Skip to main content

2 posts tagged with "customer story"

View All Tags

Toco is a Swiss digital currency and wallet service provider that offers Toco, a digital currency representing one ton of CO2 equivalent captured from the atmosphere. Users can easily exchange local currency for Tocos via the Tocos app, and use Tocos for spending, saving, or trading, thereby actively contributing to carbon removal with each transaction.

Technical Challenges

As Toco gradually expands across Europe, it anticipates significant growth challenges in the coming months. The marketing team aims to convert everything into data for various displays, putting the data infrastructure to the test.

Toco has high demands for big data and analytics platforms. They require a modular, flexible, and scalable data tech stack. The desired database should access S3 Buckets and allow the use of different tools to read files from these buckets. Additionally, it should feature a compute-storage separation architecture, providing true flexibility and modularity, narrowing down their database options.

Why Databend Cloud?

Databend Cloud was selected due to its excellent compute-storage separation architecture and the following features:

  • Enhanced Distributed Computing: Inspired by Snowflake, Databend Cloud enhances distributed computing capabilities. The user experience is similar to Snowflake but offers better resource scheduling, utilization, and lower costs.

  • Vectorized Computing Engine: Leading in the industry, all operators are vectorized, significantly improving single-machine performance and distributed cluster capabilities.

  • Object Storage Design: Fully designed around object storage, supporting over 20 protocols including HDFS, Amazon S3, Azure Blob, OSS, and COS. This achieves true compute-storage separation with finer resource control granularity. Compute nodes can scale elastically without storage capacity constraints. Built-in Stream (CDC) + Task stream processing and task scheduling enable a unified batch-stream processing solution.

  • Native STAGE Support: STAGE is core to Databend Cloud's data flow, allowing users to load and export data and perform queries directly in STAGE. Users only need to create a STAGE containing data files to easily query data without complex table creation or data import processes. Transitioning from STAGE to Table is also straightforward.

Toco's tech lead states: "As a growing company with high demands for big data and analytics platforms, Databend Cloud's object storage design provides low-cost, high-performance, and comprehensive, flexible data processing. Few tools offer such convenience."

Deployment Solution

Toco currently uses Databend Cloud as the primary analytics database, divided into several phases:

Phase I: Data Acquisition & Processing

In this phase, Toco primarily gathers data from application APIs and marketing activity webpages, using Mage orchestration to manage these data processing workflows. The orchestration coordinates multiple endpoint requests to collect data, which is then forwarded to an S3 Bucket. Databend Cloud reads the data from the S3 Bucket. Additionally, Toco runs DBT processes in the database to transform and prepare the data, forming a robust data warehouse.

Phase II: Data Access & Presentation

Once the data is ready for customer use, Toco utilizes Superset to push the processed table data to customers. For internal users, Toco provides access to public Superset Dashboards, allowing them to connect directly to Databend Cloud and read the data. The public can also access certain data through Superset panels.

This structured approach ensures that both internal and external customers can seamlessly and efficiently access the necessary data. However, the process currently lacks automation. Toco's tech team plans to refine and expand these processes to meet future needs.

Since not all data from regular marketing activities is stored, such as table data in PostgreSQL and MongoDB databases, the API method falls short of meeting requirements. In the next phase, Toco plans to use Airbyte, an open-source data integration tool, to replace API data acquisition. Airbyte will automate the creation of source tables and S3 Buckets. Any changes made by developers to the databases will automatically appear in the source tables.

The Story Continues ...

Currently, Toco has been using Databend Cloud as their analytics database, and everything has been running smoothly without major issues. As their marketing efforts increase, they may face a surge in user volume. At that point, Toco will need to conduct a "battle test" to evaluate the system's stability in a production environment.

Simultaneously, Toco plans to push some data from Databend Cloud to users' API endpoints and explore real-time data stream analysis using Airbyte and CDC. This data will primarily support marketers in real-time customer segmentation. This requires the database to have near-real-time data analysis capabilities. Databend Cloud can provide near-real-time data analysis, offering timely business decision support and helping Toco meet growing business demands.

Typing (TYPING TECHNOLOGY PTE. LTD.), founded in 2022, is a company that offers social platforms in Southeast Asia, Latin America, and the Middle East. The platforms feature a variety of social functions including video streaming, voice chat rooms, short videos, lifestyle sharing, and text chat. With over a million registered users and hundreds of thousands of daily active users, people can meet interesting individuals, make new friends, and create their own social communities on the platform.

Business Scenario

Social platforms have become an indispensable part of daily life. People use them to make friends, share content, and exchange information. This generates a wealth of user behavior and preference data. Big data technology enables the efficient mining and analysis of this data, providing technological and decision-making support for the development and enhancement of social platforms.

As a social media company, Typing recognizes the critical importance of data, which can uncover significant commercial value:

  • Building User Profiles: User profiles are models created based on users' behavioral data and personal information. Typing analyzes data such as user interactions, friendships, and interests to build accurate user profiles. These profiles help Typing better understand user needs and behavioral trends, enabling the platform to offer more personalized and precise services and recommendations, thereby enhancing user experience and satisfaction.

  • Content Recommendation & Personalization: The vast and complex array of content on Typing’s platform includes audio, video, text, and images. Finding relevant content and people can be challenging for users. Utilizing big data analysis, Typing can examine users' historical behavior data to discern their interests and preferences, providing personalized content recommendations and notifications. This personalization boosts user engagement and retention, fostering greater loyalty and dependency on the platform.

  • Social Relationship Analysis: Understanding and analyzing social relationships is central to Typing's platform. By leveraging big data, Typing can analyze users' friendships and interactions to identify interest groups and social networks. This insight allows Typing to offer more precise social recommendations. Additionally, social relationship analysis supports strategies for predicting user churn and maintaining user relationships, ultimately improving user retention and activity levels.

Technical Challenges

Due to its startup scale, Typing's entire development team consists of only about 15 people, with no dedicated big data or AI algorithm recommendation teams. However, the company has a strong need for refined operations, requiring a deep understanding of both users and the platform. Extracting valuable insights and analysis from data becomes essential.

To achieve this, the Typing technical team explored various solutions, including big data offerings from Alibaba Cloud and Volcano Engine. However, these solutions were deemed complex in terms of documentation and integration, with high time and manpower costs, making them impractical for a startup to implement.

Typing also experimented with the open-source ClickHouse, but it required specialized data developers to handle intermediate data cleaning and ETL tasks. Due to the lack of manpower in this area, this solution also proved unfeasible for Typing.

Why Databend Cloud?

At an open-source event during a conference, the technical lead of Typing encountered Databend Cloud. After extensive research and discussions, he was deeply impressed by several key features of Databend Cloud:

  • Separation of Storage and Compute: Databend Cloud completely separates storage from computation, allowing users to easily scale up or down based on application needs. Its design for object storage overcomes the traditional database disk capacity limitations.

  • High-Performance Queries: Databend Cloud’s advanced architecture and vectorized query engine enable real-time analysis of massive datasets with sub-second latency. Leveraging data-level parallelism (Vectorized Query Execution) and instruction-level parallelism (SIMD), Databend Cloud delivers exceptional data analysis performance. It outperforms mainstream next-generation cloud-native databases by 1.3 times and traditional integrated databases by 2-3 times in the TPC-H benchmark across data import, cold run, and hot run scenarios.

  • Seamless Integration with Data Ecosystems: Databend Cloud integrates seamlessly with popular data technologies and tools, offering SDKs in Java, Go, Python, Node.js, and Rust. It supports integration with Kafka, DBT, FlinkCDC, Airbyte, Data X, and Devezium, addressing Typing’s compatibility issues with their existing tech stack. This comprehensive support meets all data transformation, business intelligence, ad-hoc analysis, and data application needs, helping users quickly uncover data’s potential value.

  • Cost Efficiency: Databend Cloud’s economical and intelligent compute clusters, combined with highly compressed and performance-optimized object storage, can reduce costs by up to 90%. This cost-effectiveness is crucial for startups like Typing, making data processing affordable.

  • Ease of Use: Databend Cloud offers a one-stop SaaS service, simplifying data import with data pipelines and task management, and freeing users from maintenance burdens. It is ready to use out-of-the-box with no need to build indexes, manually tune, or calculate partitions or sharding. All of this is automatically handled when data is loaded into tables.

Deployment Solution

The features of Databend Cloud perfectly matched Typing's requirements for a big data platform, leading Typing to choose Databend Cloud as their primary tool for big data analysis. After thorough planning, preparation, and compatibility assessments, Typing successfully migrated their big data computing operations to Databend Cloud.

alt text

Currently, Typing’s data primarily originates from AWS Aurora databases. Developers perform daily data synchronization on a T+1 basis. They first use the databend-py SDK to export data from dozens of tables in Aurora to S3. Then, the data from S3 is directly imported into Databend Cloud. Thanks to Databend's commitment to open-source principles and contributions to Superset, integrating with the Superset open-source data dashboard tool is seamless. Once the data is processed in Databend Cloud, it is transmitted to Superset for visualization.

In this setup, Databend Cloud mainly supports the operational data dashboards. Typing starts the data synchronization at 8 AM daily, handling around 2-3TB of data, and completes data import and computation by 10 AM. This allows Typing’s technical team to utilize Superset for creating operational and product-focused data visualizations as soon as they start their workday.

Additionally, Databend Cloud serves another crucial purpose at Typing. It processes historical user behavior data (such as purchase records, voice room activity, and gift transactions) within Databend Cloud. This enables the computation of user segmentation labels, which are then imported into the business servers. These labels support business application development by facilitating personalized push notifications and other user-specific interactions.

Project Benefits

Since completing the deployment in November last year, Typing has experienced six months of significant improvements with Databend Cloud, effectively addressing various big data analysis challenges. The benefits have exceeded Typing's expectations in terms of query speed, accuracy of results, and cost efficiency.

  • Cost Reduction: After migrating to Databend Cloud, Typing achieved a 90% reduction in data costs, primarily attributed to the faster query speeds. The highest remaining cost is the data synchronization from AWS Aurora to Databend Cloud, and Typing is exploring ways to reduce this expense by collaborating with Databend Cloud on new synchronization mechanisms.

  • Operational Efficiency: Typing's operations team frequently writes SQL queries to set metrics and view data dashboards. Databend Cloud's unified SQL interface aligns with the team's existing database usage habits, reducing adaptation costs. The team has found the new dashboards very user-friendly and quick to yield results, contributing to a smooth and stable workflow.

  • Dedicated Support: Databend Cloud provides dedicated engineering support, addressing urgent issues within hours or days. This support has allowed Typing to forgo dedicated data development personnel, effectively integrating Databend engineers as part of their data team—a level of service previously unimaginable with other major cloud providers.

The Story Continues ...

Typing is embarking on a new phase of exploration with Databend, driven by their trust in the platform and its potential for broader applications. Looking ahead, Typing plans to synchronize server-side event tracking data to Databend Cloud. This data, which captures more granular user behavior than database data, is invaluable for business decision-making and supports more time-sensitive business logic. The event tracking data will be synchronized approximately every 15 minutes, necessitating near-real-time updates. Databend, considering cost and timeliness, offers an incremental synchronization solution that can achieve updates as frequently as hourly.

Throughout their collaboration, Databend has not only resolved many of Typing's existing technical challenges but also embraced an open and cooperative approach. Together, they continue to explore new scenarios, providing reliable data support for the growth and development of Typing's social platform business.