Customer Success Story: Toco Chooses Databend Cloud to Tackle Big Data Challenges

Scott FengJun 11, 2024

Toco is a Swiss digital currency and wallet service provider that offers Toco, a digital currency representing one ton of CO2 equivalent captured from the atmosphere. Users can easily exchange local currency for Tocos via the Tocos app, and use Tocos for spending, saving, or trading, thereby actively contributing to carbon removal with each transaction.

Technical Challenges

As Toco gradually expands across Europe, it anticipates significant growth challenges in the coming months. The marketing team aims to convert everything into data for various displays, putting the data infrastructure to the test.

Toco has high demands for big data and analytics platforms. They require a modular, flexible, and scalable data tech stack. The desired database should access S3 Buckets and allow the use of different tools to read files from these buckets. Additionally, it should feature a compute-storage separation architecture, providing true flexibility and modularity, narrowing down their database options.

Why Databend Cloud?

Databend Cloud was selected due to its excellent compute-storage separation architecture and the following features:

  • Enhanced Distributed Computing: Inspired by Snowflake, Databend Cloud enhances distributed computing capabilities. The user experience is similar to Snowflake but offers better resource scheduling, utilization, and lower costs.

  • Vectorized Computing Engine: Leading in the industry, all operators are vectorized, significantly improving single-machine performance and distributed cluster capabilities.

  • Object Storage Design: Fully designed around object storage, supporting over 20 protocols including HDFS, Amazon S3, Azure Blob, OSS, and COS. This achieves true compute-storage separation with finer resource control granularity. Compute nodes can scale elastically without storage capacity constraints. Built-in Stream (CDC) + Task stream processing and task scheduling enable a unified batch-stream processing solution.

  • Native STAGE Support: STAGE is core to Databend Cloud's data flow, allowing users to load and export data and perform queries directly in STAGE. Users only need to create a STAGE containing data files to easily query data without complex table creation or data import processes. Transitioning from STAGE to Table is also straightforward.

Toco's tech lead states: "As a growing company with high demands for big data and analytics platforms, Databend Cloud's object storage design provides low-cost, high-performance, and comprehensive, flexible data processing. Few tools offer such convenience."

Deployment Solution

Toco currently uses Databend Cloud as the primary analytics database, divided into several phases:

Phase I: Data Acquisition & Processing

In this phase, Toco primarily gathers data from application APIs and marketing activity webpages, using Mage orchestration to manage these data processing workflows. The orchestration coordinates multiple endpoint requests to collect data, which is then forwarded to an S3 Bucket. Databend Cloud reads the data from the S3 Bucket. Additionally, Toco runs DBT processes in the database to transform and prepare the data, forming a robust data warehouse.

Phase II: Data Access & Presentation

Once the data is ready for customer use, Toco utilizes Superset to push the processed table data to customers. For internal users, Toco provides access to public Superset Dashboards, allowing them to connect directly to Databend Cloud and read the data. The public can also access certain data through Superset panels.

This structured approach ensures that both internal and external customers can seamlessly and efficiently access the necessary data. However, the process currently lacks automation. Toco's tech team plans to refine and expand these processes to meet future needs.

Since not all data from regular marketing activities is stored, such as table data in PostgreSQL and MongoDB databases, the API method falls short of meeting requirements. In the next phase, Toco plans to use Airbyte, an open-source data integration tool, to replace API data acquisition. Airbyte will automate the creation of source tables and S3 Buckets. Any changes made by developers to the databases will automatically appear in the source tables.

The Story Continues ...

Currently, Toco has been using Databend Cloud as their analytics database, and everything has been running smoothly without major issues. As their marketing efforts increase, they may face a surge in user volume. At that point, Toco will need to conduct a "battle test" to evaluate the system's stability in a production environment.

Simultaneously, Toco plans to push some data from Databend Cloud to users' API endpoints and explore real-time data stream analysis using Airbyte and CDC. This data will primarily support marketers in real-time customer segmentation. This requires the database to have near-real-time data analysis capabilities. Databend Cloud can provide near-real-time data analysis, offering timely business decision support and helping Toco meet growing business demands.

Share this post

Subscribe to our newsletter

Stay informed on feature releases, product roadmap, support, and cloud offerings!