This Week in Databend #88

What's On In Databend

Support Eager Aggregation

Eager aggregation helps improve the performance of queries that involve grouping and joining data. It works by partially pushing a groupby past a join, which reduces the number of input rows to the join and may result in a better overall plan.

Databend recently added support for Eager aggregation. Here is an example of how it works.

aggregate: SUM(x), SUM(y)
| \
| (y)
(1) Eager Groupby-Count:
final aggregate: SUM(eager SUM(x)), SUM(y * cnt)
| \
| (y)
eager group-by: eager SUM(x), eager count: cnt
(2) Eager Split:
final aggregate: SUM(eager SUM(x) * cnt2), SUM(eager SUM(y) * cnt1)
| \
| eager group-by: eager SUM(y), eager count: cnt2
eager group-by: eager SUM(x), eager count: cnt1

Support All TPC-DS Queries

Databend now supports all TPC-DS queries!

TPC-DS is a decision support benchmark that models several generally applicable aspects of a decision support system, including queries and data maintenance. The benchmark provides a representative evaluation of performance as a general-purpose decision support system.

Code Corner

databend-driver - A driver for Databend in Rust

The Databend community has crafted a Rust driver that allows developers to connect to Databend and execute SQL queries in Rust.

Here's an example of how to use the driver:

use databend_driver::new_connection;

let dsn = "databend://root:@localhost:8000/default?sslmode=disable";
let conn = new_connection(dsn).unwrap();

let sql_create = "CREATE TABLE books (
title VARCHAR,
author VARCHAR,
date Date
let sql_insert = "INSERT INTO books VALUES ('The Little Prince', 'Antoine de Saint-Exupéry', '1943-04-06');";

AskBend - SQL-based Knowledge Base Search and Completion

AskBend is a Rust project that utilizes the power of Databend and OpenAI to create a SQL-based knowledge base from Markdown files.

With AskBend, you can easily search and retrieve the most relevant information to your queries using SQL. The project automatically generates document embeddings from the content, enabling you to quickly find the information you need.

How it works:

  1. Read and parse Markdown files from a directory.
  2. Store the content in the askbend.doc table.
  3. Compute embeddings for the content using Databend Cloud's built-in AI capabilities.
  4. When a users asks a question, generate the embedding using Databend Cloud's SQL-based ai_embedding_vector function.
  5. Find the most relevant doc.content using Databend Cloud's SQL-based cosine_distance function.
  6. Use OpenAI's completion capabilities with Databend Cloud's SQL-based ai_text_completion function.
  7. Output the completion result in Markdown format.

  • New Aggregation Functions Added: QUANTILE_DISC, KURTOSIS, SKEWNESS
  • Learn everything about AI functions in Databend: Docs - AI Functions

What's Up Next

Add Nullable Table Schema Tests to Databend

Currently, Databend table schema is not nullable by default. So almost all of tests table schemas are not nullable, we need to add some tests which table schemas are nullable to cover.

To achieve this goal, we need to add some new test cases in Databend. These test cases should include nullable table schemas to ensure that Databend can handle these cases correctly.

Issue #10969 | test: add some tests which table schemas are nullable

Please let us know if you're interested in contributing to this issue, or pick up a good first issue at to get started.

New Contributors

  • @Dousir9 made their first contribution in #10884. The PR fixes the wrong cardinality estimation when the aggregation function's argument has multiple columns.
  • @YimingQiao made their first contribution in #10906. The PR adds function summarization of KURTOSIS and SKEWNESS and reorders the functions to make it consistent with the function order in the navigation bar.
  • @jsoref made their first contribution in #10914. The PR helps improve the quality of the code and documentation by fixing spelling errors.
  • @leiwenfang made their first contribution in #10917. The PR beautifies the covers of blog.
  • @ArberSephirotheca made their first contribution in #10949. The PR adds a new function called to_unix_timestamp() which converts Databend timestamp to Unix timestamp.


You can check the changelog of Databend Nightly for details about our latest developments.

Full Changelog:


A total of 22 contributors participated

We are very grateful for the outstanding work of the contributors.