Skip to main content

This Week in Databend #123

Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .

What's New

Stay informed about the latest features of Databend.

MERGE INTO now Shows Statistics

Databend now shows statistics for MERGE INTO, returning the number of affected rows after updates, deletes, and inserts.

🐳 :) create table t1(a int);
🐳 :) create table t2(b int);
🐳 :) insert into t1 values(1),(3);
🐳 :) insert into t2 values(1),(3),(4);
🐳 :) merge into t1 using t2 on t1.a = t2.a when matched and t2.a = 1 then update * when
matched then delete when not matched then insert *;
+-------------+-------------+-------------+
| insert_rows | update_rows | delete_rows |
+-------------+-------------+-------------+
| 1 | 1 | 1 |
+-------------+-------------+-------------+

If you want to learn more, please feel free to contact the Databend team, or check out the resources listed below.

Code Corner

Discover some fascinating code snippets or projects that showcase our work or learning journey.

Adding Custom Clippy Rules with clippy.toml

rust-clippy is an official code linting tool provided by Rust. It uses static analysis to discover issues or non-compliant code.

By configuring the clippy.toml file, you can specify project-specific Clippy rules to enforce consistent code development practices and provide best practice guidelines.

For example, the following configuration prompts developers to use std::sync::LazyLock instead of lazy_static::lazy_static:

disallowed-macros = [
{ path = "lazy_static::lazy_static", reason = "Please use `std::sync::LazyLock` instead." },
]

Highlights

We have also made these improvements to Databend that we hope you will find helpful:

  • Introduced json_path_match function and @?, @@ operators.
  • Switched to volo thrift as the replacement for upstream thrift, which is no longer actively maintained.
  • Read the documentation Docs | Tutorial: Dashboarding Covid-19 Data from New York Times to learn how to create and manage visual charts using the Databend Cloud Dashboard.

What's Up Next

We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.

Adding Support for More Data File Types to INFER_SCHEMA

Databend supports the infer_schema table function, which infers the schema of data files for easy data loading and analysis.

See an example:

SELECT * FROM INFER_SCHEMA(location => '@infer_parquet/data_e0fd9cba-f45c-4c43-aa07-d6d87d134378_0_0.parquet');
+-------------+-----------------+----------+----------+
| column_name | type | nullable | order_id |
+-------------+-----------------+----------+----------+
| number | BIGINT UNSIGNED | 0 | 0 |
+-------------+-----------------+----------+----------+

Currently, infer_schema only works with Parquet files. We plan to extend its support to include other file types such as CSV and JSON.

Issue #13959 | INFER_SCHEMA supports more file types

Please let us know if you're interested in contributing to this feature, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.

Changelog

You can check the changelog of Databend Nightly for details about our latest developments.

Full Changelog: https://github.com/datafuselabs/databend/compare/v1.2.239-nightly...v1.2.248-nightly


Contributors

A total of 24 contributors participated

We are very grateful for the outstanding work of the contributors.