Blog

This Week in Databend #123

PsiACEDec 10, 2023

Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .

What's New

Stay informed about the latest features of Databend.

MERGE INTO now Shows Statistics

Databend now shows statistics for MERGE INTO, returning the number of affected rows after updates, deletes, and inserts.

🐳 :) create table t1(a int);
🐳 :) create table t2(b int);
🐳 :) insert into t1 values(1),(3);
🐳 :) insert into t2 values(1),(3),(4);
🐳 :) merge into t1 using t2 on t1.a = t2.a when matched and t2.a = 1 then update * when
matched then delete when not matched then insert *;
+-------------+-------------+-------------+
| insert_rows | update_rows | delete_rows |
+-------------+-------------+-------------+
| 1 | 1 | 1 |
+-------------+-------------+-------------+

If you want to learn more, please feel free to contact the Databend team, or check out the resources listed below.

Code Corner

Discover some fascinating code snippets or projects that showcase our work or learning journey.

Adding Custom Clippy Rules with clippy.toml

rust-clippy is an official code linting tool provided by Rust. It uses static analysis to discover issues or non-compliant code.

By configuring the

clippy.toml
file, you can specify project-specific Clippy rules to enforce consistent code development practices and provide best practice guidelines.

For example, the following configuration prompts developers to use

std::sync::LazyLock
instead of
lazy_static::lazy_static
:

disallowed-macros = [
{ path = "lazy_static::lazy_static", reason = "Please use `std::sync::LazyLock` instead." },
]

Highlights

We have also made these improvements to Databend that we hope you will find helpful:

  • Introduced
    json_path_match
    function and
    @?
    ,
    @@
    operators.
  • Switched to volo thrift as the replacement for upstream thrift, which is no longer actively maintained.
  • Read the documentation Docs | Tutorial: Dashboarding Covid-19 Data from New York Times to learn how to create and manage visual charts using the Databend Cloud Dashboard.

What's Up Next

We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.

Adding Support for More Data File Types to
INFER_SCHEMA

Databend supports the

infer_schema
table function, which infers the schema of data files for easy data loading and analysis.

See an example:

SELECT * FROM INFER_SCHEMA(location => '@infer_parquet/data_e0fd9cba-f45c-4c43-aa07-d6d87d134378_0_0.parquet');
+-------------+-----------------+----------+----------+
| column_name | type | nullable | order_id |
+-------------+-----------------+----------+----------+
| number | BIGINT UNSIGNED | 0 | 0 |
+-------------+-----------------+----------+----------+

Currently, infer_schema only works with Parquet files. We plan to extend its support to include other file types such as CSV and JSON.

Issue #13959 | INFER_SCHEMA supports more file types

Please let us know if you're interested in contributing to this feature, or pick up a good first issue at https://link.databend.com/i-m-feeling-lucky to get started.

Changelog

You can check the changelog of Databend Nightly for details about our latest developments.

Full Changelog: https://github.com/datafuselabs/databend/compare/v1.2.239-nightly...v1.2.248-nightly

Share this post

Subscribe to our newsletter

Stay informed on feature releases, product roadmap, support, and cloud offerings!