Skip to main content

This Week in Databend #130

Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .

What's New

Stay informed about the latest features of Databend.

Supporting CREATE OR REPLACE DATABASE

CREATE OR REPLACE DATABASE is syntactic sugar designed to simplify the following SQL statements:

DROP DATABASE IF EXISTS ...
CREATE DATABASE ...

Support for CREATE OR REPLACE TABLE is also actively underway.

If you would like to learn more, please contact the Databend team or refer to the resources listed below:

Code Corner

Discover some fascinating code snippets or projects that showcase our work or learning journey.

Predicting Iris Species with SQL

Using a combination of CASE WHEN statements and simple criteria, we can mock a basic tree model with SQL, achieving nearly 97% accuracy in classifying iris species.

SELECT
COUNT(*) AS total_count,
SUM(CASE WHEN subquery.species = subquery.prediction THEN 1 ELSE 0 END) AS correct_count,
SUM(CASE WHEN subquery.species <> subquery.prediction THEN 1 ELSE 0 END) AS incorrect_count,
SUM(CASE WHEN subquery.species = subquery.prediction THEN 1 ELSE 0 END) / COUNT(*) AS accuracy
FROM
(
SELECT
sepal_length,
sepal_width,
petal_length,
petal_width,
species,
CASE
WHEN petal_width < 0.75 THEN 1
WHEN petal_width >= 0.75 AND petal_width < 1.35 THEN 2
WHEN petal_width >= 0.75 AND petal_width < 1.75 AND petal_width >= 1.35 AND sepal_width < 2.65 THEN 3
WHEN petal_width >= 0.75 AND petal_width < 1.75 AND petal_width >= 1.35 AND sepal_width >= 2.65 THEN 2
WHEN petal_width >= 0.75 AND petal_width >= 1.75 THEN 3
END AS prediction
FROM
iris
) AS subquery;

The complete example, using the iris dataset from Hugging Face, includes steps such as data access and data cleaning. For further reading, check out:

Highlights

We have also made these improvements to Databend that we hope you will find helpful:

  • Added support for SHOW USER FUNCTIONS.
  • Added a system table user_functions.
  • Added conditional functions nvl and nvl2.
  • Added JSON processing function minus supports deletion by index or name.
  • Added support for Predicate Move-Around.

What's Up Next

We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.

Adding Support for Inverted Index

Inverted Index is a word-doc mapping structure commonly used in full-text search systems. Databend plans to introduce inverted indexes to meet the needs of full-text search scenarios.

The reference syntax is as follows:

-- Create index
CREATE [ASYNC] INVERTED INDEX ivt_index ON <table>(<column>) ...

-- Refresh index
REFRESH INVERTED INDEX <index_name> [LIMIT <limit>]

Issue #14505 | feat: inverted index

Please let us know if you're interested in contributing to this feature, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.

New Contributors

We always open arms to everyone and can't wait to see how you'll help our community grow and thrive.

Changelog

You can check the changelog of Databend Nightly for details about our latest developments.

Full Changelog: https://github.com/datafuselabs/databend/compare/v1.2.307-nightly...v1.2.318-nightly


Contributors

A total of 29 contributors participated

We are very grateful for the outstanding work of the contributors.