Blog

This Week in Databend #130

PsiACEJan 29, 2024

Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .

What's New

Stay informed about the latest features of Databend.

Supporting
CREATE OR REPLACE DATABASE

CREATE OR REPLACE DATABASE
is syntactic sugar designed to simplify the following SQL statements:

DROP DATABASE IF EXISTS ...
CREATE DATABASE ...

Support for

CREATE OR REPLACE TABLE
is also actively underway.

If you would like to learn more, please contact the Databend team or refer to the resources listed below:

Code Corner

Discover some fascinating code snippets or projects that showcase our work or learning journey.

Predicting Iris Species with SQL

Using a combination of

CASE WHEN
statements and simple criteria, we can mock a basic tree model with SQL, achieving nearly 97% accuracy in classifying iris species.

SELECT
COUNT(*) AS total_count,
SUM(CASE WHEN subquery.species = subquery.prediction THEN 1 ELSE 0 END) AS correct_count,
SUM(CASE WHEN subquery.species <> subquery.prediction THEN 1 ELSE 0 END) AS incorrect_count,
SUM(CASE WHEN subquery.species = subquery.prediction THEN 1 ELSE 0 END) / COUNT(*) AS accuracy
FROM
(
SELECT
sepal_length,
sepal_width,
petal_length,
petal_width,
species,
CASE
WHEN petal_width < 0.75 THEN 1
WHEN petal_width >= 0.75 AND petal_width < 1.35 THEN 2
WHEN petal_width >= 0.75 AND petal_width < 1.75 AND petal_width >= 1.35 AND sepal_width < 2.65 THEN 3
WHEN petal_width >= 0.75 AND petal_width < 1.75 AND petal_width >= 1.35 AND sepal_width >= 2.65 THEN 2
WHEN petal_width >= 0.75 AND petal_width >= 1.75 THEN 3
END AS prediction
FROM
iris
) AS subquery;

The complete example, using the iris dataset from Hugging Face, includes steps such as data access and data cleaning. For further reading, check out:

Highlights

We have also made these improvements to Databend that we hope you will find helpful:

  • Added support for
    SHOW USER FUNCTIONS
    .
  • Added a system table
    user_functions
    .
  • Added conditional functions
    nvl
    and
    nvl2
    .
  • Added JSON processing function
    minus
    supports deletion by index or name.
  • Added support for Predicate Move-Around.

What's Up Next

We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.

Adding Support for Inverted Index

Inverted Index is a word-doc mapping structure commonly used in full-text search systems. Databend plans to introduce inverted indexes to meet the needs of full-text search scenarios.

The reference syntax is as follows:

-- Create index
CREATE [ASYNC] INVERTED INDEX ivt_index ON <table>(<column>) ...

-- Refresh index
REFRESH INVERTED INDEX <index_name> [LIMIT <limit>]

Issue #14505 | feat: inverted index

Please let us know if you're interested in contributing to this feature, or pick up a good first issue at https://link.databend.com/i-m-feeling-lucky to get started.

New Contributors

We always open arms to everyone and can't wait to see how you'll help our community grow and thrive.

Changelog

You can check the changelog of Databend Nightly for details about our latest developments.

Full Changelog: https://github.com/datafuselabs/databend/compare/v1.2.307-nightly...v1.2.318-nightly

Share this post

Subscribe to our newsletter

Stay informed on feature releases, product roadmap, support, and cloud offerings!