Skip to main content

This Week in Databend #128

Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .

What's New

Stay informed about the latest features of Databend.

Querying Data on HuggingFace File System with Databend

Hugging Face is currently the most popular AI community. Databend now supports direct queries and analysis of massive data and models stored on the Hugging Face file system.

URI format: hf://{repo_id}/path/to/file, where a repo_id might look like fka/awesome-chatgpt-prompts.

Supported configurations include:

  • repo_type: The type of HuggingFace repository, default is dataset, available options are dataset, model.
  • revision: The revision version of HuggingFace, default is main. Can be a branch, tag, or commit in the repository.
  • token: The API token of HuggingFace.

The following example queries fka/awesome-chatgpt-prompts and lists the first 5 rows of the first column in a CSV file.

If you would like to learn more, please contact the Databend team or refer to the resources listed below:

Code Corner

Discover some fascinating code snippets or projects that showcase our work or learning journey.

Data Type Mappings across Databend, MySQL, and Oracle

This table provides an outline of the mapping of data types between Databend, MySQL, and Oracle.

DatabendMySQLOracle
TINYINTTINYINTNUMBER(3,0)
SMALLINTSMALLINTNUMBER(5,0)
INTINTNUMBER(10,0)
BIGINTBIGINTNUMBER(19,0)
FLOATFLOATFLOAT
DOUBLEDOUBLEFLOAT(24)
DECIMALDECIMALFLOAT(24)
DATEDATEDATE
TIMESTAMPTIMESTAMPNUMBER
DATETIMEDATETIMEDATE
YEARINTNUMBER
VARCHARVARCHARVARCHAR2
VARCHARCHARCHAR
VARBINARYVARBINARYRAW, BLOB
VARCHARVARCHARVARCHAR2
VARCHARVARCHARRAW, CBLOB
VARBINARYVARBINARYRAW, BLOB
VARCHARVARCHARRAW, CBLOB
VARCHARVARCHARVARCHAR2
VARCHARVARCHARVARCHAR2
ARRAYN/AN/A
BOOLEANN/AN/A
TUPLEN/AN/A
MAPN/AN/A
JSON, VARIANTJSONJSON
BITMAPN/AN/A

Highlights

We have also made these improvements to Databend that we hope you will find helpful:

  • Added the Binary data type and support for conversion between String and Binary.
  • Support for adaptive filter reorder.
  • Support for JSON function concat.
  • Support for automatic refresh of the ReadOnlyAttach table schema.
  • Support for a greedy JOIN order algorithm.

What's Up Next

We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.

Adding Support for Task Advice System Table

Databend plans to introduce the task_advice system table to provide effective insights on daily operations and help database administrators to manage their data more easily.

catalog_namedatabase_nametable_nametask_typeneed_runtask_sqlreason(variant)
defaultdbxxCOMPACT1optimize table xx compact limit 3"{status in json}"
defaultdbxxAGGREGATING_INDEX0refresh aggregating index xx_agg_idx limit 10"{status in json}"
defaultdbyyADD_CLUSTER_KEY1alter table yy cluster by(col1)"{status in json}"

Issue #14323 | feat: task_advice system table

Please let us know if you're interested in contributing to this feature, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.

Changelog

You can check the changelog of Databend Nightly for details about our latest developments.

Full Changelog: https://github.com/datafuselabs/databend/compare/v1.2.286-nightly...v1.2.296-nightly


Contributors

A total of 19 contributors participated

We are very grateful for the outstanding work of the contributors.