This Week in Databend #97
PsiACEJun 11, 2023
Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .
What's On In Databend
Stay connected with the latest news about Databend.
Column Position
Databend now offers support for utilizing syntax like $N to represent column positions. For instance, $2 indicates the second column. Additionally, Databend allows the usage of column positions alongside column names in SQL statements. Here is a simple example:
CREATE TABLE IF NOT EXISTS t1(a int, b varchar);
INSERT INTO t1 values (1, 'a'), (2, 'b');
select $1, $2, a, b from t1;
┌─────────────────────────────────┐
│ $1 │ $2 │ a │ b │
│ Int32 │ String │ Int32 │ String │
├───────┼────────┼───────┼────────┤
│ 1 │ a │ 1 │ a │
│ 2 │ b │ 2 │ b │
└──────── ─────────────────────────┘
You can also use column positions when you SELECT FROM a staged NDJSON file. We are also actively working on extending this support to other formats. When using the COPY INTO statement to copy data from a stage, Databend matches the field names at the top level of the NDJSON file with the column names in the destination table, rather than relying on column positions.
SELECT $1 FROM @my_stage (FILE_FORMAT=>'ndjson')
COPY INTO my_table FROM (SELECT $1 SELECT @my_stage t) FILE_FORMAT = (type = NDJSON)
It is important to note that when using the SELECT statement for NDJSON in Databend, only $1 is allowed, representing the entire row and having the data type variant.
-- Select the entire row using column position:
SELECT $1 FROM @my_stage (FILE_FORMAT=>'ndjson')
--Select a specific field named "a" using column position:
SELECT $1:a FROM @my_stage (FILE_FORMAT=>'ndjson')
If you are interested in learning more, please check out the resources listed below:
- Issue | Feature: support $《col_position》
- Issue | Feature: copy/select from stage by pos
- PR | feat: support column position like $N
- PR | feat: select from stage support NDJson
Code Corner
Discover some fascinating code snippets or projects that showcase our work or learning journey.
Learn Databend Workflows - Typos Check
Databend now has a very complex workflow for handling code auditing, testing, benchmarking and release. Typos Check is undoubtedly the simplest part of it. Let's take a look at some of its contents together.
Like other workflows, we need to use actions/checkout to check out the code.
- uses: actions/checkout@v4
with:
clean: 'true'
typos-cli
- uses: baptiste0928/cargo-install@v1
with:
crate: typos-cli
args: --locked
cache-key: typos-check
We use
baptiste0928/cargo-install
cargo install
- name: do typos check with typos-cli
run: typos
One thing to note is that
typos-cli
typos
If you are interested in learning more, please check out the resources listed below:
Highlights
We have also made these improvements to Databend that we hope you will find helpful:
- Added support for distributed Top-N.
- The lazy_topn_threshold setting is now active by default, with a default value of 1,000.
- For enhanced security measures, the ability to change the password has been added to the root user.
- Read Blog | Databend X Tableau to learn how to connect Databend for BI data analysis in Tableau.
- Read Docs | Integrating Databend as a Sink for Vector to understand how to integrate Vector with Databend.
What's Up Next
We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.
Add a Deduplication Label Field to the Rest API
To ensure that data ingestion is idempotent, Databend now supports deduplication of DML through the use of a deduplication label. You can find more information on this feature at Docs | Setting Commands - SET_VAR.
To facilitate cross-language driver integration, we could add a REST API field for the label.
Issue #11710 | Feature: support to bring deduplication label on stage attachment api
Please let us know if you're interested in contributing to this issue, or pick up a good first issue at https://link.databend.com/i-m-feeling-lucky to get started.
Changelog
You can check the changelog of Databend Nightly for details about our latest developments.
Full Changelog: https://github.com/datafuselabs/databend/compare/v1.1.55-nightly...v1.1.56-draft2
Subscribe to our newsletter
Stay informed on feature releases, product roadmap, support, and cloud offerings!