Blog

This Week in Databend #122

PsiACEDec 3, 2023
This Week in Databend #122

Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .

What's New

Stay informed about the latest features of Databend.

Supporting Chain Function Calls

Databend now offers an experimental SQL dialect to support chain function calls.

To try out this feature, run the command below to enable it first:

set sql_dialect = 'experimental';

Here is a simple example:

🐳 :) with t(f) as (select '11|open|22|ai|33|is nothing without sam'.split('|')
.array_filter(x -> try_cast(x as int) is not null ).array_apply(x -> x::int ).array_apply(x -> x::int64 ))
select f.array_reduce('sum'), f.array_reduce('max'), f.array_reduce('min') from t;
┌──────────────────────────────────────────────────────────────────────────┐
│ array_reduce(f, 'sum') │ array_reduce(f, 'max') │ array_reduce(f, 'min')
│ Nullable(Int64) │ Nullable(Int64) │ Nullable(Int64)
├────────────────────────┼────────────────────────┼────────────────────────┤
663311
└──────────────────────────────────────────────────────────────────────────┘

If you want to learn more, please feel free to contact the Databend team, or check out the resources listed below.

Code Corner

Discover some fascinating code snippets or projects that showcase our work or learning journey.

Load Balancing with Query ID and Nginx

Databend JDBC carries an HTTP Header named

X-DATABEND-QUERY-ID
with each request. If you want to distribute multiple requests from the same client to different backend instances, you can use this HTTP Header as the Hash Key for Nginx, to achieve load balancing.

The following Nginx configuration file demonstrates how to implement load balancing based on Query ID for two databend-query instances located at ports 8000 and 8009:

#user  nobody;
worker_processes 1;

#error_log logs/error.log;
#error_log logs/error.log notice;
#error_log logs/error.log info;

#pid logs/nginx.pid;


events {
worker_connections 1024;
}


http {
log_format main '$http_x_databend_query_id "$time_local" $host "$request_method $request_uri $server_protocol" $status $bytes_sent "$http_referer" "$http_user_agent" $remote_port $upstream_addr $scheme $gzip_ratio $request_length $request_time $ssl_protocol "$upstream_response_time"';
access_log /opt/homebrew/var/log/nginx/access.log main;
map $http_x_query_id $backend {
default backend1;
}
upstream backend1 {
hash $http_x_databend_query_id consistent;
server localhost:8000;
server localhost:8009;
}
server {
listen 8085;
location / {
proxy_pass http://$backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-DATABEND-QUERY-ID $http_x_databend_query_id;
}
}
}

Highlights

We have also made these improvements to Databend that we hope you will find helpful:

  • Added support for the
    QUALIFY
    clause.
  • Connection can now be created with role_arn as access credentials.
  • Added a change in
    MERGE INTO
    which separates matched-only and insert-only Pipelines.

What's Up Next

We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.

Supporting for Calculating JOIN ORDER Using Greedy Algorithm

Databend currently calculates JOIN ORDER using DPhpy. If too much time is spent in this process, it is necessary to switch from dynamic programming algorithm to greedy algorithm to minimize the overall query time.

Issue #13884 | Feature: Support join order greedy algorithm

Please let us know if you're interested in contributing to this feature, or pick up a good first issue at https://link.databend.com/i-m-feeling-lucky to get started.

Changelog

You can check the changelog of Databend Nightly for details about our latest developments.

Full Changelog: https://github.com/datafuselabs/databend/compare/v1.2.226-nightly...v1.2.239-nightly

Share this post

Subscribe to our newsletter

Stay informed on feature releases, product roadmap, support, and cloud offerings!