Blog

Databend Feb–Mar Bimonthly Report: Continuous Data Development to Enhanced GIS Capabilities

avatarDatabendLabsApr 8, 2026
Databend Feb–Mar Bimonthly Report: Continuous Data Development to Enhanced GIS Capabilities

From February to March 2026, Databend continued advancing SQL capabilities, spatial analysis, file formats, query execution pipelines, and Meta architecture evolution. Sandbox UDF, Geography Functions, R-Tree Spatial Index, TEXT format, enhanced

EXPLAIN PERF
, Meta KV compression, and more — continuous data development delivery defined the two most notable months.

Hi, Databend community!

Since we didn't publish a standalone February report last month, this issue combines February 2026 and March 2026 into a single bimonthly recap, covering from

v1.2.874-nightly
to
v1.2.891-nightly
through March 30, 2026.

The main themes across these two months are clear: Sandbox UDF and Geography Functions expanded SQL expressiveness; R-Tree Spatial Index, Spatial Join, and spatial pruning pipelines were progressively filled in; CSV/TSV capabilities continued to improve and converged into the TEXT format in March; the query engine kept pushing forward with new join, spill fixes,

EXPLAIN PERF
, Sort/Spill optimizations, and Bloom Index reads. The Meta subsystem also went through multiple rounds of refactoring and compression upgrades, while
bendsql
and related drivers received matching improvements on the client tooling side.

Monthly Stats

From

v1.2.874-nightly
to
v1.2.891-nightly
, the Databend main repository released a total of:

  • 18 nightly versions
  • 2 patch versions

Breakdown:

  • February: 10 nightlies
  • March: 8 nightlies and 2 patches

By release note counts, the Databend main repository merged across these two months:

  • 43 new features
  • 49 bug fixes
  • 50 refactors
  • 2 CI / build improvements
  • 3 documentation updates
  • 17 other optimizations

These numbers reflect Databend advancing on three parallel tracks in Feb–Mar: feature expansion, execution stability, and foundational restructuring.

Highlights

Core New Features

  • Sandbox UDF + LATERAL
    generate_series
    — Databend continues advancing SQL expressiveness, enabling more flexible query composition and a safer foundation for custom logic execution.
  • Geography Functions +
    st_hilbert
    — Spatial analysis functions keep expanding, from Geography Functions to
    st_hilbert
    .
  • Object governance enhancements — Tagging support extended to
    USER
    ,
    ROLE
    ,
    STREAM
    ,
    view
    ,
    udf
    , and
    procedure
    ; experimental table tags support added for FUSE table snapshots.
  • Data pipeline enhancements
    COPY INTO location
    now supports
    PARTITION BY
    ;
    COPY INTO
    supports Lance dataset. Import/export pipelines continue to fill out.
  • Table Branches with independent Schema — Branched data management is now more clearly organized. Table Branch is still being further optimized.
  • Table Tag is included in the latest version and ready to use.

Spatial Analysis: From Geography Functions to Spatial Index

  • Geography Functions — February laid the groundwork for geospatial computation, setting the stage for indexing and pruning.
  • R-Tree Spatial Index — March formally introduced R-Tree-based Spatial Index, landing spatial indexing capabilities.
  • Spatial Statistics — Added spatial statistics to
    BlockMeta
    for geospatial range pruning.
  • Spatial Join — Runtime Filter now supports Spatial Index Join, integrating spatial indexes into the query optimization and execution path.

File Formats & Data Exchange: From CSV/TSV Enhancements to TEXT

  • CSV / TSV compatibility improvements — Multi-byte field delimiters supported; complex types can be encoded as JSON for more flexible heterogeneous data ingestion.
  • read_file
    +
    COPY INTO location PARTITION BY
    — File read/write pipelines continue to strengthen, making data engineering scenarios more complete.
  • TSV → TEXT — March formally converged TSV into TEXT and added TEXT file format parameters, making text-based data import definitions cleaner.
  • Detail fixes
    infer_schema()
    now supports TSV;
    FIELD_DELIMITER=''
    supports full-line reads; CRLF handling continues to improve.

Query Performance & Execution Engine: From New Join to EXPLAIN PERF

  • Experimental new join enabled by default — A significant milestone in the execution pipeline in February, followed by ongoing correctness fixes for spill scenarios.
  • EXPLAIN PERF
    — March added per-plan hardware performance counters, further enhancing execution plan analysis.
  • Sort / Spill optimizations — Spilled sort blocks prefetch, source sort key reuse, spilled sort streams compression, and batch rank-limit sort continue to advance.
  • Join and filter optimizations — Small bloom index read optimization, proactive memory reclamation after Hash Join completion, and ongoing Runtime Filter enhancements.

Meta & Operations: Ongoing Foundational Restructuring

  • Meta architecture cleanup — HTTP admin API extracted into a standalone crate
    databend-meta-admin
    ; Meta config structure, CLI config, and crate boundaries continue to be cleaned up.
  • Meta KV Compression — Introduced transparent zstd compression and typed serialization in
    v1.2.883-nightly
    , further optimizing metadata storage.
  • Query-Meta compatibility — Query-meta version compatibility documentation filled in, providing clearer boundaries for future evolution.
  • Connection and operational details — Meta value compression toggle, gRPC listener
    TCP_NODELAY
    , and other capabilities continue to be added.

Stability & Quality: Execution Correctness and Edge Case Fixes

  • Execution correctness — Ongoing fixes for new join spill data loss; recursive CTE concurrent result issues resolved.
  • Storage and metadata stability — raft-log concurrent chunk read race condition fixed; unnecessary S3 refresh for HTTP catalog attached tables resolved.
  • SQL edge case fixes
    UNION
    parenthesis preservation, correlated scalar subquery decorrelation, Unicode identifier and alias support continue to improve.
  • Security and governance boundaries — Row access policy execution issue under direct
    UPDATE
    resolved.

Practical Demo: Data Development and Delivery with Databend

Putting these two months of updates into a real-world example makes Databend's value more tangible. A common data development workflow looks like this: build cleansing, aggregation, and models inside Databend, then deliver results to object storage partitioned for downstream batch jobs; for specific analysis or application scenarios, export results into a more consumption-friendly dataset format. This maps directly to two practical capabilities shipped in Feb–Mar:

  • COPY INTO <location> PARTITION BY
  • COPY INTO <location> FILE_FORMAT = (TYPE = LANCE)

Scenario 1: Partitioned Parquet Delivery by Date and Hour

Suppose you've built an order wide table in Databend and want to deliver results daily to object storage partitioned by

date/hour
, for downstream Spark, Flink, or other batch jobs.

You can use:

COPY INTO @delivery_stage/orders/
FROM (
SELECT
order_id,
user_id,
amount,
status,
created_at,
to_date(created_at) AS dt
FROM mart_orders
WHERE created_at >= '2026-03-01'
)
PARTITION BY (
'date=' || to_varchar(dt, 'YYYY-MM-DD')
|| '/hour=' || lpad(to_varchar(date_part('hour', created_at)), 2, '0')
)
FILE_FORMAT = (TYPE = PARQUET);

The benefits are straightforward:

  • Partition directories are determined directly by SQL expressions — delivery paths are fully controlled
  • Results land in directory structures like
    date=2026-03-01/hour=08/
  • Downstream jobs don't need an extra partition rewrite step

Scenario 2: Delivering Analysis Results as a Lance Dataset

If downstream consumption prioritizes efficient reads, columnar access, or a specific dataset organization, the Lance format support added in March is more valuable.

For example, to export a feature result set directly as a Lance dataset:

COPY INTO @delivery_stage/user_features_lance/
FROM (
SELECT
number
FROM numbers(10)
)
FILE_FORMAT = (TYPE = LANCE)
USE_RAW_PATH = TRUE
OVERWRITE = TRUE;

A few important notes:

  • A Lance dataset is a complete dataset directory, not a single file output
  • By default, Databend generates a
    query_id
    subdirectory under the target path
  • Use
    USE_RAW_PATH = TRUE
    to write results directly to the specified directory
  • Lance currently does not support use with
    PARTITION BY

Databend's delivery pipeline offers two clear modes:

  • For general data lake delivery:
    PARTITION BY + PARQUET
  • For specific dataset organization or high-efficiency consumption:
    LANCE

Ecosystem Updates

bendsql: 2 Releases in March

bendsql
had no official release in February, but shipped two versions in March:

  • v0.33.5
    , released 2026-03-11
  • v0.33.6
    , released 2026-03-15

These updates focused on:

  • More readable error messages, with
    reqwest::Error
    source and source chain output added
  • Request stability improvements, including heartbeat headers fix,
    response.bytes()
    error retry, and timeout adjustments
  • Release and build pipeline improvements, including OIDC trusted publishing and abi3 wheel build fixes

Both releases lean toward CLI usability, error diagnostics, and engineering quality improvements.

Drivers and Clients

Beyond

bendsql
, Databend ecosystem drivers and clients also had notable updates in Feb–Mar:

  • databend-go
    released v0.9.2 on 2026-03-23, continuing Go driver usability improvements
  • databend-jdbc
    shipped a set of exception handling improvements around 2026-03-20, including embedding
    query_id
    in
    SQLException
    messages for easier troubleshooting
  • databend-sqlalchemy
    continued compatibility fixes and version upgrades between 2026-02-25 and 2026-02-28, including
    varchar
    type parsing fixes and
    get_sequence_names
    issue corrections

These updates don't show up as headline features the way the main repository does, but they're equally important for Databend's usability at the application integration layer.


Ready to try Table Branching and Spatial Indexes?

Get started in minutes with Databend Cloud—the agent-ready data warehouse for analytics, search, AI, and Python Sandbox—and receive $200 in free credits.

Share this post

Subscribe to our newsletter

Stay informed on feature releases, product roadmap, support, and cloud offerings!