Databend Feb–Mar Bimonthly Report: Continuous Data Development to Enhanced GIS Capabilities
DatabendLabsApr 8, 2026
From February to March 2026, Databend continued advancing SQL capabilities, spatial analysis, file formats, query execution pipelines, and Meta architecture evolution. Sandbox UDF, Geography Functions, R-Tree Spatial Index, TEXT format, enhanced
, Meta KV compression, and more — continuous data development delivery defined the two most notable months.EXPLAIN PERF
Hi, Databend community!
Since we didn't publish a standalone February report last month, this issue combines February 2026 and March 2026 into a single bimonthly recap, covering from
v1.2.874-nightly
v1.2.891-nightly
The main themes across these two months are clear: Sandbox UDF and Geography Functions expanded SQL expressiveness; R-Tree Spatial Index, Spatial Join, and spatial pruning pipelines were progressively filled in; CSV/TSV capabilities continued to improve and converged into the TEXT format in March; the query engine kept pushing forward with new join, spill fixes,
EXPLAIN PERF
bendsql
Monthly Stats
From
v1.2.874-nightly
v1.2.891-nightly
- 18 nightly versions
- 2 patch versions
Breakdown:
- February: 10 nightlies
- March: 8 nightlies and 2 patches
By release note counts, the Databend main repository merged across these two months:
- 43 new features
- 49 bug fixes
- 50 refactors
- 2 CI / build improvements
- 3 documentation updates
- 17 other optimizations
These numbers reflect Databend advancing on three parallel tracks in Feb–Mar: feature expansion, execution stability, and foundational restructuring.
Highlights
Core New Features
- Sandbox UDF + LATERAL — Databend continues advancing SQL expressiveness, enabling more flexible query composition and a safer foundation for custom logic execution.
generate_series - Geography Functions + — Spatial analysis functions keep expanding, from Geography Functions to
st_hilbert.st_hilbert - Object governance enhancements — Tagging support extended to ,
USER,ROLE,STREAM,view, andudf; experimental table tags support added for FUSE table snapshots.procedure - Data pipeline enhancements — now supports
COPY INTO location;PARTITION BYsupports Lance dataset. Import/export pipelines continue to fill out.COPY INTO - Table Branches with independent Schema — Branched data management is now more clearly organized. Table Branch is still being further optimized.
- Table Tag is included in the latest version and ready to use.
Spatial Analysis: From Geography Functions to Spatial Index
- Geography Functions — February laid the groundwork for geospatial computation, setting the stage for indexing and pruning.
- R-Tree Spatial Index — March formally introduced R-Tree-based Spatial Index, landing spatial indexing capabilities.
- Spatial Statistics — Added spatial statistics to for geospatial range pruning.
BlockMeta - Spatial Join — Runtime Filter now supports Spatial Index Join, integrating spatial indexes into the query optimization and execution path.
File Formats & Data Exchange: From CSV/TSV Enhancements to TEXT
- CSV / TSV compatibility improvements — Multi-byte field delimiters supported; complex types can be encoded as JSON for more flexible heterogeneous data ingestion.
- +
read_file— File read/write pipelines continue to strengthen, making data engineering scenarios more complete.COPY INTO location PARTITION BY - TSV → TEXT — March formally converged TSV into TEXT and added TEXT file format parameters, making text-based data import definitions cleaner.
- Detail fixes — now supports TSV;
infer_schema()supports full-line reads; CRLF handling continues to improve.FIELD_DELIMITER=''
Query Performance & Execution Engine: From New Join to EXPLAIN PERF
- Experimental new join enabled by default — A significant milestone in the execution pipeline in February, followed by ongoing correctness fixes for spill scenarios.
- — March added per-plan hardware performance counters, further enhancing execution plan analysis.
EXPLAIN PERF - Sort / Spill optimizations — Spilled sort blocks prefetch, source sort key reuse, spilled sort streams compression, and batch rank-limit sort continue to advance.
- Join and filter optimizations — Small bloom index read optimization, proactive memory reclamation after Hash Join completion, and ongoing Runtime Filter enhancements.
Meta & Operations: Ongoing Foundational Restructuring
- Meta architecture cleanup — HTTP admin API extracted into a standalone crate ; Meta config structure, CLI config, and crate boundaries continue to be cleaned up.
databend-meta-admin - Meta KV Compression — Introduced transparent zstd compression and typed serialization in , further optimizing metadata storage.
v1.2.883-nightly - Query-Meta compatibility — Query-meta version compatibility documentation filled in, providing clearer boundaries for future evolution.
- Connection and operational details — Meta value compression toggle, gRPC listener , and other capabilities continue to be added.
TCP_NODELAY
Stability & Quality: Execution Correctness and Edge Case Fixes
- Execution correctness — Ongoing fixes for new join spill data loss; recursive CTE concurrent result issues resolved.
- Storage and metadata stability — raft-log concurrent chunk read race condition fixed; unnecessary S3 refresh for HTTP catalog attached tables resolved.
- SQL edge case fixes — parenthesis preservation, correlated scalar subquery decorrelation, Unicode identifier and alias support continue to improve.
UNION - Security and governance boundaries — Row access policy execution issue under direct resolved.
UPDATE
Practical Demo: Data Development and Delivery with Databend
Putting these two months of updates into a real-world example makes Databend's value more tangible. A common data development workflow looks like this: build cleansing, aggregation, and models inside Databend, then deliver results to object storage partitioned for downstream batch jobs; for specific analysis or application scenarios, export results into a more consumption-friendly dataset format. This maps directly to two practical capabilities shipped in Feb–Mar:
COPY INTO <location> PARTITION BYCOPY INTO <location> FILE_FORMAT = (TYPE = LANCE)
Scenario 1: Partitioned Parquet Delivery by Date and Hour
Suppose you've built an order wide table in Databend and want to deliver results daily to object storage partitioned by
date/hour
You can use:
COPY INTO @delivery_stage/orders/
FROM (
SELECT
order_id,
user_id,
amount,
status,
created_at,
to_date(created_at) AS dt
FROM mart_orders
WHERE created_at >= '2026-03-01'
)
PARTITION BY (
'date=' || to_varchar(dt, 'YYYY-MM-DD')
|| '/hour=' || lpad(to_varchar(date_part('hour', created_at)), 2, '0')
)
FILE_FORMAT = (TYPE = PARQUET);
The benefits are straightforward:
- Partition directories are determined directly by SQL expressions — delivery paths are fully controlled
- Results land in directory structures like
date=2026-03-01/hour=08/ - Downstream jobs don't need an extra partition rewrite step
Scenario 2: Delivering Analysis Results as a Lance Dataset
If downstream consumption prioritizes efficient reads, columnar access, or a specific dataset organization, the Lance format support added in March is more valuable.
For example, to export a feature result set directly as a Lance dataset:
COPY INTO @delivery_stage/user_features_lance/
FROM (
SELECT
number
FROM numbers(10)
)
FILE_FORMAT = (TYPE = LANCE)
USE_RAW_PATH = TRUE
OVERWRITE = TRUE;
A few important notes:
- A Lance dataset is a complete dataset directory, not a single file output
- By default, Databend generates a subdirectory under the target path
query_id - Use to write results directly to the specified directory
USE_RAW_PATH = TRUE - Lance currently does not support use with
PARTITION BY
Databend's delivery pipeline offers two clear modes:
- For general data lake delivery:
PARTITION BY + PARQUET - For specific dataset organization or high-efficiency consumption:
LANCE
Ecosystem Updates
bendsql: 2 Releases in March
bendsql
- , released 2026-03-11
v0.33.5 - , released 2026-03-15
v0.33.6
These updates focused on:
- More readable error messages, with source and source chain output added
reqwest::Error - Request stability improvements, including heartbeat headers fix, error retry, and timeout adjustments
response.bytes() - Release and build pipeline improvements, including OIDC trusted publishing and abi3 wheel build fixes
Both releases lean toward CLI usability, error diagnostics, and engineering quality improvements.
Drivers and Clients
Beyond
bendsql
- released v0.9.2 on 2026-03-23, continuing Go driver usability improvements
databend-go - shipped a set of exception handling improvements around 2026-03-20, including embedding
databend-jdbcinquery_idmessages for easier troubleshootingSQLException - continued compatibility fixes and version upgrades between 2026-02-25 and 2026-02-28, including
databend-sqlalchemytype parsing fixes andvarcharissue correctionsget_sequence_names
These updates don't show up as headline features the way the main repository does, but they're equally important for Databend's usability at the application integration layer.
Ready to try Table Branching and Spatial Indexes?
Get started in minutes with Databend Cloud—the agent-ready data warehouse for analytics, search, AI, and Python Sandbox—and receive $200 in free credits.
Subscribe to our newsletter
Stay informed on feature releases, product roadmap, support, and cloud offerings!



