Databend GIS Upgrade: Spatial Index Now Available, Spatial Query Performance Up to 8.3x Faster
baishenApr 22, 2026
Databend Spatial Index is officially live! Leveraging R-Tree and Hilbert Clustering optimizations, it significantly accelerates range scans and spatial JOINs, delivering up to an 8.3x performance boost. This release completes a vital piece of the puzzle for large-scale GIS analytics, providing powerful native support for LBS, logistics, and IoT sectors.
When working on GIS, LBS, logistics tracking, or IoT spatial analysis, everyone runs into the same problem:
Being able to store spatial data doesn't mean spatial queries are fast enough.
Especially as data volume grows, range filtering, proximity lookups, and spatial JOINs tend to slow down quickly. Without a dedicated spatial index, even the most powerful spatial functions struggle to support real production workloads.
Databend now officially ships spatial index support, providing native acceleration for large-scale spatial data queries. Built on the classic and efficient R-Tree structure, Databend can index
Geometry
In the SpatialBenchmark standard test suite, Databend spatial index delivers up to 8.3x performance improvement in typical scenarios.
Why Do GIS Workloads Require Spatial Indexes?
Spatial data is fundamentally different from ordinary structured data.
In practice, the most common operations are not simple equality lookups, but rather:
- Querying objects within a given region
- Querying objects within a certain distance of a point
- Determining whether two spatial objects intersect or contain each other
- Performing spatial join analysis across two spatial tables
Without index support, these queries typically fall back to full table scans. The result:
- Slow queries: Response time climbs noticeably as data volume increases
- High resource consumption: Spatial computation drives up CPU and I/O usage
- Unable to support real-time workloads: Nearby search, trajectory analysis, and spatial joins can't run at low latency
Spatial indexes are not a nice-to-have — they are the critical step that takes GIS capability from "usable" to "production-ready."
Databend GIS: Filling the Critical Gap
Databend has long natively supported both
GEOMETRY
GEOGRAPHY
For example:
ST_GeomFromText
ST_Distance
ST_Area
ST_Intersects
ST_Transform
These capabilities already support spatial data storage, parsing, transformation, and computation.
The newly released spatial index fills the remaining gap in Databend's GIS query performance, making large-scale spatial analysis genuinely viable in production.
How Does the Databend Spatial Index Accelerate Queries?
The core design of Databend's spatial index can be summarized in three layers:
1. Foundation: Bounding Box
BBox (Bounding Box, or minimum bounding rectangle) is the fundamental data unit of Databend's spatial index. It uses a compact and efficient four-dimensional coordinate structure to represent a rectangular extent in 2D space — lightweight to compute and ideal for high-speed spatial filtering. A standard BBox consists of four double-precision floating-point values representing the left, bottom, right, and top boundary extremes of a geometry in the 2D plane, in the fixed format:
(minX, minY, maxX, maxY)
Whether the original geometry is a Point, LineString, or Polygon, this unified structure reduces spatial relationship checks between complex geometries to simple rectangle intersection and containment tests — requiring only a few numeric comparisons, at a fraction of the cost of operating directly on complex geometries.
2. Spatial Index Filtering
Databend's spatial index uses a two-level filtering mechanism, combining coarse-grained Block-level filtering with fine-grained R-Tree index filtering to accelerate spatial queries efficiently.
Databend maintains BBox-based spatial statistics for each data block, recording the overall bounding rectangle of all geometry objects within that block, fully covering all spatial data in the block. When executing a spatial query, the system first performs a fast intersection check between the query geometry and each block's BBox statistics, immediately eliminating completely irrelevant blocks — without reading any index or data files, completing coarse-grained filtering at minimal cost.
After the coarse pass, remaining blocks undergo fine-grained precise matching. Each block builds an R-Tree index over the BBoxes of all its geometry objects. R-Tree is a classic spatial index structure, organized similarly to a B-Tree — balanced, hierarchical, and ordered — enabling efficient spatial data lookup and filtering. Through this "coarse block filter first, fine index match second" two-level mechanism, Databend's spatial index significantly reduces I/O and computation overhead at scale, greatly improving spatial query efficiency.
3. Hilbert Clustering for Optimal Spatial Data Layout
For the spatial index to achieve maximum filtering efficiency, geographically nearby data must be co-located in the same data block as much as possible. If spatial data is randomly and evenly distributed across blocks, each block's BBox statistics will heavily overlap with similar extents, rendering index filtering ineffective.
Databend supports combining
CLUSTER BY
ST_HILBERT(...)
ST_HILBERT
CLUSTER BY
Which Query Scenarios Benefit Directly?
Databend's spatial index currently auto-accelerates the following 4 core functions:
- — determines whether two geometry objects have an intersecting relationship; the most commonly used spatial filter function
ST_Intersects - — determines whether one geometry object completely contains another
ST_Contains - — determines whether one geometry object is completely inside another; the inverse of
ST_WithinST_Contains - — determines whether the distance between two geometry objects is less than a specified threshold; commonly used for proximity search
ST_DWithin
These functions cover the two most common categories of GIS queries.
Scenario 1: Spatial Filtering in WHERE Clauses
Examples:
- Query stores, vehicles, or devices within a given region
- Query trajectory points within a certain radius of a point
- Query objects that fall inside a geofence
These queries are the foundational capability for LBS, local services, IoT, and similar workloads.
Scenario 2: Spatial JOIN Analysis
Examples:
- Join order locations with service areas
- Match trajectory points against administrative boundaries
- Analyze device positions against building footprints or campus boundaries
In spatial JOIN scenarios, Databend's optimizer automatically leverages the index and runtime filters to reduce unnecessary data processing and improve large-scale spatial join efficiency.
How to Use Databend Spatial Index
Databend supports defining a spatial index directly at table creation time, and it can be combined with
CLUSTER BY ST_HILBERT(...)
Example:
CREATE TABLE trip (
t_tripkey INT64,
t_pickuploc GEOMETRY,
t_dropoffloc GEOMETRY,
SPATIAL INDEX idx_trip(t_pickuploc, t_dropoffloc)
)
CLUSTER BY (
st_hilbert(t_pickuploc, [-180, -90, 180, 90]),
st_hilbert(t_dropoffloc, [-180, -90, 180, 90])
);
After loading data, run the following SQL to apply reclustering:
ALTER TABLE trip RECLUSTER FINAL;
This optimizes the block layout for spatial filtering.
Once defined, common queries will automatically hit the spatial index.
For example, query trip pickup points within a specified geographic polygon for region-based spatial filtering:
SELECT t_tripkey, t_pickuploc
FROM trip
WHERE ST_Within(
t_pickuploc,
TO_GEOMETRY('POLYGON((-124 37, -124 38, -122 38, -122 37, -124 37))')
);
Query trip origins within a certain distance of a given point, suitable for LBS proximity search:
SELECT t_tripkey,
ST_Distance(t_pickuploc, TO_GEOMETRY('POINT(-122.4 37.7)')) AS distance
FROM trip
WHERE ST_DWithin(
t_pickuploc,
TO_GEOMETRY('POINT(-122.4 37.7)'),
0.05
)
ORDER BY distance ASC
LIMIT 5;
Query all trips with pickup locations inside building boundaries for high-precision spatial matching:
SELECT b.b_name, t.t_tripkey
FROM building b
JOIN trip t
ON ST_Intersects(t.t_pickuploc, b.b_boundary);
How Much Performance Improvement?
To validate the spatial index, we ran comparative tests using the SpatialBenchmark standard dataset.
The results are clear: Databend spatial index delivers significant acceleration across typical GIS queries.
1. Nearby Trip Query: 5.5x Faster
Proximity query using
ST_DWithin
- Without index: 1.328 seconds
- With spatial index: 0.243 seconds
- Improvement: 5.5x
2. Regional Trip Aggregation: 6.6x Faster
Spatial range filtering combined with aggregation:
- Without index: 2.445 seconds
- With spatial index: 0.368 seconds
- Improvement: 6.6x
3. Spatial JOIN Aggregation: 8.3x Faster
Complex multi-table spatial join:
- Without index: 2315.718 seconds
- With spatial index: 279.571 seconds
- Improvement: 8.3x
The gains are especially pronounced in complex spatial JOIN scenarios. This means Databend is not just capable of simple spatial filtering — it can handle large-scale spatial analysis workloads.
Business Scenarios That Benefit Directly
With spatial index now available, Databend is better positioned to support the following typical scenarios:
LBS and Local Services
- Nearby store search
- Service area matching
- Location-based recommendations
Logistics and Trajectory Analysis
- Vehicle trajectory point-in-region checks
- Route range analysis
- Delivery and regional correlation statistics
IoT and Spatiotemporal Analysis
- Device geofence alerting
- Sensor spatial aggregation analysis
- Regional event statistics
GIS and Spatial Data Platforms
- Administrative boundary joins
- Building, road, and region analysis
- Multi-source spatial data fusion and computation
Current Limitations
Databend spatial index currently supports the Geometry type.
The Geography type does not yet support index acceleration. If needed, convert to Geometry first before querying with index optimization.
In other words, Databend is currently especially well-suited for:
- Spatial workloads based on planar or projected coordinates
- Large-scale Geometry data filtering and join analysis
- Scenarios that combine OLAP analytics with GIS queries
Closing Thoughts
The real challenge of spatial data analysis is not just "can it compute" — it's "can it still compute fast at scale."
The release of Databend spatial index fills the critical gap in its GIS capability: not only does it support spatial data types and spatial functions, it now delivers high-performance query execution for massive spatial datasets.
For LBS, logistics, IoT, and urban governance workloads, this means Databend can take on an integrated role — from spatial data storage all the way through analytical querying.
If you're looking for a data platform that handles both modern analytics and GIS query workloads, Databend's spatial capabilities are worth a closer look.
Upgrade today and experience Databend spatial index — build faster, more efficient spatial data analytics applications.
准备好体验 Table Branching 与空间索引了吗?
Get started in minutes with Databend Cloud—the agent-ready data warehouse for analytics, search, AI, and Python Sandbox—and receive $200 in free credits.
Subscribe to our newsletter
Stay informed on feature releases, product roadmap, support, and cloud offerings!



