Blog

Leveraging Randomized Testing with SQLsmith to Uncover Bugs

baishenOct 24, 2023

Testing plays a crucial role in the development and maintenance of database systems. It helps verify functionality correctness and proactively identify potential issues to maintain performance and stability. Databend's CI already supports these test types:

Unit Test: A unit test validates the functionality of minimum testable code units like functions and modules, and ensure they execute successfully and return the expected results.
SQL Logic Test: A SQL logic test verifies SQL syntax and logic correctness using test cases, covering various scenarios.
Performance Test: A performance test validates performance impact of new features and optimizations. This helps prevent performance regressions.

While these tests ensure functional correctness and stability during rapid development, they have limitations. Handwritten SQL queries are often simplistic and lack coverage of complex real-world scenarios, edge cases, and exceptions.

Introducing SQLsmith

SQLsmith is a randomized SQL query generator that produces a high volume of diverse test cases to simulate real-world variability. Compared to other testing approaches, SQLsmith improves test coverage to uncover more potential issues and bugs.

Implementing SQLsmith for Databend

The original SQLsmith for PostgreSQL fuzz testing was inspired by Csmith. Several well-known open-source databases have adapted SQLsmith for their own use, such as CockroachDB, TiDB, and RisingWave.

These open source SQLsmiths use diverse languages (C++, Go, Rust) with different syntax support suitable for their domains. We couldn't directly use them and had to build our own SQLsmith in Rust to fully support Databend's syntax and features.

SQLsmith has three main components:

SQL Generator: Generates abstract syntax trees (ASTs) of various types
SQL Reducer: Simplifies complex SQLs to pinpoint bugs
Runner: Executes SQLs and records errors

SQL Generator

The SQL Generator randomly generates ASTs including:

Data Types: Basic types, Nested types like
```
Array
```
,
```
Map
```
,
```
Tuple
```
etc.
DDLs:
```
CREATE
```
,
```
ALTER
```
,
```
DROP TABLE
```
etc. to create and modify test tables.
DMLs:
```
INSERT
```
,
```
UPDATE
```
,
```
DELETE
```
,
```
MERGE
```
etc. to populate test data.
Queries:
```
WITH
```
,
```
SELECT
```
,
```
JOIN
```
,
```
SubQuery
```
,
```
ORDER BY
```
etc.
```
WITH
```
and
```
SubQuery
```
can generate complex queries.
Expressions:
```
Column
```
,
```
Literal
```
,
```
Scalar Function
```
and
```
Aggregate Function
```
etc. Expressions can be nested.

By randomly generating AST components, it covers all possible SQL syntax. Controlled recursion also creates complex nested SQLs to find obscure bugs. But depth is capped to avoid unexecutable SQLs.

SQL Reducer

Since generated SQLs can be very complex with the bug trigger buried, simplifying the original SQL makes bug isolation easier.

The SQL Reducer iteratively removes AST components like

WITH

SubQuery

Expression

etc. If the simplified SQL continues to trigger the bug, it is adopted; otherwise, the original SQL is preserved. In the end, it generates the most concise SQL to reproduce the bug by systematically examining all AST components.

Runner

The SQL Runner executes test SQL queries using Databend, and in case of failure, it proceeds with the following error-handling steps:

Check for expected errors like syntax, semantics etc.
Verify if it's a known issue or unimplemented feature.
Call the SQL Reducer to generate the smallest reproducible SQL.
Log errors and simplified SQLs.

The Runner is now integrated into Databend's CI, running with each release to log errors for further analysis.

Impact

After a month of running, SQLsmith has uncovered 50+ bugs in Databend, including:

Internal logic errors (17 bugs)
Invalid function/expression checks (12 bugs)
Missing semantic checks (9 bugs)
Improper
```
unwrap
```
/
```
unreachable
```
handling (7 bugs)
Parser failures (3 bugs)
Failed casts across types (4 bugs)
Parquet I/O errors (1 bug)

Key Lessons Learned

Analyzing the uncovered bugs led to some key learnings to avoid common pitfalls:

Thoroughly validate function parameters and edge cases. Main cases include:
- For
```
String
```
  parameters, check for empty strings.
- For
```
Int
```
  parameters, check for large number values.
- When supporting any parameter type, consider unusual types like
```
Null
```
  ,
```
EmptyArray
```
  ,
```
Bitmap
```
  etc.
- When only supporting specific types, see if other types can auto-convert or check and return errors early.
Use unwrap judiciously. Explicitly handle
```
Result
```
and
```
Option
```
instead of assuming success.
Understand SQL semantic rules during development e.g. constraints around
```
GROUP BY
```
,
```
ORDER BY
```
etc. Perform semantic checks in the Binder phase to prevent runtime errors.
Add more unit test cases for critical modules to prevent bugs from internal logic errors.

What's Next

SQLsmith helps enhance Databend's stability and reliability by uncovering hidden issues. Upcoming improvements include:

Add support for more SQL features like
```
UNION
```
, Computed Column etc.
Introduce more configurations like expression nesting depths, query complexity.
Improve SQL Reducer.
Optimize query executions and result analysis.

Share this post

Subscribe to our newsletter

Stay informed on feature releases, product roadmap, support, and cloud offerings!

This Week in Databend #117

This Week in Databend #116

Leveraging Randomized Testing with SQLsmith to Uncover Bugs

baishenOct 24, 2023

Introducing SQLsmith

Implementing SQLsmith for Databend

SQL Generator

SQL Reducer

Runner

Impact

Key Lessons Learned

What's Next

Subscribe to our newsletter

Get Started

Talk to us

Products

Resources

Community

Company

Solutions

Partners