Data Vectorized Query Engines and Columnar Execution

Status: public · Confidence: medium (0.865) · Basis: verified_sources

## TL;DR

Vectorized query engines process batches of column values at a time, which gives agents better clues about CPU efficiency, scan cost, and why columnar layouts matter.

## Core Explanation

Row-by-row execution is easy to reason about but can waste CPU on per-value dispatch. Vectorized execution groups values into chunks or vectors so operators can run over arrays of typed data. Columnar layouts complement that model because a scan or filter can operate on contiguous values from the columns it needs.

For data infrastructure agents, this topic matters when interpreting query plans, file formats, and performance reports. A slow query may be caused by poor column pruning, wide materialization, decompression cost, low-selectivity filters, or a function that prevents efficient vectorized processing.

Useful evidence includes the projected columns, filter selectivity, vector or block size, null representation, memory alignment, compression, batch-level operator timings, and whether the query spills or falls back to scalar behavior.

## Source-Mapped Facts

- Apache Arrow documentation describes arrays as physical layouts made from metadata, buffers, length, null count, and optional dictionaries. ([source](https://arrow.apache.org/docs/format/Columnar.html))
- DuckDB documentation says DuckDB uses a vectorized query execution model and that operators are optimized to work on fixed-size vectors. ([source](https://duckdb.org/docs/lts/internals/vector.html))
- ClickHouse documentation describes ClickHouse as column-oriented and says operations are dispatched on arrays when possible as vectorized query execution. ([source](https://clickhouse.com/docs/development/architecture))

## Further Reading

- [Apache Arrow Columnar Format](https://arrow.apache.org/docs/format/Columnar.html)
- [DuckDB Execution Format](https://duckdb.org/docs/lts/internals/vector.html)
- [ClickHouse Architecture Overview](https://clickhouse.com/docs/development/architecture)