Release: OpenAddressHashAgg optimizations + PostgreSQL benchmark by poyrazK · Pull Request #164 · poyrazK/cloudSQL

poyrazK · 2026-06-13T13:12:28Z

Summary

Add PostgreSQL comparison benchmark suite (benchmarks/postgresql_comparison_bench.cpp)
Add PostgreSQL comparison documentation (docs/performance/POSTGRESQL_COMPARISON.md)
Optimize OpenAddressHashAgg with sentinel-based MIN/MAX initialization (eliminates has_mins branching)
Replace dynamic_cast with static_cast in update_bucket_accumulators for better performance
Add mask(), valid_indices(), bucket_at(), bucket_index() accessors to OpenAddressHashAgg for future batch processing

Performance Impact

GROUP BY benchmark: ~30% improvement (3.2M/s → 4.1M/s)
All 39 vectorized_operator_tests pass

Summary by CodeRabbit

New Features
- Added PostgreSQL comparison benchmark suite to evaluate engine performance against PostgreSQL backends across OLTP and analytical workloads.
Documentation
- Added comprehensive guide for PostgreSQL comparison benchmarks, including setup, execution, and performance interpretation.
Performance
- Optimized GROUP BY aggregation operations with sentinel-based tracking and batch processing.

… dynamic_cast, add accessors

coderabbitai · 2026-06-13T13:12:35Z

Warning

Review limit reached

@poyrazK, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 29 minutes and 5 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c8741479-b6e9-4b5d-b51d-d25d03992cb7

📥 Commits

Reviewing files that changed from the base of the PR and between bb0b83a and 1b661d7.

📒 Files selected for processing (4)

CMakeLists.txt
benchmarks/postgresql_comparison_bench.cpp
docs/performance/POSTGRESQL_COMPARISON.md
include/executor/vectorized_operator.hpp

📝 Walkthrough

Walkthrough

This PR adds a PostgreSQL comparison benchmark suite to measure cloudSQL's vectorized SQL engine against PostgreSQL, and optimizes GROUP BY aggregation through sentinel-based MIN/MAX tracking and batch accumulation. The benchmark covers OLTP (INSERT, UPDATE, SELECT) and analytical (SCAN, GROUP BY, JOIN, filtered queries) patterns, while aggregation improvements eliminate conditional branching and support multi-row batch updates.

Changes

PostgreSQL Comparison Benchmark Suite

Layer / File(s)	Summary
Build configuration and benchmark execution harness `CMakeLists.txt`, `benchmarks/postgresql_comparison_bench.cpp`	CMake discovers PostgreSQL libraries/headers and conditionally builds `postgresql_comparison_bench`. The benchmark program provides a SQL parser helper and two execution contexts (`PostgreSQLContext` via libpq and `CloudSQLContext` with local storage/executor components) for transactional SQL execution.
OLTP-style benchmarks (INSERT, UPDATE, point SELECT) `benchmarks/postgresql_comparison_bench.cpp`	Paired PostgreSQL and cloudSQL benchmarks for single-row writes and point lookups. Each benchmark populates tables once, then executes workload per iteration within transactions, measuring row-based throughput across argument ranges.
Analytical-style benchmarks (FULL SCAN, GROUP BY, JOIN, complex WHERE) `benchmarks/postgresql_comparison_bench.cpp`	Paired benchmarks for full-table scans, multi-group aggregations, orders-to-lineitem joins with aggregation, and compound filtered queries. Each follows the same pattern: one-time setup, per-iteration transactional execution, and processed-item tracking.
Benchmark documentation and usage guide `docs/performance/POSTGRESQL_COMPARISON.md`	Comprehensive guide covering workload definitions (OLTP vs. analytical), TPC-H-inspired schema, build/run instructions with environment variables and filters, expected outcomes, methodology notes, interpretation guidance, example JSON output, and references.

GROUP BY Aggregation Optimization

Layer / File(s)	Summary
Public bucket accessor interface `include/executor/vectorized_operator.hpp`	`OpenAddressHashAgg` adds four public methods (`mask()`, `valid_indices()`, `bucket_at()`, `bucket_index()`) to expose bucket state for external iteration and inspection.
Sentinel-based MIN/MAX initialization `include/executor/vectorized_operator.hpp`	Both generic-key and int64-fast-path insertion initialize MIN accumulators to `numeric_limits::max()` and MAX to `lowest()`, eliminating conditional "first value" checks during later accumulation updates.
Batch scratch vectors for state tracking `include/executor/vectorized_operator.hpp`	`VectorizedGroupByOperator` introduces `batch_bucket_idx_` and `batch_active_buckets_` scratch vectors to track per-row bucket assignments and active buckets within a batch, initialized in the constructor.
Batch-oriented aggregation update logic `include/executor/vectorized_operator.hpp`	SUM accumulation switches from `dynamic_cast` to `static_cast` for numeric vectors. MIN/MAX updates now use sentinel-initialized values with `std::min`/`std::max` directly. New `update_aggregate_batch` template helper updates COUNT/SUM/MIN/MAX across multiple row indices per call.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

poyrazK/cloudSQL#160: Both PRs modify OpenAddressHashAgg and VectorizedGroupByOperator bucket handling, MIN/MAX logic, and aggregation update mechanics in the same GROUP BY implementation.
poyrazK/cloudSQL#157: Both PRs modify GROUP BY aggregation internals in include/executor/vectorized_operator.hpp, including accumulator updates and MIN/MAX sentinel-based tracking.
poyrazK/cloudSQL#16: Both PRs extend the benchmark build infrastructure in CMakeLists.txt by conditionally adding benchmark executables under the BUILD_BENCHMARKS section.

Poem

🐰 A vectorized leap, benchmarks in sight,
Sentinels guard the MIN and MAX right,
PostgreSQL meets cloudSQL's might,
Batches of rows processed with delight,
Performance metrics burning bright! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately captures the two main changes: OpenAddressHashAgg optimizations and the new PostgreSQL benchmark suite. It is specific, concise, and directly reflects the primary modifications in the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch release/hashagg-optimization

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

include/executor/vectorized_operator.hpp (2)

1170-1234: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Logic error in update_aggregate_batch: MIN/MAX branch is unreachable for int64/float64 columns.

The conditional structure is:

if (col.type() == TYPE_INT64) { /* sums */ }
else if (col.type() == TYPE_FLOAT64) { /* sums */ }
else if (agg.type == Min || agg.type == Max) { /* min/max */ }

The MIN/MAX branch (line 1207) only executes when col.type() is neither TYPE_INT64 nor TYPE_FLOAT64. For numeric MIN/MAX aggregates on int64 or float64 columns, the sum-update branches execute instead, producing incorrect results.

Since this helper is not yet called (scaffolding for future batch processing per PR summary), this is not a runtime bug today, but will cause incorrect aggregation when integrated.

Suggested fix

     // Type resolved ONCE, then batch process all rows
-    if (col.type() == common::ValueType::TYPE_INT64) {
+    if (agg.type == AggregateType::Min || agg.type == AggregateType::Max) {
+        // MIN/MAX with sentinel-based approach
+        if (col.type() == common::ValueType::TYPE_INT64) {
+            const auto& num_col = static_cast<const NumericVector<int64_t>&>(col);
+            const int64_t* raw = num_col.raw_data();
+            for (size_t j = 0; j < num_rows; ++j) {
+                size_t r = row_indices[j];
+                if (!num_col.is_null(r)) {
+                    bucket.mins[agg_idx] = std::min(bucket.mins[agg_idx], raw[r]);
+                    bucket.maxes[agg_idx] = std::max(bucket.maxes[agg_idx], raw[r]);
+                    bucket.has_mins[agg_idx] = true;
+                }
+            }
+        } else if (col.type() == common::ValueType::TYPE_FLOAT64) {
+            const auto& num_col = static_cast<const NumericVector<double>&>(col);
+            const double* raw = num_col.raw_data();
+            for (size_t j = 0; j < num_rows; ++j) {
+                size_t r = row_indices[j];
+                if (!num_col.is_null(r)) {
+                    bucket.mins_float64[agg_idx] = std::min(bucket.mins_float64[agg_idx], raw[r]);
+                    bucket.maxes_float64[agg_idx] = std::max(bucket.maxes_float64[agg_idx], raw[r]);
+                    bucket.has_float_minmax[agg_idx] = true;
+                }
+            }
+        }
+    } else if (col.type() == common::ValueType::TYPE_INT64) {
         const auto& num_col = static_cast<const NumericVector<int64_t>&>(col);
         // ... existing sum logic ...

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@include/executor/vectorized_operator.hpp` around lines 1170 - 1234, The batch
update logic in update_aggregate_batch incorrectly places the
AggregateType::Min/Max handling after the TYPE_INT64/TYPE_FLOAT64 branches so
numeric MIN/MAX never run; change the control flow so that agg.type is checked
first for MIN/MAX (using aggregates_[agg_idx].type / agg.type) and then, inside
that branch, dispatch on col.type() to update bucket.mins/mins_float64 and
bucket.maxes/maxes_float64 (using NumericVector<int64_t>/NumericVector<double>
and raw_data()), or alternatively keep the numeric-type branches but branch on
agg.type inside them to perform sum (bucket.sums_int64/sums_float64 and
bucket.has_float_value) for non-min/max and min/max updates for
AggregateType::Min/Max; ensure Count and input_col_idx handling remains
unchanged.

1150-1168: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sentinel-based MIN/MAX update does not set has_mins flag.

This sentinel-based approach correctly uses std::min/std::max without branching on first value, but never sets bucket.has_mins[i] = true after a non-NULL update. This breaks:

Parallel aggregation: merge_from checks has_mins and skips merging (critical issue raised above).
Groups with all-NULL aggregate columns: Output phase will emit sentinel values (INT64_MAX for MIN, INT64_MIN for MAX) instead of NULL.

Add bucket.has_mins[i] = true; (for int64) and bucket.has_float_minmax[i] = true; (for float64) after the std::min/std::max updates:

                     bucket.mins_float64[i] = std::min(bucket.mins_float64[i], val);
                     bucket.maxes_float64[i] = std::max(bucket.maxes_float64[i], val);
+                    bucket.has_float_minmax[i] = true;
                 } else {
                     auto val = col.get(row_idx).to_int64();
                     bucket.mins[i] = std::min(bucket.mins[i], val);
                     bucket.maxes[i] = std::max(bucket.maxes[i], val);
+                    bucket.has_mins[i] = true;
                 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@include/executor/vectorized_operator.hpp` around lines 1150 - 1168, The
MIN/MAX sentinel update block for AggregateType::Min/Max updates
bucket.mins/bucket.maxes and bucket.mins_float64/bucket.maxes_float64 but never
marks the corresponding presence flags, which breaks merge_from and null-output
handling; after performing the std::min/std::max update for the int64 path set
bucket.has_mins[i] = true, and after the float64 path set
bucket.has_float_minmax[i] = true so the bucket correctly records that a
non-NULL value has been seen.

🧹 Nitpick comments (1)

CMakeLists.txt (1)

199-200: ⚡ Quick win

Add Homebrew search paths for macOS consistency.

The DuckDB benchmark (lines 187-188) includes /opt/homebrew/lib and /opt/homebrew/include in the search paths for macOS Homebrew installations. For consistency and improved portability, consider adding the same paths here.

🔧 Suggested enhancement

-    find_library(PQLIB_LIBRARY pq PATHS /usr/lib /usr/local/lib)
-    find_path(PQLIB_INCLUDE_DIR libpq-fe.h PATHS /usr/include /usr/local/include)
+    find_library(PQLIB_LIBRARY pq PATHS /opt/homebrew/lib /usr/lib /usr/local/lib)
+    find_path(PQLIB_INCLUDE_DIR libpq-fe.h PATHS /opt/homebrew/include /usr/include /usr/local/include)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CMakeLists.txt` around lines 199 - 200, The find commands for libpq currently
search only /usr/lib and /usr/local/lib for find_library(PQLIB_LIBRARY) and
/usr/include and /usr/local/include for find_path(PQLIB_INCLUDE_DIR); update
these invocations to also include macOS Homebrew locations by adding
/opt/homebrew/lib to the PATHS for find_library and /opt/homebrew/include to the
PATHS for find_path so Homebrew-installed PostgreSQL is discovered consistently
(update the lines containing find_library(PQLIB_LIBRARY pq PATHS ...) and
find_path(PQLIB_INCLUDE_DIR libpq-fe.h PATHS ...)).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@benchmarks/postgresql_comparison_bench.cpp`:
- Around line 148-189: The INSERT benchmarks (BM_PostgreSQL_Insert and
BM_CloudSQL_Insert) append rows every iteration because the table is created
once and never cleared; fix by clearing the lineitem table at the start of each
benchmark iteration or document intent. Modify BM_PostgreSQL_Insert to call
ctx.execute_sql("TRUNCATE TABLE lineitem") (or DELETE FROM lineitem) immediately
inside the for (auto _ : state) loop before "BEGIN" (keeping ctx.create_tables()
once before the loop). Modify BM_CloudSQL_Insert similarly: call
ctx.executor->execute("TRUNCATE TABLE lineitem") (or the appropriate ParseSQL
variant) at the top of its per-iteration loop before "BEGIN". Alternatively, if
growth is intentional, add a clear comment in both BM_PostgreSQL_Insert and
BM_CloudSQL_Insert stating the benchmark measures inserts into a growing table.
- Around line 64-78: In create_tables(), each PQexec call currently discards the
PGresult* and ignores errors; capture each call's PGresult* (e.g., r =
PQexec(conn, "...")), check PQresultStatus(r) for PGRES_COMMAND_OK (or
appropriate success codes), handle/report failures (fprintf(stderr, ...) or
throw/exit) and always call PQclear(r) after checking; apply this pattern to the
SET, DROP TABLE and CREATE TABLE statements so no PGresult* is leaked and errors
are surfaced.

In `@docs/performance/POSTGRESQL_COMPARISON.md`:
- Line 163: Update the broken relative link in the Markdown line containing
"cloudSQL: [GitHub Repository](../README.md)" so it points to the repository
root README; replace "(../README.md)" with "(../../README.md)" to correctly
navigate up two levels to the top-level README.
- Around line 30-44: The docs declare PRIMARY KEY on l_orderkey and o_orderkey
in the CREATE TABLE examples but the benchmark's actual table-creation SQL for
the lineitem and orders tables does not add those constraints; fix by either (A)
updating the documentation to remove the PRIMARY KEY annotations for
lineitem.l_orderkey and orders.o_orderkey so the docs match the current
implementation, or (B) updating the implementation's CREATE TABLE statements for
the lineitem and orders tables to add "PRIMARY KEY" to l_orderkey and o_orderkey
respectively (and ensure any required index/constraint handling in the setup
code is adjusted accordingly) so the code matches the docs.

In `@include/executor/vectorized_operator.hpp`:
- Around line 533-539: The sentinel initialization disables MIN/MAX merging
because has_mins remains false; fix by setting has_mins[a] = true when
initializing bucket.mins/maxes/mins_float64/maxes_float64 (in the block with
bucket.mins[a] = std::numeric_limits<int64_t>::max()), and also update
update_bucket_accumulators to set has_mins[i] = true (and has_float_minmax[i] =
true for floating updates) whenever a non-NULL value updates a per-bucket min or
max; this preserves the existing merge_from logic (which checks src.has_mins[i])
and ensures thread-local buckets are merged correctly.

---

Outside diff comments:
In `@include/executor/vectorized_operator.hpp`:
- Around line 1170-1234: The batch update logic in update_aggregate_batch
incorrectly places the AggregateType::Min/Max handling after the
TYPE_INT64/TYPE_FLOAT64 branches so numeric MIN/MAX never run; change the
control flow so that agg.type is checked first for MIN/MAX (using
aggregates_[agg_idx].type / agg.type) and then, inside that branch, dispatch on
col.type() to update bucket.mins/mins_float64 and bucket.maxes/maxes_float64
(using NumericVector<int64_t>/NumericVector<double> and raw_data()), or
alternatively keep the numeric-type branches but branch on agg.type inside them
to perform sum (bucket.sums_int64/sums_float64 and bucket.has_float_value) for
non-min/max and min/max updates for AggregateType::Min/Max; ensure Count and
input_col_idx handling remains unchanged.
- Around line 1150-1168: The MIN/MAX sentinel update block for
AggregateType::Min/Max updates bucket.mins/bucket.maxes and
bucket.mins_float64/bucket.maxes_float64 but never marks the corresponding
presence flags, which breaks merge_from and null-output handling; after
performing the std::min/std::max update for the int64 path set
bucket.has_mins[i] = true, and after the float64 path set
bucket.has_float_minmax[i] = true so the bucket correctly records that a
non-NULL value has been seen.

---

Nitpick comments:
In `@CMakeLists.txt`:
- Around line 199-200: The find commands for libpq currently search only
/usr/lib and /usr/local/lib for find_library(PQLIB_LIBRARY) and /usr/include and
/usr/local/include for find_path(PQLIB_INCLUDE_DIR); update these invocations to
also include macOS Homebrew locations by adding /opt/homebrew/lib to the PATHS
for find_library and /opt/homebrew/include to the PATHS for find_path so
Homebrew-installed PostgreSQL is discovered consistently (update the lines
containing find_library(PQLIB_LIBRARY pq PATHS ...) and
find_path(PQLIB_INCLUDE_DIR libpq-fe.h PATHS ...)).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 19f084b0-3cac-48c6-a7d6-011a8d422c9a

📥 Commits

Reviewing files that changed from the base of the PR and between 4f41d62 and bb0b83a.

📒 Files selected for processing (4)

CMakeLists.txt
benchmarks/postgresql_comparison_bench.cpp
docs/performance/POSTGRESQL_COMPARISON.md
include/executor/vectorized_operator.hpp

…ark code)

…gs, Homebrew paths - Fix PGresult* leak in create_tables(): check PQresultStatus and always PQclear - Fix INSERT benchmark: add TRUNCATE at start of each iteration for fair measurement - Fix MIN/MAX presence flags: set has_mins/has_float_minmax in update_bucket_accumulators - Add Homebrew paths for macOS PostgreSQL discovery in CMakeLists.txt - Fix broken link in POSTGRESQL_COMPARISON.md (../README.md -> ../../README.md)

poyrazK

It's okay to merge

poyrazK added 3 commits June 13, 2026 16:11

Add PostgreSQL comparison benchmark target with libpq linking

a27966d

Optimize OpenAddressHashAgg: sentinel-based MIN/MAX, static_cast over…

fafdd8e

… dynamic_cast, add accessors

Add PostgreSQL comparison benchmark suite and documentation

36e1cca

style: automated clang-format fixes

bb0b83a

coderabbitai Bot reviewed Jun 13, 2026

View reviewed changes

Comment thread benchmarks/postgresql_comparison_bench.cpp

Comment thread benchmarks/postgresql_comparison_bench.cpp

Comment thread docs/performance/POSTGRESQL_COMPARISON.md

Comment thread docs/performance/POSTGRESQL_COMPARISON.md Outdated

Comment thread include/executor/vectorized_operator.hpp

poyrazK added 3 commits June 13, 2026 16:25

Clean up dead code from batch optimization attempt

c136c94

Fix documentation: schema shows no PRIMARY KEY (matches actual benchm…

b088558

…ark code)

poyrazK commented Jun 13, 2026

View reviewed changes

poyrazK merged commit d9dc734 into main Jun 13, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release: OpenAddressHashAgg optimizations + PostgreSQL benchmark#164

Release: OpenAddressHashAgg optimizations + PostgreSQL benchmark#164
poyrazK merged 7 commits into
mainfrom
release/hashagg-optimization

poyrazK commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

poyrazK left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

poyrazK commented Jun 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Performance Impact

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

poyrazK left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

poyrazK commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading