Skip to content

Release: OpenAddressHashAgg optimizations + PostgreSQL benchmark#164

Merged
poyrazK merged 7 commits into
mainfrom
release/hashagg-optimization
Jun 13, 2026
Merged

Release: OpenAddressHashAgg optimizations + PostgreSQL benchmark#164
poyrazK merged 7 commits into
mainfrom
release/hashagg-optimization

Conversation

@poyrazK

@poyrazK poyrazK commented Jun 13, 2026

Copy link
Copy Markdown
Owner

Summary

  • Add PostgreSQL comparison benchmark suite (benchmarks/postgresql_comparison_bench.cpp)
  • Add PostgreSQL comparison documentation (docs/performance/POSTGRESQL_COMPARISON.md)
  • Optimize OpenAddressHashAgg with sentinel-based MIN/MAX initialization (eliminates has_mins branching)
  • Replace dynamic_cast with static_cast in update_bucket_accumulators for better performance
  • Add mask(), valid_indices(), bucket_at(), bucket_index() accessors to OpenAddressHashAgg for future batch processing

Performance Impact

  • GROUP BY benchmark: ~30% improvement (3.2M/s → 4.1M/s)
  • All 39 vectorized_operator_tests pass

Summary by CodeRabbit

  • New Features

    • Added PostgreSQL comparison benchmark suite to evaluate engine performance against PostgreSQL backends across OLTP and analytical workloads.
  • Documentation

    • Added comprehensive guide for PostgreSQL comparison benchmarks, including setup, execution, and performance interpretation.
  • Performance

    • Optimized GROUP BY aggregation operations with sentinel-based tracking and batch processing.

@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@poyrazK, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 29 minutes and 5 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c8741479-b6e9-4b5d-b51d-d25d03992cb7

📥 Commits

Reviewing files that changed from the base of the PR and between bb0b83a and 1b661d7.

📒 Files selected for processing (4)
  • CMakeLists.txt
  • benchmarks/postgresql_comparison_bench.cpp
  • docs/performance/POSTGRESQL_COMPARISON.md
  • include/executor/vectorized_operator.hpp
📝 Walkthrough

Walkthrough

This PR adds a PostgreSQL comparison benchmark suite to measure cloudSQL's vectorized SQL engine against PostgreSQL, and optimizes GROUP BY aggregation through sentinel-based MIN/MAX tracking and batch accumulation. The benchmark covers OLTP (INSERT, UPDATE, SELECT) and analytical (SCAN, GROUP BY, JOIN, filtered queries) patterns, while aggregation improvements eliminate conditional branching and support multi-row batch updates.

Changes

PostgreSQL Comparison Benchmark Suite

Layer / File(s) Summary
Build configuration and benchmark execution harness
CMakeLists.txt, benchmarks/postgresql_comparison_bench.cpp
CMake discovers PostgreSQL libraries/headers and conditionally builds postgresql_comparison_bench. The benchmark program provides a SQL parser helper and two execution contexts (PostgreSQLContext via libpq and CloudSQLContext with local storage/executor components) for transactional SQL execution.
OLTP-style benchmarks (INSERT, UPDATE, point SELECT)
benchmarks/postgresql_comparison_bench.cpp
Paired PostgreSQL and cloudSQL benchmarks for single-row writes and point lookups. Each benchmark populates tables once, then executes workload per iteration within transactions, measuring row-based throughput across argument ranges.
Analytical-style benchmarks (FULL SCAN, GROUP BY, JOIN, complex WHERE)
benchmarks/postgresql_comparison_bench.cpp
Paired benchmarks for full-table scans, multi-group aggregations, orders-to-lineitem joins with aggregation, and compound filtered queries. Each follows the same pattern: one-time setup, per-iteration transactional execution, and processed-item tracking.
Benchmark documentation and usage guide
docs/performance/POSTGRESQL_COMPARISON.md
Comprehensive guide covering workload definitions (OLTP vs. analytical), TPC-H-inspired schema, build/run instructions with environment variables and filters, expected outcomes, methodology notes, interpretation guidance, example JSON output, and references.

GROUP BY Aggregation Optimization

Layer / File(s) Summary
Public bucket accessor interface
include/executor/vectorized_operator.hpp
OpenAddressHashAgg adds four public methods (mask(), valid_indices(), bucket_at(), bucket_index()) to expose bucket state for external iteration and inspection.
Sentinel-based MIN/MAX initialization
include/executor/vectorized_operator.hpp
Both generic-key and int64-fast-path insertion initialize MIN accumulators to numeric_limits::max() and MAX to lowest(), eliminating conditional "first value" checks during later accumulation updates.
Batch scratch vectors for state tracking
include/executor/vectorized_operator.hpp
VectorizedGroupByOperator introduces batch_bucket_idx_ and batch_active_buckets_ scratch vectors to track per-row bucket assignments and active buckets within a batch, initialized in the constructor.
Batch-oriented aggregation update logic
include/executor/vectorized_operator.hpp
SUM accumulation switches from dynamic_cast to static_cast for numeric vectors. MIN/MAX updates now use sentinel-initialized values with std::min/std::max directly. New update_aggregate_batch template helper updates COUNT/SUM/MIN/MAX across multiple row indices per call.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • poyrazK/cloudSQL#160: Both PRs modify OpenAddressHashAgg and VectorizedGroupByOperator bucket handling, MIN/MAX logic, and aggregation update mechanics in the same GROUP BY implementation.
  • poyrazK/cloudSQL#157: Both PRs modify GROUP BY aggregation internals in include/executor/vectorized_operator.hpp, including accumulator updates and MIN/MAX sentinel-based tracking.
  • poyrazK/cloudSQL#16: Both PRs extend the benchmark build infrastructure in CMakeLists.txt by conditionally adding benchmark executables under the BUILD_BENCHMARKS section.

Poem

🐰 A vectorized leap, benchmarks in sight,
Sentinels guard the MIN and MAX right,
PostgreSQL meets cloudSQL's might,
Batches of rows processed with delight,
Performance metrics burning bright! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the two main changes: OpenAddressHashAgg optimizations and the new PostgreSQL benchmark suite. It is specific, concise, and directly reflects the primary modifications in the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch release/hashagg-optimization

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
include/executor/vectorized_operator.hpp (2)

1170-1234: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Logic error in update_aggregate_batch: MIN/MAX branch is unreachable for int64/float64 columns.

The conditional structure is:

if (col.type() == TYPE_INT64) { /* sums */ }
else if (col.type() == TYPE_FLOAT64) { /* sums */ }
else if (agg.type == Min || agg.type == Max) { /* min/max */ }

The MIN/MAX branch (line 1207) only executes when col.type() is neither TYPE_INT64 nor TYPE_FLOAT64. For numeric MIN/MAX aggregates on int64 or float64 columns, the sum-update branches execute instead, producing incorrect results.

Since this helper is not yet called (scaffolding for future batch processing per PR summary), this is not a runtime bug today, but will cause incorrect aggregation when integrated.

Suggested fix
     // Type resolved ONCE, then batch process all rows
-    if (col.type() == common::ValueType::TYPE_INT64) {
+    if (agg.type == AggregateType::Min || agg.type == AggregateType::Max) {
+        // MIN/MAX with sentinel-based approach
+        if (col.type() == common::ValueType::TYPE_INT64) {
+            const auto& num_col = static_cast<const NumericVector<int64_t>&>(col);
+            const int64_t* raw = num_col.raw_data();
+            for (size_t j = 0; j < num_rows; ++j) {
+                size_t r = row_indices[j];
+                if (!num_col.is_null(r)) {
+                    bucket.mins[agg_idx] = std::min(bucket.mins[agg_idx], raw[r]);
+                    bucket.maxes[agg_idx] = std::max(bucket.maxes[agg_idx], raw[r]);
+                    bucket.has_mins[agg_idx] = true;
+                }
+            }
+        } else if (col.type() == common::ValueType::TYPE_FLOAT64) {
+            const auto& num_col = static_cast<const NumericVector<double>&>(col);
+            const double* raw = num_col.raw_data();
+            for (size_t j = 0; j < num_rows; ++j) {
+                size_t r = row_indices[j];
+                if (!num_col.is_null(r)) {
+                    bucket.mins_float64[agg_idx] = std::min(bucket.mins_float64[agg_idx], raw[r]);
+                    bucket.maxes_float64[agg_idx] = std::max(bucket.maxes_float64[agg_idx], raw[r]);
+                    bucket.has_float_minmax[agg_idx] = true;
+                }
+            }
+        }
+    } else if (col.type() == common::ValueType::TYPE_INT64) {
         const auto& num_col = static_cast<const NumericVector<int64_t>&>(col);
         // ... existing sum logic ...
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@include/executor/vectorized_operator.hpp` around lines 1170 - 1234, The batch
update logic in update_aggregate_batch incorrectly places the
AggregateType::Min/Max handling after the TYPE_INT64/TYPE_FLOAT64 branches so
numeric MIN/MAX never run; change the control flow so that agg.type is checked
first for MIN/MAX (using aggregates_[agg_idx].type / agg.type) and then, inside
that branch, dispatch on col.type() to update bucket.mins/mins_float64 and
bucket.maxes/maxes_float64 (using NumericVector<int64_t>/NumericVector<double>
and raw_data()), or alternatively keep the numeric-type branches but branch on
agg.type inside them to perform sum (bucket.sums_int64/sums_float64 and
bucket.has_float_value) for non-min/max and min/max updates for
AggregateType::Min/Max; ensure Count and input_col_idx handling remains
unchanged.

1150-1168: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sentinel-based MIN/MAX update does not set has_mins flag.

This sentinel-based approach correctly uses std::min/std::max without branching on first value, but never sets bucket.has_mins[i] = true after a non-NULL update. This breaks:

  1. Parallel aggregation: merge_from checks has_mins and skips merging (critical issue raised above).
  2. Groups with all-NULL aggregate columns: Output phase will emit sentinel values (INT64_MAX for MIN, INT64_MIN for MAX) instead of NULL.

Add bucket.has_mins[i] = true; (for int64) and bucket.has_float_minmax[i] = true; (for float64) after the std::min/std::max updates:

                     bucket.mins_float64[i] = std::min(bucket.mins_float64[i], val);
                     bucket.maxes_float64[i] = std::max(bucket.maxes_float64[i], val);
+                    bucket.has_float_minmax[i] = true;
                 } else {
                     auto val = col.get(row_idx).to_int64();
                     bucket.mins[i] = std::min(bucket.mins[i], val);
                     bucket.maxes[i] = std::max(bucket.maxes[i], val);
+                    bucket.has_mins[i] = true;
                 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@include/executor/vectorized_operator.hpp` around lines 1150 - 1168, The
MIN/MAX sentinel update block for AggregateType::Min/Max updates
bucket.mins/bucket.maxes and bucket.mins_float64/bucket.maxes_float64 but never
marks the corresponding presence flags, which breaks merge_from and null-output
handling; after performing the std::min/std::max update for the int64 path set
bucket.has_mins[i] = true, and after the float64 path set
bucket.has_float_minmax[i] = true so the bucket correctly records that a
non-NULL value has been seen.
🧹 Nitpick comments (1)
CMakeLists.txt (1)

199-200: ⚡ Quick win

Add Homebrew search paths for macOS consistency.

The DuckDB benchmark (lines 187-188) includes /opt/homebrew/lib and /opt/homebrew/include in the search paths for macOS Homebrew installations. For consistency and improved portability, consider adding the same paths here.

🔧 Suggested enhancement
-    find_library(PQLIB_LIBRARY pq PATHS /usr/lib /usr/local/lib)
-    find_path(PQLIB_INCLUDE_DIR libpq-fe.h PATHS /usr/include /usr/local/include)
+    find_library(PQLIB_LIBRARY pq PATHS /opt/homebrew/lib /usr/lib /usr/local/lib)
+    find_path(PQLIB_INCLUDE_DIR libpq-fe.h PATHS /opt/homebrew/include /usr/include /usr/local/include)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CMakeLists.txt` around lines 199 - 200, The find commands for libpq currently
search only /usr/lib and /usr/local/lib for find_library(PQLIB_LIBRARY) and
/usr/include and /usr/local/include for find_path(PQLIB_INCLUDE_DIR); update
these invocations to also include macOS Homebrew locations by adding
/opt/homebrew/lib to the PATHS for find_library and /opt/homebrew/include to the
PATHS for find_path so Homebrew-installed PostgreSQL is discovered consistently
(update the lines containing find_library(PQLIB_LIBRARY pq PATHS ...) and
find_path(PQLIB_INCLUDE_DIR libpq-fe.h PATHS ...)).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@benchmarks/postgresql_comparison_bench.cpp`:
- Around line 148-189: The INSERT benchmarks (BM_PostgreSQL_Insert and
BM_CloudSQL_Insert) append rows every iteration because the table is created
once and never cleared; fix by clearing the lineitem table at the start of each
benchmark iteration or document intent. Modify BM_PostgreSQL_Insert to call
ctx.execute_sql("TRUNCATE TABLE lineitem") (or DELETE FROM lineitem) immediately
inside the for (auto _ : state) loop before "BEGIN" (keeping ctx.create_tables()
once before the loop). Modify BM_CloudSQL_Insert similarly: call
ctx.executor->execute("TRUNCATE TABLE lineitem") (or the appropriate ParseSQL
variant) at the top of its per-iteration loop before "BEGIN". Alternatively, if
growth is intentional, add a clear comment in both BM_PostgreSQL_Insert and
BM_CloudSQL_Insert stating the benchmark measures inserts into a growing table.
- Around line 64-78: In create_tables(), each PQexec call currently discards the
PGresult* and ignores errors; capture each call's PGresult* (e.g., r =
PQexec(conn, "...")), check PQresultStatus(r) for PGRES_COMMAND_OK (or
appropriate success codes), handle/report failures (fprintf(stderr, ...) or
throw/exit) and always call PQclear(r) after checking; apply this pattern to the
SET, DROP TABLE and CREATE TABLE statements so no PGresult* is leaked and errors
are surfaced.

In `@docs/performance/POSTGRESQL_COMPARISON.md`:
- Line 163: Update the broken relative link in the Markdown line containing
"cloudSQL: [GitHub Repository](../README.md)" so it points to the repository
root README; replace "(../README.md)" with "(../../README.md)" to correctly
navigate up two levels to the top-level README.
- Around line 30-44: The docs declare PRIMARY KEY on l_orderkey and o_orderkey
in the CREATE TABLE examples but the benchmark's actual table-creation SQL for
the lineitem and orders tables does not add those constraints; fix by either (A)
updating the documentation to remove the PRIMARY KEY annotations for
lineitem.l_orderkey and orders.o_orderkey so the docs match the current
implementation, or (B) updating the implementation's CREATE TABLE statements for
the lineitem and orders tables to add "PRIMARY KEY" to l_orderkey and o_orderkey
respectively (and ensure any required index/constraint handling in the setup
code is adjusted accordingly) so the code matches the docs.

In `@include/executor/vectorized_operator.hpp`:
- Around line 533-539: The sentinel initialization disables MIN/MAX merging
because has_mins remains false; fix by setting has_mins[a] = true when
initializing bucket.mins/maxes/mins_float64/maxes_float64 (in the block with
bucket.mins[a] = std::numeric_limits<int64_t>::max()), and also update
update_bucket_accumulators to set has_mins[i] = true (and has_float_minmax[i] =
true for floating updates) whenever a non-NULL value updates a per-bucket min or
max; this preserves the existing merge_from logic (which checks src.has_mins[i])
and ensures thread-local buckets are merged correctly.

---

Outside diff comments:
In `@include/executor/vectorized_operator.hpp`:
- Around line 1170-1234: The batch update logic in update_aggregate_batch
incorrectly places the AggregateType::Min/Max handling after the
TYPE_INT64/TYPE_FLOAT64 branches so numeric MIN/MAX never run; change the
control flow so that agg.type is checked first for MIN/MAX (using
aggregates_[agg_idx].type / agg.type) and then, inside that branch, dispatch on
col.type() to update bucket.mins/mins_float64 and bucket.maxes/maxes_float64
(using NumericVector<int64_t>/NumericVector<double> and raw_data()), or
alternatively keep the numeric-type branches but branch on agg.type inside them
to perform sum (bucket.sums_int64/sums_float64 and bucket.has_float_value) for
non-min/max and min/max updates for AggregateType::Min/Max; ensure Count and
input_col_idx handling remains unchanged.
- Around line 1150-1168: The MIN/MAX sentinel update block for
AggregateType::Min/Max updates bucket.mins/bucket.maxes and
bucket.mins_float64/bucket.maxes_float64 but never marks the corresponding
presence flags, which breaks merge_from and null-output handling; after
performing the std::min/std::max update for the int64 path set
bucket.has_mins[i] = true, and after the float64 path set
bucket.has_float_minmax[i] = true so the bucket correctly records that a
non-NULL value has been seen.

---

Nitpick comments:
In `@CMakeLists.txt`:
- Around line 199-200: The find commands for libpq currently search only
/usr/lib and /usr/local/lib for find_library(PQLIB_LIBRARY) and /usr/include and
/usr/local/include for find_path(PQLIB_INCLUDE_DIR); update these invocations to
also include macOS Homebrew locations by adding /opt/homebrew/lib to the PATHS
for find_library and /opt/homebrew/include to the PATHS for find_path so
Homebrew-installed PostgreSQL is discovered consistently (update the lines
containing find_library(PQLIB_LIBRARY pq PATHS ...) and
find_path(PQLIB_INCLUDE_DIR libpq-fe.h PATHS ...)).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 19f084b0-3cac-48c6-a7d6-011a8d422c9a

📥 Commits

Reviewing files that changed from the base of the PR and between 4f41d62 and bb0b83a.

📒 Files selected for processing (4)
  • CMakeLists.txt
  • benchmarks/postgresql_comparison_bench.cpp
  • docs/performance/POSTGRESQL_COMPARISON.md
  • include/executor/vectorized_operator.hpp

Comment thread benchmarks/postgresql_comparison_bench.cpp
Comment thread benchmarks/postgresql_comparison_bench.cpp
Comment thread docs/performance/POSTGRESQL_COMPARISON.md
Comment thread docs/performance/POSTGRESQL_COMPARISON.md Outdated
Comment thread include/executor/vectorized_operator.hpp
poyrazK added 3 commits June 13, 2026 16:25
…gs, Homebrew paths

- Fix PGresult* leak in create_tables(): check PQresultStatus and always PQclear
- Fix INSERT benchmark: add TRUNCATE at start of each iteration for fair measurement
- Fix MIN/MAX presence flags: set has_mins/has_float_minmax in update_bucket_accumulators
- Add Homebrew paths for macOS PostgreSQL discovery in CMakeLists.txt
- Fix broken link in POSTGRESQL_COMPARISON.md (../README.md -> ../../README.md)

@poyrazK poyrazK left a comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's okay to merge

@poyrazK poyrazK merged commit d9dc734 into main Jun 13, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant