Skip to content

feat: add bucket and level statistics to ManifestFileMeta#350

Merged
JingsongLi merged 1 commit into
apache:mainfrom
shyjsarah:worktree-fix-manifest-schema-bucket-level
Jun 2, 2026
Merged

feat: add bucket and level statistics to ManifestFileMeta#350
JingsongLi merged 1 commit into
apache:mainfrom
shyjsarah:worktree-fix-manifest-schema-bucket-level

Conversation

@shyjsarah
Copy link
Copy Markdown
Contributor

Purpose

paimon-rust's MANIFEST_FILE_META_SCHEMA declares only 9 fields, while the upstream Java org.apache.paimon.manifest.ManifestFileMeta.SCHEMA declares 12. The four bucket / level pruning fields added in apache/paimon#5345 (_MIN_BUCKET, _MAX_BUCKET, _MIN_LEVEL, _MAX_LEVEL) are absent from both the Avro schema and the Rust struct. Manifests written by paimon-rust therefore carry no bucket / level summary.

This is not a correctness bug — the Java reader's pruning code in ManifestsReader.filterManifestFileMeta guards every branch with Integer != null, so missing fields just short-circuit the optimization. But it silently disables three production optimizations on tables originating from paimon-rust:

  • specifiedBucket pruning — bucket-targeted Spark/Flink scans (ReadBuilder.withBucket(...), bucketed joins, runtime filter pushdown) fall back to reading every manifest instead of filtering by bucket range.
  • levelMinMaxFilter pruning — Java's compaction (CompactAction / minor compaction) reads extra manifests that could be skipped.
  • onlyReadRealBuckets — commit-time StrictModeChecker no longer skips virtual / index-only manifests (those with negative bucket numbers).

Brief change log

  • Add four Option<i32> fields to ManifestFileMeta (crates/paimon/src/spec/manifest_file_meta.rs) with #[serde(rename, default, skip_serializing_if = "Option::is_none")] mirroring the existing min_row_id / max_row_id pattern, plus getters.
  • Add a with_bucket_level_stats(min_bucket, max_bucket, min_level, max_level) chain method for writers that already have a constructed ManifestFileMeta.
  • Extend MANIFEST_FILE_META_SCHEMA with the four ["null", "int"] Avro fields, inserted between _SCHEMA_ID and _MIN_ROW_ID to match the Java field order.
  • Update the hand-written Avro decoder (crates/paimon/src/spec/avro/manifest_file_meta_decode.rs) to read the new fields when present (otherwise they fall through the _ => skip_nullable_field arm as None).
  • Aggregate min/max bucket and level across manifest entries at write time in TableCommit::write_manifest_file (crates/paimon/src/table/table_commit.rs) and attach via the new chain method. When the entry list is empty all four stay None, mirroring back-compat shape.
  • Bump new_with_version to accept the new positional Option<i32> arguments; new still defaults them to None, so non-writer call sites (tests, objects_file.rs) need no churn beyond passing None through new_with_version.

Serializer version is intentionally not bumped — it stays at 2. Java did the same in apache/paimon#5345 because the fields are nullable with default: null, so old and new files coexist.

Tests

Three new tests:

  • spec::manifest_list::tests::test_manifest_list_roundtrip_preserves_bucket_level_stats — writes a manifest list with explicit bucket / level values via ManifestList::write and asserts they survive the Avro round-trip through ManifestList::read.
  • table::table_commit::tests::test_commit_writes_bucket_and_level_stats_into_manifest_list — drives a real TableCommit::commit with messages spanning buckets [0, 3] and levels [0, 2], then reads back the manifest list and asserts the aggregates. Exercises the full plumbing from CommitMessage through messages_to_entries to write_manifest_file.
  • spec::manifest_list::tests::test_manifest_list_decodes_legacy_without_bucket_level_fields — fabricates a manifest list written under the pre-5345 Avro schema (no bucket / level fields) and asserts it decodes cleanly with the new getters returning None. Pins the back-compat contract.

Local verification:

  • cargo build -p paimon — ok
  • cargo test -p paimon --lib — 675 passed, 0 failed
  • cargo clippy -p paimon --all-targets -- -D warnings — clean
  • cargo fmt --check — clean

API and Format

On-disk format: extends MANIFEST_FILE_META_SCHEMA with four optional Avro fields. Old paimon-rust files decode unchanged (missing fields → None); new files are readable by Java (which already has these fields) and by any reader that follows Avro's nullable-union default semantics. Serializer version stays at 2.

Public API:

  • ManifestFileMeta gains four Option<i32> getters (min_bucket, max_bucket, min_level, max_level) and a with_bucket_level_stats(...) chain method.
  • ManifestFileMeta::new(...) signature unchanged — new fields default to None.
  • ManifestFileMeta::new_with_version(...) signature gains four Option<i32> arguments (positional, after schema_id, before min_row_id). This is pub(crate); the only external impact is keeping the Avro decoder in sync.

Documentation

Rustdoc comments on the new fields and with_bucket_level_stats explain the back-compat semantics (None = stats absent, treat as "no information") and reference apache/paimon#5345 for the upstream change. No prose docs need updating.

`paimon-rust`'s `MANIFEST_FILE_META_SCHEMA` was missing the four bucket
and level statistics fields added to `org.apache.paimon.manifest.ManifestFileMeta.SCHEMA`
in apache/paimon#5345 (Mar 2025). Manifests written by `paimon-rust`
therefore carried no `_MIN_BUCKET`, `_MAX_BUCKET`, `_MIN_LEVEL`,
`_MAX_LEVEL` summaries. The Java reader handles their absence gracefully
(`Integer != null` guards short-circuit each pruning branch in
`ManifestsReader.filterManifestFileMeta`), so this is not a correctness
bug — but it disables three optimizations on tables originating from
`paimon-rust`:

- `specifiedBucket` pruning (bucket-targeted Spark/Flink scans go
  full manifest scan instead of filtering by bucket range);
- `levelMinMaxFilter` pruning (compaction reads extra manifests);
- `onlyReadRealBuckets` (commit-time conflict checks no longer skip
  virtual-bucket manifests).

The fix adds four `Option<i32>` fields to `ManifestFileMeta`, extends
the Avro schema with `["null", "int"]` unions defaulting to `null` (in
the same order as Java), aggregates min/max from manifest entries at
write time via a new `with_bucket_level_stats` chain method, and surfaces
the values through the manual Avro decoder. Serializer version stays at
`2` — Java did not bump it either, since the fields are optional and
back-compat is guaranteed by `default: null`.

Tests cover three angles:
- A round-trip through `ManifestList::write`/`read` preserves explicit
  bucket/level values.
- A real `TableCommit::commit` with messages on multiple buckets / levels
  produces the correct aggregate in the manifest list.
- A manifest list written in the pre-5345 schema (no bucket/level fields)
  still decodes, with the new getters returning `None` instead of failing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I checked the schema/order against current Java ManifestFileMeta, the fast Avro decode path, and the manifest write aggregation. The new fields are nullable and preserve legacy reads as expected.

Verification: cargo test -p paimon passes locally on this PR.

@JingsongLi JingsongLi merged commit 27f574a into apache:main Jun 2, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants