Skip to content

Avoid allocations in IterableUtils.frequency and countMatches#698

Open
nishantmehta wants to merge 2 commits into
apache:masterfrom
nishantmehta:perf/count-utils-noalloc
Open

Avoid allocations in IterableUtils.frequency and countMatches#698
nishantmehta wants to merge 2 commits into
apache:masterfrom
nishantmehta:perf/count-utils-noalloc

Conversation

@nishantmehta

Copy link
Copy Markdown

What

IterableUtils.frequency and IterableUtils.countMatches each built a lazy filtered-iterable pipeline just to count elements:

// frequency
return size(filteredIterable(emptyIfNull(iterable), EqualPredicate.equalPredicate(obj)));
// countMatches
return size(filteredIterable(emptyIfNull(input), predicate));

That allocates a FluentIterable decorator and a filtered iterator (plus an EqualPredicate for frequency) on every call. Since frequency backs CollectionUtils.cardinality and countMatches is a commonly used utility, the cost is paid on hot paths.

This counts directly in a single pass. The matching semantics are preserved exactly:

  • EqualPredicate.equalPredicate(obj) evaluates Objects.equals(obj, element), and equalPredicate(null) matches null elements — so Objects.equals(obj, element) reproduces it including null handling.
  • FilterIterator advances using predicate.test(element), so the loop calls predicate.test for the same effect.
  • emptyIfNull's null handling becomes an explicit null check.

The direct loop also lets the JIT scalar-replace the iterator.

Benchmark

Measured with a ThreadMXBean allocation driver (200k warmed ops):

CollectionUtils.cardinality   52 B/op -> 0 B/op
IterableUtils.countMatches    66 B/op -> 0 B/op

Testing

CollectionUtilsTest (161) and IterableUtilsTest (42) pass unchanged.

frequency() handled non-Set, non-Bag iterables by building a filtered-
iterable pipeline:

    size(filteredIterable(emptyIfNull(iterable), EqualPredicate.equalPredicate(obj)))

which allocates an EqualPredicate, a FluentIterable decorator and a filtered
iterator on every call just to count matching elements. Since this also backs
CollectionUtils.cardinality(), the cost is paid by a commonly used utility.

Count directly with a single pass instead. EqualPredicate.equalPredicate(obj)
evaluates Objects.equals(obj, element) (and equalPredicate(null) matches null
elements), so the loop reproduces the previous semantics exactly, including
null handling. The direct loop also lets the JIT scalar-replace the iterator.

Measured with a ThreadMXBean allocation driver (200k warmed ops):
CollectionUtils.cardinality 52 B/op -> 0 B/op. CollectionUtilsTest (161) and
IterableUtilsTest (42) pass unchanged.

Signed-off-by: Nishant Mehta <nishantmehta.n@gmail.com>
countMatches() built a filtered-iterable pipeline just to count elements:

    size(filteredIterable(emptyIfNull(input), predicate))

allocating a FluentIterable decorator and a filtered iterator on every call.
Count directly in a single pass instead. FilterIterator advances using
predicate.test(element), so calling predicate.test in the loop reproduces the
previous matching semantics, and emptyIfNull's null handling becomes an
explicit null check.

Measured with a ThreadMXBean allocation driver (200k warmed ops):
IterableUtils.countMatches 66 B/op -> 0 B/op. CollectionUtilsTest (161) and
IterableUtilsTest (42) pass unchanged.

Signed-off-by: Nishant Mehta <nishantmehta.n@gmail.com>
@nishantmehta nishantmehta marked this pull request as ready for review June 28, 2026 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant