Skip to content
Open

V2 #15

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
0d04d17
feat(aggregation): add value count aggregation
Spamercz May 20, 2026
3a81761
feat(aggregation): add cardinality aggregation
Spamercz May 20, 2026
e0047ca
feat(aggregation): add stats aggregation
Spamercz May 20, 2026
f57ca56
feat(aggregation): add extended stats aggregation
Spamercz May 20, 2026
9435900
feat(aggregation): add percentiles aggregation
Spamercz May 20, 2026
58b5be0
feat(aggregation): add percentile ranks aggregation
Spamercz May 20, 2026
6c29b3e
feat(aggregation): add weighted avg aggregation
Spamercz May 20, 2026
f19ced6
feat(aggregation): add median absolute deviation aggregation
Spamercz May 20, 2026
5ea6ee3
feat(aggregation): add string stats aggregation
Spamercz May 20, 2026
1e205ab
feat(aggregation): add boxplot aggregation
Spamercz May 20, 2026
b4992ba
feat(aggregation): add geo centroid aggregation
Spamercz May 20, 2026
68225c3
feat(aggregation): add geo bounds aggregation
Spamercz May 20, 2026
72ca67c
feat(aggregation): add date histogram aggregation
Spamercz May 20, 2026
5314aaa
feat(aggregation): add date range aggregation
Spamercz May 20, 2026
2122bb9
feat(aggregation): add missing aggregation
Spamercz May 20, 2026
7da0e02
feat(aggregation): add global aggregation
Spamercz May 20, 2026
955a5e2
feat(aggregation): add significant terms aggregation
Spamercz May 20, 2026
b5592dd
feat(aggregation): add significant text aggregation
Spamercz May 20, 2026
93bac42
feat(aggregation): add geo distance aggregation
Spamercz May 20, 2026
092b56d
feat(aggregation): add geohash grid aggregation
Spamercz May 20, 2026
7a56900
feat(aggregation): add geotile grid aggregation
Spamercz May 20, 2026
7df3f33
feat(aggregation): add reverse nested aggregation
Spamercz May 20, 2026
fbc8e9a
feat(aggregation): add composite aggregation
Spamercz May 20, 2026
146ce28
feat(aggregation): add multi terms aggregation
Spamercz May 20, 2026
3966ec1
feat(aggregation): add rare terms aggregation
Spamercz May 20, 2026
bd4c6e1
feat(aggregation): add sampler aggregation
Spamercz May 20, 2026
f1d51f8
feat(aggregation): add diversified sampler aggregation
Spamercz May 20, 2026
48dc7ee
feat(aggregation): add adjacency matrix aggregation
Spamercz May 20, 2026
111e13e
feat(aggregation): add ip range aggregation
Spamercz May 20, 2026
47c262d
feat(aggregation): add avg bucket pipeline aggregation
Spamercz May 20, 2026
cea99eb
feat(aggregation): add sum bucket pipeline aggregation
Spamercz May 20, 2026
7ffd196
feat(aggregation): add max bucket pipeline aggregation
Spamercz May 20, 2026
cdb97d1
feat(aggregation): add min bucket pipeline aggregation
Spamercz May 20, 2026
13cd8cd
feat(aggregation): add stats bucket pipeline aggregation
Spamercz May 20, 2026
9b22538
feat(aggregation): add percentiles bucket pipeline aggregation
Spamercz May 20, 2026
c9401ec
feat(aggregation): add derivative pipeline aggregation
Spamercz May 20, 2026
358e097
feat(aggregation): add cumulative sum pipeline aggregation
Spamercz May 20, 2026
3ec8af2
feat(aggregation): add moving function pipeline aggregation
Spamercz May 20, 2026
d9a4f69
feat(aggregation): add serial diff pipeline aggregation
Spamercz May 20, 2026
c7be141
feat(aggregation): add bucket script pipeline aggregation
Spamercz May 20, 2026
f3e34ad
feat(aggregation): add bucket selector pipeline aggregation
Spamercz May 20, 2026
4ab45fa
feat(aggregation): add bucket sort pipeline aggregation
Spamercz May 20, 2026
b655051
feat(aggregation): add normalize pipeline aggregation
Spamercz May 20, 2026
bc41270
feat(query): add ids query
Spamercz May 20, 2026
0cce7f1
feat(query): add prefix query
Spamercz May 20, 2026
2d3dc40
feat(query): add regexp query
Spamercz May 20, 2026
028172b
feat(query): add term set query
Spamercz May 20, 2026
8a32256
feat(query): add match bool prefix query
Spamercz May 20, 2026
a49314c
feat(query): add combined fields query
Spamercz May 20, 2026
c2b1d09
feat(query): add query string query
Spamercz May 20, 2026
02c8c60
feat(query): add simple query string query
Spamercz May 20, 2026
4bfb9f9
feat(query): add intervals query
Spamercz May 20, 2026
573b19f
feat(query): add match none query
Spamercz May 20, 2026
f3c4313
feat(query): add boosting compound query
Spamercz May 20, 2026
2bcb7fb
feat(query): add constant score compound query
Spamercz May 20, 2026
8827fa3
feat(query): add dis max compound query
Spamercz May 20, 2026
40b57ac
feat(query): add has child joining query
Spamercz May 20, 2026
0369e28
feat(query): add has parent joining query
Spamercz May 20, 2026
94266a7
feat(query): add parent id joining query
Spamercz May 20, 2026
93bcfe1
feat(query): add geo bounding box query
Spamercz May 20, 2026
ead03cd
feat(query): add geo shape query
Spamercz May 20, 2026
8e4d7d8
feat(query): add shape query
Spamercz May 20, 2026
96e9126
feat(query): add script query
Spamercz May 20, 2026
d64cf17
feat(query): add script score query
Spamercz May 20, 2026
6e24431
feat(query): add more like this query
Spamercz May 20, 2026
8b42e84
feat(query): add rank feature query
Spamercz May 20, 2026
f354b6b
feat(query): add distance feature query
Spamercz May 20, 2026
8da126d
feat(query): add pinned query
Spamercz May 20, 2026
fcefba0
feat(query): add percolate query
Spamercz May 20, 2026
e84ef0c
feat(query): add wrapper query
Spamercz May 20, 2026
813f6c3
feat(query): add span term query
Spamercz May 20, 2026
d0eecef
feat(query): add span first query
Spamercz May 20, 2026
17992b6
feat(query): add span near query
Spamercz May 20, 2026
a207f5c
feat(query): add span or query
Spamercz May 20, 2026
dffc919
feat(query): add span not query
Spamercz May 20, 2026
e86210a
feat(query): add span containing query
Spamercz May 20, 2026
83ae5ee
feat(query): add span within query
Spamercz May 20, 2026
a4015e2
feat(query): add span multi query
Spamercz May 20, 2026
9ce1816
feat(query): add field masking span query
Spamercz May 20, 2026
641abaf
test: add AbstractElasticTestCase to remove curl boilerplate
Spamercz May 20, 2026
8efe673
fix(query): correct geo_distance, nested envelope; add inner_hits
Spamercz May 20, 2026
467d6bf
feat(query): bring full-text queries to ES DSL parity
Spamercz May 20, 2026
3f06817
feat(query): bring term-level queries to ES DSL parity
Spamercz May 20, 2026
0da8389
feat(query): inner_hits on joining queries; ParentId boost
Spamercz May 20, 2026
caadba6
feat(query): IndexedShape support + extra geo args
Spamercz May 20, 2026
59283d8
feat(query): MoreLikeThis + Percolate full coverage
Spamercz May 20, 2026
f7b5e9e
feat(query): add vector / semantic / rules query types
Spamercz May 20, 2026
add3466
feat(agg): metric aggregation coverage + TopHits rewrite
Spamercz May 20, 2026
1ccbeb8
feat(agg): bucket aggregation expansion + Composite typed sources
Spamercz May 20, 2026
ec55e29
feat(agg): add 16 new aggregation types
Spamercz May 20, 2026
7283bb6
feat(function-score): decay functions + ScriptScore + boost_mode
Spamercz May 20, 2026
4394573
feat(options): Sort expansion + ScriptSort + NestedSort
Spamercz May 20, 2026
cad64aa
feat(highlight): rewrite with HighlightField + global options
Spamercz May 20, 2026
958e306
feat(options): Source/Pit/Collapse/Rescore/Suggest + many search opts
Spamercz May 20, 2026
df4e324
feat(filter): expand FilterCollection to full bool body
Spamercz May 20, 2026
5c3bff2
docs: CHANGELOG + README + Query doc updates for v2
Spamercz May 20, 2026
fe3089f
fix(tests): drop deprecated curl_close() call
Spamercz Jun 2, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
173 changes: 173 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
# Changelog

## v2 — full DSL coverage

This branch brings every documented Elasticsearch DSL feature under typed PHP objects, fixes two queries that produced invalid DSL, and round-trips every feature against a real ES container via the new `AbstractElasticTestCase`.

### Test infrastructure

- New `tests/SpameriTests/ElasticQuery/AbstractElasticTestCase` base class with `createIndex($mapping)`, `indexDocument($body, $id, $refresh)`, `search($elasticQuery)`, `deleteIndex()`, and a generic `request($method, $path, $body)`. Tests that extend it shrink from ~70 lines of curl boilerplate to ~15.

### Bug fixes (BC-breaking)

| File | Previous | Fixed |
| --- | --- | --- |
| `Query/GeoDistance` | Emitted `{pin: {location: ...}}` (invalid DSL) and lacked the required `distance` argument. | Emits proper `geo_distance` envelope. Constructor now takes `distance` (required), plus `distance_type`, `validation_method`, `ignore_unmapped`, `boost`. |
| `Query/Nested` | Wrapped inner query in `[$queryArray]` (extra list level — rejected by ES). | Inner query is now an object. Added `score_mode`, `ignore_unmapped`, `inner_hits`. |
| `Query/PhrasePrefix` | `int $boost = 1` (inconsistent type). | `float $boost = 1.0`. |
| `Options/GeoDistanceSort` | `ignore_unmapped` hard-coded to `true`. | Constructor arg `bool $ignoreUnmapped = true`. |

### New query types

- **Knn** — vector similarity (field, queryVector, k, numCandidates, similarity, filter, boost).
- **SparseVector** — ELSER-style sparse vector query (inference_id+query or queryVector tokens).
- **TextExpansion** — legacy ELSER form (model_id, model_text).
- **Semantic** — queries a `semantic_text` field.
- **RuleQuery** — Search Application query rules over an organic query.
- **WeightedTokens** — token weights against a sparse_vector field.

### Existing queries — new constructor arguments

| Query | New args |
| --- | --- |
| `ElasticMatch` | `zero_terms_query`, `auto_generate_synonyms_phrase_query`, `lenient`, `prefix_length`, `max_expansions`, `fuzzy_transpositions`, `fuzzy_rewrite` |
| `MultiMatch` | `tie_breaker`, `slop`, `prefix_length`, `max_expansions`, `lenient`, `zero_terms_query`, `auto_generate_synonyms_phrase_query`, `fuzzy_transpositions`, `fuzzy_rewrite` |
| `MatchPhrase` | `zero_terms_query` |
| `PhrasePrefix` | `analyzer`, `max_expansions`, `zero_terms_query` |
| `MatchBoolPrefix` | `fuzziness`, `prefix_length`, `max_expansions`, `fuzzy_transpositions`, `fuzzy_rewrite` |
| `QueryString` | `analyze_wildcard`, `auto_generate_synonyms_phrase_query`, `enable_position_increments`, `fuzziness`, `fuzzy_max_expansions`, `fuzzy_prefix_length`, `fuzzy_transpositions`, `lenient`, `max_determinized_states`, `minimum_should_match`, `quote_analyzer`, `phrase_slop`, `quote_field_suffix`, `rewrite`, `time_zone`, `type`, `tie_breaker` |
| `SimpleQueryString` | `analyze_wildcard`, `auto_generate_synonyms_phrase_query`, `fuzzy_max_expansions`, `fuzzy_prefix_length`, `fuzzy_transpositions`, `lenient`, `minimum_should_match`, `quote_field_suffix` |
| `CombinedFields` | `auto_generate_synonyms_phrase_query` |
| `Term` | `case_insensitive` |
| `Terms` | accepts `TermsLookup` for cross-document terms resolution |
| `Range` | `gt`, `lt`, `format`, `relation` (new `Range\Relation` constants), `time_zone` |
| `Exists` | `boost` |
| `WildCard` | `case_insensitive`, `rewrite` |
| `Prefix` | `rewrite` |
| `Fuzzy` | `transpositions`, `rewrite` |
| `Regexp` | `rewrite` |
| `TermSet` | `boost` |
| `HasChild` | `inner_hits` |
| `HasParent` | `inner_hits` |
| `Nested` | `score_mode`, `ignore_unmapped`, `inner_hits` |
| `ParentId` | `boost` |
| `GeoBoundingBox` | `validation_method`, `ignore_unmapped`, `boost` |
| `GeoShape` | `indexed_shape` (new `IndexedShape` sub-object), `boost` |
| `Shape` | `indexed_shape`, `boost` |
| `MoreLikeThis` | `boost_terms`, `include`, `min_doc_freq`, `max_doc_freq`, `min_word_length`, `max_word_length`, `stop_words`, `analyzer`, `boost`, `fail_on_unsupported_field` |
| `Percolate` | `documents` (multi-doc), `name`, `routing`, `preference`, `version` |

### New sub-objects

- `Query/TermsLookup` — `index`, `id`, `path`, `routing`.
- `Query/Range/Relation` — constants: `INTERSECTS`, `CONTAINS`, `WITHIN`.
- `Query/InnerHits` — `name`, `from`, `size`, `sort`, `_source`, `highlight`, `explain`, `script_fields`, `docvalue_fields`, `version`, `seq_no_primary_term`, `stored_fields`, `track_scores`.
- `Query/IndexedShape` — `id`, `index`, `path`, `routing`.
- `Script` (top-level) — reusable script value object (`source`, `lang`, `params`).

### Aggregations — new types

Bucket: `Filters` (named filters), `AutoDateHistogram`, `VariableWidthHistogram`, `CategorizeText` *(platinum license)*, `FrequentItemSets` *(platinum license)*, `IpPrefix`, `TimeSeries`.

Metric: `TopMetrics`, `GeoLine` *(gold license)*, `TTest`, `Rate`, `MatrixStats`.

Pipeline/sampler/ML: `RandomSampler`, `CumulativeCardinality`, `ExtendedStatsBucket`, `Inference`.

### Aggregations — new constructor arguments

| Agg | New args |
| --- | --- |
| `Min`/`Max`/`Avg`/`Sum`/`ValueCount`/`Stats` | `missing`, `script`, `format` |
| `ExtendedStats` | `missing`, `script`, `format` (kept `sigma`) |
| `Cardinality` | `script`, `missing`, `rehash` |
| `MedianAbsoluteDeviation`/`StringStats` | `missing`, `script` |
| `BoxPlot` | `missing`, `script`, `execution_hint` |
| `Percentiles` | `tdigest`, `hdr`, `missing`, `script` |
| `PercentileRanks` | `hdr`, `missing`, `script` |
| `WeightedAvg` | **rewritten** — takes typed `WeightedAvgValue` for value/weight (each with `field`/`script`/`missing`), plus `format` |
| `TopHits` | **rewritten** — `from`, `sort`, `_source`, `highlight`, `explain`, `script_fields`, `docvalue_fields`, `version`, `seq_no_primary_term`, `stored_fields`, `track_scores` |
| `Term` | `min_doc_count`, `shard_size`, `shard_min_doc_count`, `show_term_doc_count_error`, `script`, `collect_mode`, `execution_hint`, `value_type`, `format`; `include`/`exclude` accept arrays |
| `MultiTerms` | `order`, `min_doc_count`, `shard_size`, `shard_min_doc_count`, `collect_mode`, `format` |
| `RareTerms` | `include`, `exclude`, `missing` |
| `SignificantTerms` | `shard_size`, `shard_min_doc_count`, `execution_hint`, `background_filter`, `heuristic` (with `HEURISTIC_*` constants) |
| `SignificantText` | `shard_size`, `shard_min_doc_count`, `min_doc_count`, `background_filter`, `source_fields` |
| `Range` | `script`, `missing`, `format` |
| `DateRange` | `script`, `missing` |
| `Histogram` | `min_doc_count`, `extended_bounds` (new `Histogram\Bounds`), `hard_bounds`, `offset`, `order`, `script`, `missing`, `keyed`, `format` |
| `DateHistogram` | `extended_bounds`, `hard_bounds`, `keyed`, `order`, `script`, `missing` |
| `IpRange` | **rewritten** — new `IpRange\IpRangeValue` with `mask` (CIDR) support |
| `Filter` | **rewritten** — accepts any `LeafQueryInterface` directly |
| `Composite` | typed sources: `Composite\TermsSource`, `Composite\HistogramSource`, `Composite\DateHistogramSource`, `Composite\GeotileGridSource`, each with `order`/`missing_bucket` |
| `AdjacencyMatrix` | `separator`, accepts `LeafQueryInterface` for filters |
| `GeoDistance` (agg) | `keyed`, `script`, `missing` |
| `GeoHashGrid`/`GeoTileGrid` | `bounds` |
| `DiversifiedSampler` | `execution_hint`, `script` |
| `Missing` | `script` |

### Score functions

- New `FunctionScore/ScoreFunction/Decay/Gauss`, `Linear`, `Exp` with shared `AbstractDecay` parent (`field`, `origin`, `scale`, `offset`, `decay`, `multi_value_mode`).
- New `FunctionScore/ScoreFunction/ScriptScore` (function variant — distinct from the `Query/ScriptScore` leaf).
- `FunctionScore` gained `boost`, `boost_mode` (with `BOOST_MODE_*` constants), `max_boost`, `min_score`.

### Sort

- `Sort` gains `mode`, `nested` (new `NestedSort`), `numeric_type`, `unmapped_type`, `format`.
- New `Options/ScriptSort` — script-based sort.
- New `Options/NestedSort` — path/filter/max_children for nested sorting (recursive).

### Highlight — rewritten

- `Highlight/HighlightField` — per-field config (type, number_of_fragments, fragment_size, all boundary_*, encoder, force_source, fragmenter, highlight_query, matched_fields, no_match_size, order, phrase_limit, require_field_match, tags_schema, pre_tags, post_tags).
- `Highlight/HighlightFieldCollection` — typed collection.
- `Highlight` accepts either `HighlightFieldCollection` or simple `array<string>` of field names (BC). Adds all global options.

### Options — many new fields

| Field | Type |
| --- | --- |
| `_source` | new `Options\Source` (includes/excludes, or `false`) |
| `track_total_hits` | `bool\|int` |
| `track_scores` | `bool` |
| `explain` | `bool` |
| `terminate_after` | `int` |
| `timeout` | `string` |
| `search_after` | `array` |
| `pit` | new `Options\Pit` |
| `stored_fields` | `array` |
| `docvalue_fields` | `array` |
| `fields` | `array` |
| `script_fields` | `array` |
| `runtime_mappings` | `array` |
| `seq_no_primary_term` | `bool` |
| `indices_boost` | `array` |
| `collapse` | new `Options\Collapse` |
| `rescore` | `array<Options\Rescore>` |
| `suggesters` | `array<Suggest\SuggesterInterface>` |
| `profile` | `bool` |
| `stats` | `array<string>` |
| `ext` | `array` |

`ElasticQuery::toArray()` wires `collapse`, `rescore`, and `suggest` to the top-level request body.

### Filter container — bool expansion

`Filter/FilterCollection` previously exposed only `must()`. It now mirrors `Query/QueryCollection` with `must()`, `should()`, `mustNot()`, and `filter()` — the `bool` body emits all four arms.

### Suggesters

- `Options/Suggest/SuggesterInterface`
- `Options/Suggest/TermSuggester`
- `Options/Suggest/PhraseSuggester`
- `Options/Suggest/CompletionSuggester`

### Response mapper

- `ResultMapper` now handles named buckets (string keys, e.g. from `Filters` agg) and composite-key buckets (array keys).
- `Result/Aggregation/Bucket.from`/`to` accept `string` (e.g. for IP / date range buckets).

### CI / tests

- 218 tests, 3 skipped on basic license (geo_line, categorize_text).
- ES 9.2.2 container in CI; `make tests` passes end-to-end against it.
- The two pre-existing buggy tests for `GeoDistance` and `Nested` (which asserted invalid output) are now corrected and re-run as integration tests against ES.
17 changes: 10 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,16 @@ A PHP library that converts Elasticsearch query DSL into strongly-typed PHP obje

## Features

- **Type-safe queries** - Full-text, term-level, compound, geo, and nested queries
- **Aggregations** - Metric (min, max, avg) and bucket (terms, histogram, range, filter) aggregations
- **Response mapping** - Automatic mapping of Elasticsearch responses to typed objects
- **Index mapping** - Define index settings, analyzers, tokenizers, and filters
- **Function scoring** - Custom scoring with field value factors, weights, and random scores
- **Highlighting** - Search result highlighting support
- **Pagination & sorting** - Options for size, offset, scroll, and geo-distance sorting
- **Type-safe queries** — full-text, term-level, compound, geo, nested, joining, vector (knn / sparse_vector / semantic), span queries, and rule queries
- **Aggregations** — metric (min, max, avg, stats, weighted_avg, top_hits, top_metrics, t_test, geo_line, …), bucket (terms, histogram, date_histogram, range, filter, filters, composite with typed sources, ip_prefix, time_series, …), pipeline (cumulative_*, bucket_*, normalize, serial_diff, inference, …)
- **Function scoring** — field value factor, weight, random, decay (gauss / linear / exp), script_score; score_mode + boost_mode
- **Sort** — field, geo-distance, script-based, with nested sort (filter / max_children / recursive)
- **Highlight** — per-field config (type, fragment_size, boundary scanner, encoder, fragmenter, highlight_query, matched_fields, no_match_size, order, phrase_limit, …)
- **Search options** — `_source`, `track_total_hits`, `search_after`, `pit`, `collapse`, `rescore`, `suggest` (term / phrase / completion), `runtime_mappings`, `script_fields`, `docvalue_fields`, `stored_fields`, `terminate_after`, `timeout`, `profile`, `stats`, `ext`
- **Response mapping** — automatic mapping of Elasticsearch responses (including composite/named buckets, IP/date range buckets) to typed objects
- **Index mapping** — index settings, analyzers, tokenizers, filters

See [CHANGELOG.md](CHANGELOG.md) for the full list of types and arguments added in v2.

## Requirements

Expand Down
Loading
Loading