ATLAS-5032: Fix basic search when querying by long attribute values by saksenasonali · Pull Request #650 · apache/atlas

saksenasonali · 2026-05-27T13:31:39Z

What changes were proposed in this pull request?

ATLAS-5032: Fix basic search for long qualifiedName with startsWith / endsWith / contains

Problem
Basic search with attribute filters on qualifiedName returns no results when filter values exceed Solr’s default max token length (255). This affects startsWith, endsWith, and contains, especially when multiple criteria on the same attribute are combined with AND (e.g. qualifiedName starts with a long prefix and ends with @primary).

Root cause: Solr ignores tokens longer than maxTokenLength, so index-based search does not match even though the entity exists and can be retrieved by GUID.

Solution (Approach 2 from ATLAS-5032)
For indexed string attributes, when the filter value length exceeds the configured Solr token limit, do not use the Solr index for STARTS_WITH, ENDS_WITH, or CONTAINS. Search falls back to JanusGraph instead.

Also ensure index and graph query paths stay consistent when the same attribute appears in multiple AND criteria:

Skip graph filter construction when the criterion is still index-searchable.
Skip index query construction when the criterion is not index-searchable.

How was this patch tested?
Unit / module tests
EntitySearchProcessorTest — 48 tests, including 6 new ATLAS-5032 scenarios (short and long qualifiedName, hive_table and hive_column, tokenized name, CONTAINS + ENDS_WITH).
Full repository module: mvn -pl repository test — 2391 tests, 0 failures.
mvn -pl common,repository -DskipTests install — build success.
Manual / REST (local Docker Atlas)
Reproduced the JIRA flow against http://localhost:21000:

Created a hive_table with a ~370-character name and qualifiedName default.@primary.
Basic search with AND:
qualifiedName startsWith default.<370-char-name>
qualifiedName endsWith @primary
Result: 1 matching entity (previously empty).

rkundam · 2026-06-01T22:25:45Z

@@ -866,6 +871,10 @@ private boolean isIndexSearchable(FilterCriteria filterCriteria, AtlasStructType
                } else if (operator == SearchParameters.Operator.CONTAINS && AtlasAttribute.hastokenizeChar(attributeValue) && indexType == null) { // indexType = TEXT


Since this patch updating isIndexSearchable method, it's good to reorder indexType before hastokenizeChar: Check indexType == null first so STRING-indexed attributes skip the full-string tokenize scan. Small performance win, no behavior change.

rkundam · 2026-06-01T22:27:10Z

                    LOG.debug("{} operator found for string (TEXT) attribute {} and special characters found in filter value {}, deferring to in-memory or graph query (might cause poor performance)", operator, qualifiedName, attributeValue);

+                    ret = false;
+                } else if ((operator == SearchParameters.Operator.STARTS_WITH || operator == SearchParameters.Operator.ENDS_WITH || operator == SearchParameters.Operator.CONTAINS)  && attributeValue.length() > SOLR_MAX_TOKEN_STR_LENGTH) {


It's optional, Single CONTAINS block: Tokenize-char and max-length checks for CONTAINS are correct but split across two conditions. Merging them under one CONTAINS condition would improve readability.

rkundam · 2026-06-01T22:29:17Z

                    LOG.debug("{} operator found for string (TEXT) attribute {} and special characters found in filter value {}, deferring to in-memory or graph query (might cause poor performance)", operator, qualifiedName, attributeValue);

+                    ret = false;
+                } else if ((operator == SearchParameters.Operator.STARTS_WITH || operator == SearchParameters.Operator.ENDS_WITH || operator == SearchParameters.Operator.CONTAINS)  && attributeValue.length() > SOLR_MAX_TOKEN_STR_LENGTH) {


Null safety (minor): The length check uses attributeValue.length() directly; a null-safe length check or using StringUtils.length would avoid a potential NPE.

rkundam · 2026-06-01T22:33:46Z

    public static final String INDEX_SEARCH_MAX_RESULT_SET_SIZE        = "atlas.graph.index.search.max-result-set-size";
    public static final String INDEX_SEARCH_TYPES_MAX_QUERY_STR_LENGTH = "atlas.graph.index.search.types.max-query-str-length";
    public static final String INDEX_SEARCH_TAGS_MAX_QUERY_STR_LENGTH  = "atlas.graph.index.search.tags.max-query-str-length";
+    public static final String INDEX_SEARCH_SOLR_MAX_TOKEN_LENGTH      = "atlas.graph.solr.index.search.max-token-length";


Avoid Solr-specific naming in INDEX_SEARCH_SOLR_MAX_TOKEN_LENGTH: Atlas can use different indexing backends, so embedding "Solr" in the constant (and related names like SOLR_MAX_TOKEN_STR_LENGTH and the config key atlas.graph.solr.index.search.max-token-length) ties the API to one implementation. Prefer a backend-neutral name such as INDEX_SEARCH_MAX_TOKEN_LENGTH / INDEX_SEARCH_MAX_TOKEN_STR_LENGTH, with a generic property key (e.g. atlas.graph.index.search.max-token-length) unless this limit is truly Solr-only.

rkundam · 2026-06-01T22:36:54Z

+    private static final String LONG_TABLE_NAME =
+            "rltdvhrxhocivajyqojaxulwzhgzgzpkfolziacfnndzncjkthzeaeykdhhrjqdeibhdgiepwqkinvqzqxevushydtwjaabzgbfjmzvcbsoxewruhyhciyjefzsnokxvbeiiowzbhlkmujcnwilslgeswobzwwvkugyupsemqxsbdcmgrlpxmeuljvxyddvpccvcloupjorziwhogwnjvsdrwksvrbxcxjlcrcmrvvmbdmenafmvgrqzcaqbgpnhxiqbvxcbnudafsmjzvlzzzzpqmjkngbximmbjbijrqfb";
+
+    private static final String LONG_TOKENIZED_TABLE_NAME =


Update LONG_TOKENIZED_TABLE_NAME to include tokenize characters: The value is currently all alphanumeric, so it does not satisfy hastokenizeChar and the name is misleading. Change the string to include a few special characters (e.g. ., :, or @) so it actually exercises the TEXT-index tokenize-char deferral path, not just the max token length boundary.

ATLAS-5032: Long qualifiedName basic search fix

96da10d

rkundam reviewed Jun 1, 2026

View reviewed changes

Handled PR Comments

51a9a4f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ATLAS-5032: Fix basic search when querying by long attribute values#650

ATLAS-5032: Fix basic search when querying by long attribute values#650
saksenasonali wants to merge 2 commits into
apache:masterfrom
saksenasonali:ATLAS-5032-fix-long-attribute-search

saksenasonali commented May 27, 2026

Uh oh!

rkundam Jun 1, 2026

Uh oh!

rkundam Jun 1, 2026

Uh oh!

rkundam Jun 1, 2026

Uh oh!

rkundam Jun 1, 2026

Uh oh!

rkundam Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -866,6 +871,10 @@ private boolean isIndexSearchable(FilterCriteria filterCriteria, AtlasStructType
		} else if (operator == SearchParameters.Operator.CONTAINS && AtlasAttribute.hastokenizeChar(attributeValue) && indexType == null) { // indexType = TEXT

Conversation

saksenasonali commented May 27, 2026

Uh oh!

rkundam Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

rkundam Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

rkundam Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

rkundam Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

rkundam Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants