Skip to content

explorer: #248 Flavor B — search by EXTERNAL concept URIs (Getty AAT, UBERON/OBO) via a pid→concept-URI projection #256

@rdhyee

Description

@rdhyee

The ask (Eric Kansa)

Search material samples by a concept's URI/PID using linked data, not strings (2026-05-29 call). Eric's four guide queries — these are the acceptance tests:

  1. 'bucchero ware' → https://vocab.getty.edu/aat/300387149
  2. 'tibia' → https://purl.obolibrary.org/obo/UBERON_0000979
  3. 'femur' → https://purl.obolibrary.org/obo/UBERON_0000981 (in geographic context: Italy)
  4. 'whorls (spindle flywheels)' → https://vocab.getty.edu/aat/300263796

Key finding (verified 2026-06-01)

Flavor A (#248/#252) returns 0 for ALL four — these are EXTERNAL vocab URIs, not iSamples-vocabulary, and they are not in object_type/material/context of sample_facets_v2. They live in the PQG graph as IdentifiedConcept nodes referenced by p__keywords / p__has_* (BIGINT[] node refs in the wide parquet). The keyword labels ARE abundant in text (bucchero 2,693 · tibia 16,577 · femur 13,388 · whorl 1,473), so A1 free-text approximates these today.

Proposed work

  1. Confirm the graph path: resolve Eric's 4 URIs through p__keywords (+ category relations) in the narrow/wide PQG to verify reachability and get pid sets. (Narrow: isamples_202601_narrow.)
  2. Build a denormalized projection pid → [concept_uri, concept_label, relation] (a new facets-style parquet) so described-by=<any-uri> can semi-join it — eliminating joins at query time. (Eric & Andrea repeatedly flagged this 'new denormalized parquet' need in the MVP notes.)
  3. Frontend: point buildConceptFilter at the new projection (or UNION it with the existing iSamples-vocab columns); add URI→label resolution + free-text fallback for unknown URIs; paste-a-URI input.
  4. Compose with geography (query 3: femur in Italy) — already works via concept filter + viewport.

Repos

Data/projection work likely in pqg/; frontend in isamplesorg.github.io. Cross-repo.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions