Skip to content

Shared Discussion (knowledge gaps) + Dataset module — TraitMech reference integration (claw#7 Phase 1)#119

Merged
realmarcin merged 2 commits into
mainfrom
feat/shared-discussion-dataset-module
Jun 18, 2026
Merged

Shared Discussion (knowledge gaps) + Dataset module — TraitMech reference integration (claw#7 Phase 1)#119
realmarcin merged 2 commits into
mainfrom
feat/shared-discussion-dataset-module

Conversation

@realmarcin

Copy link
Copy Markdown
Contributor

Phase 1 of the cross-Mech DisMech feature adoption (CultureBotAI/culturebotai-claw#7). Introduces the canonical, byte-identical shared LinkML module and wires it into TraitMech as the greenfield reference integration.

What's here

  • src/traitmech/schema/mech_shared.yaml (new canonical module, to be vendored byte-identical across all four Mechs):
    • Discussion — broad knowledge-gap/discourse supertype. kind ∈ {KNOWLEDGE_GAP, OPEN_QUESTION, CONTROVERSY, CURATION_TODO, EMERGING_HYPOTHESIS, INTERPRETATION, HUMAN_MODEL_MISMATCH}; lifecycle status; free-form attaches_to hash-anchor pointers; slim proposed_experiments; evidence.
    • Dataset — lightweight public-data reference. DatasetTypeEnum reconciles CultureMech's omics types + microbial additions (AMPLICON/METAGENOMICS/METATRANSCRIPTOMICS/GENOMICS/…); DatasetRepositoryEnum from CommunityMech's orientation (SRA/GEO/ENA/MGnify/JGI GOLD+IMG/NMDC/…).
    • ProposedExperiment — domain-neutral resolution sketch.
  • TraitRecord gains discussions + datasets slots (attaches_tocausal_graphs#<edge>).

Design note (please eyeball)

Discussion.evidence / Dataset.evidence use range: EvidenceItem, resolving to the importing schema's EvidenceItem — so each Mech reuses its own evidence model, the module defines no colliding EvidenceItem/Dataset classes, and stays byte-identical. (MIM has MappingEvidence, not EvidenceItem — its adoption will need a small reconciliation; tracked in claw#7.)

Validation

  • gen-pydantic compiles with no errors; Discussion/Dataset/ProposedExperiment emitted; EvidenceItem resolves.
  • A sample TraitRecord with a KNOWLEDGE_GAP discussion (+ proposed_experiments) and a METAGENOMICS dataset passes linkml-validate ("No issues found").

Next (claw#7)

Vendor this module byte-identical + sha-pin across MIM / CommunityMech / CultureMech; do the Dataset migration for CultureMech (Dataset) and CommunityMech (AssociatedDataset); MIM EvidenceItem reconciliation.

🤖 Generated with Claude Code

realmarcin and others added 2 commits June 17, 2026 20:04
…aitRecord

Phase 1 of the DisMech feature adoption (culturebotai-claw#7). Introduces the
canonical, byte-identical shared LinkML module
src/traitmech/schema/mech_shared.yaml:

  * Discussion — broad discourse/knowledge-gap supertype (kind enum incl.
    KNOWLEDGE_GAP/OPEN_QUESTION/CONTROVERSY/CURATION_TODO/EMERGING_HYPOTHESIS/
    INTERPRETATION/HUMAN_MODEL_MISMATCH; lifecycle status; free-form
    `attaches_to` hash-anchor pointers; slim `proposed_experiments`; `evidence`).
  * Dataset — lightweight public-data reference; canonical DatasetTypeEnum
    (omics + microbial: AMPLICON/METAGENOMICS/METATRANSCRIPTOMICS/GENOMICS/…)
    and DatasetRepositoryEnum (SRA/GEO/ENA/MGnify/JGI GOLD+IMG/NMDC/…).
  * ProposedExperiment — domain-neutral resolution sketch.

`Discussion.evidence` / `Dataset.evidence` resolve to the IMPORTING schema's
`EvidenceItem`, so each Mech reuses its own evidence model (no duplicate class,
no collision). TraitRecord gains `discussions` + `datasets` slots; `attaches_to`
anchors into `causal_graphs#<edge>`.

TraitMech is the greenfield reference integration (no dataset migration needed).
Validated: `gen-pydantic` compiles (Discussion/Dataset/ProposedExperiment emitted,
EvidenceItem resolves); a sample TraitRecord with a KNOWLEDGE_GAP discussion +
METAGENOMICS dataset passes `linkml-validate` (No issues found).

Next (claw#7): vendor this module byte-identical + sha-pin across MIM /
CommunityMech / CultureMech, with the Dataset migration for CultureMech
(`Dataset`) and CommunityMech (`AssociatedDataset`).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…, naming

Addresses the four requested adjustment areas:

- Self-contained evidence: drop the `range: EvidenceItem` dependency on the
  importing schema; define `SupportingReference` (+ `SupportLevelEnum`) in the
  module. Discussion.evidence / Dataset.evidence now reference it. Module
  validates STANDALONE (gen-pydantic clean) and needs no per-repo reconciliation
  (MIM's MappingEvidence no longer matters). Each record keeps its own primary
  EvidenceItem; discussions/datasets carry SupportingReference citations.
- Dataset enums grounded in reality: DatasetTypeEnum / DatasetRepositoryEnum are
  now the faithful UNION of the existing CultureMech + CommunityMech enums (plus
  microbial additions), with the old→new migration map documented inline so the
  Phase-2 migration of those two repos is loss-preserving.
- Naming & constraints: field is `dataset_type` (matches both repos, was
  `data_type`); `accession`/`description` recommended; `url` range uri;
  documented `attaches_to` `<section>#<anchor>` grammar; CultureMech `dataset_id`
  → `accession` and CommunityMech `name` → `title` migration notes inline.
- ProposedExperiment: `model_systems` is now multivalued (matches DisMech's
  plurality); `name` recommended.

Validated: module standalone gen-pydantic (exit 0, SupportingReference +
Discussion + Dataset + ProposedExperiment emitted); full traitmech schema
gen-pydantic (exit 0); sample TraitRecord instance with a KNOWLEDGE_GAP
discussion + METAGENOMICS dataset passes linkml-validate.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@realmarcin realmarcin merged commit 654a5e3 into main Jun 18, 2026
3 checks passed
@realmarcin realmarcin deleted the feat/shared-discussion-dataset-module branch June 18, 2026 06:15
realmarcin added a commit that referenced this pull request Jun 18, 2026
Completes the cross-Mech pin: TraitMech (which introduced the module in #119,
before the pin recipe existed) now also pins mech_shared.yaml. All four Mechs
now carry the byte-identical module (1a5e21eb) + an identical sha sidecar.

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant