Skip to content

Phase 3: Rebuild FLASHApp viewers on OpenMS-Insight via frozen template#93

Open
t0mdavid-m wants to merge 53 commits into
developfrom
claude/kind-heisenberg-u6dVm
Open

Phase 3: Rebuild FLASHApp viewers on OpenMS-Insight via frozen template#93
t0mdavid-m wants to merge 53 commits into
developfrom
claude/kind-heisenberg-u6dVm

Conversation

@t0mdavid-m
Copy link
Copy Markdown
Member

@t0mdavid-m t0mdavid-m commented Jun 5, 2026

Summary

This PR completes Phase 3 of the FLASHApp migration: rebuild the three viewer pages (FLASHDeconv, FLASHTnT, FLASHQuant) on top of OpenMS-Insight using a reusable, tool-agnostic grid module from a frozen streamlit-template package. The old bespoke Vue mega-component and index-based selection logic are replaced with value-based filters/interactivity and a shared StateManager for cross-linked panels.

Key Changes

New Modules

  • src/view/grid.py — Tool-agnostic linked-grid renderer (render_linked_grid() + LayoutManager class). Distills the old render.py::render_grid() + both FLASH*LayoutManager classes into a single reusable template that can be frozen and vendored byte-for-byte unchanged into downstream apps.

  • src/render/schema.py — Adapter layer that converts FLASHApp's wide, list-column, index-addressed FileManager caches into tidy parquet with stable value IDs (scan_id, mass_id, peak_id, protein_id, etc.). Enables OpenMS-Insight's value-based filters/interactivity model. Includes helpers for exploding nested signal/noise peaks, KDE distributions, and protein/tag data.

  • src/render/render.py (repurposed) — Now a builder factory. make_builders(file_manager, dataset_id, tool, settings=None) returns a {comp_name: () -> BaseComponent} map. Each zero-arg factory closes over dataset context and uses file_manager.result_path() to feed data_path= to Insight components. Includes oracle Tabulator column definitions (titles, formatters, sorters) ported verbatim so migrated Tables show the same curated columns and formatting.

Removed Modules

  • src/render/update.py — Index-based selection logic (scanIndex / massIndex / proteinIndex / xRange/yRange slicing). Replaced by value-based filters/interactivity in Insight components.
  • src/render/initialize.py — Per-panel cache loading. Now handled by Insight's data_path= + subprocess preprocessing.
  • src/render/StateTracker.py — Local state reconciler. Replaced 1:1 by openms_insight.StateManager.
  • src/render/components.py — Vue component registration. No longer needed.
  • src/render/util.py — Hash utility (unused).
  • .gitmodules + openms-streamlit-vue-component/ — Old Vue submodule removed.

Page Updates

  • content/FLASHDeconv/FLASHDeconvViewer.py, content/FLASHTnT/FLASHTnTViewer.py, content/FLASHQuant/FLASHQuantViewer.py — Simplified to call show_linked_grid() with a layout and builders. No more render_grid() / StateTracker / initialize_data / update_data / filter_data calls.

  • src/common/common.py — Added show_linked_grid() convenience wrapper for rendering N-experiment grids with the frozen template.

Data Layer

  • src/workflow/FileManager.py — Added result_path(dataset_id, name_tag) method to expose parquet file paths for data_path= (Insight's subprocess preprocessing entry point).

Testing & Migration Harness

  • tests/test_render_schema.py — Construct-smoke for build_insight_caches(): synthetic FileManager caches → tidy parquet with stable IDs and correct explode shapes.
  • **`tests/test_render

https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne

Summary by CodeRabbit

  • New Features
    • Unified linked-grid viewer with per-experiment layout editing and side-by-side rendering; value-based Insight builders and tidy-cache generation for all viewers; FileManager now exposes parquet path helpers.
  • Documentation
    • Added Phase 3 migration plan and review guidance.
  • Tests
    • Large new test suites exercising builders, schema transforms, and grid wiring.
  • Chores
    • Runtime image build updated to install Insight package; dependency pinned to openms-insight; removed legacy local JS submodule.

claude added 30 commits June 2, 2026 18:46
Scaffolds Phase-3 tracking: rebuild FLASHApp viewer pages on OpenMS-Insight
via a reusable, frozen streamlit-template grid (no FLASHApp fork). No app code changes.

- units.yaml: 10 units (template build+freeze, then FLASHApp rebuild) with oracles.
- run_review.py: shared record/gate/report convergence driver.
- nondivergence.py: asserts FLASHApp grid code == frozen template (normalized hash).
- review-log/ ledger, REVIEW.md rollup, README.md.

https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
… non-divergence)

Concrete spec from the planning fan-out: streamlit-template src/view/grid.py
(render_linked_grid + LayoutManager), FileManager data layer, show_linked_grid,
visualization_template demo; FLASHApp src/render/schema.py (tidy parquet, stable IDs);
builders factory + StateManager per (tool,experiment); 3 viewer rebuilds; index->value
selection map; non-divergence via vendored grid.py == frozen template grid.py.

https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
MSG
- src/view/grid.py: byte-identical vendored copy of the frozen streamlit-template
  src/view/grid.py (the render_linked_grid + LayoutManager SSOT). Plain copy synced from
  the template, NOT a git submodule.
- src/view/__init__.py: package marker.
- migration/units.yaml: register nondivergence_pairs = [FLASHApp/src/view/grid.py,
  streamlit-template/src/view/grid.py] so nondivergence.py enforces byte-identical
  (normalized-hash) equality of the two side-by-side checkouts.

nondivergence GREEN (vendored == frozen template).

https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
…rozen template grid

- src/render/schema.py (NEW): build_insight_caches(file_manager, dataset_id, tool) ->
  Insight-ready tidy parquet (stable scan_id/mass_id/peak_id/protein_id/tag_id/feature_id;
  exploded SignalPeaks/NoisyPeaks; comma-split quant traces; long-format target/decoy KDE).
- src/render/render.py: make_builders(file_manager, dataset_id, tool) factory (comp_name ->
  zero-arg -> BaseComponent), cache_id=f"{tool}__{dataset_id}__{comp}", data_path=result_path;
  index->value selection map (scanIndex->scan_id, massIndex->mass_in_scan/mass_id,
  proteinIndex+proteoform_scan_map->protein_id filter, StateTracker->StateManager).
- 3 viewer pages shrink to page_setup -> pick experiment -> load layout -> show_linked_grid
  (imports render_linked_grid/LayoutManager from the vendored src/view/grid.py, unchanged).
- src/common/common.py: show_linked_grid; FileManager: get_results(as_path)/result_path.
- DELETED dead layer: components.py, initialize.py, update.py, StateTracker.py, util.py.
- requirements.txt: + openms-insight.
- tests/: conftest + test_render_schema (9) + test_render_builders (8).
- Known seams (for review loop): Plot3D x/y/z config not forwarded through Insight's
  data_path subprocess (worked around with in-process data= for the small Plot3D frames);
  FLASHTnT SequenceView peaks not scan->protein remapped on protein selection.

Verified: pytest 45 passed/2 skipped, parse OK, nondivergence GREEN (grid.py untouched).

https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
Replace the Phase-0 ast.parse placeholders with: nondivergence + template pytest (74,
incl. test_view_grid construct-smoke) + FLASHApp pytest (45, incl. render schema/builders
construct-smoke) + viewer-parse. Real machine gate for the review loop.

https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
Resolves the high/med round-1 review findings on the FLASHApp rebuild:

schema.py:
- _explode_list_cols: drop null/empty list cells before exploding (no more
  phantom null rows for empty spectra / zero-mass scans).
- deconv_spectrum_tidy: alias MonoMass->mass (SequenceView requires a 'mass'
  column) and expose the per-scan ordinal as mass_in_scan (the oracle
  massIndex space the 3D / mass-table share) instead of dropping it.
- proteins: denormalize scan_id (=deconv_index via build_proteoform_scan_map)
  so a protein-row click resolves to its scan (value-based proteoform_scan_map).
- tags: key on scan_id (not a collapsed last-wins protein_id) so every tag on
  the selected proteoform's scan shows for ANY proteoform on that scan (oracle
  filtered by Scan).

render.py:
- mass_table / deconv_spectrum: 'mass' selection = mass_in_scan (per-scan
  ordinal the 3D consumes), not a global id; deconv x_column -> mass.
- anno_spectrum: remove mass interactivity (raw-m/z click never matched the
  deconvolved MonoMass in the oracle, so it selected nothing).
- protein_table: click sets BOTH protein and scan; tag_table + augmented
  spectrum + sequence-view peaks follow via scan; combined_spectrum filters
  by scan (was an unset 'spectrum' slot -> blank).
- SequenceView (tnt): filters by protein (sequence) AND scan (peaks).

viewers: blank-until-pick experiment selector (oracle parity; also avoids
eager cache builds on page load).

tests updated to assert the parity-correct wiring + a null-guard regression.
- combined_spectrum: resolve the tag-table's scalar tag_id to the tag's
  fragment masses + sequence via the tags frame (FLASHTnT only; FLASHDeconv
  has no tags frame so the resolve kwargs are omitted).
- sequence_view (tnt): publish residue clicks as the 'aa' selection the
  tagger consumes for the tag-relative selectedAA (gold highlight).
- test: the tagger resolves tag_id -> {sequence, masses, selectedAA}, tag/aa
  are re-render dependencies, and FLASHDeconv has no tag resolution.

Resolves the last round-1 finding (3-tnt-003, tagger overlay dead).
…rowing

Resolves the round-2 review findings:

- 3-builders-003 (high): 3D Precursor Signals showed nothing until a mass was
  clicked because mass was a hard filter. Now optional_filters=["mass"] so the
  scan's full Signal/Noisy scatter shows immediately, narrowing to one mass only
  when selected (oracle parity).
- 3-quant-001 (med): quant feature-trace 3D axes were swapped/mislabeled. Now
  x=m/z, y=retention time, z=intensity with explicit labels (oracle
  FLASHQuantView), title 'Feature group signals'.
- 3-tnt-005 (low): clicking a sequence residue now narrows the tag table to tags
  spanning it via interval_filters={"aa": ("StartPos", "EndPos")}, reusing the
  SequenceView 'aa' selection (oracle TabulatorTagTable secondary filter).

migration: gate now also runs the OpenMS-Insight gate (pytest/build/vitest/
parity) since Phase 3 re-opened Insight; insight:tagger-seqview tracked as a
no-regression unit. Tests added for all three fixes.
…or grid

- 3-schema-004 (high): the Precursor Signals 3D plotted raw m/z on a 'Mass'
  axis. The oracle (get3DplotInputFromSNRPeaks) uses x = mz*charge; add a
  precomputed 'mass' column to precursor_signals and point Plot3D x_column there
  (Plot3D's default x_label 'Mass' matches the oracle axis title). Charge states
  no longer collapse to their m/z positions.
- re-vendor src/view/grid.py from the frozen template (empty-experiment upload
  fix); nondivergence GREEN.
- tests: precursor_signals carries mass==mz*charge (explode unit + integration);
  3D builder x/y/z = mass/charge/intensity with label 'Mass'.

Also records round-5 review (9 clean, 2 findings: 3-schema-004, 3-grid-003).
…ution lines

Resolves round-6 review findings:
- 3-builders-005 (med): revert the 3D Precursor Signals optional_filters=[mass].
  The oracle frontend (getPrecursorSignal) renders EMPTY when no mass is selected
  (the precursor-scan lookup fails on the scan-filtered per_scan_data); it only
  draws SignalPeaks[mass_index] once a mass is chosen. mass is again a REQUIRED
  filter. (Round-2's 3-builders-003 misread update.py's data-prep as the displayed
  behavior.)
- 3-builders-004 (low): pass oracle axis titles to the plain spectra + heatmaps
  (Monoisotopic Mass / m/z / Intensity / Retention Time) instead of raw column names.
- 3-quant-002 (med): quant feature-trace 3D uses stem=False so each charge is one
  connected elution line (oracle FLASHQuantView mode:lines), not per-point spikes.
- 3-schema-006 (nit): correct the anno is_signal test comment.

tests: axis-title coverage; quant stem=False; corrected is_signal comment.
(OpenMS-Insight tagger config-key leak 3-tnt-006 fixed in the Insight commit.)
…ch-oracle-chrome)

Per the chosen full-presentation-parity bar:
- All 5 tables now pass column_definitions (oracle Tabulator titles, fixed +
  placeholder(-1->'-') formatters, sorters) + per-table initial_sort
  (Protein/Tag = Score desc). The Table renders ONLY the curated oracle columns;
  internal carrier columns (scan_id, mzs, ProteinIndex, full sequence, ...) stay
  in the data for filters/interactivity/index but are no longer displayed.
  Coverage(%) omitted (oracle commented it out); de-duped FLASHQuant's duplicated
  'Feature Group Quantity' column.
- quant_traces_3d: series_column='isotope' (breaks the polyline between isotopes
  within a charge, reproducing the oracle -1000 z-sentinel gaps) +
  category_name_template='Charge: {}' (legend 'Charge: 2'), via the new Plot3D
  features.
- conftest: synthetic protein_dfs/tag_dfs fixtures extended with the real
  FLASHTagger columns (Score, MatchingFragments, ModCount, TagCount, Coverage(%),
  Nmass/Cmass/DeltaMass) so initial_sort + the placeholder formatter are exercised.

Resolves 3-tables-001, 3-quant-003, 3-quant-004. (ProteinTable best-per-spectrum
toggle still pending.)
Round 8 (match-oracle-chrome bar): template/grid/common/filemanager/page + insight
+ deconv-viewer CLEAN; table chrome + quant isotope-breaks/legend verified resolved.
Remaining findings: 3-quant-005 (per-trace-id for same-isotope duplicate traces),
3-tables-002 (best-per-spectrum toggle), 3-tables-003 (go_to_fields), 3-fdr-001/002
(FDR title+trace names), 3-feat-001 (feature-table title), 3-anno-001 (anno spectrum
selection-driven highlight model). User directive: match the full interaction model.
…can flag

Data-layer support for the full-interaction-model parity work (render wiring in a
follow-up):
- anno_highlight_link (NEW tidy frame: scan_id, peak_id, mass_in_scan, charge):
  maps each annotated signal peak to the deconvolved mass(es) it belongs to + its
  charge, keyed by anno_spectrum_tidy.peak_id, so the annotated spectrum can
  highlight ONLY the selected mass's raw peaks + show per-charge z=N labels.
  Confirmed 1:MANY (a raw m/z peak can be a signal peak for several masses at
  different charges) -> one row per (peak, mass).
- quant_traces + trace_in_feature: per-feature running trace id minted at the
  trace explode, carried through the point comma-split, so the 3D breaks the
  polyline PER-TRACE (fixes 3-quant-005: two same-(charge,isotope) traces no
  longer merge).
- proteins + is_best_per_scan: 1 for the max-Score row per Scan (ordinal rank,
  ties keep-first) -> backs the ProteinTable best-per-spectrum default view.
Additive; tests + fixtures updated (59 passed).
… chrome

Wire the new OpenMS-Insight selection-driven LinePlot/Plot3D features + schema
data into the viewers (match-full-interaction-model directive):
- anno_spectrum: selection-driven highlight of the SELECTED mass via the
  anno_highlight_link frame (highlight_selection=mass, highlight_link_path,
  key=peak_id, match=mass_in_scan, charge labels z={}, deconv_peaks_toggle=True);
  exposes peak_id via a private anno_peak interactivity slot (does not drive the
  shared mass slot). Removes the static is_signal highlight. (3-anno-001)
- deconv_spectrum: selective highlight of the selected mass (highlight_match_column
  =mass_in_scan). (residual: selected-mass value label needs a match-column label
  producer in the LinePlot -- noted for follow-up)
- 3D_SN_plot: title_selection -> dynamic "" / "Precursor signals" / "Mass signals".
  (3-3d-001)
- quant_traces_3d: series_column=trace_in_feature (per-trace break; fixes the
  same-(charge,isotope) merge). (3-quant-005)
- go_to_fields per oracle for every table (no internal-carrier leakage). (3-tables-003)
- FDR: title "FDR Plot" + trace names "Target QScores"/"Decoy QScores". (3-fdr-001/002)
- feature table title "Feature groups". (3-feat-001)
- best-per-spectrum toggle: per-experiment checkbox -> make_builders(best_per_spectrum)
  -> protein_table sources is_best_per_scan-filtered data under a distinct cache_id.
  (3-tables-002)
gate GREEN; 69 passed; nondivergence GREEN.
Wire the new LinePlot match-column value-label producer into deconv_spectrum:
highlight_value_column="mass" + highlight_value_template="{:.2f}" so the selected
mass deconvolved stick gets its MonoMass value label (oracle mass.toFixed(2)),
closing the round-9 finding. Test extended to assert exactly one value label at
the selected stick. 69 passed.
…forms (3-best-002)

is_best_per_scan now flags EVERY proteoform whose Scan is missing (NaN / null /
non-numeric), matching the oracle filterBestPerSpectrum (keeps each row where
typeof scan !== "number" || isNaN(scan)) instead of collapsing them into a single
.over(Scan) group (which flagged only one -> hid the rest in best-per-spectrum
mode). A missing Scan from protein.tsv arrives as float NaN, so the check casts to
f64 (non-numeric -> null) and treats null|NaN as missing (dtype-safe; is_nan errors
on an int column without the cast). Edge-case-gated (real protein.tsv populates
Scan) but a real parity divergence. Test added.
Round 10: 9 units CLEAN (template grid/common/filemanager/page, nondivergence,
builders, deconv-viewer, quant-viewer, insight). Deconv value-label (3-deconv-001)
verified resolved. Two findings: 3-best-002 (best-per-spectrum must pass through
missing-Scan proteoforms; fixed) and 3-cascade-001 (protein selection must
cascade-clear the dependent aa/tag selections; fix in progress).
Wire clears_selections=["aa","tag"] onto the protein_table, reproducing the oracle
ProteinTable.updateSelectedProtein cascade (clears selectedAA/selectedTag/tagData
on every protein click). Switching proteoform now resets the dependent residue (aa)
+ tag selections, so the tag-table interval filter + tagger overlay follow the new
proteoform instead of going stale/empty. Test asserts the protein_table args carry
clearsSelections=["aa","tag"].
3-best-003 (schema): is_best_per_scan now ranks the highest REAL Score first --
NaN/missing/non-numeric Score is pushed to -inf before rank(descending) so it
sorts LAST, mirroring the oracle toScore (NaN -> -Infinity). Previously polars
ranked NaN as largest, flagging a no-score proteoform best. Test added.

3-cascade-002 (render): scan_table gains clears_selections=["mass"], reproducing
the oracle TabulatorScanTable updateSelectedScan -> updateSelectedMass(0): a scan
click clears the mass selection, and the mass_table (default_row=0) re-defaults to
mass_in_scan 0 of the new scan via the bridge _auto_selection -- so the
deconv/anno spectra + 3D show the new scan first mass, not a stale ordinal.
Test asserts the cascade arg AND that the mass_table auto-selects 0 when mass is
unset.
Round 12: 10 units clean (grid empty-row divergence sanctioned as exception-b:
degenerate malformed-upload-only, no well-formed-data difference). Round-11 fixes
(Score-NaN 3-best-003, scan->mass 3-cascade-002) + anno-peak click all verified
resolved/correct. One new finding 3-seqview-001: SequenceView residue-click model
(coverage- vs fragment-gating, no toggle-clear, no residue->mass) diverges from the
oracle; re-opens the converged SequenceView -- escalating scope.
…001)

Add fragment_mass_identifier="mass" to the FLASHTnT sequence_view builder so the
oracle two-path residue click is fully reproduced (maintainer: "both should be
supported as in the FLASHTnT Viewer"):
- PATH 1 (aa / sequence-tag): coverage_column="coverage" (already wired) now makes
  residue_identifier="aa" coverage-gated + toggling via the Insight SequenceView
  (tag-covered residues drive the tagger/tag table; re-click clears).
- PATH 2 (mass / fragment): a residue click on a fragment-matched residue publishes
  that fragment peak mass_in_scan to "mass" (resolved via the existing interactivity
  "mass" column), reproducing updateMassTableFromFragmentMass -> updateSelectedMass.
FLASHDeconv sequence_view (no coverage/residue wiring) is unchanged. Test asserts
the two-path wiring.
…view-002)

Add fragment_mass_identifier="mass" to the FLASHDeconv sequence_view branch so a
fragment-residue click in the (global) deconv Sequence View publishes the fragment
peak mass_in_scan to the shared mass selection -- driving the mass table, deconv/anno
spectra, and 3D S/N plot (all in the deconv default layout), matching the oracle
aminoAcidSelected -> updateSelectedMass which runs on every tool. PATH 2 only (no
tags/coverage on the global sequence). Test asserts deconv fragment_mass_identifier
== mass and coverage_column is None.
Round 13: 9 units clean (SequenceView two-path PATH-1/PATH-2 verified for TnT; gate
GREEN). 3 findings, all SequenceView: 3-seqview-002 deconv residue->mass (fixed),
3-seqview-003 inbound mass->fragment-table highlight, 3-seqview-004 mass-info header.
Comprehensive SequenceView parity pass in progress.
…ghlight (3-seqview-003/004)

- schema._build_seq_tnt: add observed_mass = computed_mass (= ProteoformMass), the
  oracle SequenceView header observed proteoform mass.
- render._sequence_view (tnt): observed_mass_column="observed_mass" +
  mass_header_title="Proteoform" (renders Theoretical/Observed/Delta Mass header) +
  mass_selection_identifier="mass" (a mass selected elsewhere highlights the matching
  fragment-table row -- oracle updateFragmentTableFromMassSelection).
- render._sequence_view (deconv): mass_selection_identifier="mass" (inbound highlight;
  no header -- global sequence, not a proteoform).
Tests assert the header + inbound wiring on both branches.
claude added 21 commits June 4, 2026 09:10
FLASHTnT sequence_view passes theoretical_mass_label="Theoretical protein mass" +
observed_mass_label="Observed proteoform mass" (oracle preparePrecursorInfo
proteoform branch) instead of the generic precursor-branch defaults. Test asserts.
- schema._build_seq_deconv: add per-scan observed_mass = PrecursorMass (NULL for MS1
  scans where PrecursorMass==0, so the SequenceView hides the header there -- matching
  the oracle precursor branch which renders no header for MS1; vs the TnT proteoform
  branch which shows "-" for a missing mass).
- render._sequence_view (deconv): observed_mass_column="observed_mass" +
  mass_header_title="Precursor" (generic "Theoretical mass"/"Observed mass" labels =
  oracle preparePrecursorInfo precursor branch). Completes the deconv half of the
  mass-info header (3-seqview-004 did the TnT proteoform half).
Test asserts deconv observed_mass_column/title + seq_deconv observed_mass column.
ms1_raw_heatmap/ms2_raw_heatmap plot raw m/z (from the annotated spectra), so the
oracle PlotlyHeatmap yAxisLabel returns "m/z" for Raw MS1/MS2 Heatmaps -- only the
DECONV heatmaps are "Monoisotopic Mass". Fix the two raw heatmaps y_label="m/z";
test now expects "m/z" for raw, "Monoisotopic Mass" for deconv.
…t ion)

Round 16: 8 units clean (X-residue + deconv precursor header verified). 2 findings:
3-heatmap-001 raw-heatmap y-label (fixed: "m/z"); 3-seqview-008 SequenceView TSG path
omits the full-length terminal fragment ion vs oracle (fix in progress).
Round 17: 9 units clean (terminal-ion + raw-heatmap fixes verified; internal-fragment
_terminal_collision_masses expansion confirmed oracle-faithful improvement, not a
regression). 3 findings: 3-seqview-009/010 (HIGH) SequenceView computes fragments on
the FULL protein not the proteoform sub-sequence (wrong masses/grid for truncated
proteoforms) + no undetermined-terminus suppression; 3-seqview-011 (low) docstring
accuracy. Proteoform-aware fragment-handling fix in progress.
…ng (3-seqview-009/010)

seq_tnt carries the full sequence + 0-based proteoform terminals; end-to-end
SequenceView-from-seq_tnt fragment grid == oracle sub-region (truncated + undetermined).
No FLASHApp source change needed (tnt.py already slices; render.py already wires
proteoform_start/end_column -- the Insight side now consumes them for fragments).
Wire interactivity on all 4 heatmap builders (oracle PlotlyHeatmap click ->
updateSelectedScan for all + updateSelectedMass for deconv MS1/MS2): deconv heatmaps
interactivity={scan:scan_idx, mass:mass_idx}; raw heatmaps interactivity={scan:scan_idx}.
The reused heatmap caches carry scan_idx(=scan_id)/mass_idx(=mass_in_scan) from
getMSSignalDF. conftest heatmap fixture extended with scan_idx/mass_idx; test asserts
the per-heatmap interactivity mapping.
First fully-clean round since round 11 -- the SequenceView interaction tail
(residue click, mass header, X-residue, terminal ion, proteoform sub-region,
ion-priority) + heatmap navigation are all resolved. All 53 findings raised across
rounds 5-18 marked resolved (each fixed + confirmed by the all-units-clean round 19).
Need rounds 20-21 clean to converge.
… rounds 19-20-21)

Rounds 19, 20, 21 each fully clean (all 11 Phase-3 units pass the three critics +
GREEN machine gate); 0 open findings. STATUS: CONVERGED.

Round 21 note: an environment restart mid-round killed 4 in-flight review agents;
they were re-launched and all confirmed CLEAN (schema 1/0, builders+deconv 2/0,
tnt+quant 2/0, insight 1/0).
…pin openms-insight==0.1.15

The migration replaced the local Vue component (openms-streamlit-vue-component) with
the openms-insight package, which ships its own Vue bundle. Per PHASE3_PLAN.md 5.5:
- requirements.txt: openms-insight 0.1.11 -> 0.1.15 (the published 0.1.14 is Phase-1&2
  WITHOUT the Phase-3 component features; 0.1.15 is the new Insight release carrying them).
- Dockerfile + Dockerfile.arm: remove the node:21 js-build stage that cloned+built
  openms-streamlit-vue-component and the COPY of its dist into /app/js-component/dist.
- Remove the openms-streamlit-vue-component git submodule (.gitmodules + gitlink).
The migrated app has no js-component references (verified); local tests unaffected.

REMAINING EXTERNAL STEP: publish openms-insight 0.1.15 to PyPI (maintainer release
action) so a clean pip install -r requirements.txt / Docker build resolves it.
…PyPI dep)

Add an insight-build stage (node:21) that clones the openms-insight branch
claude/kind-heisenberg-u6dVm, builds its Vue bundle (npm install && npm run build),
and syncs dist into openms_insight/js-component/dist. The compile-openms stage then
pip-installs that source tree (hatchling force-includes the bundled dist) BEFORE
requirements, so openms-insight==0.1.15 resolves from the branch instead of PyPI.
Makes the image self-contained without a PyPI publish. (Verified: a source build
produces a wheel that bundles assets/index.js.)
…esh-clone fix)

In a fresh clone openms_insight/js-component/ does not exist (it only ever held the
gitignored dist/), so cp had no parent dir and the build failed. mkdir -p it first.
Validated end-to-end against a fresh clone of the branch: clone -> npm build -> sync
-> pip wheel produces a wheel bundling the Vue dist (7 files incl. index.js).
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 5, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR completes Phase 3 of the FLASHApp migration by replacing legacy Vue/component builds and index-based rendering with OpenMS-Insight integration and a frozen, reusable linked-grid template. It introduces tidy-parquet schema adapters, a builder factory, a layout manager/renderer, viewer rewrites, Docker/CI integration for openms-insight, migration gates/harness, and comprehensive tests.

Changes

FLASHApp Phase 3 Migration

Layer / File(s) Summary
Docker: Build and Install OpenMS-Insight Package
Dockerfile, Dockerfile.arm
Adds a node:21 insight-build stage that clones the OpenMS-Insight migration branch, runs the JS build, injects the Vue dist into the Python package tree, and pip-installs the built package during image build; removes legacy local JS build/artifact copy steps.
Requirements & CI build step
requirements.txt, .github/workflows/unit-tests.yml
Pins openms-insight==0.1.15 and updates unit-tests workflow to set up Node 21, clone/build openms-insight, copy the built dist, install local package, then run pytest.
FileManager: Add Parquet Path Support for Insight
src/workflow/FileManager.py
Adds as_path option to get_results() (priority over pyarrow/polars branches) and result_path() helper to return on-disk Parquet paths for Insight data_path usage; guards against missing DB rows.
Schema Adapter: Transform Oracle Caches to Insight-Ready Tidy Parquet
src/render/schema.py
New adapter implementing Polars-based primitives to explode list/nested columns, mint stable IDs, normalize KDEs, and build idempotent tidy Parquet outputs per tool (flashdeconv, flashtnt, flashquant) behind build_insight_caches().
Render Builder Factory: Lazy Component Construction with Interactivity Wiring
src/render/render.py
Repurposes render module into make_builders() returning zero-arg component factories with curated table column definitions, _sequence_view wiring, value-based filters/interactivity, and namespaced cache IDs routing to tidy Parquet frames.
Frozen Grid Module: Reusable Linked-Grid Renderer and Layout Editor
src/view/grid.py
Adds render_linked_grid() to render nested component layouts using a shared StateManager for cross-panel interactivity and LayoutManager to edit/persist layouts with dependency validation and JSON import/export.
Streamlit Wrapper: show_linked_grid() for Multi-Experiment Rendering
src/common/common.py
Introduces show_linked_grid() to wrap render_linked_grid() and handle multi-experiment state scoping and side-by-side vs stacked rendering.
FLASHDeconv Viewer: Lazy Experiment Selection with Insight Rendering
content/FLASHDeconv/FLASHDeconvViewer.py
Rewrites viewer to use per-experiment selectbox values as stable dataset ids, lazily call build_insight_caches() and make_builders(), and render via show_linked_grid(); removes previous index/display-name mapping helpers.
FLASHQuant Viewer: Dataset Selection with Tidy Cache Building
content/FLASHQuant/FLASHQuantViewer.py
Replaces name→index session-state flow with id-based selection, gates on quant_dfs, lazily builds tidy caches, constructs builders, and renders the linked grid.
FLASHTnT Viewer: Per-Experiment Rendering with Best-Per-Spectrum Control
content/FLASHTnT/FLASHTnTViewer.py
Adds per-experiment _render_experiment() with "Best per spectrum" checkbox forwarded to make_builders(), lazily builds caches and uses show_linked_grid(); persists layout/side_by_side fields.
Migration Infrastructure: Phase 3 Planning, Verification, and Convergence
migration/README.md, migration/REVIEW.md, migration/specs/PHASE3_PLAN.md, migration/units.yaml, migration/nondivergence.py, migration/run_review.py
Adds Phase 3 plans/specs, units registry, nondivergence gate that compares normalized file hashes, and a convergence harness (run_review.py) that records gates/reviews and reports convergence across rounds.
Test Infrastructure: Synthetic Caches, Component Smoke Tests, and Schema Validation
tests/conftest.py, tests/test_render_builders.py, tests/test_render_schema.py
Adds Streamlit-mocking test fixtures and synthetic oracle caches; tests validate schema primitives, build_insight_caches idempotency/edge cases, make_builders() outputs, and render_linked_grid wiring and component payloads.
Legacy Stack Removal
src/render/StateTracker.py, src/render/components.py, src/render/initialize.py, src/render/update.py, src/render/util.py, tests/test_selection_clear.py, .gitmodules, openms-streamlit-vue-component
Removes prior StateTracker, component wrappers, initialization/update utilities, hashing helper, selection-clear tests, and deletes .gitmodules submodule entry referencing the old Vue component.

Sequence Diagram(s)

sequenceDiagram
  participant Page as Streamlit Page
  participant Grid as render_linked_grid
  participant Builder as make_builders factory
  participant State as StateManager

  Page->>Grid: call render_linked_grid(layout, builders, state_key)
  Grid->>State: create StateManager(state_key)
  Grid->>Builder: invoke factory() for each cell
  Builder->>Grid: return component instance
  Builder->>State: register filters/interactivity
  Grid->>Builder: component.render() -> Vue payload
Loading

Possibly Related PRs

  • OpenMS/FLASHApp#22: Related to the submodule reference for openms-streamlit-vue-component (both PRs touch that metadata).
  • OpenMS/FLASHApp#57: Related changes to viewer selection/rendering flows in FLASHDeconv and FLASHTnT; both refactor selection behavior.
  • OpenMS/FLASHApp#21: Prior work on viewer helpers that this PR ultimately removes or replaces in favor of Insight-linked-grid rendering.

Poem

🐰 I hopped through grids and tidy pails of data,
Swapped indices for values — now viewers cater.
Vue crumbs tucked away, Insight bakes the chart,
Layouts freeze, builders hum — the panels start.
Phase Three done, the rabbit nods with cheer.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 58.74% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: rebuilding FLASHApp viewers on OpenMS-Insight using a frozen template, which is the core objective of this Phase 3 migration PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/kind-heisenberg-u6dVm

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/workflow/FileManager.py (1)

480-486: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle missing cache rows before iterating fetchone().

If dataset_id has no row in one of the cache tables, fetchone() returns None and both loops raise TypeError before partial/KeyError handling runs. That turns a normal cache miss into a hard crash and breaks the new result_path() contract.

Suggested fix
         result = self.cache_cursor.fetchone()
+        if result is None:
+            result = (None,) * len(file_columns)
         for c, r in zip(file_columns, result):
             if r is None:
                 if partial:
                     continue
                 else:
@@
         result = self.cache_cursor.fetchone()
+        if result is None:
+            result = (None,) * len(data_columns)
         for c, r in zip(data_columns, result):
             if r is None:
                 if partial:
                     continue
                 else:

Also applies to: 498-504

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/workflow/FileManager.py` around lines 480 - 486, The SELECT followed by
fetchone() can return None for a cache miss, so before zipping over file_columns
and result ensure you check "result is None" and treat it as a cache miss (do
not iterate). For both occurrences (the block using self.cache_cursor,
file_columns and dataset_id starting at the shown snippet and the similar block
at 498-504) either raise KeyError(dataset_id) so upstream partial/KeyError
handling runs, or return the missing-value expected by result_path(); do this
check immediately after result = self.cache_cursor.fetchone() to avoid the
TypeError.
🧹 Nitpick comments (1)
migration/nondivergence.py (1)

56-57: ⚡ Quick win

Resolve nondivergence pair paths from a fixed base, not process CWD.

Line 56-Line 57 currently makes relative paths caller-dependent. Resolve relative entries from repo/config root so checks are deterministic across runners.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@migration/nondivergence.py` around lines 56 - 57, Current code creates
Path(pair[0]) and Path(pair[1]) which resolves relative paths against the
process CWD; change this to resolve against the repository/config root instead.
Modify the logic around a, b (and subsequent ea, eb) so you first compute a
stable base Path (e.g., repo_root or config_root) and then construct a =
base.joinpath(pair[0]) and b = base.joinpath(pair[1]) (or otherwise resolve pair
entries against that base) so existence checks ea, eb are deterministic across
runners; update any uses of a and b accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@content/FLASHDeconv/FLASHDeconvViewer.py`:
- Around line 47-49: The current mapping using to_id =
{file_manager.get_display_name(r): r for r in results} is lossy when display
names are duplicated; update the lookup to preserve all datasets by either (a)
keying by a unique identifier (use the raw dataset id from each result instead
of file_manager.get_display_name) where the UI can still show display names, or
(b) build a multimap from display name to a list of result ids (e.g., to_id:
dict[str, list] and append each r) so duplicates are retained; apply the same
change to the other mapping logic around the block handling lines 57–64 to
ensure no dataset is overwritten or lost.
- Line 67: The tool key passed to show_linked_grid is not panel-scoped and can
collide when two panels use the same ds; update the call to generate a unique
panel-scoped tool string (e.g., include a panel identifier such as an explicit
panel index, a unique id derived from exp_layout or builders, or a UUID) and
pass that as the tool argument to show_linked_grid (replace the current
f"flashdeconv_{ds}" with something like f"flashdeconv_{panel_id}_{ds}" or
similar) so the linked-grid state is isolated per panel; change the code that
constructs the tool key where show_linked_grid([exp_layout], builders, tool=...)
is invoked.

In `@content/FLASHQuant/FLASHQuantViewer.py`:
- Around line 29-30: The current code builds names =
[file_manager.get_display_name(r) for r in results] and to_id =
{file_manager.get_display_name(r): r for r in results}, which loses datasets
when display names collide; update the mapping and selection logic to use a
unique key instead of the display name (or map a display name to a list of
results). For example, keep a list of tuples (display_name, result) or build
to_id using a unique identifier from each result (e.g., r.id or r.path) while
still presenting file_manager.get_display_name(r) in the UI, and change the
lookup code that currently uses the display name (the selection at Line 38 and
the surrounding block that spans the same section) to resolve selection via that
unique identifier or disambiguate by choosing the correct entry from the list of
results for that display name.

In `@content/FLASHTnT/FLASHTnTViewer.py`:
- Around line 42-43: The code builds names = [file_manager.get_display_name(r)
for r in results] and to_id = {file_manager.get_display_name(r): r for r in
results}, which will drop entries when multiple results share the same display
name; change the mapping so resolution is based on a unique identifier instead
of display name (e.g., use r.id or r.path) or map display names to lists of
results: create to_id_by_unique = {r.id: r for r in results} and keep names as
display labels, or build to_id = defaultdict(list) and append each r under
file_manager.get_display_name(r) so duplicates are preserved; then update any
lookup code that uses to_id (the lookup at or around the code that resolves
selections) to use the unique-id map or to handle list values accordingly (e.g.,
disambiguate by id or present multiple choices).
- Line 76: The linked-grid tool key is currently only namespaced by dataset id
via tool=f"flashtnt_{ds}", which allows two side-by-side panels selecting the
same dataset to collide; update the call to show_linked_grid to include the
panel index (e.g., tool=f"flashtnt_{ds}_{panel_idx}" or similar) so the key is
unique per panel (locate the show_linked_grid([exp_layout], builders, tool=...)
invocation and add the panel index variable used when rendering panels).

In `@migration/units.yaml`:
- Around line 8-14: Replace hard-coded absolute paths in migration/units.yaml
(oracle_root, template_repo, nondivergence_pairs) with repo-relative or
environment-parameterized tokens (e.g., use a ${REPO_ROOT} or ${TEMPLATE_REPO}
placeholder) and update the code that reads units.yaml to expand those tokens at
runtime (resolve via an env var or compute repo root with os.getcwd()/git
rev-parse and join paths). Specifically change the values for oracle_root and
template_repo and the paths inside nondivergence_pairs to use placeholders, and
update the loader/validator that reads units.yaml to perform
environment-variable expansion and path normalization so CI and other machines
can resolve the locations correctly.

In `@src/render/render.py`:
- Around line 624-626: The sequence_view factory is always registered and
immediately triggers result_path() for seq_deconv/seq_tnt even if
build_insight_caches skipped them, causing cache misses; update the registration
site to first check for the presence of the sequence cache(s) before adding the
"sequence_view" entry (e.g. query the cache mapping returned by
build_insight_caches or call file_manager.has_cache/exists for the
seq_deconv/seq_tnt result paths) and only assign "sequence_view": lambda:
_sequence_view(...) when that check succeeds so datasets without sequence data
won't construct the panel.

In `@src/render/schema.py`:
- Around line 551-555: The lambda passed to pl.col("Scan").map_elements() only
checks for s is not None, so NaN/pd.NA still reach int(s) and crash; update the
guard to treat all missing/null-like values as missing (return -1) by using a
null check that covers NaN/pd.NA (e.g., pd.isna(s) or math.isnan where
appropriate) before calling int(s), keeping the rest of the expression
(scan_to_deconv lookup, alias("scan_id"), return_dtype=pl.Int64) unchanged so
missing scans yield -1 and valid scans are int-cast and mapped via
scan_to_deconv.

In `@src/view/grid.py`:
- Around line 243-266: The upload flow needs to validate JSON and avoid index
errors in expand: wrap json.load(uploaded) in a try/except to catch
JSONDecodeError and return/raise a controlled validation error, verify the
decoded object is the expected list-of-lists structure before calling expand,
and inside Grid.expand change the name-lookup to a safe pattern (e.g., check
membership of col in self.component_names or use a dict mapping name->option and
skip unknown IDs) rather than calling self.component_names.index(col) which
raises ValueError; ensure expand returns a consistent validated structure (or
raises a clear validation error) instead of letting runtime exceptions
propagate.

---

Outside diff comments:
In `@src/workflow/FileManager.py`:
- Around line 480-486: The SELECT followed by fetchone() can return None for a
cache miss, so before zipping over file_columns and result ensure you check
"result is None" and treat it as a cache miss (do not iterate). For both
occurrences (the block using self.cache_cursor, file_columns and dataset_id
starting at the shown snippet and the similar block at 498-504) either raise
KeyError(dataset_id) so upstream partial/KeyError handling runs, or return the
missing-value expected by result_path(); do this check immediately after result
= self.cache_cursor.fetchone() to avoid the TypeError.

---

Nitpick comments:
In `@migration/nondivergence.py`:
- Around line 56-57: Current code creates Path(pair[0]) and Path(pair[1]) which
resolves relative paths against the process CWD; change this to resolve against
the repository/config root instead. Modify the logic around a, b (and subsequent
ea, eb) so you first compute a stable base Path (e.g., repo_root or config_root)
and then construct a = base.joinpath(pair[0]) and b = base.joinpath(pair[1]) (or
otherwise resolve pair entries against that base) so existence checks ea, eb are
deterministic across runners; update any uses of a and b accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9ef92ede-ee8d-47e5-83d9-a18e8557d9c0

📥 Commits

Reviewing files that changed from the base of the PR and between a89427d and de4b7d5.

📒 Files selected for processing (31)
  • .gitmodules
  • Dockerfile
  • Dockerfile.arm
  • content/FLASHDeconv/FLASHDeconvViewer.py
  • content/FLASHQuant/FLASHQuantViewer.py
  • content/FLASHTnT/FLASHTnTViewer.py
  • migration/README.md
  • migration/REVIEW.md
  • migration/nondivergence.py
  • migration/review-log/phase-3.jsonl
  • migration/run_review.py
  • migration/specs/.gitkeep
  • migration/specs/PHASE3_PLAN.md
  • migration/units.yaml
  • openms-streamlit-vue-component
  • requirements.txt
  • src/common/common.py
  • src/render/StateTracker.py
  • src/render/components.py
  • src/render/initialize.py
  • src/render/render.py
  • src/render/schema.py
  • src/render/update.py
  • src/render/util.py
  • src/view/__init__.py
  • src/view/grid.py
  • src/workflow/FileManager.py
  • tests/conftest.py
  • tests/test_render_builders.py
  • tests/test_render_schema.py
  • tests/test_selection_clear.py
💤 Files with no reviewable changes (8)
  • src/render/StateTracker.py
  • .gitmodules
  • tests/test_selection_clear.py
  • src/render/util.py
  • src/render/initialize.py
  • src/render/components.py
  • src/render/update.py
  • openms-streamlit-vue-component

Comment thread content/FLASHDeconv/FLASHDeconvViewer.py Outdated
Comment thread content/FLASHDeconv/FLASHDeconvViewer.py Outdated
Comment thread content/FLASHQuant/FLASHQuantViewer.py Outdated
Comment thread content/FLASHTnT/FLASHTnTViewer.py Outdated
Comment thread content/FLASHTnT/FLASHTnTViewer.py Outdated
Comment thread migration/units.yaml
Comment on lines +8 to +14
oracle_root: /home/user/FLASHApp/src/render
template_repo: /home/user/streamlit-template
# Pairs checked by nondivergence.py: FLASHApp must reuse the template module verbatim.
# Filled in once the template grid module is frozen (Phase 3 step 2).
nondivergence_pairs:
- [/home/user/FLASHApp/src/view/grid.py, /home/user/streamlit-template/src/view/grid.py]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Hard-coded /home/user/... paths make the phase harness non-portable.

This config will fail on any machine/CI runner that doesn’t mirror that exact filesystem layout. Please switch targets/oracles/cwd entries to repo-relative or env-parameterized paths and resolve them in the scripts.

Also applies to: 22-50, 55-119

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@migration/units.yaml` around lines 8 - 14, Replace hard-coded absolute paths
in migration/units.yaml (oracle_root, template_repo, nondivergence_pairs) with
repo-relative or environment-parameterized tokens (e.g., use a ${REPO_ROOT} or
${TEMPLATE_REPO} placeholder) and update the code that reads units.yaml to
expand those tokens at runtime (resolve via an env var or compute repo root with
os.getcwd()/git rev-parse and join paths). Specifically change the values for
oracle_root and template_repo and the paths inside nondivergence_pairs to use
placeholders, and update the loader/validator that reads units.yaml to perform
environment-variable expansion and path normalization so CI and other machines
can resolve the locations correctly.

Comment thread src/common/common.py
Comment on lines +965 to +1011
def show_linked_grid(
layout,
builders,
*,
tool,
side_by_side=False,
grid_key="linked_grid",
height=None,
column_heights=None,
):
"""Render an N-experiment OpenMS-Insight linked grid.

Thin convenience over ``src.view.grid.render_linked_grid`` (the frozen,
tool-agnostic grid) that handles the multi-experiment + side-by-side page
concern. ``layout`` is ``List[experiment]``; each experiment is the nested
rows list consumed by ``render_linked_grid``. One independent
``StateManager`` is created per experiment (``session_key=f"{tool}__exp{i}"``)
so experiments never cross-link. When exactly two experiments and
``side_by_side=True`` they render in two ``st.columns``; otherwise they stack
with ``st.divider()``.

Experiment *selection* (the per-experiment ``st.selectbox``) stays in the
viewer page because it is tool/data specific; this helper only owns the
grid + side-by-side rendering.
"""
from src.view.grid import render_linked_grid

def _one(exp_idx, exp_layout, container):
with container:
render_linked_grid(
exp_layout,
builders,
state_key=f"{tool}__exp{exp_idx}",
grid_key=f"{grid_key}_{exp_idx}",
height=height,
column_heights=column_heights,
)

if len(layout) == 2 and side_by_side:
c1, c2 = st.columns(2)
_one(0, layout[0], c1)
_one(1, layout[1], c2)
else:
for i, exp_layout in enumerate(layout):
if i:
st.divider()
_one(i, exp_layout, st.container())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make experiment rendering input/state explicitly per-experiment.

This helper currently applies one builders map to every experiment and uses state_key=f"{tool}__exp{idx}". That can leak stale selections across dataset switches and prevents clean per-dataset isolation when experiments differ.

💡 Suggested API shape
 def show_linked_grid(
     layout,
-    builders,
+    builders,
     *,
     tool,
+    state_keys=None,
     side_by_side=False,
@@
 ):
@@
-    def _one(exp_idx, exp_layout, container):
+    def _one(exp_idx, exp_layout, container):
+        exp_builders = (
+            builders[exp_idx] if isinstance(builders, (list, tuple)) else builders
+        )
+        exp_state_key = (
+            state_keys[exp_idx]
+            if state_keys is not None
+            else f"{tool}__exp{exp_idx}"
+        )
         with container:
             render_linked_grid(
                 exp_layout,
-                builders,
-                state_key=f"{tool}__exp{exp_idx}",
+                exp_builders,
+                state_key=exp_state_key,
                 grid_key=f"{grid_key}_{exp_idx}",
                 height=height,
                 column_heights=column_heights,
             )

Comment thread src/render/render.py Outdated
Comment thread src/render/schema.py
Comment thread src/view/grid.py
Comment on lines +243 to +266
def expand(self, trimmed: list, drop_empty_experiments: bool = True) -> list:
"""internal names -> labels, dropping empty cells/rows.

``drop_empty_experiments`` (default True, the edit-mode behavior) also drops
a wholly-empty experiment. The upload path passes False to match the oracle
``handleSettingButtons``, whose inline expand keeps an empty experiment as a
``[]`` stub so ``num_experiments`` stays ``len(uploaded)`` and the
reset-on-count-mismatch never fires (which would wipe the upload).
"""
expanded = []
for exp in trimmed:
rows = []
for row in exp:
cols = []
for col in row:
if col:
cols.append(
self.component_options[self.component_names.index(col)]
)
if cols:
rows.append(cols)
if rows or not drop_empty_experiments:
expanded.append(rows)
return expanded
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Harden uploaded layout handling to avoid runtime crashes.

json.load(uploaded) and the expand() name lookup path can throw on invalid JSON or unknown/stale component IDs, which currently crashes the layout page instead of surfacing a validation error.

💡 Suggested fix
 def _handle_setting_buttons(self) -> None:
@@
-        if uploaded is not None:
-            uploaded_layout = json.load(uploaded)
+        if uploaded is not None:
+            try:
+                uploaded_layout = json.load(uploaded)
+            except json.JSONDecodeError:
+                st.session_state[self._k("component_error")] = "Invalid JSON layout file"
+                return
+            if not isinstance(uploaded_layout, list):
+                st.session_state[self._k("component_error")] = "Invalid layout format"
+                return
@@
-                st.session_state[self._k("layout")] = self.expand(
+                st.session_state[self._k("layout")] = self.expand(
                     uploaded_layout, drop_empty_experiments=False
                 )
 def expand(self, trimmed: list, drop_empty_experiments: bool = True) -> list:
@@
                 for col in row:
                     if col:
-                        cols.append(
-                            self.component_options[self.component_names.index(col)]
-                        )
+                        if col not in self.component_names:
+                            st.session_state[self._k("component_error")] = (
+                                f"Unknown component in saved layout: {col}"
+                            )
+                            continue
+                        cols.append(
+                            self.component_options[self.component_names.index(col)]
+                        )

Also applies to: 423-438

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/view/grid.py` around lines 243 - 266, The upload flow needs to validate
JSON and avoid index errors in expand: wrap json.load(uploaded) in a try/except
to catch JSONDecodeError and return/raise a controlled validation error, verify
the decoded object is the expected list-of-lists structure before calling
expand, and inside Grid.expand change the name-lookup to a safe pattern (e.g.,
check membership of col in self.component_names or use a dict mapping
name->option and skip unknown IDs) rather than calling
self.component_names.index(col) which raises ValueError; ensure expand returns a
consistent validated structure (or raises a clear validation error) instead of
letting runtime exceptions propagate.

claude added 2 commits June 5, 2026 12:02
…ache-miss guards

- Viewers: select experiments by stable dataset id (display name via format_func)
  so duplicate display names can't collapse distinct datasets; scope linked-grid
  tool keys by experiment index to isolate side-by-side panel state.
- render.make_builders: only register sequence_view when its seq_deconv/seq_tnt
  cache exists, so datasets without sequence data don't hit a cache-miss crash.
- schema._build_tags: treat NaN/None/pd.NA tag scans as missing (scan_id = -1)
  instead of crashing in int().
- FileManager.get_results: handle a missing cache row (fetchone() is None) as a
  normal cache miss instead of raising TypeError.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/workflow/FileManager.py (1)

535-542: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

result_path() should enforce parquet-only contract.

result_path() is documented/typed to return a parquet path string, but it currently forwards whatever get_results(..., as_path=True) returns. For non-.pq tags, that can be a loaded Python object, which breaks the method contract and can cause downstream data_path= misuse.

Proposed fix
 def result_path(self, dataset_id: str, name_tag: str) -> str:
@@
-    return self.get_results(dataset_id, [name_tag], as_path=True)[name_tag]
+    value = self.get_results(dataset_id, [name_tag], as_path=True)[name_tag]
+    if not isinstance(value, str):
+        raise KeyError(
+            f"{name_tag} for {dataset_id} is not a parquet-backed result path"
+        )
+    return value
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/workflow/FileManager.py` around lines 535 - 542, The result_path method
currently returns whatever get_results(..., as_path=True)[name_tag] yields,
which can be a non-path object; change result_path to fetch value =
self.get_results(dataset_id, [name_tag], as_path=True)[name_tag], then validate
that value is a string path ending with ".pq" (and keep letting KeyError bubble
if the tag is missing); if the value is not a string or does not end with ".pq"
raise a TypeError with a clear message indicating result_path enforces a parquet
path contract. Ensure references inside the method remain to result_path and
get_results so the behavior is localized.
.github/workflows/unit-tests.yml (1)

44-49: ⚠️ Potential issue | 🟠 Major

Exclude openms-insight from the subsequent requirements.txt install.

requirements.txt includes openms-insight==0.1.15 (line 62), but the workflow already installs the built local /tmp/openms-insight. Leaving the pin in place can still make pip install -r requirements.txt try to resolve openms-insight again (despite it requiring the prebuilt/mirrored js-component/dist to be installable in this repo).

Suggested fix
-          pip install -r requirements.txt
+          grep -vi 'openms-insight' requirements.txt > /tmp/requirements-no-insight.txt
+          pip install -r /tmp/requirements-no-insight.txt
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/unit-tests.yml around lines 44 - 49, The workflow
currently installs the local wheel with "pip install /tmp/openms-insight" and
then re-runs "pip install -r requirements.txt" which contains a pinned
openms-insight entry; modify the job in .github/workflows/unit-tests.yml so the
requirements install does not attempt to reinstall openms-insight — e.g.,
generate or use a filtered requirements file before running "pip install -r
requirements.txt" or run pip install with a constraint/exclude to skip
"openms-insight==0.1.15"; ensure the change targets the sequence around the
existing "pip install /tmp/openms-insight" and "pip install -r requirements.txt"
commands so only the local package is installed.

Source: Learnings

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/unit-tests.yml:
- Around line 27-30: The workflow is using the mutable tag actions/setup-node@v4
which weakens CI supply-chain security; update the uses line in
.github/workflows/unit-tests.yml to pin to a specific commit SHA for
actions/setup-node (replace actions/setup-node@v4 with
actions/setup-node@<full-commit-sha>), keeping the existing node-version setting
intact and using the commit SHA from the official actions/setup-node repository
release you trust.
- Around line 38-39: The CI job currently clones the mutable branch reference
"claude/kind-heisenberg-u6dVm" via the git clone command and can drift; change
the workflow to fetch and checkout a specific immutable commit SHA instead of
relying on -b branch (or add an explicit git checkout <COMMIT_SHA> after clone),
and apply the same hardening to the Dockerfile/Dockerfile.arm flow that consumes
insight-ref.json by resolving and pinning the referenced commit SHA before
building; update references to the git clone invocation and the insight-ref.json
fetch so all checkouts use a concrete commit SHA rather than a branch name.

---

Outside diff comments:
In @.github/workflows/unit-tests.yml:
- Around line 44-49: The workflow currently installs the local wheel with "pip
install /tmp/openms-insight" and then re-runs "pip install -r requirements.txt"
which contains a pinned openms-insight entry; modify the job in
.github/workflows/unit-tests.yml so the requirements install does not attempt to
reinstall openms-insight — e.g., generate or use a filtered requirements file
before running "pip install -r requirements.txt" or run pip install with a
constraint/exclude to skip "openms-insight==0.1.15"; ensure the change targets
the sequence around the existing "pip install /tmp/openms-insight" and "pip
install -r requirements.txt" commands so only the local package is installed.

In `@src/workflow/FileManager.py`:
- Around line 535-542: The result_path method currently returns whatever
get_results(..., as_path=True)[name_tag] yields, which can be a non-path object;
change result_path to fetch value = self.get_results(dataset_id, [name_tag],
as_path=True)[name_tag], then validate that value is a string path ending with
".pq" (and keep letting KeyError bubble if the tag is missing); if the value is
not a string or does not end with ".pq" raise a TypeError with a clear message
indicating result_path enforces a parquet path contract. Ensure references
inside the method remain to result_path and get_results so the behavior is
localized.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5cf58efb-9598-46e6-8c10-83db94c2b574

📥 Commits

Reviewing files that changed from the base of the PR and between de4b7d5 and 94ed99a.

📒 Files selected for processing (7)
  • .github/workflows/unit-tests.yml
  • content/FLASHDeconv/FLASHDeconvViewer.py
  • content/FLASHQuant/FLASHQuantViewer.py
  • content/FLASHTnT/FLASHTnTViewer.py
  • src/render/render.py
  • src/render/schema.py
  • src/workflow/FileManager.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • content/FLASHQuant/FLASHQuantViewer.py
  • content/FLASHTnT/FLASHTnTViewer.py
  • content/FLASHDeconv/FLASHDeconvViewer.py
  • src/render/render.py
  • src/render/schema.py

Comment on lines +27 to +30
- name: Set up Node
uses: actions/setup-node@v4
with:
node-version: "21"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Pin actions/setup-node to a full commit SHA.

.github/workflows/unit-tests.yml uses the mutable action tag actions/setup-node@v4, which weakens CI supply-chain integrity—pin it to a specific commit SHA instead of v4.

🧰 Tools
🪛 zizmor (1.25.2)

[error] 28-28: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/unit-tests.yml around lines 27 - 30, The workflow is using
the mutable tag actions/setup-node@v4 which weakens CI supply-chain security;
update the uses line in .github/workflows/unit-tests.yml to pin to a specific
commit SHA for actions/setup-node (replace actions/setup-node@v4 with
actions/setup-node@<full-commit-sha>), keeping the existing node-version setting
intact and using the commit SHA from the official actions/setup-node repository
release you trust.

Comment on lines +38 to +39
git clone -b claude/kind-heisenberg-u6dVm --single-branch --depth 1 \
https://github.com/t0mdavid-m/openms-insight.git /tmp/openms-insight
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate and print the relevant section of the workflow around the cited lines
FILE=".github/workflows/unit-tests.yml"
if [ ! -f "$FILE" ]; then
  echo "Missing $FILE"
  exit 1
fi

nl -ba "$FILE" | sed -n '1,120p'
echo "----"
nl -ba "$FILE" | sed -n '20,80p'
echo "----"

# Also search for any other openms-insight clone/install logic in the repo
rg -n "openms-insight|/tmp/openms-insight|npm run build|js-component/dist|requirements\.txt" .github/workflows/unit-tests.yml .claude/hooks/session-start.sh Dockerfile* . || true

Repository: OpenMS/FLASHApp

Length of output: 103


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=".github/workflows/unit-tests.yml"
if [ ! -f "$FILE" ]; then
  echo "Missing $FILE"
  exit 1
fi

echo "---- $FILE (lines ~20-80) ----"
awk 'NR>=20 && NR<=80 {printf "%5d %s\n", NR, $0}' "$FILE"

echo "---- matches for openms-insight related logic ----"
rg -n "openms-insight|/tmp/openms-insight|js-component/dist|npm run build|requirements\.txt" .github/workflows .claude/hooks Dockerfile* 2>/dev/null || true

echo "---- any checkout/pin patterns in the workflow ----"
rg -n "git clone|checkout .*sha|checkout --detach|fetch --depth|--depth 1" .github/workflows/unit-tests.yml 2>/dev/null || true

Repository: OpenMS/FLASHApp

Length of output: 5956


🏁 Script executed:

#!/bin/bash
set -euo pipefail

for f in Dockerfile Dockerfile.arm; do
  echo "===== $f (around insight clone) ====="
  [ -f "$f" ] || continue
  # print likely relevant region
  awk 'NR>=60 && NR<=160 {printf "%5d %s\n", NR, $0}' "$f"
  echo
done

echo "===== search for insight-ref.json usage ====="
rg -n "insight-ref\.json|INSIGHT_BRANCH|openms-insight" Dockerfile Dockerfile.arm

Repository: OpenMS/FLASHApp

Length of output: 12567


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "insight-ref\.json|insight-ref|INSIGHT_BRANCH=.*|git.*checkout.*openms-insight" -S . \
  || true

# Specifically ensure there is no follow-up that extracts a SHA from insight-ref.json
rg -n "jq |\.object\.sha|\"sha\"|grep .*insight-ref\.json|cat .*insight-ref\.json" Dockerfile Dockerfile.arm .github/workflows .claude/hooks 2>/dev/null || true

# List any other workflows that clone openms-insight
rg -n "openms-insight\.git|/tmp/openms-insight|openms_insight/js-component/dist" .github/workflows .claude/hooks Dockerfile* 2>/dev/null || true

Repository: OpenMS/FLASHApp

Length of output: 2552


Use an immutable openms-insight revision in CI checkout.

.github/workflows/unit-tests.yml clones claude/kind-heisenberg-u6dVm via git clone -b ... --depth 1 with no commit SHA pinning, so CI results can silently drift as the branch moves. The same mutable-branch pattern is also used in Dockerfile/Dockerfile.arm (the insight-ref.json fetch is not followed by a SHA checkout).

Suggested hardening
-          git clone -b claude/kind-heisenberg-u6dVm --single-branch --depth 1 \
-            https://github.com/t0mdavid-m/openms-insight.git /tmp/openms-insight
+          git clone https://github.com/t0mdavid-m/openms-insight.git /tmp/openms-insight
+          git -C /tmp/openms-insight checkout <pinned-openms-insight-commit-sha>
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/unit-tests.yml around lines 38 - 39, The CI job currently
clones the mutable branch reference "claude/kind-heisenberg-u6dVm" via the git
clone command and can drift; change the workflow to fetch and checkout a
specific immutable commit SHA instead of relying on -b branch (or add an
explicit git checkout <COMMIT_SHA> after clone), and apply the same hardening to
the Dockerfile/Dockerfile.arm flow that consumes insight-ref.json by resolving
and pinning the referenced commit SHA before building; update references to the
git clone invocation and the insight-ref.json fetch so all checkouts use a
concrete commit SHA rather than a branch name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants