Phase 3: Rebuild FLASHApp viewers on OpenMS-Insight via frozen template#93
Phase 3: Rebuild FLASHApp viewers on OpenMS-Insight via frozen template#93t0mdavid-m wants to merge 53 commits into
Conversation
Scaffolds Phase-3 tracking: rebuild FLASHApp viewer pages on OpenMS-Insight via a reusable, frozen streamlit-template grid (no FLASHApp fork). No app code changes. - units.yaml: 10 units (template build+freeze, then FLASHApp rebuild) with oracles. - run_review.py: shared record/gate/report convergence driver. - nondivergence.py: asserts FLASHApp grid code == frozen template (normalized hash). - review-log/ ledger, REVIEW.md rollup, README.md. https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
… non-divergence) Concrete spec from the planning fan-out: streamlit-template src/view/grid.py (render_linked_grid + LayoutManager), FileManager data layer, show_linked_grid, visualization_template demo; FLASHApp src/render/schema.py (tidy parquet, stable IDs); builders factory + StateManager per (tool,experiment); 3 viewer rebuilds; index->value selection map; non-divergence via vendored grid.py == frozen template grid.py. https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne MSG
- src/view/grid.py: byte-identical vendored copy of the frozen streamlit-template src/view/grid.py (the render_linked_grid + LayoutManager SSOT). Plain copy synced from the template, NOT a git submodule. - src/view/__init__.py: package marker. - migration/units.yaml: register nondivergence_pairs = [FLASHApp/src/view/grid.py, streamlit-template/src/view/grid.py] so nondivergence.py enforces byte-identical (normalized-hash) equality of the two side-by-side checkouts. nondivergence GREEN (vendored == frozen template). https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
…rozen template grid
- src/render/schema.py (NEW): build_insight_caches(file_manager, dataset_id, tool) ->
Insight-ready tidy parquet (stable scan_id/mass_id/peak_id/protein_id/tag_id/feature_id;
exploded SignalPeaks/NoisyPeaks; comma-split quant traces; long-format target/decoy KDE).
- src/render/render.py: make_builders(file_manager, dataset_id, tool) factory (comp_name ->
zero-arg -> BaseComponent), cache_id=f"{tool}__{dataset_id}__{comp}", data_path=result_path;
index->value selection map (scanIndex->scan_id, massIndex->mass_in_scan/mass_id,
proteinIndex+proteoform_scan_map->protein_id filter, StateTracker->StateManager).
- 3 viewer pages shrink to page_setup -> pick experiment -> load layout -> show_linked_grid
(imports render_linked_grid/LayoutManager from the vendored src/view/grid.py, unchanged).
- src/common/common.py: show_linked_grid; FileManager: get_results(as_path)/result_path.
- DELETED dead layer: components.py, initialize.py, update.py, StateTracker.py, util.py.
- requirements.txt: + openms-insight.
- tests/: conftest + test_render_schema (9) + test_render_builders (8).
- Known seams (for review loop): Plot3D x/y/z config not forwarded through Insight's
data_path subprocess (worked around with in-process data= for the small Plot3D frames);
FLASHTnT SequenceView peaks not scan->protein remapped on protein selection.
Verified: pytest 45 passed/2 skipped, parse OK, nondivergence GREEN (grid.py untouched).
https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
Replace the Phase-0 ast.parse placeholders with: nondivergence + template pytest (74, incl. test_view_grid construct-smoke) + FLASHApp pytest (45, incl. render schema/builders construct-smoke) + viewer-parse. Real machine gate for the review loop. https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
Resolves the high/med round-1 review findings on the FLASHApp rebuild: schema.py: - _explode_list_cols: drop null/empty list cells before exploding (no more phantom null rows for empty spectra / zero-mass scans). - deconv_spectrum_tidy: alias MonoMass->mass (SequenceView requires a 'mass' column) and expose the per-scan ordinal as mass_in_scan (the oracle massIndex space the 3D / mass-table share) instead of dropping it. - proteins: denormalize scan_id (=deconv_index via build_proteoform_scan_map) so a protein-row click resolves to its scan (value-based proteoform_scan_map). - tags: key on scan_id (not a collapsed last-wins protein_id) so every tag on the selected proteoform's scan shows for ANY proteoform on that scan (oracle filtered by Scan). render.py: - mass_table / deconv_spectrum: 'mass' selection = mass_in_scan (per-scan ordinal the 3D consumes), not a global id; deconv x_column -> mass. - anno_spectrum: remove mass interactivity (raw-m/z click never matched the deconvolved MonoMass in the oracle, so it selected nothing). - protein_table: click sets BOTH protein and scan; tag_table + augmented spectrum + sequence-view peaks follow via scan; combined_spectrum filters by scan (was an unset 'spectrum' slot -> blank). - SequenceView (tnt): filters by protein (sequence) AND scan (peaks). viewers: blank-until-pick experiment selector (oracle parity; also avoids eager cache builds on page load). tests updated to assert the parity-correct wiring + a null-guard regression.
- combined_spectrum: resolve the tag-table's scalar tag_id to the tag's
fragment masses + sequence via the tags frame (FLASHTnT only; FLASHDeconv
has no tags frame so the resolve kwargs are omitted).
- sequence_view (tnt): publish residue clicks as the 'aa' selection the
tagger consumes for the tag-relative selectedAA (gold highlight).
- test: the tagger resolves tag_id -> {sequence, masses, selectedAA}, tag/aa
are re-render dependencies, and FLASHDeconv has no tag resolution.
Resolves the last round-1 finding (3-tnt-003, tagger overlay dead).
…rowing
Resolves the round-2 review findings:
- 3-builders-003 (high): 3D Precursor Signals showed nothing until a mass was
clicked because mass was a hard filter. Now optional_filters=["mass"] so the
scan's full Signal/Noisy scatter shows immediately, narrowing to one mass only
when selected (oracle parity).
- 3-quant-001 (med): quant feature-trace 3D axes were swapped/mislabeled. Now
x=m/z, y=retention time, z=intensity with explicit labels (oracle
FLASHQuantView), title 'Feature group signals'.
- 3-tnt-005 (low): clicking a sequence residue now narrows the tag table to tags
spanning it via interval_filters={"aa": ("StartPos", "EndPos")}, reusing the
SequenceView 'aa' selection (oracle TabulatorTagTable secondary filter).
migration: gate now also runs the OpenMS-Insight gate (pytest/build/vitest/
parity) since Phase 3 re-opened Insight; insight:tagger-seqview tracked as a
no-regression unit. Tests added for all three fixes.
…utive clean round
…utive clean round
…or grid - 3-schema-004 (high): the Precursor Signals 3D plotted raw m/z on a 'Mass' axis. The oracle (get3DplotInputFromSNRPeaks) uses x = mz*charge; add a precomputed 'mass' column to precursor_signals and point Plot3D x_column there (Plot3D's default x_label 'Mass' matches the oracle axis title). Charge states no longer collapse to their m/z positions. - re-vendor src/view/grid.py from the frozen template (empty-experiment upload fix); nondivergence GREEN. - tests: precursor_signals carries mass==mz*charge (explode unit + integration); 3D builder x/y/z = mass/charge/intensity with label 'Mass'. Also records round-5 review (9 clean, 2 findings: 3-schema-004, 3-grid-003).
…ution lines Resolves round-6 review findings: - 3-builders-005 (med): revert the 3D Precursor Signals optional_filters=[mass]. The oracle frontend (getPrecursorSignal) renders EMPTY when no mass is selected (the precursor-scan lookup fails on the scan-filtered per_scan_data); it only draws SignalPeaks[mass_index] once a mass is chosen. mass is again a REQUIRED filter. (Round-2's 3-builders-003 misread update.py's data-prep as the displayed behavior.) - 3-builders-004 (low): pass oracle axis titles to the plain spectra + heatmaps (Monoisotopic Mass / m/z / Intensity / Retention Time) instead of raw column names. - 3-quant-002 (med): quant feature-trace 3D uses stem=False so each charge is one connected elution line (oracle FLASHQuantView mode:lines), not per-point spikes. - 3-schema-006 (nit): correct the anno is_signal test comment. tests: axis-title coverage; quant stem=False; corrected is_signal comment. (OpenMS-Insight tagger config-key leak 3-tnt-006 fixed in the Insight commit.)
…ch-oracle-chrome)
Per the chosen full-presentation-parity bar:
- All 5 tables now pass column_definitions (oracle Tabulator titles, fixed +
placeholder(-1->'-') formatters, sorters) + per-table initial_sort
(Protein/Tag = Score desc). The Table renders ONLY the curated oracle columns;
internal carrier columns (scan_id, mzs, ProteinIndex, full sequence, ...) stay
in the data for filters/interactivity/index but are no longer displayed.
Coverage(%) omitted (oracle commented it out); de-duped FLASHQuant's duplicated
'Feature Group Quantity' column.
- quant_traces_3d: series_column='isotope' (breaks the polyline between isotopes
within a charge, reproducing the oracle -1000 z-sentinel gaps) +
category_name_template='Charge: {}' (legend 'Charge: 2'), via the new Plot3D
features.
- conftest: synthetic protein_dfs/tag_dfs fixtures extended with the real
FLASHTagger columns (Score, MatchingFragments, ModCount, TagCount, Coverage(%),
Nmass/Cmass/DeltaMass) so initial_sort + the placeholder formatter are exercised.
Resolves 3-tables-001, 3-quant-003, 3-quant-004. (ProteinTable best-per-spectrum
toggle still pending.)
Round 8 (match-oracle-chrome bar): template/grid/common/filemanager/page + insight + deconv-viewer CLEAN; table chrome + quant isotope-breaks/legend verified resolved. Remaining findings: 3-quant-005 (per-trace-id for same-isotope duplicate traces), 3-tables-002 (best-per-spectrum toggle), 3-tables-003 (go_to_fields), 3-fdr-001/002 (FDR title+trace names), 3-feat-001 (feature-table title), 3-anno-001 (anno spectrum selection-driven highlight model). User directive: match the full interaction model.
…can flag Data-layer support for the full-interaction-model parity work (render wiring in a follow-up): - anno_highlight_link (NEW tidy frame: scan_id, peak_id, mass_in_scan, charge): maps each annotated signal peak to the deconvolved mass(es) it belongs to + its charge, keyed by anno_spectrum_tidy.peak_id, so the annotated spectrum can highlight ONLY the selected mass's raw peaks + show per-charge z=N labels. Confirmed 1:MANY (a raw m/z peak can be a signal peak for several masses at different charges) -> one row per (peak, mass). - quant_traces + trace_in_feature: per-feature running trace id minted at the trace explode, carried through the point comma-split, so the 3D breaks the polyline PER-TRACE (fixes 3-quant-005: two same-(charge,isotope) traces no longer merge). - proteins + is_best_per_scan: 1 for the max-Score row per Scan (ordinal rank, ties keep-first) -> backs the ProteinTable best-per-spectrum default view. Additive; tests + fixtures updated (59 passed).
… chrome
Wire the new OpenMS-Insight selection-driven LinePlot/Plot3D features + schema
data into the viewers (match-full-interaction-model directive):
- anno_spectrum: selection-driven highlight of the SELECTED mass via the
anno_highlight_link frame (highlight_selection=mass, highlight_link_path,
key=peak_id, match=mass_in_scan, charge labels z={}, deconv_peaks_toggle=True);
exposes peak_id via a private anno_peak interactivity slot (does not drive the
shared mass slot). Removes the static is_signal highlight. (3-anno-001)
- deconv_spectrum: selective highlight of the selected mass (highlight_match_column
=mass_in_scan). (residual: selected-mass value label needs a match-column label
producer in the LinePlot -- noted for follow-up)
- 3D_SN_plot: title_selection -> dynamic "" / "Precursor signals" / "Mass signals".
(3-3d-001)
- quant_traces_3d: series_column=trace_in_feature (per-trace break; fixes the
same-(charge,isotope) merge). (3-quant-005)
- go_to_fields per oracle for every table (no internal-carrier leakage). (3-tables-003)
- FDR: title "FDR Plot" + trace names "Target QScores"/"Decoy QScores". (3-fdr-001/002)
- feature table title "Feature groups". (3-feat-001)
- best-per-spectrum toggle: per-experiment checkbox -> make_builders(best_per_spectrum)
-> protein_table sources is_best_per_scan-filtered data under a distinct cache_id.
(3-tables-002)
gate GREEN; 69 passed; nondivergence GREEN.
Wire the new LinePlot match-column value-label producer into deconv_spectrum:
highlight_value_column="mass" + highlight_value_template="{:.2f}" so the selected
mass deconvolved stick gets its MonoMass value label (oracle mass.toFixed(2)),
closing the round-9 finding. Test extended to assert exactly one value label at
the selected stick. 69 passed.
…forms (3-best-002) is_best_per_scan now flags EVERY proteoform whose Scan is missing (NaN / null / non-numeric), matching the oracle filterBestPerSpectrum (keeps each row where typeof scan !== "number" || isNaN(scan)) instead of collapsing them into a single .over(Scan) group (which flagged only one -> hid the rest in best-per-spectrum mode). A missing Scan from protein.tsv arrives as float NaN, so the check casts to f64 (non-numeric -> null) and treats null|NaN as missing (dtype-safe; is_nan errors on an int column without the cast). Edge-case-gated (real protein.tsv populates Scan) but a real parity divergence. Test added.
Round 10: 9 units CLEAN (template grid/common/filemanager/page, nondivergence, builders, deconv-viewer, quant-viewer, insight). Deconv value-label (3-deconv-001) verified resolved. Two findings: 3-best-002 (best-per-spectrum must pass through missing-Scan proteoforms; fixed) and 3-cascade-001 (protein selection must cascade-clear the dependent aa/tag selections; fix in progress).
Wire clears_selections=["aa","tag"] onto the protein_table, reproducing the oracle ProteinTable.updateSelectedProtein cascade (clears selectedAA/selectedTag/tagData on every protein click). Switching proteoform now resets the dependent residue (aa) + tag selections, so the tag-table interval filter + tagger overlay follow the new proteoform instead of going stale/empty. Test asserts the protein_table args carry clearsSelections=["aa","tag"].
3-best-003 (schema): is_best_per_scan now ranks the highest REAL Score first -- NaN/missing/non-numeric Score is pushed to -inf before rank(descending) so it sorts LAST, mirroring the oracle toScore (NaN -> -Infinity). Previously polars ranked NaN as largest, flagging a no-score proteoform best. Test added. 3-cascade-002 (render): scan_table gains clears_selections=["mass"], reproducing the oracle TabulatorScanTable updateSelectedScan -> updateSelectedMass(0): a scan click clears the mass selection, and the mass_table (default_row=0) re-defaults to mass_in_scan 0 of the new scan via the bridge _auto_selection -- so the deconv/anno spectra + 3D show the new scan first mass, not a stale ordinal. Test asserts the cascade arg AND that the mass_table auto-selects 0 when mass is unset.
Round 12: 10 units clean (grid empty-row divergence sanctioned as exception-b: degenerate malformed-upload-only, no well-formed-data difference). Round-11 fixes (Score-NaN 3-best-003, scan->mass 3-cascade-002) + anno-peak click all verified resolved/correct. One new finding 3-seqview-001: SequenceView residue-click model (coverage- vs fragment-gating, no toggle-clear, no residue->mass) diverges from the oracle; re-opens the converged SequenceView -- escalating scope.
…001) Add fragment_mass_identifier="mass" to the FLASHTnT sequence_view builder so the oracle two-path residue click is fully reproduced (maintainer: "both should be supported as in the FLASHTnT Viewer"): - PATH 1 (aa / sequence-tag): coverage_column="coverage" (already wired) now makes residue_identifier="aa" coverage-gated + toggling via the Insight SequenceView (tag-covered residues drive the tagger/tag table; re-click clears). - PATH 2 (mass / fragment): a residue click on a fragment-matched residue publishes that fragment peak mass_in_scan to "mass" (resolved via the existing interactivity "mass" column), reproducing updateMassTableFromFragmentMass -> updateSelectedMass. FLASHDeconv sequence_view (no coverage/residue wiring) is unchanged. Test asserts the two-path wiring.
…view-002) Add fragment_mass_identifier="mass" to the FLASHDeconv sequence_view branch so a fragment-residue click in the (global) deconv Sequence View publishes the fragment peak mass_in_scan to the shared mass selection -- driving the mass table, deconv/anno spectra, and 3D S/N plot (all in the deconv default layout), matching the oracle aminoAcidSelected -> updateSelectedMass which runs on every tool. PATH 2 only (no tags/coverage on the global sequence). Test asserts deconv fragment_mass_identifier == mass and coverage_column is None.
Round 13: 9 units clean (SequenceView two-path PATH-1/PATH-2 verified for TnT; gate GREEN). 3 findings, all SequenceView: 3-seqview-002 deconv residue->mass (fixed), 3-seqview-003 inbound mass->fragment-table highlight, 3-seqview-004 mass-info header. Comprehensive SequenceView parity pass in progress.
…ghlight (3-seqview-003/004) - schema._build_seq_tnt: add observed_mass = computed_mass (= ProteoformMass), the oracle SequenceView header observed proteoform mass. - render._sequence_view (tnt): observed_mass_column="observed_mass" + mass_header_title="Proteoform" (renders Theoretical/Observed/Delta Mass header) + mass_selection_identifier="mass" (a mass selected elsewhere highlights the matching fragment-table row -- oracle updateFragmentTableFromMassSelection). - render._sequence_view (deconv): mass_selection_identifier="mass" (inbound highlight; no header -- global sequence, not a proteoform). Tests assert the header + inbound wiring on both branches.
…ighlight verified)
FLASHTnT sequence_view passes theoretical_mass_label="Theoretical protein mass" + observed_mass_label="Observed proteoform mass" (oracle preparePrecursorInfo proteoform branch) instead of the generic precursor-branch defaults. Test asserts.
- schema._build_seq_deconv: add per-scan observed_mass = PrecursorMass (NULL for MS1 scans where PrecursorMass==0, so the SequenceView hides the header there -- matching the oracle precursor branch which renders no header for MS1; vs the TnT proteoform branch which shows "-" for a missing mass). - render._sequence_view (deconv): observed_mass_column="observed_mass" + mass_header_title="Precursor" (generic "Theoretical mass"/"Observed mass" labels = oracle preparePrecursorInfo precursor branch). Completes the deconv half of the mass-info header (3-seqview-004 did the TnT proteoform half). Test asserts deconv observed_mass_column/title + seq_deconv observed_mass column.
…ader fixes verified)
ms1_raw_heatmap/ms2_raw_heatmap plot raw m/z (from the annotated spectra), so the oracle PlotlyHeatmap yAxisLabel returns "m/z" for Raw MS1/MS2 Heatmaps -- only the DECONV heatmaps are "Monoisotopic Mass". Fix the two raw heatmaps y_label="m/z"; test now expects "m/z" for raw, "Monoisotopic Mass" for deconv.
…t ion) Round 16: 8 units clean (X-residue + deconv precursor header verified). 2 findings: 3-heatmap-001 raw-heatmap y-label (fixed: "m/z"); 3-seqview-008 SequenceView TSG path omits the full-length terminal fragment ion vs oracle (fix in progress).
…tmap fixes verified)
Round 17: 9 units clean (terminal-ion + raw-heatmap fixes verified; internal-fragment _terminal_collision_masses expansion confirmed oracle-faithful improvement, not a regression). 3 findings: 3-seqview-009/010 (HIGH) SequenceView computes fragments on the FULL protein not the proteoform sub-sequence (wrong masses/grid for truncated proteoforms) + no undetermined-terminus suppression; 3-seqview-011 (low) docstring accuracy. Proteoform-aware fragment-handling fix in progress.
…ng (3-seqview-009/010) seq_tnt carries the full sequence + 0-based proteoform terminals; end-to-end SequenceView-from-seq_tnt fragment grid == oracle sub-region (truncated + undetermined). No FLASHApp source change needed (tnt.py already slices; render.py already wires proteoform_start/end_column -- the Insight side now consumes them for fragments).
Wire interactivity on all 4 heatmap builders (oracle PlotlyHeatmap click ->
updateSelectedScan for all + updateSelectedMass for deconv MS1/MS2): deconv heatmaps
interactivity={scan:scan_idx, mass:mass_idx}; raw heatmaps interactivity={scan:scan_idx}.
The reused heatmap caches carry scan_idx(=scan_id)/mass_idx(=mass_in_scan) from
getMSSignalDF. conftest heatmap fixture extended with scan_idx/mass_idx; test asserts
the per-heatmap interactivity mapping.
…ority fixes verified)
First fully-clean round since round 11 -- the SequenceView interaction tail (residue click, mass header, X-residue, terminal ion, proteoform sub-region, ion-priority) + heatmap navigation are all resolved. All 53 findings raised across rounds 5-18 marked resolved (each fixed + confirmed by the all-units-clean round 19). Need rounds 20-21 clean to converge.
… rounds 19-20-21) Rounds 19, 20, 21 each fully clean (all 11 Phase-3 units pass the three critics + GREEN machine gate); 0 open findings. STATUS: CONVERGED. Round 21 note: an environment restart mid-round killed 4 in-flight review agents; they were re-launched and all confirmed CLEAN (schema 1/0, builders+deconv 2/0, tnt+quant 2/0, insight 1/0).
…pin openms-insight==0.1.15 The migration replaced the local Vue component (openms-streamlit-vue-component) with the openms-insight package, which ships its own Vue bundle. Per PHASE3_PLAN.md 5.5: - requirements.txt: openms-insight 0.1.11 -> 0.1.15 (the published 0.1.14 is Phase-1&2 WITHOUT the Phase-3 component features; 0.1.15 is the new Insight release carrying them). - Dockerfile + Dockerfile.arm: remove the node:21 js-build stage that cloned+built openms-streamlit-vue-component and the COPY of its dist into /app/js-component/dist. - Remove the openms-streamlit-vue-component git submodule (.gitmodules + gitlink). The migrated app has no js-component references (verified); local tests unaffected. REMAINING EXTERNAL STEP: publish openms-insight 0.1.15 to PyPI (maintainer release action) so a clean pip install -r requirements.txt / Docker build resolves it.
…PyPI dep) Add an insight-build stage (node:21) that clones the openms-insight branch claude/kind-heisenberg-u6dVm, builds its Vue bundle (npm install && npm run build), and syncs dist into openms_insight/js-component/dist. The compile-openms stage then pip-installs that source tree (hatchling force-includes the bundled dist) BEFORE requirements, so openms-insight==0.1.15 resolves from the branch instead of PyPI. Makes the image self-contained without a PyPI publish. (Verified: a source build produces a wheel that bundles assets/index.js.)
…esh-clone fix) In a fresh clone openms_insight/js-component/ does not exist (it only ever held the gitignored dist/), so cp had no parent dir and the build failed. mkdir -p it first. Validated end-to-end against a fresh clone of the branch: clone -> npm build -> sync -> pip wheel produces a wheel bundling the Vue dist (7 files incl. index.js).
📝 WalkthroughWalkthroughThis PR completes Phase 3 of the FLASHApp migration by replacing legacy Vue/component builds and index-based rendering with OpenMS-Insight integration and a frozen, reusable linked-grid template. It introduces tidy-parquet schema adapters, a builder factory, a layout manager/renderer, viewer rewrites, Docker/CI integration for openms-insight, migration gates/harness, and comprehensive tests. ChangesFLASHApp Phase 3 Migration
Sequence Diagram(s)sequenceDiagram
participant Page as Streamlit Page
participant Grid as render_linked_grid
participant Builder as make_builders factory
participant State as StateManager
Page->>Grid: call render_linked_grid(layout, builders, state_key)
Grid->>State: create StateManager(state_key)
Grid->>Builder: invoke factory() for each cell
Builder->>Grid: return component instance
Builder->>State: register filters/interactivity
Grid->>Builder: component.render() -> Vue payload
Possibly Related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/workflow/FileManager.py (1)
480-486:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHandle missing cache rows before iterating
fetchone().If
dataset_idhas no row in one of the cache tables,fetchone()returnsNoneand both loops raiseTypeErrorbeforepartial/KeyErrorhandling runs. That turns a normal cache miss into a hard crash and breaks the newresult_path()contract.Suggested fix
result = self.cache_cursor.fetchone() + if result is None: + result = (None,) * len(file_columns) for c, r in zip(file_columns, result): if r is None: if partial: continue else: @@ result = self.cache_cursor.fetchone() + if result is None: + result = (None,) * len(data_columns) for c, r in zip(data_columns, result): if r is None: if partial: continue else:Also applies to: 498-504
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/workflow/FileManager.py` around lines 480 - 486, The SELECT followed by fetchone() can return None for a cache miss, so before zipping over file_columns and result ensure you check "result is None" and treat it as a cache miss (do not iterate). For both occurrences (the block using self.cache_cursor, file_columns and dataset_id starting at the shown snippet and the similar block at 498-504) either raise KeyError(dataset_id) so upstream partial/KeyError handling runs, or return the missing-value expected by result_path(); do this check immediately after result = self.cache_cursor.fetchone() to avoid the TypeError.
🧹 Nitpick comments (1)
migration/nondivergence.py (1)
56-57: ⚡ Quick winResolve nondivergence pair paths from a fixed base, not process CWD.
Line 56-Line 57 currently makes relative paths caller-dependent. Resolve relative entries from repo/config root so checks are deterministic across runners.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@migration/nondivergence.py` around lines 56 - 57, Current code creates Path(pair[0]) and Path(pair[1]) which resolves relative paths against the process CWD; change this to resolve against the repository/config root instead. Modify the logic around a, b (and subsequent ea, eb) so you first compute a stable base Path (e.g., repo_root or config_root) and then construct a = base.joinpath(pair[0]) and b = base.joinpath(pair[1]) (or otherwise resolve pair entries against that base) so existence checks ea, eb are deterministic across runners; update any uses of a and b accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@content/FLASHDeconv/FLASHDeconvViewer.py`:
- Around line 47-49: The current mapping using to_id =
{file_manager.get_display_name(r): r for r in results} is lossy when display
names are duplicated; update the lookup to preserve all datasets by either (a)
keying by a unique identifier (use the raw dataset id from each result instead
of file_manager.get_display_name) where the UI can still show display names, or
(b) build a multimap from display name to a list of result ids (e.g., to_id:
dict[str, list] and append each r) so duplicates are retained; apply the same
change to the other mapping logic around the block handling lines 57–64 to
ensure no dataset is overwritten or lost.
- Line 67: The tool key passed to show_linked_grid is not panel-scoped and can
collide when two panels use the same ds; update the call to generate a unique
panel-scoped tool string (e.g., include a panel identifier such as an explicit
panel index, a unique id derived from exp_layout or builders, or a UUID) and
pass that as the tool argument to show_linked_grid (replace the current
f"flashdeconv_{ds}" with something like f"flashdeconv_{panel_id}_{ds}" or
similar) so the linked-grid state is isolated per panel; change the code that
constructs the tool key where show_linked_grid([exp_layout], builders, tool=...)
is invoked.
In `@content/FLASHQuant/FLASHQuantViewer.py`:
- Around line 29-30: The current code builds names =
[file_manager.get_display_name(r) for r in results] and to_id =
{file_manager.get_display_name(r): r for r in results}, which loses datasets
when display names collide; update the mapping and selection logic to use a
unique key instead of the display name (or map a display name to a list of
results). For example, keep a list of tuples (display_name, result) or build
to_id using a unique identifier from each result (e.g., r.id or r.path) while
still presenting file_manager.get_display_name(r) in the UI, and change the
lookup code that currently uses the display name (the selection at Line 38 and
the surrounding block that spans the same section) to resolve selection via that
unique identifier or disambiguate by choosing the correct entry from the list of
results for that display name.
In `@content/FLASHTnT/FLASHTnTViewer.py`:
- Around line 42-43: The code builds names = [file_manager.get_display_name(r)
for r in results] and to_id = {file_manager.get_display_name(r): r for r in
results}, which will drop entries when multiple results share the same display
name; change the mapping so resolution is based on a unique identifier instead
of display name (e.g., use r.id or r.path) or map display names to lists of
results: create to_id_by_unique = {r.id: r for r in results} and keep names as
display labels, or build to_id = defaultdict(list) and append each r under
file_manager.get_display_name(r) so duplicates are preserved; then update any
lookup code that uses to_id (the lookup at or around the code that resolves
selections) to use the unique-id map or to handle list values accordingly (e.g.,
disambiguate by id or present multiple choices).
- Line 76: The linked-grid tool key is currently only namespaced by dataset id
via tool=f"flashtnt_{ds}", which allows two side-by-side panels selecting the
same dataset to collide; update the call to show_linked_grid to include the
panel index (e.g., tool=f"flashtnt_{ds}_{panel_idx}" or similar) so the key is
unique per panel (locate the show_linked_grid([exp_layout], builders, tool=...)
invocation and add the panel index variable used when rendering panels).
In `@migration/units.yaml`:
- Around line 8-14: Replace hard-coded absolute paths in migration/units.yaml
(oracle_root, template_repo, nondivergence_pairs) with repo-relative or
environment-parameterized tokens (e.g., use a ${REPO_ROOT} or ${TEMPLATE_REPO}
placeholder) and update the code that reads units.yaml to expand those tokens at
runtime (resolve via an env var or compute repo root with os.getcwd()/git
rev-parse and join paths). Specifically change the values for oracle_root and
template_repo and the paths inside nondivergence_pairs to use placeholders, and
update the loader/validator that reads units.yaml to perform
environment-variable expansion and path normalization so CI and other machines
can resolve the locations correctly.
In `@src/render/render.py`:
- Around line 624-626: The sequence_view factory is always registered and
immediately triggers result_path() for seq_deconv/seq_tnt even if
build_insight_caches skipped them, causing cache misses; update the registration
site to first check for the presence of the sequence cache(s) before adding the
"sequence_view" entry (e.g. query the cache mapping returned by
build_insight_caches or call file_manager.has_cache/exists for the
seq_deconv/seq_tnt result paths) and only assign "sequence_view": lambda:
_sequence_view(...) when that check succeeds so datasets without sequence data
won't construct the panel.
In `@src/render/schema.py`:
- Around line 551-555: The lambda passed to pl.col("Scan").map_elements() only
checks for s is not None, so NaN/pd.NA still reach int(s) and crash; update the
guard to treat all missing/null-like values as missing (return -1) by using a
null check that covers NaN/pd.NA (e.g., pd.isna(s) or math.isnan where
appropriate) before calling int(s), keeping the rest of the expression
(scan_to_deconv lookup, alias("scan_id"), return_dtype=pl.Int64) unchanged so
missing scans yield -1 and valid scans are int-cast and mapped via
scan_to_deconv.
In `@src/view/grid.py`:
- Around line 243-266: The upload flow needs to validate JSON and avoid index
errors in expand: wrap json.load(uploaded) in a try/except to catch
JSONDecodeError and return/raise a controlled validation error, verify the
decoded object is the expected list-of-lists structure before calling expand,
and inside Grid.expand change the name-lookup to a safe pattern (e.g., check
membership of col in self.component_names or use a dict mapping name->option and
skip unknown IDs) rather than calling self.component_names.index(col) which
raises ValueError; ensure expand returns a consistent validated structure (or
raises a clear validation error) instead of letting runtime exceptions
propagate.
---
Outside diff comments:
In `@src/workflow/FileManager.py`:
- Around line 480-486: The SELECT followed by fetchone() can return None for a
cache miss, so before zipping over file_columns and result ensure you check
"result is None" and treat it as a cache miss (do not iterate). For both
occurrences (the block using self.cache_cursor, file_columns and dataset_id
starting at the shown snippet and the similar block at 498-504) either raise
KeyError(dataset_id) so upstream partial/KeyError handling runs, or return the
missing-value expected by result_path(); do this check immediately after result
= self.cache_cursor.fetchone() to avoid the TypeError.
---
Nitpick comments:
In `@migration/nondivergence.py`:
- Around line 56-57: Current code creates Path(pair[0]) and Path(pair[1]) which
resolves relative paths against the process CWD; change this to resolve against
the repository/config root instead. Modify the logic around a, b (and subsequent
ea, eb) so you first compute a stable base Path (e.g., repo_root or config_root)
and then construct a = base.joinpath(pair[0]) and b = base.joinpath(pair[1]) (or
otherwise resolve pair entries against that base) so existence checks ea, eb are
deterministic across runners; update any uses of a and b accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9ef92ede-ee8d-47e5-83d9-a18e8557d9c0
📒 Files selected for processing (31)
.gitmodulesDockerfileDockerfile.armcontent/FLASHDeconv/FLASHDeconvViewer.pycontent/FLASHQuant/FLASHQuantViewer.pycontent/FLASHTnT/FLASHTnTViewer.pymigration/README.mdmigration/REVIEW.mdmigration/nondivergence.pymigration/review-log/phase-3.jsonlmigration/run_review.pymigration/specs/.gitkeepmigration/specs/PHASE3_PLAN.mdmigration/units.yamlopenms-streamlit-vue-componentrequirements.txtsrc/common/common.pysrc/render/StateTracker.pysrc/render/components.pysrc/render/initialize.pysrc/render/render.pysrc/render/schema.pysrc/render/update.pysrc/render/util.pysrc/view/__init__.pysrc/view/grid.pysrc/workflow/FileManager.pytests/conftest.pytests/test_render_builders.pytests/test_render_schema.pytests/test_selection_clear.py
💤 Files with no reviewable changes (8)
- src/render/StateTracker.py
- .gitmodules
- tests/test_selection_clear.py
- src/render/util.py
- src/render/initialize.py
- src/render/components.py
- src/render/update.py
- openms-streamlit-vue-component
| oracle_root: /home/user/FLASHApp/src/render | ||
| template_repo: /home/user/streamlit-template | ||
| # Pairs checked by nondivergence.py: FLASHApp must reuse the template module verbatim. | ||
| # Filled in once the template grid module is frozen (Phase 3 step 2). | ||
| nondivergence_pairs: | ||
| - [/home/user/FLASHApp/src/view/grid.py, /home/user/streamlit-template/src/view/grid.py] | ||
|
|
There was a problem hiding this comment.
Hard-coded /home/user/... paths make the phase harness non-portable.
This config will fail on any machine/CI runner that doesn’t mirror that exact filesystem layout. Please switch targets/oracles/cwd entries to repo-relative or env-parameterized paths and resolve them in the scripts.
Also applies to: 22-50, 55-119
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@migration/units.yaml` around lines 8 - 14, Replace hard-coded absolute paths
in migration/units.yaml (oracle_root, template_repo, nondivergence_pairs) with
repo-relative or environment-parameterized tokens (e.g., use a ${REPO_ROOT} or
${TEMPLATE_REPO} placeholder) and update the code that reads units.yaml to
expand those tokens at runtime (resolve via an env var or compute repo root with
os.getcwd()/git rev-parse and join paths). Specifically change the values for
oracle_root and template_repo and the paths inside nondivergence_pairs to use
placeholders, and update the loader/validator that reads units.yaml to perform
environment-variable expansion and path normalization so CI and other machines
can resolve the locations correctly.
| def show_linked_grid( | ||
| layout, | ||
| builders, | ||
| *, | ||
| tool, | ||
| side_by_side=False, | ||
| grid_key="linked_grid", | ||
| height=None, | ||
| column_heights=None, | ||
| ): | ||
| """Render an N-experiment OpenMS-Insight linked grid. | ||
|
|
||
| Thin convenience over ``src.view.grid.render_linked_grid`` (the frozen, | ||
| tool-agnostic grid) that handles the multi-experiment + side-by-side page | ||
| concern. ``layout`` is ``List[experiment]``; each experiment is the nested | ||
| rows list consumed by ``render_linked_grid``. One independent | ||
| ``StateManager`` is created per experiment (``session_key=f"{tool}__exp{i}"``) | ||
| so experiments never cross-link. When exactly two experiments and | ||
| ``side_by_side=True`` they render in two ``st.columns``; otherwise they stack | ||
| with ``st.divider()``. | ||
|
|
||
| Experiment *selection* (the per-experiment ``st.selectbox``) stays in the | ||
| viewer page because it is tool/data specific; this helper only owns the | ||
| grid + side-by-side rendering. | ||
| """ | ||
| from src.view.grid import render_linked_grid | ||
|
|
||
| def _one(exp_idx, exp_layout, container): | ||
| with container: | ||
| render_linked_grid( | ||
| exp_layout, | ||
| builders, | ||
| state_key=f"{tool}__exp{exp_idx}", | ||
| grid_key=f"{grid_key}_{exp_idx}", | ||
| height=height, | ||
| column_heights=column_heights, | ||
| ) | ||
|
|
||
| if len(layout) == 2 and side_by_side: | ||
| c1, c2 = st.columns(2) | ||
| _one(0, layout[0], c1) | ||
| _one(1, layout[1], c2) | ||
| else: | ||
| for i, exp_layout in enumerate(layout): | ||
| if i: | ||
| st.divider() | ||
| _one(i, exp_layout, st.container()) |
There was a problem hiding this comment.
Make experiment rendering input/state explicitly per-experiment.
This helper currently applies one builders map to every experiment and uses state_key=f"{tool}__exp{idx}". That can leak stale selections across dataset switches and prevents clean per-dataset isolation when experiments differ.
💡 Suggested API shape
def show_linked_grid(
layout,
- builders,
+ builders,
*,
tool,
+ state_keys=None,
side_by_side=False,
@@
):
@@
- def _one(exp_idx, exp_layout, container):
+ def _one(exp_idx, exp_layout, container):
+ exp_builders = (
+ builders[exp_idx] if isinstance(builders, (list, tuple)) else builders
+ )
+ exp_state_key = (
+ state_keys[exp_idx]
+ if state_keys is not None
+ else f"{tool}__exp{exp_idx}"
+ )
with container:
render_linked_grid(
exp_layout,
- builders,
- state_key=f"{tool}__exp{exp_idx}",
+ exp_builders,
+ state_key=exp_state_key,
grid_key=f"{grid_key}_{exp_idx}",
height=height,
column_heights=column_heights,
)| def expand(self, trimmed: list, drop_empty_experiments: bool = True) -> list: | ||
| """internal names -> labels, dropping empty cells/rows. | ||
|
|
||
| ``drop_empty_experiments`` (default True, the edit-mode behavior) also drops | ||
| a wholly-empty experiment. The upload path passes False to match the oracle | ||
| ``handleSettingButtons``, whose inline expand keeps an empty experiment as a | ||
| ``[]`` stub so ``num_experiments`` stays ``len(uploaded)`` and the | ||
| reset-on-count-mismatch never fires (which would wipe the upload). | ||
| """ | ||
| expanded = [] | ||
| for exp in trimmed: | ||
| rows = [] | ||
| for row in exp: | ||
| cols = [] | ||
| for col in row: | ||
| if col: | ||
| cols.append( | ||
| self.component_options[self.component_names.index(col)] | ||
| ) | ||
| if cols: | ||
| rows.append(cols) | ||
| if rows or not drop_empty_experiments: | ||
| expanded.append(rows) | ||
| return expanded |
There was a problem hiding this comment.
Harden uploaded layout handling to avoid runtime crashes.
json.load(uploaded) and the expand() name lookup path can throw on invalid JSON or unknown/stale component IDs, which currently crashes the layout page instead of surfacing a validation error.
💡 Suggested fix
def _handle_setting_buttons(self) -> None:
@@
- if uploaded is not None:
- uploaded_layout = json.load(uploaded)
+ if uploaded is not None:
+ try:
+ uploaded_layout = json.load(uploaded)
+ except json.JSONDecodeError:
+ st.session_state[self._k("component_error")] = "Invalid JSON layout file"
+ return
+ if not isinstance(uploaded_layout, list):
+ st.session_state[self._k("component_error")] = "Invalid layout format"
+ return
@@
- st.session_state[self._k("layout")] = self.expand(
+ st.session_state[self._k("layout")] = self.expand(
uploaded_layout, drop_empty_experiments=False
) def expand(self, trimmed: list, drop_empty_experiments: bool = True) -> list:
@@
for col in row:
if col:
- cols.append(
- self.component_options[self.component_names.index(col)]
- )
+ if col not in self.component_names:
+ st.session_state[self._k("component_error")] = (
+ f"Unknown component in saved layout: {col}"
+ )
+ continue
+ cols.append(
+ self.component_options[self.component_names.index(col)]
+ )Also applies to: 423-438
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/view/grid.py` around lines 243 - 266, The upload flow needs to validate
JSON and avoid index errors in expand: wrap json.load(uploaded) in a try/except
to catch JSONDecodeError and return/raise a controlled validation error, verify
the decoded object is the expected list-of-lists structure before calling
expand, and inside Grid.expand change the name-lookup to a safe pattern (e.g.,
check membership of col in self.component_names or use a dict mapping
name->option and skip unknown IDs) rather than calling
self.component_names.index(col) which raises ValueError; ensure expand returns a
consistent validated structure (or raises a clear validation error) instead of
letting runtime exceptions propagate.
…ache-miss guards - Viewers: select experiments by stable dataset id (display name via format_func) so duplicate display names can't collapse distinct datasets; scope linked-grid tool keys by experiment index to isolate side-by-side panel state. - render.make_builders: only register sequence_view when its seq_deconv/seq_tnt cache exists, so datasets without sequence data don't hit a cache-miss crash. - schema._build_tags: treat NaN/None/pd.NA tag scans as missing (scan_id = -1) instead of crashing in int(). - FileManager.get_results: handle a missing cache row (fetchone() is None) as a normal cache miss instead of raising TypeError.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/workflow/FileManager.py (1)
535-542:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
result_path()should enforce parquet-only contract.
result_path()is documented/typed to return a parquet path string, but it currently forwards whateverget_results(..., as_path=True)returns. For non-.pqtags, that can be a loaded Python object, which breaks the method contract and can cause downstreamdata_path=misuse.Proposed fix
def result_path(self, dataset_id: str, name_tag: str) -> str: @@ - return self.get_results(dataset_id, [name_tag], as_path=True)[name_tag] + value = self.get_results(dataset_id, [name_tag], as_path=True)[name_tag] + if not isinstance(value, str): + raise KeyError( + f"{name_tag} for {dataset_id} is not a parquet-backed result path" + ) + return value🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/workflow/FileManager.py` around lines 535 - 542, The result_path method currently returns whatever get_results(..., as_path=True)[name_tag] yields, which can be a non-path object; change result_path to fetch value = self.get_results(dataset_id, [name_tag], as_path=True)[name_tag], then validate that value is a string path ending with ".pq" (and keep letting KeyError bubble if the tag is missing); if the value is not a string or does not end with ".pq" raise a TypeError with a clear message indicating result_path enforces a parquet path contract. Ensure references inside the method remain to result_path and get_results so the behavior is localized..github/workflows/unit-tests.yml (1)
44-49:⚠️ Potential issue | 🟠 MajorExclude
openms-insightfrom the subsequentrequirements.txtinstall.
requirements.txtincludesopenms-insight==0.1.15(line 62), but the workflow already installs the built local/tmp/openms-insight. Leaving the pin in place can still makepip install -r requirements.txttry to resolveopenms-insightagain (despite it requiring the prebuilt/mirroredjs-component/distto be installable in this repo).Suggested fix
- pip install -r requirements.txt + grep -vi 'openms-insight' requirements.txt > /tmp/requirements-no-insight.txt + pip install -r /tmp/requirements-no-insight.txt🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/unit-tests.yml around lines 44 - 49, The workflow currently installs the local wheel with "pip install /tmp/openms-insight" and then re-runs "pip install -r requirements.txt" which contains a pinned openms-insight entry; modify the job in .github/workflows/unit-tests.yml so the requirements install does not attempt to reinstall openms-insight — e.g., generate or use a filtered requirements file before running "pip install -r requirements.txt" or run pip install with a constraint/exclude to skip "openms-insight==0.1.15"; ensure the change targets the sequence around the existing "pip install /tmp/openms-insight" and "pip install -r requirements.txt" commands so only the local package is installed.Source: Learnings
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/unit-tests.yml:
- Around line 27-30: The workflow is using the mutable tag actions/setup-node@v4
which weakens CI supply-chain security; update the uses line in
.github/workflows/unit-tests.yml to pin to a specific commit SHA for
actions/setup-node (replace actions/setup-node@v4 with
actions/setup-node@<full-commit-sha>), keeping the existing node-version setting
intact and using the commit SHA from the official actions/setup-node repository
release you trust.
- Around line 38-39: The CI job currently clones the mutable branch reference
"claude/kind-heisenberg-u6dVm" via the git clone command and can drift; change
the workflow to fetch and checkout a specific immutable commit SHA instead of
relying on -b branch (or add an explicit git checkout <COMMIT_SHA> after clone),
and apply the same hardening to the Dockerfile/Dockerfile.arm flow that consumes
insight-ref.json by resolving and pinning the referenced commit SHA before
building; update references to the git clone invocation and the insight-ref.json
fetch so all checkouts use a concrete commit SHA rather than a branch name.
---
Outside diff comments:
In @.github/workflows/unit-tests.yml:
- Around line 44-49: The workflow currently installs the local wheel with "pip
install /tmp/openms-insight" and then re-runs "pip install -r requirements.txt"
which contains a pinned openms-insight entry; modify the job in
.github/workflows/unit-tests.yml so the requirements install does not attempt to
reinstall openms-insight — e.g., generate or use a filtered requirements file
before running "pip install -r requirements.txt" or run pip install with a
constraint/exclude to skip "openms-insight==0.1.15"; ensure the change targets
the sequence around the existing "pip install /tmp/openms-insight" and "pip
install -r requirements.txt" commands so only the local package is installed.
In `@src/workflow/FileManager.py`:
- Around line 535-542: The result_path method currently returns whatever
get_results(..., as_path=True)[name_tag] yields, which can be a non-path object;
change result_path to fetch value = self.get_results(dataset_id, [name_tag],
as_path=True)[name_tag], then validate that value is a string path ending with
".pq" (and keep letting KeyError bubble if the tag is missing); if the value is
not a string or does not end with ".pq" raise a TypeError with a clear message
indicating result_path enforces a parquet path contract. Ensure references
inside the method remain to result_path and get_results so the behavior is
localized.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5cf58efb-9598-46e6-8c10-83db94c2b574
📒 Files selected for processing (7)
.github/workflows/unit-tests.ymlcontent/FLASHDeconv/FLASHDeconvViewer.pycontent/FLASHQuant/FLASHQuantViewer.pycontent/FLASHTnT/FLASHTnTViewer.pysrc/render/render.pysrc/render/schema.pysrc/workflow/FileManager.py
🚧 Files skipped from review as they are similar to previous changes (5)
- content/FLASHQuant/FLASHQuantViewer.py
- content/FLASHTnT/FLASHTnTViewer.py
- content/FLASHDeconv/FLASHDeconvViewer.py
- src/render/render.py
- src/render/schema.py
| - name: Set up Node | ||
| uses: actions/setup-node@v4 | ||
| with: | ||
| node-version: "21" |
There was a problem hiding this comment.
Pin actions/setup-node to a full commit SHA.
.github/workflows/unit-tests.yml uses the mutable action tag actions/setup-node@v4, which weakens CI supply-chain integrity—pin it to a specific commit SHA instead of v4.
🧰 Tools
🪛 zizmor (1.25.2)
[error] 28-28: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/unit-tests.yml around lines 27 - 30, The workflow is using
the mutable tag actions/setup-node@v4 which weakens CI supply-chain security;
update the uses line in .github/workflows/unit-tests.yml to pin to a specific
commit SHA for actions/setup-node (replace actions/setup-node@v4 with
actions/setup-node@<full-commit-sha>), keeping the existing node-version setting
intact and using the commit SHA from the official actions/setup-node repository
release you trust.
| git clone -b claude/kind-heisenberg-u6dVm --single-branch --depth 1 \ | ||
| https://github.com/t0mdavid-m/openms-insight.git /tmp/openms-insight |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate and print the relevant section of the workflow around the cited lines
FILE=".github/workflows/unit-tests.yml"
if [ ! -f "$FILE" ]; then
echo "Missing $FILE"
exit 1
fi
nl -ba "$FILE" | sed -n '1,120p'
echo "----"
nl -ba "$FILE" | sed -n '20,80p'
echo "----"
# Also search for any other openms-insight clone/install logic in the repo
rg -n "openms-insight|/tmp/openms-insight|npm run build|js-component/dist|requirements\.txt" .github/workflows/unit-tests.yml .claude/hooks/session-start.sh Dockerfile* . || trueRepository: OpenMS/FLASHApp
Length of output: 103
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE=".github/workflows/unit-tests.yml"
if [ ! -f "$FILE" ]; then
echo "Missing $FILE"
exit 1
fi
echo "---- $FILE (lines ~20-80) ----"
awk 'NR>=20 && NR<=80 {printf "%5d %s\n", NR, $0}' "$FILE"
echo "---- matches for openms-insight related logic ----"
rg -n "openms-insight|/tmp/openms-insight|js-component/dist|npm run build|requirements\.txt" .github/workflows .claude/hooks Dockerfile* 2>/dev/null || true
echo "---- any checkout/pin patterns in the workflow ----"
rg -n "git clone|checkout .*sha|checkout --detach|fetch --depth|--depth 1" .github/workflows/unit-tests.yml 2>/dev/null || trueRepository: OpenMS/FLASHApp
Length of output: 5956
🏁 Script executed:
#!/bin/bash
set -euo pipefail
for f in Dockerfile Dockerfile.arm; do
echo "===== $f (around insight clone) ====="
[ -f "$f" ] || continue
# print likely relevant region
awk 'NR>=60 && NR<=160 {printf "%5d %s\n", NR, $0}' "$f"
echo
done
echo "===== search for insight-ref.json usage ====="
rg -n "insight-ref\.json|INSIGHT_BRANCH|openms-insight" Dockerfile Dockerfile.armRepository: OpenMS/FLASHApp
Length of output: 12567
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "insight-ref\.json|insight-ref|INSIGHT_BRANCH=.*|git.*checkout.*openms-insight" -S . \
|| true
# Specifically ensure there is no follow-up that extracts a SHA from insight-ref.json
rg -n "jq |\.object\.sha|\"sha\"|grep .*insight-ref\.json|cat .*insight-ref\.json" Dockerfile Dockerfile.arm .github/workflows .claude/hooks 2>/dev/null || true
# List any other workflows that clone openms-insight
rg -n "openms-insight\.git|/tmp/openms-insight|openms_insight/js-component/dist" .github/workflows .claude/hooks Dockerfile* 2>/dev/null || trueRepository: OpenMS/FLASHApp
Length of output: 2552
Use an immutable openms-insight revision in CI checkout.
.github/workflows/unit-tests.yml clones claude/kind-heisenberg-u6dVm via git clone -b ... --depth 1 with no commit SHA pinning, so CI results can silently drift as the branch moves. The same mutable-branch pattern is also used in Dockerfile/Dockerfile.arm (the insight-ref.json fetch is not followed by a SHA checkout).
Suggested hardening
- git clone -b claude/kind-heisenberg-u6dVm --single-branch --depth 1 \
- https://github.com/t0mdavid-m/openms-insight.git /tmp/openms-insight
+ git clone https://github.com/t0mdavid-m/openms-insight.git /tmp/openms-insight
+ git -C /tmp/openms-insight checkout <pinned-openms-insight-commit-sha>🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/unit-tests.yml around lines 38 - 39, The CI job currently
clones the mutable branch reference "claude/kind-heisenberg-u6dVm" via the git
clone command and can drift; change the workflow to fetch and checkout a
specific immutable commit SHA instead of relying on -b branch (or add an
explicit git checkout <COMMIT_SHA> after clone), and apply the same hardening to
the Dockerfile/Dockerfile.arm flow that consumes insight-ref.json by resolving
and pinning the referenced commit SHA before building; update references to the
git clone invocation and the insight-ref.json fetch so all checkouts use a
concrete commit SHA rather than a branch name.
Summary
This PR completes Phase 3 of the FLASHApp migration: rebuild the three viewer pages (FLASHDeconv, FLASHTnT, FLASHQuant) on top of OpenMS-Insight using a reusable, tool-agnostic grid module from a frozen
streamlit-templatepackage. The old bespoke Vue mega-component and index-based selection logic are replaced with value-basedfilters/interactivityand a sharedStateManagerfor cross-linked panels.Key Changes
New Modules
src/view/grid.py— Tool-agnostic linked-grid renderer (render_linked_grid()+LayoutManagerclass). Distills the oldrender.py::render_grid()+ bothFLASH*LayoutManagerclasses into a single reusable template that can be frozen and vendored byte-for-byte unchanged into downstream apps.src/render/schema.py— Adapter layer that converts FLASHApp's wide, list-column, index-addressed FileManager caches into tidy parquet with stable value IDs (scan_id,mass_id,peak_id,protein_id, etc.). Enables OpenMS-Insight's value-basedfilters/interactivitymodel. Includes helpers for exploding nested signal/noise peaks, KDE distributions, and protein/tag data.src/render/render.py(repurposed) — Now a builder factory.make_builders(file_manager, dataset_id, tool, settings=None)returns a{comp_name: () -> BaseComponent}map. Each zero-arg factory closes over dataset context and usesfile_manager.result_path()to feeddata_path=to Insight components. Includes oracle Tabulator column definitions (titles, formatters, sorters) ported verbatim so migrated Tables show the same curated columns and formatting.Removed Modules
src/render/update.py— Index-based selection logic (scanIndex/massIndex/proteinIndex/xRange/yRangeslicing). Replaced by value-basedfilters/interactivityin Insight components.src/render/initialize.py— Per-panel cache loading. Now handled by Insight'sdata_path=+ subprocess preprocessing.src/render/StateTracker.py— Local state reconciler. Replaced 1:1 byopenms_insight.StateManager.src/render/components.py— Vue component registration. No longer needed.src/render/util.py— Hash utility (unused)..gitmodules+openms-streamlit-vue-component/— Old Vue submodule removed.Page Updates
content/FLASHDeconv/FLASHDeconvViewer.py,content/FLASHTnT/FLASHTnTViewer.py,content/FLASHQuant/FLASHQuantViewer.py— Simplified to callshow_linked_grid()with a layout and builders. No morerender_grid()/StateTracker/initialize_data/update_data/filter_datacalls.src/common/common.py— Addedshow_linked_grid()convenience wrapper for rendering N-experiment grids with the frozen template.Data Layer
src/workflow/FileManager.py— Addedresult_path(dataset_id, name_tag)method to expose parquet file paths fordata_path=(Insight's subprocess preprocessing entry point).Testing & Migration Harness
tests/test_render_schema.py— Construct-smoke forbuild_insight_caches(): synthetic FileManager caches → tidy parquet with stable IDs and correct explode shapes.https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
Summary by CodeRabbit