Skip to content

Assessment: L1 pipeline for L2 and Post-Processing #904

@vprashrex

Description

@vprashrex

Is your feature request related to a problem?
Currently, all rows are sent directly to the L2 LLM batch without any pre-filtering. This leads to wasted L2 costs on irrelevant or duplicate submissions, and the output lacks flexibility in terms of computed columns and sorting.

Describe the solution you'd like

  • Implement an L1 pipeline for topic relevance and duplicate detection before processing with L2 using a Celery task.
  • Ensure topic relevance evaluates selected attachment documents, with relevance metrics included in export.
  • Enable detection of per-item attachment types (supporting mixed image and PDF columns).
  • Apply post-processing features (computed columns, filter, sort) at the export stage.
Original issue

Describe the current behavior
All rows go straight to the L2 LLM batch. No pre-filter, so irrelevant/duplicate submissions consume full L2 cost. Relevance ignores attachments, columns can't mix image+PDF, and export output is fixed (no computed columns/filter/sort).

Describe the enhancement you'd like

  • L1 pipeline (topic relevance + duplicate detection) running before L2 via a Celery task.
  • Topic relevance that also evaluates selected attachment documents, with per-input relevance in the export.
  • Per-item attachment type detection (mixed image/PDF columns).
  • Post-processing (computed columns, filter, sort) applied at export.

Why is this enhancement needed?
Cuts L2 cost by dropping irrelevant/duplicate rows early, improves relevance accuracy by judging real document content, and gives reviewers derived/filtered/sorted exports.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request
No fields configured for Feature.

Projects

Status
In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions