Skip to content

Prompt-driver OpenAI topic relevance validator#128

Merged
rkritika1508 merged 4 commits into
mainfrom
feat/open-ai-topic-relevance-fix
Jun 5, 2026
Merged

Prompt-driver OpenAI topic relevance validator#128
rkritika1508 merged 4 commits into
mainfrom
feat/open-ai-topic-relevance-fix

Conversation

@rkritika1508
Copy link
Copy Markdown
Collaborator

@rkritika1508 rkritika1508 commented Jun 4, 2026

Summary

Target issue is #129
Explain the motivation for making this change. What existing problem does the pull request solve?

Problem

The TopicRelevanceOpenAI validator had hardcoded scoring instructions that caused false positives for forbidden-topic configurations. When a user configured the system prompt with forbidden topics (e.g. "do not answer queries about gender detection"), the model was misled by the scoring semantics — score 3 meant "clearly in scope", which conflicted with the intent of an exclusion-based prompt where a high score should mean "clearly NOT forbidden".

Solution

Replaced the hardcoded _SCORING_INSTRUCTIONS string with a versioned file-based prompt system, matching the architecture already used by the TopicRelevance (Guardrails Hub) validator. Each version encodes a different scoring strategy:

Version Strategy Use case
v1 Allowed topics only Classic allowlist — score 3 = in scope, score 1 = out of scope
v2 Forbidden topics only Exclusion/blocklist — score 3 = clearly NOT forbidden, score 1 = clearly forbidden
v3 Combined allowed + forbidden Checks forbidden first, then allowed; designed for mixed configurations

The scoring instructions now travel as the user message (with {{USER_PROMPT}} marking where the query is injected), keeping the system message clean as pure topic configuration.

Changes

  • prompts/topic_relevance_openai/v1.md — allowed-topics scoring (mirrors original hardcoded logic)
  • prompts/topic_relevance_openai/v2.md — forbidden-topics scoring (fixes the false positive issue)
  • prompts/topic_relevance_openai/v3.md — combined strategy, prioritises forbidden topics, then checks allowed
  • topic_relevance_openai.py — added _load_prompt_template() with lru_cache, new prompt_schema_version: int = 1 param, system prompt is now stored as-is without appended instructions
  • topic_relevance_openai_safety_validator_config.py — exposed prompt_schema_version field so callers can select the strategy via config
  • test_topic_relevance_openai.py — updated 1 test to reflect the new attribute (_user_message_template), added 4 new tests covering v2/v3 loading and invalid version handling

Backward compatibility

Defaults to prompt_schema_version=1, which preserves the original scoring behaviour for existing deployments.

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

Notes

Please add here if any other information is required for the reviewer.

Summary by CodeRabbit

  • New Features
    • Topic relevance validator now supports multiple versioned prompt templates for improved flexibility and configuration control.
    • Added configurable prompt schema version parameter to select between different validation template versions.
    • Prompt templates are now modular and versioned, enabling easier updates and maintenance of validation instructions.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 4, 2026

Review Change Stack

Warning

Review limit reached

@rkritika1508, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 43 minutes and 13 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a73b5544-1cfe-42c1-8cee-404793c83f8e

📥 Commits

Reviewing files that changed from the base of the PR and between 97d8c1b and d7d24b3.

📒 Files selected for processing (14)
  • backend/app/api/routes/guardrails.py
  • backend/app/core/config.py
  • backend/app/core/enum.py
  • backend/app/core/validators/config/topic_relevance_llm_safety_validator_config.py
  • backend/app/core/validators/prompts/topic_relevance_llm/v1.md
  • backend/app/core/validators/prompts/topic_relevance_llm/v2.md
  • backend/app/core/validators/prompts/topic_relevance_llm/v3.md
  • backend/app/core/validators/topic_relevance_llm.py
  • backend/app/core/validators/validators.json
  • backend/app/evaluation/topic_relevance/run.py
  • backend/app/schemas/guardrail_config.py
  • backend/app/tests/test_llm_validators.py
  • backend/app/tests/test_validate_with_guard.py
  • backend/app/tests/validators/test_topic_relevance_llm.py
📝 Walkthrough

Walkthrough

This PR implements versioned prompt templates for the TopicRelevanceOpenAI validator, replacing hardcoded scoring instructions with configurable markdown templates. It adds three template versions (v1, v2, v3), implements cached template loading with validation, updates the validator core to substitute placeholders, wires the schema version through configuration, and expands test coverage across all versions.

Changes

Prompt Versioning and Template Infrastructure

Layer / File(s) Summary
Prompt Template Definitions
backend/app/core/validators/prompts/topic_relevance_openai/v1.md, v2.md, v3.md
Three versioned markdown prompt templates define topic-relevance scoring criteria (1–3 scale), decision rules, and enforce JSON-only response format with {"scope_violation": <score>} contract.
Prompt Template Loading and Validation
backend/app/core/validators/topic_relevance_openai.py (lines 5–6, 26–53)
Adds imports for caching and filesystem access; implements _load_prompt_template() with input validation, file-exists checks, placeholder presence verification, and UTF-8 reading.
Validator Core: Template-Based Prompting
backend/app/core/validators/topic_relevance_openai.py (lines 62–103, 115–124)
Updates TopicRelevanceOpenAI.__init__ to accept prompt_schema_version, load templates, validate system prompt and template, and construct user messages by substituting {{USER_PROMPT}} with input text during validation.
Configuration Integration
backend/app/core/validators/config/topic_relevance_openai_safety_validator_config.py
Adds prompt_schema_version field (default 1, ge=1) to config and forwards it through build() to the validator.
Test Coverage and Verification
backend/app/tests/validators/test_topic_relevance_openai.py
Imports placeholder constant, refactors existing tests with descriptive validator names, adds user message construction test, and expands prompt-template suite to verify system/user content, version-specific template loading, and invalid schema version handling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

  • ProjectTech4DevAI/kaapi-guardrails#126: Extends the TopicRelevanceOpenAI validator implementation by adding versioned prompt template loading with prompt_schema_version parameter.
  • ProjectTech4DevAI/kaapi-guardrails#109: Connects at the shared prompt_schema_version wiring for topic relevance by populating that field from llm_prompt_config_crud in run_guardrails config resolution.

Suggested reviewers

  • nishika26
  • AkhileshNegi

Poem

🐰 Three prompts dance in versioned grace,
Templates load at runtime's pace,
User queries find their place,
Placeholders swap without a trace—
A validator now wears many faces! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 5.26% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Prompt-driver OpenAI topic relevance validator' clearly and concisely summarizes the main change: introducing a prompt-driven (versioned prompt templates) approach to the OpenAI topic relevance validator, replacing hardcoded scoring instructions.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/open-ai-topic-relevance-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread backend/app/tests/validators/test_topic_relevance_openai.py Outdated
Comment thread backend/app/tests/validators/test_topic_relevance_openai.py Outdated
Comment thread backend/app/tests/validators/test_topic_relevance_llm.py
@rkritika1508 rkritika1508 self-assigned this Jun 5, 2026
@rkritika1508 rkritika1508 added enhancement New feature or request ready-for-review labels Jun 5, 2026
@github-project-automation github-project-automation Bot moved this to To Do in Kaapi-dev Jun 5, 2026
@rkritika1508 rkritika1508 linked an issue Jun 5, 2026 that may be closed by this pull request
@rkritika1508 rkritika1508 merged commit 0d97649 into main Jun 5, 2026
2 checks passed
@github-project-automation github-project-automation Bot moved this from To Do to Closed in Kaapi-dev Jun 5, 2026
@rkritika1508 rkritika1508 deleted the feat/open-ai-topic-relevance-fix branch June 5, 2026 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request ready-for-review

Projects

Status: Closed

Development

Successfully merging this pull request may close these issues.

Evaluation: Update scoring semantics

2 participants