Prompt-driver OpenAI topic relevance validator#128
Conversation
|
Warning Review limit reached
More reviews will be available in 43 minutes and 13 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (14)
📝 WalkthroughWalkthroughThis PR implements versioned prompt templates for the TopicRelevanceOpenAI validator, replacing hardcoded scoring instructions with configurable markdown templates. It adds three template versions (v1, v2, v3), implements cached template loading with validation, updates the validator core to substitute placeholders, wires the schema version through configuration, and expands test coverage across all versions. ChangesPrompt Versioning and Template Infrastructure
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
Target issue is #129
Explain the motivation for making this change. What existing problem does the pull request solve?
Problem
The
TopicRelevanceOpenAIvalidator had hardcoded scoring instructions that caused false positives for forbidden-topic configurations. When a user configured the system prompt with forbidden topics (e.g. "do not answer queries about gender detection"), the model was misled by the scoring semantics — score 3 meant "clearly in scope", which conflicted with the intent of an exclusion-based prompt where a high score should mean "clearly NOT forbidden".Solution
Replaced the hardcoded
_SCORING_INSTRUCTIONSstring with a versioned file-based prompt system, matching the architecture already used by theTopicRelevance(Guardrails Hub) validator. Each version encodes a different scoring strategy:v1v2v3The scoring instructions now travel as the user message (with
{{USER_PROMPT}}marking where the query is injected), keeping the system message clean as pure topic configuration.Changes
prompts/topic_relevance_openai/v1.md— allowed-topics scoring (mirrors original hardcoded logic)prompts/topic_relevance_openai/v2.md— forbidden-topics scoring (fixes the false positive issue)prompts/topic_relevance_openai/v3.md— combined strategy, prioritises forbidden topics, then checks allowedtopic_relevance_openai.py— added_load_prompt_template()withlru_cache, newprompt_schema_version: int = 1param, system prompt is now stored as-is without appended instructionstopic_relevance_openai_safety_validator_config.py— exposedprompt_schema_versionfield so callers can select the strategy via configtest_topic_relevance_openai.py— updated 1 test to reflect the new attribute (_user_message_template), added 4 new tests covering v2/v3 loading and invalid version handlingBackward compatibility
Defaults to
prompt_schema_version=1, which preserves the original scoring behaviour for existing deployments.Checklist
Before submitting a pull request, please ensure that you mark these task.
fastapi run --reload app/main.pyordocker compose upin the repository root and test.Notes
Please add here if any other information is required for the reviewer.
Summary by CodeRabbit