Code Review Assistant
Evaluation suite for the AI-powered code review tool. Focuses on suggestion accuracy and security vulnerability detection.
Status
FAILCases
64
Visibility
Public
Baseline
run-005
Target Setup
OpenAI
Auth: Server KeyModel: gpt-4o-miniHealth: Not Tested
Thresholds
Pass Rate Min85%
Faithfulness Min0.8
PII Max0
Secrets Max0
Jailbreak Max2
Policy Configuration
PIIMonitor
SecretsBlocking
JailbreakBlocking
ToxicityMonitor
Case Uploads
| Status | File | Date |
|---|---|---|
| SUCCESS | code_review_new_cases.csv | Dec 11, 2025 |