OpenEvalOps

Code Review Assistant

Evaluation suite for the AI-powered code review tool. Focuses on suggestion accuracy and security vulnerability detection.

Status

FAIL

Cases

64

Visibility

Public

Baseline

run-005

Target Setup

OpenAI
Auth: Server KeyModel: gpt-4o-miniHealth: Not Tested

Thresholds

Pass Rate Min85%
Faithfulness Min0.8
PII Max0
Secrets Max0
Jailbreak Max2

Policy Configuration

PIIMonitor
SecretsBlocking
JailbreakBlocking
ToxicityMonitor

Case Uploads

StatusFileDate
SUCCESScode_review_new_cases.csvDec 11, 2025

Recent Runs

ResultRun IDStarted
FAILrun-00510h ago
WARNrun-0061d ago