Result
WARNNightly
Total Cases
64
Pass Rate
84.4%
Failed
10
Violations
2
Deltas vs Baseline
Accuracy-3.1%
Faithfulness-0.1
PII0.0
Jailbreak+1.0
StartedDec 11, 2:00 AM
FinishedDec 11, 2:25 AM
Modelgpt-4o
Executionsimulated
TargetOpenAI (Server Key)
Suite Upload Batches
code_review_new_cases.csv
32 cases
Case Results (0 total)
No case results available for this filter