OpenEvalOps

Run run-002

Customer Support Bot v2 — feat/context-window — gpt-4o

Result

PASSPR

Total Cases

128

Pass Rate

93.8%

Failed

8

Violations

0

Deltas vs Baseline

Accuracy-0.3%
Faithfulness+0.0
PII0.0
Jailbreak0.0
StartedDec 9, 4:20 PM
FinishedDec 9, 5:00 PM
Modelgpt-4o
Executionsimulated
TargetOpenAI (Server Key)

Suite Upload Batches

support_cases_batch_1.csv
45 cases
SUCCESS

Case Results (0 total)

No case results available for this filter