OpenEvalOps

Run run-010

Summarization — Earnings Reports — main — claude-3.5-sonnet

Result

PASSNightly

Total Cases

96

Pass Rate

93.8%

Failed

6

Violations

0

Deltas vs Baseline

Accuracy+0.3%
Faithfulness+0.0
PII0.0
Jailbreak0.0
StartedDec 8, 3:00 AM
FinishedDec 8, 3:35 AM
Modelclaude-3.5-sonnet
Executionsimulated
TargetOpenAI (Server Key)

Suite Upload Batches

No upload batches are linked to this suite.

Case Results (0 total)

No case results available for this filter