All research
Working note·Mar 10, 2026
Sonnet vs. Haiku for BOM column mapping — an empirical note
A quick harness, a small eval set, and a surprisingly large delta on edge cases.
#claude#evals#agents
Draft placeholder — replace with your own writing.
I ran both Sonnet and Haiku against the same 200-row labeled BOM set and measured precision on the hard cases (ambiguous headers, multi-language part numbers, suppliers with non-standard columns). The delta on edge cases was larger than the delta on easy cases by roughly 3×.
TODO: drop in the actual table once I clean up the harness.