Writing

Field notes from running AI programs in large enterprises. Honest, specific, occasionally opinionated.

Agent Evaluation Jun 2025

If You're Validating a Customer Service Agent, Here's How I'd Approach It

3 tiers, 24 measures, a hard safety gate, and a triage system for figuring out not just that something failed โ€” but why, and what to fix.

Read more โ†’