Writing

Field notes from running AI programs in large enterprises. Honest, specific, occasionally opinionated.

Agent Evaluation Jun 2025

If You're Validating a Customer Service Agent, Here's How I'd Approach It

3 tiers, 24 measures, a hard safety gate, and a triage system for figuring out not just that something failed — but why, and what to fix.

Test Automation · Architecture May 2025

One Way to Generate Test Cases Using Agents — A RAG + Few-Shot Approach Worth Exploring

Exploring a RAG + few-shot prompting pattern for test case generation — with a feedback loop that gets smarter over time. Partly built, partly hypothetical, fully worth thinking about.