Writing

Field notes from running AI programs in large enterprises. Honest, specific, occasionally opinionated.

Agent Evaluation Jun 2025

If You're Validating a Customer Service Agent, Here's How I'd Approach It

3 tiers, 24 measures, a hard safety gate, and a triage system for figuring out not just that something failed โ€” but why, and what to fix.

Read more โ†’
Test Automation ยท Architecture May 2025

One Way to Generate Test Cases Using Agents โ€” A RAG + Few-Shot Approach Worth Exploring

Exploring a RAG + few-shot prompting pattern for test case generation โ€” with a feedback loop that gets smarter over time. Partly built, partly hypothetical, fully worth thinking about.

Read more โ†’