Currently Shipping AI into a real-estate acquisitions workflow Location San Mateo, CA Updated 2026-07-03

§ · Writing

Builder notes on the AI tools I run on real deals.

One idea per post. Architecture decisions, things I tried first and abandoned, what I'd do differently.

007

The dashboard was wrong for two weeks. Every health check was green.

Model evals don't catch a broken data layer. The fix is the same discipline pointed at the database — checks on the numbers people actually act on.

AI

2026 · 06 · 10
006

The feature that tested seven points better and changed almost nothing

How the eval caught a leaked metric — and then told me how to make the feature actually useful.

AI

2026 · 06 · 08
005

An accuracy number is meaningless without the cost of being wrong

Not all classification errors cost the same. The eval reports two numbers and ranks every failure by business impact.

AI

2026 · 06 · 05
004

You can't grade a language model against its own guesses

The fastest way to build a useless eval is to use the model's own output as the answer key.

AI

2026 · 06 · 02
003

The one place the LLM lives in my data pipeline

Dozens of Python tools, exactly one LLM call per document. The boundary is the design decision.

AI

2026 · 05 · 25
002

Why I moved my rent-benchmark and income-statement workflows out of Claude chat into Skills

Three problems running spreadsheet workflows in Claude chat that pushed me to build dedicated skills.

AI

2026 · 05 · 22
001

What this site is

An introduction to the site — who I am, what I build, and what to expect here.

2026 · 05 · 17