Currently Shipping AI into a real-estate acquisitions workflow Location San Mateo, CA Updated 2026-07-03
§ · Writing

Builder notes on the AI tools I run on real deals.

One idea per post. Architecture decisions, things I tried first and abandoned, what I'd do differently.

  1. 007

    The dashboard was wrong for two weeks. Every health check was green.

    Model evals don't catch a broken data layer. The fix is the same discipline pointed at the database — checks on the numbers people actually act on.

    AI
    2026 · 06 · 10
  2. 006

    The feature that tested seven points better and changed almost nothing

    How the eval caught a leaked metric — and then told me how to make the feature actually useful.

    AI
    2026 · 06 · 08
  3. 005

    An accuracy number is meaningless without the cost of being wrong

    Not all classification errors cost the same. The eval reports two numbers and ranks every failure by business impact.

    AI
    2026 · 06 · 05
  4. 004

    You can't grade a language model against its own guesses

    The fastest way to build a useless eval is to use the model's own output as the answer key.

    AI
    2026 · 06 · 02
  5. 003

    The one place the LLM lives in my data pipeline

    Dozens of Python tools, exactly one LLM call per document. The boundary is the design decision.

    AI
    2026 · 05 · 25
  6. 002

    Why I moved my rent-benchmark and income-statement workflows out of Claude chat into Skills

    Three problems running spreadsheet workflows in Claude chat that pushed me to build dedicated skills.

    AI
    2026 · 05 · 22
  7. 001

    What this site is

    An introduction to the site — who I am, what I build, and what to expect here.

    2026 · 05 · 17