N° 001 / Index

I build LLM tools — for my own real-estate underwriting work.

I work in acquisitions at Haven Realty Capital, underwriting rental housing — single-family rentals and build-to-rent communities — across the Sunbelt. The unusual part: I scope the problem, prompt-engineer the workflows, and direct Claude through the build of the AI tooling we use internally — Claude Code skills, Model Context Protocol servers, agent workflows. Four production tools shipped since March 2026, two more in build; each one compresses a specific part of the workflow. A one-page deal summary that used to take two hours now takes fifteen minutes.

§ 02 Projects Internal tools shipped to teammates

Internal tooling In production

Claude Code skills suite

Five skills I run on most deals — a one-page deal summary from the broker's pitch document, an underwriting financial model, rent benchmarks against comparable nearby properties, a unit-by-unit rent-roll summary, and categorization of the trailing twelve-month income statement. Each is a Markdown spec plus a folder of Python tools; Claude orchestrates, Python does the deterministic work. The deal-summary skill went from ~2 hours of manual work to ~15 minutes.

Claude Code Python openpyxl since 2025

Internal tooling · agent Coming soon

Underwriting agent

An orchestration agent that takes a deal from broker email to a finished analysis pack. When a deal lands in my inbox, the agent gathers the source files into a fresh folder, runs the underlying analysis skills in the right order, and drafts a one-paragraph summary back to me.

Claude Agent SDK Python Supabase since 2026

Pipeline · data ingest In production

Off-market and supply pipeline

An automated pipeline that watches public records across eight Sunbelt metros for two signals — new rental-housing supply heading into the market, and off-market deals that haven't hit commercial datasets yet. Each new filing gets classified by a language model into a structured schema and surfaced to a dashboard.

Python Postgres Railway since 2024

Internal tooling In production

MCP server for the off-market and supply pipeline

A read-only Model Context Protocol server fronting the supply pipeline as nine tools for Claude. My team logs into claude.ai, asks something like "what's been filed in this part of town in the last 30 days" in plain English, and the server queries the underlying database and returns a clean answer.

Python FastMCP OAuth 2.1 since 2025

Internal tooling · evaluation In production

Evaluation framework

A labeled gold-standard set plus scoring scripts to catch regressions before merging prompt or model changes. Built first for the property-type classifier in my supply pipeline — independently-labeled goldens, cost-sensitive scoring, deterministic runs — and extending to the higher-volume document-extraction step.

Python pytest-style runner Anthropic API since 2026

Internal tooling · retrieval In production

Deal-document search

A hybrid search system over my team's years of deal documents — pitch memos, broker emails, internal notes. A relational database for the structured facts, vector embeddings for the prose, and a router that picks whichever fits the question.

Python ChromaDB Voyage embeddings since 2026

§ 03 Writing Builder notes on the AI tools I run on real deals · all 7 posts

007

The dashboard was wrong for two weeks. Every health check was green.

Model evals don't catch a broken data layer. The fix is the same discipline pointed at the database — checks on the numbers people actually act on.

AI

2026 · 06 · 10
006

The feature that tested seven points better and changed almost nothing

How the eval caught a leaked metric — and then told me how to make the feature actually useful.

AI

2026 · 06 · 08
005

An accuracy number is meaningless without the cost of being wrong

Not all classification errors cost the same. The eval reports two numbers and ranks every failure by business impact.

AI

2026 · 06 · 05
004

You can't grade a language model against its own guesses

The fastest way to build a useless eval is to use the model's own output as the answer key.

AI

2026 · 06 · 02
003

The one place the LLM lives in my data pipeline

Dozens of Python tools, exactly one LLM call per document. The boundary is the design decision.

AI

2026 · 05 · 25
002

Why I moved my rent-benchmark and income-statement workflows out of Claude chat into Skills

Three problems running spreadsheet workflows in Claude chat that pushed me to build dedicated skills.

AI

2026 · 05 · 22
001

What this site is

An introduction to the site — who I am, what I build, and what to expect here.

2026 · 05 · 17

End of index Subscribe via RSS →