<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Herman Chan</title><description>Builder notes on the AI tools I run on real deals.</description><link>https://hermanchanai.com/</link><item><title>The dashboard was wrong for two weeks. Every health check was green.</title><link>https://hermanchanai.com/posts/every-health-check-was-green/</link><guid isPermaLink="true">https://hermanchanai.com/posts/every-health-check-was-green/</guid><description>Model evals don&apos;t catch a broken data layer. The fix is the same discipline pointed at the database — checks on the numbers people actually act on.</description><pubDate>Wed, 10 Jun 2026 00:00:00 GMT</pubDate></item><item><title>The feature that tested seven points better and changed almost nothing</title><link>https://hermanchanai.com/posts/the-feature-that-tested-well/</link><guid isPermaLink="true">https://hermanchanai.com/posts/the-feature-that-tested-well/</guid><description>How the eval caught a leaked metric — and then told me how to make the feature actually useful.</description><pubDate>Mon, 08 Jun 2026 00:00:00 GMT</pubDate></item><item><title>An accuracy number is meaningless without the cost of being wrong</title><link>https://hermanchanai.com/posts/cost-of-being-wrong/</link><guid isPermaLink="true">https://hermanchanai.com/posts/cost-of-being-wrong/</guid><description>Not all classification errors cost the same. The eval reports two numbers and ranks every failure by business impact.</description><pubDate>Fri, 05 Jun 2026 00:00:00 GMT</pubDate></item><item><title>You can&apos;t grade a language model against its own guesses</title><link>https://hermanchanai.com/posts/grading-an-llm-against-its-own-guesses/</link><guid isPermaLink="true">https://hermanchanai.com/posts/grading-an-llm-against-its-own-guesses/</guid><description>The fastest way to build a useless eval is to use the model&apos;s own output as the answer key.</description><pubDate>Tue, 02 Jun 2026 00:00:00 GMT</pubDate></item><item><title>The one place the LLM lives in my data pipeline</title><link>https://hermanchanai.com/posts/one-place-the-llm-lives/</link><guid isPermaLink="true">https://hermanchanai.com/posts/one-place-the-llm-lives/</guid><description>Dozens of Python tools, exactly one LLM call per document. The boundary is the design decision.</description><pubDate>Mon, 25 May 2026 00:00:00 GMT</pubDate></item><item><title>Why I moved my rent-benchmark and income-statement workflows out of Claude chat into Skills</title><link>https://hermanchanai.com/posts/rent-comps-t12-skills/</link><guid isPermaLink="true">https://hermanchanai.com/posts/rent-comps-t12-skills/</guid><description>Three problems running spreadsheet workflows in Claude chat that pushed me to build dedicated skills.</description><pubDate>Fri, 22 May 2026 00:00:00 GMT</pubDate></item><item><title>What this site is</title><link>https://hermanchanai.com/posts/what-this-site-is/</link><guid isPermaLink="true">https://hermanchanai.com/posts/what-this-site-is/</guid><description>An introduction to the site — who I am, what I build, and what to expect here.</description><pubDate>Sun, 17 May 2026 00:00:00 GMT</pubDate></item></channel></rss>