How Claude Fable Helped Ship a Safer Open-Source Library — and What the $149 Price Tag Reveals About AI-Assisted Development
Simon Willison used Claude Fable in Claude Code to catch a critical data-loss bug and ship sqlite-utils 4.0rc2 across 37 prompts and 34 commits, at an estimated API cost of $149.25 (≈ ₹12,686). The case reveals how AI code review, cross-model validation, and subagent workflows are becoming practical — and expensive — tools in modern software development.
A $149 AI Session That Caught a Data-Loss Bug Before It Shipped
In early July 2026, developer Simon Willison published a detailed account on his blog, simonwillison.net, of how he used Claude Fable — Anthropic’s most capable model available on the Claude Max subscription — to prepare the open-source tool sqlite-utils for a stable 4.0 release. The result: 37 prompts, 34 code commits, changes across 30 files, and a discovered bug so serious that it would have silently erased users’ data. The estimated cost for the AI work, had it been billed at direct API rates, was $149.25 (approximately ₹12,686 at current exchange rates).
If you are not a developer, this story still matters to you — because it illustrates exactly what AI coding assistants are becoming capable of, what their real-world costs look like, and where the genuine limitations remain.
What Is sqlite-utils, and Why Does This Matter?
Sqlite-utils is a widely used Python library that helps developers work with SQLite databases — the same kind of lightweight database that powers everything from mobile apps to internal analytics dashboards. A “major version” release like 4.0 signals breaking changes: things that worked in the old version may behave differently in the new one. Simon Willison, the tool’s author, follows Semantic Versioning (SemVer) discipline, meaning he tries to make such releases rare and carefully considered.
The challenge he faced was ensuring that every edge case in the new transaction model — the way the library saves data to disk — was correct before shipping. This is painstaking, detail-oriented work. It is precisely the kind of task where a tireless AI reviewer can add significant value.
The Bug Claude Found That Willison Had Missed
Willison started with a single prompt in Claude Code, run from his iPhone: “Final review before shipping a stable 4.0 release — very important to spot any last minute things that would be a breaking change if we fix them later.”
According to Willison’s account, Claude Fable returned an initial report identifying five “release blockers.” The worst was a data-loss bug in the delete_where() method. In plain language: when you told the library to delete certain rows from a database, it deleted them in memory but never properly saved that deletion to disk. Worse, it left the database connection in a broken state, so subsequent write operations — inserts, updates — also failed to save. When you closed the database and reopened it, all of that work was simply gone.
Willison’s own words on his blog capture the severity: “Reopen: rows are [0, 1, 2] — the delete, row 50, AND table u are all gone.”
This is the kind of bug that is genuinely hard to spot in code review because everything appears to work while the database is open. It only fails silently on close. Claude Fable caught it by systematically tracing the transaction logic through the codebase — something a human reviewer doing a quick scan might easily miss.
What a Non-Technical Indian Professional Can Learn From This
Imagine you are leading a data operations team at a mid-sized NBFC (non-banking financial company) in Pune. Your engineering team is about to ship an upgrade to an internal tool that records loan disbursement entries into a database. A senior developer has reviewed the code. Tests have passed. You are ready to go live.
Now consider: what if there was a transaction bug similar to the one Willison described — where disbursement records appeared to save during a session, but were silently rolled back when the database connection closed? In a financial context, that is not just a technical inconvenience. It is a compliance and audit risk.
This is the category of value that AI code review tools like Claude Code are beginning to provide. Not writing code from scratch, but acting as an exhaustive second reviewer — one that traces logic paths systematically and does not get tired after reviewing the fifteenth file.
Willison’s account notes that he was able to attend the Half Moon Bay 4th of July parade while Fable worked through tasks that took 10–15 minutes each. He checked in on his phone between prompts. The AI handled the investigation; the human made the final judgement calls.
The Cross-Model Review That Caught Even More
One of the more revealing details in Willison’s account is his use of a second AI — GPT-5.5 via Codex Desktop — to review Fable’s work. He wrote on his blog: “I used to think that the idea of having one model review the work of another was somewhat absurd — it felt weirdly superstitious. The problem is it really does work.”
The GPT-5.5 review surfaced two additional issues:
- A
db.query()method that committed a database update to disk before raising an error — so the update happened even though the method said it failed. - An
INSERT ... RETURNINGstatement that only committed when you iterated through all the results — meaning developers who stopped reading after the first row would leave the database in an inconsistent state.
Willison then pasted those findings into a fresh Claude Fable session, which confirmed both problems and produced fixes.
For non-technical readers, the lesson here is structural: AI models have blind spots, and those blind spots can be partially compensated for by using a different model as a reviewer. This is now a documented, practical workflow — not a theoretical idea.
The Real Cost: ₹12,686 Worth of AI Work
Willison was curious about the API-equivalent cost of the session. He used a tool called AgentsView inside his Claude Code session to calculate it. The breakdown, as reported on his blog:
- Main session (claude-fable-5): $141.02 (≈ ₹11,987)
- Four specialist sub-agent sessions: $7.91 combined (≈ ₹672)
- One cheaper model (Claude Opus 4.8) for counting tasks: $0.32 (≈ ₹27)
- Total: $149.25 (≈ ₹12,686)
Willison was on the Claude Max $200/month plan (≈ ₹17,000/month), which he upgraded from the $100/month tier specifically to get more Fable usage before a July 7th deadline — after which even Max subscribers would pay full API rates for the model.
His own reflection: “I really should have followed my own advice and leaned more heavily into subagents with cheaper models.” The implication is that a smarter workflow — routing simpler tasks to cheaper models and reserving Fable for the hardest reasoning — could have significantly reduced the cost.
The Honest Limitations You Should Know
Before you conclude that AI can fully replace engineering judgment, Willison’s account surfaces several important constraints.
What to Watch Next
The sqlite-utils 4.0 stable release had not shipped at the time of Willison’s post — rc2 was the release candidate. Watch the sqlite-utils changelog for the stable release announcement.
More broadly, the workflow Willison describes — AI as exhaustive pre-release reviewer, cross-model validation, subagent cost optimisation — represents a maturing pattern in AI-assisted development. If your organisation ships internal tools or manages data pipelines, this is the category of capability worth understanding now, before it becomes standard practice in your industry.
