How Claude Code Ported a 0.2B AI Image-Editing Model to Run Entirely in Your Browser
Simon Willison used Claude Code to convert the Moebius 0.2B inpainting model from a Python-only tool into a fully browser-based app running on WebGPU — without writing a single line of code himself. The exercise reveals how Claude Code can handle model conversion, cloud deployment, and UI building autonomously, pointing toward a future where AI image tools run locally in a browser tab with no server required.
An AI Photo-Editing Model Now Runs Directly in Your Browser — Thanks to Claude Code
Imagine being able to erase an unwanted object from a photograph — a stray cable, a passerby, a logo on a shirt — entirely within a browser tab, with no internet connection to a server, no subscription to a cloud AI service, and no Python installation required. That is now possible, and the story of how it got there is a masterclass in what Claude Code can do as an autonomous coding agent.
Simon Willison’s analysis, published on simonwillison.net on 22 June 2026, documents the full journey of porting the Moebius 0.2B image inpainting model from a Python-and-NVIDIA-CUDA-only tool to a working, shareable web application that runs in Chrome, Firefox, and Safari using WebGPU.
What Is Image Inpainting, in Plain Language?
Inpainting is a specific AI task: you mark a region of a photograph with a brush, and the model fills that region in with a plausible reconstruction of what might naturally be behind the removed area. Think of it as an intelligent digital eraser that does not leave a blank hole but instead imagines the background.
The Moebius model, released by researchers at Huazhong University of Science and Technology, is unusual because it achieves strong inpainting results with only 0.2 billion parameters — a tiny footprint by modern AI standards. The original release required PyTorch (a Python machine-learning framework) and an NVIDIA CUDA-compatible GPU, meaning it was practically inaccessible to anyone without a developer setup.
Willison spotted the model on Hacker News and made a simple observation: if the model is this small, maybe it can run directly in a browser using WebGPU, the newer browser API that allows web pages to access a device’s GPU for computation.
How Claude Code Did the Heavy Lifting
Willison was simultaneously working on an unrelated major feature for his open-source project Datasette, using a different AI coding tool called Codex Desktop. While waiting for that agent to grind through long refactoring tasks, he opened Claude Code in a separate terminal window and set it a parallel challenge: port Moebius to the web.
According to Willison’s detailed account, the workflow proceeded in several distinct phases, each driven primarily by Claude Code with Willison acting as a director rather than a programmer.
Phase 1: Feasibility Research via Claude.ai
Before writing a single line of code, Willison asked Claude.ai — the conversational interface — to clone the Moebius GitHub repository and assess the porting options. He used a prompt style he describes as telling the model to “muse on” a problem, meaning he wanted contemplation rather than a concrete solution. Claude suggested using ONNX Runtime Web on the WebGPU backend, a lower-level runtime that sits beneath the popular Transformers.js library. Willison saved that analysis as a research.md file.
Phase 2: Autonomous Execution by Claude Code
Willison gathered the raw materials — the Moebius Python source code, the 1.24 GB model weights from Hugging Face, and the source code of potentially relevant libraries — into a single folder. He then pointed Claude Code at that folder and gave it two instructions: read the research file, and build a browser-based port.
He also asked the agent to maintain a plan.md and a notes.md file throughout, documenting its own reasoning as it worked. According to Willison’s account, the agent converted the PyTorch model to the ONNX format (Open Neural Network Exchange, a portable format for neural networks that separates the model’s mathematical operations from any specific hardware or framework), published the converted weights to Hugging Face, built a web interface, and deployed the frontend to GitHub Pages — all without Willison writing a single line of code himself.
Phase 3: Iterative Debugging and Refinement
Willison’s role was to test the live demo in his browser, paste errors and screenshots back into Claude Code, and suggest improvements. One significant problem emerged: each page reload appeared to download the full 1.3 GB of model weights again. Willison pointed Claude Code toward the Whisper Web demo project as a reference, asking it to examine how that project handled caching. Claude Code determined that Whisper Web used the browser’s CacheStorage API and added the same mechanism to the Moebius project.
The entire exercise, run in parallel with a separate major software project, produced a working, publicly accessible demo at simonw.github.io/moebius-web.
What This Enables: A Concrete Scenario for Indian Professionals
Consider a small photography studio in Pune that handles wedding and event photography. The team’s post-processing workflow currently relies on a designer with access to Adobe Photoshop or similar paid software to remove background distractions — a tent pole in the background, an uninvited guest walking through the frame, a garland stand that shouldn’t be in the hero shot.
With a browser-based inpainting tool built on the same architecture Willison demonstrated, that workflow changes significantly. A photographer could open the tool directly in Chrome on any laptop, load the image, brush over the unwanted region, and let the model fill it in — without uploading the image to any external server, without a Photoshop subscription, and without waiting for a cloud API response. The model runs locally on the device’s GPU.
The first load would download approximately 1.3 GB of model weights (a one-time cost given proper browser caching). After that, the tool works offline. For a studio that handles sensitive client photographs or operates in areas with unreliable connectivity, that privacy and offline capability is a genuine advantage.
The same logic applies to a marketing team in Hyderabad creating product photographs for an e-commerce catalogue, or a real estate agency in Chennai needing to clean up property images before listing.
What Claude Code Actually Demonstrated About Its Own Capabilities
Willison’s account is explicit about what this exercise revealed. According to his writeup, Claude Opus 4.8 — the model version running inside Claude Code during this project — proved capable of:
- Converting a PyTorch deep-learning model to the ONNX format
- Publishing the resulting model weights to Hugging Face using the
hfCLI tool - Building a web application and user interface that loads and executes that model in a browser
- Diagnosing a caching problem and implementing a browser-native
CacheStoragesolution by studying a reference project
All of this was accomplished while Willison’s attention was elsewhere, with him checking in periodically rather than supervising each step.
The Limitations You Should Know About
Willison is candid about the tradeoffs, and they are significant.
What to Watch For
The deeper implication of Willison’s experiment is not about this specific tool — it is about the trajectory. The combination of smaller, efficient models (like Moebius at 0.2B parameters) and browser-native GPU APIs (like WebGPU) is making it feasible to run genuinely useful AI tasks client-side, with no server costs and no data leaving the device.
For non-technical professionals, the practical signal is this: tools that currently require a cloud subscription or a developer to deploy may, over the next year or two, become browser tabs you can simply open. Claude Code is one of the agents making that transition faster by handling the conversion and deployment work that would otherwise require deep technical expertise.
If you want to understand the architecture behind what Willison built, his post links to a understanding.md file that Claude.ai generated from the finished codebase — a readable explanation of ONNX, WebGPU, and the Moebius model pipeline written specifically for someone encountering these concepts for the first time. That document is worth reading before the next time someone mentions running AI models locally.
