Feed fetched in 65 ms.
Content type is application/xml; charset=utf-8.
Feed is 141,854 characters long.
Warning Feed is missing an ETag.
Feed has a last modified date of Sat, 13 Jun 2026 01:01:50 GMT.
Feed is well-formed XML.
Warning Feed has no styling.
This is an Atom feed.
Feed title: Simon Willison's Weblog
Error Feed self link: http://simonwillison.net/atom/everything/ does not match feed URL: https://simonwillison.net/atom/everything/.
Warning Feed is missing an image.
Feed has 30 items.
First item published on 2026-06-13T01:01:50.000Z
Last item published on 2026-06-02T03:43:45.000Z
All items have published dates.
Newest item was published on 2026-06-13T01:01:50.000Z.
Home page URL: http://simonwillison.net/
Error Home page URL is on a different protocol: http:.
Warning Home page URL redirected to https://simonwillison.net/.
Home page has feed discovery link in <head>.
Home page has a link to the feed in the <body>
<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom">
<title>Simon Willison's Weblog</title>
<link href="http://simonwillison.net/" rel="alternate"/>
<link href="http://simonwillison.net/atom/everything/" rel="self"/>
<id>http://simonwillison.net/</id>
<updated>2026-06-13T01:01:50+00:00</updated>
<author>
<name>Simon Willison</name>
</author>
<entry>
<title>Statement on the US government directive to suspend access to Fable 5 and Mythos 5</title>
<link href="https://simonwillison.net/2026/Jun/13/us-government-directive-to-suspend-access/#atom-everything" rel="alternate"/>
<published>2026-06-13T01:01:50+00:00</published>
<updated>2026-06-13T01:01:50+00:00</updated>
<id>https://simonwillison.net/2026/Jun/13/us-government-directive-to-suspend-access/#atom-everything</id>
<summary type="html"><p><strong><a href="https://www.anthropic.com/news/fable-mythos-access">Statement on the US government directive to suspend access to Fable 5 and Mythos 5</a></strong></p>
Well this is <em>nuts</em>:</p>
<blockquote>
<p>The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for <strong>all</strong> our customers to ensure compliance. <strong>Access to all other Anthropic models</strong> <strong>will not be affected.</strong></p>
<p>We received the directive from the government today at 5:21pm (ET). The letter did not provide specific details of its national security concern. Our understanding is that the government believes it has become aware of a method of bypassing, or "jailbreaking" Fable 5. We reviewed a demonstration of this specific technique being used to identify a small number of previously known, minor vulnerabilities. These vulnerabilities all appear relatively simple, and we have found that other publicly-available models are able to discover them as well without requiring a bypass. [...]</p>
<p>To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws. Our understanding is that one potential jailbreak was shared with the government. We have reviewed the report and validated that the level of capability displayed there is widely available from other models (including OpenAI's <a href="https://deploymentsafety.openai.com/gpt-5-5/tacit-knowledge-and-troubleshooting">GPT-5.5</a>), and is used every day by the defenders who keep systems safe. We will share more details over the next 24 hours.</p>
</blockquote>
<p>I still have access to Fable via <a href="https://claude.ai/">claude.ai</a> and Claude Code now, at 9:01pm ET.</p>
<p><strong>Update</strong>: I ran <a href="https://gist.github.com/simonw/5894cfafc64a2b8aafbe834bc9c950b9">this script</a> against the Anthropic API to spot when <code>claude-fable-5</code> would stop working. My access was cut off at 6:59pm Pacific (9:59pm ET):</p>
<pre>[2026-06-12T18:56:50-07:00] attempt 35: running uv run llm -m claude-fable-5 hi
[2026-06-12T18:56:55-07:00] success: Hi there! How can I help you today?
[2026-06-12T18:57:55-07:00] attempt 36: running uv run llm -m claude-fable-5 hi
[2026-06-12T18:57:59-07:00] success: Hi! How can I help you today?
[2026-06-12T18:58:59-07:00] attempt 37: running uv run llm -m claude-fable-5 hi
[2026-06-12T18:59:00-07:00] FAILED after attempt 37 with exit code 1
stderr:
Error: Error code: 404 - {'type': 'error', 'error': {'type': 'not_found_error', 'message': 'Claude Fable 5 is not available. Please use Opus 4.8. Learn more: https://www.anthropic.com/news/fable-mythos-access'}, 'request_id': 'req_011CbzRyirV7KZLHYYdBM9od'}</pre>
<p><small></small>Via <a href="https://twitter.com/AnthropicAI/status/2065597531644743999">@AnthropicAI</a></small></p>
<p>Tags: <a href="https://simonwillison.net/tags/jailbreaking">jailbreaking</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/anthropic">anthropic</a>, <a href="https://simonwillison.net/tags/claude">claude</a>, <a href="https://simonwillison.net/tags/ai-ethics">ai-ethics</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p></summary>
<category term="jailbreaking"/>
<category term="ai"/>
<category term="generative-ai"/>
<category term="llms"/>
<category term="anthropic"/>
<category term="claude"/>
<category term="ai-ethics"/>
<category term="claude-mythos"/>
</entry>
<entry>
<title>OpenAI WebRTC Audio Session, now with document context</title>
<link href="https://simonwillison.net/2026/Jun/12/openai-webrtc/#atom-everything" rel="alternate"/>
<published>2026-06-12T23:53:04+00:00</published>
<updated>2026-06-12T23:53:04+00:00</updated>
<id>https://simonwillison.net/2026/Jun/12/openai-webrtc/#atom-everything</id>
<summary type="html"><p><strong><a href="https://tools.simonwillison.net/openai-webrtc">OpenAI WebRTC Audio Session, now with document context</a></strong></p>
I built the first version of this tool <a href="https://simonwillison.net/2024/Dec/17/openai-webrtc/">in December 2024</a> to try out the then-new OpenAI WebRTC API for interacting with their realtime audio models.</p>
<p>Last month OpenAI <a href="https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/">introduced a brand new model</a> to that API called <a href="https://developers.openai.com/api/docs/models/gpt-realtime-2">GPT‑Realtime‑2</a>, which they promoted as "our first voice model with GPT‑5‑class reasoning" - with a Sep 30, 2024 knowledge cut-off.</p>
<p>I've been waiting for that model to show up in the ChatGPT iPhone app but it still hasn't, so I revisited my old playground.</p>
<p>You can now pick the better model, and you can also paste in a big chunk of document context so you can have as audio conversation in your browser about whatever information you think would be useful to explore in a conversational way.</p>
<p><img src="https://static.simonwillison.net/static/2026/openai-webrtc-document-context.jpg" alt="Screenshot of a web interface titled &quot;OpenAI WebRTC Audio Session&quot; with a gray status dot. Form fields: &quot;OpenAI API Token&quot; showing a masked password of dots, &quot;Voice&quot; dropdown set to &quot;Coral&quot;, &quot;Model&quot; dropdown set to &quot;gpt-realtime-2&quot;. A collapsible section labeled &quot;▼ Document context (optional — paste text to talk about)&quot; with bold instruction &quot;Paste a document here before starting the session and the model will be able to discuss it with you&quot; above a textarea containing a pasted Markdown document about whether DuckDB can run untrusted SQL as safely as Datasette runs SQLite. Below are a blue &quot;Start Session&quot; button and a gray disabled &quot;Mute Mic&quot; button, then a green success message &quot;Session established successfully!&quot; At the bottom, a dark panel headed &quot;Last transcript&quot; reads: &quot;DuckDB can be made about as safe as SQLite for running untrusted SELECT queries, but only if you lock it down properly. Using read only true by itself is not enough, because SQL can still&quot; (text cut off)." class="blogmark-image" style="max-width: 80%">
<p>Tags: <a href="https://simonwillison.net/tags/audio">audio</a>, <a href="https://simonwillison.net/tags/tools">tools</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/openai">openai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/multi-modal-output">multi-modal-output</a>, <a href="https://simonwillison.net/tags/webrtc">webrtc</a></p></summary>
<category term="audio"/>
<category term="tools"/>
<category term="ai"/>
<category term="openai"/>
<category term="generative-ai"/>
<category term="llms"/>
<category term="multi-modal-output"/>
<category term="webrtc"/>
</entry>
<entry>
<title>Quoting Andrew Singleton</title>
<link href="https://simonwillison.net/2026/Jun/12/andrew-singleton/#atom-everything" rel="alternate"/>
<published>2026-06-12T18:09:21+00:00</published>
<updated>2026-06-12T18:09:21+00:00</updated>
<id>https://simonwillison.net/2026/Jun/12/andrew-singleton/#atom-everything</id>
<summary type="html"><blockquote cite="https://www.mcsweeneys.net/articles/ai-economics-for-dummies"><p>Jenny owns a crematorium. John’s propane company gives her a $20 billion investment in return for 5 percent of her operation. Jenny throws $10 billion into the incinerator, then pays John $10 billion to buy propane to burn that money to ashes. John reports that his AI investments have generated $10 billion in revenue this quarter and that he owns 5 percent of a $100 billion business. A reporter from <em>Forbes</em> is assigned to profile John and Jenny, and over the course of his research, he becomes embroiled in a passionate but confusing three-way love affair with them, which eventually turns into a polyamorous common-law marriage. His profile is glowing, but light on financial details.</p></blockquote>
<p class="cite">&mdash; <a href="https://www.mcsweeneys.net/articles/ai-economics-for-dummies">Andrew Singleton</a>, AI Economics for Dummies</p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a></p></summary>
<category term="ai"/>
</entry>
<entry>
<title>Claude Fable is relentlessly proactive</title>
<link href="https://simonwillison.net/2026/Jun/11/fable-is-relentlessly-proactive/#atom-everything" rel="alternate"/>
<published>2026-06-11T23:35:17+00:00</published>
<updated>2026-06-11T23:35:17+00:00</updated>
<id>https://simonwillison.net/2026/Jun/11/fable-is-relentlessly-proactive/#atom-everything</id>
<summary type="html"><p>After two days of experience with <a href="https://simonwillison.net/2026/Jun/9/claude-fable-5/">Claude Fable 5</a> I think the best way to describe it is <strong>relentlessly proactive</strong>. It knows a whole lot of tricks and it will deploy pretty much any of them to get to its goal.</p>
<p>I'll illustrate this with an example. I was hacking on <a href="https://agent.datasette.io/">Datasette Agent</a> today when I noticed a glitch: a horizontal scrollbar that shouldn't be there in the jump menu chat prompt. I snapped this screenshot:</p>
<p><img src="https://static.simonwillison.net/static/2026/jump-to-bug.jpg" alt="Screenshot of a modal dialog demonstrating a scrollbar bug. At the top is a focused search input with blue outline and placeholder &quot;Jump to...&quot;, with an X close button to its right. Below, a heading reads &quot;Start a new agent chat&quot; above a textarea with the placeholder &quot;Ask a question about your data...&quot; — the bug: a thick gray horizontal scrollbar is incorrectly displayed along the bottom edge of the empty textarea, spanning nearly its full width, next to the resize handle. Below the textarea: &quot;Press Enter to start. Shift+Enter adds a new line.&quot; followed by a blue &quot;Start chat&quot; button." style="max-width: 100%;" /></p>
<p>Then I started a fresh <code>claude</code> session in my <code>datasette-agent</code> checkout, dragged in the screenshot and told it:</p>
<blockquote>
<p><code>Look at dependencies to help figure out why there is a horizontal scrollbar here</code></p>
</blockquote>
<p>I had a hunch the cause was in a dependency of Datasette Agent (likely Datasette itself) and I knew Fable was good at digging into dependency code, either by inspecting installed files in its own virtual environment <code>site-packages</code> or by referencing a local checkout on disk. Telling it to start with dependencies felt like a good bet.</p>
<p>I got distracted by a domestic task and wandered away from my computer.</p>
<p>When I came back a few minutes later I saw my machine <em>open a browser window</em> in my regular Firefox and then <em>navigate to the dialog in question</em>. I had not told Claude Code to use any browser automation, and I was pretty sure it wasn't possible for it to trigger mouse movements or keyboard shortcuts within a window, so how was it doing that?</p>
<p>I watched in fascination as it continued with its explorations, then saw it open a Safari window instead of Firefox. I also grabbed this snapshot from the Claude terminal:</p>
<p><img src="https://static.simonwillison.net/static/2026/fable-bash-pyobjc.jpg" alt="Screenshot of two Bash tool calls in a dark terminal interface. First: Bash(open -a Safari /tmp/textarea-scrollbar-test.html &amp;&amp; sleep 4 &amp;&amp; uv run --with pyobjc-framework-Quartz python - &lt;&lt;'EOF' import Quartz wins = Quartz.CGWindowListCopyWindowInfo(Quartz.kCGWindowListOptionOnScreenOnly, Quartz.kCGNullWindowID) for w in wins: if (w.get('kCGWindowOwnerName') or '') == 'Safari' and 'textarea' in (w.get('kCGWindowName') or '').lower(): print(w.get('kCGWindowNumber')) EOF) with output 153551. Second: Bash(screencapture -x -o -l 153551 /tmp/safari-cases.png &amp;&amp; echo ok) with output ok." style="max-width: 100%;" /></p>
<p>What was it doing there with <code>uv run --with pyobjc-framework-Quartz</code>?</p>
<p>It turns out Fable had hacked up its own pattern for taking screenshots of browser windows. It was using Python to iterate through all available windows on my machine, then filtering for Safari windows with expected strings such as <code>"textarea"</code> in the window name. It used that to find their window number - an integer like 153551 - which it could then use with the <code>screencapture</code> CLI tool to grab a PNG.</p>
<p>OK fine, that's a neat way of taking screenshots. But what was it taking screenshots of?</p>
<p>Turns out it had been writing its own scratch HTML pages to try and recreate the bug, then opening Safari and grabbing screenshots.</p>
<p>Here's that <a href="https://static.simonwillison.net/static/2026/textarea-scrollbar-test.html">/tmp/textarea-scrollbar-test.html</a> page it created, and the screenshot it took with <code>screencapture -x -o -l 153551 /tmp/safari-cases.png</code>:</p>
<p><img src="https://static.simonwillison.net/static/2026/safari-cases.jpg" alt="Screenshot of a Safari browser window showing a textarea scrollbar test page at file:///private/tmp/textarea-scrollbar-test.html. Page text reads: scrollbar thickness: 17px | UA: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/26.4 Safari/605.1.15 | devicePixelRatio: 2. Four numbered test cases follow, each with a textarea containing the placeholder &quot;Ask a question about your data...&quot;: 1. Exact plugin CSS (resize: vertical, default overflow), 2. Plugin CSS + overflow-x: hidden, 3. Plugin CSS + resize: none, and 4. Bare default textarea, which is a much smaller box with the placeholder wrapping onto two lines." style="max-width: 100%;" />
(I have way too many open tabs!)</p>
<p>OK, so I can see how it's opening test pages and taking screenshots, but how on earth was it triggering the modal dialog that was meant to be under test? That's only available via a click or a keyboard shortcut, and I couldn't see a mechanism for it to run those in Safari.</p>
<p>I eventually figured out what it had done.</p>
<p>Claude was running in a folder that contained the source code for the application. It knows enough about <a href="https://datasette.io/">Datasette</a> to be able to run a local development server. It turns out it was editing Datasette's own templates to add JavaScript that would trigger the correct keyboard shortcut as soon as the window opened, adding code like this:</p>
<div class="highlight highlight-text-html-basic"><pre><span class="pl-kos">&lt;</span><span class="pl-ent">script</span><span class="pl-kos">&gt;</span>
<span class="pl-smi">window</span><span class="pl-kos">.</span><span class="pl-en">addEventListener</span><span class="pl-kos">(</span><span class="pl-s">"load"</span><span class="pl-kos">,</span> <span class="pl-k">function</span> <span class="pl-kos">(</span><span class="pl-kos">)</span> <span class="pl-kos">{</span>
<span class="pl-en">setTimeout</span><span class="pl-kos">(</span><span class="pl-k">function</span> <span class="pl-kos">(</span><span class="pl-kos">)</span> <span class="pl-kos">{</span>
<span class="pl-smi">document</span><span class="pl-kos">.</span><span class="pl-en">dispatchEvent</span><span class="pl-kos">(</span><span class="pl-k">new</span> <span class="pl-v">KeyboardEvent</span><span class="pl-kos">(</span><span class="pl-s">"keydown"</span><span class="pl-kos">,</span> <span class="pl-kos">{</span><span class="pl-c1">key</span>: <span class="pl-s">"/"</span><span class="pl-kos">,</span> <span class="pl-c1">bubbles</span>: <span class="pl-c1">true</span><span class="pl-kos">}</span><span class="pl-kos">)</span><span class="pl-kos">)</span><span class="pl-kos">;</span>
<span class="pl-kos">}</span><span class="pl-kos">,</span> <span class="pl-c1">1200</span><span class="pl-kos">)</span><span class="pl-kos">;</span>
<span class="pl-kos">}</span><span class="pl-kos">)</span><span class="pl-kos">;</span>
<span class="pl-kos">&lt;/</span><span class="pl-ent">script</span><span class="pl-kos">&gt;</span></pre></div>
<p>1.2 seconds after the window opens, this code triggers a simulated <code>/</code> key, which is the keyboard shortcut for opening the modal dialog.</p>
<p>There was one challenge left. In order to understand what was going on, Claude needed to run JavaScript on the page to take measurements for itself.</p>
<p>It wrote its own custom web application to capture information via CORS, then ran that as a local server and opened a page with JavaScript that would POST directly to it!</p>
<p>Here's the Python web app it wrote, using the standard library <a href="https://docs.python.org/3/library/http.server.html">http.server</a> package:</p>
<pre><span class="pl-k">from</span> <span class="pl-s1">http</span>.<span class="pl-s1">server</span> <span class="pl-k">import</span> <span class="pl-v">HTTPServer</span>, <span class="pl-v">BaseHTTPRequestHandler</span>
<span class="pl-k">class</span> <span class="pl-c1">H</span>(<span class="pl-v">BaseHTTPRequestHandler</span>):
<span class="pl-k">def</span> <span class="pl-en">do_POST</span>(<span class="pl-s1">self</span>):
<span class="pl-s1">n</span> <span class="pl-c1">=</span> <span class="pl-en">int</span>(<span class="pl-s1">self</span>.<span class="pl-c1">headers</span>.<span class="pl-c1">get</span>(<span class="pl-s">"Content-Length"</span>, <span class="pl-c1">0</span>))
<span class="pl-en">open</span>(<span class="pl-s">"/tmp/diag.json"</span>, <span class="pl-s">"w"</span>).<span class="pl-c1">write</span>(<span class="pl-s1">self</span>.<span class="pl-c1">rfile</span>.<span class="pl-c1">read</span>(<span class="pl-s1">n</span>).<span class="pl-c1">decode</span>())
<span class="pl-s1">self</span>.<span class="pl-c1">send_response</span>(<span class="pl-c1">200</span>)
<span class="pl-s1">self</span>.<span class="pl-c1">send_header</span>(<span class="pl-s">"Access-Control-Allow-Origin"</span>, <span class="pl-s">"*"</span>)
<span class="pl-s1">self</span>.<span class="pl-c1">end_headers</span>()
<span class="pl-k">def</span> <span class="pl-en">do_OPTIONS</span>(<span class="pl-s1">self</span>):
<span class="pl-s1">self</span>.<span class="pl-c1">send_response</span>(<span class="pl-c1">200</span>)
<span class="pl-s1">self</span>.<span class="pl-c1">send_header</span>(<span class="pl-s">"Access-Control-Allow-Origin"</span>, <span class="pl-s">"*"</span>)
<span class="pl-s1">self</span>.<span class="pl-c1">send_header</span>(<span class="pl-s">"Access-Control-Allow-Headers"</span>, <span class="pl-s">"*"</span>)
<span class="pl-s1">self</span>.<span class="pl-c1">end_headers</span>()
<span class="pl-k">def</span> <span class="pl-en">log_message</span>(<span class="pl-s1">self</span>, <span class="pl-c1">*</span><span class="pl-s1">a</span>): <span class="pl-c"># quiet</span>
<span class="pl-k">pass</span>
<span class="pl-en">HTTPServer</span>((<span class="pl-s">"127.0.0.1"</span>, <span class="pl-c1">9999</span>), <span class="pl-c1">H</span>).<span class="pl-c1">serve_forever</span>()</pre>
<p>All this does is accept a POST request full of JSON and write that to the <code>/tmp/diag.json</code> file. It sends <code>Access-Control-Allow-Origin: *</code> headers (including from <code>OPTIONS</code> requests) so that code running on another domain can still communicate back to it.</p>
<p>Then Claude injected this code into the template that it was loading in a browser:</p>
<div class="highlight highlight-source-js"><pre><span class="pl-k">const</span> <span class="pl-s1">host</span> <span class="pl-c1">=</span> <span class="pl-smi">document</span><span class="pl-kos">.</span><span class="pl-en">querySelector</span><span class="pl-kos">(</span><span class="pl-s">"navigation-search"</span><span class="pl-kos">)</span><span class="pl-kos">;</span>
<span class="pl-k">const</span> <span class="pl-s1">ta</span> <span class="pl-c1">=</span> <span class="pl-s1">host</span><span class="pl-kos">.</span><span class="pl-c1">shadowRoot</span><span class="pl-kos">.</span><span class="pl-en">querySelector</span><span class="pl-kos">(</span><span class="pl-s">"textarea"</span><span class="pl-kos">)</span><span class="pl-kos">;</span>
<span class="pl-k">const</span> <span class="pl-s1">cs</span> <span class="pl-c1">=</span> <span class="pl-en">getComputedStyle</span><span class="pl-kos">(</span><span class="pl-s1">ta</span><span class="pl-kos">)</span><span class="pl-kos">;</span>
<span class="pl-en">fetch</span><span class="pl-kos">(</span><span class="pl-s">"http://127.0.0.1:9999/diag"</span><span class="pl-kos">,</span> <span class="pl-kos">{</span>
<span class="pl-c1">method</span>: <span class="pl-s">"POST"</span><span class="pl-kos">,</span>
<span class="pl-c1">body</span>: <span class="pl-c1">JSON</span><span class="pl-kos">.</span><span class="pl-en">stringify</span><span class="pl-kos">(</span><span class="pl-kos">{</span>
<span class="pl-c1">dpr</span>: <span class="pl-smi">window</span><span class="pl-kos">.</span><span class="pl-c1">devicePixelRatio</span><span class="pl-kos">,</span>
<span class="pl-c1">scrollWidth</span>: <span class="pl-s1">ta</span><span class="pl-kos">.</span><span class="pl-c1">scrollWidth</span><span class="pl-kos">,</span> <span class="pl-c1">clientWidth</span>: <span class="pl-s1">ta</span><span class="pl-kos">.</span><span class="pl-c1">clientWidth</span><span class="pl-kos">,</span>
<span class="pl-c1">whiteSpace</span>: <span class="pl-s1">cs</span><span class="pl-kos">.</span><span class="pl-c1">whiteSpace</span><span class="pl-kos">,</span> <span class="pl-c1">width</span>: <span class="pl-s1">cs</span><span class="pl-kos">.</span><span class="pl-c1">width</span><span class="pl-kos">,</span>
<span class="pl-kos">}</span><span class="pl-kos">)</span><span class="pl-kos">,</span>
<span class="pl-kos">}</span><span class="pl-kos">)</span><span class="pl-kos">;</span></pre></div>
<p>This took measurements of the <code>&lt;textarea&gt;</code> inside the <code>&lt;navigation-search&gt;</code> Web Component and sent them to the server, which wrote them to a file on disk, which Claude could then read.</p>
<p>Having figured out all of these tricks Fable... hit some invisible guardrail and downgraded itself to Opus. Thankfully Opus had access to the full transcript and could continue using the tricks pioneered by Fable, and shortly afterwards found, tested and verified <a href="https://github.com/datasette/datasette-agent/commit/a75a8b727b42c30ced1fc41dc8add7eb9f04fefe">the fix</a>.</p>
<p>I prompted Opus to:</p>
<blockquote>
<p><code>Write a report in /tmp/automation-report.md where you note down all of the tricks you have used in this session to test against real browsers on my computer, include runnable code examples</code></p>
</blockquote>
<p>Which produced <a href="https://gist.github.com/simonw/aef7f7db9ac992643110a74e43d6d42f">this report</a>, which was invaluable for piecing together the details of what had happened for this post.</p>
<p>I've shared <a href="https://gisthost.github.io/?cc14774f6d37eb67bf089f3ac3925f8f">the full terminal transcript</a> of the Claude Code session as well.</p>
<h4 id="a-review-of-everything-it-did">A review of everything it did</h4>
<p>Based on a screenshot and a one-line prompt, Claude Fable 5 + Claude Code:</p>
<ul>
<li>Figured out the recipe to run the local development server (with fake environment variables needed to get it running)</li>
<li>Fired up a Playwright Chrome session</li>
<li>Turned on the visible scrollbars setting for Chrome <code>defaults write com.google.chrome.for.testing AppleShowScrollBars Always</code> (it turned that off again later)</li>
<li>Cycled through Firefox and WebKit in Playwright too, failing to recreate the bug</li>
<li>Worked out my default browser was Safari</li>
<li>Built a <code>textarea-scrollbar-test.html</code> HTML document</li>
<li>Opened that in real (not Playwright) Firefox</li>
<li>Found that <code>osascript -e 'tell application "System Events" to tell process "firefox" to id of window 1'</code> was blocked because "osascript is not allowed assistive access"</li>
<li>Figured out that <code>uv run --with pyobjc-framework-Quartz python</code> workaround, described above</li>
<li>Added JavaScript to the site templates in order to trigger the <code>/</code> key</li>
<li>Built its own little Python CORS web server to capture JSON data</li>
<li>Rewrote the template to capture that data and send it to the server</li>
<li>Scripted its way through the Web Component shadow DOM to the information it needed</li>
<li>Opened Safari to confirm the source of the bug</li>
<li>Modified its custom template to hack in a potential fix</li>
<li>Confirmed the hacked fix worked</li>
<li>Reported back on how to fix the problem</li>
</ul>
<p>Like I said, relentlessly proactive!</p>
<h4 id="an-estimate-of-the-cost">An estimate of the cost</h4>
<p>I'm currently on the $100/month Claude Max plan, which includes a generous allowance for Fable up until June 22nd after which Anthropic say they'll start charging full API prices for it.</p>
<p>I'm using <a href="https://www.agentsview.io">AgentsView</a> to track my spending (see <a href="https://til.simonwillison.net/llms/agentsview-custom-model-price">this TIL</a>). Here's what AgentsView says this session would have cost me if I was paying full price for it:</p>
<pre><code>~ % uvx agentsview session usage be8850a7-6119-46a0-b5d6-79c7fff5ae2b
Session: be8850a7-6119-46a0-b5d6-79c7fff5ae2b
Agent: claude
Output: 68606
Peak ctx: 113178
Cost: ~$12.11 (claude-fable-5, claude-opus-4-8)
</code></pre>
<p>If you don't keep a close eye on it, Fable will quite happily burn $12 in tokens inventing new ways to debug your CSS.</p>
<h4 id="i-really-need-to-lock-this-thing-down">I really need to lock this thing down</h4>
<p>On the one hand, watching Fable go to extreme lengths to get the information that it needed to debug what was, in the end, a two-line CSS fix, was <em>fascinating</em>.</p>
<p>But on the other hand... this is a robust reminder that coding agents can do anything <em>you</em> can do by typing commands into a terminal - and frontier models know every trick in the book, and evidently a few that nobody has ever written down before.</p>
<p>If Fable had been acting on malicious instructions - a prompt injection attack hidden in code or an issue thread, or something I'd carelessly pasted into my terminal - it's alarming to think quite how far it could go to exfiltrate data or cause other forms of mischief.</p>
<p>Running coding agents outside of a sandbox has always been a bad idea - it's my top contender for <a href="https://simonwillison.net/2026/Jan/8/llm-predictions-for-2026/#1-year-a-challenger-disaster-for-coding-agent-security">a Challenger disaster</a> incident, as described by Johann Rehberger in <a href="https://embracethered.com/blog/posts/2025/the-normalization-of-deviance-in-ai/">The Normalization of Deviance in AI</a>.</p>
<p>Fable is arguably smarter and hence more suspicious of potentially malicious instructions. But that smartness is very much a two-edged sword: if it <em>does</em> get subverted by instructions, the amount of damage it can do given its relentless proactivity is terrifying.</p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/prompt-injection">prompt-injection</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/ai-assisted-programming">ai-assisted-programming</a>, <a href="https://simonwillison.net/tags/coding-agents">coding-agents</a>, <a href="https://simonwillison.net/tags/claude-code">claude-code</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p></summary>
<category term="ai"/>
<category term="prompt-injection"/>
<category term="generative-ai"/>
<category term="llms"/>
<category term="ai-assisted-programming"/>
<category term="coding-agents"/>
<category term="claude-code"/>
<category term="claude-mythos"/>
</entry>
<entry>
<title>datasette 1.0a33</title>
<link href="https://simonwillison.net/2026/Jun/11/datasette/#atom-everything" rel="alternate"/>
<published>2026-06-11T15:26:49+00:00</published>
<updated>2026-06-11T15:26:49+00:00</updated>
<id>https://simonwillison.net/2026/Jun/11/datasette/#atom-everything</id>
<summary type="html"><p><strong>Release:</strong> <a href="https://github.com/simonw/datasette/releases/tag/1.0a33">datasette 1.0a33</a></p>
<p>This alpha is a significant step on the road to a stable 1.0, finally extending the <code>?_extra=</code> pattern I introduced <a href="https://docs.datasette.io/en/1.0a3/changelog.html#a3-2023-08-09">in Datasette 1.0a3</a> to cover queries and rows in addition to tables. That pattern is also <a href="https://docs.datasette.io/en/latest/json_api.html#expanding-json-responses">now documented</a>!</p>
<p>I wrote a whole lot more about the new release on the Datasette project blog: <strong><a href="http://datasette.io/blog/2026/api-extras/">Datasette 1.0a33 with JSON extras in the API</a></strong>.</p>
<p>Because API explorer tools are almost free to build now I had Claude Fable 5 in Claude Code (for <a href="https://gist.github.com/simonw/d8bf1a8f36e28fbd595cede946e0ab6d">the plan</a>) and GPT-5.5 xhigh in Codex Desktop (for <a href="https://gist.github.com/simonw/12d5e09797072a6807d7b9cfcc8ff6b7">the implementation</a>) build me this <a href="https://tools.simonwillison.net/datasette-extras-explorer">custom extras API explorer</a> to help demonstrate the feature:</p>
<p><img alt="Screenshot of a web application titled &quot;Datasette extras explorer&quot;. A URL input field contains https://latest.datasette.io/fixtures/facetable.json with a teal Explore button next to it. Below, a left panel labeled EXTRAS (30) lists checkboxes: all_columns - All columns in the table, regardless of _col/_nocol filtering; column_types - Column type assignments for this table; columns (checked) - Column names returned by this query; count - Total count of rows matching these filters; count_sql - SQL query used to calculate the total count; custom_table_templates - Custom template names considered for this table; database - Database name; database_color - Color assigned to the database. A right panel labeled RESPONSE shows GET /fixtures/fac… with Copy JSON and Copy URL buttons, then a dark JSON viewer showing 200 - 9.9 KB - 114ms and JSON: &quot;ok&quot;: true, &quot;next&quot;: null, &quot;columns&quot;: (highlighted array) &quot;pk&quot;, &quot;created&quot;, &quot;planet_int&quot;, &quot;on_earth&quot;, &quot;state&quot;, &quot;_city_id&quot;, &quot;_neighborhood&quot;, &quot;tags&quot;, &quot;complex_array&quot;, &quot;distinct_some_null&quot;, &quot;n&quot;, &quot;rows&quot;: list of objects." src="https://static.simonwillison.net/static/2026/extras-explorer.png" /></p>
<p>Tags: <a href="https://simonwillison.net/tags/projects">projects</a>, <a href="https://simonwillison.net/tags/datasette">datasette</a>, <a href="https://simonwillison.net/tags/annotated-release-notes">annotated-release-notes</a>, <a href="https://simonwillison.net/tags/ai-assisted-programming">ai-assisted-programming</a></p></summary>
<category term="projects"/>
<category term="datasette"/>
<category term="annotated-release-notes"/>
<category term="ai-assisted-programming"/>
</entry>
<entry>
<title>asyncinject 0.7</title>
<link href="https://simonwillison.net/2026/Jun/11/asyncinject/#atom-everything" rel="alternate"/>
<published>2026-06-11T06:28:09+00:00</published>
<updated>2026-06-11T06:28:09+00:00</updated>
<id>https://simonwillison.net/2026/Jun/11/asyncinject/#atom-everything</id>
<summary type="html"><p><strong>Release:</strong> <a href="https://github.com/simonw/asyncinject/releases/tag/0.7">asyncinject 0.7</a></p>
<p>I built this utility library to support an <code>asyncio</code> dependency injection pattern a few years ago. I was using it with Datasette and Claude Fable 5 spotted some bugs in the dependency which it then fixed for me. It's a very proactive model!</p>
<p>Tags: <a href="https://simonwillison.net/tags/async">async</a>, <a href="https://simonwillison.net/tags/projects">projects</a>, <a href="https://simonwillison.net/tags/python">python</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p></summary>
<category term="async"/>
<category term="projects"/>
<category term="python"/>
<category term="claude-mythos"/>
</entry>
<entry>
<title>Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude</title>
<link href="https://simonwillison.net/2026/Jun/11/anthropic-walks-back-policy/#atom-everything" rel="alternate"/>
<published>2026-06-11T03:45:49+00:00</published>
<updated>2026-06-11T03:45:49+00:00</updated>
<id>https://simonwillison.net/2026/Jun/11/anthropic-walks-back-policy/#atom-everything</id>
<summary type="html"><p><strong><a href="https://www.wired.com/story/anthropic-responds-to-backlash-on-claudes-secret-sabotage-on-ai-research/">Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude</a></strong></p>
Big scoop for Maxwell Zeff at Wired:</p>
<blockquote>
<p>“We’re changing Fable 5’s safeguards for frontier LLM development to make them visible.” Anthropic said in a statement to WIRED. “We made the wrong tradeoff and we apologize for not getting the balance right.”</p>
</blockquote>
<p>There's been a <em>huge</em> outcry about Anthropic's policy, <a href="https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-helping-you/">tucked away in their system card</a>, that Claude Fable/Mythos would identify "requests targeting frontier LLM development" and "limit effectiveness" without notifying the user.</p>
<p>It's good news that they're dropping the invisible aspect of this. It would be a whole lot better of they dropped this category of refusals entirely.</p>
<p><strong>Update</strong>: More details from <a href="https://twitter.com/claudedevs/status/2064949876463645026">@ClaudeDevs on Twitter</a>:</p>
<blockquote>
<p>We’re rolling out changes to make Fable 5’s safeguards for frontier LLM development visible.</p>
<p>Starting this week, flagged requests will visibly fall back to Opus 4.8—the same as our safeguards for cyber and bio. You will see this every time it happens. On the API, any flagged requests will return a reason for their refusal (coming to server-side fallback in the next few days).</p>
<p>We wanted to deploy Fable 5 to our users quickly and safely. Visible safeguards can be probed, so they have to be robust, which takes time to get right. Invisible safeguards can be targeted more narrowly, allowing us to ship quickly with very few false positives. We went with invisible safeguards for this reason—and that was the wrong tradeoff. You should have visibility into the safeguards we have in place, and why. We’re sorry for not getting the balance right.</p>
</blockquote>
<p><small></small>Via <a href="https://twitter.com/zeffmax/status/2064910040503627917">@zeffmax</a></small></p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/anthropic">anthropic</a>, <a href="https://simonwillison.net/tags/claude">claude</a>, <a href="https://simonwillison.net/tags/ai-ethics">ai-ethics</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p></summary>
<category term="ai"/>
<category term="generative-ai"/>
<category term="llms"/>
<category term="anthropic"/>
<category term="claude"/>
<category term="ai-ethics"/>
<category term="claude-mythos"/>
</entry>
<entry>
<title>datasette-agent 0.2a0</title>
<link href="https://simonwillison.net/2026/Jun/10/datasette-agent/#atom-everything" rel="alternate"/>
<published>2026-06-10T23:57:27+00:00</published>
<updated>2026-06-10T23:57:27+00:00</updated>
<id>https://simonwillison.net/2026/Jun/10/datasette-agent/#atom-everything</id>
<summary type="html"><p><strong>Release:</strong> <a href="https://github.com/datasette/datasette-agent/releases/tag/0.2a0">datasette-agent 0.2a0</a></p>
<p>Highlights from the release notes:</p>
<blockquote>
<ul>
<li>Tools can now ask the user questions mid-execution. Tools that declare a <code>context</code> parameter receive a <code>ToolContext</code> object, and <code>await context.ask_user(...)</code> can ask a yes/no, multiple-choice (<code>options=[...]</code>) or free-text (<code>free_text=True</code>) question. While a question is unanswered the agent turn suspends: the question renders as a form in the chat UI and persists to the internal database, so suspended conversations survive a server restart. Once answered, the tool re-executes from the top with stored answers replayed, so call <code>ask_user()</code> before performing side effects. <a href="https://github.com/datasette/datasette-agent/pull/20">#20</a></li>
<li>New built-in <code>save_query</code> tool: the agent can save SQL it has written as a <a href="https://docs.datasette.io/en/latest/sql_queries.html#saved-queries">Datasette stored query</a>. Saving always requires human approval - the agent shows the full SQL plus the proposed name, database and visibility, and nothing is stored until you click Yes. <a href="https://github.com/datasette/datasette-agent/pull/20">#20</a></li>
</ul>
</blockquote>
<p>The <code>ask_user()</code> feature was enabled by the new LLM alpha I <a href="https://simonwillison.net/2026/Jun/9/claude-fable-5/#adding-features-to-datasette-agent-and-llm-using-claude-code">built yesterday</a> with the help of Claude Fable 5.</p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/datasette">datasette</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/datasette-agent">datasette-agent</a></p></summary>
<category term="ai"/>
<category term="datasette"/>
<category term="generative-ai"/>
<category term="llms"/>
<category term="datasette-agent"/>
</entry>
<entry>
<title>DiffusionGemma</title>
<link href="https://simonwillison.net/2026/Jun/10/diffusiongemma/#atom-everything" rel="alternate"/>
<published>2026-06-10T20:00:54+00:00</published>
<updated>2026-06-10T20:00:54+00:00</updated>
<id>https://simonwillison.net/2026/Jun/10/diffusiongemma/#atom-everything</id>
<summary type="html"><p><strong><a href="https://blog.google/innovation-and-ai/technology/developers-tools/diffusion-gemma-faster-text-generation/">DiffusionGemma</a></strong></p>
Last May Google briefly released an experimental Gemini Diffusion model. I <a href="https://simonwillison.net/2025/May/21/gemini-diffusion/">tried the preview at the time</a> and recorded it running at 857 tokens/second. It was an exciting model, but Google made no further announcements about it.</p>
<p>That research has returned in the best possible way: as a new open weight (Apache 2 licensed) Gemma model, <a href="https://huggingface.co/google/diffusiongemma-26B-A4B-it">google/diffusiongemma-26B-A4B-it</a>.</p>
<p>NVIDIA are currently <a href="https://build.nvidia.com/google/diffusiongemma-26b-a4b-it">hosting the model for free</a> on their NIM cloud API. I used that API to <a href="https://tools.simonwillison.net/markdown-svg-renderer#url=https%3A%2F%2Fgist.github.com%2Fsimonw%2Fe5e234a6dc6eef61e209ce1629620042">generate this pelican</a>, which took 4.4s (according to <code>time uv run generate.py</code>) to return 2,409 tokens - so at least 500 tokens/second.</p>
<p><img alt="Flat minimalist illustration of a white pelican with a large orange beak riding a red bicycle with black wheels, against a pale blue background with a green line representing the ground" src="https://static.simonwillison.net/static/2026/diffusiongemma-pelican.png" />
<p><small></small>Via <a href="https://news.ycombinator.com/item?id=48478471">Hacker News</a></small></p>
<p>Tags: <a href="https://simonwillison.net/tags/google">google</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/nvidia">nvidia</a>, <a href="https://simonwillison.net/tags/pelican-riding-a-bicycle">pelican-riding-a-bicycle</a>, <a href="https://simonwillison.net/tags/gemma">gemma</a>, <a href="https://simonwillison.net/tags/llm-release">llm-release</a>, <a href="https://simonwillison.net/tags/llm-performance">llm-performance</a></p></summary>
<category term="google"/>
<category term="ai"/>
<category term="generative-ai"/>
<category term="llms"/>
<category term="nvidia"/>
<category term="pelican-riding-a-bicycle"/>
<category term="gemma"/>
<category term="llm-release"/>
<category term="llm-performance"/>
</entry>
<entry>
<title>Quoting Jeremy Howard</title>
<link href="https://simonwillison.net/2026/Jun/10/jeremy-howard/#atom-everything" rel="alternate"/>
<published>2026-06-10T15:23:34+00:00</published>
<updated>2026-06-10T15:23:34+00:00</updated>
<id>https://simonwillison.net/2026/Jun/10/jeremy-howard/#atom-everything</id>
<summary type="html"><blockquote cite="https://twitter.com/jeremyphoward/status/2064595816875217362"><p>Easy solution to slow down recursive AI self improvement:</p>
<ul>
<li>The lab with the top-ranked model must agree THEY must not use it for working on frontier AI</li>
<li>But everyone else should have access to it.</li>
</ul>
<p>By definition, this means the frontier doesn't advance.</p>
<p>It also has the critical benefit of avoiding a dangerous power imbalance.</p>
<p>Anthropic has chosen the <em>opposite</em> of the safe path: they are allowing themselves, the current top lab, to use their top model for frontier AI research. They've said they'll sabotage others who try.</p>
<p>This means the AI frontier advances, &amp; power imbalance increases.</p>
<p>(To be clear, <em>I</em> don't think we should try to slow down recursive AI self improvement - I think we should open it up and democratize it as much as possible. My point is: if <em>you</em> claim we should slow down, and you have the best model, you should ensure your org can't use it.)</p></blockquote>
<p class="cite">&mdash; <a href="https://twitter.com/jeremyphoward/status/2064595816875217362">Jeremy Howard</a>, in a Twitter thread</p>
<p>Tags: <a href="https://simonwillison.net/tags/ai-ethics">ai-ethics</a>, <a href="https://simonwillison.net/tags/anthropic">anthropic</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a>, <a href="https://simonwillison.net/tags/jeremy-howard">jeremy-howard</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a></p></summary>
<category term="ai-ethics"/>
<category term="anthropic"/>
<category term="generative-ai"/>
<category term="claude-mythos"/>
<category term="jeremy-howard"/>
<category term="ai"/>
<category term="llms"/>
</entry>
<entry>
<title>If Claude Fable stops helping you, you'll never know</title>
<link href="https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-helping-you/#atom-everything" rel="alternate"/>
<published>2026-06-10T00:37:25+00:00</published>
<updated>2026-06-10T00:37:25+00:00</updated>
<id>https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-helping-you/#atom-everything</id>
<summary type="html"><p><strong><a href="https://jonready.com/blog/posts/claude-fable5-is-allowed-to-sabotage-your-app-if-youre-a-competitor.html">If Claude Fable stops helping you, you&#x27;ll never know</a></strong></p>
Jonathon Ready highlights one of the more eyebrow-raising details from the <a href="https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c342ee809620.pdf">319 page system card</a> for Fable 5 and Mythos 5. Here's a longer excerpt, highlights mine:</p>
<blockquote>
<p>In light of the ability of recent models to <a href="https://www.anthropic.com/institute/recursive-self-improvement">accelerate their own development</a>, we’ve <strong>implemented new interventions</strong> that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on <strong>building pretraining pipelines, distributed training infrastructure, or ML accelerator design</strong>). Using Claude to develop competing models already violates our <a href="https://www.anthropic.com/legal/consumer-terms">Terms of Service</a>, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms.</p>
<p>Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, <strong>these safeguards will not be visible to the user</strong>. Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations.</p>
</blockquote>
<p>I believe this is the first time Anthropic have announced these kinds of silent interventions. The justification still feels pretty science-fiction to me - the linked article talks about "recursive self-improvement". I'm not at all keen on a model that silently corrupts its replies to questions about "ML accelerator design" purely to slow down research that might conflict with Anthropic's own goals!</p>
<p><strong>Update</strong>: Anthropic <a href="https://simonwillison.net/2026/Jun/11/anthropic-walks-back-policy/">walked back this policy</a> in the face of widespread outrage from the research community.
<p><small></small>Via <a href="https://news.ycombinator.com/item?id=48467896">Hacker News</a></small></p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/anthropic">anthropic</a>, <a href="https://simonwillison.net/tags/claude">claude</a>, <a href="https://simonwillison.net/tags/ai-ethics">ai-ethics</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p></summary>
<category term="ai"/>
<category term="generative-ai"/>
<category term="llms"/>
<category term="anthropic"/>
<category term="claude"/>
<category term="ai-ethics"/>
<category term="claude-mythos"/>
</entry>
<entry>
<title>Initial impressions of Claude Fable 5</title>
<link href="https://simonwillison.net/2026/Jun/9/claude-fable-5/#atom-everything" rel="alternate"/>
<published>2026-06-09T23:59:54+00:00</published>
<updated>2026-06-09T23:59:54+00:00</updated>
<id>https://simonwillison.net/2026/Jun/9/claude-fable-5/#atom-everything</id>
<summary type="html"><p>I didn't have early access to today's <a href="https://www.anthropic.com/news/claude-fable-5-mythos-5">Claude Fable 5</a> release, but I've spent the past ~5.5 hours putting it through its paces. My initial impressions are that this is something of a <em>beast</em>. It's slow, expensive and has been quite happily churning through everything I've thrown at it so far. As is frequently the case with current frontier models the challenge is finding tasks that it can't do.</p>
<p>First, let's review the key characteristics.</p>
<p>Anthropic claim that <a href="https://www.anthropic.com/news/claude-fable-5-mythos-5">Claude Fable 5</a> offers the same performance as Claude Mythos 5, except with much more strict guardrails in place to prevent it being used for harmful things. Those guardrails trigger often enough that the Claude API has new mechanisms for letting you know when you hit them, and even has a <a href="https://platform.claude.com/docs/en/build-with-claude/refusals-and-fallback">new option</a> to request it falls back to another model automatically if something gets rejected.</p>
<p>Claude Mythos 5 is out today as well, <a href="https://platform.claude.com/docs/en/about-claude/models/introducing-claude-fable-5-and-claude-mythos-5">Anthropic say it</a> "Shares Claude Fable 5's capabilities without the safety classifiers".</p>
<p>The models have a 1 million token context window, 128,000 maximum output tokens and a knowledge cut-off date of January 2026.</p>
<p>They are priced at twice the price of Claude Opus 4.5/4.6/4.7/4.8: $10/million input tokens and $50/million output tokens. There's no increase in price for longer context usage.</p>
<p>Other than that the <a href="https://platform.claude.com/docs/en/about-claude/models/introducing-claude-fable-5-and-claude-mythos-5">upgrade guide</a> is substantially thinner than <a href="https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-8">the similar guide for Opus 4.8</a>.</p>
<h4 id="the-big-model-smell">The big model smell</h4>
<p>The best way to describe Fable is that it feels <em>big</em>. Not just in terms of speed and cost, but also in how much it knows.</p>
<p>Here's an example prompt, which I used to compare Fable's knowledge with Opus 4.8 (typo included):</p>
<blockquote>
<p><code>List all of Simon Willion's open source projects, most recent first, each with a rough date of when they were first released</code></p>
</blockquote>
<p>Opus 4.8's reply <a href="https://gist.github.com/simonw/e77abf645156ade8c5c610f524226dcb#response-1">started like this</a>:</p>
<blockquote>
<p>I don't have a reliable, comprehensive, and date-verified list of all of Simon Willison's open source projects, and I want to be honest about that rather than risk giving you inaccurate dates or fabricated entries.</p>
<p>Here's what I can tell you with reasonable confidence:</p>
<p><strong>Well-known projects by Simon Willison:</strong></p>
<ul>
<li>
<p><strong>LLM</strong> (~2023) – A command-line tool and Python library for interacting with large language models.</p>
</li>
<li>
<p><strong>Datasette</strong> (~2017) – A tool for exploring and publishing data, probably his most prominent recent project.</p>
</li>
<li>
<p><strong>sqlite-utils</strong> (~2018-2019) – A Python library and CLI tool for manipulating SQLite databases, often used alongside Datasette.</p>
</li>
<li>
<p><strong>Django</strong> (2005) – He is one of the co-creators of the Django web framework, along with Adrian Holovaty, originally developed at the Lawrence Journal-World newspaper.</p>
</li>
</ul>
<p>He also maintains a large ecosystem of <strong>Datasette plugins</strong> and many smaller utilities (often with names like <code>shot-scraper</code>, various <code>*-to-sqlite</code> tools for importing data, etc.).</p>
</blockquote>
<p>That's not bad. Here's <a href="https://gist.github.com/simonw/e77abf645156ade8c5c610f524226dcb#response">the start of the response</a> for the exact same prompt from Fable 5 - both models were running without access to search:</p>
<blockquote>
<p>I think you mean <strong>Simon Willison</strong> — the prolific open source developer, co-creator of Django, and creator of Datasette.</p>
<p>An important caveat upfront: a truly complete list isn't feasible. Simon Willison is famously prolific — he has published <strong>hundreds</strong> of repositories (Datasette alone has 100+ plugins, many written by him), and my knowledge has a cutoff date, so I'll be missing his most recent work. Below are his <strong>notable</strong> projects with approximate initial release dates, most recent first:</p>
<h2 id="notable-projects-approximate-dates-">Notable projects (approximate dates)</h2>
<ul>
<li>
<strong>files-to-prompt</strong> (April 2024) – concatenates files into a single prompt for LLMs</li>
<li>
<strong>datasette-extract</strong> (2024) – plugin for extracting structured data using LLMs</li>
<li>
<strong>LLM</strong> (May–June 2023) – CLI tool and Python library for interacting with large language models, plus a large ecosystem of plugins (llm-gpt4all, llm-mistral, llm-claude, etc., 2023 onward)</li>
<li>
<strong>symbex</strong> (June 2023) – search Python code for symbols</li>
<li>
<strong>ttok</strong> and <strong>strip-tags</strong> (May 2023) – CLI tools for token counting and HTML cleanup for LLM pipelines</li>
<li>
<strong>datasette-lite</strong> (May 2022) – Datasette running entirely in the browser via WebAssembly/Pyodide</li>
<li>
<strong>shot-scraper</strong> (March 2022) – automated website screenshots via Playwright</li>
<li>
<strong>s3-credentials</strong> (November 2021) – CLI for creating scoped AWS S3 credentials</li>
<li>
<strong>django-sql-dashboard</strong> (2021) – SQL reporting dashboards for Django</li>
<li>
<strong>Dogsheep suite</strong> (2019) – personal analytics tools: twitter-to-sqlite, github-to-sqlite, healthkit-to-sqlite, dogsheep-beta, etc.</li>
<li>
<strong>sqlite-utils</strong> (2018) – CLI and Python library for manipulating SQLite databases</li>
<li>
<strong>Datasette</strong> (November 2017) – his flagship project; tool for exploring and publishing data</li>
<li>
<strong>csvs-to-sqlite</strong> (2017) – convert CSV files to SQLite</li>
<li>
<strong>Various early tools</strong> (~2007–2010) – soupselect, json-head, geocoders, and others</li>
<li>
<strong>Django</strong> (developed 2003–2005, open-sourced July 2005) – co-created with Adrian Holovaty at the Lawrence Journal-World</li>
</ul>
</blockquote>
<p>(Here's <a href="https://gist.github.com/simonw/8d05ce836d44bd8543055c9614f7c478">GPT-5.5 for good measure</a>. It listed even more projects than Fable did!)</p>
<p>In the past I've stated that I don't care about how much models <em>know</em> - I want them to be able to manipulate text and code in useful ways and actively look up the information they need via search tools, not bake it into their weights.</p>
<p>But knowledge like this is a reasonably good proxy for model size - you can cram a whole lot more details about the world into a larger number of parameters.</p>
<p>Does <em>knowing more stuff</em> mean the model is better at the tasks we pose to it? I can certainly imagine how a coding model with deeper knowledge of modern libraries and patterns could crunch through coding tasks more effectively.</p>
<p>Is Fable really bigger than Opus? Anthropic haven't said anything about model size, so all we have are tea-leaves, but the speed, pricing and my own poking at its knowledge make me think that it's a large model. Maybe the largest yet from any vendor.</p>
<h4 id="using-fable-in-claude-ai">Using Fable in Claude.ai</h4>
<p>Anthropic made Fable 5 available across all of their surfaces - the <a href="https://claude.ai/">Claude.ai</a> chat interface, Claude Code for web, Claude Code CLI and Claude Cowork as well. The model is available "until June 22nd" on the subscription plans (I'm on $100/month Max at the moment), after which it will be billed extra.</p>
<p>Claude.ai is often under-estimated. Since <a href="https://simonwillison.net/2025/Sep/9/claude-code-interpreter/">September 2025</a> every chat has had access to a full container environment to run code, including the ability to install additional packages and even clone repositories directly from GitHub.</p>
<p>Last week <a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/">I released micropython-wasm</a>, a Python library that uses <a href="https://wasmtime.dev/">wasmtime</a> to run a custom build of MicroPython in WebAssembly to act as a sandbox for untrusted Python code.</p>
<p>I decided to see if Fable could upgrade that to running full Python instead. I started with this prompt:</p>
<blockquote>
<p><code>Clone simonw/micropython-wasm from GitHub and research how this could use a full Python as opposed to MicroPython</code></p>
</blockquote>
<p>Fable identified that it could use Brett Cannon's <a href="https://github.com/brettcannon/cpython-wasi-build/releases">cpython-wasi-build</a> builds for this, but was unable to download them itself due to environment restrictions.</p>
<p>So I grabbed the two zip files from that page and uploaded them to Claude:</p>
<blockquote>
<p><code>Here's the Brett Cannon builds</code> (<code>python-3.zip</code>, <code>_build-python-3.zip</code> as attachments)</p>
</blockquote>
<p>And that was that. It churned away for a few minutes and got the entire thing working.</p>
<p>Part of the response included:</p>
<blockquote>
<p>I tried the cleaner single-zip-stdlib approach to shrink the filesystem surface, but CPython's <code>getpath</code> bootstrap fails to find <code>encodings</code> from inside a zip without more prefix finessing — the directory-preopen approach works reliably, so that's what the PoC uses. The zip path is solvable but needs <code>_PYTHONHOME</code>/frozen-getpath work.</p>
</blockquote>
<p>So I said:</p>
<blockquote>
<p><code>Try a bit more at the single-zip-stdlib problem</code></p>
</blockquote>
<p>Then a little later:</p>
<blockquote>
<p><code>I want a wheel that has the whole system in it, the Python wrappers and the WASM files and the stdlibrary, so I can do uv run --with path-to-whl python -c "demo code"</code></p>
</blockquote>
<p>... and it gave me <a href="https://static.simonwillison.net/static/cors-allow/2026/cpython_wasm-0.1.0-py3-none-any.whl">this 13.9MB cpython_wasm-0.1.0-py3-none-any.whl</a> file. You can try running Python code in a sandbox using that wheel URL and <code>uv</code> like this:</p>
<div class="highlight highlight-source-shell"><pre>uv run --with https://static.simonwillison.net/static/cors-allow/2026/cpython_wasm-0.1.0-py3-none-any.whl \
cpython-wasm -c <span class="pl-s"><span class="pl-pds">'</span>print(45 ** 56)<span class="pl-pds">'</span></span></pre></div>
<p>Here's <a href="https://claude.ai/share/a73b8b8b-8ebc-4fef-9e5c-7438e5e7ae35">the full chat transcript</a>.</p>
<p>This was a <em>very</em> strong start.</p>
<h4 id="adding-features-to-datasette-agent-and-llm-using-claude-code">Adding features to Datasette Agent and LLM using Claude Code</h4>
<p>Before I'd realized it was Fable day, my stretch goal for today was to add a new feature to <a href="https://agent.datasette.io/">Datasette Agent</a>: I wanted tool calls within that agent software to gain the ability to pause mid-execution and request approval directly from the user.</p>
<p>This felt like a suitably meaty task to throw at the new model.</p>
<p>Over the course of the day Fable not only <a href="https://github.com/datasette/datasette-agent/pull/20">solved that problem</a>, it also identified and then implemented four issues in my underlying LLM library that would help support this kind of advanced pause-resume mechanism in tool calls.</p>
<p>It got everything working first using somewhat gnarly hacks, but the moment I told it that changes to LLM itself were in scope it set to work unraveling the hacks and turning them into supported features of LLM instead.</p>
<p>My stretch goal turned into <a href="https://llm.datasette.io/en/latest/changelog.html#a3-2026-06-09">LLM 0.32a3</a>, almost entirely written by Fable. Here are the release notes:</p>
<blockquote>
<p>Driven by the needs of <a href="https://github.com/datasette/datasette-agent">Datasette Agent</a>'s human-in-the-loop <code>ask_user()</code> feature, made the following improvements to how tool calls work:</p>
<ul>
<li>Tool implementations can declare a parameter named <code>llm_tool_call</code> in order to be passed the <code>llm.ToolCall</code> object for the current invocation. This allows them to access the current <code>llm_tool_call.tool_call_id</code>. See <a href="https://llm.datasette.io/en/latest/python-api.html#python-api-tools-llm-tool-call">Accessing the tool call from inside a tool</a>. <a href="https://github.com/simonw/llm/pull/1480">#1480</a>
</li>
<li>Every tool call is now guaranteed a unique <code>tool_call_id</code> - providers that do not supply one get a synthesized <code>tc_</code>-prefixed ULID. <a href="https://github.com/simonw/llm/pull/1481">#1481</a>
</li>
<li>Tools can raise a <code>llm.PauseChain</code> exception to cleanly pause the tool chain, useful for things like waiting for human approval. The exception propagates to the caller with <code>.tool_call</code> and <code>.tool_results</code> (completed sibling results) attached, and no model call is made with a placeholder result. See <a href="https://llm.datasette.io/en/latest/python-api.html#python-api-tools-pause">Pausing a chain from inside a tool</a>. <a href="https://github.com/simonw/llm/pull/1482">#1482</a>
</li>
<li>Failure semantics for concurrent tool execution: async sibling tool calls always run to completion before a pause or hook exception propagates. <a href="https://github.com/simonw/llm/pull/1482">#1482</a>
</li>
<li>Chains can now resume from a <code>messages=</code> history ending in unresolved tool calls: the calls are executed through the normal <code>before_call</code>/<code>after_call</code> machinery before the first model call, skipping any that already have results. The <code>execute_tool_calls()</code> method also accepts a new optional <code>tool_calls_list=</code> argument for executing an explicit list of <code>ToolCall</code> objects in place of the calls requested by the response. See <a href="https://llm.datasette.io/en/latest/python-api.html#python-api-tools-resume">Resuming a chain with pending tool calls</a>. <a href="https://github.com/simonw/llm/pull/1482">#1482</a>
</li>
<li>Fixed a bug where the async tool executor silently dropped calls to tools not present in <code>tools=</code> - these now return <code>Error: tool "..." does not exist</code> results, matching the sync executor. <a href="https://github.com/simonw/llm/pull/1483">#1483</a>
</li>
</ul>
</blockquote>
<p>I'm really impressed with the quality of API design, tests, code and documentation that Fable put together for this. I spent several hours on it today, but it feels like several days' worth of work.</p>
<h4 id="how-much-i-ve-spent">How much I've spent</h4>
<p>I recently started using <a href="https://agentsview.io">AgentsView</a> to help track my local LLM usage across all of the different coding agents. I published a <a href="https://til.simonwillison.net/llms/agentsview-custom-model-price">TIL today</a> about adding custom Fable pricing to that tool, which I expect will not be necessary in the very near future.</p>
<p>After setting the price, I ran this command to start a localhost web server to explore my usage:</p>
<pre><code>uvx agentsview serve
</code></pre>
<p>Here's the treemap showing the breakdown of my Fable usage across various projects today:</p>
<p><img src="https://static.simonwillison.net/static/2026/agentsview-fable-full-day.jpg" alt="Screenshot of a cost tracking dashboard with two panels. The first panel is titled &quot;Cost Attribution&quot; with toggle buttons for Project / Model / Agent and Treemap / List, with Project and Treemap selected. Italic text reads &quot;Click to hide from chart&quot;. A treemap shows a large red block labeled prod_datasette_agent $99.26 89.9%, with smaller blocks to its right labeled cloud (blue), datasette (teal), llm (red), and money (pink), plus a tiny orange sliver. A legend lists: 1 prod_datasette_agent $99.26, 2 cloud $3.98, 3 datasette $2.81, 4 llm $2.30, 5 money $1.92, 6 simon $0.15. The second panel is titled &quot;Top Sessions by Cost&quot; and lists nine sessions, each with a &quot;Claude&quot; badge, a prompt excerpt, a project name with a session UUID (omitted here), a token count, and a cost: 1. Review ./datasette-agent and ./datasette-apps - we are going to add a new feature to agent but you ... prod_datasette_agent, 78.2M, $99.26. 2. issues.db is a copy of the Datasette issues database. There are a LOT of notes in there relating to... datasette, 826.8k, $2.81. 3. Consult fly-docs and then look at datasette.cloud (which launches fly machines) and datasettecloud-... cloud, 924.7k, $2.61. 4. simonwillisonblog.db is a copy of my blog, plus all my software releases and other interesting thin... money, 542.9k, $1.92. 5. Look in datasette.cloud and figure out all remaining steps and decisions that need to be made in or... cloud, 455k, $1.37. 6. Review PRs and issues filed against this repo within the last 4 weeks and see if any deserve to be ... llm, 323.3k, $0.95. 7. run mypy, llm, 320.9k, $0.76. 8. [Image #1] fix this in github actions, llm, 183.9k, $0.59. 9. simon, simon, 26.4k, $0.15." style="max-width: 100%;" /></p>
<p>I used $110.42 worth of tokens today, all as part of my $100/month subscription.</p>
<h4 id="and-some-pelicans">And some pelicans</h4>
<p>I ran "Generate an SVG of a pelican riding a bicycle" against all five thinking effort levels with Fable.</p>
<p>Here are <a href="https://tools.simonwillison.net/markdown-svg-renderer#url=https%3A%2F%2Fgist.github.com%2Fsimonw%2F94fde31c34a0400c1d29f57e6a708e6b">the results</a>, including the token cost for each one:</p>
<div style="display: flex; flex-wrap: wrap; gap: 10px; margin-bottom: 1em">
<figure style="margin: 0; flex: 1 1 30%;">
<img src="https://static.simonwillison.net/static/2026/fable-low.jpg" alt="low" style="width: 100%; height: auto;" />
<figcaption style="text-align: center;">low: <a href="https://www.llm-prices.com/#it=25&amp;ot=1929&amp;sel=claude-fable-5">1,929 out, 9.67c</a></figcaption>
</figure>
<figure style="margin: 0; flex: 1 1 30%;">
<img src="https://static.simonwillison.net/static/2026/fable-medium.jpg" alt="medium" style="width: 100%; height: auto;" />
<figcaption style="text-align: center;">medium: <a href="https://www.llm-prices.com/#it=25&amp;ot=2290&amp;sel=claude-fable-5">2,290 out, 11.475c</a></figcaption>
</figure>
<figure style="margin: 0; flex: 1 1 30%;">
<img src="https://static.simonwillison.net/static/2026/fable-high.jpg" alt="high" style="width: 100%; height: auto;" />
<figcaption style="text-align: center;">high: <a href="https://www.llm-prices.com/#it=25&amp;ot=2057&amp;sel=claude-fable-5">2,057 out, 10.31c</a></figcaption>
</figure>
<figure style="margin: 0; flex: 1 1 45%;">
<img src="https://static.simonwillison.net/static/2026/fable-xhigh.jpg" alt="xhigh" style="width: 100%; height: auto;" />
<figcaption style="text-align: center;">xhigh: <a href="https://www.llm-prices.com/#it=25&amp;ot=5992&amp;sel=claude-fable-5">5,992 out, 29.985c</a></figcaption>
</figure>
<figure style="margin: 0; flex: 1 1 45%;">
<img src="https://static.simonwillison.net/static/2026/fable-max.jpg" alt="max" style="width: 100%; height: auto;" />
<figcaption style="text-align: center;">max: <a href="https://www.llm-prices.com/#it=25&amp;ot=14430&amp;sel=claude-fable-5">14,430 out, 72.175c</a></figcaption>
</figure>
</div>
<p>It's interesting that high ended up using fewer tokens than medium for this particular run.</p>
<p>Here are the <a href="https://simonwillison.net/2026/May/28/claude-opus-4-8/#and-some-pelicans">Opus 4.8 pelicans</a> for comparison.</p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/anthropic">anthropic</a>, <a href="https://simonwillison.net/tags/claude">claude</a>, <a href="https://simonwillison.net/tags/llm-pricing">llm-pricing</a>, <a href="https://simonwillison.net/tags/pelican-riding-a-bicycle">pelican-riding-a-bicycle</a>, <a href="https://simonwillison.net/tags/llm-release">llm-release</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p></summary>
<category term="ai"/>
<category term="generative-ai"/>
<category term="llms"/>
<category term="anthropic"/>
<category term="claude"/>
<category term="llm-pricing"/>
<category term="pelican-riding-a-bicycle"/>
<category term="llm-release"/>
<category term="claude-mythos"/>
</entry>
<entry>
<title>llm 0.32a3</title>
<link href="https://simonwillison.net/2026/Jun/9/llm/#atom-everything" rel="alternate"/>
<published>2026-06-09T22:27:03+00:00</published>
<updated>2026-06-09T22:27:03+00:00</updated>
<id>https://simonwillison.net/2026/Jun/9/llm/#atom-everything</id>
<summary type="html"><p><strong>Release:</strong> <a href="https://github.com/simonw/llm/releases/tag/0.32a3">llm 0.32a3</a></p>
<p>Almost entirely written by the new Claude Fable 5, see <a href="https://simonwillison.net/2026/Jun/9/claude-fable-5/#adding-features-to-datasette-agent-and-llm-using-claude-code">my write-up for more details</a>.</p>
<p>Tags: <a href="https://simonwillison.net/tags/projects">projects</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/llm">llm</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p></summary>
<category term="projects"/>
<category term="ai"/>
<category term="generative-ai"/>
<category term="llms"/>
<category term="llm"/>
<category term="claude-mythos"/>
</entry>
<entry>
<title>Setting a custom price for a model in AgentsView</title>
<link href="https://simonwillison.net/2026/Jun/9/agentsview-custom-model-price/#atom-everything" rel="alternate"/>
<published>2026-06-09T21:35:31+00:00</published>
<updated>2026-06-09T21:35:31+00:00</updated>
<id>https://simonwillison.net/2026/Jun/9/agentsview-custom-model-price/#atom-everything</id>
<summary type="html"><p><strong>TIL:</strong> <a href="https://til.simonwillison.net/llms/agentsview-custom-model-price">Setting a custom price for a model in AgentsView</a></p>
<p>I've been really enjoying <a href="https://agentsview.io/">AgentsView</a> by Wes McKinney as a tool for exploring my token usage across different coding agents running on my laptop.</p>
<p>Claude Fable 5 came out today and wasn't yet included in the pricing database AgentsView uses. I used Fable to reverse-engineer AgentsView and figured out this recipe for setting custom prices.</p>
<p>Here's my Claude Fable 5 usage for today so far, plotted by AgentsView as a treemap across my different local projects:</p>
<p><img alt="Screenshot of a cost analytics dashboard. Cost Attribution - Click to hide from chart - toggle buttons for Project / Model / Agent and Treemap / List. A treemap shows a large red block: prod_datasette_agent $74.06 89.3%, then blue: cloud $3.98 4.8%, teal: datasette $2.81 3.4%, pink: money $1.92 2.3%, and a thin orange sliver. A legend lists 1 prod_datasette_agent $74.06, 2 cloud $3.98, 3 datasette $2.81, 4 money $1.92, 5 simon $0.15. Below left, Top Sessions by Cost: 1 Claude - Review ./datasette-agent and ./datasette-apps - we are going to a... - prod_datasette_agent · 08a1f374-0e77-420f-be2d-af805d67e8aa - 55.9M $74.06; 2 Claude - issues.db is a copy of the Datasette issues database. There are a... - datasette · 8caa2d2d-b91f-43b3-bf3a-4268995b6011 - 826.8k $2.81; 3 Claude - Consult fly-docs and then look at datasette.cloud (which launche... - cloud · bfcacc70-09d7-4b27-aaec-4bb8accd9fec - 924.7k $2.61; 4 Claude - simonwillisonblog.db is a copy of my blog, plus all my software re... - money · 0c0fb9dc-6347-4e1b-9307-3709a7cdf0c8 - 542.9k $1.92; 5 Claude - Look in datasette.cloud and figure out all remaining steps and dec... - cloud · 45963b5f-608a-4caa-ad6b-6ae81e1dbf0d - 455k $1.37; 6 Claude - simon - simon · deeccb5d-9e90-4b1e-bfe6-c2b271e1b1d4 - 26.4k $0.15. Below right, Cache Efficiency with horizontal bars: Cache Reads 57.6M (nearly full green bar), Cache Writes 769.3K, Uncached Input 64.4K, Output 300.9K (all tiny bars), and a green highlighted note: $516.62 saved vs uncached." src="https://static.simonwillison.net/static/2026/agentsview-fable.jpg" /></p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/llm-pricing">llm-pricing</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p></summary>
<category term="ai"/>
<category term="generative-ai"/>
<category term="llms"/>
<category term="llm-pricing"/>
<category term="claude-mythos"/>
</entry>
<entry>
<title>Quoting Andrej Karpathy</title>
<link href="https://simonwillison.net/2026/Jun/9/andrej-karpathy/#atom-everything" rel="alternate"/>
<published>2026-06-09T19:03:10+00:00</published>
<updated>2026-06-09T19:03:10+00:00</updated>
<id>https://simonwillison.net/2026/Jun/9/andrej-karpathy/#atom-everything</id>
<summary type="html"><blockquote cite="https://twitter.com/karpathy/status/2064409694761054332"><p>I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref).</p></blockquote>
<p class="cite">&mdash; <a href="https://twitter.com/karpathy/status/2064409694761054332">Andrej Karpathy</a>, on Claude Fable 5</p>
<p>Tags: <a href="https://simonwillison.net/tags/andrej-karpathy">andrej-karpathy</a>, <a href="https://simonwillison.net/tags/jevons-paradox">jevons-paradox</a>, <a href="https://simonwillison.net/tags/anthropic">anthropic</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p></summary>
<category term="andrej-karpathy"/>
<category term="jevons-paradox"/>
<category term="anthropic"/>
<category term="generative-ai"/>
<category term="ai"/>
<category term="llms"/>
<category term="claude-mythos"/>
</entry>
<entry>
<title>Siri AI at WWDC 2026</title>
<link href="https://simonwillison.net/2026/Jun/8/wwdc/#atom-everything" rel="alternate"/>
<published>2026-06-08T23:58:04+00:00</published>
<updated>2026-06-08T23:58:04+00:00</updated>
<id>https://simonwillison.net/2026/Jun/8/wwdc/#atom-everything</id>
<summary type="html"><p>Given how badly burned anyone who took Apple's <a href="https://simonwillison.net/2024/Jun/10/apple-intelligence/">2024 WWDC Apple Intelligence announcements</a> at face value was, I'm holding to a strict "I'll believe it when I see it" policy for everything <a href="https://www.apple.com/newsroom/2026/06/apple-unveils-next-generation-of-apple-intelligence-siri-ai-and-more/">they announced today</a>. </p>
<p>The new Siri AI features do at least look feasible with today's technology, especially since Apple are licensing a custom Gemini-derived model that they can run on their own <a href="https://simonwillison.net/2024/Jun/11/private-cloud-compute/">Private Cloud Compute</a>.</p>
<p>It sounds like they'll be taking advantage of vision LLMs to extract information from the user's screen, which neatly sidesteps the need for every existing application to ship custom code in order to integrate with Apple Intelligence. Vision LLMs were a much less mature category in June 2024.</p>
<p>The new Core AI library looks like a good step in enabling developers to finally take full advantage of Apple's hardware for running their own models. It integrates with Meta's open source PyTorch ecosystem, using these <a href="https://apple.github.io/coreai-torch/main/">Core AI PyTorch extensions</a>:</p>
<blockquote>
<p>Core AI PyTorch Extensions (<code>coreai-torch</code>) is a Python package that bridges PyTorch and Core AI. You can use it to bring up an existing PyTorch model — exported as a <code>torch.export.ExportedProgram</code> — into a Core AI <code>AIProgram</code> ready to run on Apple hardware, traversing the FX graph node-by-node and mapping ATen operators to Core AI operations.</p>
</blockquote>
<p>You can install an iOS 27 Developer Beta today, which supposedly has the new features - but you then have to make it through a waiting list for access to the new Siri AI. Aaron Perris from MacRumors reports having <a href="https://twitter.com/aaronp613/status/2064078063814471977">made it off the waitlist</a> so we may start seeing credible reports on how well Siri AI works in the very near future.</p>
<p><strong>Update</strong>: These Private Cloud Compute Gemini models are running in Google Cloud, and using NVIDIA hardware. According to <a href="https://security.apple.com/blog/expanding-pcc/?linkId=100000425571569">Expanding Private Cloud Compute</a> on Apple's Security Research blog:</p>
<blockquote>
<p>For the most demanding tasks, including agentic tool-use and complex reasoning, we worked with Google and NVIDIA to extend our PCC infrastructure to Google Cloud systems using NVIDIA GPUs, while maintaining Apple's powerful security and privacy protections. [...]</p>
<p>PCC on Google Cloud leverages many of the same architectural security patterns as PCC on Apple silicon to implement these layered protections: initial network data parsing for each request happens in a dedicated process within its own namespace, shared inference software is recycled with a short time-to-live duration, and attested keys are held in a separate, dedicated confidential VM isolated from external inputs. [...]</p>
<p>As with PCC on Apple silicon, all binaries will be published for public inspection.</p>
</blockquote>
<p>Tags: <a href="https://simonwillison.net/tags/vision-llms">vision-llms</a>, <a href="https://simonwillison.net/tags/apple">apple</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/gemini">gemini</a>, <a href="https://simonwillison.net/tags/nvidia">nvidia</a>, <a href="https://simonwillison.net/tags/google">google</a></p></summary>
<category term="vision-llms"/>
<category term="apple"/>
<category term="generative-ai"/>
<category term="ai"/>
<category term="llms"/>
<category term="gemini"/>
<category term="nvidia"/>
<category term="google"/>
</entry>
<entry>
<title>datasette-agent-edit 0.1a0</title>
<link href="https://simonwillison.net/2026/Jun/7/datasette-agent-edit/#atom-everything" rel="alternate"/>
<published>2026-06-07T23:56:38+00:00</published>
<updated>2026-06-07T23:56:38+00:00</updated>
<id>https://simonwillison.net/2026/Jun/7/datasette-agent-edit/#atom-everything</id>
<summary type="html"><p><strong>Release:</strong> <a href="https://github.com/datasette/datasette-agent-edit/releases/tag/0.1a0">datasette-agent-edit 0.1a0</a></p>
<p>I'm planning several plugins for <a href="https://agent.datasette.io/">Datasette Agent</a> which can make edits to existing pieces of text - things like collaborative Markdown editing, updating large SQL queries, and editing SVG files.</p>
<p>Agentic editing of text is a little tricky to get right. My favorite published design for this is for the <a href="https://platform.claude.com/docs/en/agents-and-tools/tool-use/text-editor-tool#use-the-text-editor-tool">Claude text editor</a>, which implements the following tools:</p>
<ul>
<li><code>view</code> - view sections of a file, with line numbers added to every line.</li>
<li><code>str_replace</code> - find an exact <code>old_str</code> and replace it with <code>new_str</code> - fail if the original string is not unique</li>
<li><code>insert</code> - insert the specified text after the specified line number</li>
</ul>
<p>Rather than recreate these patterns for every plugin that needs them I decided to create this base plugin, <code>datasette-agent-edit</code>, which implements the core tools in a way that allows them to be adapted for other plugins.</p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/datasette">datasette</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/llm-tool-use">llm-tool-use</a>, <a href="https://simonwillison.net/tags/datasette-agent">datasette-agent</a></p></summary>
<category term="ai"/>
<category term="datasette"/>
<category term="generative-ai"/>
<category term="llms"/>
<category term="llm-tool-use"/>
<category term="datasette-agent"/>
</entry>
<entry>
<title>micropython-wasm 0.1a2</title>
<link href="https://simonwillison.net/2026/Jun/6/micropython-wasm/#atom-everything" rel="alternate"/>
<published>2026-06-06T04:26:06+00:00</published>
<updated>2026-06-06T04:26:06+00:00</updated>
<id>https://simonwillison.net/2026/Jun/6/micropython-wasm/#atom-everything</id>
<summary type="html"><p><strong>Release:</strong> <a href="https://github.com/simonw/micropython-wasm/releases/tag/0.1a2">micropython-wasm 0.1a2</a></p>
<p>I added a CLI to <code>micropython-wasm</code> (<a href="https://github.com/simonw/micropython-wasm/issues/7">issue #7</a>), inspired by the first draft of <a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/">the blog entry</a> when I realized it would be a great way to illustrate the <a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#try-it-yourself">Try it yourself</a> section.</p>
<p>Tags: <a href="https://simonwillison.net/tags/python">python</a>, <a href="https://simonwillison.net/tags/sandboxing">sandboxing</a>, <a href="https://simonwillison.net/tags/webassembly">webassembly</a>, <a href="https://simonwillison.net/tags/micropython">micropython</a></p></summary>
<category term="python"/>
<category term="sandboxing"/>
<category term="webassembly"/>
<category term="micropython"/>
</entry>
<entry>
<title>Running Python code in a sandbox with MicroPython and WASM</title>
<link href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#atom-everything" rel="alternate"/>
<published>2026-06-06T03:53:34+00:00</published>
<updated>2026-06-06T03:53:34+00:00</updated>
<id>https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#atom-everything</id>
<summary type="html"><p>I've been experimenting with different approaches to running code in a sandbox for several years now, but my latest attempt feels like it might finally have all of the characteristics I've been looking for. I've released it as an alpha package called <a href="https://github.com/simonw/micropython-wasm">micropython-wasm</a>, and I'm using it for a code execution sandbox plugin for <a href="https://github.com/datasette/datasette-agent">Datasette Agent</a> called <a href="https://github.com/datasette/datasette-agent-micropython">datasette-agent-micropython</a>.</p>
<ul>
<li><a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#why-do-i-want-a-sandbox-">Why do I want a sandbox?</a></li>
<li><a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#what-i-want-from-a-sandbox">What I want from a sandbox</a></li>
<li><a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#webassembly-looks-really-promising-here">WebAssembly looks really promising here</a></li>
<li><a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#micropython-in-webassembly">MicroPython in WebAssembly</a></li>
<li><a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#building-the-first-version">Building the first version</a></li>
<li><a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#try-it-yourself">Try it yourself</a></li>
<li><a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#should-you-trust-my-vibe-coded-sandbox-">Should you trust my vibe-coded sandbox?</a></li>
</ul>
<h4 id="why-do-i-want-a-sandbox-">Why do I want a sandbox?</h4>
<p>My key open source projects - <a href="https://datasette.io/">Datasette</a>, <a href="https://llm.datasette.io/">LLM</a>, even <a href="https://sqlite-utils.datasette.io/">sqlite-utils</a> - all support plugins.</p>
<p>I absolutely love plugins as a mechanism for extending software. A carefully designed plugin system reduces the risk involved in trying new things to almost nothing - even the wildest ideas won't leave a lasting influence on the core application itself. My software can grow a new feature overnight and I don't even have to review a pull request!</p>
<p>There's one major drawback: my plugin systems all use Python and <a href="https://pluggy.readthedocs.io/en/latest/">Pluggy</a>, and plugin code executes with full privileges within my applications. A buggy or malicious plugin could break everything or leak private data.</p>
<p>I'd love to be able to run plugin-style code in an environment where it is unable to read unapproved files, connect to a network, or generally operate in a way that's risky or harmful to the rest of the application or the user's computer.</p>
<p>My interest covers more than just plugins. For Datasette in particular there are many features I'd like to support where arbitrary code execution would be useful. I've already experimented with this for <a href="https://enrichments.datasette.io/">Datasette Enrichments</a>, where code can be used to transform values stored in a table. I'd love to build a mechanism where you can run code on a schedule that fetches JSON from an approved location, runs a tiny bit of code to reformat it into a list of dictionaries, then inserts those as rows in a SQLite database table.</p>
<h4 id="what-i-want-from-a-sandbox">What I want from a sandbox</h4>
<p>My goal is to execute code safely within my own Python applications. Here's what I need:</p>
<ul>
<li>Dependencies that <strong>cleanly install from PyPI</strong>, including binary wheels across multiple platforms if necessary. I don't want people using my software to have to take any extra steps beyond directly installing my Python package.</li>
<li>Executed code must be subject to both <strong>memory</strong> and <strong>CPU</strong> limits. I don't want <code>while True: s += "longer string"</code> to crash my application or the user's computer.</li>
<li>
<strong>File access must be strictly controlled</strong>. Either no filesystem access at all or I get to define exactly which files can be read and which files can be written to.</li>
<li>
<strong>Network access is controlled as well</strong>. Sandboxed code should not be able to communicate with anything without going through a layer I fully control.</li>
<li>Support for interaction with <strong>host functions</strong>. A sandbox isn't much use if I can't carefully expose selected platform features to the code that it's running.</li>
<li>It has to be <strong>robust, supported, and clearly documented</strong>. I've lost count of the number of sandbox projects I've seen in repos with warnings that they aren't actively maintained!</li>
</ul>
<h4 id="webassembly-looks-really-promising-here">WebAssembly looks really promising here</h4>
<p>Web browsers operate in the most hostile environment imaginable when it comes to malicious code. Their job is to download <em>and execute</em> untrusted code from the web on almost every page load.</p>
<p>Given this, JavaScript engines should be excellent candidates for sandboxes. Sadly those engines are also extremely complicated, and are not designed for easy embedding in other projects. Most of the V8-in-Python projects I've seen are infrequently maintained and come with warnings not to use them with completely untrusted code.</p>
<p>WebAssembly is a <em>much better</em> candidate. It was designed from the start to support all of the characteristics I care about and has been tested in browsers for nearly a decade. The <a href="https://pypi.org/project/wasmtime">wasmtime</a> Python library brings WASM to Python, is actively maintained, and has binary wheels.</p>
<h4 id="micropython-in-webassembly">MicroPython in WebAssembly</h4>
<p>WebAssembly engines like wasmtime run WebAssembly binaries. Some programming languages like Rust are easy to compile directly to WebAssembly. Dynamic languages like JavaScript and Python are harder - they support language primitives like <code>eval()</code>, which means they need a full interpreter available at runtime.</p>
<p>To run Python we need a full Python interpreter compiled to WebAssembly, wired up in a way that makes it easy to feed it code, hook up host functions and access the results.</p>
<p>Pyodide offers an outstanding package for running Python using WebAssembly in the browser, but using Pyodide in server-side Python isn't supported. The most recent advice I could find was <a href="https://github.com/pyodide/pyodide/discussions/5145">from October 2024</a> stating "Pyodide is built by the Emscripten toolchain and can only run in a browser or Node.js".</p>
<p>The other day I decided to take a look at <a href="https://micropython.org">MicroPython</a> as an option for this. The MicroPython site says:</p>
<blockquote>
<p>MicroPython is a lean and efficient implementation of the Python 3 programming language that includes a small subset of the Python standard library and is optimised to run on microcontrollers and in constrained environments.</p>
</blockquote>
<p>WebAssembly sure feels like a constrained environment to me!</p>
<h4 id="building-the-first-version">Building the first version</h4>
<p>I had GPT-5.5 Pro <a href="https://chatgpt.com/share/6a1e2a5c-58b8-8328-ba1c-0e6aadb0a051">do some research for me</a>, which turned up <a href="https://github.com/micropython/micropython/pull/13676">this PR against MicroPython</a> by <a href="https://github.com/yamt">Yamamoto Takahashi</a> titled "Experimental WASI support for ports/unix".</p>
<p>It then produced this <a href="https://github.com/simonw/micropython-wasm/blob/c08fbd2276b15dc8c9bdff82845f750971f45647/research.md">research.md document</a>, so I let Codex Desktop and GPT-5.5 high <a href="https://gist.github.com/simonw/27461a16d76f28f8619c609444d544fe">loose on it</a> to see what would happen:</p>
<blockquote>
<p><code>read the research.md document and build this. You will probably need to write a script that compiles a custom WASM version of MicroPython as part of this project - fetch the MicroPython code to a /tmp directory for this as part of that script.</code></p>
</blockquote>
<p>It worked. I now had a prototype Python library that could execute Python code inside a WebAssembly sandbox!</p>
<p>The trickiest piece to solve was persistent interpreter state. The WASM build we are using here exposes a single entry point which starts the interpreter, runs the code and then stops the interpreter at the end.</p>
<p>This works fine for one-off scripts, but for Datasette Agent I want variables and functions to stay resident in memory so I can reuse them across multiple code execution calls.</p>
<p>A neat thing about working with coding agents is that you can get from an idea to a proof of concept quickly. I prompted:</p>
<blockquote>
<p><code>For keeping variables resident: what if we ran code inside micropython itself which called a host function get_next_python_code() and then passed that to eval() - and that host function blocked until new code was available, maybe by running in a thread with a queue? Could that or a similar idea help here?</code></p>
</blockquote>
<p>After some iteration we got to a version of this that works! In Python code you can now do this:</p>
<pre><span class="pl-k">from</span> <span class="pl-s1">micropython_wasm</span> <span class="pl-k">import</span> <span class="pl-v">MicroPythonSession</span>
<span class="pl-k">with</span> <span class="pl-en">MicroPythonSession</span>() <span class="pl-k">as</span> <span class="pl-s1">session</span>:
<span class="pl-en">print</span>(<span class="pl-s1">session</span>.<span class="pl-c1">run</span>(<span class="pl-s">"x = 10<span class="pl-cce">\n</span>print(x)"</span>).<span class="pl-c1">stdout</span>)
<span class="pl-en">print</span>(<span class="pl-s1">session</span>.<span class="pl-c1">run</span>(<span class="pl-s">"x += 5<span class="pl-cce">\n</span>print(x)"</span>).<span class="pl-c1">stdout</span>)
<span class="pl-en">print</span>(<span class="pl-s1">session</span>.<span class="pl-c1">run</span>(<span class="pl-s">"print(x * 2)"</span>).<span class="pl-c1">stdout</span>)</pre>
<p>Under the hood this starts a thread, sets up a request queue and then sends messages to that queue for the <code>session.run()</code> command, each time waiting on a reply queue for the result of that execution. Inside WASM the MicroPython interpreter blocks waiting for a <code>__session_next__()</code> host function to return the next line of code, which it runs <code>eval()</code> on before calling <code>__session_result__({"id": request_id, "ok": True})</code> when each block has been successfully executed.</p>
<p>The other piece of complexity was supporting host functions, so my Python library could selectively expose functions that could then be called by code running in MicroPython.</p>
<p>Codex ended up solving this with <a href="https://github.com/simonw/micropython-wasm/blob/0.1a1/micropython_wasm/usercmodule/host/hostmodule.c">78 lines of C</a>, which ends up compiled into the <a href="https://github.com/simonw/micropython-wasm/blob/0.1a1/micropython_wasm/artifacts/micropython-wasi.wasm">362KB WebAssembly blob</a> I'm distributing with the package.</p>
<p>I am by no means a C programmer, but I've read the C and had two different models explain it to me (here's <a href="https://claude.ai/share/62f74371-cc3c-44f2-b406-33d03513de9e">Claude's explanation</a>) and I've subjected it to a barrage of tests.</p>
<p>The great thing about working with WebAssembly is that if the C turns out to be fatally flawed the worst that can happen is the WebAssembly execution will fail with an exception. I can live with that risk.</p>
<p>Memory limits are directly supported by wasmtime. CPU limits are a little harder: wasmtime offers a "fuel" concept to limit how many operations a WebAssembly call can execute, and that's the correct fit for this problem, but the units are hard to reason about. I'm experimenting with a 20 million default "fuel" setting now but I'm not confident that it's the most appropriate value.</p>
<h4 id="try-it-yourself">Try it yourself</h4>
<p>The <code>micropython-wasm</code> alpha is now <a href="https://pypi.org/project/micropython-wasm">live on PyPI</a>.</p>
<p>You can try it from your own Python code as <a href="https://github.com/simonw/micropython-wasm">described in the README</a>. I've also added a simple CLI mode in <a href="https://github.com/simonw/micropython-wasm/releases/tag/0.1a2">version 0.1a2</a> which means you can try it using <code>uvx</code> without first installing it like so:</p>
<div class="highlight highlight-source-shell"><pre>uvx micropython-wasm -c <span class="pl-s"><span class="pl-pds">'</span>print("Hello world")<span class="pl-pds">'</span></span>
<span class="pl-c"><span class="pl-c">#</span> To see it run out of fuel:</span>
uvx micropython-wasm -c <span class="pl-s"><span class="pl-pds">'</span>s = ""; while True: s += "longer"<span class="pl-pds">'</span></span>
<span class="pl-c"><span class="pl-c">#</span> Outputs: micropython-wasm: guest exited with code 1</span></pre></div>
<p>You can also try it in <a href="https://agent.datasette.io/">Datasette Agent</a> like this:</p>
<div class="highlight highlight-source-shell"><pre>uvx llm keys <span class="pl-c1">set</span> openai
<span class="pl-c"><span class="pl-c">#</span> Paste in an OpenAI key, then:</span>
uvx --with datasette-agent \
--with datasette-agent-micropython \
--prerelease allow \
datasette --internal internal.db \
-s plugins.datasette-llm.default_model gpt-5.5 \
--root -o</pre></div>
<p>Then navigate to <a href="http://127.0.0.1:8001/-/agent">http://127.0.0.1:8001/-/agent</a> and run the prompt:</p>
<blockquote>
<p><code>show me some micropython</code></p>
</blockquote>
<p><img src="https://static.simonwillison.net/static/2026/micropython-in-datasette-agent.jpg" style="max-width: 100%" alt="Screenshot of a chat application interface with a dark blue-grey header reading &quot;home&quot; on the left and &quot;root&quot; with a hamburger menu icon on the right. Below is a navigation row with &quot;← Back&quot; and &quot;Chat&quot; on the left and an &quot;EXPORT&quot; button on the right. A blue user message bubble reads &quot;show me some micropython&quot;. Below it a collapsed thinking section reads &quot;▸Thinking: … to show the result clearly. After that, I can wrap up with a brief explanation!&quot; followed by a &quot;▶ Tool: execute_micropython&quot; label. A code block follows: &quot;# A tiny MicroPython example: blink-style logic + Fibonacci&quot; / &quot;def fib(n):&quot; / &quot; a, b = 0, 1&quot; / &quot; out = []&quot; / &quot; for _ in range(n):&quot; / &quot; out.append(a)&quot; / &quot; a, b = b, a + b&quot; / &quot; return out&quot; / &#39;print(&quot;Hello from MicroPython!&quot;)&#39; / &#39;print(&quot;First 10 Fibonacci numbers:&quot;, fib(10))&#39; / &quot;# MicroPython often runs on microcontrollers, e.g.:&quot; / &quot;# from machine import Pin&quot; / &quot;# led = Pin(2, Pin.OUT)&quot; / &quot;# led.value(1) # turn LED on&quot; / &quot;# led.value(0) # turn LED off&quot;. Below a horizontal divider is the output: &quot;Hello from MicroPython!&quot; / &quot;First 10 Fibonacci numbers: [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]&quot;, followed by a &quot;▶ Result: execute_micropython&quot; label. At the bottom is a text input field with placeholder &quot;Type a message...&quot; and a blue &quot;Send&quot; button." /></p>
<p>You can try a live demo of that plugin running in Datasette Agent by signing into <a href="https://agent.datasette.io">agent.datasette.io</a> with your GitHub account.</p>
<h4 id="should-you-trust-my-vibe-coded-sandbox-">Should you trust my vibe-coded sandbox?</h4>
<p>Having complained about immature, loosely-maintained sandboxing libraries, it's deeply ironic that I've now built my own!</p>
<p>I deliberately slapped an alpha release version on it, and I'm not ready to recommend it to anyone who isn't willing to take a significant risk.</p>
<p>I've put it through enough testing that I'm OK using it myself. I've shipped my first plugin that uses it, <a href="https://github.com/datasette/datasette-agent-micropython">datasette-agent-micropython</a>. I've also locked GPT-5.5 xhigh in that Datasette Agent plugin and <a href="https://gist.github.com/simonw/5de497c44d25f9fd459c8aa2c959fe4a">challenged it to break out of the sandbox</a> and so far it has not managed to.</p>
<p>I'm hoping this implementation can convince some companies with professional security teams and high-stakes problems to commit to using Python in WebAssembly as a sandboxing approach and open source their own solutions.</p>
<p>Tags: <a href="https://simonwillison.net/tags/python">python</a>, <a href="https://simonwillison.net/tags/sandboxing">sandboxing</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/datasette">datasette</a>, <a href="https://simonwillison.net/tags/webassembly">webassembly</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/ai-assisted-programming">ai-assisted-programming</a>, <a href="https://simonwillison.net/tags/codex">codex</a>, <a href="https://simonwillison.net/tags/datasette-agent">datasette-agent</a>, <a href="https://simonwillison.net/tags/micropython">micropython</a></p></summary>
<category term="python"/>
<category term="sandboxing"/>
<category term="ai"/>
<category term="datasette"/>
<category term="webassembly"/>
<category term="generative-ai"/>
<category term="llms"/>
<category term="ai-assisted-programming"/>
<category term="codex"/>
<category term="datasette-agent"/>
<category term="micropython"/>
</entry>
<entry>
<title>OpenAI Help: Lockdown Mode</title>
<link href="https://simonwillison.net/2026/Jun/5/openai-help-lockdown-mode/#atom-everything" rel="alternate"/>
<published>2026-06-05T23:56:40+00:00</published>
<updated>2026-06-05T23:56:40+00:00</updated>
<id>https://simonwillison.net/2026/Jun/5/openai-help-lockdown-mode/#atom-everything</id>
<summary type="html"><p><strong><a href="https://help.openai.com/en/articles/20001061-lockdown-mode">OpenAI Help: Lockdown Mode</a></strong></p>
OpenAI first teased this <a href="https://openai.com/index/introducing-lockdown-mode-and-elevated-risk-labels-in-chatgpt/">in February</a>, but now it's live and "rolling out to eligible personal accounts, including Free, Go, Plus, and Pro, and self-serve ChatGPT Business accounts":</p>
<blockquote>
<p>Lockdown Mode is designed to help prevent the final stage of data exfiltration from a prompt injection attack by limiting outbound network requests that could transfer sensitive data to an attacker. Lockdown Mode does not prevent prompt injections from appearing in the content ChatGPT processes. For example, a prompt injection could appear in cached web content or in an uploaded file, and could still affect the behavior or accuracy of a response.</p>
</blockquote>
<p>This looks really good to me.</p>
<p>The <a href="https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/">Lethal Trifecta</a> occurs when an LLM system has access to all three of access to private data, exposure to untrusted content and a way to steal data and transmit it back to the attacker.</p>
<p>The only way to solve the trifecta is to cut off one of the three legs, and by far the easiest leg to restrict without making your LLM systems far less useful is the exfiltration vectors to steal data.</p>
<p>It looks to me like lockdown mode directly attacks that leg, using mechanisms that are deterministic and, crucially, are not evaluated by AI systems that themselves can be subverted by sufficiently devious attacks.</p>
<p>The existence of lockdown mode does however imply that ChatGPT, in its default settings, does <em>not</em> provide robust protection against sufficiently determined data exfiltration attacks!</p>
<p><strong>Update</strong>: <a href="https://twitter.com/cryps1s/status/2062923575049531422">This tweet</a> OpenAI CISO Dane Stuckey:</p>
<blockquote>
<p>Lockdown mode is not meant for everyone. However, for folks who have an elevated risk profile - due to who they are, what they work on, or the types of data they work with - it's an excellent tool for further securing themselves. This has some tradeoffs on functionality and utility, but for these users, the tradeoff is worthwhile.</p>
</blockquote>
<p>Tags: <a href="https://simonwillison.net/tags/security">security</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/openai">openai</a>, <a href="https://simonwillison.net/tags/prompt-injection">prompt-injection</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/lethal-trifecta">lethal-trifecta</a></p></summary>
<category term="security"/>
<category term="ai"/>
<category term="openai"/>
<category term="prompt-injection"/>
<category term="llms"/>
<category term="lethal-trifecta"/>
</entry>
<entry>
<title>Quoting Andreas Kling</title>
<link href="https://simonwillison.net/2026/Jun/5/andreas-kling/#atom-everything" rel="alternate"/>
<published>2026-06-05T11:10:05+00:00</published>
<updated>2026-06-05T11:10:05+00:00</updated>
<id>https://simonwillison.net/2026/Jun/5/andreas-kling/#atom-everything</id>
<summary type="html"><blockquote cite="https://ladybird.org/posts/changing-how-we-develop-ladybird/"><p>We will no longer accept public pull requests. [...]</p>
<p>A substantial patch used to imply substantial effort, and that effort was a reasonable proxy for good faith. That assumption no longer holds. [...]</p>
<p>Whether code was typed by hand is beside the point. What matters is who is responsible for it once it enters the browser. Ladybird is becoming a browser for real users. The people introducing changes to it must be the people who decide those changes belong in the project, and who will answer for the consequences.</p></blockquote>
<p class="cite">&mdash; <a href="https://ladybird.org/posts/changing-how-we-develop-ladybird/">Andreas Kling</a>, Changing How We Develop Ladybird</p>
<p>Tags: <a href="https://simonwillison.net/tags/ladybird">ladybird</a>, <a href="https://simonwillison.net/tags/ai-ethics">ai-ethics</a>, <a href="https://simonwillison.net/tags/open-source">open-source</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/andreas-kling">andreas-kling</a>, <a href="https://simonwillison.net/tags/llms">llms</a></p></summary>
<category term="ladybird"/>
<category term="ai-ethics"/>
<category term="open-source"/>
<category term="generative-ai"/>
<category term="ai"/>
<category term="andreas-kling"/>
<category term="llms"/>
</entry>
<entry>
<title>AI enthusiasts are in a race against time, AI skeptics are in a race against entropy</title>
<link href="https://simonwillison.net/2026/Jun/4/ai-enthusiasts-ai-skeptics/#atom-everything" rel="alternate"/>
<published>2026-06-04T23:55:27+00:00</published>
<updated>2026-06-04T23:55:27+00:00</updated>
<id>https://simonwillison.net/2026/Jun/4/ai-enthusiasts-ai-skeptics/#atom-everything</id>
<summary type="html"><p><strong><a href="https://charitydotwtf.substack.com/p/ai-enthusiasts-are-in-a-race-against">AI enthusiasts are in a race against time, AI skeptics are in a race against entropy</a></strong></p>
Charity Majors neatly captures the dynamic between AI enthusiasts and AI skeptics, both of whom are trying to build great software, often in the same teams:</p>
<blockquote>
<p>The enthusiasts are <em>not wrong</em>. We are starting to see real, non-imaginary, discontinuous leaps in capabilities from teams that lean in hard to working with AI. And this does not feel like a normal technology cycle where you can wait for the dust to settle; teams that sit this out while competitors are hustling could be out of business before the dust settles. That’s a real, existential threat.</p>
<p>The skeptics are also <em>not wrong</em>. When you ship code faster than engineers can read it, in domains where nobody has full context, you are making withdrawals from a trust account that took years to build. Reliability degrades, institutional knowledge evaporates. You end up with systems nobody understands, products burbling into incoherence, and on-call rotations that grind people up and spit them out. That is ALSO a real existential threat.</p>
</blockquote>
<p>Charity recommends treating this as both a leadership challenge and an engineering challenge. The key issue:</p>
<blockquote>
<p>There is no natural feedback loop connecting enthusiasts with skeptics.</p>
</blockquote>
<p>Designing feedback loops to help "mend the gap in shared reality" between the two groups is a fascinating organizational design problem.
<p><small></small>Via <a href="https://lobste.rs/s/ri4flr/ai_enthusiasts_are_race_against_time_ai">Lobste.rs</a></small></p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/charity-majors">charity-majors</a>, <a href="https://simonwillison.net/tags/agentic-engineering">agentic-engineering</a></p></summary>
<category term="ai"/>
<category term="charity-majors"/>
<category term="agentic-engineering"/>
</entry>
<entry>
<title>Quoting Emanuel Maiberg, 404 Media</title>
<link href="https://simonwillison.net/2026/Jun/4/a-slightly-different-version/#atom-everything" rel="alternate"/>
<published>2026-06-04T16:38:29+00:00</published>
<updated>2026-06-04T16:38:29+00:00</updated>
<id>https://simonwillison.net/2026/Jun/4/a-slightly-different-version/#atom-everything</id>
<summary type="html"><blockquote cite="https://www.404media.co/google-employees-internally-share-memes-about-how-its-ai-sucks/"><p>After this story was published Google's spokesperson reached out and asked us to publish a slightly different version of that statement. The new statement no longer stated that "it's critical that we maintain humans in the loop."</p></blockquote>
<p class="cite">&mdash; <a href="https://www.404media.co/google-employees-internally-share-memes-about-how-its-ai-sucks/">Emanuel Maiberg, 404 Media</a>, Google Employees Internally Share Memes About How Its AI Sucks</p>
<p>Tags: <a href="https://simonwillison.net/tags/ai-ethics">ai-ethics</a>, <a href="https://simonwillison.net/tags/journalism">journalism</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/google">google</a></p></summary>
<category term="ai-ethics"/>
<category term="journalism"/>
<category term="ai"/>
<category term="google"/>
</entry>
<entry>
<title>Uber Caps Usage of AI Tools Like Claude Code to Manage Costs</title>
<link href="https://simonwillison.net/2026/Jun/3/uber-caps-usage/#atom-everything" rel="alternate"/>
<published>2026-06-03T12:01:27+00:00</published>
<updated>2026-06-03T12:01:27+00:00</updated>
<id>https://simonwillison.net/2026/Jun/3/uber-caps-usage/#atom-everything</id>
<summary type="html"><p><strong><a href="https://www.bloomberg.com/news/articles/2026-06-02/uber-caps-usage-of-ai-tools-like-claude-code-to-cut-costs">Uber Caps Usage of AI Tools Like Claude Code to Manage Costs</a></strong></p>
I wrote <a href="https://simonwillison.net/2026/May/27/product-market-fit/#the-ai-failure-stories-around-this-are-pretty-thin">the other day</a> about Uber blowing its 2026 AI budget in four months, and how that wasn't particularly surprising given they would have set that budget in 2025, before anyone could have predicted how popular token-burning coding agents were about to become.
Natalie Lung for Bloomberg:</p>
<blockquote>
<p>The rideshare giant is limiting all employees to $1,500 in monthly token spending per AI coding tool, an Uber spokesperson said in response to a Bloomberg News inquiry. That means spending on one tool doesn’t have a bearing on the budget for another. The limits, which have been instituted in recent months, only apply to agentic coding software such as Cursor or Anthropic PBC’s Claude Code.</p>
</blockquote>
<p>A $1,500 monthly limit per tool strikes me as a rational policy response to over-spending, and <em>much</em> more sensible than those <a href="https://en.wikipedia.org/wiki/Token_maxxing">tokenmaxxing</a> leaderboards encouraging employees to compete for as much AI usage as possible.</p>
<p>It's also interesting in that it hints at a real dollar value for what Uber is getting out of these tools. If we assume two actively used tools per engineer that's $3,000 * 12 = $36,000 cap per engineer per year. Levels.fyi lists <a href="https://www.levels.fyi/companies/uber/salaries/software-engineer?country=254">the median yearly compensation package for Uber software engineers in the USA</a> at $330,000.</p>
<p>That means each employee's AI spending cap is ~11% of that median compensation package.</p>
<p>I <a href="https://simonwillison.net/2026/May/27/product-market-fit/#enterprise-customers-are-now-paying-api-prices">noted</a> that my own token usage comes to about $1,000/month against each of Anthropic and OpenAI - which currently costs me just $100 per provider thanks to their generous subsidized plans for individual subscribers. Those plans are no longer available to larger companies like Uber.</p>
<p>Their new policy means if I were working at Uber I'd still have ~$500/month of tokens to spare for each of those tools, given my current usage patterns.
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/llm-pricing">llm-pricing</a>, <a href="https://simonwillison.net/tags/coding-agents">coding-agents</a>, <a href="https://simonwillison.net/tags/uber">uber</a></p></summary>
<category term="ai"/>
<category term="generative-ai"/>
<category term="llms"/>
<category term="llm-pricing"/>
<category term="coding-agents"/>
<category term="uber"/>
</entry>
<entry>
<title>Microsoft's new MAI models</title>
<link href="https://simonwillison.net/2026/Jun/2/microsofts-new-models/#atom-everything" rel="alternate"/>
<published>2026-06-02T22:21:52+00:00</published>
<updated>2026-06-02T22:21:52+00:00</updated>
<id>https://simonwillison.net/2026/Jun/2/microsofts-new-models/#atom-everything</id>
<summary type="html"><p>Microsoft <a href="https://microsoft.ai/news/building-a-hillclimbing-machine-launching-seven-new-mai-models/">announced two new text LLMs</a> this morning - <strong><a href="https://microsoft.ai/news/introducing-mai-thinking-1/">MAI-Thinking-1</a></strong> (reasoning, 1T parameters, 35B active, available to "select early partners") and <strong><a href="https://microsoft.ai/news/introducingmai-code-1-flash/">MAI-Code-1-Flash</a></strong> (137B Parameters, 5B active, "purpose-built for GitHub Copilot and VS Code to deliver high performance and lower cost [...] rolling out to GitHub Copilot individual users in Visual Studio Code"). I've not been able to try either of them just yet.</p>
<p><strike>It's very interesting to see Microsoft releasing models with such low parameter counts, especially given how expensive larger models are to access right now. They claim MAI-Thinking-1 "is preferred to Sonnet 4.6 in our blind human side-by-side evaluations", which is impressive for a 35B model seeing as I frequently run models larger than that on my own laptop.</strike> (UPDATE: I got this entirely wrong, see note below.)</p>
<p>Also <a href="https://microsoft.ai/news/introducing-mai-thinking-1/">of note</a>:</p>
<blockquote>
<p>We trained [MAI-Thinking-1] from the ground up on enterprise grade, clean and commercially licensed data, without distillation from third-party models.</p>
</blockquote>
<p>And for <a href="https://microsoft.ai/news/introducingmai-code-1-flash/">MAI-Code-1-Flash</a> as well:</p>
<blockquote>
<p>It is built end-to-end by Microsoft using clean and appropriately licensed data.</p>
</blockquote>
<p>I would <em>very much</em> like to learn more about this "appropriately licensed" data! Could these be the first generally useful code-specialist models that didn't train on an unlicensed dump of the web? (<strong>Update</strong>: the answer is no, see note below.)</p>
<p><strong>Update</strong>: My initial published notes got the size of the models wrong. I misread Microsoft's announcements and interpreted the MoE active parameter count as the total parameter count, but the <a href="https://microsoft.ai/pdf/MAI-Code-1-Flash-Model-Card.PDF">model card for MAI-Code-1-Flash</a> lists it as 137B with 5B active and the <a href="https://microsoft.ai/wp-content/uploads/2026/06/main_20260602_2.pdf">MAI-Thinking-1 technical paper</a> reveals it to be a 1T model with 35B active.</p>
<p>I deeply regret this error.</p>
<p><strong>Update 2</strong>: That technical paper describes the training data in some detail from page 80 onwards. It has the same licensing problems as all of the other major LLMs: it's trained on a crawl of the public web:</p>
<blockquote>
<p>The majority of our web HTML corpus comes from a proprietary crawl. After initial page discovery and selection, approximately 1.2 trillion pages are crawled and parsed. [...] In addition to Microsoft standard policy Sec. 2.4, we apply UT1 block list (Prigent, 2026) to remove adult content and piracy-related domains. In all, this filtering reduces the corpus from 1.2 trillion pages to 794 billion pages. Given the prevalence of AI-generated content on the web, we also score pages with a proprietary AI-content detection model and use manual inspection to identify domains with extensive AI-generated content; those domains are filtered out of the training corpus.</p>
<p>[...]</p>
<p>We process Common Crawl with the same pipeline. [...] After filtering, deduplication, merging with the proprietary web corpus, and a final round of exact-URL and content-level fuzzy deduplication, the Common Crawl portion contains 24.2 billion pages.</p>
</blockquote>
<p>I did not cover this one at all well, which is somewhat ironic since I was at the Microsoft Build conference when I wrote this up! I'm sorry for not digging deeper before publishing my initial notes.</p>
<p>Tags: <a href="https://simonwillison.net/tags/llm-release">llm-release</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/microsoft">microsoft</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/training-data">training-data</a></p></summary>
<category term="llm-release"/>
<category term="generative-ai"/>
<category term="ai"/>
<category term="microsoft"/>
<category term="llms"/>
<category term="training-data"/>
</entry>
<entry>
<title>datasette-agent-micropython 0.1a0</title>
<link href="https://simonwillison.net/2026/Jun/2/datasette-agent-micropython/#atom-everything" rel="alternate"/>
<published>2026-06-02T19:28:36+00:00</published>
<updated>2026-06-02T19:28:36+00:00</updated>
<id>https://simonwillison.net/2026/Jun/2/datasette-agent-micropython/#atom-everything</id>
<summary type="html"><p><strong>Release:</strong> <a href="https://github.com/datasette/datasette-agent-micropython/releases/tag/0.1a0">datasette-agent-micropython 0.1a0</a></p>
<p>I want <a href="https://agent.datasette.io">Datasette Agent</a> to be able to generate and execute Python code safely. This alpha is looking promising so far. GPT-5.5 has so far failed to break out of the sandbox!</p>
<p>Tags: <a href="https://simonwillison.net/tags/python">python</a>, <a href="https://simonwillison.net/tags/sandboxing">sandboxing</a>, <a href="https://simonwillison.net/tags/datasette">datasette</a>, <a href="https://simonwillison.net/tags/webassembly">webassembly</a>, <a href="https://simonwillison.net/tags/datasette-agent">datasette-agent</a>, <a href="https://simonwillison.net/tags/micropython">micropython</a></p></summary>
<category term="python"/>
<category term="sandboxing"/>
<category term="datasette"/>
<category term="webassembly"/>
<category term="datasette-agent"/>
<category term="micropython"/>
</entry>
<entry>
<title>micropython-wasm 0.1a1</title>
<link href="https://simonwillison.net/2026/Jun/2/micropython-wasm/#atom-everything" rel="alternate"/>
<published>2026-06-02T19:20:47+00:00</published>
<updated>2026-06-02T19:20:47+00:00</updated>
<id>https://simonwillison.net/2026/Jun/2/micropython-wasm/#atom-everything</id>
<summary type="html"><p><strong>Release:</strong> <a href="https://github.com/simonw/micropython-wasm/releases/tag/0.1a1">micropython-wasm 0.1a1</a></p>
<p>Fixes for some limitations that emerged while I was trying to use this to build <code>datasette-agent-micropython</code>.</p>
<p>Tags: <a href="https://simonwillison.net/tags/python">python</a>, <a href="https://simonwillison.net/tags/sandboxing">sandboxing</a>, <a href="https://simonwillison.net/tags/webassembly">webassembly</a>, <a href="https://simonwillison.net/tags/micropython">micropython</a></p></summary>
<category term="python"/>
<category term="sandboxing"/>
<category term="webassembly"/>
<category term="micropython"/>
</entry>
<entry>
<title>California Brown Pelican</title>
<link href="https://simonwillison.net/2026/Jun/2/sighting-367841339/#atom-everything" rel="alternate"/>
<published>2026-06-02T18:17:03+00:00</published>
<updated>2026-06-02T18:17:03+00:00</updated>
<id>https://simonwillison.net/2026/Jun/2/sighting-367841339/#atom-everything</id>
<summary type="html"><p><img src="https://static.inaturalist.org/photos/671786719/large.jpg" alt="California Brown Pelican"></p><p>California Brown Pelican, in Fort Mason, CA, US</p><p>I'm at the <a href="https://build.microsoft.com/">Microsoft Build</a> conference today, held at <a href="https://en.wikipedia.org/wiki/Fort_Mason">Fort Mason</a> in San Francisco. There are California Brown Pelicans diving into the water directly behind venue!</p>
<p>Tags: <a href="https://simonwillison.net/tags/microsoft">microsoft</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/llm-release">llm-release</a></p></summary>
<category term="microsoft"/>
<category term="ai"/>
<category term="generative-ai"/>
<category term="llms"/>
<category term="llm-release"/>
</entry>
<entry>
<title>Pasted File Editor</title>
<link href="https://simonwillison.net/2026/Jun/2/pasted-file-editor/#atom-everything" rel="alternate"/>
<published>2026-06-02T04:13:36+00:00</published>
<updated>2026-06-02T04:13:36+00:00</updated>
<id>https://simonwillison.net/2026/Jun/2/pasted-file-editor/#atom-everything</id>
<summary type="html"><p><strong>Tool:</strong> <a href="https://tools.simonwillison.net/pasted-file-editor">Pasted File Editor</a></p>
<p>I really like how you can paste a large volume of text into <a href="https://claude.ail">claude.ai</a> (or the Claude desktop/mobile apps) and it will detect it as a large paste and turn it into a file attachment instead.</p>
<p>I decided to have Codex desktop <a href="https://gist.github.com/simonw/74c79119b487a5acce18b4dcc26b9f79">build me a version of that</a> as a prototype.</p>
<p>You can also open files directly - including images which will be shown as thumbnails - or drag files onto the textarea.</p>
<p>Tags: <a href="https://simonwillison.net/tags/javascript">javascript</a>, <a href="https://simonwillison.net/tags/tools">tools</a>, <a href="https://simonwillison.net/tags/ai-assisted-programming">ai-assisted-programming</a>, <a href="https://simonwillison.net/tags/claude">claude</a>, <a href="https://simonwillison.net/tags/codex">codex</a></p></summary>
<category term="javascript"/>
<category term="tools"/>
<category term="ai-assisted-programming"/>
<category term="claude"/>
<category term="codex"/>
</entry>
<entry>
<title>micropython-wasm 0.1a0</title>
<link href="https://simonwillison.net/2026/Jun/2/micropython-wasm-2/#atom-everything" rel="alternate"/>
<published>2026-06-02T03:43:45+00:00</published>
<updated>2026-06-02T03:43:45+00:00</updated>
<id>https://simonwillison.net/2026/Jun/2/micropython-wasm-2/#atom-everything</id>
<summary type="html"><p><strong>Release:</strong> <a href="https://github.com/simonw/micropython-wasm/releases/tag/0.1a0">micropython-wasm 0.1a0</a></p>
<p>My latest sandboxing experiment: This alpha package bundles a lightly customized WASM build of <a href="https://micropython.org/">MicroPython</a> with a wrapper to execute code in it via <a href="https://wasmtime.dev/">wasmtime</a>.</p>
<p>Tags: <a href="https://simonwillison.net/tags/python">python</a>, <a href="https://simonwillison.net/tags/sandboxing">sandboxing</a>, <a href="https://simonwillison.net/tags/webassembly">webassembly</a>, <a href="https://simonwillison.net/tags/micropython">micropython</a></p></summary>
<category term="python"/>
<category term="sandboxing"/>
<category term="webassembly"/>
<category term="micropython"/>
</entry>
</feed>
<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/atom/everything/" rel="self"/><id>http://simonwillison.net/</id><updated>2026-06-13T01:01:50+00:00</updated><author><name>Simon Willison</name></author><entry><title>Statement on the US government directive to suspend access to Fable 5 and Mythos 5</title><link href="https://simonwillison.net/2026/Jun/13/us-government-directive-to-suspend-access/#atom-everything" rel="alternate"/><published>2026-06-13T01:01:50+00:00</published><updated>2026-06-13T01:01:50+00:00</updated><id>https://simonwillison.net/2026/Jun/13/us-government-directive-to-suspend-access/#atom-everything</id><summary type="html">
<p><strong><a href="https://www.anthropic.com/news/fable-mythos-access">Statement on the US government directive to suspend access to Fable 5 and Mythos 5</a></strong></p>
Well this is <em>nuts</em>:</p>
<blockquote>
<p>The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for <strong>all</strong> our customers to ensure compliance. <strong>Access to all other Anthropic models</strong> <strong>will not be affected.</strong></p>
<p>We received the directive from the government today at 5:21pm (ET). The letter did not provide specific details of its national security concern. Our understanding is that the government believes it has become aware of a method of bypassing, or "jailbreaking" Fable 5. We reviewed a demonstration of this specific technique being used to identify a small number of previously known, minor vulnerabilities. These vulnerabilities all appear relatively simple, and we have found that other publicly-available models are able to discover them as well without requiring a bypass. [...]</p>
<p>To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws. Our understanding is that one potential jailbreak was shared with the government. We have reviewed the report and validated that the level of capability displayed there is widely available from other models (including OpenAI's <a href="https://deploymentsafety.openai.com/gpt-5-5/tacit-knowledge-and-troubleshooting">GPT-5.5</a>), and is used every day by the defenders who keep systems safe. We will share more details over the next 24 hours.</p>
</blockquote>
<p>I still have access to Fable via <a href="https://claude.ai/">claude.ai</a> and Claude Code now, at 9:01pm ET.</p>
<p><strong>Update</strong>: I ran <a href="https://gist.github.com/simonw/5894cfafc64a2b8aafbe834bc9c950b9">this script</a> against the Anthropic API to spot when <code>claude-fable-5</code> would stop working. My access was cut off at 6:59pm Pacific (9:59pm ET):</p>
<pre>[2026-06-12T18:56:50-07:00] attempt 35: running uv run llm -m claude-fable-5 hi
[2026-06-12T18:56:55-07:00] success: Hi there! How can I help you today?
[2026-06-12T18:57:55-07:00] attempt 36: running uv run llm -m claude-fable-5 hi
[2026-06-12T18:57:59-07:00] success: Hi! How can I help you today?
[2026-06-12T18:58:59-07:00] attempt 37: running uv run llm -m claude-fable-5 hi
[2026-06-12T18:59:00-07:00] FAILED after attempt 37 with exit code 1
stderr:
Error: Error code: 404 - {'type': 'error', 'error': {'type': 'not_found_error', 'message': 'Claude Fable 5 is not available. Please use Opus 4.8. Learn more: https://www.anthropic.com/news/fable-mythos-access'}, 'request_id': 'req_011CbzRyirV7KZLHYYdBM9od'}</pre>
<p><small></small>Via <a href="https://twitter.com/AnthropicAI/status/2065597531644743999">@AnthropicAI</a></small></p>
<p>Tags: <a href="https://simonwillison.net/tags/jailbreaking">jailbreaking</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/anthropic">anthropic</a>, <a href="https://simonwillison.net/tags/claude">claude</a>, <a href="https://simonwillison.net/tags/ai-ethics">ai-ethics</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p>
</summary><category term="jailbreaking"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="ai-ethics"/><category term="claude-mythos"/></entry><entry><title>OpenAI WebRTC Audio Session, now with document context</title><link href="https://simonwillison.net/2026/Jun/12/openai-webrtc/#atom-everything" rel="alternate"/><published>2026-06-12T23:53:04+00:00</published><updated>2026-06-12T23:53:04+00:00</updated><id>https://simonwillison.net/2026/Jun/12/openai-webrtc/#atom-everything</id><summary type="html">
<p><strong><a href="https://tools.simonwillison.net/openai-webrtc">OpenAI WebRTC Audio Session, now with document context</a></strong></p>
I built the first version of this tool <a href="https://simonwillison.net/2024/Dec/17/openai-webrtc/">in December 2024</a> to try out the then-new OpenAI WebRTC API for interacting with their realtime audio models.</p>
<p>Last month OpenAI <a href="https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/">introduced a brand new model</a> to that API called <a href="https://developers.openai.com/api/docs/models/gpt-realtime-2">GPT‑Realtime‑2</a>, which they promoted as "our first voice model with GPT‑5‑class reasoning" - with a Sep 30, 2024 knowledge cut-off.</p>
<p>I've been waiting for that model to show up in the ChatGPT iPhone app but it still hasn't, so I revisited my old playground.</p>
<p>You can now pick the better model, and you can also paste in a big chunk of document context so you can have as audio conversation in your browser about whatever information you think would be useful to explore in a conversational way.</p>
<p><img src="https://static.simonwillison.net/static/2026/openai-webrtc-document-context.jpg" alt="Screenshot of a web interface titled &quot;OpenAI WebRTC Audio Session&quot; with a gray status dot. Form fields: &quot;OpenAI API Token&quot; showing a masked password of dots, &quot;Voice&quot; dropdown set to &quot;Coral&quot;, &quot;Model&quot; dropdown set to &quot;gpt-realtime-2&quot;. A collapsible section labeled &quot;▼ Document context (optional — paste text to talk about)&quot; with bold instruction &quot;Paste a document here before starting the session and the model will be able to discuss it with you&quot; above a textarea containing a pasted Markdown document about whether DuckDB can run untrusted SQL as safely as Datasette runs SQLite. Below are a blue &quot;Start Session&quot; button and a gray disabled &quot;Mute Mic&quot; button, then a green success message &quot;Session established successfully!&quot; At the bottom, a dark panel headed &quot;Last transcript&quot; reads: &quot;DuckDB can be made about as safe as SQLite for running untrusted SELECT queries, but only if you lock it down properly. Using read only true by itself is not enough, because SQL can still&quot; (text cut off)." class="blogmark-image" style="max-width: 80%">
<p>Tags: <a href="https://simonwillison.net/tags/audio">audio</a>, <a href="https://simonwillison.net/tags/tools">tools</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/openai">openai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/multi-modal-output">multi-modal-output</a>, <a href="https://simonwillison.net/tags/webrtc">webrtc</a></p>
</summary><category term="audio"/><category term="tools"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="multi-modal-output"/><category term="webrtc"/></entry><entry><title>Quoting Andrew Singleton</title><link href="https://simonwillison.net/2026/Jun/12/andrew-singleton/#atom-everything" rel="alternate"/><published>2026-06-12T18:09:21+00:00</published><updated>2026-06-12T18:09:21+00:00</updated><id>https://simonwillison.net/2026/Jun/12/andrew-singleton/#atom-everything</id><summary type="html">
<blockquote cite="https://www.mcsweeneys.net/articles/ai-economics-for-dummies"><p>Jenny owns a crematorium. John’s propane company gives her a $20 billion investment in return for 5 percent of her operation. Jenny throws $10 billion into the incinerator, then pays John $10 billion to buy propane to burn that money to ashes. John reports that his AI investments have generated $10 billion in revenue this quarter and that he owns 5 percent of a $100 billion business. A reporter from <em>Forbes</em> is assigned to profile John and Jenny, and over the course of his research, he becomes embroiled in a passionate but confusing three-way love affair with them, which eventually turns into a polyamorous common-law marriage. His profile is glowing, but light on financial details.</p></blockquote>
<p class="cite">&mdash; <a href="https://www.mcsweeneys.net/articles/ai-economics-for-dummies">Andrew Singleton</a>, AI Economics for Dummies</p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a></p>
</summary><category term="ai"/></entry><entry><title>Claude Fable is relentlessly proactive</title><link href="https://simonwillison.net/2026/Jun/11/fable-is-relentlessly-proactive/#atom-everything" rel="alternate"/><published>2026-06-11T23:35:17+00:00</published><updated>2026-06-11T23:35:17+00:00</updated><id>https://simonwillison.net/2026/Jun/11/fable-is-relentlessly-proactive/#atom-everything</id><summary type="html">
<p>After two days of experience with <a href="https://simonwillison.net/2026/Jun/9/claude-fable-5/">Claude Fable 5</a> I think the best way to describe it is <strong>relentlessly proactive</strong>. It knows a whole lot of tricks and it will deploy pretty much any of them to get to its goal.</p>
<p>I'll illustrate this with an example. I was hacking on <a href="https://agent.datasette.io/">Datasette Agent</a> today when I noticed a glitch: a horizontal scrollbar that shouldn't be there in the jump menu chat prompt. I snapped this screenshot:</p>
<p><img src="https://static.simonwillison.net/static/2026/jump-to-bug.jpg" alt="Screenshot of a modal dialog demonstrating a scrollbar bug. At the top is a focused search input with blue outline and placeholder &quot;Jump to...&quot;, with an X close button to its right. Below, a heading reads &quot;Start a new agent chat&quot; above a textarea with the placeholder &quot;Ask a question about your data...&quot; — the bug: a thick gray horizontal scrollbar is incorrectly displayed along the bottom edge of the empty textarea, spanning nearly its full width, next to the resize handle. Below the textarea: &quot;Press Enter to start. Shift+Enter adds a new line.&quot; followed by a blue &quot;Start chat&quot; button." style="max-width: 100%;" /></p>
<p>Then I started a fresh <code>claude</code> session in my <code>datasette-agent</code> checkout, dragged in the screenshot and told it:</p>
<blockquote>
<p><code>Look at dependencies to help figure out why there is a horizontal scrollbar here</code></p>
</blockquote>
<p>I had a hunch the cause was in a dependency of Datasette Agent (likely Datasette itself) and I knew Fable was good at digging into dependency code, either by inspecting installed files in its own virtual environment <code>site-packages</code> or by referencing a local checkout on disk. Telling it to start with dependencies felt like a good bet.</p>
<p>I got distracted by a domestic task and wandered away from my computer.</p>
<p>When I came back a few minutes later I saw my machine <em>open a browser window</em> in my regular Firefox and then <em>navigate to the dialog in question</em>. I had not told Claude Code to use any browser automation, and I was pretty sure it wasn't possible for it to trigger mouse movements or keyboard shortcuts within a window, so how was it doing that?</p>
<p>I watched in fascination as it continued with its explorations, then saw it open a Safari window instead of Firefox. I also grabbed this snapshot from the Claude terminal:</p>
<p><img src="https://static.simonwillison.net/static/2026/fable-bash-pyobjc.jpg" alt="Screenshot of two Bash tool calls in a dark terminal interface. First: Bash(open -a Safari /tmp/textarea-scrollbar-test.html &amp;&amp; sleep 4 &amp;&amp; uv run --with pyobjc-framework-Quartz python - &lt;&lt;'EOF' import Quartz wins = Quartz.CGWindowListCopyWindowInfo(Quartz.kCGWindowListOptionOnScreenOnly, Quartz.kCGNullWindowID) for w in wins: if (w.get('kCGWindowOwnerName') or '') == 'Safari' and 'textarea' in (w.get('kCGWindowName') or '').lower(): print(w.get('kCGWindowNumber')) EOF) with output 153551. Second: Bash(screencapture -x -o -l 153551 /tmp/safari-cases.png &amp;&amp; echo ok) with output ok." style="max-width: 100%;" /></p>
<p>What was it doing there with <code>uv run --with pyobjc-framework-Quartz</code>?</p>
<p>It turns out Fable had hacked up its own pattern for taking screenshots of browser windows. It was using Python to iterate through all available windows on my machine, then filtering for Safari windows with expected strings such as <code>"textarea"</code> in the window name. It used that to find their window number - an integer like 153551 - which it could then use with the <code>screencapture</code> CLI tool to grab a PNG.</p>
<p>OK fine, that's a neat way of taking screenshots. But what was it taking screenshots of?</p>
<p>Turns out it had been writing its own scratch HTML pages to try and recreate the bug, then opening Safari and grabbing screenshots.</p>
<p>Here's that <a href="https://static.simonwillison.net/static/2026/textarea-scrollbar-test.html">/tmp/textarea-scrollbar-test.html</a> page it created, and the screenshot it took with <code>screencapture -x -o -l 153551 /tmp/safari-cases.png</code>:</p>
<p><img src="https://static.simonwillison.net/static/2026/safari-cases.jpg" alt="Screenshot of a Safari browser window showing a textarea scrollbar test page at file:///private/tmp/textarea-scrollbar-test.html. Page text reads: scrollbar thickness: 17px | UA: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/26.4 Safari/605.1.15 | devicePixelRatio: 2. Four numbered test cases follow, each with a textarea containing the placeholder &quot;Ask a question about your data...&quot;: 1. Exact plugin CSS (resize: vertical, default overflow), 2. Plugin CSS + overflow-x: hidden, 3. Plugin CSS + resize: none, and 4. Bare default textarea, which is a much smaller box with the placeholder wrapping onto two lines." style="max-width: 100%;" />
(I have way too many open tabs!)</p>
<p>OK, so I can see how it's opening test pages and taking screenshots, but how on earth was it triggering the modal dialog that was meant to be under test? That's only available via a click or a keyboard shortcut, and I couldn't see a mechanism for it to run those in Safari.</p>
<p>I eventually figured out what it had done.</p>
<p>Claude was running in a folder that contained the source code for the application. It knows enough about <a href="https://datasette.io/">Datasette</a> to be able to run a local development server. It turns out it was editing Datasette's own templates to add JavaScript that would trigger the correct keyboard shortcut as soon as the window opened, adding code like this:</p>
<div class="highlight highlight-text-html-basic"><pre><span class="pl-kos">&lt;</span><span class="pl-ent">script</span><span class="pl-kos">&gt;</span>
<span class="pl-smi">window</span><span class="pl-kos">.</span><span class="pl-en">addEventListener</span><span class="pl-kos">(</span><span class="pl-s">"load"</span><span class="pl-kos">,</span> <span class="pl-k">function</span> <span class="pl-kos">(</span><span class="pl-kos">)</span> <span class="pl-kos">{</span>
<span class="pl-en">setTimeout</span><span class="pl-kos">(</span><span class="pl-k">function</span> <span class="pl-kos">(</span><span class="pl-kos">)</span> <span class="pl-kos">{</span>
<span class="pl-smi">document</span><span class="pl-kos">.</span><span class="pl-en">dispatchEvent</span><span class="pl-kos">(</span><span class="pl-k">new</span> <span class="pl-v">KeyboardEvent</span><span class="pl-kos">(</span><span class="pl-s">"keydown"</span><span class="pl-kos">,</span> <span class="pl-kos">{</span><span class="pl-c1">key</span>: <span class="pl-s">"/"</span><span class="pl-kos">,</span> <span class="pl-c1">bubbles</span>: <span class="pl-c1">true</span><span class="pl-kos">}</span><span class="pl-kos">)</span><span class="pl-kos">)</span><span class="pl-kos">;</span>
<span class="pl-kos">}</span><span class="pl-kos">,</span> <span class="pl-c1">1200</span><span class="pl-kos">)</span><span class="pl-kos">;</span>
<span class="pl-kos">}</span><span class="pl-kos">)</span><span class="pl-kos">;</span>
<span class="pl-kos">&lt;/</span><span class="pl-ent">script</span><span class="pl-kos">&gt;</span></pre></div>
<p>1.2 seconds after the window opens, this code triggers a simulated <code>/</code> key, which is the keyboard shortcut for opening the modal dialog.</p>
<p>There was one challenge left. In order to understand what was going on, Claude needed to run JavaScript on the page to take measurements for itself.</p>
<p>It wrote its own custom web application to capture information via CORS, then ran that as a local server and opened a page with JavaScript that would POST directly to it!</p>
<p>Here's the Python web app it wrote, using the standard library <a href="https://docs.python.org/3/library/http.server.html">http.server</a> package:</p>
<pre><span class="pl-k">from</span> <span class="pl-s1">http</span>.<span class="pl-s1">server</span> <span class="pl-k">import</span> <span class="pl-v">HTTPServer</span>, <span class="pl-v">BaseHTTPRequestHandler</span>
<span class="pl-k">class</span> <span class="pl-c1">H</span>(<span class="pl-v">BaseHTTPRequestHandler</span>):
<span class="pl-k">def</span> <span class="pl-en">do_POST</span>(<span class="pl-s1">self</span>):
<span class="pl-s1">n</span> <span class="pl-c1">=</span> <span class="pl-en">int</span>(<span class="pl-s1">self</span>.<span class="pl-c1">headers</span>.<span class="pl-c1">get</span>(<span class="pl-s">"Content-Length"</span>, <span class="pl-c1">0</span>))
<span class="pl-en">open</span>(<span class="pl-s">"/tmp/diag.json"</span>, <span class="pl-s">"w"</span>).<span class="pl-c1">write</span>(<span class="pl-s1">self</span>.<span class="pl-c1">rfile</span>.<span class="pl-c1">read</span>(<span class="pl-s1">n</span>).<span class="pl-c1">decode</span>())
<span class="pl-s1">self</span>.<span class="pl-c1">send_response</span>(<span class="pl-c1">200</span>)
<span class="pl-s1">self</span>.<span class="pl-c1">send_header</span>(<span class="pl-s">"Access-Control-Allow-Origin"</span>, <span class="pl-s">"*"</span>)
<span class="pl-s1">self</span>.<span class="pl-c1">end_headers</span>()
<span class="pl-k">def</span> <span class="pl-en">do_OPTIONS</span>(<span class="pl-s1">self</span>):
<span class="pl-s1">self</span>.<span class="pl-c1">send_response</span>(<span class="pl-c1">200</span>)
<span class="pl-s1">self</span>.<span class="pl-c1">send_header</span>(<span class="pl-s">"Access-Control-Allow-Origin"</span>, <span class="pl-s">"*"</span>)
<span class="pl-s1">self</span>.<span class="pl-c1">send_header</span>(<span class="pl-s">"Access-Control-Allow-Headers"</span>, <span class="pl-s">"*"</span>)
<span class="pl-s1">self</span>.<span class="pl-c1">end_headers</span>()
<span class="pl-k">def</span> <span class="pl-en">log_message</span>(<span class="pl-s1">self</span>, <span class="pl-c1">*</span><span class="pl-s1">a</span>): <span class="pl-c"># quiet</span>
<span class="pl-k">pass</span>
<span class="pl-en">HTTPServer</span>((<span class="pl-s">"127.0.0.1"</span>, <span class="pl-c1">9999</span>), <span class="pl-c1">H</span>).<span class="pl-c1">serve_forever</span>()</pre>
<p>All this does is accept a POST request full of JSON and write that to the <code>/tmp/diag.json</code> file. It sends <code>Access-Control-Allow-Origin: *</code> headers (including from <code>OPTIONS</code> requests) so that code running on another domain can still communicate back to it.</p>
<p>Then Claude injected this code into the template that it was loading in a browser:</p>
<div class="highlight highlight-source-js"><pre><span class="pl-k">const</span> <span class="pl-s1">host</span> <span class="pl-c1">=</span> <span class="pl-smi">document</span><span class="pl-kos">.</span><span class="pl-en">querySelector</span><span class="pl-kos">(</span><span class="pl-s">"navigation-search"</span><span class="pl-kos">)</span><span class="pl-kos">;</span>
<span class="pl-k">const</span> <span class="pl-s1">ta</span> <span class="pl-c1">=</span> <span class="pl-s1">host</span><span class="pl-kos">.</span><span class="pl-c1">shadowRoot</span><span class="pl-kos">.</span><span class="pl-en">querySelector</span><span class="pl-kos">(</span><span class="pl-s">"textarea"</span><span class="pl-kos">)</span><span class="pl-kos">;</span>
<span class="pl-k">const</span> <span class="pl-s1">cs</span> <span class="pl-c1">=</span> <span class="pl-en">getComputedStyle</span><span class="pl-kos">(</span><span class="pl-s1">ta</span><span class="pl-kos">)</span><span class="pl-kos">;</span>
<span class="pl-en">fetch</span><span class="pl-kos">(</span><span class="pl-s">"http://127.0.0.1:9999/diag"</span><span class="pl-kos">,</span> <span class="pl-kos">{</span>
<span class="pl-c1">method</span>: <span class="pl-s">"POST"</span><span class="pl-kos">,</span>
<span class="pl-c1">body</span>: <span class="pl-c1">JSON</span><span class="pl-kos">.</span><span class="pl-en">stringify</span><span class="pl-kos">(</span><span class="pl-kos">{</span>
<span class="pl-c1">dpr</span>: <span class="pl-smi">window</span><span class="pl-kos">.</span><span class="pl-c1">devicePixelRatio</span><span class="pl-kos">,</span>
<span class="pl-c1">scrollWidth</span>: <span class="pl-s1">ta</span><span class="pl-kos">.</span><span class="pl-c1">scrollWidth</span><span class="pl-kos">,</span> <span class="pl-c1">clientWidth</span>: <span class="pl-s1">ta</span><span class="pl-kos">.</span><span class="pl-c1">clientWidth</span><span class="pl-kos">,</span>
<span class="pl-c1">whiteSpace</span>: <span class="pl-s1">cs</span><span class="pl-kos">.</span><span class="pl-c1">whiteSpace</span><span class="pl-kos">,</span> <span class="pl-c1">width</span>: <span class="pl-s1">cs</span><span class="pl-kos">.</span><span class="pl-c1">width</span><span class="pl-kos">,</span>
<span class="pl-kos">}</span><span class="pl-kos">)</span><span class="pl-kos">,</span>
<span class="pl-kos">}</span><span class="pl-kos">)</span><span class="pl-kos">;</span></pre></div>
<p>This took measurements of the <code>&lt;textarea&gt;</code> inside the <code>&lt;navigation-search&gt;</code> Web Component and sent them to the server, which wrote them to a file on disk, which Claude could then read.</p>
<p>Having figured out all of these tricks Fable... hit some invisible guardrail and downgraded itself to Opus. Thankfully Opus had access to the full transcript and could continue using the tricks pioneered by Fable, and shortly afterwards found, tested and verified <a href="https://github.com/datasette/datasette-agent/commit/a75a8b727b42c30ced1fc41dc8add7eb9f04fefe">the fix</a>.</p>
<p>I prompted Opus to:</p>
<blockquote>
<p><code>Write a report in /tmp/automation-report.md where you note down all of the tricks you have used in this session to test against real browsers on my computer, include runnable code examples</code></p>
</blockquote>
<p>Which produced <a href="https://gist.github.com/simonw/aef7f7db9ac992643110a74e43d6d42f">this report</a>, which was invaluable for piecing together the details of what had happened for this post.</p>
<p>I've shared <a href="https://gisthost.github.io/?cc14774f6d37eb67bf089f3ac3925f8f">the full terminal transcript</a> of the Claude Code session as well.</p>
<h4 id="a-review-of-everything-it-did">A review of everything it did</h4>
<p>Based on a screenshot and a one-line prompt, Claude Fable 5 + Claude Code:</p>
<ul>
<li>Figured out the recipe to run the local development server (with fake environment variables needed to get it running)</li>
<li>Fired up a Playwright Chrome session</li>
<li>Turned on the visible scrollbars setting for Chrome <code>defaults write com.google.chrome.for.testing AppleShowScrollBars Always</code> (it turned that off again later)</li>
<li>Cycled through Firefox and WebKit in Playwright too, failing to recreate the bug</li>
<li>Worked out my default browser was Safari</li>
<li>Built a <code>textarea-scrollbar-test.html</code> HTML document</li>
<li>Opened that in real (not Playwright) Firefox</li>
<li>Found that <code>osascript -e 'tell application "System Events" to tell process "firefox" to id of window 1'</code> was blocked because "osascript is not allowed assistive access"</li>
<li>Figured out that <code>uv run --with pyobjc-framework-Quartz python</code> workaround, described above</li>
<li>Added JavaScript to the site templates in order to trigger the <code>/</code> key</li>
<li>Built its own little Python CORS web server to capture JSON data</li>
<li>Rewrote the template to capture that data and send it to the server</li>
<li>Scripted its way through the Web Component shadow DOM to the information it needed</li>
<li>Opened Safari to confirm the source of the bug</li>
<li>Modified its custom template to hack in a potential fix</li>
<li>Confirmed the hacked fix worked</li>
<li>Reported back on how to fix the problem</li>
</ul>
<p>Like I said, relentlessly proactive!</p>
<h4 id="an-estimate-of-the-cost">An estimate of the cost</h4>
<p>I'm currently on the $100/month Claude Max plan, which includes a generous allowance for Fable up until June 22nd after which Anthropic say they'll start charging full API prices for it.</p>
<p>I'm using <a href="https://www.agentsview.io">AgentsView</a> to track my spending (see <a href="https://til.simonwillison.net/llms/agentsview-custom-model-price">this TIL</a>). Here's what AgentsView says this session would have cost me if I was paying full price for it:</p>
<pre><code>~ % uvx agentsview session usage be8850a7-6119-46a0-b5d6-79c7fff5ae2b
Session: be8850a7-6119-46a0-b5d6-79c7fff5ae2b
Agent: claude
Output: 68606
Peak ctx: 113178
Cost: ~$12.11 (claude-fable-5, claude-opus-4-8)
</code></pre>
<p>If you don't keep a close eye on it, Fable will quite happily burn $12 in tokens inventing new ways to debug your CSS.</p>
<h4 id="i-really-need-to-lock-this-thing-down">I really need to lock this thing down</h4>
<p>On the one hand, watching Fable go to extreme lengths to get the information that it needed to debug what was, in the end, a two-line CSS fix, was <em>fascinating</em>.</p>
<p>But on the other hand... this is a robust reminder that coding agents can do anything <em>you</em> can do by typing commands into a terminal - and frontier models know every trick in the book, and evidently a few that nobody has ever written down before.</p>
<p>If Fable had been acting on malicious instructions - a prompt injection attack hidden in code or an issue thread, or something I'd carelessly pasted into my terminal - it's alarming to think quite how far it could go to exfiltrate data or cause other forms of mischief.</p>
<p>Running coding agents outside of a sandbox has always been a bad idea - it's my top contender for <a href="https://simonwillison.net/2026/Jan/8/llm-predictions-for-2026/#1-year-a-challenger-disaster-for-coding-agent-security">a Challenger disaster</a> incident, as described by Johann Rehberger in <a href="https://embracethered.com/blog/posts/2025/the-normalization-of-deviance-in-ai/">The Normalization of Deviance in AI</a>.</p>
<p>Fable is arguably smarter and hence more suspicious of potentially malicious instructions. But that smartness is very much a two-edged sword: if it <em>does</em> get subverted by instructions, the amount of damage it can do given its relentless proactivity is terrifying.</p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/prompt-injection">prompt-injection</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/ai-assisted-programming">ai-assisted-programming</a>, <a href="https://simonwillison.net/tags/coding-agents">coding-agents</a>, <a href="https://simonwillison.net/tags/claude-code">claude-code</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p>
</summary><category term="ai"/><category term="prompt-injection"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="coding-agents"/><category term="claude-code"/><category term="claude-mythos"/></entry><entry><title>datasette 1.0a33</title><link href="https://simonwillison.net/2026/Jun/11/datasette/#atom-everything" rel="alternate"/><published>2026-06-11T15:26:49+00:00</published><updated>2026-06-11T15:26:49+00:00</updated><id>https://simonwillison.net/2026/Jun/11/datasette/#atom-everything</id><summary type="html">
<p><strong>Release:</strong> <a href="https://github.com/simonw/datasette/releases/tag/1.0a33">datasette 1.0a33</a></p>
<p>This alpha is a significant step on the road to a stable 1.0, finally extending the <code>?_extra=</code> pattern I introduced <a href="https://docs.datasette.io/en/1.0a3/changelog.html#a3-2023-08-09">in Datasette 1.0a3</a> to cover queries and rows in addition to tables. That pattern is also <a href="https://docs.datasette.io/en/latest/json_api.html#expanding-json-responses">now documented</a>!</p>
<p>I wrote a whole lot more about the new release on the Datasette project blog: <strong><a href="http://datasette.io/blog/2026/api-extras/">Datasette 1.0a33 with JSON extras in the API</a></strong>.</p>
<p>Because API explorer tools are almost free to build now I had Claude Fable 5 in Claude Code (for <a href="https://gist.github.com/simonw/d8bf1a8f36e28fbd595cede946e0ab6d">the plan</a>) and GPT-5.5 xhigh in Codex Desktop (for <a href="https://gist.github.com/simonw/12d5e09797072a6807d7b9cfcc8ff6b7">the implementation</a>) build me this <a href="https://tools.simonwillison.net/datasette-extras-explorer">custom extras API explorer</a> to help demonstrate the feature:</p>
<p><img alt="Screenshot of a web application titled &quot;Datasette extras explorer&quot;. A URL input field contains https://latest.datasette.io/fixtures/facetable.json with a teal Explore button next to it. Below, a left panel labeled EXTRAS (30) lists checkboxes: all_columns - All columns in the table, regardless of _col/_nocol filtering; column_types - Column type assignments for this table; columns (checked) - Column names returned by this query; count - Total count of rows matching these filters; count_sql - SQL query used to calculate the total count; custom_table_templates - Custom template names considered for this table; database - Database name; database_color - Color assigned to the database. A right panel labeled RESPONSE shows GET /fixtures/fac… with Copy JSON and Copy URL buttons, then a dark JSON viewer showing 200 - 9.9 KB - 114ms and JSON: &quot;ok&quot;: true, &quot;next&quot;: null, &quot;columns&quot;: (highlighted array) &quot;pk&quot;, &quot;created&quot;, &quot;planet_int&quot;, &quot;on_earth&quot;, &quot;state&quot;, &quot;_city_id&quot;, &quot;_neighborhood&quot;, &quot;tags&quot;, &quot;complex_array&quot;, &quot;distinct_some_null&quot;, &quot;n&quot;, &quot;rows&quot;: list of objects." src="https://static.simonwillison.net/static/2026/extras-explorer.png" /></p>
<p>Tags: <a href="https://simonwillison.net/tags/projects">projects</a>, <a href="https://simonwillison.net/tags/datasette">datasette</a>, <a href="https://simonwillison.net/tags/annotated-release-notes">annotated-release-notes</a>, <a href="https://simonwillison.net/tags/ai-assisted-programming">ai-assisted-programming</a></p>
</summary><category term="projects"/><category term="datasette"/><category term="annotated-release-notes"/><category term="ai-assisted-programming"/></entry><entry><title>asyncinject 0.7</title><link href="https://simonwillison.net/2026/Jun/11/asyncinject/#atom-everything" rel="alternate"/><published>2026-06-11T06:28:09+00:00</published><updated>2026-06-11T06:28:09+00:00</updated><id>https://simonwillison.net/2026/Jun/11/asyncinject/#atom-everything</id><summary type="html">
<p><strong>Release:</strong> <a href="https://github.com/simonw/asyncinject/releases/tag/0.7">asyncinject 0.7</a></p>
<p>I built this utility library to support an <code>asyncio</code> dependency injection pattern a few years ago. I was using it with Datasette and Claude Fable 5 spotted some bugs in the dependency which it then fixed for me. It's a very proactive model!</p>
<p>Tags: <a href="https://simonwillison.net/tags/async">async</a>, <a href="https://simonwillison.net/tags/projects">projects</a>, <a href="https://simonwillison.net/tags/python">python</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p>
</summary><category term="async"/><category term="projects"/><category term="python"/><category term="claude-mythos"/></entry><entry><title>Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude</title><link href="https://simonwillison.net/2026/Jun/11/anthropic-walks-back-policy/#atom-everything" rel="alternate"/><published>2026-06-11T03:45:49+00:00</published><updated>2026-06-11T03:45:49+00:00</updated><id>https://simonwillison.net/2026/Jun/11/anthropic-walks-back-policy/#atom-everything</id><summary type="html">
<p><strong><a href="https://www.wired.com/story/anthropic-responds-to-backlash-on-claudes-secret-sabotage-on-ai-research/">Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude</a></strong></p>
Big scoop for Maxwell Zeff at Wired:</p>
<blockquote>
<p>“We’re changing Fable 5’s safeguards for frontier LLM development to make them visible.” Anthropic said in a statement to WIRED. “We made the wrong tradeoff and we apologize for not getting the balance right.”</p>
</blockquote>
<p>There's been a <em>huge</em> outcry about Anthropic's policy, <a href="https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-helping-you/">tucked away in their system card</a>, that Claude Fable/Mythos would identify "requests targeting frontier LLM development" and "limit effectiveness" without notifying the user.</p>
<p>It's good news that they're dropping the invisible aspect of this. It would be a whole lot better of they dropped this category of refusals entirely.</p>
<p><strong>Update</strong>: More details from <a href="https://twitter.com/claudedevs/status/2064949876463645026">@ClaudeDevs on Twitter</a>:</p>
<blockquote>
<p>We’re rolling out changes to make Fable 5’s safeguards for frontier LLM development visible.</p>
<p>Starting this week, flagged requests will visibly fall back to Opus 4.8—the same as our safeguards for cyber and bio. You will see this every time it happens. On the API, any flagged requests will return a reason for their refusal (coming to server-side fallback in the next few days).</p>
<p>We wanted to deploy Fable 5 to our users quickly and safely. Visible safeguards can be probed, so they have to be robust, which takes time to get right. Invisible safeguards can be targeted more narrowly, allowing us to ship quickly with very few false positives. We went with invisible safeguards for this reason—and that was the wrong tradeoff. You should have visibility into the safeguards we have in place, and why. We’re sorry for not getting the balance right.</p>
</blockquote>
<p><small></small>Via <a href="https://twitter.com/zeffmax/status/2064910040503627917">@zeffmax</a></small></p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/anthropic">anthropic</a>, <a href="https://simonwillison.net/tags/claude">claude</a>, <a href="https://simonwillison.net/tags/ai-ethics">ai-ethics</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p>
</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="ai-ethics"/><category term="claude-mythos"/></entry><entry><title>datasette-agent 0.2a0</title><link href="https://simonwillison.net/2026/Jun/10/datasette-agent/#atom-everything" rel="alternate"/><published>2026-06-10T23:57:27+00:00</published><updated>2026-06-10T23:57:27+00:00</updated><id>https://simonwillison.net/2026/Jun/10/datasette-agent/#atom-everything</id><summary type="html">
<p><strong>Release:</strong> <a href="https://github.com/datasette/datasette-agent/releases/tag/0.2a0">datasette-agent 0.2a0</a></p>
<p>Highlights from the release notes:</p>
<blockquote>
<ul>
<li>Tools can now ask the user questions mid-execution. Tools that declare a <code>context</code> parameter receive a <code>ToolContext</code> object, and <code>await context.ask_user(...)</code> can ask a yes/no, multiple-choice (<code>options=[...]</code>) or free-text (<code>free_text=True</code>) question. While a question is unanswered the agent turn suspends: the question renders as a form in the chat UI and persists to the internal database, so suspended conversations survive a server restart. Once answered, the tool re-executes from the top with stored answers replayed, so call <code>ask_user()</code> before performing side effects. <a href="https://github.com/datasette/datasette-agent/pull/20">#20</a></li>
<li>New built-in <code>save_query</code> tool: the agent can save SQL it has written as a <a href="https://docs.datasette.io/en/latest/sql_queries.html#saved-queries">Datasette stored query</a>. Saving always requires human approval - the agent shows the full SQL plus the proposed name, database and visibility, and nothing is stored until you click Yes. <a href="https://github.com/datasette/datasette-agent/pull/20">#20</a></li>
</ul>
</blockquote>
<p>The <code>ask_user()</code> feature was enabled by the new LLM alpha I <a href="https://simonwillison.net/2026/Jun/9/claude-fable-5/#adding-features-to-datasette-agent-and-llm-using-claude-code">built yesterday</a> with the help of Claude Fable 5.</p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/datasette">datasette</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/datasette-agent">datasette-agent</a></p>
</summary><category term="ai"/><category term="datasette"/><category term="generative-ai"/><category term="llms"/><category term="datasette-agent"/></entry><entry><title>DiffusionGemma</title><link href="https://simonwillison.net/2026/Jun/10/diffusiongemma/#atom-everything" rel="alternate"/><published>2026-06-10T20:00:54+00:00</published><updated>2026-06-10T20:00:54+00:00</updated><id>https://simonwillison.net/2026/Jun/10/diffusiongemma/#atom-everything</id><summary type="html">
<p><strong><a href="https://blog.google/innovation-and-ai/technology/developers-tools/diffusion-gemma-faster-text-generation/">DiffusionGemma</a></strong></p>
Last May Google briefly released an experimental Gemini Diffusion model. I <a href="https://simonwillison.net/2025/May/21/gemini-diffusion/">tried the preview at the time</a> and recorded it running at 857 tokens/second. It was an exciting model, but Google made no further announcements about it.</p>
<p>That research has returned in the best possible way: as a new open weight (Apache 2 licensed) Gemma model, <a href="https://huggingface.co/google/diffusiongemma-26B-A4B-it">google/diffusiongemma-26B-A4B-it</a>.</p>
<p>NVIDIA are currently <a href="https://build.nvidia.com/google/diffusiongemma-26b-a4b-it">hosting the model for free</a> on their NIM cloud API. I used that API to <a href="https://tools.simonwillison.net/markdown-svg-renderer#url=https%3A%2F%2Fgist.github.com%2Fsimonw%2Fe5e234a6dc6eef61e209ce1629620042">generate this pelican</a>, which took 4.4s (according to <code>time uv run generate.py</code>) to return 2,409 tokens - so at least 500 tokens/second.</p>
<p><img alt="Flat minimalist illustration of a white pelican with a large orange beak riding a red bicycle with black wheels, against a pale blue background with a green line representing the ground" src="https://static.simonwillison.net/static/2026/diffusiongemma-pelican.png" />
<p><small></small>Via <a href="https://news.ycombinator.com/item?id=48478471">Hacker News</a></small></p>
<p>Tags: <a href="https://simonwillison.net/tags/google">google</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/nvidia">nvidia</a>, <a href="https://simonwillison.net/tags/pelican-riding-a-bicycle">pelican-riding-a-bicycle</a>, <a href="https://simonwillison.net/tags/gemma">gemma</a>, <a href="https://simonwillison.net/tags/llm-release">llm-release</a>, <a href="https://simonwillison.net/tags/llm-performance">llm-performance</a></p>
</summary><category term="google"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="nvidia"/><category term="pelican-riding-a-bicycle"/><category term="gemma"/><category term="llm-release"/><category term="llm-performance"/></entry><entry><title>Quoting Jeremy Howard</title><link href="https://simonwillison.net/2026/Jun/10/jeremy-howard/#atom-everything" rel="alternate"/><published>2026-06-10T15:23:34+00:00</published><updated>2026-06-10T15:23:34+00:00</updated><id>https://simonwillison.net/2026/Jun/10/jeremy-howard/#atom-everything</id><summary type="html">
<blockquote cite="https://twitter.com/jeremyphoward/status/2064595816875217362"><p>Easy solution to slow down recursive AI self improvement:</p>
<ul>
<li>The lab with the top-ranked model must agree THEY must not use it for working on frontier AI</li>
<li>But everyone else should have access to it.</li>
</ul>
<p>By definition, this means the frontier doesn't advance.</p>
<p>It also has the critical benefit of avoiding a dangerous power imbalance.</p>
<p>Anthropic has chosen the <em>opposite</em> of the safe path: they are allowing themselves, the current top lab, to use their top model for frontier AI research. They've said they'll sabotage others who try.</p>
<p>This means the AI frontier advances, &amp; power imbalance increases.</p>
<p>(To be clear, <em>I</em> don't think we should try to slow down recursive AI self improvement - I think we should open it up and democratize it as much as possible. My point is: if <em>you</em> claim we should slow down, and you have the best model, you should ensure your org can't use it.)</p></blockquote>
<p class="cite">&mdash; <a href="https://twitter.com/jeremyphoward/status/2064595816875217362">Jeremy Howard</a>, in a Twitter thread</p>
<p>Tags: <a href="https://simonwillison.net/tags/ai-ethics">ai-ethics</a>, <a href="https://simonwillison.net/tags/anthropic">anthropic</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a>, <a href="https://simonwillison.net/tags/jeremy-howard">jeremy-howard</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a></p>
</summary><category term="ai-ethics"/><category term="anthropic"/><category term="generative-ai"/><category term="claude-mythos"/><category term="jeremy-howard"/><category term="ai"/><category term="llms"/></entry><entry><title>If Claude Fable stops helping you, you'll never know</title><link href="https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-helping-you/#atom-everything" rel="alternate"/><published>2026-06-10T00:37:25+00:00</published><updated>2026-06-10T00:37:25+00:00</updated><id>https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-helping-you/#atom-everything</id><summary type="html">
<p><strong><a href="https://jonready.com/blog/posts/claude-fable5-is-allowed-to-sabotage-your-app-if-youre-a-competitor.html">If Claude Fable stops helping you, you&#x27;ll never know</a></strong></p>
Jonathon Ready highlights one of the more eyebrow-raising details from the <a href="https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c342ee809620.pdf">319 page system card</a> for Fable 5 and Mythos 5. Here's a longer excerpt, highlights mine:</p>
<blockquote>
<p>In light of the ability of recent models to <a href="https://www.anthropic.com/institute/recursive-self-improvement">accelerate their own development</a>, we’ve <strong>implemented new interventions</strong> that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on <strong>building pretraining pipelines, distributed training infrastructure, or ML accelerator design</strong>). Using Claude to develop competing models already violates our <a href="https://www.anthropic.com/legal/consumer-terms">Terms of Service</a>, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms.</p>
<p>Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, <strong>these safeguards will not be visible to the user</strong>. Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations.</p>
</blockquote>
<p>I believe this is the first time Anthropic have announced these kinds of silent interventions. The justification still feels pretty science-fiction to me - the linked article talks about "recursive self-improvement". I'm not at all keen on a model that silently corrupts its replies to questions about "ML accelerator design" purely to slow down research that might conflict with Anthropic's own goals!</p>
<p><strong>Update</strong>: Anthropic <a href="https://simonwillison.net/2026/Jun/11/anthropic-walks-back-policy/">walked back this policy</a> in the face of widespread outrage from the research community.
<p><small></small>Via <a href="https://news.ycombinator.com/item?id=48467896">Hacker News</a></small></p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/anthropic">anthropic</a>, <a href="https://simonwillison.net/tags/claude">claude</a>, <a href="https://simonwillison.net/tags/ai-ethics">ai-ethics</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p>
</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="ai-ethics"/><category term="claude-mythos"/></entry><entry><title>Initial impressions of Claude Fable 5</title><link href="https://simonwillison.net/2026/Jun/9/claude-fable-5/#atom-everything" rel="alternate"/><published>2026-06-09T23:59:54+00:00</published><updated>2026-06-09T23:59:54+00:00</updated><id>https://simonwillison.net/2026/Jun/9/claude-fable-5/#atom-everything</id><summary type="html">
<p>I didn't have early access to today's <a href="https://www.anthropic.com/news/claude-fable-5-mythos-5">Claude Fable 5</a> release, but I've spent the past ~5.5 hours putting it through its paces. My initial impressions are that this is something of a <em>beast</em>. It's slow, expensive and has been quite happily churning through everything I've thrown at it so far. As is frequently the case with current frontier models the challenge is finding tasks that it can't do.</p>
<p>First, let's review the key characteristics.</p>
<p>Anthropic claim that <a href="https://www.anthropic.com/news/claude-fable-5-mythos-5">Claude Fable 5</a> offers the same performance as Claude Mythos 5, except with much more strict guardrails in place to prevent it being used for harmful things. Those guardrails trigger often enough that the Claude API has new mechanisms for letting you know when you hit them, and even has a <a href="https://platform.claude.com/docs/en/build-with-claude/refusals-and-fallback">new option</a> to request it falls back to another model automatically if something gets rejected.</p>
<p>Claude Mythos 5 is out today as well, <a href="https://platform.claude.com/docs/en/about-claude/models/introducing-claude-fable-5-and-claude-mythos-5">Anthropic say it</a> "Shares Claude Fable 5's capabilities without the safety classifiers".</p>
<p>The models have a 1 million token context window, 128,000 maximum output tokens and a knowledge cut-off date of January 2026.</p>
<p>They are priced at twice the price of Claude Opus 4.5/4.6/4.7/4.8: $10/million input tokens and $50/million output tokens. There's no increase in price for longer context usage.</p>
<p>Other than that the <a href="https://platform.claude.com/docs/en/about-claude/models/introducing-claude-fable-5-and-claude-mythos-5">upgrade guide</a> is substantially thinner than <a href="https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-8">the similar guide for Opus 4.8</a>.</p>
<h4 id="the-big-model-smell">The big model smell</h4>
<p>The best way to describe Fable is that it feels <em>big</em>. Not just in terms of speed and cost, but also in how much it knows.</p>
<p>Here's an example prompt, which I used to compare Fable's knowledge with Opus 4.8 (typo included):</p>
<blockquote>
<p><code>List all of Simon Willion's open source projects, most recent first, each with a rough date of when they were first released</code></p>
</blockquote>
<p>Opus 4.8's reply <a href="https://gist.github.com/simonw/e77abf645156ade8c5c610f524226dcb#response-1">started like this</a>:</p>
<blockquote>
<p>I don't have a reliable, comprehensive, and date-verified list of all of Simon Willison's open source projects, and I want to be honest about that rather than risk giving you inaccurate dates or fabricated entries.</p>
<p>Here's what I can tell you with reasonable confidence:</p>
<p><strong>Well-known projects by Simon Willison:</strong></p>
<ul>
<li>
<p><strong>LLM</strong> (~2023) – A command-line tool and Python library for interacting with large language models.</p>
</li>
<li>
<p><strong>Datasette</strong> (~2017) – A tool for exploring and publishing data, probably his most prominent recent project.</p>
</li>
<li>
<p><strong>sqlite-utils</strong> (~2018-2019) – A Python library and CLI tool for manipulating SQLite databases, often used alongside Datasette.</p>
</li>
<li>
<p><strong>Django</strong> (2005) – He is one of the co-creators of the Django web framework, along with Adrian Holovaty, originally developed at the Lawrence Journal-World newspaper.</p>
</li>
</ul>
<p>He also maintains a large ecosystem of <strong>Datasette plugins</strong> and many smaller utilities (often with names like <code>shot-scraper</code>, various <code>*-to-sqlite</code> tools for importing data, etc.).</p>
</blockquote>
<p>That's not bad. Here's <a href="https://gist.github.com/simonw/e77abf645156ade8c5c610f524226dcb#response">the start of the response</a> for the exact same prompt from Fable 5 - both models were running without access to search:</p>
<blockquote>
<p>I think you mean <strong>Simon Willison</strong> — the prolific open source developer, co-creator of Django, and creator of Datasette.</p>
<p>An important caveat upfront: a truly complete list isn't feasible. Simon Willison is famously prolific — he has published <strong>hundreds</strong> of repositories (Datasette alone has 100+ plugins, many written by him), and my knowledge has a cutoff date, so I'll be missing his most recent work. Below are his <strong>notable</strong> projects with approximate initial release dates, most recent first:</p>
<h2 id="notable-projects-approximate-dates-">Notable projects (approximate dates)</h2>
<ul>
<li>
<strong>files-to-prompt</strong> (April 2024) – concatenates files into a single prompt for LLMs</li>
<li>
<strong>datasette-extract</strong> (2024) – plugin for extracting structured data using LLMs</li>
<li>
<strong>LLM</strong> (May–June 2023) – CLI tool and Python library for interacting with large language models, plus a large ecosystem of plugins (llm-gpt4all, llm-mistral, llm-claude, etc., 2023 onward)</li>
<li>
<strong>symbex</strong> (June 2023) – search Python code for symbols</li>
<li>
<strong>ttok</strong> and <strong>strip-tags</strong> (May 2023) – CLI tools for token counting and HTML cleanup for LLM pipelines</li>
<li>
<strong>datasette-lite</strong> (May 2022) – Datasette running entirely in the browser via WebAssembly/Pyodide</li>
<li>
<strong>shot-scraper</strong> (March 2022) – automated website screenshots via Playwright</li>
<li>
<strong>s3-credentials</strong> (November 2021) – CLI for creating scoped AWS S3 credentials</li>
<li>
<strong>django-sql-dashboard</strong> (2021) – SQL reporting dashboards for Django</li>
<li>
<strong>Dogsheep suite</strong> (2019) – personal analytics tools: twitter-to-sqlite, github-to-sqlite, healthkit-to-sqlite, dogsheep-beta, etc.</li>
<li>
<strong>sqlite-utils</strong> (2018) – CLI and Python library for manipulating SQLite databases</li>
<li>
<strong>Datasette</strong> (November 2017) – his flagship project; tool for exploring and publishing data</li>
<li>
<strong>csvs-to-sqlite</strong> (2017) – convert CSV files to SQLite</li>
<li>
<strong>Various early tools</strong> (~2007–2010) – soupselect, json-head, geocoders, and others</li>
<li>
<strong>Django</strong> (developed 2003–2005, open-sourced July 2005) – co-created with Adrian Holovaty at the Lawrence Journal-World</li>
</ul>
</blockquote>
<p>(Here's <a href="https://gist.github.com/simonw/8d05ce836d44bd8543055c9614f7c478">GPT-5.5 for good measure</a>. It listed even more projects than Fable did!)</p>
<p>In the past I've stated that I don't care about how much models <em>know</em> - I want them to be able to manipulate text and code in useful ways and actively look up the information they need via search tools, not bake it into their weights.</p>
<p>But knowledge like this is a reasonably good proxy for model size - you can cram a whole lot more details about the world into a larger number of parameters.</p>
<p>Does <em>knowing more stuff</em> mean the model is better at the tasks we pose to it? I can certainly imagine how a coding model with deeper knowledge of modern libraries and patterns could crunch through coding tasks more effectively.</p>
<p>Is Fable really bigger than Opus? Anthropic haven't said anything about model size, so all we have are tea-leaves, but the speed, pricing and my own poking at its knowledge make me think that it's a large model. Maybe the largest yet from any vendor.</p>
<h4 id="using-fable-in-claude-ai">Using Fable in Claude.ai</h4>
<p>Anthropic made Fable 5 available across all of their surfaces - the <a href="https://claude.ai/">Claude.ai</a> chat interface, Claude Code for web, Claude Code CLI and Claude Cowork as well. The model is available "until June 22nd" on the subscription plans (I'm on $100/month Max at the moment), after which it will be billed extra.</p>
<p>Claude.ai is often under-estimated. Since <a href="https://simonwillison.net/2025/Sep/9/claude-code-interpreter/">September 2025</a> every chat has had access to a full container environment to run code, including the ability to install additional packages and even clone repositories directly from GitHub.</p>
<p>Last week <a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/">I released micropython-wasm</a>, a Python library that uses <a href="https://wasmtime.dev/">wasmtime</a> to run a custom build of MicroPython in WebAssembly to act as a sandbox for untrusted Python code.</p>
<p>I decided to see if Fable could upgrade that to running full Python instead. I started with this prompt:</p>
<blockquote>
<p><code>Clone simonw/micropython-wasm from GitHub and research how this could use a full Python as opposed to MicroPython</code></p>
</blockquote>
<p>Fable identified that it could use Brett Cannon's <a href="https://github.com/brettcannon/cpython-wasi-build/releases">cpython-wasi-build</a> builds for this, but was unable to download them itself due to environment restrictions.</p>
<p>So I grabbed the two zip files from that page and uploaded them to Claude:</p>
<blockquote>
<p><code>Here's the Brett Cannon builds</code> (<code>python-3.zip</code>, <code>_build-python-3.zip</code> as attachments)</p>
</blockquote>
<p>And that was that. It churned away for a few minutes and got the entire thing working.</p>
<p>Part of the response included:</p>
<blockquote>
<p>I tried the cleaner single-zip-stdlib approach to shrink the filesystem surface, but CPython's <code>getpath</code> bootstrap fails to find <code>encodings</code> from inside a zip without more prefix finessing — the directory-preopen approach works reliably, so that's what the PoC uses. The zip path is solvable but needs <code>_PYTHONHOME</code>/frozen-getpath work.</p>
</blockquote>
<p>So I said:</p>
<blockquote>
<p><code>Try a bit more at the single-zip-stdlib problem</code></p>
</blockquote>
<p>Then a little later:</p>
<blockquote>
<p><code>I want a wheel that has the whole system in it, the Python wrappers and the WASM files and the stdlibrary, so I can do uv run --with path-to-whl python -c "demo code"</code></p>
</blockquote>
<p>... and it gave me <a href="https://static.simonwillison.net/static/cors-allow/2026/cpython_wasm-0.1.0-py3-none-any.whl">this 13.9MB cpython_wasm-0.1.0-py3-none-any.whl</a> file. You can try running Python code in a sandbox using that wheel URL and <code>uv</code> like this:</p>
<div class="highlight highlight-source-shell"><pre>uv run --with https://static.simonwillison.net/static/cors-allow/2026/cpython_wasm-0.1.0-py3-none-any.whl \
cpython-wasm -c <span class="pl-s"><span class="pl-pds">'</span>print(45 ** 56)<span class="pl-pds">'</span></span></pre></div>
<p>Here's <a href="https://claude.ai/share/a73b8b8b-8ebc-4fef-9e5c-7438e5e7ae35">the full chat transcript</a>.</p>
<p>This was a <em>very</em> strong start.</p>
<h4 id="adding-features-to-datasette-agent-and-llm-using-claude-code">Adding features to Datasette Agent and LLM using Claude Code</h4>
<p>Before I'd realized it was Fable day, my stretch goal for today was to add a new feature to <a href="https://agent.datasette.io/">Datasette Agent</a>: I wanted tool calls within that agent software to gain the ability to pause mid-execution and request approval directly from the user.</p>
<p>This felt like a suitably meaty task to throw at the new model.</p>
<p>Over the course of the day Fable not only <a href="https://github.com/datasette/datasette-agent/pull/20">solved that problem</a>, it also identified and then implemented four issues in my underlying LLM library that would help support this kind of advanced pause-resume mechanism in tool calls.</p>
<p>It got everything working first using somewhat gnarly hacks, but the moment I told it that changes to LLM itself were in scope it set to work unraveling the hacks and turning them into supported features of LLM instead.</p>
<p>My stretch goal turned into <a href="https://llm.datasette.io/en/latest/changelog.html#a3-2026-06-09">LLM 0.32a3</a>, almost entirely written by Fable. Here are the release notes:</p>
<blockquote>
<p>Driven by the needs of <a href="https://github.com/datasette/datasette-agent">Datasette Agent</a>'s human-in-the-loop <code>ask_user()</code> feature, made the following improvements to how tool calls work:</p>
<ul>
<li>Tool implementations can declare a parameter named <code>llm_tool_call</code> in order to be passed the <code>llm.ToolCall</code> object for the current invocation. This allows them to access the current <code>llm_tool_call.tool_call_id</code>. See <a href="https://llm.datasette.io/en/latest/python-api.html#python-api-tools-llm-tool-call">Accessing the tool call from inside a tool</a>. <a href="https://github.com/simonw/llm/pull/1480">#1480</a>
</li>
<li>Every tool call is now guaranteed a unique <code>tool_call_id</code> - providers that do not supply one get a synthesized <code>tc_</code>-prefixed ULID. <a href="https://github.com/simonw/llm/pull/1481">#1481</a>
</li>
<li>Tools can raise a <code>llm.PauseChain</code> exception to cleanly pause the tool chain, useful for things like waiting for human approval. The exception propagates to the caller with <code>.tool_call</code> and <code>.tool_results</code> (completed sibling results) attached, and no model call is made with a placeholder result. See <a href="https://llm.datasette.io/en/latest/python-api.html#python-api-tools-pause">Pausing a chain from inside a tool</a>. <a href="https://github.com/simonw/llm/pull/1482">#1482</a>
</li>
<li>Failure semantics for concurrent tool execution: async sibling tool calls always run to completion before a pause or hook exception propagates. <a href="https://github.com/simonw/llm/pull/1482">#1482</a>
</li>
<li>Chains can now resume from a <code>messages=</code> history ending in unresolved tool calls: the calls are executed through the normal <code>before_call</code>/<code>after_call</code> machinery before the first model call, skipping any that already have results. The <code>execute_tool_calls()</code> method also accepts a new optional <code>tool_calls_list=</code> argument for executing an explicit list of <code>ToolCall</code> objects in place of the calls requested by the response. See <a href="https://llm.datasette.io/en/latest/python-api.html#python-api-tools-resume">Resuming a chain with pending tool calls</a>. <a href="https://github.com/simonw/llm/pull/1482">#1482</a>
</li>
<li>Fixed a bug where the async tool executor silently dropped calls to tools not present in <code>tools=</code> - these now return <code>Error: tool "..." does not exist</code> results, matching the sync executor. <a href="https://github.com/simonw/llm/pull/1483">#1483</a>
</li>
</ul>
</blockquote>
<p>I'm really impressed with the quality of API design, tests, code and documentation that Fable put together for this. I spent several hours on it today, but it feels like several days' worth of work.</p>
<h4 id="how-much-i-ve-spent">How much I've spent</h4>
<p>I recently started using <a href="https://agentsview.io">AgentsView</a> to help track my local LLM usage across all of the different coding agents. I published a <a href="https://til.simonwillison.net/llms/agentsview-custom-model-price">TIL today</a> about adding custom Fable pricing to that tool, which I expect will not be necessary in the very near future.</p>
<p>After setting the price, I ran this command to start a localhost web server to explore my usage:</p>
<pre><code>uvx agentsview serve
</code></pre>
<p>Here's the treemap showing the breakdown of my Fable usage across various projects today:</p>
<p><img src="https://static.simonwillison.net/static/2026/agentsview-fable-full-day.jpg" alt="Screenshot of a cost tracking dashboard with two panels. The first panel is titled &quot;Cost Attribution&quot; with toggle buttons for Project / Model / Agent and Treemap / List, with Project and Treemap selected. Italic text reads &quot;Click to hide from chart&quot;. A treemap shows a large red block labeled prod_datasette_agent $99.26 89.9%, with smaller blocks to its right labeled cloud (blue), datasette (teal), llm (red), and money (pink), plus a tiny orange sliver. A legend lists: 1 prod_datasette_agent $99.26, 2 cloud $3.98, 3 datasette $2.81, 4 llm $2.30, 5 money $1.92, 6 simon $0.15. The second panel is titled &quot;Top Sessions by Cost&quot; and lists nine sessions, each with a &quot;Claude&quot; badge, a prompt excerpt, a project name with a session UUID (omitted here), a token count, and a cost: 1. Review ./datasette-agent and ./datasette-apps - we are going to add a new feature to agent but you ... prod_datasette_agent, 78.2M, $99.26. 2. issues.db is a copy of the Datasette issues database. There are a LOT of notes in there relating to... datasette, 826.8k, $2.81. 3. Consult fly-docs and then look at datasette.cloud (which launches fly machines) and datasettecloud-... cloud, 924.7k, $2.61. 4. simonwillisonblog.db is a copy of my blog, plus all my software releases and other interesting thin... money, 542.9k, $1.92. 5. Look in datasette.cloud and figure out all remaining steps and decisions that need to be made in or... cloud, 455k, $1.37. 6. Review PRs and issues filed against this repo within the last 4 weeks and see if any deserve to be ... llm, 323.3k, $0.95. 7. run mypy, llm, 320.9k, $0.76. 8. [Image #1] fix this in github actions, llm, 183.9k, $0.59. 9. simon, simon, 26.4k, $0.15." style="max-width: 100%;" /></p>
<p>I used $110.42 worth of tokens today, all as part of my $100/month subscription.</p>
<h4 id="and-some-pelicans">And some pelicans</h4>
<p>I ran "Generate an SVG of a pelican riding a bicycle" against all five thinking effort levels with Fable.</p>
<p>Here are <a href="https://tools.simonwillison.net/markdown-svg-renderer#url=https%3A%2F%2Fgist.github.com%2Fsimonw%2F94fde31c34a0400c1d29f57e6a708e6b">the results</a>, including the token cost for each one:</p>
<div style="display: flex; flex-wrap: wrap; gap: 10px; margin-bottom: 1em">
<figure style="margin: 0; flex: 1 1 30%;">
<img src="https://static.simonwillison.net/static/2026/fable-low.jpg" alt="low" style="width: 100%; height: auto;" />
<figcaption style="text-align: center;">low: <a href="https://www.llm-prices.com/#it=25&amp;ot=1929&amp;sel=claude-fable-5">1,929 out, 9.67c</a></figcaption>
</figure>
<figure style="margin: 0; flex: 1 1 30%;">
<img src="https://static.simonwillison.net/static/2026/fable-medium.jpg" alt="medium" style="width: 100%; height: auto;" />
<figcaption style="text-align: center;">medium: <a href="https://www.llm-prices.com/#it=25&amp;ot=2290&amp;sel=claude-fable-5">2,290 out, 11.475c</a></figcaption>
</figure>
<figure style="margin: 0; flex: 1 1 30%;">
<img src="https://static.simonwillison.net/static/2026/fable-high.jpg" alt="high" style="width: 100%; height: auto;" />
<figcaption style="text-align: center;">high: <a href="https://www.llm-prices.com/#it=25&amp;ot=2057&amp;sel=claude-fable-5">2,057 out, 10.31c</a></figcaption>
</figure>
<figure style="margin: 0; flex: 1 1 45%;">
<img src="https://static.simonwillison.net/static/2026/fable-xhigh.jpg" alt="xhigh" style="width: 100%; height: auto;" />
<figcaption style="text-align: center;">xhigh: <a href="https://www.llm-prices.com/#it=25&amp;ot=5992&amp;sel=claude-fable-5">5,992 out, 29.985c</a></figcaption>
</figure>
<figure style="margin: 0; flex: 1 1 45%;">
<img src="https://static.simonwillison.net/static/2026/fable-max.jpg" alt="max" style="width: 100%; height: auto;" />
<figcaption style="text-align: center;">max: <a href="https://www.llm-prices.com/#it=25&amp;ot=14430&amp;sel=claude-fable-5">14,430 out, 72.175c</a></figcaption>
</figure>
</div>
<p>It's interesting that high ended up using fewer tokens than medium for this particular run.</p>
<p>Here are the <a href="https://simonwillison.net/2026/May/28/claude-opus-4-8/#and-some-pelicans">Opus 4.8 pelicans</a> for comparison.</p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/anthropic">anthropic</a>, <a href="https://simonwillison.net/tags/claude">claude</a>, <a href="https://simonwillison.net/tags/llm-pricing">llm-pricing</a>, <a href="https://simonwillison.net/tags/pelican-riding-a-bicycle">pelican-riding-a-bicycle</a>, <a href="https://simonwillison.net/tags/llm-release">llm-release</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p>
</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="llm-pricing"/><category term="pelican-riding-a-bicycle"/><category term="llm-release"/><category term="claude-mythos"/></entry><entry><title>llm 0.32a3</title><link href="https://simonwillison.net/2026/Jun/9/llm/#atom-everything" rel="alternate"/><published>2026-06-09T22:27:03+00:00</published><updated>2026-06-09T22:27:03+00:00</updated><id>https://simonwillison.net/2026/Jun/9/llm/#atom-everything</id><summary type="html">
<p><strong>Release:</strong> <a href="https://github.com/simonw/llm/releases/tag/0.32a3">llm 0.32a3</a></p>
<p>Almost entirely written by the new Claude Fable 5, see <a href="https://simonwillison.net/2026/Jun/9/claude-fable-5/#adding-features-to-datasette-agent-and-llm-using-claude-code">my write-up for more details</a>.</p>
<p>Tags: <a href="https://simonwillison.net/tags/projects">projects</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/llm">llm</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p>
</summary><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="claude-mythos"/></entry><entry><title>Setting a custom price for a model in AgentsView</title><link href="https://simonwillison.net/2026/Jun/9/agentsview-custom-model-price/#atom-everything" rel="alternate"/><published>2026-06-09T21:35:31+00:00</published><updated>2026-06-09T21:35:31+00:00</updated><id>https://simonwillison.net/2026/Jun/9/agentsview-custom-model-price/#atom-everything</id><summary type="html">
<p><strong>TIL:</strong> <a href="https://til.simonwillison.net/llms/agentsview-custom-model-price">Setting a custom price for a model in AgentsView</a></p>
<p>I've been really enjoying <a href="https://agentsview.io/">AgentsView</a> by Wes McKinney as a tool for exploring my token usage across different coding agents running on my laptop.</p>
<p>Claude Fable 5 came out today and wasn't yet included in the pricing database AgentsView uses. I used Fable to reverse-engineer AgentsView and figured out this recipe for setting custom prices.</p>
<p>Here's my Claude Fable 5 usage for today so far, plotted by AgentsView as a treemap across my different local projects:</p>
<p><img alt="Screenshot of a cost analytics dashboard. Cost Attribution - Click to hide from chart - toggle buttons for Project / Model / Agent and Treemap / List. A treemap shows a large red block: prod_datasette_agent $74.06 89.3%, then blue: cloud $3.98 4.8%, teal: datasette $2.81 3.4%, pink: money $1.92 2.3%, and a thin orange sliver. A legend lists 1 prod_datasette_agent $74.06, 2 cloud $3.98, 3 datasette $2.81, 4 money $1.92, 5 simon $0.15. Below left, Top Sessions by Cost: 1 Claude - Review ./datasette-agent and ./datasette-apps - we are going to a... - prod_datasette_agent · 08a1f374-0e77-420f-be2d-af805d67e8aa - 55.9M $74.06; 2 Claude - issues.db is a copy of the Datasette issues database. There are a... - datasette · 8caa2d2d-b91f-43b3-bf3a-4268995b6011 - 826.8k $2.81; 3 Claude - Consult fly-docs and then look at datasette.cloud (which launche... - cloud · bfcacc70-09d7-4b27-aaec-4bb8accd9fec - 924.7k $2.61; 4 Claude - simonwillisonblog.db is a copy of my blog, plus all my software re... - money · 0c0fb9dc-6347-4e1b-9307-3709a7cdf0c8 - 542.9k $1.92; 5 Claude - Look in datasette.cloud and figure out all remaining steps and dec... - cloud · 45963b5f-608a-4caa-ad6b-6ae81e1dbf0d - 455k $1.37; 6 Claude - simon - simon · deeccb5d-9e90-4b1e-bfe6-c2b271e1b1d4 - 26.4k $0.15. Below right, Cache Efficiency with horizontal bars: Cache Reads 57.6M (nearly full green bar), Cache Writes 769.3K, Uncached Input 64.4K, Output 300.9K (all tiny bars), and a green highlighted note: $516.62 saved vs uncached." src="https://static.simonwillison.net/static/2026/agentsview-fable.jpg" /></p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/llm-pricing">llm-pricing</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p>
</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm-pricing"/><category term="claude-mythos"/></entry><entry><title>Quoting Andrej Karpathy</title><link href="https://simonwillison.net/2026/Jun/9/andrej-karpathy/#atom-everything" rel="alternate"/><published>2026-06-09T19:03:10+00:00</published><updated>2026-06-09T19:03:10+00:00</updated><id>https://simonwillison.net/2026/Jun/9/andrej-karpathy/#atom-everything</id><summary type="html">
<blockquote cite="https://twitter.com/karpathy/status/2064409694761054332"><p>I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref).</p></blockquote>
<p class="cite">&mdash; <a href="https://twitter.com/karpathy/status/2064409694761054332">Andrej Karpathy</a>, on Claude Fable 5</p>
<p>Tags: <a href="https://simonwillison.net/tags/andrej-karpathy">andrej-karpathy</a>, <a href="https://simonwillison.net/tags/jevons-paradox">jevons-paradox</a>, <a href="https://simonwillison.net/tags/anthropic">anthropic</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/claude-mythos">claude-mythos</a></p>
</summary><category term="andrej-karpathy"/><category term="jevons-paradox"/><category term="anthropic"/><category term="generative-ai"/><category term="ai"/><category term="llms"/><category term="claude-mythos"/></entry><entry><title>Siri AI at WWDC 2026</title><link href="https://simonwillison.net/2026/Jun/8/wwdc/#atom-everything" rel="alternate"/><published>2026-06-08T23:58:04+00:00</published><updated>2026-06-08T23:58:04+00:00</updated><id>https://simonwillison.net/2026/Jun/8/wwdc/#atom-everything</id><summary type="html">
<p>Given how badly burned anyone who took Apple's <a href="https://simonwillison.net/2024/Jun/10/apple-intelligence/">2024 WWDC Apple Intelligence announcements</a> at face value was, I'm holding to a strict "I'll believe it when I see it" policy for everything <a href="https://www.apple.com/newsroom/2026/06/apple-unveils-next-generation-of-apple-intelligence-siri-ai-and-more/">they announced today</a>. </p>
<p>The new Siri AI features do at least look feasible with today's technology, especially since Apple are licensing a custom Gemini-derived model that they can run on their own <a href="https://simonwillison.net/2024/Jun/11/private-cloud-compute/">Private Cloud Compute</a>.</p>
<p>It sounds like they'll be taking advantage of vision LLMs to extract information from the user's screen, which neatly sidesteps the need for every existing application to ship custom code in order to integrate with Apple Intelligence. Vision LLMs were a much less mature category in June 2024.</p>
<p>The new Core AI library looks like a good step in enabling developers to finally take full advantage of Apple's hardware for running their own models. It integrates with Meta's open source PyTorch ecosystem, using these <a href="https://apple.github.io/coreai-torch/main/">Core AI PyTorch extensions</a>:</p>
<blockquote>
<p>Core AI PyTorch Extensions (<code>coreai-torch</code>) is a Python package that bridges PyTorch and Core AI. You can use it to bring up an existing PyTorch model — exported as a <code>torch.export.ExportedProgram</code> — into a Core AI <code>AIProgram</code> ready to run on Apple hardware, traversing the FX graph node-by-node and mapping ATen operators to Core AI operations.</p>
</blockquote>
<p>You can install an iOS 27 Developer Beta today, which supposedly has the new features - but you then have to make it through a waiting list for access to the new Siri AI. Aaron Perris from MacRumors reports having <a href="https://twitter.com/aaronp613/status/2064078063814471977">made it off the waitlist</a> so we may start seeing credible reports on how well Siri AI works in the very near future.</p>
<p><strong>Update</strong>: These Private Cloud Compute Gemini models are running in Google Cloud, and using NVIDIA hardware. According to <a href="https://security.apple.com/blog/expanding-pcc/?linkId=100000425571569">Expanding Private Cloud Compute</a> on Apple's Security Research blog:</p>
<blockquote>
<p>For the most demanding tasks, including agentic tool-use and complex reasoning, we worked with Google and NVIDIA to extend our PCC infrastructure to Google Cloud systems using NVIDIA GPUs, while maintaining Apple's powerful security and privacy protections. [...]</p>
<p>PCC on Google Cloud leverages many of the same architectural security patterns as PCC on Apple silicon to implement these layered protections: initial network data parsing for each request happens in a dedicated process within its own namespace, shared inference software is recycled with a short time-to-live duration, and attested keys are held in a separate, dedicated confidential VM isolated from external inputs. [...]</p>
<p>As with PCC on Apple silicon, all binaries will be published for public inspection.</p>
</blockquote>
<p>Tags: <a href="https://simonwillison.net/tags/vision-llms">vision-llms</a>, <a href="https://simonwillison.net/tags/apple">apple</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/gemini">gemini</a>, <a href="https://simonwillison.net/tags/nvidia">nvidia</a>, <a href="https://simonwillison.net/tags/google">google</a></p>
</summary><category term="vision-llms"/><category term="apple"/><category term="generative-ai"/><category term="ai"/><category term="llms"/><category term="gemini"/><category term="nvidia"/><category term="google"/></entry><entry><title>datasette-agent-edit 0.1a0</title><link href="https://simonwillison.net/2026/Jun/7/datasette-agent-edit/#atom-everything" rel="alternate"/><published>2026-06-07T23:56:38+00:00</published><updated>2026-06-07T23:56:38+00:00</updated><id>https://simonwillison.net/2026/Jun/7/datasette-agent-edit/#atom-everything</id><summary type="html">
<p><strong>Release:</strong> <a href="https://github.com/datasette/datasette-agent-edit/releases/tag/0.1a0">datasette-agent-edit 0.1a0</a></p>
<p>I'm planning several plugins for <a href="https://agent.datasette.io/">Datasette Agent</a> which can make edits to existing pieces of text - things like collaborative Markdown editing, updating large SQL queries, and editing SVG files.</p>
<p>Agentic editing of text is a little tricky to get right. My favorite published design for this is for the <a href="https://platform.claude.com/docs/en/agents-and-tools/tool-use/text-editor-tool#use-the-text-editor-tool">Claude text editor</a>, which implements the following tools:</p>
<ul>
<li><code>view</code> - view sections of a file, with line numbers added to every line.</li>
<li><code>str_replace</code> - find an exact <code>old_str</code> and replace it with <code>new_str</code> - fail if the original string is not unique</li>
<li><code>insert</code> - insert the specified text after the specified line number</li>
</ul>
<p>Rather than recreate these patterns for every plugin that needs them I decided to create this base plugin, <code>datasette-agent-edit</code>, which implements the core tools in a way that allows them to be adapted for other plugins.</p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/datasette">datasette</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/llm-tool-use">llm-tool-use</a>, <a href="https://simonwillison.net/tags/datasette-agent">datasette-agent</a></p>
</summary><category term="ai"/><category term="datasette"/><category term="generative-ai"/><category term="llms"/><category term="llm-tool-use"/><category term="datasette-agent"/></entry><entry><title>micropython-wasm 0.1a2</title><link href="https://simonwillison.net/2026/Jun/6/micropython-wasm/#atom-everything" rel="alternate"/><published>2026-06-06T04:26:06+00:00</published><updated>2026-06-06T04:26:06+00:00</updated><id>https://simonwillison.net/2026/Jun/6/micropython-wasm/#atom-everything</id><summary type="html">
<p><strong>Release:</strong> <a href="https://github.com/simonw/micropython-wasm/releases/tag/0.1a2">micropython-wasm 0.1a2</a></p>
<p>I added a CLI to <code>micropython-wasm</code> (<a href="https://github.com/simonw/micropython-wasm/issues/7">issue #7</a>), inspired by the first draft of <a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/">the blog entry</a> when I realized it would be a great way to illustrate the <a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#try-it-yourself">Try it yourself</a> section.</p>
<p>Tags: <a href="https://simonwillison.net/tags/python">python</a>, <a href="https://simonwillison.net/tags/sandboxing">sandboxing</a>, <a href="https://simonwillison.net/tags/webassembly">webassembly</a>, <a href="https://simonwillison.net/tags/micropython">micropython</a></p>
</summary><category term="python"/><category term="sandboxing"/><category term="webassembly"/><category term="micropython"/></entry><entry><title>Running Python code in a sandbox with MicroPython and WASM</title><link href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#atom-everything" rel="alternate"/><published>2026-06-06T03:53:34+00:00</published><updated>2026-06-06T03:53:34+00:00</updated><id>https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#atom-everything</id><summary type="html">
<p>I've been experimenting with different approaches to running code in a sandbox for several years now, but my latest attempt feels like it might finally have all of the characteristics I've been looking for. I've released it as an alpha package called <a href="https://github.com/simonw/micropython-wasm">micropython-wasm</a>, and I'm using it for a code execution sandbox plugin for <a href="https://github.com/datasette/datasette-agent">Datasette Agent</a> called <a href="https://github.com/datasette/datasette-agent-micropython">datasette-agent-micropython</a>.</p>
<ul>
<li><a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#why-do-i-want-a-sandbox-">Why do I want a sandbox?</a></li>
<li><a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#what-i-want-from-a-sandbox">What I want from a sandbox</a></li>
<li><a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#webassembly-looks-really-promising-here">WebAssembly looks really promising here</a></li>
<li><a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#micropython-in-webassembly">MicroPython in WebAssembly</a></li>
<li><a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#building-the-first-version">Building the first version</a></li>
<li><a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#try-it-yourself">Try it yourself</a></li>
<li><a href="https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#should-you-trust-my-vibe-coded-sandbox-">Should you trust my vibe-coded sandbox?</a></li>
</ul>
<h4 id="why-do-i-want-a-sandbox-">Why do I want a sandbox?</h4>
<p>My key open source projects - <a href="https://datasette.io/">Datasette</a>, <a href="https://llm.datasette.io/">LLM</a>, even <a href="https://sqlite-utils.datasette.io/">sqlite-utils</a> - all support plugins.</p>
<p>I absolutely love plugins as a mechanism for extending software. A carefully designed plugin system reduces the risk involved in trying new things to almost nothing - even the wildest ideas won't leave a lasting influence on the core application itself. My software can grow a new feature overnight and I don't even have to review a pull request!</p>
<p>There's one major drawback: my plugin systems all use Python and <a href="https://pluggy.readthedocs.io/en/latest/">Pluggy</a>, and plugin code executes with full privileges within my applications. A buggy or malicious plugin could break everything or leak private data.</p>
<p>I'd love to be able to run plugin-style code in an environment where it is unable to read unapproved files, connect to a network, or generally operate in a way that's risky or harmful to the rest of the application or the user's computer.</p>
<p>My interest covers more than just plugins. For Datasette in particular there are many features I'd like to support where arbitrary code execution would be useful. I've already experimented with this for <a href="https://enrichments.datasette.io/">Datasette Enrichments</a>, where code can be used to transform values stored in a table. I'd love to build a mechanism where you can run code on a schedule that fetches JSON from an approved location, runs a tiny bit of code to reformat it into a list of dictionaries, then inserts those as rows in a SQLite database table.</p>
<h4 id="what-i-want-from-a-sandbox">What I want from a sandbox</h4>
<p>My goal is to execute code safely within my own Python applications. Here's what I need:</p>
<ul>
<li>Dependencies that <strong>cleanly install from PyPI</strong>, including binary wheels across multiple platforms if necessary. I don't want people using my software to have to take any extra steps beyond directly installing my Python package.</li>
<li>Executed code must be subject to both <strong>memory</strong> and <strong>CPU</strong> limits. I don't want <code>while True: s += "longer string"</code> to crash my application or the user's computer.</li>
<li>
<strong>File access must be strictly controlled</strong>. Either no filesystem access at all or I get to define exactly which files can be read and which files can be written to.</li>
<li>
<strong>Network access is controlled as well</strong>. Sandboxed code should not be able to communicate with anything without going through a layer I fully control.</li>
<li>Support for interaction with <strong>host functions</strong>. A sandbox isn't much use if I can't carefully expose selected platform features to the code that it's running.</li>
<li>It has to be <strong>robust, supported, and clearly documented</strong>. I've lost count of the number of sandbox projects I've seen in repos with warnings that they aren't actively maintained!</li>
</ul>
<h4 id="webassembly-looks-really-promising-here">WebAssembly looks really promising here</h4>
<p>Web browsers operate in the most hostile environment imaginable when it comes to malicious code. Their job is to download <em>and execute</em> untrusted code from the web on almost every page load.</p>
<p>Given this, JavaScript engines should be excellent candidates for sandboxes. Sadly those engines are also extremely complicated, and are not designed for easy embedding in other projects. Most of the V8-in-Python projects I've seen are infrequently maintained and come with warnings not to use them with completely untrusted code.</p>
<p>WebAssembly is a <em>much better</em> candidate. It was designed from the start to support all of the characteristics I care about and has been tested in browsers for nearly a decade. The <a href="https://pypi.org/project/wasmtime">wasmtime</a> Python library brings WASM to Python, is actively maintained, and has binary wheels.</p>
<h4 id="micropython-in-webassembly">MicroPython in WebAssembly</h4>
<p>WebAssembly engines like wasmtime run WebAssembly binaries. Some programming languages like Rust are easy to compile directly to WebAssembly. Dynamic languages like JavaScript and Python are harder - they support language primitives like <code>eval()</code>, which means they need a full interpreter available at runtime.</p>
<p>To run Python we need a full Python interpreter compiled to WebAssembly, wired up in a way that makes it easy to feed it code, hook up host functions and access the results.</p>
<p>Pyodide offers an outstanding package for running Python using WebAssembly in the browser, but using Pyodide in server-side Python isn't supported. The most recent advice I could find was <a href="https://github.com/pyodide/pyodide/discussions/5145">from October 2024</a> stating "Pyodide is built by the Emscripten toolchain and can only run in a browser or Node.js".</p>
<p>The other day I decided to take a look at <a href="https://micropython.org">MicroPython</a> as an option for this. The MicroPython site says:</p>
<blockquote>
<p>MicroPython is a lean and efficient implementation of the Python 3 programming language that includes a small subset of the Python standard library and is optimised to run on microcontrollers and in constrained environments.</p>
</blockquote>
<p>WebAssembly sure feels like a constrained environment to me!</p>
<h4 id="building-the-first-version">Building the first version</h4>
<p>I had GPT-5.5 Pro <a href="https://chatgpt.com/share/6a1e2a5c-58b8-8328-ba1c-0e6aadb0a051">do some research for me</a>, which turned up <a href="https://github.com/micropython/micropython/pull/13676">this PR against MicroPython</a> by <a href="https://github.com/yamt">Yamamoto Takahashi</a> titled "Experimental WASI support for ports/unix".</p>
<p>It then produced this <a href="https://github.com/simonw/micropython-wasm/blob/c08fbd2276b15dc8c9bdff82845f750971f45647/research.md">research.md document</a>, so I let Codex Desktop and GPT-5.5 high <a href="https://gist.github.com/simonw/27461a16d76f28f8619c609444d544fe">loose on it</a> to see what would happen:</p>
<blockquote>
<p><code>read the research.md document and build this. You will probably need to write a script that compiles a custom WASM version of MicroPython as part of this project - fetch the MicroPython code to a /tmp directory for this as part of that script.</code></p>
</blockquote>
<p>It worked. I now had a prototype Python library that could execute Python code inside a WebAssembly sandbox!</p>
<p>The trickiest piece to solve was persistent interpreter state. The WASM build we are using here exposes a single entry point which starts the interpreter, runs the code and then stops the interpreter at the end.</p>
<p>This works fine for one-off scripts, but for Datasette Agent I want variables and functions to stay resident in memory so I can reuse them across multiple code execution calls.</p>
<p>A neat thing about working with coding agents is that you can get from an idea to a proof of concept quickly. I prompted:</p>
<blockquote>
<p><code>For keeping variables resident: what if we ran code inside micropython itself which called a host function get_next_python_code() and then passed that to eval() - and that host function blocked until new code was available, maybe by running in a thread with a queue? Could that or a similar idea help here?</code></p>
</blockquote>
<p>After some iteration we got to a version of this that works! In Python code you can now do this:</p>
<pre><span class="pl-k">from</span> <span class="pl-s1">micropython_wasm</span> <span class="pl-k">import</span> <span class="pl-v">MicroPythonSession</span>
<span class="pl-k">with</span> <span class="pl-en">MicroPythonSession</span>() <span class="pl-k">as</span> <span class="pl-s1">session</span>:
<span class="pl-en">print</span>(<span class="pl-s1">session</span>.<span class="pl-c1">run</span>(<span class="pl-s">"x = 10<span class="pl-cce">\n</span>print(x)"</span>).<span class="pl-c1">stdout</span>)
<span class="pl-en">print</span>(<span class="pl-s1">session</span>.<span class="pl-c1">run</span>(<span class="pl-s">"x += 5<span class="pl-cce">\n</span>print(x)"</span>).<span class="pl-c1">stdout</span>)
<span class="pl-en">print</span>(<span class="pl-s1">session</span>.<span class="pl-c1">run</span>(<span class="pl-s">"print(x * 2)"</span>).<span class="pl-c1">stdout</span>)</pre>
<p>Under the hood this starts a thread, sets up a request queue and then sends messages to that queue for the <code>session.run()</code> command, each time waiting on a reply queue for the result of that execution. Inside WASM the MicroPython interpreter blocks waiting for a <code>__session_next__()</code> host function to return the next line of code, which it runs <code>eval()</code> on before calling <code>__session_result__({"id": request_id, "ok": True})</code> when each block has been successfully executed.</p>
<p>The other piece of complexity was supporting host functions, so my Python library could selectively expose functions that could then be called by code running in MicroPython.</p>
<p>Codex ended up solving this with <a href="https://github.com/simonw/micropython-wasm/blob/0.1a1/micropython_wasm/usercmodule/host/hostmodule.c">78 lines of C</a>, which ends up compiled into the <a href="https://github.com/simonw/micropython-wasm/blob/0.1a1/micropython_wasm/artifacts/micropython-wasi.wasm">362KB WebAssembly blob</a> I'm distributing with the package.</p>
<p>I am by no means a C programmer, but I've read the C and had two different models explain it to me (here's <a href="https://claude.ai/share/62f74371-cc3c-44f2-b406-33d03513de9e">Claude's explanation</a>) and I've subjected it to a barrage of tests.</p>
<p>The great thing about working with WebAssembly is that if the C turns out to be fatally flawed the worst that can happen is the WebAssembly execution will fail with an exception. I can live with that risk.</p>
<p>Memory limits are directly supported by wasmtime. CPU limits are a little harder: wasmtime offers a "fuel" concept to limit how many operations a WebAssembly call can execute, and that's the correct fit for this problem, but the units are hard to reason about. I'm experimenting with a 20 million default "fuel" setting now but I'm not confident that it's the most appropriate value.</p>
<h4 id="try-it-yourself">Try it yourself</h4>
<p>The <code>micropython-wasm</code> alpha is now <a href="https://pypi.org/project/micropython-wasm">live on PyPI</a>.</p>
<p>You can try it from your own Python code as <a href="https://github.com/simonw/micropython-wasm">described in the README</a>. I've also added a simple CLI mode in <a href="https://github.com/simonw/micropython-wasm/releases/tag/0.1a2">version 0.1a2</a> which means you can try it using <code>uvx</code> without first installing it like so:</p>
<div class="highlight highlight-source-shell"><pre>uvx micropython-wasm -c <span class="pl-s"><span class="pl-pds">'</span>print("Hello world")<span class="pl-pds">'</span></span>
<span class="pl-c"><span class="pl-c">#</span> To see it run out of fuel:</span>
uvx micropython-wasm -c <span class="pl-s"><span class="pl-pds">'</span>s = ""; while True: s += "longer"<span class="pl-pds">'</span></span>
<span class="pl-c"><span class="pl-c">#</span> Outputs: micropython-wasm: guest exited with code 1</span></pre></div>
<p>You can also try it in <a href="https://agent.datasette.io/">Datasette Agent</a> like this:</p>
<div class="highlight highlight-source-shell"><pre>uvx llm keys <span class="pl-c1">set</span> openai
<span class="pl-c"><span class="pl-c">#</span> Paste in an OpenAI key, then:</span>
uvx --with datasette-agent \
--with datasette-agent-micropython \
--prerelease allow \
datasette --internal internal.db \
-s plugins.datasette-llm.default_model gpt-5.5 \
--root -o</pre></div>
<p>Then navigate to <a href="http://127.0.0.1:8001/-/agent">http://127.0.0.1:8001/-/agent</a> and run the prompt:</p>
<blockquote>
<p><code>show me some micropython</code></p>
</blockquote>
<p><img src="https://static.simonwillison.net/static/2026/micropython-in-datasette-agent.jpg" style="max-width: 100%" alt="Screenshot of a chat application interface with a dark blue-grey header reading &quot;home&quot; on the left and &quot;root&quot; with a hamburger menu icon on the right. Below is a navigation row with &quot;← Back&quot; and &quot;Chat&quot; on the left and an &quot;EXPORT&quot; button on the right. A blue user message bubble reads &quot;show me some micropython&quot;. Below it a collapsed thinking section reads &quot;▸Thinking: … to show the result clearly. After that, I can wrap up with a brief explanation!&quot; followed by a &quot;▶ Tool: execute_micropython&quot; label. A code block follows: &quot;# A tiny MicroPython example: blink-style logic + Fibonacci&quot; / &quot;def fib(n):&quot; / &quot; a, b = 0, 1&quot; / &quot; out = []&quot; / &quot; for _ in range(n):&quot; / &quot; out.append(a)&quot; / &quot; a, b = b, a + b&quot; / &quot; return out&quot; / &#39;print(&quot;Hello from MicroPython!&quot;)&#39; / &#39;print(&quot;First 10 Fibonacci numbers:&quot;, fib(10))&#39; / &quot;# MicroPython often runs on microcontrollers, e.g.:&quot; / &quot;# from machine import Pin&quot; / &quot;# led = Pin(2, Pin.OUT)&quot; / &quot;# led.value(1) # turn LED on&quot; / &quot;# led.value(0) # turn LED off&quot;. Below a horizontal divider is the output: &quot;Hello from MicroPython!&quot; / &quot;First 10 Fibonacci numbers: [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]&quot;, followed by a &quot;▶ Result: execute_micropython&quot; label. At the bottom is a text input field with placeholder &quot;Type a message...&quot; and a blue &quot;Send&quot; button." /></p>
<p>You can try a live demo of that plugin running in Datasette Agent by signing into <a href="https://agent.datasette.io">agent.datasette.io</a> with your GitHub account.</p>
<h4 id="should-you-trust-my-vibe-coded-sandbox-">Should you trust my vibe-coded sandbox?</h4>
<p>Having complained about immature, loosely-maintained sandboxing libraries, it's deeply ironic that I've now built my own!</p>
<p>I deliberately slapped an alpha release version on it, and I'm not ready to recommend it to anyone who isn't willing to take a significant risk.</p>
<p>I've put it through enough testing that I'm OK using it myself. I've shipped my first plugin that uses it, <a href="https://github.com/datasette/datasette-agent-micropython">datasette-agent-micropython</a>. I've also locked GPT-5.5 xhigh in that Datasette Agent plugin and <a href="https://gist.github.com/simonw/5de497c44d25f9fd459c8aa2c959fe4a">challenged it to break out of the sandbox</a> and so far it has not managed to.</p>
<p>I'm hoping this implementation can convince some companies with professional security teams and high-stakes problems to commit to using Python in WebAssembly as a sandboxing approach and open source their own solutions.</p>
<p>Tags: <a href="https://simonwillison.net/tags/python">python</a>, <a href="https://simonwillison.net/tags/sandboxing">sandboxing</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/datasette">datasette</a>, <a href="https://simonwillison.net/tags/webassembly">webassembly</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/ai-assisted-programming">ai-assisted-programming</a>, <a href="https://simonwillison.net/tags/codex">codex</a>, <a href="https://simonwillison.net/tags/datasette-agent">datasette-agent</a>, <a href="https://simonwillison.net/tags/micropython">micropython</a></p>
</summary><category term="python"/><category term="sandboxing"/><category term="ai"/><category term="datasette"/><category term="webassembly"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="codex"/><category term="datasette-agent"/><category term="micropython"/></entry><entry><title>OpenAI Help: Lockdown Mode</title><link href="https://simonwillison.net/2026/Jun/5/openai-help-lockdown-mode/#atom-everything" rel="alternate"/><published>2026-06-05T23:56:40+00:00</published><updated>2026-06-05T23:56:40+00:00</updated><id>https://simonwillison.net/2026/Jun/5/openai-help-lockdown-mode/#atom-everything</id><summary type="html">
<p><strong><a href="https://help.openai.com/en/articles/20001061-lockdown-mode">OpenAI Help: Lockdown Mode</a></strong></p>
OpenAI first teased this <a href="https://openai.com/index/introducing-lockdown-mode-and-elevated-risk-labels-in-chatgpt/">in February</a>, but now it's live and "rolling out to eligible personal accounts, including Free, Go, Plus, and Pro, and self-serve ChatGPT Business accounts":</p>
<blockquote>
<p>Lockdown Mode is designed to help prevent the final stage of data exfiltration from a prompt injection attack by limiting outbound network requests that could transfer sensitive data to an attacker. Lockdown Mode does not prevent prompt injections from appearing in the content ChatGPT processes. For example, a prompt injection could appear in cached web content or in an uploaded file, and could still affect the behavior or accuracy of a response.</p>
</blockquote>
<p>This looks really good to me.</p>
<p>The <a href="https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/">Lethal Trifecta</a> occurs when an LLM system has access to all three of access to private data, exposure to untrusted content and a way to steal data and transmit it back to the attacker.</p>
<p>The only way to solve the trifecta is to cut off one of the three legs, and by far the easiest leg to restrict without making your LLM systems far less useful is the exfiltration vectors to steal data.</p>
<p>It looks to me like lockdown mode directly attacks that leg, using mechanisms that are deterministic and, crucially, are not evaluated by AI systems that themselves can be subverted by sufficiently devious attacks.</p>
<p>The existence of lockdown mode does however imply that ChatGPT, in its default settings, does <em>not</em> provide robust protection against sufficiently determined data exfiltration attacks!</p>
<p><strong>Update</strong>: <a href="https://twitter.com/cryps1s/status/2062923575049531422">This tweet</a> OpenAI CISO Dane Stuckey:</p>
<blockquote>
<p>Lockdown mode is not meant for everyone. However, for folks who have an elevated risk profile - due to who they are, what they work on, or the types of data they work with - it's an excellent tool for further securing themselves. This has some tradeoffs on functionality and utility, but for these users, the tradeoff is worthwhile.</p>
</blockquote>
<p>Tags: <a href="https://simonwillison.net/tags/security">security</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/openai">openai</a>, <a href="https://simonwillison.net/tags/prompt-injection">prompt-injection</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/lethal-trifecta">lethal-trifecta</a></p>
</summary><category term="security"/><category term="ai"/><category term="openai"/><category term="prompt-injection"/><category term="llms"/><category term="lethal-trifecta"/></entry><entry><title>Quoting Andreas Kling</title><link href="https://simonwillison.net/2026/Jun/5/andreas-kling/#atom-everything" rel="alternate"/><published>2026-06-05T11:10:05+00:00</published><updated>2026-06-05T11:10:05+00:00</updated><id>https://simonwillison.net/2026/Jun/5/andreas-kling/#atom-everything</id><summary type="html">
<blockquote cite="https://ladybird.org/posts/changing-how-we-develop-ladybird/"><p>We will no longer accept public pull requests. [...]</p>
<p>A substantial patch used to imply substantial effort, and that effort was a reasonable proxy for good faith. That assumption no longer holds. [...]</p>
<p>Whether code was typed by hand is beside the point. What matters is who is responsible for it once it enters the browser. Ladybird is becoming a browser for real users. The people introducing changes to it must be the people who decide those changes belong in the project, and who will answer for the consequences.</p></blockquote>
<p class="cite">&mdash; <a href="https://ladybird.org/posts/changing-how-we-develop-ladybird/">Andreas Kling</a>, Changing How We Develop Ladybird</p>
<p>Tags: <a href="https://simonwillison.net/tags/ladybird">ladybird</a>, <a href="https://simonwillison.net/tags/ai-ethics">ai-ethics</a>, <a href="https://simonwillison.net/tags/open-source">open-source</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/andreas-kling">andreas-kling</a>, <a href="https://simonwillison.net/tags/llms">llms</a></p>
</summary><category term="ladybird"/><category term="ai-ethics"/><category term="open-source"/><category term="generative-ai"/><category term="ai"/><category term="andreas-kling"/><category term="llms"/></entry><entry><title>AI enthusiasts are in a race against time, AI skeptics are in a race against entropy</title><link href="https://simonwillison.net/2026/Jun/4/ai-enthusiasts-ai-skeptics/#atom-everything" rel="alternate"/><published>2026-06-04T23:55:27+00:00</published><updated>2026-06-04T23:55:27+00:00</updated><id>https://simonwillison.net/2026/Jun/4/ai-enthusiasts-ai-skeptics/#atom-everything</id><summary type="html">
<p><strong><a href="https://charitydotwtf.substack.com/p/ai-enthusiasts-are-in-a-race-against">AI enthusiasts are in a race against time, AI skeptics are in a race against entropy</a></strong></p>
Charity Majors neatly captures the dynamic between AI enthusiasts and AI skeptics, both of whom are trying to build great software, often in the same teams:</p>
<blockquote>
<p>The enthusiasts are <em>not wrong</em>. We are starting to see real, non-imaginary, discontinuous leaps in capabilities from teams that lean in hard to working with AI. And this does not feel like a normal technology cycle where you can wait for the dust to settle; teams that sit this out while competitors are hustling could be out of business before the dust settles. That’s a real, existential threat.</p>
<p>The skeptics are also <em>not wrong</em>. When you ship code faster than engineers can read it, in domains where nobody has full context, you are making withdrawals from a trust account that took years to build. Reliability degrades, institutional knowledge evaporates. You end up with systems nobody understands, products burbling into incoherence, and on-call rotations that grind people up and spit them out. That is ALSO a real existential threat.</p>
</blockquote>
<p>Charity recommends treating this as both a leadership challenge and an engineering challenge. The key issue:</p>
<blockquote>
<p>There is no natural feedback loop connecting enthusiasts with skeptics.</p>
</blockquote>
<p>Designing feedback loops to help "mend the gap in shared reality" between the two groups is a fascinating organizational design problem.
<p><small></small>Via <a href="https://lobste.rs/s/ri4flr/ai_enthusiasts_are_race_against_time_ai">Lobste.rs</a></small></p>
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/charity-majors">charity-majors</a>, <a href="https://simonwillison.net/tags/agentic-engineering">agentic-engineering</a></p>
</summary><category term="ai"/><category term="charity-majors"/><category term="agentic-engineering"/></entry><entry><title>Quoting Emanuel Maiberg, 404 Media</title><link href="https://simonwillison.net/2026/Jun/4/a-slightly-different-version/#atom-everything" rel="alternate"/><published>2026-06-04T16:38:29+00:00</published><updated>2026-06-04T16:38:29+00:00</updated><id>https://simonwillison.net/2026/Jun/4/a-slightly-different-version/#atom-everything</id><summary type="html">
<blockquote cite="https://www.404media.co/google-employees-internally-share-memes-about-how-its-ai-sucks/"><p>After this story was published Google's spokesperson reached out and asked us to publish a slightly different version of that statement. The new statement no longer stated that "it's critical that we maintain humans in the loop."</p></blockquote>
<p class="cite">&mdash; <a href="https://www.404media.co/google-employees-internally-share-memes-about-how-its-ai-sucks/">Emanuel Maiberg, 404 Media</a>, Google Employees Internally Share Memes About How Its AI Sucks</p>
<p>Tags: <a href="https://simonwillison.net/tags/ai-ethics">ai-ethics</a>, <a href="https://simonwillison.net/tags/journalism">journalism</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/google">google</a></p>
</summary><category term="ai-ethics"/><category term="journalism"/><category term="ai"/><category term="google"/></entry><entry><title>Uber Caps Usage of AI Tools Like Claude Code to Manage Costs</title><link href="https://simonwillison.net/2026/Jun/3/uber-caps-usage/#atom-everything" rel="alternate"/><published>2026-06-03T12:01:27+00:00</published><updated>2026-06-03T12:01:27+00:00</updated><id>https://simonwillison.net/2026/Jun/3/uber-caps-usage/#atom-everything</id><summary type="html">
<p><strong><a href="https://www.bloomberg.com/news/articles/2026-06-02/uber-caps-usage-of-ai-tools-like-claude-code-to-cut-costs">Uber Caps Usage of AI Tools Like Claude Code to Manage Costs</a></strong></p>
I wrote <a href="https://simonwillison.net/2026/May/27/product-market-fit/#the-ai-failure-stories-around-this-are-pretty-thin">the other day</a> about Uber blowing its 2026 AI budget in four months, and how that wasn't particularly surprising given they would have set that budget in 2025, before anyone could have predicted how popular token-burning coding agents were about to become.
Natalie Lung for Bloomberg:</p>
<blockquote>
<p>The rideshare giant is limiting all employees to $1,500 in monthly token spending per AI coding tool, an Uber spokesperson said in response to a Bloomberg News inquiry. That means spending on one tool doesn’t have a bearing on the budget for another. The limits, which have been instituted in recent months, only apply to agentic coding software such as Cursor or Anthropic PBC’s Claude Code.</p>
</blockquote>
<p>A $1,500 monthly limit per tool strikes me as a rational policy response to over-spending, and <em>much</em> more sensible than those <a href="https://en.wikipedia.org/wiki/Token_maxxing">tokenmaxxing</a> leaderboards encouraging employees to compete for as much AI usage as possible.</p>
<p>It's also interesting in that it hints at a real dollar value for what Uber is getting out of these tools. If we assume two actively used tools per engineer that's $3,000 * 12 = $36,000 cap per engineer per year. Levels.fyi lists <a href="https://www.levels.fyi/companies/uber/salaries/software-engineer?country=254">the median yearly compensation package for Uber software engineers in the USA</a> at $330,000.</p>
<p>That means each employee's AI spending cap is ~11% of that median compensation package.</p>
<p>I <a href="https://simonwillison.net/2026/May/27/product-market-fit/#enterprise-customers-are-now-paying-api-prices">noted</a> that my own token usage comes to about $1,000/month against each of Anthropic and OpenAI - which currently costs me just $100 per provider thanks to their generous subsidized plans for individual subscribers. Those plans are no longer available to larger companies like Uber.</p>
<p>Their new policy means if I were working at Uber I'd still have ~$500/month of tokens to spare for each of those tools, given my current usage patterns.
<p>Tags: <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/llm-pricing">llm-pricing</a>, <a href="https://simonwillison.net/tags/coding-agents">coding-agents</a>, <a href="https://simonwillison.net/tags/uber">uber</a></p>
</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm-pricing"/><category term="coding-agents"/><category term="uber"/></entry><entry><title>Microsoft's new MAI models</title><link href="https://simonwillison.net/2026/Jun/2/microsofts-new-models/#atom-everything" rel="alternate"/><published>2026-06-02T22:21:52+00:00</published><updated>2026-06-02T22:21:52+00:00</updated><id>https://simonwillison.net/2026/Jun/2/microsofts-new-models/#atom-everything</id><summary type="html">
<p>Microsoft <a href="https://microsoft.ai/news/building-a-hillclimbing-machine-launching-seven-new-mai-models/">announced two new text LLMs</a> this morning - <strong><a href="https://microsoft.ai/news/introducing-mai-thinking-1/">MAI-Thinking-1</a></strong> (reasoning, 1T parameters, 35B active, available to "select early partners") and <strong><a href="https://microsoft.ai/news/introducingmai-code-1-flash/">MAI-Code-1-Flash</a></strong> (137B Parameters, 5B active, "purpose-built for GitHub Copilot and VS Code to deliver high performance and lower cost [...] rolling out to GitHub Copilot individual users in Visual Studio Code"). I've not been able to try either of them just yet.</p>
<p><strike>It's very interesting to see Microsoft releasing models with such low parameter counts, especially given how expensive larger models are to access right now. They claim MAI-Thinking-1 "is preferred to Sonnet 4.6 in our blind human side-by-side evaluations", which is impressive for a 35B model seeing as I frequently run models larger than that on my own laptop.</strike> (UPDATE: I got this entirely wrong, see note below.)</p>
<p>Also <a href="https://microsoft.ai/news/introducing-mai-thinking-1/">of note</a>:</p>
<blockquote>
<p>We trained [MAI-Thinking-1] from the ground up on enterprise grade, clean and commercially licensed data, without distillation from third-party models.</p>
</blockquote>
<p>And for <a href="https://microsoft.ai/news/introducingmai-code-1-flash/">MAI-Code-1-Flash</a> as well:</p>
<blockquote>
<p>It is built end-to-end by Microsoft using clean and appropriately licensed data.</p>
</blockquote>
<p>I would <em>very much</em> like to learn more about this "appropriately licensed" data! Could these be the first generally useful code-specialist models that didn't train on an unlicensed dump of the web? (<strong>Update</strong>: the answer is no, see note below.)</p>
<p><strong>Update</strong>: My initial published notes got the size of the models wrong. I misread Microsoft's announcements and interpreted the MoE active parameter count as the total parameter count, but the <a href="https://microsoft.ai/pdf/MAI-Code-1-Flash-Model-Card.PDF">model card for MAI-Code-1-Flash</a> lists it as 137B with 5B active and the <a href="https://microsoft.ai/wp-content/uploads/2026/06/main_20260602_2.pdf">MAI-Thinking-1 technical paper</a> reveals it to be a 1T model with 35B active.</p>
<p>I deeply regret this error.</p>
<p><strong>Update 2</strong>: That technical paper describes the training data in some detail from page 80 onwards. It has the same licensing problems as all of the other major LLMs: it's trained on a crawl of the public web:</p>
<blockquote>
<p>The majority of our web HTML corpus comes from a proprietary crawl. After initial page discovery and selection, approximately 1.2 trillion pages are crawled and parsed. [...] In addition to Microsoft standard policy Sec. 2.4, we apply UT1 block list (Prigent, 2026) to remove adult content and piracy-related domains. In all, this filtering reduces the corpus from 1.2 trillion pages to 794 billion pages. Given the prevalence of AI-generated content on the web, we also score pages with a proprietary AI-content detection model and use manual inspection to identify domains with extensive AI-generated content; those domains are filtered out of the training corpus.</p>
<p>[...]</p>
<p>We process Common Crawl with the same pipeline. [...] After filtering, deduplication, merging with the proprietary web corpus, and a final round of exact-URL and content-level fuzzy deduplication, the Common Crawl portion contains 24.2 billion pages.</p>
</blockquote>
<p>I did not cover this one at all well, which is somewhat ironic since I was at the Microsoft Build conference when I wrote this up! I'm sorry for not digging deeper before publishing my initial notes.</p>
<p>Tags: <a href="https://simonwillison.net/tags/llm-release">llm-release</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/microsoft">microsoft</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/training-data">training-data</a></p>
</summary><category term="llm-release"/><category term="generative-ai"/><category term="ai"/><category term="microsoft"/><category term="llms"/><category term="training-data"/></entry><entry><title>datasette-agent-micropython 0.1a0</title><link href="https://simonwillison.net/2026/Jun/2/datasette-agent-micropython/#atom-everything" rel="alternate"/><published>2026-06-02T19:28:36+00:00</published><updated>2026-06-02T19:28:36+00:00</updated><id>https://simonwillison.net/2026/Jun/2/datasette-agent-micropython/#atom-everything</id><summary type="html">
<p><strong>Release:</strong> <a href="https://github.com/datasette/datasette-agent-micropython/releases/tag/0.1a0">datasette-agent-micropython 0.1a0</a></p>
<p>I want <a href="https://agent.datasette.io">Datasette Agent</a> to be able to generate and execute Python code safely. This alpha is looking promising so far. GPT-5.5 has so far failed to break out of the sandbox!</p>
<p>Tags: <a href="https://simonwillison.net/tags/python">python</a>, <a href="https://simonwillison.net/tags/sandboxing">sandboxing</a>, <a href="https://simonwillison.net/tags/datasette">datasette</a>, <a href="https://simonwillison.net/tags/webassembly">webassembly</a>, <a href="https://simonwillison.net/tags/datasette-agent">datasette-agent</a>, <a href="https://simonwillison.net/tags/micropython">micropython</a></p>
</summary><category term="python"/><category term="sandboxing"/><category term="datasette"/><category term="webassembly"/><category term="datasette-agent"/><category term="micropython"/></entry><entry><title>micropython-wasm 0.1a1</title><link href="https://simonwillison.net/2026/Jun/2/micropython-wasm/#atom-everything" rel="alternate"/><published>2026-06-02T19:20:47+00:00</published><updated>2026-06-02T19:20:47+00:00</updated><id>https://simonwillison.net/2026/Jun/2/micropython-wasm/#atom-everything</id><summary type="html">
<p><strong>Release:</strong> <a href="https://github.com/simonw/micropython-wasm/releases/tag/0.1a1">micropython-wasm 0.1a1</a></p>
<p>Fixes for some limitations that emerged while I was trying to use this to build <code>datasette-agent-micropython</code>.</p>
<p>Tags: <a href="https://simonwillison.net/tags/python">python</a>, <a href="https://simonwillison.net/tags/sandboxing">sandboxing</a>, <a href="https://simonwillison.net/tags/webassembly">webassembly</a>, <a href="https://simonwillison.net/tags/micropython">micropython</a></p>
</summary><category term="python"/><category term="sandboxing"/><category term="webassembly"/><category term="micropython"/></entry><entry><title>California Brown Pelican</title><link href="https://simonwillison.net/2026/Jun/2/sighting-367841339/#atom-everything" rel="alternate"/><published>2026-06-02T18:17:03+00:00</published><updated>2026-06-02T18:17:03+00:00</updated><id>https://simonwillison.net/2026/Jun/2/sighting-367841339/#atom-everything</id><summary type="html">
<p><img src="https://static.inaturalist.org/photos/671786719/large.jpg" alt="California Brown Pelican"></p><p>California Brown Pelican, in Fort Mason, CA, US</p><p>I'm at the <a href="https://build.microsoft.com/">Microsoft Build</a> conference today, held at <a href="https://en.wikipedia.org/wiki/Fort_Mason">Fort Mason</a> in San Francisco. There are California Brown Pelicans diving into the water directly behind venue!</p>
<p>Tags: <a href="https://simonwillison.net/tags/microsoft">microsoft</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a>, <a href="https://simonwillison.net/tags/llm-release">llm-release</a></p>
</summary><category term="microsoft"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm-release"/></entry><entry><title>Pasted File Editor</title><link href="https://simonwillison.net/2026/Jun/2/pasted-file-editor/#atom-everything" rel="alternate"/><published>2026-06-02T04:13:36+00:00</published><updated>2026-06-02T04:13:36+00:00</updated><id>https://simonwillison.net/2026/Jun/2/pasted-file-editor/#atom-everything</id><summary type="html">
<p><strong>Tool:</strong> <a href="https://tools.simonwillison.net/pasted-file-editor">Pasted File Editor</a></p>
<p>I really like how you can paste a large volume of text into <a href="https://claude.ail">claude.ai</a> (or the Claude desktop/mobile apps) and it will detect it as a large paste and turn it into a file attachment instead.</p>
<p>I decided to have Codex desktop <a href="https://gist.github.com/simonw/74c79119b487a5acce18b4dcc26b9f79">build me a version of that</a> as a prototype.</p>
<p>You can also open files directly - including images which will be shown as thumbnails - or drag files onto the textarea.</p>
<p>Tags: <a href="https://simonwillison.net/tags/javascript">javascript</a>, <a href="https://simonwillison.net/tags/tools">tools</a>, <a href="https://simonwillison.net/tags/ai-assisted-programming">ai-assisted-programming</a>, <a href="https://simonwillison.net/tags/claude">claude</a>, <a href="https://simonwillison.net/tags/codex">codex</a></p>
</summary><category term="javascript"/><category term="tools"/><category term="ai-assisted-programming"/><category term="claude"/><category term="codex"/></entry><entry><title>micropython-wasm 0.1a0</title><link href="https://simonwillison.net/2026/Jun/2/micropython-wasm-2/#atom-everything" rel="alternate"/><published>2026-06-02T03:43:45+00:00</published><updated>2026-06-02T03:43:45+00:00</updated><id>https://simonwillison.net/2026/Jun/2/micropython-wasm-2/#atom-everything</id><summary type="html">
<p><strong>Release:</strong> <a href="https://github.com/simonw/micropython-wasm/releases/tag/0.1a0">micropython-wasm 0.1a0</a></p>
<p>My latest sandboxing experiment: This alpha package bundles a lightly customized WASM build of <a href="https://micropython.org/">MicroPython</a> with a wrapper to execute code in it via <a href="https://wasmtime.dev/">wasmtime</a>.</p>
<p>Tags: <a href="https://simonwillison.net/tags/python">python</a>, <a href="https://simonwillison.net/tags/sandboxing">sandboxing</a>, <a href="https://simonwillison.net/tags/webassembly">webassembly</a>, <a href="https://simonwillison.net/tags/micropython">micropython</a></p>
</summary><category term="python"/><category term="sandboxing"/><category term="webassembly"/><category term="micropython"/></entry></feed>
{
"accept-ranges": "bytes",
"access-control-allow-methods": "GET, OPTIONS",
"access-control-allow-origin": "*",
"access-control-max-age": "1000",
"age": "257",
"cache-control": "s-maxage=600",
"cf-cache-status": "HIT",
"cf-ray": "a0b1b8fded80c424-CMH",
"connection": "close",
"content-length": "141986",
"content-type": "application/xml; charset=utf-8",
"date": "Sat, 13 Jun 2026 14:18:06 GMT",
"django-composition": "Impromptu",
"last-modified": "Sat, 13 Jun 2026 01:01:50 GMT",
"nel": "{\"report_to\":\"heroku-nel\",\"response_headers\":[\"Via\"],\"max_age\":3600,\"success_fraction\":0.01,\"failure_fraction\":0.1}",
"referrer-policy": "strict-origin-when-cross-origin",
"report-to": "{\"group\":\"heroku-nel\",\"endpoints\":[{\"url\":\"https://nel.heroku.com/reports?s=T3DpsoUB4MxF3Xi9fy%2BxVvU4ARg%2FZcgQIt7B1U%2FmQLQ%3D\\u0026sid=c46efe9b-d3d2-4a0c-8c76-bfafa16c5add\\u0026ts=1781360028\"}],\"max_age\":3600}",
"reporting-endpoints": "heroku-nel=\"https://nel.heroku.com/reports?s=T3DpsoUB4MxF3Xi9fy%2BxVvU4ARg%2FZcgQIt7B1U%2FmQLQ%3D&sid=c46efe9b-d3d2-4a0c-8c76-bfafa16c5add&ts=1781360028\"",
"server": "cloudflare",
"via": "1.1 heroku-router",
"x-content-type-options": "nosniff"
}
{
"meta": {
"type": "atom",
"version": "1.0"
},
"language": "en-us",
"title": "Simon Willison's Weblog",
"description": null,
"copyright": null,
"url": "http://simonwillison.net/",
"self": "http://simonwillison.net/atom/everything/",
"published": null,
"updated": "2026-06-13T01:01:50.000Z",
"generator": null,
"image": null,
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [],
"items": [
{
"id": "https://simonwillison.net/2026/Jun/13/us-government-directive-to-suspend-access/#atom-everything",
"title": "Statement on the US government directive to suspend access to Fable 5 and Mythos 5",
"description": "<p><strong><a href=\"https://www.anthropic.com/news/fable-mythos-access\">Statement on the US government directive to suspend access to Fable 5 and Mythos 5</a></strong></p>\nWell this is <em>nuts</em>:</p>\n<blockquote>\n<p>The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for <strong>all</strong> our customers to ensure compliance. <strong>Access to all other Anthropic models</strong> <strong>will not be affected.</strong></p>\n<p>We received the directive from the government today at 5:21pm (ET). The letter did not provide specific details of its national security concern. Our understanding is that the government believes it has become aware of a method of bypassing, or \"jailbreaking\" Fable 5. We reviewed a demonstration of this specific technique being used to identify a small number of previously known, minor vulnerabilities. These vulnerabilities all appear relatively simple, and we have found that other publicly-available models are able to discover them as well without requiring a bypass. [...]</p>\n<p>To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws. Our understanding is that one potential jailbreak was shared with the government. We have reviewed the report and validated that the level of capability displayed there is widely available from other models (including OpenAI's <a href=\"https://deploymentsafety.openai.com/gpt-5-5/tacit-knowledge-and-troubleshooting\">GPT-5.5</a>), and is used every day by the defenders who keep systems safe. We will share more details over the next 24 hours.</p>\n</blockquote>\n<p>I still have access to Fable via <a href=\"https://claude.ai/\">claude.ai</a> and Claude Code now, at 9:01pm ET.</p>\n<p><strong>Update</strong>: I ran <a href=\"https://gist.github.com/simonw/5894cfafc64a2b8aafbe834bc9c950b9\">this script</a> against the Anthropic API to spot when <code>claude-fable-5</code> would stop working. My access was cut off at 6:59pm Pacific (9:59pm ET):</p>\n<pre>[2026-06-12T18:56:50-07:00] attempt 35: running uv run llm -m claude-fable-5 hi\n[2026-06-12T18:56:55-07:00] success: Hi there! How can I help you today?\n[2026-06-12T18:57:55-07:00] attempt 36: running uv run llm -m claude-fable-5 hi\n[2026-06-12T18:57:59-07:00] success: Hi! How can I help you today?\n[2026-06-12T18:58:59-07:00] attempt 37: running uv run llm -m claude-fable-5 hi\n[2026-06-12T18:59:00-07:00] FAILED after attempt 37 with exit code 1\n\nstderr:\nError: Error code: 404 - {'type': 'error', 'error': {'type': 'not_found_error', 'message': 'Claude Fable 5 is not available. Please use Opus 4.8. Learn more: https://www.anthropic.com/news/fable-mythos-access'}, 'request_id': 'req_011CbzRyirV7KZLHYYdBM9od'}</pre>\n\n <p><small></small>Via <a href=\"https://twitter.com/AnthropicAI/status/2065597531644743999\">@AnthropicAI</a></small></p>\n\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/jailbreaking\">jailbreaking</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/anthropic\">anthropic</a>, <a href=\"https://simonwillison.net/tags/claude\">claude</a>, <a href=\"https://simonwillison.net/tags/ai-ethics\">ai-ethics</a>, <a href=\"https://simonwillison.net/tags/claude-mythos\">claude-mythos</a></p>",
"url": "https://simonwillison.net/2026/Jun/13/us-government-directive-to-suspend-access/#atom-everything",
"published": "2026-06-13T01:01:50.000Z",
"updated": "2026-06-13T01:01:50.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "jailbreaking",
"term": "jailbreaking",
"url": null
},
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "anthropic",
"term": "anthropic",
"url": null
},
{
"label": "claude",
"term": "claude",
"url": null
},
{
"label": "ai-ethics",
"term": "ai-ethics",
"url": null
},
{
"label": "claude-mythos",
"term": "claude-mythos",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/12/openai-webrtc/#atom-everything",
"title": "OpenAI WebRTC Audio Session, now with document context",
"description": "<p><strong><a href=\"https://tools.simonwillison.net/openai-webrtc\">OpenAI WebRTC Audio Session, now with document context</a></strong></p>\nI built the first version of this tool <a href=\"https://simonwillison.net/2024/Dec/17/openai-webrtc/\">in December 2024</a> to try out the then-new OpenAI WebRTC API for interacting with their realtime audio models.</p>\n<p>Last month OpenAI <a href=\"https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/\">introduced a brand new model</a> to that API called <a href=\"https://developers.openai.com/api/docs/models/gpt-realtime-2\">GPT‑Realtime‑2</a>, which they promoted as \"our first voice model with GPT‑5‑class reasoning\" - with a Sep 30, 2024 knowledge cut-off.</p>\n<p>I've been waiting for that model to show up in the ChatGPT iPhone app but it still hasn't, so I revisited my old playground.</p>\n<p>You can now pick the better model, and you can also paste in a big chunk of document context so you can have as audio conversation in your browser about whatever information you think would be useful to explore in a conversational way.</p>\n<p><img src=\"https://static.simonwillison.net/static/2026/openai-webrtc-document-context.jpg\" alt=\"Screenshot of a web interface titled \"OpenAI WebRTC Audio Session\" with a gray status dot. Form fields: \"OpenAI API Token\" showing a masked password of dots, \"Voice\" dropdown set to \"Coral\", \"Model\" dropdown set to \"gpt-realtime-2\". A collapsible section labeled \"▼ Document context (optional — paste text to talk about)\" with bold instruction \"Paste a document here before starting the session and the model will be able to discuss it with you\" above a textarea containing a pasted Markdown document about whether DuckDB can run untrusted SQL as safely as Datasette runs SQLite. Below are a blue \"Start Session\" button and a gray disabled \"Mute Mic\" button, then a green success message \"Session established successfully!\" At the bottom, a dark panel headed \"Last transcript\" reads: \"DuckDB can be made about as safe as SQLite for running untrusted SELECT queries, but only if you lock it down properly. Using read only true by itself is not enough, because SQL can still\" (text cut off).\" class=\"blogmark-image\" style=\"max-width: 80%\">\n\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/audio\">audio</a>, <a href=\"https://simonwillison.net/tags/tools\">tools</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/openai\">openai</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/multi-modal-output\">multi-modal-output</a>, <a href=\"https://simonwillison.net/tags/webrtc\">webrtc</a></p>",
"url": "https://simonwillison.net/2026/Jun/12/openai-webrtc/#atom-everything",
"published": "2026-06-12T23:53:04.000Z",
"updated": "2026-06-12T23:53:04.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "audio",
"term": "audio",
"url": null
},
{
"label": "tools",
"term": "tools",
"url": null
},
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "openai",
"term": "openai",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "multi-modal-output",
"term": "multi-modal-output",
"url": null
},
{
"label": "webrtc",
"term": "webrtc",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/12/andrew-singleton/#atom-everything",
"title": "Quoting Andrew Singleton",
"description": "<blockquote cite=\"https://www.mcsweeneys.net/articles/ai-economics-for-dummies\"><p>Jenny owns a crematorium. John’s propane company gives her a $20 billion investment in return for 5 percent of her operation. Jenny throws $10 billion into the incinerator, then pays John $10 billion to buy propane to burn that money to ashes. John reports that his AI investments have generated $10 billion in revenue this quarter and that he owns 5 percent of a $100 billion business. A reporter from <em>Forbes</em> is assigned to profile John and Jenny, and over the course of his research, he becomes embroiled in a passionate but confusing three-way love affair with them, which eventually turns into a polyamorous common-law marriage. His profile is glowing, but light on financial details.</p></blockquote>\n<p class=\"cite\">— <a href=\"https://www.mcsweeneys.net/articles/ai-economics-for-dummies\">Andrew Singleton</a>, AI Economics for Dummies</p>\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a></p>",
"url": "https://simonwillison.net/2026/Jun/12/andrew-singleton/#atom-everything",
"published": "2026-06-12T18:09:21.000Z",
"updated": "2026-06-12T18:09:21.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "ai",
"term": "ai",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/11/fable-is-relentlessly-proactive/#atom-everything",
"title": "Claude Fable is relentlessly proactive",
"description": "<p>After two days of experience with <a href=\"https://simonwillison.net/2026/Jun/9/claude-fable-5/\">Claude Fable 5</a> I think the best way to describe it is <strong>relentlessly proactive</strong>. It knows a whole lot of tricks and it will deploy pretty much any of them to get to its goal.</p>\n<p>I'll illustrate this with an example. I was hacking on <a href=\"https://agent.datasette.io/\">Datasette Agent</a> today when I noticed a glitch: a horizontal scrollbar that shouldn't be there in the jump menu chat prompt. I snapped this screenshot:</p>\n<p><img src=\"https://static.simonwillison.net/static/2026/jump-to-bug.jpg\" alt=\"Screenshot of a modal dialog demonstrating a scrollbar bug. At the top is a focused search input with blue outline and placeholder \"Jump to...\", with an X close button to its right. Below, a heading reads \"Start a new agent chat\" above a textarea with the placeholder \"Ask a question about your data...\" — the bug: a thick gray horizontal scrollbar is incorrectly displayed along the bottom edge of the empty textarea, spanning nearly its full width, next to the resize handle. Below the textarea: \"Press Enter to start. Shift+Enter adds a new line.\" followed by a blue \"Start chat\" button.\" style=\"max-width: 100%;\" /></p>\n<p>Then I started a fresh <code>claude</code> session in my <code>datasette-agent</code> checkout, dragged in the screenshot and told it:</p>\n<blockquote>\n<p><code>Look at dependencies to help figure out why there is a horizontal scrollbar here</code></p>\n</blockquote>\n<p>I had a hunch the cause was in a dependency of Datasette Agent (likely Datasette itself) and I knew Fable was good at digging into dependency code, either by inspecting installed files in its own virtual environment <code>site-packages</code> or by referencing a local checkout on disk. Telling it to start with dependencies felt like a good bet.</p>\n<p>I got distracted by a domestic task and wandered away from my computer.</p>\n<p>When I came back a few minutes later I saw my machine <em>open a browser window</em> in my regular Firefox and then <em>navigate to the dialog in question</em>. I had not told Claude Code to use any browser automation, and I was pretty sure it wasn't possible for it to trigger mouse movements or keyboard shortcuts within a window, so how was it doing that?</p>\n<p>I watched in fascination as it continued with its explorations, then saw it open a Safari window instead of Firefox. I also grabbed this snapshot from the Claude terminal:</p>\n<p><img src=\"https://static.simonwillison.net/static/2026/fable-bash-pyobjc.jpg\" alt=\"Screenshot of two Bash tool calls in a dark terminal interface. First: Bash(open -a Safari /tmp/textarea-scrollbar-test.html && sleep 4 && uv run --with pyobjc-framework-Quartz python - <<'EOF' import Quartz wins = Quartz.CGWindowListCopyWindowInfo(Quartz.kCGWindowListOptionOnScreenOnly, Quartz.kCGNullWindowID) for w in wins: if (w.get('kCGWindowOwnerName') or '') == 'Safari' and 'textarea' in (w.get('kCGWindowName') or '').lower(): print(w.get('kCGWindowNumber')) EOF) with output 153551. Second: Bash(screencapture -x -o -l 153551 /tmp/safari-cases.png && echo ok) with output ok.\" style=\"max-width: 100%;\" /></p>\n<p>What was it doing there with <code>uv run --with pyobjc-framework-Quartz</code>?</p>\n<p>It turns out Fable had hacked up its own pattern for taking screenshots of browser windows. It was using Python to iterate through all available windows on my machine, then filtering for Safari windows with expected strings such as <code>\"textarea\"</code> in the window name. It used that to find their window number - an integer like 153551 - which it could then use with the <code>screencapture</code> CLI tool to grab a PNG.</p>\n<p>OK fine, that's a neat way of taking screenshots. But what was it taking screenshots of?</p>\n<p>Turns out it had been writing its own scratch HTML pages to try and recreate the bug, then opening Safari and grabbing screenshots.</p>\n<p>Here's that <a href=\"https://static.simonwillison.net/static/2026/textarea-scrollbar-test.html\">/tmp/textarea-scrollbar-test.html</a> page it created, and the screenshot it took with <code>screencapture -x -o -l 153551 /tmp/safari-cases.png</code>:</p>\n<p><img src=\"https://static.simonwillison.net/static/2026/safari-cases.jpg\" alt=\"Screenshot of a Safari browser window showing a textarea scrollbar test page at file:///private/tmp/textarea-scrollbar-test.html. Page text reads: scrollbar thickness: 17px | UA: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/26.4 Safari/605.1.15 | devicePixelRatio: 2. Four numbered test cases follow, each with a textarea containing the placeholder \"Ask a question about your data...\": 1. Exact plugin CSS (resize: vertical, default overflow), 2. Plugin CSS + overflow-x: hidden, 3. Plugin CSS + resize: none, and 4. Bare default textarea, which is a much smaller box with the placeholder wrapping onto two lines.\" style=\"max-width: 100%;\" />\n(I have way too many open tabs!)</p>\n<p>OK, so I can see how it's opening test pages and taking screenshots, but how on earth was it triggering the modal dialog that was meant to be under test? That's only available via a click or a keyboard shortcut, and I couldn't see a mechanism for it to run those in Safari.</p>\n<p>I eventually figured out what it had done.</p>\n<p>Claude was running in a folder that contained the source code for the application. It knows enough about <a href=\"https://datasette.io/\">Datasette</a> to be able to run a local development server. It turns out it was editing Datasette's own templates to add JavaScript that would trigger the correct keyboard shortcut as soon as the window opened, adding code like this:</p>\n<div class=\"highlight highlight-text-html-basic\"><pre><span class=\"pl-kos\"><</span><span class=\"pl-ent\">script</span><span class=\"pl-kos\">></span>\n<span class=\"pl-smi\">window</span><span class=\"pl-kos\">.</span><span class=\"pl-en\">addEventListener</span><span class=\"pl-kos\">(</span><span class=\"pl-s\">\"load\"</span><span class=\"pl-kos\">,</span> <span class=\"pl-k\">function</span> <span class=\"pl-kos\">(</span><span class=\"pl-kos\">)</span> <span class=\"pl-kos\">{</span>\n <span class=\"pl-en\">setTimeout</span><span class=\"pl-kos\">(</span><span class=\"pl-k\">function</span> <span class=\"pl-kos\">(</span><span class=\"pl-kos\">)</span> <span class=\"pl-kos\">{</span>\n <span class=\"pl-smi\">document</span><span class=\"pl-kos\">.</span><span class=\"pl-en\">dispatchEvent</span><span class=\"pl-kos\">(</span><span class=\"pl-k\">new</span> <span class=\"pl-v\">KeyboardEvent</span><span class=\"pl-kos\">(</span><span class=\"pl-s\">\"keydown\"</span><span class=\"pl-kos\">,</span> <span class=\"pl-kos\">{</span><span class=\"pl-c1\">key</span>: <span class=\"pl-s\">\"/\"</span><span class=\"pl-kos\">,</span> <span class=\"pl-c1\">bubbles</span>: <span class=\"pl-c1\">true</span><span class=\"pl-kos\">}</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">;</span>\n <span class=\"pl-kos\">}</span><span class=\"pl-kos\">,</span> <span class=\"pl-c1\">1200</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">;</span>\n<span class=\"pl-kos\">}</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">;</span>\n<span class=\"pl-kos\"></</span><span class=\"pl-ent\">script</span><span class=\"pl-kos\">></span></pre></div>\n<p>1.2 seconds after the window opens, this code triggers a simulated <code>/</code> key, which is the keyboard shortcut for opening the modal dialog.</p>\n<p>There was one challenge left. In order to understand what was going on, Claude needed to run JavaScript on the page to take measurements for itself.</p>\n<p>It wrote its own custom web application to capture information via CORS, then ran that as a local server and opened a page with JavaScript that would POST directly to it!</p>\n<p>Here's the Python web app it wrote, using the standard library <a href=\"https://docs.python.org/3/library/http.server.html\">http.server</a> package:</p>\n<pre><span class=\"pl-k\">from</span> <span class=\"pl-s1\">http</span>.<span class=\"pl-s1\">server</span> <span class=\"pl-k\">import</span> <span class=\"pl-v\">HTTPServer</span>, <span class=\"pl-v\">BaseHTTPRequestHandler</span>\n\n<span class=\"pl-k\">class</span> <span class=\"pl-c1\">H</span>(<span class=\"pl-v\">BaseHTTPRequestHandler</span>):\n <span class=\"pl-k\">def</span> <span class=\"pl-en\">do_POST</span>(<span class=\"pl-s1\">self</span>):\n <span class=\"pl-s1\">n</span> <span class=\"pl-c1\">=</span> <span class=\"pl-en\">int</span>(<span class=\"pl-s1\">self</span>.<span class=\"pl-c1\">headers</span>.<span class=\"pl-c1\">get</span>(<span class=\"pl-s\">\"Content-Length\"</span>, <span class=\"pl-c1\">0</span>))\n <span class=\"pl-en\">open</span>(<span class=\"pl-s\">\"/tmp/diag.json\"</span>, <span class=\"pl-s\">\"w\"</span>).<span class=\"pl-c1\">write</span>(<span class=\"pl-s1\">self</span>.<span class=\"pl-c1\">rfile</span>.<span class=\"pl-c1\">read</span>(<span class=\"pl-s1\">n</span>).<span class=\"pl-c1\">decode</span>())\n <span class=\"pl-s1\">self</span>.<span class=\"pl-c1\">send_response</span>(<span class=\"pl-c1\">200</span>)\n <span class=\"pl-s1\">self</span>.<span class=\"pl-c1\">send_header</span>(<span class=\"pl-s\">\"Access-Control-Allow-Origin\"</span>, <span class=\"pl-s\">\"*\"</span>)\n <span class=\"pl-s1\">self</span>.<span class=\"pl-c1\">end_headers</span>()\n <span class=\"pl-k\">def</span> <span class=\"pl-en\">do_OPTIONS</span>(<span class=\"pl-s1\">self</span>):\n <span class=\"pl-s1\">self</span>.<span class=\"pl-c1\">send_response</span>(<span class=\"pl-c1\">200</span>)\n <span class=\"pl-s1\">self</span>.<span class=\"pl-c1\">send_header</span>(<span class=\"pl-s\">\"Access-Control-Allow-Origin\"</span>, <span class=\"pl-s\">\"*\"</span>)\n <span class=\"pl-s1\">self</span>.<span class=\"pl-c1\">send_header</span>(<span class=\"pl-s\">\"Access-Control-Allow-Headers\"</span>, <span class=\"pl-s\">\"*\"</span>)\n <span class=\"pl-s1\">self</span>.<span class=\"pl-c1\">end_headers</span>()\n <span class=\"pl-k\">def</span> <span class=\"pl-en\">log_message</span>(<span class=\"pl-s1\">self</span>, <span class=\"pl-c1\">*</span><span class=\"pl-s1\">a</span>): <span class=\"pl-c\"># quiet</span>\n <span class=\"pl-k\">pass</span>\n\n<span class=\"pl-en\">HTTPServer</span>((<span class=\"pl-s\">\"127.0.0.1\"</span>, <span class=\"pl-c1\">9999</span>), <span class=\"pl-c1\">H</span>).<span class=\"pl-c1\">serve_forever</span>()</pre>\n<p>All this does is accept a POST request full of JSON and write that to the <code>/tmp/diag.json</code> file. It sends <code>Access-Control-Allow-Origin: *</code> headers (including from <code>OPTIONS</code> requests) so that code running on another domain can still communicate back to it.</p>\n<p>Then Claude injected this code into the template that it was loading in a browser:</p>\n<div class=\"highlight highlight-source-js\"><pre><span class=\"pl-k\">const</span> <span class=\"pl-s1\">host</span> <span class=\"pl-c1\">=</span> <span class=\"pl-smi\">document</span><span class=\"pl-kos\">.</span><span class=\"pl-en\">querySelector</span><span class=\"pl-kos\">(</span><span class=\"pl-s\">\"navigation-search\"</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">;</span>\n<span class=\"pl-k\">const</span> <span class=\"pl-s1\">ta</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">host</span><span class=\"pl-kos\">.</span><span class=\"pl-c1\">shadowRoot</span><span class=\"pl-kos\">.</span><span class=\"pl-en\">querySelector</span><span class=\"pl-kos\">(</span><span class=\"pl-s\">\"textarea\"</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">;</span>\n<span class=\"pl-k\">const</span> <span class=\"pl-s1\">cs</span> <span class=\"pl-c1\">=</span> <span class=\"pl-en\">getComputedStyle</span><span class=\"pl-kos\">(</span><span class=\"pl-s1\">ta</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">;</span>\n<span class=\"pl-en\">fetch</span><span class=\"pl-kos\">(</span><span class=\"pl-s\">\"http://127.0.0.1:9999/diag\"</span><span class=\"pl-kos\">,</span> <span class=\"pl-kos\">{</span>\n <span class=\"pl-c1\">method</span>: <span class=\"pl-s\">\"POST\"</span><span class=\"pl-kos\">,</span>\n <span class=\"pl-c1\">body</span>: <span class=\"pl-c1\">JSON</span><span class=\"pl-kos\">.</span><span class=\"pl-en\">stringify</span><span class=\"pl-kos\">(</span><span class=\"pl-kos\">{</span>\n <span class=\"pl-c1\">dpr</span>: <span class=\"pl-smi\">window</span><span class=\"pl-kos\">.</span><span class=\"pl-c1\">devicePixelRatio</span><span class=\"pl-kos\">,</span>\n <span class=\"pl-c1\">scrollWidth</span>: <span class=\"pl-s1\">ta</span><span class=\"pl-kos\">.</span><span class=\"pl-c1\">scrollWidth</span><span class=\"pl-kos\">,</span> <span class=\"pl-c1\">clientWidth</span>: <span class=\"pl-s1\">ta</span><span class=\"pl-kos\">.</span><span class=\"pl-c1\">clientWidth</span><span class=\"pl-kos\">,</span>\n <span class=\"pl-c1\">whiteSpace</span>: <span class=\"pl-s1\">cs</span><span class=\"pl-kos\">.</span><span class=\"pl-c1\">whiteSpace</span><span class=\"pl-kos\">,</span> <span class=\"pl-c1\">width</span>: <span class=\"pl-s1\">cs</span><span class=\"pl-kos\">.</span><span class=\"pl-c1\">width</span><span class=\"pl-kos\">,</span>\n <span class=\"pl-kos\">}</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">,</span>\n<span class=\"pl-kos\">}</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">;</span></pre></div>\n<p>This took measurements of the <code><textarea></code> inside the <code><navigation-search></code> Web Component and sent them to the server, which wrote them to a file on disk, which Claude could then read.</p>\n<p>Having figured out all of these tricks Fable... hit some invisible guardrail and downgraded itself to Opus. Thankfully Opus had access to the full transcript and could continue using the tricks pioneered by Fable, and shortly afterwards found, tested and verified <a href=\"https://github.com/datasette/datasette-agent/commit/a75a8b727b42c30ced1fc41dc8add7eb9f04fefe\">the fix</a>.</p>\n<p>I prompted Opus to:</p>\n<blockquote>\n<p><code>Write a report in /tmp/automation-report.md where you note down all of the tricks you have used in this session to test against real browsers on my computer, include runnable code examples</code></p>\n</blockquote>\n<p>Which produced <a href=\"https://gist.github.com/simonw/aef7f7db9ac992643110a74e43d6d42f\">this report</a>, which was invaluable for piecing together the details of what had happened for this post.</p>\n<p>I've shared <a href=\"https://gisthost.github.io/?cc14774f6d37eb67bf089f3ac3925f8f\">the full terminal transcript</a> of the Claude Code session as well.</p>\n<h4 id=\"a-review-of-everything-it-did\">A review of everything it did</h4>\n<p>Based on a screenshot and a one-line prompt, Claude Fable 5 + Claude Code:</p>\n<ul>\n<li>Figured out the recipe to run the local development server (with fake environment variables needed to get it running)</li>\n<li>Fired up a Playwright Chrome session</li>\n<li>Turned on the visible scrollbars setting for Chrome <code>defaults write com.google.chrome.for.testing AppleShowScrollBars Always</code> (it turned that off again later)</li>\n<li>Cycled through Firefox and WebKit in Playwright too, failing to recreate the bug</li>\n<li>Worked out my default browser was Safari</li>\n<li>Built a <code>textarea-scrollbar-test.html</code> HTML document</li>\n<li>Opened that in real (not Playwright) Firefox</li>\n<li>Found that <code>osascript -e 'tell application \"System Events\" to tell process \"firefox\" to id of window 1'</code> was blocked because \"osascript is not allowed assistive access\"</li>\n<li>Figured out that <code>uv run --with pyobjc-framework-Quartz python</code> workaround, described above</li>\n<li>Added JavaScript to the site templates in order to trigger the <code>/</code> key</li>\n<li>Built its own little Python CORS web server to capture JSON data</li>\n<li>Rewrote the template to capture that data and send it to the server</li>\n<li>Scripted its way through the Web Component shadow DOM to the information it needed</li>\n<li>Opened Safari to confirm the source of the bug</li>\n<li>Modified its custom template to hack in a potential fix</li>\n<li>Confirmed the hacked fix worked</li>\n<li>Reported back on how to fix the problem</li>\n</ul>\n<p>Like I said, relentlessly proactive!</p>\n<h4 id=\"an-estimate-of-the-cost\">An estimate of the cost</h4>\n<p>I'm currently on the $100/month Claude Max plan, which includes a generous allowance for Fable up until June 22nd after which Anthropic say they'll start charging full API prices for it.</p>\n<p>I'm using <a href=\"https://www.agentsview.io\">AgentsView</a> to track my spending (see <a href=\"https://til.simonwillison.net/llms/agentsview-custom-model-price\">this TIL</a>). Here's what AgentsView says this session would have cost me if I was paying full price for it:</p>\n<pre><code>~ % uvx agentsview session usage be8850a7-6119-46a0-b5d6-79c7fff5ae2b\nSession: be8850a7-6119-46a0-b5d6-79c7fff5ae2b\nAgent: claude\nOutput: 68606\nPeak ctx: 113178\nCost: ~$12.11 (claude-fable-5, claude-opus-4-8)\n</code></pre>\n<p>If you don't keep a close eye on it, Fable will quite happily burn $12 in tokens inventing new ways to debug your CSS.</p>\n<h4 id=\"i-really-need-to-lock-this-thing-down\">I really need to lock this thing down</h4>\n<p>On the one hand, watching Fable go to extreme lengths to get the information that it needed to debug what was, in the end, a two-line CSS fix, was <em>fascinating</em>.</p>\n<p>But on the other hand... this is a robust reminder that coding agents can do anything <em>you</em> can do by typing commands into a terminal - and frontier models know every trick in the book, and evidently a few that nobody has ever written down before.</p>\n<p>If Fable had been acting on malicious instructions - a prompt injection attack hidden in code or an issue thread, or something I'd carelessly pasted into my terminal - it's alarming to think quite how far it could go to exfiltrate data or cause other forms of mischief.</p>\n<p>Running coding agents outside of a sandbox has always been a bad idea - it's my top contender for <a href=\"https://simonwillison.net/2026/Jan/8/llm-predictions-for-2026/#1-year-a-challenger-disaster-for-coding-agent-security\">a Challenger disaster</a> incident, as described by Johann Rehberger in <a href=\"https://embracethered.com/blog/posts/2025/the-normalization-of-deviance-in-ai/\">The Normalization of Deviance in AI</a>.</p>\n<p>Fable is arguably smarter and hence more suspicious of potentially malicious instructions. But that smartness is very much a two-edged sword: if it <em>does</em> get subverted by instructions, the amount of damage it can do given its relentless proactivity is terrifying.</p>\n \n <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/prompt-injection\">prompt-injection</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/ai-assisted-programming\">ai-assisted-programming</a>, <a href=\"https://simonwillison.net/tags/coding-agents\">coding-agents</a>, <a href=\"https://simonwillison.net/tags/claude-code\">claude-code</a>, <a href=\"https://simonwillison.net/tags/claude-mythos\">claude-mythos</a></p>",
"url": "https://simonwillison.net/2026/Jun/11/fable-is-relentlessly-proactive/#atom-everything",
"published": "2026-06-11T23:35:17.000Z",
"updated": "2026-06-11T23:35:17.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "prompt-injection",
"term": "prompt-injection",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "ai-assisted-programming",
"term": "ai-assisted-programming",
"url": null
},
{
"label": "coding-agents",
"term": "coding-agents",
"url": null
},
{
"label": "claude-code",
"term": "claude-code",
"url": null
},
{
"label": "claude-mythos",
"term": "claude-mythos",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/11/datasette/#atom-everything",
"title": "datasette 1.0a33",
"description": "<p><strong>Release:</strong> <a href=\"https://github.com/simonw/datasette/releases/tag/1.0a33\">datasette 1.0a33</a></p>\n <p>This alpha is a significant step on the road to a stable 1.0, finally extending the <code>?_extra=</code> pattern I introduced <a href=\"https://docs.datasette.io/en/1.0a3/changelog.html#a3-2023-08-09\">in Datasette 1.0a3</a> to cover queries and rows in addition to tables. That pattern is also <a href=\"https://docs.datasette.io/en/latest/json_api.html#expanding-json-responses\">now documented</a>!</p>\n<p>I wrote a whole lot more about the new release on the Datasette project blog: <strong><a href=\"http://datasette.io/blog/2026/api-extras/\">Datasette 1.0a33 with JSON extras in the API</a></strong>.</p>\n<p>Because API explorer tools are almost free to build now I had Claude Fable 5 in Claude Code (for <a href=\"https://gist.github.com/simonw/d8bf1a8f36e28fbd595cede946e0ab6d\">the plan</a>) and GPT-5.5 xhigh in Codex Desktop (for <a href=\"https://gist.github.com/simonw/12d5e09797072a6807d7b9cfcc8ff6b7\">the implementation</a>) build me this <a href=\"https://tools.simonwillison.net/datasette-extras-explorer\">custom extras API explorer</a> to help demonstrate the feature:</p>\n<p><img alt=\"Screenshot of a web application titled \"Datasette extras explorer\". A URL input field contains https://latest.datasette.io/fixtures/facetable.json with a teal Explore button next to it. Below, a left panel labeled EXTRAS (30) lists checkboxes: all_columns - All columns in the table, regardless of _col/_nocol filtering; column_types - Column type assignments for this table; columns (checked) - Column names returned by this query; count - Total count of rows matching these filters; count_sql - SQL query used to calculate the total count; custom_table_templates - Custom template names considered for this table; database - Database name; database_color - Color assigned to the database. A right panel labeled RESPONSE shows GET /fixtures/fac… with Copy JSON and Copy URL buttons, then a dark JSON viewer showing 200 - 9.9 KB - 114ms and JSON: \"ok\": true, \"next\": null, \"columns\": (highlighted array) \"pk\", \"created\", \"planet_int\", \"on_earth\", \"state\", \"_city_id\", \"_neighborhood\", \"tags\", \"complex_array\", \"distinct_some_null\", \"n\", \"rows\": list of objects.\" src=\"https://static.simonwillison.net/static/2026/extras-explorer.png\" /></p>\n \n \n <p>Tags: <a href=\"https://simonwillison.net/tags/projects\">projects</a>, <a href=\"https://simonwillison.net/tags/datasette\">datasette</a>, <a href=\"https://simonwillison.net/tags/annotated-release-notes\">annotated-release-notes</a>, <a href=\"https://simonwillison.net/tags/ai-assisted-programming\">ai-assisted-programming</a></p>",
"url": "https://simonwillison.net/2026/Jun/11/datasette/#atom-everything",
"published": "2026-06-11T15:26:49.000Z",
"updated": "2026-06-11T15:26:49.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "projects",
"term": "projects",
"url": null
},
{
"label": "datasette",
"term": "datasette",
"url": null
},
{
"label": "annotated-release-notes",
"term": "annotated-release-notes",
"url": null
},
{
"label": "ai-assisted-programming",
"term": "ai-assisted-programming",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/11/asyncinject/#atom-everything",
"title": "asyncinject 0.7",
"description": "<p><strong>Release:</strong> <a href=\"https://github.com/simonw/asyncinject/releases/tag/0.7\">asyncinject 0.7</a></p>\n <p>I built this utility library to support an <code>asyncio</code> dependency injection pattern a few years ago. I was using it with Datasette and Claude Fable 5 spotted some bugs in the dependency which it then fixed for me. It's a very proactive model!</p>\n \n \n <p>Tags: <a href=\"https://simonwillison.net/tags/async\">async</a>, <a href=\"https://simonwillison.net/tags/projects\">projects</a>, <a href=\"https://simonwillison.net/tags/python\">python</a>, <a href=\"https://simonwillison.net/tags/claude-mythos\">claude-mythos</a></p>",
"url": "https://simonwillison.net/2026/Jun/11/asyncinject/#atom-everything",
"published": "2026-06-11T06:28:09.000Z",
"updated": "2026-06-11T06:28:09.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "async",
"term": "async",
"url": null
},
{
"label": "projects",
"term": "projects",
"url": null
},
{
"label": "python",
"term": "python",
"url": null
},
{
"label": "claude-mythos",
"term": "claude-mythos",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/11/anthropic-walks-back-policy/#atom-everything",
"title": "Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude",
"description": "<p><strong><a href=\"https://www.wired.com/story/anthropic-responds-to-backlash-on-claudes-secret-sabotage-on-ai-research/\">Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude</a></strong></p>\nBig scoop for Maxwell Zeff at Wired:</p>\n<blockquote>\n<p>“We’re changing Fable 5’s safeguards for frontier LLM development to make them visible.” Anthropic said in a statement to WIRED. “We made the wrong tradeoff and we apologize for not getting the balance right.”</p>\n</blockquote>\n<p>There's been a <em>huge</em> outcry about Anthropic's policy, <a href=\"https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-helping-you/\">tucked away in their system card</a>, that Claude Fable/Mythos would identify \"requests targeting frontier LLM development\" and \"limit effectiveness\" without notifying the user.</p>\n<p>It's good news that they're dropping the invisible aspect of this. It would be a whole lot better of they dropped this category of refusals entirely.</p>\n<p><strong>Update</strong>: More details from <a href=\"https://twitter.com/claudedevs/status/2064949876463645026\">@ClaudeDevs on Twitter</a>:</p>\n<blockquote>\n<p>We’re rolling out changes to make Fable 5’s safeguards for frontier LLM development visible.</p>\n<p>Starting this week, flagged requests will visibly fall back to Opus 4.8—the same as our safeguards for cyber and bio. You will see this every time it happens. On the API, any flagged requests will return a reason for their refusal (coming to server-side fallback in the next few days).</p>\n<p>We wanted to deploy Fable 5 to our users quickly and safely. Visible safeguards can be probed, so they have to be robust, which takes time to get right. Invisible safeguards can be targeted more narrowly, allowing us to ship quickly with very few false positives. We went with invisible safeguards for this reason—and that was the wrong tradeoff. You should have visibility into the safeguards we have in place, and why. We’re sorry for not getting the balance right.</p>\n</blockquote>\n\n <p><small></small>Via <a href=\"https://twitter.com/zeffmax/status/2064910040503627917\">@zeffmax</a></small></p>\n\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/anthropic\">anthropic</a>, <a href=\"https://simonwillison.net/tags/claude\">claude</a>, <a href=\"https://simonwillison.net/tags/ai-ethics\">ai-ethics</a>, <a href=\"https://simonwillison.net/tags/claude-mythos\">claude-mythos</a></p>",
"url": "https://simonwillison.net/2026/Jun/11/anthropic-walks-back-policy/#atom-everything",
"published": "2026-06-11T03:45:49.000Z",
"updated": "2026-06-11T03:45:49.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "anthropic",
"term": "anthropic",
"url": null
},
{
"label": "claude",
"term": "claude",
"url": null
},
{
"label": "ai-ethics",
"term": "ai-ethics",
"url": null
},
{
"label": "claude-mythos",
"term": "claude-mythos",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/10/datasette-agent/#atom-everything",
"title": "datasette-agent 0.2a0",
"description": "<p><strong>Release:</strong> <a href=\"https://github.com/datasette/datasette-agent/releases/tag/0.2a0\">datasette-agent 0.2a0</a></p>\n <p>Highlights from the release notes:</p>\n<blockquote>\n<ul>\n<li>Tools can now ask the user questions mid-execution. Tools that declare a <code>context</code> parameter receive a <code>ToolContext</code> object, and <code>await context.ask_user(...)</code> can ask a yes/no, multiple-choice (<code>options=[...]</code>) or free-text (<code>free_text=True</code>) question. While a question is unanswered the agent turn suspends: the question renders as a form in the chat UI and persists to the internal database, so suspended conversations survive a server restart. Once answered, the tool re-executes from the top with stored answers replayed, so call <code>ask_user()</code> before performing side effects. <a href=\"https://github.com/datasette/datasette-agent/pull/20\">#20</a></li>\n<li>New built-in <code>save_query</code> tool: the agent can save SQL it has written as a <a href=\"https://docs.datasette.io/en/latest/sql_queries.html#saved-queries\">Datasette stored query</a>. Saving always requires human approval - the agent shows the full SQL plus the proposed name, database and visibility, and nothing is stored until you click Yes. <a href=\"https://github.com/datasette/datasette-agent/pull/20\">#20</a></li>\n</ul>\n</blockquote>\n<p>The <code>ask_user()</code> feature was enabled by the new LLM alpha I <a href=\"https://simonwillison.net/2026/Jun/9/claude-fable-5/#adding-features-to-datasette-agent-and-llm-using-claude-code\">built yesterday</a> with the help of Claude Fable 5.</p>\n \n \n <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/datasette\">datasette</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/datasette-agent\">datasette-agent</a></p>",
"url": "https://simonwillison.net/2026/Jun/10/datasette-agent/#atom-everything",
"published": "2026-06-10T23:57:27.000Z",
"updated": "2026-06-10T23:57:27.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "datasette",
"term": "datasette",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "datasette-agent",
"term": "datasette-agent",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/10/diffusiongemma/#atom-everything",
"title": "DiffusionGemma",
"description": "<p><strong><a href=\"https://blog.google/innovation-and-ai/technology/developers-tools/diffusion-gemma-faster-text-generation/\">DiffusionGemma</a></strong></p>\nLast May Google briefly released an experimental Gemini Diffusion model. I <a href=\"https://simonwillison.net/2025/May/21/gemini-diffusion/\">tried the preview at the time</a> and recorded it running at 857 tokens/second. It was an exciting model, but Google made no further announcements about it.</p>\n<p>That research has returned in the best possible way: as a new open weight (Apache 2 licensed) Gemma model, <a href=\"https://huggingface.co/google/diffusiongemma-26B-A4B-it\">google/diffusiongemma-26B-A4B-it</a>.</p>\n<p>NVIDIA are currently <a href=\"https://build.nvidia.com/google/diffusiongemma-26b-a4b-it\">hosting the model for free</a> on their NIM cloud API. I used that API to <a href=\"https://tools.simonwillison.net/markdown-svg-renderer#url=https%3A%2F%2Fgist.github.com%2Fsimonw%2Fe5e234a6dc6eef61e209ce1629620042\">generate this pelican</a>, which took 4.4s (according to <code>time uv run generate.py</code>) to return 2,409 tokens - so at least 500 tokens/second.</p>\n<p><img alt=\"Flat minimalist illustration of a white pelican with a large orange beak riding a red bicycle with black wheels, against a pale blue background with a green line representing the ground\" src=\"https://static.simonwillison.net/static/2026/diffusiongemma-pelican.png\" />\n\n <p><small></small>Via <a href=\"https://news.ycombinator.com/item?id=48478471\">Hacker News</a></small></p>\n\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/google\">google</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/nvidia\">nvidia</a>, <a href=\"https://simonwillison.net/tags/pelican-riding-a-bicycle\">pelican-riding-a-bicycle</a>, <a href=\"https://simonwillison.net/tags/gemma\">gemma</a>, <a href=\"https://simonwillison.net/tags/llm-release\">llm-release</a>, <a href=\"https://simonwillison.net/tags/llm-performance\">llm-performance</a></p>",
"url": "https://simonwillison.net/2026/Jun/10/diffusiongemma/#atom-everything",
"published": "2026-06-10T20:00:54.000Z",
"updated": "2026-06-10T20:00:54.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "google",
"term": "google",
"url": null
},
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "nvidia",
"term": "nvidia",
"url": null
},
{
"label": "pelican-riding-a-bicycle",
"term": "pelican-riding-a-bicycle",
"url": null
},
{
"label": "gemma",
"term": "gemma",
"url": null
},
{
"label": "llm-release",
"term": "llm-release",
"url": null
},
{
"label": "llm-performance",
"term": "llm-performance",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/10/jeremy-howard/#atom-everything",
"title": "Quoting Jeremy Howard",
"description": "<blockquote cite=\"https://twitter.com/jeremyphoward/status/2064595816875217362\"><p>Easy solution to slow down recursive AI self improvement:</p>\n<ul>\n<li>The lab with the top-ranked model must agree THEY must not use it for working on frontier AI</li>\n<li>But everyone else should have access to it.</li>\n</ul>\n<p>By definition, this means the frontier doesn't advance.</p>\n<p>It also has the critical benefit of avoiding a dangerous power imbalance.</p>\n<p>Anthropic has chosen the <em>opposite</em> of the safe path: they are allowing themselves, the current top lab, to use their top model for frontier AI research. They've said they'll sabotage others who try.</p>\n<p>This means the AI frontier advances, & power imbalance increases.</p>\n<p>(To be clear, <em>I</em> don't think we should try to slow down recursive AI self improvement - I think we should open it up and democratize it as much as possible. My point is: if <em>you</em> claim we should slow down, and you have the best model, you should ensure your org can't use it.)</p></blockquote>\n<p class=\"cite\">— <a href=\"https://twitter.com/jeremyphoward/status/2064595816875217362\">Jeremy Howard</a>, in a Twitter thread</p>\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/ai-ethics\">ai-ethics</a>, <a href=\"https://simonwillison.net/tags/anthropic\">anthropic</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/claude-mythos\">claude-mythos</a>, <a href=\"https://simonwillison.net/tags/jeremy-howard\">jeremy-howard</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a></p>",
"url": "https://simonwillison.net/2026/Jun/10/jeremy-howard/#atom-everything",
"published": "2026-06-10T15:23:34.000Z",
"updated": "2026-06-10T15:23:34.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "ai-ethics",
"term": "ai-ethics",
"url": null
},
{
"label": "anthropic",
"term": "anthropic",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "claude-mythos",
"term": "claude-mythos",
"url": null
},
{
"label": "jeremy-howard",
"term": "jeremy-howard",
"url": null
},
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-helping-you/#atom-everything",
"title": "If Claude Fable stops helping you, you'll never know",
"description": "<p><strong><a href=\"https://jonready.com/blog/posts/claude-fable5-is-allowed-to-sabotage-your-app-if-youre-a-competitor.html\">If Claude Fable stops helping you, you'll never know</a></strong></p>\nJonathon Ready highlights one of the more eyebrow-raising details from the <a href=\"https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c342ee809620.pdf\">319 page system card</a> for Fable 5 and Mythos 5. Here's a longer excerpt, highlights mine:</p>\n<blockquote>\n<p>In light of the ability of recent models to <a href=\"https://www.anthropic.com/institute/recursive-self-improvement\">accelerate their own development</a>, we’ve <strong>implemented new interventions</strong> that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on <strong>building pretraining pipelines, distributed training infrastructure, or ML accelerator design</strong>). Using Claude to develop competing models already violates our <a href=\"https://www.anthropic.com/legal/consumer-terms\">Terms of Service</a>, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms.</p>\n<p>Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, <strong>these safeguards will not be visible to the user</strong>. Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations.</p>\n</blockquote>\n<p>I believe this is the first time Anthropic have announced these kinds of silent interventions. The justification still feels pretty science-fiction to me - the linked article talks about \"recursive self-improvement\". I'm not at all keen on a model that silently corrupts its replies to questions about \"ML accelerator design\" purely to slow down research that might conflict with Anthropic's own goals!</p>\n<p><strong>Update</strong>: Anthropic <a href=\"https://simonwillison.net/2026/Jun/11/anthropic-walks-back-policy/\">walked back this policy</a> in the face of widespread outrage from the research community.\n\n <p><small></small>Via <a href=\"https://news.ycombinator.com/item?id=48467896\">Hacker News</a></small></p>\n\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/anthropic\">anthropic</a>, <a href=\"https://simonwillison.net/tags/claude\">claude</a>, <a href=\"https://simonwillison.net/tags/ai-ethics\">ai-ethics</a>, <a href=\"https://simonwillison.net/tags/claude-mythos\">claude-mythos</a></p>",
"url": "https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-helping-you/#atom-everything",
"published": "2026-06-10T00:37:25.000Z",
"updated": "2026-06-10T00:37:25.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "anthropic",
"term": "anthropic",
"url": null
},
{
"label": "claude",
"term": "claude",
"url": null
},
{
"label": "ai-ethics",
"term": "ai-ethics",
"url": null
},
{
"label": "claude-mythos",
"term": "claude-mythos",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/9/claude-fable-5/#atom-everything",
"title": "Initial impressions of Claude Fable 5",
"description": "<p>I didn't have early access to today's <a href=\"https://www.anthropic.com/news/claude-fable-5-mythos-5\">Claude Fable 5</a> release, but I've spent the past ~5.5 hours putting it through its paces. My initial impressions are that this is something of a <em>beast</em>. It's slow, expensive and has been quite happily churning through everything I've thrown at it so far. As is frequently the case with current frontier models the challenge is finding tasks that it can't do.</p>\n<p>First, let's review the key characteristics.</p>\n<p>Anthropic claim that <a href=\"https://www.anthropic.com/news/claude-fable-5-mythos-5\">Claude Fable 5</a> offers the same performance as Claude Mythos 5, except with much more strict guardrails in place to prevent it being used for harmful things. Those guardrails trigger often enough that the Claude API has new mechanisms for letting you know when you hit them, and even has a <a href=\"https://platform.claude.com/docs/en/build-with-claude/refusals-and-fallback\">new option</a> to request it falls back to another model automatically if something gets rejected.</p>\n<p>Claude Mythos 5 is out today as well, <a href=\"https://platform.claude.com/docs/en/about-claude/models/introducing-claude-fable-5-and-claude-mythos-5\">Anthropic say it</a> \"Shares Claude Fable 5's capabilities without the safety classifiers\".</p>\n<p>The models have a 1 million token context window, 128,000 maximum output tokens and a knowledge cut-off date of January 2026.</p>\n<p>They are priced at twice the price of Claude Opus 4.5/4.6/4.7/4.8: $10/million input tokens and $50/million output tokens. There's no increase in price for longer context usage.</p>\n<p>Other than that the <a href=\"https://platform.claude.com/docs/en/about-claude/models/introducing-claude-fable-5-and-claude-mythos-5\">upgrade guide</a> is substantially thinner than <a href=\"https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-8\">the similar guide for Opus 4.8</a>.</p>\n<h4 id=\"the-big-model-smell\">The big model smell</h4>\n<p>The best way to describe Fable is that it feels <em>big</em>. Not just in terms of speed and cost, but also in how much it knows.</p>\n<p>Here's an example prompt, which I used to compare Fable's knowledge with Opus 4.8 (typo included):</p>\n<blockquote>\n<p><code>List all of Simon Willion's open source projects, most recent first, each with a rough date of when they were first released</code></p>\n</blockquote>\n<p>Opus 4.8's reply <a href=\"https://gist.github.com/simonw/e77abf645156ade8c5c610f524226dcb#response-1\">started like this</a>:</p>\n<blockquote>\n<p>I don't have a reliable, comprehensive, and date-verified list of all of Simon Willison's open source projects, and I want to be honest about that rather than risk giving you inaccurate dates or fabricated entries.</p>\n<p>Here's what I can tell you with reasonable confidence:</p>\n<p><strong>Well-known projects by Simon Willison:</strong></p>\n<ul>\n<li>\n<p><strong>LLM</strong> (~2023) – A command-line tool and Python library for interacting with large language models.</p>\n</li>\n<li>\n<p><strong>Datasette</strong> (~2017) – A tool for exploring and publishing data, probably his most prominent recent project.</p>\n</li>\n<li>\n<p><strong>sqlite-utils</strong> (~2018-2019) – A Python library and CLI tool for manipulating SQLite databases, often used alongside Datasette.</p>\n</li>\n<li>\n<p><strong>Django</strong> (2005) – He is one of the co-creators of the Django web framework, along with Adrian Holovaty, originally developed at the Lawrence Journal-World newspaper.</p>\n</li>\n</ul>\n<p>He also maintains a large ecosystem of <strong>Datasette plugins</strong> and many smaller utilities (often with names like <code>shot-scraper</code>, various <code>*-to-sqlite</code> tools for importing data, etc.).</p>\n</blockquote>\n<p>That's not bad. Here's <a href=\"https://gist.github.com/simonw/e77abf645156ade8c5c610f524226dcb#response\">the start of the response</a> for the exact same prompt from Fable 5 - both models were running without access to search:</p>\n<blockquote>\n<p>I think you mean <strong>Simon Willison</strong> — the prolific open source developer, co-creator of Django, and creator of Datasette.</p>\n<p>An important caveat upfront: a truly complete list isn't feasible. Simon Willison is famously prolific — he has published <strong>hundreds</strong> of repositories (Datasette alone has 100+ plugins, many written by him), and my knowledge has a cutoff date, so I'll be missing his most recent work. Below are his <strong>notable</strong> projects with approximate initial release dates, most recent first:</p>\n<h2 id=\"notable-projects-approximate-dates-\">Notable projects (approximate dates)</h2>\n<ul>\n<li>\n<strong>files-to-prompt</strong> (April 2024) – concatenates files into a single prompt for LLMs</li>\n<li>\n<strong>datasette-extract</strong> (2024) – plugin for extracting structured data using LLMs</li>\n<li>\n<strong>LLM</strong> (May–June 2023) – CLI tool and Python library for interacting with large language models, plus a large ecosystem of plugins (llm-gpt4all, llm-mistral, llm-claude, etc., 2023 onward)</li>\n<li>\n<strong>symbex</strong> (June 2023) – search Python code for symbols</li>\n<li>\n<strong>ttok</strong> and <strong>strip-tags</strong> (May 2023) – CLI tools for token counting and HTML cleanup for LLM pipelines</li>\n<li>\n<strong>datasette-lite</strong> (May 2022) – Datasette running entirely in the browser via WebAssembly/Pyodide</li>\n<li>\n<strong>shot-scraper</strong> (March 2022) – automated website screenshots via Playwright</li>\n<li>\n<strong>s3-credentials</strong> (November 2021) – CLI for creating scoped AWS S3 credentials</li>\n<li>\n<strong>django-sql-dashboard</strong> (2021) – SQL reporting dashboards for Django</li>\n<li>\n<strong>Dogsheep suite</strong> (2019) – personal analytics tools: twitter-to-sqlite, github-to-sqlite, healthkit-to-sqlite, dogsheep-beta, etc.</li>\n<li>\n<strong>sqlite-utils</strong> (2018) – CLI and Python library for manipulating SQLite databases</li>\n<li>\n<strong>Datasette</strong> (November 2017) – his flagship project; tool for exploring and publishing data</li>\n<li>\n<strong>csvs-to-sqlite</strong> (2017) – convert CSV files to SQLite</li>\n<li>\n<strong>Various early tools</strong> (~2007–2010) – soupselect, json-head, geocoders, and others</li>\n<li>\n<strong>Django</strong> (developed 2003–2005, open-sourced July 2005) – co-created with Adrian Holovaty at the Lawrence Journal-World</li>\n</ul>\n</blockquote>\n<p>(Here's <a href=\"https://gist.github.com/simonw/8d05ce836d44bd8543055c9614f7c478\">GPT-5.5 for good measure</a>. It listed even more projects than Fable did!)</p>\n<p>In the past I've stated that I don't care about how much models <em>know</em> - I want them to be able to manipulate text and code in useful ways and actively look up the information they need via search tools, not bake it into their weights.</p>\n<p>But knowledge like this is a reasonably good proxy for model size - you can cram a whole lot more details about the world into a larger number of parameters.</p>\n<p>Does <em>knowing more stuff</em> mean the model is better at the tasks we pose to it? I can certainly imagine how a coding model with deeper knowledge of modern libraries and patterns could crunch through coding tasks more effectively.</p>\n<p>Is Fable really bigger than Opus? Anthropic haven't said anything about model size, so all we have are tea-leaves, but the speed, pricing and my own poking at its knowledge make me think that it's a large model. Maybe the largest yet from any vendor.</p>\n<h4 id=\"using-fable-in-claude-ai\">Using Fable in Claude.ai</h4>\n<p>Anthropic made Fable 5 available across all of their surfaces - the <a href=\"https://claude.ai/\">Claude.ai</a> chat interface, Claude Code for web, Claude Code CLI and Claude Cowork as well. The model is available \"until June 22nd\" on the subscription plans (I'm on $100/month Max at the moment), after which it will be billed extra.</p>\n<p>Claude.ai is often under-estimated. Since <a href=\"https://simonwillison.net/2025/Sep/9/claude-code-interpreter/\">September 2025</a> every chat has had access to a full container environment to run code, including the ability to install additional packages and even clone repositories directly from GitHub.</p>\n<p>Last week <a href=\"https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/\">I released micropython-wasm</a>, a Python library that uses <a href=\"https://wasmtime.dev/\">wasmtime</a> to run a custom build of MicroPython in WebAssembly to act as a sandbox for untrusted Python code.</p>\n<p>I decided to see if Fable could upgrade that to running full Python instead. I started with this prompt:</p>\n<blockquote>\n<p><code>Clone simonw/micropython-wasm from GitHub and research how this could use a full Python as opposed to MicroPython</code></p>\n</blockquote>\n<p>Fable identified that it could use Brett Cannon's <a href=\"https://github.com/brettcannon/cpython-wasi-build/releases\">cpython-wasi-build</a> builds for this, but was unable to download them itself due to environment restrictions.</p>\n<p>So I grabbed the two zip files from that page and uploaded them to Claude:</p>\n<blockquote>\n<p><code>Here's the Brett Cannon builds</code> (<code>python-3.zip</code>, <code>_build-python-3.zip</code> as attachments)</p>\n</blockquote>\n<p>And that was that. It churned away for a few minutes and got the entire thing working.</p>\n<p>Part of the response included:</p>\n<blockquote>\n<p>I tried the cleaner single-zip-stdlib approach to shrink the filesystem surface, but CPython's <code>getpath</code> bootstrap fails to find <code>encodings</code> from inside a zip without more prefix finessing — the directory-preopen approach works reliably, so that's what the PoC uses. The zip path is solvable but needs <code>_PYTHONHOME</code>/frozen-getpath work.</p>\n</blockquote>\n<p>So I said:</p>\n<blockquote>\n<p><code>Try a bit more at the single-zip-stdlib problem</code></p>\n</blockquote>\n<p>Then a little later:</p>\n<blockquote>\n<p><code>I want a wheel that has the whole system in it, the Python wrappers and the WASM files and the stdlibrary, so I can do uv run --with path-to-whl python -c \"demo code\"</code></p>\n</blockquote>\n<p>... and it gave me <a href=\"https://static.simonwillison.net/static/cors-allow/2026/cpython_wasm-0.1.0-py3-none-any.whl\">this 13.9MB cpython_wasm-0.1.0-py3-none-any.whl</a> file. You can try running Python code in a sandbox using that wheel URL and <code>uv</code> like this:</p>\n<div class=\"highlight highlight-source-shell\"><pre>uv run --with https://static.simonwillison.net/static/cors-allow/2026/cpython_wasm-0.1.0-py3-none-any.whl \\\n cpython-wasm -c <span class=\"pl-s\"><span class=\"pl-pds\">'</span>print(45 ** 56)<span class=\"pl-pds\">'</span></span></pre></div>\n<p>Here's <a href=\"https://claude.ai/share/a73b8b8b-8ebc-4fef-9e5c-7438e5e7ae35\">the full chat transcript</a>.</p>\n<p>This was a <em>very</em> strong start.</p>\n<h4 id=\"adding-features-to-datasette-agent-and-llm-using-claude-code\">Adding features to Datasette Agent and LLM using Claude Code</h4>\n<p>Before I'd realized it was Fable day, my stretch goal for today was to add a new feature to <a href=\"https://agent.datasette.io/\">Datasette Agent</a>: I wanted tool calls within that agent software to gain the ability to pause mid-execution and request approval directly from the user.</p>\n<p>This felt like a suitably meaty task to throw at the new model.</p>\n<p>Over the course of the day Fable not only <a href=\"https://github.com/datasette/datasette-agent/pull/20\">solved that problem</a>, it also identified and then implemented four issues in my underlying LLM library that would help support this kind of advanced pause-resume mechanism in tool calls.</p>\n<p>It got everything working first using somewhat gnarly hacks, but the moment I told it that changes to LLM itself were in scope it set to work unraveling the hacks and turning them into supported features of LLM instead.</p>\n<p>My stretch goal turned into <a href=\"https://llm.datasette.io/en/latest/changelog.html#a3-2026-06-09\">LLM 0.32a3</a>, almost entirely written by Fable. Here are the release notes:</p>\n<blockquote>\n<p>Driven by the needs of <a href=\"https://github.com/datasette/datasette-agent\">Datasette Agent</a>'s human-in-the-loop <code>ask_user()</code> feature, made the following improvements to how tool calls work:</p>\n<ul>\n<li>Tool implementations can declare a parameter named <code>llm_tool_call</code> in order to be passed the <code>llm.ToolCall</code> object for the current invocation. This allows them to access the current <code>llm_tool_call.tool_call_id</code>. See <a href=\"https://llm.datasette.io/en/latest/python-api.html#python-api-tools-llm-tool-call\">Accessing the tool call from inside a tool</a>. <a href=\"https://github.com/simonw/llm/pull/1480\">#1480</a>\n</li>\n<li>Every tool call is now guaranteed a unique <code>tool_call_id</code> - providers that do not supply one get a synthesized <code>tc_</code>-prefixed ULID. <a href=\"https://github.com/simonw/llm/pull/1481\">#1481</a>\n</li>\n<li>Tools can raise a <code>llm.PauseChain</code> exception to cleanly pause the tool chain, useful for things like waiting for human approval. The exception propagates to the caller with <code>.tool_call</code> and <code>.tool_results</code> (completed sibling results) attached, and no model call is made with a placeholder result. See <a href=\"https://llm.datasette.io/en/latest/python-api.html#python-api-tools-pause\">Pausing a chain from inside a tool</a>. <a href=\"https://github.com/simonw/llm/pull/1482\">#1482</a>\n</li>\n<li>Failure semantics for concurrent tool execution: async sibling tool calls always run to completion before a pause or hook exception propagates. <a href=\"https://github.com/simonw/llm/pull/1482\">#1482</a>\n</li>\n<li>Chains can now resume from a <code>messages=</code> history ending in unresolved tool calls: the calls are executed through the normal <code>before_call</code>/<code>after_call</code> machinery before the first model call, skipping any that already have results. The <code>execute_tool_calls()</code> method also accepts a new optional <code>tool_calls_list=</code> argument for executing an explicit list of <code>ToolCall</code> objects in place of the calls requested by the response. See <a href=\"https://llm.datasette.io/en/latest/python-api.html#python-api-tools-resume\">Resuming a chain with pending tool calls</a>. <a href=\"https://github.com/simonw/llm/pull/1482\">#1482</a>\n</li>\n<li>Fixed a bug where the async tool executor silently dropped calls to tools not present in <code>tools=</code> - these now return <code>Error: tool \"...\" does not exist</code> results, matching the sync executor. <a href=\"https://github.com/simonw/llm/pull/1483\">#1483</a>\n</li>\n</ul>\n</blockquote>\n<p>I'm really impressed with the quality of API design, tests, code and documentation that Fable put together for this. I spent several hours on it today, but it feels like several days' worth of work.</p>\n<h4 id=\"how-much-i-ve-spent\">How much I've spent</h4>\n<p>I recently started using <a href=\"https://agentsview.io\">AgentsView</a> to help track my local LLM usage across all of the different coding agents. I published a <a href=\"https://til.simonwillison.net/llms/agentsview-custom-model-price\">TIL today</a> about adding custom Fable pricing to that tool, which I expect will not be necessary in the very near future.</p>\n<p>After setting the price, I ran this command to start a localhost web server to explore my usage:</p>\n<pre><code>uvx agentsview serve\n</code></pre>\n<p>Here's the treemap showing the breakdown of my Fable usage across various projects today:</p>\n<p><img src=\"https://static.simonwillison.net/static/2026/agentsview-fable-full-day.jpg\" alt=\"Screenshot of a cost tracking dashboard with two panels. The first panel is titled \"Cost Attribution\" with toggle buttons for Project / Model / Agent and Treemap / List, with Project and Treemap selected. Italic text reads \"Click to hide from chart\". A treemap shows a large red block labeled prod_datasette_agent $99.26 89.9%, with smaller blocks to its right labeled cloud (blue), datasette (teal), llm (red), and money (pink), plus a tiny orange sliver. A legend lists: 1 prod_datasette_agent $99.26, 2 cloud $3.98, 3 datasette $2.81, 4 llm $2.30, 5 money $1.92, 6 simon $0.15. The second panel is titled \"Top Sessions by Cost\" and lists nine sessions, each with a \"Claude\" badge, a prompt excerpt, a project name with a session UUID (omitted here), a token count, and a cost: 1. Review ./datasette-agent and ./datasette-apps - we are going to add a new feature to agent but you ... prod_datasette_agent, 78.2M, $99.26. 2. issues.db is a copy of the Datasette issues database. There are a LOT of notes in there relating to... datasette, 826.8k, $2.81. 3. Consult fly-docs and then look at datasette.cloud (which launches fly machines) and datasettecloud-... cloud, 924.7k, $2.61. 4. simonwillisonblog.db is a copy of my blog, plus all my software releases and other interesting thin... money, 542.9k, $1.92. 5. Look in datasette.cloud and figure out all remaining steps and decisions that need to be made in or... cloud, 455k, $1.37. 6. Review PRs and issues filed against this repo within the last 4 weeks and see if any deserve to be ... llm, 323.3k, $0.95. 7. run mypy, llm, 320.9k, $0.76. 8. [Image #1] fix this in github actions, llm, 183.9k, $0.59. 9. simon, simon, 26.4k, $0.15.\" style=\"max-width: 100%;\" /></p>\n<p>I used $110.42 worth of tokens today, all as part of my $100/month subscription.</p>\n<h4 id=\"and-some-pelicans\">And some pelicans</h4>\n<p>I ran \"Generate an SVG of a pelican riding a bicycle\" against all five thinking effort levels with Fable.</p>\n<p>Here are <a href=\"https://tools.simonwillison.net/markdown-svg-renderer#url=https%3A%2F%2Fgist.github.com%2Fsimonw%2F94fde31c34a0400c1d29f57e6a708e6b\">the results</a>, including the token cost for each one:</p>\n\n<div style=\"display: flex; flex-wrap: wrap; gap: 10px; margin-bottom: 1em\">\n <figure style=\"margin: 0; flex: 1 1 30%;\">\n <img src=\"https://static.simonwillison.net/static/2026/fable-low.jpg\" alt=\"low\" style=\"width: 100%; height: auto;\" />\n <figcaption style=\"text-align: center;\">low: <a href=\"https://www.llm-prices.com/#it=25&ot=1929&sel=claude-fable-5\">1,929 out, 9.67c</a></figcaption>\n </figure>\n <figure style=\"margin: 0; flex: 1 1 30%;\">\n <img src=\"https://static.simonwillison.net/static/2026/fable-medium.jpg\" alt=\"medium\" style=\"width: 100%; height: auto;\" />\n <figcaption style=\"text-align: center;\">medium: <a href=\"https://www.llm-prices.com/#it=25&ot=2290&sel=claude-fable-5\">2,290 out, 11.475c</a></figcaption>\n </figure>\n <figure style=\"margin: 0; flex: 1 1 30%;\">\n <img src=\"https://static.simonwillison.net/static/2026/fable-high.jpg\" alt=\"high\" style=\"width: 100%; height: auto;\" />\n <figcaption style=\"text-align: center;\">high: <a href=\"https://www.llm-prices.com/#it=25&ot=2057&sel=claude-fable-5\">2,057 out, 10.31c</a></figcaption>\n </figure>\n <figure style=\"margin: 0; flex: 1 1 45%;\">\n <img src=\"https://static.simonwillison.net/static/2026/fable-xhigh.jpg\" alt=\"xhigh\" style=\"width: 100%; height: auto;\" />\n <figcaption style=\"text-align: center;\">xhigh: <a href=\"https://www.llm-prices.com/#it=25&ot=5992&sel=claude-fable-5\">5,992 out, 29.985c</a></figcaption>\n </figure>\n <figure style=\"margin: 0; flex: 1 1 45%;\">\n <img src=\"https://static.simonwillison.net/static/2026/fable-max.jpg\" alt=\"max\" style=\"width: 100%; height: auto;\" />\n <figcaption style=\"text-align: center;\">max: <a href=\"https://www.llm-prices.com/#it=25&ot=14430&sel=claude-fable-5\">14,430 out, 72.175c</a></figcaption>\n </figure>\n</div>\n\n<p>It's interesting that high ended up using fewer tokens than medium for this particular run.</p>\n\n<p>Here are the <a href=\"https://simonwillison.net/2026/May/28/claude-opus-4-8/#and-some-pelicans\">Opus 4.8 pelicans</a> for comparison.</p>\n \n <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/anthropic\">anthropic</a>, <a href=\"https://simonwillison.net/tags/claude\">claude</a>, <a href=\"https://simonwillison.net/tags/llm-pricing\">llm-pricing</a>, <a href=\"https://simonwillison.net/tags/pelican-riding-a-bicycle\">pelican-riding-a-bicycle</a>, <a href=\"https://simonwillison.net/tags/llm-release\">llm-release</a>, <a href=\"https://simonwillison.net/tags/claude-mythos\">claude-mythos</a></p>",
"url": "https://simonwillison.net/2026/Jun/9/claude-fable-5/#atom-everything",
"published": "2026-06-09T23:59:54.000Z",
"updated": "2026-06-09T23:59:54.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "anthropic",
"term": "anthropic",
"url": null
},
{
"label": "claude",
"term": "claude",
"url": null
},
{
"label": "llm-pricing",
"term": "llm-pricing",
"url": null
},
{
"label": "pelican-riding-a-bicycle",
"term": "pelican-riding-a-bicycle",
"url": null
},
{
"label": "llm-release",
"term": "llm-release",
"url": null
},
{
"label": "claude-mythos",
"term": "claude-mythos",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/9/llm/#atom-everything",
"title": "llm 0.32a3",
"description": "<p><strong>Release:</strong> <a href=\"https://github.com/simonw/llm/releases/tag/0.32a3\">llm 0.32a3</a></p>\n <p>Almost entirely written by the new Claude Fable 5, see <a href=\"https://simonwillison.net/2026/Jun/9/claude-fable-5/#adding-features-to-datasette-agent-and-llm-using-claude-code\">my write-up for more details</a>.</p>\n \n \n <p>Tags: <a href=\"https://simonwillison.net/tags/projects\">projects</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/llm\">llm</a>, <a href=\"https://simonwillison.net/tags/claude-mythos\">claude-mythos</a></p>",
"url": "https://simonwillison.net/2026/Jun/9/llm/#atom-everything",
"published": "2026-06-09T22:27:03.000Z",
"updated": "2026-06-09T22:27:03.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "projects",
"term": "projects",
"url": null
},
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "llm",
"term": "llm",
"url": null
},
{
"label": "claude-mythos",
"term": "claude-mythos",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/9/agentsview-custom-model-price/#atom-everything",
"title": "Setting a custom price for a model in AgentsView",
"description": "<p><strong>TIL:</strong> <a href=\"https://til.simonwillison.net/llms/agentsview-custom-model-price\">Setting a custom price for a model in AgentsView</a></p>\n <p>I've been really enjoying <a href=\"https://agentsview.io/\">AgentsView</a> by Wes McKinney as a tool for exploring my token usage across different coding agents running on my laptop.</p>\n<p>Claude Fable 5 came out today and wasn't yet included in the pricing database AgentsView uses. I used Fable to reverse-engineer AgentsView and figured out this recipe for setting custom prices.</p>\n<p>Here's my Claude Fable 5 usage for today so far, plotted by AgentsView as a treemap across my different local projects:</p>\n<p><img alt=\"Screenshot of a cost analytics dashboard. Cost Attribution - Click to hide from chart - toggle buttons for Project / Model / Agent and Treemap / List. A treemap shows a large red block: prod_datasette_agent $74.06 89.3%, then blue: cloud $3.98 4.8%, teal: datasette $2.81 3.4%, pink: money $1.92 2.3%, and a thin orange sliver. A legend lists 1 prod_datasette_agent $74.06, 2 cloud $3.98, 3 datasette $2.81, 4 money $1.92, 5 simon $0.15. Below left, Top Sessions by Cost: 1 Claude - Review ./datasette-agent and ./datasette-apps - we are going to a... - prod_datasette_agent · 08a1f374-0e77-420f-be2d-af805d67e8aa - 55.9M $74.06; 2 Claude - issues.db is a copy of the Datasette issues database. There are a... - datasette · 8caa2d2d-b91f-43b3-bf3a-4268995b6011 - 826.8k $2.81; 3 Claude - Consult fly-docs and then look at datasette.cloud (which launche... - cloud · bfcacc70-09d7-4b27-aaec-4bb8accd9fec - 924.7k $2.61; 4 Claude - simonwillisonblog.db is a copy of my blog, plus all my software re... - money · 0c0fb9dc-6347-4e1b-9307-3709a7cdf0c8 - 542.9k $1.92; 5 Claude - Look in datasette.cloud and figure out all remaining steps and dec... - cloud · 45963b5f-608a-4caa-ad6b-6ae81e1dbf0d - 455k $1.37; 6 Claude - simon - simon · deeccb5d-9e90-4b1e-bfe6-c2b271e1b1d4 - 26.4k $0.15. Below right, Cache Efficiency with horizontal bars: Cache Reads 57.6M (nearly full green bar), Cache Writes 769.3K, Uncached Input 64.4K, Output 300.9K (all tiny bars), and a green highlighted note: $516.62 saved vs uncached.\" src=\"https://static.simonwillison.net/static/2026/agentsview-fable.jpg\" /></p>\n \n \n <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/llm-pricing\">llm-pricing</a>, <a href=\"https://simonwillison.net/tags/claude-mythos\">claude-mythos</a></p>",
"url": "https://simonwillison.net/2026/Jun/9/agentsview-custom-model-price/#atom-everything",
"published": "2026-06-09T21:35:31.000Z",
"updated": "2026-06-09T21:35:31.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "llm-pricing",
"term": "llm-pricing",
"url": null
},
{
"label": "claude-mythos",
"term": "claude-mythos",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/9/andrej-karpathy/#atom-everything",
"title": "Quoting Andrej Karpathy",
"description": "<blockquote cite=\"https://twitter.com/karpathy/status/2064409694761054332\"><p>I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! \"Free your mind\" (Matrix ref).</p></blockquote>\n<p class=\"cite\">— <a href=\"https://twitter.com/karpathy/status/2064409694761054332\">Andrej Karpathy</a>, on Claude Fable 5</p>\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/andrej-karpathy\">andrej-karpathy</a>, <a href=\"https://simonwillison.net/tags/jevons-paradox\">jevons-paradox</a>, <a href=\"https://simonwillison.net/tags/anthropic\">anthropic</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/claude-mythos\">claude-mythos</a></p>",
"url": "https://simonwillison.net/2026/Jun/9/andrej-karpathy/#atom-everything",
"published": "2026-06-09T19:03:10.000Z",
"updated": "2026-06-09T19:03:10.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "andrej-karpathy",
"term": "andrej-karpathy",
"url": null
},
{
"label": "jevons-paradox",
"term": "jevons-paradox",
"url": null
},
{
"label": "anthropic",
"term": "anthropic",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "claude-mythos",
"term": "claude-mythos",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/8/wwdc/#atom-everything",
"title": "Siri AI at WWDC 2026",
"description": "<p>Given how badly burned anyone who took Apple's <a href=\"https://simonwillison.net/2024/Jun/10/apple-intelligence/\">2024 WWDC Apple Intelligence announcements</a> at face value was, I'm holding to a strict \"I'll believe it when I see it\" policy for everything <a href=\"https://www.apple.com/newsroom/2026/06/apple-unveils-next-generation-of-apple-intelligence-siri-ai-and-more/\">they announced today</a>. </p>\n<p>The new Siri AI features do at least look feasible with today's technology, especially since Apple are licensing a custom Gemini-derived model that they can run on their own <a href=\"https://simonwillison.net/2024/Jun/11/private-cloud-compute/\">Private Cloud Compute</a>.</p>\n<p>It sounds like they'll be taking advantage of vision LLMs to extract information from the user's screen, which neatly sidesteps the need for every existing application to ship custom code in order to integrate with Apple Intelligence. Vision LLMs were a much less mature category in June 2024.</p>\n<p>The new Core AI library looks like a good step in enabling developers to finally take full advantage of Apple's hardware for running their own models. It integrates with Meta's open source PyTorch ecosystem, using these <a href=\"https://apple.github.io/coreai-torch/main/\">Core AI PyTorch extensions</a>:</p>\n<blockquote>\n<p>Core AI PyTorch Extensions (<code>coreai-torch</code>) is a Python package that bridges PyTorch and Core AI. You can use it to bring up an existing PyTorch model — exported as a <code>torch.export.ExportedProgram</code> — into a Core AI <code>AIProgram</code> ready to run on Apple hardware, traversing the FX graph node-by-node and mapping ATen operators to Core AI operations.</p>\n</blockquote>\n<p>You can install an iOS 27 Developer Beta today, which supposedly has the new features - but you then have to make it through a waiting list for access to the new Siri AI. Aaron Perris from MacRumors reports having <a href=\"https://twitter.com/aaronp613/status/2064078063814471977\">made it off the waitlist</a> so we may start seeing credible reports on how well Siri AI works in the very near future.</p>\n<p><strong>Update</strong>: These Private Cloud Compute Gemini models are running in Google Cloud, and using NVIDIA hardware. According to <a href=\"https://security.apple.com/blog/expanding-pcc/?linkId=100000425571569\">Expanding Private Cloud Compute</a> on Apple's Security Research blog:</p>\n<blockquote>\n<p>For the most demanding tasks, including agentic tool-use and complex reasoning, we worked with Google and NVIDIA to extend our PCC infrastructure to Google Cloud systems using NVIDIA GPUs, while maintaining Apple's powerful security and privacy protections. [...]</p>\n<p>PCC on Google Cloud leverages many of the same architectural security patterns as PCC on Apple silicon to implement these layered protections: initial network data parsing for each request happens in a dedicated process within its own namespace, shared inference software is recycled with a short time-to-live duration, and attested keys are held in a separate, dedicated confidential VM isolated from external inputs. [...]</p>\n<p>As with PCC on Apple silicon, all binaries will be published for public inspection.</p>\n</blockquote>\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/vision-llms\">vision-llms</a>, <a href=\"https://simonwillison.net/tags/apple\">apple</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/gemini\">gemini</a>, <a href=\"https://simonwillison.net/tags/nvidia\">nvidia</a>, <a href=\"https://simonwillison.net/tags/google\">google</a></p>",
"url": "https://simonwillison.net/2026/Jun/8/wwdc/#atom-everything",
"published": "2026-06-08T23:58:04.000Z",
"updated": "2026-06-08T23:58:04.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "vision-llms",
"term": "vision-llms",
"url": null
},
{
"label": "apple",
"term": "apple",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "gemini",
"term": "gemini",
"url": null
},
{
"label": "nvidia",
"term": "nvidia",
"url": null
},
{
"label": "google",
"term": "google",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/7/datasette-agent-edit/#atom-everything",
"title": "datasette-agent-edit 0.1a0",
"description": "<p><strong>Release:</strong> <a href=\"https://github.com/datasette/datasette-agent-edit/releases/tag/0.1a0\">datasette-agent-edit 0.1a0</a></p>\n <p>I'm planning several plugins for <a href=\"https://agent.datasette.io/\">Datasette Agent</a> which can make edits to existing pieces of text - things like collaborative Markdown editing, updating large SQL queries, and editing SVG files.</p>\n<p>Agentic editing of text is a little tricky to get right. My favorite published design for this is for the <a href=\"https://platform.claude.com/docs/en/agents-and-tools/tool-use/text-editor-tool#use-the-text-editor-tool\">Claude text editor</a>, which implements the following tools:</p>\n<ul>\n<li><code>view</code> - view sections of a file, with line numbers added to every line.</li>\n<li><code>str_replace</code> - find an exact <code>old_str</code> and replace it with <code>new_str</code> - fail if the original string is not unique</li>\n<li><code>insert</code> - insert the specified text after the specified line number</li>\n</ul>\n<p>Rather than recreate these patterns for every plugin that needs them I decided to create this base plugin, <code>datasette-agent-edit</code>, which implements the core tools in a way that allows them to be adapted for other plugins.</p>\n \n \n <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/datasette\">datasette</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/llm-tool-use\">llm-tool-use</a>, <a href=\"https://simonwillison.net/tags/datasette-agent\">datasette-agent</a></p>",
"url": "https://simonwillison.net/2026/Jun/7/datasette-agent-edit/#atom-everything",
"published": "2026-06-07T23:56:38.000Z",
"updated": "2026-06-07T23:56:38.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "datasette",
"term": "datasette",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "llm-tool-use",
"term": "llm-tool-use",
"url": null
},
{
"label": "datasette-agent",
"term": "datasette-agent",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/6/micropython-wasm/#atom-everything",
"title": "micropython-wasm 0.1a2",
"description": "<p><strong>Release:</strong> <a href=\"https://github.com/simonw/micropython-wasm/releases/tag/0.1a2\">micropython-wasm 0.1a2</a></p>\n <p>I added a CLI to <code>micropython-wasm</code> (<a href=\"https://github.com/simonw/micropython-wasm/issues/7\">issue #7</a>), inspired by the first draft of <a href=\"https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/\">the blog entry</a> when I realized it would be a great way to illustrate the <a href=\"https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#try-it-yourself\">Try it yourself</a> section.</p>\n \n \n <p>Tags: <a href=\"https://simonwillison.net/tags/python\">python</a>, <a href=\"https://simonwillison.net/tags/sandboxing\">sandboxing</a>, <a href=\"https://simonwillison.net/tags/webassembly\">webassembly</a>, <a href=\"https://simonwillison.net/tags/micropython\">micropython</a></p>",
"url": "https://simonwillison.net/2026/Jun/6/micropython-wasm/#atom-everything",
"published": "2026-06-06T04:26:06.000Z",
"updated": "2026-06-06T04:26:06.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "python",
"term": "python",
"url": null
},
{
"label": "sandboxing",
"term": "sandboxing",
"url": null
},
{
"label": "webassembly",
"term": "webassembly",
"url": null
},
{
"label": "micropython",
"term": "micropython",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#atom-everything",
"title": "Running Python code in a sandbox with MicroPython and WASM",
"description": "<p>I've been experimenting with different approaches to running code in a sandbox for several years now, but my latest attempt feels like it might finally have all of the characteristics I've been looking for. I've released it as an alpha package called <a href=\"https://github.com/simonw/micropython-wasm\">micropython-wasm</a>, and I'm using it for a code execution sandbox plugin for <a href=\"https://github.com/datasette/datasette-agent\">Datasette Agent</a> called <a href=\"https://github.com/datasette/datasette-agent-micropython\">datasette-agent-micropython</a>.</p>\n<ul>\n <li><a href=\"https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#why-do-i-want-a-sandbox-\">Why do I want a sandbox?</a></li>\n <li><a href=\"https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#what-i-want-from-a-sandbox\">What I want from a sandbox</a></li>\n <li><a href=\"https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#webassembly-looks-really-promising-here\">WebAssembly looks really promising here</a></li>\n <li><a href=\"https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#micropython-in-webassembly\">MicroPython in WebAssembly</a></li>\n <li><a href=\"https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#building-the-first-version\">Building the first version</a></li>\n <li><a href=\"https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#try-it-yourself\">Try it yourself</a></li>\n <li><a href=\"https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#should-you-trust-my-vibe-coded-sandbox-\">Should you trust my vibe-coded sandbox?</a></li>\n</ul>\n<h4 id=\"why-do-i-want-a-sandbox-\">Why do I want a sandbox?</h4>\n<p>My key open source projects - <a href=\"https://datasette.io/\">Datasette</a>, <a href=\"https://llm.datasette.io/\">LLM</a>, even <a href=\"https://sqlite-utils.datasette.io/\">sqlite-utils</a> - all support plugins.</p>\n<p>I absolutely love plugins as a mechanism for extending software. A carefully designed plugin system reduces the risk involved in trying new things to almost nothing - even the wildest ideas won't leave a lasting influence on the core application itself. My software can grow a new feature overnight and I don't even have to review a pull request!</p>\n<p>There's one major drawback: my plugin systems all use Python and <a href=\"https://pluggy.readthedocs.io/en/latest/\">Pluggy</a>, and plugin code executes with full privileges within my applications. A buggy or malicious plugin could break everything or leak private data.</p>\n<p>I'd love to be able to run plugin-style code in an environment where it is unable to read unapproved files, connect to a network, or generally operate in a way that's risky or harmful to the rest of the application or the user's computer.</p>\n<p>My interest covers more than just plugins. For Datasette in particular there are many features I'd like to support where arbitrary code execution would be useful. I've already experimented with this for <a href=\"https://enrichments.datasette.io/\">Datasette Enrichments</a>, where code can be used to transform values stored in a table. I'd love to build a mechanism where you can run code on a schedule that fetches JSON from an approved location, runs a tiny bit of code to reformat it into a list of dictionaries, then inserts those as rows in a SQLite database table.</p>\n<h4 id=\"what-i-want-from-a-sandbox\">What I want from a sandbox</h4>\n<p>My goal is to execute code safely within my own Python applications. Here's what I need:</p>\n<ul>\n<li>Dependencies that <strong>cleanly install from PyPI</strong>, including binary wheels across multiple platforms if necessary. I don't want people using my software to have to take any extra steps beyond directly installing my Python package.</li>\n<li>Executed code must be subject to both <strong>memory</strong> and <strong>CPU</strong> limits. I don't want <code>while True: s += \"longer string\"</code> to crash my application or the user's computer.</li>\n<li>\n<strong>File access must be strictly controlled</strong>. Either no filesystem access at all or I get to define exactly which files can be read and which files can be written to.</li>\n<li>\n<strong>Network access is controlled as well</strong>. Sandboxed code should not be able to communicate with anything without going through a layer I fully control.</li>\n<li>Support for interaction with <strong>host functions</strong>. A sandbox isn't much use if I can't carefully expose selected platform features to the code that it's running.</li>\n<li>It has to be <strong>robust, supported, and clearly documented</strong>. I've lost count of the number of sandbox projects I've seen in repos with warnings that they aren't actively maintained!</li>\n</ul>\n<h4 id=\"webassembly-looks-really-promising-here\">WebAssembly looks really promising here</h4>\n<p>Web browsers operate in the most hostile environment imaginable when it comes to malicious code. Their job is to download <em>and execute</em> untrusted code from the web on almost every page load.</p>\n<p>Given this, JavaScript engines should be excellent candidates for sandboxes. Sadly those engines are also extremely complicated, and are not designed for easy embedding in other projects. Most of the V8-in-Python projects I've seen are infrequently maintained and come with warnings not to use them with completely untrusted code.</p>\n<p>WebAssembly is a <em>much better</em> candidate. It was designed from the start to support all of the characteristics I care about and has been tested in browsers for nearly a decade. The <a href=\"https://pypi.org/project/wasmtime\">wasmtime</a> Python library brings WASM to Python, is actively maintained, and has binary wheels.</p>\n<h4 id=\"micropython-in-webassembly\">MicroPython in WebAssembly</h4>\n<p>WebAssembly engines like wasmtime run WebAssembly binaries. Some programming languages like Rust are easy to compile directly to WebAssembly. Dynamic languages like JavaScript and Python are harder - they support language primitives like <code>eval()</code>, which means they need a full interpreter available at runtime.</p>\n<p>To run Python we need a full Python interpreter compiled to WebAssembly, wired up in a way that makes it easy to feed it code, hook up host functions and access the results.</p>\n<p>Pyodide offers an outstanding package for running Python using WebAssembly in the browser, but using Pyodide in server-side Python isn't supported. The most recent advice I could find was <a href=\"https://github.com/pyodide/pyodide/discussions/5145\">from October 2024</a> stating \"Pyodide is built by the Emscripten toolchain and can only run in a browser or Node.js\".</p>\n<p>The other day I decided to take a look at <a href=\"https://micropython.org\">MicroPython</a> as an option for this. The MicroPython site says:</p>\n<blockquote>\n<p>MicroPython is a lean and efficient implementation of the Python 3 programming language that includes a small subset of the Python standard library and is optimised to run on microcontrollers and in constrained environments.</p>\n</blockquote>\n<p>WebAssembly sure feels like a constrained environment to me!</p>\n<h4 id=\"building-the-first-version\">Building the first version</h4>\n<p>I had GPT-5.5 Pro <a href=\"https://chatgpt.com/share/6a1e2a5c-58b8-8328-ba1c-0e6aadb0a051\">do some research for me</a>, which turned up <a href=\"https://github.com/micropython/micropython/pull/13676\">this PR against MicroPython</a> by <a href=\"https://github.com/yamt\">Yamamoto Takahashi</a> titled \"Experimental WASI support for ports/unix\".</p>\n<p>It then produced this <a href=\"https://github.com/simonw/micropython-wasm/blob/c08fbd2276b15dc8c9bdff82845f750971f45647/research.md\">research.md document</a>, so I let Codex Desktop and GPT-5.5 high <a href=\"https://gist.github.com/simonw/27461a16d76f28f8619c609444d544fe\">loose on it</a> to see what would happen:</p>\n<blockquote>\n<p><code>read the research.md document and build this. You will probably need to write a script that compiles a custom WASM version of MicroPython as part of this project - fetch the MicroPython code to a /tmp directory for this as part of that script.</code></p>\n</blockquote>\n<p>It worked. I now had a prototype Python library that could execute Python code inside a WebAssembly sandbox!</p>\n<p>The trickiest piece to solve was persistent interpreter state. The WASM build we are using here exposes a single entry point which starts the interpreter, runs the code and then stops the interpreter at the end.</p>\n<p>This works fine for one-off scripts, but for Datasette Agent I want variables and functions to stay resident in memory so I can reuse them across multiple code execution calls.</p>\n<p>A neat thing about working with coding agents is that you can get from an idea to a proof of concept quickly. I prompted:</p>\n<blockquote>\n<p><code>For keeping variables resident: what if we ran code inside micropython itself which called a host function get_next_python_code() and then passed that to eval() - and that host function blocked until new code was available, maybe by running in a thread with a queue? Could that or a similar idea help here?</code></p>\n</blockquote>\n<p>After some iteration we got to a version of this that works! In Python code you can now do this:</p>\n<pre><span class=\"pl-k\">from</span> <span class=\"pl-s1\">micropython_wasm</span> <span class=\"pl-k\">import</span> <span class=\"pl-v\">MicroPythonSession</span>\n\n<span class=\"pl-k\">with</span> <span class=\"pl-en\">MicroPythonSession</span>() <span class=\"pl-k\">as</span> <span class=\"pl-s1\">session</span>:\n <span class=\"pl-en\">print</span>(<span class=\"pl-s1\">session</span>.<span class=\"pl-c1\">run</span>(<span class=\"pl-s\">\"x = 10<span class=\"pl-cce\">\\n</span>print(x)\"</span>).<span class=\"pl-c1\">stdout</span>)\n <span class=\"pl-en\">print</span>(<span class=\"pl-s1\">session</span>.<span class=\"pl-c1\">run</span>(<span class=\"pl-s\">\"x += 5<span class=\"pl-cce\">\\n</span>print(x)\"</span>).<span class=\"pl-c1\">stdout</span>)\n <span class=\"pl-en\">print</span>(<span class=\"pl-s1\">session</span>.<span class=\"pl-c1\">run</span>(<span class=\"pl-s\">\"print(x * 2)\"</span>).<span class=\"pl-c1\">stdout</span>)</pre>\n<p>Under the hood this starts a thread, sets up a request queue and then sends messages to that queue for the <code>session.run()</code> command, each time waiting on a reply queue for the result of that execution. Inside WASM the MicroPython interpreter blocks waiting for a <code>__session_next__()</code> host function to return the next line of code, which it runs <code>eval()</code> on before calling <code>__session_result__({\"id\": request_id, \"ok\": True})</code> when each block has been successfully executed.</p>\n<p>The other piece of complexity was supporting host functions, so my Python library could selectively expose functions that could then be called by code running in MicroPython.</p>\n<p>Codex ended up solving this with <a href=\"https://github.com/simonw/micropython-wasm/blob/0.1a1/micropython_wasm/usercmodule/host/hostmodule.c\">78 lines of C</a>, which ends up compiled into the <a href=\"https://github.com/simonw/micropython-wasm/blob/0.1a1/micropython_wasm/artifacts/micropython-wasi.wasm\">362KB WebAssembly blob</a> I'm distributing with the package.</p>\n<p>I am by no means a C programmer, but I've read the C and had two different models explain it to me (here's <a href=\"https://claude.ai/share/62f74371-cc3c-44f2-b406-33d03513de9e\">Claude's explanation</a>) and I've subjected it to a barrage of tests.</p>\n<p>The great thing about working with WebAssembly is that if the C turns out to be fatally flawed the worst that can happen is the WebAssembly execution will fail with an exception. I can live with that risk.</p>\n<p>Memory limits are directly supported by wasmtime. CPU limits are a little harder: wasmtime offers a \"fuel\" concept to limit how many operations a WebAssembly call can execute, and that's the correct fit for this problem, but the units are hard to reason about. I'm experimenting with a 20 million default \"fuel\" setting now but I'm not confident that it's the most appropriate value.</p>\n<h4 id=\"try-it-yourself\">Try it yourself</h4>\n<p>The <code>micropython-wasm</code> alpha is now <a href=\"https://pypi.org/project/micropython-wasm\">live on PyPI</a>.</p>\n<p>You can try it from your own Python code as <a href=\"https://github.com/simonw/micropython-wasm\">described in the README</a>. I've also added a simple CLI mode in <a href=\"https://github.com/simonw/micropython-wasm/releases/tag/0.1a2\">version 0.1a2</a> which means you can try it using <code>uvx</code> without first installing it like so:</p>\n<div class=\"highlight highlight-source-shell\"><pre>uvx micropython-wasm -c <span class=\"pl-s\"><span class=\"pl-pds\">'</span>print(\"Hello world\")<span class=\"pl-pds\">'</span></span>\n<span class=\"pl-c\"><span class=\"pl-c\">#</span> To see it run out of fuel:</span>\nuvx micropython-wasm -c <span class=\"pl-s\"><span class=\"pl-pds\">'</span>s = \"\"; while True: s += \"longer\"<span class=\"pl-pds\">'</span></span>\n<span class=\"pl-c\"><span class=\"pl-c\">#</span> Outputs: micropython-wasm: guest exited with code 1</span></pre></div>\n<p>You can also try it in <a href=\"https://agent.datasette.io/\">Datasette Agent</a> like this:</p>\n<div class=\"highlight highlight-source-shell\"><pre>uvx llm keys <span class=\"pl-c1\">set</span> openai\n<span class=\"pl-c\"><span class=\"pl-c\">#</span> Paste in an OpenAI key, then:</span>\nuvx --with datasette-agent \\\n --with datasette-agent-micropython \\\n --prerelease allow \\\n datasette --internal internal.db \\\n -s plugins.datasette-llm.default_model gpt-5.5 \\\n --root -o</pre></div>\n<p>Then navigate to <a href=\"http://127.0.0.1:8001/-/agent\">http://127.0.0.1:8001/-/agent</a> and run the prompt:</p>\n<blockquote>\n<p><code>show me some micropython</code></p>\n</blockquote>\n\n<p><img src=\"https://static.simonwillison.net/static/2026/micropython-in-datasette-agent.jpg\" style=\"max-width: 100%\" alt=\"Screenshot of a chat application interface with a dark blue-grey header reading \"home\" on the left and \"root\" with a hamburger menu icon on the right. Below is a navigation row with \"← Back\" and \"Chat\" on the left and an \"EXPORT\" button on the right. A blue user message bubble reads \"show me some micropython\". Below it a collapsed thinking section reads \"▸Thinking: … to show the result clearly. After that, I can wrap up with a brief explanation!\" followed by a \"▶ Tool: execute_micropython\" label. A code block follows: \"# A tiny MicroPython example: blink-style logic + Fibonacci\" / \"def fib(n):\" / \" a, b = 0, 1\" / \" out = []\" / \" for _ in range(n):\" / \" out.append(a)\" / \" a, b = b, a + b\" / \" return out\" / 'print(\"Hello from MicroPython!\")' / 'print(\"First 10 Fibonacci numbers:\", fib(10))' / \"# MicroPython often runs on microcontrollers, e.g.:\" / \"# from machine import Pin\" / \"# led = Pin(2, Pin.OUT)\" / \"# led.value(1) # turn LED on\" / \"# led.value(0) # turn LED off\". Below a horizontal divider is the output: \"Hello from MicroPython!\" / \"First 10 Fibonacci numbers: [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]\", followed by a \"▶ Result: execute_micropython\" label. At the bottom is a text input field with placeholder \"Type a message...\" and a blue \"Send\" button.\" /></p>\n\n<p>You can try a live demo of that plugin running in Datasette Agent by signing into <a href=\"https://agent.datasette.io\">agent.datasette.io</a> with your GitHub account.</p>\n\n<h4 id=\"should-you-trust-my-vibe-coded-sandbox-\">Should you trust my vibe-coded sandbox?</h4>\n<p>Having complained about immature, loosely-maintained sandboxing libraries, it's deeply ironic that I've now built my own!</p>\n<p>I deliberately slapped an alpha release version on it, and I'm not ready to recommend it to anyone who isn't willing to take a significant risk.</p>\n<p>I've put it through enough testing that I'm OK using it myself. I've shipped my first plugin that uses it, <a href=\"https://github.com/datasette/datasette-agent-micropython\">datasette-agent-micropython</a>. I've also locked GPT-5.5 xhigh in that Datasette Agent plugin and <a href=\"https://gist.github.com/simonw/5de497c44d25f9fd459c8aa2c959fe4a\">challenged it to break out of the sandbox</a> and so far it has not managed to.</p>\n<p>I'm hoping this implementation can convince some companies with professional security teams and high-stakes problems to commit to using Python in WebAssembly as a sandboxing approach and open source their own solutions.</p>\n \n <p>Tags: <a href=\"https://simonwillison.net/tags/python\">python</a>, <a href=\"https://simonwillison.net/tags/sandboxing\">sandboxing</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/datasette\">datasette</a>, <a href=\"https://simonwillison.net/tags/webassembly\">webassembly</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/ai-assisted-programming\">ai-assisted-programming</a>, <a href=\"https://simonwillison.net/tags/codex\">codex</a>, <a href=\"https://simonwillison.net/tags/datasette-agent\">datasette-agent</a>, <a href=\"https://simonwillison.net/tags/micropython\">micropython</a></p>",
"url": "https://simonwillison.net/2026/Jun/6/micropython-in-a-sandbox/#atom-everything",
"published": "2026-06-06T03:53:34.000Z",
"updated": "2026-06-06T03:53:34.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "python",
"term": "python",
"url": null
},
{
"label": "sandboxing",
"term": "sandboxing",
"url": null
},
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "datasette",
"term": "datasette",
"url": null
},
{
"label": "webassembly",
"term": "webassembly",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "ai-assisted-programming",
"term": "ai-assisted-programming",
"url": null
},
{
"label": "codex",
"term": "codex",
"url": null
},
{
"label": "datasette-agent",
"term": "datasette-agent",
"url": null
},
{
"label": "micropython",
"term": "micropython",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/5/openai-help-lockdown-mode/#atom-everything",
"title": "OpenAI Help: Lockdown Mode",
"description": "<p><strong><a href=\"https://help.openai.com/en/articles/20001061-lockdown-mode\">OpenAI Help: Lockdown Mode</a></strong></p>\nOpenAI first teased this <a href=\"https://openai.com/index/introducing-lockdown-mode-and-elevated-risk-labels-in-chatgpt/\">in February</a>, but now it's live and \"rolling out to eligible personal accounts, including Free, Go, Plus, and Pro, and self-serve ChatGPT Business accounts\":</p>\n<blockquote>\n<p>Lockdown Mode is designed to help prevent the final stage of data exfiltration from a prompt injection attack by limiting outbound network requests that could transfer sensitive data to an attacker. Lockdown Mode does not prevent prompt injections from appearing in the content ChatGPT processes. For example, a prompt injection could appear in cached web content or in an uploaded file, and could still affect the behavior or accuracy of a response.</p>\n</blockquote>\n<p>This looks really good to me.</p>\n<p>The <a href=\"https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/\">Lethal Trifecta</a> occurs when an LLM system has access to all three of access to private data, exposure to untrusted content and a way to steal data and transmit it back to the attacker.</p>\n<p>The only way to solve the trifecta is to cut off one of the three legs, and by far the easiest leg to restrict without making your LLM systems far less useful is the exfiltration vectors to steal data.</p>\n<p>It looks to me like lockdown mode directly attacks that leg, using mechanisms that are deterministic and, crucially, are not evaluated by AI systems that themselves can be subverted by sufficiently devious attacks.</p>\n<p>The existence of lockdown mode does however imply that ChatGPT, in its default settings, does <em>not</em> provide robust protection against sufficiently determined data exfiltration attacks!</p>\n<p><strong>Update</strong>: <a href=\"https://twitter.com/cryps1s/status/2062923575049531422\">This tweet</a> OpenAI CISO Dane Stuckey:</p>\n<blockquote>\n<p>Lockdown mode is not meant for everyone. However, for folks who have an elevated risk profile - due to who they are, what they work on, or the types of data they work with - it's an excellent tool for further securing themselves. This has some tradeoffs on functionality and utility, but for these users, the tradeoff is worthwhile.</p>\n</blockquote>\n\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/security\">security</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/openai\">openai</a>, <a href=\"https://simonwillison.net/tags/prompt-injection\">prompt-injection</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/lethal-trifecta\">lethal-trifecta</a></p>",
"url": "https://simonwillison.net/2026/Jun/5/openai-help-lockdown-mode/#atom-everything",
"published": "2026-06-05T23:56:40.000Z",
"updated": "2026-06-05T23:56:40.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "security",
"term": "security",
"url": null
},
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "openai",
"term": "openai",
"url": null
},
{
"label": "prompt-injection",
"term": "prompt-injection",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "lethal-trifecta",
"term": "lethal-trifecta",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/5/andreas-kling/#atom-everything",
"title": "Quoting Andreas Kling",
"description": "<blockquote cite=\"https://ladybird.org/posts/changing-how-we-develop-ladybird/\"><p>We will no longer accept public pull requests. [...]</p>\n<p>A substantial patch used to imply substantial effort, and that effort was a reasonable proxy for good faith. That assumption no longer holds. [...]</p>\n<p>Whether code was typed by hand is beside the point. What matters is who is responsible for it once it enters the browser. Ladybird is becoming a browser for real users. The people introducing changes to it must be the people who decide those changes belong in the project, and who will answer for the consequences.</p></blockquote>\n<p class=\"cite\">— <a href=\"https://ladybird.org/posts/changing-how-we-develop-ladybird/\">Andreas Kling</a>, Changing How We Develop Ladybird</p>\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/ladybird\">ladybird</a>, <a href=\"https://simonwillison.net/tags/ai-ethics\">ai-ethics</a>, <a href=\"https://simonwillison.net/tags/open-source\">open-source</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/andreas-kling\">andreas-kling</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a></p>",
"url": "https://simonwillison.net/2026/Jun/5/andreas-kling/#atom-everything",
"published": "2026-06-05T11:10:05.000Z",
"updated": "2026-06-05T11:10:05.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "ladybird",
"term": "ladybird",
"url": null
},
{
"label": "ai-ethics",
"term": "ai-ethics",
"url": null
},
{
"label": "open-source",
"term": "open-source",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "andreas-kling",
"term": "andreas-kling",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/4/ai-enthusiasts-ai-skeptics/#atom-everything",
"title": "AI enthusiasts are in a race against time, AI skeptics are in a race against entropy",
"description": "<p><strong><a href=\"https://charitydotwtf.substack.com/p/ai-enthusiasts-are-in-a-race-against\">AI enthusiasts are in a race against time, AI skeptics are in a race against entropy</a></strong></p>\nCharity Majors neatly captures the dynamic between AI enthusiasts and AI skeptics, both of whom are trying to build great software, often in the same teams:</p>\n<blockquote>\n<p>The enthusiasts are <em>not wrong</em>. We are starting to see real, non-imaginary, discontinuous leaps in capabilities from teams that lean in hard to working with AI. And this does not feel like a normal technology cycle where you can wait for the dust to settle; teams that sit this out while competitors are hustling could be out of business before the dust settles. That’s a real, existential threat.</p>\n<p>The skeptics are also <em>not wrong</em>. When you ship code faster than engineers can read it, in domains where nobody has full context, you are making withdrawals from a trust account that took years to build. Reliability degrades, institutional knowledge evaporates. You end up with systems nobody understands, products burbling into incoherence, and on-call rotations that grind people up and spit them out. That is ALSO a real existential threat.</p>\n</blockquote>\n<p>Charity recommends treating this as both a leadership challenge and an engineering challenge. The key issue:</p>\n<blockquote>\n<p>There is no natural feedback loop connecting enthusiasts with skeptics.</p>\n</blockquote>\n<p>Designing feedback loops to help \"mend the gap in shared reality\" between the two groups is a fascinating organizational design problem.\n\n <p><small></small>Via <a href=\"https://lobste.rs/s/ri4flr/ai_enthusiasts_are_race_against_time_ai\">Lobste.rs</a></small></p>\n\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/charity-majors\">charity-majors</a>, <a href=\"https://simonwillison.net/tags/agentic-engineering\">agentic-engineering</a></p>",
"url": "https://simonwillison.net/2026/Jun/4/ai-enthusiasts-ai-skeptics/#atom-everything",
"published": "2026-06-04T23:55:27.000Z",
"updated": "2026-06-04T23:55:27.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "charity-majors",
"term": "charity-majors",
"url": null
},
{
"label": "agentic-engineering",
"term": "agentic-engineering",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/4/a-slightly-different-version/#atom-everything",
"title": "Quoting Emanuel Maiberg, 404 Media",
"description": "<blockquote cite=\"https://www.404media.co/google-employees-internally-share-memes-about-how-its-ai-sucks/\"><p>After this story was published Google's spokesperson reached out and asked us to publish a slightly different version of that statement. The new statement no longer stated that \"it's critical that we maintain humans in the loop.\"</p></blockquote>\n<p class=\"cite\">— <a href=\"https://www.404media.co/google-employees-internally-share-memes-about-how-its-ai-sucks/\">Emanuel Maiberg, 404 Media</a>, Google Employees Internally Share Memes About How Its AI Sucks</p>\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/ai-ethics\">ai-ethics</a>, <a href=\"https://simonwillison.net/tags/journalism\">journalism</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/google\">google</a></p>",
"url": "https://simonwillison.net/2026/Jun/4/a-slightly-different-version/#atom-everything",
"published": "2026-06-04T16:38:29.000Z",
"updated": "2026-06-04T16:38:29.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "ai-ethics",
"term": "ai-ethics",
"url": null
},
{
"label": "journalism",
"term": "journalism",
"url": null
},
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "google",
"term": "google",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/3/uber-caps-usage/#atom-everything",
"title": "Uber Caps Usage of AI Tools Like Claude Code to Manage Costs",
"description": "<p><strong><a href=\"https://www.bloomberg.com/news/articles/2026-06-02/uber-caps-usage-of-ai-tools-like-claude-code-to-cut-costs\">Uber Caps Usage of AI Tools Like Claude Code to Manage Costs</a></strong></p>\nI wrote <a href=\"https://simonwillison.net/2026/May/27/product-market-fit/#the-ai-failure-stories-around-this-are-pretty-thin\">the other day</a> about Uber blowing its 2026 AI budget in four months, and how that wasn't particularly surprising given they would have set that budget in 2025, before anyone could have predicted how popular token-burning coding agents were about to become.\nNatalie Lung for Bloomberg:</p>\n<blockquote>\n<p>The rideshare giant is limiting all employees to $1,500 in monthly token spending per AI coding tool, an Uber spokesperson said in response to a Bloomberg News inquiry. That means spending on one tool doesn’t have a bearing on the budget for another. The limits, which have been instituted in recent months, only apply to agentic coding software such as Cursor or Anthropic PBC’s Claude Code.</p>\n</blockquote>\n<p>A $1,500 monthly limit per tool strikes me as a rational policy response to over-spending, and <em>much</em> more sensible than those <a href=\"https://en.wikipedia.org/wiki/Token_maxxing\">tokenmaxxing</a> leaderboards encouraging employees to compete for as much AI usage as possible.</p>\n<p>It's also interesting in that it hints at a real dollar value for what Uber is getting out of these tools. If we assume two actively used tools per engineer that's $3,000 * 12 = $36,000 cap per engineer per year. Levels.fyi lists <a href=\"https://www.levels.fyi/companies/uber/salaries/software-engineer?country=254\">the median yearly compensation package for Uber software engineers in the USA</a> at $330,000.</p>\n<p>That means each employee's AI spending cap is ~11% of that median compensation package.</p>\n<p>I <a href=\"https://simonwillison.net/2026/May/27/product-market-fit/#enterprise-customers-are-now-paying-api-prices\">noted</a> that my own token usage comes to about $1,000/month against each of Anthropic and OpenAI - which currently costs me just $100 per provider thanks to their generous subsidized plans for individual subscribers. Those plans are no longer available to larger companies like Uber.</p>\n<p>Their new policy means if I were working at Uber I'd still have ~$500/month of tokens to spare for each of those tools, given my current usage patterns.\n\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/llm-pricing\">llm-pricing</a>, <a href=\"https://simonwillison.net/tags/coding-agents\">coding-agents</a>, <a href=\"https://simonwillison.net/tags/uber\">uber</a></p>",
"url": "https://simonwillison.net/2026/Jun/3/uber-caps-usage/#atom-everything",
"published": "2026-06-03T12:01:27.000Z",
"updated": "2026-06-03T12:01:27.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "llm-pricing",
"term": "llm-pricing",
"url": null
},
{
"label": "coding-agents",
"term": "coding-agents",
"url": null
},
{
"label": "uber",
"term": "uber",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/2/microsofts-new-models/#atom-everything",
"title": "Microsoft's new MAI models",
"description": "<p>Microsoft <a href=\"https://microsoft.ai/news/building-a-hillclimbing-machine-launching-seven-new-mai-models/\">announced two new text LLMs</a> this morning - <strong><a href=\"https://microsoft.ai/news/introducing-mai-thinking-1/\">MAI-Thinking-1</a></strong> (reasoning, 1T parameters, 35B active, available to \"select early partners\") and <strong><a href=\"https://microsoft.ai/news/introducingmai-code-1-flash/\">MAI-Code-1-Flash</a></strong> (137B Parameters, 5B active, \"purpose-built for GitHub Copilot and VS Code to deliver high performance and lower cost [...] rolling out to GitHub Copilot individual users in Visual Studio Code\"). I've not been able to try either of them just yet.</p>\n<p><strike>It's very interesting to see Microsoft releasing models with such low parameter counts, especially given how expensive larger models are to access right now. They claim MAI-Thinking-1 \"is preferred to Sonnet 4.6 in our blind human side-by-side evaluations\", which is impressive for a 35B model seeing as I frequently run models larger than that on my own laptop.</strike> (UPDATE: I got this entirely wrong, see note below.)</p>\n<p>Also <a href=\"https://microsoft.ai/news/introducing-mai-thinking-1/\">of note</a>:</p>\n<blockquote>\n<p>We trained [MAI-Thinking-1] from the ground up on enterprise grade, clean and commercially licensed data, without distillation from third-party models.</p>\n</blockquote>\n<p>And for <a href=\"https://microsoft.ai/news/introducingmai-code-1-flash/\">MAI-Code-1-Flash</a> as well:</p>\n<blockquote>\n<p>It is built end-to-end by Microsoft using clean and appropriately licensed data.</p>\n</blockquote>\n<p>I would <em>very much</em> like to learn more about this \"appropriately licensed\" data! Could these be the first generally useful code-specialist models that didn't train on an unlicensed dump of the web? (<strong>Update</strong>: the answer is no, see note below.)</p>\n<p><strong>Update</strong>: My initial published notes got the size of the models wrong. I misread Microsoft's announcements and interpreted the MoE active parameter count as the total parameter count, but the <a href=\"https://microsoft.ai/pdf/MAI-Code-1-Flash-Model-Card.PDF\">model card for MAI-Code-1-Flash</a> lists it as 137B with 5B active and the <a href=\"https://microsoft.ai/wp-content/uploads/2026/06/main_20260602_2.pdf\">MAI-Thinking-1 technical paper</a> reveals it to be a 1T model with 35B active.</p>\n<p>I deeply regret this error.</p>\n<p><strong>Update 2</strong>: That technical paper describes the training data in some detail from page 80 onwards. It has the same licensing problems as all of the other major LLMs: it's trained on a crawl of the public web:</p>\n<blockquote>\n<p>The majority of our web HTML corpus comes from a proprietary crawl. After initial page discovery and selection, approximately 1.2 trillion pages are crawled and parsed. [...] In addition to Microsoft standard policy Sec. 2.4, we apply UT1 block list (Prigent, 2026) to remove adult content and piracy-related domains. In all, this filtering reduces the corpus from 1.2 trillion pages to 794 billion pages. Given the prevalence of AI-generated content on the web, we also score pages with a proprietary AI-content detection model and use manual inspection to identify domains with extensive AI-generated content; those domains are filtered out of the training corpus.</p>\n<p>[...]</p>\n<p>We process Common Crawl with the same pipeline. [...] After filtering, deduplication, merging with the proprietary web corpus, and a final round of exact-URL and content-level fuzzy deduplication, the Common Crawl portion contains 24.2 billion pages.</p>\n</blockquote>\n<p>I did not cover this one at all well, which is somewhat ironic since I was at the Microsoft Build conference when I wrote this up! I'm sorry for not digging deeper before publishing my initial notes.</p>\n\n <p>Tags: <a href=\"https://simonwillison.net/tags/llm-release\">llm-release</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/microsoft\">microsoft</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/training-data\">training-data</a></p>",
"url": "https://simonwillison.net/2026/Jun/2/microsofts-new-models/#atom-everything",
"published": "2026-06-02T22:21:52.000Z",
"updated": "2026-06-02T22:21:52.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "llm-release",
"term": "llm-release",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "microsoft",
"term": "microsoft",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "training-data",
"term": "training-data",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/2/datasette-agent-micropython/#atom-everything",
"title": "datasette-agent-micropython 0.1a0",
"description": "<p><strong>Release:</strong> <a href=\"https://github.com/datasette/datasette-agent-micropython/releases/tag/0.1a0\">datasette-agent-micropython 0.1a0</a></p>\n <p>I want <a href=\"https://agent.datasette.io\">Datasette Agent</a> to be able to generate and execute Python code safely. This alpha is looking promising so far. GPT-5.5 has so far failed to break out of the sandbox!</p>\n \n \n <p>Tags: <a href=\"https://simonwillison.net/tags/python\">python</a>, <a href=\"https://simonwillison.net/tags/sandboxing\">sandboxing</a>, <a href=\"https://simonwillison.net/tags/datasette\">datasette</a>, <a href=\"https://simonwillison.net/tags/webassembly\">webassembly</a>, <a href=\"https://simonwillison.net/tags/datasette-agent\">datasette-agent</a>, <a href=\"https://simonwillison.net/tags/micropython\">micropython</a></p>",
"url": "https://simonwillison.net/2026/Jun/2/datasette-agent-micropython/#atom-everything",
"published": "2026-06-02T19:28:36.000Z",
"updated": "2026-06-02T19:28:36.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "python",
"term": "python",
"url": null
},
{
"label": "sandboxing",
"term": "sandboxing",
"url": null
},
{
"label": "datasette",
"term": "datasette",
"url": null
},
{
"label": "webassembly",
"term": "webassembly",
"url": null
},
{
"label": "datasette-agent",
"term": "datasette-agent",
"url": null
},
{
"label": "micropython",
"term": "micropython",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/2/micropython-wasm/#atom-everything",
"title": "micropython-wasm 0.1a1",
"description": "<p><strong>Release:</strong> <a href=\"https://github.com/simonw/micropython-wasm/releases/tag/0.1a1\">micropython-wasm 0.1a1</a></p>\n <p>Fixes for some limitations that emerged while I was trying to use this to build <code>datasette-agent-micropython</code>.</p>\n \n \n <p>Tags: <a href=\"https://simonwillison.net/tags/python\">python</a>, <a href=\"https://simonwillison.net/tags/sandboxing\">sandboxing</a>, <a href=\"https://simonwillison.net/tags/webassembly\">webassembly</a>, <a href=\"https://simonwillison.net/tags/micropython\">micropython</a></p>",
"url": "https://simonwillison.net/2026/Jun/2/micropython-wasm/#atom-everything",
"published": "2026-06-02T19:20:47.000Z",
"updated": "2026-06-02T19:20:47.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "python",
"term": "python",
"url": null
},
{
"label": "sandboxing",
"term": "sandboxing",
"url": null
},
{
"label": "webassembly",
"term": "webassembly",
"url": null
},
{
"label": "micropython",
"term": "micropython",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/2/sighting-367841339/#atom-everything",
"title": "California Brown Pelican",
"description": "<p><img src=\"https://static.inaturalist.org/photos/671786719/large.jpg\" alt=\"California Brown Pelican\"></p><p>California Brown Pelican, in Fort Mason, CA, US</p><p>I'm at the <a href=\"https://build.microsoft.com/\">Microsoft Build</a> conference today, held at <a href=\"https://en.wikipedia.org/wiki/Fort_Mason\">Fort Mason</a> in San Francisco. There are California Brown Pelicans diving into the water directly behind venue!</p>\n \n \n <p>Tags: <a href=\"https://simonwillison.net/tags/microsoft\">microsoft</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/llm-release\">llm-release</a></p>",
"url": "https://simonwillison.net/2026/Jun/2/sighting-367841339/#atom-everything",
"published": "2026-06-02T18:17:03.000Z",
"updated": "2026-06-02T18:17:03.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "microsoft",
"term": "microsoft",
"url": null
},
{
"label": "ai",
"term": "ai",
"url": null
},
{
"label": "generative-ai",
"term": "generative-ai",
"url": null
},
{
"label": "llms",
"term": "llms",
"url": null
},
{
"label": "llm-release",
"term": "llm-release",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/2/pasted-file-editor/#atom-everything",
"title": "Pasted File Editor",
"description": "<p><strong>Tool:</strong> <a href=\"https://tools.simonwillison.net/pasted-file-editor\">Pasted File Editor</a></p>\n <p>I really like how you can paste a large volume of text into <a href=\"https://claude.ail\">claude.ai</a> (or the Claude desktop/mobile apps) and it will detect it as a large paste and turn it into a file attachment instead.</p>\n<p>I decided to have Codex desktop <a href=\"https://gist.github.com/simonw/74c79119b487a5acce18b4dcc26b9f79\">build me a version of that</a> as a prototype.</p>\n<p>You can also open files directly - including images which will be shown as thumbnails - or drag files onto the textarea.</p>\n \n \n <p>Tags: <a href=\"https://simonwillison.net/tags/javascript\">javascript</a>, <a href=\"https://simonwillison.net/tags/tools\">tools</a>, <a href=\"https://simonwillison.net/tags/ai-assisted-programming\">ai-assisted-programming</a>, <a href=\"https://simonwillison.net/tags/claude\">claude</a>, <a href=\"https://simonwillison.net/tags/codex\">codex</a></p>",
"url": "https://simonwillison.net/2026/Jun/2/pasted-file-editor/#atom-everything",
"published": "2026-06-02T04:13:36.000Z",
"updated": "2026-06-02T04:13:36.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "javascript",
"term": "javascript",
"url": null
},
{
"label": "tools",
"term": "tools",
"url": null
},
{
"label": "ai-assisted-programming",
"term": "ai-assisted-programming",
"url": null
},
{
"label": "claude",
"term": "claude",
"url": null
},
{
"label": "codex",
"term": "codex",
"url": null
}
]
},
{
"id": "https://simonwillison.net/2026/Jun/2/micropython-wasm-2/#atom-everything",
"title": "micropython-wasm 0.1a0",
"description": "<p><strong>Release:</strong> <a href=\"https://github.com/simonw/micropython-wasm/releases/tag/0.1a0\">micropython-wasm 0.1a0</a></p>\n <p>My latest sandboxing experiment: This alpha package bundles a lightly customized WASM build of <a href=\"https://micropython.org/\">MicroPython</a> with a wrapper to execute code in it via <a href=\"https://wasmtime.dev/\">wasmtime</a>.</p>\n \n \n <p>Tags: <a href=\"https://simonwillison.net/tags/python\">python</a>, <a href=\"https://simonwillison.net/tags/sandboxing\">sandboxing</a>, <a href=\"https://simonwillison.net/tags/webassembly\">webassembly</a>, <a href=\"https://simonwillison.net/tags/micropython\">micropython</a></p>",
"url": "https://simonwillison.net/2026/Jun/2/micropython-wasm-2/#atom-everything",
"published": "2026-06-02T03:43:45.000Z",
"updated": "2026-06-02T03:43:45.000Z",
"content": null,
"image": null,
"media": [],
"authors": [
{
"name": "Simon Willison",
"email": null,
"url": null
}
],
"categories": [
{
"label": "python",
"term": "python",
"url": null
},
{
"label": "sandboxing",
"term": "sandboxing",
"url": null
},
{
"label": "webassembly",
"term": "webassembly",
"url": null
},
{
"label": "micropython",
"term": "micropython",
"url": null
}
]
}
]
}