Analysis of https://buttondown.email/hillelwayne/rss

Feed fetched in 385 ms.
Warning Feed URL redirected to https://buttondown.com/hillelwayne/rss.
Warning Content type is application/rss+xml; charset=utf-8, not text/xml or applicaton/xml.
Feed is 298,444 characters long.
Warning Feed is missing an ETag.
Feed has a last modified date of Tue, 10 Mar 2026 17:12:30 GMT.
Feed is well-formed XML.
Warning Feed has no styling.
This is an RSS feed.
Feed title: Computer Things
Feed self link matches feed URL.
Warning Feed is missing an image.
Feed has 30 items.
First item published on 2026-03-10T17:12:30.000Z
Last item published on 2025-06-05T14:59:11.000Z
All items have published dates.
Newest item was published on 2026-03-10T17:12:30.000Z.
Home page URL: https://buttondown.com/hillelwayne
Error Home page does not have a matching feed discovery link in the <head>.

1 feed links in <head>
  • https://buttondown.com/hillelwayne/rss

  • Error Home page does not have a link to the feed in the <body>.

    Formatted XML
    <?xml version="1.0" encoding="utf-8"?>
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
        <channel>
            <title>Computer Things</title>
            <link>https://buttondown.com/hillelwayne</link>
            <description>&lt;!-- buttondown-editor-mode: fancy --&gt;&lt;p&gt;Hi, I'm Hillel. This is the newsletter version of &lt;a target="_blank" rel="noopener noreferrer nofollow" href="https://www.hillelwayne.com"&gt;my website&lt;/a&gt;. I post all website updates here. I also post weekly content just for the newsletter, on topics like&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Formal Methods&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Software History and Culture&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Fringetech and exotic tooling&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The philosophy and theory of software engineering&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;You can see the archive of all public essays &lt;a target="_blank" rel="noopener noreferrer nofollow" href="https://buttondown.email/hillelwayne/archive/"&gt;here&lt;/a&gt;.&lt;/p&gt;</description>
            <atom:link href="https://buttondown.email/hillelwayne/rss" rel="self"/>
            <language>en-us</language>
            <lastBuildDate>Tue, 10 Mar 2026 17:12:30 +0000</lastBuildDate>
            <item>
                <title>LLMs are bad at vibing specifications</title>
                <link>https://buttondown.com/hillelwayne/archive/llms-are-bad-at-vibing-specifications/</link>
                <description>&lt;h3&gt;No newsletter next week&lt;/h3&gt;
    &lt;p&gt;I'll be speaking at &lt;a href="https://qconlondon.com/" target="_blank"&gt;InfoQ London&lt;/a&gt;. But see below for a book giveaway!&lt;/p&gt;
    &lt;hr /&gt;
    &lt;h1&gt;LLMs are bad at vibing specifications&lt;/h1&gt;
    &lt;p&gt;About a year ago I wrote &lt;a href="https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/" target="_blank"&gt;AI is a gamechanger for TLA+ users&lt;/a&gt;, which argued that AI are a "specification force multiplier". That was written from the perspective an TLA+ expert using these tools. A full &lt;a href="https://github.com/search?q=path%3A*.tla+NOT+is%3Afork+claude&amp;amp;type=code" target="_blank"&gt;4% of Github TLA+ specs&lt;/a&gt; now have the word "Claude" somewhere in them. This is interesting to me, because it suggests there was always an interest in formal methods, people just lacked the skills to do it.  &lt;/p&gt;
    &lt;p&gt;It's also interesting because it gives me a sense of what happens when beginners use AI to write formal specs. It's not good.&lt;/p&gt;
    &lt;p&gt;As a case study, we'll use &lt;a href="https://github.com/myProjectsRavi/sentinel-protocol/tree/main/docs/formal/specs" target="_blank"&gt;this project&lt;/a&gt;, which is kind of enough to have vibed out TLA+ and Alloy specs.&lt;/p&gt;
    &lt;h3&gt;Looking at a project&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://github.com/myProjectsRavi/sentinel-protocol/blob/main/docs/formal/specs/threat-intel-mesh.als" target="_blank"&gt;Starting with the Alloy spec&lt;/a&gt;. Here it is in its entirety:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;module ThreatIntelMesh
    
    sig Node {}
    
    one sig LocalNode extends Node {}
    
    sig Snapshot {
      owner: one Node,
      signed: one Bool,
      signatures: set Signature
    }
    
    sig Signature {}
    
    sig Policy {
      allowUnsignedImport: one Bool
    }
    
    pred canImport[p: Policy, s: Snapshot] {
      (p.allowUnsignedImport = True) or (s.signed = True)
    }
    
    assert UnsignedImportMustBeDenied {
      all p: Policy, s: Snapshot |
        p.allowUnsignedImport = False and s.signed = False implies not canImport[p, s]
    }
    
    assert SignedImportMayBeAccepted {
      all p: Policy, s: Snapshot |
        s.signed = True implies canImport[p, s]
    }
    
    check UnsignedImportMustBeDenied for 5
    check SignedImportMayBeAccepted for 5
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Couple of things to note here: first of all, this doesn't actually compile. It's using the &lt;a href="https://alloy.readthedocs.io/en/latest/modules/boolean.html" target="_blank"&gt;Boolean&lt;/a&gt; standard module so needs &lt;code&gt;open util/boolean&lt;/code&gt; to function. Second, Boolean is the wrong approach here; you're supposed to use subtyping. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;sig Snapshot {
    &lt;span class="w"&gt; &lt;/span&gt; owner: one Node,
    &lt;span class="gd"&gt;- signed: one Bool,&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; signatures: set Signature
    }
    
    &lt;span class="gi"&gt;+ sig SignedSnapshot in Snapshot {}&lt;/span&gt;
    
    
    pred canImport[p: Policy, s: Snapshot] {
    &lt;span class="gd"&gt;- s.signed = True&lt;/span&gt;
    &lt;span class="gi"&gt;+ s in SignedSnapshot&lt;/span&gt;
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;So we know the person did not actually run these specs. This is &lt;em&gt;somewhat&lt;/em&gt; less of a problem in TLA+, which has an official MCP server that lets the agent run model checking. Even so, I regularly see specs that I'm pretty sure won't model check, with things like using &lt;code&gt;Reals&lt;/code&gt; or assuming &lt;code&gt;NULL&lt;/code&gt; is a built-in and not a user-defined constant.&lt;/p&gt;
    &lt;p&gt;The bigger problem with the spec is that &lt;code&gt;UnsignedImportMustBeDenied&lt;/code&gt; and &lt;code&gt;SignedImportMayBeAccepted&lt;/code&gt; &lt;em&gt;don't actually do anything&lt;/em&gt;. &lt;code&gt;canImport&lt;/code&gt; is defined as &lt;code&gt;P || Q&lt;/code&gt;. &lt;code&gt;UnsignedImportMustBeDenied&lt;/code&gt; checks that &lt;code&gt;!P &amp;amp;&amp;amp; !Q =&amp;gt; !canImport&lt;/code&gt;. &lt;code&gt;SignedImportMayBeAccepted&lt;/code&gt; checks that &lt;code&gt;P =&amp;gt; canImport&lt;/code&gt;. These are tautologically true! If they do anything at all, it is only checking that &lt;code&gt;canImport&lt;/code&gt; was defined correctly. &lt;/p&gt;
    &lt;p&gt;You see the same thing in the &lt;a href="https://github.com/myProjectsRavi/sentinel-protocol/blob/main/docs/formal/specs/serialization-firewall.tla" target="_blank"&gt;TLA+ specs&lt;/a&gt;, too:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;GadgetPayload ==
      /\ gadgetDetected&amp;#39; = TRUE
      /\ depth&amp;#39; \in 0..(MaxDepth + 5)
      /\ UNCHANGED allowlistedFormat
      /\ decision&amp;#39; = &amp;quot;block&amp;quot;
    
    NoExploitAllowed == gadgetDetected =&amp;gt; decision = &amp;quot;block&amp;quot;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;The AI is only writing "obvious properties", which fail for reasons like "we missed a guard clause" or "we forgot to update a variable". It does not seem to be good at writing "subtle" properties that fail due to concurrency, nondeterminism, or bad behavior separated by several steps. Obvious properties are useful for orienting yourself and ensuring the system behaves like you expect, but the actual value in using formal methods comes from the subtle properties. &lt;/p&gt;
    &lt;p&gt;(This ties into &lt;a href="https://buttondown.com/hillelwayne/archive/some-tests-are-stronger-than-others/" target="_blank"&gt;Strong and Weak Properties&lt;/a&gt;. LLM properties are weak, intended properties need to be strong.)&lt;/p&gt;
    &lt;p&gt;This is a problem I see in almost every FM spec written by AI. LLMs aren't doing one of the core features of a spec. Articles like &lt;a href="https://martin.kleppmann.com/2025/12/08/ai-formal-verification.html" target="_blank"&gt;Prediction: AI will make formal verification go mainstream&lt;/a&gt; and &lt;a href="https://leodemoura.github.io/blog/2026/02/28/when-ai-writes-the-worlds-software.html" target="_blank"&gt;When AI Writes the World's Software, Who Verifies It?&lt;/a&gt; argue that LLMs will make formal methods go mainstream, but being easily able to write specifications doesn't help with correctness if the specs don't actually verify anything.&lt;/p&gt;
    &lt;h3&gt;Is this a user error?&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;I first got interested in LLMs and TLA+ from &lt;a href="https://zfhuang99.github.io/github%20copilot/formal%20verification/tla+/2025/05/24/ai-revolution-in-distributed-systems.html" target="_blank"&gt;The Coming AI Revolution in Distributed Systems&lt;/a&gt;. The author of that later &lt;a href="https://github.com/zfhuang99/lamport-agent/blob/main/spec/CRAQ/CRAQ.tla" target="_blank"&gt;vibecoded a spec&lt;/a&gt; with a considerably more complex property:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;NoStaleStrictRead ==
      \A i \in 1..Len(eventLog) :
        LET ev == eventLog[i] IN
          ev.type = &amp;quot;read&amp;quot; =&amp;gt;
            LET c == ev.chunk IN
            LET v == ev.version IN
            /\ \A j \in 1..i :
                 LET evC == eventLog[j] IN
                   evC.type = &amp;quot;commit&amp;quot; /\ evC.chunk = c =&amp;gt; evC.version &amp;lt;= v
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;This is a lot more complicated than the &lt;code&gt;(P =&amp;gt; Q &amp;amp;&amp;amp; P) =&amp;gt; Q&lt;/code&gt; properties I've seen! It could be because &lt;a href="https://github.com/deepseek-ai/3FS/tree/main/specs/DataStorage" target="_blank"&gt;the corresponding system already had a complete spec written in P&lt;/a&gt;. But it could also be that Cheng Huang is already an expert specifier, meaning he can get more out of an LLM than an ordinary developer can. I've also noticed that I can usually coax an LLM to do more interesting things than most of my clients can. Which is good for my current livelihood, but bad for the hope of LLMs making formal methods mainstream. If you need to know formal methods to get the LLM to do formal methods, is that really helping?&lt;/p&gt;
    &lt;p&gt;(Yes, if it lowers the skill threshold-- means you can apply FM with 20 hours of practice instead of 80. But the jury's still out on how &lt;em&gt;much&lt;/em&gt; it lowers the threshold. What if it only lowers it from 80 to 75?) &lt;/p&gt;
    &lt;p&gt;On the other hand, there also seem to be some properties that AI struggles with, even with explicit instructions. Last week a client and I tried to get Claude to generate a good &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;liveness&lt;/a&gt; or &lt;a href="https://www.hillelwayne.com/post/action-properties/" target="_blank"&gt;action&lt;/a&gt; property instead of a standard obvious invariant, and it just couldn't. Training data issue? Something in the innate complexity of liveness? It's not clear yet. These properties are even more "subtle" than most invariants, so maybe that's it.&lt;/p&gt;
    &lt;p&gt;On the other other hand, this is all as of March 2026. Maybe this whole article will be laughably obsolete by June. &lt;/p&gt;
    &lt;hr /&gt;
    &lt;h3&gt;&lt;a href="https://logicforprogrammers.com" target="_blank"&gt;Logic for Programmers&lt;/a&gt; Giveaway&lt;/h3&gt;
    &lt;p&gt;Last week's giveaway raised a few issues. First, the New World copies were all taken before all of the emails went out, so a lot of people did not even get a chance to try for a book. Second, due to a Leanpub bug the Europe coupon scheduled for 10 AM UTC actually activated at 10 AM my time, which was early evening for Europe. Third, everybody in the APAC region got left out.&lt;/p&gt;
    &lt;p&gt;So, since I'm not doing a newsletter next week, let's have another giveaway:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://leanpub.com/logic/c/E5A55F7B482C3" target="_blank"&gt;This coupon&lt;/a&gt; will go up 2026-03-16 at 11:00 UTC, which should be noon Central European Time, and be good for ten books (five for this giveaway, five to account for last week's bug).&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://leanpub.com/logic/c/ADC664C95B6D1" target="_blank"&gt;This coupon&lt;/a&gt; will go up 2026-03-17 at 04:00 UTC, which should be noon Beijing Time, and be good for five books.&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://leanpub.com/logic/c/U1250212A9070" target="_blank"&gt;This coupon&lt;/a&gt; will go up 2026-03-17 at 17:00 UTC, which should be noon Central US Time, and also be good for five books.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;I think that gives the best chance of everybody getting at least a chance of a book, while being resilient to timezone shenanigans due to travel / Leanpub dropping bugfixes / daylight savings / whatever. &lt;/p&gt;
    &lt;p&gt;(No guarantees that later "no newsletter" weeks will have giveaways! This is a gimmick)&lt;/p&gt;</description>
                <pubDate>Tue, 10 Mar 2026 17:12:30 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/llms-are-bad-at-vibing-specifications/</guid>
            </item>
            <item>
                <title>Free Books</title>
                <link>https://buttondown.com/hillelwayne/archive/free-books/</link>
                <description>&lt;p&gt;Spinning a &lt;a href="https://www.youtube.com/watch?v=NB4hzg4k7_A" target="_blank"&gt;lot of plates&lt;/a&gt; this week so skipping the newsletter. As an apology, have ten free copies of &lt;em&gt;Logic for Programmers&lt;/em&gt;.&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://leanpub.com/logic/c/EBDFA51B15C1" target="_blank"&gt;These five&lt;/a&gt; are available now.&lt;/li&gt;
    &lt;li&gt;&lt;del&gt;&lt;a href="https://leanpub.com/logic/c/5A55F7B482C3" target="_blank"&gt;These five&lt;/a&gt; &lt;em&gt;should&lt;/em&gt; be available at 10:30 AM CEST tomorrow, so people in Europe have a better chance of nabbing one.&lt;/del&gt; Nevermind Leanpub had a bug that made this not work properly&lt;/li&gt;
    &lt;/ul&gt;</description>
                <pubDate>Tue, 03 Mar 2026 16:34:33 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/free-books/</guid>
            </item>
            <item>
                <title>New Blog Post: Some Silly Z3 Scripts I Wrote</title>
                <link>https://buttondown.com/hillelwayne/archive/new-blog-post-some-silly-z3-scripts-i-wrote/</link>
                <description>&lt;p&gt;Now that I'm not spending all my time on Logic for Programmers, I have time to update my website again! So here's the first blog post in five months: &lt;a href="https://www.hillelwayne.com/post/z3-examples/" target="_blank"&gt;Some Silly Z3 Scripts I Wrote&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;Normally I'd also put a link to the Patreon notes but I've decided I don't like publishing gated content and am going to wind that whole thing down. So some quick notes about this post:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Part of the point is admittedly to hype up the eventual release of LfP. I want to start marketing the book, but don't want the marketing material to be devoid of interest, so tangentially-related-but-independent blog posts are a good place to start.&lt;/li&gt;
    &lt;li&gt;The post discusses the concept of "chaff", the enormous quantity of material (both code samples and prose) that didn't make it into the book. The book is about 50,000 words… and considerably shorter than the total volume of chaff! I don't &lt;em&gt;think&lt;/em&gt; most of it can be turned into useful public posts, but I'm not entirely opposed to the idea. Maybe some of the old chapters could be made into something?&lt;/li&gt;
    &lt;li&gt;Coming up with a conditioned mathematical property to prove was a struggle. I had two candidates: &lt;code&gt;a == b * c =&amp;gt; a / b == c&lt;/code&gt;, which would have required a long tangent on how division must be total in Z3, and  &lt;code&gt;a != 0 =&amp;gt; some b: b * a == 1&lt;/code&gt;, which would have required introducing a quantifier (SMT is real weird about quantifiers). Division by zero has already caused me enough grief so I went with the latter. This did mean I had to reintroduce "operations must be total" when talking about arrays.&lt;/li&gt;
    &lt;li&gt;I have no idea why the array example returns &lt;code&gt;2&lt;/code&gt; for the max profit and not &lt;code&gt;99999999&lt;/code&gt;. I'm guessing there's some short circuiting logic in the optimizer when the problem is ill-defined?&lt;/li&gt;
    &lt;li&gt;One example I could not get working, which is unfortunate, was a demonstration of how SMT solvers are undecidable via encoding Goldbach's conjecture as an SMT problem. Anything with multiple nested quantifiers is a pain.&lt;/li&gt;
    &lt;/ul&gt;</description>
                <pubDate>Mon, 23 Feb 2026 16:49:10 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/new-blog-post-some-silly-z3-scripts-i-wrote/</guid>
            </item>
            <item>
                <title>Stream of Consciousness Driven Development</title>
                <link>https://buttondown.com/hillelwayne/archive/stream-of-consciousness-driven-development/</link>
                <description>&lt;p&gt;This is something I just tried out last week but it seems to have enough potential to be worth showing unpolished. I was pairing with a client on writing a spec. I saw a problem with the spec, a convoluted way of fixing the spec. Instead of trying to verbally explain it, I started by creating a new markdown file:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;NameOfProblem.md
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;Then I started typing. First the problem summary, then a detailed description, then the solution and why it worked. When my partner asked questions, I incorporated his question and our discussion of it into the flow. If we hit a dead end with the solution, we marked it out as a dead end. Eventually the file looked something like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Current state of spec
    Problems caused by this
        Elaboration of problems
        What we tried that didn&amp;#39;t work
    Proposed Solution
        Theory behind proposed solution
        How the solution works
        Expected changes
        Other problems this helps solve
        Problems this does *not* help with
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;Only once this was done, my partner fully understood the chain of thought, &lt;em&gt;and&lt;/em&gt; we agreed it represented the right approach, did we start making changes to the spec. &lt;/p&gt;
    &lt;h3&gt;How is this better than just making the change?&lt;/h3&gt;
    &lt;p&gt;The change was &lt;em&gt;conceptually&lt;/em&gt; complex. A rough analogy: imagine pairing with a beginner who wrote an insertion sort, and you want to replace it with quicksort. You need to explain why the insertion sort is too slow, why the quicksort isn't slow, and how quicksort actually correctly sorts a list. This could involve tangents into computational complexity, big-o notation, recursion, etc. These are all concepts you have internalized, so the change is simple to you, but the solution uses concepts the beginner does not know. So it's conceptually complex to them.&lt;/p&gt;
    &lt;p&gt;I wasn't pairing with a beginning programmer or even a beginning specifier. This was a client who could confidently write complex specs on their own. But they don't work on specifications full time like I do. Any time there's a relative gap in experience in a pair, there's solutions that are conceptually simple to one person and complex to the other.&lt;/p&gt;
    &lt;p&gt;I've noticed too often that when one person doesn't fully understand the concepts behind a change, they just go "you're the expert, I trust you." That eventually leads to a totally unmaintainable spec. Hence, writing it all out. &lt;/p&gt;
    &lt;p&gt;As I said before, I've only tried this once (though I've successfully used a similar idea when teaching workshops). It worked pretty well, though! Just be prepared for a lot of typing.&lt;/p&gt;</description>
                <pubDate>Wed, 18 Feb 2026 16:33:08 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/stream-of-consciousness-driven-development/</guid>
            </item>
            <item>
                <title>Proving What's Possible</title>
                <link>https://buttondown.com/hillelwayne/archive/proving-whats-possible/</link>
                <description>&lt;p&gt;As a formal methods consultant I have to mathematically express properties of systems. I generally do this with two "temporal operators": &lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;A(x) means that &lt;code&gt;x&lt;/code&gt; is always true. For example, a database table &lt;em&gt;always&lt;/em&gt; satisfies all record-level constraints, and a state machine &lt;em&gt;always&lt;/em&gt; makes valid transitions between states. If &lt;code&gt;x&lt;/code&gt; is a statement about an individual state (as in the database but not state machine example), we further call it an &lt;strong&gt;invariant&lt;/strong&gt;.&lt;/li&gt;
    &lt;li&gt;E(x) means that &lt;code&gt;x&lt;/code&gt; is "eventually" true, conventionally meaning "guaranteed true at some point in the future". A database transaction &lt;em&gt;eventually&lt;/em&gt; completes or rolls back, a state machine &lt;em&gt;eventually&lt;/em&gt; reaches the "done" state, etc. &lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;These come from linear temporal logic, which is the mainstream notation for expressing system properties. &lt;sup id="fnref:modal"&gt;&lt;a class="footnote-ref" href="#fn:modal"&gt;1&lt;/a&gt;&lt;/sup&gt; We like these operators because they elegantly cover &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;safety and liveness properties&lt;/a&gt;, and because &lt;a href="https://buttondown.com/hillelwayne/archive/formalizing-stability-and-resilience-properties/" target="_blank"&gt;we can combine them&lt;/a&gt;. &lt;code&gt;A(E(x))&lt;/code&gt; means &lt;code&gt;x&lt;/code&gt; is true an infinite number of times, while &lt;code&gt;A(x =&amp;gt; E(y)&lt;/code&gt; means that &lt;code&gt;x&lt;/code&gt; being true guarantees &lt;code&gt;y&lt;/code&gt; true in the future. &lt;/p&gt;
    &lt;p&gt;There's a third class of properties, that I will call &lt;em&gt;possibility&lt;/em&gt; properties: &lt;code&gt;P(x)&lt;/code&gt; is "can x happen in this model"? Is it possible for a table to have more than ten records? Can a state machine transition from "Done" to "Retry", even if it &lt;em&gt;doesn't&lt;/em&gt;? Importantly, &lt;code&gt;P(x)&lt;/code&gt; does not need to be possible &lt;em&gt;immediately&lt;/em&gt;, just at some point in the future. It's possible to lose 100 dollars betting on slot machines, even if you only bet one dollar at a time. If &lt;code&gt;x&lt;/code&gt; is a statement about an individual state, we can further call it a &lt;a href="https://en.wikipedia.org/wiki/Reachability" target="_blank"&gt;&lt;em&gt;reachability&lt;/em&gt; property&lt;/a&gt;. I'm going to use the two interchangeably for flow. &lt;/p&gt;
    &lt;p&gt;&lt;code&gt;A(P(x))&lt;/code&gt; says that &lt;code&gt;x&lt;/code&gt; is &lt;em&gt;always&lt;/em&gt; possible. No matter what we've done in our system, we can make &lt;code&gt;x&lt;/code&gt; happen again. There's no way to do this with just &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;E&lt;/code&gt;. Other meaningful combinations include:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;code&gt;P(A(x))&lt;/code&gt;: there is a reachable state from which &lt;code&gt;x&lt;/code&gt; is always true.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;A(x =&amp;gt; P(y))&lt;/code&gt;: &lt;code&gt;y&lt;/code&gt; is possible from any state where &lt;code&gt;x&lt;/code&gt; is true.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;E(x &amp;amp;&amp;amp; P(y))&lt;/code&gt;: There is always a future state where x is true and y is reachable.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;A(P(x) =&amp;gt; E(x))&lt;/code&gt;: If &lt;code&gt;x&lt;/code&gt; is ever possible, it will eventually happen.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;E(P(x))&lt;/code&gt; and &lt;code&gt;P(E(x))&lt;/code&gt; are the same as &lt;code&gt;P(x)&lt;/code&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;See the paper &lt;a href="https://dl.acm.org/doi/epdf/10.1145/567446.567463" target="_blank"&gt;"Sometime" is sometimes "not never"&lt;/a&gt; for a deeper discussion of &lt;code&gt;E&lt;/code&gt; and &lt;code&gt;P&lt;/code&gt;.&lt;/p&gt;
    &lt;h3&gt;The use case&lt;/h3&gt;
    &lt;p&gt;Possibility properties are "something good &lt;em&gt;can&lt;/em&gt; happen", which is generally less useful (&lt;em&gt;in specifications&lt;/em&gt;) than "something bad &lt;em&gt;can't&lt;/em&gt; happen" (safety) and "something good &lt;em&gt;will&lt;/em&gt; happen" (liveness). But it still comes up as an important property! My favorite example:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="A guy who can't shut down his computer because system preferences interrupts shutdown" class="newsletter-image" src="https://www.hillelwayne.com/post/safety-and-liveness/img/tweet2.png" /&gt;&lt;/p&gt;
    &lt;p&gt;The big use I've found for the idea is as a sense-check that we wrote the spec properly. Say I take the property "A worker in the 'Retry' state eventually leaves that state":&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;A(state == &amp;#39;Retry&amp;#39; =&amp;gt; E(state != &amp;#39;Retry&amp;#39;))
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;The model checker checks this property and confirms it holds of the spec. Great! Our system is correct! ...Unless the system can never &lt;em&gt;reach&lt;/em&gt; the "Retry" state, in which case the expression is trivially true. I need to verify that 'Retry' is reachable, eg &lt;code&gt;P(state == 'Retry')&lt;/code&gt;. Notice I can't use &lt;code&gt;E&lt;/code&gt; to do this, because I don't want to say "the worker always needs to retry at least once". &lt;/p&gt;
    &lt;h3&gt;It's not supported though&lt;/h3&gt;
    &lt;p&gt;I say "use I've found for &lt;em&gt;the idea&lt;/em&gt;" because the main formalisms I use (Alloy and TLA+) don't natively support &lt;code&gt;P&lt;/code&gt;. &lt;sup id="fnref:tla"&gt;&lt;a class="footnote-ref" href="#fn:tla"&gt;2&lt;/a&gt;&lt;/sup&gt; On top of &lt;code&gt;P&lt;/code&gt; being less useful than &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;E&lt;/code&gt;, simple reachability properties are &lt;a href="https://www.hillelwayne.com/post/software-mimicry/" target="_blank"&gt;mimickable&lt;/a&gt; with A(x). &lt;code&gt;P(x)&lt;/code&gt; &lt;em&gt;passes&lt;/em&gt; whenever &lt;code&gt;A(!x)&lt;/code&gt; &lt;em&gt;fails&lt;/em&gt;, meaning I can verify &lt;code&gt;P(state == 'Retry')&lt;/code&gt; by testing that &lt;code&gt;A(!(state == 'Retry'))&lt;/code&gt; finds a counterexample. We &lt;em&gt;cannot&lt;/em&gt; mimic combined operators this way like &lt;code&gt;A(P(x))&lt;/code&gt; but those are significantly less common than state-reachability. &lt;/p&gt;
    &lt;p&gt;(Also, refinement doesn't preserve possibility properties, but that's a whole other kettle of worms.)&lt;/p&gt;
    &lt;p&gt;The one that's bitten me a little is that we can't mimic "&lt;code&gt;P(x)&lt;/code&gt; from every starting state". "&lt;code&gt;A(!x)&lt;/code&gt;" fails if there's at least one path from one starting state that leads to &lt;code&gt;x&lt;/code&gt;, but other starting states might not make &lt;code&gt;x&lt;/code&gt; possible.&lt;/p&gt;
    &lt;p&gt;I suspect there's also a chicken-and-egg problem here. Since my tools can't verify possibility properties, I'm not used to noticing them in systems. I'd be interested in hearing if anybody works with codebases where possibility properties are important, especially if it's something complex like &lt;code&gt;A(x =&amp;gt; P(y))&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr /&gt;
    &lt;ol&gt;
    &lt;li id="fn:modal"&gt;
    &lt;p&gt;Instead of &lt;code&gt;A(x)&lt;/code&gt;, the literature uses &lt;code&gt;[]x&lt;/code&gt; or &lt;code&gt;Gx&lt;/code&gt; ("globally x") and instead of &lt;code&gt;E(x)&lt;/code&gt; it uses &lt;code&gt;&amp;lt;&amp;gt;x&lt;/code&gt; or &lt;code&gt;Fx&lt;/code&gt; ("finally x"). I'm using A and E because this isn't teaching material.&amp;#160;&lt;a class="footnote-backref" href="#fnref:modal" title="Jump back to footnote 1 in the text"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:tla"&gt;
    &lt;p&gt;There's &lt;a href="https://github.com/tlaplus/tlaplus/issues/860" target="_blank"&gt;some discussion to add it to TLA+, though&lt;/a&gt;.&amp;#160;&lt;a class="footnote-backref" href="#fnref:tla" title="Jump back to footnote 2 in the text"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 11 Feb 2026 18:36:53 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/proving-whats-possible/</guid>
            </item>
            <item>
                <title>Logic for Programmers New Release and Next Steps</title>
                <link>https://buttondown.com/hillelwayne/archive/logic-for-programmers-new-release-and-next-steps/</link>
                <description>&lt;p&gt;&lt;img alt="cover.jpg" class="newsletter-image" src="https://assets.buttondown.email/images/f821145f-d310-403c-88f4-327758a66606.jpg?w=480&amp;amp;fit=max" /&gt;&lt;/p&gt;
    &lt;p&gt;It's taken four months, but the next release of &lt;a href="https://logicforprogrammers.com" target="_blank"&gt;Logic for Programmers is now available&lt;/a&gt;! v0.13 is over 50,000 words, making it both 20% larger than v0.12 and officially the longest thing I have ever written.&lt;sup id="fnref:longest"&gt;&lt;a class="footnote-ref" href="#fn:longest"&gt;1&lt;/a&gt;&lt;/sup&gt; Full release notes are &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;here&lt;/a&gt;, but I'll talk a bit about the biggest changes. &lt;/p&gt;
    &lt;p&gt;For one, every chapter has been rewritten. Every single one. They span from &lt;em&gt;relatively&lt;/em&gt; minor changes to complete chapter rewrites. After some rough git diffing, I think I deleted about 11,000 words?&lt;sup id="fnref:gross-additions"&gt;&lt;a class="footnote-ref" href="#fn:gross-additions"&gt;2&lt;/a&gt;&lt;/sup&gt; The biggest change is probably to the Alloy chapter. After many sleepless nights, I realized the right approach wasn't to teach Alloy as a &lt;em&gt;data modeling&lt;/em&gt; tool but to teach it as a &lt;em&gt;domain modeling&lt;/em&gt; tool. Which technically means the book no longer covers data modeling.&lt;/p&gt;
    &lt;p&gt;There's also a lot more connections between the chapters. The introductory math chapter, for example, foreshadows how each bit of math will be used in the future techniques. I also put more emphasis on the general "themes" like the expressiveness-guarantees tradeoff (working title). One theme I'm really excited about is compatibility (extremely working title). It turns out that the &lt;a href="https://buttondown.com/hillelwayne/archive/the-liskov-substitution-principle-does-more-than/" target="_blank"&gt;Liskov substitution principle&lt;/a&gt;/subtyping in general, &lt;a href="https://buttondown.com/hillelwayne/archive/refinement-without-specification/" target="_blank"&gt;database migrations&lt;/a&gt;, backwards-compatible API changes, and &lt;a href="https://hillelwayne.com/post/refinement/" target="_blank"&gt;specification refinement&lt;/a&gt; all follow &lt;em&gt;basically&lt;/em&gt; the same general principles. I'm calling this "compatibility" for now but prolly need a better name.&lt;/p&gt;
    &lt;p&gt;Finally, there's just a lot more new topics in the various chapters. &lt;code&gt;Testing&lt;/code&gt; properly covers structural and metamorphic properties. &lt;code&gt;Proofs&lt;/code&gt; covers proof by induction and proving recursive functions (in an exercise). &lt;code&gt;Logic Programming&lt;/code&gt; now finally has a section on answer set programming. You get the picture.&lt;/p&gt;
    &lt;h3&gt;Next Steps&lt;/h3&gt;
    &lt;p&gt;There's a lot I still want to add to the book: proper data modeling, data structures, type theory, model-based testing, etc. But I've added new material for two year, and if I keep going it will never get done. So with this release, all the content is in!&lt;/p&gt;
    &lt;p&gt;Just like all the content was in &lt;a href="https://buttondown.com/hillelwayne/archive/five-unusual-raku-features/" target="_blank"&gt;two Novembers ago&lt;/a&gt; and &lt;a href="https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/" target="_blank"&gt;two Januaries ago&lt;/a&gt; and &lt;a href="https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/" target="_blank"&gt;last July&lt;/a&gt;. To make it absolutely 100% for sure that I won't be tempted to add anything else, I passed the whole manuscript over to a copy editor. So if I write more, it won't get edits. That's a pretty good incentive to stop.&lt;/p&gt;
    &lt;p&gt;I also need to find a technical reviewer and proofreader. Once all three phases are done then it's "just" a matter of fixing the layout and finding a good printer. I don't know what the timeline looks like but I really want to have something I can hold in my hands before the summer.&lt;/p&gt;
    &lt;p&gt;(I also need to get notable-people testimonials. Hampered a little in this because I'm trying real hard not to quid-pro-quo, so I'd like to avoid anybody who helped me or is mentioned in the book. And given I tapped most of my network to help me... I've got some ideas though!)&lt;/p&gt;
    &lt;p&gt;There's still a lot of work ahead. Even so, for the first time in two years I don't have research to do or sections to write and it feels so crazy. Maybe I'll update my blog again! Maybe I'll run a workshop! Maybe I'll go outside if Chicago ever gets above 6°F! &lt;/p&gt;
    &lt;hr /&gt;
    &lt;h2&gt;Conference Season&lt;/h2&gt;
    &lt;p&gt;After a pretty slow 2025, the 2026 conference season is looking to be pretty busy! Here's where I'm speaking so far:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://qconlondon.com/" target="_blank"&gt;QCon London&lt;/a&gt;, March 16-19&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://craft-conf.com/2026" target="_blank"&gt;Craft Conference&lt;/a&gt;, Budapest, June 4-5&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://softwareshould.work/" target="_blank"&gt;Software Should Work&lt;/a&gt;, Missouri, July 16-17&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://hfpug.org/" target="_blank"&gt;Houston Functional Programmers&lt;/a&gt;, Virtual, December 3&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;For the first three I'm giving variations of my talk "How to find bugs in systems that don't exist", which I gave last year at &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;. Last one will ideally be a talk based on LfP. &lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr /&gt;
    &lt;ol&gt;
    &lt;li id="fn:longest"&gt;
    &lt;p&gt;The second longest was my 2003 NaNoWriMo. The third longest was &lt;em&gt;Practical TLA+&lt;/em&gt;.&amp;#160;&lt;a class="footnote-backref" href="#fnref:longest" title="Jump back to footnote 1 in the text"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:gross-additions"&gt;
    &lt;p&gt;This means I must have written 20,000 words total. For comparison, the v0.1 release was 19,000 words.&amp;#160;&lt;a class="footnote-backref" href="#fnref:gross-additions" title="Jump back to footnote 2 in the text"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 04 Feb 2026 14:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/logic-for-programmers-new-release-and-next-steps/</guid>
            </item>
            <item>
                <title>Refinement without Specification</title>
                <link>https://buttondown.com/hillelwayne/archive/refinement-without-specification/</link>
                <description>&lt;p&gt;Imagine we have a SQL database with a &lt;code&gt;user&lt;/code&gt; table, and users have a non-nullable &lt;code&gt;is_activated&lt;/code&gt; boolean column. Having read &lt;a href="https://ntietz.com/blog/that-boolean-should-probably-be-something-else/" target="_blank"&gt;That Boolean Should Probably Be Something else&lt;/a&gt;, you decide to migrate it to a nullable &lt;code&gt;activated_at&lt;/code&gt; column. You can change any of the SQL queries that read/update the &lt;code&gt;user&lt;/code&gt; table but not any of the code that uses the results of these queries. Can we make this change in a way that preserves all external properties? &lt;/p&gt;
    &lt;p&gt;Yes. If an update would set &lt;code&gt;is_activated&lt;/code&gt; to true, instead set it to the current date. Now define the &lt;strong&gt;refinement mapping&lt;/strong&gt; that takes a &lt;code&gt;new_user&lt;/code&gt; and returns an &lt;code&gt;old_user&lt;/code&gt;. All columns will be unchanged &lt;em&gt;except&lt;/em&gt; &lt;code&gt;is_activated&lt;/code&gt;, which will be&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;f(new_user).is_activated = 
        if new_user.activated_at == NULL 
        then FALSE
        else TRUE
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;Now new code can use &lt;code&gt;new_user&lt;/code&gt; directly while legacy code can use &lt;code&gt;f(new_user)&lt;/code&gt; instead, which will behave indistinguishably from the &lt;code&gt;old_user&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;A little more time passes and you decide to switch to an &lt;a href="https://martinfowler.com/eaaDev/EventSourcing.html" target="_blank"&gt;event sourcing&lt;/a&gt;-like model. So instead of an &lt;code&gt;activated_at&lt;/code&gt; column, you have a &lt;code&gt;user_events&lt;/code&gt; table, where every record is &lt;code&gt;(user_id, timestamp, event)&lt;/code&gt;. So adding an &lt;code&gt;activate&lt;/code&gt; event will activate the user, adding a &lt;code&gt;deactivate&lt;/code&gt; event will deactivate the user. Once again, we can update the queries but not any of the code that uses the results of these queries. Can we make a change that preserves all external properties?&lt;/p&gt;
    &lt;p&gt;Yes. If an update would change &lt;code&gt;is_activated&lt;/code&gt;, instead have it add an appropriate record to the event table. Now, define the refinement mapping that takes &lt;code&gt;newer_user&lt;/code&gt; and returns &lt;code&gt;new_user&lt;/code&gt;. The &lt;code&gt;activated_at&lt;/code&gt; field will be computed like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;g(newer_user).activated_at =
            # last_activated_event
        let lae = 
                newer_user.events
                          .filter(event = &amp;quot;activate&amp;quot; | &amp;quot;deactivate&amp;quot;)
                          .last,
        in
            if lae.event == &amp;quot;activate&amp;quot; 
            then lae.timestamp
            else NULL
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Now new code can use &lt;code&gt;newer_user&lt;/code&gt; directly while old code can use &lt;code&gt;g(newer_user)&lt;/code&gt; and the really old code can use &lt;code&gt;f(g(newer_user))&lt;/code&gt;.&lt;/p&gt;
    &lt;h3&gt;Mutability constraints&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;I said "these preserve all external properties" and that was a lie. It depends on the properties we explicitly have, and I didn't list any. The real interesting properties for me are mutability constraints on how the system can evolve. So let's go back in time and add a constraint to &lt;code&gt;user&lt;/code&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;C1(u) = u.is_activated =&amp;gt; u.is_activated&amp;#39;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;This constraint means that if a user is activated, any change will preserve its activated-ness. This means a user can go from deactivated to activated but not the other way. It's not a particular good constraint but it's good enough for teaching purposes. Such a SQL constraint can be enforced with &lt;a href="https://www.postgresql.org/docs/current/sql-createeventtrigger.html" target="_blank"&gt;triggers&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;Now we can throw a constraint on &lt;code&gt;new_user&lt;/code&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;C2(nu) = nu.activated_at != NULL =&amp;gt; nu.activated_at&amp;#39; != NULL
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;If &lt;code&gt;nu&lt;/code&gt; satisfies &lt;code&gt;C2&lt;/code&gt;, then &lt;code&gt;f(nu)&lt;/code&gt; satisfies &lt;code&gt;C1&lt;/code&gt;. So the refinement still holds.&lt;/p&gt;
    &lt;p&gt;With &lt;code&gt;newer_u&lt;/code&gt;, we &lt;em&gt;cannot&lt;/em&gt; guarantee that &lt;code&gt;g(newer_u)&lt;/code&gt; satisfies &lt;code&gt;C2&lt;/code&gt; because we can go from "activated" to "deactivated" just by appending a new event. So it's not a refinement. This is fixable by removing deactivation events, that would work too.&lt;/p&gt;
    &lt;p&gt;So a more interesting case is &lt;code&gt;bad_user&lt;/code&gt;, a refinement of &lt;code&gt;user&lt;/code&gt; that has both &lt;code&gt;activated_at&lt;/code&gt; and &lt;code&gt;activated_until&lt;/code&gt;. We propose the refinement mapping &lt;code&gt;b&lt;/code&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;b(bad_user).activated =
        if bad_user.activated_at == NULL &amp;amp;&amp;amp; activated_until == NULL
        then FALSE
        else bad_user.activated_at &amp;lt;= now() &amp;lt; bad_user.activated_until
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;But now if enough time passes, &lt;code&gt;b(bad_user).activated' = false&lt;/code&gt;, so this is not a refinement either.&lt;/p&gt;
    &lt;h3&gt;The punchline&lt;/h3&gt;
    &lt;p&gt;Refinement is one of the most powerful techniques in formal specification, but also one of the hardest for people to understand. I'm starting to think that the reason it's so hard is because they learn refinement while they're &lt;em&gt;also&lt;/em&gt; learning formal methods, so are faced with an unfamiliar topic in an unfamiliar context. If that's the case, then maybe it's easier introducing refinement in a more common context like databases.&lt;/p&gt;
    &lt;p&gt;I've written a bit about refinement in the normal context &lt;a href="https://hillelwayne.com/post/refinement/" target="_blank"&gt;here&lt;/a&gt; (showing one specification is an implementation of another). I kinda want to work this explanation into the book but it might be too late for big content additions like this.&lt;/p&gt;
    &lt;p&gt;(Food for thought: how do refinement mappings relate to database views?)&lt;/p&gt;</description>
                <pubDate>Tue, 20 Jan 2026 17:49:07 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/refinement-without-specification/</guid>
            </item>
            <item>
                <title>My Gripes with Prolog</title>
                <link>https://buttondown.com/hillelwayne/archive/my-gripes-with-prolog/</link>
                <description>&lt;p&gt;For the next release of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt;, I'm finally adding the sections on Answer Set Programming and Constraint Logic Programming that I TODOd back in version 0.9. And this is making me re-experience some of my pain points with Prolog, which I will gripe about now.  If you want to know more about why Prolog is cool instead, go &lt;a href="https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/" target="_blank"&gt;here&lt;/a&gt; or &lt;a href="https://www.metalevel.at/prolog" target="_blank"&gt;here&lt;/a&gt; or &lt;a href="https://ianthehenry.com/posts/drinking-with-datalog/" target="_blank"&gt;here&lt;/a&gt; or &lt;a href="https://logicprogramming.org/" target="_blank"&gt;here&lt;/a&gt;. &lt;/p&gt;
    &lt;h3&gt;No standardized strings&lt;/h3&gt;
    &lt;p&gt;ISO "strings" are just atoms or lists of single-character atoms (or lists of integer character codes). The various implementations of Prolog add custom string operators but they are not cross compatible, so code written with strings in SWI-Prolog will not work in Scryer Prolog. &lt;/p&gt;
    &lt;h3&gt;No functions&lt;/h3&gt;
    &lt;p&gt;Code logic is expressed entirely in &lt;em&gt;rules&lt;/em&gt;, predicates which return true or false for certain values. For example if you wanted to get the length of a Prolog list, you write this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;length&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Len&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    
       &lt;span class="nv"&gt;Len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;3.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Now this is pretty cool in that it allows bidirectionality, or running predicates "in reverse". To generate lists of length 3, you can write &lt;code&gt;length(L, 3)&lt;/code&gt;. But it also means that if you want to get the length a list &lt;em&gt;plus one&lt;/em&gt;, you can't do that in one expression, you have to write &lt;code&gt;length(List, Out), X is Out+1&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;For a while I thought no functions was necessary evil for bidirectionality, but then I discovered &lt;a href="https://picat-lang.org/" target="_blank"&gt;Picat&lt;/a&gt; has functions and works just fine. That by itself is a reason for me to prefer Picat for my LP needs.&lt;/p&gt;
    &lt;p&gt;(Bidirectionality is a killer feature of Prolog, so it's a shame I so rarely run into situations that use it.)&lt;/p&gt;
    &lt;h3&gt;No standardized collection types besides lists&lt;/h3&gt;
    &lt;p&gt;Aside from atoms (&lt;code&gt;abc&lt;/code&gt;) and numbers, there are two data types:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Linked lists like &lt;code&gt;[a,b,c,d]&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;Compound terms like &lt;code&gt;dog(rex, poodle)&lt;/code&gt;, which &lt;em&gt;seem&lt;/em&gt; like record types but are actually tuples. You can even convert compound terms to linked lists with &lt;code&gt;=..&lt;/code&gt;:&lt;/li&gt;
    &lt;/ul&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt; &lt;span class="s s-Atom"&gt;=..&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;
       &lt;span class="nv"&gt;L&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;a&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;a&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;c&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="s s-Atom"&gt;=..&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
       &lt;span class="nv"&gt;L&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;c&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)].&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There's no proper key-value maps or even struct types. Again, this is something that individual distributions can fix (without cross compatibility), but these never feel integrated with the rest of the language. &lt;/p&gt;
    &lt;h3&gt;No boolean values&lt;/h3&gt;
    &lt;p&gt;&lt;code&gt;true&lt;/code&gt; and &lt;code&gt;false&lt;/code&gt; aren't values, they're control flow statements. &lt;code&gt;true&lt;/code&gt; is a noop and &lt;code&gt;false&lt;/code&gt; says that the current search path is a dead end, so backtrack and start again. You can't explicitly store true and false as values, you have to implicitly have them in facts (&lt;code&gt;passed(test)&lt;/code&gt; instead of &lt;code&gt;test.passed? == true&lt;/code&gt;).&lt;/p&gt;
    &lt;p&gt;This hasn't made any tasks impossible, and I can usually find a workaround to whatever I want to do. But I do think it makes things more inconvenient! Sometimes I want to do something dumb like "get all atoms that don't pass at least three of these rules", and that'd be a lot easier if I could shove intermediate results into a sack of booleans. &lt;/p&gt;
    &lt;p&gt;(This is called "&lt;a href="https://en.wikipedia.org/wiki/Negation_as_failure" target="_blank"&gt;Negation as Failure&lt;/a&gt;". I think this might be necessary to make Prolog a Turing complete general programming language. Picat fixes a lot of Prolog's gripes and still has negation as failure. ASP has regular negation but it's not Turing complete.) &lt;/p&gt;
    &lt;h3&gt;Cuts are confusing&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;Prolog finds solutions through depth first search, and a "cut" (&lt;code&gt;!&lt;/code&gt;) symbol prevents backtracking past a certain point. This is necessary for optimization but can lead to invalid programs. &lt;/p&gt;
    &lt;p&gt;You're not supposed to use cuts if you can avoid it, so I pretended cuts didn't exist. Which is why I was surprised to find that &lt;a href="https://eu.swi-prolog.org/pldoc/doc_for?object=(-%3E)/2" target="_blank"&gt;conditionals&lt;/a&gt; are implemented with cuts. Because cuts are spooky dark magic conditionals &lt;em&gt;sometimes&lt;/em&gt; conditionals work as I expect them to and sometimes leave out valid solutions and I have no idea how to tell which it'll be. Usually I find it safer to just avoid conditionals entirely, which means my code gets a lot longer and messier. &lt;/p&gt;
    &lt;h3&gt;Non-cuts are confusing&lt;/h3&gt;
    &lt;p&gt;The original example in the last section was this: &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
        &lt;span class="s s-Atom"&gt;\+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;B&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nv"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;2.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;foo(1, 2)&lt;/code&gt; returns true, so you'd expect &lt;code&gt;f(A, B)&lt;/code&gt; to return &lt;code&gt;A=1, B=2&lt;/code&gt;. But it returns &lt;code&gt;false&lt;/code&gt;.  Whereas this works as expected.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
        &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nv"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s s-Atom"&gt;\+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;B&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I &lt;em&gt;thought&lt;/em&gt; this was because &lt;code&gt;\+&lt;/code&gt; was implemented with cuts, and the &lt;a href="https://www.amazon.com/Programming-Prolog-Using-ISO-Standard/dp/3540006788" target="_blank"&gt;Clocksin book&lt;/a&gt; suggests it's &lt;code&gt;call(P), !, fail&lt;/code&gt;, so this was my prime example about how cuts are confusing. But then I tried this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="s s-Atom"&gt;\+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;3.&lt;/span&gt;
    &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;3.&lt;/span&gt; &lt;span class="c1"&gt;% wtf?&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There's no way to get that behavior with cuts! I don't think &lt;code&gt;\+&lt;/code&gt; uses cuts at all! And now I have to figure out why 
    &lt;code&gt;foo(A, B)&lt;/code&gt; doesn't returns results. Is it &lt;a href="https://github.com/dtonhofer/prolog_notes/blob/master/other_notes/about_negation/floundering.md" target="_blank"&gt;floundering&lt;/a&gt;? Is it because &lt;code&gt;\+ P&lt;/code&gt; only succeeds if &lt;code&gt;P&lt;/code&gt; fails, and &lt;code&gt;A = B&lt;/code&gt; always succeeds? A closed-world assumption? Something else?&lt;sup id="fnref:dif"&gt;&lt;a class="footnote-ref" href="#fn:dif"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;h3&gt;Straying outside of default queries is confusing&lt;/h3&gt;
    &lt;p&gt;Say I have a program like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;n1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n11&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;n2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n21&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;n2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n22&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;n11&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n111&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;n11&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n112&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    
    &lt;span class="nf"&gt;branch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="c1"&gt;% two children&lt;/span&gt;
        &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;C1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;C2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nv"&gt;C1&lt;/span&gt; &lt;span class="s s-Atom"&gt;@&amp;lt;&lt;/span&gt; &lt;span class="nv"&gt;C2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="c1"&gt;% ordering&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;And I want to know all of the nodes that are parents of branches. The normal way to do this is with a query:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;branch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s s-Atom"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s s-Atom"&gt;n2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;% show more...&lt;/span&gt;
    &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s s-Atom"&gt;n1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s s-Atom"&gt;n11&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is interactively making me query for every result. That's usually not what I want, I know the result of my query is finite and I want all of the results at once, so I can count or farble or whatever them. It took a while to figure out that the proper solution is &lt;a href="https://www.swi-prolog.org/pldoc/man?predicate=bagof/3" target="_blank"&gt;&lt;code&gt;bagof(Template, Goal, Bag)&lt;/code&gt;&lt;/a&gt;, which will "Unify Bag with the alternatives of Template":&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;bagof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;branch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    
    &lt;span class="nv"&gt;As&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;n1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s s-Atom"&gt;n11&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nv"&gt;As&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;n&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s s-Atom"&gt;n2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Wait crap that's still giving one result at a time, because &lt;code&gt;N&lt;/code&gt; is a free variable in &lt;code&gt;bagof&lt;/code&gt; so it backtracks over that. It surprises me but I guess it's good to have as an option. So how do I get all of the results at once?&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;bagof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="s s-Atom"&gt;^&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;branch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    
    &lt;span class="nv"&gt;As&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The only difference is the &lt;code&gt;N^Goal&lt;/code&gt;, which tells &lt;code&gt;bagof&lt;/code&gt; to ignore and group the results of &lt;code&gt;N&lt;/code&gt;. As far as I can tell, this is the &lt;em&gt;only&lt;/em&gt; place the ISO standard uses &lt;code&gt;^&lt;/code&gt; to mean anything besides exponentiation. Supposedly it's the &lt;a href="https://sicstus.sics.se/sicstus/docs/latest4/html/sicstus.html/ref_002dall_002dsum.html" target="_blank"&gt;existential quantifier&lt;/a&gt;? In general whenever I try to stray outside simpler use-cases, especially if I try to do things non-interactively, I run into trouble.&lt;/p&gt;
    &lt;h3&gt;I have mixed feelings about symbol terms&lt;/h3&gt;
    &lt;p&gt;It took me a long time to realize the reason &lt;code&gt;bagof&lt;/code&gt;  "works" is because infix symbols are mapped to prefix compound terms, so that  &lt;code&gt;a^b&lt;/code&gt; is &lt;code&gt;^(a, b)&lt;/code&gt;, and then different predicates can decide to do different things with &lt;code&gt;^(a, b)&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;This is also why you can't just write &lt;code&gt;A = B+1&lt;/code&gt;: that unifies &lt;code&gt;A&lt;/code&gt; with the &lt;em&gt;compound term&lt;/em&gt; &lt;code&gt;+(B, 1)&lt;/code&gt;. &lt;code&gt;A+1 = B+2&lt;/code&gt; is &lt;em&gt;false&lt;/em&gt;, as &lt;code&gt;1 \= 2&lt;/code&gt;. You have to write &lt;code&gt;A+1 is B+2&lt;/code&gt;, as &lt;code&gt;is&lt;/code&gt; is the operator that converts &lt;code&gt;+(B, 1)&lt;/code&gt; to a mathematical term.&lt;/p&gt;
    &lt;p&gt;(And &lt;em&gt;that&lt;/em&gt; fails because &lt;code&gt;is&lt;/code&gt; isn't fully bidirectional. The lhs &lt;em&gt;must&lt;/em&gt; be a single variable. You have to import &lt;code&gt;clpfd&lt;/code&gt; and write &lt;code&gt;A + 1 #= B + 2&lt;/code&gt;.)&lt;/p&gt;
    &lt;p&gt;I don't like this, but I'm a hypocrite for saying that because I appreciate the idea and don't mind custom symbols in other languages. I guess what annoys me is there's no official definition of what &lt;code&gt;^(a, b)&lt;/code&gt; is, it's purely a convention. ISO Prolog uses &lt;code&gt;-(a, b)&lt;/code&gt; (aka &lt;code&gt;a-b&lt;/code&gt;) as a convention to mean "pairs", and the only way to realize that is to see that an awful lot of standard modules use that convention. But you can use &lt;code&gt;-(a, b)&lt;/code&gt; to mean something else in your own code and nothing will warn you of the inconsistency.&lt;/p&gt;
    &lt;p&gt;Anyway I griped about pairs so I can gripe about &lt;code&gt;sort&lt;/code&gt;.&lt;/p&gt;
    &lt;h3&gt;go home sort, ur drunk&lt;/h3&gt;
    &lt;p&gt;This one's just a blunder:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Out&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
       &lt;span class="nv"&gt;Out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt; &lt;span class="c1"&gt;% wat&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;According to an expert online this is because sort is supposed to return a sorted &lt;em&gt;set&lt;/em&gt;, not a sorted list. If you want to preserve duplicates you're supposed to lift all of the values into &lt;code&gt;-($key, $value)&lt;/code&gt; compound terms, then use &lt;a href="https://eu.swi-prolog.org/pldoc/doc_for?object=keysort/2" target="_blank"&gt;keysort&lt;/a&gt;, then extract the values. And, since there's no functions, this process takes at least three lines. This is also how you're supposed to sort by a custom predicate, like "the second value of a compound term". &lt;/p&gt;
    &lt;p&gt;(Most (but not all) distributions have a duplicate merge like &lt;a href="https://eu.swi-prolog.org/pldoc/doc_for?object=msort/2" target="_blank"&gt;msort&lt;/a&gt;. SWI-Prolog also has a &lt;a href="https://eu.swi-prolog.org/pldoc/doc_for?object=predsort/3" target="_blank"&gt;sort by key&lt;/a&gt; but it removes duplicates.)&lt;/p&gt;
    &lt;h3&gt;Please just let me end rules with a trailing comma instead of a period, I'm begging you&lt;/h3&gt;
    &lt;p&gt;I don't care if it makes fact parsing ambiguous, I just don't want "reorder two lines" to be a syntax error anymore&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;I expect by this time tomorrow I'll have been Cunningham'd and there will be a 2000 word essay about how all of my gripes are either easily fixable by doing XYZ or how they are the best possible choice that Prolog could have made. I mean, even in writing this I found out some fixes to problems I had. Like I was going to gripe about how I can't run SWI-Prolog queries from the command line but, in doing do diligence finally &lt;em&gt;finally&lt;/em&gt; figured it out:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;swipl&lt;span class="w"&gt; &lt;/span&gt;-t&lt;span class="w"&gt; &lt;/span&gt;halt&lt;span class="w"&gt; &lt;/span&gt;-g&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"bagof(X, Goal, Xs), print(Xs)"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;./file.pl
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;It's pretty clunky but still better than the old process of having to enter an interactive session every time I wanted to validate a script change.&lt;/p&gt;
    &lt;p&gt;(Also, answer set programming is pretty darn cool. Excited to write about it in the book!)&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:dif"&gt;
    &lt;p&gt;A couple of people mentioned using &lt;a href="https://eu.swi-prolog.org/pldoc/doc_for?object=dif/2" target="_blank"&gt;dif/2&lt;/a&gt; instead of &lt;code&gt;\+ A = B&lt;/code&gt;. Dif is great but usually I hit the negation footgun with things like &lt;code&gt;\+ foo(A, B), bar(B, C), baz(A, C)&lt;/code&gt;, where &lt;code&gt;dif/2&lt;/code&gt; isn't applicable. &lt;a class="footnote-backref" href="#fnref:dif" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 14 Jan 2026 16:48:51 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/my-gripes-with-prolog/</guid>
            </item>
            <item>
                <title>The Liskov Substitution Principle does more than you think</title>
                <link>https://buttondown.com/hillelwayne/archive/the-liskov-substitution-principle-does-more-than/</link>
                <description>&lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Happy New Year! I'm done with the newsletter hiatus and am going to try updating weekly again. To ease into things a bit, I'll try to keep posts a little more off the cuff and casual for a while, at least until &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;&lt;em&gt;Logic for Programmers&lt;/em&gt;&lt;/a&gt; is done. Speaking of which, v0.13 should be out by the end of this month.&lt;/p&gt;
    &lt;p&gt;So for this newsletter I want to talk about the &lt;a href="https://en.wikipedia.org/wiki/Liskov_substitution_principle" target="_blank"&gt;Liskov Substitution Principle&lt;/a&gt; (LSP). Last week I read &lt;a href="https://loup-vaillant.fr/articles/solid-bull" target="_blank"&gt;A SOLID Load of Bull&lt;/a&gt; by cryptographer Loupe Vaillant, where he argues the &lt;a href="https://en.wikipedia.org/wiki/SOLID" target="_blank"&gt;SOLID&lt;/a&gt; principles of OOP are not worth following. He makes an exception for LSP, but also claims that it's "just subtyping" and further:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;If I were trying really hard to be negative about the Liskov substitution principle, I would stress that &lt;strong&gt;it only applies when inheritance is involved&lt;/strong&gt;, and inheritance is strongly discouraged anyway.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;LSP is more interesting than that! In the original paper, &lt;a href="https://www.cs.cmu.edu/~wing/publications/LiskovWing94.pdf" target="_blank"&gt;A Behavioral Notion of Subtyping&lt;/a&gt;, Barbara Liskov and Jeannette Wing start by defining a "correct" subtyping as follows:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Subtype Requirement: Let ϕ(x) be a property provable about objects x of type T. Then ϕ(y) should be true for objects y of type S where S is a subtype of T.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;From then on, the paper determine what &lt;em&gt;guarantees&lt;/em&gt; that a subtype is correct.&lt;sup id="fnref:safety"&gt;&lt;a class="footnote-ref" href="#fn:safety"&gt;1&lt;/a&gt;&lt;/sup&gt;  They identify three conditions: &lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Each of the subtype's methods has the same or weaker preconditions and the same or stronger postconditions as the corresponding supertype method.&lt;sup id="fnref:cocontra"&gt;&lt;a class="footnote-ref" href="#fn:cocontra"&gt;2&lt;/a&gt;&lt;/sup&gt; &lt;/li&gt;
    &lt;li&gt;The subtype satisfies all state invariants of the supertype. &lt;/li&gt;
    &lt;li&gt;The subtype satisfies all "history properties" of the supertype. &lt;sup id="fnref:refinement"&gt;&lt;a class="footnote-ref" href="#fn:refinement"&gt;3&lt;/a&gt;&lt;/sup&gt; e.g. if a supertype has an immutable field, the subtype cannot make it mutable. &lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;(Later, Elisa Baniassad and Alexander Summers &lt;a href="https://www.cs.ubc.ca/~alexsumm/papers/BaniassadSummers21.pdf" target="_blank"&gt;would realize&lt;/a&gt; these are equivalent to "the subtype passes all black-box tests designed for the supertype", which I wrote a little bit more about &lt;a href="https://www.hillelwayne.com/post/lsp/" target="_blank"&gt;here&lt;/a&gt;.)&lt;/p&gt;
    &lt;p&gt;I want to focus on the first rule about preconditions and postconditions. This refers to the method's &lt;strong&gt;contract&lt;/strong&gt;.  For a function &lt;code&gt;f&lt;/code&gt;, &lt;code&gt;f.Pre&lt;/code&gt; is what must be true going into the function, and &lt;code&gt;f.Post&lt;/code&gt; is what the function guarantees on execution. A canonical example is square root: &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;sqrt.Pre(x) = x &amp;gt;= 0
    sqrt.Post(x, out) = out &amp;gt;= 0 &amp;amp;&amp;amp; out*out == x
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;Mathematically we would write this as &lt;code&gt;all x: f.Pre(x) =&amp;gt; f.Post(x)&lt;/code&gt; (where &lt;code&gt;=&amp;gt;&lt;/code&gt; is the &lt;a href="https://en.wikipedia.org/wiki/Material_conditional" target="_blank"&gt;implication operator&lt;/a&gt;). If that relation holds for all &lt;code&gt;x&lt;/code&gt;, we say the function is "correct". With this definition we can actually formally deduce the first  subtyping requirement. Let &lt;code&gt;caller&lt;/code&gt; be some code that uses a method, which we will call &lt;code&gt;super&lt;/code&gt;, and let both &lt;code&gt;caller&lt;/code&gt; and &lt;code&gt;super&lt;/code&gt; be correct. Then we know the following statements are true:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;  1. caller.Pre &amp;amp;&amp;amp; stuff =&amp;gt; super.Pre
      2. super.Pre =&amp;gt; super.Post
      3. super.Post &amp;amp;&amp;amp; more_stuff =&amp;gt; caller.Post
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now let's say we substitute &lt;code&gt;super&lt;/code&gt; with &lt;code&gt;sub&lt;/code&gt;, which is also correct. Here is what we now know is true: &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt; &lt;/span&gt; 1. caller.Pre =&amp;gt; super.Pre
    &lt;span class="gd"&gt;- 2. super.Pre =&amp;gt; super.Post&lt;/span&gt;
    &lt;span class="gi"&gt;+ 2. sub.Pre =&amp;gt; sub.Post&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; 3. super.Post =&amp;gt; caller.Post
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;When is &lt;code&gt;caller&lt;/code&gt; still correct? When we can fill in the "gaps" in the chain, aka if &lt;code&gt;super.Pre =&amp;gt; sub.Pre&lt;/code&gt; and &lt;code&gt;sub.Post =&amp;gt; super.Post&lt;/code&gt;. In other words, if &lt;code&gt;sub&lt;/code&gt;'s preconditions are weaker than (or equivalent to) &lt;code&gt;super&lt;/code&gt;'s preconditions and if &lt;code&gt;sub&lt;/code&gt;'s postconditions are stronger than (or equivalent to) &lt;code&gt;super&lt;/code&gt;'s postconditions.&lt;/p&gt;
    &lt;p&gt;Notice that I never actually said &lt;code&gt;sub&lt;/code&gt; was from a subtype of &lt;code&gt;super&lt;/code&gt;! The LSP conditions (at least, the contract rule of LSP) doesn't just apply to &lt;em&gt;subtypes&lt;/em&gt; but can be applied in any situation where we substitute a function or block of code for another. Subtyping is a common place where this happens, but by no means the only! We can also substitute across time.Any time we modify some code's behavior, we are effectively substituting the new version in for the old version, and so the new version's contract must be compatible with the old version's to guarantee no existing code is broken.&lt;/p&gt;
    &lt;p&gt;For example, say we maintain an API or function with two required inputs, &lt;code&gt;X&lt;/code&gt; and &lt;code&gt;Y&lt;/code&gt;, and one optional input, &lt;code&gt;Z&lt;/code&gt;. Making &lt;code&gt;Z&lt;/code&gt; required strengthens the precondition ("input must have Z" is stronger than "input may have Z"), so potentially breaks existing users of our API. Making &lt;code&gt;Y&lt;/code&gt; optional weakens the precondition ("input may have Y" is weaker than "input must have Y"), so is guaranteed to be compatible.&lt;/p&gt;
    &lt;p&gt;(This also underpins &lt;a href="https://en.wikipedia.org/wiki/Robustness_principle" target="_blank"&gt;The robustness principle&lt;/a&gt;: "be conservative in what you send, be liberal in what you accept".)&lt;/p&gt;
    &lt;p&gt;Now the dark side of all this is &lt;a href="https://www.hyrumslaw.com/" target="_blank"&gt;Hyrum's Law&lt;/a&gt;. In the below code, are &lt;code&gt;new&lt;/code&gt;'s postconditions stronger than &lt;code&gt;old&lt;/code&gt;'s postconditions? &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;old&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"a"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"foo"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"b"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"bar"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    
    &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"a"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"foo"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"b"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"bar"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"c"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"baz"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;On a first appearance, this is a strengthened postcondition: &lt;code&gt;out.contains_keys([a, b, c]) =&amp;gt; out.contains_keys([a, b])&lt;/code&gt;. But now someone does this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;my_dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"c"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"blat"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; 
    &lt;span class="n"&gt;my_dict&lt;/span&gt; &lt;span class="o"&gt;|=&lt;/span&gt; &lt;span class="n"&gt;new&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;my_dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"blat"&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Oh no, their code now breaks! They saw &lt;code&gt;old&lt;/code&gt; had the postcondition "&lt;code&gt;out&lt;/code&gt; does NOT contain "c" as a key", and then wrote their code expecting that postcondition. In a sense, &lt;em&gt;any&lt;/em&gt; change the postcondition can potentially break &lt;em&gt;someone&lt;/em&gt;. "All observable behaviors of your system
    will be depended on by somebody", as &lt;a href="https://www.hyrumslaw.com/" target="_blank"&gt;Hyrum's Law&lt;/a&gt; puts it.&lt;/p&gt;
    &lt;p&gt;So we need to be explicit in what our postconditions actually are, and properties of the output that are not part of our explicit postconditions are subject to be violated on the next version. You'll break people's workflows but you also have grounds to say "I warned you".&lt;/p&gt;
    &lt;p&gt;Overall, Liskov and Wing did their work in the context of subtyping, but the principles are more widely applicable, certainly to more than just the use of inheritance.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:safety"&gt;
    &lt;p&gt;Though they restrict it to just &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;safety properties&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:safety" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:cocontra"&gt;
    &lt;p&gt;The paper lists a couple of other authors as introduce the idea of "contra/covariance rules", but part of being "off-the-cuff and casual" means not diving into every referenced paper. So they might have gotten the pre/postconditions thing from an earlier author, dunno for sure! &lt;a class="footnote-backref" href="#fnref:cocontra" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:refinement"&gt;
    &lt;p&gt;I &lt;em&gt;believe&lt;/em&gt; that this is equivalent to the formal methods notion of a &lt;a href="https://www.hillelwayne.com/post/refinement/" target="_blank"&gt;refinement&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:refinement" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 06 Jan 2026 16:51:26 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/the-liskov-substitution-principle-does-more-than/</guid>
            </item>
            <item>
                <title>Some Fun Software Facts</title>
                <link>https://buttondown.com/hillelwayne/archive/some-fun-software-facts/</link>
                <description>&lt;p&gt;Last newsletter of the year!&lt;/p&gt;
    &lt;p&gt;First some news on &lt;em&gt;Logic for Programmers&lt;/em&gt;. Thanks to everyone who donated to the &lt;a href="https://buttondown.com/hillelwayne/archive/get-logic-for-programmers-50-off-support-chicago" target="_blank"&gt;feedchicago charity drive&lt;/a&gt;! In total we raised $2250 for Chicago food banks. Proof &lt;a href="https://link.fndrsp.net/CL0/https:%2F%2Fgiving.chicagosfoodbank.org%2Freceipts%2FBMDDDCAF%3FreceiptType=oneTime%26emailLog=YS699MZW/2/0100019ae2b7eb92-7c917ad0-c94e-4fe2-8ee1-1b9dc521c607-000000/brmxoTOvoJN94I9nQH26s7fRrmyFDj_Jir1FySSoxCw=434" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;If you missed buying &lt;em&gt;Logic for Programmers&lt;/em&gt; real cheap in the charity drive, you can still get it for $10 off with the holiday code &lt;a href="https://leanpub.com/logic/c/hannukah-presents" target="_blank"&gt;hannukah-presents&lt;/a&gt;. This will last from now until the end of the year. After that, I'll be raising the price from $25 to $30.&lt;/p&gt;
    &lt;p&gt;Anyway, to make this more than just some record keeping, let's close out with something light. I'm one of those people who loves hearing "fun facts" about stuff. So here's some random fun facts I accumulated about software over the years:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;In 2017, a team of eight+ programmers &lt;a href="https://codegolf.stackexchange.com/questions/11880/build-a-working-game-of-tetris-in-conways-game-of-life" target="_blank"&gt;successfully implemented Tetris&lt;/a&gt; as a &lt;a href="https://en.wikipedia.org/wiki/Conway's_Game_of_Life" target="_blank"&gt;game of life simulation&lt;/a&gt;. The GoL grid had an area of 30 trillion pixels and implemented a full programmable CPU as part of the project.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;Computer systems have to deal with leap seconds in order to keep UTC (where one day is 86,400 seconds) in sync with UT1 (where one day is exactly one full earth rotation). The people in charge recently passed a resolution to abolish the leap second by 2035, letting UTC and UT1 slowly drift out of sync.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://buttondown.com/hillelwayne/archive/vim-is-turing-complete/" target="_blank"&gt;Vim is Turing complete&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;ul&gt;
    &lt;li&gt;The backslash character basically didn't exist in writing before 1930, and &lt;a href="http://dump.deadcodersociety.org/ascii.pdf" target="_blank"&gt;was only added to ASCII&lt;/a&gt; so mathematicians (and ALGOLists) could write &lt;code&gt;/\&lt;/code&gt; and &lt;code&gt;\/&lt;/code&gt;. It's popular use in computing stems entirely from being a useless key on the keyboard.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Galactic_algorithm" target="_blank"&gt;Galactic Algorithms&lt;/a&gt; are algorithms that are theoretically faster than algorithms we use, but only at scales that make them impractical. For example, matrix multiplication of NxN is &lt;a href="https://en.wikipedia.org/wiki/Strassen_algorithm" target="_blank"&gt;normally&lt;/a&gt; O(N^2.81). The &lt;a href="https://www-auth.cs.wisc.edu/lists/theory-reading/2009-December/pdfmN6UVeUiJ3.pdf" target="_blank"&gt;Coppersmith Winograd&lt;/a&gt; algorithm is O(N^2.38), but is so complex that it's vastly slower for even &lt;a href="https://mathoverflow.net/questions/1743/what-is-the-constant-of-the-coppersmith-winograd-matrix-multiplication-algorithm" target="_blank"&gt;10,000 x 10,000 matrices&lt;/a&gt;. It's still interesting in advancing our mathematical understanding of algorithms!&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;Cloudflare generates random numbers by, in part, &lt;a href="https://www.cloudflare.com/learning/ssl/lava-lamp-encryption/" target="_blank"&gt;taking pictures of 100 lava lamps&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;Mergesort is older than bubblesort. Quicksort is slightly younger than bubblesort but older than the &lt;em&gt;term&lt;/em&gt; "bubblesort". Bubblesort, btw, &lt;a href="https://buttondown.com/hillelwayne/archive/when-would-you-ever-want-bubblesort/" target="_blank"&gt;does have some uses&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;Speaking of mergesort, most implementations of mergesort pre-2006 &lt;a href="https://research.google/blog/extra-extra-read-all-about-it-nearly-all-binary-searches-and-mergesorts-are-broken/" target="_blank"&gt;were broken&lt;/a&gt;. Basically the problem was that the "find the midpoint of a list" step &lt;em&gt;could&lt;/em&gt; overflow if the list was big enough. For C with 32-bit signed integers, "big enough" meant over a billion elements, which was why the bug went unnoticed for so long.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://nibblestew.blogspot.com/2023/09/circles-do-not-exist.html" target="_blank"&gt;PDF's drawing model cannot render perfect circles&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;People make fun of how you have to flip USBs three times to get them into a computer, but there's supposed to be a guide: according to the standard, USBs are supposed to be inserted &lt;em&gt;logo-side up&lt;/em&gt;. Of course, this assumes that the port is right-side up, too, which is why USB-C is just symmetric. &lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;I was gonna write a fun fact about how all spreadsheet software treats 1900 as a leap year, as that was a bug in Lotus 1-2-3 and everybody preserved backwards compatibility. But I checked and Google sheets considers it a normal year. So I guess the fun fact is that things have changed!&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;Speaking of spreadsheet errors, in 2020 &lt;a href="https://www.engadget.com/scientists-rename-genes-due-to-excel-151748790.html" target="_blank"&gt;biologists changed the official nomenclature&lt;/a&gt; of 27 genes because Excel kept parsing their names as dates. F.ex MARCH1 was renamed to MARCHF1 to avoid being parsed as "March 1st". Microsoft rolled out a fix for this... three years later.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;It is possible to encode any valid JavaScript program with just the characters &lt;code&gt;()+[]!&lt;/code&gt;. This encoding is called &lt;a href="https://en.wikipedia.org/wiki/JSFuck" target="_blank"&gt;JSFuck&lt;/a&gt; and was once used to distribute malware on &lt;a href="https://arstechnica.com/information-technology/2016/02/ebay-has-no-plans-to-fix-severe-bug-that-allows-malware-distribution/" target="_blank"&gt;Ebay&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Happy holidays everyone, and see you in 2026!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:status"&gt;
    &lt;p&gt;Current status update: I'm finally getting line by line structural editing done and it's turning up lots of improvements, so I'm doing more rewrites than I expected to be doing. &lt;a class="footnote-backref" href="#fnref:status" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 10 Dec 2025 18:45:37 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/some-fun-software-facts/</guid>
            </item>
            <item>
                <title>One more week to the Logic for Programmers Food Drive</title>
                <link>https://buttondown.com/hillelwayne/archive/one-more-week-to-the-logic-for-programmers-food/</link>
                <description>&lt;p&gt;A couple of weeks ago I started a fundraiser for the &lt;a href="https://www.chicagosfoodbank.org/" target="_blank"&gt;Greater Chicago Food Depository&lt;/a&gt;: get &lt;a href="https://leanpub.com/logic/c/feedchicago" target="_blank"&gt;Logic for Programmers 50% off&lt;/a&gt; and all the royalties will go to charity.&lt;sup id="fnref:royalties"&gt;&lt;a class="footnote-ref" href="#fn:royalties"&gt;1&lt;/a&gt;&lt;/sup&gt; Since then, we've raised a bit over $1600. Y'all are great! &lt;/p&gt;
    &lt;p&gt;The fundraiser is going on until the end of November, so you still have one more week to get the book real cheap.&lt;/p&gt;
    &lt;p&gt;I feel a bit weird about doing two newsletter adverts without raw content, so here's a teaser from a old project I really need to get back to. &lt;a href="https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/#what-is-a-goto-statement-anyway" target="_blank"&gt;Notes on structured concurrency&lt;/a&gt; argues that old languages had a "old-testament fire-and-brimstone &lt;code&gt;goto&lt;/code&gt;" that could send control flow anywhere, like from the body of one function into the body of another function. This "wild goto", the article claims, what Dijkstra was railing against in &lt;a href="https://homepages.cwi.nl/~storm/teaching/reader/Dijkstra68.pdf" target="_blank"&gt;Go To Statement Considered Harmful&lt;/a&gt;, and that modern goto statements are much more limited, "tame" if you will, and wouldn't invoke Dijkstra's ire.&lt;/p&gt;
    &lt;p&gt;I've shared this historical fact about Dijkstra many times, but recently two &lt;a href="https://without.boats/blog/" target="_blank"&gt;separate&lt;/a&gt; &lt;a href="https://matklad.github.io/" target="_blank"&gt;people&lt;/a&gt; have told me it doesn't makes sense: Dijkstra used ALGOL-60, which &lt;em&gt;already had&lt;/em&gt; tame gotos. All of the problems he raises with &lt;code&gt;goto&lt;/code&gt; hold even for tame ones, none are exclusive to wild gotos. So &lt;/p&gt;
    &lt;p&gt;This got me looking to see which languages, if any, ever had the wild goto. I define this as any goto which lets you jump from outside to into a loop or function scope. Turns out, FORTRAN had tame gotos from the start, BASIC has wild gotos, and COBOL is a nonsense language intentionally designed to horrify me. I mean, look at this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The COBOL ALTER statement, which redefines a goto target" class="newsletter-image" src="https://assets.buttondown.email/images/e4dfa0fd-fdd5-4fef-b813-4053a183be2f.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;The COBOL ALTER statement &lt;em&gt;changes a &lt;code&gt;goto&lt;/code&gt;'s target at runtime&lt;/em&gt;. &lt;/p&gt;
    &lt;p&gt;(Early COBOL has tame gotos but only on a technicality: there are no nested scopes in COBOL so no jumping from outside and into a nested scope.)&lt;/p&gt;
    &lt;p&gt;Anyway I need to write up the full story (and complain about COBOL more) but this is pretty neat! Reminder, &lt;a href="https://leanpub.com/logic/c/feedchicago" target="_blank"&gt;fundraiser here&lt;/a&gt;. Let's get it to 2k.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:royalties"&gt;
    &lt;p&gt;Royalties are 80% so if you already have the book you get a bit more bang for your buck by donating to the GCFD directly &lt;a class="footnote-backref" href="#fnref:royalties" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Mon, 24 Nov 2025 18:21:49 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/one-more-week-to-the-logic-for-programmers-food/</guid>
            </item>
            <item>
                <title>Get Logic for Programmers 50% off &amp; Support Chicago Foodbanks</title>
                <link>https://buttondown.com/hillelwayne/archive/get-logic-for-programmers-50-off-support-chicago/</link>
                <description>&lt;p&gt;From now until the end of the month, you can get &lt;a href="https://leanpub.com/logic/c/feedchicago" target="_blank"&gt;Logic for Programmers at half price&lt;/a&gt; with the coupon &lt;code&gt;feedchicago&lt;/code&gt;. All royalties from that coupon will go to the &lt;a href="https://www.chicagosfoodbank.org/" target="_blank"&gt;Greater Chicago Food Depository&lt;/a&gt;. Thank you!&lt;/p&gt;</description>
                <pubDate>Mon, 10 Nov 2025 16:31:11 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/get-logic-for-programmers-50-off-support-chicago/</guid>
            </item>
            <item>
                <title>I'm taking a break</title>
                <link>https://buttondown.com/hillelwayne/archive/im-taking-a-break/</link>
                <description>&lt;p&gt;Hi everyone,&lt;/p&gt;
    &lt;p&gt;I've been getting burnt out on writing a weekly software essay. It's gone from taking me an afternoon to write a post to taking two or three days, and that's made it really difficult to get other writing done. That, plus some short-term work and life priorities, means now feels like a good time for a break. &lt;/p&gt;
    &lt;p&gt;So I'm taking off from &lt;em&gt;Computer Things&lt;/em&gt; for the rest of the year. There &lt;em&gt;might&lt;/em&gt; be some announcements and/or one or two short newsletters in the meantime but I won't be attempting a weekly cadence until 2026.&lt;/p&gt;
    &lt;p&gt;Thanks again for reading!&lt;/p&gt;
    &lt;p&gt;Hillel&lt;/p&gt;</description>
                <pubDate>Mon, 27 Oct 2025 21:02:37 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/im-taking-a-break/</guid>
            </item>
            <item>
                <title>Modal editing is a weird historical contingency we have through sheer happenstance</title>
                <link>https://buttondown.com/hillelwayne/archive/modal-editing-is-a-weird-historical-contingency/</link>
                <description>&lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;A while back my friend &lt;a href="https://morepablo.com/" target="_blank"&gt;Pablo Meier&lt;/a&gt; was reviewing some 2024 videogames and wrote &lt;a href="https://morepablo.com/2025/03/games-of-2024.html" target="_blank"&gt;this&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;I feel like some artists, if they didn't exist, would have the resulting void filled in by someone similar (e.g. if Katy Perry didn't exist, someone like her would have). But others don't have successful imitators or comparisons (thinking Jackie Chan, or Weird Al): they are irreplaceable.  &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;He was using it to describe auteurs but I see this as a property of opportunity, in that "replaceable" artists are those who work in bigger markets. Katy Perry's market is large, visible and obviously (but not &lt;em&gt;easily&lt;/em&gt;) exploitable, so there are a lot of people who'd compete in her niche. Weird Al's market is unclear: while there were successful parody songs in the past, it wasn't clear there was enough opportunity there to support a superstar.&lt;/p&gt;
    &lt;p&gt;I think that modal editing is in the latter category. Vim is now very popular and has spawned numerous successors. But its key feature, &lt;strong&gt;modes&lt;/strong&gt;, is not obviously-beneficial, to the point that if Bill Joy didn't make vi (vim's direct predecessor) fifty years ago I don't think we'd have any modal editors today. &lt;/p&gt;
    &lt;h3&gt;A quick overview of "modal editing"&lt;/h3&gt;
    &lt;p&gt;In a non-modal editor, pressing the "u" key adds a "u" to your text, as you'd expect. In a &lt;strong&gt;modal editor&lt;/strong&gt;, pressing "u" does something different depending on the "mode" you are in. In Vim's default "normal" mode, "u" undoes the last change to the text, while in the "visual" mode it lowercases all selected text. It only inserts the character in "insert" mode. All other keys, as well as chorded shortcuts (&lt;code&gt;ctrl-x&lt;/code&gt;), work the same way. &lt;/p&gt;
    &lt;p&gt;The clearest benefit to this is you can densely pack the keyboard with advanced commands. The standard US keyboard has 48ish keys dedicated to inserting characters. With the ctrl and shift modifiers that becomes at least ~150 extra shortcuts for each other mode. This is also what IMO "spiritually" distinguishes modal editing from contextual shortcuts. Even if a unimodal editor lets you change a keyboard shortcut's behavior based on languages or focused panel, without global user-controlled modes it simply can't achieve that density of shortcuts.&lt;/p&gt;
    &lt;p&gt;Now while modal editing today is widely beloved (the Vim plugin for &lt;a href="https://marketplace.visualstudio.com/items?itemName=vscodevim.vim" target="_blank"&gt;VSCode&lt;/a&gt; has at least eight million downloads), I suspect it was "carried" by the popularity of vi, as opposed to driving vi's popularity.&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;Modal editing is an unusual idea&lt;/h3&gt;
    &lt;p&gt;Pre-vi editors weren't modal. Some, like &lt;a href="https://en.wikipedia.org/wiki/EDT_(Digital)" target="_blank"&gt;EDT/KED&lt;/a&gt;, used chorded commands, while others like &lt;a href="https://en.wikipedia.org/wiki/Ed_(software)" target="_blank"&gt;ed&lt;/a&gt; or &lt;a href="https://en.wikipedia.org/wiki/TECO_(text_editor)" target="_blank"&gt;TECO&lt;/a&gt; basically REPLs for text-editing DSLs. Both of these ideas widely reappear in modern editors.&lt;/p&gt;
    &lt;p&gt;As far as I can tell, the first modal editor was Butler Lampson's &lt;a href="https://en.wikipedia.org/wiki/Bravo_(editor)" target="_blank"&gt;Bravo&lt;/a&gt; in 1974. Bill Joy &lt;a href="https://web.archive.org/web/20120210184000/http://web.cecs.pdx.edu/~kirkenda/joy84.html" target="_blank"&gt;admits he used it for inspiration&lt;/a&gt;: &lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;A lot of the ideas for the screen editing mode were stolen from a Bravo manual I surreptitiously looked at and copied. Dot is really the double-escape from Bravo, the redo command. Most of the stuff was stolen. &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Bill Joy probably took the idea because he was working on &lt;a href="https://en.wikipedia.org/wiki/ADM-3A" target="_blank"&gt;dumb terminals&lt;/a&gt; that were slow to register keystrokes, which put pressure to minimize the number needed for complex operations.&lt;/p&gt;
    &lt;p&gt;Why did Bravo have modal editing? Looking at the &lt;a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/11/15a-AltoHandbook.pdf" target="_blank"&gt;Alto handbook&lt;/a&gt;, I get the impression that Xerox was trying to figure out the best mouse and GUI workflows. Bravo was an experiment with modes, one hand on the mouse and one issuing commands on the keyboard. Other experiments included context menus (the Markup program) and toolbars (Draw).&lt;/p&gt;
    &lt;p&gt;Xerox very quickly decided &lt;em&gt;against&lt;/em&gt; modes, as the successors &lt;a href="http://www.bitsavers.org/pdf/xerox/alto/memos_1975/Gypsy_The_Ginn_Typescript_System_Apr75.pdf" target="_blank"&gt;Gypsy&lt;/a&gt; and &lt;a href="http://www.bitsavers.org/pdf/xerox/alto/BravoXMan.pdf" target="_blank"&gt;BravoX&lt;/a&gt; were modeless. Commands originally assigned to English letters were moved to graphical menus, special keys, and chords. &lt;/p&gt;
    &lt;p&gt;It seems to me that modes started as an unsuccessful experiment deal with a specific constraint and then later successfully adopted to deal with a different constraint. It was a specialized feature as opposed to a generally useful feature like chords.&lt;/p&gt;
    &lt;h3&gt;Modal editing didn't popularize vi&lt;/h3&gt;
    &lt;p&gt;While vi was popular at Bill Joy's coworkers, he doesn't &lt;a href="https://web.archive.org/web/20120210184000/http://web.cecs.pdx.edu/~kirkenda/joy84.html" target="_blank"&gt;attribute its success to its features&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;I think the wonderful thing about vi is that it has such a good market share because we gave it away. Everybody has it now. So it actually had a chance to become part of what is perceived as basic UNIX. EMACS is a nice editor too, but because it costs hundreds of dollars, there will always be people who won't buy it. &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Vi was distributed for free with the popular &lt;a href="https://en.wikipedia.org/wiki/Berkeley_Software_Distribution" target="_blank"&gt;BSD Unix&lt;/a&gt; and was standardized in &lt;a href="https://pubs.opengroup.org/onlinepubs/9799919799/" target="_blank"&gt;POSIX Issue 2&lt;/a&gt;, meaning all Unix OSes had to have vi. That arguably is what made it popular, and why so many people ended up learning a modal editor. &lt;/p&gt;
    &lt;h3&gt;Modal editing doesn't really spread outside of vim&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;I think by the 90s, people started believing that modal editing was a Good Idea, if not an obvious one. That's why we see direct descendants of vi, most famously vim. It's also why extensible editors like Emacs and VSCode have vim-mode extensions, but these are but these are always simple emulation layers on top of a unimodal baselines. This was good for getting people used to the vim keybindings (I learned on &lt;a href="https://en.wikipedia.org/wiki/Kile" target="_blank"&gt;Kile&lt;/a&gt;) but it means people weren't really &lt;em&gt;doing&lt;/em&gt; anything with modal editing. It was always "The Vim Gimmick".&lt;/p&gt;
    &lt;p&gt;Modes also didn't take off anywhere else. There's no modal word processor, spreadsheet editor, or email client.&lt;sup id="fnref:gmail"&gt;&lt;a class="footnote-ref" href="#fn:gmail"&gt;1&lt;/a&gt;&lt;/sup&gt; &lt;a href="https://www.visidata.org/" target="_blank"&gt;Visidata&lt;/a&gt; is an extremely cool modal data exploration tool but it's pretty niche. Firefox used to have &lt;a href="https://en.wikipedia.org/wiki/Vimperator" target="_blank"&gt;vimperator&lt;/a&gt; (which was inspired by Vim) but that's defunct now. Modal software means modal editing which means vi.&lt;/p&gt;
    &lt;p&gt;This has been changing a little, though! Nowadays we do see new modal text editors, like &lt;a href="https://kakoune.org/" target="_blank"&gt;kakoune&lt;/a&gt; and &lt;a href="https://helix-editor.com/" target="_blank"&gt;Helix&lt;/a&gt;, that don't just try to emulate vi but do entirely new things. These were made, though, in response to perceived shortcomings in vi's editing model. I think they are still classifiable as descendants. If vi never existed, would the developers of kak and helix have still made modal editors, or would they have explored different ideas? &lt;/p&gt;
    &lt;h3&gt;People aren't clamouring for more experiments&lt;/h3&gt;
    &lt;p&gt;Not too related to the overall picture, but a gripe of mine. Vi and vim have a set of hardcoded modes, and adding an entirely new mode is impossible. Like if a plugin (like vim's default &lt;code&gt;netrw&lt;/code&gt;) adds a file explorer it should be able to add a filesystem mode, right? But it can't, so instead it waits for you to open the filesystem and then &lt;a href="https://github.com/vim/vim/blob/0124320c97b0fbbb44613f42fc1c34fee6181fc8/runtime/pack/dist/opt/netrw/autoload/netrw.vim#L4867" target="_blank"&gt;adds 60 new mappings to normal mode&lt;/a&gt;. There's no way to properly add a "filesystem" mode, a "diff" mode, a "git" mode, etc, so plugin developers have to &lt;a href="https://www.hillelwayne.com/post/software-mimicry/" target="_blank"&gt;mimic&lt;/a&gt; them.&lt;/p&gt;
    &lt;p&gt;I don't think people see this as a problem, though! Neovim, which aims to fix all of the baggage in vim's legacy, didn't consider creating modes an important feature. Kak and Helix, which reimagine modal editing from from the ground up, don't support creating modes either.&lt;sup id="fnref:helix"&gt;&lt;a class="footnote-ref" href="#fn:helix"&gt;2&lt;/a&gt;&lt;/sup&gt; People aren't clamouring for new modes!&lt;/p&gt;
    &lt;h2&gt;Modes are a niche power user feature&lt;/h2&gt;
    &lt;p&gt;So far I've been trying to show that vi is, in Pablo's words, "irreplaceable". Editors weren't doing modal editing before Bravo, and even after vi became incredibly popular, unrelated editors did not adapt modal editing. At most, they got a vi emulation layer. Kak and helix complicate this story but I don't think they refute it; they appear much later and arguably count as descendants (so are related). &lt;/p&gt;
    &lt;p&gt;I think the best explanation is that in a vacuum modal editing sounds like a bad idea. The mode is global state that users always have to know, which makes it dangerous. To use new modes well you have to memorize all of the keybindings, which makes it difficult. Modal editing has a brutal skill floor before it becomes more efficient than a unimodal, chorded editor like VSCode.&lt;/p&gt;
    &lt;p&gt;That's why it originally appears in very specific circumstances, as early experiments in mouse UX and as a way of dealing with modem latencies. The fact we have vim today is a historical accident. &lt;/p&gt;
    &lt;p&gt;And I'm glad for it! You can pry Neovim from my cold dead hands, you monsters.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h1&gt;&lt;a href="https://www.p99conf.io/" target="_blank"&gt;P99 talk this Thursday&lt;/a&gt;!&lt;/h1&gt;
    &lt;p&gt;My talk, "Designing Low-Latency Systems with TLA+", is happening 10/23 at 11:40 central time. Tickets are free, the conf is online, and the talk's only 16 minutes, so come check it out!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:gmail"&gt;
    &lt;p&gt;I guess if you squint &lt;a href="https://support.google.com/mail/answer/6594?hl=en&amp;amp;co=GENIE.Platform%3DDesktop" target="_blank"&gt;gmail kinda counts&lt;/a&gt; but it's basically an antifeature &lt;a class="footnote-backref" href="#fnref:gmail" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:helix"&gt;
    &lt;p&gt;It looks like Helix supports &lt;a href="https://docs.helix-editor.com/remapping.html" target="_blank"&gt;creating minor modes&lt;/a&gt;, but these are only active for one keystroke, making them akin to a better, more ergonomic version of vim multikey mappings. &lt;a class="footnote-backref" href="#fnref:helix" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 21 Oct 2025 16:46:24 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/modal-editing-is-a-weird-historical-contingency/</guid>
            </item>
            <item>
                <title>The Phase Change</title>
                <link>https://buttondown.com/hillelwayne/archive/the-phase-change/</link>
                <description>&lt;p&gt;Last week I ran my first 10k.&lt;/p&gt;
    &lt;p&gt;It wasn't a race or anything. I left that evening planning to run a 5k, and then three miles later thought "what if I kept going?"&lt;sup id="fnref:distance"&gt;&lt;a class="footnote-ref" href="#fn:distance"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;I've been running for just over two years now. My goal was to run a mile, then three, then three at a pace faster than a power-walk. I wish I could say that I then found joy in running, but really I was just mad at myself for being so bad at it. Spite has always been my brightest muse.&lt;/p&gt;
    &lt;p&gt;Looking back, the thing I find most fascinating is what progress looked like. I couldn't tell you if I was physically progressing steadily, but for sure mental progress moved in discrete jumps. For a long time a 5k was me pushing myself, then suddenly a "phase change" happens and it becomes something I can just do on a run. Sometime in the future the 10k will feel the same way.&lt;/p&gt;
    &lt;p&gt;I've noticed this in a lot of other places. For every skill I know, my sense of myself follows a phase change. In every programming language I've ever learned, I lurch from "bad" to "okay" to "good". There's no "20% bad / 80% normal" in between. Pedagogical experts say that learning is about steadily building a &lt;a href="https://teachtogether.tech/en/index.html#s:models" target="_blank"&gt;mental model&lt;/a&gt; of the topic. It really feels like knowledge grows continuously, and then it suddenly becomes a model.&lt;/p&gt;
    &lt;p&gt;Now, for all the time I spend writing about software history and software theory and stuff, my actually job boils down to &lt;a href="https://www.hillelwayne.com/consulting/" target="_blank"&gt;teaching formal methods&lt;/a&gt;. So I now have two questions about phase changes.&lt;/p&gt;
    &lt;p&gt;The first is "can we make phase changes happen faster?" I don't know if this is even possible! I've found lots of ways to teach concepts faster, cover more ground in less time, so that people know the material more quickly. But it doesn't seem to speed up that very first phase change from "this is foreign" to "this is normal". Maybe we can't really do that until we've spent enough effort on understanding.&lt;/p&gt;
    &lt;p&gt;So the second may be more productive: "can we motivate people to keep going until the phase change?" This is a lot easier to tackle! For example, removing frustration makes a huge difference. Getting a proper pair of running shoes made running so much less unpleasant, and made me more willing to keep putting in the hours. For teaching tech topics like formal methods, this often takes the form of better tooling and troubleshooting info.&lt;/p&gt;
    &lt;p&gt;We can also reduce the effort of investing time. This is also why I prefer to pair on writing specifications with clients and not just write specs for them. It's more work for them than fobbing it all off on me, but a whole lot &lt;em&gt;less&lt;/em&gt; work than writing the spec by themselves, so they'll put in time and gradually develop skills on their own.&lt;/p&gt;
    &lt;p&gt;Question two seems much more fruitful than question one but also so much less interesting! Speeding up the phase change feels like the kind of dream that empires are built on. I know I'm going to keep obsessing over it, even if that leads nowhere.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:distance"&gt;
    &lt;p&gt;For non-running Americans: 5km is about 3.1 miles, and 10km is 6.2. &lt;a class="footnote-backref" href="#fnref:distance" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 16 Oct 2025 14:59:25 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/the-phase-change/</guid>
            </item>
            <item>
                <title>Three ways formally verified code can go wrong in practice</title>
                <link>https://buttondown.com/hillelwayne/archive/three-ways-formally-verified-code-can-go-wrong-in/</link>
                <description>&lt;h3&gt;New Logic for Programmers Release!&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" rel="noopener noreferrer nofollow" target="_blank"&gt;v0.12 is now available&lt;/a&gt;! This should be the last major content release. The next few months are going to be technical review, copyediting and polishing, with a hopeful 1.0 release in March. &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" rel="noopener noreferrer nofollow" target="_blank"&gt;Full release notes here&lt;/a&gt;.&lt;/p&gt;
    &lt;figure&gt;&lt;img alt="Cover of the boooooook" draggable="false" src="https://assets.buttondown.email/images/92b4a35d-2bdd-416a-92c7-15ff42b49d8d.jpg?w=960&amp;amp;fit=max"/&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;/figure&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h1&gt;Three ways formally verified code can go wrong in practice&lt;/h1&gt;
    &lt;p&gt;I run this small project called &lt;a href="https://github.com/hwayne/lets-prove-leftpad" rel="noopener noreferrer nofollow" target="_blank"&gt;Let's Prove Leftpad&lt;/a&gt;, where people submit formally verified proofs of the &lt;a href="https://en.wikipedia.org/wiki/Npm_left-pad_incident" rel="noopener noreferrer nofollow" target="_blank"&gt;eponymous meme&lt;/a&gt;. Recently I read &lt;a href="https://lukeplant.me.uk/blog/posts/breaking-provably-correct-leftpad/" rel="noopener noreferrer nofollow" target="_blank"&gt;Breaking “provably correct” Leftpad&lt;/a&gt;, which argued that most (if not all) of the provably correct leftpads have bugs! The lean proof, for example, &lt;em&gt;should&lt;/em&gt; render &lt;code&gt;leftpad('-', 9, אֳֽ֑)&lt;/code&gt; as &lt;code&gt;---------אֳֽ֑&lt;/code&gt;, but actually does &lt;code&gt;------אֳֽ֑&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;You can read the article for a good explanation of why this goes wrong (Unicode). The actual problem is that correct can mean two different things, and this leads to confusion about how much formal methods can actually guarantee us. So I see this as a great opportunity to talk about the nature of proof, correctness, and how "correct" code can still have bugs.&lt;/p&gt;
    &lt;h2&gt;What we talk about when we talk about correctness&lt;/h2&gt;
    &lt;p&gt;In most of the real world, correct means "no bugs". Except "bugs" isn't a very clear category. A bug is anything that causes someone to say "this isn't working right, there's a bug." Being too slow is a bug, a typo is a bug, etc. "correct" is a little fuzzy.&lt;/p&gt;
    &lt;p&gt;In formal methods, "correct" has a very specific and precise meaning: the code conforms to a &lt;strong&gt;specification&lt;/strong&gt; (or "spec"). The spec is a higher-level description of what is supposed the code's properties, usually something we can't just directly implement. Let's look at the most popular kind of proven specification:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Haskell&lt;/span&gt;
    &lt;span class="nf"&gt;inc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;::&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;Int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;gt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;Int&lt;/span&gt;
    &lt;span class="nf"&gt;inc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The type signature &lt;code&gt;Int -&amp;gt; Int&lt;/code&gt; is a specification! It corresponds to the logical statement &lt;code&gt;all x in Int: inc(x) in Int&lt;/code&gt;. The Haskell type checker can automatically verify this for us. It cannot, however, verify properties like &lt;code&gt;all x in Int: inc(x) &amp;gt; x&lt;/code&gt;. Formal verification is concerned with verifying arbitrary properties beyond what is (easily) automatically verifiable. Most often, this takes the form of proof. A human manually writes a proof that the code conforms to its specification, and the prover checks that the proof is correct.&lt;/p&gt;
    &lt;p&gt;Even if we have a proof of "correctness", though, there's a few different ways the code can still have bugs.&lt;/p&gt;
    &lt;h3&gt;1. The proof is invalid&lt;/h3&gt;
    &lt;p&gt;For some reason the proof doesn't actually show the code matches the specification. This is pretty common in pencil-and-paper verification, where the proof is checked by someone saying "yep looks good to me". It's much rarer when doing formal verification but it can still happen in a couple of specific cases:&lt;/p&gt;
    &lt;ol&gt;&lt;li&gt;&lt;p&gt;The theorem prover itself has a bug (in the code or introduced in the compiled binary) that makes it accept an incorrect proof. This is something people are really concerned about but it's so much rarer than every other way verified code goes wrong, so is only included for completeness.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;For convenience, most provers and FM languages have an "just accept this statement is true" feature. This helps you work on the big picture proof and fill in the details later. If you leave in a shortcut, &lt;em&gt;and&lt;/em&gt; the compiler is configured to allow code-with-proof-assumptions to compile, &lt;em&gt;then&lt;/em&gt; you can compile incorrect code that "passes the proof checker". You really should know better, though.&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;h3&gt;2. The properties are wrong&lt;/h3&gt;
    &lt;blockquote&gt;&lt;figure&gt;&lt;img alt="The horrible bug you had wasn't covered in the specification/came from some other module/etc" draggable="false" src="https://cdn.prod.website-files.com/673b407e535dbf3b547179ff/681ca0bf4a045f39f785faeb_AD_4nXfFhdn6DGmgLAcmaUNHl9a3Nog8gH8Hluve5Kof7zLk4CyOlD4zCmCqVJaowKqu-pTicwZ393jE7anIrjYZTSuRvGiYhFhAkkX9vifNt9vEWYwZUp65hsbrRTmZzRgb9vgu7n7buA.png"/&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;a href="https://www.galois.com/articles/what-works-and-doesnt-selling-formal-methods" rel="noopener noreferrer nofollow" target="_blank"&gt;Galois&lt;/a&gt;&lt;/p&gt;&lt;/blockquote&gt;
    &lt;p&gt;This code is provably correct:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;inc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;::&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;Int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;gt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;Int&lt;/span&gt;
    &lt;span class="nf"&gt;inc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The only specification I've given is the type signature &lt;code&gt;Int -&amp;gt; Int&lt;/code&gt;. At no point did I put the property &lt;code&gt;inc(x) &amp;gt; x&lt;/code&gt; in my specification, so it doesn't matter that it doesn't hold, the code is still "correct".&lt;/p&gt;
    &lt;p&gt;This is what "went wrong" with the leftpad proofs. They do &lt;em&gt;not&lt;/em&gt; prove the property "&lt;code&gt;leftpad(c, n, s)&lt;/code&gt; will take up either &lt;code&gt;n&lt;/code&gt; spaces on the screen or however many characters &lt;code&gt;s&lt;/code&gt; takes up (if more than &lt;code&gt;n&lt;/code&gt;)". They prove the weaker property "&lt;code&gt;len(leftpad(c, n, s)) == max(n, len(s))&lt;/code&gt;, for however you want to define &lt;code&gt;len(string)&lt;/code&gt;". The second is a rough proxy for the first that works in most cases, but if someone really needs the former property they are liable to experience a bug.&lt;/p&gt;
    &lt;p&gt;Why don't we prove the stronger property? Sometimes it's because the code is meant to be used one way and people want to use it another way. This can lead to accusations that the developer is "misusing the provably correct code" but this should more often be seen as the verification expert failing to educate devs on was actually "proven".&lt;/p&gt;
    &lt;p&gt;Sometimes it's because the property is too hard to prove. "Outputs are visually aligned" is a proof about Unicode inputs, and the &lt;em&gt;core&lt;/em&gt; Unicode specification is &lt;a href="https://www.unicode.org/versions/Unicode17.0.0/UnicodeStandard-17.0.pdf" rel="noopener noreferrer nofollow" target="_blank"&gt;1,243 pages long&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;Sometimes it's because the property we want is too hard to &lt;em&gt;express&lt;/em&gt;. How do you mathematically represent "people will perceive the output as being visually aligned"? Is it OS and font dependent? These two lines are exactly five characters but not visually aligned:&lt;/p&gt;
    &lt;blockquote&gt;&lt;p&gt;|||||&lt;/p&gt;&lt;p&gt;MMMMM&lt;/p&gt;&lt;/blockquote&gt;
    &lt;p&gt;Or maybe they are aligned for you! I don't know, lots of people read email in a monospace font. "We can't express the property" comes up a lot when dealing with human/business concepts as opposed to mathematical/computational ones.&lt;/p&gt;
    &lt;p&gt;Finally, there's just the possibility of a brain fart. All of the proofs in &lt;a href="https://research.google/blog/extra-extra-read-all-about-it-nearly-all-binary-searches-and-mergesorts-are-broken/" rel="noopener noreferrer nofollow" target="_blank"&gt;Nearly All Binary Searches and Mergesorts are Broken&lt;/a&gt; are like this. They (informally) proved the correctness of binary search with unbound integers, forgetting that many programming languages use &lt;em&gt;machine&lt;/em&gt; integers, where a large enough sum can overflow.&lt;/p&gt;
    &lt;h3&gt;3. The assumptions are wrong&lt;/h3&gt;
    &lt;p&gt;This is arguably the most important and most subtle source of bugs. Most properties we prove aren't "&lt;code&gt;X&lt;/code&gt; is always true". They are "&lt;em&gt;assuming&lt;/em&gt; &lt;code&gt;Y&lt;/code&gt; is true, &lt;code&gt;X&lt;/code&gt; is also true". Then if &lt;code&gt;Y&lt;/code&gt; is not true, the proof no longer guarantees &lt;code&gt;X&lt;/code&gt;. A good example of this is binary &lt;s&gt;sort&lt;/s&gt; &lt;em&gt;search&lt;/em&gt;, which only correctly finds elements &lt;em&gt;assuming&lt;/em&gt; the input list is sorted. If the list is not sorted, it will not work correctly.&lt;/p&gt;
    &lt;p&gt;Formal verification adds two more wrinkles. One: sometimes we need assumptions to make the property valid, but we can also add them to make the proof easier. So the code can be bug-free even if the assumptions used to verify it no longer hold! Even if a leftpad implements visual alignment for all Unicode glyphs, it will be a lot easier to &lt;em&gt;prove&lt;/em&gt; visual alignment for just ASCII strings and padding.&lt;/p&gt;
    &lt;p&gt;Two: we need make a lot of &lt;em&gt;environmental&lt;/em&gt; assumptions that are outside our control. Does the algorithm return output or use the stack? Need to assume that there's sufficient memory to store stuff. Does it use any variables? Need to assume nothing is concurrently modifying them. Does it use an external service? Need to assume the vendor doesn't change the API or response formats. You need to assume the compiler worked correctly, the hardware isn't faulty, and the OS doesn't mess with things, etc. Any of these could change well after the code is proven and deployed, meaning formal verification can't be a one-and-done thing.&lt;/p&gt;
    &lt;p&gt;You don't actually have to assume most of these, but each assumption drop makes the proof harder and the properties you can prove more restricted. Remember, the code might still be bug-free even if the environmental assumptions change, so there's a tradeoff in time spent proving vs doing other useful work.&lt;/p&gt;
    &lt;p&gt;Another common source of "assumptions" is when verified code depends on unverified code. The Rust compiler can prove that safe code doesn't have a memory bug &lt;em&gt;assuming&lt;/em&gt; unsafe code does not have one either, but depends on the human to confirm that assumption. &lt;a href="https://ucsd-progsys.github.io/liquidhaskell/" rel="noopener noreferrer nofollow" target="_blank"&gt;Liquid Haskell&lt;/a&gt; is verifiable but can also call regular Haskell libraries, which are unverified. We need to assume that code is correct (in the "conforms to spec") sense, and if it's not, our proof can be "correct" and still cause bugs.&lt;/p&gt;
    &lt;hr/&gt;&lt;p&gt;These boundaries are fuzzy. I wrote that the "binary search" bug happened because they proved the wrong property, but you can just as well argue that it was a broken assumption (that integers could not overflow). What really matters is having a clear understanding of what "this code is proven correct" actually &lt;em&gt;tells&lt;/em&gt; you. Where can you use it safely? When should you worry? How do you communicate all of this to your teammates?&lt;/p&gt;
    &lt;p&gt;Good lord it's already Friday&lt;/p&gt;</description>
                <pubDate>Fri, 10 Oct 2025 17:06:19 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/three-ways-formally-verified-code-can-go-wrong-in/</guid>
            </item>
            <item>
                <title>New Blog Post: " A Very Early History of Algebraic Data Types"</title>
                <link>https://buttondown.com/hillelwayne/archive/new-blog-post-a-very-early-history-of-algebraic/</link>
                <description>&lt;p&gt;Last week I said that this week's newsletter would be a brief history of algebraic data types.&lt;/p&gt;
    &lt;p&gt;I was wrong.&lt;/p&gt;
    &lt;p&gt;That history is now a &lt;a href="https://www.hillelwayne.com/post/algdt-history/" target="_blank"&gt;3500 word blog post&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://www.patreon.com/posts/blog-notes-very-139696324?utm_medium=clipboard_copy&amp;amp;utm_source=copyLink&amp;amp;utm_campaign=postshare_creator&amp;amp;utm_content=join_link" target="_blank"&gt;Patreon blog notes here&lt;/a&gt;.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;I'm speaking at &lt;a href="https://www.p99conf.io/" target="_blank"&gt;P99 Conf&lt;/a&gt;!&lt;/h3&gt;
    &lt;p&gt;My talk, "Designing Low-Latency Systems with TLA+", is happening 10/23 at 11:30 central time. It's an online conf and the talk's only 16 minutes, so come check it out!&lt;/p&gt;</description>
                <pubDate>Thu, 25 Sep 2025 16:50:58 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/new-blog-post-a-very-early-history-of-algebraic/</guid>
            </item>
            <item>
                <title>Many Hard Leetcode Problems are Easy Constraint Problems</title>
                <link>https://buttondown.com/hillelwayne/archive/many-hard-leetcode-problems-are-easy-constraint/</link>
                <description>&lt;p&gt;In my first interview out of college I was asked the change counter problem:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given a set of coin denominations, find the minimum number of coins required to make change for a given number. IE for USA coinage and 37 cents, the minimum number is four (quarter, dime, 2 pennies).&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;I implemented the simple greedy algorithm and immediately fell into the trap of the question: the greedy algorithm only works for "well-behaved" denominations. If the coin values were &lt;code&gt;[10, 9, 1]&lt;/code&gt;, then making 37 cents would take 10 coins in the greedy algorithm but only 4 coins optimally (&lt;code&gt;10+9+9+9&lt;/code&gt;). The "smart" answer is to use a dynamic programming algorithm, which I didn't know how to do. So I failed the interview.&lt;/p&gt;
    &lt;p&gt;But you only need dynamic programming if you're writing your own algorithm. It's really easy if you throw it into a constraint solver like &lt;a href="https://www.minizinc.org/" target="_blank"&gt;MiniZinc&lt;/a&gt; and call it a day. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;int: total;
    array[int] of int: values = [10, 9, 1];
    array[index_set(values)] of var 0..: coins;
    
    constraint sum (c in index_set(coins)) (coins[c] * values[c]) == total;
    solve minimize sum(coins);
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;You can try this online &lt;a href="https://play.minizinc.dev/" target="_blank"&gt;here&lt;/a&gt;. It'll give you a prompt to put in &lt;code&gt;total&lt;/code&gt; and then give you successively-better solutions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;coins = [0, 0, 37];
    ----------
    coins = [0, 1, 28];
    ----------
    coins = [0, 2, 19];
    ----------
    coins = [0, 3, 10];
    ----------
    coins = [0, 4, 1];
    ----------
    coins = [1, 3, 0];
    ----------
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Lots of similar interview questions are this kind of mathematical optimization problem, where we have to find the maximum or minimum of a function corresponding to constraints. They're hard in programming languages because programming languages are too low-level. They are also exactly the problems that constraint solvers were designed to solve. Hard leetcode problems are easy constraint problems.&lt;sup id="fnref:leetcode"&gt;&lt;a class="footnote-ref" href="#fn:leetcode"&gt;1&lt;/a&gt;&lt;/sup&gt; Here I'm using MiniZinc, but you could just as easily use Z3 or OR-Tools or whatever your favorite generalized solver is.&lt;/p&gt;
    &lt;h3&gt;More examples&lt;/h3&gt;
    &lt;p&gt;This was a question in a different interview (which I thankfully passed):&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given a list of stock prices through the day, find maximum profit you can get by buying one stock and selling one stock later.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;It's easy to do in O(n^2) time, or if you are clever, you can do it in O(n). Or you could be not clever at all and just write it as a constraint problem:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;array[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];
    var int: buy;
    var int: sell;
    var int: profit = prices[sell] - prices[buy];
    
    constraint sell &amp;gt; buy;
    constraint profit &amp;gt; 0;
    solve maximize profit;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Reminder, link to trying it online &lt;a href="https://play.minizinc.dev/" target="_blank"&gt;here&lt;/a&gt;. While working at that job, one interview question we tested out was:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given a list, determine if three numbers in that list can be added or subtracted to give 0? &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;This is a satisfaction problem, not a constraint problem: we don't need the "best answer", any answer will do. We eventually decided against it for being too tricky for the engineers we were targeting. But it's not tricky in a solver; &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;include "globals.mzn";
    array[int] of int: numbers = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];
    array[index_set(numbers)] of var {0, -1, 1}: choices;
    
    constraint sum(n in index_set(numbers)) (numbers[n] * choices[n]) = 0;
    constraint count(choices, -1) + count(choices, 1) = 3;
    solve satisfy;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Okay, one last one, a problem I saw last year at &lt;a href="https://chicagopython.github.io/algosig/" target="_blank"&gt;Chipy AlgoSIG&lt;/a&gt;. Basically they pick some leetcode problems and we all do them. I failed to solve &lt;a href="https://leetcode.com/problems/largest-rectangle-in-histogram/description/" target="_blank"&gt;this one&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given an array of integers heights representing the histogram's bar height where the width of each bar is 1, return the area of the largest rectangle in the histogram.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="example from leetcode link" class="newsletter-image" src="https://assets.buttondown.email/images/63337f78-7138-4b21-87a0-917c0c5b1706.jpg?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The "proper" solution is a tricky thing involving tracking lots of bookkeeping states, which you can completely bypass by expressing it as constraints:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;array[int] of int: numbers = [2,1,5,6,2,3];
    
    var 1..length(numbers): x; 
    var 1..length(numbers): dx;
    var 1..: y;
    
    constraint x + dx &amp;lt;= length(numbers);
    constraint forall (i in x..(x+dx)) (y &amp;lt;= numbers[i]);
    
    var int: area = (dx+1)*y;
    solve maximize area;
    
    output ["(\(x)-&amp;gt;\(x+dx))*\(y) = \(area)"]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There's even a way to &lt;a href="https://docs.minizinc.dev/en/2.9.3/visualisation.html" target="_blank"&gt;automatically visualize the solution&lt;/a&gt; (using &lt;code&gt;vis_geost_2d&lt;/code&gt;), but I didn't feel like figuring it out in time for the newsletter.&lt;/p&gt;
    &lt;h3&gt;Is this better?&lt;/h3&gt;
    &lt;p&gt;Now if I actually brought these questions to an interview the interviewee could ruin my day by asking "what's the runtime complexity?" Constraint solvers runtimes are unpredictable and almost always slower than an ideal bespoke algorithm because they are more expressive, in what I refer to as the &lt;a href="https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/" target="_blank"&gt;capability/tractability tradeoff&lt;/a&gt;. But even so, they'll do way better than a &lt;em&gt;bad&lt;/em&gt; bespoke algorithm, and I'm not experienced enough in handwriting algorithms to consistently beat a solver.&lt;/p&gt;
    &lt;p&gt;The real advantage of solvers, though, is how well they handle new constraints. Take the stock picking problem above. I can write an O(n²) algorithm in a few minutes and the O(n) algorithm if you give me some time to think. Now change the problem to&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Maximize the profit by buying and selling up to &lt;code&gt;max_sales&lt;/code&gt; stocks, but you can only buy or sell one stock at a given time and you can only hold up to &lt;code&gt;max_hold&lt;/code&gt; stocks at a time?&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;That's a way harder problem to write even an inefficient algorithm for! While the constraint problem is only a tiny bit more complicated:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;include "globals.mzn";
    int: max_sales = 3;
    int: max_hold = 2;
    array[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];
    array [1..max_sales] of var int: buy;
    array [1..max_sales] of var int: sell;
    array [index_set(prices)] of var 0..max_hold: stocks_held;
    var int: profit = sum(s in 1..max_sales) (prices[sell[s]] - prices[buy[s]]);
    
    constraint forall (s in 1..max_sales) (sell[s] &amp;gt; buy[s]);
    constraint profit &amp;gt; 0;
    
    constraint forall(i in index_set(prices)) (stocks_held[i] = (count(s in 1..max_sales) (buy[s] &amp;lt;= i) - count(s in 1..max_sales) (sell[s] &amp;lt;= i)));
    constraint alldifferent(buy ++ sell);
    solve maximize profit;
    
    output ["buy at \(buy)\n", "sell at \(sell)\n", "for \(profit)"];
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Most constraint solving examples online are puzzles, like &lt;a href="https://docs.minizinc.dev/en/stable/modelling2.html#ex-sudoku" target="_blank"&gt;Sudoku&lt;/a&gt; or "&lt;a href="https://docs.minizinc.dev/en/stable/modelling2.html#ex-smm" target="_blank"&gt;SEND + MORE = MONEY&lt;/a&gt;". Solving leetcode problems would be a more interesting demonstration. And you get more interesting opportunities to teach optimizations, like symmetry breaking.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Update for the Internet&lt;/h3&gt;
    &lt;p&gt;This was sent as a weekly newsletter, which is usually on topics like &lt;a href="https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code" target="_blank"&gt;software history&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/" target="_blank"&gt;formal methods&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;unusual technologies&lt;/a&gt;, and the &lt;a href="https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/" target="_blank"&gt;theory of software engineering&lt;/a&gt;. You can subscribe here: &lt;/p&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:leetcode"&gt;
    &lt;p&gt;Because my dad will email me if I don't explain this: "leetcode" is slang for "tricky algorithmic interview questions that have little-to-no relevance in the actual job you're interviewing for." It's from &lt;a href="https://leetcode.com/" target="_blank"&gt;leetcode.com&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:leetcode" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 10 Sep 2025 13:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/many-hard-leetcode-problems-are-easy-constraint/</guid>
            </item>
            <item>
                <title>The Angels and Demons of Nondeterminism</title>
                <link>https://buttondown.com/hillelwayne/archive/the-angels-and-demons-of-nondeterminism/</link>
                <description>&lt;p&gt;Greetings everyone! You might have noticed that it's September and I don't have the next version of &lt;em&gt;Logic for Programmers&lt;/em&gt; ready. As penance, &lt;a href="https://leanpub.com/logic/c/september-2025-kuBCrhBnUzb7" target="_blank"&gt;here's ten free copies of the book&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;So a few months ago I wrote &lt;a href="https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/" target="_blank"&gt;a newsletter&lt;/a&gt; about how we use nondeterminism in formal methods.  The overarching idea:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Nondeterminism is when multiple paths are possible from a starting state.&lt;/li&gt;
    &lt;li&gt;A system preserves a property if it holds on &lt;em&gt;all&lt;/em&gt; possible paths. If even one path violates the property, then we have a bug.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;An intuitive model of this is that for this is that when faced with a nondeterministic choice, the system always makes the &lt;em&gt;worst possible choice&lt;/em&gt;. This is sometimes called &lt;strong&gt;demonic nondeterminism&lt;/strong&gt; and is favored in formal methods because we are paranoid to a fault.&lt;/p&gt;
    &lt;p&gt;The opposite would be &lt;strong&gt;angelic nondeterminism&lt;/strong&gt;, where the system always makes the &lt;em&gt;best possible choice&lt;/em&gt;. A property then holds if &lt;em&gt;any&lt;/em&gt; possible path satisfies that property.&lt;sup id="fnref:duals"&gt;&lt;a class="footnote-ref" href="#fn:duals"&gt;1&lt;/a&gt;&lt;/sup&gt; This is not as common in FM, but it still has its uses! "Players can access the secret level" or "&lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/#other-properties" target="_blank"&gt;We can always shut down the computer&lt;/a&gt;" are &lt;strong&gt;reachability&lt;/strong&gt; properties, that something is possible even if not actually done.&lt;/p&gt;
    &lt;p&gt;In broader computer science research, I'd say that angelic nondeterminism is more popular, due to its widespread use in complexity analysis and programming languages.&lt;/p&gt;
    &lt;h3&gt;Complexity Analysis&lt;/h3&gt;
    &lt;p&gt;P is the set of all "decision problems" (&lt;em&gt;basically&lt;/em&gt;, boolean functions) can be solved in polynomial time: there's an algorithm that's worst-case in &lt;code&gt;O(n)&lt;/code&gt;, &lt;code&gt;O(n²)&lt;/code&gt;, &lt;code&gt;O(n³)&lt;/code&gt;, etc.&lt;sup id="fnref:big-o"&gt;&lt;a class="footnote-ref" href="#fn:big-o"&gt;2&lt;/a&gt;&lt;/sup&gt;  NP is the set of all problems that can be solved in polynomial time by an algorithm with &lt;em&gt;angelic nondeterminism&lt;/em&gt;.&lt;sup id="fnref:TM"&gt;&lt;a class="footnote-ref" href="#fn:TM"&gt;3&lt;/a&gt;&lt;/sup&gt; For example, the question "does list &lt;code&gt;l&lt;/code&gt; contain &lt;code&gt;x&lt;/code&gt;" can be solved in O(1) time by a nondeterministic algorithm:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;fun is_member(l: List[T], x: T): bool {
      if l == [] {return false};
    
      guess i in 0..&amp;lt;(len(l)-1);
      return l[i] == x;
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Say call &lt;code&gt;is_member([a, b, c, d], c)&lt;/code&gt;. The best possible choice would be to guess &lt;code&gt;i = 2&lt;/code&gt;, which would correctly return true. Now call &lt;code&gt;is_member([a, b], d)&lt;/code&gt;. No matter what we guess, the algorithm correctly returns false. and just return false. Ergo, O(1). NP stands for "Nondeterministic Polynomial". &lt;/p&gt;
    &lt;p&gt;(And I just now realized something pretty cool: you can say that P is the set of all problems solvable in polynomial time under &lt;em&gt;demonic nondeterminism&lt;/em&gt;, which is a nice parallel between the two classes.)&lt;/p&gt;
    &lt;p&gt;Computer scientists have proven that angelic nondeterminism doesn't give us any more "power": there are no problems solvable with AN that aren't also solvable deterministically. The big question is whether AN is more &lt;em&gt;efficient&lt;/em&gt;: it is widely believed, but not &lt;em&gt;proven&lt;/em&gt;, that there are problems in NP but not in P. Most famously, "Is there any variable assignment that makes this boolean formula true?" A polynomial AN algorithm is again easy:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;fun SAT(f(x1, x2, …: bool): bool): bool {
       N = num_params(f)
       for i in 1..=num_params(f) {
         guess x_i in {true, false}
       }
    
       return f(x_1, x_2, …)
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The best deterministic algorithms we have to solve the same problem are worst-case exponential with the number of boolean parameters. This a real frustrating problem because real computers don't have angelic nondeterminism, so problems like SAT remain hard. We can solve most "well-behaved" instances of the problem &lt;a href="https://www.hillelwayne.com/post/np-hard/" target="_blank"&gt;in reasonable time&lt;/a&gt;, but the worst-case instances get intractable real fast.&lt;/p&gt;
    &lt;h3&gt;Means of Abstraction&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;We can directly turn an AN algorithm into a (possibly much slower) deterministic algorithm, such as by &lt;a href="https://en.wikipedia.org/wiki/Backtracking" target="_blank"&gt;backtracking&lt;/a&gt;. This makes AN a pretty good abstraction over what an algorithm is doing. Does the regex &lt;code&gt;(a+b)\1+&lt;/code&gt; match "abaabaabaab"? Yes, if the regex engine nondeterministically guesses that it needs to start at the third letter and make the group &lt;code&gt;aab&lt;/code&gt;. How does my PL's regex implementation find that match? I dunno, backtracking or &lt;a href="https://swtch.com/~rsc/regexp/regexp1.html" target="_blank"&gt;NFA construction&lt;/a&gt; or something, I don't need to know the deterministic specifics in order to use the nondeterministic abstraction.&lt;/p&gt;
    &lt;p&gt;Neel Krishnaswami has &lt;a href="https://semantic-domain.blogspot.com/2013/07/what-declarative-languages-are.html" target="_blank"&gt;a great definition of 'declarative language'&lt;/a&gt;: "any language with a semantics has some nontrivial existential quantifiers in it". I'm not sure if this is &lt;em&gt;identical&lt;/em&gt; to saying "a language with an angelic nondeterministic abstraction", but they must be pretty close, and all of his examples match:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;SQL's selects and joins&lt;/li&gt;
    &lt;li&gt;Parsing DSLs&lt;/li&gt;
    &lt;li&gt;Logic programming's unification&lt;/li&gt;
    &lt;li&gt;Constraint solving&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;On top of that I'd add CSS selectors and &lt;a href="https://www.hillelwayne.com/post/picat/" target="_blank"&gt;planner's actions&lt;/a&gt;; all nondeterministic abstractions over a deterministic implementation. He also says that the things programmers hate most in declarative languages are features that "that expose the operational model": constraint solver search strategies, Prolog cuts, regex backreferences, etc. Which again matches my experiences with angelic nondeterminism: I dread features that force me to understand the deterministic implementation. But they're necessary, since P probably != NP and so we need to worry about operational optimizations.&lt;/p&gt;
    &lt;h3&gt;Eldritch Nondeterminism&lt;/h3&gt;
    &lt;p&gt;If you need to know the &lt;a href="https://en.wikipedia.org/wiki/PP_(complexity)" target="_blank"&gt;ratio of good/bad paths&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/%E2%99%AFP" target="_blank"&gt;the number of good paths&lt;/a&gt;, or probability, or anything more than "there is a good path" or "there is a bad path", you are beyond the reach of heaven or hell.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:duals"&gt;
    &lt;p&gt;Angelic and demonic nondeterminism are &lt;a href="https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/" target="_blank"&gt;duals&lt;/a&gt;: angelic returns "yes" if &lt;code&gt;some choice: correct&lt;/code&gt; and demonic returns "no" if &lt;code&gt;!all choice: correct&lt;/code&gt;, which is the same as &lt;code&gt;some choice: !correct&lt;/code&gt;. &lt;a class="footnote-backref" href="#fnref:duals" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:big-o"&gt;
    &lt;p&gt;Pet peeve about Big-O notation: &lt;code&gt;O(n²)&lt;/code&gt; is the &lt;em&gt;set&lt;/em&gt; of all algorithms that, for sufficiently large problem sizes, grow no faster that quadratically. "Bubblesort has &lt;code&gt;O(n²)&lt;/code&gt; complexity" &lt;em&gt;should&lt;/em&gt; be written &lt;code&gt;Bubblesort in O(n²)&lt;/code&gt;, &lt;em&gt;not&lt;/em&gt; &lt;code&gt;Bubblesort = O(n²)&lt;/code&gt;. &lt;a class="footnote-backref" href="#fnref:big-o" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:TM"&gt;
    &lt;p&gt;To be precise, solvable in polynomial time by a &lt;em&gt;Nondeterministic Turing Machine&lt;/em&gt;, a very particular model of computation. We can broadly talk about P and NP without framing everything in terms of Turing machines, but some details of complexity classes (like the existence "weak NP-hardness") kinda need Turing machines to make sense. &lt;a class="footnote-backref" href="#fnref:TM" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 04 Sep 2025 14:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/the-angels-and-demons-of-nondeterminism/</guid>
            </item>
            <item>
                <title>Logical Duals in Software Engineering</title>
                <link>https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/</link>
                <description>&lt;p&gt;(&lt;a href="https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/" target="_blank"&gt;Last week's newsletter&lt;/a&gt; took too long and I'm way behind on &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; revisions so short one this time.&lt;sup id="fnref:retread"&gt;&lt;a class="footnote-ref" href="#fn:retread"&gt;1&lt;/a&gt;&lt;/sup&gt;)&lt;/p&gt;
    &lt;p&gt;In classical logic, two operators &lt;code&gt;F/G&lt;/code&gt; are &lt;strong&gt;duals&lt;/strong&gt; if &lt;code&gt;F(x) = !G(!x)&lt;/code&gt;. Three examples:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;&lt;code&gt;x || y&lt;/code&gt; is the same as &lt;code&gt;!(!x &amp;amp;&amp;amp; !y)&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;&amp;lt;&amp;gt;P&lt;/code&gt; ("P is possibly true") is the same as &lt;code&gt;![]!P&lt;/code&gt; ("not P isn't definitely true").&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;some x in set: P(x)&lt;/code&gt; is the same as &lt;code&gt;!(all x in set: !P(x))&lt;/code&gt;.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;(1) is just a version of De Morgan's Law, which we regularly use to simplify boolean expressions. (2) is important in modal logic but has niche applications in software engineering, mostly in how it powers various formal methods.&lt;sup id="fnref:fm"&gt;&lt;a class="footnote-ref" href="#fn:fm"&gt;2&lt;/a&gt;&lt;/sup&gt; The real interesting one is (3), the "quantifier duals". We use lots of software tools to either &lt;em&gt;find&lt;/em&gt; a value satisfying &lt;code&gt;P&lt;/code&gt; or &lt;em&gt;check&lt;/em&gt; that all values satisfy &lt;code&gt;P&lt;/code&gt;. And by duality, any tool that does one can do the other, by seeing if it &lt;em&gt;fails&lt;/em&gt; to find/check &lt;code&gt;!P&lt;/code&gt;. Some examples in the wild:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Z3 is used to solve mathematical constraints, like "find x, where &lt;code&gt;f(x) &amp;gt;= 0&lt;/code&gt;. If I want to prove a property like "f is always positive", I ask z3 to solve "find x, where &lt;code&gt;!(f(x) &amp;gt;= 0)&lt;/code&gt;, and see if that is unsatisfiable. This use case powers a LOT of theorem provers and formal verification tooling.&lt;/li&gt;
    &lt;li&gt;Property testing checks that all inputs to a code block satisfy a property. I've used it to generate complex inputs with certain properties by checking that all inputs &lt;em&gt;don't&lt;/em&gt; satisfy the property and reading out the test failure.&lt;/li&gt;
    &lt;li&gt;Model checkers check that all behaviors of a specification satisfy a property, so we can find a behavior that reaches a goal state G by checking that all states are &lt;code&gt;!G&lt;/code&gt;. &lt;a href="https://github.com/tlaplus/Examples/blob/master/specifications/DieHard/DieHard.tla" target="_blank"&gt;Here's TLA+ solving a puzzle this way&lt;/a&gt;.&lt;sup id="fnref:antithesis"&gt;&lt;a class="footnote-ref" href="#fn:antithesis"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;li&gt;Planners find behaviors that reach a goal state, so we can check if all behaviors satisfy a property P by asking it to reach goal state &lt;code&gt;!P&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;The problem "find the shortest &lt;a href="https://en.wikipedia.org/wiki/Travelling_salesman_problem" target="_blank"&gt;traveling salesman route&lt;/a&gt;" can be broken into &lt;code&gt;some route: distance(route) = n&lt;/code&gt; and &lt;code&gt;all route: !(distance(route) &amp;lt; n)&lt;/code&gt;. Then a route finder can find the first, and then convert the second into a &lt;code&gt;some&lt;/code&gt; and &lt;em&gt;fail&lt;/em&gt; to find it, proving &lt;code&gt;n&lt;/code&gt; is optimal.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Even cooler to me is when a tool does &lt;em&gt;both&lt;/em&gt; finding and checking, but gives them different "meanings". In SQL, &lt;code&gt;some x: P(x)&lt;/code&gt; is true if we can &lt;em&gt;query&lt;/em&gt; for &lt;code&gt;P(x)&lt;/code&gt; and get a nonempty response, while &lt;code&gt;all x: P(x)&lt;/code&gt; is true if all records satisfy the &lt;code&gt;P(x)&lt;/code&gt; &lt;em&gt;constraint&lt;/em&gt;. Most SQL databases allow for complex queries but not complex constraints! You got &lt;code&gt;UNIQUE&lt;/code&gt;, &lt;code&gt;NOT NULL&lt;/code&gt;, &lt;code&gt;REFERENCES&lt;/code&gt;, which are fixed predicates, and &lt;code&gt;CHECK&lt;/code&gt;, which is one-record only.&lt;sup id="fnref:check"&gt;&lt;a class="footnote-ref" href="#fn:check"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;Oh, and you got database triggers, which can run arbitrary queries and throw exceptions. So if you really need to enforce a complex constraint &lt;code&gt;P(x, y, z)&lt;/code&gt;, you put in a database trigger that queries &lt;code&gt;some x, y, z: !P(x, y, z)&lt;/code&gt; and throws an exception if it finds any results. That all works because of quantifier duality! See &lt;a href="https://eddmann.com/posts/maintaining-invariant-constraints-in-postgresql-using-trigger-functions/" target="_blank"&gt;here&lt;/a&gt; for an example of this in practice.&lt;/p&gt;
    &lt;h3&gt;Duals more broadly&lt;/h3&gt;
    &lt;p&gt;"Dual" doesn't have a strict meaning in math, it's more of a vibe thing where all of the "duals" are kinda similar in meaning but don't strictly follow all of the same rules. &lt;em&gt;Usually&lt;/em&gt; things X and Y are duals if there is some transform &lt;code&gt;F&lt;/code&gt; where &lt;code&gt;X = F(Y)&lt;/code&gt; and &lt;code&gt;Y = F(X)&lt;/code&gt;, but not always. Maybe the category theorists have a formal definition that covers all of the different uses. Usually duals switch properties of things, too: an example showing &lt;code&gt;some x: P(x)&lt;/code&gt; becomes a &lt;em&gt;counterexample&lt;/em&gt; of &lt;code&gt;all x: !P(x)&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;Under this definition, I think the dual of a list &lt;code&gt;l&lt;/code&gt; could be &lt;code&gt;reverse(l)&lt;/code&gt;. The first element of &lt;code&gt;l&lt;/code&gt; becomes the last element of &lt;code&gt;reverse(l)&lt;/code&gt;, the last becomes the first, etc. A more interesting case is the dual of a &lt;code&gt;K -&amp;gt; set(V)&lt;/code&gt; map is the &lt;code&gt;V -&amp;gt; set(K)&lt;/code&gt; map. IE the dual of &lt;code&gt;lived_in_city = {alice: {paris}, bob: {detroit}, charlie: {detroit, paris}}&lt;/code&gt; is &lt;code&gt;city_lived_in_by = {paris: {alice, charlie}, detroit: {bob, charlie}}&lt;/code&gt;. This preserves the property that &lt;code&gt;x in map[y] &amp;lt;=&amp;gt; y in dual[x]&lt;/code&gt;.&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:retread"&gt;
    &lt;p&gt;And after writing this I just realized this is partial retread of a newsletter I wrote &lt;a href="https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/" target="_blank"&gt;a couple months ago&lt;/a&gt;. But only a &lt;em&gt;partial&lt;/em&gt; retread! &lt;a class="footnote-backref" href="#fnref:retread" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:fm"&gt;
    &lt;p&gt;Specifically "linear temporal logics" are modal logics, so "&lt;code&gt;eventually P&lt;/code&gt; ("P is true in at least one state of each behavior") is the same as saying &lt;code&gt;!always !P&lt;/code&gt; ("not P isn't true in all states of all behaviors"). This is the basis of &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;liveness checking&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:fm" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:antithesis"&gt;
    &lt;p&gt;I don't know for sure, but my best guess is that Antithesis does something similar &lt;a href="https://antithesis.com/blog/tag/games/" target="_blank"&gt;when their fuzzer beats videogames&lt;/a&gt;. They're doing fuzzing, not model checking, but they have the same purpose check that complex state spaces don't have bugs. Making the bug "we can't reach the end screen" can make a fuzzer output a complete end-to-end run of the game. Obvs a lot more complicated than that but that's the general idea at least. &lt;a class="footnote-backref" href="#fnref:antithesis" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:check"&gt;
    &lt;p&gt;For &lt;code&gt;CHECK&lt;/code&gt; to constraint multiple records you would need to use a subquery. Core SQL does not support subqueries in check. It is an optional database "feature outside of core SQL" (F671), which &lt;a href="https://www.postgresql.org/docs/current/unsupported-features-sql-standard.html" target="_blank"&gt;Postgres does not support&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:check" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 27 Aug 2025 19:25:32 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/</guid>
            </item>
            <item>
                <title>Sapir-Whorf does not apply to Programming Languages</title>
                <link>https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/</link>
                <description>&lt;p&gt;&lt;em&gt;This one is a hot mess but it's too late in the week to start over. Oh well!&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;Someone recognized me at last week's &lt;a href="https://www.chipy.org/" target="_blank"&gt;Chipy&lt;/a&gt; and asked for my opinion on Sapir-Whorf hypothesis in programming languages. I thought this was interesting enough to make a newsletter. First what it is, then why it &lt;em&gt;looks&lt;/em&gt; like it applies, and then why it doesn't apply after all.&lt;/p&gt;
    &lt;h3&gt;The Sapir-Whorf Hypothesis&lt;/h3&gt;
    &lt;blockquote&gt;
    &lt;p&gt;We dissect nature along lines laid down by our native language. — &lt;a href="https://web.mit.edu/allanmc/www/whorf.scienceandlinguistics.pdf" target="_blank"&gt;Whorf&lt;/a&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;To quote from a &lt;a href="https://www.amazon.com/Linguistics-Complete-Introduction-Teach-Yourself/dp/1444180320" target="_blank"&gt;Linguistics book I've read&lt;/a&gt;, the hypothesis is that "an individual's fundamental perception of reality is moulded by the language they speak." As a massive oversimplification, if English did not have a word for "rebellion", we would not be able to conceive of rebellion. This view, now called &lt;a href="https://en.wikipedia.org/wiki/Linguistic_determinism" target="_blank"&gt;Linguistic Determinism&lt;/a&gt;, is mostly rejected by modern linguists.&lt;/p&gt;
    &lt;p&gt;The "weak" form of SWH is that the language we speak influences, but does not &lt;em&gt;decide&lt;/em&gt; our cognition. &lt;a href="https://langcog.stanford.edu/papers/winawer2007.pdf" target="_blank"&gt;For example&lt;/a&gt;, Russian has distinct words for "light blue" and "dark blue", so can discriminate between "light blue" and "dark blue" shades faster than they can discriminate two "light blue" shades. English does not have distinct words, so we discriminate those at the same speed. This &lt;strong&gt;linguistic relativism&lt;/strong&gt; seems to have lots of empirical support in studies, but mostly with "small indicators". I don't think there's anything that convincingly shows linguistic relativism having effects on a societal level.&lt;sup id="fnref:economic-behavior"&gt;&lt;a class="footnote-ref" href="#fn:economic-behavior"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;The weak form of SWH for software would then be the "the programming languages you know affects how you think about programs."&lt;/p&gt;
    &lt;h3&gt;SWH in software&lt;/h3&gt;
    &lt;p&gt;This seems like a natural fit, as different paradigms solve problems in different ways. Consider the &lt;a href="https://hadid.dev/posts/living-coding/" target="_blank"&gt;hardest interview question ever&lt;/a&gt;, "given a list of integers, sum the even numbers". Here it is in four paradigms:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Procedural: &lt;code&gt;total = 0; foreach x in list {if IsEven(x) total += x}&lt;/code&gt;. You iterate over data with an algorithm.&lt;/li&gt;
    &lt;li&gt;Functional: &lt;code&gt;reduce(+, filter(IsEven, list), 0)&lt;/code&gt;. You apply transformations to data to get a result.&lt;/li&gt;
    &lt;li&gt;Array: &lt;code&gt;+ fold L * iseven L&lt;/code&gt;.&lt;sup id="fnref:J"&gt;&lt;a class="footnote-ref" href="#fn:J"&gt;2&lt;/a&gt;&lt;/sup&gt; In English: replace every element in L with 0 if odd and 1 if even, multiple the new array elementwise against &lt;code&gt;L&lt;/code&gt;, and then sum the resulting array. It's like functional except everything is in terms of whole-array transformations.&lt;/li&gt;
    &lt;li&gt;Logical: Somethingish like &lt;code&gt;sumeven(0, []). sumeven(X, [Y|L]) :- iseven(Y) -&amp;gt; sumeven(Z, L), X is Y + Z ; sumeven(X, L)&lt;/code&gt;. You write a set of equations that express what it means for X to &lt;em&gt;be&lt;/em&gt; the sum of events of L.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;There's some similarities between how these paradigms approach the problem, but each is also unique, too. It's plausible that where a procedural programmer "sees" a for loop, a functional programmer "sees" a map and an array programmer "sees" a singular operator.&lt;/p&gt;
    &lt;p&gt;I also have a personal experience with how a language changed the way I think. I use &lt;a href="https://learntla.com/" target="_blank"&gt;TLA+&lt;/a&gt; to detect concurrency bugs in software designs. After doing this for several years, I've gotten much better at intuitively seeing race conditions in things even &lt;em&gt;without&lt;/em&gt; writing a TLA+ spec. It's even leaked out into my day-to-day life. I see concurrency bugs everywhere. Phone tag is a race condition.&lt;/p&gt;
    &lt;p&gt;But I still don't think SWH is the right mental model to use, for one big reason: language is &lt;em&gt;special&lt;/em&gt;. We think in language, we dream in language, there are huge parts of our brain dedicated to processing language. &lt;a href="https://web.eecs.umich.edu/~weimerw/p/weimer-icse2017-preprint.pdf" target="_blank"&gt;We don't use those parts of our brain to read code&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;SWH is so intriguing because it seems so unnatural, that the way we express thoughts changes the way we &lt;em&gt;think&lt;/em&gt; thoughts. That I would be a different person if I was bilingual in Spanish, not because the life experiences it would open up but because &lt;a href="https://en.wikipedia.org/wiki/Grammatical_gender" target="_blank"&gt;grammatical gender&lt;/a&gt; would change my brain.&lt;/p&gt;
    &lt;p&gt;Compared to that, the idea that programming languages affect our brain is more natural and has a simpler explanation:&lt;/p&gt;
    &lt;p&gt;It's the goddamned &lt;a href="https://en.wikipedia.org/wiki/Tetris_effect" target="_blank"&gt;Tetris Effect&lt;/a&gt;.&lt;/p&gt;
    &lt;h3&gt;The Goddamned Tetris Effect&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;blockquote&gt;
    &lt;p&gt;The Tetris effect occurs when someone dedicates vast amounts of time, effort and concentration on an activity which thereby alters their thoughts, dreams, and other experiences not directly linked to said activity. — Wikipedia&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Every skill does this. I'm a juggler, so every item I can see right now has a tiny metadata field of "how would this tumble if I threw it up". I teach professionally, so I'm always noticing good teaching examples everywhere. I spent years writing specs in TLA+ and watching the model checker throw concurrency errors in my face, so now race conditions have visceral presence. Every skill does this. &lt;/p&gt;
    &lt;p&gt;And to really develop a skill, you gotta practice. This is where I think programming paradigms do something especially interesting that make them feel more like Sapir-Whorfy than, like, juggling. Some languages mix lots of different paradigms, like Javascript or Rust. Others like Haskell really focus on &lt;em&gt;excluding&lt;/em&gt; paradigms. If something is easy for you in procedural and hard in FP, in JS you could just lean on the procedural bits. In Haskell, &lt;em&gt;too bad&lt;/em&gt;, you're learning how to do it the functional way.&lt;sup id="fnref:escape-hatch"&gt;&lt;a class="footnote-ref" href="#fn:escape-hatch"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;And that forces you to practice, which makes you see functional patterns everywhere. Tetris effect!&lt;/p&gt;
    &lt;p&gt;Anyway this may all seem like quibbling— why does it matter whether we call it "Tetris effect" or "Sapir-Whorf", if our brains is get rewired either way? For me, personally, it's because SWH sounds really special and &lt;em&gt;unique&lt;/em&gt;, while Tetris effect sounds mundane and commonplace. Which it &lt;em&gt;is&lt;/em&gt;. But also because TE suggests it's not just programming languages that affect how we think about software, it's &lt;em&gt;everything&lt;/em&gt;. Spending lots of time debugging, profiling, writing exploits, whatever will change what you notice, what you think a program "is". And that's a way useful idea that shouldn't be restricted to just PLs.&lt;/p&gt;
    &lt;p&gt;(Then again, the Tetris Effect might also be a bad analogy to what's going on here, because I think part of it is that it wears off after a while. Maybe it's just "building a mental model is good".)&lt;/p&gt;
    &lt;h3&gt;I just realized all of this might have missed the point&lt;/h3&gt;
    &lt;p&gt;Wait are people actually using SWH to mean the &lt;em&gt;weak form&lt;/em&gt; or the &lt;em&gt;strong&lt;/em&gt; form? Like that if a language doesn't make something possible, its users can't conceive of it being possible. I've been arguing against the weaker form in software but I think I've seen strong form often too. Dammit.&lt;/p&gt;
    &lt;p&gt;Well, it's already Thursday and far too late to rewrite the whole newsletter, so I'll just outline the problem with the strong form: we describe the capabilities of our programming languages &lt;em&gt;with human language&lt;/em&gt;. In college I wrote a lot of crappy physics lab C++ and one of my projects was filled with comments like "man I hate copying this triply-nested loop in 10 places with one-line changes, I wish I could put it in one function and just take the changing line as a parameter". Even if I hadn't &lt;em&gt;encountered&lt;/em&gt; higher-order functions, I was still perfectly capable of expressing the idea. So if the strong SWH isn't true for human language, it's not true for programming languages either.&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h1&gt;Systems Distributed talk now up!&lt;/h1&gt;
    &lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=d9cM8f_qSLQ" target="_blank"&gt;Link here&lt;/a&gt;! Original abstract:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Building correct distributed systems takes thinking outside the box, and the fastest way to do that is to think inside a different box. One different box is "formal methods", the discipline of mathematically verifying software and systems. Formal methods encourages unusual perspectives on systems, models that are also broadly useful to all software developers. In this talk we will learn two of the most important FM perspectives: the abstract specifications behind software systems, and the property they are and aren't supposed to have.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The talk ended up evolving away from that abstract but I like how it turned out!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:economic-behavior"&gt;
    &lt;p&gt;There is &lt;a href="https://www.anderson.ucla.edu/faculty/keith.chen/papers/LanguageWorkingPaper.pdf" target="_blank"&gt;one paper&lt;/a&gt; arguing that people who speak a language that doesn't have a "future tense" are more likely to save and eat healthy, but it is... &lt;a href="https://www.reddit.com/r/linguistics/comments/rcne7m/comment/hnz2705/" target="_blank"&gt;extremely questionable&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:economic-behavior" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:J"&gt;
    &lt;p&gt;The original J is &lt;code&gt;+/ (* (0 =  2&amp;amp;|))&lt;/code&gt;. Obligatory &lt;a href="https://www.jsoftware.com/papers/tot.htm" target="_blank"&gt;Notation as a Tool of Thought&lt;/a&gt; reference &lt;a class="footnote-backref" href="#fnref:J" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:escape-hatch"&gt;
    &lt;p&gt;Though if it's &lt;em&gt;too&lt;/em&gt; hard for you, that's why languages have &lt;a href="https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/" target="_blank"&gt;escape hatches&lt;/a&gt; &lt;a class="footnote-backref" href="#fnref:escape-hatch" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 21 Aug 2025 13:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/</guid>
            </item>
            <item>
                <title>Software books I wish I could read</title>
                <link>https://buttondown.com/hillelwayne/archive/software-books-i-wish-i-could-read/</link>
                <description>&lt;h3&gt;New Logic for Programmers Release!&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.11 is now available&lt;/a&gt;! This is over 20%  longer than v0.10, with a new chapter on code proofs, three chapter overhauls, and more! &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;Full release notes here&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Cover of the boooooook" class="newsletter-image" src="https://assets.buttondown.email/images/92b4a35d-2bdd-416a-92c7-15ff42b49d8d.jpg?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h1&gt;Software books I wish I could read&lt;/h1&gt;
    &lt;p&gt;I'm writing &lt;em&gt;Logic for Programmers&lt;/em&gt; because it's a book I wanted to have ten years ago. I had to learn everything in it the hard way, which is why I'm ensuring that everybody else can learn it the easy way.&lt;/p&gt;
    &lt;p&gt;Books occupy a sort of weird niche in software. We're great at sharing information via blogs and git repos and entire websites. These have many benefits over books: they're free, they're easily accessible, they can be updated quickly, they can even be interactive. But no blog post has influenced me as profoundly as &lt;a href="https://buttondown.com/hillelwayne/archive/why-you-should-read-data-and-reality/" target="_blank"&gt;Data and Reality&lt;/a&gt; or &lt;a href="https://www.oreilly.com/library/view/making-software/9780596808310/" target="_blank"&gt;Making Software&lt;/a&gt;. There is no blog or talk about debugging as good as the 
    &lt;a href="https://debuggingrules.com/" target="_blank"&gt;Debugging&lt;/a&gt; book.&lt;/p&gt;
    &lt;p&gt;It might not be anything deeper than "people spend more time per word on writing books than blog posts". I dunno.&lt;/p&gt;
    &lt;p&gt;So here are some other books I wish I could read. I don't &lt;em&gt;think&lt;/em&gt; any of them exist yet but it's a big world out there. Also while they're probably best as books, a website or a series of blog posts would be ok too.&lt;/p&gt;
    &lt;h4&gt;Everything about Configurations&lt;/h4&gt;
    &lt;p&gt;The whole topic of how we configure software, whether by CLI flags, environmental vars, or JSON/YAML/XML/Dhall files. What causes the &lt;a href="https://mikehadlow.blogspot.com/2012/05/configuration-complexity-clock.html" target="_blank"&gt;configuration complexity clock&lt;/a&gt;? How do we distinguish between basic, advanced, and developer-only configuration options? When should we disallow configuration? How do we test all possible configurations for correctness? Why do so many widespread outages trace back to misconfiguration, and how do we prevent them? &lt;/p&gt;
    &lt;p&gt;I also want the same for plugin systems. Manifests, permissions, common APIs and architectures, etc. Configuration management is more universal, though, since everybody either uses software with configuration or has made software with configuration.&lt;/p&gt;
    &lt;h4&gt;The Big Book of Complicated Data Schemas&lt;/h4&gt;
    &lt;p&gt;I guess this would kind of be like &lt;a href="https://schema.org/docs/full.html" target="_blank"&gt;Schema.org&lt;/a&gt;, except with a lot more on the "why" and not the what. Why is important for the &lt;a href="https://schema.org/Volcano" target="_blank"&gt;Volcano model&lt;/a&gt; to have a "smokingAllowed" field?&lt;sup id="fnref:volcano"&gt;&lt;a class="footnote-ref" href="#fn:volcano"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;I'd see this less as "here's your guide to putting Volcanos in your database" and more "here's recurring motifs in modeling interesting domains", to help a person see sources of complexity in their &lt;em&gt;own&lt;/em&gt; domain. Does something crop up if the references can form a cycle? If a relationship needs to be strictly temporary, or a reference can change type? Bonus: path dependence in data models, where an additional requirement leads to a vastly different ideal data model that a company couldn't do because they made the old model.&lt;/p&gt;
    &lt;p&gt;(This has got to exist, right? Business modeling is a big enough domain that this must exist. Maybe &lt;a href="https://essenceofsoftware.com/" target="_blank"&gt;The Essence of Software&lt;/a&gt; touches on this? Man I feel bad I haven't read that yet.)&lt;/p&gt;
    &lt;h4&gt;Computer Science for Software Engineers&lt;/h4&gt;
    &lt;p&gt;Yes, I checked, this book does not exist (though maybe &lt;a href="https://www.amazon.com/A-Programmers-Guide-to-Computer-Science-2-book-series/dp/B08433QR53" target="_blank"&gt;this&lt;/a&gt; is the same thing). I don't have any formal software education; everything I know was either self-taught or learned on the job. But it's way easier to learn software engineering that way than computer science. And I bet there's a lot of other engineers in the same boat. &lt;/p&gt;
    &lt;p&gt;This book wouldn't have to be comprehensive or instructive: just enough about each topic to understand why it's an area of study and appreciate how research in it eventually finds its way into practice. &lt;/p&gt;
    &lt;h4&gt;MISU Patterns&lt;/h4&gt;
    &lt;p&gt;MISU, or "Make Illegal States Unrepresentable", is the idea of designing system invariants in the structure of your data. For example, if a &lt;code&gt;Contact&lt;/code&gt; needs at least one of &lt;code&gt;email&lt;/code&gt; or &lt;code&gt;phone&lt;/code&gt; to be non-null, make it a sum type over &lt;code&gt;EmailContact, PhoneContact, EmailPhoneContact&lt;/code&gt; (from &lt;a href="https://fsharpforfunandprofit.com/posts/designing-with-types-making-illegal-states-unrepresentable/" target="_blank"&gt;this post&lt;/a&gt;). MISU is great.&lt;/p&gt;
    &lt;p&gt;Most MISU in the wild look very different than that, though, because the concept of MISU is so broad there's lots of different ways to achieve it. And that means there are "patterns": smart constructors, product types, properly using sets, &lt;a href="https://lexi-lambda.github.io/blog/2020/11/01/names-are-not-type-safety/" target="_blank"&gt;newtypes to some degree&lt;/a&gt;, etc. Some of them are specific to typed FP, while others can be used in even untyped languages. Someone oughta make a pattern book.&lt;/p&gt;
    &lt;p&gt;My one request would be to not give them cutesy names. Do something like the &lt;a href="https://ia600301.us.archive.org/18/items/Thompson2016MotifIndex/Thompson_2016_Motif-Index.pdf" target="_blank"&gt;Aarne–Thompson–Uther Index&lt;/a&gt;, where items are given names like "Recognition by manner of throwing cakes of different weights into faces of old uncles". Names can come later.&lt;/p&gt;
    &lt;h4&gt;The Tools of '25&lt;/h4&gt;
    &lt;p&gt;Not something I'd read, but something to recommend to junior engineers. Starting out it's easy to think the only bit that matters is the language or framework and not realize the enormous amount of surrounding tooling you'll have to learn. This book would cover the basics of tools that &lt;em&gt;enough&lt;/em&gt; developers will probably use at some point: git, VSCode, &lt;em&gt;very&lt;/em&gt; basic Unix and bash, curl. Maybe the general concepts of tools that appear in every ecosystem, like package managers, build tools, task runners. That might be easier if we specialize this to one particular domain, like webdev or data science.&lt;/p&gt;
    &lt;p&gt;Ideally the book would only have to be updated every five years or so. No LLM stuff because I don't expect the tooling will be stable through 2026, to say nothing of 2030.&lt;/p&gt;
    &lt;h4&gt;A History of Obsolete Optimizations&lt;/h4&gt;
    &lt;p&gt;Probably better as a really long blog series. Each chapter would be broken up into two parts:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;A deep dive into a brilliant, elegant, insightful historical optimization designed to work within the constraints of that era's computing technology&lt;/li&gt;
    &lt;li&gt;What we started doing instead, once we had more compute/network/storage available.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;c.f. &lt;a href="https://prog21.dadgum.com/29.html" target="_blank"&gt;A Spellchecker Used to Be a Major Feat of Software Engineering&lt;/a&gt;. Bonus topics would be brilliance obsoleted by standardization (like what people did before git and json were universal), optimizations we do today that may not stand the test of time, and optimizations from the past that &lt;em&gt;did&lt;/em&gt;.&lt;/p&gt;
    &lt;h4&gt;Sphinx Internals&lt;/h4&gt;
    &lt;p&gt;&lt;em&gt;I need this&lt;/em&gt;. I've spent so much goddamn time digging around in Sphinx and docutils source code I'm gonna throw up.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Systems Distributed Talk Today!&lt;/h3&gt;
    &lt;p&gt;Online premier's at noon central / 5 PM UTC, &lt;a href="https://www.youtube.com/watch?v=d9cM8f_qSLQ" target="_blank"&gt;here&lt;/a&gt;! I'll be hanging out to answer questions and be awkward. You ever watch a recording of your own talk? It's real uncomfortable!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:volcano"&gt;
    &lt;p&gt;In &lt;em&gt;this&lt;/em&gt; case because it's a field on one of &lt;code&gt;Volcano&lt;/code&gt;'s supertypes. I guess schemas gotta follow LSP too &lt;a class="footnote-backref" href="#fnref:volcano" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 06 Aug 2025 13:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/software-books-i-wish-i-could-read/</guid>
            </item>
            <item>
                <title>2000 words about arrays and tables</title>
                <link>https://buttondown.com/hillelwayne/archive/2000-words-about-arrays-and-tables/</link>
                <description>&lt;p&gt;I'm way too discombobulated from getting next month's release of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; ready, so I'm pulling a idea from the slush pile. Basically I wanted to come up with a mental model of arrays as a concept that explained APL-style multidimensional arrays and tables but also why there weren't multitables.&lt;/p&gt;
    &lt;p&gt;So, arrays. In all languages they are basically the same: they map a sequence of numbers (I'll use &lt;code&gt;1..N&lt;/code&gt;)&lt;sup id="fnref:1-indexing"&gt;&lt;a class="footnote-ref" href="#fn:1-indexing"&gt;1&lt;/a&gt;&lt;/sup&gt; to homogeneous values (values of a single type). This is in contrast to the other two foundational types, associative arrays (which map an arbitrary type to homogeneous values) and structs (which map a fixed set of keys to &lt;em&gt;heterogeneous&lt;/em&gt; values). Arrays appear in PLs earlier than the other two, possibly because they have the simplest implementation and the most obvious application to scientific computing. The OG FORTRAN had arrays. &lt;/p&gt;
    &lt;p&gt;I'm interested in two structural extensions to arrays. The first, found in languages like nushell and frameworks like Pandas, is the &lt;em&gt;table&lt;/em&gt;. Tables have string keys like a struct &lt;em&gt;and&lt;/em&gt; indexes like an array. Each row is a struct, so you can get "all values in this column" or "all values for this row". They're heavily used in databases and data science.&lt;/p&gt;
    &lt;p&gt;The other extension is the &lt;strong&gt;N-dimensional array&lt;/strong&gt;, mostly seen in APLs like Dyalog and J. Think of this like arrays-of-arrays(-of-arrays), except all arrays at the same depth have the same length. So &lt;code&gt;[[1,2,3],[4]]&lt;/code&gt; is not a 2D array, but &lt;code&gt;[[1,2,3],[4,5,6]]&lt;/code&gt; is. This means that N-arrays can be queried on any axis.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
    &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;NB. first row&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;{"&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;NB. first column&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So, I've had some ideas on a conceptual model of arrays that explains all of these variations and possibly predicts new variations. I wrote up my notes and did the bare minimum of editing and polishing. Somehow it ended up being 2000 words.&lt;/p&gt;
    &lt;h3&gt;1-dimensional arrays&lt;/h3&gt;
    &lt;p&gt;A one-dimensional array is a function over &lt;code&gt;1..N&lt;/code&gt; for some N. &lt;/p&gt;
    &lt;p&gt;To be clear this is &lt;em&gt;math&lt;/em&gt; functions, not programming functions. Programming functions take values of a type and perform computations on them. Math functions take values of a fixed set and return values of another set. So the array &lt;code&gt;[a, b, c, d]&lt;/code&gt; can be represented by the function &lt;code&gt;(1 -&amp;gt; a ++ 2 -&amp;gt; b ++ 3 -&amp;gt; c ++ 4 -&amp;gt; d)&lt;/code&gt;. Let's write the set of all four element character arrays as &lt;code&gt;1..4 -&amp;gt; char&lt;/code&gt;. &lt;code&gt;1..4&lt;/code&gt; is the function's &lt;strong&gt;domain&lt;/strong&gt;.&lt;/p&gt;
    &lt;p&gt;The set of all character arrays is the empty array + the functions with domain &lt;code&gt;1..1&lt;/code&gt; + the functions with domain &lt;code&gt;1..2&lt;/code&gt; + ... Let's call this set &lt;code&gt;Array[Char]&lt;/code&gt;. Our compilers can enforce that a type belongs to &lt;code&gt;Array[Char]&lt;/code&gt;, but some operations care about the more specific type, like matrix multiplication. This is either checked with the runtime type or, in exotic enough languages, with static dependent types.&lt;/p&gt;
    &lt;p&gt;(This is actually how TLA+ does things: the basic collection types are functions and sets, and a function with domain 1..N is a sequence.)&lt;/p&gt;
    &lt;h3&gt;2-dimensional arrays&lt;/h3&gt;
    &lt;p&gt;Now take the 3x4 matrix&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;
    &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;11&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There are two equally valid ways to represent the array function:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;A function that takes a row and a column and returns the value at that index, so it would look like &lt;code&gt;f(r: 1..3, c: 1..4) -&amp;gt; Int&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;A function that takes a row and returns that column as an array, aka another function: &lt;code&gt;f(r: 1..3) -&amp;gt; g(c: 1..4) -&amp;gt; Int&lt;/code&gt;.&lt;sup id="fnref:associative"&gt;&lt;a class="footnote-ref" href="#fn:associative"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Man, (2) looks a lot like &lt;a href="https://en.wikipedia.org/wiki/Currying" target="_blank"&gt;currying&lt;/a&gt;! In Haskell, functions can only have one parameter. If you write &lt;code&gt;(+) 6 10&lt;/code&gt;, &lt;code&gt;(+) 6&lt;/code&gt; first returns a &lt;em&gt;new&lt;/em&gt; function &lt;code&gt;f y = y + 6&lt;/code&gt;, and then applies &lt;code&gt;f 10&lt;/code&gt; to get 16. So &lt;code&gt;(+)&lt;/code&gt; has the type signature &lt;code&gt;Int -&amp;gt; Int -&amp;gt; Int&lt;/code&gt;: it's a function that takes an &lt;code&gt;Int&lt;/code&gt; and returns a function of type &lt;code&gt;Int -&amp;gt; Int&lt;/code&gt;.&lt;sup id="fnref:typeclass"&gt;&lt;a class="footnote-ref" href="#fn:typeclass"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;Similarly, our 2D array can be represented as an array function that returns array functions: it has type &lt;code&gt;1..3 -&amp;gt; 1..4 -&amp;gt; Int&lt;/code&gt;, meaning it takes a row index and returns &lt;code&gt;1..4 -&amp;gt; Int&lt;/code&gt;, aka a single array.&lt;/p&gt;
    &lt;p&gt;(This differs from conventional array-of-arrays because it forces all of the subarrays to have the same domain, aka the same length. If we wanted to permit ragged arrays, we would instead have the type &lt;code&gt;1..3 -&amp;gt; Array[Int]&lt;/code&gt;.)&lt;/p&gt;
    &lt;p&gt;Why is this useful? A couple of reasons. First of all, we can apply function transformations to arrays, like "&lt;a href="https://blog.zdsmith.com/series/combinatory-programming.html" target="_blank"&gt;combinators&lt;/a&gt;". For example, we can flip any function of type &lt;code&gt;a -&amp;gt; b -&amp;gt; c&lt;/code&gt; into a function of type &lt;code&gt;b -&amp;gt; a -&amp;gt; c&lt;/code&gt;. So given a function that takes rows and returns columns, we can produce one that takes columns and returns rows. That's just a matrix transposition! &lt;/p&gt;
    &lt;p&gt;Second, we can extend this to any number of dimensions: a three-dimensional array is one with type &lt;code&gt;1..M -&amp;gt; 1..N -&amp;gt; 1..O -&amp;gt; V&lt;/code&gt;. We can still use function transformations to rearrange the array along any ordering of axes.&lt;/p&gt;
    &lt;p&gt;Speaking of dimensions:&lt;/p&gt;
    &lt;h3&gt;What are dimensions, anyway&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;Okay, so now imagine we have a &lt;code&gt;Row&lt;/code&gt; × &lt;code&gt;Col&lt;/code&gt; grid of pixels, where each pixel is a struct of type &lt;code&gt;Pixel(R: int, G: int, B: int)&lt;/code&gt;. So the array is&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Row -&amp;gt; Col -&amp;gt; Pixel
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But we can also represent the &lt;em&gt;Pixel struct&lt;/em&gt; with a function: &lt;code&gt;Pixel(R: 0, G: 0, B: 255)&lt;/code&gt; is the function where &lt;code&gt;f(R) = 0&lt;/code&gt;, &lt;code&gt;f(G) = 0&lt;/code&gt;, &lt;code&gt;f(B) = 255&lt;/code&gt;, making it a function of type &lt;code&gt;{R, G, B} -&amp;gt; Int&lt;/code&gt;. So the array is actually the function&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Row -&amp;gt; Col -&amp;gt; {R, G, B} -&amp;gt; Int
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;And then we can rearrange the parameters of the function like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;{R, G, B} -&amp;gt; Row -&amp;gt; Col -&amp;gt; Int
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Even though the set &lt;code&gt;{R, G, B}&lt;/code&gt; is not of form 1..N, this clearly has a real meaning: &lt;code&gt;f[R]&lt;/code&gt; is the function mapping each coordinate to that coordinate's red value. What about &lt;code&gt;Row -&amp;gt; {R, G, B} -&amp;gt; Col -&amp;gt; Int&lt;/code&gt;?  That's for each row, the 3 × Col array mapping each color to that row's intensities.&lt;/p&gt;
    &lt;p&gt;Really &lt;em&gt;any finite set&lt;/em&gt; can be a "dimension". Recording the monitor over a span of time? &lt;code&gt;Frame -&amp;gt; Row -&amp;gt; Col -&amp;gt; Color -&amp;gt; Int&lt;/code&gt;. Recording a bunch of computers over some time? &lt;code&gt;Computer -&amp;gt; Frame -&amp;gt; Row …&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;This is pretty common in constraint satisfaction! Like if you're conference trying to assign talks to talk slots, your array might be type &lt;code&gt;(Day, Time, Room) -&amp;gt; Talk&lt;/code&gt;, where Day/Time/Room are enumerations.&lt;/p&gt;
    &lt;p&gt;An implementation constraint is that most programming languages &lt;em&gt;only&lt;/em&gt; allow integer indexes, so we have to replace Rooms and Colors with numerical enumerations over the set. As long as the set is finite, this is always possible, and for struct-functions, we can always choose the indexing on the lexicographic ordering of the keys. But we lose type safety.&lt;/p&gt;
    &lt;h3&gt;Why tables are different&lt;/h3&gt;
    &lt;p&gt;One more example: &lt;code&gt;Day -&amp;gt; Hour -&amp;gt; Airport(name: str, flights: int, revenue: USD)&lt;/code&gt;. Can we turn the struct into a dimension like before? &lt;/p&gt;
    &lt;p&gt;In this case, no. We were able to make &lt;code&gt;Color&lt;/code&gt; an axis because we could turn &lt;code&gt;Pixel&lt;/code&gt; into a &lt;code&gt;Color -&amp;gt; Int&lt;/code&gt; function, and we could only do that because all of the fields of the struct had the same type. This time, the fields are &lt;em&gt;different&lt;/em&gt; types. So we can't convert &lt;code&gt;{name, flights, revenue}&lt;/code&gt; into an axis. &lt;sup id="fnref:name-dimension"&gt;&lt;a class="footnote-ref" href="#fn:name-dimension"&gt;4&lt;/a&gt;&lt;/sup&gt; One thing we can do is convert it to three &lt;em&gt;separate&lt;/em&gt; functions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;airport: Day -&amp;gt; Hour -&amp;gt; Str
    flights: Day -&amp;gt; Hour -&amp;gt; Int
    revenue: Day -&amp;gt; Hour -&amp;gt; USD
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But we want to keep all of the data in one place. That's where &lt;strong&gt;tables&lt;/strong&gt; come in: an array-of-structs is isomorphic to a struct-of-arrays:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;AirportColumns(
        airport: Day -&amp;gt; Hour -&amp;gt; Str,
        flights: Day -&amp;gt; Hour -&amp;gt; Int,
        revenue: Day -&amp;gt; Hour -&amp;gt; USD,
    )
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The table is a sort of &lt;em&gt;both&lt;/em&gt; representations simultaneously. If this was a pandas dataframe, &lt;code&gt;df["airport"]&lt;/code&gt; would get the airport column, while &lt;code&gt;df.loc[day1]&lt;/code&gt; would get the first day's data. I don't think many table implementations support more than one axis dimension but there's no reason they &lt;em&gt;couldn't&lt;/em&gt;. &lt;/p&gt;
    &lt;p&gt;These are also possible transforms:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Hour -&amp;gt; NamesAreHard(
        airport: Day -&amp;gt; Str,
        flights: Day -&amp;gt; Int,
        revenue: Day -&amp;gt; USD,
    )
    
    Day -&amp;gt; Whatever(
        airport: Hour -&amp;gt; Str,
        flights: Hour -&amp;gt; Int,
        revenue: Hour -&amp;gt; USD,
    )
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In my mental model, the heterogeneous struct acts as a "block" in the array. We can't remove it, we can only push an index into the fields or pull a shared column out. But there's no way to convert a heterogeneous table into an array.&lt;/p&gt;
    &lt;h3&gt;Actually there is a terrible way&lt;/h3&gt;
    &lt;p&gt;Most languages have unions or &lt;del&gt;product&lt;/del&gt; sum types that let us say "this is a string OR integer". So we can make our airport data &lt;code&gt;Day -&amp;gt; Hour -&amp;gt; AirportKey -&amp;gt; Int | Str | USD&lt;/code&gt;. Heck, might as well just say it's &lt;code&gt;Day -&amp;gt; Hour -&amp;gt; AirportKey -&amp;gt; Any&lt;/code&gt;. But would anybody really be mad enough to use that in practice?&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://code.jsoftware.com/wiki/Vocabulary/lt" target="_blank"&gt;Oh wait J does exactly that&lt;/a&gt;. J has an opaque datatype called a "box". A "table" is a function &lt;code&gt;Dim1 -&amp;gt; Dim2 -&amp;gt; Box&lt;/code&gt;. You can see some examples of what that looks like &lt;a href="https://code.jsoftware.com/wiki/DB/Flwor" target="_blank"&gt;here&lt;/a&gt;&lt;/p&gt;
    &lt;h3&gt;Misc Thoughts and Questions&lt;/h3&gt;
    &lt;p&gt;The heterogeneity barrier seems like it explains why we don't see multiple axes of table columns, while we do see multiple axes of array dimensions. But is that actually why? Is there a system out there that &lt;em&gt;does&lt;/em&gt; have multiple columnar axes?&lt;/p&gt;
    &lt;p&gt;The array &lt;code&gt;x = [[a, b, a], [b, b, b]]&lt;/code&gt; has type &lt;code&gt;1..2 -&amp;gt; 1..3 -&amp;gt; {a, b}&lt;/code&gt;. Can we rearrange it to &lt;code&gt;1..2 -&amp;gt; {a, b} -&amp;gt; 1..3&lt;/code&gt;? No. But we &lt;em&gt;can&lt;/em&gt; rearrange it to &lt;code&gt;1..2 -&amp;gt; {a, b} -&amp;gt; PowerSet(1..3)&lt;/code&gt;, which maps rows and characters to columns &lt;em&gt;with&lt;/em&gt; that character. &lt;code&gt;[(a -&amp;gt; {1, 3} ++ b -&amp;gt; {2}), (a -&amp;gt; {} ++ b -&amp;gt; {1, 2, 3}]&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;We can also transform &lt;code&gt;Row -&amp;gt; PowerSet(Col)&lt;/code&gt; into &lt;code&gt;Row -&amp;gt; Col -&amp;gt; Bool&lt;/code&gt;, aka a boolean matrix. This makes sense to me as both forms are means of representing directed graphs.&lt;/p&gt;
    &lt;p&gt;Are other function combinators useful for thinking about arrays?&lt;/p&gt;
    &lt;p&gt;Does this model cover pivot tables? Can we extend it to relational data with multiple tables?&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Systems Distributed Talk (will be) Online&lt;/h3&gt;
    &lt;p&gt;The premier will be August 6 at 12 CST, &lt;a href="https://www.youtube.com/watch?v=d9cM8f_qSLQ" target="_blank"&gt;here&lt;/a&gt;! I'll be there to answer questions / mock my own performance / generally make a fool of myself.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:1-indexing"&gt;
    &lt;p&gt;&lt;a href="https://buttondown.com/hillelwayne/archive/why-do-arrays-start-at-0/" target="_blank"&gt;Sacrilege&lt;/a&gt;! But it turns out in this context, it's easier to use 1-indexing than 0-indexing. In the years since I wrote that article I've settled on "each indexing choice matches different kinds of mathematical work", so mathematicians and computer scientists are best served by being able to choose their index. But software engineers need consistency, and 0-indexing is overall a net better consistency pick. &lt;a class="footnote-backref" href="#fnref:1-indexing" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:associative"&gt;
    &lt;p&gt;This is &lt;em&gt;right-associative&lt;/em&gt;: &lt;code&gt;a -&amp;gt; b -&amp;gt; c&lt;/code&gt; means &lt;code&gt;a -&amp;gt; (b -&amp;gt; c)&lt;/code&gt;, not &lt;code&gt;(a -&amp;gt; b) -&amp;gt; c&lt;/code&gt;. &lt;code&gt;(1..3 -&amp;gt; 1..4) -&amp;gt; Int&lt;/code&gt; would be the associative array that maps length-3 arrays to integers. &lt;a class="footnote-backref" href="#fnref:associative" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:typeclass"&gt;
    &lt;p&gt;Technically it has type &lt;code&gt;Num a =&amp;gt; a -&amp;gt; a -&amp;gt; a&lt;/code&gt;, since &lt;code&gt;(+)&lt;/code&gt; works on floats too. &lt;a class="footnote-backref" href="#fnref:typeclass" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:name-dimension"&gt;
    &lt;p&gt;Notice that if each &lt;code&gt;Airport&lt;/code&gt; had a unique name, we &lt;em&gt;could&lt;/em&gt; pull it out into &lt;code&gt;AirportName -&amp;gt; Airport(flights, revenue)&lt;/code&gt;, but we still are stuck with two different values. &lt;a class="footnote-backref" href="#fnref:name-dimension" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 30 Jul 2025 13:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/2000-words-about-arrays-and-tables/</guid>
            </item>
            <item>
                <title>Programming Language Escape Hatches</title>
                <link>https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/</link>
                <description>&lt;p&gt;The excellent-but-defunct blog &lt;a href="https://prog21.dadgum.com/38.html" target="_blank"&gt;Programming in the 21st Century&lt;/a&gt; defines "puzzle languages" as languages were part of the appeal is in figuring out how to express a program idiomatically, like a puzzle. As examples, he lists Haskell, Erlang, and J. All puzzle languages, the author says, have an "escape" out of the puzzle model that is pragmatic but stigmatized.&lt;/p&gt;
    &lt;p&gt;But many mainstream languages have escape hatches, too.&lt;/p&gt;
    &lt;p&gt;Languages have a lot of properties. One of these properties is the language's &lt;a href="https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/" target="_blank"&gt;capabilities&lt;/a&gt;, roughly the set of things you can do in the language. Capability is desirable but comes into conflicts with a lot of other desirable properties, like simplicity or efficiency. In particular, reducing the capability of a language means that all remaining programs share more in common, meaning there's more assumptions the compiler and programmer can make ("tractability"). Assumptions are generally used to reason about correctness, but can also be about things like optimization: J's assumption that everything is an array leads to &lt;a href="https://code.jsoftware.com/wiki/Vocabulary/SpecialCombinations" target="_blank"&gt;high-performance "special combinations"&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;Rust is the most famous example of &lt;strong&gt;mainstream&lt;/strong&gt; language that trades capability for tractability.&lt;sup id="fnref:gc"&gt;&lt;a class="footnote-ref" href="#fn:gc"&gt;1&lt;/a&gt;&lt;/sup&gt; Rust has a lot of rules designed to prevent common memory errors, like keeping a reference to deallocated memory or modifying memory while something else is reading it. As a consequence, there's a lot of things that cannot be done in (safe) Rust, like interface with an external C function (as it doesn't have these guarantees).&lt;/p&gt;
    &lt;p&gt;To do this, you need to use &lt;a href="https://doc.rust-lang.org/book/ch20-01-unsafe-rust.html" target="_blank"&gt;unsafe Rust&lt;/a&gt;, which lets you do additional things forbidden by safe Rust, such as deference a raw pointer. Everybody tells you not to use &lt;code&gt;unsafe&lt;/code&gt; unless you absolutely 100% know what you're doing, and possibly not even then.&lt;/p&gt;
    &lt;p&gt;Sounds like an escape hatch to me!&lt;/p&gt;
    &lt;p&gt;To extrapolate, an &lt;strong&gt;escape hatch&lt;/strong&gt; is a feature (either in the language itself or a particular implementation) that deliberately breaks core assumptions about the language in order to add capabilities. This explains both Rust and most of the so-called "puzzle languages": they need escape hatches because they have very strong conceptual models of the language which leads to lots of assumptions about programs. But plenty of "kitchen sink" mainstream languages have escape hatches, too:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Some compilers let C++ code embed &lt;a href="https://en.cppreference.com/w/cpp/language/asm.html" target="_blank"&gt;inline assembly&lt;/a&gt;.&lt;/li&gt;
    &lt;li&gt;Languages built on .NET or the JVM has some sort of interop with C# or Java, and many of those languages make assumptions about programs that C#/Java do not.&lt;/li&gt;
    &lt;li&gt;The SQL language has stored procedures as an escape hatch &lt;em&gt;and&lt;/em&gt; vendors create a second escape hatch of user-defined functions.&lt;/li&gt;
    &lt;li&gt;Ruby lets you bypass any form of encapsulation with &lt;a href="https://ruby-doc.org/3.4.1/Object.html#method-i-send" target="_blank"&gt;&lt;code&gt;send&lt;/code&gt;&lt;/a&gt;.&lt;/li&gt;
    &lt;li&gt;Frameworks have escape hatches, too! React has &lt;a href="https://react.dev/learn/escape-hatches" target="_blank"&gt;an entire page on them&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;(Does &lt;code&gt;eval&lt;/code&gt; in interpreted languages count as an escape hatch? It feels different, but it does add a lot of capability. Maybe they don't "break assumptions" in the same way?)&lt;/p&gt;
    &lt;h3&gt;The problem with escape hatches&lt;/h3&gt;
    &lt;p&gt;In all languages with escape hatches, the rule is "use this as carefully and sparingly as possible", to the point where a messy solution &lt;em&gt;without&lt;/em&gt; an escape hatch is preferable to a clean solution &lt;em&gt;with&lt;/em&gt; one. Breaking a core assumption is a big deal! If the language is operating as if its still true, it's going to do incorrect things. &lt;/p&gt;
    &lt;p&gt;I recently had this problem in a TLA+ contract. TLA+ is a language for modeling complicated systems, and assumes that the model is a self-contained universe. The client wanted to use the TLA+ to test a real system. The model checker should send commands to a test device and check the next states were the same. This is straightforward to set up with the &lt;a href="https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla" target="_blank"&gt;IOExec escape hatch&lt;/a&gt;.&lt;sup id="fnref:ioexec"&gt;&lt;a class="footnote-ref" href="#fn:ioexec"&gt;2&lt;/a&gt;&lt;/sup&gt; But the model checker assumed that state exploration was pure and it could skip around the state randomly, meaning it would do things like &lt;code&gt;set x = 10&lt;/code&gt;, then skip to &lt;code&gt;set x = 1&lt;/code&gt;, then skip back to &lt;code&gt;inc x; assert x == 11&lt;/code&gt;. Oops!&lt;/p&gt;
    &lt;p&gt;We eventually found workarounds but it took a lot of clever tricks to pull off. I'll probably write up the technique when I'm less busy with The Book.&lt;/p&gt;
    &lt;p&gt;The other problem with escape hatches is the rest of the language is designed around &lt;em&gt;not&lt;/em&gt; having said capabilities, meaning it can't support the feature as well as a language designed for them from the start. Even if your escape hatch code is clean, it might not cleanly &lt;em&gt;integrate&lt;/em&gt; with the rest of your code. This is why people &lt;a href="https://chadaustin.me/2024/10/intrusive-linked-list-in-rust/" target="_blank"&gt;complain about unsafe Rust&lt;/a&gt; so often.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:gc"&gt;
    &lt;p&gt;It should be noted though that &lt;em&gt;all&lt;/em&gt; languages with automatic memory management are trading capability for tractability, too. If you can't deference pointers, you can't deference &lt;em&gt;null&lt;/em&gt; pointers. &lt;a class="footnote-backref" href="#fnref:gc" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:ioexec"&gt;
    &lt;p&gt;From the Community Modules (which come default with the VSCode extension). &lt;a class="footnote-backref" href="#fnref:ioexec" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 24 Jul 2025 14:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/</guid>
            </item>
            <item>
                <title>Maybe writing speed actually is a bottleneck for programming</title>
                <link>https://buttondown.com/hillelwayne/archive/maybe-writing-speed-actually-is-a-bottleneck-for/</link>
                <description>&lt;p&gt;I'm a big (neo)vim buff. My config is over 1500 lines and I regularly write new scripts. I recently ported my neovim config to a new laptop. Before then, I was using VSCode to write, and when I switched back I immediately saw a big gain in productivity.&lt;/p&gt;
    &lt;p&gt;People often pooh-pooh vim (and other assistive writing technologies) by saying that writing code isn't the bottleneck in software development. Reading, understanding, and thinking through code is!&lt;/p&gt;
    &lt;p&gt;Now I don't know how true this actually is in practice, because empirical studies of time spent coding are all over the place. Most of them, like &lt;a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/meyer-fse-2014.pdf" target="_blank"&gt;this study&lt;/a&gt;, track time spent in the editor but don't distinguish between time spent reading code and time spent writing code. The only one I found that separates them was &lt;a href="https://scispace.com/pdf/i-know-what-you-did-last-summer-an-investigation-of-how-3zxclzzocc.pdf" target="_blank"&gt;this study&lt;/a&gt;. It finds that developers spend only 5% of their time editing. It also finds they spend 14% of their time moving or resizing editor windows, so I don't know how clean their data is.&lt;/p&gt;
    &lt;p&gt;But I have a bigger problem with "writing is not the bottleneck": when I think of a bottleneck, I imagine that &lt;em&gt;no&lt;/em&gt; amount of improvement will lead to productivity gains. Like if a program is bottlenecked on the network, it isn't going to get noticeably faster with 100x more ram or compute. &lt;/p&gt;
    &lt;p&gt;But being able to type code 100x faster, even with without corresponding improvements to reading and imagining code, would be &lt;strong&gt;huge&lt;/strong&gt;. &lt;/p&gt;
    &lt;p&gt;We'll assume the average developer writes at 80 words per minute, at five characters a word, for 400 characters a minute.What could we do if we instead wrote at 8,000 words/40k characters a minute? &lt;/p&gt;
    &lt;h3&gt;Writing fast&lt;/h3&gt;
    &lt;h4&gt;Boilerplate is trivial&lt;/h4&gt;
    &lt;p&gt;Why do people like type inference? Because writing all of the types manually is annoying. Why don't people like boilerplate? Because it's annoying to write every damn time. Programmers like features that help them write less! That's not a problem if you can write all of the boilerplate in 0.1 seconds.&lt;/p&gt;
    &lt;p&gt;You still have the problem of &lt;em&gt;reading&lt;/em&gt; boilerplate heavy code, but you can use the remaining 0.9 seconds to churn out an extension that parses the file and presents the boilerplate in a more legible fashion. &lt;/p&gt;
    &lt;h4&gt;We can write more tooling&lt;/h4&gt;
    &lt;p&gt;This is something I've noticed with LLMs: when I can churn out crappy code as a free action, I use that to write lots of tools that assist me in writing &lt;em&gt;good&lt;/em&gt; code. Even if I'm bottlenecked on a large program, I can still quickly write a script that helps me with something. Most of these aren't things I would have written because they'd take too long to write! &lt;/p&gt;
    &lt;p&gt;Again, not the best comparison, because LLMs also shortcut learning the relevant APIs, so also optimize the "understanding code" part. Then again, if I could type real fast I could more quickly whip up experiments on new apis to learn them faster. &lt;/p&gt;
    &lt;h4&gt;We can do practices that slow us down in the short-term&lt;/h4&gt;
    &lt;p&gt;Something like test-driven development significantly slows down how fast you write production code, because you have to spend a lot more time writing test code. Pair programming trades speed of writing code for speed of understanding code. A two-order-of-magnitude writing speedup makes both of them effectively free. Or, if you're not an eXtreme Programming fan, you can more easily follow the &lt;a href="https://en.wikipedia.org/wiki/The_Power_of_10:_Rules_for_Developing_Safety-Critical_Code" target="_blank"&gt;The Power of Ten Rules&lt;/a&gt; and blanket your code with contracts and assertions.&lt;/p&gt;
    &lt;h4&gt;We could do more speculative editing&lt;/h4&gt;
    &lt;p&gt;This is probably the biggest difference in how we'd work if we could write 100x faster: it'd be much easier to try changes to the code to see if they're good ideas in the first place. &lt;/p&gt;
    &lt;p&gt;How often have I tried optimizing something, only to find out it didn't make a difference? How often have I done a refactoring only to end up with lower-quality code overall? Too often. Over time it makes me prefer to try things that I know will work, and only "speculatively edit" when I think it be a fast change. If I could code 100x faster it would absolutely lead to me trying more speculative edits.&lt;/p&gt;
    &lt;p&gt;This is especially big because I believe that lots of speculative edits are high-risk, high-reward: given 50 things we could do to the code, 49 won't make a difference and one will be a major improvement. If I only have time to try five things, I have a 10% chance of hitting the jackpot. If I can try 500 things I will get that reward every single time. &lt;/p&gt;
    &lt;h2&gt;Processes are built off constraints&lt;/h2&gt;
    &lt;p&gt;There are just a few ideas I came up with; there are probably others. Most of them, I suspect, will share the same property in common: they change &lt;em&gt;the process&lt;/em&gt; of writing code to leverage the speedup. I can totally believe that a large speedup would not remove a bottleneck in the processes we &lt;em&gt;currently&lt;/em&gt; use to write code. But that's because those processes are developed work within our existing constraints. Remove a constraint and new processes become possible.&lt;/p&gt;
    &lt;p&gt;The way I see it, if our current process produces 1 Utils of Software / day, a 100x writing speedup might lead to only 1.5 UoS/day. But there are other processes that produce only 0.5 UoS/d &lt;em&gt;because they are bottlenecked on writing speed&lt;/em&gt;. A 100x speedup would lead to 10 UoS/day.&lt;/p&gt;
    &lt;p&gt;The problem with all of this that 100x speedup isn't realistic, and it's not obvious whether a 2x improvement would lead to better processes. Then again, one of the first custom vim function scripts I wrote was an aid to writing unit tests in a particular codebase, and it lead to me writing a lot more tests. So maybe even a 2x speedup is going to be speed things up, too.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Patreon Stuff&lt;/h3&gt;
    &lt;p&gt;I wrote a couple of TLA+ specs to show how to model &lt;a href="https://en.wikipedia.org/wiki/Fork%E2%80%93join_model" target="_blank"&gt;fork-join&lt;/a&gt; algorithms. I'm planning on eventually writing them up for my blog/learntla but it'll be a while, so if you want to see them in the meantime I put them up on &lt;a href="https://www.patreon.com/posts/fork-join-in-tla-134209395?utm_medium=clipboard_copy&amp;amp;utm_source=copyLink&amp;amp;utm_campaign=postshare_creator&amp;amp;utm_content=join_link" target="_blank"&gt;Patreon&lt;/a&gt;.&lt;/p&gt;</description>
                <pubDate>Thu, 17 Jul 2025 19:08:27 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/maybe-writing-speed-actually-is-a-bottleneck-for/</guid>
            </item>
            <item>
                <title>Logic for Programmers Turns One</title>
                <link>https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/</link>
                <description>&lt;p&gt;I released &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; exactly one year ago today. It feels weird to celebrate the anniversary of something that isn't 1.0 yet, but software projects have a proud tradition of celebrating a dozen anniversaries before 1.0. I wanted to share about what's changed in the past year and the work for the next six+ months.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The book cover!" class="newsletter-image" src="https://assets.buttondown.email/images/70ac47c9-c49f-47c0-9a05-7a9e70551d03.jpg?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h3&gt;The Road to 0.1&lt;/h3&gt;
    &lt;p&gt;I had been noodling on the idea of a logic book since the pandemic. The first time I wrote about it on the newsletter was in &lt;a href="https://buttondown.com/hillelwayne/archive/predicate-logic-for-programmers/" target="_blank"&gt;2021&lt;/a&gt;! Then I said that it would be done by June and would be "under 50 pages". The idea was to cover logic as a "soft skill" that helped you think about things like requirements and stuff.&lt;/p&gt;
    &lt;p&gt;That version &lt;em&gt;sucked&lt;/em&gt;. If you want to see how much it sucked, I put it up on &lt;a href="https://www.patreon.com/posts/what-logic-for-133675688" target="_blank"&gt;Patreon&lt;/a&gt;. Then I slept on the next draft for three years. Then in 2024 a lot of business fell through and I had a lot of free time, so with the help of &lt;a href="https://saul.pw/" target="_blank"&gt;Saul Pwanson&lt;/a&gt; I rewrote the book. This time I emphasized breadth over depth, trying to cover a lot more techniques.  &lt;/p&gt;
    &lt;p&gt;I also decided to self-publish it instead of pitching it to a publisher. Not going the traditional route would mean I would be responsible for paying for editing, advertising, graphic design etc, but I hoped that would be compensated by &lt;em&gt;much&lt;/em&gt; higher royalties. It also meant I could release the book in early access and use early sales to fund further improvements. So I wrote up a draft in &lt;a href="https://www.sphinx-doc.org/en/master/" target="_blank"&gt;Sphinx&lt;/a&gt;, compiled it to LaTeX, and uploaded the PDF to &lt;a href="https://leanpub.com/" target="_blank"&gt;leanpub&lt;/a&gt;. That was in June 2024.&lt;/p&gt;
    &lt;p&gt;Since then I kept to a monthly cadence of updates, missing once in November (short-notice contract) and once last month (&lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;). The book's now on v0.10. What's changed?&lt;/p&gt;
    &lt;h3&gt;A LOT&lt;/h3&gt;
    &lt;p&gt;v0.1 was &lt;em&gt;very obviously&lt;/em&gt; an alpha, and I have made a lot of improvements since then. For one, the book no longer looks like a &lt;a href="https://www.sphinx-doc.org/_/downloads/en/master/pdf/#page=13" target="_blank"&gt;Sphinx manual&lt;/a&gt;. Compare!&lt;/p&gt;
    &lt;p&gt;&lt;img alt="0.1 on left, 0.10 on right. Way better!" class="newsletter-image" src="https://assets.buttondown.email/images/e4d880ad-80b8-4360-9cae-27c07598c740.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Also, the content is very, very different. v0.1 was 19,000 words, v.10 is 31,000.&lt;sup id="fnref:pagesize"&gt;&lt;a class="footnote-ref" href="#fn:pagesize"&gt;1&lt;/a&gt;&lt;/sup&gt; This comes from new chapters on TLA+, constraint/SMT solving, logic programming, and major expansions to the existing chapters. Originally, "Simplifying Conditionals" was 600 words. Six hundred words! It almost fit in two pages!&lt;/p&gt;
    &lt;p&gt;&lt;img alt="How short Simplifying Conditions USED to be" class="newsletter-image" src="https://assets.buttondown.email/images/31e731b7-3bdc-4ded-9b09-2a6261a323ec.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;The chapter is now 2600 words, now covering condition lifting, quantifier manipulation, helper predicates, and set optimizations. All the other chapters have either gotten similar facelifts or are scheduled to get facelifts.&lt;/p&gt;
    &lt;p&gt;The last big change is the addition of &lt;a href="https://github.com/logicforprogrammers/book-assets" target="_blank"&gt;book assets&lt;/a&gt;. Originally you had to manually copy over all of the code to try it out, which is a problem when there are samples in eight distinct languages! Now there are ready-to-go examples for each chapter, with instructions on how to set up each programming environment. This is also nice because it gives me breaks from writing to code instead.&lt;/p&gt;
    &lt;h3&gt;How did the book do?&lt;/h3&gt;
    &lt;p&gt;Leanpub's all-time visualizations are terrible, so I'll just give the summary: 1180 copies sold, $18,241 in royalties. That's a lot of money for something that isn't fully out yet! By comparison, &lt;em&gt;Practical TLA+&lt;/em&gt; has made me less than half of that, despite selling over 5x as many books. Self-publishing was the right choice!&lt;/p&gt;
    &lt;p&gt;In that time I've paid about $400 for the book cover (worth it) and maybe $800 in Leanpub's advertising service (probably not worth it). &lt;/p&gt;
    &lt;p&gt;Right now that doesn't come close to making back the time investment, but I think it can get there post-release. I believe there's a lot more potential customers via marketing. I think post-release 10k copies sold is within reach.&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;Where is the book going?&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;The main content work is rewrites: many of the chapters have not meaningfully changed since 1.0, so I am going through and rewriting them from scratch. So far four of the ten chapters have been rewritten. My (admittedly ambitious) goal is to rewrite three of them by the end of this month and another three by the end of next. I also want to do final passes on the rewritten chapters; as most of them have a few TODOs left lying around.&lt;/p&gt;
    &lt;p&gt;(Also somehow in starting this newsletter and publishing it I realized that one of the chapters might be better split into two chapters, so there could well-be a tenth technique in v0.11 or v0.12!)&lt;/p&gt;
    &lt;p&gt;After that, I will pass it to a copy editor while I work on improving the layout, making images, and indexing. I want to have something worthy of printing on a dead tree by 1.0. &lt;/p&gt;
    &lt;p&gt;In terms of timelines, I am &lt;strong&gt;very roughly&lt;/strong&gt; estimating something like this:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Summer: final big changes and rewrites&lt;/li&gt;
    &lt;li&gt;Early Autumn: graphic design and copy editing&lt;/li&gt;
    &lt;li&gt;Late Autumn: proofing, figuring out printing stuff&lt;/li&gt;
    &lt;li&gt;Winter: final ebook and initial print releases of 1.0.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;(If you know a service that helps get self-published books "past the finish line", I'd love to hear about it! Preferably something that works for a fee, not part of royalties.)&lt;/p&gt;
    &lt;p&gt;This timeline may be disrupted by official client work, like a new TLA+ contract or a conference invitation.&lt;/p&gt;
    &lt;p&gt;Needless to say, I am incredibly excited to complete this book and share the final version with you all. This is a book I wished for years ago, a book I wrote because nobody else would. It fills a critical gap in software educational material, and someday soon I'll be able to put a copy on my bookshelf. It's exhilarating and terrifying and above all, satisfying.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:pagesize"&gt;
    &lt;p&gt;It's also 150 pages vs 50 pages, but admittedly this is partially because I made the book smaller with a larger font. &lt;a class="footnote-backref" href="#fnref:pagesize" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 08 Jul 2025 18:18:52 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/</guid>
            </item>
            <item>
                <title>Logical Quantifiers in Software</title>
                <link>https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/</link>
                <description>&lt;p&gt;I realize that for all I've talked about &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; in this newsletter, I never once explained basic logical quantifiers. They're both simple and incredibly useful, so let's do that this week! &lt;/p&gt;
    &lt;h3&gt;Sets and quantifiers&lt;/h3&gt;
    &lt;p&gt;A &lt;strong&gt;set&lt;/strong&gt; is a collection of unordered, unique elements. &lt;code&gt;{1, 2, 3, …}&lt;/code&gt; is a set, as are "every programming language", "every programming language's Wikipedia page", and "every function ever defined in any programming language's standard library". You can put whatever you want in a set, with some very specific limitations to avoid certain paradoxes.&lt;sup id="fnref:paradox"&gt;&lt;a class="footnote-ref" href="#fn:paradox"&gt;2&lt;/a&gt;&lt;/sup&gt; &lt;/p&gt;
    &lt;p&gt;Once we have a set, we can ask "is something true for all elements of the set" and "is something true for at least one element of the set?" IE, is it true that every programming language has a &lt;code&gt;set&lt;/code&gt; collection type in the core language? We would write it like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# all of them
    all l in ProgrammingLanguages: HasSetType(l)
    
    # at least one
    some l in ProgrammingLanguages: HasSetType(l)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is the notation I use in the book because it's easy to read, type, and search for. Mathematicians historically had a few different formats; the one I grew up with was &lt;code&gt;∀x ∈ set: P(x)&lt;/code&gt; to mean &lt;code&gt;all x in set&lt;/code&gt;, and &lt;code&gt;∃&lt;/code&gt; to mean &lt;code&gt;some&lt;/code&gt;. I use these when writing for just myself, but find them confusing to programmers when communicating.&lt;/p&gt;
    &lt;p&gt;"All" and "some" are respectively referred to as "universal" and "existential" quantifiers.&lt;/p&gt;
    &lt;h3&gt;Some cool properties&lt;/h3&gt;
    &lt;p&gt;We can simplify expressions with quantifiers, in the same way that we can simplify &lt;code&gt;!(x &amp;amp;&amp;amp; y)&lt;/code&gt; to &lt;code&gt;!x || !y&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;First of all, quantifiers are commutative with themselves. &lt;code&gt;some x: some y: P(x,y)&lt;/code&gt; is the same as &lt;code&gt;some y: some x: P(x, y)&lt;/code&gt;. For this reason we can write &lt;code&gt;some x, y: P(x,y)&lt;/code&gt; as shorthand. We can even do this when quantifying over different sets, writing &lt;code&gt;some x, x' in X, y in Y&lt;/code&gt; instead of &lt;code&gt;some x, x' in X: some y in Y&lt;/code&gt;. We can &lt;em&gt;not&lt;/em&gt; do this with "alternating quantifiers":&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;code&gt;all p in Person: some m in Person: Mother(m, p)&lt;/code&gt; says that every person has a mother.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;some m in Person: all p in Person: Mother(m, p)&lt;/code&gt; says that someone is every person's mother.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Second, existentials distribute over &lt;code&gt;||&lt;/code&gt; while universals distribute over &lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt;. "There is some url which returns a 403 or 404" is the same as "there is some url which returns a 403 or some url that returns a 404", and "all PRs pass the linter and the test suites" is the same as "all PRs pass the linter and all PRs pass the test suites".&lt;/p&gt;
    &lt;p&gt;Finally, &lt;code&gt;some&lt;/code&gt; and &lt;code&gt;all&lt;/code&gt; are &lt;em&gt;duals&lt;/em&gt;: &lt;code&gt;some x: P(x) == !(all x: !P(x))&lt;/code&gt;, and vice-versa. Intuitively: if some file is malicious, it's not true that all files are benign.&lt;/p&gt;
    &lt;p&gt;All these rules together mean we can manipulate quantifiers &lt;em&gt;almost&lt;/em&gt; as easily as we can manipulate regular booleans, putting them in whatever form is easiest to use in programming. &lt;/p&gt;
    &lt;p&gt;Speaking of which, how &lt;em&gt;do&lt;/em&gt; we use this in in programming?&lt;/p&gt;
    &lt;h2&gt;How we use this in programming&lt;/h2&gt;
    &lt;p&gt;First of all, people clearly have a need for directly using quantifiers in code. If we have something of the form:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;for x in list:
        if P(x):
            return true
    return false
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;That's just &lt;code&gt;some x in list: P(x)&lt;/code&gt;. And this is a prevalent pattern, as you can see by using &lt;a href="https://github.com/search?q=%2Ffor+.*%3A%5Cn%5Cs*if+.*%3A%5Cn%5Cs*return+%28False%7CTrue%29%5Cn%5Cs*return+%28True%7CFalse%29%2F+language%3Apython+NOT+is%3Afork&amp;amp;type=code" target="_blank"&gt;GitHub code search&lt;/a&gt;. It finds over 500k examples of this pattern in Python alone! That can be simplified via using the language's built-in quantifiers: the Python would be &lt;code&gt;any(P(x) for x in list)&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;(Note this is not quantifying over sets but iterables. But the idea translates cleanly enough.)&lt;/p&gt;
    &lt;p&gt;More generally, quantifiers are a key way we express higher-level properties of software. What does it mean for a list to be sorted in ascending order? That &lt;code&gt;all i, j in 0..&amp;lt;len(l): if i &amp;lt; j then l[i] &amp;lt;= l[j]&lt;/code&gt;. When should a &lt;a href="https://qntm.org/ratchet" target="_blank"&gt;ratchet test fail&lt;/a&gt;? When &lt;code&gt;some f in functions - exceptions: Uses(f, bad_function)&lt;/code&gt;. Should the image classifier work upside down? &lt;code&gt;all i in images: classify(i) == classify(rotate(i, 180))&lt;/code&gt;. These are the properties we verify with tests and types and &lt;a href="https://www.hillelwayne.com/post/constructive/" target="_blank"&gt;MISU&lt;/a&gt; and whatnot;&lt;sup id="fnref:misu"&gt;&lt;a class="footnote-ref" href="#fn:misu"&gt;1&lt;/a&gt;&lt;/sup&gt; it helps to be able to make them explicit!&lt;/p&gt;
    &lt;p&gt;One cool use case that'll be in the book's next version: database invariants are universal statements over the set of all records, like &lt;code&gt;all a in accounts: a.balance &amp;gt; 0&lt;/code&gt;. That's enforceable with a &lt;a href="https://sqlite.org/lang_createtable.html#check_constraints" target="_blank"&gt;CHECK&lt;/a&gt; constraint. But what about something like &lt;code&gt;all i, i' in intervals: NoOverlap(i, i')&lt;/code&gt;? That isn't covered by CHECK, since it spans two rows.&lt;/p&gt;
    &lt;p&gt;Quantifier duality to the rescue! The invariant is equivalent to &lt;code&gt;!(some i, i' in intervals: Overlap(i, i'))&lt;/code&gt;, so is preserved if the &lt;em&gt;query&lt;/em&gt; &lt;code&gt;SELECT COUNT(*) FROM intervals CROSS JOIN intervals …&lt;/code&gt; returns 0 rows. This means we can test it via a &lt;a href="https://sqlite.org/lang_createtrigger.html" target="_blank"&gt;database trigger&lt;/a&gt;.&lt;sup id="fnref:efficiency"&gt;&lt;a class="footnote-ref" href="#fn:efficiency"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;There are a lot more use cases for quantifiers, but this is enough to introduce the ideas! Next week's the one year anniversary of the book entering early access, so I'll be writing a bit about that experience and how the book changed. It's &lt;em&gt;crazy&lt;/em&gt; how crude v0.1 was compared to the current version.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:misu"&gt;
    &lt;p&gt;MISU ("make illegal states unrepresentable") means using data representations that rule out invalid values. For example, if you have a &lt;code&gt;location -&amp;gt; Optional(item)&lt;/code&gt; lookup and want to make sure that each item is in exactly one location, consider instead changing the map to &lt;code&gt;item -&amp;gt; location&lt;/code&gt;. This is a means of &lt;em&gt;implementing&lt;/em&gt; the property &lt;code&gt;all i in item, l, l' in location: if ItemIn(i, l) &amp;amp;&amp;amp; l != l' then !ItemIn(i, l')&lt;/code&gt;. &lt;a class="footnote-backref" href="#fnref:misu" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:paradox"&gt;
    &lt;p&gt;Specifically, a set can't be an element of itself, which rules out constructing things like "the set of all sets" or "the set of sets that don't contain themselves". &lt;a class="footnote-backref" href="#fnref:paradox" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:efficiency"&gt;
    &lt;p&gt;Though note that when you're inserting or updating an interval, you already &lt;em&gt;have&lt;/em&gt; that row's fields in the trigger's &lt;code&gt;NEW&lt;/code&gt; keyword. So you can just query &lt;code&gt;!(some i in intervals: Overlap(new, i'))&lt;/code&gt;, which is more efficient. &lt;a class="footnote-backref" href="#fnref:efficiency" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 02 Jul 2025 19:44:22 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/</guid>
            </item>
            <item>
                <title>You can cheat a test suite with a big enough polynomial</title>
                <link>https://buttondown.com/hillelwayne/archive/you-can-cheat-a-test-suite-with-a-big-enough/</link>
                <description>&lt;p&gt;Hi nerds, I'm back from &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;! I'd heartily recommend it, wildest conference I've been to in years. I have a lot of work to catch up on, so this will be a short newsletter.&lt;/p&gt;
    &lt;p&gt;In an earlier version of my talk, I had a gag about unit tests. First I showed the test &lt;code&gt;f([1,2,3]) == 3&lt;/code&gt;, then said that this was satisfied by &lt;code&gt;f(l) = 3&lt;/code&gt;, &lt;code&gt;f(l) = l[-1]&lt;/code&gt;, &lt;code&gt;f(l) = len(l)&lt;/code&gt;, &lt;code&gt;f(l) = (129*l[0]-34*l[1]-617)*l[2] - 443*l[0] + 1148*l[1] - 182&lt;/code&gt;. Then I progressively rule them out one by one with more unit tests, except the last polynomial which stubbornly passes every single test.&lt;/p&gt;
    &lt;p&gt;If you're given some function of &lt;code&gt;f(x: int, y: int, …): int&lt;/code&gt; and a set of unit tests asserting &lt;a href="https://buttondown.com/hillelwayne/archive/oracle-testing/" target="_blank"&gt;specific inputs give specific outputs&lt;/a&gt;, then you can find a polynomial that passes every single unit test.&lt;/p&gt;
    &lt;p&gt;To find the gag, and as &lt;a href="https://en.wikipedia.org/wiki/Satisfiability_modulo_theories" target="_blank"&gt;SMT&lt;/a&gt; practice, I wrote a Python program that finds a polynomial that passes a test suite meant for &lt;code&gt;max&lt;/code&gt;. It's hardcoded for three parameters and only finds 2nd-order polynomials but I think it could be generalized with enough effort.&lt;/p&gt;
    &lt;h2&gt;The code&lt;/h2&gt;
    &lt;p&gt;Full code &lt;a href="https://gist.github.com/hwayne/0ed045a35376c786171f9cf4b55c470f" target="_blank"&gt;here&lt;/a&gt;, breakdown below.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;z3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;  &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
    &lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Solver&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;Solver&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;a href="https://microsoft.github.io/z3guide/" target="_blank"&gt;Z3&lt;/a&gt; is just the particular SMT solver we use, as it has good language bindings and a lot of affordances.&lt;/p&gt;
    &lt;p&gt;As part of learning SMT I wanted to do this two ways. First by putting the polynomial "outside" of the SMT solver in a python function, second by doing it "natively" in Z3. I created two solvers so I could test both versions in one run. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;a0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Consts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'a0 a b c d e f'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Ints&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'x y z'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"a*x+b*y+c*z+d*x*y+e*x*z+f*y*z+a0"&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Both &lt;code&gt;Const('x', IntSort())&lt;/code&gt; and &lt;code&gt;Int('x')&lt;/code&gt; do the exact same thing, the latter being syntactic sugar for the former. I did not know this when I wrote the program. &lt;/p&gt;
    &lt;p&gt;To keep the two versions in sync I represented the equation as a string, which I later &lt;code&gt;eval&lt;/code&gt;. This is one of the rare cases where eval is a good idea, to help us experiment more quickly while learning. The polynomial is a "2nd-order polynomial", even though it doesn't have &lt;code&gt;x^2&lt;/code&gt; terms, as it has &lt;code&gt;xy&lt;/code&gt; and &lt;code&gt;xz&lt;/code&gt; terms.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;lambdamax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="n"&gt;z3max&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'z3max'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;  &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ForAll&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;z3max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;lambdamax&lt;/code&gt; is pretty straightforward: create a lambda with three parameters and &lt;code&gt;eval&lt;/code&gt; the string. The string "&lt;code&gt;a*x&lt;/code&gt;" then becomes the python expression &lt;code&gt;a*x&lt;/code&gt;, &lt;code&gt;a&lt;/code&gt; is an SMT symbol, while the &lt;code&gt;x&lt;/code&gt; SMT symbol is shadowed by the lambda parameter. To reiterate, a terrible idea in practice, but a good way to learn faster.&lt;/p&gt;
    &lt;p&gt;&lt;code&gt;z3max&lt;/code&gt; function is a little more complex. &lt;code&gt;Function&lt;/code&gt; takes an identifier string and N "sorts" (roughly the same as programming types). The first &lt;code&gt;N-1&lt;/code&gt; sorts define the parameters of the function, while the last becomes the output. So here I assign the string identifier &lt;code&gt;"z3max"&lt;/code&gt; to be a function with signature &lt;code&gt;(int, int, int) -&amp;gt; int&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;I can load the function into the model by specifying constraints on what &lt;code&gt;z3max&lt;/code&gt; &lt;em&gt;could&lt;/em&gt; be. This could either be a strict input/output, as will be done later, or a &lt;code&gt;ForAll&lt;/code&gt; over all possible inputs. Here I just use that directly to say "for all inputs, the function should match this polynomial." But I could do more complicated constraints, like commutativity (&lt;code&gt;f(x, y) == f(y, x)&lt;/code&gt;) or monotonicity (&lt;code&gt;Implies(x &amp;lt; y, f(x) &amp;lt;= f(y))&lt;/code&gt;).&lt;/p&gt;
    &lt;p&gt;Note &lt;code&gt;ForAll&lt;/code&gt; takes a list of z3 symbols to quantify over. That's the only reason we need to define &lt;code&gt;x, y, z&lt;/code&gt; in the first place. The lambda version doesn't need them. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z3max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lambdamax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This sets up the joke: adding constraints to each solver that the polynomial it finds must, for a fixed list of triplets, return the max of each triplet.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z3max&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lambdamax&lt;/span&gt;&lt;span class="p"&gt;)]:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;sat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"max([&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;]) ="&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"max([x, y, z]) = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;x + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"+ &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;z +"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# linebreaks added for newsletter rendering&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;xy + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;xz + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;yz + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;a0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Output:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;max([1, 2, 3]) = 3
    # etc
    max([x, y, z]) = -133x + 130y + -10z + -2xy + 62xz + -46yz + 0
    
    max([1, 2, 3]) = 3
    # etc
    max([x, y, z]) = -17x + 16y + 0z + 0xy + 8xz + -6yz + 0
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I find that &lt;code&gt;z3max&lt;/code&gt; (top) consistently finds larger coefficients than &lt;code&gt;lambdamax&lt;/code&gt; does. I don't know why.&lt;/p&gt;
    &lt;h3&gt;Practical Applications&lt;/h3&gt;
    &lt;p&gt;&lt;strong&gt;Test-Driven Development&lt;/strong&gt; recommends a strict "red-green refactor" cycle. Write a new failing test, make the new test pass, then go back and refactor. Well, the easiest way to make the new test pass would be to paste in a new polynomial, so that's what you should be doing. You can even do this all automatically: have a script read the set of test cases, pass them to the solver, and write the new polynomial to your code file. All you need to do is write the tests!&lt;/p&gt;
    &lt;h3&gt;Pedagogical Notes&lt;/h3&gt;
    &lt;p&gt;Writing the script took me a couple of hours. I'm sure an LLM could have whipped it all up in five minutes but I really want to &lt;em&gt;learn&lt;/em&gt; SMT and &lt;a href="https://www.sciencedirect.com/science/article/pii/S0747563224002541" target="_blank"&gt;LLMs &lt;em&gt;may&lt;/em&gt; decrease learning retention&lt;/a&gt;.&lt;sup id="fnref:caveat"&gt;&lt;a class="footnote-ref" href="#fn:caveat"&gt;1&lt;/a&gt;&lt;/sup&gt; Z3 documentation is not... great for non-academics, though, and most other SMT solvers have even worse docs. One useful trick I use regularly is to use Github code search to find code using the same APIs and study how that works. Turns out reading API-heavy code is a lot easier than writing it!&lt;/p&gt;
    &lt;p&gt;Anyway, I'm very, very slowly feeling like I'm getting the basics on how to use SMT. I don't have any practical use cases yet, but I wanted to learn this skill for a while and glad I finally did.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:caveat"&gt;
    &lt;p&gt;Caveat I have not actually &lt;em&gt;read&lt;/em&gt; the study, for all I know it could have a sample size of three people, I'll get around to it eventually &lt;a class="footnote-backref" href="#fnref:caveat" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 24 Jun 2025 16:27:01 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/you-can-cheat-a-test-suite-with-a-big-enough/</guid>
            </item>
            <item>
                <title>Solving LinkedIn Queens with SMT</title>
                <link>https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/</link>
                <description>&lt;h3&gt;No newsletter next week&lt;/h3&gt;
    &lt;p&gt;I’ll be speaking at &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;. My talk isn't close to done yet, which is why this newsletter is both late and short. &lt;/p&gt;
    &lt;h1&gt;Solving LinkedIn Queens in SMT&lt;/h1&gt;
    &lt;p&gt;The article &lt;a href="https://codingnest.com/modern-sat-solvers-fast-neat-underused-part-1-of-n/" target="_blank"&gt;Modern SAT solvers: fast, neat and underused&lt;/a&gt; claims that SAT solvers&lt;sup id="fnref:SAT"&gt;&lt;a class="footnote-ref" href="#fn:SAT"&gt;1&lt;/a&gt;&lt;/sup&gt; are "criminally underused by the industry". A while back on the newsletter I asked "why": how come they're so powerful and yet nobody uses them? Many experts responded saying the reason is that encoding SAT kinda sucked and they rather prefer using tools that compile to SAT. &lt;/p&gt;
    &lt;p&gt;I was reminded of this when I read &lt;a href="https://ryanberger.me/posts/queens/" target="_blank"&gt;Ryan Berger's post&lt;/a&gt; on solving “LinkedIn Queens” as a SAT problem. &lt;/p&gt;
    &lt;p&gt;A quick overview of Queens. You’re presented with an NxN grid divided into N regions, and have to place N queens so that there is exactly one queen in each row, column, and region. While queens can be on the same diagonal, they &lt;em&gt;cannot&lt;/em&gt; be adjacently diagonal.&lt;/p&gt;
    &lt;p&gt;(Important note: Linkedin “Queens” is a variation on the puzzle game &lt;a href="https://starbattle.puzzlebaron.com/" target="_blank"&gt;Star Battle&lt;/a&gt;, which is the same except the number of stars you place in each row/column/region varies per puzzle, and is usually two. This is also why 'queens' don’t capture like chess queens.)&lt;/p&gt;
    &lt;p&gt;&lt;img alt="An image of a solved queens board. Copied from https://ryanberger.me/posts/queens" class="newsletter-image" src="https://assets.buttondown.email/images/96f6f923-331f-424d-8641-fe6753e1c2ca.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Ryan solved this by writing Queens as a SAT problem, expressing properties like "there is exactly one queen in row 3" as a large number of boolean clauses. &lt;a href="https://ryanberger.me/posts/queens/" target="_blank"&gt;Go read his post, it's pretty cool&lt;/a&gt;. What leapt out to me was that he used &lt;a href="https://cvc5.github.io/" target="_blank"&gt;CVC5&lt;/a&gt;, an &lt;strong&gt;SMT&lt;/strong&gt; solver.&lt;sup id="fnref:SMT"&gt;&lt;a class="footnote-ref" href="#fn:SMT"&gt;2&lt;/a&gt;&lt;/sup&gt; SMT solvers are "higher-level" than SAT, capable of handling more data types than just boolean variables. It's a lot easier to solve the problem at the SMT level than at the SAT level. To show this, I whipped up a short demo of solving the same problem in &lt;a href="https://github.com/Z3Prover/z3/wiki" target="_blank"&gt;Z3&lt;/a&gt; (via the &lt;a href="https://pypi.org/project/z3-solver/" target="_blank"&gt;Python API&lt;/a&gt;).&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://gist.github.com/hwayne/c5de7bc52e733995311236666bedecd3" target="_blank"&gt;Full code here&lt;/a&gt;, which you can compare to Ryan's SAT solution &lt;a href="https://github.com/ryan-berger/queens/blob/master/main.py" target="_blank"&gt;here&lt;/a&gt;. I didn't do a whole lot of cleanup on it (again, time crunch!), but short explanation below.&lt;/p&gt;
    &lt;h3&gt;The code&lt;/h3&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;z3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
    &lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;itertools&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;combinations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;
    &lt;span class="n"&gt;solver&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Solver&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt; &lt;span class="c1"&gt;# N&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Initial setup and modules. &lt;code&gt;size&lt;/code&gt; is the number of rows/columns/regions in the board, which I'll call &lt;code&gt;N&lt;/code&gt; below.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# queens[n] = col of queen on row n&lt;/span&gt;
    &lt;span class="c1"&gt;# by construction, not on same row&lt;/span&gt;
    &lt;span class="n"&gt;queens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;IntVector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'q'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;SAT represents the queen positions via N² booleans: &lt;code&gt;q_00&lt;/code&gt; means that a Queen is on row 0 and column 0, &lt;code&gt;!q_05&lt;/code&gt; means a queen &lt;em&gt;isn't&lt;/em&gt; on row 0 col 5, etc. In SMT we can instead encode it as N integers: &lt;code&gt;q_0 = 5&lt;/code&gt; means that the queen on row 0 is positioned at column 5. This immediately enforces one class of constraints for us: we don't need any constraints saying "exactly one queen per row", because that's embedded in the definition of &lt;code&gt;queens&lt;/code&gt;!&lt;/p&gt;
    &lt;p&gt;(Incidentally, using 0-based indexing for the board was a mistake on my part, it makes correctly encoding the regions later really painful.)&lt;/p&gt;
    &lt;p&gt;To actually make the variables &lt;code&gt;[q_0, q_1, …]&lt;/code&gt;, we use the Z3 affordance &lt;code&gt;IntVector(str, n)&lt;/code&gt; for making &lt;code&gt;n&lt;/code&gt; variables at once.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;And&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="c1"&gt;# not on same column&lt;/span&gt;
    &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Distinct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;First we constrain all the integers to &lt;code&gt;[0, N)&lt;/code&gt;, then use the &lt;em&gt;incredibly&lt;/em&gt; handy &lt;code&gt;Distinct&lt;/code&gt; constraint to force all the integers to have different values. This guarantees at most one queen per column, which by the &lt;a href="https://en.wikipedia.org/wiki/Pigeonhole_principle" target="_blank"&gt;pigeonhole principle&lt;/a&gt; means there is exactly one queen per column.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# not diagonally adjacent&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;q1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;q2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;One of the rules is that queens can't be adjacent. We already know that they can't be horizontally or vertically adjacent via other constraints, which leaves the diagonals. We only need to add constraints that, for each queen, there is no queen in the lower-left or lower-right corner, aka &lt;code&gt;q_3 != q_2 ± 1&lt;/code&gt;. We don't need to check the top corners because if &lt;code&gt;q_1&lt;/code&gt; is in the upper-left corner of &lt;code&gt;q_2&lt;/code&gt;, then &lt;code&gt;q_2&lt;/code&gt; is in the lower-right corner of &lt;code&gt;q_1&lt;/code&gt;!&lt;/p&gt;
    &lt;p&gt;That covers everything except the "one queen per region" constraint. But the regions are the tricky part, which we should expect because we vary the difficulty of queens games by varying the regions.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;regions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;"purple"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
            &lt;span class="s2"&gt;"red"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),],&lt;/span&gt;
            &lt;span class="c1"&gt;# you get the picture&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
    
    &lt;span class="c1"&gt;# Some checking code left out, see below&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The region has to be manually coded in, which is a huge pain.&lt;/p&gt;
    &lt;p&gt;(In the link, some validation code follows. Since it breaks up explaining the model I put it in the next section.)&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Or&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally we have the region constraint. The easiest way I found to say "there is exactly one queen in each region" is to say "there is a queen in region 1 and a queen in region 2 and a queen in region 3" etc." Then to say "there is a queen in region &lt;code&gt;purple&lt;/code&gt;" I wrote "&lt;code&gt;q_0 = 0&lt;/code&gt; OR &lt;code&gt;q_0 = 1&lt;/code&gt; OR … OR &lt;code&gt;q_1 = 0&lt;/code&gt; etc." &lt;/p&gt;
    &lt;p&gt;Why iterate over every position in the region instead of doing something like &lt;code&gt;(0, q[0]) in r&lt;/code&gt;? I tried that but it's not an expression that Z3 supports.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;sat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally, we solve and print the positions. Running this gives me:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;q__0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Which is the correct solution to the queens puzzle. I didn't benchmark the solution times, but I imagine it's considerably slower than a raw SAT solver. &lt;a href="https://github.com/audemard/glucose" target="_blank"&gt;Glucose&lt;/a&gt; is really, really fast.&lt;/p&gt;
    &lt;p&gt;But even so, solving the problem with SMT was a lot &lt;em&gt;easier&lt;/em&gt; than solving it with SAT. That satisfies me as an explanation for why people prefer it to SAT.&lt;/p&gt;
    &lt;h3&gt;Sanity checks&lt;/h3&gt;
    &lt;p&gt;One bit I glossed over earlier was the sanity checking code. I &lt;em&gt;knew for sure&lt;/em&gt; that I was going to make a mistake encoding the &lt;code&gt;region&lt;/code&gt;, and the solver wasn't going to provide useful information abut what I did wrong.  In cases like these, I like adding small tests and checks to catch mistakes early, because the solver certainly isn't going to catch them!&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;all_squares&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;repeat&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;test_i_set_up_problem_right&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;all_squares&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_iterable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r2&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;combinations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The first check was a quick test that I didn't leave any squares out, or accidentally put the same square in both regions. Converting the values into sets makes both checks a lot easier. Honestly I don't know why I didn't just use sets from the start, sets are great.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;render_regions&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;colormap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"purple"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s2"&gt;"red"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"brown"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"white"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"green"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"yellow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"orange"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"blue"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"pink"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;board&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; 
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;all_squares&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;colormap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    
    &lt;span class="n"&gt;render_regions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The second check is something that prints out the regions. It produces something like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;111111111
    112333999
    122439999
    124437799
    124666779
    124467799
    122467899
    122555889
    112258899
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I can compare this to the picture of the board to make sure I got it right. I guess a more advanced solution would be to print emoji squares like 🟥 instead.&lt;/p&gt;
    &lt;p&gt;Neither check is quality code but it's throwaway and it gets the job done so eh.&lt;/p&gt;
    &lt;h3&gt;Update for the Internet&lt;/h3&gt;
    &lt;p&gt;This was sent as a weekly newsletter, which is usually on topics like &lt;a href="https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code" target="_blank"&gt;software history&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/" target="_blank"&gt;formal methods&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;unusual technologies&lt;/a&gt;, and the &lt;a href="https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/" target="_blank"&gt;theory of software engineering&lt;/a&gt;. You &lt;a href="https://buttondown.email/hillelwayne/" target="_blank"&gt;can subscribe here&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:SAT"&gt;
    &lt;p&gt;"Boolean &lt;strong&gt;SAT&lt;/strong&gt;isfiability Solver", aka a solver that can find assignments that make complex boolean expressions true. I write a bit more about them &lt;a href="https://www.hillelwayne.com/post/np-hard/" target="_blank"&gt;here&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:SAT" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:SMT"&gt;
    &lt;p&gt;"Satisfiability Modulo Theories" &lt;a class="footnote-backref" href="#fnref:SMT" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 12 Jun 2025 15:43:25 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/</guid>
            </item>
            <item>
                <title>AI is a gamechanger for TLA+ users</title>
                <link>https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/</link>
                <description>&lt;h3&gt;New Logic for Programmers Release&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.10 is now available&lt;/a&gt;! This is a minor release, mostly focused on logic-based refactoring, with new material on set types and testing refactors are correct. See the full release notes at &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;the changelog page&lt;/a&gt;. Due to &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;conference pressure&lt;/a&gt; v0.11 will also likely be a minor release. &lt;/p&gt;
    &lt;p&gt;&lt;img alt="The book cover" class="newsletter-image" src="https://assets.buttondown.email/images/29d4ae9d-bcb9-4d8b-99d4-8a35c0990ad5.jpg?w=300&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h1&gt;AI is a gamechanger for TLA+ users&lt;/h1&gt;
    &lt;p&gt;&lt;a href="https://lamport.azurewebsites.net/tla/tla.html" target="_blank"&gt;TLA+&lt;/a&gt; is a specification language to model and debug distributed systems. While very powerful, it's also hard for programmers to learn, and there's always questions of connecting specifications with actual code. &lt;/p&gt;
    &lt;p&gt;That's why &lt;a href="https://zfhuang99.github.io/github%20copilot/formal%20verification/tla+/2025/05/24/ai-revolution-in-distributed-systems.html" target="_blank"&gt;The Coming AI Revolution in Distributed Systems&lt;/a&gt; caught my interest. In the post, Cheng Huang claims that Azure successfully used LLMs to examine an existing codebase, derive a TLA+ spec, and find a production bug in that spec. "After a decade of manually crafting TLA+ specifications", he wrote, "I must acknowledge that this AI-generated specification rivals human work".&lt;/p&gt;
    &lt;p&gt;This inspired me to experiment with LLMs in TLA+ myself. My goals are a little less ambitious than Cheng's: I wanted to see how LLMs could help junior specifiers write TLA+, rather than handling the entire spec automatically. Details on what did and didn't work below, but my takeaway is that &lt;strong&gt;LLMs are an immense specification force multiplier.&lt;/strong&gt;&lt;/p&gt;
    &lt;p&gt;All tests were done with a standard VSCode Copilot subscription, writing Claude 3.7 in Agent mode. Other LLMs or IDEs may be more or less effective, etc.&lt;/p&gt;
    &lt;h2&gt;Things Claude was good at&lt;/h2&gt;
    &lt;h3&gt;Fixing syntax errors&lt;/h3&gt;
    &lt;p&gt;TLA+ uses a very different syntax than mainstream programming languages, meaning beginners make a lot of mistakes where they do a "programming syntax" instead of TLA+ syntax:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;NotThree(x) = \* should be ==, not =
        x != 3 \* should be #, not !=
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The problem is that the TLA+ syntax checker, SANY, is 30 years old and doesn't provide good information. Here's what it says for that snippet:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Was expecting "==== or more Module body"
    Encountered "NotThree" at line 6, column 1
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;That only isolates one error and doesn't tell us what the problem is, only where it is. Experienced TLA+ users get "error eyes" and can quickly see what the problem is, but beginners really struggle with this.&lt;/p&gt;
    &lt;p&gt;The TLA+ foundation has made LLM integration a priority, so the VSCode extension &lt;a href="https://github.com/tlaplus/vscode-tlaplus/blob/master/src/main.ts#L174" target="_blank"&gt;naturally supports several agents actions&lt;/a&gt;. One of these is running SANY, meaning an agent can get an error, fix it, get another error, fix it, etc. Provided the above sample and asked to make it work, Claude successfully fixed both errors. It also fixed many errors in a larger spec, as well as figure out why PlusCal specs weren't compiling to TLA+.&lt;/p&gt;
    &lt;p&gt;This by itself is already enough to make LLMs a worthwhile tool, as it fixes one of the biggest barriers to entry.&lt;/p&gt;
    &lt;h3&gt;Understanding error traces&lt;/h3&gt;
    &lt;p&gt;When TLA+ finds a violated property, it outputs the sequence of steps that leads to the error. This starts in plaintext, and VSCode parses it into an interactive table:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="An example error trace" class="newsletter-image" src="https://assets.buttondown.email/images/f7f16d0e-c61f-4286-ae49-67e03f844126.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Learning to read these error traces is a skill in itself. You have to understand what's happening in each step and how it relates back to the actually broken property. It takes a long time for people to learn how to do this well.&lt;/p&gt;
    &lt;p&gt;Claude was successful here, too, accurately reading 20+ step error traces and giving a high-level explanation of what went wrong. It also could condense error traces: if ten steps of the error trace could be condensed into a one-sentence summary (which can happen if you're modeling a lot of process internals) Claude would do it.&lt;/p&gt;
    &lt;p&gt;I did have issues here with doing this in agent mode: while the extension does provide a "run model checker" command, the agent would regularly ignore this and prefer to run a terminal command instead. This would be fine except that the LLM consistently hallucinated invalid commands. I had to amend every prompt with "run the model checker via vscode, do not use a terminal command". You can skip this if you're willing to copy and paste the error trace into the prompt.&lt;/p&gt;
    &lt;p&gt;As with syntax checking, if this was the &lt;em&gt;only&lt;/em&gt; thing LLMs could effectively do, that would already be enough&lt;sup id="fnref:dayenu"&gt;&lt;a class="footnote-ref" href="#fn:dayenu"&gt;1&lt;/a&gt;&lt;/sup&gt; to earn a strong recommend. Even as a TLA+ expert I expect I'll be using this trick regularly. &lt;/p&gt;
    &lt;h3&gt;Boilerplate tasks&lt;/h3&gt;
    &lt;p&gt;TLA+ has a lot of boilerplate. One of the most notorious examples is &lt;code&gt;UNCHANGED&lt;/code&gt; rules. Specifications are extremely precise — so precise that you have to specify what variables &lt;em&gt;don't&lt;/em&gt; change in every step. This takes the form of an &lt;code&gt;UNCHANGED&lt;/code&gt; clause at the end of relevant actions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;RemoveObjectFromStore(srv, o, s) ==
      /\ o \in stored[s]
      /\ stored' = [stored EXCEPT ![s] = @ \ {o}]
      /\ UNCHANGED &amp;lt;&amp;lt;capacity, log, objectsize, pc&amp;gt;&amp;gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Writing this is really annoying. Updating these whenever you change an action, or add a new variable to the spec, is doubly so. Syntax checking and error analysis are important for beginners, but this is what I wanted for &lt;em&gt;myself&lt;/em&gt;. I took a spec and prompted Claude&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Add UNCHANGED &amp;lt;&lt;v1, etc="" v2,=""&gt;&amp;gt; for each variable not changed in an action.&lt;/v1,&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;And it worked! It successfully updated the &lt;code&gt;UNCHANGED&lt;/code&gt; in every action. &lt;/p&gt;
    &lt;p&gt;(Note, though, that it was a "well-behaved" spec in this regard: only one "action" happened at a time. In TLA+ you can have two actions happen simultaneously, that each update half of the variables, meaning neither of them should have an &lt;code&gt;UNCHANGED&lt;/code&gt; clause. I haven't tested how Claude handles that!)&lt;/p&gt;
    &lt;p&gt;That's the most obvious win, but Claude was good at handling other tedious work, too. Some examples include updating &lt;code&gt;vars&lt;/code&gt; (the conventional collection of all state variables), lifting a hard-coded value into a model parameter, and changing data formats. Most impressive to me, though, was rewriting a spec designed for one process to instead handle multiple processes. This means taking all of the process variables, which originally have types like &lt;code&gt;Int&lt;/code&gt;, converting them to types like &lt;code&gt;[Process -&amp;gt; Int]&lt;/code&gt;, and then updating the uses of all of those variables in the spec. It didn't account for race conditions in the new concurrent behavior, but it was an excellent scaffold to do more work.&lt;/p&gt;
    &lt;h3&gt;Writing properties from an informal description&lt;/h3&gt;
    &lt;p&gt;You have to be pretty precise with your intended property description but it handles converting that precise description into TLA+'s formalized syntax, which is something beginners often struggle with.&lt;/p&gt;
    &lt;h2&gt;Things it is less good at&lt;/h2&gt;
    &lt;h3&gt;Generating model config files&lt;/h3&gt;
    &lt;p&gt;To model check TLA+, you need both a specification (&lt;code&gt;.tla&lt;/code&gt;) and a model config file (&lt;code&gt;.cfg&lt;/code&gt;), which have separate syntaxes. Asking the agent to generate the second often lead to it using TLA+ syntax. It automatically fixed this after getting parsing errors, though. &lt;/p&gt;
    &lt;h3&gt;Fixing specs&lt;/h3&gt;
    &lt;p&gt;Whenever the ran model checking and discovered a bug, it would naturally propose a change to either the invalid property or the spec. Sometimes the changes were good, other times the changes were not physically realizable. For example, if it found that a bug was due to a race condition between processes, it would often suggest fixing it by saying race conditions were okay. I mean yes, if you say bugs are okay, then the spec finds that bugs are okay! Or it would alternatively suggest adding a constraint to the spec saying that race conditions don't happen. &lt;a href="https://www.hillelwayne.com/post/alloy-facts/" target="_blank"&gt;But that's a huge mistake in specification&lt;/a&gt;, because race conditions happen if we don't have coordination. We need to specify the &lt;em&gt;mechanism&lt;/em&gt; that is supposed to prevent them.&lt;/p&gt;
    &lt;h3&gt;Finding properties of the spec&lt;/h3&gt;
    &lt;p&gt;After seeing how capable it was at translating my properties to TLA+, I started prompting Claude to come up with properties on its own. Unfortunately, almost everything I got back was either trivial, uninteresting, or too coupled to implementation details. I haven't tested if it would work better to ask it for "properties that may be violated".&lt;/p&gt;
    &lt;h3&gt;Generating code from specs&lt;/h3&gt;
    &lt;p&gt;I have to be specific here: Claude &lt;em&gt;could&lt;/em&gt; sometimes convert Python into a passable spec, an vice versa. It &lt;em&gt;wasn't&lt;/em&gt; good at recognizing abstraction. For example, TLA+ specifications often represent sequential operations with a state variable, commonly called &lt;code&gt;pc&lt;/code&gt;. If modeling code that nonatomically retrieves a counter value and increments it, we'd have one action that requires &lt;code&gt;pc = "Get"&lt;/code&gt; and sets the new value to &lt;code&gt;"Inc"&lt;/code&gt;, then another that requires it be &lt;code&gt;"Inc"&lt;/code&gt; and sets it to &lt;code&gt;"Done"&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;I found that Claude would try to somehow convert &lt;code&gt;pc&lt;/code&gt; into part of the Python program's state, rather than recognize it as a TLA+ abstraction. On the other side, when converting python code to TLA+ it would often try to translate things like &lt;code&gt;sleep&lt;/code&gt; into some part of the spec, not recognizing that it is abstractable into a distinct action. I didn't test other possible misconceptions, like converting randomness to nondeterminism.&lt;/p&gt;
    &lt;p&gt;For the record, when converting TLA+ to Python Claude tended to make simulators of the spec, rather than possible production code implementing the spec. I really wasn't expecting otherwise though.&lt;/p&gt;
    &lt;h2&gt;Unexplored Applications&lt;/h2&gt;
    &lt;p&gt;Things I haven't explored thoroughly but could possibly be effective, based on what I know about TLA+ and AI:&lt;/p&gt;
    &lt;h3&gt;Writing Java Overrides&lt;/h3&gt;
    &lt;p&gt;Most TLA+ operators are resolved via TLA+ interpreters, but you can also implement them in "native" Java. This lets you escape the standard language semantics and add capabilities like &lt;a href="https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla" target="_blank"&gt;executing programs during model-checking&lt;/a&gt; or &lt;a href="https://github.com/tlaplus/tlaplus/blob/master/tlatools/org.lamport.tlatools/src/tla2sany/StandardModules/TLC.tla#L62" target="_blank"&gt;dynamically constrain the depth of the searched state space&lt;/a&gt;. There's a lot of cool things I think would be possible with overrides. The problem is there's only a handful of people in the world who know how to write them. But that handful have written quite a few overrides and I think there's enough there for Claude to work with. &lt;/p&gt;
    &lt;h3&gt;Writing specs, given a reference mechanism&lt;/h3&gt;
    &lt;p&gt;In all my experiments, the LLM only had my prompts and the occasional Python script as information. That makes me suspect that some of its problems with writing and fixing specs come down to not having a system model. Maybe it wouldn't suggest fixes like "these processes never race" if it had a design doc saying that the processes can't coordinate. &lt;/p&gt;
    &lt;p&gt;(Could a Sufficiently Powerful LLM derive some TLA+ specification from a design document?)&lt;/p&gt;
    &lt;h3&gt;Connecting specs and code&lt;/h3&gt;
    &lt;p&gt;This is the holy grail of TLA+: taking a codebase and showing it correctly implements a spec. Currently the best ways to do this are by either using TLA+ to generate a test suite, or by taking logged production traces and matching them to TLA+ behaviors. &lt;a href="https://www.mongodb.com/blog/post/engineering/conformance-checking-at-mongodb-testing-our-code-matches-our-tla-specs" target="_blank"&gt;This blog post discusses both&lt;/a&gt;. While I've seen a lot of academic research into these approaches there are no industry-ready tools. So if you want trace validation you have to do a lot of manual labour tailored to your specific product. &lt;/p&gt;
    &lt;p&gt;If LLMs could do some of this work for us then that'd really amplify the usefulness of TLA+ to many companies.&lt;/p&gt;
    &lt;h2&gt;Thoughts&lt;/h2&gt;
    &lt;p&gt;&lt;em&gt;Right now&lt;/em&gt;, agents seem good at the tedious and routine parts of TLA+ and worse at the strategic and abstraction parts. But, since the routine parts are often a huge barrier to beginners, this means that LLMs have the potential to make TLA+ far, far more accessible than it previously was.&lt;/p&gt;
    &lt;p&gt;I have mixed thoughts on this. As an &lt;em&gt;advocate&lt;/em&gt;, this is incredible. I want more people using formal specifications because I believe it leads to cheaper, safer, more reliable software. Anything that gets people comfortable with specs is great for our industry. As a &lt;em&gt;professional TLA+ consultant&lt;/em&gt;, I'm worried that this obsoletes me. Most of my income comes from training and coaching, which companies will have far less demand of now. Then again, maybe this an opportunity to pitch "agentic TLA+ training" to companies!&lt;/p&gt;
    &lt;p&gt;Anyway, if you're interested in TLA+, there has never been a better time to try it. I mean it, these tools handle so much of the hard part now. I've got a &lt;a href="https://learntla.com/" target="_blank"&gt;free book available online&lt;/a&gt;, as does &lt;a href="https://lamport.azurewebsites.net/tla/book.html" target="_blank"&gt;the inventor of TLA+&lt;/a&gt;. I like &lt;a href="https://elliotswart.github.io/pragmaticformalmodeling/" target="_blank"&gt;this guide too&lt;/a&gt;. Happy modeling!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:dayenu"&gt;
    &lt;p&gt;Dayenu. &lt;a class="footnote-backref" href="#fnref:dayenu" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 05 Jun 2025 14:59:11 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/</guid>
            </item>
        </channel>
    </rss>
    Raw text
    <?xml version="1.0" encoding="utf-8"?>
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Computer Things</title><link>https://buttondown.com/hillelwayne</link><description>&lt;!-- buttondown-editor-mode: fancy --&gt;&lt;p&gt;Hi, I'm Hillel. This is the newsletter version of &lt;a target="_blank" rel="noopener noreferrer nofollow" href="https://www.hillelwayne.com"&gt;my website&lt;/a&gt;. I post all website updates here. I also post weekly content just for the newsletter, on topics like&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Formal Methods&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Software History and Culture&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Fringetech and exotic tooling&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The philosophy and theory of software engineering&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;You can see the archive of all public essays &lt;a target="_blank" rel="noopener noreferrer nofollow" href="https://buttondown.email/hillelwayne/archive/"&gt;here&lt;/a&gt;.&lt;/p&gt;</description><atom:link href="https://buttondown.email/hillelwayne/rss" rel="self"/><language>en-us</language><lastBuildDate>Tue, 10 Mar 2026 17:12:30 +0000</lastBuildDate><item><title>LLMs are bad at vibing specifications</title><link>https://buttondown.com/hillelwayne/archive/llms-are-bad-at-vibing-specifications/</link><description>
    &lt;h3&gt;No newsletter next week&lt;/h3&gt;
    &lt;p&gt;I'll be speaking at &lt;a href="https://qconlondon.com/" target="_blank"&gt;InfoQ London&lt;/a&gt;. But see below for a book giveaway!&lt;/p&gt;
    &lt;hr /&gt;
    &lt;h1&gt;LLMs are bad at vibing specifications&lt;/h1&gt;
    &lt;p&gt;About a year ago I wrote &lt;a href="https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/" target="_blank"&gt;AI is a gamechanger for TLA+ users&lt;/a&gt;, which argued that AI are a "specification force multiplier". That was written from the perspective an TLA+ expert using these tools. A full &lt;a href="https://github.com/search?q=path%3A*.tla+NOT+is%3Afork+claude&amp;amp;type=code" target="_blank"&gt;4% of Github TLA+ specs&lt;/a&gt; now have the word "Claude" somewhere in them. This is interesting to me, because it suggests there was always an interest in formal methods, people just lacked the skills to do it.  &lt;/p&gt;
    &lt;p&gt;It's also interesting because it gives me a sense of what happens when beginners use AI to write formal specs. It's not good.&lt;/p&gt;
    &lt;p&gt;As a case study, we'll use &lt;a href="https://github.com/myProjectsRavi/sentinel-protocol/tree/main/docs/formal/specs" target="_blank"&gt;this project&lt;/a&gt;, which is kind of enough to have vibed out TLA+ and Alloy specs.&lt;/p&gt;
    &lt;h3&gt;Looking at a project&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://github.com/myProjectsRavi/sentinel-protocol/blob/main/docs/formal/specs/threat-intel-mesh.als" target="_blank"&gt;Starting with the Alloy spec&lt;/a&gt;. Here it is in its entirety:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;module ThreatIntelMesh
    
    sig Node {}
    
    one sig LocalNode extends Node {}
    
    sig Snapshot {
      owner: one Node,
      signed: one Bool,
      signatures: set Signature
    }
    
    sig Signature {}
    
    sig Policy {
      allowUnsignedImport: one Bool
    }
    
    pred canImport[p: Policy, s: Snapshot] {
      (p.allowUnsignedImport = True) or (s.signed = True)
    }
    
    assert UnsignedImportMustBeDenied {
      all p: Policy, s: Snapshot |
        p.allowUnsignedImport = False and s.signed = False implies not canImport[p, s]
    }
    
    assert SignedImportMayBeAccepted {
      all p: Policy, s: Snapshot |
        s.signed = True implies canImport[p, s]
    }
    
    check UnsignedImportMustBeDenied for 5
    check SignedImportMayBeAccepted for 5
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Couple of things to note here: first of all, this doesn't actually compile. It's using the &lt;a href="https://alloy.readthedocs.io/en/latest/modules/boolean.html" target="_blank"&gt;Boolean&lt;/a&gt; standard module so needs &lt;code&gt;open util/boolean&lt;/code&gt; to function. Second, Boolean is the wrong approach here; you're supposed to use subtyping. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;sig Snapshot {
    &lt;span class="w"&gt; &lt;/span&gt; owner: one Node,
    &lt;span class="gd"&gt;- signed: one Bool,&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; signatures: set Signature
    }
    
    &lt;span class="gi"&gt;+ sig SignedSnapshot in Snapshot {}&lt;/span&gt;
    
    
    pred canImport[p: Policy, s: Snapshot] {
    &lt;span class="gd"&gt;- s.signed = True&lt;/span&gt;
    &lt;span class="gi"&gt;+ s in SignedSnapshot&lt;/span&gt;
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;So we know the person did not actually run these specs. This is &lt;em&gt;somewhat&lt;/em&gt; less of a problem in TLA+, which has an official MCP server that lets the agent run model checking. Even so, I regularly see specs that I'm pretty sure won't model check, with things like using &lt;code&gt;Reals&lt;/code&gt; or assuming &lt;code&gt;NULL&lt;/code&gt; is a built-in and not a user-defined constant.&lt;/p&gt;
    &lt;p&gt;The bigger problem with the spec is that &lt;code&gt;UnsignedImportMustBeDenied&lt;/code&gt; and &lt;code&gt;SignedImportMayBeAccepted&lt;/code&gt; &lt;em&gt;don't actually do anything&lt;/em&gt;. &lt;code&gt;canImport&lt;/code&gt; is defined as &lt;code&gt;P || Q&lt;/code&gt;. &lt;code&gt;UnsignedImportMustBeDenied&lt;/code&gt; checks that &lt;code&gt;!P &amp;amp;&amp;amp; !Q =&amp;gt; !canImport&lt;/code&gt;. &lt;code&gt;SignedImportMayBeAccepted&lt;/code&gt; checks that &lt;code&gt;P =&amp;gt; canImport&lt;/code&gt;. These are tautologically true! If they do anything at all, it is only checking that &lt;code&gt;canImport&lt;/code&gt; was defined correctly. &lt;/p&gt;
    &lt;p&gt;You see the same thing in the &lt;a href="https://github.com/myProjectsRavi/sentinel-protocol/blob/main/docs/formal/specs/serialization-firewall.tla" target="_blank"&gt;TLA+ specs&lt;/a&gt;, too:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;GadgetPayload ==
      /\ gadgetDetected&amp;#39; = TRUE
      /\ depth&amp;#39; \in 0..(MaxDepth + 5)
      /\ UNCHANGED allowlistedFormat
      /\ decision&amp;#39; = &amp;quot;block&amp;quot;
    
    NoExploitAllowed == gadgetDetected =&amp;gt; decision = &amp;quot;block&amp;quot;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;The AI is only writing "obvious properties", which fail for reasons like "we missed a guard clause" or "we forgot to update a variable". It does not seem to be good at writing "subtle" properties that fail due to concurrency, nondeterminism, or bad behavior separated by several steps. Obvious properties are useful for orienting yourself and ensuring the system behaves like you expect, but the actual value in using formal methods comes from the subtle properties. &lt;/p&gt;
    &lt;p&gt;(This ties into &lt;a href="https://buttondown.com/hillelwayne/archive/some-tests-are-stronger-than-others/" target="_blank"&gt;Strong and Weak Properties&lt;/a&gt;. LLM properties are weak, intended properties need to be strong.)&lt;/p&gt;
    &lt;p&gt;This is a problem I see in almost every FM spec written by AI. LLMs aren't doing one of the core features of a spec. Articles like &lt;a href="https://martin.kleppmann.com/2025/12/08/ai-formal-verification.html" target="_blank"&gt;Prediction: AI will make formal verification go mainstream&lt;/a&gt; and &lt;a href="https://leodemoura.github.io/blog/2026/02/28/when-ai-writes-the-worlds-software.html" target="_blank"&gt;When AI Writes the World's Software, Who Verifies It?&lt;/a&gt; argue that LLMs will make formal methods go mainstream, but being easily able to write specifications doesn't help with correctness if the specs don't actually verify anything.&lt;/p&gt;
    &lt;h3&gt;Is this a user error?&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;I first got interested in LLMs and TLA+ from &lt;a href="https://zfhuang99.github.io/github%20copilot/formal%20verification/tla+/2025/05/24/ai-revolution-in-distributed-systems.html" target="_blank"&gt;The Coming AI Revolution in Distributed Systems&lt;/a&gt;. The author of that later &lt;a href="https://github.com/zfhuang99/lamport-agent/blob/main/spec/CRAQ/CRAQ.tla" target="_blank"&gt;vibecoded a spec&lt;/a&gt; with a considerably more complex property:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;NoStaleStrictRead ==
      \A i \in 1..Len(eventLog) :
        LET ev == eventLog[i] IN
          ev.type = &amp;quot;read&amp;quot; =&amp;gt;
            LET c == ev.chunk IN
            LET v == ev.version IN
            /\ \A j \in 1..i :
                 LET evC == eventLog[j] IN
                   evC.type = &amp;quot;commit&amp;quot; /\ evC.chunk = c =&amp;gt; evC.version &amp;lt;= v
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;This is a lot more complicated than the &lt;code&gt;(P =&amp;gt; Q &amp;amp;&amp;amp; P) =&amp;gt; Q&lt;/code&gt; properties I've seen! It could be because &lt;a href="https://github.com/deepseek-ai/3FS/tree/main/specs/DataStorage" target="_blank"&gt;the corresponding system already had a complete spec written in P&lt;/a&gt;. But it could also be that Cheng Huang is already an expert specifier, meaning he can get more out of an LLM than an ordinary developer can. I've also noticed that I can usually coax an LLM to do more interesting things than most of my clients can. Which is good for my current livelihood, but bad for the hope of LLMs making formal methods mainstream. If you need to know formal methods to get the LLM to do formal methods, is that really helping?&lt;/p&gt;
    &lt;p&gt;(Yes, if it lowers the skill threshold-- means you can apply FM with 20 hours of practice instead of 80. But the jury's still out on how &lt;em&gt;much&lt;/em&gt; it lowers the threshold. What if it only lowers it from 80 to 75?) &lt;/p&gt;
    &lt;p&gt;On the other hand, there also seem to be some properties that AI struggles with, even with explicit instructions. Last week a client and I tried to get Claude to generate a good &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;liveness&lt;/a&gt; or &lt;a href="https://www.hillelwayne.com/post/action-properties/" target="_blank"&gt;action&lt;/a&gt; property instead of a standard obvious invariant, and it just couldn't. Training data issue? Something in the innate complexity of liveness? It's not clear yet. These properties are even more "subtle" than most invariants, so maybe that's it.&lt;/p&gt;
    &lt;p&gt;On the other other hand, this is all as of March 2026. Maybe this whole article will be laughably obsolete by June. &lt;/p&gt;
    &lt;hr /&gt;
    &lt;h3&gt;&lt;a href="https://logicforprogrammers.com" target="_blank"&gt;Logic for Programmers&lt;/a&gt; Giveaway&lt;/h3&gt;
    &lt;p&gt;Last week's giveaway raised a few issues. First, the New World copies were all taken before all of the emails went out, so a lot of people did not even get a chance to try for a book. Second, due to a Leanpub bug the Europe coupon scheduled for 10 AM UTC actually activated at 10 AM my time, which was early evening for Europe. Third, everybody in the APAC region got left out.&lt;/p&gt;
    &lt;p&gt;So, since I'm not doing a newsletter next week, let's have another giveaway:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://leanpub.com/logic/c/E5A55F7B482C3" target="_blank"&gt;This coupon&lt;/a&gt; will go up 2026-03-16 at 11:00 UTC, which should be noon Central European Time, and be good for ten books (five for this giveaway, five to account for last week's bug).&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://leanpub.com/logic/c/ADC664C95B6D1" target="_blank"&gt;This coupon&lt;/a&gt; will go up 2026-03-17 at 04:00 UTC, which should be noon Beijing Time, and be good for five books.&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://leanpub.com/logic/c/U1250212A9070" target="_blank"&gt;This coupon&lt;/a&gt; will go up 2026-03-17 at 17:00 UTC, which should be noon Central US Time, and also be good for five books.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;I think that gives the best chance of everybody getting at least a chance of a book, while being resilient to timezone shenanigans due to travel / Leanpub dropping bugfixes / daylight savings / whatever. &lt;/p&gt;
    &lt;p&gt;(No guarantees that later "no newsletter" weeks will have giveaways! This is a gimmick)&lt;/p&gt;
    </description><pubDate>Tue, 10 Mar 2026 17:12:30 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/llms-are-bad-at-vibing-specifications/</guid></item><item><title>Free Books</title><link>https://buttondown.com/hillelwayne/archive/free-books/</link><description>
    &lt;p&gt;Spinning a &lt;a href="https://www.youtube.com/watch?v=NB4hzg4k7_A" target="_blank"&gt;lot of plates&lt;/a&gt; this week so skipping the newsletter. As an apology, have ten free copies of &lt;em&gt;Logic for Programmers&lt;/em&gt;.&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://leanpub.com/logic/c/EBDFA51B15C1" target="_blank"&gt;These five&lt;/a&gt; are available now.&lt;/li&gt;
    &lt;li&gt;&lt;del&gt;&lt;a href="https://leanpub.com/logic/c/5A55F7B482C3" target="_blank"&gt;These five&lt;/a&gt; &lt;em&gt;should&lt;/em&gt; be available at 10:30 AM CEST tomorrow, so people in Europe have a better chance of nabbing one.&lt;/del&gt; Nevermind Leanpub had a bug that made this not work properly&lt;/li&gt;
    &lt;/ul&gt;
    </description><pubDate>Tue, 03 Mar 2026 16:34:33 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/free-books/</guid></item><item><title>New Blog Post: Some Silly Z3 Scripts I Wrote</title><link>https://buttondown.com/hillelwayne/archive/new-blog-post-some-silly-z3-scripts-i-wrote/</link><description>
    &lt;p&gt;Now that I'm not spending all my time on Logic for Programmers, I have time to update my website again! So here's the first blog post in five months: &lt;a href="https://www.hillelwayne.com/post/z3-examples/" target="_blank"&gt;Some Silly Z3 Scripts I Wrote&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;Normally I'd also put a link to the Patreon notes but I've decided I don't like publishing gated content and am going to wind that whole thing down. So some quick notes about this post:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Part of the point is admittedly to hype up the eventual release of LfP. I want to start marketing the book, but don't want the marketing material to be devoid of interest, so tangentially-related-but-independent blog posts are a good place to start.&lt;/li&gt;
    &lt;li&gt;The post discusses the concept of "chaff", the enormous quantity of material (both code samples and prose) that didn't make it into the book. The book is about 50,000 words… and considerably shorter than the total volume of chaff! I don't &lt;em&gt;think&lt;/em&gt; most of it can be turned into useful public posts, but I'm not entirely opposed to the idea. Maybe some of the old chapters could be made into something?&lt;/li&gt;
    &lt;li&gt;Coming up with a conditioned mathematical property to prove was a struggle. I had two candidates: &lt;code&gt;a == b * c =&amp;gt; a / b == c&lt;/code&gt;, which would have required a long tangent on how division must be total in Z3, and  &lt;code&gt;a != 0 =&amp;gt; some b: b * a == 1&lt;/code&gt;, which would have required introducing a quantifier (SMT is real weird about quantifiers). Division by zero has already caused me enough grief so I went with the latter. This did mean I had to reintroduce "operations must be total" when talking about arrays.&lt;/li&gt;
    &lt;li&gt;I have no idea why the array example returns &lt;code&gt;2&lt;/code&gt; for the max profit and not &lt;code&gt;99999999&lt;/code&gt;. I'm guessing there's some short circuiting logic in the optimizer when the problem is ill-defined?&lt;/li&gt;
    &lt;li&gt;One example I could not get working, which is unfortunate, was a demonstration of how SMT solvers are undecidable via encoding Goldbach's conjecture as an SMT problem. Anything with multiple nested quantifiers is a pain.&lt;/li&gt;
    &lt;/ul&gt;
    </description><pubDate>Mon, 23 Feb 2026 16:49:10 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/new-blog-post-some-silly-z3-scripts-i-wrote/</guid></item><item><title>Stream of Consciousness Driven Development</title><link>https://buttondown.com/hillelwayne/archive/stream-of-consciousness-driven-development/</link><description>
    &lt;p&gt;This is something I just tried out last week but it seems to have enough potential to be worth showing unpolished. I was pairing with a client on writing a spec. I saw a problem with the spec, a convoluted way of fixing the spec. Instead of trying to verbally explain it, I started by creating a new markdown file:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;NameOfProblem.md
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;Then I started typing. First the problem summary, then a detailed description, then the solution and why it worked. When my partner asked questions, I incorporated his question and our discussion of it into the flow. If we hit a dead end with the solution, we marked it out as a dead end. Eventually the file looked something like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Current state of spec
    Problems caused by this
        Elaboration of problems
        What we tried that didn&amp;#39;t work
    Proposed Solution
        Theory behind proposed solution
        How the solution works
        Expected changes
        Other problems this helps solve
        Problems this does *not* help with
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;Only once this was done, my partner fully understood the chain of thought, &lt;em&gt;and&lt;/em&gt; we agreed it represented the right approach, did we start making changes to the spec. &lt;/p&gt;
    &lt;h3&gt;How is this better than just making the change?&lt;/h3&gt;
    &lt;p&gt;The change was &lt;em&gt;conceptually&lt;/em&gt; complex. A rough analogy: imagine pairing with a beginner who wrote an insertion sort, and you want to replace it with quicksort. You need to explain why the insertion sort is too slow, why the quicksort isn't slow, and how quicksort actually correctly sorts a list. This could involve tangents into computational complexity, big-o notation, recursion, etc. These are all concepts you have internalized, so the change is simple to you, but the solution uses concepts the beginner does not know. So it's conceptually complex to them.&lt;/p&gt;
    &lt;p&gt;I wasn't pairing with a beginning programmer or even a beginning specifier. This was a client who could confidently write complex specs on their own. But they don't work on specifications full time like I do. Any time there's a relative gap in experience in a pair, there's solutions that are conceptually simple to one person and complex to the other.&lt;/p&gt;
    &lt;p&gt;I've noticed too often that when one person doesn't fully understand the concepts behind a change, they just go "you're the expert, I trust you." That eventually leads to a totally unmaintainable spec. Hence, writing it all out. &lt;/p&gt;
    &lt;p&gt;As I said before, I've only tried this once (though I've successfully used a similar idea when teaching workshops). It worked pretty well, though! Just be prepared for a lot of typing.&lt;/p&gt;
    </description><pubDate>Wed, 18 Feb 2026 16:33:08 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/stream-of-consciousness-driven-development/</guid></item><item><title>Proving What's Possible</title><link>https://buttondown.com/hillelwayne/archive/proving-whats-possible/</link><description>
    &lt;p&gt;As a formal methods consultant I have to mathematically express properties of systems. I generally do this with two "temporal operators": &lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;A(x) means that &lt;code&gt;x&lt;/code&gt; is always true. For example, a database table &lt;em&gt;always&lt;/em&gt; satisfies all record-level constraints, and a state machine &lt;em&gt;always&lt;/em&gt; makes valid transitions between states. If &lt;code&gt;x&lt;/code&gt; is a statement about an individual state (as in the database but not state machine example), we further call it an &lt;strong&gt;invariant&lt;/strong&gt;.&lt;/li&gt;
    &lt;li&gt;E(x) means that &lt;code&gt;x&lt;/code&gt; is "eventually" true, conventionally meaning "guaranteed true at some point in the future". A database transaction &lt;em&gt;eventually&lt;/em&gt; completes or rolls back, a state machine &lt;em&gt;eventually&lt;/em&gt; reaches the "done" state, etc. &lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;These come from linear temporal logic, which is the mainstream notation for expressing system properties. &lt;sup id="fnref:modal"&gt;&lt;a class="footnote-ref" href="#fn:modal"&gt;1&lt;/a&gt;&lt;/sup&gt; We like these operators because they elegantly cover &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;safety and liveness properties&lt;/a&gt;, and because &lt;a href="https://buttondown.com/hillelwayne/archive/formalizing-stability-and-resilience-properties/" target="_blank"&gt;we can combine them&lt;/a&gt;. &lt;code&gt;A(E(x))&lt;/code&gt; means &lt;code&gt;x&lt;/code&gt; is true an infinite number of times, while &lt;code&gt;A(x =&amp;gt; E(y)&lt;/code&gt; means that &lt;code&gt;x&lt;/code&gt; being true guarantees &lt;code&gt;y&lt;/code&gt; true in the future. &lt;/p&gt;
    &lt;p&gt;There's a third class of properties, that I will call &lt;em&gt;possibility&lt;/em&gt; properties: &lt;code&gt;P(x)&lt;/code&gt; is "can x happen in this model"? Is it possible for a table to have more than ten records? Can a state machine transition from "Done" to "Retry", even if it &lt;em&gt;doesn't&lt;/em&gt;? Importantly, &lt;code&gt;P(x)&lt;/code&gt; does not need to be possible &lt;em&gt;immediately&lt;/em&gt;, just at some point in the future. It's possible to lose 100 dollars betting on slot machines, even if you only bet one dollar at a time. If &lt;code&gt;x&lt;/code&gt; is a statement about an individual state, we can further call it a &lt;a href="https://en.wikipedia.org/wiki/Reachability" target="_blank"&gt;&lt;em&gt;reachability&lt;/em&gt; property&lt;/a&gt;. I'm going to use the two interchangeably for flow. &lt;/p&gt;
    &lt;p&gt;&lt;code&gt;A(P(x))&lt;/code&gt; says that &lt;code&gt;x&lt;/code&gt; is &lt;em&gt;always&lt;/em&gt; possible. No matter what we've done in our system, we can make &lt;code&gt;x&lt;/code&gt; happen again. There's no way to do this with just &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;E&lt;/code&gt;. Other meaningful combinations include:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;code&gt;P(A(x))&lt;/code&gt;: there is a reachable state from which &lt;code&gt;x&lt;/code&gt; is always true.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;A(x =&amp;gt; P(y))&lt;/code&gt;: &lt;code&gt;y&lt;/code&gt; is possible from any state where &lt;code&gt;x&lt;/code&gt; is true.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;E(x &amp;amp;&amp;amp; P(y))&lt;/code&gt;: There is always a future state where x is true and y is reachable.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;A(P(x) =&amp;gt; E(x))&lt;/code&gt;: If &lt;code&gt;x&lt;/code&gt; is ever possible, it will eventually happen.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;E(P(x))&lt;/code&gt; and &lt;code&gt;P(E(x))&lt;/code&gt; are the same as &lt;code&gt;P(x)&lt;/code&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;See the paper &lt;a href="https://dl.acm.org/doi/epdf/10.1145/567446.567463" target="_blank"&gt;"Sometime" is sometimes "not never"&lt;/a&gt; for a deeper discussion of &lt;code&gt;E&lt;/code&gt; and &lt;code&gt;P&lt;/code&gt;.&lt;/p&gt;
    &lt;h3&gt;The use case&lt;/h3&gt;
    &lt;p&gt;Possibility properties are "something good &lt;em&gt;can&lt;/em&gt; happen", which is generally less useful (&lt;em&gt;in specifications&lt;/em&gt;) than "something bad &lt;em&gt;can't&lt;/em&gt; happen" (safety) and "something good &lt;em&gt;will&lt;/em&gt; happen" (liveness). But it still comes up as an important property! My favorite example:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="A guy who can't shut down his computer because system preferences interrupts shutdown" class="newsletter-image" src="https://www.hillelwayne.com/post/safety-and-liveness/img/tweet2.png" /&gt;&lt;/p&gt;
    &lt;p&gt;The big use I've found for the idea is as a sense-check that we wrote the spec properly. Say I take the property "A worker in the 'Retry' state eventually leaves that state":&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;A(state == &amp;#39;Retry&amp;#39; =&amp;gt; E(state != &amp;#39;Retry&amp;#39;))
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;The model checker checks this property and confirms it holds of the spec. Great! Our system is correct! ...Unless the system can never &lt;em&gt;reach&lt;/em&gt; the "Retry" state, in which case the expression is trivially true. I need to verify that 'Retry' is reachable, eg &lt;code&gt;P(state == 'Retry')&lt;/code&gt;. Notice I can't use &lt;code&gt;E&lt;/code&gt; to do this, because I don't want to say "the worker always needs to retry at least once". &lt;/p&gt;
    &lt;h3&gt;It's not supported though&lt;/h3&gt;
    &lt;p&gt;I say "use I've found for &lt;em&gt;the idea&lt;/em&gt;" because the main formalisms I use (Alloy and TLA+) don't natively support &lt;code&gt;P&lt;/code&gt;. &lt;sup id="fnref:tla"&gt;&lt;a class="footnote-ref" href="#fn:tla"&gt;2&lt;/a&gt;&lt;/sup&gt; On top of &lt;code&gt;P&lt;/code&gt; being less useful than &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;E&lt;/code&gt;, simple reachability properties are &lt;a href="https://www.hillelwayne.com/post/software-mimicry/" target="_blank"&gt;mimickable&lt;/a&gt; with A(x). &lt;code&gt;P(x)&lt;/code&gt; &lt;em&gt;passes&lt;/em&gt; whenever &lt;code&gt;A(!x)&lt;/code&gt; &lt;em&gt;fails&lt;/em&gt;, meaning I can verify &lt;code&gt;P(state == 'Retry')&lt;/code&gt; by testing that &lt;code&gt;A(!(state == 'Retry'))&lt;/code&gt; finds a counterexample. We &lt;em&gt;cannot&lt;/em&gt; mimic combined operators this way like &lt;code&gt;A(P(x))&lt;/code&gt; but those are significantly less common than state-reachability. &lt;/p&gt;
    &lt;p&gt;(Also, refinement doesn't preserve possibility properties, but that's a whole other kettle of worms.)&lt;/p&gt;
    &lt;p&gt;The one that's bitten me a little is that we can't mimic "&lt;code&gt;P(x)&lt;/code&gt; from every starting state". "&lt;code&gt;A(!x)&lt;/code&gt;" fails if there's at least one path from one starting state that leads to &lt;code&gt;x&lt;/code&gt;, but other starting states might not make &lt;code&gt;x&lt;/code&gt; possible.&lt;/p&gt;
    &lt;p&gt;I suspect there's also a chicken-and-egg problem here. Since my tools can't verify possibility properties, I'm not used to noticing them in systems. I'd be interested in hearing if anybody works with codebases where possibility properties are important, especially if it's something complex like &lt;code&gt;A(x =&amp;gt; P(y))&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr /&gt;
    &lt;ol&gt;
    &lt;li id="fn:modal"&gt;
    &lt;p&gt;Instead of &lt;code&gt;A(x)&lt;/code&gt;, the literature uses &lt;code&gt;[]x&lt;/code&gt; or &lt;code&gt;Gx&lt;/code&gt; ("globally x") and instead of &lt;code&gt;E(x)&lt;/code&gt; it uses &lt;code&gt;&amp;lt;&amp;gt;x&lt;/code&gt; or &lt;code&gt;Fx&lt;/code&gt; ("finally x"). I'm using A and E because this isn't teaching material.&amp;#160;&lt;a class="footnote-backref" href="#fnref:modal" title="Jump back to footnote 1 in the text"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:tla"&gt;
    &lt;p&gt;There's &lt;a href="https://github.com/tlaplus/tlaplus/issues/860" target="_blank"&gt;some discussion to add it to TLA+, though&lt;/a&gt;.&amp;#160;&lt;a class="footnote-backref" href="#fnref:tla" title="Jump back to footnote 2 in the text"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 11 Feb 2026 18:36:53 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/proving-whats-possible/</guid></item><item><title>Logic for Programmers New Release and Next Steps</title><link>https://buttondown.com/hillelwayne/archive/logic-for-programmers-new-release-and-next-steps/</link><description>
    &lt;p&gt;&lt;img alt="cover.jpg" class="newsletter-image" src="https://assets.buttondown.email/images/f821145f-d310-403c-88f4-327758a66606.jpg?w=480&amp;amp;fit=max" /&gt;&lt;/p&gt;
    &lt;p&gt;It's taken four months, but the next release of &lt;a href="https://logicforprogrammers.com" target="_blank"&gt;Logic for Programmers is now available&lt;/a&gt;! v0.13 is over 50,000 words, making it both 20% larger than v0.12 and officially the longest thing I have ever written.&lt;sup id="fnref:longest"&gt;&lt;a class="footnote-ref" href="#fn:longest"&gt;1&lt;/a&gt;&lt;/sup&gt; Full release notes are &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;here&lt;/a&gt;, but I'll talk a bit about the biggest changes. &lt;/p&gt;
    &lt;p&gt;For one, every chapter has been rewritten. Every single one. They span from &lt;em&gt;relatively&lt;/em&gt; minor changes to complete chapter rewrites. After some rough git diffing, I think I deleted about 11,000 words?&lt;sup id="fnref:gross-additions"&gt;&lt;a class="footnote-ref" href="#fn:gross-additions"&gt;2&lt;/a&gt;&lt;/sup&gt; The biggest change is probably to the Alloy chapter. After many sleepless nights, I realized the right approach wasn't to teach Alloy as a &lt;em&gt;data modeling&lt;/em&gt; tool but to teach it as a &lt;em&gt;domain modeling&lt;/em&gt; tool. Which technically means the book no longer covers data modeling.&lt;/p&gt;
    &lt;p&gt;There's also a lot more connections between the chapters. The introductory math chapter, for example, foreshadows how each bit of math will be used in the future techniques. I also put more emphasis on the general "themes" like the expressiveness-guarantees tradeoff (working title). One theme I'm really excited about is compatibility (extremely working title). It turns out that the &lt;a href="https://buttondown.com/hillelwayne/archive/the-liskov-substitution-principle-does-more-than/" target="_blank"&gt;Liskov substitution principle&lt;/a&gt;/subtyping in general, &lt;a href="https://buttondown.com/hillelwayne/archive/refinement-without-specification/" target="_blank"&gt;database migrations&lt;/a&gt;, backwards-compatible API changes, and &lt;a href="https://hillelwayne.com/post/refinement/" target="_blank"&gt;specification refinement&lt;/a&gt; all follow &lt;em&gt;basically&lt;/em&gt; the same general principles. I'm calling this "compatibility" for now but prolly need a better name.&lt;/p&gt;
    &lt;p&gt;Finally, there's just a lot more new topics in the various chapters. &lt;code&gt;Testing&lt;/code&gt; properly covers structural and metamorphic properties. &lt;code&gt;Proofs&lt;/code&gt; covers proof by induction and proving recursive functions (in an exercise). &lt;code&gt;Logic Programming&lt;/code&gt; now finally has a section on answer set programming. You get the picture.&lt;/p&gt;
    &lt;h3&gt;Next Steps&lt;/h3&gt;
    &lt;p&gt;There's a lot I still want to add to the book: proper data modeling, data structures, type theory, model-based testing, etc. But I've added new material for two year, and if I keep going it will never get done. So with this release, all the content is in!&lt;/p&gt;
    &lt;p&gt;Just like all the content was in &lt;a href="https://buttondown.com/hillelwayne/archive/five-unusual-raku-features/" target="_blank"&gt;two Novembers ago&lt;/a&gt; and &lt;a href="https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/" target="_blank"&gt;two Januaries ago&lt;/a&gt; and &lt;a href="https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/" target="_blank"&gt;last July&lt;/a&gt;. To make it absolutely 100% for sure that I won't be tempted to add anything else, I passed the whole manuscript over to a copy editor. So if I write more, it won't get edits. That's a pretty good incentive to stop.&lt;/p&gt;
    &lt;p&gt;I also need to find a technical reviewer and proofreader. Once all three phases are done then it's "just" a matter of fixing the layout and finding a good printer. I don't know what the timeline looks like but I really want to have something I can hold in my hands before the summer.&lt;/p&gt;
    &lt;p&gt;(I also need to get notable-people testimonials. Hampered a little in this because I'm trying real hard not to quid-pro-quo, so I'd like to avoid anybody who helped me or is mentioned in the book. And given I tapped most of my network to help me... I've got some ideas though!)&lt;/p&gt;
    &lt;p&gt;There's still a lot of work ahead. Even so, for the first time in two years I don't have research to do or sections to write and it feels so crazy. Maybe I'll update my blog again! Maybe I'll run a workshop! Maybe I'll go outside if Chicago ever gets above 6°F! &lt;/p&gt;
    &lt;hr /&gt;
    &lt;h2&gt;Conference Season&lt;/h2&gt;
    &lt;p&gt;After a pretty slow 2025, the 2026 conference season is looking to be pretty busy! Here's where I'm speaking so far:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://qconlondon.com/" target="_blank"&gt;QCon London&lt;/a&gt;, March 16-19&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://craft-conf.com/2026" target="_blank"&gt;Craft Conference&lt;/a&gt;, Budapest, June 4-5&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://softwareshould.work/" target="_blank"&gt;Software Should Work&lt;/a&gt;, Missouri, July 16-17&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://hfpug.org/" target="_blank"&gt;Houston Functional Programmers&lt;/a&gt;, Virtual, December 3&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;For the first three I'm giving variations of my talk "How to find bugs in systems that don't exist", which I gave last year at &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;. Last one will ideally be a talk based on LfP. &lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr /&gt;
    &lt;ol&gt;
    &lt;li id="fn:longest"&gt;
    &lt;p&gt;The second longest was my 2003 NaNoWriMo. The third longest was &lt;em&gt;Practical TLA+&lt;/em&gt;.&amp;#160;&lt;a class="footnote-backref" href="#fnref:longest" title="Jump back to footnote 1 in the text"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:gross-additions"&gt;
    &lt;p&gt;This means I must have written 20,000 words total. For comparison, the v0.1 release was 19,000 words.&amp;#160;&lt;a class="footnote-backref" href="#fnref:gross-additions" title="Jump back to footnote 2 in the text"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 04 Feb 2026 14:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/logic-for-programmers-new-release-and-next-steps/</guid></item><item><title>Refinement without Specification</title><link>https://buttondown.com/hillelwayne/archive/refinement-without-specification/</link><description>
    &lt;p&gt;Imagine we have a SQL database with a &lt;code&gt;user&lt;/code&gt; table, and users have a non-nullable &lt;code&gt;is_activated&lt;/code&gt; boolean column. Having read &lt;a href="https://ntietz.com/blog/that-boolean-should-probably-be-something-else/" target="_blank"&gt;That Boolean Should Probably Be Something else&lt;/a&gt;, you decide to migrate it to a nullable &lt;code&gt;activated_at&lt;/code&gt; column. You can change any of the SQL queries that read/update the &lt;code&gt;user&lt;/code&gt; table but not any of the code that uses the results of these queries. Can we make this change in a way that preserves all external properties? &lt;/p&gt;
    &lt;p&gt;Yes. If an update would set &lt;code&gt;is_activated&lt;/code&gt; to true, instead set it to the current date. Now define the &lt;strong&gt;refinement mapping&lt;/strong&gt; that takes a &lt;code&gt;new_user&lt;/code&gt; and returns an &lt;code&gt;old_user&lt;/code&gt;. All columns will be unchanged &lt;em&gt;except&lt;/em&gt; &lt;code&gt;is_activated&lt;/code&gt;, which will be&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;f(new_user).is_activated = 
        if new_user.activated_at == NULL 
        then FALSE
        else TRUE
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;Now new code can use &lt;code&gt;new_user&lt;/code&gt; directly while legacy code can use &lt;code&gt;f(new_user)&lt;/code&gt; instead, which will behave indistinguishably from the &lt;code&gt;old_user&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;A little more time passes and you decide to switch to an &lt;a href="https://martinfowler.com/eaaDev/EventSourcing.html" target="_blank"&gt;event sourcing&lt;/a&gt;-like model. So instead of an &lt;code&gt;activated_at&lt;/code&gt; column, you have a &lt;code&gt;user_events&lt;/code&gt; table, where every record is &lt;code&gt;(user_id, timestamp, event)&lt;/code&gt;. So adding an &lt;code&gt;activate&lt;/code&gt; event will activate the user, adding a &lt;code&gt;deactivate&lt;/code&gt; event will deactivate the user. Once again, we can update the queries but not any of the code that uses the results of these queries. Can we make a change that preserves all external properties?&lt;/p&gt;
    &lt;p&gt;Yes. If an update would change &lt;code&gt;is_activated&lt;/code&gt;, instead have it add an appropriate record to the event table. Now, define the refinement mapping that takes &lt;code&gt;newer_user&lt;/code&gt; and returns &lt;code&gt;new_user&lt;/code&gt;. The &lt;code&gt;activated_at&lt;/code&gt; field will be computed like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;g(newer_user).activated_at =
            # last_activated_event
        let lae = 
                newer_user.events
                          .filter(event = &amp;quot;activate&amp;quot; | &amp;quot;deactivate&amp;quot;)
                          .last,
        in
            if lae.event == &amp;quot;activate&amp;quot; 
            then lae.timestamp
            else NULL
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Now new code can use &lt;code&gt;newer_user&lt;/code&gt; directly while old code can use &lt;code&gt;g(newer_user)&lt;/code&gt; and the really old code can use &lt;code&gt;f(g(newer_user))&lt;/code&gt;.&lt;/p&gt;
    &lt;h3&gt;Mutability constraints&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;I said "these preserve all external properties" and that was a lie. It depends on the properties we explicitly have, and I didn't list any. The real interesting properties for me are mutability constraints on how the system can evolve. So let's go back in time and add a constraint to &lt;code&gt;user&lt;/code&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;C1(u) = u.is_activated =&amp;gt; u.is_activated&amp;#39;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;This constraint means that if a user is activated, any change will preserve its activated-ness. This means a user can go from deactivated to activated but not the other way. It's not a particular good constraint but it's good enough for teaching purposes. Such a SQL constraint can be enforced with &lt;a href="https://www.postgresql.org/docs/current/sql-createeventtrigger.html" target="_blank"&gt;triggers&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;Now we can throw a constraint on &lt;code&gt;new_user&lt;/code&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;C2(nu) = nu.activated_at != NULL =&amp;gt; nu.activated_at&amp;#39; != NULL
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;If &lt;code&gt;nu&lt;/code&gt; satisfies &lt;code&gt;C2&lt;/code&gt;, then &lt;code&gt;f(nu)&lt;/code&gt; satisfies &lt;code&gt;C1&lt;/code&gt;. So the refinement still holds.&lt;/p&gt;
    &lt;p&gt;With &lt;code&gt;newer_u&lt;/code&gt;, we &lt;em&gt;cannot&lt;/em&gt; guarantee that &lt;code&gt;g(newer_u)&lt;/code&gt; satisfies &lt;code&gt;C2&lt;/code&gt; because we can go from "activated" to "deactivated" just by appending a new event. So it's not a refinement. This is fixable by removing deactivation events, that would work too.&lt;/p&gt;
    &lt;p&gt;So a more interesting case is &lt;code&gt;bad_user&lt;/code&gt;, a refinement of &lt;code&gt;user&lt;/code&gt; that has both &lt;code&gt;activated_at&lt;/code&gt; and &lt;code&gt;activated_until&lt;/code&gt;. We propose the refinement mapping &lt;code&gt;b&lt;/code&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;b(bad_user).activated =
        if bad_user.activated_at == NULL &amp;amp;&amp;amp; activated_until == NULL
        then FALSE
        else bad_user.activated_at &amp;lt;= now() &amp;lt; bad_user.activated_until
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    
    &lt;p&gt;But now if enough time passes, &lt;code&gt;b(bad_user).activated' = false&lt;/code&gt;, so this is not a refinement either.&lt;/p&gt;
    &lt;h3&gt;The punchline&lt;/h3&gt;
    &lt;p&gt;Refinement is one of the most powerful techniques in formal specification, but also one of the hardest for people to understand. I'm starting to think that the reason it's so hard is because they learn refinement while they're &lt;em&gt;also&lt;/em&gt; learning formal methods, so are faced with an unfamiliar topic in an unfamiliar context. If that's the case, then maybe it's easier introducing refinement in a more common context like databases.&lt;/p&gt;
    &lt;p&gt;I've written a bit about refinement in the normal context &lt;a href="https://hillelwayne.com/post/refinement/" target="_blank"&gt;here&lt;/a&gt; (showing one specification is an implementation of another). I kinda want to work this explanation into the book but it might be too late for big content additions like this.&lt;/p&gt;
    &lt;p&gt;(Food for thought: how do refinement mappings relate to database views?)&lt;/p&gt;
    </description><pubDate>Tue, 20 Jan 2026 17:49:07 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/refinement-without-specification/</guid></item><item><title>My Gripes with Prolog</title><link>https://buttondown.com/hillelwayne/archive/my-gripes-with-prolog/</link><description>
    &lt;p&gt;For the next release of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt;, I'm finally adding the sections on Answer Set Programming and Constraint Logic Programming that I TODOd back in version 0.9. And this is making me re-experience some of my pain points with Prolog, which I will gripe about now.  If you want to know more about why Prolog is cool instead, go &lt;a href="https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/" target="_blank"&gt;here&lt;/a&gt; or &lt;a href="https://www.metalevel.at/prolog" target="_blank"&gt;here&lt;/a&gt; or &lt;a href="https://ianthehenry.com/posts/drinking-with-datalog/" target="_blank"&gt;here&lt;/a&gt; or &lt;a href="https://logicprogramming.org/" target="_blank"&gt;here&lt;/a&gt;. &lt;/p&gt;
    &lt;h3&gt;No standardized strings&lt;/h3&gt;
    &lt;p&gt;ISO "strings" are just atoms or lists of single-character atoms (or lists of integer character codes). The various implementations of Prolog add custom string operators but they are not cross compatible, so code written with strings in SWI-Prolog will not work in Scryer Prolog. &lt;/p&gt;
    &lt;h3&gt;No functions&lt;/h3&gt;
    &lt;p&gt;Code logic is expressed entirely in &lt;em&gt;rules&lt;/em&gt;, predicates which return true or false for certain values. For example if you wanted to get the length of a Prolog list, you write this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;length&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Len&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    
       &lt;span class="nv"&gt;Len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;3.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Now this is pretty cool in that it allows bidirectionality, or running predicates "in reverse". To generate lists of length 3, you can write &lt;code&gt;length(L, 3)&lt;/code&gt;. But it also means that if you want to get the length a list &lt;em&gt;plus one&lt;/em&gt;, you can't do that in one expression, you have to write &lt;code&gt;length(List, Out), X is Out+1&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;For a while I thought no functions was necessary evil for bidirectionality, but then I discovered &lt;a href="https://picat-lang.org/" target="_blank"&gt;Picat&lt;/a&gt; has functions and works just fine. That by itself is a reason for me to prefer Picat for my LP needs.&lt;/p&gt;
    &lt;p&gt;(Bidirectionality is a killer feature of Prolog, so it's a shame I so rarely run into situations that use it.)&lt;/p&gt;
    &lt;h3&gt;No standardized collection types besides lists&lt;/h3&gt;
    &lt;p&gt;Aside from atoms (&lt;code&gt;abc&lt;/code&gt;) and numbers, there are two data types:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Linked lists like &lt;code&gt;[a,b,c,d]&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;Compound terms like &lt;code&gt;dog(rex, poodle)&lt;/code&gt;, which &lt;em&gt;seem&lt;/em&gt; like record types but are actually tuples. You can even convert compound terms to linked lists with &lt;code&gt;=..&lt;/code&gt;:&lt;/li&gt;
    &lt;/ul&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt; &lt;span class="s s-Atom"&gt;=..&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;
       &lt;span class="nv"&gt;L&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;a&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;a&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;c&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="s s-Atom"&gt;=..&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
       &lt;span class="nv"&gt;L&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;c&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)].&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There's no proper key-value maps or even struct types. Again, this is something that individual distributions can fix (without cross compatibility), but these never feel integrated with the rest of the language. &lt;/p&gt;
    &lt;h3&gt;No boolean values&lt;/h3&gt;
    &lt;p&gt;&lt;code&gt;true&lt;/code&gt; and &lt;code&gt;false&lt;/code&gt; aren't values, they're control flow statements. &lt;code&gt;true&lt;/code&gt; is a noop and &lt;code&gt;false&lt;/code&gt; says that the current search path is a dead end, so backtrack and start again. You can't explicitly store true and false as values, you have to implicitly have them in facts (&lt;code&gt;passed(test)&lt;/code&gt; instead of &lt;code&gt;test.passed? == true&lt;/code&gt;).&lt;/p&gt;
    &lt;p&gt;This hasn't made any tasks impossible, and I can usually find a workaround to whatever I want to do. But I do think it makes things more inconvenient! Sometimes I want to do something dumb like "get all atoms that don't pass at least three of these rules", and that'd be a lot easier if I could shove intermediate results into a sack of booleans. &lt;/p&gt;
    &lt;p&gt;(This is called "&lt;a href="https://en.wikipedia.org/wiki/Negation_as_failure" target="_blank"&gt;Negation as Failure&lt;/a&gt;". I think this might be necessary to make Prolog a Turing complete general programming language. Picat fixes a lot of Prolog's gripes and still has negation as failure. ASP has regular negation but it's not Turing complete.) &lt;/p&gt;
    &lt;h3&gt;Cuts are confusing&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;Prolog finds solutions through depth first search, and a "cut" (&lt;code&gt;!&lt;/code&gt;) symbol prevents backtracking past a certain point. This is necessary for optimization but can lead to invalid programs. &lt;/p&gt;
    &lt;p&gt;You're not supposed to use cuts if you can avoid it, so I pretended cuts didn't exist. Which is why I was surprised to find that &lt;a href="https://eu.swi-prolog.org/pldoc/doc_for?object=(-%3E)/2" target="_blank"&gt;conditionals&lt;/a&gt; are implemented with cuts. Because cuts are spooky dark magic conditionals &lt;em&gt;sometimes&lt;/em&gt; conditionals work as I expect them to and sometimes leave out valid solutions and I have no idea how to tell which it'll be. Usually I find it safer to just avoid conditionals entirely, which means my code gets a lot longer and messier. &lt;/p&gt;
    &lt;h3&gt;Non-cuts are confusing&lt;/h3&gt;
    &lt;p&gt;The original example in the last section was this: &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
        &lt;span class="s s-Atom"&gt;\+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;B&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nv"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;2.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;foo(1, 2)&lt;/code&gt; returns true, so you'd expect &lt;code&gt;f(A, B)&lt;/code&gt; to return &lt;code&gt;A=1, B=2&lt;/code&gt;. But it returns &lt;code&gt;false&lt;/code&gt;.  Whereas this works as expected.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
        &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nv"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s s-Atom"&gt;\+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;B&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I &lt;em&gt;thought&lt;/em&gt; this was because &lt;code&gt;\+&lt;/code&gt; was implemented with cuts, and the &lt;a href="https://www.amazon.com/Programming-Prolog-Using-ISO-Standard/dp/3540006788" target="_blank"&gt;Clocksin book&lt;/a&gt; suggests it's &lt;code&gt;call(P), !, fail&lt;/code&gt;, so this was my prime example about how cuts are confusing. But then I tried this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="s s-Atom"&gt;\+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;3.&lt;/span&gt;
    &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;3.&lt;/span&gt; &lt;span class="c1"&gt;% wtf?&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There's no way to get that behavior with cuts! I don't think &lt;code&gt;\+&lt;/code&gt; uses cuts at all! And now I have to figure out why 
    &lt;code&gt;foo(A, B)&lt;/code&gt; doesn't returns results. Is it &lt;a href="https://github.com/dtonhofer/prolog_notes/blob/master/other_notes/about_negation/floundering.md" target="_blank"&gt;floundering&lt;/a&gt;? Is it because &lt;code&gt;\+ P&lt;/code&gt; only succeeds if &lt;code&gt;P&lt;/code&gt; fails, and &lt;code&gt;A = B&lt;/code&gt; always succeeds? A closed-world assumption? Something else?&lt;sup id="fnref:dif"&gt;&lt;a class="footnote-ref" href="#fn:dif"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;h3&gt;Straying outside of default queries is confusing&lt;/h3&gt;
    &lt;p&gt;Say I have a program like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;n1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n11&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;n2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n21&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;n2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n22&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;n11&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n111&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;n11&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n112&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    
    &lt;span class="nf"&gt;branch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="c1"&gt;% two children&lt;/span&gt;
        &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;C1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;C2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nv"&gt;C1&lt;/span&gt; &lt;span class="s s-Atom"&gt;@&amp;lt;&lt;/span&gt; &lt;span class="nv"&gt;C2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="c1"&gt;% ordering&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;And I want to know all of the nodes that are parents of branches. The normal way to do this is with a query:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;branch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s s-Atom"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s s-Atom"&gt;n2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;% show more...&lt;/span&gt;
    &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s s-Atom"&gt;n1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s s-Atom"&gt;n11&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is interactively making me query for every result. That's usually not what I want, I know the result of my query is finite and I want all of the results at once, so I can count or farble or whatever them. It took a while to figure out that the proper solution is &lt;a href="https://www.swi-prolog.org/pldoc/man?predicate=bagof/3" target="_blank"&gt;&lt;code&gt;bagof(Template, Goal, Bag)&lt;/code&gt;&lt;/a&gt;, which will "Unify Bag with the alternatives of Template":&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;bagof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;branch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    
    &lt;span class="nv"&gt;As&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;n1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s s-Atom"&gt;n11&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nv"&gt;As&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;n&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s s-Atom"&gt;n2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Wait crap that's still giving one result at a time, because &lt;code&gt;N&lt;/code&gt; is a free variable in &lt;code&gt;bagof&lt;/code&gt; so it backtracks over that. It surprises me but I guess it's good to have as an option. So how do I get all of the results at once?&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;bagof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="s s-Atom"&gt;^&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;branch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    
    &lt;span class="nv"&gt;As&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;n1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The only difference is the &lt;code&gt;N^Goal&lt;/code&gt;, which tells &lt;code&gt;bagof&lt;/code&gt; to ignore and group the results of &lt;code&gt;N&lt;/code&gt;. As far as I can tell, this is the &lt;em&gt;only&lt;/em&gt; place the ISO standard uses &lt;code&gt;^&lt;/code&gt; to mean anything besides exponentiation. Supposedly it's the &lt;a href="https://sicstus.sics.se/sicstus/docs/latest4/html/sicstus.html/ref_002dall_002dsum.html" target="_blank"&gt;existential quantifier&lt;/a&gt;? In general whenever I try to stray outside simpler use-cases, especially if I try to do things non-interactively, I run into trouble.&lt;/p&gt;
    &lt;h3&gt;I have mixed feelings about symbol terms&lt;/h3&gt;
    &lt;p&gt;It took me a long time to realize the reason &lt;code&gt;bagof&lt;/code&gt;  "works" is because infix symbols are mapped to prefix compound terms, so that  &lt;code&gt;a^b&lt;/code&gt; is &lt;code&gt;^(a, b)&lt;/code&gt;, and then different predicates can decide to do different things with &lt;code&gt;^(a, b)&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;This is also why you can't just write &lt;code&gt;A = B+1&lt;/code&gt;: that unifies &lt;code&gt;A&lt;/code&gt; with the &lt;em&gt;compound term&lt;/em&gt; &lt;code&gt;+(B, 1)&lt;/code&gt;. &lt;code&gt;A+1 = B+2&lt;/code&gt; is &lt;em&gt;false&lt;/em&gt;, as &lt;code&gt;1 \= 2&lt;/code&gt;. You have to write &lt;code&gt;A+1 is B+2&lt;/code&gt;, as &lt;code&gt;is&lt;/code&gt; is the operator that converts &lt;code&gt;+(B, 1)&lt;/code&gt; to a mathematical term.&lt;/p&gt;
    &lt;p&gt;(And &lt;em&gt;that&lt;/em&gt; fails because &lt;code&gt;is&lt;/code&gt; isn't fully bidirectional. The lhs &lt;em&gt;must&lt;/em&gt; be a single variable. You have to import &lt;code&gt;clpfd&lt;/code&gt; and write &lt;code&gt;A + 1 #= B + 2&lt;/code&gt;.)&lt;/p&gt;
    &lt;p&gt;I don't like this, but I'm a hypocrite for saying that because I appreciate the idea and don't mind custom symbols in other languages. I guess what annoys me is there's no official definition of what &lt;code&gt;^(a, b)&lt;/code&gt; is, it's purely a convention. ISO Prolog uses &lt;code&gt;-(a, b)&lt;/code&gt; (aka &lt;code&gt;a-b&lt;/code&gt;) as a convention to mean "pairs", and the only way to realize that is to see that an awful lot of standard modules use that convention. But you can use &lt;code&gt;-(a, b)&lt;/code&gt; to mean something else in your own code and nothing will warn you of the inconsistency.&lt;/p&gt;
    &lt;p&gt;Anyway I griped about pairs so I can gripe about &lt;code&gt;sort&lt;/code&gt;.&lt;/p&gt;
    &lt;h3&gt;go home sort, ur drunk&lt;/h3&gt;
    &lt;p&gt;This one's just a blunder:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Out&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
       &lt;span class="nv"&gt;Out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt; &lt;span class="c1"&gt;% wat&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;According to an expert online this is because sort is supposed to return a sorted &lt;em&gt;set&lt;/em&gt;, not a sorted list. If you want to preserve duplicates you're supposed to lift all of the values into &lt;code&gt;-($key, $value)&lt;/code&gt; compound terms, then use &lt;a href="https://eu.swi-prolog.org/pldoc/doc_for?object=keysort/2" target="_blank"&gt;keysort&lt;/a&gt;, then extract the values. And, since there's no functions, this process takes at least three lines. This is also how you're supposed to sort by a custom predicate, like "the second value of a compound term". &lt;/p&gt;
    &lt;p&gt;(Most (but not all) distributions have a duplicate merge like &lt;a href="https://eu.swi-prolog.org/pldoc/doc_for?object=msort/2" target="_blank"&gt;msort&lt;/a&gt;. SWI-Prolog also has a &lt;a href="https://eu.swi-prolog.org/pldoc/doc_for?object=predsort/3" target="_blank"&gt;sort by key&lt;/a&gt; but it removes duplicates.)&lt;/p&gt;
    &lt;h3&gt;Please just let me end rules with a trailing comma instead of a period, I'm begging you&lt;/h3&gt;
    &lt;p&gt;I don't care if it makes fact parsing ambiguous, I just don't want "reorder two lines" to be a syntax error anymore&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;I expect by this time tomorrow I'll have been Cunningham'd and there will be a 2000 word essay about how all of my gripes are either easily fixable by doing XYZ or how they are the best possible choice that Prolog could have made. I mean, even in writing this I found out some fixes to problems I had. Like I was going to gripe about how I can't run SWI-Prolog queries from the command line but, in doing do diligence finally &lt;em&gt;finally&lt;/em&gt; figured it out:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;swipl&lt;span class="w"&gt; &lt;/span&gt;-t&lt;span class="w"&gt; &lt;/span&gt;halt&lt;span class="w"&gt; &lt;/span&gt;-g&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"bagof(X, Goal, Xs), print(Xs)"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;./file.pl
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;It's pretty clunky but still better than the old process of having to enter an interactive session every time I wanted to validate a script change.&lt;/p&gt;
    &lt;p&gt;(Also, answer set programming is pretty darn cool. Excited to write about it in the book!)&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:dif"&gt;
    &lt;p&gt;A couple of people mentioned using &lt;a href="https://eu.swi-prolog.org/pldoc/doc_for?object=dif/2" target="_blank"&gt;dif/2&lt;/a&gt; instead of &lt;code&gt;\+ A = B&lt;/code&gt;. Dif is great but usually I hit the negation footgun with things like &lt;code&gt;\+ foo(A, B), bar(B, C), baz(A, C)&lt;/code&gt;, where &lt;code&gt;dif/2&lt;/code&gt; isn't applicable. &lt;a class="footnote-backref" href="#fnref:dif" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 14 Jan 2026 16:48:51 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/my-gripes-with-prolog/</guid></item><item><title>The Liskov Substitution Principle does more than you think</title><link>https://buttondown.com/hillelwayne/archive/the-liskov-substitution-principle-does-more-than/</link><description>
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Happy New Year! I'm done with the newsletter hiatus and am going to try updating weekly again. To ease into things a bit, I'll try to keep posts a little more off the cuff and casual for a while, at least until &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;&lt;em&gt;Logic for Programmers&lt;/em&gt;&lt;/a&gt; is done. Speaking of which, v0.13 should be out by the end of this month.&lt;/p&gt;
    &lt;p&gt;So for this newsletter I want to talk about the &lt;a href="https://en.wikipedia.org/wiki/Liskov_substitution_principle" target="_blank"&gt;Liskov Substitution Principle&lt;/a&gt; (LSP). Last week I read &lt;a href="https://loup-vaillant.fr/articles/solid-bull" target="_blank"&gt;A SOLID Load of Bull&lt;/a&gt; by cryptographer Loupe Vaillant, where he argues the &lt;a href="https://en.wikipedia.org/wiki/SOLID" target="_blank"&gt;SOLID&lt;/a&gt; principles of OOP are not worth following. He makes an exception for LSP, but also claims that it's "just subtyping" and further:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;If I were trying really hard to be negative about the Liskov substitution principle, I would stress that &lt;strong&gt;it only applies when inheritance is involved&lt;/strong&gt;, and inheritance is strongly discouraged anyway.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;LSP is more interesting than that! In the original paper, &lt;a href="https://www.cs.cmu.edu/~wing/publications/LiskovWing94.pdf" target="_blank"&gt;A Behavioral Notion of Subtyping&lt;/a&gt;, Barbara Liskov and Jeannette Wing start by defining a "correct" subtyping as follows:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Subtype Requirement: Let ϕ(x) be a property provable about objects x of type T. Then ϕ(y) should be true for objects y of type S where S is a subtype of T.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;From then on, the paper determine what &lt;em&gt;guarantees&lt;/em&gt; that a subtype is correct.&lt;sup id="fnref:safety"&gt;&lt;a class="footnote-ref" href="#fn:safety"&gt;1&lt;/a&gt;&lt;/sup&gt;  They identify three conditions: &lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Each of the subtype's methods has the same or weaker preconditions and the same or stronger postconditions as the corresponding supertype method.&lt;sup id="fnref:cocontra"&gt;&lt;a class="footnote-ref" href="#fn:cocontra"&gt;2&lt;/a&gt;&lt;/sup&gt; &lt;/li&gt;
    &lt;li&gt;The subtype satisfies all state invariants of the supertype. &lt;/li&gt;
    &lt;li&gt;The subtype satisfies all "history properties" of the supertype. &lt;sup id="fnref:refinement"&gt;&lt;a class="footnote-ref" href="#fn:refinement"&gt;3&lt;/a&gt;&lt;/sup&gt; e.g. if a supertype has an immutable field, the subtype cannot make it mutable. &lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;(Later, Elisa Baniassad and Alexander Summers &lt;a href="https://www.cs.ubc.ca/~alexsumm/papers/BaniassadSummers21.pdf" target="_blank"&gt;would realize&lt;/a&gt; these are equivalent to "the subtype passes all black-box tests designed for the supertype", which I wrote a little bit more about &lt;a href="https://www.hillelwayne.com/post/lsp/" target="_blank"&gt;here&lt;/a&gt;.)&lt;/p&gt;
    &lt;p&gt;I want to focus on the first rule about preconditions and postconditions. This refers to the method's &lt;strong&gt;contract&lt;/strong&gt;.  For a function &lt;code&gt;f&lt;/code&gt;, &lt;code&gt;f.Pre&lt;/code&gt; is what must be true going into the function, and &lt;code&gt;f.Post&lt;/code&gt; is what the function guarantees on execution. A canonical example is square root: &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;sqrt.Pre(x) = x &amp;gt;= 0
    sqrt.Post(x, out) = out &amp;gt;= 0 &amp;amp;&amp;amp; out*out == x
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;Mathematically we would write this as &lt;code&gt;all x: f.Pre(x) =&amp;gt; f.Post(x)&lt;/code&gt; (where &lt;code&gt;=&amp;gt;&lt;/code&gt; is the &lt;a href="https://en.wikipedia.org/wiki/Material_conditional" target="_blank"&gt;implication operator&lt;/a&gt;). If that relation holds for all &lt;code&gt;x&lt;/code&gt;, we say the function is "correct". With this definition we can actually formally deduce the first  subtyping requirement. Let &lt;code&gt;caller&lt;/code&gt; be some code that uses a method, which we will call &lt;code&gt;super&lt;/code&gt;, and let both &lt;code&gt;caller&lt;/code&gt; and &lt;code&gt;super&lt;/code&gt; be correct. Then we know the following statements are true:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;  1. caller.Pre &amp;amp;&amp;amp; stuff =&amp;gt; super.Pre
      2. super.Pre =&amp;gt; super.Post
      3. super.Post &amp;amp;&amp;amp; more_stuff =&amp;gt; caller.Post
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now let's say we substitute &lt;code&gt;super&lt;/code&gt; with &lt;code&gt;sub&lt;/code&gt;, which is also correct. Here is what we now know is true: &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt; &lt;/span&gt; 1. caller.Pre =&amp;gt; super.Pre
    &lt;span class="gd"&gt;- 2. super.Pre =&amp;gt; super.Post&lt;/span&gt;
    &lt;span class="gi"&gt;+ 2. sub.Pre =&amp;gt; sub.Post&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; 3. super.Post =&amp;gt; caller.Post
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;When is &lt;code&gt;caller&lt;/code&gt; still correct? When we can fill in the "gaps" in the chain, aka if &lt;code&gt;super.Pre =&amp;gt; sub.Pre&lt;/code&gt; and &lt;code&gt;sub.Post =&amp;gt; super.Post&lt;/code&gt;. In other words, if &lt;code&gt;sub&lt;/code&gt;'s preconditions are weaker than (or equivalent to) &lt;code&gt;super&lt;/code&gt;'s preconditions and if &lt;code&gt;sub&lt;/code&gt;'s postconditions are stronger than (or equivalent to) &lt;code&gt;super&lt;/code&gt;'s postconditions.&lt;/p&gt;
    &lt;p&gt;Notice that I never actually said &lt;code&gt;sub&lt;/code&gt; was from a subtype of &lt;code&gt;super&lt;/code&gt;! The LSP conditions (at least, the contract rule of LSP) doesn't just apply to &lt;em&gt;subtypes&lt;/em&gt; but can be applied in any situation where we substitute a function or block of code for another. Subtyping is a common place where this happens, but by no means the only! We can also substitute across time.Any time we modify some code's behavior, we are effectively substituting the new version in for the old version, and so the new version's contract must be compatible with the old version's to guarantee no existing code is broken.&lt;/p&gt;
    &lt;p&gt;For example, say we maintain an API or function with two required inputs, &lt;code&gt;X&lt;/code&gt; and &lt;code&gt;Y&lt;/code&gt;, and one optional input, &lt;code&gt;Z&lt;/code&gt;. Making &lt;code&gt;Z&lt;/code&gt; required strengthens the precondition ("input must have Z" is stronger than "input may have Z"), so potentially breaks existing users of our API. Making &lt;code&gt;Y&lt;/code&gt; optional weakens the precondition ("input may have Y" is weaker than "input must have Y"), so is guaranteed to be compatible.&lt;/p&gt;
    &lt;p&gt;(This also underpins &lt;a href="https://en.wikipedia.org/wiki/Robustness_principle" target="_blank"&gt;The robustness principle&lt;/a&gt;: "be conservative in what you send, be liberal in what you accept".)&lt;/p&gt;
    &lt;p&gt;Now the dark side of all this is &lt;a href="https://www.hyrumslaw.com/" target="_blank"&gt;Hyrum's Law&lt;/a&gt;. In the below code, are &lt;code&gt;new&lt;/code&gt;'s postconditions stronger than &lt;code&gt;old&lt;/code&gt;'s postconditions? &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;old&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"a"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"foo"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"b"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"bar"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    
    &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"a"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"foo"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"b"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"bar"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"c"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"baz"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;On a first appearance, this is a strengthened postcondition: &lt;code&gt;out.contains_keys([a, b, c]) =&amp;gt; out.contains_keys([a, b])&lt;/code&gt;. But now someone does this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;my_dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"c"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"blat"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; 
    &lt;span class="n"&gt;my_dict&lt;/span&gt; &lt;span class="o"&gt;|=&lt;/span&gt; &lt;span class="n"&gt;new&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;my_dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"blat"&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Oh no, their code now breaks! They saw &lt;code&gt;old&lt;/code&gt; had the postcondition "&lt;code&gt;out&lt;/code&gt; does NOT contain "c" as a key", and then wrote their code expecting that postcondition. In a sense, &lt;em&gt;any&lt;/em&gt; change the postcondition can potentially break &lt;em&gt;someone&lt;/em&gt;. "All observable behaviors of your system
    will be depended on by somebody", as &lt;a href="https://www.hyrumslaw.com/" target="_blank"&gt;Hyrum's Law&lt;/a&gt; puts it.&lt;/p&gt;
    &lt;p&gt;So we need to be explicit in what our postconditions actually are, and properties of the output that are not part of our explicit postconditions are subject to be violated on the next version. You'll break people's workflows but you also have grounds to say "I warned you".&lt;/p&gt;
    &lt;p&gt;Overall, Liskov and Wing did their work in the context of subtyping, but the principles are more widely applicable, certainly to more than just the use of inheritance.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:safety"&gt;
    &lt;p&gt;Though they restrict it to just &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;safety properties&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:safety" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:cocontra"&gt;
    &lt;p&gt;The paper lists a couple of other authors as introduce the idea of "contra/covariance rules", but part of being "off-the-cuff and casual" means not diving into every referenced paper. So they might have gotten the pre/postconditions thing from an earlier author, dunno for sure! &lt;a class="footnote-backref" href="#fnref:cocontra" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:refinement"&gt;
    &lt;p&gt;I &lt;em&gt;believe&lt;/em&gt; that this is equivalent to the formal methods notion of a &lt;a href="https://www.hillelwayne.com/post/refinement/" target="_blank"&gt;refinement&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:refinement" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 06 Jan 2026 16:51:26 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/the-liskov-substitution-principle-does-more-than/</guid></item><item><title>Some Fun Software Facts</title><link>https://buttondown.com/hillelwayne/archive/some-fun-software-facts/</link><description>
    &lt;p&gt;Last newsletter of the year!&lt;/p&gt;
    &lt;p&gt;First some news on &lt;em&gt;Logic for Programmers&lt;/em&gt;. Thanks to everyone who donated to the &lt;a href="https://buttondown.com/hillelwayne/archive/get-logic-for-programmers-50-off-support-chicago" target="_blank"&gt;feedchicago charity drive&lt;/a&gt;! In total we raised $2250 for Chicago food banks. Proof &lt;a href="https://link.fndrsp.net/CL0/https:%2F%2Fgiving.chicagosfoodbank.org%2Freceipts%2FBMDDDCAF%3FreceiptType=oneTime%26emailLog=YS699MZW/2/0100019ae2b7eb92-7c917ad0-c94e-4fe2-8ee1-1b9dc521c607-000000/brmxoTOvoJN94I9nQH26s7fRrmyFDj_Jir1FySSoxCw=434" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;If you missed buying &lt;em&gt;Logic for Programmers&lt;/em&gt; real cheap in the charity drive, you can still get it for $10 off with the holiday code &lt;a href="https://leanpub.com/logic/c/hannukah-presents" target="_blank"&gt;hannukah-presents&lt;/a&gt;. This will last from now until the end of the year. After that, I'll be raising the price from $25 to $30.&lt;/p&gt;
    &lt;p&gt;Anyway, to make this more than just some record keeping, let's close out with something light. I'm one of those people who loves hearing "fun facts" about stuff. So here's some random fun facts I accumulated about software over the years:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;In 2017, a team of eight+ programmers &lt;a href="https://codegolf.stackexchange.com/questions/11880/build-a-working-game-of-tetris-in-conways-game-of-life" target="_blank"&gt;successfully implemented Tetris&lt;/a&gt; as a &lt;a href="https://en.wikipedia.org/wiki/Conway's_Game_of_Life" target="_blank"&gt;game of life simulation&lt;/a&gt;. The GoL grid had an area of 30 trillion pixels and implemented a full programmable CPU as part of the project.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;Computer systems have to deal with leap seconds in order to keep UTC (where one day is 86,400 seconds) in sync with UT1 (where one day is exactly one full earth rotation). The people in charge recently passed a resolution to abolish the leap second by 2035, letting UTC and UT1 slowly drift out of sync.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://buttondown.com/hillelwayne/archive/vim-is-turing-complete/" target="_blank"&gt;Vim is Turing complete&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;ul&gt;
    &lt;li&gt;The backslash character basically didn't exist in writing before 1930, and &lt;a href="http://dump.deadcodersociety.org/ascii.pdf" target="_blank"&gt;was only added to ASCII&lt;/a&gt; so mathematicians (and ALGOLists) could write &lt;code&gt;/\&lt;/code&gt; and &lt;code&gt;\/&lt;/code&gt;. It's popular use in computing stems entirely from being a useless key on the keyboard.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Galactic_algorithm" target="_blank"&gt;Galactic Algorithms&lt;/a&gt; are algorithms that are theoretically faster than algorithms we use, but only at scales that make them impractical. For example, matrix multiplication of NxN is &lt;a href="https://en.wikipedia.org/wiki/Strassen_algorithm" target="_blank"&gt;normally&lt;/a&gt; O(N^2.81). The &lt;a href="https://www-auth.cs.wisc.edu/lists/theory-reading/2009-December/pdfmN6UVeUiJ3.pdf" target="_blank"&gt;Coppersmith Winograd&lt;/a&gt; algorithm is O(N^2.38), but is so complex that it's vastly slower for even &lt;a href="https://mathoverflow.net/questions/1743/what-is-the-constant-of-the-coppersmith-winograd-matrix-multiplication-algorithm" target="_blank"&gt;10,000 x 10,000 matrices&lt;/a&gt;. It's still interesting in advancing our mathematical understanding of algorithms!&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;Cloudflare generates random numbers by, in part, &lt;a href="https://www.cloudflare.com/learning/ssl/lava-lamp-encryption/" target="_blank"&gt;taking pictures of 100 lava lamps&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;Mergesort is older than bubblesort. Quicksort is slightly younger than bubblesort but older than the &lt;em&gt;term&lt;/em&gt; "bubblesort". Bubblesort, btw, &lt;a href="https://buttondown.com/hillelwayne/archive/when-would-you-ever-want-bubblesort/" target="_blank"&gt;does have some uses&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;Speaking of mergesort, most implementations of mergesort pre-2006 &lt;a href="https://research.google/blog/extra-extra-read-all-about-it-nearly-all-binary-searches-and-mergesorts-are-broken/" target="_blank"&gt;were broken&lt;/a&gt;. Basically the problem was that the "find the midpoint of a list" step &lt;em&gt;could&lt;/em&gt; overflow if the list was big enough. For C with 32-bit signed integers, "big enough" meant over a billion elements, which was why the bug went unnoticed for so long.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://nibblestew.blogspot.com/2023/09/circles-do-not-exist.html" target="_blank"&gt;PDF's drawing model cannot render perfect circles&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;People make fun of how you have to flip USBs three times to get them into a computer, but there's supposed to be a guide: according to the standard, USBs are supposed to be inserted &lt;em&gt;logo-side up&lt;/em&gt;. Of course, this assumes that the port is right-side up, too, which is why USB-C is just symmetric. &lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;I was gonna write a fun fact about how all spreadsheet software treats 1900 as a leap year, as that was a bug in Lotus 1-2-3 and everybody preserved backwards compatibility. But I checked and Google sheets considers it a normal year. So I guess the fun fact is that things have changed!&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;Speaking of spreadsheet errors, in 2020 &lt;a href="https://www.engadget.com/scientists-rename-genes-due-to-excel-151748790.html" target="_blank"&gt;biologists changed the official nomenclature&lt;/a&gt; of 27 genes because Excel kept parsing their names as dates. F.ex MARCH1 was renamed to MARCHF1 to avoid being parsed as "March 1st". Microsoft rolled out a fix for this... three years later.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;It is possible to encode any valid JavaScript program with just the characters &lt;code&gt;()+[]!&lt;/code&gt;. This encoding is called &lt;a href="https://en.wikipedia.org/wiki/JSFuck" target="_blank"&gt;JSFuck&lt;/a&gt; and was once used to distribute malware on &lt;a href="https://arstechnica.com/information-technology/2016/02/ebay-has-no-plans-to-fix-severe-bug-that-allows-malware-distribution/" target="_blank"&gt;Ebay&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Happy holidays everyone, and see you in 2026!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:status"&gt;
    &lt;p&gt;Current status update: I'm finally getting line by line structural editing done and it's turning up lots of improvements, so I'm doing more rewrites than I expected to be doing. &lt;a class="footnote-backref" href="#fnref:status" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 10 Dec 2025 18:45:37 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/some-fun-software-facts/</guid></item><item><title>One more week to the Logic for Programmers Food Drive</title><link>https://buttondown.com/hillelwayne/archive/one-more-week-to-the-logic-for-programmers-food/</link><description>
    &lt;p&gt;A couple of weeks ago I started a fundraiser for the &lt;a href="https://www.chicagosfoodbank.org/" target="_blank"&gt;Greater Chicago Food Depository&lt;/a&gt;: get &lt;a href="https://leanpub.com/logic/c/feedchicago" target="_blank"&gt;Logic for Programmers 50% off&lt;/a&gt; and all the royalties will go to charity.&lt;sup id="fnref:royalties"&gt;&lt;a class="footnote-ref" href="#fn:royalties"&gt;1&lt;/a&gt;&lt;/sup&gt; Since then, we've raised a bit over $1600. Y'all are great! &lt;/p&gt;
    &lt;p&gt;The fundraiser is going on until the end of November, so you still have one more week to get the book real cheap.&lt;/p&gt;
    &lt;p&gt;I feel a bit weird about doing two newsletter adverts without raw content, so here's a teaser from a old project I really need to get back to. &lt;a href="https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/#what-is-a-goto-statement-anyway" target="_blank"&gt;Notes on structured concurrency&lt;/a&gt; argues that old languages had a "old-testament fire-and-brimstone &lt;code&gt;goto&lt;/code&gt;" that could send control flow anywhere, like from the body of one function into the body of another function. This "wild goto", the article claims, what Dijkstra was railing against in &lt;a href="https://homepages.cwi.nl/~storm/teaching/reader/Dijkstra68.pdf" target="_blank"&gt;Go To Statement Considered Harmful&lt;/a&gt;, and that modern goto statements are much more limited, "tame" if you will, and wouldn't invoke Dijkstra's ire.&lt;/p&gt;
    &lt;p&gt;I've shared this historical fact about Dijkstra many times, but recently two &lt;a href="https://without.boats/blog/" target="_blank"&gt;separate&lt;/a&gt; &lt;a href="https://matklad.github.io/" target="_blank"&gt;people&lt;/a&gt; have told me it doesn't makes sense: Dijkstra used ALGOL-60, which &lt;em&gt;already had&lt;/em&gt; tame gotos. All of the problems he raises with &lt;code&gt;goto&lt;/code&gt; hold even for tame ones, none are exclusive to wild gotos. So &lt;/p&gt;
    &lt;p&gt;This got me looking to see which languages, if any, ever had the wild goto. I define this as any goto which lets you jump from outside to into a loop or function scope. Turns out, FORTRAN had tame gotos from the start, BASIC has wild gotos, and COBOL is a nonsense language intentionally designed to horrify me. I mean, look at this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The COBOL ALTER statement, which redefines a goto target" class="newsletter-image" src="https://assets.buttondown.email/images/e4dfa0fd-fdd5-4fef-b813-4053a183be2f.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;The COBOL ALTER statement &lt;em&gt;changes a &lt;code&gt;goto&lt;/code&gt;'s target at runtime&lt;/em&gt;. &lt;/p&gt;
    &lt;p&gt;(Early COBOL has tame gotos but only on a technicality: there are no nested scopes in COBOL so no jumping from outside and into a nested scope.)&lt;/p&gt;
    &lt;p&gt;Anyway I need to write up the full story (and complain about COBOL more) but this is pretty neat! Reminder, &lt;a href="https://leanpub.com/logic/c/feedchicago" target="_blank"&gt;fundraiser here&lt;/a&gt;. Let's get it to 2k.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:royalties"&gt;
    &lt;p&gt;Royalties are 80% so if you already have the book you get a bit more bang for your buck by donating to the GCFD directly &lt;a class="footnote-backref" href="#fnref:royalties" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Mon, 24 Nov 2025 18:21:49 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/one-more-week-to-the-logic-for-programmers-food/</guid></item><item><title>Get Logic for Programmers 50% off &amp; Support Chicago Foodbanks</title><link>https://buttondown.com/hillelwayne/archive/get-logic-for-programmers-50-off-support-chicago/</link><description>
    &lt;p&gt;From now until the end of the month, you can get &lt;a href="https://leanpub.com/logic/c/feedchicago" target="_blank"&gt;Logic for Programmers at half price&lt;/a&gt; with the coupon &lt;code&gt;feedchicago&lt;/code&gt;. All royalties from that coupon will go to the &lt;a href="https://www.chicagosfoodbank.org/" target="_blank"&gt;Greater Chicago Food Depository&lt;/a&gt;. Thank you!&lt;/p&gt;
    </description><pubDate>Mon, 10 Nov 2025 16:31:11 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/get-logic-for-programmers-50-off-support-chicago/</guid></item><item><title>I'm taking a break</title><link>https://buttondown.com/hillelwayne/archive/im-taking-a-break/</link><description>
    &lt;p&gt;Hi everyone,&lt;/p&gt;
    &lt;p&gt;I've been getting burnt out on writing a weekly software essay. It's gone from taking me an afternoon to write a post to taking two or three days, and that's made it really difficult to get other writing done. That, plus some short-term work and life priorities, means now feels like a good time for a break. &lt;/p&gt;
    &lt;p&gt;So I'm taking off from &lt;em&gt;Computer Things&lt;/em&gt; for the rest of the year. There &lt;em&gt;might&lt;/em&gt; be some announcements and/or one or two short newsletters in the meantime but I won't be attempting a weekly cadence until 2026.&lt;/p&gt;
    &lt;p&gt;Thanks again for reading!&lt;/p&gt;
    &lt;p&gt;Hillel&lt;/p&gt;
    </description><pubDate>Mon, 27 Oct 2025 21:02:37 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/im-taking-a-break/</guid></item><item><title>Modal editing is a weird historical contingency we have through sheer happenstance</title><link>https://buttondown.com/hillelwayne/archive/modal-editing-is-a-weird-historical-contingency/</link><description>
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;A while back my friend &lt;a href="https://morepablo.com/" target="_blank"&gt;Pablo Meier&lt;/a&gt; was reviewing some 2024 videogames and wrote &lt;a href="https://morepablo.com/2025/03/games-of-2024.html" target="_blank"&gt;this&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;I feel like some artists, if they didn't exist, would have the resulting void filled in by someone similar (e.g. if Katy Perry didn't exist, someone like her would have). But others don't have successful imitators or comparisons (thinking Jackie Chan, or Weird Al): they are irreplaceable.  &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;He was using it to describe auteurs but I see this as a property of opportunity, in that "replaceable" artists are those who work in bigger markets. Katy Perry's market is large, visible and obviously (but not &lt;em&gt;easily&lt;/em&gt;) exploitable, so there are a lot of people who'd compete in her niche. Weird Al's market is unclear: while there were successful parody songs in the past, it wasn't clear there was enough opportunity there to support a superstar.&lt;/p&gt;
    &lt;p&gt;I think that modal editing is in the latter category. Vim is now very popular and has spawned numerous successors. But its key feature, &lt;strong&gt;modes&lt;/strong&gt;, is not obviously-beneficial, to the point that if Bill Joy didn't make vi (vim's direct predecessor) fifty years ago I don't think we'd have any modal editors today. &lt;/p&gt;
    &lt;h3&gt;A quick overview of "modal editing"&lt;/h3&gt;
    &lt;p&gt;In a non-modal editor, pressing the "u" key adds a "u" to your text, as you'd expect. In a &lt;strong&gt;modal editor&lt;/strong&gt;, pressing "u" does something different depending on the "mode" you are in. In Vim's default "normal" mode, "u" undoes the last change to the text, while in the "visual" mode it lowercases all selected text. It only inserts the character in "insert" mode. All other keys, as well as chorded shortcuts (&lt;code&gt;ctrl-x&lt;/code&gt;), work the same way. &lt;/p&gt;
    &lt;p&gt;The clearest benefit to this is you can densely pack the keyboard with advanced commands. The standard US keyboard has 48ish keys dedicated to inserting characters. With the ctrl and shift modifiers that becomes at least ~150 extra shortcuts for each other mode. This is also what IMO "spiritually" distinguishes modal editing from contextual shortcuts. Even if a unimodal editor lets you change a keyboard shortcut's behavior based on languages or focused panel, without global user-controlled modes it simply can't achieve that density of shortcuts.&lt;/p&gt;
    &lt;p&gt;Now while modal editing today is widely beloved (the Vim plugin for &lt;a href="https://marketplace.visualstudio.com/items?itemName=vscodevim.vim" target="_blank"&gt;VSCode&lt;/a&gt; has at least eight million downloads), I suspect it was "carried" by the popularity of vi, as opposed to driving vi's popularity.&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;Modal editing is an unusual idea&lt;/h3&gt;
    &lt;p&gt;Pre-vi editors weren't modal. Some, like &lt;a href="https://en.wikipedia.org/wiki/EDT_(Digital)" target="_blank"&gt;EDT/KED&lt;/a&gt;, used chorded commands, while others like &lt;a href="https://en.wikipedia.org/wiki/Ed_(software)" target="_blank"&gt;ed&lt;/a&gt; or &lt;a href="https://en.wikipedia.org/wiki/TECO_(text_editor)" target="_blank"&gt;TECO&lt;/a&gt; basically REPLs for text-editing DSLs. Both of these ideas widely reappear in modern editors.&lt;/p&gt;
    &lt;p&gt;As far as I can tell, the first modal editor was Butler Lampson's &lt;a href="https://en.wikipedia.org/wiki/Bravo_(editor)" target="_blank"&gt;Bravo&lt;/a&gt; in 1974. Bill Joy &lt;a href="https://web.archive.org/web/20120210184000/http://web.cecs.pdx.edu/~kirkenda/joy84.html" target="_blank"&gt;admits he used it for inspiration&lt;/a&gt;: &lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;A lot of the ideas for the screen editing mode were stolen from a Bravo manual I surreptitiously looked at and copied. Dot is really the double-escape from Bravo, the redo command. Most of the stuff was stolen. &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Bill Joy probably took the idea because he was working on &lt;a href="https://en.wikipedia.org/wiki/ADM-3A" target="_blank"&gt;dumb terminals&lt;/a&gt; that were slow to register keystrokes, which put pressure to minimize the number needed for complex operations.&lt;/p&gt;
    &lt;p&gt;Why did Bravo have modal editing? Looking at the &lt;a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/11/15a-AltoHandbook.pdf" target="_blank"&gt;Alto handbook&lt;/a&gt;, I get the impression that Xerox was trying to figure out the best mouse and GUI workflows. Bravo was an experiment with modes, one hand on the mouse and one issuing commands on the keyboard. Other experiments included context menus (the Markup program) and toolbars (Draw).&lt;/p&gt;
    &lt;p&gt;Xerox very quickly decided &lt;em&gt;against&lt;/em&gt; modes, as the successors &lt;a href="http://www.bitsavers.org/pdf/xerox/alto/memos_1975/Gypsy_The_Ginn_Typescript_System_Apr75.pdf" target="_blank"&gt;Gypsy&lt;/a&gt; and &lt;a href="http://www.bitsavers.org/pdf/xerox/alto/BravoXMan.pdf" target="_blank"&gt;BravoX&lt;/a&gt; were modeless. Commands originally assigned to English letters were moved to graphical menus, special keys, and chords. &lt;/p&gt;
    &lt;p&gt;It seems to me that modes started as an unsuccessful experiment deal with a specific constraint and then later successfully adopted to deal with a different constraint. It was a specialized feature as opposed to a generally useful feature like chords.&lt;/p&gt;
    &lt;h3&gt;Modal editing didn't popularize vi&lt;/h3&gt;
    &lt;p&gt;While vi was popular at Bill Joy's coworkers, he doesn't &lt;a href="https://web.archive.org/web/20120210184000/http://web.cecs.pdx.edu/~kirkenda/joy84.html" target="_blank"&gt;attribute its success to its features&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;I think the wonderful thing about vi is that it has such a good market share because we gave it away. Everybody has it now. So it actually had a chance to become part of what is perceived as basic UNIX. EMACS is a nice editor too, but because it costs hundreds of dollars, there will always be people who won't buy it. &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Vi was distributed for free with the popular &lt;a href="https://en.wikipedia.org/wiki/Berkeley_Software_Distribution" target="_blank"&gt;BSD Unix&lt;/a&gt; and was standardized in &lt;a href="https://pubs.opengroup.org/onlinepubs/9799919799/" target="_blank"&gt;POSIX Issue 2&lt;/a&gt;, meaning all Unix OSes had to have vi. That arguably is what made it popular, and why so many people ended up learning a modal editor. &lt;/p&gt;
    &lt;h3&gt;Modal editing doesn't really spread outside of vim&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;I think by the 90s, people started believing that modal editing was a Good Idea, if not an obvious one. That's why we see direct descendants of vi, most famously vim. It's also why extensible editors like Emacs and VSCode have vim-mode extensions, but these are but these are always simple emulation layers on top of a unimodal baselines. This was good for getting people used to the vim keybindings (I learned on &lt;a href="https://en.wikipedia.org/wiki/Kile" target="_blank"&gt;Kile&lt;/a&gt;) but it means people weren't really &lt;em&gt;doing&lt;/em&gt; anything with modal editing. It was always "The Vim Gimmick".&lt;/p&gt;
    &lt;p&gt;Modes also didn't take off anywhere else. There's no modal word processor, spreadsheet editor, or email client.&lt;sup id="fnref:gmail"&gt;&lt;a class="footnote-ref" href="#fn:gmail"&gt;1&lt;/a&gt;&lt;/sup&gt; &lt;a href="https://www.visidata.org/" target="_blank"&gt;Visidata&lt;/a&gt; is an extremely cool modal data exploration tool but it's pretty niche. Firefox used to have &lt;a href="https://en.wikipedia.org/wiki/Vimperator" target="_blank"&gt;vimperator&lt;/a&gt; (which was inspired by Vim) but that's defunct now. Modal software means modal editing which means vi.&lt;/p&gt;
    &lt;p&gt;This has been changing a little, though! Nowadays we do see new modal text editors, like &lt;a href="https://kakoune.org/" target="_blank"&gt;kakoune&lt;/a&gt; and &lt;a href="https://helix-editor.com/" target="_blank"&gt;Helix&lt;/a&gt;, that don't just try to emulate vi but do entirely new things. These were made, though, in response to perceived shortcomings in vi's editing model. I think they are still classifiable as descendants. If vi never existed, would the developers of kak and helix have still made modal editors, or would they have explored different ideas? &lt;/p&gt;
    &lt;h3&gt;People aren't clamouring for more experiments&lt;/h3&gt;
    &lt;p&gt;Not too related to the overall picture, but a gripe of mine. Vi and vim have a set of hardcoded modes, and adding an entirely new mode is impossible. Like if a plugin (like vim's default &lt;code&gt;netrw&lt;/code&gt;) adds a file explorer it should be able to add a filesystem mode, right? But it can't, so instead it waits for you to open the filesystem and then &lt;a href="https://github.com/vim/vim/blob/0124320c97b0fbbb44613f42fc1c34fee6181fc8/runtime/pack/dist/opt/netrw/autoload/netrw.vim#L4867" target="_blank"&gt;adds 60 new mappings to normal mode&lt;/a&gt;. There's no way to properly add a "filesystem" mode, a "diff" mode, a "git" mode, etc, so plugin developers have to &lt;a href="https://www.hillelwayne.com/post/software-mimicry/" target="_blank"&gt;mimic&lt;/a&gt; them.&lt;/p&gt;
    &lt;p&gt;I don't think people see this as a problem, though! Neovim, which aims to fix all of the baggage in vim's legacy, didn't consider creating modes an important feature. Kak and Helix, which reimagine modal editing from from the ground up, don't support creating modes either.&lt;sup id="fnref:helix"&gt;&lt;a class="footnote-ref" href="#fn:helix"&gt;2&lt;/a&gt;&lt;/sup&gt; People aren't clamouring for new modes!&lt;/p&gt;
    &lt;h2&gt;Modes are a niche power user feature&lt;/h2&gt;
    &lt;p&gt;So far I've been trying to show that vi is, in Pablo's words, "irreplaceable". Editors weren't doing modal editing before Bravo, and even after vi became incredibly popular, unrelated editors did not adapt modal editing. At most, they got a vi emulation layer. Kak and helix complicate this story but I don't think they refute it; they appear much later and arguably count as descendants (so are related). &lt;/p&gt;
    &lt;p&gt;I think the best explanation is that in a vacuum modal editing sounds like a bad idea. The mode is global state that users always have to know, which makes it dangerous. To use new modes well you have to memorize all of the keybindings, which makes it difficult. Modal editing has a brutal skill floor before it becomes more efficient than a unimodal, chorded editor like VSCode.&lt;/p&gt;
    &lt;p&gt;That's why it originally appears in very specific circumstances, as early experiments in mouse UX and as a way of dealing with modem latencies. The fact we have vim today is a historical accident. &lt;/p&gt;
    &lt;p&gt;And I'm glad for it! You can pry Neovim from my cold dead hands, you monsters.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h1&gt;&lt;a href="https://www.p99conf.io/" target="_blank"&gt;P99 talk this Thursday&lt;/a&gt;!&lt;/h1&gt;
    &lt;p&gt;My talk, "Designing Low-Latency Systems with TLA+", is happening 10/23 at 11:40 central time. Tickets are free, the conf is online, and the talk's only 16 minutes, so come check it out!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:gmail"&gt;
    &lt;p&gt;I guess if you squint &lt;a href="https://support.google.com/mail/answer/6594?hl=en&amp;amp;co=GENIE.Platform%3DDesktop" target="_blank"&gt;gmail kinda counts&lt;/a&gt; but it's basically an antifeature &lt;a class="footnote-backref" href="#fnref:gmail" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:helix"&gt;
    &lt;p&gt;It looks like Helix supports &lt;a href="https://docs.helix-editor.com/remapping.html" target="_blank"&gt;creating minor modes&lt;/a&gt;, but these are only active for one keystroke, making them akin to a better, more ergonomic version of vim multikey mappings. &lt;a class="footnote-backref" href="#fnref:helix" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 21 Oct 2025 16:46:24 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/modal-editing-is-a-weird-historical-contingency/</guid></item><item><title>The Phase Change</title><link>https://buttondown.com/hillelwayne/archive/the-phase-change/</link><description>
    &lt;p&gt;Last week I ran my first 10k.&lt;/p&gt;
    &lt;p&gt;It wasn't a race or anything. I left that evening planning to run a 5k, and then three miles later thought "what if I kept going?"&lt;sup id="fnref:distance"&gt;&lt;a class="footnote-ref" href="#fn:distance"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;I've been running for just over two years now. My goal was to run a mile, then three, then three at a pace faster than a power-walk. I wish I could say that I then found joy in running, but really I was just mad at myself for being so bad at it. Spite has always been my brightest muse.&lt;/p&gt;
    &lt;p&gt;Looking back, the thing I find most fascinating is what progress looked like. I couldn't tell you if I was physically progressing steadily, but for sure mental progress moved in discrete jumps. For a long time a 5k was me pushing myself, then suddenly a "phase change" happens and it becomes something I can just do on a run. Sometime in the future the 10k will feel the same way.&lt;/p&gt;
    &lt;p&gt;I've noticed this in a lot of other places. For every skill I know, my sense of myself follows a phase change. In every programming language I've ever learned, I lurch from "bad" to "okay" to "good". There's no "20% bad / 80% normal" in between. Pedagogical experts say that learning is about steadily building a &lt;a href="https://teachtogether.tech/en/index.html#s:models" target="_blank"&gt;mental model&lt;/a&gt; of the topic. It really feels like knowledge grows continuously, and then it suddenly becomes a model.&lt;/p&gt;
    &lt;p&gt;Now, for all the time I spend writing about software history and software theory and stuff, my actually job boils down to &lt;a href="https://www.hillelwayne.com/consulting/" target="_blank"&gt;teaching formal methods&lt;/a&gt;. So I now have two questions about phase changes.&lt;/p&gt;
    &lt;p&gt;The first is "can we make phase changes happen faster?" I don't know if this is even possible! I've found lots of ways to teach concepts faster, cover more ground in less time, so that people know the material more quickly. But it doesn't seem to speed up that very first phase change from "this is foreign" to "this is normal". Maybe we can't really do that until we've spent enough effort on understanding.&lt;/p&gt;
    &lt;p&gt;So the second may be more productive: "can we motivate people to keep going until the phase change?" This is a lot easier to tackle! For example, removing frustration makes a huge difference. Getting a proper pair of running shoes made running so much less unpleasant, and made me more willing to keep putting in the hours. For teaching tech topics like formal methods, this often takes the form of better tooling and troubleshooting info.&lt;/p&gt;
    &lt;p&gt;We can also reduce the effort of investing time. This is also why I prefer to pair on writing specifications with clients and not just write specs for them. It's more work for them than fobbing it all off on me, but a whole lot &lt;em&gt;less&lt;/em&gt; work than writing the spec by themselves, so they'll put in time and gradually develop skills on their own.&lt;/p&gt;
    &lt;p&gt;Question two seems much more fruitful than question one but also so much less interesting! Speeding up the phase change feels like the kind of dream that empires are built on. I know I'm going to keep obsessing over it, even if that leads nowhere.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:distance"&gt;
    &lt;p&gt;For non-running Americans: 5km is about 3.1 miles, and 10km is 6.2. &lt;a class="footnote-backref" href="#fnref:distance" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 16 Oct 2025 14:59:25 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/the-phase-change/</guid></item><item><title>Three ways formally verified code can go wrong in practice</title><link>https://buttondown.com/hillelwayne/archive/three-ways-formally-verified-code-can-go-wrong-in/</link><description>
    &lt;h3&gt;New Logic for Programmers Release!&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" rel="noopener noreferrer nofollow" target="_blank"&gt;v0.12 is now available&lt;/a&gt;! This should be the last major content release. The next few months are going to be technical review, copyediting and polishing, with a hopeful 1.0 release in March. &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" rel="noopener noreferrer nofollow" target="_blank"&gt;Full release notes here&lt;/a&gt;.&lt;/p&gt;
    &lt;figure&gt;&lt;img alt="Cover of the boooooook" draggable="false" src="https://assets.buttondown.email/images/92b4a35d-2bdd-416a-92c7-15ff42b49d8d.jpg?w=960&amp;amp;fit=max"/&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;/figure&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h1&gt;Three ways formally verified code can go wrong in practice&lt;/h1&gt;
    &lt;p&gt;I run this small project called &lt;a href="https://github.com/hwayne/lets-prove-leftpad" rel="noopener noreferrer nofollow" target="_blank"&gt;Let's Prove Leftpad&lt;/a&gt;, where people submit formally verified proofs of the &lt;a href="https://en.wikipedia.org/wiki/Npm_left-pad_incident" rel="noopener noreferrer nofollow" target="_blank"&gt;eponymous meme&lt;/a&gt;. Recently I read &lt;a href="https://lukeplant.me.uk/blog/posts/breaking-provably-correct-leftpad/" rel="noopener noreferrer nofollow" target="_blank"&gt;Breaking “provably correct” Leftpad&lt;/a&gt;, which argued that most (if not all) of the provably correct leftpads have bugs! The lean proof, for example, &lt;em&gt;should&lt;/em&gt; render &lt;code&gt;leftpad('-', 9, אֳֽ֑)&lt;/code&gt; as &lt;code&gt;---------אֳֽ֑&lt;/code&gt;, but actually does &lt;code&gt;------אֳֽ֑&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;You can read the article for a good explanation of why this goes wrong (Unicode). The actual problem is that correct can mean two different things, and this leads to confusion about how much formal methods can actually guarantee us. So I see this as a great opportunity to talk about the nature of proof, correctness, and how "correct" code can still have bugs.&lt;/p&gt;
    &lt;h2&gt;What we talk about when we talk about correctness&lt;/h2&gt;
    &lt;p&gt;In most of the real world, correct means "no bugs". Except "bugs" isn't a very clear category. A bug is anything that causes someone to say "this isn't working right, there's a bug." Being too slow is a bug, a typo is a bug, etc. "correct" is a little fuzzy.&lt;/p&gt;
    &lt;p&gt;In formal methods, "correct" has a very specific and precise meaning: the code conforms to a &lt;strong&gt;specification&lt;/strong&gt; (or "spec"). The spec is a higher-level description of what is supposed the code's properties, usually something we can't just directly implement. Let's look at the most popular kind of proven specification:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Haskell&lt;/span&gt;
    &lt;span class="nf"&gt;inc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;::&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;Int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;gt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;Int&lt;/span&gt;
    &lt;span class="nf"&gt;inc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The type signature &lt;code&gt;Int -&amp;gt; Int&lt;/code&gt; is a specification! It corresponds to the logical statement &lt;code&gt;all x in Int: inc(x) in Int&lt;/code&gt;. The Haskell type checker can automatically verify this for us. It cannot, however, verify properties like &lt;code&gt;all x in Int: inc(x) &amp;gt; x&lt;/code&gt;. Formal verification is concerned with verifying arbitrary properties beyond what is (easily) automatically verifiable. Most often, this takes the form of proof. A human manually writes a proof that the code conforms to its specification, and the prover checks that the proof is correct.&lt;/p&gt;
    &lt;p&gt;Even if we have a proof of "correctness", though, there's a few different ways the code can still have bugs.&lt;/p&gt;
    &lt;h3&gt;1. The proof is invalid&lt;/h3&gt;
    &lt;p&gt;For some reason the proof doesn't actually show the code matches the specification. This is pretty common in pencil-and-paper verification, where the proof is checked by someone saying "yep looks good to me". It's much rarer when doing formal verification but it can still happen in a couple of specific cases:&lt;/p&gt;
    &lt;ol&gt;&lt;li&gt;&lt;p&gt;The theorem prover itself has a bug (in the code or introduced in the compiled binary) that makes it accept an incorrect proof. This is something people are really concerned about but it's so much rarer than every other way verified code goes wrong, so is only included for completeness.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;For convenience, most provers and FM languages have an "just accept this statement is true" feature. This helps you work on the big picture proof and fill in the details later. If you leave in a shortcut, &lt;em&gt;and&lt;/em&gt; the compiler is configured to allow code-with-proof-assumptions to compile, &lt;em&gt;then&lt;/em&gt; you can compile incorrect code that "passes the proof checker". You really should know better, though.&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;h3&gt;2. The properties are wrong&lt;/h3&gt;
    &lt;blockquote&gt;&lt;figure&gt;&lt;img alt="The horrible bug you had wasn't covered in the specification/came from some other module/etc" draggable="false" src="https://cdn.prod.website-files.com/673b407e535dbf3b547179ff/681ca0bf4a045f39f785faeb_AD_4nXfFhdn6DGmgLAcmaUNHl9a3Nog8gH8Hluve5Kof7zLk4CyOlD4zCmCqVJaowKqu-pTicwZ393jE7anIrjYZTSuRvGiYhFhAkkX9vifNt9vEWYwZUp65hsbrRTmZzRgb9vgu7n7buA.png"/&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;a href="https://www.galois.com/articles/what-works-and-doesnt-selling-formal-methods" rel="noopener noreferrer nofollow" target="_blank"&gt;Galois&lt;/a&gt;&lt;/p&gt;&lt;/blockquote&gt;
    &lt;p&gt;This code is provably correct:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;inc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;::&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;Int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;gt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;Int&lt;/span&gt;
    &lt;span class="nf"&gt;inc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The only specification I've given is the type signature &lt;code&gt;Int -&amp;gt; Int&lt;/code&gt;. At no point did I put the property &lt;code&gt;inc(x) &amp;gt; x&lt;/code&gt; in my specification, so it doesn't matter that it doesn't hold, the code is still "correct".&lt;/p&gt;
    &lt;p&gt;This is what "went wrong" with the leftpad proofs. They do &lt;em&gt;not&lt;/em&gt; prove the property "&lt;code&gt;leftpad(c, n, s)&lt;/code&gt; will take up either &lt;code&gt;n&lt;/code&gt; spaces on the screen or however many characters &lt;code&gt;s&lt;/code&gt; takes up (if more than &lt;code&gt;n&lt;/code&gt;)". They prove the weaker property "&lt;code&gt;len(leftpad(c, n, s)) == max(n, len(s))&lt;/code&gt;, for however you want to define &lt;code&gt;len(string)&lt;/code&gt;". The second is a rough proxy for the first that works in most cases, but if someone really needs the former property they are liable to experience a bug.&lt;/p&gt;
    &lt;p&gt;Why don't we prove the stronger property? Sometimes it's because the code is meant to be used one way and people want to use it another way. This can lead to accusations that the developer is "misusing the provably correct code" but this should more often be seen as the verification expert failing to educate devs on was actually "proven".&lt;/p&gt;
    &lt;p&gt;Sometimes it's because the property is too hard to prove. "Outputs are visually aligned" is a proof about Unicode inputs, and the &lt;em&gt;core&lt;/em&gt; Unicode specification is &lt;a href="https://www.unicode.org/versions/Unicode17.0.0/UnicodeStandard-17.0.pdf" rel="noopener noreferrer nofollow" target="_blank"&gt;1,243 pages long&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;Sometimes it's because the property we want is too hard to &lt;em&gt;express&lt;/em&gt;. How do you mathematically represent "people will perceive the output as being visually aligned"? Is it OS and font dependent? These two lines are exactly five characters but not visually aligned:&lt;/p&gt;
    &lt;blockquote&gt;&lt;p&gt;|||||&lt;/p&gt;&lt;p&gt;MMMMM&lt;/p&gt;&lt;/blockquote&gt;
    &lt;p&gt;Or maybe they are aligned for you! I don't know, lots of people read email in a monospace font. "We can't express the property" comes up a lot when dealing with human/business concepts as opposed to mathematical/computational ones.&lt;/p&gt;
    &lt;p&gt;Finally, there's just the possibility of a brain fart. All of the proofs in &lt;a href="https://research.google/blog/extra-extra-read-all-about-it-nearly-all-binary-searches-and-mergesorts-are-broken/" rel="noopener noreferrer nofollow" target="_blank"&gt;Nearly All Binary Searches and Mergesorts are Broken&lt;/a&gt; are like this. They (informally) proved the correctness of binary search with unbound integers, forgetting that many programming languages use &lt;em&gt;machine&lt;/em&gt; integers, where a large enough sum can overflow.&lt;/p&gt;
    &lt;h3&gt;3. The assumptions are wrong&lt;/h3&gt;
    &lt;p&gt;This is arguably the most important and most subtle source of bugs. Most properties we prove aren't "&lt;code&gt;X&lt;/code&gt; is always true". They are "&lt;em&gt;assuming&lt;/em&gt; &lt;code&gt;Y&lt;/code&gt; is true, &lt;code&gt;X&lt;/code&gt; is also true". Then if &lt;code&gt;Y&lt;/code&gt; is not true, the proof no longer guarantees &lt;code&gt;X&lt;/code&gt;. A good example of this is binary &lt;s&gt;sort&lt;/s&gt; &lt;em&gt;search&lt;/em&gt;, which only correctly finds elements &lt;em&gt;assuming&lt;/em&gt; the input list is sorted. If the list is not sorted, it will not work correctly.&lt;/p&gt;
    &lt;p&gt;Formal verification adds two more wrinkles. One: sometimes we need assumptions to make the property valid, but we can also add them to make the proof easier. So the code can be bug-free even if the assumptions used to verify it no longer hold! Even if a leftpad implements visual alignment for all Unicode glyphs, it will be a lot easier to &lt;em&gt;prove&lt;/em&gt; visual alignment for just ASCII strings and padding.&lt;/p&gt;
    &lt;p&gt;Two: we need make a lot of &lt;em&gt;environmental&lt;/em&gt; assumptions that are outside our control. Does the algorithm return output or use the stack? Need to assume that there's sufficient memory to store stuff. Does it use any variables? Need to assume nothing is concurrently modifying them. Does it use an external service? Need to assume the vendor doesn't change the API or response formats. You need to assume the compiler worked correctly, the hardware isn't faulty, and the OS doesn't mess with things, etc. Any of these could change well after the code is proven and deployed, meaning formal verification can't be a one-and-done thing.&lt;/p&gt;
    &lt;p&gt;You don't actually have to assume most of these, but each assumption drop makes the proof harder and the properties you can prove more restricted. Remember, the code might still be bug-free even if the environmental assumptions change, so there's a tradeoff in time spent proving vs doing other useful work.&lt;/p&gt;
    &lt;p&gt;Another common source of "assumptions" is when verified code depends on unverified code. The Rust compiler can prove that safe code doesn't have a memory bug &lt;em&gt;assuming&lt;/em&gt; unsafe code does not have one either, but depends on the human to confirm that assumption. &lt;a href="https://ucsd-progsys.github.io/liquidhaskell/" rel="noopener noreferrer nofollow" target="_blank"&gt;Liquid Haskell&lt;/a&gt; is verifiable but can also call regular Haskell libraries, which are unverified. We need to assume that code is correct (in the "conforms to spec") sense, and if it's not, our proof can be "correct" and still cause bugs.&lt;/p&gt;
    &lt;hr/&gt;&lt;p&gt;These boundaries are fuzzy. I wrote that the "binary search" bug happened because they proved the wrong property, but you can just as well argue that it was a broken assumption (that integers could not overflow). What really matters is having a clear understanding of what "this code is proven correct" actually &lt;em&gt;tells&lt;/em&gt; you. Where can you use it safely? When should you worry? How do you communicate all of this to your teammates?&lt;/p&gt;
    &lt;p&gt;Good lord it's already Friday&lt;/p&gt;
    </description><pubDate>Fri, 10 Oct 2025 17:06:19 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/three-ways-formally-verified-code-can-go-wrong-in/</guid></item><item><title>New Blog Post: " A Very Early History of Algebraic Data Types"</title><link>https://buttondown.com/hillelwayne/archive/new-blog-post-a-very-early-history-of-algebraic/</link><description>
    &lt;p&gt;Last week I said that this week's newsletter would be a brief history of algebraic data types.&lt;/p&gt;
    &lt;p&gt;I was wrong.&lt;/p&gt;
    &lt;p&gt;That history is now a &lt;a href="https://www.hillelwayne.com/post/algdt-history/" target="_blank"&gt;3500 word blog post&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://www.patreon.com/posts/blog-notes-very-139696324?utm_medium=clipboard_copy&amp;amp;utm_source=copyLink&amp;amp;utm_campaign=postshare_creator&amp;amp;utm_content=join_link" target="_blank"&gt;Patreon blog notes here&lt;/a&gt;.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;I'm speaking at &lt;a href="https://www.p99conf.io/" target="_blank"&gt;P99 Conf&lt;/a&gt;!&lt;/h3&gt;
    &lt;p&gt;My talk, "Designing Low-Latency Systems with TLA+", is happening 10/23 at 11:30 central time. It's an online conf and the talk's only 16 minutes, so come check it out!&lt;/p&gt;
    </description><pubDate>Thu, 25 Sep 2025 16:50:58 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/new-blog-post-a-very-early-history-of-algebraic/</guid></item><item><title>Many Hard Leetcode Problems are Easy Constraint Problems</title><link>https://buttondown.com/hillelwayne/archive/many-hard-leetcode-problems-are-easy-constraint/</link><description>
    &lt;p&gt;In my first interview out of college I was asked the change counter problem:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given a set of coin denominations, find the minimum number of coins required to make change for a given number. IE for USA coinage and 37 cents, the minimum number is four (quarter, dime, 2 pennies).&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;I implemented the simple greedy algorithm and immediately fell into the trap of the question: the greedy algorithm only works for "well-behaved" denominations. If the coin values were &lt;code&gt;[10, 9, 1]&lt;/code&gt;, then making 37 cents would take 10 coins in the greedy algorithm but only 4 coins optimally (&lt;code&gt;10+9+9+9&lt;/code&gt;). The "smart" answer is to use a dynamic programming algorithm, which I didn't know how to do. So I failed the interview.&lt;/p&gt;
    &lt;p&gt;But you only need dynamic programming if you're writing your own algorithm. It's really easy if you throw it into a constraint solver like &lt;a href="https://www.minizinc.org/" target="_blank"&gt;MiniZinc&lt;/a&gt; and call it a day. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;int: total;
    array[int] of int: values = [10, 9, 1];
    array[index_set(values)] of var 0..: coins;
    
    constraint sum (c in index_set(coins)) (coins[c] * values[c]) == total;
    solve minimize sum(coins);
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;You can try this online &lt;a href="https://play.minizinc.dev/" target="_blank"&gt;here&lt;/a&gt;. It'll give you a prompt to put in &lt;code&gt;total&lt;/code&gt; and then give you successively-better solutions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;coins = [0, 0, 37];
    ----------
    coins = [0, 1, 28];
    ----------
    coins = [0, 2, 19];
    ----------
    coins = [0, 3, 10];
    ----------
    coins = [0, 4, 1];
    ----------
    coins = [1, 3, 0];
    ----------
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Lots of similar interview questions are this kind of mathematical optimization problem, where we have to find the maximum or minimum of a function corresponding to constraints. They're hard in programming languages because programming languages are too low-level. They are also exactly the problems that constraint solvers were designed to solve. Hard leetcode problems are easy constraint problems.&lt;sup id="fnref:leetcode"&gt;&lt;a class="footnote-ref" href="#fn:leetcode"&gt;1&lt;/a&gt;&lt;/sup&gt; Here I'm using MiniZinc, but you could just as easily use Z3 or OR-Tools or whatever your favorite generalized solver is.&lt;/p&gt;
    &lt;h3&gt;More examples&lt;/h3&gt;
    &lt;p&gt;This was a question in a different interview (which I thankfully passed):&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given a list of stock prices through the day, find maximum profit you can get by buying one stock and selling one stock later.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;It's easy to do in O(n^2) time, or if you are clever, you can do it in O(n). Or you could be not clever at all and just write it as a constraint problem:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;array[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];
    var int: buy;
    var int: sell;
    var int: profit = prices[sell] - prices[buy];
    
    constraint sell &amp;gt; buy;
    constraint profit &amp;gt; 0;
    solve maximize profit;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Reminder, link to trying it online &lt;a href="https://play.minizinc.dev/" target="_blank"&gt;here&lt;/a&gt;. While working at that job, one interview question we tested out was:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given a list, determine if three numbers in that list can be added or subtracted to give 0? &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;This is a satisfaction problem, not a constraint problem: we don't need the "best answer", any answer will do. We eventually decided against it for being too tricky for the engineers we were targeting. But it's not tricky in a solver; &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;include "globals.mzn";
    array[int] of int: numbers = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];
    array[index_set(numbers)] of var {0, -1, 1}: choices;
    
    constraint sum(n in index_set(numbers)) (numbers[n] * choices[n]) = 0;
    constraint count(choices, -1) + count(choices, 1) = 3;
    solve satisfy;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Okay, one last one, a problem I saw last year at &lt;a href="https://chicagopython.github.io/algosig/" target="_blank"&gt;Chipy AlgoSIG&lt;/a&gt;. Basically they pick some leetcode problems and we all do them. I failed to solve &lt;a href="https://leetcode.com/problems/largest-rectangle-in-histogram/description/" target="_blank"&gt;this one&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given an array of integers heights representing the histogram's bar height where the width of each bar is 1, return the area of the largest rectangle in the histogram.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="example from leetcode link" class="newsletter-image" src="https://assets.buttondown.email/images/63337f78-7138-4b21-87a0-917c0c5b1706.jpg?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The "proper" solution is a tricky thing involving tracking lots of bookkeeping states, which you can completely bypass by expressing it as constraints:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;array[int] of int: numbers = [2,1,5,6,2,3];
    
    var 1..length(numbers): x; 
    var 1..length(numbers): dx;
    var 1..: y;
    
    constraint x + dx &amp;lt;= length(numbers);
    constraint forall (i in x..(x+dx)) (y &amp;lt;= numbers[i]);
    
    var int: area = (dx+1)*y;
    solve maximize area;
    
    output ["(\(x)-&amp;gt;\(x+dx))*\(y) = \(area)"]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There's even a way to &lt;a href="https://docs.minizinc.dev/en/2.9.3/visualisation.html" target="_blank"&gt;automatically visualize the solution&lt;/a&gt; (using &lt;code&gt;vis_geost_2d&lt;/code&gt;), but I didn't feel like figuring it out in time for the newsletter.&lt;/p&gt;
    &lt;h3&gt;Is this better?&lt;/h3&gt;
    &lt;p&gt;Now if I actually brought these questions to an interview the interviewee could ruin my day by asking "what's the runtime complexity?" Constraint solvers runtimes are unpredictable and almost always slower than an ideal bespoke algorithm because they are more expressive, in what I refer to as the &lt;a href="https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/" target="_blank"&gt;capability/tractability tradeoff&lt;/a&gt;. But even so, they'll do way better than a &lt;em&gt;bad&lt;/em&gt; bespoke algorithm, and I'm not experienced enough in handwriting algorithms to consistently beat a solver.&lt;/p&gt;
    &lt;p&gt;The real advantage of solvers, though, is how well they handle new constraints. Take the stock picking problem above. I can write an O(n²) algorithm in a few minutes and the O(n) algorithm if you give me some time to think. Now change the problem to&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Maximize the profit by buying and selling up to &lt;code&gt;max_sales&lt;/code&gt; stocks, but you can only buy or sell one stock at a given time and you can only hold up to &lt;code&gt;max_hold&lt;/code&gt; stocks at a time?&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;That's a way harder problem to write even an inefficient algorithm for! While the constraint problem is only a tiny bit more complicated:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;include "globals.mzn";
    int: max_sales = 3;
    int: max_hold = 2;
    array[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];
    array [1..max_sales] of var int: buy;
    array [1..max_sales] of var int: sell;
    array [index_set(prices)] of var 0..max_hold: stocks_held;
    var int: profit = sum(s in 1..max_sales) (prices[sell[s]] - prices[buy[s]]);
    
    constraint forall (s in 1..max_sales) (sell[s] &amp;gt; buy[s]);
    constraint profit &amp;gt; 0;
    
    constraint forall(i in index_set(prices)) (stocks_held[i] = (count(s in 1..max_sales) (buy[s] &amp;lt;= i) - count(s in 1..max_sales) (sell[s] &amp;lt;= i)));
    constraint alldifferent(buy ++ sell);
    solve maximize profit;
    
    output ["buy at \(buy)\n", "sell at \(sell)\n", "for \(profit)"];
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Most constraint solving examples online are puzzles, like &lt;a href="https://docs.minizinc.dev/en/stable/modelling2.html#ex-sudoku" target="_blank"&gt;Sudoku&lt;/a&gt; or "&lt;a href="https://docs.minizinc.dev/en/stable/modelling2.html#ex-smm" target="_blank"&gt;SEND + MORE = MONEY&lt;/a&gt;". Solving leetcode problems would be a more interesting demonstration. And you get more interesting opportunities to teach optimizations, like symmetry breaking.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Update for the Internet&lt;/h3&gt;
    &lt;p&gt;This was sent as a weekly newsletter, which is usually on topics like &lt;a href="https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code" target="_blank"&gt;software history&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/" target="_blank"&gt;formal methods&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;unusual technologies&lt;/a&gt;, and the &lt;a href="https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/" target="_blank"&gt;theory of software engineering&lt;/a&gt;. You can subscribe here: &lt;/p&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:leetcode"&gt;
    &lt;p&gt;Because my dad will email me if I don't explain this: "leetcode" is slang for "tricky algorithmic interview questions that have little-to-no relevance in the actual job you're interviewing for." It's from &lt;a href="https://leetcode.com/" target="_blank"&gt;leetcode.com&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:leetcode" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 10 Sep 2025 13:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/many-hard-leetcode-problems-are-easy-constraint/</guid></item><item><title>The Angels and Demons of Nondeterminism</title><link>https://buttondown.com/hillelwayne/archive/the-angels-and-demons-of-nondeterminism/</link><description>
    &lt;p&gt;Greetings everyone! You might have noticed that it's September and I don't have the next version of &lt;em&gt;Logic for Programmers&lt;/em&gt; ready. As penance, &lt;a href="https://leanpub.com/logic/c/september-2025-kuBCrhBnUzb7" target="_blank"&gt;here's ten free copies of the book&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;So a few months ago I wrote &lt;a href="https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/" target="_blank"&gt;a newsletter&lt;/a&gt; about how we use nondeterminism in formal methods.  The overarching idea:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Nondeterminism is when multiple paths are possible from a starting state.&lt;/li&gt;
    &lt;li&gt;A system preserves a property if it holds on &lt;em&gt;all&lt;/em&gt; possible paths. If even one path violates the property, then we have a bug.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;An intuitive model of this is that for this is that when faced with a nondeterministic choice, the system always makes the &lt;em&gt;worst possible choice&lt;/em&gt;. This is sometimes called &lt;strong&gt;demonic nondeterminism&lt;/strong&gt; and is favored in formal methods because we are paranoid to a fault.&lt;/p&gt;
    &lt;p&gt;The opposite would be &lt;strong&gt;angelic nondeterminism&lt;/strong&gt;, where the system always makes the &lt;em&gt;best possible choice&lt;/em&gt;. A property then holds if &lt;em&gt;any&lt;/em&gt; possible path satisfies that property.&lt;sup id="fnref:duals"&gt;&lt;a class="footnote-ref" href="#fn:duals"&gt;1&lt;/a&gt;&lt;/sup&gt; This is not as common in FM, but it still has its uses! "Players can access the secret level" or "&lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/#other-properties" target="_blank"&gt;We can always shut down the computer&lt;/a&gt;" are &lt;strong&gt;reachability&lt;/strong&gt; properties, that something is possible even if not actually done.&lt;/p&gt;
    &lt;p&gt;In broader computer science research, I'd say that angelic nondeterminism is more popular, due to its widespread use in complexity analysis and programming languages.&lt;/p&gt;
    &lt;h3&gt;Complexity Analysis&lt;/h3&gt;
    &lt;p&gt;P is the set of all "decision problems" (&lt;em&gt;basically&lt;/em&gt;, boolean functions) can be solved in polynomial time: there's an algorithm that's worst-case in &lt;code&gt;O(n)&lt;/code&gt;, &lt;code&gt;O(n²)&lt;/code&gt;, &lt;code&gt;O(n³)&lt;/code&gt;, etc.&lt;sup id="fnref:big-o"&gt;&lt;a class="footnote-ref" href="#fn:big-o"&gt;2&lt;/a&gt;&lt;/sup&gt;  NP is the set of all problems that can be solved in polynomial time by an algorithm with &lt;em&gt;angelic nondeterminism&lt;/em&gt;.&lt;sup id="fnref:TM"&gt;&lt;a class="footnote-ref" href="#fn:TM"&gt;3&lt;/a&gt;&lt;/sup&gt; For example, the question "does list &lt;code&gt;l&lt;/code&gt; contain &lt;code&gt;x&lt;/code&gt;" can be solved in O(1) time by a nondeterministic algorithm:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;fun is_member(l: List[T], x: T): bool {
      if l == [] {return false};
    
      guess i in 0..&amp;lt;(len(l)-1);
      return l[i] == x;
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Say call &lt;code&gt;is_member([a, b, c, d], c)&lt;/code&gt;. The best possible choice would be to guess &lt;code&gt;i = 2&lt;/code&gt;, which would correctly return true. Now call &lt;code&gt;is_member([a, b], d)&lt;/code&gt;. No matter what we guess, the algorithm correctly returns false. and just return false. Ergo, O(1). NP stands for "Nondeterministic Polynomial". &lt;/p&gt;
    &lt;p&gt;(And I just now realized something pretty cool: you can say that P is the set of all problems solvable in polynomial time under &lt;em&gt;demonic nondeterminism&lt;/em&gt;, which is a nice parallel between the two classes.)&lt;/p&gt;
    &lt;p&gt;Computer scientists have proven that angelic nondeterminism doesn't give us any more "power": there are no problems solvable with AN that aren't also solvable deterministically. The big question is whether AN is more &lt;em&gt;efficient&lt;/em&gt;: it is widely believed, but not &lt;em&gt;proven&lt;/em&gt;, that there are problems in NP but not in P. Most famously, "Is there any variable assignment that makes this boolean formula true?" A polynomial AN algorithm is again easy:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;fun SAT(f(x1, x2, …: bool): bool): bool {
       N = num_params(f)
       for i in 1..=num_params(f) {
         guess x_i in {true, false}
       }
    
       return f(x_1, x_2, …)
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The best deterministic algorithms we have to solve the same problem are worst-case exponential with the number of boolean parameters. This a real frustrating problem because real computers don't have angelic nondeterminism, so problems like SAT remain hard. We can solve most "well-behaved" instances of the problem &lt;a href="https://www.hillelwayne.com/post/np-hard/" target="_blank"&gt;in reasonable time&lt;/a&gt;, but the worst-case instances get intractable real fast.&lt;/p&gt;
    &lt;h3&gt;Means of Abstraction&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;We can directly turn an AN algorithm into a (possibly much slower) deterministic algorithm, such as by &lt;a href="https://en.wikipedia.org/wiki/Backtracking" target="_blank"&gt;backtracking&lt;/a&gt;. This makes AN a pretty good abstraction over what an algorithm is doing. Does the regex &lt;code&gt;(a+b)\1+&lt;/code&gt; match "abaabaabaab"? Yes, if the regex engine nondeterministically guesses that it needs to start at the third letter and make the group &lt;code&gt;aab&lt;/code&gt;. How does my PL's regex implementation find that match? I dunno, backtracking or &lt;a href="https://swtch.com/~rsc/regexp/regexp1.html" target="_blank"&gt;NFA construction&lt;/a&gt; or something, I don't need to know the deterministic specifics in order to use the nondeterministic abstraction.&lt;/p&gt;
    &lt;p&gt;Neel Krishnaswami has &lt;a href="https://semantic-domain.blogspot.com/2013/07/what-declarative-languages-are.html" target="_blank"&gt;a great definition of 'declarative language'&lt;/a&gt;: "any language with a semantics has some nontrivial existential quantifiers in it". I'm not sure if this is &lt;em&gt;identical&lt;/em&gt; to saying "a language with an angelic nondeterministic abstraction", but they must be pretty close, and all of his examples match:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;SQL's selects and joins&lt;/li&gt;
    &lt;li&gt;Parsing DSLs&lt;/li&gt;
    &lt;li&gt;Logic programming's unification&lt;/li&gt;
    &lt;li&gt;Constraint solving&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;On top of that I'd add CSS selectors and &lt;a href="https://www.hillelwayne.com/post/picat/" target="_blank"&gt;planner's actions&lt;/a&gt;; all nondeterministic abstractions over a deterministic implementation. He also says that the things programmers hate most in declarative languages are features that "that expose the operational model": constraint solver search strategies, Prolog cuts, regex backreferences, etc. Which again matches my experiences with angelic nondeterminism: I dread features that force me to understand the deterministic implementation. But they're necessary, since P probably != NP and so we need to worry about operational optimizations.&lt;/p&gt;
    &lt;h3&gt;Eldritch Nondeterminism&lt;/h3&gt;
    &lt;p&gt;If you need to know the &lt;a href="https://en.wikipedia.org/wiki/PP_(complexity)" target="_blank"&gt;ratio of good/bad paths&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/%E2%99%AFP" target="_blank"&gt;the number of good paths&lt;/a&gt;, or probability, or anything more than "there is a good path" or "there is a bad path", you are beyond the reach of heaven or hell.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:duals"&gt;
    &lt;p&gt;Angelic and demonic nondeterminism are &lt;a href="https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/" target="_blank"&gt;duals&lt;/a&gt;: angelic returns "yes" if &lt;code&gt;some choice: correct&lt;/code&gt; and demonic returns "no" if &lt;code&gt;!all choice: correct&lt;/code&gt;, which is the same as &lt;code&gt;some choice: !correct&lt;/code&gt;. &lt;a class="footnote-backref" href="#fnref:duals" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:big-o"&gt;
    &lt;p&gt;Pet peeve about Big-O notation: &lt;code&gt;O(n²)&lt;/code&gt; is the &lt;em&gt;set&lt;/em&gt; of all algorithms that, for sufficiently large problem sizes, grow no faster that quadratically. "Bubblesort has &lt;code&gt;O(n²)&lt;/code&gt; complexity" &lt;em&gt;should&lt;/em&gt; be written &lt;code&gt;Bubblesort in O(n²)&lt;/code&gt;, &lt;em&gt;not&lt;/em&gt; &lt;code&gt;Bubblesort = O(n²)&lt;/code&gt;. &lt;a class="footnote-backref" href="#fnref:big-o" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:TM"&gt;
    &lt;p&gt;To be precise, solvable in polynomial time by a &lt;em&gt;Nondeterministic Turing Machine&lt;/em&gt;, a very particular model of computation. We can broadly talk about P and NP without framing everything in terms of Turing machines, but some details of complexity classes (like the existence "weak NP-hardness") kinda need Turing machines to make sense. &lt;a class="footnote-backref" href="#fnref:TM" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 04 Sep 2025 14:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/the-angels-and-demons-of-nondeterminism/</guid></item><item><title>Logical Duals in Software Engineering</title><link>https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/</link><description>
    &lt;p&gt;(&lt;a href="https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/" target="_blank"&gt;Last week's newsletter&lt;/a&gt; took too long and I'm way behind on &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; revisions so short one this time.&lt;sup id="fnref:retread"&gt;&lt;a class="footnote-ref" href="#fn:retread"&gt;1&lt;/a&gt;&lt;/sup&gt;)&lt;/p&gt;
    &lt;p&gt;In classical logic, two operators &lt;code&gt;F/G&lt;/code&gt; are &lt;strong&gt;duals&lt;/strong&gt; if &lt;code&gt;F(x) = !G(!x)&lt;/code&gt;. Three examples:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;&lt;code&gt;x || y&lt;/code&gt; is the same as &lt;code&gt;!(!x &amp;amp;&amp;amp; !y)&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;&amp;lt;&amp;gt;P&lt;/code&gt; ("P is possibly true") is the same as &lt;code&gt;![]!P&lt;/code&gt; ("not P isn't definitely true").&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;some x in set: P(x)&lt;/code&gt; is the same as &lt;code&gt;!(all x in set: !P(x))&lt;/code&gt;.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;(1) is just a version of De Morgan's Law, which we regularly use to simplify boolean expressions. (2) is important in modal logic but has niche applications in software engineering, mostly in how it powers various formal methods.&lt;sup id="fnref:fm"&gt;&lt;a class="footnote-ref" href="#fn:fm"&gt;2&lt;/a&gt;&lt;/sup&gt; The real interesting one is (3), the "quantifier duals". We use lots of software tools to either &lt;em&gt;find&lt;/em&gt; a value satisfying &lt;code&gt;P&lt;/code&gt; or &lt;em&gt;check&lt;/em&gt; that all values satisfy &lt;code&gt;P&lt;/code&gt;. And by duality, any tool that does one can do the other, by seeing if it &lt;em&gt;fails&lt;/em&gt; to find/check &lt;code&gt;!P&lt;/code&gt;. Some examples in the wild:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Z3 is used to solve mathematical constraints, like "find x, where &lt;code&gt;f(x) &amp;gt;= 0&lt;/code&gt;. If I want to prove a property like "f is always positive", I ask z3 to solve "find x, where &lt;code&gt;!(f(x) &amp;gt;= 0)&lt;/code&gt;, and see if that is unsatisfiable. This use case powers a LOT of theorem provers and formal verification tooling.&lt;/li&gt;
    &lt;li&gt;Property testing checks that all inputs to a code block satisfy a property. I've used it to generate complex inputs with certain properties by checking that all inputs &lt;em&gt;don't&lt;/em&gt; satisfy the property and reading out the test failure.&lt;/li&gt;
    &lt;li&gt;Model checkers check that all behaviors of a specification satisfy a property, so we can find a behavior that reaches a goal state G by checking that all states are &lt;code&gt;!G&lt;/code&gt;. &lt;a href="https://github.com/tlaplus/Examples/blob/master/specifications/DieHard/DieHard.tla" target="_blank"&gt;Here's TLA+ solving a puzzle this way&lt;/a&gt;.&lt;sup id="fnref:antithesis"&gt;&lt;a class="footnote-ref" href="#fn:antithesis"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;li&gt;Planners find behaviors that reach a goal state, so we can check if all behaviors satisfy a property P by asking it to reach goal state &lt;code&gt;!P&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;The problem "find the shortest &lt;a href="https://en.wikipedia.org/wiki/Travelling_salesman_problem" target="_blank"&gt;traveling salesman route&lt;/a&gt;" can be broken into &lt;code&gt;some route: distance(route) = n&lt;/code&gt; and &lt;code&gt;all route: !(distance(route) &amp;lt; n)&lt;/code&gt;. Then a route finder can find the first, and then convert the second into a &lt;code&gt;some&lt;/code&gt; and &lt;em&gt;fail&lt;/em&gt; to find it, proving &lt;code&gt;n&lt;/code&gt; is optimal.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Even cooler to me is when a tool does &lt;em&gt;both&lt;/em&gt; finding and checking, but gives them different "meanings". In SQL, &lt;code&gt;some x: P(x)&lt;/code&gt; is true if we can &lt;em&gt;query&lt;/em&gt; for &lt;code&gt;P(x)&lt;/code&gt; and get a nonempty response, while &lt;code&gt;all x: P(x)&lt;/code&gt; is true if all records satisfy the &lt;code&gt;P(x)&lt;/code&gt; &lt;em&gt;constraint&lt;/em&gt;. Most SQL databases allow for complex queries but not complex constraints! You got &lt;code&gt;UNIQUE&lt;/code&gt;, &lt;code&gt;NOT NULL&lt;/code&gt;, &lt;code&gt;REFERENCES&lt;/code&gt;, which are fixed predicates, and &lt;code&gt;CHECK&lt;/code&gt;, which is one-record only.&lt;sup id="fnref:check"&gt;&lt;a class="footnote-ref" href="#fn:check"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;Oh, and you got database triggers, which can run arbitrary queries and throw exceptions. So if you really need to enforce a complex constraint &lt;code&gt;P(x, y, z)&lt;/code&gt;, you put in a database trigger that queries &lt;code&gt;some x, y, z: !P(x, y, z)&lt;/code&gt; and throws an exception if it finds any results. That all works because of quantifier duality! See &lt;a href="https://eddmann.com/posts/maintaining-invariant-constraints-in-postgresql-using-trigger-functions/" target="_blank"&gt;here&lt;/a&gt; for an example of this in practice.&lt;/p&gt;
    &lt;h3&gt;Duals more broadly&lt;/h3&gt;
    &lt;p&gt;"Dual" doesn't have a strict meaning in math, it's more of a vibe thing where all of the "duals" are kinda similar in meaning but don't strictly follow all of the same rules. &lt;em&gt;Usually&lt;/em&gt; things X and Y are duals if there is some transform &lt;code&gt;F&lt;/code&gt; where &lt;code&gt;X = F(Y)&lt;/code&gt; and &lt;code&gt;Y = F(X)&lt;/code&gt;, but not always. Maybe the category theorists have a formal definition that covers all of the different uses. Usually duals switch properties of things, too: an example showing &lt;code&gt;some x: P(x)&lt;/code&gt; becomes a &lt;em&gt;counterexample&lt;/em&gt; of &lt;code&gt;all x: !P(x)&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;Under this definition, I think the dual of a list &lt;code&gt;l&lt;/code&gt; could be &lt;code&gt;reverse(l)&lt;/code&gt;. The first element of &lt;code&gt;l&lt;/code&gt; becomes the last element of &lt;code&gt;reverse(l)&lt;/code&gt;, the last becomes the first, etc. A more interesting case is the dual of a &lt;code&gt;K -&amp;gt; set(V)&lt;/code&gt; map is the &lt;code&gt;V -&amp;gt; set(K)&lt;/code&gt; map. IE the dual of &lt;code&gt;lived_in_city = {alice: {paris}, bob: {detroit}, charlie: {detroit, paris}}&lt;/code&gt; is &lt;code&gt;city_lived_in_by = {paris: {alice, charlie}, detroit: {bob, charlie}}&lt;/code&gt;. This preserves the property that &lt;code&gt;x in map[y] &amp;lt;=&amp;gt; y in dual[x]&lt;/code&gt;.&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:retread"&gt;
    &lt;p&gt;And after writing this I just realized this is partial retread of a newsletter I wrote &lt;a href="https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/" target="_blank"&gt;a couple months ago&lt;/a&gt;. But only a &lt;em&gt;partial&lt;/em&gt; retread! &lt;a class="footnote-backref" href="#fnref:retread" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:fm"&gt;
    &lt;p&gt;Specifically "linear temporal logics" are modal logics, so "&lt;code&gt;eventually P&lt;/code&gt; ("P is true in at least one state of each behavior") is the same as saying &lt;code&gt;!always !P&lt;/code&gt; ("not P isn't true in all states of all behaviors"). This is the basis of &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;liveness checking&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:fm" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:antithesis"&gt;
    &lt;p&gt;I don't know for sure, but my best guess is that Antithesis does something similar &lt;a href="https://antithesis.com/blog/tag/games/" target="_blank"&gt;when their fuzzer beats videogames&lt;/a&gt;. They're doing fuzzing, not model checking, but they have the same purpose check that complex state spaces don't have bugs. Making the bug "we can't reach the end screen" can make a fuzzer output a complete end-to-end run of the game. Obvs a lot more complicated than that but that's the general idea at least. &lt;a class="footnote-backref" href="#fnref:antithesis" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:check"&gt;
    &lt;p&gt;For &lt;code&gt;CHECK&lt;/code&gt; to constraint multiple records you would need to use a subquery. Core SQL does not support subqueries in check. It is an optional database "feature outside of core SQL" (F671), which &lt;a href="https://www.postgresql.org/docs/current/unsupported-features-sql-standard.html" target="_blank"&gt;Postgres does not support&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:check" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 27 Aug 2025 19:25:32 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/</guid></item><item><title>Sapir-Whorf does not apply to Programming Languages</title><link>https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/</link><description>
    &lt;p&gt;&lt;em&gt;This one is a hot mess but it's too late in the week to start over. Oh well!&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;Someone recognized me at last week's &lt;a href="https://www.chipy.org/" target="_blank"&gt;Chipy&lt;/a&gt; and asked for my opinion on Sapir-Whorf hypothesis in programming languages. I thought this was interesting enough to make a newsletter. First what it is, then why it &lt;em&gt;looks&lt;/em&gt; like it applies, and then why it doesn't apply after all.&lt;/p&gt;
    &lt;h3&gt;The Sapir-Whorf Hypothesis&lt;/h3&gt;
    &lt;blockquote&gt;
    &lt;p&gt;We dissect nature along lines laid down by our native language. — &lt;a href="https://web.mit.edu/allanmc/www/whorf.scienceandlinguistics.pdf" target="_blank"&gt;Whorf&lt;/a&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;To quote from a &lt;a href="https://www.amazon.com/Linguistics-Complete-Introduction-Teach-Yourself/dp/1444180320" target="_blank"&gt;Linguistics book I've read&lt;/a&gt;, the hypothesis is that "an individual's fundamental perception of reality is moulded by the language they speak." As a massive oversimplification, if English did not have a word for "rebellion", we would not be able to conceive of rebellion. This view, now called &lt;a href="https://en.wikipedia.org/wiki/Linguistic_determinism" target="_blank"&gt;Linguistic Determinism&lt;/a&gt;, is mostly rejected by modern linguists.&lt;/p&gt;
    &lt;p&gt;The "weak" form of SWH is that the language we speak influences, but does not &lt;em&gt;decide&lt;/em&gt; our cognition. &lt;a href="https://langcog.stanford.edu/papers/winawer2007.pdf" target="_blank"&gt;For example&lt;/a&gt;, Russian has distinct words for "light blue" and "dark blue", so can discriminate between "light blue" and "dark blue" shades faster than they can discriminate two "light blue" shades. English does not have distinct words, so we discriminate those at the same speed. This &lt;strong&gt;linguistic relativism&lt;/strong&gt; seems to have lots of empirical support in studies, but mostly with "small indicators". I don't think there's anything that convincingly shows linguistic relativism having effects on a societal level.&lt;sup id="fnref:economic-behavior"&gt;&lt;a class="footnote-ref" href="#fn:economic-behavior"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;The weak form of SWH for software would then be the "the programming languages you know affects how you think about programs."&lt;/p&gt;
    &lt;h3&gt;SWH in software&lt;/h3&gt;
    &lt;p&gt;This seems like a natural fit, as different paradigms solve problems in different ways. Consider the &lt;a href="https://hadid.dev/posts/living-coding/" target="_blank"&gt;hardest interview question ever&lt;/a&gt;, "given a list of integers, sum the even numbers". Here it is in four paradigms:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Procedural: &lt;code&gt;total = 0; foreach x in list {if IsEven(x) total += x}&lt;/code&gt;. You iterate over data with an algorithm.&lt;/li&gt;
    &lt;li&gt;Functional: &lt;code&gt;reduce(+, filter(IsEven, list), 0)&lt;/code&gt;. You apply transformations to data to get a result.&lt;/li&gt;
    &lt;li&gt;Array: &lt;code&gt;+ fold L * iseven L&lt;/code&gt;.&lt;sup id="fnref:J"&gt;&lt;a class="footnote-ref" href="#fn:J"&gt;2&lt;/a&gt;&lt;/sup&gt; In English: replace every element in L with 0 if odd and 1 if even, multiple the new array elementwise against &lt;code&gt;L&lt;/code&gt;, and then sum the resulting array. It's like functional except everything is in terms of whole-array transformations.&lt;/li&gt;
    &lt;li&gt;Logical: Somethingish like &lt;code&gt;sumeven(0, []). sumeven(X, [Y|L]) :- iseven(Y) -&amp;gt; sumeven(Z, L), X is Y + Z ; sumeven(X, L)&lt;/code&gt;. You write a set of equations that express what it means for X to &lt;em&gt;be&lt;/em&gt; the sum of events of L.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;There's some similarities between how these paradigms approach the problem, but each is also unique, too. It's plausible that where a procedural programmer "sees" a for loop, a functional programmer "sees" a map and an array programmer "sees" a singular operator.&lt;/p&gt;
    &lt;p&gt;I also have a personal experience with how a language changed the way I think. I use &lt;a href="https://learntla.com/" target="_blank"&gt;TLA+&lt;/a&gt; to detect concurrency bugs in software designs. After doing this for several years, I've gotten much better at intuitively seeing race conditions in things even &lt;em&gt;without&lt;/em&gt; writing a TLA+ spec. It's even leaked out into my day-to-day life. I see concurrency bugs everywhere. Phone tag is a race condition.&lt;/p&gt;
    &lt;p&gt;But I still don't think SWH is the right mental model to use, for one big reason: language is &lt;em&gt;special&lt;/em&gt;. We think in language, we dream in language, there are huge parts of our brain dedicated to processing language. &lt;a href="https://web.eecs.umich.edu/~weimerw/p/weimer-icse2017-preprint.pdf" target="_blank"&gt;We don't use those parts of our brain to read code&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;SWH is so intriguing because it seems so unnatural, that the way we express thoughts changes the way we &lt;em&gt;think&lt;/em&gt; thoughts. That I would be a different person if I was bilingual in Spanish, not because the life experiences it would open up but because &lt;a href="https://en.wikipedia.org/wiki/Grammatical_gender" target="_blank"&gt;grammatical gender&lt;/a&gt; would change my brain.&lt;/p&gt;
    &lt;p&gt;Compared to that, the idea that programming languages affect our brain is more natural and has a simpler explanation:&lt;/p&gt;
    &lt;p&gt;It's the goddamned &lt;a href="https://en.wikipedia.org/wiki/Tetris_effect" target="_blank"&gt;Tetris Effect&lt;/a&gt;.&lt;/p&gt;
    &lt;h3&gt;The Goddamned Tetris Effect&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;blockquote&gt;
    &lt;p&gt;The Tetris effect occurs when someone dedicates vast amounts of time, effort and concentration on an activity which thereby alters their thoughts, dreams, and other experiences not directly linked to said activity. — Wikipedia&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Every skill does this. I'm a juggler, so every item I can see right now has a tiny metadata field of "how would this tumble if I threw it up". I teach professionally, so I'm always noticing good teaching examples everywhere. I spent years writing specs in TLA+ and watching the model checker throw concurrency errors in my face, so now race conditions have visceral presence. Every skill does this. &lt;/p&gt;
    &lt;p&gt;And to really develop a skill, you gotta practice. This is where I think programming paradigms do something especially interesting that make them feel more like Sapir-Whorfy than, like, juggling. Some languages mix lots of different paradigms, like Javascript or Rust. Others like Haskell really focus on &lt;em&gt;excluding&lt;/em&gt; paradigms. If something is easy for you in procedural and hard in FP, in JS you could just lean on the procedural bits. In Haskell, &lt;em&gt;too bad&lt;/em&gt;, you're learning how to do it the functional way.&lt;sup id="fnref:escape-hatch"&gt;&lt;a class="footnote-ref" href="#fn:escape-hatch"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;And that forces you to practice, which makes you see functional patterns everywhere. Tetris effect!&lt;/p&gt;
    &lt;p&gt;Anyway this may all seem like quibbling— why does it matter whether we call it "Tetris effect" or "Sapir-Whorf", if our brains is get rewired either way? For me, personally, it's because SWH sounds really special and &lt;em&gt;unique&lt;/em&gt;, while Tetris effect sounds mundane and commonplace. Which it &lt;em&gt;is&lt;/em&gt;. But also because TE suggests it's not just programming languages that affect how we think about software, it's &lt;em&gt;everything&lt;/em&gt;. Spending lots of time debugging, profiling, writing exploits, whatever will change what you notice, what you think a program "is". And that's a way useful idea that shouldn't be restricted to just PLs.&lt;/p&gt;
    &lt;p&gt;(Then again, the Tetris Effect might also be a bad analogy to what's going on here, because I think part of it is that it wears off after a while. Maybe it's just "building a mental model is good".)&lt;/p&gt;
    &lt;h3&gt;I just realized all of this might have missed the point&lt;/h3&gt;
    &lt;p&gt;Wait are people actually using SWH to mean the &lt;em&gt;weak form&lt;/em&gt; or the &lt;em&gt;strong&lt;/em&gt; form? Like that if a language doesn't make something possible, its users can't conceive of it being possible. I've been arguing against the weaker form in software but I think I've seen strong form often too. Dammit.&lt;/p&gt;
    &lt;p&gt;Well, it's already Thursday and far too late to rewrite the whole newsletter, so I'll just outline the problem with the strong form: we describe the capabilities of our programming languages &lt;em&gt;with human language&lt;/em&gt;. In college I wrote a lot of crappy physics lab C++ and one of my projects was filled with comments like "man I hate copying this triply-nested loop in 10 places with one-line changes, I wish I could put it in one function and just take the changing line as a parameter". Even if I hadn't &lt;em&gt;encountered&lt;/em&gt; higher-order functions, I was still perfectly capable of expressing the idea. So if the strong SWH isn't true for human language, it's not true for programming languages either.&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h1&gt;Systems Distributed talk now up!&lt;/h1&gt;
    &lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=d9cM8f_qSLQ" target="_blank"&gt;Link here&lt;/a&gt;! Original abstract:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Building correct distributed systems takes thinking outside the box, and the fastest way to do that is to think inside a different box. One different box is "formal methods", the discipline of mathematically verifying software and systems. Formal methods encourages unusual perspectives on systems, models that are also broadly useful to all software developers. In this talk we will learn two of the most important FM perspectives: the abstract specifications behind software systems, and the property they are and aren't supposed to have.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The talk ended up evolving away from that abstract but I like how it turned out!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:economic-behavior"&gt;
    &lt;p&gt;There is &lt;a href="https://www.anderson.ucla.edu/faculty/keith.chen/papers/LanguageWorkingPaper.pdf" target="_blank"&gt;one paper&lt;/a&gt; arguing that people who speak a language that doesn't have a "future tense" are more likely to save and eat healthy, but it is... &lt;a href="https://www.reddit.com/r/linguistics/comments/rcne7m/comment/hnz2705/" target="_blank"&gt;extremely questionable&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:economic-behavior" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:J"&gt;
    &lt;p&gt;The original J is &lt;code&gt;+/ (* (0 =  2&amp;amp;|))&lt;/code&gt;. Obligatory &lt;a href="https://www.jsoftware.com/papers/tot.htm" target="_blank"&gt;Notation as a Tool of Thought&lt;/a&gt; reference &lt;a class="footnote-backref" href="#fnref:J" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:escape-hatch"&gt;
    &lt;p&gt;Though if it's &lt;em&gt;too&lt;/em&gt; hard for you, that's why languages have &lt;a href="https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/" target="_blank"&gt;escape hatches&lt;/a&gt; &lt;a class="footnote-backref" href="#fnref:escape-hatch" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 21 Aug 2025 13:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/</guid></item><item><title>Software books I wish I could read</title><link>https://buttondown.com/hillelwayne/archive/software-books-i-wish-i-could-read/</link><description>
    &lt;h3&gt;New Logic for Programmers Release!&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.11 is now available&lt;/a&gt;! This is over 20%  longer than v0.10, with a new chapter on code proofs, three chapter overhauls, and more! &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;Full release notes here&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Cover of the boooooook" class="newsletter-image" src="https://assets.buttondown.email/images/92b4a35d-2bdd-416a-92c7-15ff42b49d8d.jpg?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h1&gt;Software books I wish I could read&lt;/h1&gt;
    &lt;p&gt;I'm writing &lt;em&gt;Logic for Programmers&lt;/em&gt; because it's a book I wanted to have ten years ago. I had to learn everything in it the hard way, which is why I'm ensuring that everybody else can learn it the easy way.&lt;/p&gt;
    &lt;p&gt;Books occupy a sort of weird niche in software. We're great at sharing information via blogs and git repos and entire websites. These have many benefits over books: they're free, they're easily accessible, they can be updated quickly, they can even be interactive. But no blog post has influenced me as profoundly as &lt;a href="https://buttondown.com/hillelwayne/archive/why-you-should-read-data-and-reality/" target="_blank"&gt;Data and Reality&lt;/a&gt; or &lt;a href="https://www.oreilly.com/library/view/making-software/9780596808310/" target="_blank"&gt;Making Software&lt;/a&gt;. There is no blog or talk about debugging as good as the 
    &lt;a href="https://debuggingrules.com/" target="_blank"&gt;Debugging&lt;/a&gt; book.&lt;/p&gt;
    &lt;p&gt;It might not be anything deeper than "people spend more time per word on writing books than blog posts". I dunno.&lt;/p&gt;
    &lt;p&gt;So here are some other books I wish I could read. I don't &lt;em&gt;think&lt;/em&gt; any of them exist yet but it's a big world out there. Also while they're probably best as books, a website or a series of blog posts would be ok too.&lt;/p&gt;
    &lt;h4&gt;Everything about Configurations&lt;/h4&gt;
    &lt;p&gt;The whole topic of how we configure software, whether by CLI flags, environmental vars, or JSON/YAML/XML/Dhall files. What causes the &lt;a href="https://mikehadlow.blogspot.com/2012/05/configuration-complexity-clock.html" target="_blank"&gt;configuration complexity clock&lt;/a&gt;? How do we distinguish between basic, advanced, and developer-only configuration options? When should we disallow configuration? How do we test all possible configurations for correctness? Why do so many widespread outages trace back to misconfiguration, and how do we prevent them? &lt;/p&gt;
    &lt;p&gt;I also want the same for plugin systems. Manifests, permissions, common APIs and architectures, etc. Configuration management is more universal, though, since everybody either uses software with configuration or has made software with configuration.&lt;/p&gt;
    &lt;h4&gt;The Big Book of Complicated Data Schemas&lt;/h4&gt;
    &lt;p&gt;I guess this would kind of be like &lt;a href="https://schema.org/docs/full.html" target="_blank"&gt;Schema.org&lt;/a&gt;, except with a lot more on the "why" and not the what. Why is important for the &lt;a href="https://schema.org/Volcano" target="_blank"&gt;Volcano model&lt;/a&gt; to have a "smokingAllowed" field?&lt;sup id="fnref:volcano"&gt;&lt;a class="footnote-ref" href="#fn:volcano"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;I'd see this less as "here's your guide to putting Volcanos in your database" and more "here's recurring motifs in modeling interesting domains", to help a person see sources of complexity in their &lt;em&gt;own&lt;/em&gt; domain. Does something crop up if the references can form a cycle? If a relationship needs to be strictly temporary, or a reference can change type? Bonus: path dependence in data models, where an additional requirement leads to a vastly different ideal data model that a company couldn't do because they made the old model.&lt;/p&gt;
    &lt;p&gt;(This has got to exist, right? Business modeling is a big enough domain that this must exist. Maybe &lt;a href="https://essenceofsoftware.com/" target="_blank"&gt;The Essence of Software&lt;/a&gt; touches on this? Man I feel bad I haven't read that yet.)&lt;/p&gt;
    &lt;h4&gt;Computer Science for Software Engineers&lt;/h4&gt;
    &lt;p&gt;Yes, I checked, this book does not exist (though maybe &lt;a href="https://www.amazon.com/A-Programmers-Guide-to-Computer-Science-2-book-series/dp/B08433QR53" target="_blank"&gt;this&lt;/a&gt; is the same thing). I don't have any formal software education; everything I know was either self-taught or learned on the job. But it's way easier to learn software engineering that way than computer science. And I bet there's a lot of other engineers in the same boat. &lt;/p&gt;
    &lt;p&gt;This book wouldn't have to be comprehensive or instructive: just enough about each topic to understand why it's an area of study and appreciate how research in it eventually finds its way into practice. &lt;/p&gt;
    &lt;h4&gt;MISU Patterns&lt;/h4&gt;
    &lt;p&gt;MISU, or "Make Illegal States Unrepresentable", is the idea of designing system invariants in the structure of your data. For example, if a &lt;code&gt;Contact&lt;/code&gt; needs at least one of &lt;code&gt;email&lt;/code&gt; or &lt;code&gt;phone&lt;/code&gt; to be non-null, make it a sum type over &lt;code&gt;EmailContact, PhoneContact, EmailPhoneContact&lt;/code&gt; (from &lt;a href="https://fsharpforfunandprofit.com/posts/designing-with-types-making-illegal-states-unrepresentable/" target="_blank"&gt;this post&lt;/a&gt;). MISU is great.&lt;/p&gt;
    &lt;p&gt;Most MISU in the wild look very different than that, though, because the concept of MISU is so broad there's lots of different ways to achieve it. And that means there are "patterns": smart constructors, product types, properly using sets, &lt;a href="https://lexi-lambda.github.io/blog/2020/11/01/names-are-not-type-safety/" target="_blank"&gt;newtypes to some degree&lt;/a&gt;, etc. Some of them are specific to typed FP, while others can be used in even untyped languages. Someone oughta make a pattern book.&lt;/p&gt;
    &lt;p&gt;My one request would be to not give them cutesy names. Do something like the &lt;a href="https://ia600301.us.archive.org/18/items/Thompson2016MotifIndex/Thompson_2016_Motif-Index.pdf" target="_blank"&gt;Aarne–Thompson–Uther Index&lt;/a&gt;, where items are given names like "Recognition by manner of throwing cakes of different weights into faces of old uncles". Names can come later.&lt;/p&gt;
    &lt;h4&gt;The Tools of '25&lt;/h4&gt;
    &lt;p&gt;Not something I'd read, but something to recommend to junior engineers. Starting out it's easy to think the only bit that matters is the language or framework and not realize the enormous amount of surrounding tooling you'll have to learn. This book would cover the basics of tools that &lt;em&gt;enough&lt;/em&gt; developers will probably use at some point: git, VSCode, &lt;em&gt;very&lt;/em&gt; basic Unix and bash, curl. Maybe the general concepts of tools that appear in every ecosystem, like package managers, build tools, task runners. That might be easier if we specialize this to one particular domain, like webdev or data science.&lt;/p&gt;
    &lt;p&gt;Ideally the book would only have to be updated every five years or so. No LLM stuff because I don't expect the tooling will be stable through 2026, to say nothing of 2030.&lt;/p&gt;
    &lt;h4&gt;A History of Obsolete Optimizations&lt;/h4&gt;
    &lt;p&gt;Probably better as a really long blog series. Each chapter would be broken up into two parts:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;A deep dive into a brilliant, elegant, insightful historical optimization designed to work within the constraints of that era's computing technology&lt;/li&gt;
    &lt;li&gt;What we started doing instead, once we had more compute/network/storage available.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;c.f. &lt;a href="https://prog21.dadgum.com/29.html" target="_blank"&gt;A Spellchecker Used to Be a Major Feat of Software Engineering&lt;/a&gt;. Bonus topics would be brilliance obsoleted by standardization (like what people did before git and json were universal), optimizations we do today that may not stand the test of time, and optimizations from the past that &lt;em&gt;did&lt;/em&gt;.&lt;/p&gt;
    &lt;h4&gt;Sphinx Internals&lt;/h4&gt;
    &lt;p&gt;&lt;em&gt;I need this&lt;/em&gt;. I've spent so much goddamn time digging around in Sphinx and docutils source code I'm gonna throw up.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Systems Distributed Talk Today!&lt;/h3&gt;
    &lt;p&gt;Online premier's at noon central / 5 PM UTC, &lt;a href="https://www.youtube.com/watch?v=d9cM8f_qSLQ" target="_blank"&gt;here&lt;/a&gt;! I'll be hanging out to answer questions and be awkward. You ever watch a recording of your own talk? It's real uncomfortable!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:volcano"&gt;
    &lt;p&gt;In &lt;em&gt;this&lt;/em&gt; case because it's a field on one of &lt;code&gt;Volcano&lt;/code&gt;'s supertypes. I guess schemas gotta follow LSP too &lt;a class="footnote-backref" href="#fnref:volcano" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 06 Aug 2025 13:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/software-books-i-wish-i-could-read/</guid></item><item><title>2000 words about arrays and tables</title><link>https://buttondown.com/hillelwayne/archive/2000-words-about-arrays-and-tables/</link><description>
    &lt;p&gt;I'm way too discombobulated from getting next month's release of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; ready, so I'm pulling a idea from the slush pile. Basically I wanted to come up with a mental model of arrays as a concept that explained APL-style multidimensional arrays and tables but also why there weren't multitables.&lt;/p&gt;
    &lt;p&gt;So, arrays. In all languages they are basically the same: they map a sequence of numbers (I'll use &lt;code&gt;1..N&lt;/code&gt;)&lt;sup id="fnref:1-indexing"&gt;&lt;a class="footnote-ref" href="#fn:1-indexing"&gt;1&lt;/a&gt;&lt;/sup&gt; to homogeneous values (values of a single type). This is in contrast to the other two foundational types, associative arrays (which map an arbitrary type to homogeneous values) and structs (which map a fixed set of keys to &lt;em&gt;heterogeneous&lt;/em&gt; values). Arrays appear in PLs earlier than the other two, possibly because they have the simplest implementation and the most obvious application to scientific computing. The OG FORTRAN had arrays. &lt;/p&gt;
    &lt;p&gt;I'm interested in two structural extensions to arrays. The first, found in languages like nushell and frameworks like Pandas, is the &lt;em&gt;table&lt;/em&gt;. Tables have string keys like a struct &lt;em&gt;and&lt;/em&gt; indexes like an array. Each row is a struct, so you can get "all values in this column" or "all values for this row". They're heavily used in databases and data science.&lt;/p&gt;
    &lt;p&gt;The other extension is the &lt;strong&gt;N-dimensional array&lt;/strong&gt;, mostly seen in APLs like Dyalog and J. Think of this like arrays-of-arrays(-of-arrays), except all arrays at the same depth have the same length. So &lt;code&gt;[[1,2,3],[4]]&lt;/code&gt; is not a 2D array, but &lt;code&gt;[[1,2,3],[4,5,6]]&lt;/code&gt; is. This means that N-arrays can be queried on any axis.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
    &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;NB. first row&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;{"&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;NB. first column&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So, I've had some ideas on a conceptual model of arrays that explains all of these variations and possibly predicts new variations. I wrote up my notes and did the bare minimum of editing and polishing. Somehow it ended up being 2000 words.&lt;/p&gt;
    &lt;h3&gt;1-dimensional arrays&lt;/h3&gt;
    &lt;p&gt;A one-dimensional array is a function over &lt;code&gt;1..N&lt;/code&gt; for some N. &lt;/p&gt;
    &lt;p&gt;To be clear this is &lt;em&gt;math&lt;/em&gt; functions, not programming functions. Programming functions take values of a type and perform computations on them. Math functions take values of a fixed set and return values of another set. So the array &lt;code&gt;[a, b, c, d]&lt;/code&gt; can be represented by the function &lt;code&gt;(1 -&amp;gt; a ++ 2 -&amp;gt; b ++ 3 -&amp;gt; c ++ 4 -&amp;gt; d)&lt;/code&gt;. Let's write the set of all four element character arrays as &lt;code&gt;1..4 -&amp;gt; char&lt;/code&gt;. &lt;code&gt;1..4&lt;/code&gt; is the function's &lt;strong&gt;domain&lt;/strong&gt;.&lt;/p&gt;
    &lt;p&gt;The set of all character arrays is the empty array + the functions with domain &lt;code&gt;1..1&lt;/code&gt; + the functions with domain &lt;code&gt;1..2&lt;/code&gt; + ... Let's call this set &lt;code&gt;Array[Char]&lt;/code&gt;. Our compilers can enforce that a type belongs to &lt;code&gt;Array[Char]&lt;/code&gt;, but some operations care about the more specific type, like matrix multiplication. This is either checked with the runtime type or, in exotic enough languages, with static dependent types.&lt;/p&gt;
    &lt;p&gt;(This is actually how TLA+ does things: the basic collection types are functions and sets, and a function with domain 1..N is a sequence.)&lt;/p&gt;
    &lt;h3&gt;2-dimensional arrays&lt;/h3&gt;
    &lt;p&gt;Now take the 3x4 matrix&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;
    &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;11&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There are two equally valid ways to represent the array function:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;A function that takes a row and a column and returns the value at that index, so it would look like &lt;code&gt;f(r: 1..3, c: 1..4) -&amp;gt; Int&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;A function that takes a row and returns that column as an array, aka another function: &lt;code&gt;f(r: 1..3) -&amp;gt; g(c: 1..4) -&amp;gt; Int&lt;/code&gt;.&lt;sup id="fnref:associative"&gt;&lt;a class="footnote-ref" href="#fn:associative"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Man, (2) looks a lot like &lt;a href="https://en.wikipedia.org/wiki/Currying" target="_blank"&gt;currying&lt;/a&gt;! In Haskell, functions can only have one parameter. If you write &lt;code&gt;(+) 6 10&lt;/code&gt;, &lt;code&gt;(+) 6&lt;/code&gt; first returns a &lt;em&gt;new&lt;/em&gt; function &lt;code&gt;f y = y + 6&lt;/code&gt;, and then applies &lt;code&gt;f 10&lt;/code&gt; to get 16. So &lt;code&gt;(+)&lt;/code&gt; has the type signature &lt;code&gt;Int -&amp;gt; Int -&amp;gt; Int&lt;/code&gt;: it's a function that takes an &lt;code&gt;Int&lt;/code&gt; and returns a function of type &lt;code&gt;Int -&amp;gt; Int&lt;/code&gt;.&lt;sup id="fnref:typeclass"&gt;&lt;a class="footnote-ref" href="#fn:typeclass"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;Similarly, our 2D array can be represented as an array function that returns array functions: it has type &lt;code&gt;1..3 -&amp;gt; 1..4 -&amp;gt; Int&lt;/code&gt;, meaning it takes a row index and returns &lt;code&gt;1..4 -&amp;gt; Int&lt;/code&gt;, aka a single array.&lt;/p&gt;
    &lt;p&gt;(This differs from conventional array-of-arrays because it forces all of the subarrays to have the same domain, aka the same length. If we wanted to permit ragged arrays, we would instead have the type &lt;code&gt;1..3 -&amp;gt; Array[Int]&lt;/code&gt;.)&lt;/p&gt;
    &lt;p&gt;Why is this useful? A couple of reasons. First of all, we can apply function transformations to arrays, like "&lt;a href="https://blog.zdsmith.com/series/combinatory-programming.html" target="_blank"&gt;combinators&lt;/a&gt;". For example, we can flip any function of type &lt;code&gt;a -&amp;gt; b -&amp;gt; c&lt;/code&gt; into a function of type &lt;code&gt;b -&amp;gt; a -&amp;gt; c&lt;/code&gt;. So given a function that takes rows and returns columns, we can produce one that takes columns and returns rows. That's just a matrix transposition! &lt;/p&gt;
    &lt;p&gt;Second, we can extend this to any number of dimensions: a three-dimensional array is one with type &lt;code&gt;1..M -&amp;gt; 1..N -&amp;gt; 1..O -&amp;gt; V&lt;/code&gt;. We can still use function transformations to rearrange the array along any ordering of axes.&lt;/p&gt;
    &lt;p&gt;Speaking of dimensions:&lt;/p&gt;
    &lt;h3&gt;What are dimensions, anyway&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;Okay, so now imagine we have a &lt;code&gt;Row&lt;/code&gt; × &lt;code&gt;Col&lt;/code&gt; grid of pixels, where each pixel is a struct of type &lt;code&gt;Pixel(R: int, G: int, B: int)&lt;/code&gt;. So the array is&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Row -&amp;gt; Col -&amp;gt; Pixel
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But we can also represent the &lt;em&gt;Pixel struct&lt;/em&gt; with a function: &lt;code&gt;Pixel(R: 0, G: 0, B: 255)&lt;/code&gt; is the function where &lt;code&gt;f(R) = 0&lt;/code&gt;, &lt;code&gt;f(G) = 0&lt;/code&gt;, &lt;code&gt;f(B) = 255&lt;/code&gt;, making it a function of type &lt;code&gt;{R, G, B} -&amp;gt; Int&lt;/code&gt;. So the array is actually the function&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Row -&amp;gt; Col -&amp;gt; {R, G, B} -&amp;gt; Int
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;And then we can rearrange the parameters of the function like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;{R, G, B} -&amp;gt; Row -&amp;gt; Col -&amp;gt; Int
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Even though the set &lt;code&gt;{R, G, B}&lt;/code&gt; is not of form 1..N, this clearly has a real meaning: &lt;code&gt;f[R]&lt;/code&gt; is the function mapping each coordinate to that coordinate's red value. What about &lt;code&gt;Row -&amp;gt; {R, G, B} -&amp;gt; Col -&amp;gt; Int&lt;/code&gt;?  That's for each row, the 3 × Col array mapping each color to that row's intensities.&lt;/p&gt;
    &lt;p&gt;Really &lt;em&gt;any finite set&lt;/em&gt; can be a "dimension". Recording the monitor over a span of time? &lt;code&gt;Frame -&amp;gt; Row -&amp;gt; Col -&amp;gt; Color -&amp;gt; Int&lt;/code&gt;. Recording a bunch of computers over some time? &lt;code&gt;Computer -&amp;gt; Frame -&amp;gt; Row …&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;This is pretty common in constraint satisfaction! Like if you're conference trying to assign talks to talk slots, your array might be type &lt;code&gt;(Day, Time, Room) -&amp;gt; Talk&lt;/code&gt;, where Day/Time/Room are enumerations.&lt;/p&gt;
    &lt;p&gt;An implementation constraint is that most programming languages &lt;em&gt;only&lt;/em&gt; allow integer indexes, so we have to replace Rooms and Colors with numerical enumerations over the set. As long as the set is finite, this is always possible, and for struct-functions, we can always choose the indexing on the lexicographic ordering of the keys. But we lose type safety.&lt;/p&gt;
    &lt;h3&gt;Why tables are different&lt;/h3&gt;
    &lt;p&gt;One more example: &lt;code&gt;Day -&amp;gt; Hour -&amp;gt; Airport(name: str, flights: int, revenue: USD)&lt;/code&gt;. Can we turn the struct into a dimension like before? &lt;/p&gt;
    &lt;p&gt;In this case, no. We were able to make &lt;code&gt;Color&lt;/code&gt; an axis because we could turn &lt;code&gt;Pixel&lt;/code&gt; into a &lt;code&gt;Color -&amp;gt; Int&lt;/code&gt; function, and we could only do that because all of the fields of the struct had the same type. This time, the fields are &lt;em&gt;different&lt;/em&gt; types. So we can't convert &lt;code&gt;{name, flights, revenue}&lt;/code&gt; into an axis. &lt;sup id="fnref:name-dimension"&gt;&lt;a class="footnote-ref" href="#fn:name-dimension"&gt;4&lt;/a&gt;&lt;/sup&gt; One thing we can do is convert it to three &lt;em&gt;separate&lt;/em&gt; functions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;airport: Day -&amp;gt; Hour -&amp;gt; Str
    flights: Day -&amp;gt; Hour -&amp;gt; Int
    revenue: Day -&amp;gt; Hour -&amp;gt; USD
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But we want to keep all of the data in one place. That's where &lt;strong&gt;tables&lt;/strong&gt; come in: an array-of-structs is isomorphic to a struct-of-arrays:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;AirportColumns(
        airport: Day -&amp;gt; Hour -&amp;gt; Str,
        flights: Day -&amp;gt; Hour -&amp;gt; Int,
        revenue: Day -&amp;gt; Hour -&amp;gt; USD,
    )
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The table is a sort of &lt;em&gt;both&lt;/em&gt; representations simultaneously. If this was a pandas dataframe, &lt;code&gt;df["airport"]&lt;/code&gt; would get the airport column, while &lt;code&gt;df.loc[day1]&lt;/code&gt; would get the first day's data. I don't think many table implementations support more than one axis dimension but there's no reason they &lt;em&gt;couldn't&lt;/em&gt;. &lt;/p&gt;
    &lt;p&gt;These are also possible transforms:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Hour -&amp;gt; NamesAreHard(
        airport: Day -&amp;gt; Str,
        flights: Day -&amp;gt; Int,
        revenue: Day -&amp;gt; USD,
    )
    
    Day -&amp;gt; Whatever(
        airport: Hour -&amp;gt; Str,
        flights: Hour -&amp;gt; Int,
        revenue: Hour -&amp;gt; USD,
    )
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In my mental model, the heterogeneous struct acts as a "block" in the array. We can't remove it, we can only push an index into the fields or pull a shared column out. But there's no way to convert a heterogeneous table into an array.&lt;/p&gt;
    &lt;h3&gt;Actually there is a terrible way&lt;/h3&gt;
    &lt;p&gt;Most languages have unions or &lt;del&gt;product&lt;/del&gt; sum types that let us say "this is a string OR integer". So we can make our airport data &lt;code&gt;Day -&amp;gt; Hour -&amp;gt; AirportKey -&amp;gt; Int | Str | USD&lt;/code&gt;. Heck, might as well just say it's &lt;code&gt;Day -&amp;gt; Hour -&amp;gt; AirportKey -&amp;gt; Any&lt;/code&gt;. But would anybody really be mad enough to use that in practice?&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://code.jsoftware.com/wiki/Vocabulary/lt" target="_blank"&gt;Oh wait J does exactly that&lt;/a&gt;. J has an opaque datatype called a "box". A "table" is a function &lt;code&gt;Dim1 -&amp;gt; Dim2 -&amp;gt; Box&lt;/code&gt;. You can see some examples of what that looks like &lt;a href="https://code.jsoftware.com/wiki/DB/Flwor" target="_blank"&gt;here&lt;/a&gt;&lt;/p&gt;
    &lt;h3&gt;Misc Thoughts and Questions&lt;/h3&gt;
    &lt;p&gt;The heterogeneity barrier seems like it explains why we don't see multiple axes of table columns, while we do see multiple axes of array dimensions. But is that actually why? Is there a system out there that &lt;em&gt;does&lt;/em&gt; have multiple columnar axes?&lt;/p&gt;
    &lt;p&gt;The array &lt;code&gt;x = [[a, b, a], [b, b, b]]&lt;/code&gt; has type &lt;code&gt;1..2 -&amp;gt; 1..3 -&amp;gt; {a, b}&lt;/code&gt;. Can we rearrange it to &lt;code&gt;1..2 -&amp;gt; {a, b} -&amp;gt; 1..3&lt;/code&gt;? No. But we &lt;em&gt;can&lt;/em&gt; rearrange it to &lt;code&gt;1..2 -&amp;gt; {a, b} -&amp;gt; PowerSet(1..3)&lt;/code&gt;, which maps rows and characters to columns &lt;em&gt;with&lt;/em&gt; that character. &lt;code&gt;[(a -&amp;gt; {1, 3} ++ b -&amp;gt; {2}), (a -&amp;gt; {} ++ b -&amp;gt; {1, 2, 3}]&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;We can also transform &lt;code&gt;Row -&amp;gt; PowerSet(Col)&lt;/code&gt; into &lt;code&gt;Row -&amp;gt; Col -&amp;gt; Bool&lt;/code&gt;, aka a boolean matrix. This makes sense to me as both forms are means of representing directed graphs.&lt;/p&gt;
    &lt;p&gt;Are other function combinators useful for thinking about arrays?&lt;/p&gt;
    &lt;p&gt;Does this model cover pivot tables? Can we extend it to relational data with multiple tables?&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Systems Distributed Talk (will be) Online&lt;/h3&gt;
    &lt;p&gt;The premier will be August 6 at 12 CST, &lt;a href="https://www.youtube.com/watch?v=d9cM8f_qSLQ" target="_blank"&gt;here&lt;/a&gt;! I'll be there to answer questions / mock my own performance / generally make a fool of myself.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:1-indexing"&gt;
    &lt;p&gt;&lt;a href="https://buttondown.com/hillelwayne/archive/why-do-arrays-start-at-0/" target="_blank"&gt;Sacrilege&lt;/a&gt;! But it turns out in this context, it's easier to use 1-indexing than 0-indexing. In the years since I wrote that article I've settled on "each indexing choice matches different kinds of mathematical work", so mathematicians and computer scientists are best served by being able to choose their index. But software engineers need consistency, and 0-indexing is overall a net better consistency pick. &lt;a class="footnote-backref" href="#fnref:1-indexing" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:associative"&gt;
    &lt;p&gt;This is &lt;em&gt;right-associative&lt;/em&gt;: &lt;code&gt;a -&amp;gt; b -&amp;gt; c&lt;/code&gt; means &lt;code&gt;a -&amp;gt; (b -&amp;gt; c)&lt;/code&gt;, not &lt;code&gt;(a -&amp;gt; b) -&amp;gt; c&lt;/code&gt;. &lt;code&gt;(1..3 -&amp;gt; 1..4) -&amp;gt; Int&lt;/code&gt; would be the associative array that maps length-3 arrays to integers. &lt;a class="footnote-backref" href="#fnref:associative" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:typeclass"&gt;
    &lt;p&gt;Technically it has type &lt;code&gt;Num a =&amp;gt; a -&amp;gt; a -&amp;gt; a&lt;/code&gt;, since &lt;code&gt;(+)&lt;/code&gt; works on floats too. &lt;a class="footnote-backref" href="#fnref:typeclass" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:name-dimension"&gt;
    &lt;p&gt;Notice that if each &lt;code&gt;Airport&lt;/code&gt; had a unique name, we &lt;em&gt;could&lt;/em&gt; pull it out into &lt;code&gt;AirportName -&amp;gt; Airport(flights, revenue)&lt;/code&gt;, but we still are stuck with two different values. &lt;a class="footnote-backref" href="#fnref:name-dimension" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 30 Jul 2025 13:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/2000-words-about-arrays-and-tables/</guid></item><item><title>Programming Language Escape Hatches</title><link>https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/</link><description>
    &lt;p&gt;The excellent-but-defunct blog &lt;a href="https://prog21.dadgum.com/38.html" target="_blank"&gt;Programming in the 21st Century&lt;/a&gt; defines "puzzle languages" as languages were part of the appeal is in figuring out how to express a program idiomatically, like a puzzle. As examples, he lists Haskell, Erlang, and J. All puzzle languages, the author says, have an "escape" out of the puzzle model that is pragmatic but stigmatized.&lt;/p&gt;
    &lt;p&gt;But many mainstream languages have escape hatches, too.&lt;/p&gt;
    &lt;p&gt;Languages have a lot of properties. One of these properties is the language's &lt;a href="https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/" target="_blank"&gt;capabilities&lt;/a&gt;, roughly the set of things you can do in the language. Capability is desirable but comes into conflicts with a lot of other desirable properties, like simplicity or efficiency. In particular, reducing the capability of a language means that all remaining programs share more in common, meaning there's more assumptions the compiler and programmer can make ("tractability"). Assumptions are generally used to reason about correctness, but can also be about things like optimization: J's assumption that everything is an array leads to &lt;a href="https://code.jsoftware.com/wiki/Vocabulary/SpecialCombinations" target="_blank"&gt;high-performance "special combinations"&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;Rust is the most famous example of &lt;strong&gt;mainstream&lt;/strong&gt; language that trades capability for tractability.&lt;sup id="fnref:gc"&gt;&lt;a class="footnote-ref" href="#fn:gc"&gt;1&lt;/a&gt;&lt;/sup&gt; Rust has a lot of rules designed to prevent common memory errors, like keeping a reference to deallocated memory or modifying memory while something else is reading it. As a consequence, there's a lot of things that cannot be done in (safe) Rust, like interface with an external C function (as it doesn't have these guarantees).&lt;/p&gt;
    &lt;p&gt;To do this, you need to use &lt;a href="https://doc.rust-lang.org/book/ch20-01-unsafe-rust.html" target="_blank"&gt;unsafe Rust&lt;/a&gt;, which lets you do additional things forbidden by safe Rust, such as deference a raw pointer. Everybody tells you not to use &lt;code&gt;unsafe&lt;/code&gt; unless you absolutely 100% know what you're doing, and possibly not even then.&lt;/p&gt;
    &lt;p&gt;Sounds like an escape hatch to me!&lt;/p&gt;
    &lt;p&gt;To extrapolate, an &lt;strong&gt;escape hatch&lt;/strong&gt; is a feature (either in the language itself or a particular implementation) that deliberately breaks core assumptions about the language in order to add capabilities. This explains both Rust and most of the so-called "puzzle languages": they need escape hatches because they have very strong conceptual models of the language which leads to lots of assumptions about programs. But plenty of "kitchen sink" mainstream languages have escape hatches, too:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Some compilers let C++ code embed &lt;a href="https://en.cppreference.com/w/cpp/language/asm.html" target="_blank"&gt;inline assembly&lt;/a&gt;.&lt;/li&gt;
    &lt;li&gt;Languages built on .NET or the JVM has some sort of interop with C# or Java, and many of those languages make assumptions about programs that C#/Java do not.&lt;/li&gt;
    &lt;li&gt;The SQL language has stored procedures as an escape hatch &lt;em&gt;and&lt;/em&gt; vendors create a second escape hatch of user-defined functions.&lt;/li&gt;
    &lt;li&gt;Ruby lets you bypass any form of encapsulation with &lt;a href="https://ruby-doc.org/3.4.1/Object.html#method-i-send" target="_blank"&gt;&lt;code&gt;send&lt;/code&gt;&lt;/a&gt;.&lt;/li&gt;
    &lt;li&gt;Frameworks have escape hatches, too! React has &lt;a href="https://react.dev/learn/escape-hatches" target="_blank"&gt;an entire page on them&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;(Does &lt;code&gt;eval&lt;/code&gt; in interpreted languages count as an escape hatch? It feels different, but it does add a lot of capability. Maybe they don't "break assumptions" in the same way?)&lt;/p&gt;
    &lt;h3&gt;The problem with escape hatches&lt;/h3&gt;
    &lt;p&gt;In all languages with escape hatches, the rule is "use this as carefully and sparingly as possible", to the point where a messy solution &lt;em&gt;without&lt;/em&gt; an escape hatch is preferable to a clean solution &lt;em&gt;with&lt;/em&gt; one. Breaking a core assumption is a big deal! If the language is operating as if its still true, it's going to do incorrect things. &lt;/p&gt;
    &lt;p&gt;I recently had this problem in a TLA+ contract. TLA+ is a language for modeling complicated systems, and assumes that the model is a self-contained universe. The client wanted to use the TLA+ to test a real system. The model checker should send commands to a test device and check the next states were the same. This is straightforward to set up with the &lt;a href="https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla" target="_blank"&gt;IOExec escape hatch&lt;/a&gt;.&lt;sup id="fnref:ioexec"&gt;&lt;a class="footnote-ref" href="#fn:ioexec"&gt;2&lt;/a&gt;&lt;/sup&gt; But the model checker assumed that state exploration was pure and it could skip around the state randomly, meaning it would do things like &lt;code&gt;set x = 10&lt;/code&gt;, then skip to &lt;code&gt;set x = 1&lt;/code&gt;, then skip back to &lt;code&gt;inc x; assert x == 11&lt;/code&gt;. Oops!&lt;/p&gt;
    &lt;p&gt;We eventually found workarounds but it took a lot of clever tricks to pull off. I'll probably write up the technique when I'm less busy with The Book.&lt;/p&gt;
    &lt;p&gt;The other problem with escape hatches is the rest of the language is designed around &lt;em&gt;not&lt;/em&gt; having said capabilities, meaning it can't support the feature as well as a language designed for them from the start. Even if your escape hatch code is clean, it might not cleanly &lt;em&gt;integrate&lt;/em&gt; with the rest of your code. This is why people &lt;a href="https://chadaustin.me/2024/10/intrusive-linked-list-in-rust/" target="_blank"&gt;complain about unsafe Rust&lt;/a&gt; so often.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:gc"&gt;
    &lt;p&gt;It should be noted though that &lt;em&gt;all&lt;/em&gt; languages with automatic memory management are trading capability for tractability, too. If you can't deference pointers, you can't deference &lt;em&gt;null&lt;/em&gt; pointers. &lt;a class="footnote-backref" href="#fnref:gc" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:ioexec"&gt;
    &lt;p&gt;From the Community Modules (which come default with the VSCode extension). &lt;a class="footnote-backref" href="#fnref:ioexec" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 24 Jul 2025 14:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/</guid></item><item><title>Maybe writing speed actually is a bottleneck for programming</title><link>https://buttondown.com/hillelwayne/archive/maybe-writing-speed-actually-is-a-bottleneck-for/</link><description>
    &lt;p&gt;I'm a big (neo)vim buff. My config is over 1500 lines and I regularly write new scripts. I recently ported my neovim config to a new laptop. Before then, I was using VSCode to write, and when I switched back I immediately saw a big gain in productivity.&lt;/p&gt;
    &lt;p&gt;People often pooh-pooh vim (and other assistive writing technologies) by saying that writing code isn't the bottleneck in software development. Reading, understanding, and thinking through code is!&lt;/p&gt;
    &lt;p&gt;Now I don't know how true this actually is in practice, because empirical studies of time spent coding are all over the place. Most of them, like &lt;a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/meyer-fse-2014.pdf" target="_blank"&gt;this study&lt;/a&gt;, track time spent in the editor but don't distinguish between time spent reading code and time spent writing code. The only one I found that separates them was &lt;a href="https://scispace.com/pdf/i-know-what-you-did-last-summer-an-investigation-of-how-3zxclzzocc.pdf" target="_blank"&gt;this study&lt;/a&gt;. It finds that developers spend only 5% of their time editing. It also finds they spend 14% of their time moving or resizing editor windows, so I don't know how clean their data is.&lt;/p&gt;
    &lt;p&gt;But I have a bigger problem with "writing is not the bottleneck": when I think of a bottleneck, I imagine that &lt;em&gt;no&lt;/em&gt; amount of improvement will lead to productivity gains. Like if a program is bottlenecked on the network, it isn't going to get noticeably faster with 100x more ram or compute. &lt;/p&gt;
    &lt;p&gt;But being able to type code 100x faster, even with without corresponding improvements to reading and imagining code, would be &lt;strong&gt;huge&lt;/strong&gt;. &lt;/p&gt;
    &lt;p&gt;We'll assume the average developer writes at 80 words per minute, at five characters a word, for 400 characters a minute.What could we do if we instead wrote at 8,000 words/40k characters a minute? &lt;/p&gt;
    &lt;h3&gt;Writing fast&lt;/h3&gt;
    &lt;h4&gt;Boilerplate is trivial&lt;/h4&gt;
    &lt;p&gt;Why do people like type inference? Because writing all of the types manually is annoying. Why don't people like boilerplate? Because it's annoying to write every damn time. Programmers like features that help them write less! That's not a problem if you can write all of the boilerplate in 0.1 seconds.&lt;/p&gt;
    &lt;p&gt;You still have the problem of &lt;em&gt;reading&lt;/em&gt; boilerplate heavy code, but you can use the remaining 0.9 seconds to churn out an extension that parses the file and presents the boilerplate in a more legible fashion. &lt;/p&gt;
    &lt;h4&gt;We can write more tooling&lt;/h4&gt;
    &lt;p&gt;This is something I've noticed with LLMs: when I can churn out crappy code as a free action, I use that to write lots of tools that assist me in writing &lt;em&gt;good&lt;/em&gt; code. Even if I'm bottlenecked on a large program, I can still quickly write a script that helps me with something. Most of these aren't things I would have written because they'd take too long to write! &lt;/p&gt;
    &lt;p&gt;Again, not the best comparison, because LLMs also shortcut learning the relevant APIs, so also optimize the "understanding code" part. Then again, if I could type real fast I could more quickly whip up experiments on new apis to learn them faster. &lt;/p&gt;
    &lt;h4&gt;We can do practices that slow us down in the short-term&lt;/h4&gt;
    &lt;p&gt;Something like test-driven development significantly slows down how fast you write production code, because you have to spend a lot more time writing test code. Pair programming trades speed of writing code for speed of understanding code. A two-order-of-magnitude writing speedup makes both of them effectively free. Or, if you're not an eXtreme Programming fan, you can more easily follow the &lt;a href="https://en.wikipedia.org/wiki/The_Power_of_10:_Rules_for_Developing_Safety-Critical_Code" target="_blank"&gt;The Power of Ten Rules&lt;/a&gt; and blanket your code with contracts and assertions.&lt;/p&gt;
    &lt;h4&gt;We could do more speculative editing&lt;/h4&gt;
    &lt;p&gt;This is probably the biggest difference in how we'd work if we could write 100x faster: it'd be much easier to try changes to the code to see if they're good ideas in the first place. &lt;/p&gt;
    &lt;p&gt;How often have I tried optimizing something, only to find out it didn't make a difference? How often have I done a refactoring only to end up with lower-quality code overall? Too often. Over time it makes me prefer to try things that I know will work, and only "speculatively edit" when I think it be a fast change. If I could code 100x faster it would absolutely lead to me trying more speculative edits.&lt;/p&gt;
    &lt;p&gt;This is especially big because I believe that lots of speculative edits are high-risk, high-reward: given 50 things we could do to the code, 49 won't make a difference and one will be a major improvement. If I only have time to try five things, I have a 10% chance of hitting the jackpot. If I can try 500 things I will get that reward every single time. &lt;/p&gt;
    &lt;h2&gt;Processes are built off constraints&lt;/h2&gt;
    &lt;p&gt;There are just a few ideas I came up with; there are probably others. Most of them, I suspect, will share the same property in common: they change &lt;em&gt;the process&lt;/em&gt; of writing code to leverage the speedup. I can totally believe that a large speedup would not remove a bottleneck in the processes we &lt;em&gt;currently&lt;/em&gt; use to write code. But that's because those processes are developed work within our existing constraints. Remove a constraint and new processes become possible.&lt;/p&gt;
    &lt;p&gt;The way I see it, if our current process produces 1 Utils of Software / day, a 100x writing speedup might lead to only 1.5 UoS/day. But there are other processes that produce only 0.5 UoS/d &lt;em&gt;because they are bottlenecked on writing speed&lt;/em&gt;. A 100x speedup would lead to 10 UoS/day.&lt;/p&gt;
    &lt;p&gt;The problem with all of this that 100x speedup isn't realistic, and it's not obvious whether a 2x improvement would lead to better processes. Then again, one of the first custom vim function scripts I wrote was an aid to writing unit tests in a particular codebase, and it lead to me writing a lot more tests. So maybe even a 2x speedup is going to be speed things up, too.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Patreon Stuff&lt;/h3&gt;
    &lt;p&gt;I wrote a couple of TLA+ specs to show how to model &lt;a href="https://en.wikipedia.org/wiki/Fork%E2%80%93join_model" target="_blank"&gt;fork-join&lt;/a&gt; algorithms. I'm planning on eventually writing them up for my blog/learntla but it'll be a while, so if you want to see them in the meantime I put them up on &lt;a href="https://www.patreon.com/posts/fork-join-in-tla-134209395?utm_medium=clipboard_copy&amp;amp;utm_source=copyLink&amp;amp;utm_campaign=postshare_creator&amp;amp;utm_content=join_link" target="_blank"&gt;Patreon&lt;/a&gt;.&lt;/p&gt;
    </description><pubDate>Thu, 17 Jul 2025 19:08:27 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/maybe-writing-speed-actually-is-a-bottleneck-for/</guid></item><item><title>Logic for Programmers Turns One</title><link>https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/</link><description>
    &lt;p&gt;I released &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; exactly one year ago today. It feels weird to celebrate the anniversary of something that isn't 1.0 yet, but software projects have a proud tradition of celebrating a dozen anniversaries before 1.0. I wanted to share about what's changed in the past year and the work for the next six+ months.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The book cover!" class="newsletter-image" src="https://assets.buttondown.email/images/70ac47c9-c49f-47c0-9a05-7a9e70551d03.jpg?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h3&gt;The Road to 0.1&lt;/h3&gt;
    &lt;p&gt;I had been noodling on the idea of a logic book since the pandemic. The first time I wrote about it on the newsletter was in &lt;a href="https://buttondown.com/hillelwayne/archive/predicate-logic-for-programmers/" target="_blank"&gt;2021&lt;/a&gt;! Then I said that it would be done by June and would be "under 50 pages". The idea was to cover logic as a "soft skill" that helped you think about things like requirements and stuff.&lt;/p&gt;
    &lt;p&gt;That version &lt;em&gt;sucked&lt;/em&gt;. If you want to see how much it sucked, I put it up on &lt;a href="https://www.patreon.com/posts/what-logic-for-133675688" target="_blank"&gt;Patreon&lt;/a&gt;. Then I slept on the next draft for three years. Then in 2024 a lot of business fell through and I had a lot of free time, so with the help of &lt;a href="https://saul.pw/" target="_blank"&gt;Saul Pwanson&lt;/a&gt; I rewrote the book. This time I emphasized breadth over depth, trying to cover a lot more techniques.  &lt;/p&gt;
    &lt;p&gt;I also decided to self-publish it instead of pitching it to a publisher. Not going the traditional route would mean I would be responsible for paying for editing, advertising, graphic design etc, but I hoped that would be compensated by &lt;em&gt;much&lt;/em&gt; higher royalties. It also meant I could release the book in early access and use early sales to fund further improvements. So I wrote up a draft in &lt;a href="https://www.sphinx-doc.org/en/master/" target="_blank"&gt;Sphinx&lt;/a&gt;, compiled it to LaTeX, and uploaded the PDF to &lt;a href="https://leanpub.com/" target="_blank"&gt;leanpub&lt;/a&gt;. That was in June 2024.&lt;/p&gt;
    &lt;p&gt;Since then I kept to a monthly cadence of updates, missing once in November (short-notice contract) and once last month (&lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;). The book's now on v0.10. What's changed?&lt;/p&gt;
    &lt;h3&gt;A LOT&lt;/h3&gt;
    &lt;p&gt;v0.1 was &lt;em&gt;very obviously&lt;/em&gt; an alpha, and I have made a lot of improvements since then. For one, the book no longer looks like a &lt;a href="https://www.sphinx-doc.org/_/downloads/en/master/pdf/#page=13" target="_blank"&gt;Sphinx manual&lt;/a&gt;. Compare!&lt;/p&gt;
    &lt;p&gt;&lt;img alt="0.1 on left, 0.10 on right. Way better!" class="newsletter-image" src="https://assets.buttondown.email/images/e4d880ad-80b8-4360-9cae-27c07598c740.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Also, the content is very, very different. v0.1 was 19,000 words, v.10 is 31,000.&lt;sup id="fnref:pagesize"&gt;&lt;a class="footnote-ref" href="#fn:pagesize"&gt;1&lt;/a&gt;&lt;/sup&gt; This comes from new chapters on TLA+, constraint/SMT solving, logic programming, and major expansions to the existing chapters. Originally, "Simplifying Conditionals" was 600 words. Six hundred words! It almost fit in two pages!&lt;/p&gt;
    &lt;p&gt;&lt;img alt="How short Simplifying Conditions USED to be" class="newsletter-image" src="https://assets.buttondown.email/images/31e731b7-3bdc-4ded-9b09-2a6261a323ec.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;The chapter is now 2600 words, now covering condition lifting, quantifier manipulation, helper predicates, and set optimizations. All the other chapters have either gotten similar facelifts or are scheduled to get facelifts.&lt;/p&gt;
    &lt;p&gt;The last big change is the addition of &lt;a href="https://github.com/logicforprogrammers/book-assets" target="_blank"&gt;book assets&lt;/a&gt;. Originally you had to manually copy over all of the code to try it out, which is a problem when there are samples in eight distinct languages! Now there are ready-to-go examples for each chapter, with instructions on how to set up each programming environment. This is also nice because it gives me breaks from writing to code instead.&lt;/p&gt;
    &lt;h3&gt;How did the book do?&lt;/h3&gt;
    &lt;p&gt;Leanpub's all-time visualizations are terrible, so I'll just give the summary: 1180 copies sold, $18,241 in royalties. That's a lot of money for something that isn't fully out yet! By comparison, &lt;em&gt;Practical TLA+&lt;/em&gt; has made me less than half of that, despite selling over 5x as many books. Self-publishing was the right choice!&lt;/p&gt;
    &lt;p&gt;In that time I've paid about $400 for the book cover (worth it) and maybe $800 in Leanpub's advertising service (probably not worth it). &lt;/p&gt;
    &lt;p&gt;Right now that doesn't come close to making back the time investment, but I think it can get there post-release. I believe there's a lot more potential customers via marketing. I think post-release 10k copies sold is within reach.&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;Where is the book going?&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;The main content work is rewrites: many of the chapters have not meaningfully changed since 1.0, so I am going through and rewriting them from scratch. So far four of the ten chapters have been rewritten. My (admittedly ambitious) goal is to rewrite three of them by the end of this month and another three by the end of next. I also want to do final passes on the rewritten chapters; as most of them have a few TODOs left lying around.&lt;/p&gt;
    &lt;p&gt;(Also somehow in starting this newsletter and publishing it I realized that one of the chapters might be better split into two chapters, so there could well-be a tenth technique in v0.11 or v0.12!)&lt;/p&gt;
    &lt;p&gt;After that, I will pass it to a copy editor while I work on improving the layout, making images, and indexing. I want to have something worthy of printing on a dead tree by 1.0. &lt;/p&gt;
    &lt;p&gt;In terms of timelines, I am &lt;strong&gt;very roughly&lt;/strong&gt; estimating something like this:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Summer: final big changes and rewrites&lt;/li&gt;
    &lt;li&gt;Early Autumn: graphic design and copy editing&lt;/li&gt;
    &lt;li&gt;Late Autumn: proofing, figuring out printing stuff&lt;/li&gt;
    &lt;li&gt;Winter: final ebook and initial print releases of 1.0.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;(If you know a service that helps get self-published books "past the finish line", I'd love to hear about it! Preferably something that works for a fee, not part of royalties.)&lt;/p&gt;
    &lt;p&gt;This timeline may be disrupted by official client work, like a new TLA+ contract or a conference invitation.&lt;/p&gt;
    &lt;p&gt;Needless to say, I am incredibly excited to complete this book and share the final version with you all. This is a book I wished for years ago, a book I wrote because nobody else would. It fills a critical gap in software educational material, and someday soon I'll be able to put a copy on my bookshelf. It's exhilarating and terrifying and above all, satisfying.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:pagesize"&gt;
    &lt;p&gt;It's also 150 pages vs 50 pages, but admittedly this is partially because I made the book smaller with a larger font. &lt;a class="footnote-backref" href="#fnref:pagesize" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 08 Jul 2025 18:18:52 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/</guid></item><item><title>Logical Quantifiers in Software</title><link>https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/</link><description>
    &lt;p&gt;I realize that for all I've talked about &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; in this newsletter, I never once explained basic logical quantifiers. They're both simple and incredibly useful, so let's do that this week! &lt;/p&gt;
    &lt;h3&gt;Sets and quantifiers&lt;/h3&gt;
    &lt;p&gt;A &lt;strong&gt;set&lt;/strong&gt; is a collection of unordered, unique elements. &lt;code&gt;{1, 2, 3, …}&lt;/code&gt; is a set, as are "every programming language", "every programming language's Wikipedia page", and "every function ever defined in any programming language's standard library". You can put whatever you want in a set, with some very specific limitations to avoid certain paradoxes.&lt;sup id="fnref:paradox"&gt;&lt;a class="footnote-ref" href="#fn:paradox"&gt;2&lt;/a&gt;&lt;/sup&gt; &lt;/p&gt;
    &lt;p&gt;Once we have a set, we can ask "is something true for all elements of the set" and "is something true for at least one element of the set?" IE, is it true that every programming language has a &lt;code&gt;set&lt;/code&gt; collection type in the core language? We would write it like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# all of them
    all l in ProgrammingLanguages: HasSetType(l)
    
    # at least one
    some l in ProgrammingLanguages: HasSetType(l)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is the notation I use in the book because it's easy to read, type, and search for. Mathematicians historically had a few different formats; the one I grew up with was &lt;code&gt;∀x ∈ set: P(x)&lt;/code&gt; to mean &lt;code&gt;all x in set&lt;/code&gt;, and &lt;code&gt;∃&lt;/code&gt; to mean &lt;code&gt;some&lt;/code&gt;. I use these when writing for just myself, but find them confusing to programmers when communicating.&lt;/p&gt;
    &lt;p&gt;"All" and "some" are respectively referred to as "universal" and "existential" quantifiers.&lt;/p&gt;
    &lt;h3&gt;Some cool properties&lt;/h3&gt;
    &lt;p&gt;We can simplify expressions with quantifiers, in the same way that we can simplify &lt;code&gt;!(x &amp;amp;&amp;amp; y)&lt;/code&gt; to &lt;code&gt;!x || !y&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;First of all, quantifiers are commutative with themselves. &lt;code&gt;some x: some y: P(x,y)&lt;/code&gt; is the same as &lt;code&gt;some y: some x: P(x, y)&lt;/code&gt;. For this reason we can write &lt;code&gt;some x, y: P(x,y)&lt;/code&gt; as shorthand. We can even do this when quantifying over different sets, writing &lt;code&gt;some x, x' in X, y in Y&lt;/code&gt; instead of &lt;code&gt;some x, x' in X: some y in Y&lt;/code&gt;. We can &lt;em&gt;not&lt;/em&gt; do this with "alternating quantifiers":&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;code&gt;all p in Person: some m in Person: Mother(m, p)&lt;/code&gt; says that every person has a mother.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;some m in Person: all p in Person: Mother(m, p)&lt;/code&gt; says that someone is every person's mother.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Second, existentials distribute over &lt;code&gt;||&lt;/code&gt; while universals distribute over &lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt;. "There is some url which returns a 403 or 404" is the same as "there is some url which returns a 403 or some url that returns a 404", and "all PRs pass the linter and the test suites" is the same as "all PRs pass the linter and all PRs pass the test suites".&lt;/p&gt;
    &lt;p&gt;Finally, &lt;code&gt;some&lt;/code&gt; and &lt;code&gt;all&lt;/code&gt; are &lt;em&gt;duals&lt;/em&gt;: &lt;code&gt;some x: P(x) == !(all x: !P(x))&lt;/code&gt;, and vice-versa. Intuitively: if some file is malicious, it's not true that all files are benign.&lt;/p&gt;
    &lt;p&gt;All these rules together mean we can manipulate quantifiers &lt;em&gt;almost&lt;/em&gt; as easily as we can manipulate regular booleans, putting them in whatever form is easiest to use in programming. &lt;/p&gt;
    &lt;p&gt;Speaking of which, how &lt;em&gt;do&lt;/em&gt; we use this in in programming?&lt;/p&gt;
    &lt;h2&gt;How we use this in programming&lt;/h2&gt;
    &lt;p&gt;First of all, people clearly have a need for directly using quantifiers in code. If we have something of the form:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;for x in list:
        if P(x):
            return true
    return false
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;That's just &lt;code&gt;some x in list: P(x)&lt;/code&gt;. And this is a prevalent pattern, as you can see by using &lt;a href="https://github.com/search?q=%2Ffor+.*%3A%5Cn%5Cs*if+.*%3A%5Cn%5Cs*return+%28False%7CTrue%29%5Cn%5Cs*return+%28True%7CFalse%29%2F+language%3Apython+NOT+is%3Afork&amp;amp;type=code" target="_blank"&gt;GitHub code search&lt;/a&gt;. It finds over 500k examples of this pattern in Python alone! That can be simplified via using the language's built-in quantifiers: the Python would be &lt;code&gt;any(P(x) for x in list)&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;(Note this is not quantifying over sets but iterables. But the idea translates cleanly enough.)&lt;/p&gt;
    &lt;p&gt;More generally, quantifiers are a key way we express higher-level properties of software. What does it mean for a list to be sorted in ascending order? That &lt;code&gt;all i, j in 0..&amp;lt;len(l): if i &amp;lt; j then l[i] &amp;lt;= l[j]&lt;/code&gt;. When should a &lt;a href="https://qntm.org/ratchet" target="_blank"&gt;ratchet test fail&lt;/a&gt;? When &lt;code&gt;some f in functions - exceptions: Uses(f, bad_function)&lt;/code&gt;. Should the image classifier work upside down? &lt;code&gt;all i in images: classify(i) == classify(rotate(i, 180))&lt;/code&gt;. These are the properties we verify with tests and types and &lt;a href="https://www.hillelwayne.com/post/constructive/" target="_blank"&gt;MISU&lt;/a&gt; and whatnot;&lt;sup id="fnref:misu"&gt;&lt;a class="footnote-ref" href="#fn:misu"&gt;1&lt;/a&gt;&lt;/sup&gt; it helps to be able to make them explicit!&lt;/p&gt;
    &lt;p&gt;One cool use case that'll be in the book's next version: database invariants are universal statements over the set of all records, like &lt;code&gt;all a in accounts: a.balance &amp;gt; 0&lt;/code&gt;. That's enforceable with a &lt;a href="https://sqlite.org/lang_createtable.html#check_constraints" target="_blank"&gt;CHECK&lt;/a&gt; constraint. But what about something like &lt;code&gt;all i, i' in intervals: NoOverlap(i, i')&lt;/code&gt;? That isn't covered by CHECK, since it spans two rows.&lt;/p&gt;
    &lt;p&gt;Quantifier duality to the rescue! The invariant is equivalent to &lt;code&gt;!(some i, i' in intervals: Overlap(i, i'))&lt;/code&gt;, so is preserved if the &lt;em&gt;query&lt;/em&gt; &lt;code&gt;SELECT COUNT(*) FROM intervals CROSS JOIN intervals …&lt;/code&gt; returns 0 rows. This means we can test it via a &lt;a href="https://sqlite.org/lang_createtrigger.html" target="_blank"&gt;database trigger&lt;/a&gt;.&lt;sup id="fnref:efficiency"&gt;&lt;a class="footnote-ref" href="#fn:efficiency"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;There are a lot more use cases for quantifiers, but this is enough to introduce the ideas! Next week's the one year anniversary of the book entering early access, so I'll be writing a bit about that experience and how the book changed. It's &lt;em&gt;crazy&lt;/em&gt; how crude v0.1 was compared to the current version.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:misu"&gt;
    &lt;p&gt;MISU ("make illegal states unrepresentable") means using data representations that rule out invalid values. For example, if you have a &lt;code&gt;location -&amp;gt; Optional(item)&lt;/code&gt; lookup and want to make sure that each item is in exactly one location, consider instead changing the map to &lt;code&gt;item -&amp;gt; location&lt;/code&gt;. This is a means of &lt;em&gt;implementing&lt;/em&gt; the property &lt;code&gt;all i in item, l, l' in location: if ItemIn(i, l) &amp;amp;&amp;amp; l != l' then !ItemIn(i, l')&lt;/code&gt;. &lt;a class="footnote-backref" href="#fnref:misu" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:paradox"&gt;
    &lt;p&gt;Specifically, a set can't be an element of itself, which rules out constructing things like "the set of all sets" or "the set of sets that don't contain themselves". &lt;a class="footnote-backref" href="#fnref:paradox" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:efficiency"&gt;
    &lt;p&gt;Though note that when you're inserting or updating an interval, you already &lt;em&gt;have&lt;/em&gt; that row's fields in the trigger's &lt;code&gt;NEW&lt;/code&gt; keyword. So you can just query &lt;code&gt;!(some i in intervals: Overlap(new, i'))&lt;/code&gt;, which is more efficient. &lt;a class="footnote-backref" href="#fnref:efficiency" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 02 Jul 2025 19:44:22 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/</guid></item><item><title>You can cheat a test suite with a big enough polynomial</title><link>https://buttondown.com/hillelwayne/archive/you-can-cheat-a-test-suite-with-a-big-enough/</link><description>
    &lt;p&gt;Hi nerds, I'm back from &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;! I'd heartily recommend it, wildest conference I've been to in years. I have a lot of work to catch up on, so this will be a short newsletter.&lt;/p&gt;
    &lt;p&gt;In an earlier version of my talk, I had a gag about unit tests. First I showed the test &lt;code&gt;f([1,2,3]) == 3&lt;/code&gt;, then said that this was satisfied by &lt;code&gt;f(l) = 3&lt;/code&gt;, &lt;code&gt;f(l) = l[-1]&lt;/code&gt;, &lt;code&gt;f(l) = len(l)&lt;/code&gt;, &lt;code&gt;f(l) = (129*l[0]-34*l[1]-617)*l[2] - 443*l[0] + 1148*l[1] - 182&lt;/code&gt;. Then I progressively rule them out one by one with more unit tests, except the last polynomial which stubbornly passes every single test.&lt;/p&gt;
    &lt;p&gt;If you're given some function of &lt;code&gt;f(x: int, y: int, …): int&lt;/code&gt; and a set of unit tests asserting &lt;a href="https://buttondown.com/hillelwayne/archive/oracle-testing/" target="_blank"&gt;specific inputs give specific outputs&lt;/a&gt;, then you can find a polynomial that passes every single unit test.&lt;/p&gt;
    &lt;p&gt;To find the gag, and as &lt;a href="https://en.wikipedia.org/wiki/Satisfiability_modulo_theories" target="_blank"&gt;SMT&lt;/a&gt; practice, I wrote a Python program that finds a polynomial that passes a test suite meant for &lt;code&gt;max&lt;/code&gt;. It's hardcoded for three parameters and only finds 2nd-order polynomials but I think it could be generalized with enough effort.&lt;/p&gt;
    &lt;h2&gt;The code&lt;/h2&gt;
    &lt;p&gt;Full code &lt;a href="https://gist.github.com/hwayne/0ed045a35376c786171f9cf4b55c470f" target="_blank"&gt;here&lt;/a&gt;, breakdown below.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;z3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;  &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
    &lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Solver&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;Solver&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;a href="https://microsoft.github.io/z3guide/" target="_blank"&gt;Z3&lt;/a&gt; is just the particular SMT solver we use, as it has good language bindings and a lot of affordances.&lt;/p&gt;
    &lt;p&gt;As part of learning SMT I wanted to do this two ways. First by putting the polynomial "outside" of the SMT solver in a python function, second by doing it "natively" in Z3. I created two solvers so I could test both versions in one run. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;a0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Consts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'a0 a b c d e f'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Ints&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'x y z'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"a*x+b*y+c*z+d*x*y+e*x*z+f*y*z+a0"&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Both &lt;code&gt;Const('x', IntSort())&lt;/code&gt; and &lt;code&gt;Int('x')&lt;/code&gt; do the exact same thing, the latter being syntactic sugar for the former. I did not know this when I wrote the program. &lt;/p&gt;
    &lt;p&gt;To keep the two versions in sync I represented the equation as a string, which I later &lt;code&gt;eval&lt;/code&gt;. This is one of the rare cases where eval is a good idea, to help us experiment more quickly while learning. The polynomial is a "2nd-order polynomial", even though it doesn't have &lt;code&gt;x^2&lt;/code&gt; terms, as it has &lt;code&gt;xy&lt;/code&gt; and &lt;code&gt;xz&lt;/code&gt; terms.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;lambdamax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="n"&gt;z3max&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'z3max'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;  &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ForAll&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;z3max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;lambdamax&lt;/code&gt; is pretty straightforward: create a lambda with three parameters and &lt;code&gt;eval&lt;/code&gt; the string. The string "&lt;code&gt;a*x&lt;/code&gt;" then becomes the python expression &lt;code&gt;a*x&lt;/code&gt;, &lt;code&gt;a&lt;/code&gt; is an SMT symbol, while the &lt;code&gt;x&lt;/code&gt; SMT symbol is shadowed by the lambda parameter. To reiterate, a terrible idea in practice, but a good way to learn faster.&lt;/p&gt;
    &lt;p&gt;&lt;code&gt;z3max&lt;/code&gt; function is a little more complex. &lt;code&gt;Function&lt;/code&gt; takes an identifier string and N "sorts" (roughly the same as programming types). The first &lt;code&gt;N-1&lt;/code&gt; sorts define the parameters of the function, while the last becomes the output. So here I assign the string identifier &lt;code&gt;"z3max"&lt;/code&gt; to be a function with signature &lt;code&gt;(int, int, int) -&amp;gt; int&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;I can load the function into the model by specifying constraints on what &lt;code&gt;z3max&lt;/code&gt; &lt;em&gt;could&lt;/em&gt; be. This could either be a strict input/output, as will be done later, or a &lt;code&gt;ForAll&lt;/code&gt; over all possible inputs. Here I just use that directly to say "for all inputs, the function should match this polynomial." But I could do more complicated constraints, like commutativity (&lt;code&gt;f(x, y) == f(y, x)&lt;/code&gt;) or monotonicity (&lt;code&gt;Implies(x &amp;lt; y, f(x) &amp;lt;= f(y))&lt;/code&gt;).&lt;/p&gt;
    &lt;p&gt;Note &lt;code&gt;ForAll&lt;/code&gt; takes a list of z3 symbols to quantify over. That's the only reason we need to define &lt;code&gt;x, y, z&lt;/code&gt; in the first place. The lambda version doesn't need them. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z3max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lambdamax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This sets up the joke: adding constraints to each solver that the polynomial it finds must, for a fixed list of triplets, return the max of each triplet.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z3max&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lambdamax&lt;/span&gt;&lt;span class="p"&gt;)]:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;sat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"max([&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;]) ="&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"max([x, y, z]) = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;x + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"+ &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;z +"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# linebreaks added for newsletter rendering&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;xy + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;xz + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;yz + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;a0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Output:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;max([1, 2, 3]) = 3
    # etc
    max([x, y, z]) = -133x + 130y + -10z + -2xy + 62xz + -46yz + 0
    
    max([1, 2, 3]) = 3
    # etc
    max([x, y, z]) = -17x + 16y + 0z + 0xy + 8xz + -6yz + 0
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I find that &lt;code&gt;z3max&lt;/code&gt; (top) consistently finds larger coefficients than &lt;code&gt;lambdamax&lt;/code&gt; does. I don't know why.&lt;/p&gt;
    &lt;h3&gt;Practical Applications&lt;/h3&gt;
    &lt;p&gt;&lt;strong&gt;Test-Driven Development&lt;/strong&gt; recommends a strict "red-green refactor" cycle. Write a new failing test, make the new test pass, then go back and refactor. Well, the easiest way to make the new test pass would be to paste in a new polynomial, so that's what you should be doing. You can even do this all automatically: have a script read the set of test cases, pass them to the solver, and write the new polynomial to your code file. All you need to do is write the tests!&lt;/p&gt;
    &lt;h3&gt;Pedagogical Notes&lt;/h3&gt;
    &lt;p&gt;Writing the script took me a couple of hours. I'm sure an LLM could have whipped it all up in five minutes but I really want to &lt;em&gt;learn&lt;/em&gt; SMT and &lt;a href="https://www.sciencedirect.com/science/article/pii/S0747563224002541" target="_blank"&gt;LLMs &lt;em&gt;may&lt;/em&gt; decrease learning retention&lt;/a&gt;.&lt;sup id="fnref:caveat"&gt;&lt;a class="footnote-ref" href="#fn:caveat"&gt;1&lt;/a&gt;&lt;/sup&gt; Z3 documentation is not... great for non-academics, though, and most other SMT solvers have even worse docs. One useful trick I use regularly is to use Github code search to find code using the same APIs and study how that works. Turns out reading API-heavy code is a lot easier than writing it!&lt;/p&gt;
    &lt;p&gt;Anyway, I'm very, very slowly feeling like I'm getting the basics on how to use SMT. I don't have any practical use cases yet, but I wanted to learn this skill for a while and glad I finally did.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:caveat"&gt;
    &lt;p&gt;Caveat I have not actually &lt;em&gt;read&lt;/em&gt; the study, for all I know it could have a sample size of three people, I'll get around to it eventually &lt;a class="footnote-backref" href="#fnref:caveat" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 24 Jun 2025 16:27:01 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/you-can-cheat-a-test-suite-with-a-big-enough/</guid></item><item><title>Solving LinkedIn Queens with SMT</title><link>https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/</link><description>
    &lt;h3&gt;No newsletter next week&lt;/h3&gt;
    &lt;p&gt;I’ll be speaking at &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;. My talk isn't close to done yet, which is why this newsletter is both late and short. &lt;/p&gt;
    &lt;h1&gt;Solving LinkedIn Queens in SMT&lt;/h1&gt;
    &lt;p&gt;The article &lt;a href="https://codingnest.com/modern-sat-solvers-fast-neat-underused-part-1-of-n/" target="_blank"&gt;Modern SAT solvers: fast, neat and underused&lt;/a&gt; claims that SAT solvers&lt;sup id="fnref:SAT"&gt;&lt;a class="footnote-ref" href="#fn:SAT"&gt;1&lt;/a&gt;&lt;/sup&gt; are "criminally underused by the industry". A while back on the newsletter I asked "why": how come they're so powerful and yet nobody uses them? Many experts responded saying the reason is that encoding SAT kinda sucked and they rather prefer using tools that compile to SAT. &lt;/p&gt;
    &lt;p&gt;I was reminded of this when I read &lt;a href="https://ryanberger.me/posts/queens/" target="_blank"&gt;Ryan Berger's post&lt;/a&gt; on solving “LinkedIn Queens” as a SAT problem. &lt;/p&gt;
    &lt;p&gt;A quick overview of Queens. You’re presented with an NxN grid divided into N regions, and have to place N queens so that there is exactly one queen in each row, column, and region. While queens can be on the same diagonal, they &lt;em&gt;cannot&lt;/em&gt; be adjacently diagonal.&lt;/p&gt;
    &lt;p&gt;(Important note: Linkedin “Queens” is a variation on the puzzle game &lt;a href="https://starbattle.puzzlebaron.com/" target="_blank"&gt;Star Battle&lt;/a&gt;, which is the same except the number of stars you place in each row/column/region varies per puzzle, and is usually two. This is also why 'queens' don’t capture like chess queens.)&lt;/p&gt;
    &lt;p&gt;&lt;img alt="An image of a solved queens board. Copied from https://ryanberger.me/posts/queens" class="newsletter-image" src="https://assets.buttondown.email/images/96f6f923-331f-424d-8641-fe6753e1c2ca.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Ryan solved this by writing Queens as a SAT problem, expressing properties like "there is exactly one queen in row 3" as a large number of boolean clauses. &lt;a href="https://ryanberger.me/posts/queens/" target="_blank"&gt;Go read his post, it's pretty cool&lt;/a&gt;. What leapt out to me was that he used &lt;a href="https://cvc5.github.io/" target="_blank"&gt;CVC5&lt;/a&gt;, an &lt;strong&gt;SMT&lt;/strong&gt; solver.&lt;sup id="fnref:SMT"&gt;&lt;a class="footnote-ref" href="#fn:SMT"&gt;2&lt;/a&gt;&lt;/sup&gt; SMT solvers are "higher-level" than SAT, capable of handling more data types than just boolean variables. It's a lot easier to solve the problem at the SMT level than at the SAT level. To show this, I whipped up a short demo of solving the same problem in &lt;a href="https://github.com/Z3Prover/z3/wiki" target="_blank"&gt;Z3&lt;/a&gt; (via the &lt;a href="https://pypi.org/project/z3-solver/" target="_blank"&gt;Python API&lt;/a&gt;).&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://gist.github.com/hwayne/c5de7bc52e733995311236666bedecd3" target="_blank"&gt;Full code here&lt;/a&gt;, which you can compare to Ryan's SAT solution &lt;a href="https://github.com/ryan-berger/queens/blob/master/main.py" target="_blank"&gt;here&lt;/a&gt;. I didn't do a whole lot of cleanup on it (again, time crunch!), but short explanation below.&lt;/p&gt;
    &lt;h3&gt;The code&lt;/h3&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;z3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
    &lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;itertools&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;combinations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;
    &lt;span class="n"&gt;solver&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Solver&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt; &lt;span class="c1"&gt;# N&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Initial setup and modules. &lt;code&gt;size&lt;/code&gt; is the number of rows/columns/regions in the board, which I'll call &lt;code&gt;N&lt;/code&gt; below.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# queens[n] = col of queen on row n&lt;/span&gt;
    &lt;span class="c1"&gt;# by construction, not on same row&lt;/span&gt;
    &lt;span class="n"&gt;queens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;IntVector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'q'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;SAT represents the queen positions via N² booleans: &lt;code&gt;q_00&lt;/code&gt; means that a Queen is on row 0 and column 0, &lt;code&gt;!q_05&lt;/code&gt; means a queen &lt;em&gt;isn't&lt;/em&gt; on row 0 col 5, etc. In SMT we can instead encode it as N integers: &lt;code&gt;q_0 = 5&lt;/code&gt; means that the queen on row 0 is positioned at column 5. This immediately enforces one class of constraints for us: we don't need any constraints saying "exactly one queen per row", because that's embedded in the definition of &lt;code&gt;queens&lt;/code&gt;!&lt;/p&gt;
    &lt;p&gt;(Incidentally, using 0-based indexing for the board was a mistake on my part, it makes correctly encoding the regions later really painful.)&lt;/p&gt;
    &lt;p&gt;To actually make the variables &lt;code&gt;[q_0, q_1, …]&lt;/code&gt;, we use the Z3 affordance &lt;code&gt;IntVector(str, n)&lt;/code&gt; for making &lt;code&gt;n&lt;/code&gt; variables at once.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;And&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="c1"&gt;# not on same column&lt;/span&gt;
    &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Distinct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;First we constrain all the integers to &lt;code&gt;[0, N)&lt;/code&gt;, then use the &lt;em&gt;incredibly&lt;/em&gt; handy &lt;code&gt;Distinct&lt;/code&gt; constraint to force all the integers to have different values. This guarantees at most one queen per column, which by the &lt;a href="https://en.wikipedia.org/wiki/Pigeonhole_principle" target="_blank"&gt;pigeonhole principle&lt;/a&gt; means there is exactly one queen per column.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# not diagonally adjacent&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;q1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;q2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;One of the rules is that queens can't be adjacent. We already know that they can't be horizontally or vertically adjacent via other constraints, which leaves the diagonals. We only need to add constraints that, for each queen, there is no queen in the lower-left or lower-right corner, aka &lt;code&gt;q_3 != q_2 ± 1&lt;/code&gt;. We don't need to check the top corners because if &lt;code&gt;q_1&lt;/code&gt; is in the upper-left corner of &lt;code&gt;q_2&lt;/code&gt;, then &lt;code&gt;q_2&lt;/code&gt; is in the lower-right corner of &lt;code&gt;q_1&lt;/code&gt;!&lt;/p&gt;
    &lt;p&gt;That covers everything except the "one queen per region" constraint. But the regions are the tricky part, which we should expect because we vary the difficulty of queens games by varying the regions.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;regions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;"purple"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
            &lt;span class="s2"&gt;"red"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),],&lt;/span&gt;
            &lt;span class="c1"&gt;# you get the picture&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
    
    &lt;span class="c1"&gt;# Some checking code left out, see below&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The region has to be manually coded in, which is a huge pain.&lt;/p&gt;
    &lt;p&gt;(In the link, some validation code follows. Since it breaks up explaining the model I put it in the next section.)&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Or&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally we have the region constraint. The easiest way I found to say "there is exactly one queen in each region" is to say "there is a queen in region 1 and a queen in region 2 and a queen in region 3" etc." Then to say "there is a queen in region &lt;code&gt;purple&lt;/code&gt;" I wrote "&lt;code&gt;q_0 = 0&lt;/code&gt; OR &lt;code&gt;q_0 = 1&lt;/code&gt; OR … OR &lt;code&gt;q_1 = 0&lt;/code&gt; etc." &lt;/p&gt;
    &lt;p&gt;Why iterate over every position in the region instead of doing something like &lt;code&gt;(0, q[0]) in r&lt;/code&gt;? I tried that but it's not an expression that Z3 supports.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;sat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally, we solve and print the positions. Running this gives me:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;q__0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Which is the correct solution to the queens puzzle. I didn't benchmark the solution times, but I imagine it's considerably slower than a raw SAT solver. &lt;a href="https://github.com/audemard/glucose" target="_blank"&gt;Glucose&lt;/a&gt; is really, really fast.&lt;/p&gt;
    &lt;p&gt;But even so, solving the problem with SMT was a lot &lt;em&gt;easier&lt;/em&gt; than solving it with SAT. That satisfies me as an explanation for why people prefer it to SAT.&lt;/p&gt;
    &lt;h3&gt;Sanity checks&lt;/h3&gt;
    &lt;p&gt;One bit I glossed over earlier was the sanity checking code. I &lt;em&gt;knew for sure&lt;/em&gt; that I was going to make a mistake encoding the &lt;code&gt;region&lt;/code&gt;, and the solver wasn't going to provide useful information abut what I did wrong.  In cases like these, I like adding small tests and checks to catch mistakes early, because the solver certainly isn't going to catch them!&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;all_squares&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;repeat&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;test_i_set_up_problem_right&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;all_squares&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_iterable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r2&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;combinations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The first check was a quick test that I didn't leave any squares out, or accidentally put the same square in both regions. Converting the values into sets makes both checks a lot easier. Honestly I don't know why I didn't just use sets from the start, sets are great.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;render_regions&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;colormap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"purple"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s2"&gt;"red"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"brown"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"white"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"green"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"yellow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"orange"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"blue"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"pink"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;board&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; 
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;all_squares&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;colormap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    
    &lt;span class="n"&gt;render_regions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The second check is something that prints out the regions. It produces something like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;111111111
    112333999
    122439999
    124437799
    124666779
    124467799
    122467899
    122555889
    112258899
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I can compare this to the picture of the board to make sure I got it right. I guess a more advanced solution would be to print emoji squares like 🟥 instead.&lt;/p&gt;
    &lt;p&gt;Neither check is quality code but it's throwaway and it gets the job done so eh.&lt;/p&gt;
    &lt;h3&gt;Update for the Internet&lt;/h3&gt;
    &lt;p&gt;This was sent as a weekly newsletter, which is usually on topics like &lt;a href="https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code" target="_blank"&gt;software history&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/" target="_blank"&gt;formal methods&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;unusual technologies&lt;/a&gt;, and the &lt;a href="https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/" target="_blank"&gt;theory of software engineering&lt;/a&gt;. You &lt;a href="https://buttondown.email/hillelwayne/" target="_blank"&gt;can subscribe here&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:SAT"&gt;
    &lt;p&gt;"Boolean &lt;strong&gt;SAT&lt;/strong&gt;isfiability Solver", aka a solver that can find assignments that make complex boolean expressions true. I write a bit more about them &lt;a href="https://www.hillelwayne.com/post/np-hard/" target="_blank"&gt;here&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:SAT" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:SMT"&gt;
    &lt;p&gt;"Satisfiability Modulo Theories" &lt;a class="footnote-backref" href="#fnref:SMT" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 12 Jun 2025 15:43:25 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/</guid></item><item><title>AI is a gamechanger for TLA+ users</title><link>https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/</link><description>
    &lt;h3&gt;New Logic for Programmers Release&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.10 is now available&lt;/a&gt;! This is a minor release, mostly focused on logic-based refactoring, with new material on set types and testing refactors are correct. See the full release notes at &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;the changelog page&lt;/a&gt;. Due to &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;conference pressure&lt;/a&gt; v0.11 will also likely be a minor release. &lt;/p&gt;
    &lt;p&gt;&lt;img alt="The book cover" class="newsletter-image" src="https://assets.buttondown.email/images/29d4ae9d-bcb9-4d8b-99d4-8a35c0990ad5.jpg?w=300&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h1&gt;AI is a gamechanger for TLA+ users&lt;/h1&gt;
    &lt;p&gt;&lt;a href="https://lamport.azurewebsites.net/tla/tla.html" target="_blank"&gt;TLA+&lt;/a&gt; is a specification language to model and debug distributed systems. While very powerful, it's also hard for programmers to learn, and there's always questions of connecting specifications with actual code. &lt;/p&gt;
    &lt;p&gt;That's why &lt;a href="https://zfhuang99.github.io/github%20copilot/formal%20verification/tla+/2025/05/24/ai-revolution-in-distributed-systems.html" target="_blank"&gt;The Coming AI Revolution in Distributed Systems&lt;/a&gt; caught my interest. In the post, Cheng Huang claims that Azure successfully used LLMs to examine an existing codebase, derive a TLA+ spec, and find a production bug in that spec. "After a decade of manually crafting TLA+ specifications", he wrote, "I must acknowledge that this AI-generated specification rivals human work".&lt;/p&gt;
    &lt;p&gt;This inspired me to experiment with LLMs in TLA+ myself. My goals are a little less ambitious than Cheng's: I wanted to see how LLMs could help junior specifiers write TLA+, rather than handling the entire spec automatically. Details on what did and didn't work below, but my takeaway is that &lt;strong&gt;LLMs are an immense specification force multiplier.&lt;/strong&gt;&lt;/p&gt;
    &lt;p&gt;All tests were done with a standard VSCode Copilot subscription, writing Claude 3.7 in Agent mode. Other LLMs or IDEs may be more or less effective, etc.&lt;/p&gt;
    &lt;h2&gt;Things Claude was good at&lt;/h2&gt;
    &lt;h3&gt;Fixing syntax errors&lt;/h3&gt;
    &lt;p&gt;TLA+ uses a very different syntax than mainstream programming languages, meaning beginners make a lot of mistakes where they do a "programming syntax" instead of TLA+ syntax:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;NotThree(x) = \* should be ==, not =
        x != 3 \* should be #, not !=
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The problem is that the TLA+ syntax checker, SANY, is 30 years old and doesn't provide good information. Here's what it says for that snippet:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Was expecting "==== or more Module body"
    Encountered "NotThree" at line 6, column 1
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;That only isolates one error and doesn't tell us what the problem is, only where it is. Experienced TLA+ users get "error eyes" and can quickly see what the problem is, but beginners really struggle with this.&lt;/p&gt;
    &lt;p&gt;The TLA+ foundation has made LLM integration a priority, so the VSCode extension &lt;a href="https://github.com/tlaplus/vscode-tlaplus/blob/master/src/main.ts#L174" target="_blank"&gt;naturally supports several agents actions&lt;/a&gt;. One of these is running SANY, meaning an agent can get an error, fix it, get another error, fix it, etc. Provided the above sample and asked to make it work, Claude successfully fixed both errors. It also fixed many errors in a larger spec, as well as figure out why PlusCal specs weren't compiling to TLA+.&lt;/p&gt;
    &lt;p&gt;This by itself is already enough to make LLMs a worthwhile tool, as it fixes one of the biggest barriers to entry.&lt;/p&gt;
    &lt;h3&gt;Understanding error traces&lt;/h3&gt;
    &lt;p&gt;When TLA+ finds a violated property, it outputs the sequence of steps that leads to the error. This starts in plaintext, and VSCode parses it into an interactive table:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="An example error trace" class="newsletter-image" src="https://assets.buttondown.email/images/f7f16d0e-c61f-4286-ae49-67e03f844126.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Learning to read these error traces is a skill in itself. You have to understand what's happening in each step and how it relates back to the actually broken property. It takes a long time for people to learn how to do this well.&lt;/p&gt;
    &lt;p&gt;Claude was successful here, too, accurately reading 20+ step error traces and giving a high-level explanation of what went wrong. It also could condense error traces: if ten steps of the error trace could be condensed into a one-sentence summary (which can happen if you're modeling a lot of process internals) Claude would do it.&lt;/p&gt;
    &lt;p&gt;I did have issues here with doing this in agent mode: while the extension does provide a "run model checker" command, the agent would regularly ignore this and prefer to run a terminal command instead. This would be fine except that the LLM consistently hallucinated invalid commands. I had to amend every prompt with "run the model checker via vscode, do not use a terminal command". You can skip this if you're willing to copy and paste the error trace into the prompt.&lt;/p&gt;
    &lt;p&gt;As with syntax checking, if this was the &lt;em&gt;only&lt;/em&gt; thing LLMs could effectively do, that would already be enough&lt;sup id="fnref:dayenu"&gt;&lt;a class="footnote-ref" href="#fn:dayenu"&gt;1&lt;/a&gt;&lt;/sup&gt; to earn a strong recommend. Even as a TLA+ expert I expect I'll be using this trick regularly. &lt;/p&gt;
    &lt;h3&gt;Boilerplate tasks&lt;/h3&gt;
    &lt;p&gt;TLA+ has a lot of boilerplate. One of the most notorious examples is &lt;code&gt;UNCHANGED&lt;/code&gt; rules. Specifications are extremely precise — so precise that you have to specify what variables &lt;em&gt;don't&lt;/em&gt; change in every step. This takes the form of an &lt;code&gt;UNCHANGED&lt;/code&gt; clause at the end of relevant actions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;RemoveObjectFromStore(srv, o, s) ==
      /\ o \in stored[s]
      /\ stored' = [stored EXCEPT ![s] = @ \ {o}]
      /\ UNCHANGED &amp;lt;&amp;lt;capacity, log, objectsize, pc&amp;gt;&amp;gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Writing this is really annoying. Updating these whenever you change an action, or add a new variable to the spec, is doubly so. Syntax checking and error analysis are important for beginners, but this is what I wanted for &lt;em&gt;myself&lt;/em&gt;. I took a spec and prompted Claude&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Add UNCHANGED &amp;lt;&lt;v1, etc="" v2,=""&gt;&amp;gt; for each variable not changed in an action.&lt;/v1,&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;And it worked! It successfully updated the &lt;code&gt;UNCHANGED&lt;/code&gt; in every action. &lt;/p&gt;
    &lt;p&gt;(Note, though, that it was a "well-behaved" spec in this regard: only one "action" happened at a time. In TLA+ you can have two actions happen simultaneously, that each update half of the variables, meaning neither of them should have an &lt;code&gt;UNCHANGED&lt;/code&gt; clause. I haven't tested how Claude handles that!)&lt;/p&gt;
    &lt;p&gt;That's the most obvious win, but Claude was good at handling other tedious work, too. Some examples include updating &lt;code&gt;vars&lt;/code&gt; (the conventional collection of all state variables), lifting a hard-coded value into a model parameter, and changing data formats. Most impressive to me, though, was rewriting a spec designed for one process to instead handle multiple processes. This means taking all of the process variables, which originally have types like &lt;code&gt;Int&lt;/code&gt;, converting them to types like &lt;code&gt;[Process -&amp;gt; Int]&lt;/code&gt;, and then updating the uses of all of those variables in the spec. It didn't account for race conditions in the new concurrent behavior, but it was an excellent scaffold to do more work.&lt;/p&gt;
    &lt;h3&gt;Writing properties from an informal description&lt;/h3&gt;
    &lt;p&gt;You have to be pretty precise with your intended property description but it handles converting that precise description into TLA+'s formalized syntax, which is something beginners often struggle with.&lt;/p&gt;
    &lt;h2&gt;Things it is less good at&lt;/h2&gt;
    &lt;h3&gt;Generating model config files&lt;/h3&gt;
    &lt;p&gt;To model check TLA+, you need both a specification (&lt;code&gt;.tla&lt;/code&gt;) and a model config file (&lt;code&gt;.cfg&lt;/code&gt;), which have separate syntaxes. Asking the agent to generate the second often lead to it using TLA+ syntax. It automatically fixed this after getting parsing errors, though. &lt;/p&gt;
    &lt;h3&gt;Fixing specs&lt;/h3&gt;
    &lt;p&gt;Whenever the ran model checking and discovered a bug, it would naturally propose a change to either the invalid property or the spec. Sometimes the changes were good, other times the changes were not physically realizable. For example, if it found that a bug was due to a race condition between processes, it would often suggest fixing it by saying race conditions were okay. I mean yes, if you say bugs are okay, then the spec finds that bugs are okay! Or it would alternatively suggest adding a constraint to the spec saying that race conditions don't happen. &lt;a href="https://www.hillelwayne.com/post/alloy-facts/" target="_blank"&gt;But that's a huge mistake in specification&lt;/a&gt;, because race conditions happen if we don't have coordination. We need to specify the &lt;em&gt;mechanism&lt;/em&gt; that is supposed to prevent them.&lt;/p&gt;
    &lt;h3&gt;Finding properties of the spec&lt;/h3&gt;
    &lt;p&gt;After seeing how capable it was at translating my properties to TLA+, I started prompting Claude to come up with properties on its own. Unfortunately, almost everything I got back was either trivial, uninteresting, or too coupled to implementation details. I haven't tested if it would work better to ask it for "properties that may be violated".&lt;/p&gt;
    &lt;h3&gt;Generating code from specs&lt;/h3&gt;
    &lt;p&gt;I have to be specific here: Claude &lt;em&gt;could&lt;/em&gt; sometimes convert Python into a passable spec, an vice versa. It &lt;em&gt;wasn't&lt;/em&gt; good at recognizing abstraction. For example, TLA+ specifications often represent sequential operations with a state variable, commonly called &lt;code&gt;pc&lt;/code&gt;. If modeling code that nonatomically retrieves a counter value and increments it, we'd have one action that requires &lt;code&gt;pc = "Get"&lt;/code&gt; and sets the new value to &lt;code&gt;"Inc"&lt;/code&gt;, then another that requires it be &lt;code&gt;"Inc"&lt;/code&gt; and sets it to &lt;code&gt;"Done"&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;I found that Claude would try to somehow convert &lt;code&gt;pc&lt;/code&gt; into part of the Python program's state, rather than recognize it as a TLA+ abstraction. On the other side, when converting python code to TLA+ it would often try to translate things like &lt;code&gt;sleep&lt;/code&gt; into some part of the spec, not recognizing that it is abstractable into a distinct action. I didn't test other possible misconceptions, like converting randomness to nondeterminism.&lt;/p&gt;
    &lt;p&gt;For the record, when converting TLA+ to Python Claude tended to make simulators of the spec, rather than possible production code implementing the spec. I really wasn't expecting otherwise though.&lt;/p&gt;
    &lt;h2&gt;Unexplored Applications&lt;/h2&gt;
    &lt;p&gt;Things I haven't explored thoroughly but could possibly be effective, based on what I know about TLA+ and AI:&lt;/p&gt;
    &lt;h3&gt;Writing Java Overrides&lt;/h3&gt;
    &lt;p&gt;Most TLA+ operators are resolved via TLA+ interpreters, but you can also implement them in "native" Java. This lets you escape the standard language semantics and add capabilities like &lt;a href="https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla" target="_blank"&gt;executing programs during model-checking&lt;/a&gt; or &lt;a href="https://github.com/tlaplus/tlaplus/blob/master/tlatools/org.lamport.tlatools/src/tla2sany/StandardModules/TLC.tla#L62" target="_blank"&gt;dynamically constrain the depth of the searched state space&lt;/a&gt;. There's a lot of cool things I think would be possible with overrides. The problem is there's only a handful of people in the world who know how to write them. But that handful have written quite a few overrides and I think there's enough there for Claude to work with. &lt;/p&gt;
    &lt;h3&gt;Writing specs, given a reference mechanism&lt;/h3&gt;
    &lt;p&gt;In all my experiments, the LLM only had my prompts and the occasional Python script as information. That makes me suspect that some of its problems with writing and fixing specs come down to not having a system model. Maybe it wouldn't suggest fixes like "these processes never race" if it had a design doc saying that the processes can't coordinate. &lt;/p&gt;
    &lt;p&gt;(Could a Sufficiently Powerful LLM derive some TLA+ specification from a design document?)&lt;/p&gt;
    &lt;h3&gt;Connecting specs and code&lt;/h3&gt;
    &lt;p&gt;This is the holy grail of TLA+: taking a codebase and showing it correctly implements a spec. Currently the best ways to do this are by either using TLA+ to generate a test suite, or by taking logged production traces and matching them to TLA+ behaviors. &lt;a href="https://www.mongodb.com/blog/post/engineering/conformance-checking-at-mongodb-testing-our-code-matches-our-tla-specs" target="_blank"&gt;This blog post discusses both&lt;/a&gt;. While I've seen a lot of academic research into these approaches there are no industry-ready tools. So if you want trace validation you have to do a lot of manual labour tailored to your specific product. &lt;/p&gt;
    &lt;p&gt;If LLMs could do some of this work for us then that'd really amplify the usefulness of TLA+ to many companies.&lt;/p&gt;
    &lt;h2&gt;Thoughts&lt;/h2&gt;
    &lt;p&gt;&lt;em&gt;Right now&lt;/em&gt;, agents seem good at the tedious and routine parts of TLA+ and worse at the strategic and abstraction parts. But, since the routine parts are often a huge barrier to beginners, this means that LLMs have the potential to make TLA+ far, far more accessible than it previously was.&lt;/p&gt;
    &lt;p&gt;I have mixed thoughts on this. As an &lt;em&gt;advocate&lt;/em&gt;, this is incredible. I want more people using formal specifications because I believe it leads to cheaper, safer, more reliable software. Anything that gets people comfortable with specs is great for our industry. As a &lt;em&gt;professional TLA+ consultant&lt;/em&gt;, I'm worried that this obsoletes me. Most of my income comes from training and coaching, which companies will have far less demand of now. Then again, maybe this an opportunity to pitch "agentic TLA+ training" to companies!&lt;/p&gt;
    &lt;p&gt;Anyway, if you're interested in TLA+, there has never been a better time to try it. I mean it, these tools handle so much of the hard part now. I've got a &lt;a href="https://learntla.com/" target="_blank"&gt;free book available online&lt;/a&gt;, as does &lt;a href="https://lamport.azurewebsites.net/tla/book.html" target="_blank"&gt;the inventor of TLA+&lt;/a&gt;. I like &lt;a href="https://elliotswart.github.io/pragmaticformalmodeling/" target="_blank"&gt;this guide too&lt;/a&gt;. Happy modeling!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:dayenu"&gt;
    &lt;p&gt;Dayenu. &lt;a class="footnote-backref" href="#fnref:dayenu" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 05 Jun 2025 14:59:11 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/</guid></item></channel></rss>
    Raw headers
    {
      "cf-cache-status": "DYNAMIC",
      "cf-ray": "9dc5cb406aa55751-CMH",
      "content-security-policy": "default-src 'self'; script-src 'self' 'nonce--_br2I4Q9VJypMBerTU6hQ' 'sha256-fmq7SB5YRtx2c358LZJiPdOUdgbDngasL//1pZSOw7Q=' 'sha256-Zsbh78gbtd2zyd9vW5HmEiB7ZYWa/Und7TqJ7cDC1PM=' 'sha256-8TGjKztrL0R7wxViRmWe0k/Je4l3hMmzBhO8FBQdYf0=' https://static.addtoany.com https://embed.bsky.app https://challenges.cloudflare.com https://static.cloudflareinsights.com https://connect.facebook.net https://cdn.usefathom.com https://embedr.flickr.com https://www.googletagmanager.com https://www.instagram.com https://plausible.io https://cdn.seline.com https://scripts.simpleanalyticscdn.com https://sniperl.ink https://js.stripe.com https://js.stripe.com https://cdn.tailwindcss.com https://www.tiktok.com https://tinylytics.app https://sf16-website-login.neutral.ttwstatic.com https://platform.twitter.com https://cloud.umami.is https://pay.google.com https://static-assets.buttondown.com https://tiktokcdn-us.com; style-src 'self' 'unsafe-inline' https:; img-src 'self' data: https: http: blob:; media-src 'self' data: https: http: blob:; font-src 'self' data: https:; frame-src https: blob:; connect-src 'self' https:; manifest-src 'self'; object-src 'none'; base-uri 'self'; form-action 'self' https://buttondown.com",
      "content-type": "application/rss+xml; charset=utf-8",
      "cross-origin-opener-policy": "same-origin",
      "date": "Sat, 14 Mar 2026 19:48:06 GMT",
      "last-modified": "Tue, 10 Mar 2026 17:12:30 GMT",
      "nel": "{\"report_to\":\"heroku-nel\",\"response_headers\":[\"Via\"],\"max_age\":3600,\"success_fraction\":0.01,\"failure_fraction\":0.1}, {\"report_to\":\"heroku-nel\",\"response_headers\":[\"Via\"],\"max_age\":3600,\"success_fraction\":0.01,\"failure_fraction\":0.1}",
      "referrer-policy": "strict-origin-when-cross-origin",
      "report-to": "{\"group\":\"heroku-nel\",\"endpoints\":[{\"url\":\"https://nel.heroku.com/reports?s=JdZp615%2BnJIEKgvDoNtbqg8SUEHewYt9748btYALSVk%3D\\u0026sid=929419e7-33ea-4e2f-85f0-7d8b7cd5cbd6\\u0026ts=1773517686\"}],\"max_age\":3600}, {\"group\":\"heroku-nel\",\"endpoints\":[{\"url\":\"https://nel.heroku.com/reports?s=ERDuAoUKO%2FtSjnqkNQNmS6H%2BC7YtAIdm0AhwnM9LcyI%3D\\u0026sid=e11707d5-02a7-43ef-b45e-2cf4d2036f7d\\u0026ts=1773517686\"}],\"max_age\":3600}",
      "reporting-endpoints": "heroku-nel=\"https://nel.heroku.com/reports?s=JdZp615%2BnJIEKgvDoNtbqg8SUEHewYt9748btYALSVk%3D&sid=929419e7-33ea-4e2f-85f0-7d8b7cd5cbd6&ts=1773517686\", heroku-nel=\"https://nel.heroku.com/reports?s=ERDuAoUKO%2FtSjnqkNQNmS6H%2BC7YtAIdm0AhwnM9LcyI%3D&sid=e11707d5-02a7-43ef-b45e-2cf4d2036f7d&ts=1773517686\"",
      "server": "cloudflare",
      "set-cookie": "initial_path=\"/hillelwayne/rss\"; expires=Mon, 13 Apr 2026 19:48:06 GMT; Max-Age=2592000; Path=/",
      "transfer-encoding": "chunked",
      "vary": "Cookie, Host, origin, Accept-Encoding",
      "via": "1.1 heroku-router, 2.0 heroku-router",
      "x-content-type-options": "nosniff",
      "x-frame-options": "DENY"
    }
    Parsed with @rowanmanning/feed-parser
    {
      "meta": {
        "type": "rss",
        "version": "2.0"
      },
      "language": "en-us",
      "title": "Computer Things",
      "description": "<!-- buttondown-editor-mode: fancy --><p>Hi, I'm Hillel. This is the newsletter version of <a target=\"_blank\" rel=\"noopener noreferrer nofollow\" href=\"https://www.hillelwayne.com\">my website</a>. I post all website updates here. I also post weekly content just for the newsletter, on topics like</p><ul><li><p>Formal Methods</p></li><li><p>Software History and Culture</p></li><li><p>Fringetech and exotic tooling</p></li><li><p>The philosophy and theory of software engineering</p></li></ul><p>You can see the archive of all public essays <a target=\"_blank\" rel=\"noopener noreferrer nofollow\" href=\"https://buttondown.email/hillelwayne/archive/\">here</a>.</p>",
      "copyright": null,
      "url": "https://buttondown.com/hillelwayne",
      "self": "https://buttondown.email/hillelwayne/rss",
      "published": null,
      "updated": "2026-03-10T17:12:30.000Z",
      "generator": null,
      "image": null,
      "authors": [],
      "categories": [],
      "items": [
        {
          "id": "https://buttondown.com/hillelwayne/archive/llms-are-bad-at-vibing-specifications/",
          "title": "LLMs are bad at vibing specifications",
          "description": "<h3>No newsletter next week</h3>\n<p>I'll be speaking at <a href=\"https://qconlondon.com/\" target=\"_blank\">InfoQ London</a>. But see below for a book giveaway!</p>\n<hr />\n<h1>LLMs are bad at vibing specifications</h1>\n<p>About a year ago I wrote <a href=\"https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/\" target=\"_blank\">AI is a gamechanger for TLA+ users</a>, which argued that AI are a \"specification force multiplier\". That was written from the perspective an TLA+ expert using these tools. A full <a href=\"https://github.com/search?q=path%3A*.tla+NOT+is%3Afork+claude&type=code\" target=\"_blank\">4% of Github TLA+ specs</a> now have the word \"Claude\" somewhere in them. This is interesting to me, because it suggests there was always an interest in formal methods, people just lacked the skills to do it.  </p>\n<p>It's also interesting because it gives me a sense of what happens when beginners use AI to write formal specs. It's not good.</p>\n<p>As a case study, we'll use <a href=\"https://github.com/myProjectsRavi/sentinel-protocol/tree/main/docs/formal/specs\" target=\"_blank\">this project</a>, which is kind of enough to have vibed out TLA+ and Alloy specs.</p>\n<h3>Looking at a project</h3>\n<p><a href=\"https://github.com/myProjectsRavi/sentinel-protocol/blob/main/docs/formal/specs/threat-intel-mesh.als\" target=\"_blank\">Starting with the Alloy spec</a>. Here it is in its entirety:</p>\n<div class=\"codehilite\"><pre><span></span><code>module ThreatIntelMesh\n\nsig Node {}\n\none sig LocalNode extends Node {}\n\nsig Snapshot {\n  owner: one Node,\n  signed: one Bool,\n  signatures: set Signature\n}\n\nsig Signature {}\n\nsig Policy {\n  allowUnsignedImport: one Bool\n}\n\npred canImport[p: Policy, s: Snapshot] {\n  (p.allowUnsignedImport = True) or (s.signed = True)\n}\n\nassert UnsignedImportMustBeDenied {\n  all p: Policy, s: Snapshot |\n    p.allowUnsignedImport = False and s.signed = False implies not canImport[p, s]\n}\n\nassert SignedImportMayBeAccepted {\n  all p: Policy, s: Snapshot |\n    s.signed = True implies canImport[p, s]\n}\n\ncheck UnsignedImportMustBeDenied for 5\ncheck SignedImportMayBeAccepted for 5\n</code></pre></div>\n\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<p>Couple of things to note here: first of all, this doesn't actually compile. It's using the <a href=\"https://alloy.readthedocs.io/en/latest/modules/boolean.html\" target=\"_blank\">Boolean</a> standard module so needs <code>open util/boolean</code> to function. Second, Boolean is the wrong approach here; you're supposed to use subtyping. </p>\n<div class=\"codehilite\"><pre><span></span><code>sig Snapshot {\n<span class=\"w\"> </span> owner: one Node,\n<span class=\"gd\">- signed: one Bool,</span>\n<span class=\"w\"> </span> signatures: set Signature\n}\n\n<span class=\"gi\">+ sig SignedSnapshot in Snapshot {}</span>\n\n\npred canImport[p: Policy, s: Snapshot] {\n<span class=\"gd\">- s.signed = True</span>\n<span class=\"gi\">+ s in SignedSnapshot</span>\n}\n</code></pre></div>\n\n<p>So we know the person did not actually run these specs. This is <em>somewhat</em> less of a problem in TLA+, which has an official MCP server that lets the agent run model checking. Even so, I regularly see specs that I'm pretty sure won't model check, with things like using <code>Reals</code> or assuming <code>NULL</code> is a built-in and not a user-defined constant.</p>\n<p>The bigger problem with the spec is that <code>UnsignedImportMustBeDenied</code> and <code>SignedImportMayBeAccepted</code> <em>don't actually do anything</em>. <code>canImport</code> is defined as <code>P || Q</code>. <code>UnsignedImportMustBeDenied</code> checks that <code>!P && !Q => !canImport</code>. <code>SignedImportMayBeAccepted</code> checks that <code>P => canImport</code>. These are tautologically true! If they do anything at all, it is only checking that <code>canImport</code> was defined correctly. </p>\n<p>You see the same thing in the <a href=\"https://github.com/myProjectsRavi/sentinel-protocol/blob/main/docs/formal/specs/serialization-firewall.tla\" target=\"_blank\">TLA+ specs</a>, too:</p>\n<div class=\"codehilite\"><pre><span></span><code>GadgetPayload ==\n  /\\ gadgetDetected' = TRUE\n  /\\ depth' \\in 0..(MaxDepth + 5)\n  /\\ UNCHANGED allowlistedFormat\n  /\\ decision' = \"block\"\n\nNoExploitAllowed == gadgetDetected => decision = \"block\"\n</code></pre></div>\n\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<p>The AI is only writing \"obvious properties\", which fail for reasons like \"we missed a guard clause\" or \"we forgot to update a variable\". It does not seem to be good at writing \"subtle\" properties that fail due to concurrency, nondeterminism, or bad behavior separated by several steps. Obvious properties are useful for orienting yourself and ensuring the system behaves like you expect, but the actual value in using formal methods comes from the subtle properties. </p>\n<p>(This ties into <a href=\"https://buttondown.com/hillelwayne/archive/some-tests-are-stronger-than-others/\" target=\"_blank\">Strong and Weak Properties</a>. LLM properties are weak, intended properties need to be strong.)</p>\n<p>This is a problem I see in almost every FM spec written by AI. LLMs aren't doing one of the core features of a spec. Articles like <a href=\"https://martin.kleppmann.com/2025/12/08/ai-formal-verification.html\" target=\"_blank\">Prediction: AI will make formal verification go mainstream</a> and <a href=\"https://leodemoura.github.io/blog/2026/02/28/when-ai-writes-the-worlds-software.html\" target=\"_blank\">When AI Writes the World's Software, Who Verifies It?</a> argue that LLMs will make formal methods go mainstream, but being easily able to write specifications doesn't help with correctness if the specs don't actually verify anything.</p>\n<h3>Is this a user error?</h3>\n<div class=\"subscribe-form\"></div>\n<p>I first got interested in LLMs and TLA+ from <a href=\"https://zfhuang99.github.io/github%20copilot/formal%20verification/tla+/2025/05/24/ai-revolution-in-distributed-systems.html\" target=\"_blank\">The Coming AI Revolution in Distributed Systems</a>. The author of that later <a href=\"https://github.com/zfhuang99/lamport-agent/blob/main/spec/CRAQ/CRAQ.tla\" target=\"_blank\">vibecoded a spec</a> with a considerably more complex property:</p>\n<div class=\"codehilite\"><pre><span></span><code>NoStaleStrictRead ==\n  \\A i \\in 1..Len(eventLog) :\n    LET ev == eventLog[i] IN\n      ev.type = \"read\" =>\n        LET c == ev.chunk IN\n        LET v == ev.version IN\n        /\\ \\A j \\in 1..i :\n             LET evC == eventLog[j] IN\n               evC.type = \"commit\" /\\ evC.chunk = c => evC.version <= v\n</code></pre></div>\n\n<p>This is a lot more complicated than the <code>(P => Q && P) => Q</code> properties I've seen! It could be because <a href=\"https://github.com/deepseek-ai/3FS/tree/main/specs/DataStorage\" target=\"_blank\">the corresponding system already had a complete spec written in P</a>. But it could also be that Cheng Huang is already an expert specifier, meaning he can get more out of an LLM than an ordinary developer can. I've also noticed that I can usually coax an LLM to do more interesting things than most of my clients can. Which is good for my current livelihood, but bad for the hope of LLMs making formal methods mainstream. If you need to know formal methods to get the LLM to do formal methods, is that really helping?</p>\n<p>(Yes, if it lowers the skill threshold-- means you can apply FM with 20 hours of practice instead of 80. But the jury's still out on how <em>much</em> it lowers the threshold. What if it only lowers it from 80 to 75?) </p>\n<p>On the other hand, there also seem to be some properties that AI struggles with, even with explicit instructions. Last week a client and I tried to get Claude to generate a good <a href=\"https://www.hillelwayne.com/post/safety-and-liveness/\" target=\"_blank\">liveness</a> or <a href=\"https://www.hillelwayne.com/post/action-properties/\" target=\"_blank\">action</a> property instead of a standard obvious invariant, and it just couldn't. Training data issue? Something in the innate complexity of liveness? It's not clear yet. These properties are even more \"subtle\" than most invariants, so maybe that's it.</p>\n<p>On the other other hand, this is all as of March 2026. Maybe this whole article will be laughably obsolete by June. </p>\n<hr />\n<h3><a href=\"https://logicforprogrammers.com\" target=\"_blank\">Logic for Programmers</a> Giveaway</h3>\n<p>Last week's giveaway raised a few issues. First, the New World copies were all taken before all of the emails went out, so a lot of people did not even get a chance to try for a book. Second, due to a Leanpub bug the Europe coupon scheduled for 10 AM UTC actually activated at 10 AM my time, which was early evening for Europe. Third, everybody in the APAC region got left out.</p>\n<p>So, since I'm not doing a newsletter next week, let's have another giveaway:</p>\n<ul>\n<li><a href=\"https://leanpub.com/logic/c/E5A55F7B482C3\" target=\"_blank\">This coupon</a> will go up 2026-03-16 at 11:00 UTC, which should be noon Central European Time, and be good for ten books (five for this giveaway, five to account for last week's bug).</li>\n<li><a href=\"https://leanpub.com/logic/c/ADC664C95B6D1\" target=\"_blank\">This coupon</a> will go up 2026-03-17 at 04:00 UTC, which should be noon Beijing Time, and be good for five books.</li>\n<li><a href=\"https://leanpub.com/logic/c/U1250212A9070\" target=\"_blank\">This coupon</a> will go up 2026-03-17 at 17:00 UTC, which should be noon Central US Time, and also be good for five books.</li>\n</ul>\n<p>I think that gives the best chance of everybody getting at least a chance of a book, while being resilient to timezone shenanigans due to travel / Leanpub dropping bugfixes / daylight savings / whatever. </p>\n<p>(No guarantees that later \"no newsletter\" weeks will have giveaways! This is a gimmick)</p>",
          "url": "https://buttondown.com/hillelwayne/archive/llms-are-bad-at-vibing-specifications/",
          "published": "2026-03-10T17:12:30.000Z",
          "updated": "2026-03-10T17:12:30.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/free-books/",
          "title": "Free Books",
          "description": "<p>Spinning a <a href=\"https://www.youtube.com/watch?v=NB4hzg4k7_A\" target=\"_blank\">lot of plates</a> this week so skipping the newsletter. As an apology, have ten free copies of <em>Logic for Programmers</em>.</p>\n<ul>\n<li><a href=\"https://leanpub.com/logic/c/EBDFA51B15C1\" target=\"_blank\">These five</a> are available now.</li>\n<li><del><a href=\"https://leanpub.com/logic/c/5A55F7B482C3\" target=\"_blank\">These five</a> <em>should</em> be available at 10:30 AM CEST tomorrow, so people in Europe have a better chance of nabbing one.</del> Nevermind Leanpub had a bug that made this not work properly</li>\n</ul>",
          "url": "https://buttondown.com/hillelwayne/archive/free-books/",
          "published": "2026-03-03T16:34:33.000Z",
          "updated": "2026-03-03T16:34:33.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/new-blog-post-some-silly-z3-scripts-i-wrote/",
          "title": "New Blog Post: Some Silly Z3 Scripts I Wrote",
          "description": "<p>Now that I'm not spending all my time on Logic for Programmers, I have time to update my website again! So here's the first blog post in five months: <a href=\"https://www.hillelwayne.com/post/z3-examples/\" target=\"_blank\">Some Silly Z3 Scripts I Wrote</a>.</p>\n<p>Normally I'd also put a link to the Patreon notes but I've decided I don't like publishing gated content and am going to wind that whole thing down. So some quick notes about this post:</p>\n<ul>\n<li>Part of the point is admittedly to hype up the eventual release of LfP. I want to start marketing the book, but don't want the marketing material to be devoid of interest, so tangentially-related-but-independent blog posts are a good place to start.</li>\n<li>The post discusses the concept of \"chaff\", the enormous quantity of material (both code samples and prose) that didn't make it into the book. The book is about 50,000 words… and considerably shorter than the total volume of chaff! I don't <em>think</em> most of it can be turned into useful public posts, but I'm not entirely opposed to the idea. Maybe some of the old chapters could be made into something?</li>\n<li>Coming up with a conditioned mathematical property to prove was a struggle. I had two candidates: <code>a == b * c => a / b == c</code>, which would have required a long tangent on how division must be total in Z3, and  <code>a != 0 => some b: b * a == 1</code>, which would have required introducing a quantifier (SMT is real weird about quantifiers). Division by zero has already caused me enough grief so I went with the latter. This did mean I had to reintroduce \"operations must be total\" when talking about arrays.</li>\n<li>I have no idea why the array example returns <code>2</code> for the max profit and not <code>99999999</code>. I'm guessing there's some short circuiting logic in the optimizer when the problem is ill-defined?</li>\n<li>One example I could not get working, which is unfortunate, was a demonstration of how SMT solvers are undecidable via encoding Goldbach's conjecture as an SMT problem. Anything with multiple nested quantifiers is a pain.</li>\n</ul>",
          "url": "https://buttondown.com/hillelwayne/archive/new-blog-post-some-silly-z3-scripts-i-wrote/",
          "published": "2026-02-23T16:49:10.000Z",
          "updated": "2026-02-23T16:49:10.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/stream-of-consciousness-driven-development/",
          "title": "Stream of Consciousness Driven Development",
          "description": "<p>This is something I just tried out last week but it seems to have enough potential to be worth showing unpolished. I was pairing with a client on writing a spec. I saw a problem with the spec, a convoluted way of fixing the spec. Instead of trying to verbally explain it, I started by creating a new markdown file:</p>\n<div class=\"codehilite\"><pre><span></span><code>NameOfProblem.md\n</code></pre></div>\n\n<p>Then I started typing. First the problem summary, then a detailed description, then the solution and why it worked. When my partner asked questions, I incorporated his question and our discussion of it into the flow. If we hit a dead end with the solution, we marked it out as a dead end. Eventually the file looked something like this:</p>\n<div class=\"codehilite\"><pre><span></span><code>Current state of spec\nProblems caused by this\n    Elaboration of problems\n    What we tried that didn't work\nProposed Solution\n    Theory behind proposed solution\n    How the solution works\n    Expected changes\n    Other problems this helps solve\n    Problems this does *not* help with\n</code></pre></div>\n\n<p>Only once this was done, my partner fully understood the chain of thought, <em>and</em> we agreed it represented the right approach, did we start making changes to the spec. </p>\n<h3>How is this better than just making the change?</h3>\n<p>The change was <em>conceptually</em> complex. A rough analogy: imagine pairing with a beginner who wrote an insertion sort, and you want to replace it with quicksort. You need to explain why the insertion sort is too slow, why the quicksort isn't slow, and how quicksort actually correctly sorts a list. This could involve tangents into computational complexity, big-o notation, recursion, etc. These are all concepts you have internalized, so the change is simple to you, but the solution uses concepts the beginner does not know. So it's conceptually complex to them.</p>\n<p>I wasn't pairing with a beginning programmer or even a beginning specifier. This was a client who could confidently write complex specs on their own. But they don't work on specifications full time like I do. Any time there's a relative gap in experience in a pair, there's solutions that are conceptually simple to one person and complex to the other.</p>\n<p>I've noticed too often that when one person doesn't fully understand the concepts behind a change, they just go \"you're the expert, I trust you.\" That eventually leads to a totally unmaintainable spec. Hence, writing it all out. </p>\n<p>As I said before, I've only tried this once (though I've successfully used a similar idea when teaching workshops). It worked pretty well, though! Just be prepared for a lot of typing.</p>",
          "url": "https://buttondown.com/hillelwayne/archive/stream-of-consciousness-driven-development/",
          "published": "2026-02-18T16:33:08.000Z",
          "updated": "2026-02-18T16:33:08.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/proving-whats-possible/",
          "title": "Proving What's Possible",
          "description": "<p>As a formal methods consultant I have to mathematically express properties of systems. I generally do this with two \"temporal operators\": </p>\n<ul>\n<li>A(x) means that <code>x</code> is always true. For example, a database table <em>always</em> satisfies all record-level constraints, and a state machine <em>always</em> makes valid transitions between states. If <code>x</code> is a statement about an individual state (as in the database but not state machine example), we further call it an <strong>invariant</strong>.</li>\n<li>E(x) means that <code>x</code> is \"eventually\" true, conventionally meaning \"guaranteed true at some point in the future\". A database transaction <em>eventually</em> completes or rolls back, a state machine <em>eventually</em> reaches the \"done\" state, etc. </li>\n</ul>\n<p>These come from linear temporal logic, which is the mainstream notation for expressing system properties. <sup id=\"fnref:modal\"><a class=\"footnote-ref\" href=\"#fn:modal\">1</a></sup> We like these operators because they elegantly cover <a href=\"https://www.hillelwayne.com/post/safety-and-liveness/\" target=\"_blank\">safety and liveness properties</a>, and because <a href=\"https://buttondown.com/hillelwayne/archive/formalizing-stability-and-resilience-properties/\" target=\"_blank\">we can combine them</a>. <code>A(E(x))</code> means <code>x</code> is true an infinite number of times, while <code>A(x => E(y)</code> means that <code>x</code> being true guarantees <code>y</code> true in the future. </p>\n<p>There's a third class of properties, that I will call <em>possibility</em> properties: <code>P(x)</code> is \"can x happen in this model\"? Is it possible for a table to have more than ten records? Can a state machine transition from \"Done\" to \"Retry\", even if it <em>doesn't</em>? Importantly, <code>P(x)</code> does not need to be possible <em>immediately</em>, just at some point in the future. It's possible to lose 100 dollars betting on slot machines, even if you only bet one dollar at a time. If <code>x</code> is a statement about an individual state, we can further call it a <a href=\"https://en.wikipedia.org/wiki/Reachability\" target=\"_blank\"><em>reachability</em> property</a>. I'm going to use the two interchangeably for flow. </p>\n<p><code>A(P(x))</code> says that <code>x</code> is <em>always</em> possible. No matter what we've done in our system, we can make <code>x</code> happen again. There's no way to do this with just <code>A</code> and <code>E</code>. Other meaningful combinations include:</p>\n<ul>\n<li><code>P(A(x))</code>: there is a reachable state from which <code>x</code> is always true.</li>\n<li><code>A(x => P(y))</code>: <code>y</code> is possible from any state where <code>x</code> is true.</li>\n<li><code>E(x && P(y))</code>: There is always a future state where x is true and y is reachable.</li>\n<li><code>A(P(x) => E(x))</code>: If <code>x</code> is ever possible, it will eventually happen.</li>\n<li><code>E(P(x))</code> and <code>P(E(x))</code> are the same as <code>P(x)</code>.</li>\n</ul>\n<p>See the paper <a href=\"https://dl.acm.org/doi/epdf/10.1145/567446.567463\" target=\"_blank\">\"Sometime\" is sometimes \"not never\"</a> for a deeper discussion of <code>E</code> and <code>P</code>.</p>\n<h3>The use case</h3>\n<p>Possibility properties are \"something good <em>can</em> happen\", which is generally less useful (<em>in specifications</em>) than \"something bad <em>can't</em> happen\" (safety) and \"something good <em>will</em> happen\" (liveness). But it still comes up as an important property! My favorite example:</p>\n<p><img alt=\"A guy who can't shut down his computer because system preferences interrupts shutdown\" class=\"newsletter-image\" src=\"https://www.hillelwayne.com/post/safety-and-liveness/img/tweet2.png\" /></p>\n<p>The big use I've found for the idea is as a sense-check that we wrote the spec properly. Say I take the property \"A worker in the 'Retry' state eventually leaves that state\":</p>\n<div class=\"codehilite\"><pre><span></span><code>A(state == 'Retry' => E(state != 'Retry'))\n</code></pre></div>\n\n<p>The model checker checks this property and confirms it holds of the spec. Great! Our system is correct! ...Unless the system can never <em>reach</em> the \"Retry\" state, in which case the expression is trivially true. I need to verify that 'Retry' is reachable, eg <code>P(state == 'Retry')</code>. Notice I can't use <code>E</code> to do this, because I don't want to say \"the worker always needs to retry at least once\". </p>\n<h3>It's not supported though</h3>\n<p>I say \"use I've found for <em>the idea</em>\" because the main formalisms I use (Alloy and TLA+) don't natively support <code>P</code>. <sup id=\"fnref:tla\"><a class=\"footnote-ref\" href=\"#fn:tla\">2</a></sup> On top of <code>P</code> being less useful than <code>A</code> and <code>E</code>, simple reachability properties are <a href=\"https://www.hillelwayne.com/post/software-mimicry/\" target=\"_blank\">mimickable</a> with A(x). <code>P(x)</code> <em>passes</em> whenever <code>A(!x)</code> <em>fails</em>, meaning I can verify <code>P(state == 'Retry')</code> by testing that <code>A(!(state == 'Retry'))</code> finds a counterexample. We <em>cannot</em> mimic combined operators this way like <code>A(P(x))</code> but those are significantly less common than state-reachability. </p>\n<p>(Also, refinement doesn't preserve possibility properties, but that's a whole other kettle of worms.)</p>\n<p>The one that's bitten me a little is that we can't mimic \"<code>P(x)</code> from every starting state\". \"<code>A(!x)</code>\" fails if there's at least one path from one starting state that leads to <code>x</code>, but other starting states might not make <code>x</code> possible.</p>\n<p>I suspect there's also a chicken-and-egg problem here. Since my tools can't verify possibility properties, I'm not used to noticing them in systems. I'd be interested in hearing if anybody works with codebases where possibility properties are important, especially if it's something complex like <code>A(x => P(y))</code>.</p>\n<div class=\"footnote\">\n<hr />\n<ol>\n<li id=\"fn:modal\">\n<p>Instead of <code>A(x)</code>, the literature uses <code>[]x</code> or <code>Gx</code> (\"globally x\") and instead of <code>E(x)</code> it uses <code><>x</code> or <code>Fx</code> (\"finally x\"). I'm using A and E because this isn't teaching material. <a class=\"footnote-backref\" href=\"#fnref:modal\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:tla\">\n<p>There's <a href=\"https://github.com/tlaplus/tlaplus/issues/860\" target=\"_blank\">some discussion to add it to TLA+, though</a>. <a class=\"footnote-backref\" href=\"#fnref:tla\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/proving-whats-possible/",
          "published": "2026-02-11T18:36:53.000Z",
          "updated": "2026-02-11T18:36:53.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/logic-for-programmers-new-release-and-next-steps/",
          "title": "Logic for Programmers New Release and Next Steps",
          "description": "<p><img alt=\"cover.jpg\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/f821145f-d310-403c-88f4-327758a66606.jpg?w=480&fit=max\" /></p>\n<p>It's taken four months, but the next release of <a href=\"https://logicforprogrammers.com\" target=\"_blank\">Logic for Programmers is now available</a>! v0.13 is over 50,000 words, making it both 20% larger than v0.12 and officially the longest thing I have ever written.<sup id=\"fnref:longest\"><a class=\"footnote-ref\" href=\"#fn:longest\">1</a></sup> Full release notes are <a href=\"https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md\" target=\"_blank\">here</a>, but I'll talk a bit about the biggest changes. </p>\n<p>For one, every chapter has been rewritten. Every single one. They span from <em>relatively</em> minor changes to complete chapter rewrites. After some rough git diffing, I think I deleted about 11,000 words?<sup id=\"fnref:gross-additions\"><a class=\"footnote-ref\" href=\"#fn:gross-additions\">2</a></sup> The biggest change is probably to the Alloy chapter. After many sleepless nights, I realized the right approach wasn't to teach Alloy as a <em>data modeling</em> tool but to teach it as a <em>domain modeling</em> tool. Which technically means the book no longer covers data modeling.</p>\n<p>There's also a lot more connections between the chapters. The introductory math chapter, for example, foreshadows how each bit of math will be used in the future techniques. I also put more emphasis on the general \"themes\" like the expressiveness-guarantees tradeoff (working title). One theme I'm really excited about is compatibility (extremely working title). It turns out that the <a href=\"https://buttondown.com/hillelwayne/archive/the-liskov-substitution-principle-does-more-than/\" target=\"_blank\">Liskov substitution principle</a>/subtyping in general, <a href=\"https://buttondown.com/hillelwayne/archive/refinement-without-specification/\" target=\"_blank\">database migrations</a>, backwards-compatible API changes, and <a href=\"https://hillelwayne.com/post/refinement/\" target=\"_blank\">specification refinement</a> all follow <em>basically</em> the same general principles. I'm calling this \"compatibility\" for now but prolly need a better name.</p>\n<p>Finally, there's just a lot more new topics in the various chapters. <code>Testing</code> properly covers structural and metamorphic properties. <code>Proofs</code> covers proof by induction and proving recursive functions (in an exercise). <code>Logic Programming</code> now finally has a section on answer set programming. You get the picture.</p>\n<h3>Next Steps</h3>\n<p>There's a lot I still want to add to the book: proper data modeling, data structures, type theory, model-based testing, etc. But I've added new material for two year, and if I keep going it will never get done. So with this release, all the content is in!</p>\n<p>Just like all the content was in <a href=\"https://buttondown.com/hillelwayne/archive/five-unusual-raku-features/\" target=\"_blank\">two Novembers ago</a> and <a href=\"https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/\" target=\"_blank\">two Januaries ago</a> and <a href=\"https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/\" target=\"_blank\">last July</a>. To make it absolutely 100% for sure that I won't be tempted to add anything else, I passed the whole manuscript over to a copy editor. So if I write more, it won't get edits. That's a pretty good incentive to stop.</p>\n<p>I also need to find a technical reviewer and proofreader. Once all three phases are done then it's \"just\" a matter of fixing the layout and finding a good printer. I don't know what the timeline looks like but I really want to have something I can hold in my hands before the summer.</p>\n<p>(I also need to get notable-people testimonials. Hampered a little in this because I'm trying real hard not to quid-pro-quo, so I'd like to avoid anybody who helped me or is mentioned in the book. And given I tapped most of my network to help me... I've got some ideas though!)</p>\n<p>There's still a lot of work ahead. Even so, for the first time in two years I don't have research to do or sections to write and it feels so crazy. Maybe I'll update my blog again! Maybe I'll run a workshop! Maybe I'll go outside if Chicago ever gets above 6°F! </p>\n<hr />\n<h2>Conference Season</h2>\n<p>After a pretty slow 2025, the 2026 conference season is looking to be pretty busy! Here's where I'm speaking so far:</p>\n<ul>\n<li><a href=\"https://qconlondon.com/\" target=\"_blank\">QCon London</a>, March 16-19</li>\n<li><a href=\"https://craft-conf.com/2026\" target=\"_blank\">Craft Conference</a>, Budapest, June 4-5</li>\n<li><a href=\"https://softwareshould.work/\" target=\"_blank\">Software Should Work</a>, Missouri, July 16-17</li>\n<li><a href=\"https://hfpug.org/\" target=\"_blank\">Houston Functional Programmers</a>, Virtual, December 3</li>\n</ul>\n<p>For the first three I'm giving variations of my talk \"How to find bugs in systems that don't exist\", which I gave last year at <a href=\"https://systemsdistributed.com/\" target=\"_blank\">Systems Distributed</a>. Last one will ideally be a talk based on LfP. </p>\n<div class=\"footnote\">\n<hr />\n<ol>\n<li id=\"fn:longest\">\n<p>The second longest was my 2003 NaNoWriMo. The third longest was <em>Practical TLA+</em>. <a class=\"footnote-backref\" href=\"#fnref:longest\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:gross-additions\">\n<p>This means I must have written 20,000 words total. For comparison, the v0.1 release was 19,000 words. <a class=\"footnote-backref\" href=\"#fnref:gross-additions\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/logic-for-programmers-new-release-and-next-steps/",
          "published": "2026-02-04T14:00:00.000Z",
          "updated": "2026-02-04T14:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/refinement-without-specification/",
          "title": "Refinement without Specification",
          "description": "<p>Imagine we have a SQL database with a <code>user</code> table, and users have a non-nullable <code>is_activated</code> boolean column. Having read <a href=\"https://ntietz.com/blog/that-boolean-should-probably-be-something-else/\" target=\"_blank\">That Boolean Should Probably Be Something else</a>, you decide to migrate it to a nullable <code>activated_at</code> column. You can change any of the SQL queries that read/update the <code>user</code> table but not any of the code that uses the results of these queries. Can we make this change in a way that preserves all external properties? </p>\n<p>Yes. If an update would set <code>is_activated</code> to true, instead set it to the current date. Now define the <strong>refinement mapping</strong> that takes a <code>new_user</code> and returns an <code>old_user</code>. All columns will be unchanged <em>except</em> <code>is_activated</code>, which will be</p>\n<div class=\"codehilite\"><pre><span></span><code>f(new_user).is_activated = \n    if new_user.activated_at == NULL \n    then FALSE\n    else TRUE\n</code></pre></div>\n\n<p>Now new code can use <code>new_user</code> directly while legacy code can use <code>f(new_user)</code> instead, which will behave indistinguishably from the <code>old_user</code>. </p>\n<p>A little more time passes and you decide to switch to an <a href=\"https://martinfowler.com/eaaDev/EventSourcing.html\" target=\"_blank\">event sourcing</a>-like model. So instead of an <code>activated_at</code> column, you have a <code>user_events</code> table, where every record is <code>(user_id, timestamp, event)</code>. So adding an <code>activate</code> event will activate the user, adding a <code>deactivate</code> event will deactivate the user. Once again, we can update the queries but not any of the code that uses the results of these queries. Can we make a change that preserves all external properties?</p>\n<p>Yes. If an update would change <code>is_activated</code>, instead have it add an appropriate record to the event table. Now, define the refinement mapping that takes <code>newer_user</code> and returns <code>new_user</code>. The <code>activated_at</code> field will be computed like this:</p>\n<div class=\"codehilite\"><pre><span></span><code>g(newer_user).activated_at =\n        # last_activated_event\n    let lae = \n            newer_user.events\n                      .filter(event = \"activate\" | \"deactivate\")\n                      .last,\n    in\n        if lae.event == \"activate\" \n        then lae.timestamp\n        else NULL\n</code></pre></div>\n\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<p>Now new code can use <code>newer_user</code> directly while old code can use <code>g(newer_user)</code> and the really old code can use <code>f(g(newer_user))</code>.</p>\n<h3>Mutability constraints</h3>\n<div class=\"subscribe-form\"></div>\n<p>I said \"these preserve all external properties\" and that was a lie. It depends on the properties we explicitly have, and I didn't list any. The real interesting properties for me are mutability constraints on how the system can evolve. So let's go back in time and add a constraint to <code>user</code>:</p>\n<div class=\"codehilite\"><pre><span></span><code>C1(u) = u.is_activated => u.is_activated'\n</code></pre></div>\n\n<p>This constraint means that if a user is activated, any change will preserve its activated-ness. This means a user can go from deactivated to activated but not the other way. It's not a particular good constraint but it's good enough for teaching purposes. Such a SQL constraint can be enforced with <a href=\"https://www.postgresql.org/docs/current/sql-createeventtrigger.html\" target=\"_blank\">triggers</a>. </p>\n<p>Now we can throw a constraint on <code>new_user</code>:</p>\n<div class=\"codehilite\"><pre><span></span><code>C2(nu) = nu.activated_at != NULL => nu.activated_at' != NULL\n</code></pre></div>\n\n<p>If <code>nu</code> satisfies <code>C2</code>, then <code>f(nu)</code> satisfies <code>C1</code>. So the refinement still holds.</p>\n<p>With <code>newer_u</code>, we <em>cannot</em> guarantee that <code>g(newer_u)</code> satisfies <code>C2</code> because we can go from \"activated\" to \"deactivated\" just by appending a new event. So it's not a refinement. This is fixable by removing deactivation events, that would work too.</p>\n<p>So a more interesting case is <code>bad_user</code>, a refinement of <code>user</code> that has both <code>activated_at</code> and <code>activated_until</code>. We propose the refinement mapping <code>b</code>:</p>\n<div class=\"codehilite\"><pre><span></span><code>b(bad_user).activated =\n    if bad_user.activated_at == NULL && activated_until == NULL\n    then FALSE\n    else bad_user.activated_at <= now() < bad_user.activated_until\n</code></pre></div>\n\n<p>But now if enough time passes, <code>b(bad_user).activated' = false</code>, so this is not a refinement either.</p>\n<h3>The punchline</h3>\n<p>Refinement is one of the most powerful techniques in formal specification, but also one of the hardest for people to understand. I'm starting to think that the reason it's so hard is because they learn refinement while they're <em>also</em> learning formal methods, so are faced with an unfamiliar topic in an unfamiliar context. If that's the case, then maybe it's easier introducing refinement in a more common context like databases.</p>\n<p>I've written a bit about refinement in the normal context <a href=\"https://hillelwayne.com/post/refinement/\" target=\"_blank\">here</a> (showing one specification is an implementation of another). I kinda want to work this explanation into the book but it might be too late for big content additions like this.</p>\n<p>(Food for thought: how do refinement mappings relate to database views?)</p>",
          "url": "https://buttondown.com/hillelwayne/archive/refinement-without-specification/",
          "published": "2026-01-20T17:49:07.000Z",
          "updated": "2026-01-20T17:49:07.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/my-gripes-with-prolog/",
          "title": "My Gripes with Prolog",
          "description": "<p>For the next release of <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a>, I'm finally adding the sections on Answer Set Programming and Constraint Logic Programming that I TODOd back in version 0.9. And this is making me re-experience some of my pain points with Prolog, which I will gripe about now.  If you want to know more about why Prolog is cool instead, go <a href=\"https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/\" target=\"_blank\">here</a> or <a href=\"https://www.metalevel.at/prolog\" target=\"_blank\">here</a> or <a href=\"https://ianthehenry.com/posts/drinking-with-datalog/\" target=\"_blank\">here</a> or <a href=\"https://logicprogramming.org/\" target=\"_blank\">here</a>. </p>\n<h3>No standardized strings</h3>\n<p>ISO \"strings\" are just atoms or lists of single-character atoms (or lists of integer character codes). The various implementations of Prolog add custom string operators but they are not cross compatible, so code written with strings in SWI-Prolog will not work in Scryer Prolog. </p>\n<h3>No functions</h3>\n<p>Code logic is expressed entirely in <em>rules</em>, predicates which return true or false for certain values. For example if you wanted to get the length of a Prolog list, you write this:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">length</span><span class=\"p\">([</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">c</span><span class=\"p\">],</span> <span class=\"nv\">Len</span><span class=\"p\">).</span>\n\n   <span class=\"nv\">Len</span> <span class=\"o\">=</span> <span class=\"mf\">3.</span>\n</code></pre></div>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<p>Now this is pretty cool in that it allows bidirectionality, or running predicates \"in reverse\". To generate lists of length 3, you can write <code>length(L, 3)</code>. But it also means that if you want to get the length a list <em>plus one</em>, you can't do that in one expression, you have to write <code>length(List, Out), X is Out+1</code>.</p>\n<p>For a while I thought no functions was necessary evil for bidirectionality, but then I discovered <a href=\"https://picat-lang.org/\" target=\"_blank\">Picat</a> has functions and works just fine. That by itself is a reason for me to prefer Picat for my LP needs.</p>\n<p>(Bidirectionality is a killer feature of Prolog, so it's a shame I so rarely run into situations that use it.)</p>\n<h3>No standardized collection types besides lists</h3>\n<p>Aside from atoms (<code>abc</code>) and numbers, there are two data types:</p>\n<ul>\n<li>Linked lists like <code>[a,b,c,d]</code>.</li>\n<li>Compound terms like <code>dog(rex, poodle)</code>, which <em>seem</em> like record types but are actually tuples. You can even convert compound terms to linked lists with <code>=..</code>:</li>\n</ul>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nv\">L</span> <span class=\"s s-Atom\">=..</span> <span class=\"p\">[</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">c</span><span class=\"p\">].</span>\n   <span class=\"nv\">L</span> <span class=\"o\">=</span> <span class=\"nf\">a</span><span class=\"p\">(</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">c</span><span class=\"p\">).</span>\n<span class=\"s s-Atom\">?-</span> <span class=\"nf\">a</span><span class=\"p\">(</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"nf\">c</span><span class=\"p\">(</span><span class=\"s s-Atom\">c</span><span class=\"p\">))</span> <span class=\"s s-Atom\">=..</span> <span class=\"nv\">L</span><span class=\"p\">.</span>\n   <span class=\"nv\">L</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"nf\">c</span><span class=\"p\">(</span><span class=\"s s-Atom\">c</span><span class=\"p\">)].</span>\n</code></pre></div>\n<p>There's no proper key-value maps or even struct types. Again, this is something that individual distributions can fix (without cross compatibility), but these never feel integrated with the rest of the language. </p>\n<h3>No boolean values</h3>\n<p><code>true</code> and <code>false</code> aren't values, they're control flow statements. <code>true</code> is a noop and <code>false</code> says that the current search path is a dead end, so backtrack and start again. You can't explicitly store true and false as values, you have to implicitly have them in facts (<code>passed(test)</code> instead of <code>test.passed? == true</code>).</p>\n<p>This hasn't made any tasks impossible, and I can usually find a workaround to whatever I want to do. But I do think it makes things more inconvenient! Sometimes I want to do something dumb like \"get all atoms that don't pass at least three of these rules\", and that'd be a lot easier if I could shove intermediate results into a sack of booleans. </p>\n<p>(This is called \"<a href=\"https://en.wikipedia.org/wiki/Negation_as_failure\" target=\"_blank\">Negation as Failure</a>\". I think this might be necessary to make Prolog a Turing complete general programming language. Picat fixes a lot of Prolog's gripes and still has negation as failure. ASP has regular negation but it's not Turing complete.) </p>\n<h3>Cuts are confusing</h3>\n<div class=\"subscribe-form\"></div>\n<p>Prolog finds solutions through depth first search, and a \"cut\" (<code>!</code>) symbol prevents backtracking past a certain point. This is necessary for optimization but can lead to invalid programs. </p>\n<p>You're not supposed to use cuts if you can avoid it, so I pretended cuts didn't exist. Which is why I was surprised to find that <a href=\"https://eu.swi-prolog.org/pldoc/doc_for?object=(-%3E)/2\" target=\"_blank\">conditionals</a> are implemented with cuts. Because cuts are spooky dark magic conditionals <em>sometimes</em> conditionals work as I expect them to and sometimes leave out valid solutions and I have no idea how to tell which it'll be. Usually I find it safer to just avoid conditionals entirely, which means my code gets a lot longer and messier. </p>\n<h3>Non-cuts are confusing</h3>\n<p>The original example in the last section was this: </p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">foo</span><span class=\"p\">(</span><span class=\"nv\">A</span><span class=\"p\">,</span> <span class=\"nv\">B</span><span class=\"p\">)</span> <span class=\"p\">:-</span>\n    <span class=\"s s-Atom\">\\+</span> <span class=\"p\">(</span><span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"nv\">B</span><span class=\"p\">),</span>\n    <span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"mi\">1</span><span class=\"p\">,</span>\n    <span class=\"nv\">B</span> <span class=\"o\">=</span> <span class=\"mf\">2.</span>\n</code></pre></div>\n<p><code>foo(1, 2)</code> returns true, so you'd expect <code>f(A, B)</code> to return <code>A=1, B=2</code>. But it returns <code>false</code>.  Whereas this works as expected.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">bar</span><span class=\"p\">(</span><span class=\"nv\">A</span><span class=\"p\">,</span> <span class=\"nv\">B</span><span class=\"p\">)</span> <span class=\"p\">:-</span>\n    <span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"mi\">1</span><span class=\"p\">,</span>\n    <span class=\"nv\">B</span> <span class=\"o\">=</span> <span class=\"mi\">2</span><span class=\"p\">,</span>\n    <span class=\"s s-Atom\">\\+</span> <span class=\"p\">(</span><span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"nv\">B</span><span class=\"p\">).</span>\n</code></pre></div>\n<p>I <em>thought</em> this was because <code>\\+</code> was implemented with cuts, and the <a href=\"https://www.amazon.com/Programming-Prolog-Using-ISO-Standard/dp/3540006788\" target=\"_blank\">Clocksin book</a> suggests it's <code>call(P), !, fail</code>, so this was my prime example about how cuts are confusing. But then I tried this:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">member</span><span class=\"p\">(</span><span class=\"nv\">A</span><span class=\"p\">,</span> <span class=\"p\">[</span><span class=\"mi\">1</span><span class=\"p\">,</span><span class=\"mi\">2</span><span class=\"p\">,</span><span class=\"mi\">3</span><span class=\"p\">]),</span> <span class=\"s s-Atom\">\\+</span> <span class=\"p\">(</span><span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"mf\">3.</span>\n<span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"mf\">3.</span> <span class=\"c1\">% wtf?</span>\n</code></pre></div>\n<p>There's no way to get that behavior with cuts! I don't think <code>\\+</code> uses cuts at all! And now I have to figure out why \n<code>foo(A, B)</code> doesn't returns results. Is it <a href=\"https://github.com/dtonhofer/prolog_notes/blob/master/other_notes/about_negation/floundering.md\" target=\"_blank\">floundering</a>? Is it because <code>\\+ P</code> only succeeds if <code>P</code> fails, and <code>A = B</code> always succeeds? A closed-world assumption? Something else?<sup id=\"fnref:dif\"><a class=\"footnote-ref\" href=\"#fn:dif\">1</a></sup></p>\n<h3>Straying outside of default queries is confusing</h3>\n<p>Say I have a program like this:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">tree</span><span class=\"p\">(</span><span class=\"s s-Atom\">n</span><span class=\"p\">,</span> <span class=\"s s-Atom\">n1</span><span class=\"p\">).</span>\n<span class=\"nf\">tree</span><span class=\"p\">(</span><span class=\"s s-Atom\">n</span><span class=\"p\">,</span> <span class=\"s s-Atom\">n2</span><span class=\"p\">).</span>\n<span class=\"nf\">tree</span><span class=\"p\">(</span><span class=\"s s-Atom\">n1</span><span class=\"p\">,</span> <span class=\"s s-Atom\">n11</span><span class=\"p\">).</span>\n<span class=\"nf\">tree</span><span class=\"p\">(</span><span class=\"s s-Atom\">n2</span><span class=\"p\">,</span> <span class=\"s s-Atom\">n21</span><span class=\"p\">).</span>\n<span class=\"nf\">tree</span><span class=\"p\">(</span><span class=\"s s-Atom\">n2</span><span class=\"p\">,</span> <span class=\"s s-Atom\">n22</span><span class=\"p\">).</span>\n<span class=\"nf\">tree</span><span class=\"p\">(</span><span class=\"s s-Atom\">n11</span><span class=\"p\">,</span> <span class=\"s s-Atom\">n111</span><span class=\"p\">).</span>\n<span class=\"nf\">tree</span><span class=\"p\">(</span><span class=\"s s-Atom\">n11</span><span class=\"p\">,</span> <span class=\"s s-Atom\">n112</span><span class=\"p\">).</span>\n\n<span class=\"nf\">branch</span><span class=\"p\">(</span><span class=\"nv\">N</span><span class=\"p\">)</span> <span class=\"p\">:-</span> <span class=\"c1\">% two children</span>\n    <span class=\"nf\">tree</span><span class=\"p\">(</span><span class=\"nv\">N</span><span class=\"p\">,</span> <span class=\"nv\">C1</span><span class=\"p\">),</span>\n    <span class=\"nf\">tree</span><span class=\"p\">(</span><span class=\"nv\">N</span><span class=\"p\">,</span> <span class=\"nv\">C2</span><span class=\"p\">),</span>\n    <span class=\"nv\">C1</span> <span class=\"s s-Atom\">@<</span> <span class=\"nv\">C2</span><span class=\"p\">.</span> <span class=\"c1\">% ordering</span>\n</code></pre></div>\n<p>And I want to know all of the nodes that are parents of branches. The normal way to do this is with a query:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">tree</span><span class=\"p\">(</span><span class=\"nv\">A</span><span class=\"p\">,</span> <span class=\"nv\">N</span><span class=\"p\">),</span> <span class=\"nf\">branch</span><span class=\"p\">(</span><span class=\"nv\">N</span><span class=\"p\">).</span>\n<span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"s s-Atom\">n</span><span class=\"p\">,</span> <span class=\"nv\">N</span> <span class=\"o\">=</span> <span class=\"s s-Atom\">n2</span><span class=\"p\">;</span> <span class=\"c1\">% show more...</span>\n<span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"s s-Atom\">n1</span><span class=\"p\">,</span> <span class=\"nv\">N</span> <span class=\"o\">=</span> <span class=\"s s-Atom\">n11</span><span class=\"p\">.</span>\n</code></pre></div>\n<p>This is interactively making me query for every result. That's usually not what I want, I know the result of my query is finite and I want all of the results at once, so I can count or farble or whatever them. It took a while to figure out that the proper solution is <a href=\"https://www.swi-prolog.org/pldoc/man?predicate=bagof/3\" target=\"_blank\"><code>bagof(Template, Goal, Bag)</code></a>, which will \"Unify Bag with the alternatives of Template\":</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">bagof</span><span class=\"p\">(</span><span class=\"nv\">A</span><span class=\"p\">,</span> <span class=\"p\">(</span><span class=\"nf\">tree</span><span class=\"p\">(</span><span class=\"nv\">A</span><span class=\"p\">,</span> <span class=\"nv\">N</span><span class=\"p\">),</span> <span class=\"nf\">branch</span><span class=\"p\">(</span><span class=\"nv\">N</span><span class=\"p\">)),</span> <span class=\"nv\">As</span><span class=\"p\">).</span>\n\n<span class=\"nv\">As</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">n1</span><span class=\"p\">],</span> <span class=\"nv\">N</span> <span class=\"o\">=</span> <span class=\"s s-Atom\">n11</span><span class=\"p\">;</span>\n<span class=\"nv\">As</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">n</span><span class=\"p\">],</span> <span class=\"nv\">N</span> <span class=\"o\">=</span> <span class=\"s s-Atom\">n2</span><span class=\"p\">.</span>\n</code></pre></div>\n<p>Wait crap that's still giving one result at a time, because <code>N</code> is a free variable in <code>bagof</code> so it backtracks over that. It surprises me but I guess it's good to have as an option. So how do I get all of the results at once?</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">bagof</span><span class=\"p\">(</span><span class=\"nv\">A</span><span class=\"p\">,</span> <span class=\"nv\">N</span><span class=\"s s-Atom\">^</span><span class=\"p\">(</span><span class=\"nf\">tree</span><span class=\"p\">(</span><span class=\"nv\">A</span><span class=\"p\">,</span> <span class=\"nv\">N</span><span class=\"p\">),</span> <span class=\"nf\">branch</span><span class=\"p\">(</span><span class=\"nv\">N</span><span class=\"p\">)),</span> <span class=\"nv\">As</span><span class=\"p\">).</span>\n\n<span class=\"nv\">As</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">n</span><span class=\"p\">,</span> <span class=\"s s-Atom\">n1</span><span class=\"p\">]</span>\n</code></pre></div>\n<p>The only difference is the <code>N^Goal</code>, which tells <code>bagof</code> to ignore and group the results of <code>N</code>. As far as I can tell, this is the <em>only</em> place the ISO standard uses <code>^</code> to mean anything besides exponentiation. Supposedly it's the <a href=\"https://sicstus.sics.se/sicstus/docs/latest4/html/sicstus.html/ref_002dall_002dsum.html\" target=\"_blank\">existential quantifier</a>? In general whenever I try to stray outside simpler use-cases, especially if I try to do things non-interactively, I run into trouble.</p>\n<h3>I have mixed feelings about symbol terms</h3>\n<p>It took me a long time to realize the reason <code>bagof</code>  \"works\" is because infix symbols are mapped to prefix compound terms, so that  <code>a^b</code> is <code>^(a, b)</code>, and then different predicates can decide to do different things with <code>^(a, b)</code>.</p>\n<p>This is also why you can't just write <code>A = B+1</code>: that unifies <code>A</code> with the <em>compound term</em> <code>+(B, 1)</code>. <code>A+1 = B+2</code> is <em>false</em>, as <code>1 \\= 2</code>. You have to write <code>A+1 is B+2</code>, as <code>is</code> is the operator that converts <code>+(B, 1)</code> to a mathematical term.</p>\n<p>(And <em>that</em> fails because <code>is</code> isn't fully bidirectional. The lhs <em>must</em> be a single variable. You have to import <code>clpfd</code> and write <code>A + 1 #= B + 2</code>.)</p>\n<p>I don't like this, but I'm a hypocrite for saying that because I appreciate the idea and don't mind custom symbols in other languages. I guess what annoys me is there's no official definition of what <code>^(a, b)</code> is, it's purely a convention. ISO Prolog uses <code>-(a, b)</code> (aka <code>a-b</code>) as a convention to mean \"pairs\", and the only way to realize that is to see that an awful lot of standard modules use that convention. But you can use <code>-(a, b)</code> to mean something else in your own code and nothing will warn you of the inconsistency.</p>\n<p>Anyway I griped about pairs so I can gripe about <code>sort</code>.</p>\n<h3>go home sort, ur drunk</h3>\n<p>This one's just a blunder:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">sort</span><span class=\"p\">([</span><span class=\"mi\">3</span><span class=\"p\">,</span><span class=\"mi\">1</span><span class=\"p\">,</span><span class=\"mi\">2</span><span class=\"p\">,</span><span class=\"mi\">1</span><span class=\"p\">,</span><span class=\"mi\">3</span><span class=\"p\">],</span> <span class=\"nv\">Out</span><span class=\"p\">).</span>\n   <span class=\"nv\">Out</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">].</span> <span class=\"c1\">% wat</span>\n</code></pre></div>\n<p>According to an expert online this is because sort is supposed to return a sorted <em>set</em>, not a sorted list. If you want to preserve duplicates you're supposed to lift all of the values into <code>-($key, $value)</code> compound terms, then use <a href=\"https://eu.swi-prolog.org/pldoc/doc_for?object=keysort/2\" target=\"_blank\">keysort</a>, then extract the values. And, since there's no functions, this process takes at least three lines. This is also how you're supposed to sort by a custom predicate, like \"the second value of a compound term\". </p>\n<p>(Most (but not all) distributions have a duplicate merge like <a href=\"https://eu.swi-prolog.org/pldoc/doc_for?object=msort/2\" target=\"_blank\">msort</a>. SWI-Prolog also has a <a href=\"https://eu.swi-prolog.org/pldoc/doc_for?object=predsort/3\" target=\"_blank\">sort by key</a> but it removes duplicates.)</p>\n<h3>Please just let me end rules with a trailing comma instead of a period, I'm begging you</h3>\n<p>I don't care if it makes fact parsing ambiguous, I just don't want \"reorder two lines\" to be a syntax error anymore</p>\n<hr/>\n<p>I expect by this time tomorrow I'll have been Cunningham'd and there will be a 2000 word essay about how all of my gripes are either easily fixable by doing XYZ or how they are the best possible choice that Prolog could have made. I mean, even in writing this I found out some fixes to problems I had. Like I was going to gripe about how I can't run SWI-Prolog queries from the command line but, in doing do diligence finally <em>finally</em> figured it out:</p>\n<div class=\"codehilite\"><pre><span></span><code>swipl<span class=\"w\"> </span>-t<span class=\"w\"> </span>halt<span class=\"w\"> </span>-g<span class=\"w\"> </span><span class=\"s2\">\"bagof(X, Goal, Xs), print(Xs)\"</span><span class=\"w\"> </span>./file.pl\n</code></pre></div>\n<p>It's pretty clunky but still better than the old process of having to enter an interactive session every time I wanted to validate a script change.</p>\n<p>(Also, answer set programming is pretty darn cool. Excited to write about it in the book!)</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:dif\">\n<p>A couple of people mentioned using <a href=\"https://eu.swi-prolog.org/pldoc/doc_for?object=dif/2\" target=\"_blank\">dif/2</a> instead of <code>\\+ A = B</code>. Dif is great but usually I hit the negation footgun with things like <code>\\+ foo(A, B), bar(B, C), baz(A, C)</code>, where <code>dif/2</code> isn't applicable. <a class=\"footnote-backref\" href=\"#fnref:dif\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/my-gripes-with-prolog/",
          "published": "2026-01-14T16:48:51.000Z",
          "updated": "2026-01-14T16:48:51.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/the-liskov-substitution-principle-does-more-than/",
          "title": "The Liskov Substitution Principle does more than you think",
          "description": "<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<p>Happy New Year! I'm done with the newsletter hiatus and am going to try updating weekly again. To ease into things a bit, I'll try to keep posts a little more off the cuff and casual for a while, at least until <a href=\"https://leanpub.com/logic/\" target=\"_blank\"><em>Logic for Programmers</em></a> is done. Speaking of which, v0.13 should be out by the end of this month.</p>\n<p>So for this newsletter I want to talk about the <a href=\"https://en.wikipedia.org/wiki/Liskov_substitution_principle\" target=\"_blank\">Liskov Substitution Principle</a> (LSP). Last week I read <a href=\"https://loup-vaillant.fr/articles/solid-bull\" target=\"_blank\">A SOLID Load of Bull</a> by cryptographer Loupe Vaillant, where he argues the <a href=\"https://en.wikipedia.org/wiki/SOLID\" target=\"_blank\">SOLID</a> principles of OOP are not worth following. He makes an exception for LSP, but also claims that it's \"just subtyping\" and further:</p>\n<blockquote>\n<p>If I were trying really hard to be negative about the Liskov substitution principle, I would stress that <strong>it only applies when inheritance is involved</strong>, and inheritance is strongly discouraged anyway.</p>\n</blockquote>\n<p>LSP is more interesting than that! In the original paper, <a href=\"https://www.cs.cmu.edu/~wing/publications/LiskovWing94.pdf\" target=\"_blank\">A Behavioral Notion of Subtyping</a>, Barbara Liskov and Jeannette Wing start by defining a \"correct\" subtyping as follows:</p>\n<blockquote>\n<p>Subtype Requirement: Let ϕ(x) be a property provable about objects x of type T. Then ϕ(y) should be true for objects y of type S where S is a subtype of T.</p>\n</blockquote>\n<p>From then on, the paper determine what <em>guarantees</em> that a subtype is correct.<sup id=\"fnref:safety\"><a class=\"footnote-ref\" href=\"#fn:safety\">1</a></sup>  They identify three conditions: </p>\n<ol>\n<li>Each of the subtype's methods has the same or weaker preconditions and the same or stronger postconditions as the corresponding supertype method.<sup id=\"fnref:cocontra\"><a class=\"footnote-ref\" href=\"#fn:cocontra\">2</a></sup> </li>\n<li>The subtype satisfies all state invariants of the supertype. </li>\n<li>The subtype satisfies all \"history properties\" of the supertype. <sup id=\"fnref:refinement\"><a class=\"footnote-ref\" href=\"#fn:refinement\">3</a></sup> e.g. if a supertype has an immutable field, the subtype cannot make it mutable. </li>\n</ol>\n<p>(Later, Elisa Baniassad and Alexander Summers <a href=\"https://www.cs.ubc.ca/~alexsumm/papers/BaniassadSummers21.pdf\" target=\"_blank\">would realize</a> these are equivalent to \"the subtype passes all black-box tests designed for the supertype\", which I wrote a little bit more about <a href=\"https://www.hillelwayne.com/post/lsp/\" target=\"_blank\">here</a>.)</p>\n<p>I want to focus on the first rule about preconditions and postconditions. This refers to the method's <strong>contract</strong>.  For a function <code>f</code>, <code>f.Pre</code> is what must be true going into the function, and <code>f.Post</code> is what the function guarantees on execution. A canonical example is square root: </p>\n<div class=\"codehilite\"><pre><span></span><code>sqrt.Pre(x) = x >= 0\nsqrt.Post(x, out) = out >= 0 && out*out == x\n</code></pre></div>\n<div class=\"subscribe-form\"></div>\n<p>Mathematically we would write this as <code>all x: f.Pre(x) => f.Post(x)</code> (where <code>=></code> is the <a href=\"https://en.wikipedia.org/wiki/Material_conditional\" target=\"_blank\">implication operator</a>). If that relation holds for all <code>x</code>, we say the function is \"correct\". With this definition we can actually formally deduce the first  subtyping requirement. Let <code>caller</code> be some code that uses a method, which we will call <code>super</code>, and let both <code>caller</code> and <code>super</code> be correct. Then we know the following statements are true:</p>\n<div class=\"codehilite\"><pre><span></span><code>  1. caller.Pre && stuff => super.Pre\n  2. super.Pre => super.Post\n  3. super.Post && more_stuff => caller.Post\n</code></pre></div>\n<p>Now let's say we substitute <code>super</code> with <code>sub</code>, which is also correct. Here is what we now know is true: </p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"w\"> </span> 1. caller.Pre => super.Pre\n<span class=\"gd\">- 2. super.Pre => super.Post</span>\n<span class=\"gi\">+ 2. sub.Pre => sub.Post</span>\n<span class=\"w\"> </span> 3. super.Post => caller.Post\n</code></pre></div>\n<p>When is <code>caller</code> still correct? When we can fill in the \"gaps\" in the chain, aka if <code>super.Pre => sub.Pre</code> and <code>sub.Post => super.Post</code>. In other words, if <code>sub</code>'s preconditions are weaker than (or equivalent to) <code>super</code>'s preconditions and if <code>sub</code>'s postconditions are stronger than (or equivalent to) <code>super</code>'s postconditions.</p>\n<p>Notice that I never actually said <code>sub</code> was from a subtype of <code>super</code>! The LSP conditions (at least, the contract rule of LSP) doesn't just apply to <em>subtypes</em> but can be applied in any situation where we substitute a function or block of code for another. Subtyping is a common place where this happens, but by no means the only! We can also substitute across time.Any time we modify some code's behavior, we are effectively substituting the new version in for the old version, and so the new version's contract must be compatible with the old version's to guarantee no existing code is broken.</p>\n<p>For example, say we maintain an API or function with two required inputs, <code>X</code> and <code>Y</code>, and one optional input, <code>Z</code>. Making <code>Z</code> required strengthens the precondition (\"input must have Z\" is stronger than \"input may have Z\"), so potentially breaks existing users of our API. Making <code>Y</code> optional weakens the precondition (\"input may have Y\" is weaker than \"input must have Y\"), so is guaranteed to be compatible.</p>\n<p>(This also underpins <a href=\"https://en.wikipedia.org/wiki/Robustness_principle\" target=\"_blank\">The robustness principle</a>: \"be conservative in what you send, be liberal in what you accept\".)</p>\n<p>Now the dark side of all this is <a href=\"https://www.hyrumslaw.com/\" target=\"_blank\">Hyrum's Law</a>. In the below code, are <code>new</code>'s postconditions stronger than <code>old</code>'s postconditions? </p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">def</span><span class=\"w\"> </span><span class=\"nf\">old</span><span class=\"p\">():</span>\n    <span class=\"k\">return</span> <span class=\"p\">{</span><span class=\"s2\">\"a\"</span><span class=\"p\">:</span> <span class=\"s2\">\"foo\"</span><span class=\"p\">,</span> <span class=\"s2\">\"b\"</span><span class=\"p\">:</span> <span class=\"s2\">\"bar\"</span><span class=\"p\">}</span>\n\n<span class=\"k\">def</span><span class=\"w\"> </span><span class=\"nf\">new</span><span class=\"p\">():</span>\n    <span class=\"k\">return</span> <span class=\"p\">{</span><span class=\"s2\">\"a\"</span><span class=\"p\">:</span> <span class=\"s2\">\"foo\"</span><span class=\"p\">,</span> <span class=\"s2\">\"b\"</span><span class=\"p\">:</span> <span class=\"s2\">\"bar\"</span><span class=\"p\">,</span> <span class=\"s2\">\"c\"</span><span class=\"p\">:</span> <span class=\"s2\">\"baz\"</span><span class=\"p\">}</span>\n</code></pre></div>\n<p>On a first appearance, this is a strengthened postcondition: <code>out.contains_keys([a, b, c]) => out.contains_keys([a, b])</code>. But now someone does this:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">my_dict</span> <span class=\"o\">=</span> <span class=\"p\">{</span><span class=\"s2\">\"c\"</span><span class=\"p\">:</span> <span class=\"s2\">\"blat\"</span><span class=\"p\">}</span> \n<span class=\"n\">my_dict</span> <span class=\"o\">|=</span> <span class=\"n\">new</span><span class=\"p\">()</span>\n<span class=\"k\">assert</span> <span class=\"n\">my_dict</span><span class=\"p\">[</span><span class=\"n\">c</span><span class=\"p\">]</span> <span class=\"o\">==</span> <span class=\"s2\">\"blat\"</span>\n</code></pre></div>\n<p>Oh no, their code now breaks! They saw <code>old</code> had the postcondition \"<code>out</code> does NOT contain \"c\" as a key\", and then wrote their code expecting that postcondition. In a sense, <em>any</em> change the postcondition can potentially break <em>someone</em>. \"All observable behaviors of your system\nwill be depended on by somebody\", as <a href=\"https://www.hyrumslaw.com/\" target=\"_blank\">Hyrum's Law</a> puts it.</p>\n<p>So we need to be explicit in what our postconditions actually are, and properties of the output that are not part of our explicit postconditions are subject to be violated on the next version. You'll break people's workflows but you also have grounds to say \"I warned you\".</p>\n<p>Overall, Liskov and Wing did their work in the context of subtyping, but the principles are more widely applicable, certainly to more than just the use of inheritance.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:safety\">\n<p>Though they restrict it to just <a href=\"https://www.hillelwayne.com/post/safety-and-liveness/\" target=\"_blank\">safety properties</a>. <a class=\"footnote-backref\" href=\"#fnref:safety\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:cocontra\">\n<p>The paper lists a couple of other authors as introduce the idea of \"contra/covariance rules\", but part of being \"off-the-cuff and casual\" means not diving into every referenced paper. So they might have gotten the pre/postconditions thing from an earlier author, dunno for sure! <a class=\"footnote-backref\" href=\"#fnref:cocontra\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:refinement\">\n<p>I <em>believe</em> that this is equivalent to the formal methods notion of a <a href=\"https://www.hillelwayne.com/post/refinement/\" target=\"_blank\">refinement</a>. <a class=\"footnote-backref\" href=\"#fnref:refinement\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/the-liskov-substitution-principle-does-more-than/",
          "published": "2026-01-06T16:51:26.000Z",
          "updated": "2026-01-06T16:51:26.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/some-fun-software-facts/",
          "title": "Some Fun Software Facts",
          "description": "<p>Last newsletter of the year!</p>\n<p>First some news on <em>Logic for Programmers</em>. Thanks to everyone who donated to the <a href=\"https://buttondown.com/hillelwayne/archive/get-logic-for-programmers-50-off-support-chicago\" target=\"_blank\">feedchicago charity drive</a>! In total we raised $2250 for Chicago food banks. Proof <a href=\"https://link.fndrsp.net/CL0/https:%2F%2Fgiving.chicagosfoodbank.org%2Freceipts%2FBMDDDCAF%3FreceiptType=oneTime%26emailLog=YS699MZW/2/0100019ae2b7eb92-7c917ad0-c94e-4fe2-8ee1-1b9dc521c607-000000/brmxoTOvoJN94I9nQH26s7fRrmyFDj_Jir1FySSoxCw=434\" target=\"_blank\">here</a>.</p>\n<p>If you missed buying <em>Logic for Programmers</em> real cheap in the charity drive, you can still get it for $10 off with the holiday code <a href=\"https://leanpub.com/logic/c/hannukah-presents\" target=\"_blank\">hannukah-presents</a>. This will last from now until the end of the year. After that, I'll be raising the price from $25 to $30.</p>\n<p>Anyway, to make this more than just some record keeping, let's close out with something light. I'm one of those people who loves hearing \"fun facts\" about stuff. So here's some random fun facts I accumulated about software over the years:</p>\n<ul>\n<li>In 2017, a team of eight+ programmers <a href=\"https://codegolf.stackexchange.com/questions/11880/build-a-working-game-of-tetris-in-conways-game-of-life\" target=\"_blank\">successfully implemented Tetris</a> as a <a href=\"https://en.wikipedia.org/wiki/Conway's_Game_of_Life\" target=\"_blank\">game of life simulation</a>. The GoL grid had an area of 30 trillion pixels and implemented a full programmable CPU as part of the project.</li>\n</ul>\n<ul>\n<li>Computer systems have to deal with leap seconds in order to keep UTC (where one day is 86,400 seconds) in sync with UT1 (where one day is exactly one full earth rotation). The people in charge recently passed a resolution to abolish the leap second by 2035, letting UTC and UT1 slowly drift out of sync.</li>\n</ul>\n<ul>\n<li><a href=\"https://buttondown.com/hillelwayne/archive/vim-is-turing-complete/\" target=\"_blank\">Vim is Turing complete</a>.</li>\n</ul>\n<div class=\"subscribe-form\"></div>\n<ul>\n<li>The backslash character basically didn't exist in writing before 1930, and <a href=\"http://dump.deadcodersociety.org/ascii.pdf\" target=\"_blank\">was only added to ASCII</a> so mathematicians (and ALGOLists) could write <code>/\\</code> and <code>\\/</code>. It's popular use in computing stems entirely from being a useless key on the keyboard.</li>\n</ul>\n<ul>\n<li><a href=\"https://en.wikipedia.org/wiki/Galactic_algorithm\" target=\"_blank\">Galactic Algorithms</a> are algorithms that are theoretically faster than algorithms we use, but only at scales that make them impractical. For example, matrix multiplication of NxN is <a href=\"https://en.wikipedia.org/wiki/Strassen_algorithm\" target=\"_blank\">normally</a> O(N^2.81). The <a href=\"https://www-auth.cs.wisc.edu/lists/theory-reading/2009-December/pdfmN6UVeUiJ3.pdf\" target=\"_blank\">Coppersmith Winograd</a> algorithm is O(N^2.38), but is so complex that it's vastly slower for even <a href=\"https://mathoverflow.net/questions/1743/what-is-the-constant-of-the-coppersmith-winograd-matrix-multiplication-algorithm\" target=\"_blank\">10,000 x 10,000 matrices</a>. It's still interesting in advancing our mathematical understanding of algorithms!</li>\n</ul>\n<ul>\n<li>Cloudflare generates random numbers by, in part, <a href=\"https://www.cloudflare.com/learning/ssl/lava-lamp-encryption/\" target=\"_blank\">taking pictures of 100 lava lamps</a>.</li>\n</ul>\n<ul>\n<li>Mergesort is older than bubblesort. Quicksort is slightly younger than bubblesort but older than the <em>term</em> \"bubblesort\". Bubblesort, btw, <a href=\"https://buttondown.com/hillelwayne/archive/when-would-you-ever-want-bubblesort/\" target=\"_blank\">does have some uses</a>.</li>\n</ul>\n<ul>\n<li>Speaking of mergesort, most implementations of mergesort pre-2006 <a href=\"https://research.google/blog/extra-extra-read-all-about-it-nearly-all-binary-searches-and-mergesorts-are-broken/\" target=\"_blank\">were broken</a>. Basically the problem was that the \"find the midpoint of a list\" step <em>could</em> overflow if the list was big enough. For C with 32-bit signed integers, \"big enough\" meant over a billion elements, which was why the bug went unnoticed for so long.</li>\n</ul>\n<ul>\n<li><a href=\"https://nibblestew.blogspot.com/2023/09/circles-do-not-exist.html\" target=\"_blank\">PDF's drawing model cannot render perfect circles</a>.</li>\n</ul>\n<ul>\n<li>People make fun of how you have to flip USBs three times to get them into a computer, but there's supposed to be a guide: according to the standard, USBs are supposed to be inserted <em>logo-side up</em>. Of course, this assumes that the port is right-side up, too, which is why USB-C is just symmetric. </li>\n</ul>\n<ul>\n<li>I was gonna write a fun fact about how all spreadsheet software treats 1900 as a leap year, as that was a bug in Lotus 1-2-3 and everybody preserved backwards compatibility. But I checked and Google sheets considers it a normal year. So I guess the fun fact is that things have changed!</li>\n</ul>\n<ul>\n<li>Speaking of spreadsheet errors, in 2020 <a href=\"https://www.engadget.com/scientists-rename-genes-due-to-excel-151748790.html\" target=\"_blank\">biologists changed the official nomenclature</a> of 27 genes because Excel kept parsing their names as dates. F.ex MARCH1 was renamed to MARCHF1 to avoid being parsed as \"March 1st\". Microsoft rolled out a fix for this... three years later.</li>\n</ul>\n<ul>\n<li>It is possible to encode any valid JavaScript program with just the characters <code>()+[]!</code>. This encoding is called <a href=\"https://en.wikipedia.org/wiki/JSFuck\" target=\"_blank\">JSFuck</a> and was once used to distribute malware on <a href=\"https://arstechnica.com/information-technology/2016/02/ebay-has-no-plans-to-fix-severe-bug-that-allows-malware-distribution/\" target=\"_blank\">Ebay</a>.</li>\n</ul>\n<p>Happy holidays everyone, and see you in 2026!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:status\">\n<p>Current status update: I'm finally getting line by line structural editing done and it's turning up lots of improvements, so I'm doing more rewrites than I expected to be doing. <a class=\"footnote-backref\" href=\"#fnref:status\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/some-fun-software-facts/",
          "published": "2025-12-10T18:45:37.000Z",
          "updated": "2025-12-10T18:45:37.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/one-more-week-to-the-logic-for-programmers-food/",
          "title": "One more week to the Logic for Programmers Food Drive",
          "description": "<p>A couple of weeks ago I started a fundraiser for the <a href=\"https://www.chicagosfoodbank.org/\" target=\"_blank\">Greater Chicago Food Depository</a>: get <a href=\"https://leanpub.com/logic/c/feedchicago\" target=\"_blank\">Logic for Programmers 50% off</a> and all the royalties will go to charity.<sup id=\"fnref:royalties\"><a class=\"footnote-ref\" href=\"#fn:royalties\">1</a></sup> Since then, we've raised a bit over $1600. Y'all are great! </p>\n<p>The fundraiser is going on until the end of November, so you still have one more week to get the book real cheap.</p>\n<p>I feel a bit weird about doing two newsletter adverts without raw content, so here's a teaser from a old project I really need to get back to. <a href=\"https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/#what-is-a-goto-statement-anyway\" target=\"_blank\">Notes on structured concurrency</a> argues that old languages had a \"old-testament fire-and-brimstone <code>goto</code>\" that could send control flow anywhere, like from the body of one function into the body of another function. This \"wild goto\", the article claims, what Dijkstra was railing against in <a href=\"https://homepages.cwi.nl/~storm/teaching/reader/Dijkstra68.pdf\" target=\"_blank\">Go To Statement Considered Harmful</a>, and that modern goto statements are much more limited, \"tame\" if you will, and wouldn't invoke Dijkstra's ire.</p>\n<p>I've shared this historical fact about Dijkstra many times, but recently two <a href=\"https://without.boats/blog/\" target=\"_blank\">separate</a> <a href=\"https://matklad.github.io/\" target=\"_blank\">people</a> have told me it doesn't makes sense: Dijkstra used ALGOL-60, which <em>already had</em> tame gotos. All of the problems he raises with <code>goto</code> hold even for tame ones, none are exclusive to wild gotos. So </p>\n<p>This got me looking to see which languages, if any, ever had the wild goto. I define this as any goto which lets you jump from outside to into a loop or function scope. Turns out, FORTRAN had tame gotos from the start, BASIC has wild gotos, and COBOL is a nonsense language intentionally designed to horrify me. I mean, look at this:</p>\n<p><img alt=\"The COBOL ALTER statement, which redefines a goto target\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/e4dfa0fd-fdd5-4fef-b813-4053a183be2f.png?w=960&fit=max\"/></p>\n<p>The COBOL ALTER statement <em>changes a <code>goto</code>'s target at runtime</em>. </p>\n<p>(Early COBOL has tame gotos but only on a technicality: there are no nested scopes in COBOL so no jumping from outside and into a nested scope.)</p>\n<p>Anyway I need to write up the full story (and complain about COBOL more) but this is pretty neat! Reminder, <a href=\"https://leanpub.com/logic/c/feedchicago\" target=\"_blank\">fundraiser here</a>. Let's get it to 2k.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:royalties\">\n<p>Royalties are 80% so if you already have the book you get a bit more bang for your buck by donating to the GCFD directly <a class=\"footnote-backref\" href=\"#fnref:royalties\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/one-more-week-to-the-logic-for-programmers-food/",
          "published": "2025-11-24T18:21:49.000Z",
          "updated": "2025-11-24T18:21:49.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/get-logic-for-programmers-50-off-support-chicago/",
          "title": "Get Logic for Programmers 50% off & Support Chicago Foodbanks",
          "description": "<p>From now until the end of the month, you can get <a href=\"https://leanpub.com/logic/c/feedchicago\" target=\"_blank\">Logic for Programmers at half price</a> with the coupon <code>feedchicago</code>. All royalties from that coupon will go to the <a href=\"https://www.chicagosfoodbank.org/\" target=\"_blank\">Greater Chicago Food Depository</a>. Thank you!</p>",
          "url": "https://buttondown.com/hillelwayne/archive/get-logic-for-programmers-50-off-support-chicago/",
          "published": "2025-11-10T16:31:11.000Z",
          "updated": "2025-11-10T16:31:11.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/im-taking-a-break/",
          "title": "I'm taking a break",
          "description": "<p>Hi everyone,</p>\n<p>I've been getting burnt out on writing a weekly software essay. It's gone from taking me an afternoon to write a post to taking two or three days, and that's made it really difficult to get other writing done. That, plus some short-term work and life priorities, means now feels like a good time for a break. </p>\n<p>So I'm taking off from <em>Computer Things</em> for the rest of the year. There <em>might</em> be some announcements and/or one or two short newsletters in the meantime but I won't be attempting a weekly cadence until 2026.</p>\n<p>Thanks again for reading!</p>\n<p>Hillel</p>",
          "url": "https://buttondown.com/hillelwayne/archive/im-taking-a-break/",
          "published": "2025-10-27T21:02:37.000Z",
          "updated": "2025-10-27T21:02:37.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/modal-editing-is-a-weird-historical-contingency/",
          "title": "Modal editing is a weird historical contingency we have through sheer happenstance",
          "description": "<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<p>A while back my friend <a href=\"https://morepablo.com/\" target=\"_blank\">Pablo Meier</a> was reviewing some 2024 videogames and wrote <a href=\"https://morepablo.com/2025/03/games-of-2024.html\" target=\"_blank\">this</a>:</p>\n<blockquote>\n<p>I feel like some artists, if they didn't exist, would have the resulting void filled in by someone similar (e.g. if Katy Perry didn't exist, someone like her would have). But others don't have successful imitators or comparisons (thinking Jackie Chan, or Weird Al): they are irreplaceable.  </p>\n</blockquote>\n<p>He was using it to describe auteurs but I see this as a property of opportunity, in that \"replaceable\" artists are those who work in bigger markets. Katy Perry's market is large, visible and obviously (but not <em>easily</em>) exploitable, so there are a lot of people who'd compete in her niche. Weird Al's market is unclear: while there were successful parody songs in the past, it wasn't clear there was enough opportunity there to support a superstar.</p>\n<p>I think that modal editing is in the latter category. Vim is now very popular and has spawned numerous successors. But its key feature, <strong>modes</strong>, is not obviously-beneficial, to the point that if Bill Joy didn't make vi (vim's direct predecessor) fifty years ago I don't think we'd have any modal editors today. </p>\n<h3>A quick overview of \"modal editing\"</h3>\n<p>In a non-modal editor, pressing the \"u\" key adds a \"u\" to your text, as you'd expect. In a <strong>modal editor</strong>, pressing \"u\" does something different depending on the \"mode\" you are in. In Vim's default \"normal\" mode, \"u\" undoes the last change to the text, while in the \"visual\" mode it lowercases all selected text. It only inserts the character in \"insert\" mode. All other keys, as well as chorded shortcuts (<code>ctrl-x</code>), work the same way. </p>\n<p>The clearest benefit to this is you can densely pack the keyboard with advanced commands. The standard US keyboard has 48ish keys dedicated to inserting characters. With the ctrl and shift modifiers that becomes at least ~150 extra shortcuts for each other mode. This is also what IMO \"spiritually\" distinguishes modal editing from contextual shortcuts. Even if a unimodal editor lets you change a keyboard shortcut's behavior based on languages or focused panel, without global user-controlled modes it simply can't achieve that density of shortcuts.</p>\n<p>Now while modal editing today is widely beloved (the Vim plugin for <a href=\"https://marketplace.visualstudio.com/items?itemName=vscodevim.vim\" target=\"_blank\">VSCode</a> has at least eight million downloads), I suspect it was \"carried\" by the popularity of vi, as opposed to driving vi's popularity.</p>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<h3>Modal editing is an unusual idea</h3>\n<p>Pre-vi editors weren't modal. Some, like <a href=\"https://en.wikipedia.org/wiki/EDT_(Digital)\" target=\"_blank\">EDT/KED</a>, used chorded commands, while others like <a href=\"https://en.wikipedia.org/wiki/Ed_(software)\" target=\"_blank\">ed</a> or <a href=\"https://en.wikipedia.org/wiki/TECO_(text_editor)\" target=\"_blank\">TECO</a> basically REPLs for text-editing DSLs. Both of these ideas widely reappear in modern editors.</p>\n<p>As far as I can tell, the first modal editor was Butler Lampson's <a href=\"https://en.wikipedia.org/wiki/Bravo_(editor)\" target=\"_blank\">Bravo</a> in 1974. Bill Joy <a href=\"https://web.archive.org/web/20120210184000/http://web.cecs.pdx.edu/~kirkenda/joy84.html\" target=\"_blank\">admits he used it for inspiration</a>: </p>\n<blockquote>\n<p>A lot of the ideas for the screen editing mode were stolen from a Bravo manual I surreptitiously looked at and copied. Dot is really the double-escape from Bravo, the redo command. Most of the stuff was stolen. </p>\n</blockquote>\n<p>Bill Joy probably took the idea because he was working on <a href=\"https://en.wikipedia.org/wiki/ADM-3A\" target=\"_blank\">dumb terminals</a> that were slow to register keystrokes, which put pressure to minimize the number needed for complex operations.</p>\n<p>Why did Bravo have modal editing? Looking at the <a href=\"https://www.microsoft.com/en-us/research/wp-content/uploads/2016/11/15a-AltoHandbook.pdf\" target=\"_blank\">Alto handbook</a>, I get the impression that Xerox was trying to figure out the best mouse and GUI workflows. Bravo was an experiment with modes, one hand on the mouse and one issuing commands on the keyboard. Other experiments included context menus (the Markup program) and toolbars (Draw).</p>\n<p>Xerox very quickly decided <em>against</em> modes, as the successors <a href=\"http://www.bitsavers.org/pdf/xerox/alto/memos_1975/Gypsy_The_Ginn_Typescript_System_Apr75.pdf\" target=\"_blank\">Gypsy</a> and <a href=\"http://www.bitsavers.org/pdf/xerox/alto/BravoXMan.pdf\" target=\"_blank\">BravoX</a> were modeless. Commands originally assigned to English letters were moved to graphical menus, special keys, and chords. </p>\n<p>It seems to me that modes started as an unsuccessful experiment deal with a specific constraint and then later successfully adopted to deal with a different constraint. It was a specialized feature as opposed to a generally useful feature like chords.</p>\n<h3>Modal editing didn't popularize vi</h3>\n<p>While vi was popular at Bill Joy's coworkers, he doesn't <a href=\"https://web.archive.org/web/20120210184000/http://web.cecs.pdx.edu/~kirkenda/joy84.html\" target=\"_blank\">attribute its success to its features</a>:</p>\n<blockquote>\n<p>I think the wonderful thing about vi is that it has such a good market share because we gave it away. Everybody has it now. So it actually had a chance to become part of what is perceived as basic UNIX. EMACS is a nice editor too, but because it costs hundreds of dollars, there will always be people who won't buy it. </p>\n</blockquote>\n<p>Vi was distributed for free with the popular <a href=\"https://en.wikipedia.org/wiki/Berkeley_Software_Distribution\" target=\"_blank\">BSD Unix</a> and was standardized in <a href=\"https://pubs.opengroup.org/onlinepubs/9799919799/\" target=\"_blank\">POSIX Issue 2</a>, meaning all Unix OSes had to have vi. That arguably is what made it popular, and why so many people ended up learning a modal editor. </p>\n<h3>Modal editing doesn't really spread outside of vim</h3>\n<div class=\"subscribe-form\"></div>\n<p>I think by the 90s, people started believing that modal editing was a Good Idea, if not an obvious one. That's why we see direct descendants of vi, most famously vim. It's also why extensible editors like Emacs and VSCode have vim-mode extensions, but these are but these are always simple emulation layers on top of a unimodal baselines. This was good for getting people used to the vim keybindings (I learned on <a href=\"https://en.wikipedia.org/wiki/Kile\" target=\"_blank\">Kile</a>) but it means people weren't really <em>doing</em> anything with modal editing. It was always \"The Vim Gimmick\".</p>\n<p>Modes also didn't take off anywhere else. There's no modal word processor, spreadsheet editor, or email client.<sup id=\"fnref:gmail\"><a class=\"footnote-ref\" href=\"#fn:gmail\">1</a></sup> <a href=\"https://www.visidata.org/\" target=\"_blank\">Visidata</a> is an extremely cool modal data exploration tool but it's pretty niche. Firefox used to have <a href=\"https://en.wikipedia.org/wiki/Vimperator\" target=\"_blank\">vimperator</a> (which was inspired by Vim) but that's defunct now. Modal software means modal editing which means vi.</p>\n<p>This has been changing a little, though! Nowadays we do see new modal text editors, like <a href=\"https://kakoune.org/\" target=\"_blank\">kakoune</a> and <a href=\"https://helix-editor.com/\" target=\"_blank\">Helix</a>, that don't just try to emulate vi but do entirely new things. These were made, though, in response to perceived shortcomings in vi's editing model. I think they are still classifiable as descendants. If vi never existed, would the developers of kak and helix have still made modal editors, or would they have explored different ideas? </p>\n<h3>People aren't clamouring for more experiments</h3>\n<p>Not too related to the overall picture, but a gripe of mine. Vi and vim have a set of hardcoded modes, and adding an entirely new mode is impossible. Like if a plugin (like vim's default <code>netrw</code>) adds a file explorer it should be able to add a filesystem mode, right? But it can't, so instead it waits for you to open the filesystem and then <a href=\"https://github.com/vim/vim/blob/0124320c97b0fbbb44613f42fc1c34fee6181fc8/runtime/pack/dist/opt/netrw/autoload/netrw.vim#L4867\" target=\"_blank\">adds 60 new mappings to normal mode</a>. There's no way to properly add a \"filesystem\" mode, a \"diff\" mode, a \"git\" mode, etc, so plugin developers have to <a href=\"https://www.hillelwayne.com/post/software-mimicry/\" target=\"_blank\">mimic</a> them.</p>\n<p>I don't think people see this as a problem, though! Neovim, which aims to fix all of the baggage in vim's legacy, didn't consider creating modes an important feature. Kak and Helix, which reimagine modal editing from from the ground up, don't support creating modes either.<sup id=\"fnref:helix\"><a class=\"footnote-ref\" href=\"#fn:helix\">2</a></sup> People aren't clamouring for new modes!</p>\n<h2>Modes are a niche power user feature</h2>\n<p>So far I've been trying to show that vi is, in Pablo's words, \"irreplaceable\". Editors weren't doing modal editing before Bravo, and even after vi became incredibly popular, unrelated editors did not adapt modal editing. At most, they got a vi emulation layer. Kak and helix complicate this story but I don't think they refute it; they appear much later and arguably count as descendants (so are related). </p>\n<p>I think the best explanation is that in a vacuum modal editing sounds like a bad idea. The mode is global state that users always have to know, which makes it dangerous. To use new modes well you have to memorize all of the keybindings, which makes it difficult. Modal editing has a brutal skill floor before it becomes more efficient than a unimodal, chorded editor like VSCode.</p>\n<p>That's why it originally appears in very specific circumstances, as early experiments in mouse UX and as a way of dealing with modem latencies. The fact we have vim today is a historical accident. </p>\n<p>And I'm glad for it! You can pry Neovim from my cold dead hands, you monsters.</p>\n<hr/>\n<h1><a href=\"https://www.p99conf.io/\" target=\"_blank\">P99 talk this Thursday</a>!</h1>\n<p>My talk, \"Designing Low-Latency Systems with TLA+\", is happening 10/23 at 11:40 central time. Tickets are free, the conf is online, and the talk's only 16 minutes, so come check it out!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:gmail\">\n<p>I guess if you squint <a href=\"https://support.google.com/mail/answer/6594?hl=en&co=GENIE.Platform%3DDesktop\" target=\"_blank\">gmail kinda counts</a> but it's basically an antifeature <a class=\"footnote-backref\" href=\"#fnref:gmail\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:helix\">\n<p>It looks like Helix supports <a href=\"https://docs.helix-editor.com/remapping.html\" target=\"_blank\">creating minor modes</a>, but these are only active for one keystroke, making them akin to a better, more ergonomic version of vim multikey mappings. <a class=\"footnote-backref\" href=\"#fnref:helix\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/modal-editing-is-a-weird-historical-contingency/",
          "published": "2025-10-21T16:46:24.000Z",
          "updated": "2025-10-21T16:46:24.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/the-phase-change/",
          "title": "The Phase Change",
          "description": "<p>Last week I ran my first 10k.</p>\n<p>It wasn't a race or anything. I left that evening planning to run a 5k, and then three miles later thought \"what if I kept going?\"<sup id=\"fnref:distance\"><a class=\"footnote-ref\" href=\"#fn:distance\">1</a></sup></p>\n<p>I've been running for just over two years now. My goal was to run a mile, then three, then three at a pace faster than a power-walk. I wish I could say that I then found joy in running, but really I was just mad at myself for being so bad at it. Spite has always been my brightest muse.</p>\n<p>Looking back, the thing I find most fascinating is what progress looked like. I couldn't tell you if I was physically progressing steadily, but for sure mental progress moved in discrete jumps. For a long time a 5k was me pushing myself, then suddenly a \"phase change\" happens and it becomes something I can just do on a run. Sometime in the future the 10k will feel the same way.</p>\n<p>I've noticed this in a lot of other places. For every skill I know, my sense of myself follows a phase change. In every programming language I've ever learned, I lurch from \"bad\" to \"okay\" to \"good\". There's no \"20% bad / 80% normal\" in between. Pedagogical experts say that learning is about steadily building a <a href=\"https://teachtogether.tech/en/index.html#s:models\" target=\"_blank\">mental model</a> of the topic. It really feels like knowledge grows continuously, and then it suddenly becomes a model.</p>\n<p>Now, for all the time I spend writing about software history and software theory and stuff, my actually job boils down to <a href=\"https://www.hillelwayne.com/consulting/\" target=\"_blank\">teaching formal methods</a>. So I now have two questions about phase changes.</p>\n<p>The first is \"can we make phase changes happen faster?\" I don't know if this is even possible! I've found lots of ways to teach concepts faster, cover more ground in less time, so that people know the material more quickly. But it doesn't seem to speed up that very first phase change from \"this is foreign\" to \"this is normal\". Maybe we can't really do that until we've spent enough effort on understanding.</p>\n<p>So the second may be more productive: \"can we motivate people to keep going until the phase change?\" This is a lot easier to tackle! For example, removing frustration makes a huge difference. Getting a proper pair of running shoes made running so much less unpleasant, and made me more willing to keep putting in the hours. For teaching tech topics like formal methods, this often takes the form of better tooling and troubleshooting info.</p>\n<p>We can also reduce the effort of investing time. This is also why I prefer to pair on writing specifications with clients and not just write specs for them. It's more work for them than fobbing it all off on me, but a whole lot <em>less</em> work than writing the spec by themselves, so they'll put in time and gradually develop skills on their own.</p>\n<p>Question two seems much more fruitful than question one but also so much less interesting! Speeding up the phase change feels like the kind of dream that empires are built on. I know I'm going to keep obsessing over it, even if that leads nowhere.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:distance\">\n<p>For non-running Americans: 5km is about 3.1 miles, and 10km is 6.2. <a class=\"footnote-backref\" href=\"#fnref:distance\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/the-phase-change/",
          "published": "2025-10-16T14:59:25.000Z",
          "updated": "2025-10-16T14:59:25.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/three-ways-formally-verified-code-can-go-wrong-in/",
          "title": "Three ways formally verified code can go wrong in practice",
          "description": "<h3>New Logic for Programmers Release!</h3>\n<p><a href=\"https://leanpub.com/logic/\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">v0.12 is now available</a>! This should be the last major content release. The next few months are going to be technical review, copyediting and polishing, with a hopeful 1.0 release in March. <a href=\"https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Full release notes here</a>.</p>\n<figure><img alt=\"Cover of the boooooook\" draggable=\"false\" src=\"https://assets.buttondown.email/images/92b4a35d-2bdd-416a-92c7-15ff42b49d8d.jpg?w=960&fit=max\"/><figcaption></figcaption></figure>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<h1>Three ways formally verified code can go wrong in practice</h1>\n<p>I run this small project called <a href=\"https://github.com/hwayne/lets-prove-leftpad\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Let's Prove Leftpad</a>, where people submit formally verified proofs of the <a href=\"https://en.wikipedia.org/wiki/Npm_left-pad_incident\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">eponymous meme</a>. Recently I read <a href=\"https://lukeplant.me.uk/blog/posts/breaking-provably-correct-leftpad/\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Breaking “provably correct” Leftpad</a>, which argued that most (if not all) of the provably correct leftpads have bugs! The lean proof, for example, <em>should</em> render <code>leftpad('-', 9, אֳֽ֑)</code> as <code>---------אֳֽ֑</code>, but actually does <code>------אֳֽ֑</code>.</p>\n<p>You can read the article for a good explanation of why this goes wrong (Unicode). The actual problem is that correct can mean two different things, and this leads to confusion about how much formal methods can actually guarantee us. So I see this as a great opportunity to talk about the nature of proof, correctness, and how \"correct\" code can still have bugs.</p>\n<h2>What we talk about when we talk about correctness</h2>\n<p>In most of the real world, correct means \"no bugs\". Except \"bugs\" isn't a very clear category. A bug is anything that causes someone to say \"this isn't working right, there's a bug.\" Being too slow is a bug, a typo is a bug, etc. \"correct\" is a little fuzzy.</p>\n<p>In formal methods, \"correct\" has a very specific and precise meaning: the code conforms to a <strong>specification</strong> (or \"spec\"). The spec is a higher-level description of what is supposed the code's properties, usually something we can't just directly implement. Let's look at the most popular kind of proven specification:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\">-- Haskell</span>\n<span class=\"nf\">inc</span><span class=\"w\"> </span><span class=\"ow\">::</span><span class=\"w\"> </span><span class=\"kt\">Int</span><span class=\"w\"> </span><span class=\"o\">-&</span><span class=\"n\">gt</span><span class=\"p\">;</span><span class=\"w\"> </span><span class=\"kt\">Int</span>\n<span class=\"nf\">inc</span><span class=\"w\"> </span><span class=\"n\">x</span><span class=\"w\"> </span><span class=\"ow\">=</span><span class=\"w\"> </span><span class=\"n\">x</span><span class=\"w\"> </span><span class=\"o\">+</span><span class=\"w\"> </span><span class=\"mi\">1</span>\n</code></pre></div>\n<p>The type signature <code>Int -> Int</code> is a specification! It corresponds to the logical statement <code>all x in Int: inc(x) in Int</code>. The Haskell type checker can automatically verify this for us. It cannot, however, verify properties like <code>all x in Int: inc(x) > x</code>. Formal verification is concerned with verifying arbitrary properties beyond what is (easily) automatically verifiable. Most often, this takes the form of proof. A human manually writes a proof that the code conforms to its specification, and the prover checks that the proof is correct.</p>\n<p>Even if we have a proof of \"correctness\", though, there's a few different ways the code can still have bugs.</p>\n<h3>1. The proof is invalid</h3>\n<p>For some reason the proof doesn't actually show the code matches the specification. This is pretty common in pencil-and-paper verification, where the proof is checked by someone saying \"yep looks good to me\". It's much rarer when doing formal verification but it can still happen in a couple of specific cases:</p>\n<ol><li><p>The theorem prover itself has a bug (in the code or introduced in the compiled binary) that makes it accept an incorrect proof. This is something people are really concerned about but it's so much rarer than every other way verified code goes wrong, so is only included for completeness.</p></li><li><p>For convenience, most provers and FM languages have an \"just accept this statement is true\" feature. This helps you work on the big picture proof and fill in the details later. If you leave in a shortcut, <em>and</em> the compiler is configured to allow code-with-proof-assumptions to compile, <em>then</em> you can compile incorrect code that \"passes the proof checker\". You really should know better, though.</p></li></ol>\n<div class=\"subscribe-form\"></div>\n<h3>2. The properties are wrong</h3>\n<blockquote><figure><img alt=\"The horrible bug you had wasn't covered in the specification/came from some other module/etc\" draggable=\"false\" src=\"https://cdn.prod.website-files.com/673b407e535dbf3b547179ff/681ca0bf4a045f39f785faeb_AD_4nXfFhdn6DGmgLAcmaUNHl9a3Nog8gH8Hluve5Kof7zLk4CyOlD4zCmCqVJaowKqu-pTicwZ393jE7anIrjYZTSuRvGiYhFhAkkX9vifNt9vEWYwZUp65hsbrRTmZzRgb9vgu7n7buA.png\"/><figcaption></figcaption></figure><p><a href=\"https://www.galois.com/articles/what-works-and-doesnt-selling-formal-methods\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Galois</a></p></blockquote>\n<p>This code is provably correct:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">inc</span><span class=\"w\"> </span><span class=\"ow\">::</span><span class=\"w\"> </span><span class=\"kt\">Int</span><span class=\"w\"> </span><span class=\"o\">-&</span><span class=\"n\">gt</span><span class=\"p\">;</span><span class=\"w\"> </span><span class=\"kt\">Int</span>\n<span class=\"nf\">inc</span><span class=\"w\"> </span><span class=\"n\">x</span><span class=\"w\"> </span><span class=\"ow\">=</span><span class=\"w\"> </span><span class=\"n\">x</span><span class=\"o\">-</span><span class=\"mi\">1</span>\n</code></pre></div>\n<p>The only specification I've given is the type signature <code>Int -> Int</code>. At no point did I put the property <code>inc(x) > x</code> in my specification, so it doesn't matter that it doesn't hold, the code is still \"correct\".</p>\n<p>This is what \"went wrong\" with the leftpad proofs. They do <em>not</em> prove the property \"<code>leftpad(c, n, s)</code> will take up either <code>n</code> spaces on the screen or however many characters <code>s</code> takes up (if more than <code>n</code>)\". They prove the weaker property \"<code>len(leftpad(c, n, s)) == max(n, len(s))</code>, for however you want to define <code>len(string)</code>\". The second is a rough proxy for the first that works in most cases, but if someone really needs the former property they are liable to experience a bug.</p>\n<p>Why don't we prove the stronger property? Sometimes it's because the code is meant to be used one way and people want to use it another way. This can lead to accusations that the developer is \"misusing the provably correct code\" but this should more often be seen as the verification expert failing to educate devs on was actually \"proven\".</p>\n<p>Sometimes it's because the property is too hard to prove. \"Outputs are visually aligned\" is a proof about Unicode inputs, and the <em>core</em> Unicode specification is <a href=\"https://www.unicode.org/versions/Unicode17.0.0/UnicodeStandard-17.0.pdf\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">1,243 pages long</a>.</p>\n<p>Sometimes it's because the property we want is too hard to <em>express</em>. How do you mathematically represent \"people will perceive the output as being visually aligned\"? Is it OS and font dependent? These two lines are exactly five characters but not visually aligned:</p>\n<blockquote><p>|||||</p><p>MMMMM</p></blockquote>\n<p>Or maybe they are aligned for you! I don't know, lots of people read email in a monospace font. \"We can't express the property\" comes up a lot when dealing with human/business concepts as opposed to mathematical/computational ones.</p>\n<p>Finally, there's just the possibility of a brain fart. All of the proofs in <a href=\"https://research.google/blog/extra-extra-read-all-about-it-nearly-all-binary-searches-and-mergesorts-are-broken/\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Nearly All Binary Searches and Mergesorts are Broken</a> are like this. They (informally) proved the correctness of binary search with unbound integers, forgetting that many programming languages use <em>machine</em> integers, where a large enough sum can overflow.</p>\n<h3>3. The assumptions are wrong</h3>\n<p>This is arguably the most important and most subtle source of bugs. Most properties we prove aren't \"<code>X</code> is always true\". They are \"<em>assuming</em> <code>Y</code> is true, <code>X</code> is also true\". Then if <code>Y</code> is not true, the proof no longer guarantees <code>X</code>. A good example of this is binary <s>sort</s> <em>search</em>, which only correctly finds elements <em>assuming</em> the input list is sorted. If the list is not sorted, it will not work correctly.</p>\n<p>Formal verification adds two more wrinkles. One: sometimes we need assumptions to make the property valid, but we can also add them to make the proof easier. So the code can be bug-free even if the assumptions used to verify it no longer hold! Even if a leftpad implements visual alignment for all Unicode glyphs, it will be a lot easier to <em>prove</em> visual alignment for just ASCII strings and padding.</p>\n<p>Two: we need make a lot of <em>environmental</em> assumptions that are outside our control. Does the algorithm return output or use the stack? Need to assume that there's sufficient memory to store stuff. Does it use any variables? Need to assume nothing is concurrently modifying them. Does it use an external service? Need to assume the vendor doesn't change the API or response formats. You need to assume the compiler worked correctly, the hardware isn't faulty, and the OS doesn't mess with things, etc. Any of these could change well after the code is proven and deployed, meaning formal verification can't be a one-and-done thing.</p>\n<p>You don't actually have to assume most of these, but each assumption drop makes the proof harder and the properties you can prove more restricted. Remember, the code might still be bug-free even if the environmental assumptions change, so there's a tradeoff in time spent proving vs doing other useful work.</p>\n<p>Another common source of \"assumptions\" is when verified code depends on unverified code. The Rust compiler can prove that safe code doesn't have a memory bug <em>assuming</em> unsafe code does not have one either, but depends on the human to confirm that assumption. <a href=\"https://ucsd-progsys.github.io/liquidhaskell/\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Liquid Haskell</a> is verifiable but can also call regular Haskell libraries, which are unverified. We need to assume that code is correct (in the \"conforms to spec\") sense, and if it's not, our proof can be \"correct\" and still cause bugs.</p>\n<hr/><p>These boundaries are fuzzy. I wrote that the \"binary search\" bug happened because they proved the wrong property, but you can just as well argue that it was a broken assumption (that integers could not overflow). What really matters is having a clear understanding of what \"this code is proven correct\" actually <em>tells</em> you. Where can you use it safely? When should you worry? How do you communicate all of this to your teammates?</p>\n<p>Good lord it's already Friday</p>",
          "url": "https://buttondown.com/hillelwayne/archive/three-ways-formally-verified-code-can-go-wrong-in/",
          "published": "2025-10-10T17:06:19.000Z",
          "updated": "2025-10-10T17:06:19.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/new-blog-post-a-very-early-history-of-algebraic/",
          "title": "New Blog Post: \" A Very Early History of Algebraic Data Types\"",
          "description": "<p>Last week I said that this week's newsletter would be a brief history of algebraic data types.</p>\n<p>I was wrong.</p>\n<p>That history is now a <a href=\"https://www.hillelwayne.com/post/algdt-history/\" target=\"_blank\">3500 word blog post</a>.</p>\n<p><a href=\"https://www.patreon.com/posts/blog-notes-very-139696324?utm_medium=clipboard_copy&utm_source=copyLink&utm_campaign=postshare_creator&utm_content=join_link\" target=\"_blank\">Patreon blog notes here</a>.</p>\n<hr/>\n<h3>I'm speaking at <a href=\"https://www.p99conf.io/\" target=\"_blank\">P99 Conf</a>!</h3>\n<p>My talk, \"Designing Low-Latency Systems with TLA+\", is happening 10/23 at 11:30 central time. It's an online conf and the talk's only 16 minutes, so come check it out!</p>",
          "url": "https://buttondown.com/hillelwayne/archive/new-blog-post-a-very-early-history-of-algebraic/",
          "published": "2025-09-25T16:50:58.000Z",
          "updated": "2025-09-25T16:50:58.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/many-hard-leetcode-problems-are-easy-constraint/",
          "title": "Many Hard Leetcode Problems are Easy Constraint Problems",
          "description": "<p>In my first interview out of college I was asked the change counter problem:</p>\n<blockquote>\n<p>Given a set of coin denominations, find the minimum number of coins required to make change for a given number. IE for USA coinage and 37 cents, the minimum number is four (quarter, dime, 2 pennies).</p>\n</blockquote>\n<p>I implemented the simple greedy algorithm and immediately fell into the trap of the question: the greedy algorithm only works for \"well-behaved\" denominations. If the coin values were <code>[10, 9, 1]</code>, then making 37 cents would take 10 coins in the greedy algorithm but only 4 coins optimally (<code>10+9+9+9</code>). The \"smart\" answer is to use a dynamic programming algorithm, which I didn't know how to do. So I failed the interview.</p>\n<p>But you only need dynamic programming if you're writing your own algorithm. It's really easy if you throw it into a constraint solver like <a href=\"https://www.minizinc.org/\" target=\"_blank\">MiniZinc</a> and call it a day. </p>\n<div class=\"codehilite\"><pre><span></span><code>int: total;\narray[int] of int: values = [10, 9, 1];\narray[index_set(values)] of var 0..: coins;\n\nconstraint sum (c in index_set(coins)) (coins[c] * values[c]) == total;\nsolve minimize sum(coins);\n</code></pre></div>\n<p>You can try this online <a href=\"https://play.minizinc.dev/\" target=\"_blank\">here</a>. It'll give you a prompt to put in <code>total</code> and then give you successively-better solutions:</p>\n<div class=\"codehilite\"><pre><span></span><code>coins = [0, 0, 37];\n----------\ncoins = [0, 1, 28];\n----------\ncoins = [0, 2, 19];\n----------\ncoins = [0, 3, 10];\n----------\ncoins = [0, 4, 1];\n----------\ncoins = [1, 3, 0];\n----------\n</code></pre></div>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<p>Lots of similar interview questions are this kind of mathematical optimization problem, where we have to find the maximum or minimum of a function corresponding to constraints. They're hard in programming languages because programming languages are too low-level. They are also exactly the problems that constraint solvers were designed to solve. Hard leetcode problems are easy constraint problems.<sup id=\"fnref:leetcode\"><a class=\"footnote-ref\" href=\"#fn:leetcode\">1</a></sup> Here I'm using MiniZinc, but you could just as easily use Z3 or OR-Tools or whatever your favorite generalized solver is.</p>\n<h3>More examples</h3>\n<p>This was a question in a different interview (which I thankfully passed):</p>\n<blockquote>\n<p>Given a list of stock prices through the day, find maximum profit you can get by buying one stock and selling one stock later.</p>\n</blockquote>\n<p>It's easy to do in O(n^2) time, or if you are clever, you can do it in O(n). Or you could be not clever at all and just write it as a constraint problem:</p>\n<div class=\"codehilite\"><pre><span></span><code>array[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];\nvar int: buy;\nvar int: sell;\nvar int: profit = prices[sell] - prices[buy];\n\nconstraint sell > buy;\nconstraint profit > 0;\nsolve maximize profit;\n</code></pre></div>\n<p>Reminder, link to trying it online <a href=\"https://play.minizinc.dev/\" target=\"_blank\">here</a>. While working at that job, one interview question we tested out was:</p>\n<blockquote>\n<p>Given a list, determine if three numbers in that list can be added or subtracted to give 0? </p>\n</blockquote>\n<p>This is a satisfaction problem, not a constraint problem: we don't need the \"best answer\", any answer will do. We eventually decided against it for being too tricky for the engineers we were targeting. But it's not tricky in a solver; </p>\n<div class=\"codehilite\"><pre><span></span><code>include \"globals.mzn\";\narray[int] of int: numbers = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];\narray[index_set(numbers)] of var {0, -1, 1}: choices;\n\nconstraint sum(n in index_set(numbers)) (numbers[n] * choices[n]) = 0;\nconstraint count(choices, -1) + count(choices, 1) = 3;\nsolve satisfy;\n</code></pre></div>\n<p>Okay, one last one, a problem I saw last year at <a href=\"https://chicagopython.github.io/algosig/\" target=\"_blank\">Chipy AlgoSIG</a>. Basically they pick some leetcode problems and we all do them. I failed to solve <a href=\"https://leetcode.com/problems/largest-rectangle-in-histogram/description/\" target=\"_blank\">this one</a>:</p>\n<blockquote>\n<p>Given an array of integers heights representing the histogram's bar height where the width of each bar is 1, return the area of the largest rectangle in the histogram.</p>\n<p><img alt=\"example from leetcode link\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/63337f78-7138-4b21-87a0-917c0c5b1706.jpg?w=960&fit=max\"/></p>\n</blockquote>\n<p>The \"proper\" solution is a tricky thing involving tracking lots of bookkeeping states, which you can completely bypass by expressing it as constraints:</p>\n<div class=\"codehilite\"><pre><span></span><code>array[int] of int: numbers = [2,1,5,6,2,3];\n\nvar 1..length(numbers): x; \nvar 1..length(numbers): dx;\nvar 1..: y;\n\nconstraint x + dx <= length(numbers);\nconstraint forall (i in x..(x+dx)) (y <= numbers[i]);\n\nvar int: area = (dx+1)*y;\nsolve maximize area;\n\noutput [\"(\\(x)->\\(x+dx))*\\(y) = \\(area)\"]\n</code></pre></div>\n<p>There's even a way to <a href=\"https://docs.minizinc.dev/en/2.9.3/visualisation.html\" target=\"_blank\">automatically visualize the solution</a> (using <code>vis_geost_2d</code>), but I didn't feel like figuring it out in time for the newsletter.</p>\n<h3>Is this better?</h3>\n<p>Now if I actually brought these questions to an interview the interviewee could ruin my day by asking \"what's the runtime complexity?\" Constraint solvers runtimes are unpredictable and almost always slower than an ideal bespoke algorithm because they are more expressive, in what I refer to as the <a href=\"https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/\" target=\"_blank\">capability/tractability tradeoff</a>. But even so, they'll do way better than a <em>bad</em> bespoke algorithm, and I'm not experienced enough in handwriting algorithms to consistently beat a solver.</p>\n<p>The real advantage of solvers, though, is how well they handle new constraints. Take the stock picking problem above. I can write an O(n²) algorithm in a few minutes and the O(n) algorithm if you give me some time to think. Now change the problem to</p>\n<blockquote>\n<p>Maximize the profit by buying and selling up to <code>max_sales</code> stocks, but you can only buy or sell one stock at a given time and you can only hold up to <code>max_hold</code> stocks at a time?</p>\n</blockquote>\n<p>That's a way harder problem to write even an inefficient algorithm for! While the constraint problem is only a tiny bit more complicated:</p>\n<div class=\"codehilite\"><pre><span></span><code>include \"globals.mzn\";\nint: max_sales = 3;\nint: max_hold = 2;\narray[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];\narray [1..max_sales] of var int: buy;\narray [1..max_sales] of var int: sell;\narray [index_set(prices)] of var 0..max_hold: stocks_held;\nvar int: profit = sum(s in 1..max_sales) (prices[sell[s]] - prices[buy[s]]);\n\nconstraint forall (s in 1..max_sales) (sell[s] > buy[s]);\nconstraint profit > 0;\n\nconstraint forall(i in index_set(prices)) (stocks_held[i] = (count(s in 1..max_sales) (buy[s] <= i) - count(s in 1..max_sales) (sell[s] <= i)));\nconstraint alldifferent(buy ++ sell);\nsolve maximize profit;\n\noutput [\"buy at \\(buy)\\n\", \"sell at \\(sell)\\n\", \"for \\(profit)\"];\n</code></pre></div>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<p>Most constraint solving examples online are puzzles, like <a href=\"https://docs.minizinc.dev/en/stable/modelling2.html#ex-sudoku\" target=\"_blank\">Sudoku</a> or \"<a href=\"https://docs.minizinc.dev/en/stable/modelling2.html#ex-smm\" target=\"_blank\">SEND + MORE = MONEY</a>\". Solving leetcode problems would be a more interesting demonstration. And you get more interesting opportunities to teach optimizations, like symmetry breaking.</p>\n<hr/>\n<h3>Update for the Internet</h3>\n<p>This was sent as a weekly newsletter, which is usually on topics like <a href=\"https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code\" target=\"_blank\">software history</a>, <a href=\"https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/\" target=\"_blank\">formal methods</a>, <a href=\"https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/\" target=\"_blank\">unusual technologies</a>, and the <a href=\"https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/\" target=\"_blank\">theory of software engineering</a>. You can subscribe here: </p>\n<div class=\"subscribe-form\"></div>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:leetcode\">\n<p>Because my dad will email me if I don't explain this: \"leetcode\" is slang for \"tricky algorithmic interview questions that have little-to-no relevance in the actual job you're interviewing for.\" It's from <a href=\"https://leetcode.com/\" target=\"_blank\">leetcode.com</a>. <a class=\"footnote-backref\" href=\"#fnref:leetcode\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/many-hard-leetcode-problems-are-easy-constraint/",
          "published": "2025-09-10T13:00:00.000Z",
          "updated": "2025-09-10T13:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/the-angels-and-demons-of-nondeterminism/",
          "title": "The Angels and Demons of Nondeterminism",
          "description": "<p>Greetings everyone! You might have noticed that it's September and I don't have the next version of <em>Logic for Programmers</em> ready. As penance, <a href=\"https://leanpub.com/logic/c/september-2025-kuBCrhBnUzb7\" target=\"_blank\">here's ten free copies of the book</a>.</p>\n<p>So a few months ago I wrote <a href=\"https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/\" target=\"_blank\">a newsletter</a> about how we use nondeterminism in formal methods.  The overarching idea:</p>\n<ol>\n<li>Nondeterminism is when multiple paths are possible from a starting state.</li>\n<li>A system preserves a property if it holds on <em>all</em> possible paths. If even one path violates the property, then we have a bug.</li>\n</ol>\n<p>An intuitive model of this is that for this is that when faced with a nondeterministic choice, the system always makes the <em>worst possible choice</em>. This is sometimes called <strong>demonic nondeterminism</strong> and is favored in formal methods because we are paranoid to a fault.</p>\n<p>The opposite would be <strong>angelic nondeterminism</strong>, where the system always makes the <em>best possible choice</em>. A property then holds if <em>any</em> possible path satisfies that property.<sup id=\"fnref:duals\"><a class=\"footnote-ref\" href=\"#fn:duals\">1</a></sup> This is not as common in FM, but it still has its uses! \"Players can access the secret level\" or \"<a href=\"https://www.hillelwayne.com/post/safety-and-liveness/#other-properties\" target=\"_blank\">We can always shut down the computer</a>\" are <strong>reachability</strong> properties, that something is possible even if not actually done.</p>\n<p>In broader computer science research, I'd say that angelic nondeterminism is more popular, due to its widespread use in complexity analysis and programming languages.</p>\n<h3>Complexity Analysis</h3>\n<p>P is the set of all \"decision problems\" (<em>basically</em>, boolean functions) can be solved in polynomial time: there's an algorithm that's worst-case in <code>O(n)</code>, <code>O(n²)</code>, <code>O(n³)</code>, etc.<sup id=\"fnref:big-o\"><a class=\"footnote-ref\" href=\"#fn:big-o\">2</a></sup>  NP is the set of all problems that can be solved in polynomial time by an algorithm with <em>angelic nondeterminism</em>.<sup id=\"fnref:TM\"><a class=\"footnote-ref\" href=\"#fn:TM\">3</a></sup> For example, the question \"does list <code>l</code> contain <code>x</code>\" can be solved in O(1) time by a nondeterministic algorithm:</p>\n<div class=\"codehilite\"><pre><span></span><code>fun is_member(l: List[T], x: T): bool {\n  if l == [] {return false};\n\n  guess i in 0..<(len(l)-1);\n  return l[i] == x;\n}\n</code></pre></div>\n<p>Say call <code>is_member([a, b, c, d], c)</code>. The best possible choice would be to guess <code>i = 2</code>, which would correctly return true. Now call <code>is_member([a, b], d)</code>. No matter what we guess, the algorithm correctly returns false. and just return false. Ergo, O(1). NP stands for \"Nondeterministic Polynomial\". </p>\n<p>(And I just now realized something pretty cool: you can say that P is the set of all problems solvable in polynomial time under <em>demonic nondeterminism</em>, which is a nice parallel between the two classes.)</p>\n<p>Computer scientists have proven that angelic nondeterminism doesn't give us any more \"power\": there are no problems solvable with AN that aren't also solvable deterministically. The big question is whether AN is more <em>efficient</em>: it is widely believed, but not <em>proven</em>, that there are problems in NP but not in P. Most famously, \"Is there any variable assignment that makes this boolean formula true?\" A polynomial AN algorithm is again easy:</p>\n<div class=\"codehilite\"><pre><span></span><code>fun SAT(f(x1, x2, …: bool): bool): bool {\n   N = num_params(f)\n   for i in 1..=num_params(f) {\n     guess x_i in {true, false}\n   }\n\n   return f(x_1, x_2, …)\n}\n</code></pre></div>\n<p>The best deterministic algorithms we have to solve the same problem are worst-case exponential with the number of boolean parameters. This a real frustrating problem because real computers don't have angelic nondeterminism, so problems like SAT remain hard. We can solve most \"well-behaved\" instances of the problem <a href=\"https://www.hillelwayne.com/post/np-hard/\" target=\"_blank\">in reasonable time</a>, but the worst-case instances get intractable real fast.</p>\n<h3>Means of Abstraction</h3>\n<div class=\"subscribe-form\"></div>\n<p>We can directly turn an AN algorithm into a (possibly much slower) deterministic algorithm, such as by <a href=\"https://en.wikipedia.org/wiki/Backtracking\" target=\"_blank\">backtracking</a>. This makes AN a pretty good abstraction over what an algorithm is doing. Does the regex <code>(a+b)\\1+</code> match \"abaabaabaab\"? Yes, if the regex engine nondeterministically guesses that it needs to start at the third letter and make the group <code>aab</code>. How does my PL's regex implementation find that match? I dunno, backtracking or <a href=\"https://swtch.com/~rsc/regexp/regexp1.html\" target=\"_blank\">NFA construction</a> or something, I don't need to know the deterministic specifics in order to use the nondeterministic abstraction.</p>\n<p>Neel Krishnaswami has <a href=\"https://semantic-domain.blogspot.com/2013/07/what-declarative-languages-are.html\" target=\"_blank\">a great definition of 'declarative language'</a>: \"any language with a semantics has some nontrivial existential quantifiers in it\". I'm not sure if this is <em>identical</em> to saying \"a language with an angelic nondeterministic abstraction\", but they must be pretty close, and all of his examples match:</p>\n<ul>\n<li>SQL's selects and joins</li>\n<li>Parsing DSLs</li>\n<li>Logic programming's unification</li>\n<li>Constraint solving</li>\n</ul>\n<p>On top of that I'd add CSS selectors and <a href=\"https://www.hillelwayne.com/post/picat/\" target=\"_blank\">planner's actions</a>; all nondeterministic abstractions over a deterministic implementation. He also says that the things programmers hate most in declarative languages are features that \"that expose the operational model\": constraint solver search strategies, Prolog cuts, regex backreferences, etc. Which again matches my experiences with angelic nondeterminism: I dread features that force me to understand the deterministic implementation. But they're necessary, since P probably != NP and so we need to worry about operational optimizations.</p>\n<h3>Eldritch Nondeterminism</h3>\n<p>If you need to know the <a href=\"https://en.wikipedia.org/wiki/PP_(complexity)\" target=\"_blank\">ratio of good/bad paths</a>, <a href=\"https://en.wikipedia.org/wiki/%E2%99%AFP\" target=\"_blank\">the number of good paths</a>, or probability, or anything more than \"there is a good path\" or \"there is a bad path\", you are beyond the reach of heaven or hell.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:duals\">\n<p>Angelic and demonic nondeterminism are <a href=\"https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/\" target=\"_blank\">duals</a>: angelic returns \"yes\" if <code>some choice: correct</code> and demonic returns \"no\" if <code>!all choice: correct</code>, which is the same as <code>some choice: !correct</code>. <a class=\"footnote-backref\" href=\"#fnref:duals\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:big-o\">\n<p>Pet peeve about Big-O notation: <code>O(n²)</code> is the <em>set</em> of all algorithms that, for sufficiently large problem sizes, grow no faster that quadratically. \"Bubblesort has <code>O(n²)</code> complexity\" <em>should</em> be written <code>Bubblesort in O(n²)</code>, <em>not</em> <code>Bubblesort = O(n²)</code>. <a class=\"footnote-backref\" href=\"#fnref:big-o\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:TM\">\n<p>To be precise, solvable in polynomial time by a <em>Nondeterministic Turing Machine</em>, a very particular model of computation. We can broadly talk about P and NP without framing everything in terms of Turing machines, but some details of complexity classes (like the existence \"weak NP-hardness\") kinda need Turing machines to make sense. <a class=\"footnote-backref\" href=\"#fnref:TM\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/the-angels-and-demons-of-nondeterminism/",
          "published": "2025-09-04T14:00:00.000Z",
          "updated": "2025-09-04T14:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/",
          "title": "Logical Duals in Software Engineering",
          "description": "<p>(<a href=\"https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/\" target=\"_blank\">Last week's newsletter</a> took too long and I'm way behind on <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a> revisions so short one this time.<sup id=\"fnref:retread\"><a class=\"footnote-ref\" href=\"#fn:retread\">1</a></sup>)</p>\n<p>In classical logic, two operators <code>F/G</code> are <strong>duals</strong> if <code>F(x) = !G(!x)</code>. Three examples:</p>\n<ol>\n<li><code>x || y</code> is the same as <code>!(!x && !y)</code>.</li>\n<li><code><>P</code> (\"P is possibly true\") is the same as <code>![]!P</code> (\"not P isn't definitely true\").</li>\n<li><code>some x in set: P(x)</code> is the same as <code>!(all x in set: !P(x))</code>.</li>\n</ol>\n<p>(1) is just a version of De Morgan's Law, which we regularly use to simplify boolean expressions. (2) is important in modal logic but has niche applications in software engineering, mostly in how it powers various formal methods.<sup id=\"fnref:fm\"><a class=\"footnote-ref\" href=\"#fn:fm\">2</a></sup> The real interesting one is (3), the \"quantifier duals\". We use lots of software tools to either <em>find</em> a value satisfying <code>P</code> or <em>check</em> that all values satisfy <code>P</code>. And by duality, any tool that does one can do the other, by seeing if it <em>fails</em> to find/check <code>!P</code>. Some examples in the wild:</p>\n<ul>\n<li>Z3 is used to solve mathematical constraints, like \"find x, where <code>f(x) >= 0</code>. If I want to prove a property like \"f is always positive\", I ask z3 to solve \"find x, where <code>!(f(x) >= 0)</code>, and see if that is unsatisfiable. This use case powers a LOT of theorem provers and formal verification tooling.</li>\n<li>Property testing checks that all inputs to a code block satisfy a property. I've used it to generate complex inputs with certain properties by checking that all inputs <em>don't</em> satisfy the property and reading out the test failure.</li>\n<li>Model checkers check that all behaviors of a specification satisfy a property, so we can find a behavior that reaches a goal state G by checking that all states are <code>!G</code>. <a href=\"https://github.com/tlaplus/Examples/blob/master/specifications/DieHard/DieHard.tla\" target=\"_blank\">Here's TLA+ solving a puzzle this way</a>.<sup id=\"fnref:antithesis\"><a class=\"footnote-ref\" href=\"#fn:antithesis\">3</a></sup></li>\n<li>Planners find behaviors that reach a goal state, so we can check if all behaviors satisfy a property P by asking it to reach goal state <code>!P</code>.</li>\n<li>The problem \"find the shortest <a href=\"https://en.wikipedia.org/wiki/Travelling_salesman_problem\" target=\"_blank\">traveling salesman route</a>\" can be broken into <code>some route: distance(route) = n</code> and <code>all route: !(distance(route) < n)</code>. Then a route finder can find the first, and then convert the second into a <code>some</code> and <em>fail</em> to find it, proving <code>n</code> is optimal.</li>\n</ul>\n<p>Even cooler to me is when a tool does <em>both</em> finding and checking, but gives them different \"meanings\". In SQL, <code>some x: P(x)</code> is true if we can <em>query</em> for <code>P(x)</code> and get a nonempty response, while <code>all x: P(x)</code> is true if all records satisfy the <code>P(x)</code> <em>constraint</em>. Most SQL databases allow for complex queries but not complex constraints! You got <code>UNIQUE</code>, <code>NOT NULL</code>, <code>REFERENCES</code>, which are fixed predicates, and <code>CHECK</code>, which is one-record only.<sup id=\"fnref:check\"><a class=\"footnote-ref\" href=\"#fn:check\">4</a></sup></p>\n<p>Oh, and you got database triggers, which can run arbitrary queries and throw exceptions. So if you really need to enforce a complex constraint <code>P(x, y, z)</code>, you put in a database trigger that queries <code>some x, y, z: !P(x, y, z)</code> and throws an exception if it finds any results. That all works because of quantifier duality! See <a href=\"https://eddmann.com/posts/maintaining-invariant-constraints-in-postgresql-using-trigger-functions/\" target=\"_blank\">here</a> for an example of this in practice.</p>\n<h3>Duals more broadly</h3>\n<p>\"Dual\" doesn't have a strict meaning in math, it's more of a vibe thing where all of the \"duals\" are kinda similar in meaning but don't strictly follow all of the same rules. <em>Usually</em> things X and Y are duals if there is some transform <code>F</code> where <code>X = F(Y)</code> and <code>Y = F(X)</code>, but not always. Maybe the category theorists have a formal definition that covers all of the different uses. Usually duals switch properties of things, too: an example showing <code>some x: P(x)</code> becomes a <em>counterexample</em> of <code>all x: !P(x)</code>.</p>\n<p>Under this definition, I think the dual of a list <code>l</code> could be <code>reverse(l)</code>. The first element of <code>l</code> becomes the last element of <code>reverse(l)</code>, the last becomes the first, etc. A more interesting case is the dual of a <code>K -> set(V)</code> map is the <code>V -> set(K)</code> map. IE the dual of <code>lived_in_city = {alice: {paris}, bob: {detroit}, charlie: {detroit, paris}}</code> is <code>city_lived_in_by = {paris: {alice, charlie}, detroit: {bob, charlie}}</code>. This preserves the property that <code>x in map[y] <=> y in dual[x]</code>.</p>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:retread\">\n<p>And after writing this I just realized this is partial retread of a newsletter I wrote <a href=\"https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/\" target=\"_blank\">a couple months ago</a>. But only a <em>partial</em> retread! <a class=\"footnote-backref\" href=\"#fnref:retread\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:fm\">\n<p>Specifically \"linear temporal logics\" are modal logics, so \"<code>eventually P</code> (\"P is true in at least one state of each behavior\") is the same as saying <code>!always !P</code> (\"not P isn't true in all states of all behaviors\"). This is the basis of <a href=\"https://www.hillelwayne.com/post/safety-and-liveness/\" target=\"_blank\">liveness checking</a>. <a class=\"footnote-backref\" href=\"#fnref:fm\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:antithesis\">\n<p>I don't know for sure, but my best guess is that Antithesis does something similar <a href=\"https://antithesis.com/blog/tag/games/\" target=\"_blank\">when their fuzzer beats videogames</a>. They're doing fuzzing, not model checking, but they have the same purpose check that complex state spaces don't have bugs. Making the bug \"we can't reach the end screen\" can make a fuzzer output a complete end-to-end run of the game. Obvs a lot more complicated than that but that's the general idea at least. <a class=\"footnote-backref\" href=\"#fnref:antithesis\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n<li id=\"fn:check\">\n<p>For <code>CHECK</code> to constraint multiple records you would need to use a subquery. Core SQL does not support subqueries in check. It is an optional database \"feature outside of core SQL\" (F671), which <a href=\"https://www.postgresql.org/docs/current/unsupported-features-sql-standard.html\" target=\"_blank\">Postgres does not support</a>. <a class=\"footnote-backref\" href=\"#fnref:check\" title=\"Jump back to footnote 4 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/",
          "published": "2025-08-27T19:25:32.000Z",
          "updated": "2025-08-27T19:25:32.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/",
          "title": "Sapir-Whorf does not apply to Programming Languages",
          "description": "<p><em>This one is a hot mess but it's too late in the week to start over. Oh well!</em></p>\n<p>Someone recognized me at last week's <a href=\"https://www.chipy.org/\" target=\"_blank\">Chipy</a> and asked for my opinion on Sapir-Whorf hypothesis in programming languages. I thought this was interesting enough to make a newsletter. First what it is, then why it <em>looks</em> like it applies, and then why it doesn't apply after all.</p>\n<h3>The Sapir-Whorf Hypothesis</h3>\n<blockquote>\n<p>We dissect nature along lines laid down by our native language. — <a href=\"https://web.mit.edu/allanmc/www/whorf.scienceandlinguistics.pdf\" target=\"_blank\">Whorf</a></p>\n</blockquote>\n<p>To quote from a <a href=\"https://www.amazon.com/Linguistics-Complete-Introduction-Teach-Yourself/dp/1444180320\" target=\"_blank\">Linguistics book I've read</a>, the hypothesis is that \"an individual's fundamental perception of reality is moulded by the language they speak.\" As a massive oversimplification, if English did not have a word for \"rebellion\", we would not be able to conceive of rebellion. This view, now called <a href=\"https://en.wikipedia.org/wiki/Linguistic_determinism\" target=\"_blank\">Linguistic Determinism</a>, is mostly rejected by modern linguists.</p>\n<p>The \"weak\" form of SWH is that the language we speak influences, but does not <em>decide</em> our cognition. <a href=\"https://langcog.stanford.edu/papers/winawer2007.pdf\" target=\"_blank\">For example</a>, Russian has distinct words for \"light blue\" and \"dark blue\", so can discriminate between \"light blue\" and \"dark blue\" shades faster than they can discriminate two \"light blue\" shades. English does not have distinct words, so we discriminate those at the same speed. This <strong>linguistic relativism</strong> seems to have lots of empirical support in studies, but mostly with \"small indicators\". I don't think there's anything that convincingly shows linguistic relativism having effects on a societal level.<sup id=\"fnref:economic-behavior\"><a class=\"footnote-ref\" href=\"#fn:economic-behavior\">1</a></sup></p>\n<p>The weak form of SWH for software would then be the \"the programming languages you know affects how you think about programs.\"</p>\n<h3>SWH in software</h3>\n<p>This seems like a natural fit, as different paradigms solve problems in different ways. Consider the <a href=\"https://hadid.dev/posts/living-coding/\" target=\"_blank\">hardest interview question ever</a>, \"given a list of integers, sum the even numbers\". Here it is in four paradigms:</p>\n<ul>\n<li>Procedural: <code>total = 0; foreach x in list {if IsEven(x) total += x}</code>. You iterate over data with an algorithm.</li>\n<li>Functional: <code>reduce(+, filter(IsEven, list), 0)</code>. You apply transformations to data to get a result.</li>\n<li>Array: <code>+ fold L * iseven L</code>.<sup id=\"fnref:J\"><a class=\"footnote-ref\" href=\"#fn:J\">2</a></sup> In English: replace every element in L with 0 if odd and 1 if even, multiple the new array elementwise against <code>L</code>, and then sum the resulting array. It's like functional except everything is in terms of whole-array transformations.</li>\n<li>Logical: Somethingish like <code>sumeven(0, []). sumeven(X, [Y|L]) :- iseven(Y) -> sumeven(Z, L), X is Y + Z ; sumeven(X, L)</code>. You write a set of equations that express what it means for X to <em>be</em> the sum of events of L.</li>\n</ul>\n<p>There's some similarities between how these paradigms approach the problem, but each is also unique, too. It's plausible that where a procedural programmer \"sees\" a for loop, a functional programmer \"sees\" a map and an array programmer \"sees\" a singular operator.</p>\n<p>I also have a personal experience with how a language changed the way I think. I use <a href=\"https://learntla.com/\" target=\"_blank\">TLA+</a> to detect concurrency bugs in software designs. After doing this for several years, I've gotten much better at intuitively seeing race conditions in things even <em>without</em> writing a TLA+ spec. It's even leaked out into my day-to-day life. I see concurrency bugs everywhere. Phone tag is a race condition.</p>\n<p>But I still don't think SWH is the right mental model to use, for one big reason: language is <em>special</em>. We think in language, we dream in language, there are huge parts of our brain dedicated to processing language. <a href=\"https://web.eecs.umich.edu/~weimerw/p/weimer-icse2017-preprint.pdf\" target=\"_blank\">We don't use those parts of our brain to read code</a>. </p>\n<p>SWH is so intriguing because it seems so unnatural, that the way we express thoughts changes the way we <em>think</em> thoughts. That I would be a different person if I was bilingual in Spanish, not because the life experiences it would open up but because <a href=\"https://en.wikipedia.org/wiki/Grammatical_gender\" target=\"_blank\">grammatical gender</a> would change my brain.</p>\n<p>Compared to that, the idea that programming languages affect our brain is more natural and has a simpler explanation:</p>\n<p>It's the goddamned <a href=\"https://en.wikipedia.org/wiki/Tetris_effect\" target=\"_blank\">Tetris Effect</a>.</p>\n<h3>The Goddamned Tetris Effect</h3>\n<div class=\"subscribe-form\"></div>\n<blockquote>\n<p>The Tetris effect occurs when someone dedicates vast amounts of time, effort and concentration on an activity which thereby alters their thoughts, dreams, and other experiences not directly linked to said activity. — Wikipedia</p>\n</blockquote>\n<p>Every skill does this. I'm a juggler, so every item I can see right now has a tiny metadata field of \"how would this tumble if I threw it up\". I teach professionally, so I'm always noticing good teaching examples everywhere. I spent years writing specs in TLA+ and watching the model checker throw concurrency errors in my face, so now race conditions have visceral presence. Every skill does this. </p>\n<p>And to really develop a skill, you gotta practice. This is where I think programming paradigms do something especially interesting that make them feel more like Sapir-Whorfy than, like, juggling. Some languages mix lots of different paradigms, like Javascript or Rust. Others like Haskell really focus on <em>excluding</em> paradigms. If something is easy for you in procedural and hard in FP, in JS you could just lean on the procedural bits. In Haskell, <em>too bad</em>, you're learning how to do it the functional way.<sup id=\"fnref:escape-hatch\"><a class=\"footnote-ref\" href=\"#fn:escape-hatch\">3</a></sup></p>\n<p>And that forces you to practice, which makes you see functional patterns everywhere. Tetris effect!</p>\n<p>Anyway this may all seem like quibbling— why does it matter whether we call it \"Tetris effect\" or \"Sapir-Whorf\", if our brains is get rewired either way? For me, personally, it's because SWH sounds really special and <em>unique</em>, while Tetris effect sounds mundane and commonplace. Which it <em>is</em>. But also because TE suggests it's not just programming languages that affect how we think about software, it's <em>everything</em>. Spending lots of time debugging, profiling, writing exploits, whatever will change what you notice, what you think a program \"is\". And that's a way useful idea that shouldn't be restricted to just PLs.</p>\n<p>(Then again, the Tetris Effect might also be a bad analogy to what's going on here, because I think part of it is that it wears off after a while. Maybe it's just \"building a mental model is good\".)</p>\n<h3>I just realized all of this might have missed the point</h3>\n<p>Wait are people actually using SWH to mean the <em>weak form</em> or the <em>strong</em> form? Like that if a language doesn't make something possible, its users can't conceive of it being possible. I've been arguing against the weaker form in software but I think I've seen strong form often too. Dammit.</p>\n<p>Well, it's already Thursday and far too late to rewrite the whole newsletter, so I'll just outline the problem with the strong form: we describe the capabilities of our programming languages <em>with human language</em>. In college I wrote a lot of crappy physics lab C++ and one of my projects was filled with comments like \"man I hate copying this triply-nested loop in 10 places with one-line changes, I wish I could put it in one function and just take the changing line as a parameter\". Even if I hadn't <em>encountered</em> higher-order functions, I was still perfectly capable of expressing the idea. So if the strong SWH isn't true for human language, it's not true for programming languages either.</p>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<hr/>\n<h1>Systems Distributed talk now up!</h1>\n<p><a href=\"https://www.youtube.com/watch?v=d9cM8f_qSLQ\" target=\"_blank\">Link here</a>! Original abstract:</p>\n<blockquote>\n<p>Building correct distributed systems takes thinking outside the box, and the fastest way to do that is to think inside a different box. One different box is \"formal methods\", the discipline of mathematically verifying software and systems. Formal methods encourages unusual perspectives on systems, models that are also broadly useful to all software developers. In this talk we will learn two of the most important FM perspectives: the abstract specifications behind software systems, and the property they are and aren't supposed to have.</p>\n</blockquote>\n<p>The talk ended up evolving away from that abstract but I like how it turned out!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:economic-behavior\">\n<p>There is <a href=\"https://www.anderson.ucla.edu/faculty/keith.chen/papers/LanguageWorkingPaper.pdf\" target=\"_blank\">one paper</a> arguing that people who speak a language that doesn't have a \"future tense\" are more likely to save and eat healthy, but it is... <a href=\"https://www.reddit.com/r/linguistics/comments/rcne7m/comment/hnz2705/\" target=\"_blank\">extremely questionable</a>. <a class=\"footnote-backref\" href=\"#fnref:economic-behavior\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:J\">\n<p>The original J is <code>+/ (* (0 =  2&|))</code>. Obligatory <a href=\"https://www.jsoftware.com/papers/tot.htm\" target=\"_blank\">Notation as a Tool of Thought</a> reference <a class=\"footnote-backref\" href=\"#fnref:J\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:escape-hatch\">\n<p>Though if it's <em>too</em> hard for you, that's why languages have <a href=\"https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/\" target=\"_blank\">escape hatches</a> <a class=\"footnote-backref\" href=\"#fnref:escape-hatch\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/",
          "published": "2025-08-21T13:00:00.000Z",
          "updated": "2025-08-21T13:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/software-books-i-wish-i-could-read/",
          "title": "Software books I wish I could read",
          "description": "<h3>New Logic for Programmers Release!</h3>\n<p><a href=\"https://leanpub.com/logic/\" target=\"_blank\">v0.11 is now available</a>! This is over 20%  longer than v0.10, with a new chapter on code proofs, three chapter overhauls, and more! <a href=\"https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md\" target=\"_blank\">Full release notes here</a>.</p>\n<p><img alt=\"Cover of the boooooook\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/92b4a35d-2bdd-416a-92c7-15ff42b49d8d.jpg?w=960&fit=max\"/></p>\n<h1>Software books I wish I could read</h1>\n<p>I'm writing <em>Logic for Programmers</em> because it's a book I wanted to have ten years ago. I had to learn everything in it the hard way, which is why I'm ensuring that everybody else can learn it the easy way.</p>\n<p>Books occupy a sort of weird niche in software. We're great at sharing information via blogs and git repos and entire websites. These have many benefits over books: they're free, they're easily accessible, they can be updated quickly, they can even be interactive. But no blog post has influenced me as profoundly as <a href=\"https://buttondown.com/hillelwayne/archive/why-you-should-read-data-and-reality/\" target=\"_blank\">Data and Reality</a> or <a href=\"https://www.oreilly.com/library/view/making-software/9780596808310/\" target=\"_blank\">Making Software</a>. There is no blog or talk about debugging as good as the \n<a href=\"https://debuggingrules.com/\" target=\"_blank\">Debugging</a> book.</p>\n<p>It might not be anything deeper than \"people spend more time per word on writing books than blog posts\". I dunno.</p>\n<p>So here are some other books I wish I could read. I don't <em>think</em> any of them exist yet but it's a big world out there. Also while they're probably best as books, a website or a series of blog posts would be ok too.</p>\n<h4>Everything about Configurations</h4>\n<p>The whole topic of how we configure software, whether by CLI flags, environmental vars, or JSON/YAML/XML/Dhall files. What causes the <a href=\"https://mikehadlow.blogspot.com/2012/05/configuration-complexity-clock.html\" target=\"_blank\">configuration complexity clock</a>? How do we distinguish between basic, advanced, and developer-only configuration options? When should we disallow configuration? How do we test all possible configurations for correctness? Why do so many widespread outages trace back to misconfiguration, and how do we prevent them? </p>\n<p>I also want the same for plugin systems. Manifests, permissions, common APIs and architectures, etc. Configuration management is more universal, though, since everybody either uses software with configuration or has made software with configuration.</p>\n<h4>The Big Book of Complicated Data Schemas</h4>\n<p>I guess this would kind of be like <a href=\"https://schema.org/docs/full.html\" target=\"_blank\">Schema.org</a>, except with a lot more on the \"why\" and not the what. Why is important for the <a href=\"https://schema.org/Volcano\" target=\"_blank\">Volcano model</a> to have a \"smokingAllowed\" field?<sup id=\"fnref:volcano\"><a class=\"footnote-ref\" href=\"#fn:volcano\">1</a></sup></p>\n<p>I'd see this less as \"here's your guide to putting Volcanos in your database\" and more \"here's recurring motifs in modeling interesting domains\", to help a person see sources of complexity in their <em>own</em> domain. Does something crop up if the references can form a cycle? If a relationship needs to be strictly temporary, or a reference can change type? Bonus: path dependence in data models, where an additional requirement leads to a vastly different ideal data model that a company couldn't do because they made the old model.</p>\n<p>(This has got to exist, right? Business modeling is a big enough domain that this must exist. Maybe <a href=\"https://essenceofsoftware.com/\" target=\"_blank\">The Essence of Software</a> touches on this? Man I feel bad I haven't read that yet.)</p>\n<h4>Computer Science for Software Engineers</h4>\n<p>Yes, I checked, this book does not exist (though maybe <a href=\"https://www.amazon.com/A-Programmers-Guide-to-Computer-Science-2-book-series/dp/B08433QR53\" target=\"_blank\">this</a> is the same thing). I don't have any formal software education; everything I know was either self-taught or learned on the job. But it's way easier to learn software engineering that way than computer science. And I bet there's a lot of other engineers in the same boat. </p>\n<p>This book wouldn't have to be comprehensive or instructive: just enough about each topic to understand why it's an area of study and appreciate how research in it eventually finds its way into practice. </p>\n<h4>MISU Patterns</h4>\n<p>MISU, or \"Make Illegal States Unrepresentable\", is the idea of designing system invariants in the structure of your data. For example, if a <code>Contact</code> needs at least one of <code>email</code> or <code>phone</code> to be non-null, make it a sum type over <code>EmailContact, PhoneContact, EmailPhoneContact</code> (from <a href=\"https://fsharpforfunandprofit.com/posts/designing-with-types-making-illegal-states-unrepresentable/\" target=\"_blank\">this post</a>). MISU is great.</p>\n<p>Most MISU in the wild look very different than that, though, because the concept of MISU is so broad there's lots of different ways to achieve it. And that means there are \"patterns\": smart constructors, product types, properly using sets, <a href=\"https://lexi-lambda.github.io/blog/2020/11/01/names-are-not-type-safety/\" target=\"_blank\">newtypes to some degree</a>, etc. Some of them are specific to typed FP, while others can be used in even untyped languages. Someone oughta make a pattern book.</p>\n<p>My one request would be to not give them cutesy names. Do something like the <a href=\"https://ia600301.us.archive.org/18/items/Thompson2016MotifIndex/Thompson_2016_Motif-Index.pdf\" target=\"_blank\">Aarne–Thompson–Uther Index</a>, where items are given names like \"Recognition by manner of throwing cakes of different weights into faces of old uncles\". Names can come later.</p>\n<h4>The Tools of '25</h4>\n<p>Not something I'd read, but something to recommend to junior engineers. Starting out it's easy to think the only bit that matters is the language or framework and not realize the enormous amount of surrounding tooling you'll have to learn. This book would cover the basics of tools that <em>enough</em> developers will probably use at some point: git, VSCode, <em>very</em> basic Unix and bash, curl. Maybe the general concepts of tools that appear in every ecosystem, like package managers, build tools, task runners. That might be easier if we specialize this to one particular domain, like webdev or data science.</p>\n<p>Ideally the book would only have to be updated every five years or so. No LLM stuff because I don't expect the tooling will be stable through 2026, to say nothing of 2030.</p>\n<h4>A History of Obsolete Optimizations</h4>\n<p>Probably better as a really long blog series. Each chapter would be broken up into two parts:</p>\n<ol>\n<li>A deep dive into a brilliant, elegant, insightful historical optimization designed to work within the constraints of that era's computing technology</li>\n<li>What we started doing instead, once we had more compute/network/storage available.</li>\n</ol>\n<p>c.f. <a href=\"https://prog21.dadgum.com/29.html\" target=\"_blank\">A Spellchecker Used to Be a Major Feat of Software Engineering</a>. Bonus topics would be brilliance obsoleted by standardization (like what people did before git and json were universal), optimizations we do today that may not stand the test of time, and optimizations from the past that <em>did</em>.</p>\n<h4>Sphinx Internals</h4>\n<p><em>I need this</em>. I've spent so much goddamn time digging around in Sphinx and docutils source code I'm gonna throw up.</p>\n<hr/>\n<h3>Systems Distributed Talk Today!</h3>\n<p>Online premier's at noon central / 5 PM UTC, <a href=\"https://www.youtube.com/watch?v=d9cM8f_qSLQ\" target=\"_blank\">here</a>! I'll be hanging out to answer questions and be awkward. You ever watch a recording of your own talk? It's real uncomfortable!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:volcano\">\n<p>In <em>this</em> case because it's a field on one of <code>Volcano</code>'s supertypes. I guess schemas gotta follow LSP too <a class=\"footnote-backref\" href=\"#fnref:volcano\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/software-books-i-wish-i-could-read/",
          "published": "2025-08-06T13:00:00.000Z",
          "updated": "2025-08-06T13:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/2000-words-about-arrays-and-tables/",
          "title": "2000 words about arrays and tables",
          "description": "<p>I'm way too discombobulated from getting next month's release of <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a> ready, so I'm pulling a idea from the slush pile. Basically I wanted to come up with a mental model of arrays as a concept that explained APL-style multidimensional arrays and tables but also why there weren't multitables.</p>\n<p>So, arrays. In all languages they are basically the same: they map a sequence of numbers (I'll use <code>1..N</code>)<sup id=\"fnref:1-indexing\"><a class=\"footnote-ref\" href=\"#fn:1-indexing\">1</a></sup> to homogeneous values (values of a single type). This is in contrast to the other two foundational types, associative arrays (which map an arbitrary type to homogeneous values) and structs (which map a fixed set of keys to <em>heterogeneous</em> values). Arrays appear in PLs earlier than the other two, possibly because they have the simplest implementation and the most obvious application to scientific computing. The OG FORTRAN had arrays. </p>\n<p>I'm interested in two structural extensions to arrays. The first, found in languages like nushell and frameworks like Pandas, is the <em>table</em>. Tables have string keys like a struct <em>and</em> indexes like an array. Each row is a struct, so you can get \"all values in this column\" or \"all values for this row\". They're heavily used in databases and data science.</p>\n<p>The other extension is the <strong>N-dimensional array</strong>, mostly seen in APLs like Dyalog and J. Think of this like arrays-of-arrays(-of-arrays), except all arrays at the same depth have the same length. So <code>[[1,2,3],[4]]</code> is not a 2D array, but <code>[[1,2,3],[4,5,6]]</code> is. This means that N-arrays can be queried on any axis.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"w\"> </span><span class=\"o\">]</span><span class=\"nv\">x</span><span class=\"w\"> </span><span class=\"o\">=:</span><span class=\"w\"> </span><span class=\"nv\">i</span><span class=\"o\">.</span><span class=\"w\"> </span><span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">3</span>\n<span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"mi\">2</span>\n<span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">4</span><span class=\"w\"> </span><span class=\"mi\">5</span>\n<span class=\"mi\">6</span><span class=\"w\"> </span><span class=\"mi\">7</span><span class=\"w\"> </span><span class=\"mi\">8</span>\n<span class=\"w\">   </span><span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"o\">{</span><span class=\"w\"> </span><span class=\"nv\">x</span><span class=\"w\"> </span><span class=\"c1\">NB. first row</span>\n<span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"mi\">2</span>\n<span class=\"w\">   </span><span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"o\">{\"</span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"nv\">x</span><span class=\"w\"> </span><span class=\"c1\">NB. first column</span>\n<span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">6</span>\n</code></pre></div>\n<p>So, I've had some ideas on a conceptual model of arrays that explains all of these variations and possibly predicts new variations. I wrote up my notes and did the bare minimum of editing and polishing. Somehow it ended up being 2000 words.</p>\n<h3>1-dimensional arrays</h3>\n<p>A one-dimensional array is a function over <code>1..N</code> for some N. </p>\n<p>To be clear this is <em>math</em> functions, not programming functions. Programming functions take values of a type and perform computations on them. Math functions take values of a fixed set and return values of another set. So the array <code>[a, b, c, d]</code> can be represented by the function <code>(1 -> a ++ 2 -> b ++ 3 -> c ++ 4 -> d)</code>. Let's write the set of all four element character arrays as <code>1..4 -> char</code>. <code>1..4</code> is the function's <strong>domain</strong>.</p>\n<p>The set of all character arrays is the empty array + the functions with domain <code>1..1</code> + the functions with domain <code>1..2</code> + ... Let's call this set <code>Array[Char]</code>. Our compilers can enforce that a type belongs to <code>Array[Char]</code>, but some operations care about the more specific type, like matrix multiplication. This is either checked with the runtime type or, in exotic enough languages, with static dependent types.</p>\n<p>(This is actually how TLA+ does things: the basic collection types are functions and sets, and a function with domain 1..N is a sequence.)</p>\n<h3>2-dimensional arrays</h3>\n<p>Now take the 3x4 matrix</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"w\">   </span><span class=\"nv\">i</span><span class=\"o\">.</span><span class=\"w\"> </span><span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">4</span>\n<span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">1</span><span class=\"w\">  </span><span class=\"mi\">2</span><span class=\"w\">  </span><span class=\"mi\">3</span>\n<span class=\"mi\">4</span><span class=\"w\"> </span><span class=\"mi\">5</span><span class=\"w\">  </span><span class=\"mi\">6</span><span class=\"w\">  </span><span class=\"mi\">7</span>\n<span class=\"mi\">8</span><span class=\"w\"> </span><span class=\"mi\">9</span><span class=\"w\"> </span><span class=\"mi\">10</span><span class=\"w\"> </span><span class=\"mi\">11</span>\n</code></pre></div>\n<p>There are two equally valid ways to represent the array function:</p>\n<ol>\n<li>A function that takes a row and a column and returns the value at that index, so it would look like <code>f(r: 1..3, c: 1..4) -> Int</code>.</li>\n<li>A function that takes a row and returns that column as an array, aka another function: <code>f(r: 1..3) -> g(c: 1..4) -> Int</code>.<sup id=\"fnref:associative\"><a class=\"footnote-ref\" href=\"#fn:associative\">2</a></sup></li>\n</ol>\n<p>Man, (2) looks a lot like <a href=\"https://en.wikipedia.org/wiki/Currying\" target=\"_blank\">currying</a>! In Haskell, functions can only have one parameter. If you write <code>(+) 6 10</code>, <code>(+) 6</code> first returns a <em>new</em> function <code>f y = y + 6</code>, and then applies <code>f 10</code> to get 16. So <code>(+)</code> has the type signature <code>Int -> Int -> Int</code>: it's a function that takes an <code>Int</code> and returns a function of type <code>Int -> Int</code>.<sup id=\"fnref:typeclass\"><a class=\"footnote-ref\" href=\"#fn:typeclass\">3</a></sup></p>\n<p>Similarly, our 2D array can be represented as an array function that returns array functions: it has type <code>1..3 -> 1..4 -> Int</code>, meaning it takes a row index and returns <code>1..4 -> Int</code>, aka a single array.</p>\n<p>(This differs from conventional array-of-arrays because it forces all of the subarrays to have the same domain, aka the same length. If we wanted to permit ragged arrays, we would instead have the type <code>1..3 -> Array[Int]</code>.)</p>\n<p>Why is this useful? A couple of reasons. First of all, we can apply function transformations to arrays, like \"<a href=\"https://blog.zdsmith.com/series/combinatory-programming.html\" target=\"_blank\">combinators</a>\". For example, we can flip any function of type <code>a -> b -> c</code> into a function of type <code>b -> a -> c</code>. So given a function that takes rows and returns columns, we can produce one that takes columns and returns rows. That's just a matrix transposition! </p>\n<p>Second, we can extend this to any number of dimensions: a three-dimensional array is one with type <code>1..M -> 1..N -> 1..O -> V</code>. We can still use function transformations to rearrange the array along any ordering of axes.</p>\n<p>Speaking of dimensions:</p>\n<h3>What are dimensions, anyway</h3>\n<div class=\"subscribe-form\"></div>\n<p>Okay, so now imagine we have a <code>Row</code> × <code>Col</code> grid of pixels, where each pixel is a struct of type <code>Pixel(R: int, G: int, B: int)</code>. So the array is</p>\n<div class=\"codehilite\"><pre><span></span><code>Row -> Col -> Pixel\n</code></pre></div>\n<p>But we can also represent the <em>Pixel struct</em> with a function: <code>Pixel(R: 0, G: 0, B: 255)</code> is the function where <code>f(R) = 0</code>, <code>f(G) = 0</code>, <code>f(B) = 255</code>, making it a function of type <code>{R, G, B} -> Int</code>. So the array is actually the function</p>\n<div class=\"codehilite\"><pre><span></span><code>Row -> Col -> {R, G, B} -> Int\n</code></pre></div>\n<p>And then we can rearrange the parameters of the function like this:</p>\n<div class=\"codehilite\"><pre><span></span><code>{R, G, B} -> Row -> Col -> Int\n</code></pre></div>\n<p>Even though the set <code>{R, G, B}</code> is not of form 1..N, this clearly has a real meaning: <code>f[R]</code> is the function mapping each coordinate to that coordinate's red value. What about <code>Row -> {R, G, B} -> Col -> Int</code>?  That's for each row, the 3 × Col array mapping each color to that row's intensities.</p>\n<p>Really <em>any finite set</em> can be a \"dimension\". Recording the monitor over a span of time? <code>Frame -> Row -> Col -> Color -> Int</code>. Recording a bunch of computers over some time? <code>Computer -> Frame -> Row …</code>.</p>\n<p>This is pretty common in constraint satisfaction! Like if you're conference trying to assign talks to talk slots, your array might be type <code>(Day, Time, Room) -> Talk</code>, where Day/Time/Room are enumerations.</p>\n<p>An implementation constraint is that most programming languages <em>only</em> allow integer indexes, so we have to replace Rooms and Colors with numerical enumerations over the set. As long as the set is finite, this is always possible, and for struct-functions, we can always choose the indexing on the lexicographic ordering of the keys. But we lose type safety.</p>\n<h3>Why tables are different</h3>\n<p>One more example: <code>Day -> Hour -> Airport(name: str, flights: int, revenue: USD)</code>. Can we turn the struct into a dimension like before? </p>\n<p>In this case, no. We were able to make <code>Color</code> an axis because we could turn <code>Pixel</code> into a <code>Color -> Int</code> function, and we could only do that because all of the fields of the struct had the same type. This time, the fields are <em>different</em> types. So we can't convert <code>{name, flights, revenue}</code> into an axis. <sup id=\"fnref:name-dimension\"><a class=\"footnote-ref\" href=\"#fn:name-dimension\">4</a></sup> One thing we can do is convert it to three <em>separate</em> functions:</p>\n<div class=\"codehilite\"><pre><span></span><code>airport: Day -> Hour -> Str\nflights: Day -> Hour -> Int\nrevenue: Day -> Hour -> USD\n</code></pre></div>\n<p>But we want to keep all of the data in one place. That's where <strong>tables</strong> come in: an array-of-structs is isomorphic to a struct-of-arrays:</p>\n<div class=\"codehilite\"><pre><span></span><code>AirportColumns(\n    airport: Day -> Hour -> Str,\n    flights: Day -> Hour -> Int,\n    revenue: Day -> Hour -> USD,\n)\n</code></pre></div>\n<p>The table is a sort of <em>both</em> representations simultaneously. If this was a pandas dataframe, <code>df[\"airport\"]</code> would get the airport column, while <code>df.loc[day1]</code> would get the first day's data. I don't think many table implementations support more than one axis dimension but there's no reason they <em>couldn't</em>. </p>\n<p>These are also possible transforms:</p>\n<div class=\"codehilite\"><pre><span></span><code>Hour -> NamesAreHard(\n    airport: Day -> Str,\n    flights: Day -> Int,\n    revenue: Day -> USD,\n)\n\nDay -> Whatever(\n    airport: Hour -> Str,\n    flights: Hour -> Int,\n    revenue: Hour -> USD,\n)\n</code></pre></div>\n<p>In my mental model, the heterogeneous struct acts as a \"block\" in the array. We can't remove it, we can only push an index into the fields or pull a shared column out. But there's no way to convert a heterogeneous table into an array.</p>\n<h3>Actually there is a terrible way</h3>\n<p>Most languages have unions or <del>product</del> sum types that let us say \"this is a string OR integer\". So we can make our airport data <code>Day -> Hour -> AirportKey -> Int | Str | USD</code>. Heck, might as well just say it's <code>Day -> Hour -> AirportKey -> Any</code>. But would anybody really be mad enough to use that in practice?</p>\n<p><a href=\"https://code.jsoftware.com/wiki/Vocabulary/lt\" target=\"_blank\">Oh wait J does exactly that</a>. J has an opaque datatype called a \"box\". A \"table\" is a function <code>Dim1 -> Dim2 -> Box</code>. You can see some examples of what that looks like <a href=\"https://code.jsoftware.com/wiki/DB/Flwor\" target=\"_blank\">here</a></p>\n<h3>Misc Thoughts and Questions</h3>\n<p>The heterogeneity barrier seems like it explains why we don't see multiple axes of table columns, while we do see multiple axes of array dimensions. But is that actually why? Is there a system out there that <em>does</em> have multiple columnar axes?</p>\n<p>The array <code>x = [[a, b, a], [b, b, b]]</code> has type <code>1..2 -> 1..3 -> {a, b}</code>. Can we rearrange it to <code>1..2 -> {a, b} -> 1..3</code>? No. But we <em>can</em> rearrange it to <code>1..2 -> {a, b} -> PowerSet(1..3)</code>, which maps rows and characters to columns <em>with</em> that character. <code>[(a -> {1, 3} ++ b -> {2}), (a -> {} ++ b -> {1, 2, 3}]</code>. </p>\n<p>We can also transform <code>Row -> PowerSet(Col)</code> into <code>Row -> Col -> Bool</code>, aka a boolean matrix. This makes sense to me as both forms are means of representing directed graphs.</p>\n<p>Are other function combinators useful for thinking about arrays?</p>\n<p>Does this model cover pivot tables? Can we extend it to relational data with multiple tables?</p>\n<hr/>\n<h3>Systems Distributed Talk (will be) Online</h3>\n<p>The premier will be August 6 at 12 CST, <a href=\"https://www.youtube.com/watch?v=d9cM8f_qSLQ\" target=\"_blank\">here</a>! I'll be there to answer questions / mock my own performance / generally make a fool of myself.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:1-indexing\">\n<p><a href=\"https://buttondown.com/hillelwayne/archive/why-do-arrays-start-at-0/\" target=\"_blank\">Sacrilege</a>! But it turns out in this context, it's easier to use 1-indexing than 0-indexing. In the years since I wrote that article I've settled on \"each indexing choice matches different kinds of mathematical work\", so mathematicians and computer scientists are best served by being able to choose their index. But software engineers need consistency, and 0-indexing is overall a net better consistency pick. <a class=\"footnote-backref\" href=\"#fnref:1-indexing\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:associative\">\n<p>This is <em>right-associative</em>: <code>a -> b -> c</code> means <code>a -> (b -> c)</code>, not <code>(a -> b) -> c</code>. <code>(1..3 -> 1..4) -> Int</code> would be the associative array that maps length-3 arrays to integers. <a class=\"footnote-backref\" href=\"#fnref:associative\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:typeclass\">\n<p>Technically it has type <code>Num a => a -> a -> a</code>, since <code>(+)</code> works on floats too. <a class=\"footnote-backref\" href=\"#fnref:typeclass\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n<li id=\"fn:name-dimension\">\n<p>Notice that if each <code>Airport</code> had a unique name, we <em>could</em> pull it out into <code>AirportName -> Airport(flights, revenue)</code>, but we still are stuck with two different values. <a class=\"footnote-backref\" href=\"#fnref:name-dimension\" title=\"Jump back to footnote 4 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/2000-words-about-arrays-and-tables/",
          "published": "2025-07-30T13:00:00.000Z",
          "updated": "2025-07-30T13:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/",
          "title": "Programming Language Escape Hatches",
          "description": "<p>The excellent-but-defunct blog <a href=\"https://prog21.dadgum.com/38.html\" target=\"_blank\">Programming in the 21st Century</a> defines \"puzzle languages\" as languages were part of the appeal is in figuring out how to express a program idiomatically, like a puzzle. As examples, he lists Haskell, Erlang, and J. All puzzle languages, the author says, have an \"escape\" out of the puzzle model that is pragmatic but stigmatized.</p>\n<p>But many mainstream languages have escape hatches, too.</p>\n<p>Languages have a lot of properties. One of these properties is the language's <a href=\"https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/\" target=\"_blank\">capabilities</a>, roughly the set of things you can do in the language. Capability is desirable but comes into conflicts with a lot of other desirable properties, like simplicity or efficiency. In particular, reducing the capability of a language means that all remaining programs share more in common, meaning there's more assumptions the compiler and programmer can make (\"tractability\"). Assumptions are generally used to reason about correctness, but can also be about things like optimization: J's assumption that everything is an array leads to <a href=\"https://code.jsoftware.com/wiki/Vocabulary/SpecialCombinations\" target=\"_blank\">high-performance \"special combinations\"</a>. </p>\n<p>Rust is the most famous example of <strong>mainstream</strong> language that trades capability for tractability.<sup id=\"fnref:gc\"><a class=\"footnote-ref\" href=\"#fn:gc\">1</a></sup> Rust has a lot of rules designed to prevent common memory errors, like keeping a reference to deallocated memory or modifying memory while something else is reading it. As a consequence, there's a lot of things that cannot be done in (safe) Rust, like interface with an external C function (as it doesn't have these guarantees).</p>\n<p>To do this, you need to use <a href=\"https://doc.rust-lang.org/book/ch20-01-unsafe-rust.html\" target=\"_blank\">unsafe Rust</a>, which lets you do additional things forbidden by safe Rust, such as deference a raw pointer. Everybody tells you not to use <code>unsafe</code> unless you absolutely 100% know what you're doing, and possibly not even then.</p>\n<p>Sounds like an escape hatch to me!</p>\n<p>To extrapolate, an <strong>escape hatch</strong> is a feature (either in the language itself or a particular implementation) that deliberately breaks core assumptions about the language in order to add capabilities. This explains both Rust and most of the so-called \"puzzle languages\": they need escape hatches because they have very strong conceptual models of the language which leads to lots of assumptions about programs. But plenty of \"kitchen sink\" mainstream languages have escape hatches, too:</p>\n<ul>\n<li>Some compilers let C++ code embed <a href=\"https://en.cppreference.com/w/cpp/language/asm.html\" target=\"_blank\">inline assembly</a>.</li>\n<li>Languages built on .NET or the JVM has some sort of interop with C# or Java, and many of those languages make assumptions about programs that C#/Java do not.</li>\n<li>The SQL language has stored procedures as an escape hatch <em>and</em> vendors create a second escape hatch of user-defined functions.</li>\n<li>Ruby lets you bypass any form of encapsulation with <a href=\"https://ruby-doc.org/3.4.1/Object.html#method-i-send\" target=\"_blank\"><code>send</code></a>.</li>\n<li>Frameworks have escape hatches, too! React has <a href=\"https://react.dev/learn/escape-hatches\" target=\"_blank\">an entire page on them</a>.</li>\n</ul>\n<p>(Does <code>eval</code> in interpreted languages count as an escape hatch? It feels different, but it does add a lot of capability. Maybe they don't \"break assumptions\" in the same way?)</p>\n<h3>The problem with escape hatches</h3>\n<p>In all languages with escape hatches, the rule is \"use this as carefully and sparingly as possible\", to the point where a messy solution <em>without</em> an escape hatch is preferable to a clean solution <em>with</em> one. Breaking a core assumption is a big deal! If the language is operating as if its still true, it's going to do incorrect things. </p>\n<p>I recently had this problem in a TLA+ contract. TLA+ is a language for modeling complicated systems, and assumes that the model is a self-contained universe. The client wanted to use the TLA+ to test a real system. The model checker should send commands to a test device and check the next states were the same. This is straightforward to set up with the <a href=\"https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla\" target=\"_blank\">IOExec escape hatch</a>.<sup id=\"fnref:ioexec\"><a class=\"footnote-ref\" href=\"#fn:ioexec\">2</a></sup> But the model checker assumed that state exploration was pure and it could skip around the state randomly, meaning it would do things like <code>set x = 10</code>, then skip to <code>set x = 1</code>, then skip back to <code>inc x; assert x == 11</code>. Oops!</p>\n<p>We eventually found workarounds but it took a lot of clever tricks to pull off. I'll probably write up the technique when I'm less busy with The Book.</p>\n<p>The other problem with escape hatches is the rest of the language is designed around <em>not</em> having said capabilities, meaning it can't support the feature as well as a language designed for them from the start. Even if your escape hatch code is clean, it might not cleanly <em>integrate</em> with the rest of your code. This is why people <a href=\"https://chadaustin.me/2024/10/intrusive-linked-list-in-rust/\" target=\"_blank\">complain about unsafe Rust</a> so often.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:gc\">\n<p>It should be noted though that <em>all</em> languages with automatic memory management are trading capability for tractability, too. If you can't deference pointers, you can't deference <em>null</em> pointers. <a class=\"footnote-backref\" href=\"#fnref:gc\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:ioexec\">\n<p>From the Community Modules (which come default with the VSCode extension). <a class=\"footnote-backref\" href=\"#fnref:ioexec\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/",
          "published": "2025-07-24T14:00:00.000Z",
          "updated": "2025-07-24T14:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/maybe-writing-speed-actually-is-a-bottleneck-for/",
          "title": "Maybe writing speed actually is a bottleneck for programming",
          "description": "<p>I'm a big (neo)vim buff. My config is over 1500 lines and I regularly write new scripts. I recently ported my neovim config to a new laptop. Before then, I was using VSCode to write, and when I switched back I immediately saw a big gain in productivity.</p>\n<p>People often pooh-pooh vim (and other assistive writing technologies) by saying that writing code isn't the bottleneck in software development. Reading, understanding, and thinking through code is!</p>\n<p>Now I don't know how true this actually is in practice, because empirical studies of time spent coding are all over the place. Most of them, like <a href=\"https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/meyer-fse-2014.pdf\" target=\"_blank\">this study</a>, track time spent in the editor but don't distinguish between time spent reading code and time spent writing code. The only one I found that separates them was <a href=\"https://scispace.com/pdf/i-know-what-you-did-last-summer-an-investigation-of-how-3zxclzzocc.pdf\" target=\"_blank\">this study</a>. It finds that developers spend only 5% of their time editing. It also finds they spend 14% of their time moving or resizing editor windows, so I don't know how clean their data is.</p>\n<p>But I have a bigger problem with \"writing is not the bottleneck\": when I think of a bottleneck, I imagine that <em>no</em> amount of improvement will lead to productivity gains. Like if a program is bottlenecked on the network, it isn't going to get noticeably faster with 100x more ram or compute. </p>\n<p>But being able to type code 100x faster, even with without corresponding improvements to reading and imagining code, would be <strong>huge</strong>. </p>\n<p>We'll assume the average developer writes at 80 words per minute, at five characters a word, for 400 characters a minute.What could we do if we instead wrote at 8,000 words/40k characters a minute? </p>\n<h3>Writing fast</h3>\n<h4>Boilerplate is trivial</h4>\n<p>Why do people like type inference? Because writing all of the types manually is annoying. Why don't people like boilerplate? Because it's annoying to write every damn time. Programmers like features that help them write less! That's not a problem if you can write all of the boilerplate in 0.1 seconds.</p>\n<p>You still have the problem of <em>reading</em> boilerplate heavy code, but you can use the remaining 0.9 seconds to churn out an extension that parses the file and presents the boilerplate in a more legible fashion. </p>\n<h4>We can write more tooling</h4>\n<p>This is something I've noticed with LLMs: when I can churn out crappy code as a free action, I use that to write lots of tools that assist me in writing <em>good</em> code. Even if I'm bottlenecked on a large program, I can still quickly write a script that helps me with something. Most of these aren't things I would have written because they'd take too long to write! </p>\n<p>Again, not the best comparison, because LLMs also shortcut learning the relevant APIs, so also optimize the \"understanding code\" part. Then again, if I could type real fast I could more quickly whip up experiments on new apis to learn them faster. </p>\n<h4>We can do practices that slow us down in the short-term</h4>\n<p>Something like test-driven development significantly slows down how fast you write production code, because you have to spend a lot more time writing test code. Pair programming trades speed of writing code for speed of understanding code. A two-order-of-magnitude writing speedup makes both of them effectively free. Or, if you're not an eXtreme Programming fan, you can more easily follow the <a href=\"https://en.wikipedia.org/wiki/The_Power_of_10:_Rules_for_Developing_Safety-Critical_Code\" target=\"_blank\">The Power of Ten Rules</a> and blanket your code with contracts and assertions.</p>\n<h4>We could do more speculative editing</h4>\n<p>This is probably the biggest difference in how we'd work if we could write 100x faster: it'd be much easier to try changes to the code to see if they're good ideas in the first place. </p>\n<p>How often have I tried optimizing something, only to find out it didn't make a difference? How often have I done a refactoring only to end up with lower-quality code overall? Too often. Over time it makes me prefer to try things that I know will work, and only \"speculatively edit\" when I think it be a fast change. If I could code 100x faster it would absolutely lead to me trying more speculative edits.</p>\n<p>This is especially big because I believe that lots of speculative edits are high-risk, high-reward: given 50 things we could do to the code, 49 won't make a difference and one will be a major improvement. If I only have time to try five things, I have a 10% chance of hitting the jackpot. If I can try 500 things I will get that reward every single time. </p>\n<h2>Processes are built off constraints</h2>\n<p>There are just a few ideas I came up with; there are probably others. Most of them, I suspect, will share the same property in common: they change <em>the process</em> of writing code to leverage the speedup. I can totally believe that a large speedup would not remove a bottleneck in the processes we <em>currently</em> use to write code. But that's because those processes are developed work within our existing constraints. Remove a constraint and new processes become possible.</p>\n<p>The way I see it, if our current process produces 1 Utils of Software / day, a 100x writing speedup might lead to only 1.5 UoS/day. But there are other processes that produce only 0.5 UoS/d <em>because they are bottlenecked on writing speed</em>. A 100x speedup would lead to 10 UoS/day.</p>\n<p>The problem with all of this that 100x speedup isn't realistic, and it's not obvious whether a 2x improvement would lead to better processes. Then again, one of the first custom vim function scripts I wrote was an aid to writing unit tests in a particular codebase, and it lead to me writing a lot more tests. So maybe even a 2x speedup is going to be speed things up, too.</p>\n<hr/>\n<h3>Patreon Stuff</h3>\n<p>I wrote a couple of TLA+ specs to show how to model <a href=\"https://en.wikipedia.org/wiki/Fork%E2%80%93join_model\" target=\"_blank\">fork-join</a> algorithms. I'm planning on eventually writing them up for my blog/learntla but it'll be a while, so if you want to see them in the meantime I put them up on <a href=\"https://www.patreon.com/posts/fork-join-in-tla-134209395?utm_medium=clipboard_copy&utm_source=copyLink&utm_campaign=postshare_creator&utm_content=join_link\" target=\"_blank\">Patreon</a>.</p>",
          "url": "https://buttondown.com/hillelwayne/archive/maybe-writing-speed-actually-is-a-bottleneck-for/",
          "published": "2025-07-17T19:08:27.000Z",
          "updated": "2025-07-17T19:08:27.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/",
          "title": "Logic for Programmers Turns One",
          "description": "<p>I released <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a> exactly one year ago today. It feels weird to celebrate the anniversary of something that isn't 1.0 yet, but software projects have a proud tradition of celebrating a dozen anniversaries before 1.0. I wanted to share about what's changed in the past year and the work for the next six+ months.</p>\n<p><img alt=\"The book cover!\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/70ac47c9-c49f-47c0-9a05-7a9e70551d03.jpg?w=960&fit=max\"/></p>\n<h3>The Road to 0.1</h3>\n<p>I had been noodling on the idea of a logic book since the pandemic. The first time I wrote about it on the newsletter was in <a href=\"https://buttondown.com/hillelwayne/archive/predicate-logic-for-programmers/\" target=\"_blank\">2021</a>! Then I said that it would be done by June and would be \"under 50 pages\". The idea was to cover logic as a \"soft skill\" that helped you think about things like requirements and stuff.</p>\n<p>That version <em>sucked</em>. If you want to see how much it sucked, I put it up on <a href=\"https://www.patreon.com/posts/what-logic-for-133675688\" target=\"_blank\">Patreon</a>. Then I slept on the next draft for three years. Then in 2024 a lot of business fell through and I had a lot of free time, so with the help of <a href=\"https://saul.pw/\" target=\"_blank\">Saul Pwanson</a> I rewrote the book. This time I emphasized breadth over depth, trying to cover a lot more techniques.  </p>\n<p>I also decided to self-publish it instead of pitching it to a publisher. Not going the traditional route would mean I would be responsible for paying for editing, advertising, graphic design etc, but I hoped that would be compensated by <em>much</em> higher royalties. It also meant I could release the book in early access and use early sales to fund further improvements. So I wrote up a draft in <a href=\"https://www.sphinx-doc.org/en/master/\" target=\"_blank\">Sphinx</a>, compiled it to LaTeX, and uploaded the PDF to <a href=\"https://leanpub.com/\" target=\"_blank\">leanpub</a>. That was in June 2024.</p>\n<p>Since then I kept to a monthly cadence of updates, missing once in November (short-notice contract) and once last month (<a href=\"https://systemsdistributed.com/\" target=\"_blank\">Systems Distributed</a>). The book's now on v0.10. What's changed?</p>\n<h3>A LOT</h3>\n<p>v0.1 was <em>very obviously</em> an alpha, and I have made a lot of improvements since then. For one, the book no longer looks like a <a href=\"https://www.sphinx-doc.org/_/downloads/en/master/pdf/#page=13\" target=\"_blank\">Sphinx manual</a>. Compare!</p>\n<p><img alt=\"0.1 on left, 0.10 on right. Way better!\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/e4d880ad-80b8-4360-9cae-27c07598c740.png?w=960&fit=max\"/></p>\n<p>Also, the content is very, very different. v0.1 was 19,000 words, v.10 is 31,000.<sup id=\"fnref:pagesize\"><a class=\"footnote-ref\" href=\"#fn:pagesize\">1</a></sup> This comes from new chapters on TLA+, constraint/SMT solving, logic programming, and major expansions to the existing chapters. Originally, \"Simplifying Conditionals\" was 600 words. Six hundred words! It almost fit in two pages!</p>\n<p><img alt=\"How short Simplifying Conditions USED to be\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/31e731b7-3bdc-4ded-9b09-2a6261a323ec.png?w=960&fit=max\"/></p>\n<p>The chapter is now 2600 words, now covering condition lifting, quantifier manipulation, helper predicates, and set optimizations. All the other chapters have either gotten similar facelifts or are scheduled to get facelifts.</p>\n<p>The last big change is the addition of <a href=\"https://github.com/logicforprogrammers/book-assets\" target=\"_blank\">book assets</a>. Originally you had to manually copy over all of the code to try it out, which is a problem when there are samples in eight distinct languages! Now there are ready-to-go examples for each chapter, with instructions on how to set up each programming environment. This is also nice because it gives me breaks from writing to code instead.</p>\n<h3>How did the book do?</h3>\n<p>Leanpub's all-time visualizations are terrible, so I'll just give the summary: 1180 copies sold, $18,241 in royalties. That's a lot of money for something that isn't fully out yet! By comparison, <em>Practical TLA+</em> has made me less than half of that, despite selling over 5x as many books. Self-publishing was the right choice!</p>\n<p>In that time I've paid about $400 for the book cover (worth it) and maybe $800 in Leanpub's advertising service (probably not worth it). </p>\n<p>Right now that doesn't come close to making back the time investment, but I think it can get there post-release. I believe there's a lot more potential customers via marketing. I think post-release 10k copies sold is within reach.</p>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<h3>Where is the book going?</h3>\n<div class=\"subscribe-form\"></div>\n<p>The main content work is rewrites: many of the chapters have not meaningfully changed since 1.0, so I am going through and rewriting them from scratch. So far four of the ten chapters have been rewritten. My (admittedly ambitious) goal is to rewrite three of them by the end of this month and another three by the end of next. I also want to do final passes on the rewritten chapters; as most of them have a few TODOs left lying around.</p>\n<p>(Also somehow in starting this newsletter and publishing it I realized that one of the chapters might be better split into two chapters, so there could well-be a tenth technique in v0.11 or v0.12!)</p>\n<p>After that, I will pass it to a copy editor while I work on improving the layout, making images, and indexing. I want to have something worthy of printing on a dead tree by 1.0. </p>\n<p>In terms of timelines, I am <strong>very roughly</strong> estimating something like this:</p>\n<ul>\n<li>Summer: final big changes and rewrites</li>\n<li>Early Autumn: graphic design and copy editing</li>\n<li>Late Autumn: proofing, figuring out printing stuff</li>\n<li>Winter: final ebook and initial print releases of 1.0.</li>\n</ul>\n<p>(If you know a service that helps get self-published books \"past the finish line\", I'd love to hear about it! Preferably something that works for a fee, not part of royalties.)</p>\n<p>This timeline may be disrupted by official client work, like a new TLA+ contract or a conference invitation.</p>\n<p>Needless to say, I am incredibly excited to complete this book and share the final version with you all. This is a book I wished for years ago, a book I wrote because nobody else would. It fills a critical gap in software educational material, and someday soon I'll be able to put a copy on my bookshelf. It's exhilarating and terrifying and above all, satisfying.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:pagesize\">\n<p>It's also 150 pages vs 50 pages, but admittedly this is partially because I made the book smaller with a larger font. <a class=\"footnote-backref\" href=\"#fnref:pagesize\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/",
          "published": "2025-07-08T18:18:52.000Z",
          "updated": "2025-07-08T18:18:52.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/",
          "title": "Logical Quantifiers in Software",
          "description": "<p>I realize that for all I've talked about <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a> in this newsletter, I never once explained basic logical quantifiers. They're both simple and incredibly useful, so let's do that this week! </p>\n<h3>Sets and quantifiers</h3>\n<p>A <strong>set</strong> is a collection of unordered, unique elements. <code>{1, 2, 3, …}</code> is a set, as are \"every programming language\", \"every programming language's Wikipedia page\", and \"every function ever defined in any programming language's standard library\". You can put whatever you want in a set, with some very specific limitations to avoid certain paradoxes.<sup id=\"fnref:paradox\"><a class=\"footnote-ref\" href=\"#fn:paradox\">2</a></sup> </p>\n<p>Once we have a set, we can ask \"is something true for all elements of the set\" and \"is something true for at least one element of the set?\" IE, is it true that every programming language has a <code>set</code> collection type in the core language? We would write it like this:</p>\n<div class=\"codehilite\"><pre><span></span><code># all of them\nall l in ProgrammingLanguages: HasSetType(l)\n\n# at least one\nsome l in ProgrammingLanguages: HasSetType(l)\n</code></pre></div>\n<p>This is the notation I use in the book because it's easy to read, type, and search for. Mathematicians historically had a few different formats; the one I grew up with was <code>∀x ∈ set: P(x)</code> to mean <code>all x in set</code>, and <code>∃</code> to mean <code>some</code>. I use these when writing for just myself, but find them confusing to programmers when communicating.</p>\n<p>\"All\" and \"some\" are respectively referred to as \"universal\" and \"existential\" quantifiers.</p>\n<h3>Some cool properties</h3>\n<p>We can simplify expressions with quantifiers, in the same way that we can simplify <code>!(x && y)</code> to <code>!x || !y</code>.</p>\n<p>First of all, quantifiers are commutative with themselves. <code>some x: some y: P(x,y)</code> is the same as <code>some y: some x: P(x, y)</code>. For this reason we can write <code>some x, y: P(x,y)</code> as shorthand. We can even do this when quantifying over different sets, writing <code>some x, x' in X, y in Y</code> instead of <code>some x, x' in X: some y in Y</code>. We can <em>not</em> do this with \"alternating quantifiers\":</p>\n<ul>\n<li><code>all p in Person: some m in Person: Mother(m, p)</code> says that every person has a mother.</li>\n<li><code>some m in Person: all p in Person: Mother(m, p)</code> says that someone is every person's mother.</li>\n</ul>\n<p>Second, existentials distribute over <code>||</code> while universals distribute over <code>&&</code>. \"There is some url which returns a 403 or 404\" is the same as \"there is some url which returns a 403 or some url that returns a 404\", and \"all PRs pass the linter and the test suites\" is the same as \"all PRs pass the linter and all PRs pass the test suites\".</p>\n<p>Finally, <code>some</code> and <code>all</code> are <em>duals</em>: <code>some x: P(x) == !(all x: !P(x))</code>, and vice-versa. Intuitively: if some file is malicious, it's not true that all files are benign.</p>\n<p>All these rules together mean we can manipulate quantifiers <em>almost</em> as easily as we can manipulate regular booleans, putting them in whatever form is easiest to use in programming. </p>\n<p>Speaking of which, how <em>do</em> we use this in in programming?</p>\n<h2>How we use this in programming</h2>\n<p>First of all, people clearly have a need for directly using quantifiers in code. If we have something of the form:</p>\n<div class=\"codehilite\"><pre><span></span><code>for x in list:\n    if P(x):\n        return true\nreturn false\n</code></pre></div>\n<p>That's just <code>some x in list: P(x)</code>. And this is a prevalent pattern, as you can see by using <a href=\"https://github.com/search?q=%2Ffor+.*%3A%5Cn%5Cs*if+.*%3A%5Cn%5Cs*return+%28False%7CTrue%29%5Cn%5Cs*return+%28True%7CFalse%29%2F+language%3Apython+NOT+is%3Afork&type=code\" target=\"_blank\">GitHub code search</a>. It finds over 500k examples of this pattern in Python alone! That can be simplified via using the language's built-in quantifiers: the Python would be <code>any(P(x) for x in list)</code>.</p>\n<p>(Note this is not quantifying over sets but iterables. But the idea translates cleanly enough.)</p>\n<p>More generally, quantifiers are a key way we express higher-level properties of software. What does it mean for a list to be sorted in ascending order? That <code>all i, j in 0..<len(l): if i < j then l[i] <= l[j]</code>. When should a <a href=\"https://qntm.org/ratchet\" target=\"_blank\">ratchet test fail</a>? When <code>some f in functions - exceptions: Uses(f, bad_function)</code>. Should the image classifier work upside down? <code>all i in images: classify(i) == classify(rotate(i, 180))</code>. These are the properties we verify with tests and types and <a href=\"https://www.hillelwayne.com/post/constructive/\" target=\"_blank\">MISU</a> and whatnot;<sup id=\"fnref:misu\"><a class=\"footnote-ref\" href=\"#fn:misu\">1</a></sup> it helps to be able to make them explicit!</p>\n<p>One cool use case that'll be in the book's next version: database invariants are universal statements over the set of all records, like <code>all a in accounts: a.balance > 0</code>. That's enforceable with a <a href=\"https://sqlite.org/lang_createtable.html#check_constraints\" target=\"_blank\">CHECK</a> constraint. But what about something like <code>all i, i' in intervals: NoOverlap(i, i')</code>? That isn't covered by CHECK, since it spans two rows.</p>\n<p>Quantifier duality to the rescue! The invariant is equivalent to <code>!(some i, i' in intervals: Overlap(i, i'))</code>, so is preserved if the <em>query</em> <code>SELECT COUNT(*) FROM intervals CROSS JOIN intervals …</code> returns 0 rows. This means we can test it via a <a href=\"https://sqlite.org/lang_createtrigger.html\" target=\"_blank\">database trigger</a>.<sup id=\"fnref:efficiency\"><a class=\"footnote-ref\" href=\"#fn:efficiency\">3</a></sup></p>\n<hr/>\n<p>There are a lot more use cases for quantifiers, but this is enough to introduce the ideas! Next week's the one year anniversary of the book entering early access, so I'll be writing a bit about that experience and how the book changed. It's <em>crazy</em> how crude v0.1 was compared to the current version.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:misu\">\n<p>MISU (\"make illegal states unrepresentable\") means using data representations that rule out invalid values. For example, if you have a <code>location -> Optional(item)</code> lookup and want to make sure that each item is in exactly one location, consider instead changing the map to <code>item -> location</code>. This is a means of <em>implementing</em> the property <code>all i in item, l, l' in location: if ItemIn(i, l) && l != l' then !ItemIn(i, l')</code>. <a class=\"footnote-backref\" href=\"#fnref:misu\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:paradox\">\n<p>Specifically, a set can't be an element of itself, which rules out constructing things like \"the set of all sets\" or \"the set of sets that don't contain themselves\". <a class=\"footnote-backref\" href=\"#fnref:paradox\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:efficiency\">\n<p>Though note that when you're inserting or updating an interval, you already <em>have</em> that row's fields in the trigger's <code>NEW</code> keyword. So you can just query <code>!(some i in intervals: Overlap(new, i'))</code>, which is more efficient. <a class=\"footnote-backref\" href=\"#fnref:efficiency\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/",
          "published": "2025-07-02T19:44:22.000Z",
          "updated": "2025-07-02T19:44:22.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/you-can-cheat-a-test-suite-with-a-big-enough/",
          "title": "You can cheat a test suite with a big enough polynomial",
          "description": "<p>Hi nerds, I'm back from <a href=\"https://systemsdistributed.com/\" target=\"_blank\">Systems Distributed</a>! I'd heartily recommend it, wildest conference I've been to in years. I have a lot of work to catch up on, so this will be a short newsletter.</p>\n<p>In an earlier version of my talk, I had a gag about unit tests. First I showed the test <code>f([1,2,3]) == 3</code>, then said that this was satisfied by <code>f(l) = 3</code>, <code>f(l) = l[-1]</code>, <code>f(l) = len(l)</code>, <code>f(l) = (129*l[0]-34*l[1]-617)*l[2] - 443*l[0] + 1148*l[1] - 182</code>. Then I progressively rule them out one by one with more unit tests, except the last polynomial which stubbornly passes every single test.</p>\n<p>If you're given some function of <code>f(x: int, y: int, …): int</code> and a set of unit tests asserting <a href=\"https://buttondown.com/hillelwayne/archive/oracle-testing/\" target=\"_blank\">specific inputs give specific outputs</a>, then you can find a polynomial that passes every single unit test.</p>\n<p>To find the gag, and as <a href=\"https://en.wikipedia.org/wiki/Satisfiability_modulo_theories\" target=\"_blank\">SMT</a> practice, I wrote a Python program that finds a polynomial that passes a test suite meant for <code>max</code>. It's hardcoded for three parameters and only finds 2nd-order polynomials but I think it could be generalized with enough effort.</p>\n<h2>The code</h2>\n<p>Full code <a href=\"https://gist.github.com/hwayne/0ed045a35376c786171f9cf4b55c470f\" target=\"_blank\">here</a>, breakdown below.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"kn\">from</span><span class=\"w\"> </span><span class=\"nn\">z3</span><span class=\"w\"> </span><span class=\"kn\">import</span> <span class=\"o\">*</span>  <span class=\"c1\"># type: ignore</span>\n<span class=\"n\">s1</span><span class=\"p\">,</span> <span class=\"n\">s2</span> <span class=\"o\">=</span> <span class=\"n\">Solver</span><span class=\"p\">(),</span> <span class=\"n\">Solver</span><span class=\"p\">()</span>\n</code></pre></div>\n<p><a href=\"https://microsoft.github.io/z3guide/\" target=\"_blank\">Z3</a> is just the particular SMT solver we use, as it has good language bindings and a lot of affordances.</p>\n<p>As part of learning SMT I wanted to do this two ways. First by putting the polynomial \"outside\" of the SMT solver in a python function, second by doing it \"natively\" in Z3. I created two solvers so I could test both versions in one run. </p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">a0</span><span class=\"p\">,</span> <span class=\"n\">a</span><span class=\"p\">,</span> <span class=\"n\">b</span><span class=\"p\">,</span> <span class=\"n\">c</span><span class=\"p\">,</span> <span class=\"n\">d</span><span class=\"p\">,</span> <span class=\"n\">e</span><span class=\"p\">,</span> <span class=\"n\">f</span> <span class=\"o\">=</span> <span class=\"n\">Consts</span><span class=\"p\">(</span><span class=\"s1\">'a0 a b c d e f'</span><span class=\"p\">,</span> <span class=\"n\">IntSort</span><span class=\"p\">())</span>\n<span class=\"n\">x</span><span class=\"p\">,</span> <span class=\"n\">y</span><span class=\"p\">,</span> <span class=\"n\">z</span> <span class=\"o\">=</span> <span class=\"n\">Ints</span><span class=\"p\">(</span><span class=\"s1\">'x y z'</span><span class=\"p\">)</span>\n<span class=\"n\">t</span> <span class=\"o\">=</span> <span class=\"s2\">\"a*x+b*y+c*z+d*x*y+e*x*z+f*y*z+a0\"</span>\n</code></pre></div>\n<p>Both <code>Const('x', IntSort())</code> and <code>Int('x')</code> do the exact same thing, the latter being syntactic sugar for the former. I did not know this when I wrote the program. </p>\n<p>To keep the two versions in sync I represented the equation as a string, which I later <code>eval</code>. This is one of the rare cases where eval is a good idea, to help us experiment more quickly while learning. The polynomial is a \"2nd-order polynomial\", even though it doesn't have <code>x^2</code> terms, as it has <code>xy</code> and <code>xz</code> terms.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">lambdamax</span> <span class=\"o\">=</span> <span class=\"k\">lambda</span> <span class=\"n\">x</span><span class=\"p\">,</span> <span class=\"n\">y</span><span class=\"p\">,</span> <span class=\"n\">z</span><span class=\"p\">:</span> <span class=\"nb\">eval</span><span class=\"p\">(</span><span class=\"n\">t</span><span class=\"p\">)</span>\n\n<span class=\"n\">z3max</span> <span class=\"o\">=</span> <span class=\"n\">Function</span><span class=\"p\">(</span><span class=\"s1\">'z3max'</span><span class=\"p\">,</span> <span class=\"n\">IntSort</span><span class=\"p\">(),</span> <span class=\"n\">IntSort</span><span class=\"p\">(),</span> <span class=\"n\">IntSort</span><span class=\"p\">(),</span>  <span class=\"n\">IntSort</span><span class=\"p\">())</span>\n<span class=\"n\">s1</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">ForAll</span><span class=\"p\">([</span><span class=\"n\">x</span><span class=\"p\">,</span> <span class=\"n\">y</span><span class=\"p\">,</span> <span class=\"n\">z</span><span class=\"p\">],</span> <span class=\"n\">z3max</span><span class=\"p\">(</span><span class=\"n\">x</span><span class=\"p\">,</span> <span class=\"n\">y</span><span class=\"p\">,</span> <span class=\"n\">z</span><span class=\"p\">)</span> <span class=\"o\">==</span> <span class=\"nb\">eval</span><span class=\"p\">(</span><span class=\"n\">t</span><span class=\"p\">)))</span>\n</code></pre></div>\n<p><code>lambdamax</code> is pretty straightforward: create a lambda with three parameters and <code>eval</code> the string. The string \"<code>a*x</code>\" then becomes the python expression <code>a*x</code>, <code>a</code> is an SMT symbol, while the <code>x</code> SMT symbol is shadowed by the lambda parameter. To reiterate, a terrible idea in practice, but a good way to learn faster.</p>\n<p><code>z3max</code> function is a little more complex. <code>Function</code> takes an identifier string and N \"sorts\" (roughly the same as programming types). The first <code>N-1</code> sorts define the parameters of the function, while the last becomes the output. So here I assign the string identifier <code>\"z3max\"</code> to be a function with signature <code>(int, int, int) -> int</code>.</p>\n<p>I can load the function into the model by specifying constraints on what <code>z3max</code> <em>could</em> be. This could either be a strict input/output, as will be done later, or a <code>ForAll</code> over all possible inputs. Here I just use that directly to say \"for all inputs, the function should match this polynomial.\" But I could do more complicated constraints, like commutativity (<code>f(x, y) == f(y, x)</code>) or monotonicity (<code>Implies(x < y, f(x) <= f(y))</code>).</p>\n<p>Note <code>ForAll</code> takes a list of z3 symbols to quantify over. That's the only reason we need to define <code>x, y, z</code> in the first place. The lambda version doesn't need them. </p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">inputs</span> <span class=\"o\">=</span> <span class=\"p\">[(</span><span class=\"mi\">1</span><span class=\"p\">,</span><span class=\"mi\">2</span><span class=\"p\">,</span><span class=\"mi\">3</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">)]</span>\n\n<span class=\"k\">for</span> <span class=\"n\">g</span> <span class=\"ow\">in</span> <span class=\"n\">inputs</span><span class=\"p\">:</span>\n    <span class=\"n\">s1</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">z3max</span><span class=\"p\">(</span><span class=\"o\">*</span><span class=\"n\">g</span><span class=\"p\">)</span> <span class=\"o\">==</span> <span class=\"nb\">max</span><span class=\"p\">(</span><span class=\"o\">*</span><span class=\"n\">g</span><span class=\"p\">))</span>\n    <span class=\"n\">s2</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">lambdamax</span><span class=\"p\">(</span><span class=\"o\">*</span><span class=\"n\">g</span><span class=\"p\">)</span> <span class=\"o\">==</span> <span class=\"nb\">max</span><span class=\"p\">(</span><span class=\"o\">*</span><span class=\"n\">g</span><span class=\"p\">))</span>\n</code></pre></div>\n<p>This sets up the joke: adding constraints to each solver that the polynomial it finds must, for a fixed list of triplets, return the max of each triplet.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">for</span> <span class=\"n\">s</span><span class=\"p\">,</span> <span class=\"n\">func</span> <span class=\"ow\">in</span> <span class=\"p\">[(</span><span class=\"n\">s1</span><span class=\"p\">,</span> <span class=\"n\">z3max</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">s2</span><span class=\"p\">,</span> <span class=\"n\">lambdamax</span><span class=\"p\">)]:</span>\n    <span class=\"k\">if</span> <span class=\"n\">s</span><span class=\"o\">.</span><span class=\"n\">check</span><span class=\"p\">()</span> <span class=\"o\">==</span> <span class=\"n\">sat</span><span class=\"p\">:</span>\n        <span class=\"n\">m</span> <span class=\"o\">=</span> <span class=\"n\">s</span><span class=\"o\">.</span><span class=\"n\">model</span><span class=\"p\">()</span>\n        <span class=\"k\">for</span> <span class=\"n\">x</span><span class=\"p\">,</span> <span class=\"n\">y</span><span class=\"p\">,</span> <span class=\"n\">z</span> <span class=\"ow\">in</span> <span class=\"n\">inputs</span><span class=\"p\">:</span>\n            <span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"sa\">f</span><span class=\"s2\">\"max([</span><span class=\"si\">{</span><span class=\"n\">x</span><span class=\"si\">}</span><span class=\"s2\">, </span><span class=\"si\">{</span><span class=\"n\">y</span><span class=\"si\">}</span><span class=\"s2\">, </span><span class=\"si\">{</span><span class=\"n\">z</span><span class=\"si\">}</span><span class=\"s2\">]) =\"</span><span class=\"p\">,</span> <span class=\"n\">m</span><span class=\"o\">.</span><span class=\"n\">evaluate</span><span class=\"p\">(</span><span class=\"n\">func</span><span class=\"p\">(</span><span class=\"n\">x</span><span class=\"p\">,</span> <span class=\"n\">y</span><span class=\"p\">,</span> <span class=\"n\">z</span><span class=\"p\">)))</span>\n        <span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"sa\">f</span><span class=\"s2\">\"max([x, y, z]) = </span><span class=\"si\">{</span><span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">a</span><span class=\"p\">]</span><span class=\"si\">}</span><span class=\"s2\">x + </span><span class=\"si\">{</span><span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">b</span><span class=\"p\">]</span><span class=\"si\">}</span><span class=\"s2\">y\"</span><span class=\"p\">,</span>\n            <span class=\"sa\">f</span><span class=\"s2\">\"+ </span><span class=\"si\">{</span><span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">c</span><span class=\"p\">]</span><span class=\"si\">}</span><span class=\"s2\">z +\"</span><span class=\"p\">,</span> <span class=\"c1\"># linebreaks added for newsletter rendering</span>\n            <span class=\"sa\">f</span><span class=\"s2\">\"</span><span class=\"si\">{</span><span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">d</span><span class=\"p\">]</span><span class=\"si\">}</span><span class=\"s2\">xy + </span><span class=\"si\">{</span><span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">e</span><span class=\"p\">]</span><span class=\"si\">}</span><span class=\"s2\">xz + </span><span class=\"si\">{</span><span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">f</span><span class=\"p\">]</span><span class=\"si\">}</span><span class=\"s2\">yz + </span><span class=\"si\">{</span><span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">a0</span><span class=\"p\">]</span><span class=\"si\">}</span><span class=\"se\">\\n</span><span class=\"s2\">\"</span><span class=\"p\">)</span>\n</code></pre></div>\n<p>Output:</p>\n<div class=\"codehilite\"><pre><span></span><code>max([1, 2, 3]) = 3\n# etc\nmax([x, y, z]) = -133x + 130y + -10z + -2xy + 62xz + -46yz + 0\n\nmax([1, 2, 3]) = 3\n# etc\nmax([x, y, z]) = -17x + 16y + 0z + 0xy + 8xz + -6yz + 0\n</code></pre></div>\n<p>I find that <code>z3max</code> (top) consistently finds larger coefficients than <code>lambdamax</code> does. I don't know why.</p>\n<h3>Practical Applications</h3>\n<p><strong>Test-Driven Development</strong> recommends a strict \"red-green refactor\" cycle. Write a new failing test, make the new test pass, then go back and refactor. Well, the easiest way to make the new test pass would be to paste in a new polynomial, so that's what you should be doing. You can even do this all automatically: have a script read the set of test cases, pass them to the solver, and write the new polynomial to your code file. All you need to do is write the tests!</p>\n<h3>Pedagogical Notes</h3>\n<p>Writing the script took me a couple of hours. I'm sure an LLM could have whipped it all up in five minutes but I really want to <em>learn</em> SMT and <a href=\"https://www.sciencedirect.com/science/article/pii/S0747563224002541\" target=\"_blank\">LLMs <em>may</em> decrease learning retention</a>.<sup id=\"fnref:caveat\"><a class=\"footnote-ref\" href=\"#fn:caveat\">1</a></sup> Z3 documentation is not... great for non-academics, though, and most other SMT solvers have even worse docs. One useful trick I use regularly is to use Github code search to find code using the same APIs and study how that works. Turns out reading API-heavy code is a lot easier than writing it!</p>\n<p>Anyway, I'm very, very slowly feeling like I'm getting the basics on how to use SMT. I don't have any practical use cases yet, but I wanted to learn this skill for a while and glad I finally did.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:caveat\">\n<p>Caveat I have not actually <em>read</em> the study, for all I know it could have a sample size of three people, I'll get around to it eventually <a class=\"footnote-backref\" href=\"#fnref:caveat\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/you-can-cheat-a-test-suite-with-a-big-enough/",
          "published": "2025-06-24T16:27:01.000Z",
          "updated": "2025-06-24T16:27:01.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/",
          "title": "Solving LinkedIn Queens with SMT",
          "description": "<h3>No newsletter next week</h3>\n<p>I’ll be speaking at <a href=\"https://systemsdistributed.com/\" target=\"_blank\">Systems Distributed</a>. My talk isn't close to done yet, which is why this newsletter is both late and short. </p>\n<h1>Solving LinkedIn Queens in SMT</h1>\n<p>The article <a href=\"https://codingnest.com/modern-sat-solvers-fast-neat-underused-part-1-of-n/\" target=\"_blank\">Modern SAT solvers: fast, neat and underused</a> claims that SAT solvers<sup id=\"fnref:SAT\"><a class=\"footnote-ref\" href=\"#fn:SAT\">1</a></sup> are \"criminally underused by the industry\". A while back on the newsletter I asked \"why\": how come they're so powerful and yet nobody uses them? Many experts responded saying the reason is that encoding SAT kinda sucked and they rather prefer using tools that compile to SAT. </p>\n<p>I was reminded of this when I read <a href=\"https://ryanberger.me/posts/queens/\" target=\"_blank\">Ryan Berger's post</a> on solving “LinkedIn Queens” as a SAT problem. </p>\n<p>A quick overview of Queens. You’re presented with an NxN grid divided into N regions, and have to place N queens so that there is exactly one queen in each row, column, and region. While queens can be on the same diagonal, they <em>cannot</em> be adjacently diagonal.</p>\n<p>(Important note: Linkedin “Queens” is a variation on the puzzle game <a href=\"https://starbattle.puzzlebaron.com/\" target=\"_blank\">Star Battle</a>, which is the same except the number of stars you place in each row/column/region varies per puzzle, and is usually two. This is also why 'queens' don’t capture like chess queens.)</p>\n<p><img alt=\"An image of a solved queens board. Copied from https://ryanberger.me/posts/queens\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/96f6f923-331f-424d-8641-fe6753e1c2ca.png?w=960&fit=max\"/></p>\n<p>Ryan solved this by writing Queens as a SAT problem, expressing properties like \"there is exactly one queen in row 3\" as a large number of boolean clauses. <a href=\"https://ryanberger.me/posts/queens/\" target=\"_blank\">Go read his post, it's pretty cool</a>. What leapt out to me was that he used <a href=\"https://cvc5.github.io/\" target=\"_blank\">CVC5</a>, an <strong>SMT</strong> solver.<sup id=\"fnref:SMT\"><a class=\"footnote-ref\" href=\"#fn:SMT\">2</a></sup> SMT solvers are \"higher-level\" than SAT, capable of handling more data types than just boolean variables. It's a lot easier to solve the problem at the SMT level than at the SAT level. To show this, I whipped up a short demo of solving the same problem in <a href=\"https://github.com/Z3Prover/z3/wiki\" target=\"_blank\">Z3</a> (via the <a href=\"https://pypi.org/project/z3-solver/\" target=\"_blank\">Python API</a>).</p>\n<p><a href=\"https://gist.github.com/hwayne/c5de7bc52e733995311236666bedecd3\" target=\"_blank\">Full code here</a>, which you can compare to Ryan's SAT solution <a href=\"https://github.com/ryan-berger/queens/blob/master/main.py\" target=\"_blank\">here</a>. I didn't do a whole lot of cleanup on it (again, time crunch!), but short explanation below.</p>\n<h3>The code</h3>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"kn\">from</span><span class=\"w\"> </span><span class=\"nn\">z3</span><span class=\"w\"> </span><span class=\"kn\">import</span> <span class=\"o\">*</span> <span class=\"c1\"># type: ignore</span>\n<span class=\"kn\">from</span><span class=\"w\"> </span><span class=\"nn\">itertools</span><span class=\"w\"> </span><span class=\"kn\">import</span> <span class=\"n\">combinations</span><span class=\"p\">,</span> <span class=\"n\">chain</span><span class=\"p\">,</span> <span class=\"n\">product</span>\n<span class=\"n\">solver</span> <span class=\"o\">=</span> <span class=\"n\">Solver</span><span class=\"p\">()</span>\n<span class=\"n\">size</span> <span class=\"o\">=</span> <span class=\"mi\">9</span> <span class=\"c1\"># N</span>\n</code></pre></div>\n<p>Initial setup and modules. <code>size</code> is the number of rows/columns/regions in the board, which I'll call <code>N</code> below.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\"># queens[n] = col of queen on row n</span>\n<span class=\"c1\"># by construction, not on same row</span>\n<span class=\"n\">queens</span> <span class=\"o\">=</span> <span class=\"n\">IntVector</span><span class=\"p\">(</span><span class=\"s1\">'q'</span><span class=\"p\">,</span> <span class=\"n\">size</span><span class=\"p\">)</span> \n</code></pre></div>\n<p>SAT represents the queen positions via N² booleans: <code>q_00</code> means that a Queen is on row 0 and column 0, <code>!q_05</code> means a queen <em>isn't</em> on row 0 col 5, etc. In SMT we can instead encode it as N integers: <code>q_0 = 5</code> means that the queen on row 0 is positioned at column 5. This immediately enforces one class of constraints for us: we don't need any constraints saying \"exactly one queen per row\", because that's embedded in the definition of <code>queens</code>!</p>\n<p>(Incidentally, using 0-based indexing for the board was a mistake on my part, it makes correctly encoding the regions later really painful.)</p>\n<p>To actually make the variables <code>[q_0, q_1, …]</code>, we use the Z3 affordance <code>IntVector(str, n)</code> for making <code>n</code> variables at once.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">([</span><span class=\"n\">And</span><span class=\"p\">(</span><span class=\"mi\">0</span> <span class=\"o\"><=</span> <span class=\"n\">i</span><span class=\"p\">,</span> <span class=\"n\">i</span> <span class=\"o\"><</span> <span class=\"n\">size</span><span class=\"p\">)</span> <span class=\"k\">for</span> <span class=\"n\">i</span> <span class=\"ow\">in</span> <span class=\"n\">queens</span><span class=\"p\">])</span>\n<span class=\"c1\"># not on same column</span>\n<span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">Distinct</span><span class=\"p\">(</span><span class=\"n\">queens</span><span class=\"p\">))</span>\n</code></pre></div>\n<p>First we constrain all the integers to <code>[0, N)</code>, then use the <em>incredibly</em> handy <code>Distinct</code> constraint to force all the integers to have different values. This guarantees at most one queen per column, which by the <a href=\"https://en.wikipedia.org/wiki/Pigeonhole_principle\" target=\"_blank\">pigeonhole principle</a> means there is exactly one queen per column.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\"># not diagonally adjacent</span>\n<span class=\"k\">for</span> <span class=\"n\">i</span> <span class=\"ow\">in</span> <span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"n\">size</span><span class=\"o\">-</span><span class=\"mi\">1</span><span class=\"p\">):</span>\n    <span class=\"n\">q1</span><span class=\"p\">,</span> <span class=\"n\">q2</span> <span class=\"o\">=</span> <span class=\"n\">queens</span><span class=\"p\">[</span><span class=\"n\">i</span><span class=\"p\">],</span> <span class=\"n\">queens</span><span class=\"p\">[</span><span class=\"n\">i</span><span class=\"o\">+</span><span class=\"mi\">1</span><span class=\"p\">]</span>\n    <span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">Abs</span><span class=\"p\">(</span><span class=\"n\">q1</span> <span class=\"o\">-</span> <span class=\"n\">q2</span><span class=\"p\">)</span> <span class=\"o\">!=</span> <span class=\"mi\">1</span><span class=\"p\">)</span>\n</code></pre></div>\n<p>One of the rules is that queens can't be adjacent. We already know that they can't be horizontally or vertically adjacent via other constraints, which leaves the diagonals. We only need to add constraints that, for each queen, there is no queen in the lower-left or lower-right corner, aka <code>q_3 != q_2 ± 1</code>. We don't need to check the top corners because if <code>q_1</code> is in the upper-left corner of <code>q_2</code>, then <code>q_2</code> is in the lower-right corner of <code>q_1</code>!</p>\n<p>That covers everything except the \"one queen per region\" constraint. But the regions are the tricky part, which we should expect because we vary the difficulty of queens games by varying the regions.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">regions</span> <span class=\"o\">=</span> <span class=\"p\">{</span>\n        <span class=\"s2\">\"purple\"</span><span class=\"p\">:</span> <span class=\"p\">[(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">6</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">7</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">8</span><span class=\"p\">),</span>\n                   <span class=\"p\">(</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">5</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">6</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">7</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">8</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span>\n                   <span class=\"p\">(</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">8</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">)],</span>\n        <span class=\"s2\">\"red\"</span><span class=\"p\">:</span> <span class=\"p\">[(</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">5</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">6</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">6</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">7</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">7</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">8</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">8</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">),],</span>\n        <span class=\"c1\"># you get the picture</span>\n        <span class=\"p\">}</span>\n\n<span class=\"c1\"># Some checking code left out, see below</span>\n</code></pre></div>\n<p>The region has to be manually coded in, which is a huge pain.</p>\n<p>(In the link, some validation code follows. Since it breaks up explaining the model I put it in the next section.)</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">for</span> <span class=\"n\">r</span> <span class=\"ow\">in</span> <span class=\"n\">regions</span><span class=\"o\">.</span><span class=\"n\">values</span><span class=\"p\">():</span>\n    <span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">Or</span><span class=\"p\">(</span>\n        <span class=\"o\">*</span><span class=\"p\">[</span><span class=\"n\">queens</span><span class=\"p\">[</span><span class=\"n\">row</span><span class=\"p\">]</span> <span class=\"o\">==</span> <span class=\"n\">col</span> <span class=\"k\">for</span> <span class=\"p\">(</span><span class=\"n\">row</span><span class=\"p\">,</span> <span class=\"n\">col</span><span class=\"p\">)</span> <span class=\"ow\">in</span> <span class=\"n\">r</span><span class=\"p\">]</span>\n        <span class=\"p\">))</span>\n</code></pre></div>\n<p>Finally we have the region constraint. The easiest way I found to say \"there is exactly one queen in each region\" is to say \"there is a queen in region 1 and a queen in region 2 and a queen in region 3\" etc.\" Then to say \"there is a queen in region <code>purple</code>\" I wrote \"<code>q_0 = 0</code> OR <code>q_0 = 1</code> OR … OR <code>q_1 = 0</code> etc.\" </p>\n<p>Why iterate over every position in the region instead of doing something like <code>(0, q[0]) in r</code>? I tried that but it's not an expression that Z3 supports.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">if</span> <span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">check</span><span class=\"p\">()</span> <span class=\"o\">==</span> <span class=\"n\">sat</span><span class=\"p\">:</span>\n    <span class=\"n\">m</span> <span class=\"o\">=</span> <span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">model</span><span class=\"p\">()</span>\n    <span class=\"nb\">print</span><span class=\"p\">([(</span><span class=\"n\">l</span><span class=\"p\">,</span> <span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">l</span><span class=\"p\">])</span> <span class=\"k\">for</span> <span class=\"n\">l</span> <span class=\"ow\">in</span> <span class=\"n\">queens</span><span class=\"p\">])</span>\n</code></pre></div>\n<p>Finally, we solve and print the positions. Running this gives me:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"p\">[(</span><span class=\"n\">q__0</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__1</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__2</span><span class=\"p\">,</span> <span class=\"mi\">8</span><span class=\"p\">),</span> \n <span class=\"p\">(</span><span class=\"n\">q__3</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__4</span><span class=\"p\">,</span> <span class=\"mi\">7</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__5</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">),</span> \n <span class=\"p\">(</span><span class=\"n\">q__6</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__7</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__8</span><span class=\"p\">,</span> <span class=\"mi\">6</span><span class=\"p\">)]</span>\n</code></pre></div>\n<p>Which is the correct solution to the queens puzzle. I didn't benchmark the solution times, but I imagine it's considerably slower than a raw SAT solver. <a href=\"https://github.com/audemard/glucose\" target=\"_blank\">Glucose</a> is really, really fast.</p>\n<p>But even so, solving the problem with SMT was a lot <em>easier</em> than solving it with SAT. That satisfies me as an explanation for why people prefer it to SAT.</p>\n<h3>Sanity checks</h3>\n<p>One bit I glossed over earlier was the sanity checking code. I <em>knew for sure</em> that I was going to make a mistake encoding the <code>region</code>, and the solver wasn't going to provide useful information abut what I did wrong.  In cases like these, I like adding small tests and checks to catch mistakes early, because the solver certainly isn't going to catch them!</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">all_squares</span> <span class=\"o\">=</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">product</span><span class=\"p\">(</span><span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"n\">size</span><span class=\"p\">),</span> <span class=\"n\">repeat</span><span class=\"o\">=</span><span class=\"mi\">2</span><span class=\"p\">))</span>\n<span class=\"k\">def</span><span class=\"w\"> </span><span class=\"nf\">test_i_set_up_problem_right</span><span class=\"p\">():</span>\n    <span class=\"k\">assert</span> <span class=\"n\">all_squares</span> <span class=\"o\">==</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">chain</span><span class=\"o\">.</span><span class=\"n\">from_iterable</span><span class=\"p\">(</span><span class=\"n\">regions</span><span class=\"o\">.</span><span class=\"n\">values</span><span class=\"p\">()))</span>\n\n    <span class=\"k\">for</span> <span class=\"n\">r1</span><span class=\"p\">,</span> <span class=\"n\">r2</span> <span class=\"ow\">in</span> <span class=\"n\">combinations</span><span class=\"p\">(</span><span class=\"n\">regions</span><span class=\"o\">.</span><span class=\"n\">values</span><span class=\"p\">(),</span> <span class=\"mi\">2</span><span class=\"p\">):</span>\n        <span class=\"k\">assert</span> <span class=\"ow\">not</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">r1</span><span class=\"p\">)</span> <span class=\"o\">&</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">r2</span><span class=\"p\">),</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">r1</span><span class=\"p\">)</span> <span class=\"o\">&</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">r2</span><span class=\"p\">)</span>\n</code></pre></div>\n<p>The first check was a quick test that I didn't leave any squares out, or accidentally put the same square in both regions. Converting the values into sets makes both checks a lot easier. Honestly I don't know why I didn't just use sets from the start, sets are great.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">def</span><span class=\"w\"> </span><span class=\"nf\">render_regions</span><span class=\"p\">():</span>\n    <span class=\"n\">colormap</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s2\">\"purple\"</span><span class=\"p\">,</span>  <span class=\"s2\">\"red\"</span><span class=\"p\">,</span> <span class=\"s2\">\"brown\"</span><span class=\"p\">,</span> <span class=\"s2\">\"white\"</span><span class=\"p\">,</span> <span class=\"s2\">\"green\"</span><span class=\"p\">,</span> <span class=\"s2\">\"yellow\"</span><span class=\"p\">,</span> <span class=\"s2\">\"orange\"</span><span class=\"p\">,</span> <span class=\"s2\">\"blue\"</span><span class=\"p\">,</span> <span class=\"s2\">\"pink\"</span><span class=\"p\">]</span>\n    <span class=\"n\">board</span> <span class=\"o\">=</span> <span class=\"p\">[[</span><span class=\"mi\">0</span> <span class=\"k\">for</span> <span class=\"n\">_</span> <span class=\"ow\">in</span> <span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"n\">size</span><span class=\"p\">)]</span> <span class=\"k\">for</span> <span class=\"n\">_</span> <span class=\"ow\">in</span> <span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"n\">size</span><span class=\"p\">)]</span> \n    <span class=\"k\">for</span> <span class=\"p\">(</span><span class=\"n\">row</span><span class=\"p\">,</span> <span class=\"n\">col</span><span class=\"p\">)</span> <span class=\"ow\">in</span> <span class=\"n\">all_squares</span><span class=\"p\">:</span>\n        <span class=\"k\">for</span> <span class=\"n\">color</span><span class=\"p\">,</span> <span class=\"n\">region</span> <span class=\"ow\">in</span> <span class=\"n\">regions</span><span class=\"o\">.</span><span class=\"n\">items</span><span class=\"p\">():</span>\n            <span class=\"k\">if</span> <span class=\"p\">(</span><span class=\"n\">row</span><span class=\"p\">,</span> <span class=\"n\">col</span><span class=\"p\">)</span> <span class=\"ow\">in</span> <span class=\"n\">region</span><span class=\"p\">:</span>\n                <span class=\"n\">board</span><span class=\"p\">[</span><span class=\"n\">row</span><span class=\"p\">][</span><span class=\"n\">col</span><span class=\"p\">]</span> <span class=\"o\">=</span> <span class=\"n\">colormap</span><span class=\"o\">.</span><span class=\"n\">index</span><span class=\"p\">(</span><span class=\"n\">color</span><span class=\"p\">)</span><span class=\"o\">+</span><span class=\"mi\">1</span>\n\n    <span class=\"k\">for</span> <span class=\"n\">row</span> <span class=\"ow\">in</span> <span class=\"n\">board</span><span class=\"p\">:</span>\n        <span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"s2\">\"\"</span><span class=\"o\">.</span><span class=\"n\">join</span><span class=\"p\">(</span><span class=\"nb\">map</span><span class=\"p\">(</span><span class=\"nb\">str</span><span class=\"p\">,</span> <span class=\"n\">row</span><span class=\"p\">)))</span>\n\n<span class=\"n\">render_regions</span><span class=\"p\">()</span>\n</code></pre></div>\n<p>The second check is something that prints out the regions. It produces something like this:</p>\n<div class=\"codehilite\"><pre><span></span><code>111111111\n112333999\n122439999\n124437799\n124666779\n124467799\n122467899\n122555889\n112258899\n</code></pre></div>\n<p>I can compare this to the picture of the board to make sure I got it right. I guess a more advanced solution would be to print emoji squares like 🟥 instead.</p>\n<p>Neither check is quality code but it's throwaway and it gets the job done so eh.</p>\n<h3>Update for the Internet</h3>\n<p>This was sent as a weekly newsletter, which is usually on topics like <a href=\"https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code\" target=\"_blank\">software history</a>, <a href=\"https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/\" target=\"_blank\">formal methods</a>, <a href=\"https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/\" target=\"_blank\">unusual technologies</a>, and the <a href=\"https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/\" target=\"_blank\">theory of software engineering</a>. You <a href=\"https://buttondown.email/hillelwayne/\" target=\"_blank\">can subscribe here</a>.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:SAT\">\n<p>\"Boolean <strong>SAT</strong>isfiability Solver\", aka a solver that can find assignments that make complex boolean expressions true. I write a bit more about them <a href=\"https://www.hillelwayne.com/post/np-hard/\" target=\"_blank\">here</a>. <a class=\"footnote-backref\" href=\"#fnref:SAT\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:SMT\">\n<p>\"Satisfiability Modulo Theories\" <a class=\"footnote-backref\" href=\"#fnref:SMT\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/",
          "published": "2025-06-12T15:43:25.000Z",
          "updated": "2025-06-12T15:43:25.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/",
          "title": "AI is a gamechanger for TLA+ users",
          "description": "<h3>New Logic for Programmers Release</h3>\n<p><a href=\"https://leanpub.com/logic/\" target=\"_blank\">v0.10 is now available</a>! This is a minor release, mostly focused on logic-based refactoring, with new material on set types and testing refactors are correct. See the full release notes at <a href=\"https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md\" target=\"_blank\">the changelog page</a>. Due to <a href=\"https://systemsdistributed.com/\" target=\"_blank\">conference pressure</a> v0.11 will also likely be a minor release. </p>\n<p><img alt=\"The book cover\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/29d4ae9d-bcb9-4d8b-99d4-8a35c0990ad5.jpg?w=300&fit=max\"/></p>\n<h1>AI is a gamechanger for TLA+ users</h1>\n<p><a href=\"https://lamport.azurewebsites.net/tla/tla.html\" target=\"_blank\">TLA+</a> is a specification language to model and debug distributed systems. While very powerful, it's also hard for programmers to learn, and there's always questions of connecting specifications with actual code. </p>\n<p>That's why <a href=\"https://zfhuang99.github.io/github%20copilot/formal%20verification/tla+/2025/05/24/ai-revolution-in-distributed-systems.html\" target=\"_blank\">The Coming AI Revolution in Distributed Systems</a> caught my interest. In the post, Cheng Huang claims that Azure successfully used LLMs to examine an existing codebase, derive a TLA+ spec, and find a production bug in that spec. \"After a decade of manually crafting TLA+ specifications\", he wrote, \"I must acknowledge that this AI-generated specification rivals human work\".</p>\n<p>This inspired me to experiment with LLMs in TLA+ myself. My goals are a little less ambitious than Cheng's: I wanted to see how LLMs could help junior specifiers write TLA+, rather than handling the entire spec automatically. Details on what did and didn't work below, but my takeaway is that <strong>LLMs are an immense specification force multiplier.</strong></p>\n<p>All tests were done with a standard VSCode Copilot subscription, writing Claude 3.7 in Agent mode. Other LLMs or IDEs may be more or less effective, etc.</p>\n<h2>Things Claude was good at</h2>\n<h3>Fixing syntax errors</h3>\n<p>TLA+ uses a very different syntax than mainstream programming languages, meaning beginners make a lot of mistakes where they do a \"programming syntax\" instead of TLA+ syntax:</p>\n<div class=\"codehilite\"><pre><span></span><code>NotThree(x) = \\* should be ==, not =\n    x != 3 \\* should be #, not !=\n</code></pre></div>\n<p>The problem is that the TLA+ syntax checker, SANY, is 30 years old and doesn't provide good information. Here's what it says for that snippet:</p>\n<div class=\"codehilite\"><pre><span></span><code>Was expecting \"==== or more Module body\"\nEncountered \"NotThree\" at line 6, column 1\n</code></pre></div>\n<p>That only isolates one error and doesn't tell us what the problem is, only where it is. Experienced TLA+ users get \"error eyes\" and can quickly see what the problem is, but beginners really struggle with this.</p>\n<p>The TLA+ foundation has made LLM integration a priority, so the VSCode extension <a href=\"https://github.com/tlaplus/vscode-tlaplus/blob/master/src/main.ts#L174\" target=\"_blank\">naturally supports several agents actions</a>. One of these is running SANY, meaning an agent can get an error, fix it, get another error, fix it, etc. Provided the above sample and asked to make it work, Claude successfully fixed both errors. It also fixed many errors in a larger spec, as well as figure out why PlusCal specs weren't compiling to TLA+.</p>\n<p>This by itself is already enough to make LLMs a worthwhile tool, as it fixes one of the biggest barriers to entry.</p>\n<h3>Understanding error traces</h3>\n<p>When TLA+ finds a violated property, it outputs the sequence of steps that leads to the error. This starts in plaintext, and VSCode parses it into an interactive table:</p>\n<p><img alt=\"An example error trace\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/f7f16d0e-c61f-4286-ae49-67e03f844126.png?w=960&fit=max\"/></p>\n<p>Learning to read these error traces is a skill in itself. You have to understand what's happening in each step and how it relates back to the actually broken property. It takes a long time for people to learn how to do this well.</p>\n<p>Claude was successful here, too, accurately reading 20+ step error traces and giving a high-level explanation of what went wrong. It also could condense error traces: if ten steps of the error trace could be condensed into a one-sentence summary (which can happen if you're modeling a lot of process internals) Claude would do it.</p>\n<p>I did have issues here with doing this in agent mode: while the extension does provide a \"run model checker\" command, the agent would regularly ignore this and prefer to run a terminal command instead. This would be fine except that the LLM consistently hallucinated invalid commands. I had to amend every prompt with \"run the model checker via vscode, do not use a terminal command\". You can skip this if you're willing to copy and paste the error trace into the prompt.</p>\n<p>As with syntax checking, if this was the <em>only</em> thing LLMs could effectively do, that would already be enough<sup id=\"fnref:dayenu\"><a class=\"footnote-ref\" href=\"#fn:dayenu\">1</a></sup> to earn a strong recommend. Even as a TLA+ expert I expect I'll be using this trick regularly. </p>\n<h3>Boilerplate tasks</h3>\n<p>TLA+ has a lot of boilerplate. One of the most notorious examples is <code>UNCHANGED</code> rules. Specifications are extremely precise — so precise that you have to specify what variables <em>don't</em> change in every step. This takes the form of an <code>UNCHANGED</code> clause at the end of relevant actions:</p>\n<div class=\"codehilite\"><pre><span></span><code>RemoveObjectFromStore(srv, o, s) ==\n  /\\ o \\in stored[s]\n  /\\ stored' = [stored EXCEPT ![s] = @ \\ {o}]\n  /\\ UNCHANGED <<capacity, log, objectsize, pc>>\n</code></pre></div>\n<p>Writing this is really annoying. Updating these whenever you change an action, or add a new variable to the spec, is doubly so. Syntax checking and error analysis are important for beginners, but this is what I wanted for <em>myself</em>. I took a spec and prompted Claude</p>\n<blockquote>\n<p>Add UNCHANGED <<v1, etc=\"\" v2,=\"\">> for each variable not changed in an action.</v1,></p>\n</blockquote>\n<p>And it worked! It successfully updated the <code>UNCHANGED</code> in every action. </p>\n<p>(Note, though, that it was a \"well-behaved\" spec in this regard: only one \"action\" happened at a time. In TLA+ you can have two actions happen simultaneously, that each update half of the variables, meaning neither of them should have an <code>UNCHANGED</code> clause. I haven't tested how Claude handles that!)</p>\n<p>That's the most obvious win, but Claude was good at handling other tedious work, too. Some examples include updating <code>vars</code> (the conventional collection of all state variables), lifting a hard-coded value into a model parameter, and changing data formats. Most impressive to me, though, was rewriting a spec designed for one process to instead handle multiple processes. This means taking all of the process variables, which originally have types like <code>Int</code>, converting them to types like <code>[Process -> Int]</code>, and then updating the uses of all of those variables in the spec. It didn't account for race conditions in the new concurrent behavior, but it was an excellent scaffold to do more work.</p>\n<h3>Writing properties from an informal description</h3>\n<p>You have to be pretty precise with your intended property description but it handles converting that precise description into TLA+'s formalized syntax, which is something beginners often struggle with.</p>\n<h2>Things it is less good at</h2>\n<h3>Generating model config files</h3>\n<p>To model check TLA+, you need both a specification (<code>.tla</code>) and a model config file (<code>.cfg</code>), which have separate syntaxes. Asking the agent to generate the second often lead to it using TLA+ syntax. It automatically fixed this after getting parsing errors, though. </p>\n<h3>Fixing specs</h3>\n<p>Whenever the ran model checking and discovered a bug, it would naturally propose a change to either the invalid property or the spec. Sometimes the changes were good, other times the changes were not physically realizable. For example, if it found that a bug was due to a race condition between processes, it would often suggest fixing it by saying race conditions were okay. I mean yes, if you say bugs are okay, then the spec finds that bugs are okay! Or it would alternatively suggest adding a constraint to the spec saying that race conditions don't happen. <a href=\"https://www.hillelwayne.com/post/alloy-facts/\" target=\"_blank\">But that's a huge mistake in specification</a>, because race conditions happen if we don't have coordination. We need to specify the <em>mechanism</em> that is supposed to prevent them.</p>\n<h3>Finding properties of the spec</h3>\n<p>After seeing how capable it was at translating my properties to TLA+, I started prompting Claude to come up with properties on its own. Unfortunately, almost everything I got back was either trivial, uninteresting, or too coupled to implementation details. I haven't tested if it would work better to ask it for \"properties that may be violated\".</p>\n<h3>Generating code from specs</h3>\n<p>I have to be specific here: Claude <em>could</em> sometimes convert Python into a passable spec, an vice versa. It <em>wasn't</em> good at recognizing abstraction. For example, TLA+ specifications often represent sequential operations with a state variable, commonly called <code>pc</code>. If modeling code that nonatomically retrieves a counter value and increments it, we'd have one action that requires <code>pc = \"Get\"</code> and sets the new value to <code>\"Inc\"</code>, then another that requires it be <code>\"Inc\"</code> and sets it to <code>\"Done\"</code>.</p>\n<p>I found that Claude would try to somehow convert <code>pc</code> into part of the Python program's state, rather than recognize it as a TLA+ abstraction. On the other side, when converting python code to TLA+ it would often try to translate things like <code>sleep</code> into some part of the spec, not recognizing that it is abstractable into a distinct action. I didn't test other possible misconceptions, like converting randomness to nondeterminism.</p>\n<p>For the record, when converting TLA+ to Python Claude tended to make simulators of the spec, rather than possible production code implementing the spec. I really wasn't expecting otherwise though.</p>\n<h2>Unexplored Applications</h2>\n<p>Things I haven't explored thoroughly but could possibly be effective, based on what I know about TLA+ and AI:</p>\n<h3>Writing Java Overrides</h3>\n<p>Most TLA+ operators are resolved via TLA+ interpreters, but you can also implement them in \"native\" Java. This lets you escape the standard language semantics and add capabilities like <a href=\"https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla\" target=\"_blank\">executing programs during model-checking</a> or <a href=\"https://github.com/tlaplus/tlaplus/blob/master/tlatools/org.lamport.tlatools/src/tla2sany/StandardModules/TLC.tla#L62\" target=\"_blank\">dynamically constrain the depth of the searched state space</a>. There's a lot of cool things I think would be possible with overrides. The problem is there's only a handful of people in the world who know how to write them. But that handful have written quite a few overrides and I think there's enough there for Claude to work with. </p>\n<h3>Writing specs, given a reference mechanism</h3>\n<p>In all my experiments, the LLM only had my prompts and the occasional Python script as information. That makes me suspect that some of its problems with writing and fixing specs come down to not having a system model. Maybe it wouldn't suggest fixes like \"these processes never race\" if it had a design doc saying that the processes can't coordinate. </p>\n<p>(Could a Sufficiently Powerful LLM derive some TLA+ specification from a design document?)</p>\n<h3>Connecting specs and code</h3>\n<p>This is the holy grail of TLA+: taking a codebase and showing it correctly implements a spec. Currently the best ways to do this are by either using TLA+ to generate a test suite, or by taking logged production traces and matching them to TLA+ behaviors. <a href=\"https://www.mongodb.com/blog/post/engineering/conformance-checking-at-mongodb-testing-our-code-matches-our-tla-specs\" target=\"_blank\">This blog post discusses both</a>. While I've seen a lot of academic research into these approaches there are no industry-ready tools. So if you want trace validation you have to do a lot of manual labour tailored to your specific product. </p>\n<p>If LLMs could do some of this work for us then that'd really amplify the usefulness of TLA+ to many companies.</p>\n<h2>Thoughts</h2>\n<p><em>Right now</em>, agents seem good at the tedious and routine parts of TLA+ and worse at the strategic and abstraction parts. But, since the routine parts are often a huge barrier to beginners, this means that LLMs have the potential to make TLA+ far, far more accessible than it previously was.</p>\n<p>I have mixed thoughts on this. As an <em>advocate</em>, this is incredible. I want more people using formal specifications because I believe it leads to cheaper, safer, more reliable software. Anything that gets people comfortable with specs is great for our industry. As a <em>professional TLA+ consultant</em>, I'm worried that this obsoletes me. Most of my income comes from training and coaching, which companies will have far less demand of now. Then again, maybe this an opportunity to pitch \"agentic TLA+ training\" to companies!</p>\n<p>Anyway, if you're interested in TLA+, there has never been a better time to try it. I mean it, these tools handle so much of the hard part now. I've got a <a href=\"https://learntla.com/\" target=\"_blank\">free book available online</a>, as does <a href=\"https://lamport.azurewebsites.net/tla/book.html\" target=\"_blank\">the inventor of TLA+</a>. I like <a href=\"https://elliotswart.github.io/pragmaticformalmodeling/\" target=\"_blank\">this guide too</a>. Happy modeling!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:dayenu\">\n<p>Dayenu. <a class=\"footnote-backref\" href=\"#fnref:dayenu\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/",
          "published": "2025-06-05T14:59:11.000Z",
          "updated": "2025-06-05T14:59:11.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        }
      ]
    }
    Analyze Another View with RSS.Style