RSS.Style logo RSS/Atom Feed Analysis


Analysis of https://buttondown.email/hillelwayne/rss

Feed fetched in 1,050 ms.
Warning Content type is application/rss+xml; charset=utf-8, not text/xml.
Feed is 374,437 characters long.
Warning Feed is missing an ETag.
Feed has a last modified date of Wed, 10 Sep 2025 13:00:00 GMT.
Warning This feed does not have a stylesheet.
This appears to be an RSS feed.
Feed title: Computer Things
Feed self link matches feed URL.
Feed has 30 items.
First item published on 2025-09-10T13:00:00.000Z
Last item published on 2025-01-07T18:49:40.000Z
Home page URL: https://buttondown.com/hillelwayne
Error Home page does not have a matching feed discovery link in the <head>.

1 feed links in <head>
  • https://buttondown.com/hillelwayne/rss

  • Error Home page does not have a link to the feed in the <body>.

    Formatted XML
    <?xml version="1.0" encoding="utf-8"?>
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
        <channel>
            <title>Computer Things</title>
            <link>https://buttondown.com/hillelwayne</link>
            <description>Hi, I'm Hillel. This is the newsletter version of [my website](https://www.hillelwayne.com). I post all website updates here. I also post weekly content just for the newsletter, on topics like
    
    * Formal Methods
    
    * Software History and Culture
    
    * Fringetech and exotic tooling
    
    * The philosophy and theory of software engineering
    
    You can see the archive of all public essays [here](https://buttondown.email/hillelwayne/archive/).</description>
            <atom:link href="https://buttondown.email/hillelwayne/rss" rel="self"/>
            <language>en-us</language>
            <lastBuildDate>Wed, 10 Sep 2025 13:00:00 +0000</lastBuildDate>
            <item>
                <title>Many Hard Leetcode Problems are Easy Constraint Problems</title>
                <link>https://buttondown.com/hillelwayne/archive/many-hard-leetcode-problems-are-easy-constraint/</link>
                <description>&lt;p&gt;In my first interview out of college I was asked the change counter problem:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given a set of coin denominations, find the minimum number of coins required to make change for a given number. IE for USA coinage and 37 cents, the minimum number is four (quarter, dime, 2 pennies).&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;I implemented the simple greedy algorithm and immediately fell into the trap of the question: the greedy algorithm only works for "well-behaved" denominations. If the coin values were &lt;code&gt;[10, 9, 1]&lt;/code&gt;, then making 37 cents would take 10 coins in the greedy algorithm but only 4 coins optimally (&lt;code&gt;10+9+9+9&lt;/code&gt;). The "smart" answer is to use a dynamic programming algorithm, which I didn't know how to do. So I failed the interview.&lt;/p&gt;
    &lt;p&gt;But you only need dynamic programming if you're writing your own algorithm. It's really easy if you throw it into a constraint solver like &lt;a href="https://www.minizinc.org/" target="_blank"&gt;MiniZinc&lt;/a&gt; and call it a day. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;int: total;
    array[int] of int: values = [10, 9, 1];
    array[index_set(values)] of var 0..: coins;
    
    constraint sum (c in index_set(coins)) (coins[c] * values[c]) == total;
    solve minimize sum(coins);
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;You can try this online &lt;a href="https://play.minizinc.dev/" target="_blank"&gt;here&lt;/a&gt;. It'll give you a prompt to put in &lt;code&gt;total&lt;/code&gt; and then give you successively-better solutions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;coins = [0, 0, 37];
    ----------
    coins = [0, 1, 28];
    ----------
    coins = [0, 2, 19];
    ----------
    coins = [0, 3, 10];
    ----------
    coins = [0, 4, 1];
    ----------
    coins = [1, 3, 0];
    ----------
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Lots of similar interview questions are this kind of mathematical optimization problem, where we have to find the maximum or minimum of a function corresponding to constraints. They're hard in programming languages because programming languages are too low-level. They are also exactly the problems that constraint solvers were designed to solve. Hard leetcode problems are easy constraint problems.&lt;sup id="fnref:leetcode"&gt;&lt;a class="footnote-ref" href="#fn:leetcode"&gt;1&lt;/a&gt;&lt;/sup&gt; Here I'm using MiniZinc, but you could just as easily use Z3 or OR-Tools or whatever your favorite generalized solver is.&lt;/p&gt;
    &lt;h3&gt;More examples&lt;/h3&gt;
    &lt;p&gt;This was a question in a different interview (which I thankfully passed):&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given a list of stock prices through the day, find maximum profit you can get by buying one stock and selling one stock later.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;It's easy to do in O(n^2) time, or if you are clever, you can do it in O(n). Or you could be not clever at all and just write it as a constraint problem:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;array[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];
    var int: buy;
    var int: sell;
    var int: profit = prices[sell] - prices[buy];
    
    constraint sell &amp;gt; buy;
    constraint profit &amp;gt; 0;
    solve maximize profit;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Reminder, link to trying it online &lt;a href="https://play.minizinc.dev/" target="_blank"&gt;here&lt;/a&gt;. While working at that job, one interview question we tested out was:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given a list, determine if three numbers in that list can be added or subtracted to give 0? &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;This is a satisfaction problem, not a constraint problem: we don't need the "best answer", any answer will do. We eventually decided against it for being too tricky for the engineers we were targeting. But it's not tricky in a solver; &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;include "globals.mzn";
    array[int] of int: numbers = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];
    array[index_set(numbers)] of var {0, -1, 1}: choices;
    
    constraint sum(n in index_set(numbers)) (numbers[n] * choices[n]) = 0;
    constraint count(choices, -1) + count(choices, 1) = 3;
    solve satisfy;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Okay, one last one, a problem I saw last year at &lt;a href="https://chicagopython.github.io/algosig/" target="_blank"&gt;Chipy AlgoSIG&lt;/a&gt;. Basically they pick some leetcode problems and we all do them. I failed to solve &lt;a href="https://leetcode.com/problems/largest-rectangle-in-histogram/description/" target="_blank"&gt;this one&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given an array of integers heights representing the histogram's bar height where the width of each bar is 1, return the area of the largest rectangle in the histogram.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="example from leetcode link" class="newsletter-image" src="https://assets.buttondown.email/images/63337f78-7138-4b21-87a0-917c0c5b1706.jpg?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The "proper" solution is a tricky thing involving tracking lots of bookkeeping states, which you can completely bypass by expressing it as constraints:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;array[int] of int: numbers = [2,1,5,6,2,3];
    
    var 1..length(numbers): x; 
    var 1..length(numbers): dx;
    var 1..: y;
    
    constraint x + dx &amp;lt;= length(numbers);
    constraint forall (i in x..(x+dx)) (y &amp;lt;= numbers[i]);
    
    var int: area = (dx+1)*y;
    solve maximize area;
    
    output ["(\(x)-&amp;gt;\(x+dx))*\(y) = \(area)"]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There's even a way to &lt;a href="https://docs.minizinc.dev/en/2.9.3/visualisation.html" target="_blank"&gt;automatically visualize the solution&lt;/a&gt; (using &lt;code&gt;vis_geost_2d&lt;/code&gt;), but I didn't feel like figuring it out in time for the newsletter.&lt;/p&gt;
    &lt;h3&gt;Is this better?&lt;/h3&gt;
    &lt;p&gt;Now if I actually brought these questions to an interview the interviewee could ruin my day by asking "what's the runtime complexity?" Constraint solvers runtimes are unpredictable and almost always slower than an ideal bespoke algorithm because they are more expressive, in what I refer to as the &lt;a href="https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/" target="_blank"&gt;capability/tractability tradeoff&lt;/a&gt;. But even so, they'll do way better than a &lt;em&gt;bad&lt;/em&gt; bespoke algorithm, and I'm not experienced enough in handwriting algorithms to consistently beat a solver.&lt;/p&gt;
    &lt;p&gt;The real advantage of solvers, though, is how well they handle new constraints. Take the stock picking problem above. I can write an O(n²) algorithm in a few minutes and the O(n) algorithm if you give me some time to think. Now change the problem to&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Maximize the profit by buying and selling up to &lt;code&gt;max_sales&lt;/code&gt; stocks, but you can only buy or sell one stock at a given time and you can only hold up to &lt;code&gt;max_hold&lt;/code&gt; stocks at a time?&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;That's a way harder problem to write even an inefficient algorithm for! While the constraint problem is only a tiny bit more complicated:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;include "globals.mzn";
    int: max_sales = 3;
    int: max_hold = 2;
    array[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];
    array [1..max_sales] of var int: buy;
    array [1..max_sales] of var int: sell;
    array [index_set(prices)] of var 0..max_hold: stocks_held;
    var int: profit = sum(s in 1..max_sales) (prices[sell[s]] - prices[buy[s]]);
    
    constraint forall (s in 1..max_sales) (sell[s] &amp;gt; buy[s]);
    constraint profit &amp;gt; 0;
    
    constraint forall(i in index_set(prices)) (stocks_held[i] = (count(s in 1..max_sales) (buy[s] &amp;lt;= i) - count(s in 1..max_sales) (sell[s] &amp;lt;= i)));
    constraint alldifferent(buy ++ sell);
    solve maximize profit;
    
    output ["buy at \(buy)\n", "sell at \(sell)\n", "for \(profit)"];
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Most constraint solving examples online are puzzles, like &lt;a href="https://docs.minizinc.dev/en/stable/modelling2.html#ex-sudoku" target="_blank"&gt;Sudoku&lt;/a&gt; or "&lt;a href="https://docs.minizinc.dev/en/stable/modelling2.html#ex-smm" target="_blank"&gt;SEND + MORE = MONEY&lt;/a&gt;". Solving leetcode problems would be a more interesting demonstration. And you get more interesting opportunities to teach optimizations, like symmetry breaking.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Update for the Internet&lt;/h3&gt;
    &lt;p&gt;This was sent as a weekly newsletter, which is usually on topics like &lt;a href="https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code" target="_blank"&gt;software history&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/" target="_blank"&gt;formal methods&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;unusual technologies&lt;/a&gt;, and the &lt;a href="https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/" target="_blank"&gt;theory of software engineering&lt;/a&gt;. You can subscribe here: &lt;/p&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:leetcode"&gt;
    &lt;p&gt;Because my dad will email me if I don't explain this: "leetcode" is slang for "tricky algorithmic interview questions that have little-to-no relevance in the actual job you're interviewing for." It's from &lt;a href="https://leetcode.com/" target="_blank"&gt;leetcode.com&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:leetcode" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 10 Sep 2025 13:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/many-hard-leetcode-problems-are-easy-constraint/</guid>
            </item>
            <item>
                <title>The Angels and Demons of Nondeterminism</title>
                <link>https://buttondown.com/hillelwayne/archive/the-angels-and-demons-of-nondeterminism/</link>
                <description>&lt;p&gt;Greetings everyone! You might have noticed that it's September and I don't have the next version of &lt;em&gt;Logic for Programmers&lt;/em&gt; ready. As penance, &lt;a href="https://leanpub.com/logic/c/september-2025-kuBCrhBnUzb7" target="_blank"&gt;here's ten free copies of the book&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;So a few months ago I wrote &lt;a href="https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/" target="_blank"&gt;a newsletter&lt;/a&gt; about how we use nondeterminism in formal methods.  The overarching idea:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Nondeterminism is when multiple paths are possible from a starting state.&lt;/li&gt;
    &lt;li&gt;A system preserves a property if it holds on &lt;em&gt;all&lt;/em&gt; possible paths. If even one path violates the property, then we have a bug.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;An intuitive model of this is that for this is that when faced with a nondeterministic choice, the system always makes the &lt;em&gt;worst possible choice&lt;/em&gt;. This is sometimes called &lt;strong&gt;demonic nondeterminism&lt;/strong&gt; and is favored in formal methods because we are paranoid to a fault.&lt;/p&gt;
    &lt;p&gt;The opposite would be &lt;strong&gt;angelic nondeterminism&lt;/strong&gt;, where the system always makes the &lt;em&gt;best possible choice&lt;/em&gt;. A property then holds if &lt;em&gt;any&lt;/em&gt; possible path satisfies that property.&lt;sup id="fnref:duals"&gt;&lt;a class="footnote-ref" href="#fn:duals"&gt;1&lt;/a&gt;&lt;/sup&gt; This is not as common in FM, but it still has its uses! "Players can access the secret level" or "&lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/#other-properties" target="_blank"&gt;We can always shut down the computer&lt;/a&gt;" are &lt;strong&gt;reachability&lt;/strong&gt; properties, that something is possible even if not actually done.&lt;/p&gt;
    &lt;p&gt;In broader computer science research, I'd say that angelic nondeterminism is more popular, due to its widespread use in complexity analysis and programming languages.&lt;/p&gt;
    &lt;h3&gt;Complexity Analysis&lt;/h3&gt;
    &lt;p&gt;P is the set of all "decision problems" (&lt;em&gt;basically&lt;/em&gt;, boolean functions) can be solved in polynomial time: there's an algorithm that's worst-case in &lt;code&gt;O(n)&lt;/code&gt;, &lt;code&gt;O(n²)&lt;/code&gt;, &lt;code&gt;O(n³)&lt;/code&gt;, etc.&lt;sup id="fnref:big-o"&gt;&lt;a class="footnote-ref" href="#fn:big-o"&gt;2&lt;/a&gt;&lt;/sup&gt;  NP is the set of all problems that can be solved in polynomial time by an algorithm with &lt;em&gt;angelic nondeterminism&lt;/em&gt;.&lt;sup id="fnref:TM"&gt;&lt;a class="footnote-ref" href="#fn:TM"&gt;3&lt;/a&gt;&lt;/sup&gt; For example, the question "does list &lt;code&gt;l&lt;/code&gt; contain &lt;code&gt;x&lt;/code&gt;" can be solved in O(1) time by a nondeterministic algorithm:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;fun is_member(l: List[T], x: T): bool {
      if l == [] {return false};
    
      guess i in 0..&amp;lt;(len(l)-1);
      return l[i] == x;
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Say call &lt;code&gt;is_member([a, b, c, d], c)&lt;/code&gt;. The best possible choice would be to guess &lt;code&gt;i = 2&lt;/code&gt;, which would correctly return true. Now call &lt;code&gt;is_member([a, b], d)&lt;/code&gt;. No matter what we guess, the algorithm correctly returns false. and just return false. Ergo, O(1). NP stands for "Nondeterministic Polynomial". &lt;/p&gt;
    &lt;p&gt;(And I just now realized something pretty cool: you can say that P is the set of all problems solvable in polynomial time under &lt;em&gt;demonic nondeterminism&lt;/em&gt;, which is a nice parallel between the two classes.)&lt;/p&gt;
    &lt;p&gt;Computer scientists have proven that angelic nondeterminism doesn't give us any more "power": there are no problems solvable with AN that aren't also solvable deterministically. The big question is whether AN is more &lt;em&gt;efficient&lt;/em&gt;: it is widely believed, but not &lt;em&gt;proven&lt;/em&gt;, that there are problems in NP but not in P. Most famously, "Is there any variable assignment that makes this boolean formula true?" A polynomial AN algorithm is again easy:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;fun SAT(f(x1, x2, …: bool): bool): bool {
       N = num_params(f)
       for i in 1..=num_params(f) {
         guess x_i in {true, false}
       }
    
       return f(x_1, x_2, …)
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The best deterministic algorithms we have to solve the same problem are worst-case exponential with the number of boolean parameters. This a real frustrating problem because real computers don't have angelic nondeterminism, so problems like SAT remain hard. We can solve most "well-behaved" instances of the problem &lt;a href="https://www.hillelwayne.com/post/np-hard/" target="_blank"&gt;in reasonable time&lt;/a&gt;, but the worst-case instances get intractable real fast.&lt;/p&gt;
    &lt;h3&gt;Means of Abstraction&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;We can directly turn an AN algorithm into a (possibly much slower) deterministic algorithm, such as by &lt;a href="https://en.wikipedia.org/wiki/Backtracking" target="_blank"&gt;backtracking&lt;/a&gt;. This makes AN a pretty good abstraction over what an algorithm is doing. Does the regex &lt;code&gt;(a+b)\1+&lt;/code&gt; match "abaabaabaab"? Yes, if the regex engine nondeterministically guesses that it needs to start at the third letter and make the group &lt;code&gt;aab&lt;/code&gt;. How does my PL's regex implementation find that match? I dunno, backtracking or &lt;a href="https://swtch.com/~rsc/regexp/regexp1.html" target="_blank"&gt;NFA construction&lt;/a&gt; or something, I don't need to know the deterministic specifics in order to use the nondeterministic abstraction.&lt;/p&gt;
    &lt;p&gt;Neel Krishnaswami has &lt;a href="https://semantic-domain.blogspot.com/2013/07/what-declarative-languages-are.html" target="_blank"&gt;a great definition of 'declarative language'&lt;/a&gt;: "any language with a semantics has some nontrivial existential quantifiers in it". I'm not sure if this is &lt;em&gt;identical&lt;/em&gt; to saying "a language with an angelic nondeterministic abstraction", but they must be pretty close, and all of his examples match:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;SQL's selects and joins&lt;/li&gt;
    &lt;li&gt;Parsing DSLs&lt;/li&gt;
    &lt;li&gt;Logic programming's unification&lt;/li&gt;
    &lt;li&gt;Constraint solving&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;On top of that I'd add CSS selectors and &lt;a href="https://www.hillelwayne.com/post/picat/" target="_blank"&gt;planner's actions&lt;/a&gt;; all nondeterministic abstractions over a deterministic implementation. He also says that the things programmers hate most in declarative languages are features that "that expose the operational model": constraint solver search strategies, Prolog cuts, regex backreferences, etc. Which again matches my experiences with angelic nondeterminism: I dread features that force me to understand the deterministic implementation. But they're necessary, since P probably != NP and so we need to worry about operational optimizations.&lt;/p&gt;
    &lt;h3&gt;Eldritch Nondeterminism&lt;/h3&gt;
    &lt;p&gt;If you need to know the &lt;a href="https://en.wikipedia.org/wiki/PP_(complexity)" target="_blank"&gt;ratio of good/bad paths&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/%E2%99%AFP" target="_blank"&gt;the number of good paths&lt;/a&gt;, or probability, or anything more than "there is a good path" or "there is a bad path", you are beyond the reach of heaven or hell.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:duals"&gt;
    &lt;p&gt;Angelic and demonic nondeterminism are &lt;a href="https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/" target="_blank"&gt;duals&lt;/a&gt;: angelic returns "yes" if &lt;code&gt;some choice: correct&lt;/code&gt; and demonic returns "no" if &lt;code&gt;!all choice: correct&lt;/code&gt;, which is the same as &lt;code&gt;some choice: !correct&lt;/code&gt;. &lt;a class="footnote-backref" href="#fnref:duals" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:big-o"&gt;
    &lt;p&gt;Pet peeve about Big-O notation: &lt;code&gt;O(n²)&lt;/code&gt; is the &lt;em&gt;set&lt;/em&gt; of all algorithms that, for sufficiently large problem sizes, grow no faster that quadratically. "Bubblesort has &lt;code&gt;O(n²)&lt;/code&gt; complexity" &lt;em&gt;should&lt;/em&gt; be written &lt;code&gt;Bubblesort in O(n²)&lt;/code&gt;, &lt;em&gt;not&lt;/em&gt; &lt;code&gt;Bubblesort = O(n²)&lt;/code&gt;. &lt;a class="footnote-backref" href="#fnref:big-o" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:TM"&gt;
    &lt;p&gt;To be precise, solvable in polynomial time by a &lt;em&gt;Nondeterministic Turing Machine&lt;/em&gt;, a very particular model of computation. We can broadly talk about P and NP without framing everything in terms of Turing machines, but some details of complexity classes (like the existence "weak NP-hardness") kinda need Turing machines to make sense. &lt;a class="footnote-backref" href="#fnref:TM" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 04 Sep 2025 14:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/the-angels-and-demons-of-nondeterminism/</guid>
            </item>
            <item>
                <title>Logical Duals in Software Engineering</title>
                <link>https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/</link>
                <description>&lt;p&gt;(&lt;a href="https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/" target="_blank"&gt;Last week's newsletter&lt;/a&gt; took too long and I'm way behind on &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; revisions so short one this time.&lt;sup id="fnref:retread"&gt;&lt;a class="footnote-ref" href="#fn:retread"&gt;1&lt;/a&gt;&lt;/sup&gt;)&lt;/p&gt;
    &lt;p&gt;In classical logic, two operators &lt;code&gt;F/G&lt;/code&gt; are &lt;strong&gt;duals&lt;/strong&gt; if &lt;code&gt;F(x) = !G(!x)&lt;/code&gt;. Three examples:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;&lt;code&gt;x || y&lt;/code&gt; is the same as &lt;code&gt;!(!x &amp;amp;&amp;amp; !y)&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;&amp;lt;&amp;gt;P&lt;/code&gt; ("P is possibly true") is the same as &lt;code&gt;![]!P&lt;/code&gt; ("not P isn't definitely true").&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;some x in set: P(x)&lt;/code&gt; is the same as &lt;code&gt;!(all x in set: !P(x))&lt;/code&gt;.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;(1) is just a version of De Morgan's Law, which we regularly use to simplify boolean expressions. (2) is important in modal logic but has niche applications in software engineering, mostly in how it powers various formal methods.&lt;sup id="fnref:fm"&gt;&lt;a class="footnote-ref" href="#fn:fm"&gt;2&lt;/a&gt;&lt;/sup&gt; The real interesting one is (3), the "quantifier duals". We use lots of software tools to either &lt;em&gt;find&lt;/em&gt; a value satisfying &lt;code&gt;P&lt;/code&gt; or &lt;em&gt;check&lt;/em&gt; that all values satisfy &lt;code&gt;P&lt;/code&gt;. And by duality, any tool that does one can do the other, by seeing if it &lt;em&gt;fails&lt;/em&gt; to find/check &lt;code&gt;!P&lt;/code&gt;. Some examples in the wild:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Z3 is used to solve mathematical constraints, like "find x, where &lt;code&gt;f(x) &amp;gt;= 0&lt;/code&gt;. If I want to prove a property like "f is always positive", I ask z3 to solve "find x, where &lt;code&gt;!(f(x) &amp;gt;= 0)&lt;/code&gt;, and see if that is unsatisfiable. This use case powers a LOT of theorem provers and formal verification tooling.&lt;/li&gt;
    &lt;li&gt;Property testing checks that all inputs to a code block satisfy a property. I've used it to generate complex inputs with certain properties by checking that all inputs &lt;em&gt;don't&lt;/em&gt; satisfy the property and reading out the test failure.&lt;/li&gt;
    &lt;li&gt;Model checkers check that all behaviors of a specification satisfy a property, so we can find a behavior that reaches a goal state G by checking that all states are &lt;code&gt;!G&lt;/code&gt;. &lt;a href="https://github.com/tlaplus/Examples/blob/master/specifications/DieHard/DieHard.tla" target="_blank"&gt;Here's TLA+ solving a puzzle this way&lt;/a&gt;.&lt;sup id="fnref:antithesis"&gt;&lt;a class="footnote-ref" href="#fn:antithesis"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;li&gt;Planners find behaviors that reach a goal state, so we can check if all behaviors satisfy a property P by asking it to reach goal state &lt;code&gt;!P&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;The problem "find the shortest &lt;a href="https://en.wikipedia.org/wiki/Travelling_salesman_problem" target="_blank"&gt;traveling salesman route&lt;/a&gt;" can be broken into &lt;code&gt;some route: distance(route) = n&lt;/code&gt; and &lt;code&gt;all route: !(distance(route) &amp;lt; n)&lt;/code&gt;. Then a route finder can find the first, and then convert the second into a &lt;code&gt;some&lt;/code&gt; and &lt;em&gt;fail&lt;/em&gt; to find it, proving &lt;code&gt;n&lt;/code&gt; is optimal.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Even cooler to me is when a tool does &lt;em&gt;both&lt;/em&gt; finding and checking, but gives them different "meanings". In SQL, &lt;code&gt;some x: P(x)&lt;/code&gt; is true if we can &lt;em&gt;query&lt;/em&gt; for &lt;code&gt;P(x)&lt;/code&gt; and get a nonempty response, while &lt;code&gt;all x: P(x)&lt;/code&gt; is true if all records satisfy the &lt;code&gt;P(x)&lt;/code&gt; &lt;em&gt;constraint&lt;/em&gt;. Most SQL databases allow for complex queries but not complex constraints! You got &lt;code&gt;UNIQUE&lt;/code&gt;, &lt;code&gt;NOT NULL&lt;/code&gt;, &lt;code&gt;REFERENCES&lt;/code&gt;, which are fixed predicates, and &lt;code&gt;CHECK&lt;/code&gt;, which is one-record only.&lt;sup id="fnref:check"&gt;&lt;a class="footnote-ref" href="#fn:check"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;Oh, and you got database triggers, which can run arbitrary queries and throw exceptions. So if you really need to enforce a complex constraint &lt;code&gt;P(x, y, z)&lt;/code&gt;, you put in a database trigger that queries &lt;code&gt;some x, y, z: !P(x, y, z)&lt;/code&gt; and throws an exception if it finds any results. That all works because of quantifier duality! See &lt;a href="https://eddmann.com/posts/maintaining-invariant-constraints-in-postgresql-using-trigger-functions/" target="_blank"&gt;here&lt;/a&gt; for an example of this in practice.&lt;/p&gt;
    &lt;h3&gt;Duals more broadly&lt;/h3&gt;
    &lt;p&gt;"Dual" doesn't have a strict meaning in math, it's more of a vibe thing where all of the "duals" are kinda similar in meaning but don't strictly follow all of the same rules. &lt;em&gt;Usually&lt;/em&gt; things X and Y are duals if there is some transform &lt;code&gt;F&lt;/code&gt; where &lt;code&gt;X = F(Y)&lt;/code&gt; and &lt;code&gt;Y = F(X)&lt;/code&gt;, but not always. Maybe the category theorists have a formal definition that covers all of the different uses. Usually duals switch properties of things, too: an example showing &lt;code&gt;some x: P(x)&lt;/code&gt; becomes a &lt;em&gt;counterexample&lt;/em&gt; of &lt;code&gt;all x: !P(x)&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;Under this definition, I think the dual of a list &lt;code&gt;l&lt;/code&gt; could be &lt;code&gt;reverse(l)&lt;/code&gt;. The first element of &lt;code&gt;l&lt;/code&gt; becomes the last element of &lt;code&gt;reverse(l)&lt;/code&gt;, the last becomes the first, etc. A more interesting case is the dual of a &lt;code&gt;K -&amp;gt; set(V)&lt;/code&gt; map is the &lt;code&gt;V -&amp;gt; set(K)&lt;/code&gt; map. IE the dual of &lt;code&gt;lived_in_city = {alice: {paris}, bob: {detroit}, charlie: {detroit, paris}}&lt;/code&gt; is &lt;code&gt;city_lived_in_by = {paris: {alice, charlie}, detroit: {bob, charlie}}&lt;/code&gt;. This preserves the property that &lt;code&gt;x in map[y] &amp;lt;=&amp;gt; y in dual[x]&lt;/code&gt;.&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:retread"&gt;
    &lt;p&gt;And after writing this I just realized this is partial retread of a newsletter I wrote &lt;a href="https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/" target="_blank"&gt;a couple months ago&lt;/a&gt;. But only a &lt;em&gt;partial&lt;/em&gt; retread! &lt;a class="footnote-backref" href="#fnref:retread" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:fm"&gt;
    &lt;p&gt;Specifically "linear temporal logics" are modal logics, so "&lt;code&gt;eventually P&lt;/code&gt; ("P is true in at least one state of each behavior") is the same as saying &lt;code&gt;!always !P&lt;/code&gt; ("not P isn't true in all states of all behaviors"). This is the basis of &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;liveness checking&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:fm" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:antithesis"&gt;
    &lt;p&gt;I don't know for sure, but my best guess is that Antithesis does something similar &lt;a href="https://antithesis.com/blog/tag/games/" target="_blank"&gt;when their fuzzer beats videogames&lt;/a&gt;. They're doing fuzzing, not model checking, but they have the same purpose check that complex state spaces don't have bugs. Making the bug "we can't reach the end screen" can make a fuzzer output a complete end-to-end run of the game. Obvs a lot more complicated than that but that's the general idea at least. &lt;a class="footnote-backref" href="#fnref:antithesis" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:check"&gt;
    &lt;p&gt;For &lt;code&gt;CHECK&lt;/code&gt; to constraint multiple records you would need to use a subquery. Core SQL does not support subqueries in check. It is an optional database "feature outside of core SQL" (F671), which &lt;a href="https://www.postgresql.org/docs/current/unsupported-features-sql-standard.html" target="_blank"&gt;Postgres does not support&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:check" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 27 Aug 2025 19:25:32 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/</guid>
            </item>
            <item>
                <title>Sapir-Whorf does not apply to Programming Languages</title>
                <link>https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/</link>
                <description>&lt;p&gt;&lt;em&gt;This one is a hot mess but it's too late in the week to start over. Oh well!&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;Someone recognized me at last week's &lt;a href="https://www.chipy.org/" target="_blank"&gt;Chipy&lt;/a&gt; and asked for my opinion on Sapir-Whorf hypothesis in programming languages. I thought this was interesting enough to make a newsletter. First what it is, then why it &lt;em&gt;looks&lt;/em&gt; like it applies, and then why it doesn't apply after all.&lt;/p&gt;
    &lt;h3&gt;The Sapir-Whorf Hypothesis&lt;/h3&gt;
    &lt;blockquote&gt;
    &lt;p&gt;We dissect nature along lines laid down by our native language. — &lt;a href="https://web.mit.edu/allanmc/www/whorf.scienceandlinguistics.pdf" target="_blank"&gt;Whorf&lt;/a&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;To quote from a &lt;a href="https://www.amazon.com/Linguistics-Complete-Introduction-Teach-Yourself/dp/1444180320" target="_blank"&gt;Linguistics book I've read&lt;/a&gt;, the hypothesis is that "an individual's fundamental perception of reality is moulded by the language they speak." As a massive oversimplification, if English did not have a word for "rebellion", we would not be able to conceive of rebellion. This view, now called &lt;a href="https://en.wikipedia.org/wiki/Linguistic_determinism" target="_blank"&gt;Linguistic Determinism&lt;/a&gt;, is mostly rejected by modern linguists.&lt;/p&gt;
    &lt;p&gt;The "weak" form of SWH is that the language we speak influences, but does not &lt;em&gt;decide&lt;/em&gt; our cognition. &lt;a href="https://langcog.stanford.edu/papers/winawer2007.pdf" target="_blank"&gt;For example&lt;/a&gt;, Russian has distinct words for "light blue" and "dark blue", so can discriminate between "light blue" and "dark blue" shades faster than they can discriminate two "light blue" shades. English does not have distinct words, so we discriminate those at the same speed. This &lt;strong&gt;linguistic relativism&lt;/strong&gt; seems to have lots of empirical support in studies, but mostly with "small indicators". I don't think there's anything that convincingly shows linguistic relativism having effects on a societal level.&lt;sup id="fnref:economic-behavior"&gt;&lt;a class="footnote-ref" href="#fn:economic-behavior"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;The weak form of SWH for software would then be the "the programming languages you know affects how you think about programs."&lt;/p&gt;
    &lt;h3&gt;SWH in software&lt;/h3&gt;
    &lt;p&gt;This seems like a natural fit, as different paradigms solve problems in different ways. Consider the &lt;a href="https://hadid.dev/posts/living-coding/" target="_blank"&gt;hardest interview question ever&lt;/a&gt;, "given a list of integers, sum the even numbers". Here it is in four paradigms:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Procedural: &lt;code&gt;total = 0; foreach x in list {if IsEven(x) total += x}&lt;/code&gt;. You iterate over data with an algorithm.&lt;/li&gt;
    &lt;li&gt;Functional: &lt;code&gt;reduce(+, filter(IsEven, list), 0)&lt;/code&gt;. You apply transformations to data to get a result.&lt;/li&gt;
    &lt;li&gt;Array: &lt;code&gt;+ fold L * iseven L&lt;/code&gt;.&lt;sup id="fnref:J"&gt;&lt;a class="footnote-ref" href="#fn:J"&gt;2&lt;/a&gt;&lt;/sup&gt; In English: replace every element in L with 0 if odd and 1 if even, multiple the new array elementwise against &lt;code&gt;L&lt;/code&gt;, and then sum the resulting array. It's like functional except everything is in terms of whole-array transformations.&lt;/li&gt;
    &lt;li&gt;Logical: Somethingish like &lt;code&gt;sumeven(0, []). sumeven(X, [Y|L]) :- iseven(Y) -&amp;gt; sumeven(Z, L), X is Y + Z ; sumeven(X, L)&lt;/code&gt;. You write a set of equations that express what it means for X to &lt;em&gt;be&lt;/em&gt; the sum of events of L.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;There's some similarities between how these paradigms approach the problem, but each is also unique, too. It's plausible that where a procedural programmer "sees" a for loop, a functional programmer "sees" a map and an array programmer "sees" a singular operator.&lt;/p&gt;
    &lt;p&gt;I also have a personal experience with how a language changed the way I think. I use &lt;a href="https://learntla.com/" target="_blank"&gt;TLA+&lt;/a&gt; to detect concurrency bugs in software designs. After doing this for several years, I've gotten much better at intuitively seeing race conditions in things even &lt;em&gt;without&lt;/em&gt; writing a TLA+ spec. It's even leaked out into my day-to-day life. I see concurrency bugs everywhere. Phone tag is a race condition.&lt;/p&gt;
    &lt;p&gt;But I still don't think SWH is the right mental model to use, for one big reason: language is &lt;em&gt;special&lt;/em&gt;. We think in language, we dream in language, there are huge parts of our brain dedicated to processing language. &lt;a href="https://web.eecs.umich.edu/~weimerw/p/weimer-icse2017-preprint.pdf" target="_blank"&gt;We don't use those parts of our brain to read code&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;SWH is so intriguing because it seems so unnatural, that the way we express thoughts changes the way we &lt;em&gt;think&lt;/em&gt; thoughts. That I would be a different person if I was bilingual in Spanish, not because the life experiences it would open up but because &lt;a href="https://en.wikipedia.org/wiki/Grammatical_gender" target="_blank"&gt;grammatical gender&lt;/a&gt; would change my brain.&lt;/p&gt;
    &lt;p&gt;Compared to that, the idea that programming languages affect our brain is more natural and has a simpler explanation:&lt;/p&gt;
    &lt;p&gt;It's the goddamned &lt;a href="https://en.wikipedia.org/wiki/Tetris_effect" target="_blank"&gt;Tetris Effect&lt;/a&gt;.&lt;/p&gt;
    &lt;h3&gt;The Goddamned Tetris Effect&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;blockquote&gt;
    &lt;p&gt;The Tetris effect occurs when someone dedicates vast amounts of time, effort and concentration on an activity which thereby alters their thoughts, dreams, and other experiences not directly linked to said activity. — Wikipedia&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Every skill does this. I'm a juggler, so every item I can see right now has a tiny metadata field of "how would this tumble if I threw it up". I teach professionally, so I'm always noticing good teaching examples everywhere. I spent years writing specs in TLA+ and watching the model checker throw concurrency errors in my face, so now race conditions have visceral presence. Every skill does this. &lt;/p&gt;
    &lt;p&gt;And to really develop a skill, you gotta practice. This is where I think programming paradigms do something especially interesting that make them feel more like Sapir-Whorfy than, like, juggling. Some languages mix lots of different paradigms, like Javascript or Rust. Others like Haskell really focus on &lt;em&gt;excluding&lt;/em&gt; paradigms. If something is easy for you in procedural and hard in FP, in JS you could just lean on the procedural bits. In Haskell, &lt;em&gt;too bad&lt;/em&gt;, you're learning how to do it the functional way.&lt;sup id="fnref:escape-hatch"&gt;&lt;a class="footnote-ref" href="#fn:escape-hatch"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;And that forces you to practice, which makes you see functional patterns everywhere. Tetris effect!&lt;/p&gt;
    &lt;p&gt;Anyway this may all seem like quibbling— why does it matter whether we call it "Tetris effect" or "Sapir-Whorf", if our brains is get rewired either way? For me, personally, it's because SWH sounds really special and &lt;em&gt;unique&lt;/em&gt;, while Tetris effect sounds mundane and commonplace. Which it &lt;em&gt;is&lt;/em&gt;. But also because TE suggests it's not just programming languages that affect how we think about software, it's &lt;em&gt;everything&lt;/em&gt;. Spending lots of time debugging, profiling, writing exploits, whatever will change what you notice, what you think a program "is". And that's a way useful idea that shouldn't be restricted to just PLs.&lt;/p&gt;
    &lt;p&gt;(Then again, the Tetris Effect might also be a bad analogy to what's going on here, because I think part of it is that it wears off after a while. Maybe it's just "building a mental model is good".)&lt;/p&gt;
    &lt;h3&gt;I just realized all of this might have missed the point&lt;/h3&gt;
    &lt;p&gt;Wait are people actually using SWH to mean the &lt;em&gt;weak form&lt;/em&gt; or the &lt;em&gt;strong&lt;/em&gt; form? Like that if a language doesn't make something possible, its users can't conceive of it being possible. I've been arguing against the weaker form in software but I think I've seen strong form often too. Dammit.&lt;/p&gt;
    &lt;p&gt;Well, it's already Thursday and far too late to rewrite the whole newsletter, so I'll just outline the problem with the strong form: we describe the capabilities of our programming languages &lt;em&gt;with human language&lt;/em&gt;. In college I wrote a lot of crappy physics lab C++ and one of my projects was filled with comments like "man I hate copying this triply-nested loop in 10 places with one-line changes, I wish I could put it in one function and just take the changing line as a parameter". Even if I hadn't &lt;em&gt;encountered&lt;/em&gt; higher-order functions, I was still perfectly capable of expressing the idea. So if the strong SWH isn't true for human language, it's not true for programming languages either.&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h1&gt;Systems Distributed talk now up!&lt;/h1&gt;
    &lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=d9cM8f_qSLQ" target="_blank"&gt;Link here&lt;/a&gt;! Original abstract:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Building correct distributed systems takes thinking outside the box, and the fastest way to do that is to think inside a different box. One different box is "formal methods", the discipline of mathematically verifying software and systems. Formal methods encourages unusual perspectives on systems, models that are also broadly useful to all software developers. In this talk we will learn two of the most important FM perspectives: the abstract specifications behind software systems, and the property they are and aren't supposed to have.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The talk ended up evolving away from that abstract but I like how it turned out!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:economic-behavior"&gt;
    &lt;p&gt;There is &lt;a href="https://www.anderson.ucla.edu/faculty/keith.chen/papers/LanguageWorkingPaper.pdf" target="_blank"&gt;one paper&lt;/a&gt; arguing that people who speak a language that doesn't have a "future tense" are more likely to save and eat healthy, but it is... &lt;a href="https://www.reddit.com/r/linguistics/comments/rcne7m/comment/hnz2705/" target="_blank"&gt;extremely questionable&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:economic-behavior" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:J"&gt;
    &lt;p&gt;The original J is &lt;code&gt;+/ (* (0 =  2&amp;amp;|))&lt;/code&gt;. Obligatory &lt;a href="https://www.jsoftware.com/papers/tot.htm" target="_blank"&gt;Notation as a Tool of Thought&lt;/a&gt; reference &lt;a class="footnote-backref" href="#fnref:J" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:escape-hatch"&gt;
    &lt;p&gt;Though if it's &lt;em&gt;too&lt;/em&gt; hard for you, that's why languages have &lt;a href="https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/" target="_blank"&gt;escape hatches&lt;/a&gt; &lt;a class="footnote-backref" href="#fnref:escape-hatch" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 21 Aug 2025 13:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/</guid>
            </item>
            <item>
                <title>Software books I wish I could read</title>
                <link>https://buttondown.com/hillelwayne/archive/software-books-i-wish-i-could-read/</link>
                <description>&lt;h3&gt;New Logic for Programmers Release!&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.11 is now available&lt;/a&gt;! This is over 20%  longer than v0.10, with a new chapter on code proofs, three chapter overhauls, and more! &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;Full release notes here&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Cover of the boooooook" class="newsletter-image" src="https://assets.buttondown.email/images/92b4a35d-2bdd-416a-92c7-15ff42b49d8d.jpg?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h1&gt;Software books I wish I could read&lt;/h1&gt;
    &lt;p&gt;I'm writing &lt;em&gt;Logic for Programmers&lt;/em&gt; because it's a book I wanted to have ten years ago. I had to learn everything in it the hard way, which is why I'm ensuring that everybody else can learn it the easy way.&lt;/p&gt;
    &lt;p&gt;Books occupy a sort of weird niche in software. We're great at sharing information via blogs and git repos and entire websites. These have many benefits over books: they're free, they're easily accessible, they can be updated quickly, they can even be interactive. But no blog post has influenced me as profoundly as &lt;a href="https://buttondown.com/hillelwayne/archive/why-you-should-read-data-and-reality/" target="_blank"&gt;Data and Reality&lt;/a&gt; or &lt;a href="https://www.oreilly.com/library/view/making-software/9780596808310/" target="_blank"&gt;Making Software&lt;/a&gt;. There is no blog or talk about debugging as good as the 
    &lt;a href="https://debuggingrules.com/" target="_blank"&gt;Debugging&lt;/a&gt; book.&lt;/p&gt;
    &lt;p&gt;It might not be anything deeper than "people spend more time per word on writing books than blog posts". I dunno.&lt;/p&gt;
    &lt;p&gt;So here are some other books I wish I could read. I don't &lt;em&gt;think&lt;/em&gt; any of them exist yet but it's a big world out there. Also while they're probably best as books, a website or a series of blog posts would be ok too.&lt;/p&gt;
    &lt;h4&gt;Everything about Configurations&lt;/h4&gt;
    &lt;p&gt;The whole topic of how we configure software, whether by CLI flags, environmental vars, or JSON/YAML/XML/Dhall files. What causes the &lt;a href="https://mikehadlow.blogspot.com/2012/05/configuration-complexity-clock.html" target="_blank"&gt;configuration complexity clock&lt;/a&gt;? How do we distinguish between basic, advanced, and developer-only configuration options? When should we disallow configuration? How do we test all possible configurations for correctness? Why do so many widespread outages trace back to misconfiguration, and how do we prevent them? &lt;/p&gt;
    &lt;p&gt;I also want the same for plugin systems. Manifests, permissions, common APIs and architectures, etc. Configuration management is more universal, though, since everybody either uses software with configuration or has made software with configuration.&lt;/p&gt;
    &lt;h4&gt;The Big Book of Complicated Data Schemas&lt;/h4&gt;
    &lt;p&gt;I guess this would kind of be like &lt;a href="https://schema.org/docs/full.html" target="_blank"&gt;Schema.org&lt;/a&gt;, except with a lot more on the "why" and not the what. Why is important for the &lt;a href="https://schema.org/Volcano" target="_blank"&gt;Volcano model&lt;/a&gt; to have a "smokingAllowed" field?&lt;sup id="fnref:volcano"&gt;&lt;a class="footnote-ref" href="#fn:volcano"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;I'd see this less as "here's your guide to putting Volcanos in your database" and more "here's recurring motifs in modeling interesting domains", to help a person see sources of complexity in their &lt;em&gt;own&lt;/em&gt; domain. Does something crop up if the references can form a cycle? If a relationship needs to be strictly temporary, or a reference can change type? Bonus: path dependence in data models, where an additional requirement leads to a vastly different ideal data model that a company couldn't do because they made the old model.&lt;/p&gt;
    &lt;p&gt;(This has got to exist, right? Business modeling is a big enough domain that this must exist. Maybe &lt;a href="https://essenceofsoftware.com/" target="_blank"&gt;The Essence of Software&lt;/a&gt; touches on this? Man I feel bad I haven't read that yet.)&lt;/p&gt;
    &lt;h4&gt;Computer Science for Software Engineers&lt;/h4&gt;
    &lt;p&gt;Yes, I checked, this book does not exist (though maybe &lt;a href="https://www.amazon.com/A-Programmers-Guide-to-Computer-Science-2-book-series/dp/B08433QR53" target="_blank"&gt;this&lt;/a&gt; is the same thing). I don't have any formal software education; everything I know was either self-taught or learned on the job. But it's way easier to learn software engineering that way than computer science. And I bet there's a lot of other engineers in the same boat. &lt;/p&gt;
    &lt;p&gt;This book wouldn't have to be comprehensive or instructive: just enough about each topic to understand why it's an area of study and appreciate how research in it eventually finds its way into practice. &lt;/p&gt;
    &lt;h4&gt;MISU Patterns&lt;/h4&gt;
    &lt;p&gt;MISU, or "Make Illegal States Unrepresentable", is the idea of designing system invariants in the structure of your data. For example, if a &lt;code&gt;Contact&lt;/code&gt; needs at least one of &lt;code&gt;email&lt;/code&gt; or &lt;code&gt;phone&lt;/code&gt; to be non-null, make it a sum type over &lt;code&gt;EmailContact, PhoneContact, EmailPhoneContact&lt;/code&gt; (from &lt;a href="https://fsharpforfunandprofit.com/posts/designing-with-types-making-illegal-states-unrepresentable/" target="_blank"&gt;this post&lt;/a&gt;). MISU is great.&lt;/p&gt;
    &lt;p&gt;Most MISU in the wild look very different than that, though, because the concept of MISU is so broad there's lots of different ways to achieve it. And that means there are "patterns": smart constructors, product types, properly using sets, &lt;a href="https://lexi-lambda.github.io/blog/2020/11/01/names-are-not-type-safety/" target="_blank"&gt;newtypes to some degree&lt;/a&gt;, etc. Some of them are specific to typed FP, while others can be used in even untyped languages. Someone oughta make a pattern book.&lt;/p&gt;
    &lt;p&gt;My one request would be to not give them cutesy names. Do something like the &lt;a href="https://ia600301.us.archive.org/18/items/Thompson2016MotifIndex/Thompson_2016_Motif-Index.pdf" target="_blank"&gt;Aarne–Thompson–Uther Index&lt;/a&gt;, where items are given names like "Recognition by manner of throwing cakes of different weights into faces of old uncles". Names can come later.&lt;/p&gt;
    &lt;h4&gt;The Tools of '25&lt;/h4&gt;
    &lt;p&gt;Not something I'd read, but something to recommend to junior engineers. Starting out it's easy to think the only bit that matters is the language or framework and not realize the enormous amount of surrounding tooling you'll have to learn. This book would cover the basics of tools that &lt;em&gt;enough&lt;/em&gt; developers will probably use at some point: git, VSCode, &lt;em&gt;very&lt;/em&gt; basic Unix and bash, curl. Maybe the general concepts of tools that appear in every ecosystem, like package managers, build tools, task runners. That might be easier if we specialize this to one particular domain, like webdev or data science.&lt;/p&gt;
    &lt;p&gt;Ideally the book would only have to be updated every five years or so. No LLM stuff because I don't expect the tooling will be stable through 2026, to say nothing of 2030.&lt;/p&gt;
    &lt;h4&gt;A History of Obsolete Optimizations&lt;/h4&gt;
    &lt;p&gt;Probably better as a really long blog series. Each chapter would be broken up into two parts:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;A deep dive into a brilliant, elegant, insightful historical optimization designed to work within the constraints of that era's computing technology&lt;/li&gt;
    &lt;li&gt;What we started doing instead, once we had more compute/network/storage available.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;c.f. &lt;a href="https://prog21.dadgum.com/29.html" target="_blank"&gt;A Spellchecker Used to Be a Major Feat of Software Engineering&lt;/a&gt;. Bonus topics would be brilliance obsoleted by standardization (like what people did before git and json were universal), optimizations we do today that may not stand the test of time, and optimizations from the past that &lt;em&gt;did&lt;/em&gt;.&lt;/p&gt;
    &lt;h4&gt;Sphinx Internals&lt;/h4&gt;
    &lt;p&gt;&lt;em&gt;I need this&lt;/em&gt;. I've spent so much goddamn time digging around in Sphinx and docutils source code I'm gonna throw up.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Systems Distributed Talk Today!&lt;/h3&gt;
    &lt;p&gt;Online premier's at noon central / 5 PM UTC, &lt;a href="https://www.youtube.com/watch?v=d9cM8f_qSLQ" target="_blank"&gt;here&lt;/a&gt;! I'll be hanging out to answer questions and be awkward. You ever watch a recording of your own talk? It's real uncomfortable!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:volcano"&gt;
    &lt;p&gt;In &lt;em&gt;this&lt;/em&gt; case because it's a field on one of &lt;code&gt;Volcano&lt;/code&gt;'s supertypes. I guess schemas gotta follow LSP too &lt;a class="footnote-backref" href="#fnref:volcano" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 06 Aug 2025 13:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/software-books-i-wish-i-could-read/</guid>
            </item>
            <item>
                <title>2000 words about arrays and tables</title>
                <link>https://buttondown.com/hillelwayne/archive/2000-words-about-arrays-and-tables/</link>
                <description>&lt;p&gt;I'm way too discombobulated from getting next month's release of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; ready, so I'm pulling a idea from the slush pile. Basically I wanted to come up with a mental model of arrays as a concept that explained APL-style multidimensional arrays and tables but also why there weren't multitables.&lt;/p&gt;
    &lt;p&gt;So, arrays. In all languages they are basically the same: they map a sequence of numbers (I'll use &lt;code&gt;1..N&lt;/code&gt;)&lt;sup id="fnref:1-indexing"&gt;&lt;a class="footnote-ref" href="#fn:1-indexing"&gt;1&lt;/a&gt;&lt;/sup&gt; to homogeneous values (values of a single type). This is in contrast to the other two foundational types, associative arrays (which map an arbitrary type to homogeneous values) and structs (which map a fixed set of keys to &lt;em&gt;heterogeneous&lt;/em&gt; values). Arrays appear in PLs earlier than the other two, possibly because they have the simplest implementation and the most obvious application to scientific computing. The OG FORTRAN had arrays. &lt;/p&gt;
    &lt;p&gt;I'm interested in two structural extensions to arrays. The first, found in languages like nushell and frameworks like Pandas, is the &lt;em&gt;table&lt;/em&gt;. Tables have string keys like a struct &lt;em&gt;and&lt;/em&gt; indexes like an array. Each row is a struct, so you can get "all values in this column" or "all values for this row". They're heavily used in databases and data science.&lt;/p&gt;
    &lt;p&gt;The other extension is the &lt;strong&gt;N-dimensional array&lt;/strong&gt;, mostly seen in APLs like Dyalog and J. Think of this like arrays-of-arrays(-of-arrays), except all arrays at the same depth have the same length. So &lt;code&gt;[[1,2,3],[4]]&lt;/code&gt; is not a 2D array, but &lt;code&gt;[[1,2,3],[4,5,6]]&lt;/code&gt; is. This means that N-arrays can be queried on any axis.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
    &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;NB. first row&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;{"&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;NB. first column&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So, I've had some ideas on a conceptual model of arrays that explains all of these variations and possibly predicts new variations. I wrote up my notes and did the bare minimum of editing and polishing. Somehow it ended up being 2000 words.&lt;/p&gt;
    &lt;h3&gt;1-dimensional arrays&lt;/h3&gt;
    &lt;p&gt;A one-dimensional array is a function over &lt;code&gt;1..N&lt;/code&gt; for some N. &lt;/p&gt;
    &lt;p&gt;To be clear this is &lt;em&gt;math&lt;/em&gt; functions, not programming functions. Programming functions take values of a type and perform computations on them. Math functions take values of a fixed set and return values of another set. So the array &lt;code&gt;[a, b, c, d]&lt;/code&gt; can be represented by the function &lt;code&gt;(1 -&amp;gt; a ++ 2 -&amp;gt; b ++ 3 -&amp;gt; c ++ 4 -&amp;gt; d)&lt;/code&gt;. Let's write the set of all four element character arrays as &lt;code&gt;1..4 -&amp;gt; char&lt;/code&gt;. &lt;code&gt;1..4&lt;/code&gt; is the function's &lt;strong&gt;domain&lt;/strong&gt;.&lt;/p&gt;
    &lt;p&gt;The set of all character arrays is the empty array + the functions with domain &lt;code&gt;1..1&lt;/code&gt; + the functions with domain &lt;code&gt;1..2&lt;/code&gt; + ... Let's call this set &lt;code&gt;Array[Char]&lt;/code&gt;. Our compilers can enforce that a type belongs to &lt;code&gt;Array[Char]&lt;/code&gt;, but some operations care about the more specific type, like matrix multiplication. This is either checked with the runtime type or, in exotic enough languages, with static dependent types.&lt;/p&gt;
    &lt;p&gt;(This is actually how TLA+ does things: the basic collection types are functions and sets, and a function with domain 1..N is a sequence.)&lt;/p&gt;
    &lt;h3&gt;2-dimensional arrays&lt;/h3&gt;
    &lt;p&gt;Now take the 3x4 matrix&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;
    &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;11&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There are two equally valid ways to represent the array function:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;A function that takes a row and a column and returns the value at that index, so it would look like &lt;code&gt;f(r: 1..3, c: 1..4) -&amp;gt; Int&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;A function that takes a row and returns that column as an array, aka another function: &lt;code&gt;f(r: 1..3) -&amp;gt; g(c: 1..4) -&amp;gt; Int&lt;/code&gt;.&lt;sup id="fnref:associative"&gt;&lt;a class="footnote-ref" href="#fn:associative"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Man, (2) looks a lot like &lt;a href="https://en.wikipedia.org/wiki/Currying" target="_blank"&gt;currying&lt;/a&gt;! In Haskell, functions can only have one parameter. If you write &lt;code&gt;(+) 6 10&lt;/code&gt;, &lt;code&gt;(+) 6&lt;/code&gt; first returns a &lt;em&gt;new&lt;/em&gt; function &lt;code&gt;f y = y + 6&lt;/code&gt;, and then applies &lt;code&gt;f 10&lt;/code&gt; to get 16. So &lt;code&gt;(+)&lt;/code&gt; has the type signature &lt;code&gt;Int -&amp;gt; Int -&amp;gt; Int&lt;/code&gt;: it's a function that takes an &lt;code&gt;Int&lt;/code&gt; and returns a function of type &lt;code&gt;Int -&amp;gt; Int&lt;/code&gt;.&lt;sup id="fnref:typeclass"&gt;&lt;a class="footnote-ref" href="#fn:typeclass"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;Similarly, our 2D array can be represented as an array function that returns array functions: it has type &lt;code&gt;1..3 -&amp;gt; 1..4 -&amp;gt; Int&lt;/code&gt;, meaning it takes a row index and returns &lt;code&gt;1..4 -&amp;gt; Int&lt;/code&gt;, aka a single array.&lt;/p&gt;
    &lt;p&gt;(This differs from conventional array-of-arrays because it forces all of the subarrays to have the same domain, aka the same length. If we wanted to permit ragged arrays, we would instead have the type &lt;code&gt;1..3 -&amp;gt; Array[Int]&lt;/code&gt;.)&lt;/p&gt;
    &lt;p&gt;Why is this useful? A couple of reasons. First of all, we can apply function transformations to arrays, like "&lt;a href="https://blog.zdsmith.com/series/combinatory-programming.html" target="_blank"&gt;combinators&lt;/a&gt;". For example, we can flip any function of type &lt;code&gt;a -&amp;gt; b -&amp;gt; c&lt;/code&gt; into a function of type &lt;code&gt;b -&amp;gt; a -&amp;gt; c&lt;/code&gt;. So given a function that takes rows and returns columns, we can produce one that takes columns and returns rows. That's just a matrix transposition! &lt;/p&gt;
    &lt;p&gt;Second, we can extend this to any number of dimensions: a three-dimensional array is one with type &lt;code&gt;1..M -&amp;gt; 1..N -&amp;gt; 1..O -&amp;gt; V&lt;/code&gt;. We can still use function transformations to rearrange the array along any ordering of axes.&lt;/p&gt;
    &lt;p&gt;Speaking of dimensions:&lt;/p&gt;
    &lt;h3&gt;What are dimensions, anyway&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;Okay, so now imagine we have a &lt;code&gt;Row&lt;/code&gt; × &lt;code&gt;Col&lt;/code&gt; grid of pixels, where each pixel is a struct of type &lt;code&gt;Pixel(R: int, G: int, B: int)&lt;/code&gt;. So the array is&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Row -&amp;gt; Col -&amp;gt; Pixel
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But we can also represent the &lt;em&gt;Pixel struct&lt;/em&gt; with a function: &lt;code&gt;Pixel(R: 0, G: 0, B: 255)&lt;/code&gt; is the function where &lt;code&gt;f(R) = 0&lt;/code&gt;, &lt;code&gt;f(G) = 0&lt;/code&gt;, &lt;code&gt;f(B) = 255&lt;/code&gt;, making it a function of type &lt;code&gt;{R, G, B} -&amp;gt; Int&lt;/code&gt;. So the array is actually the function&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Row -&amp;gt; Col -&amp;gt; {R, G, B} -&amp;gt; Int
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;And then we can rearrange the parameters of the function like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;{R, G, B} -&amp;gt; Row -&amp;gt; Col -&amp;gt; Int
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Even though the set &lt;code&gt;{R, G, B}&lt;/code&gt; is not of form 1..N, this clearly has a real meaning: &lt;code&gt;f[R]&lt;/code&gt; is the function mapping each coordinate to that coordinate's red value. What about &lt;code&gt;Row -&amp;gt; {R, G, B} -&amp;gt; Col -&amp;gt; Int&lt;/code&gt;?  That's for each row, the 3 × Col array mapping each color to that row's intensities.&lt;/p&gt;
    &lt;p&gt;Really &lt;em&gt;any finite set&lt;/em&gt; can be a "dimension". Recording the monitor over a span of time? &lt;code&gt;Frame -&amp;gt; Row -&amp;gt; Col -&amp;gt; Color -&amp;gt; Int&lt;/code&gt;. Recording a bunch of computers over some time? &lt;code&gt;Computer -&amp;gt; Frame -&amp;gt; Row …&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;This is pretty common in constraint satisfaction! Like if you're conference trying to assign talks to talk slots, your array might be type &lt;code&gt;(Day, Time, Room) -&amp;gt; Talk&lt;/code&gt;, where Day/Time/Room are enumerations.&lt;/p&gt;
    &lt;p&gt;An implementation constraint is that most programming languages &lt;em&gt;only&lt;/em&gt; allow integer indexes, so we have to replace Rooms and Colors with numerical enumerations over the set. As long as the set is finite, this is always possible, and for struct-functions, we can always choose the indexing on the lexicographic ordering of the keys. But we lose type safety.&lt;/p&gt;
    &lt;h3&gt;Why tables are different&lt;/h3&gt;
    &lt;p&gt;One more example: &lt;code&gt;Day -&amp;gt; Hour -&amp;gt; Airport(name: str, flights: int, revenue: USD)&lt;/code&gt;. Can we turn the struct into a dimension like before? &lt;/p&gt;
    &lt;p&gt;In this case, no. We were able to make &lt;code&gt;Color&lt;/code&gt; an axis because we could turn &lt;code&gt;Pixel&lt;/code&gt; into a &lt;code&gt;Color -&amp;gt; Int&lt;/code&gt; function, and we could only do that because all of the fields of the struct had the same type. This time, the fields are &lt;em&gt;different&lt;/em&gt; types. So we can't convert &lt;code&gt;{name, flights, revenue}&lt;/code&gt; into an axis. &lt;sup id="fnref:name-dimension"&gt;&lt;a class="footnote-ref" href="#fn:name-dimension"&gt;4&lt;/a&gt;&lt;/sup&gt; One thing we can do is convert it to three &lt;em&gt;separate&lt;/em&gt; functions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;airport: Day -&amp;gt; Hour -&amp;gt; Str
    flights: Day -&amp;gt; Hour -&amp;gt; Int
    revenue: Day -&amp;gt; Hour -&amp;gt; USD
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But we want to keep all of the data in one place. That's where &lt;strong&gt;tables&lt;/strong&gt; come in: an array-of-structs is isomorphic to a struct-of-arrays:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;AirportColumns(
        airport: Day -&amp;gt; Hour -&amp;gt; Str,
        flights: Day -&amp;gt; Hour -&amp;gt; Int,
        revenue: Day -&amp;gt; Hour -&amp;gt; USD,
    )
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The table is a sort of &lt;em&gt;both&lt;/em&gt; representations simultaneously. If this was a pandas dataframe, &lt;code&gt;df["airport"]&lt;/code&gt; would get the airport column, while &lt;code&gt;df.loc[day1]&lt;/code&gt; would get the first day's data. I don't think many table implementations support more than one axis dimension but there's no reason they &lt;em&gt;couldn't&lt;/em&gt;. &lt;/p&gt;
    &lt;p&gt;These are also possible transforms:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Hour -&amp;gt; NamesAreHard(
        airport: Day -&amp;gt; Str,
        flights: Day -&amp;gt; Int,
        revenue: Day -&amp;gt; USD,
    )
    
    Day -&amp;gt; Whatever(
        airport: Hour -&amp;gt; Str,
        flights: Hour -&amp;gt; Int,
        revenue: Hour -&amp;gt; USD,
    )
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In my mental model, the heterogeneous struct acts as a "block" in the array. We can't remove it, we can only push an index into the fields or pull a shared column out. But there's no way to convert a heterogeneous table into an array.&lt;/p&gt;
    &lt;h3&gt;Actually there is a terrible way&lt;/h3&gt;
    &lt;p&gt;Most languages have unions or &lt;del&gt;product&lt;/del&gt; sum types that let us say "this is a string OR integer". So we can make our airport data &lt;code&gt;Day -&amp;gt; Hour -&amp;gt; AirportKey -&amp;gt; Int | Str | USD&lt;/code&gt;. Heck, might as well just say it's &lt;code&gt;Day -&amp;gt; Hour -&amp;gt; AirportKey -&amp;gt; Any&lt;/code&gt;. But would anybody really be mad enough to use that in practice?&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://code.jsoftware.com/wiki/Vocabulary/lt" target="_blank"&gt;Oh wait J does exactly that&lt;/a&gt;. J has an opaque datatype called a "box". A "table" is a function &lt;code&gt;Dim1 -&amp;gt; Dim2 -&amp;gt; Box&lt;/code&gt;. You can see some examples of what that looks like &lt;a href="https://code.jsoftware.com/wiki/DB/Flwor" target="_blank"&gt;here&lt;/a&gt;&lt;/p&gt;
    &lt;h3&gt;Misc Thoughts and Questions&lt;/h3&gt;
    &lt;p&gt;The heterogeneity barrier seems like it explains why we don't see multiple axes of table columns, while we do see multiple axes of array dimensions. But is that actually why? Is there a system out there that &lt;em&gt;does&lt;/em&gt; have multiple columnar axes?&lt;/p&gt;
    &lt;p&gt;The array &lt;code&gt;x = [[a, b, a], [b, b, b]]&lt;/code&gt; has type &lt;code&gt;1..2 -&amp;gt; 1..3 -&amp;gt; {a, b}&lt;/code&gt;. Can we rearrange it to &lt;code&gt;1..2 -&amp;gt; {a, b} -&amp;gt; 1..3&lt;/code&gt;? No. But we &lt;em&gt;can&lt;/em&gt; rearrange it to &lt;code&gt;1..2 -&amp;gt; {a, b} -&amp;gt; PowerSet(1..3)&lt;/code&gt;, which maps rows and characters to columns &lt;em&gt;with&lt;/em&gt; that character. &lt;code&gt;[(a -&amp;gt; {1, 3} ++ b -&amp;gt; {2}), (a -&amp;gt; {} ++ b -&amp;gt; {1, 2, 3}]&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;We can also transform &lt;code&gt;Row -&amp;gt; PowerSet(Col)&lt;/code&gt; into &lt;code&gt;Row -&amp;gt; Col -&amp;gt; Bool&lt;/code&gt;, aka a boolean matrix. This makes sense to me as both forms are means of representing directed graphs.&lt;/p&gt;
    &lt;p&gt;Are other function combinators useful for thinking about arrays?&lt;/p&gt;
    &lt;p&gt;Does this model cover pivot tables? Can we extend it to relational data with multiple tables?&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Systems Distributed Talk (will be) Online&lt;/h3&gt;
    &lt;p&gt;The premier will be August 6 at 12 CST, &lt;a href="https://www.youtube.com/watch?v=d9cM8f_qSLQ" target="_blank"&gt;here&lt;/a&gt;! I'll be there to answer questions / mock my own performance / generally make a fool of myself.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:1-indexing"&gt;
    &lt;p&gt;&lt;a href="https://buttondown.com/hillelwayne/archive/why-do-arrays-start-at-0/" target="_blank"&gt;Sacrilege&lt;/a&gt;! But it turns out in this context, it's easier to use 1-indexing than 0-indexing. In the years since I wrote that article I've settled on "each indexing choice matches different kinds of mathematical work", so mathematicians and computer scientists are best served by being able to choose their index. But software engineers need consistency, and 0-indexing is overall a net better consistency pick. &lt;a class="footnote-backref" href="#fnref:1-indexing" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:associative"&gt;
    &lt;p&gt;This is &lt;em&gt;right-associative&lt;/em&gt;: &lt;code&gt;a -&amp;gt; b -&amp;gt; c&lt;/code&gt; means &lt;code&gt;a -&amp;gt; (b -&amp;gt; c)&lt;/code&gt;, not &lt;code&gt;(a -&amp;gt; b) -&amp;gt; c&lt;/code&gt;. &lt;code&gt;(1..3 -&amp;gt; 1..4) -&amp;gt; Int&lt;/code&gt; would be the associative array that maps length-3 arrays to integers. &lt;a class="footnote-backref" href="#fnref:associative" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:typeclass"&gt;
    &lt;p&gt;Technically it has type &lt;code&gt;Num a =&amp;gt; a -&amp;gt; a -&amp;gt; a&lt;/code&gt;, since &lt;code&gt;(+)&lt;/code&gt; works on floats too. &lt;a class="footnote-backref" href="#fnref:typeclass" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:name-dimension"&gt;
    &lt;p&gt;Notice that if each &lt;code&gt;Airport&lt;/code&gt; had a unique name, we &lt;em&gt;could&lt;/em&gt; pull it out into &lt;code&gt;AirportName -&amp;gt; Airport(flights, revenue)&lt;/code&gt;, but we still are stuck with two different values. &lt;a class="footnote-backref" href="#fnref:name-dimension" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 30 Jul 2025 13:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/2000-words-about-arrays-and-tables/</guid>
            </item>
            <item>
                <title>Programming Language Escape Hatches</title>
                <link>https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/</link>
                <description>&lt;p&gt;The excellent-but-defunct blog &lt;a href="https://prog21.dadgum.com/38.html" target="_blank"&gt;Programming in the 21st Century&lt;/a&gt; defines "puzzle languages" as languages were part of the appeal is in figuring out how to express a program idiomatically, like a puzzle. As examples, he lists Haskell, Erlang, and J. All puzzle languages, the author says, have an "escape" out of the puzzle model that is pragmatic but stigmatized.&lt;/p&gt;
    &lt;p&gt;But many mainstream languages have escape hatches, too.&lt;/p&gt;
    &lt;p&gt;Languages have a lot of properties. One of these properties is the language's &lt;a href="https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/" target="_blank"&gt;capabilities&lt;/a&gt;, roughly the set of things you can do in the language. Capability is desirable but comes into conflicts with a lot of other desirable properties, like simplicity or efficiency. In particular, reducing the capability of a language means that all remaining programs share more in common, meaning there's more assumptions the compiler and programmer can make ("tractability"). Assumptions are generally used to reason about correctness, but can also be about things like optimization: J's assumption that everything is an array leads to &lt;a href="https://code.jsoftware.com/wiki/Vocabulary/SpecialCombinations" target="_blank"&gt;high-performance "special combinations"&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;Rust is the most famous example of &lt;strong&gt;mainstream&lt;/strong&gt; language that trades capability for tractability.&lt;sup id="fnref:gc"&gt;&lt;a class="footnote-ref" href="#fn:gc"&gt;1&lt;/a&gt;&lt;/sup&gt; Rust has a lot of rules designed to prevent common memory errors, like keeping a reference to deallocated memory or modifying memory while something else is reading it. As a consequence, there's a lot of things that cannot be done in (safe) Rust, like interface with an external C function (as it doesn't have these guarantees).&lt;/p&gt;
    &lt;p&gt;To do this, you need to use &lt;a href="https://doc.rust-lang.org/book/ch20-01-unsafe-rust.html" target="_blank"&gt;unsafe Rust&lt;/a&gt;, which lets you do additional things forbidden by safe Rust, such as deference a raw pointer. Everybody tells you not to use &lt;code&gt;unsafe&lt;/code&gt; unless you absolutely 100% know what you're doing, and possibly not even then.&lt;/p&gt;
    &lt;p&gt;Sounds like an escape hatch to me!&lt;/p&gt;
    &lt;p&gt;To extrapolate, an &lt;strong&gt;escape hatch&lt;/strong&gt; is a feature (either in the language itself or a particular implementation) that deliberately breaks core assumptions about the language in order to add capabilities. This explains both Rust and most of the so-called "puzzle languages": they need escape hatches because they have very strong conceptual models of the language which leads to lots of assumptions about programs. But plenty of "kitchen sink" mainstream languages have escape hatches, too:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Some compilers let C++ code embed &lt;a href="https://en.cppreference.com/w/cpp/language/asm.html" target="_blank"&gt;inline assembly&lt;/a&gt;.&lt;/li&gt;
    &lt;li&gt;Languages built on .NET or the JVM has some sort of interop with C# or Java, and many of those languages make assumptions about programs that C#/Java do not.&lt;/li&gt;
    &lt;li&gt;The SQL language has stored procedures as an escape hatch &lt;em&gt;and&lt;/em&gt; vendors create a second escape hatch of user-defined functions.&lt;/li&gt;
    &lt;li&gt;Ruby lets you bypass any form of encapsulation with &lt;a href="https://ruby-doc.org/3.4.1/Object.html#method-i-send" target="_blank"&gt;&lt;code&gt;send&lt;/code&gt;&lt;/a&gt;.&lt;/li&gt;
    &lt;li&gt;Frameworks have escape hatches, too! React has &lt;a href="https://react.dev/learn/escape-hatches" target="_blank"&gt;an entire page on them&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;(Does &lt;code&gt;eval&lt;/code&gt; in interpreted languages count as an escape hatch? It feels different, but it does add a lot of capability. Maybe they don't "break assumptions" in the same way?)&lt;/p&gt;
    &lt;h3&gt;The problem with escape hatches&lt;/h3&gt;
    &lt;p&gt;In all languages with escape hatches, the rule is "use this as carefully and sparingly as possible", to the point where a messy solution &lt;em&gt;without&lt;/em&gt; an escape hatch is preferable to a clean solution &lt;em&gt;with&lt;/em&gt; one. Breaking a core assumption is a big deal! If the language is operating as if its still true, it's going to do incorrect things. &lt;/p&gt;
    &lt;p&gt;I recently had this problem in a TLA+ contract. TLA+ is a language for modeling complicated systems, and assumes that the model is a self-contained universe. The client wanted to use the TLA+ to test a real system. The model checker should send commands to a test device and check the next states were the same. This is straightforward to set up with the &lt;a href="https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla" target="_blank"&gt;IOExec escape hatch&lt;/a&gt;.&lt;sup id="fnref:ioexec"&gt;&lt;a class="footnote-ref" href="#fn:ioexec"&gt;2&lt;/a&gt;&lt;/sup&gt; But the model checker assumed that state exploration was pure and it could skip around the state randomly, meaning it would do things like &lt;code&gt;set x = 10&lt;/code&gt;, then skip to &lt;code&gt;set x = 1&lt;/code&gt;, then skip back to &lt;code&gt;inc x; assert x == 11&lt;/code&gt;. Oops!&lt;/p&gt;
    &lt;p&gt;We eventually found workarounds but it took a lot of clever tricks to pull off. I'll probably write up the technique when I'm less busy with The Book.&lt;/p&gt;
    &lt;p&gt;The other problem with escape hatches is the rest of the language is designed around &lt;em&gt;not&lt;/em&gt; having said capabilities, meaning it can't support the feature as well as a language designed for them from the start. Even if your escape hatch code is clean, it might not cleanly &lt;em&gt;integrate&lt;/em&gt; with the rest of your code. This is why people &lt;a href="https://chadaustin.me/2024/10/intrusive-linked-list-in-rust/" target="_blank"&gt;complain about unsafe Rust&lt;/a&gt; so often.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:gc"&gt;
    &lt;p&gt;It should be noted though that &lt;em&gt;all&lt;/em&gt; languages with automatic memory management are trading capability for tractability, too. If you can't deference pointers, you can't deference &lt;em&gt;null&lt;/em&gt; pointers. &lt;a class="footnote-backref" href="#fnref:gc" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:ioexec"&gt;
    &lt;p&gt;From the Community Modules (which come default with the VSCode extension). &lt;a class="footnote-backref" href="#fnref:ioexec" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 24 Jul 2025 14:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/</guid>
            </item>
            <item>
                <title>Maybe writing speed actually is a bottleneck for programming</title>
                <link>https://buttondown.com/hillelwayne/archive/maybe-writing-speed-actually-is-a-bottleneck-for/</link>
                <description>&lt;p&gt;I'm a big (neo)vim buff. My config is over 1500 lines and I regularly write new scripts. I recently ported my neovim config to a new laptop. Before then, I was using VSCode to write, and when I switched back I immediately saw a big gain in productivity.&lt;/p&gt;
    &lt;p&gt;People often pooh-pooh vim (and other assistive writing technologies) by saying that writing code isn't the bottleneck in software development. Reading, understanding, and thinking through code is!&lt;/p&gt;
    &lt;p&gt;Now I don't know how true this actually is in practice, because empirical studies of time spent coding are all over the place. Most of them, like &lt;a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/meyer-fse-2014.pdf" target="_blank"&gt;this study&lt;/a&gt;, track time spent in the editor but don't distinguish between time spent reading code and time spent writing code. The only one I found that separates them was &lt;a href="https://scispace.com/pdf/i-know-what-you-did-last-summer-an-investigation-of-how-3zxclzzocc.pdf" target="_blank"&gt;this study&lt;/a&gt;. It finds that developers spend only 5% of their time editing. It also finds they spend 14% of their time moving or resizing editor windows, so I don't know how clean their data is.&lt;/p&gt;
    &lt;p&gt;But I have a bigger problem with "writing is not the bottleneck": when I think of a bottleneck, I imagine that &lt;em&gt;no&lt;/em&gt; amount of improvement will lead to productivity gains. Like if a program is bottlenecked on the network, it isn't going to get noticeably faster with 100x more ram or compute. &lt;/p&gt;
    &lt;p&gt;But being able to type code 100x faster, even with without corresponding improvements to reading and imagining code, would be &lt;strong&gt;huge&lt;/strong&gt;. &lt;/p&gt;
    &lt;p&gt;We'll assume the average developer writes at 80 words per minute, at five characters a word, for 400 characters a minute.What could we do if we instead wrote at 8,000 words/40k characters a minute? &lt;/p&gt;
    &lt;h3&gt;Writing fast&lt;/h3&gt;
    &lt;h4&gt;Boilerplate is trivial&lt;/h4&gt;
    &lt;p&gt;Why do people like type inference? Because writing all of the types manually is annoying. Why don't people like boilerplate? Because it's annoying to write every damn time. Programmers like features that help them write less! That's not a problem if you can write all of the boilerplate in 0.1 seconds.&lt;/p&gt;
    &lt;p&gt;You still have the problem of &lt;em&gt;reading&lt;/em&gt; boilerplate heavy code, but you can use the remaining 0.9 seconds to churn out an extension that parses the file and presents the boilerplate in a more legible fashion. &lt;/p&gt;
    &lt;h4&gt;We can write more tooling&lt;/h4&gt;
    &lt;p&gt;This is something I've noticed with LLMs: when I can churn out crappy code as a free action, I use that to write lots of tools that assist me in writing &lt;em&gt;good&lt;/em&gt; code. Even if I'm bottlenecked on a large program, I can still quickly write a script that helps me with something. Most of these aren't things I would have written because they'd take too long to write! &lt;/p&gt;
    &lt;p&gt;Again, not the best comparison, because LLMs also shortcut learning the relevant APIs, so also optimize the "understanding code" part. Then again, if I could type real fast I could more quickly whip up experiments on new apis to learn them faster. &lt;/p&gt;
    &lt;h4&gt;We can do practices that slow us down in the short-term&lt;/h4&gt;
    &lt;p&gt;Something like test-driven development significantly slows down how fast you write production code, because you have to spend a lot more time writing test code. Pair programming trades speed of writing code for speed of understanding code. A two-order-of-magnitude writing speedup makes both of them effectively free. Or, if you're not an eXtreme Programming fan, you can more easily follow the &lt;a href="https://en.wikipedia.org/wiki/The_Power_of_10:_Rules_for_Developing_Safety-Critical_Code" target="_blank"&gt;The Power of Ten Rules&lt;/a&gt; and blanket your code with contracts and assertions.&lt;/p&gt;
    &lt;h4&gt;We could do more speculative editing&lt;/h4&gt;
    &lt;p&gt;This is probably the biggest difference in how we'd work if we could write 100x faster: it'd be much easier to try changes to the code to see if they're good ideas in the first place. &lt;/p&gt;
    &lt;p&gt;How often have I tried optimizing something, only to find out it didn't make a difference? How often have I done a refactoring only to end up with lower-quality code overall? Too often. Over time it makes me prefer to try things that I know will work, and only "speculatively edit" when I think it be a fast change. If I could code 100x faster it would absolutely lead to me trying more speculative edits.&lt;/p&gt;
    &lt;p&gt;This is especially big because I believe that lots of speculative edits are high-risk, high-reward: given 50 things we could do to the code, 49 won't make a difference and one will be a major improvement. If I only have time to try five things, I have a 10% chance of hitting the jackpot. If I can try 500 things I will get that reward every single time. &lt;/p&gt;
    &lt;h2&gt;Processes are built off constraints&lt;/h2&gt;
    &lt;p&gt;There are just a few ideas I came up with; there are probably others. Most of them, I suspect, will share the same property in common: they change &lt;em&gt;the process&lt;/em&gt; of writing code to leverage the speedup. I can totally believe that a large speedup would not remove a bottleneck in the processes we &lt;em&gt;currently&lt;/em&gt; use to write code. But that's because those processes are developed work within our existing constraints. Remove a constraint and new processes become possible.&lt;/p&gt;
    &lt;p&gt;The way I see it, if our current process produces 1 Utils of Software / day, a 100x writing speedup might lead to only 1.5 UoS/day. But there are other processes that produce only 0.5 UoS/d &lt;em&gt;because they are bottlenecked on writing speed&lt;/em&gt;. A 100x speedup would lead to 10 UoS/day.&lt;/p&gt;
    &lt;p&gt;The problem with all of this that 100x speedup isn't realistic, and it's not obvious whether a 2x improvement would lead to better processes. Then again, one of the first custom vim function scripts I wrote was an aid to writing unit tests in a particular codebase, and it lead to me writing a lot more tests. So maybe even a 2x speedup is going to be speed things up, too.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Patreon Stuff&lt;/h3&gt;
    &lt;p&gt;I wrote a couple of TLA+ specs to show how to model &lt;a href="https://en.wikipedia.org/wiki/Fork%E2%80%93join_model" target="_blank"&gt;fork-join&lt;/a&gt; algorithms. I'm planning on eventually writing them up for my blog/learntla but it'll be a while, so if you want to see them in the meantime I put them up on &lt;a href="https://www.patreon.com/posts/fork-join-in-tla-134209395?utm_medium=clipboard_copy&amp;amp;utm_source=copyLink&amp;amp;utm_campaign=postshare_creator&amp;amp;utm_content=join_link" target="_blank"&gt;Patreon&lt;/a&gt;.&lt;/p&gt;</description>
                <pubDate>Thu, 17 Jul 2025 19:08:27 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/maybe-writing-speed-actually-is-a-bottleneck-for/</guid>
            </item>
            <item>
                <title>Logic for Programmers Turns One</title>
                <link>https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/</link>
                <description>&lt;p&gt;I released &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; exactly one year ago today. It feels weird to celebrate the anniversary of something that isn't 1.0 yet, but software projects have a proud tradition of celebrating a dozen anniversaries before 1.0. I wanted to share about what's changed in the past year and the work for the next six+ months.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The book cover!" class="newsletter-image" src="https://assets.buttondown.email/images/70ac47c9-c49f-47c0-9a05-7a9e70551d03.jpg?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h3&gt;The Road to 0.1&lt;/h3&gt;
    &lt;p&gt;I had been noodling on the idea of a logic book since the pandemic. The first time I wrote about it on the newsletter was in &lt;a href="https://buttondown.com/hillelwayne/archive/predicate-logic-for-programmers/" target="_blank"&gt;2021&lt;/a&gt;! Then I said that it would be done by June and would be "under 50 pages". The idea was to cover logic as a "soft skill" that helped you think about things like requirements and stuff.&lt;/p&gt;
    &lt;p&gt;That version &lt;em&gt;sucked&lt;/em&gt;. If you want to see how much it sucked, I put it up on &lt;a href="https://www.patreon.com/posts/what-logic-for-133675688" target="_blank"&gt;Patreon&lt;/a&gt;. Then I slept on the next draft for three years. Then in 2024 a lot of business fell through and I had a lot of free time, so with the help of &lt;a href="https://saul.pw/" target="_blank"&gt;Saul Pwanson&lt;/a&gt; I rewrote the book. This time I emphasized breadth over depth, trying to cover a lot more techniques.  &lt;/p&gt;
    &lt;p&gt;I also decided to self-publish it instead of pitching it to a publisher. Not going the traditional route would mean I would be responsible for paying for editing, advertising, graphic design etc, but I hoped that would be compensated by &lt;em&gt;much&lt;/em&gt; higher royalties. It also meant I could release the book in early access and use early sales to fund further improvements. So I wrote up a draft in &lt;a href="https://www.sphinx-doc.org/en/master/" target="_blank"&gt;Sphinx&lt;/a&gt;, compiled it to LaTeX, and uploaded the PDF to &lt;a href="https://leanpub.com/" target="_blank"&gt;leanpub&lt;/a&gt;. That was in June 2024.&lt;/p&gt;
    &lt;p&gt;Since then I kept to a monthly cadence of updates, missing once in November (short-notice contract) and once last month (&lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;). The book's now on v0.10. What's changed?&lt;/p&gt;
    &lt;h3&gt;A LOT&lt;/h3&gt;
    &lt;p&gt;v0.1 was &lt;em&gt;very obviously&lt;/em&gt; an alpha, and I have made a lot of improvements since then. For one, the book no longer looks like a &lt;a href="https://www.sphinx-doc.org/_/downloads/en/master/pdf/#page=13" target="_blank"&gt;Sphinx manual&lt;/a&gt;. Compare!&lt;/p&gt;
    &lt;p&gt;&lt;img alt="0.1 on left, 0.10 on right. Way better!" class="newsletter-image" src="https://assets.buttondown.email/images/e4d880ad-80b8-4360-9cae-27c07598c740.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Also, the content is very, very different. v0.1 was 19,000 words, v.10 is 31,000.&lt;sup id="fnref:pagesize"&gt;&lt;a class="footnote-ref" href="#fn:pagesize"&gt;1&lt;/a&gt;&lt;/sup&gt; This comes from new chapters on TLA+, constraint/SMT solving, logic programming, and major expansions to the existing chapters. Originally, "Simplifying Conditionals" was 600 words. Six hundred words! It almost fit in two pages!&lt;/p&gt;
    &lt;p&gt;&lt;img alt="How short Simplifying Conditions USED to be" class="newsletter-image" src="https://assets.buttondown.email/images/31e731b7-3bdc-4ded-9b09-2a6261a323ec.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;The chapter is now 2600 words, now covering condition lifting, quantifier manipulation, helper predicates, and set optimizations. All the other chapters have either gotten similar facelifts or are scheduled to get facelifts.&lt;/p&gt;
    &lt;p&gt;The last big change is the addition of &lt;a href="https://github.com/logicforprogrammers/book-assets" target="_blank"&gt;book assets&lt;/a&gt;. Originally you had to manually copy over all of the code to try it out, which is a problem when there are samples in eight distinct languages! Now there are ready-to-go examples for each chapter, with instructions on how to set up each programming environment. This is also nice because it gives me breaks from writing to code instead.&lt;/p&gt;
    &lt;h3&gt;How did the book do?&lt;/h3&gt;
    &lt;p&gt;Leanpub's all-time visualizations are terrible, so I'll just give the summary: 1180 copies sold, $18,241 in royalties. That's a lot of money for something that isn't fully out yet! By comparison, &lt;em&gt;Practical TLA+&lt;/em&gt; has made me less than half of that, despite selling over 5x as many books. Self-publishing was the right choice!&lt;/p&gt;
    &lt;p&gt;In that time I've paid about $400 for the book cover (worth it) and maybe $800 in Leanpub's advertising service (probably not worth it). &lt;/p&gt;
    &lt;p&gt;Right now that doesn't come close to making back the time investment, but I think it can get there post-release. I believe there's a lot more potential customers via marketing. I think post-release 10k copies sold is within reach.&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;Where is the book going?&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;The main content work is rewrites: many of the chapters have not meaningfully changed since 1.0, so I am going through and rewriting them from scratch. So far four of the ten chapters have been rewritten. My (admittedly ambitious) goal is to rewrite three of them by the end of this month and another three by the end of next. I also want to do final passes on the rewritten chapters; as most of them have a few TODOs left lying around.&lt;/p&gt;
    &lt;p&gt;(Also somehow in starting this newsletter and publishing it I realized that one of the chapters might be better split into two chapters, so there could well-be a tenth technique in v0.11 or v0.12!)&lt;/p&gt;
    &lt;p&gt;After that, I will pass it to a copy editor while I work on improving the layout, making images, and indexing. I want to have something worthy of printing on a dead tree by 1.0. &lt;/p&gt;
    &lt;p&gt;In terms of timelines, I am &lt;strong&gt;very roughly&lt;/strong&gt; estimating something like this:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Summer: final big changes and rewrites&lt;/li&gt;
    &lt;li&gt;Early Autumn: graphic design and copy editing&lt;/li&gt;
    &lt;li&gt;Late Autumn: proofing, figuring out printing stuff&lt;/li&gt;
    &lt;li&gt;Winter: final ebook and initial print releases of 1.0.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;(If you know a service that helps get self-published books "past the finish line", I'd love to hear about it! Preferably something that works for a fee, not part of royalties.)&lt;/p&gt;
    &lt;p&gt;This timeline may be disrupted by official client work, like a new TLA+ contract or a conference invitation.&lt;/p&gt;
    &lt;p&gt;Needless to say, I am incredibly excited to complete this book and share the final version with you all. This is a book I wished for years ago, a book I wrote because nobody else would. It fills a critical gap in software educational material, and someday soon I'll be able to put a copy on my bookshelf. It's exhilarating and terrifying and above all, satisfying.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:pagesize"&gt;
    &lt;p&gt;It's also 150 pages vs 50 pages, but admittedly this is partially because I made the book smaller with a larger font. &lt;a class="footnote-backref" href="#fnref:pagesize" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 08 Jul 2025 18:18:52 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/</guid>
            </item>
            <item>
                <title>Logical Quantifiers in Software</title>
                <link>https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/</link>
                <description>&lt;p&gt;I realize that for all I've talked about &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; in this newsletter, I never once explained basic logical quantifiers. They're both simple and incredibly useful, so let's do that this week! &lt;/p&gt;
    &lt;h3&gt;Sets and quantifiers&lt;/h3&gt;
    &lt;p&gt;A &lt;strong&gt;set&lt;/strong&gt; is a collection of unordered, unique elements. &lt;code&gt;{1, 2, 3, …}&lt;/code&gt; is a set, as are "every programming language", "every programming language's Wikipedia page", and "every function ever defined in any programming language's standard library". You can put whatever you want in a set, with some very specific limitations to avoid certain paradoxes.&lt;sup id="fnref:paradox"&gt;&lt;a class="footnote-ref" href="#fn:paradox"&gt;2&lt;/a&gt;&lt;/sup&gt; &lt;/p&gt;
    &lt;p&gt;Once we have a set, we can ask "is something true for all elements of the set" and "is something true for at least one element of the set?" IE, is it true that every programming language has a &lt;code&gt;set&lt;/code&gt; collection type in the core language? We would write it like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# all of them
    all l in ProgrammingLanguages: HasSetType(l)
    
    # at least one
    some l in ProgrammingLanguages: HasSetType(l)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is the notation I use in the book because it's easy to read, type, and search for. Mathematicians historically had a few different formats; the one I grew up with was &lt;code&gt;∀x ∈ set: P(x)&lt;/code&gt; to mean &lt;code&gt;all x in set&lt;/code&gt;, and &lt;code&gt;∃&lt;/code&gt; to mean &lt;code&gt;some&lt;/code&gt;. I use these when writing for just myself, but find them confusing to programmers when communicating.&lt;/p&gt;
    &lt;p&gt;"All" and "some" are respectively referred to as "universal" and "existential" quantifiers.&lt;/p&gt;
    &lt;h3&gt;Some cool properties&lt;/h3&gt;
    &lt;p&gt;We can simplify expressions with quantifiers, in the same way that we can simplify &lt;code&gt;!(x &amp;amp;&amp;amp; y)&lt;/code&gt; to &lt;code&gt;!x || !y&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;First of all, quantifiers are commutative with themselves. &lt;code&gt;some x: some y: P(x,y)&lt;/code&gt; is the same as &lt;code&gt;some y: some x: P(x, y)&lt;/code&gt;. For this reason we can write &lt;code&gt;some x, y: P(x,y)&lt;/code&gt; as shorthand. We can even do this when quantifying over different sets, writing &lt;code&gt;some x, x' in X, y in Y&lt;/code&gt; instead of &lt;code&gt;some x, x' in X: some y in Y&lt;/code&gt;. We can &lt;em&gt;not&lt;/em&gt; do this with "alternating quantifiers":&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;code&gt;all p in Person: some m in Person: Mother(m, p)&lt;/code&gt; says that every person has a mother.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;some m in Person: all p in Person: Mother(m, p)&lt;/code&gt; says that someone is every person's mother.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Second, existentials distribute over &lt;code&gt;||&lt;/code&gt; while universals distribute over &lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt;. "There is some url which returns a 403 or 404" is the same as "there is some url which returns a 403 or some url that returns a 404", and "all PRs pass the linter and the test suites" is the same as "all PRs pass the linter and all PRs pass the test suites".&lt;/p&gt;
    &lt;p&gt;Finally, &lt;code&gt;some&lt;/code&gt; and &lt;code&gt;all&lt;/code&gt; are &lt;em&gt;duals&lt;/em&gt;: &lt;code&gt;some x: P(x) == !(all x: !P(x))&lt;/code&gt;, and vice-versa. Intuitively: if some file is malicious, it's not true that all files are benign.&lt;/p&gt;
    &lt;p&gt;All these rules together mean we can manipulate quantifiers &lt;em&gt;almost&lt;/em&gt; as easily as we can manipulate regular booleans, putting them in whatever form is easiest to use in programming. &lt;/p&gt;
    &lt;p&gt;Speaking of which, how &lt;em&gt;do&lt;/em&gt; we use this in in programming?&lt;/p&gt;
    &lt;h2&gt;How we use this in programming&lt;/h2&gt;
    &lt;p&gt;First of all, people clearly have a need for directly using quantifiers in code. If we have something of the form:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;for x in list:
        if P(x):
            return true
    return false
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;That's just &lt;code&gt;some x in list: P(x)&lt;/code&gt;. And this is a prevalent pattern, as you can see by using &lt;a href="https://github.com/search?q=%2Ffor+.*%3A%5Cn%5Cs*if+.*%3A%5Cn%5Cs*return+%28False%7CTrue%29%5Cn%5Cs*return+%28True%7CFalse%29%2F+language%3Apython+NOT+is%3Afork&amp;amp;type=code" target="_blank"&gt;GitHub code search&lt;/a&gt;. It finds over 500k examples of this pattern in Python alone! That can be simplified via using the language's built-in quantifiers: the Python would be &lt;code&gt;any(P(x) for x in list)&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;(Note this is not quantifying over sets but iterables. But the idea translates cleanly enough.)&lt;/p&gt;
    &lt;p&gt;More generally, quantifiers are a key way we express higher-level properties of software. What does it mean for a list to be sorted in ascending order? That &lt;code&gt;all i, j in 0..&amp;lt;len(l): if i &amp;lt; j then l[i] &amp;lt;= l[j]&lt;/code&gt;. When should a &lt;a href="https://qntm.org/ratchet" target="_blank"&gt;ratchet test fail&lt;/a&gt;? When &lt;code&gt;some f in functions - exceptions: Uses(f, bad_function)&lt;/code&gt;. Should the image classifier work upside down? &lt;code&gt;all i in images: classify(i) == classify(rotate(i, 180))&lt;/code&gt;. These are the properties we verify with tests and types and &lt;a href="https://www.hillelwayne.com/post/constructive/" target="_blank"&gt;MISU&lt;/a&gt; and whatnot;&lt;sup id="fnref:misu"&gt;&lt;a class="footnote-ref" href="#fn:misu"&gt;1&lt;/a&gt;&lt;/sup&gt; it helps to be able to make them explicit!&lt;/p&gt;
    &lt;p&gt;One cool use case that'll be in the book's next version: database invariants are universal statements over the set of all records, like &lt;code&gt;all a in accounts: a.balance &amp;gt; 0&lt;/code&gt;. That's enforceable with a &lt;a href="https://sqlite.org/lang_createtable.html#check_constraints" target="_blank"&gt;CHECK&lt;/a&gt; constraint. But what about something like &lt;code&gt;all i, i' in intervals: NoOverlap(i, i')&lt;/code&gt;? That isn't covered by CHECK, since it spans two rows.&lt;/p&gt;
    &lt;p&gt;Quantifier duality to the rescue! The invariant is equivalent to &lt;code&gt;!(some i, i' in intervals: Overlap(i, i'))&lt;/code&gt;, so is preserved if the &lt;em&gt;query&lt;/em&gt; &lt;code&gt;SELECT COUNT(*) FROM intervals CROSS JOIN intervals …&lt;/code&gt; returns 0 rows. This means we can test it via a &lt;a href="https://sqlite.org/lang_createtrigger.html" target="_blank"&gt;database trigger&lt;/a&gt;.&lt;sup id="fnref:efficiency"&gt;&lt;a class="footnote-ref" href="#fn:efficiency"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;There are a lot more use cases for quantifiers, but this is enough to introduce the ideas! Next week's the one year anniversary of the book entering early access, so I'll be writing a bit about that experience and how the book changed. It's &lt;em&gt;crazy&lt;/em&gt; how crude v0.1 was compared to the current version.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:misu"&gt;
    &lt;p&gt;MISU ("make illegal states unrepresentable") means using data representations that rule out invalid values. For example, if you have a &lt;code&gt;location -&amp;gt; Optional(item)&lt;/code&gt; lookup and want to make sure that each item is in exactly one location, consider instead changing the map to &lt;code&gt;item -&amp;gt; location&lt;/code&gt;. This is a means of &lt;em&gt;implementing&lt;/em&gt; the property &lt;code&gt;all i in item, l, l' in location: if ItemIn(i, l) &amp;amp;&amp;amp; l != l' then !ItemIn(i, l')&lt;/code&gt;. &lt;a class="footnote-backref" href="#fnref:misu" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:paradox"&gt;
    &lt;p&gt;Specifically, a set can't be an element of itself, which rules out constructing things like "the set of all sets" or "the set of sets that don't contain themselves". &lt;a class="footnote-backref" href="#fnref:paradox" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:efficiency"&gt;
    &lt;p&gt;Though note that when you're inserting or updating an interval, you already &lt;em&gt;have&lt;/em&gt; that row's fields in the trigger's &lt;code&gt;NEW&lt;/code&gt; keyword. So you can just query &lt;code&gt;!(some i in intervals: Overlap(new, i'))&lt;/code&gt;, which is more efficient. &lt;a class="footnote-backref" href="#fnref:efficiency" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 02 Jul 2025 19:44:22 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/</guid>
            </item>
            <item>
                <title>You can cheat a test suite with a big enough polynomial</title>
                <link>https://buttondown.com/hillelwayne/archive/you-can-cheat-a-test-suite-with-a-big-enough/</link>
                <description>&lt;p&gt;Hi nerds, I'm back from &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;! I'd heartily recommend it, wildest conference I've been to in years. I have a lot of work to catch up on, so this will be a short newsletter.&lt;/p&gt;
    &lt;p&gt;In an earlier version of my talk, I had a gag about unit tests. First I showed the test &lt;code&gt;f([1,2,3]) == 3&lt;/code&gt;, then said that this was satisfied by &lt;code&gt;f(l) = 3&lt;/code&gt;, &lt;code&gt;f(l) = l[-1]&lt;/code&gt;, &lt;code&gt;f(l) = len(l)&lt;/code&gt;, &lt;code&gt;f(l) = (129*l[0]-34*l[1]-617)*l[2] - 443*l[0] + 1148*l[1] - 182&lt;/code&gt;. Then I progressively rule them out one by one with more unit tests, except the last polynomial which stubbornly passes every single test.&lt;/p&gt;
    &lt;p&gt;If you're given some function of &lt;code&gt;f(x: int, y: int, …): int&lt;/code&gt; and a set of unit tests asserting &lt;a href="https://buttondown.com/hillelwayne/archive/oracle-testing/" target="_blank"&gt;specific inputs give specific outputs&lt;/a&gt;, then you can find a polynomial that passes every single unit test.&lt;/p&gt;
    &lt;p&gt;To find the gag, and as &lt;a href="https://en.wikipedia.org/wiki/Satisfiability_modulo_theories" target="_blank"&gt;SMT&lt;/a&gt; practice, I wrote a Python program that finds a polynomial that passes a test suite meant for &lt;code&gt;max&lt;/code&gt;. It's hardcoded for three parameters and only finds 2nd-order polynomials but I think it could be generalized with enough effort.&lt;/p&gt;
    &lt;h2&gt;The code&lt;/h2&gt;
    &lt;p&gt;Full code &lt;a href="https://gist.github.com/hwayne/0ed045a35376c786171f9cf4b55c470f" target="_blank"&gt;here&lt;/a&gt;, breakdown below.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;z3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;  &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
    &lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Solver&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;Solver&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;a href="https://microsoft.github.io/z3guide/" target="_blank"&gt;Z3&lt;/a&gt; is just the particular SMT solver we use, as it has good language bindings and a lot of affordances.&lt;/p&gt;
    &lt;p&gt;As part of learning SMT I wanted to do this two ways. First by putting the polynomial "outside" of the SMT solver in a python function, second by doing it "natively" in Z3. I created two solvers so I could test both versions in one run. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;a0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Consts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'a0 a b c d e f'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Ints&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'x y z'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"a*x+b*y+c*z+d*x*y+e*x*z+f*y*z+a0"&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Both &lt;code&gt;Const('x', IntSort())&lt;/code&gt; and &lt;code&gt;Int('x')&lt;/code&gt; do the exact same thing, the latter being syntactic sugar for the former. I did not know this when I wrote the program. &lt;/p&gt;
    &lt;p&gt;To keep the two versions in sync I represented the equation as a string, which I later &lt;code&gt;eval&lt;/code&gt;. This is one of the rare cases where eval is a good idea, to help us experiment more quickly while learning. The polynomial is a "2nd-order polynomial", even though it doesn't have &lt;code&gt;x^2&lt;/code&gt; terms, as it has &lt;code&gt;xy&lt;/code&gt; and &lt;code&gt;xz&lt;/code&gt; terms.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;lambdamax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="n"&gt;z3max&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'z3max'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;  &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ForAll&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;z3max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;lambdamax&lt;/code&gt; is pretty straightforward: create a lambda with three parameters and &lt;code&gt;eval&lt;/code&gt; the string. The string "&lt;code&gt;a*x&lt;/code&gt;" then becomes the python expression &lt;code&gt;a*x&lt;/code&gt;, &lt;code&gt;a&lt;/code&gt; is an SMT symbol, while the &lt;code&gt;x&lt;/code&gt; SMT symbol is shadowed by the lambda parameter. To reiterate, a terrible idea in practice, but a good way to learn faster.&lt;/p&gt;
    &lt;p&gt;&lt;code&gt;z3max&lt;/code&gt; function is a little more complex. &lt;code&gt;Function&lt;/code&gt; takes an identifier string and N "sorts" (roughly the same as programming types). The first &lt;code&gt;N-1&lt;/code&gt; sorts define the parameters of the function, while the last becomes the output. So here I assign the string identifier &lt;code&gt;"z3max"&lt;/code&gt; to be a function with signature &lt;code&gt;(int, int, int) -&amp;gt; int&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;I can load the function into the model by specifying constraints on what &lt;code&gt;z3max&lt;/code&gt; &lt;em&gt;could&lt;/em&gt; be. This could either be a strict input/output, as will be done later, or a &lt;code&gt;ForAll&lt;/code&gt; over all possible inputs. Here I just use that directly to say "for all inputs, the function should match this polynomial." But I could do more complicated constraints, like commutativity (&lt;code&gt;f(x, y) == f(y, x)&lt;/code&gt;) or monotonicity (&lt;code&gt;Implies(x &amp;lt; y, f(x) &amp;lt;= f(y))&lt;/code&gt;).&lt;/p&gt;
    &lt;p&gt;Note &lt;code&gt;ForAll&lt;/code&gt; takes a list of z3 symbols to quantify over. That's the only reason we need to define &lt;code&gt;x, y, z&lt;/code&gt; in the first place. The lambda version doesn't need them. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z3max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lambdamax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This sets up the joke: adding constraints to each solver that the polynomial it finds must, for a fixed list of triplets, return the max of each triplet.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z3max&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lambdamax&lt;/span&gt;&lt;span class="p"&gt;)]:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;sat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"max([&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;]) ="&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"max([x, y, z]) = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;x + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"+ &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;z +"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# linebreaks added for newsletter rendering&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;xy + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;xz + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;yz + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;a0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Output:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;max([1, 2, 3]) = 3
    # etc
    max([x, y, z]) = -133x + 130y + -10z + -2xy + 62xz + -46yz + 0
    
    max([1, 2, 3]) = 3
    # etc
    max([x, y, z]) = -17x + 16y + 0z + 0xy + 8xz + -6yz + 0
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I find that &lt;code&gt;z3max&lt;/code&gt; (top) consistently finds larger coefficients than &lt;code&gt;lambdamax&lt;/code&gt; does. I don't know why.&lt;/p&gt;
    &lt;h3&gt;Practical Applications&lt;/h3&gt;
    &lt;p&gt;&lt;strong&gt;Test-Driven Development&lt;/strong&gt; recommends a strict "red-green refactor" cycle. Write a new failing test, make the new test pass, then go back and refactor. Well, the easiest way to make the new test pass would be to paste in a new polynomial, so that's what you should be doing. You can even do this all automatically: have a script read the set of test cases, pass them to the solver, and write the new polynomial to your code file. All you need to do is write the tests!&lt;/p&gt;
    &lt;h3&gt;Pedagogical Notes&lt;/h3&gt;
    &lt;p&gt;Writing the script took me a couple of hours. I'm sure an LLM could have whipped it all up in five minutes but I really want to &lt;em&gt;learn&lt;/em&gt; SMT and &lt;a href="https://www.sciencedirect.com/science/article/pii/S0747563224002541" target="_blank"&gt;LLMs &lt;em&gt;may&lt;/em&gt; decrease learning retention&lt;/a&gt;.&lt;sup id="fnref:caveat"&gt;&lt;a class="footnote-ref" href="#fn:caveat"&gt;1&lt;/a&gt;&lt;/sup&gt; Z3 documentation is not... great for non-academics, though, and most other SMT solvers have even worse docs. One useful trick I use regularly is to use Github code search to find code using the same APIs and study how that works. Turns out reading API-heavy code is a lot easier than writing it!&lt;/p&gt;
    &lt;p&gt;Anyway, I'm very, very slowly feeling like I'm getting the basics on how to use SMT. I don't have any practical use cases yet, but I wanted to learn this skill for a while and glad I finally did.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:caveat"&gt;
    &lt;p&gt;Caveat I have not actually &lt;em&gt;read&lt;/em&gt; the study, for all I know it could have a sample size of three people, I'll get around to it eventually &lt;a class="footnote-backref" href="#fnref:caveat" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 24 Jun 2025 16:27:01 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/you-can-cheat-a-test-suite-with-a-big-enough/</guid>
            </item>
            <item>
                <title>Solving LinkedIn Queens with SMT</title>
                <link>https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/</link>
                <description>&lt;h3&gt;No newsletter next week&lt;/h3&gt;
    &lt;p&gt;I’ll be speaking at &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;. My talk isn't close to done yet, which is why this newsletter is both late and short. &lt;/p&gt;
    &lt;h1&gt;Solving LinkedIn Queens in SMT&lt;/h1&gt;
    &lt;p&gt;The article &lt;a href="https://codingnest.com/modern-sat-solvers-fast-neat-underused-part-1-of-n/" target="_blank"&gt;Modern SAT solvers: fast, neat and underused&lt;/a&gt; claims that SAT solvers&lt;sup id="fnref:SAT"&gt;&lt;a class="footnote-ref" href="#fn:SAT"&gt;1&lt;/a&gt;&lt;/sup&gt; are "criminally underused by the industry". A while back on the newsletter I asked "why": how come they're so powerful and yet nobody uses them? Many experts responded saying the reason is that encoding SAT kinda sucked and they rather prefer using tools that compile to SAT. &lt;/p&gt;
    &lt;p&gt;I was reminded of this when I read &lt;a href="https://ryanberger.me/posts/queens/" target="_blank"&gt;Ryan Berger's post&lt;/a&gt; on solving “LinkedIn Queens” as a SAT problem. &lt;/p&gt;
    &lt;p&gt;A quick overview of Queens. You’re presented with an NxN grid divided into N regions, and have to place N queens so that there is exactly one queen in each row, column, and region. While queens can be on the same diagonal, they &lt;em&gt;cannot&lt;/em&gt; be adjacently diagonal.&lt;/p&gt;
    &lt;p&gt;(Important note: Linkedin “Queens” is a variation on the puzzle game &lt;a href="https://starbattle.puzzlebaron.com/" target="_blank"&gt;Star Battle&lt;/a&gt;, which is the same except the number of stars you place in each row/column/region varies per puzzle, and is usually two. This is also why 'queens' don’t capture like chess queens.)&lt;/p&gt;
    &lt;p&gt;&lt;img alt="An image of a solved queens board. Copied from https://ryanberger.me/posts/queens" class="newsletter-image" src="https://assets.buttondown.email/images/96f6f923-331f-424d-8641-fe6753e1c2ca.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Ryan solved this by writing Queens as a SAT problem, expressing properties like "there is exactly one queen in row 3" as a large number of boolean clauses. &lt;a href="https://ryanberger.me/posts/queens/" target="_blank"&gt;Go read his post, it's pretty cool&lt;/a&gt;. What leapt out to me was that he used &lt;a href="https://cvc5.github.io/" target="_blank"&gt;CVC5&lt;/a&gt;, an &lt;strong&gt;SMT&lt;/strong&gt; solver.&lt;sup id="fnref:SMT"&gt;&lt;a class="footnote-ref" href="#fn:SMT"&gt;2&lt;/a&gt;&lt;/sup&gt; SMT solvers are "higher-level" than SAT, capable of handling more data types than just boolean variables. It's a lot easier to solve the problem at the SMT level than at the SAT level. To show this, I whipped up a short demo of solving the same problem in &lt;a href="https://github.com/Z3Prover/z3/wiki" target="_blank"&gt;Z3&lt;/a&gt; (via the &lt;a href="https://pypi.org/project/z3-solver/" target="_blank"&gt;Python API&lt;/a&gt;).&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://gist.github.com/hwayne/c5de7bc52e733995311236666bedecd3" target="_blank"&gt;Full code here&lt;/a&gt;, which you can compare to Ryan's SAT solution &lt;a href="https://github.com/ryan-berger/queens/blob/master/main.py" target="_blank"&gt;here&lt;/a&gt;. I didn't do a whole lot of cleanup on it (again, time crunch!), but short explanation below.&lt;/p&gt;
    &lt;h3&gt;The code&lt;/h3&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;z3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
    &lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;itertools&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;combinations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;
    &lt;span class="n"&gt;solver&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Solver&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt; &lt;span class="c1"&gt;# N&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Initial setup and modules. &lt;code&gt;size&lt;/code&gt; is the number of rows/columns/regions in the board, which I'll call &lt;code&gt;N&lt;/code&gt; below.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# queens[n] = col of queen on row n&lt;/span&gt;
    &lt;span class="c1"&gt;# by construction, not on same row&lt;/span&gt;
    &lt;span class="n"&gt;queens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;IntVector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'q'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;SAT represents the queen positions via N² booleans: &lt;code&gt;q_00&lt;/code&gt; means that a Queen is on row 0 and column 0, &lt;code&gt;!q_05&lt;/code&gt; means a queen &lt;em&gt;isn't&lt;/em&gt; on row 0 col 5, etc. In SMT we can instead encode it as N integers: &lt;code&gt;q_0 = 5&lt;/code&gt; means that the queen on row 0 is positioned at column 5. This immediately enforces one class of constraints for us: we don't need any constraints saying "exactly one queen per row", because that's embedded in the definition of &lt;code&gt;queens&lt;/code&gt;!&lt;/p&gt;
    &lt;p&gt;(Incidentally, using 0-based indexing for the board was a mistake on my part, it makes correctly encoding the regions later really painful.)&lt;/p&gt;
    &lt;p&gt;To actually make the variables &lt;code&gt;[q_0, q_1, …]&lt;/code&gt;, we use the Z3 affordance &lt;code&gt;IntVector(str, n)&lt;/code&gt; for making &lt;code&gt;n&lt;/code&gt; variables at once.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;And&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="c1"&gt;# not on same column&lt;/span&gt;
    &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Distinct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;First we constrain all the integers to &lt;code&gt;[0, N)&lt;/code&gt;, then use the &lt;em&gt;incredibly&lt;/em&gt; handy &lt;code&gt;Distinct&lt;/code&gt; constraint to force all the integers to have different values. This guarantees at most one queen per column, which by the &lt;a href="https://en.wikipedia.org/wiki/Pigeonhole_principle" target="_blank"&gt;pigeonhole principle&lt;/a&gt; means there is exactly one queen per column.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# not diagonally adjacent&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;q1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;q2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;One of the rules is that queens can't be adjacent. We already know that they can't be horizontally or vertically adjacent via other constraints, which leaves the diagonals. We only need to add constraints that, for each queen, there is no queen in the lower-left or lower-right corner, aka &lt;code&gt;q_3 != q_2 ± 1&lt;/code&gt;. We don't need to check the top corners because if &lt;code&gt;q_1&lt;/code&gt; is in the upper-left corner of &lt;code&gt;q_2&lt;/code&gt;, then &lt;code&gt;q_2&lt;/code&gt; is in the lower-right corner of &lt;code&gt;q_1&lt;/code&gt;!&lt;/p&gt;
    &lt;p&gt;That covers everything except the "one queen per region" constraint. But the regions are the tricky part, which we should expect because we vary the difficulty of queens games by varying the regions.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;regions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;"purple"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
            &lt;span class="s2"&gt;"red"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),],&lt;/span&gt;
            &lt;span class="c1"&gt;# you get the picture&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
    
    &lt;span class="c1"&gt;# Some checking code left out, see below&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The region has to be manually coded in, which is a huge pain.&lt;/p&gt;
    &lt;p&gt;(In the link, some validation code follows. Since it breaks up explaining the model I put it in the next section.)&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Or&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally we have the region constraint. The easiest way I found to say "there is exactly one queen in each region" is to say "there is a queen in region 1 and a queen in region 2 and a queen in region 3" etc." Then to say "there is a queen in region &lt;code&gt;purple&lt;/code&gt;" I wrote "&lt;code&gt;q_0 = 0&lt;/code&gt; OR &lt;code&gt;q_0 = 1&lt;/code&gt; OR … OR &lt;code&gt;q_1 = 0&lt;/code&gt; etc." &lt;/p&gt;
    &lt;p&gt;Why iterate over every position in the region instead of doing something like &lt;code&gt;(0, q[0]) in r&lt;/code&gt;? I tried that but it's not an expression that Z3 supports.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;sat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally, we solve and print the positions. Running this gives me:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;q__0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Which is the correct solution to the queens puzzle. I didn't benchmark the solution times, but I imagine it's considerably slower than a raw SAT solver. &lt;a href="https://github.com/audemard/glucose" target="_blank"&gt;Glucose&lt;/a&gt; is really, really fast.&lt;/p&gt;
    &lt;p&gt;But even so, solving the problem with SMT was a lot &lt;em&gt;easier&lt;/em&gt; than solving it with SAT. That satisfies me as an explanation for why people prefer it to SAT.&lt;/p&gt;
    &lt;h3&gt;Sanity checks&lt;/h3&gt;
    &lt;p&gt;One bit I glossed over earlier was the sanity checking code. I &lt;em&gt;knew for sure&lt;/em&gt; that I was going to make a mistake encoding the &lt;code&gt;region&lt;/code&gt;, and the solver wasn't going to provide useful information abut what I did wrong.  In cases like these, I like adding small tests and checks to catch mistakes early, because the solver certainly isn't going to catch them!&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;all_squares&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;repeat&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;test_i_set_up_problem_right&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;all_squares&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_iterable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r2&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;combinations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The first check was a quick test that I didn't leave any squares out, or accidentally put the same square in both regions. Converting the values into sets makes both checks a lot easier. Honestly I don't know why I didn't just use sets from the start, sets are great.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;render_regions&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;colormap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"purple"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s2"&gt;"red"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"brown"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"white"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"green"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"yellow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"orange"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"blue"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"pink"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;board&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; 
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;all_squares&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;colormap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    
    &lt;span class="n"&gt;render_regions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The second check is something that prints out the regions. It produces something like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;111111111
    112333999
    122439999
    124437799
    124666779
    124467799
    122467899
    122555889
    112258899
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I can compare this to the picture of the board to make sure I got it right. I guess a more advanced solution would be to print emoji squares like 🟥 instead.&lt;/p&gt;
    &lt;p&gt;Neither check is quality code but it's throwaway and it gets the job done so eh.&lt;/p&gt;
    &lt;h3&gt;Update for the Internet&lt;/h3&gt;
    &lt;p&gt;This was sent as a weekly newsletter, which is usually on topics like &lt;a href="https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code" target="_blank"&gt;software history&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/" target="_blank"&gt;formal methods&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;unusual technologies&lt;/a&gt;, and the &lt;a href="https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/" target="_blank"&gt;theory of software engineering&lt;/a&gt;. You &lt;a href="https://buttondown.email/hillelwayne/" target="_blank"&gt;can subscribe here&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:SAT"&gt;
    &lt;p&gt;"Boolean &lt;strong&gt;SAT&lt;/strong&gt;isfiability Solver", aka a solver that can find assignments that make complex boolean expressions true. I write a bit more about them &lt;a href="https://www.hillelwayne.com/post/np-hard/" target="_blank"&gt;here&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:SAT" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:SMT"&gt;
    &lt;p&gt;"Satisfiability Modulo Theories" &lt;a class="footnote-backref" href="#fnref:SMT" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 12 Jun 2025 15:43:25 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/</guid>
            </item>
            <item>
                <title>AI is a gamechanger for TLA+ users</title>
                <link>https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/</link>
                <description>&lt;h3&gt;New Logic for Programmers Release&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.10 is now available&lt;/a&gt;! This is a minor release, mostly focused on logic-based refactoring, with new material on set types and testing refactors are correct. See the full release notes at &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;the changelog page&lt;/a&gt;. Due to &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;conference pressure&lt;/a&gt; v0.11 will also likely be a minor release. &lt;/p&gt;
    &lt;p&gt;&lt;img alt="The book cover" class="newsletter-image" src="https://assets.buttondown.email/images/29d4ae9d-bcb9-4d8b-99d4-8a35c0990ad5.jpg?w=300&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h1&gt;AI is a gamechanger for TLA+ users&lt;/h1&gt;
    &lt;p&gt;&lt;a href="https://lamport.azurewebsites.net/tla/tla.html" target="_blank"&gt;TLA+&lt;/a&gt; is a specification language to model and debug distributed systems. While very powerful, it's also hard for programmers to learn, and there's always questions of connecting specifications with actual code. &lt;/p&gt;
    &lt;p&gt;That's why &lt;a href="https://zfhuang99.github.io/github%20copilot/formal%20verification/tla+/2025/05/24/ai-revolution-in-distributed-systems.html" target="_blank"&gt;The Coming AI Revolution in Distributed Systems&lt;/a&gt; caught my interest. In the post, Cheng Huang claims that Azure successfully used LLMs to examine an existing codebase, derive a TLA+ spec, and find a production bug in that spec. "After a decade of manually crafting TLA+ specifications", he wrote, "I must acknowledge that this AI-generated specification rivals human work".&lt;/p&gt;
    &lt;p&gt;This inspired me to experiment with LLMs in TLA+ myself. My goals are a little less ambitious than Cheng's: I wanted to see how LLMs could help junior specifiers write TLA+, rather than handling the entire spec automatically. Details on what did and didn't work below, but my takeaway is that &lt;strong&gt;LLMs are an immense specification force multiplier.&lt;/strong&gt;&lt;/p&gt;
    &lt;p&gt;All tests were done with a standard VSCode Copilot subscription, writing Claude 3.7 in Agent mode. Other LLMs or IDEs may be more or less effective, etc.&lt;/p&gt;
    &lt;h2&gt;Things Claude was good at&lt;/h2&gt;
    &lt;h3&gt;Fixing syntax errors&lt;/h3&gt;
    &lt;p&gt;TLA+ uses a very different syntax than mainstream programming languages, meaning beginners make a lot of mistakes where they do a "programming syntax" instead of TLA+ syntax:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;NotThree(x) = \* should be ==, not =
        x != 3 \* should be #, not !=
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The problem is that the TLA+ syntax checker, SANY, is 30 years old and doesn't provide good information. Here's what it says for that snippet:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Was expecting "==== or more Module body"
    Encountered "NotThree" at line 6, column 1
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;That only isolates one error and doesn't tell us what the problem is, only where it is. Experienced TLA+ users get "error eyes" and can quickly see what the problem is, but beginners really struggle with this.&lt;/p&gt;
    &lt;p&gt;The TLA+ foundation has made LLM integration a priority, so the VSCode extension &lt;a href="https://github.com/tlaplus/vscode-tlaplus/blob/master/src/main.ts#L174" target="_blank"&gt;naturally supports several agents actions&lt;/a&gt;. One of these is running SANY, meaning an agent can get an error, fix it, get another error, fix it, etc. Provided the above sample and asked to make it work, Claude successfully fixed both errors. It also fixed many errors in a larger spec, as well as figure out why PlusCal specs weren't compiling to TLA+.&lt;/p&gt;
    &lt;p&gt;This by itself is already enough to make LLMs a worthwhile tool, as it fixes one of the biggest barriers to entry.&lt;/p&gt;
    &lt;h3&gt;Understanding error traces&lt;/h3&gt;
    &lt;p&gt;When TLA+ finds a violated property, it outputs the sequence of steps that leads to the error. This starts in plaintext, and VSCode parses it into an interactive table:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="An example error trace" class="newsletter-image" src="https://assets.buttondown.email/images/f7f16d0e-c61f-4286-ae49-67e03f844126.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Learning to read these error traces is a skill in itself. You have to understand what's happening in each step and how it relates back to the actually broken property. It takes a long time for people to learn how to do this well.&lt;/p&gt;
    &lt;p&gt;Claude was successful here, too, accurately reading 20+ step error traces and giving a high-level explanation of what went wrong. It also could condense error traces: if ten steps of the error trace could be condensed into a one-sentence summary (which can happen if you're modeling a lot of process internals) Claude would do it.&lt;/p&gt;
    &lt;p&gt;I did have issues here with doing this in agent mode: while the extension does provide a "run model checker" command, the agent would regularly ignore this and prefer to run a terminal command instead. This would be fine except that the LLM consistently hallucinated invalid commands. I had to amend every prompt with "run the model checker via vscode, do not use a terminal command". You can skip this if you're willing to copy and paste the error trace into the prompt.&lt;/p&gt;
    &lt;p&gt;As with syntax checking, if this was the &lt;em&gt;only&lt;/em&gt; thing LLMs could effectively do, that would already be enough&lt;sup id="fnref:dayenu"&gt;&lt;a class="footnote-ref" href="#fn:dayenu"&gt;1&lt;/a&gt;&lt;/sup&gt; to earn a strong recommend. Even as a TLA+ expert I expect I'll be using this trick regularly. &lt;/p&gt;
    &lt;h3&gt;Boilerplate tasks&lt;/h3&gt;
    &lt;p&gt;TLA+ has a lot of boilerplate. One of the most notorious examples is &lt;code&gt;UNCHANGED&lt;/code&gt; rules. Specifications are extremely precise — so precise that you have to specify what variables &lt;em&gt;don't&lt;/em&gt; change in every step. This takes the form of an &lt;code&gt;UNCHANGED&lt;/code&gt; clause at the end of relevant actions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;RemoveObjectFromStore(srv, o, s) ==
      /\ o \in stored[s]
      /\ stored' = [stored EXCEPT ![s] = @ \ {o}]
      /\ UNCHANGED &amp;lt;&amp;lt;capacity, log, objectsize, pc&amp;gt;&amp;gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Writing this is really annoying. Updating these whenever you change an action, or add a new variable to the spec, is doubly so. Syntax checking and error analysis are important for beginners, but this is what I wanted for &lt;em&gt;myself&lt;/em&gt;. I took a spec and prompted Claude&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Add UNCHANGED &amp;lt;&lt;v1, etc="" v2,=""&gt;&amp;gt; for each variable not changed in an action.&lt;/v1,&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;And it worked! It successfully updated the &lt;code&gt;UNCHANGED&lt;/code&gt; in every action. &lt;/p&gt;
    &lt;p&gt;(Note, though, that it was a "well-behaved" spec in this regard: only one "action" happened at a time. In TLA+ you can have two actions happen simultaneously, that each update half of the variables, meaning neither of them should have an &lt;code&gt;UNCHANGED&lt;/code&gt; clause. I haven't tested how Claude handles that!)&lt;/p&gt;
    &lt;p&gt;That's the most obvious win, but Claude was good at handling other tedious work, too. Some examples include updating &lt;code&gt;vars&lt;/code&gt; (the conventional collection of all state variables), lifting a hard-coded value into a model parameter, and changing data formats. Most impressive to me, though, was rewriting a spec designed for one process to instead handle multiple processes. This means taking all of the process variables, which originally have types like &lt;code&gt;Int&lt;/code&gt;, converting them to types like &lt;code&gt;[Process -&amp;gt; Int]&lt;/code&gt;, and then updating the uses of all of those variables in the spec. It didn't account for race conditions in the new concurrent behavior, but it was an excellent scaffold to do more work.&lt;/p&gt;
    &lt;h3&gt;Writing properties from an informal description&lt;/h3&gt;
    &lt;p&gt;You have to be pretty precise with your intended property description but it handles converting that precise description into TLA+'s formalized syntax, which is something beginners often struggle with.&lt;/p&gt;
    &lt;h2&gt;Things it is less good at&lt;/h2&gt;
    &lt;h3&gt;Generating model config files&lt;/h3&gt;
    &lt;p&gt;To model check TLA+, you need both a specification (&lt;code&gt;.tla&lt;/code&gt;) and a model config file (&lt;code&gt;.cfg&lt;/code&gt;), which have separate syntaxes. Asking the agent to generate the second often lead to it using TLA+ syntax. It automatically fixed this after getting parsing errors, though. &lt;/p&gt;
    &lt;h3&gt;Fixing specs&lt;/h3&gt;
    &lt;p&gt;Whenever the ran model checking and discovered a bug, it would naturally propose a change to either the invalid property or the spec. Sometimes the changes were good, other times the changes were not physically realizable. For example, if it found that a bug was due to a race condition between processes, it would often suggest fixing it by saying race conditions were okay. I mean yes, if you say bugs are okay, then the spec finds that bugs are okay! Or it would alternatively suggest adding a constraint to the spec saying that race conditions don't happen. &lt;a href="https://www.hillelwayne.com/post/alloy-facts/" target="_blank"&gt;But that's a huge mistake in specification&lt;/a&gt;, because race conditions happen if we don't have coordination. We need to specify the &lt;em&gt;mechanism&lt;/em&gt; that is supposed to prevent them.&lt;/p&gt;
    &lt;h3&gt;Finding properties of the spec&lt;/h3&gt;
    &lt;p&gt;After seeing how capable it was at translating my properties to TLA+, I started prompting Claude to come up with properties on its own. Unfortunately, almost everything I got back was either trivial, uninteresting, or too coupled to implementation details. I haven't tested if it would work better to ask it for "properties that may be violated".&lt;/p&gt;
    &lt;h3&gt;Generating code from specs&lt;/h3&gt;
    &lt;p&gt;I have to be specific here: Claude &lt;em&gt;could&lt;/em&gt; sometimes convert Python into a passable spec, an vice versa. It &lt;em&gt;wasn't&lt;/em&gt; good at recognizing abstraction. For example, TLA+ specifications often represent sequential operations with a state variable, commonly called &lt;code&gt;pc&lt;/code&gt;. If modeling code that nonatomically retrieves a counter value and increments it, we'd have one action that requires &lt;code&gt;pc = "Get"&lt;/code&gt; and sets the new value to &lt;code&gt;"Inc"&lt;/code&gt;, then another that requires it be &lt;code&gt;"Inc"&lt;/code&gt; and sets it to &lt;code&gt;"Done"&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;I found that Claude would try to somehow convert &lt;code&gt;pc&lt;/code&gt; into part of the Python program's state, rather than recognize it as a TLA+ abstraction. On the other side, when converting python code to TLA+ it would often try to translate things like &lt;code&gt;sleep&lt;/code&gt; into some part of the spec, not recognizing that it is abstractable into a distinct action. I didn't test other possible misconceptions, like converting randomness to nondeterminism.&lt;/p&gt;
    &lt;p&gt;For the record, when converting TLA+ to Python Claude tended to make simulators of the spec, rather than possible production code implementing the spec. I really wasn't expecting otherwise though.&lt;/p&gt;
    &lt;h2&gt;Unexplored Applications&lt;/h2&gt;
    &lt;p&gt;Things I haven't explored thoroughly but could possibly be effective, based on what I know about TLA+ and AI:&lt;/p&gt;
    &lt;h3&gt;Writing Java Overrides&lt;/h3&gt;
    &lt;p&gt;Most TLA+ operators are resolved via TLA+ interpreters, but you can also implement them in "native" Java. This lets you escape the standard language semantics and add capabilities like &lt;a href="https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla" target="_blank"&gt;executing programs during model-checking&lt;/a&gt; or &lt;a href="https://github.com/tlaplus/tlaplus/blob/master/tlatools/org.lamport.tlatools/src/tla2sany/StandardModules/TLC.tla#L62" target="_blank"&gt;dynamically constrain the depth of the searched state space&lt;/a&gt;. There's a lot of cool things I think would be possible with overrides. The problem is there's only a handful of people in the world who know how to write them. But that handful have written quite a few overrides and I think there's enough there for Claude to work with. &lt;/p&gt;
    &lt;h3&gt;Writing specs, given a reference mechanism&lt;/h3&gt;
    &lt;p&gt;In all my experiments, the LLM only had my prompts and the occasional Python script as information. That makes me suspect that some of its problems with writing and fixing specs come down to not having a system model. Maybe it wouldn't suggest fixes like "these processes never race" if it had a design doc saying that the processes can't coordinate. &lt;/p&gt;
    &lt;p&gt;(Could a Sufficiently Powerful LLM derive some TLA+ specification from a design document?)&lt;/p&gt;
    &lt;h3&gt;Connecting specs and code&lt;/h3&gt;
    &lt;p&gt;This is the holy grail of TLA+: taking a codebase and showing it correctly implements a spec. Currently the best ways to do this are by either using TLA+ to generate a test suite, or by taking logged production traces and matching them to TLA+ behaviors. &lt;a href="https://www.mongodb.com/blog/post/engineering/conformance-checking-at-mongodb-testing-our-code-matches-our-tla-specs" target="_blank"&gt;This blog post discusses both&lt;/a&gt;. While I've seen a lot of academic research into these approaches there are no industry-ready tools. So if you want trace validation you have to do a lot of manual labour tailored to your specific product. &lt;/p&gt;
    &lt;p&gt;If LLMs could do some of this work for us then that'd really amplify the usefulness of TLA+ to many companies.&lt;/p&gt;
    &lt;h2&gt;Thoughts&lt;/h2&gt;
    &lt;p&gt;&lt;em&gt;Right now&lt;/em&gt;, agents seem good at the tedious and routine parts of TLA+ and worse at the strategic and abstraction parts. But, since the routine parts are often a huge barrier to beginners, this means that LLMs have the potential to make TLA+ far, far more accessible than it previously was.&lt;/p&gt;
    &lt;p&gt;I have mixed thoughts on this. As an &lt;em&gt;advocate&lt;/em&gt;, this is incredible. I want more people using formal specifications because I believe it leads to cheaper, safer, more reliable software. Anything that gets people comfortable with specs is great for our industry. As a &lt;em&gt;professional TLA+ consultant&lt;/em&gt;, I'm worried that this obsoletes me. Most of my income comes from training and coaching, which companies will have far less demand of now. Then again, maybe this an opportunity to pitch "agentic TLA+ training" to companies!&lt;/p&gt;
    &lt;p&gt;Anyway, if you're interested in TLA+, there has never been a better time to try it. I mean it, these tools handle so much of the hard part now. I've got a &lt;a href="https://learntla.com/" target="_blank"&gt;free book available online&lt;/a&gt;, as does &lt;a href="https://lamport.azurewebsites.net/tla/book.html" target="_blank"&gt;the inventor of TLA+&lt;/a&gt;. I like &lt;a href="https://elliotswart.github.io/pragmaticformalmodeling/" target="_blank"&gt;this guide too&lt;/a&gt;. Happy modeling!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:dayenu"&gt;
    &lt;p&gt;Dayenu. &lt;a class="footnote-backref" href="#fnref:dayenu" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 05 Jun 2025 14:59:11 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/</guid>
            </item>
            <item>
                <title>What does "Undecidable" mean, anyway</title>
                <link>https://buttondown.com/hillelwayne/archive/what-does-undecidable-mean-anyway/</link>
                <description>&lt;h3&gt;Systems Distributed&lt;/h3&gt;
    &lt;p&gt;I'll be speaking at &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt; next month! The talk is brand new and will aim to showcase some of the formal methods mental models that would be useful in mainstream software development. It has added some extra stress on my schedule, though, so expect the next two monthly releases of &lt;em&gt;Logic for Programmers&lt;/em&gt; to be mostly minor changes.&lt;/p&gt;
    &lt;h2&gt;What does "Undecidable" mean, anyway&lt;/h2&gt;
    &lt;p&gt;Last week I read &lt;a href="https://liamoc.net/forest/loc-000S/index.xml" target="_blank"&gt;Against Curry-Howard Mysticism&lt;/a&gt;, which is a solid article I recommend reading. But this newsletter is actually about &lt;a href="https://lobste.rs/s/n0whur/against_curry_howard_mysticism#c_lbts57" target="_blank"&gt;one comment&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;I like to see posts like this because I often feel like I can’t tell the difference between BS and a point I’m missing. Can we get one for questions like “Isn’t XYZ (Undecidable|NP-Complete|PSPACE-Complete)?” &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;I've already written one of these for &lt;a href="https://www.hillelwayne.com/post/np-hard/" target="_blank"&gt;NP-complete&lt;/a&gt;, so let's do one for "undecidable". Step one is to pull a technical definition from the book &lt;a href="https://link.springer.com/book/10.1007/978-1-4612-1844-9" target="_blank"&gt;&lt;em&gt;Automata and Computability&lt;/em&gt;&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;A property P of strings is said to be decidable if ... there is a total Turing machine that accepts input strings that have property P and rejects those that do not. (pg 220)&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Step two is to translate the technical computer science definition into more conventional programmer terms. Warning, because this is a newsletter and not a blog post, I might be a little sloppy with terms.&lt;/p&gt;
    &lt;h3&gt;Machines and Decision Problems&lt;/h3&gt;
    &lt;p&gt;In automata theory, all inputs to a "program" are strings of characters, and all outputs are "true" or "false". A program "accepts" a string if it outputs "true", and "rejects" if it outputs "false". You can think of this as automata studying all pure functions of type &lt;code&gt;f :: string -&amp;gt; boolean&lt;/code&gt;. Problems solvable by finding such an &lt;code&gt;f&lt;/code&gt; are called "decision problems".&lt;/p&gt;
    &lt;p&gt;This covers more than you'd think, because we can bootstrap more powerful functions from these. First, as anyone who's programmed in bash knows, strings can represent any other data. Second, we can fake non-boolean outputs by instead checking if a certain computation gives a certain result. For example, I can reframe the function &lt;code&gt;add(x, y) = x + y&lt;/code&gt; as a decision problem like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;IS_SUM(str) {
        x, y, z = split(str, "#")
        return x + y == z
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Then because &lt;code&gt;IS_SUM("2#3#5")&lt;/code&gt; returns true, we know &lt;code&gt;2 + 3 == 5&lt;/code&gt;, while &lt;code&gt;IS_SUM("2#3#6")&lt;/code&gt; is false. Since we can bootstrap parameters out of strings, I'll just say it's &lt;code&gt;IS_SUM(x, y, z)&lt;/code&gt; going forward.&lt;/p&gt;
    &lt;p&gt;A big part of automata theory is studying different models of computation with different strengths. One of the weakest is called &lt;a href="https://en.wikipedia.org/wiki/Deterministic_finite_automaton" target="_blank"&gt;"DFA"&lt;/a&gt;. I won't go into any details about what DFA actually can do, but the important thing is that it &lt;em&gt;can't&lt;/em&gt; solve &lt;code&gt;IS_SUM&lt;/code&gt;. That is, if you give me a DFA that takes inputs of form &lt;code&gt;x#y#z&lt;/code&gt;, I can always find an input where the DFA returns true when &lt;code&gt;x + y != z&lt;/code&gt;, &lt;em&gt;or&lt;/em&gt; an input which returns false when &lt;code&gt;x + y == z&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;It's really important to keep this model of "solve" in mind: a program solves a problem if it correctly returns true on all true inputs and correctly returns false on all false inputs.&lt;/p&gt;
    &lt;h3&gt;(total) Turing Machines&lt;/h3&gt;
    &lt;p&gt;A Turing Machine (TM) is a particular type of computation model. It's important for two reasons: &lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;
    &lt;p&gt;By the &lt;a href="https://en.wikipedia.org/wiki/Church%E2%80%93Turing_thesis" target="_blank"&gt;Church-Turing thesis&lt;/a&gt;, a Turing Machine is the "upper bound" of how powerful (physically realizable) computational models can get. This means that if an actual real-world programming language can solve a particular decision problem, so can a TM. Conversely, if the TM &lt;em&gt;can't&lt;/em&gt; solve it, neither can the programming language.&lt;sup id="fnref:caveat"&gt;&lt;a class="footnote-ref" href="#fn:caveat"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li&gt;
    &lt;p&gt;It's possible to write a Turing machine that takes &lt;em&gt;a textual representation of another Turing machine&lt;/em&gt; as input, and then simulates that Turing machine as part of its computations. &lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Property (1) means that we can move between different computational models of equal strength, proving things about one to learn things about another. That's why I'm able to write &lt;code&gt;IS_SUM&lt;/code&gt; in a pseudocode instead of writing it in terms of the TM computational model (and why I was able to use &lt;code&gt;split&lt;/code&gt; for convenience). &lt;/p&gt;
    &lt;p&gt;Property (2) does several interesting things. First of all, it makes it possible to compose Turing machines. Here's how I can roughly ask if a given number is the sum of two primes, with "just" addition and boolean functions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;IS_SUM_TWO_PRIMES(z):
        x := 1
        y := 1
        loop {
            if x &amp;gt; z {return false}
            if IS_PRIME(x) {
                if IS_PRIME(y) {
                    if IS_SUM(x, y, z) {
                        return true;
                    }
                }
            }
            y := y + 1
            if y &amp;gt; x {
                x := x + 1
                y := 0
            }
        }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Notice that without the &lt;code&gt;if x &amp;gt; z {return false}&lt;/code&gt;, the program would loop forever on &lt;code&gt;z=2&lt;/code&gt;. A TM that always halts for all inputs is called &lt;strong&gt;total&lt;/strong&gt;.&lt;/p&gt;
    &lt;p&gt;Property (2) also makes "Turing machines" a possible input to functions, meaning that we can now make decision problems about the behavior of Turing machines. For example, "does the TM &lt;code&gt;M&lt;/code&gt; either accept or reject &lt;code&gt;x&lt;/code&gt; within ten steps?"&lt;sup id="fnref:backticks"&gt;&lt;a class="footnote-ref" href="#fn:backticks"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;IS_DONE_IN_TEN_STEPS(M, x) {
        for (i = 0; i &amp;lt; 10; i++) {
            `simulate M(x) for one step`
            if(`M accepted or rejected`) {
                return true
            }
        }
        return false
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;Decidability and Undecidability&lt;/h3&gt;
    &lt;p&gt;Now we have all of the pieces to understand our original definition:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;A property P of strings is said to be decidable if ... there is a total Turing machine that accepts input strings that have property P and rejects those that do not. (220)&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Let &lt;code&gt;IS_P&lt;/code&gt; be the decision problem "Does the input satisfy P"? Then &lt;code&gt;IS_P&lt;/code&gt; is decidable if it can be solved by a Turing machine, ie, I can provide some &lt;code&gt;IS_P(x)&lt;/code&gt; machine that &lt;em&gt;always&lt;/em&gt; accepts if &lt;code&gt;x&lt;/code&gt; has property P, and always rejects if &lt;code&gt;x&lt;/code&gt; doesn't have property P. If I can't do that, then &lt;code&gt;IS_P&lt;/code&gt; is undecidable. &lt;/p&gt;
    &lt;p&gt;&lt;code&gt;IS_SUM(x, y, z)&lt;/code&gt; and &lt;code&gt;IS_DONE_IN_TEN_STEPS(M, x)&lt;/code&gt; are decidable properties. Is &lt;code&gt;IS_SUM_TWO_PRIMES(z)&lt;/code&gt; decidable? Some analysis shows that our corresponding program will either find a solution, or have &lt;code&gt;x&amp;gt;z&lt;/code&gt; and return false. So yes, it is decidable.&lt;/p&gt;
    &lt;p&gt;Notice there's an asymmetry here. To prove some property is decidable, I need just to need to find &lt;em&gt;one&lt;/em&gt; program that correctly solves it. To prove some property is undecidable, I need to show that any possible program, no matter what it is, doesn't solve it.&lt;/p&gt;
    &lt;p&gt;So with that asymmetry in mind, do are there &lt;em&gt;any&lt;/em&gt; undecidable problems? Yes, quite a lot. Recall that Turing machines can accept encodings of other TMs as input, meaning we can write a TM that checks &lt;em&gt;properties of Turing machines&lt;/em&gt;. And, by &lt;a href="https://en.wikipedia.org/wiki/Rice%27s_theorem" target="_blank"&gt;Rice's Theorem&lt;/a&gt;, almost every nontrivial semantic&lt;sup id="fnref:nontrivial"&gt;&lt;a class="footnote-ref" href="#fn:nontrivial"&gt;3&lt;/a&gt;&lt;/sup&gt; property of Turing machines is undecidable. The conventional way to prove this is to first find a single undecidable property &lt;code&gt;H&lt;/code&gt;, and then use that to bootstrap undecidability of other properties.&lt;/p&gt;
    &lt;p&gt;The canonical and most famous example of an undecidable problem is the &lt;a href="https://en.wikipedia.org/wiki/Halting_problem" target="_blank"&gt;Halting problem&lt;/a&gt;: "does machine M halt on input i?" It's pretty easy to prove undecidable, and easy to use it to bootstrap other undecidability properties. But again, &lt;em&gt;any&lt;/em&gt; nontrivial property is undecidable. Checking a TM is total is undecidable. Checking a TM accepts &lt;em&gt;any&lt;/em&gt; inputs is undecidable. Checking a TM solves &lt;code&gt;IS_SUM&lt;/code&gt; is undecidable. Etc etc etc.&lt;/p&gt;
    &lt;h3&gt;What this doesn't mean in practice&lt;/h3&gt;
    &lt;p&gt;I often see the halting problem misconstrued as "it's impossible to tell if a program will halt before running it." &lt;strong&gt;This is wrong&lt;/strong&gt;. The halting problem says that we cannot create an algorithm that, when applied to an arbitrary program, tells us whether the program will halt or not. It is absolutely possible to tell if many programs will halt or not. It's possible to find entire subcategories of programs that are guaranteed to halt. It's possible to say "a program constructed following constraints XYZ is guaranteed to halt." &lt;/p&gt;
    &lt;p&gt;The actual consequence of undecidability is more subtle. If we want to know if a program has property P, undecidability tells us&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;We will have to spend time and mental effort to determine if it has P&lt;/li&gt;
    &lt;li&gt;We may not be successful.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;This is subtle because we're so used to living in a world where everything's undecidable that we don't really consider what the counterfactual would be like. In such a world there might be no need for Rust, because "does this C program guarantee memory-safety" is a decidable property. The entire field of formal verification could be unnecessary, as we could just check properties of arbitrary programs directly. We could automatically check if a change in a program preserves all existing behavior. Lots of famous math problems could be solved overnight. &lt;/p&gt;
    &lt;p&gt;(This to me is a strong "intuitive" argument for why the halting problem is undecidable: a halt detector can be trivially repurposed as a program optimizer / theorem-prover / bcrypt cracker / chess engine. It's &lt;em&gt;too powerful&lt;/em&gt;, so we should expect it to be impossible.)&lt;/p&gt;
    &lt;p&gt;But because we don't live in that world, all of those things are hard problems that take effort and ingenuity to solve, and even then we often fail.&lt;/p&gt;
    &lt;h3&gt;Update for the Internet&lt;/h3&gt;
    &lt;p&gt;This was sent as a weekly newsletter, which is usually on topics like &lt;a href="https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code" target="_blank"&gt;software history&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/" target="_blank"&gt;formal methods&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;unusual technologies&lt;/a&gt;, and the &lt;a href="https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/" target="_blank"&gt;theory of software engineering&lt;/a&gt;. You &lt;a href="https://buttondown.email/hillelwayne/" target="_blank"&gt;can subscribe here&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:caveat"&gt;
    &lt;p&gt;To be pendantic, a TM can't do things like "scrape a webpage" or "render a bitmap", but we're only talking about computational decision problems here. &lt;a class="footnote-backref" href="#fnref:caveat" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:backticks"&gt;
    &lt;p&gt;One notation I've adopted in &lt;em&gt;Logic for Programmers&lt;/em&gt; is marking abstract sections of pseudocode with backticks. It's really handy! &lt;a class="footnote-backref" href="#fnref:backticks" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:nontrivial"&gt;
    &lt;p&gt;Nontrivial meaning "at least one TM has this property and at least one TM doesn't have this property". Semantic meaning "related to whether the TM accepts, rejects, or runs forever on a class of inputs". &lt;code&gt;IS_DONE_IN_TEN_STEPS&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt; a semantic property, as it doesn't tell us anything about inputs that take longer than ten steps. &lt;a class="footnote-backref" href="#fnref:nontrivial" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 28 May 2025 19:34:02 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/what-does-undecidable-mean-anyway/</guid>
            </item>
            <item>
                <title>Finding hard 24 puzzles with planner programming</title>
                <link>https://buttondown.com/hillelwayne/archive/finding-hard-24-puzzles-with-planner-programming/</link>
                <description>&lt;p&gt;&lt;strong&gt;Planner programming&lt;/strong&gt; is a programming technique where you solve problems by providing a goal and actions, and letting the planner find actions that reach the goal. In a previous edition of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt;, I demonstrated how this worked by solving the 
    &lt;a href="https://en.wikipedia.org/wiki/24_(puzzle)" target="_blank"&gt;24 puzzle&lt;/a&gt; with planning. For &lt;a href="https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/" target="_blank"&gt;reasons discussed here&lt;/a&gt; I replaced that example with something more practical (orchestrating deployments), but left the &lt;a href="https://github.com/logicforprogrammers/book-assets/tree/master/code/chapter-misc" target="_blank"&gt;code online&lt;/a&gt; for posterity.&lt;/p&gt;
    &lt;p&gt;Recently I saw a family member try and fail to vibe code a tool that would find all valid 24 puzzles, and realized I could adapt the puzzle solver to also be a puzzle generator. First I'll explain the puzzle rules, then the original solver, then the generator.&lt;sup id="fnref:complex"&gt;&lt;a class="footnote-ref" href="#fn:complex"&gt;1&lt;/a&gt;&lt;/sup&gt; For a much longer intro to planning, see &lt;a href="https://www.hillelwayne.com/post/picat/" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;h3&gt;The rules of 24&lt;/h3&gt;
    &lt;p&gt;You're given four numbers and have to find some elementary equation (&lt;code&gt;+-*/&lt;/code&gt;+groupings) that uses all four numbers and results in 24. Each number must be used exactly once, but do not need to be used in the starting puzzle order. Some examples:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;code&gt;[6, 6, 6, 6]&lt;/code&gt; -&amp;gt; &lt;code&gt;6+6+6+6=24&lt;/code&gt;&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;[1, 1, 6, 6]&lt;/code&gt; -&amp;gt; &lt;code&gt;(6+6)*(1+1)=24&lt;/code&gt;&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;[4, 4, 4, 5]&lt;/code&gt; -&amp;gt; &lt;code&gt;4*(5+4/4)=24&lt;/code&gt;&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Some setups are impossible, like &lt;code&gt;[1, 1, 1, 1]&lt;/code&gt;. Others are possible only with non-elementary operations, like &lt;code&gt;[1, 5, 5, 324]&lt;/code&gt; (which requires exponentiation).&lt;/p&gt;
    &lt;h2&gt;The solver&lt;/h2&gt;
    &lt;p&gt;We will use the &lt;a href="http://picat-lang.org/" target="_blank"&gt;Picat&lt;/a&gt;, the only language that I know has a built-in planner module. The current state of our plan with be represented by a single list with all of the numbers.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;import&lt;/span&gt; &lt;span class="s s-Atom"&gt;planner&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="s s-Atom"&gt;import&lt;/span&gt; &lt;span class="s s-Atom"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    
    &lt;span class="nf"&gt;action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Cost&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;?=&amp;gt;&lt;/span&gt;
      &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt; &lt;span class="s s-Atom"&gt;:=&lt;/span&gt; &lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;% , is `and`&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt; &lt;span class="s s-Atom"&gt;:=&lt;/span&gt; &lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
        &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt; &lt;span class="s s-Atom"&gt;++&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is our "action", and it works in three steps:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Nondeterministically pull two different values out of the input, deleting them&lt;/li&gt;
    &lt;li&gt;Nondeterministically pick one of the basic operations&lt;/li&gt;
    &lt;li&gt;The new state is the remaining elements, appended with that operation applied to our two picks.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Let's walk through this with &lt;code&gt;[1, 6, 1, 7]&lt;/code&gt;. There are four choices for &lt;code&gt;X&lt;/code&gt; and three four &lt;code&gt;Y&lt;/code&gt;. If the planner chooses &lt;code&gt;X=6&lt;/code&gt; and &lt;code&gt;Y=7&lt;/code&gt;, &lt;code&gt;A = $(6 + 7)&lt;/code&gt;. This is an uncomputed term in the same way lisps might use quotation. We can resolve the computation with &lt;code&gt;apply&lt;/code&gt;, as in the line &lt;code&gt;S1 = S0 ++ [apply(A)]&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;final&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=:=&lt;/span&gt; &lt;span class="mf"&gt;24.&lt;/span&gt; &lt;span class="c1"&gt;% handle floating point&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Our final goal is just a list where the only element is 24. This has to be a little floating point-sensitive to handle floating point divison, done by &lt;code&gt;=:=&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;main&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nv"&gt;Start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;best_plan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"%w %w%n"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;For &lt;code&gt;main,&lt;/code&gt; we just find the best plan with the maximum cost of &lt;code&gt;4&lt;/code&gt; and print it. When run from the command line, &lt;code&gt;picat&lt;/code&gt; automatically executes whatever is in &lt;code&gt;main&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$ picat 24.pi
    [1,5,5,6] [1 + 5,5 * 6,30 - 6]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I don't want to spoil any more 24 puzzles, so let's stop showing the plan:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;main =&amp;gt;
    &lt;span class="gd"&gt;- , printf("%w %w%n", Start, Plan)&lt;/span&gt;
    &lt;span class="gi"&gt;+ , printf("%w%n", Start)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;Generating puzzles&lt;/h3&gt;
    &lt;p&gt;Picat provides a &lt;code&gt;find_all(X, p(X))&lt;/code&gt; function, which ruturns all &lt;code&gt;X&lt;/code&gt; for which &lt;code&gt;p(X)&lt;/code&gt; is true. In theory, we could write &lt;code&gt;find_all(S, best_plan(S, 4, _)&lt;/code&gt;. In practice, there are an infinite number of valid puzzles, so we need to bound S somewhat. We also don't want to find any redundant puzzles, such as &lt;code&gt;[6, 6, 6, 4]&lt;/code&gt; and &lt;code&gt;[4, 6, 6, 6]&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;We can solve both issues by writing a helper &lt;code&gt;valid24(S)&lt;/code&gt;, which will check that &lt;code&gt;S&lt;/code&gt; a sorted list of integers within some bounds, like &lt;code&gt;1..8&lt;/code&gt;, and also has a valid solution.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;valid24&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nv"&gt;Start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;new_list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Start&lt;/span&gt; &lt;span class="s s-Atom"&gt;::&lt;/span&gt; &lt;span class="mf"&gt;1..8&lt;/span&gt; &lt;span class="c1"&gt;% every value in 1..8&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;increasing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;% sorted ascending&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;solve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;% turn into values&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;best_plan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This leans on Picat's constraint solving features to automatically find bounded sorted lists, which is why we need the &lt;code&gt;solve&lt;/code&gt; step.&lt;sup id="fnref:efficiency"&gt;&lt;a class="footnote-ref" href="#fn:efficiency"&gt;2&lt;/a&gt;&lt;/sup&gt; Now we can just loop through all of the values in &lt;code&gt;find_all&lt;/code&gt; to get all solutions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;main&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nf"&gt;foreach&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="s s-Atom"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;find_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="nf"&gt;valid24&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="nf"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"%w%n"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="s s-Atom"&gt;end&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$ picat 24.pi
    
    [1,1,1,8]
    [1,1,2,6]
    [1,1,2,7]
    [1,1,2,8]
    # etc
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;Finding hard puzzles&lt;/h3&gt;
    &lt;p&gt;Last Friday I realized I could do something more interesting with this. Once I have found a plan, I can apply further constraints to the plan, for example to find problems that can be solved with division:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;valid24(Start, Plan) =&amp;gt;
    &lt;span class="w"&gt; &lt;/span&gt; Start = new_list(4)
    &lt;span class="w"&gt; &lt;/span&gt; , Start :: 1..8
    &lt;span class="w"&gt; &lt;/span&gt; , increasing(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , solve(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , best_plan(Start, 4, Plan)
    &lt;span class="gi"&gt;+ , member($(_ / _), Plan)&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; .
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In playing with this, though, I noticed something weird: there are some solutions that appear if I sort &lt;em&gt;up&lt;/em&gt; but not &lt;em&gt;down&lt;/em&gt;. For example, &lt;code&gt;[3,3,4,5]&lt;/code&gt; appears in the solution set, but &lt;code&gt;[5, 4, 3, 3]&lt;/code&gt; doesn't appear if I replace &lt;code&gt;increasing&lt;/code&gt; with &lt;code&gt;decreasing&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;As far as I can tell, this is because Picat only finds one best plan, and &lt;code&gt;[5, 4, 3, 3]&lt;/code&gt; has &lt;em&gt;two&lt;/em&gt; solutions: &lt;code&gt;4*(5-3/3)&lt;/code&gt; and &lt;code&gt;3*(5+4)-3&lt;/code&gt;. &lt;code&gt;best_plan&lt;/code&gt; is a &lt;em&gt;deterministic&lt;/em&gt; operator, so Picat commits to the first best plan it finds. So if it finds &lt;code&gt;3*(5+4)-3&lt;/code&gt; first, it sees that the solution doesn't contain a division, throws &lt;code&gt;[5, 4, 3, 3]&lt;/code&gt; away as a candidate, and moves on to the next puzzle.&lt;/p&gt;
    &lt;p&gt;There's a couple ways we can fix this. We could replace &lt;code&gt;best_plan&lt;/code&gt; with &lt;code&gt;best_plan_nondet&lt;/code&gt;, which can backtrack to find new plans (at the cost of an enormous number of duplicates). Or we could modify our &lt;code&gt;final&lt;/code&gt; to only accept plans with a division: &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;% Hypothetical change
    final([N]) =&amp;gt;
    &lt;span class="gi"&gt;+ member($(_ / _), current_plan()),&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; N =:= 24.
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;My favorite "fix" is to ask another question entirely. While I was looking for puzzles that can be solved with division, what I actually want is puzzles that &lt;em&gt;must&lt;/em&gt; be solved with division. What if I rejected any puzzle that has a solution &lt;em&gt;without&lt;/em&gt; division?&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gi"&gt;+ plan_with_no_div(S, P) =&amp;gt; best_plan_nondet(S, 4, P), not member($(_ / _), P).&lt;/span&gt;
    
    valid24(Start, Plan) =&amp;gt;
    &lt;span class="w"&gt; &lt;/span&gt; Start = new_list(4)
    &lt;span class="w"&gt; &lt;/span&gt; , Start :: 1..8
    &lt;span class="w"&gt; &lt;/span&gt; , increasing(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , solve(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , best_plan(Start, 4, Plan)
    &lt;span class="gd"&gt;- , member($(_ / _), Plan)&lt;/span&gt;
    &lt;span class="gi"&gt;+ , not plan_with_no_div(Start, _)&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; .
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The new line's a bit tricky. &lt;code&gt;plan_with_div&lt;/code&gt; nondeterministically finds a plan, and then fails if the plan contains a division.&lt;sup id="fnref:not"&gt;&lt;a class="footnote-ref" href="#fn:not"&gt;3&lt;/a&gt;&lt;/sup&gt; Since I used &lt;code&gt;best_plan_nondet&lt;/code&gt;, it can backtrack from there and find a new plan. This means &lt;code&gt;plan_with_no_div&lt;/code&gt; only fails if not such plan exists. And in &lt;code&gt;valid24&lt;/code&gt;, we only succeed if &lt;code&gt;plan_with_no_div&lt;/code&gt; fails, guaranteeing that the only existing plans use division. Since this doesn't depend on the plan found via &lt;code&gt;best_plan&lt;/code&gt;, it doesn't matter how the values in &lt;code&gt;Start&lt;/code&gt; are arranged, this will not miss any valid puzzles.&lt;/p&gt;
    &lt;h4&gt;Aside for my &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;logic book readers&lt;/a&gt;&lt;/h4&gt;
    &lt;p&gt;The new clause is equivalent to &lt;code&gt;!(some p: Plan(p) &amp;amp;&amp;amp; !(div in p))&lt;/code&gt;. Applying the simplifications we learned:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;&lt;code&gt;!(some p: Plan(p) &amp;amp;&amp;amp; !(div in p))&lt;/code&gt; (init)&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;all p: !(plan(p) &amp;amp;&amp;amp; !(div in p))&lt;/code&gt; (all/some duality)&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;all p: !plan(p) || div in p)&lt;/code&gt; (De Morgan's law)&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;all p: plan(p) =&amp;gt; div in p&lt;/code&gt; (implication definition)&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Which more obviously means "if P is a valid plan, then it contains a division".&lt;/p&gt;
    &lt;h4&gt;Back to finding hard puzzles&lt;/h4&gt;
    &lt;p&gt;&lt;em&gt;Anyway&lt;/em&gt;, with &lt;code&gt;not plan_with_no_div&lt;/code&gt;, we are filtering puzzles on the set of possible solutions, not just specific solutions. And this gives me an idea: what if we find puzzles that have only one solution? &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gh"&gt;different_plan(S, P) =&amp;gt; best_plan_nondet(S, 4, P2), P2 != P.&lt;/span&gt;
    
    valid24(Start, Plan) =&amp;gt;
    &lt;span class="gi"&gt;+ , not different_plan(Start, Plan)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I tried this from &lt;code&gt;1..8&lt;/code&gt; and got:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;[1,2,7,7]
    [1,3,4,6]
    [1,6,6,8]
    [3,3,8,8]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;These happen to be some of the &lt;a href="https://www.4nums.com/game/difficulties/" target="_blank"&gt;hardest 24 puzzles known&lt;/a&gt;, though not all of them. Note this is assuming that &lt;code&gt;(X + Y)&lt;/code&gt; and &lt;code&gt;(Y + X)&lt;/code&gt; are &lt;em&gt;different&lt;/em&gt; solutions. If we say they're the same (by appending writing &lt;code&gt;A = $(X + Y), X &amp;lt;= Y&lt;/code&gt; in our action) then we got a lot more puzzles, many of which are considered "easy". Other "hard" things we can look for include plans that require fractions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;plan_with_no_fractions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt; 
      &lt;span class="nf"&gt;best_plan_nondet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;not&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="s s-Atom"&gt;=\=&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;
      &lt;span class="p"&gt;).&lt;/span&gt;
    
    &lt;span class="c1"&gt;% insert `not plan...` in valid24 as usual&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally, we could try seeing if a negative number is required:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;plan_with_no_negatives&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt; 
      &lt;span class="nf"&gt;best_plan_nondet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;not&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
      &lt;span class="p"&gt;).&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Interestingly this one returns no solutions, so you are never required to construct a negative number as part of a standard 24 puzzle.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:complex"&gt;
    &lt;p&gt;The code below is different than old book version, as it uses more fancy logic programming features that aren't good in learning material. &lt;a class="footnote-backref" href="#fnref:complex" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:efficiency"&gt;
    &lt;p&gt;&lt;code&gt;increasing&lt;/code&gt; is a constraint predicate. We could alternatively write &lt;code&gt;sorted&lt;/code&gt;, which is a Picat logical predicate and must be placed after &lt;code&gt;solve&lt;/code&gt;. There doesn't seem to be any efficiency gains either way. &lt;a class="footnote-backref" href="#fnref:efficiency" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:not"&gt;
    &lt;p&gt;I don't know what the standard is in Picat, but in Prolog, the convention is to use &lt;code&gt;\+&lt;/code&gt; instead of &lt;code&gt;not&lt;/code&gt;. They mean the same thing, so I'm using &lt;code&gt;not&lt;/code&gt; because it's clearer to non-LPers. &lt;a class="footnote-backref" href="#fnref:not" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 20 May 2025 18:21:01 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/finding-hard-24-puzzles-with-planner-programming/</guid>
            </item>
            <item>
                <title>Modeling Awkward Social Situations with TLA+</title>
                <link>https://buttondown.com/hillelwayne/archive/modeling-awkward-social-situations-with-tla/</link>
                <description>&lt;p&gt;You're walking down the street and need to pass someone going the opposite way. You take a step left, but they're thinking the same thing and take a step to their &lt;em&gt;right&lt;/em&gt;, aka your left. You're still blocking each other. Then you take a step to the right, and they take a step to their left, and you're back to where you started. I've heard this called "walkwarding"&lt;/p&gt;
    &lt;p&gt;Let's model this in &lt;a href="https://lamport.azurewebsites.net/tla/tla.html" target="_blank"&gt;TLA+&lt;/a&gt;. TLA+ is a &lt;strong&gt;formal methods&lt;/strong&gt; tool for finding bugs in complex software designs, most often involving concurrency. Two people trying to get past each other just also happens to be a concurrent system. A gentler introduction to TLA+'s capabilities is &lt;a href="https://www.hillelwayne.com/post/modeling-deployments/" target="_blank"&gt;here&lt;/a&gt;, an in-depth guide teaching the language is &lt;a href="https://learntla.com/" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;h2&gt;The spec&lt;/h2&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;---- MODULE walkward ----
    EXTENDS Integers
    
    VARIABLES pos
    vars == &amp;lt;&amp;lt;pos&amp;gt;&amp;gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Double equals defines a new operator, single equals is an equality check. &lt;code&gt;&amp;lt;&amp;lt;pos&amp;gt;&amp;gt;&lt;/code&gt; is a sequence, aka array.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;you == "you"
    me == "me"
    People == {you, me}
    
    MaxPlace == 4
    
    left == 0
    right == 1
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I've gotten into the habit of assigning string "symbols" to operators so that the compiler complains if I misspelled something. &lt;code&gt;left&lt;/code&gt; and &lt;code&gt;right&lt;/code&gt; are numbers so we can shift position with &lt;code&gt;right - pos&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;direction == [you |-&amp;gt; 1, me |-&amp;gt; -1]
    goal == [you |-&amp;gt; MaxPlace, me |-&amp;gt; 1]
    
    Init ==
      \* left-right, forward-backward
      pos = [you |-&amp;gt; [lr |-&amp;gt; left, fb |-&amp;gt; 1], me |-&amp;gt; [lr |-&amp;gt; left, fb |-&amp;gt; MaxPlace]]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;direction&lt;/code&gt;, &lt;code&gt;goal&lt;/code&gt;, and &lt;code&gt;pos&lt;/code&gt; are "records", or hash tables with string keys. I can get my left-right position with &lt;code&gt;pos.me.lr&lt;/code&gt; or &lt;code&gt;pos["me"]["lr"]&lt;/code&gt; (or &lt;code&gt;pos[me].lr&lt;/code&gt;, as &lt;code&gt;me == "me"&lt;/code&gt;).&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Juke(person) ==
      pos' = [pos EXCEPT ![person].lr = right - @]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;TLA+ breaks the world into a sequence of steps. In each step, &lt;code&gt;pos&lt;/code&gt; is the value of &lt;code&gt;pos&lt;/code&gt; in the &lt;em&gt;current&lt;/em&gt; step and &lt;code&gt;pos'&lt;/code&gt; is the value in the &lt;em&gt;next&lt;/em&gt; step. The main outcome of this semantics is that we "assign" a new value to &lt;code&gt;pos&lt;/code&gt; by declaring &lt;code&gt;pos'&lt;/code&gt; equal to something. But the semantics also open up lots of cool tricks, like swapping two values with &lt;code&gt;x' = y /\ y' = x&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;TLA+ is a little weird about updating functions. To set &lt;code&gt;f[x] = 3&lt;/code&gt;, you gotta write &lt;code&gt;f' = [f EXCEPT ![x] = 3]&lt;/code&gt;. To make things a little easier, the rhs of a function update can contain &lt;code&gt;@&lt;/code&gt; for the old value. &lt;code&gt;![me].lr = right - @&lt;/code&gt; is the same as &lt;code&gt;right - pos[me].lr&lt;/code&gt;, so it swaps left and right.&lt;/p&gt;
    &lt;p&gt;("Juke" comes from &lt;a href="https://www.merriam-webster.com/dictionary/juke" target="_blank"&gt;here&lt;/a&gt;)&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Move(person) ==
      LET new_pos == [pos[person] EXCEPT !.fb = @ + direction[person]]
      IN
        /\ pos[person].fb # goal[person]
        /\ \A p \in People: pos[p] # new_pos
        /\ pos' = [pos EXCEPT ![person] = new_pos]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The &lt;code&gt;EXCEPT&lt;/code&gt; syntax can be used in regular definitions, too. This lets someone move one step in their goal direction &lt;em&gt;unless&lt;/em&gt; they are at the goal &lt;em&gt;or&lt;/em&gt; someone is already in that space. &lt;code&gt;/\&lt;/code&gt; means "and".&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Next ==
      \E p \in People:
        \/ Move(p)
        \/ Juke(p)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I really like how TLA+ represents concurrency: "In each step, there is a person who either moves or jukes." It can take a few uses to really wrap your head around but it can express extraordinarily complicated distributed systems.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Spec == Init /\ [][Next]_vars
    
    Liveness == &amp;lt;&amp;gt;(pos[me].fb = goal[me])
    ====
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;Spec&lt;/code&gt; is our specification: we start at &lt;code&gt;Init&lt;/code&gt; and take a &lt;code&gt;Next&lt;/code&gt; step every step.&lt;/p&gt;
    &lt;p&gt;Liveness is the generic term for "something good is guaranteed to happen", see &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;here&lt;/a&gt; for more.  &lt;code&gt;&amp;lt;&amp;gt;&lt;/code&gt; means "eventually", so &lt;code&gt;Liveness&lt;/code&gt; means "eventually my forward-backward position will be my goal". I could extend it to "both of us eventually reach out goal" but I think this is good enough for a demo.&lt;/p&gt;
    &lt;h3&gt;Checking the spec&lt;/h3&gt;
    &lt;p&gt;Four years ago, everybody in TLA+ used the &lt;a href="https://lamport.azurewebsites.net/tla/toolbox.html" target="_blank"&gt;toolbox&lt;/a&gt;. Now the community has collectively shifted over to using the &lt;a href="https://github.com/tlaplus/vscode-tlaplus/" target="_blank"&gt;VSCode extension&lt;/a&gt;.&lt;sup id="fnref:ltla"&gt;&lt;a class="footnote-ref" href="#fn:ltla"&gt;1&lt;/a&gt;&lt;/sup&gt; VSCode requires we write a configuration file, which I will call &lt;code&gt;walkward.cfg&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;SPECIFICATION Spec
    PROPERTY Liveness
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I then check the model with the VSCode command &lt;code&gt;TLA+: Check model with TLC&lt;/code&gt;. Unsurprisingly, it finds an error:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Screenshot 2025-05-12 153537.png" class="newsletter-image" src="https://assets.buttondown.email/images/af6f9e89-0bc6-4705-b293-4da5f5c16cfe.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;The reason it fails is "stuttering": I can get one step away from my goal and then just stop moving forever. We say the spec is &lt;a href="https://www.hillelwayne.com/post/fairness/" target="_blank"&gt;unfair&lt;/a&gt;: it does not guarantee that if progress is always possible, progress will be made. If I want the spec to always make progress, I have to make some of the steps &lt;strong&gt;weakly fair&lt;/strong&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gi"&gt;+ Fairness == WF_vars(Next)&lt;/span&gt;
    
    &lt;span class="gd"&gt;- Spec == Init /\ [][Next]_vars&lt;/span&gt;
    &lt;span class="gi"&gt;+ Spec == Init /\ [][Next]_vars /\ Fairness&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now the spec is weakly fair, so someone will always do &lt;em&gt;something&lt;/em&gt;. New error:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;\* First six steps cut
    7: &amp;lt;Move("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 4], me |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2]]
    8: &amp;lt;Juke("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 4], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 2]]
    9: &amp;lt;Juke("me")&amp;gt; (back to state 7)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In this failure, I've successfully gotten past you, and then spend the rest of my life endlessly juking back and forth. The &lt;code&gt;Next&lt;/code&gt; step keeps happening, so weak fairness is satisfied. What I actually want is for both my &lt;code&gt;Move&lt;/code&gt; and my &lt;code&gt;Juke&lt;/code&gt; to both be weakly fair independently of each other.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gd"&gt;- Fairness == WF_vars(Next)&lt;/span&gt;
    &lt;span class="gi"&gt;+ Fairness == WF_vars(Move(me)) /\ WF_vars(Juke(me))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;If my liveness property also specified that &lt;em&gt;you&lt;/em&gt; reached your goal, I could instead write &lt;code&gt;\A p \in People: WF_vars(Move(p)) etc&lt;/code&gt;. I could also swap the &lt;code&gt;\A&lt;/code&gt; with a &lt;code&gt;\E&lt;/code&gt; to mean at least one of us is guaranteed to have fair actions, but not necessarily both of us. &lt;/p&gt;
    &lt;p&gt;New error:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;3: &amp;lt;Move("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 3]]
    4: &amp;lt;Juke("you")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 3]]
    5: &amp;lt;Juke("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 3]]
    6: &amp;lt;Juke("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 3]]
    7: &amp;lt;Juke("you")&amp;gt; (back to state 3)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now we're getting somewhere! This is the original walkwarding situation we wanted to capture. We're in each others way, then you juke, but before either of us can move you juke, then we both juke back. We can repeat this forever, trapped in a social hell.&lt;/p&gt;
    &lt;p&gt;Wait, but doesn't &lt;code&gt;WF(Move(me))&lt;/code&gt; guarantee I will eventually move? Yes, but &lt;em&gt;only if a move is permanently available&lt;/em&gt;. In this case, it's not permanently available, because every couple of steps it's made temporarily unavailable.&lt;/p&gt;
    &lt;p&gt;How do I fix this? I can't add a rule saying that we only juke if we're blocked, because the whole point of walkwarding is that we're not coordinated. In the real world, walkwarding can go on for agonizing seconds. What I can do instead is say that Liveness holds &lt;em&gt;as long as &lt;code&gt;Move&lt;/code&gt; is strongly fair&lt;/em&gt;. Unlike weak fairness, &lt;a href="https://www.hillelwayne.com/post/fairness/#strong-fairness" target="_blank"&gt;strong fairness&lt;/a&gt; guarantees something happens if it keeps becoming possible, even with interruptions. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Liveness == 
    &lt;span class="gi"&gt;+  SF_vars(Move(me)) =&amp;gt; &lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt;   &amp;lt;&amp;gt;(pos[me].fb = goal[me])
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This makes the spec pass. Even if we weave back and forth for five minutes, as long as we eventually pass each other, I will reach my goal. Note we could also by making &lt;code&gt;Move&lt;/code&gt; in &lt;code&gt;Fairness&lt;/code&gt; strongly fair, which is preferable if we have a lot of different liveness properties to check.&lt;/p&gt;
    &lt;h3&gt;A small exercise for the reader&lt;/h3&gt;
    &lt;p&gt;There is a presumed invariant that is violated. Identify what it is, write it as a property in TLA+, and show the spec violates it. Then fix it.&lt;/p&gt;
    &lt;p&gt;Answer (in &lt;a href="https://rot13.com/" target="_blank"&gt;rot13&lt;/a&gt;): Gur vainevnag vf "ab gjb crbcyr ner va gur rknpg fnzr ybpngvba". &lt;code&gt;Zbir&lt;/code&gt; thnenagrrf guvf ohg &lt;code&gt;Whxr&lt;/code&gt; &lt;em&gt;qbrf abg&lt;/em&gt;.&lt;/p&gt;
    &lt;h3&gt;More TLA+ Exercises&lt;/h3&gt;
    &lt;p&gt;I've started work on &lt;a href="https://github.com/hwayne/tlaplus-exercises/" target="_blank"&gt;an exercises repo&lt;/a&gt;. There's only a handful of specific problems now but I'm planning on adding more over the summer.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:ltla"&gt;
    &lt;p&gt;&lt;a href="https://learntla.com/" target="_blank"&gt;learntla&lt;/a&gt; is still on the toolbox, but I'm hoping to get it all moved over this summer. &lt;a class="footnote-backref" href="#fnref:ltla" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 14 May 2025 16:02:21 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/modeling-awkward-social-situations-with-tla/</guid>
            </item>
            <item>
                <title>Write the most clever code you possibly can</title>
                <link>https://buttondown.com/hillelwayne/archive/write-the-most-clever-code-you-possibly-can/</link>
                <description>&lt;p&gt;&lt;em&gt;I started writing this early last week but Real Life Stuff happened and now you're getting the first-draft late this week. Warning, unedited thoughts ahead!&lt;/em&gt;&lt;/p&gt;
    &lt;h2&gt;New Logic for Programmers release!&lt;/h2&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.9 is out&lt;/a&gt;! This is a big release, with a new cover design, several rewritten chapters, &lt;a href="https://github.com/logicforprogrammers/book-assets/tree/master/code" target="_blank"&gt;online code samples&lt;/a&gt; and much more. See the full release notes at the &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;changelog page&lt;/a&gt;, and &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;get the book here&lt;/a&gt;!&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The new cover! It's a lot nicer" class="newsletter-image" src="https://assets.buttondown.email/images/038a7092-5dc7-41a5-9a16-56bdef8b5d58.jpg?w=400&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h2&gt;Write the cleverest code you possibly can&lt;/h2&gt;
    &lt;p&gt;There are millions of articles online about how programmers should not write "clever" code, and instead write simple, maintainable code that everybody understands. Sometimes the example of "clever" code looks like this (&lt;a href="https://codegolf.stackexchange.com/questions/57617/is-this-number-a-prime/57682#57682" target="_blank"&gt;src&lt;/a&gt;):&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python&lt;/span&gt;
    
    &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"p*=n*n;n+=1;"&lt;/span&gt;&lt;span class="o"&gt;*~-&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is code-golfing, the sport of writing the most concise code possible. Obviously you shouldn't run this in production for the same reason you shouldn't eat dinner off a Rembrandt. &lt;/p&gt;
    &lt;p&gt;Other times the example looks like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;is_prime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;False&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is "clever" because it uses a single list comprehension, as opposed to a "simple" for loop. Yes, "list comprehensions are too clever" is something I've read in one of these articles. &lt;/p&gt;
    &lt;p&gt;I've also talked to people who think that datatypes besides lists and hashmaps are too clever to use, that most optimizations are too clever to bother with, and even that functions and classes are too clever and code should be a linear script.&lt;sup id="fnref:grad-students"&gt;&lt;a class="footnote-ref" href="#fn:grad-students"&gt;1&lt;/a&gt;&lt;/sup&gt;. Clever code is anything using features or domain concepts we don't understand. Something that seems unbearably clever to me might be utterly mundane for you, and vice versa. &lt;/p&gt;
    &lt;p&gt;How do we make something utterly mundane? By using it and working at the boundaries of our skills. Almost everything I'm "good at" comes from banging my head against it more than is healthy. That suggests a really good reason to write clever code: it's an excellent form of purposeful practice. Writing clever code forces us to code outside of our comfort zone, developing our skills as software engineers. &lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you [will get excellent debugging practice at exactly the right level required to push your skills as a software engineer] — Brian Kernighan, probably&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;There are other benefits, too, but first let's kill the elephant in the room:&lt;sup id="fnref:bajillion"&gt;&lt;a class="footnote-ref" href="#fn:bajillion"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;h3&gt;Don't &lt;em&gt;commit&lt;/em&gt; clever code&lt;/h3&gt;
    &lt;p&gt;I am proposing writing clever code as a means of practice. Being at work is a &lt;em&gt;job&lt;/em&gt; with coworkers who will not appreciate if your code is too clever. Similarly, don't use &lt;a href="https://mcfunley.com/choose-boring-technology" target="_blank"&gt;too many innovative technologies&lt;/a&gt;. Don't put anything in production you are &lt;em&gt;uncomfortable&lt;/em&gt; with.&lt;/p&gt;
    &lt;p&gt;We can still responsibly write clever code at work, though: &lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Solve a problem in both a simple and a clever way, and then only commit the simple way. This works well for small scale problems where trying the "clever way" only takes a few minutes.&lt;/li&gt;
    &lt;li&gt;Write our &lt;em&gt;personal&lt;/em&gt; tools cleverly. I'm a big believer of the idea that most programmers would benefit from writing more scripts and support code customized to their particular work environment. This is a great place to practice new techniques, languages, etc.&lt;/li&gt;
    &lt;li&gt;If clever code is absolutely the best way to solve a problem, then commit it with &lt;strong&gt;extensive documentation&lt;/strong&gt; explaining how it works and why it's preferable to simpler solutions. Bonus: this potentially helps the whole team upskill.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;h2&gt;Writing clever code...&lt;/h2&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;h3&gt;...teaches simple solutions&lt;/h3&gt;
    &lt;p&gt;Usually, code that's called too clever composes several powerful features together — the "not a single list comprehension or function" people are the exception. &lt;a href="https://www.joshwcomeau.com/career/clever-code-considered-harmful/" target="_blank"&gt;Josh Comeau's&lt;/a&gt; "don't write clever code" article gives this example of "too clever":&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;extractDataFromResponse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;Component&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;resultsEntries&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Component&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;assignIfValueTruthy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;o&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;o&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;o&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;resultsEntries&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;assignIfValueTruthy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;What makes this "clever"? I count eight language features composed together: &lt;code&gt;entries&lt;/code&gt;, argument unpacking, implicit objects, splats, ternaries, higher-order functions, and reductions. Would code that used only one or two of these features still be "clever"? I don't think so. These features exist for a reason, and oftentimes they make code simpler than not using them.&lt;/p&gt;
    &lt;p&gt;We can, of course, learn these features one at a time. Writing the clever version (but not &lt;em&gt;committing it&lt;/em&gt;) gives us practice with all eight at once and also with how they compose together. That knowledge comes in handy when we want to apply a single one of the ideas.&lt;/p&gt;
    &lt;p&gt;I've recently had to do a bit of pandas for a project. Whenever I have to do a new analysis, I try to write it as a single chain of transformations, and then as a more balanced set of updates.&lt;/p&gt;
    &lt;h3&gt;...helps us master concepts&lt;/h3&gt;
    &lt;p&gt;Even if the composite parts of a "clever" solution aren't by themselves useful, it still makes us better at the overall language, and that's inherently valuable. A few years ago I wrote &lt;a href="https://www.hillelwayne.com/post/python-abc/" target="_blank"&gt;Crimes with Python's Pattern Matching&lt;/a&gt;. It involves writing horrible code like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;abc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ABC&lt;/span&gt;
    
    &lt;span class="k"&gt;class&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;NotIterable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ABC&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    
        &lt;span class="nd"&gt;@classmethod&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;__subclasshook__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;C&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;hasattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;C&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"__iter__"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;NotIterable&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; is not iterable"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; is iterable"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"__main__"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This composes Python match statements, which are broadly useful, and abstract base classes, which are incredibly niche. But even if I never use ABCs in real production code, it helped me understand Python's match semantics and &lt;a href="https://docs.python.org/3/howto/mro.html#python-2-3-mro" target="_blank"&gt;Method Resolution Order&lt;/a&gt; better. &lt;/p&gt;
    &lt;h3&gt;...prepares us for necessity&lt;/h3&gt;
    &lt;p&gt;Sometimes the clever way is the &lt;em&gt;only&lt;/em&gt; way. Maybe we need something faster than the simplest solution. Maybe we are working with constrained tools or frameworks that demand cleverness. Peter Norvig argued that design patterns compensate for missing language features. I'd argue that cleverness is another means of compensating: if our tools don't have an easy way to do something, we need to find a clever way.&lt;/p&gt;
    &lt;p&gt;You see this a lot in formal methods like TLA+. Need to check a hyperproperty? &lt;a href="https://www.hillelwayne.com/post/graphing-tla/" target="_blank"&gt;Cast your state space to a directed graph&lt;/a&gt;. Need to compose ten specifications together? &lt;a href="https://www.hillelwayne.com/post/composing-tla/" target="_blank"&gt;Combine refinements with state machines&lt;/a&gt;. Most difficult problems have a "clever" solution. The real problem is that clever solutions have a skill floor. If normal use of the tool is at difficult 3 out of 10, then basic clever solutions are at 5 out of 10, and it's hard to jump those two steps in the moment you need the cleverness.&lt;/p&gt;
    &lt;p&gt;But if you've practiced with writing overly clever code, you're used to working at a 7 out of 10 level in short bursts, and then you can "drop down" to 5/10. I don't know if that makes too much sense, but I see it happen a lot in practice.&lt;/p&gt;
    &lt;h3&gt;...builds comradery&lt;/h3&gt;
    &lt;p&gt;On a few occasions, after getting a pull request merged, I pulled the reviewer over and said "check out this horrible way of doing the same thing". I find that as long as people know they're not going to be subjected to a clever solution in production, they enjoy seeing it!&lt;/p&gt;
    &lt;p&gt;&lt;em&gt;Next week's newsletter will probably also be late, after that we should be back to a regular schedule for the rest of the summer.&lt;/em&gt;&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:grad-students"&gt;
    &lt;p&gt;Mostly grad students outside of CS who have to write scripts to do research. And in more than one data scientist. I think it's correlated with using Jupyter. &lt;a class="footnote-backref" href="#fnref:grad-students" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:bajillion"&gt;
    &lt;p&gt;If I don't put this at the beginning, I'll get a bajillion responses like "your team will hate you" &lt;a class="footnote-backref" href="#fnref:bajillion" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 08 May 2025 15:04:42 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/write-the-most-clever-code-you-possibly-can/</guid>
            </item>
            <item>
                <title>Requirements change until they don't</title>
                <link>https://buttondown.com/hillelwayne/archive/requirements-change-until-they-dont/</link>
                <description>&lt;p&gt;Recently I got a question on formal methods&lt;sup id="fnref:fs"&gt;&lt;a class="footnote-ref" href="#fn:fs"&gt;1&lt;/a&gt;&lt;/sup&gt;: how does it help to mathematically model systems when the system requirements are constantly changing? It doesn't make sense to spend a lot of time proving a design works, and then deliver the product and find out it's not at all what the client needs. As the saying goes, the hard part is "building the right thing", not "building the thing right".&lt;/p&gt;
    &lt;p&gt;One possible response: "why write tests"? You shouldn't write tests, &lt;em&gt;especially&lt;/em&gt; &lt;a href="https://en.wikipedia.org/wiki/Test-driven_development" target="_blank"&gt;lots of unit tests ahead of time&lt;/a&gt;, if you might just throw them all away when the requirements change.&lt;/p&gt;
    &lt;p&gt;This is a bad response because we all know the difference between writing tests and formal methods: testing is &lt;em&gt;easy&lt;/em&gt; and FM is &lt;em&gt;hard&lt;/em&gt;. Testing requires low cost for moderate correctness, FM requires high(ish) cost for high correctness. And when requirements are constantly changing, "high(ish) cost" isn't affordable and "high correctness" isn't worthwhile, because a kinda-okay solution that solves a customer's problem is infinitely better than a solid solution that doesn't.&lt;/p&gt;
    &lt;p&gt;But eventually you get something that solves the problem, and what then?&lt;/p&gt;
    &lt;p&gt;Most of us don't work for Google, we can't axe features and products &lt;a href="https://killedbygoogle.com/" target="_blank"&gt;on a whim&lt;/a&gt;. If the client is happy with your solution, you are expected to support it. It should work when your customers run into new edge cases, or migrate all their computers to the next OS version, or expand into a market with shoddy internet. It should work when 10x as many customers are using 10x as many features. It should work when &lt;a href="https://www.hillelwayne.com/post/feature-interaction/" target="_blank"&gt;you add new features that come into conflict&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;And just as importantly, &lt;em&gt;it should never stop solving their problem&lt;/em&gt;. Canonical example: your feature involves processing requested tasks synchronously. At scale, this doesn't work, so to improve latency you make it asynchronous. Now it's eventually consistent, but your customers were depending on it being always consistent. Now it no longer does what they need, and has stopped solving their problems.&lt;/p&gt;
    &lt;p&gt;Every successful requirement met spawns a new requirement: "keep this working". That requirement is permanent, or close enough to decide our long-term strategy. It takes active investment to keep a feature behaving the same as the world around it changes.&lt;/p&gt;
    &lt;p&gt;(Is this all a pretentious of way of saying "software maintenance is hard?" Maybe!)&lt;/p&gt;
    &lt;h3&gt;Phase changes&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;In physics there's a concept of a &lt;a href="https://en.wikipedia.org/wiki/Phase_transition" target="_blank"&gt;phase transition&lt;/a&gt;. To raise the temperature of a gram of liquid water by 1° C, you have to add 4.184 joules of energy.&lt;sup id="fnref:calorie"&gt;&lt;a class="footnote-ref" href="#fn:calorie"&gt;2&lt;/a&gt;&lt;/sup&gt; This continues until you raise it to 100°C, then it stops. After you've added two &lt;em&gt;thousand&lt;/em&gt; joules to that gram, it suddenly turns into steam. The energy of the system changes continuously but the form, or phase, changes discretely.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Phase_diagram_of_water_simplified.svg.png (from above link)" class="newsletter-image" src="https://assets.buttondown.email/images/31676a33-be6a-4c6d-a96f-425723dcb0d5.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Software isn't physics but the idea works as a metaphor. A certain architecture handles a certain level of load, and past that you need a new architecture. Or a bunch of similar features are independently hardcoded until the system becomes too messy to understand, you remodel the internals into something unified and extendable. etc etc etc. It's doesn't have to be totally discrete phase transition, but there's definitely a "before" and "after" in the system form. &lt;/p&gt;
    &lt;p&gt;Phase changes tend to lead to more intricacy/complexity in the system, meaning it's likely that a phase change will introduce new bugs into existing behaviors. Take the synchronous vs asynchronous case. A very simple toy model of synchronous updates would be &lt;code&gt;Set(key, val)&lt;/code&gt;, which updates &lt;code&gt;data[key]&lt;/code&gt; to &lt;code&gt;val&lt;/code&gt;.&lt;sup id="fnref:tla"&gt;&lt;a class="footnote-ref" href="#fn:tla"&gt;3&lt;/a&gt;&lt;/sup&gt; A model of asynchronous updates would be &lt;code&gt;AsyncSet(key, val, priority)&lt;/code&gt; adds a &lt;code&gt;(key, val, priority, server_time())&lt;/code&gt; tuple to a &lt;code&gt;tasks&lt;/code&gt; set, and then another process asynchronously pulls a tuple (ordered by highest priority, then earliest time) and calls &lt;code&gt;Set(key, val)&lt;/code&gt;. Here are some properties the client may need preserved as a requirement: &lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;If &lt;code&gt;AsyncSet(key, val, _, _)&lt;/code&gt; is called, then &lt;em&gt;eventually&lt;/em&gt; &lt;code&gt;db[key] = val&lt;/code&gt; (possibly violated if higher-priority tasks keep coming in)&lt;/li&gt;
    &lt;li&gt;If someone calls &lt;code&gt;AsyncSet(key1, val1, low)&lt;/code&gt; and then &lt;code&gt;AsyncSet(key2, val2, low)&lt;/code&gt;, they should see the first update and then the second (linearizability, possibly violated if the requests go to different servers with different clock times)&lt;/li&gt;
    &lt;li&gt;If someone calls &lt;code&gt;AsyncSet(key, val, _)&lt;/code&gt; and &lt;em&gt;immediately&lt;/em&gt; reads &lt;code&gt;db[key]&lt;/code&gt; they should get &lt;code&gt;val&lt;/code&gt; (obviously violated, though the client may accept a &lt;em&gt;slightly&lt;/em&gt; weaker property)&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;If the new system doesn't satisfy an existing customer requirement, it's prudent to fix the bug &lt;em&gt;before&lt;/em&gt; releasing the new system. The customer doesn't notice or care that your system underwent a phase change. They'll just see that one day your product solves their problems, and the next day it suddenly doesn't. &lt;/p&gt;
    &lt;p&gt;This is one of the most common applications of formal methods. Both of those systems, and every one of those properties, is formally specifiable in a specification language. We can then automatically check that the new system satisfies the existing properties, and from there do things like &lt;a href="https://arxiv.org/abs/2006.00915" target="_blank"&gt;automatically generate test suites&lt;/a&gt;. This does take a lot of work, so if your requirements are constantly changing, FM may not be worth the investment. But eventually requirements &lt;em&gt;stop&lt;/em&gt; changing, and then you're stuck with them forever. That's where models shine.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:fs"&gt;
    &lt;p&gt;As always, I'm using formal methods to mean the subdiscipline of formal specification of designs, leaving out the formal verification of code. Mostly because "formal specification" is really awkward to say. &lt;a class="footnote-backref" href="#fnref:fs" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:calorie"&gt;
    &lt;p&gt;Also called a "calorie". The US "dietary Calorie" is actually a kilocalorie. &lt;a class="footnote-backref" href="#fnref:calorie" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:tla"&gt;
    &lt;p&gt;This is all directly translatable to a TLA+ specification, I'm just describing it in English to avoid paying the syntax tax &lt;a class="footnote-backref" href="#fnref:tla" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 24 Apr 2025 11:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/requirements-change-until-they-dont/</guid>
            </item>
            <item>
                <title>The Halting Problem is a terrible example of NP-Harder</title>
                <link>https://buttondown.com/hillelwayne/archive/the-halting-problem-is-a-terrible-example-of-np/</link>
                <description>&lt;p&gt;&lt;em&gt;Short one this time because I have a lot going on this week.&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;In computation complexity, &lt;strong&gt;NP&lt;/strong&gt; is the class of all decision problems (yes/no) where a potential proof (or "witness") for "yes" can be &lt;em&gt;verified&lt;/em&gt; in polynomial time. For example, "does this set of numbers have a subset that sums to zero" is in NP. If the answer is "yes", you can prove it by presenting a set of numbers. We would then verify the witness by 1) checking that all the numbers are present in the set (~linear time) and 2) adding up all the numbers (also linear).&lt;/p&gt;
    &lt;p&gt;&lt;strong&gt;NP-complete&lt;/strong&gt; is the class of "hardest possible" NP problems. Subset sum is NP-complete. &lt;strong&gt;NP-hard&lt;/strong&gt; is the set all problems &lt;em&gt;at least as hard&lt;/em&gt; as NP-complete. Notably, NP-hard is &lt;em&gt;not&lt;/em&gt; a subset of NP, as it contains problems that are &lt;em&gt;harder&lt;/em&gt; than NP-complete. A natural question to ask is "like what?" And the canonical example of "NP-harder" is the halting problem (HALT): does program P halt on input C? As the argument goes, it's undecidable, so obviously not in NP.&lt;/p&gt;
    &lt;p&gt;I think this is a bad example for two reasons:&lt;/p&gt;
    &lt;ol&gt;&lt;li&gt;&lt;p&gt;All NP requires is that witnesses for "yes" can be verified in polynomial time. It does not require anything for the "no" case! And even though HP is undecidable, there &lt;em&gt;is&lt;/em&gt; a decidable way to verify a "yes": let the witness be "it halts in N steps", then run the program for that many steps and see if it halted by then. To prove HALT is not in NP, you have to show that this verification process grows faster than polynomially. It does (as &lt;a href="https://en.wikipedia.org/wiki/Busy_beaver" rel="noopener noreferrer nofollow" target="_blank"&gt;busy beaver&lt;/a&gt; is uncomputable), but this all makes the example needlessly confusing.&lt;sup id="fnref:1"&gt;&lt;a class="footnote-ref" data-id="37347adc-dba6-4629-9d24-c6252292ac6b" data-reference-number="1" href="#fn:1"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;"What's bigger than a dog? THE MOON"&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;
    &lt;p&gt;Really (2) bothers me a lot more than (1) because it's just so inelegant. It suggests that NP-complete is the upper bound of "solvable" problems, and after that you're in full-on undecidability. I'd rather show intuitive problems that are harder than NP but not &lt;em&gt;that&lt;/em&gt; much harder.&lt;/p&gt;
    &lt;p&gt;But in looking for a "slightly harder" problem, I ran into an, ah, problem. It &lt;em&gt;seems&lt;/em&gt; like the next-hardest class would be &lt;a href="https://en.wikipedia.org/wiki/EXPTIME" rel="noopener noreferrer nofollow" target="_blank"&gt;EXPTIME&lt;/a&gt;, except we don't know &lt;em&gt;for sure&lt;/em&gt; that NP != EXPTIME. We know &lt;em&gt;for sure&lt;/em&gt; that NP != &lt;a href="https://en.wikipedia.org/wiki/NEXPTIME" rel="noopener noreferrer nofollow" target="_blank"&gt;NEXPTIME&lt;/a&gt;, but NEXPTIME doesn't have any intuitive, easily explainable problems. Most "definitely harder than NP" problems require a nontrivial background in theoretical computer science or mathematics to understand.&lt;/p&gt;
    &lt;p&gt;There is one problem, though, that I find easily explainable. Place a token at the bottom left corner of a grid that extends infinitely up and right, call that point (0, 0). You're given list of valid displacement moves for the token, like &lt;code&gt;(+1, +0)&lt;/code&gt;, &lt;code&gt;(-20, +13)&lt;/code&gt;, &lt;code&gt;(-5, -6)&lt;/code&gt;, etc, and a target point like &lt;code&gt;(700, 1)&lt;/code&gt;. You may make any sequence of moves in any order, as long as no move ever puts the token off the grid. Does any sequence of moves bring you to the target?&lt;/p&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;This is PSPACE-complete, I think, which still isn't proven to be harder than NP-complete (though it's widely believed). But what if you increase the number of dimensions of the grid? Past a certain number of dimensions the problem jumps to being EXPSPACE-complete, and then TOWER-complete (grows &lt;a href="https://en.wikipedia.org/wiki/Tetration" rel="noopener noreferrer nofollow" target="_blank"&gt;tetrationally&lt;/a&gt;), and then it keeps going. Some point might recognize this as looking a lot like the &lt;a href="https://en.wikipedia.org/wiki/Ackermann_function" rel="noopener noreferrer nofollow" target="_blank"&gt;Ackermann function&lt;/a&gt;, and in fact this problem is &lt;a href="https://arxiv.org/abs/2104.13866" rel="noopener noreferrer nofollow" target="_blank"&gt;ACKERMANN-complete on the number of available dimensions&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://www.quantamagazine.org/an-easy-sounding-problem-yields-numbers-too-big-for-our-universe-20231204/" rel="noopener noreferrer nofollow" target="_blank"&gt;A friend wrote a Quanta article about the whole mess&lt;/a&gt;, you should read it.&lt;/p&gt;
    &lt;p&gt;This problem is ludicrously bigger than NP ("Chicago" instead of "The Moon"), but at least it's clearly decidable, easily explainable, and definitely &lt;em&gt;not&lt;/em&gt; in NP.&lt;/p&gt;
    &lt;div class="footnote"&gt;&lt;hr/&gt;&lt;ol class="footnotes"&gt;&lt;li data-id="37347adc-dba6-4629-9d24-c6252292ac6b" id="fn:1"&gt;&lt;p&gt;It's less confusing if you're taught the alternate (and original!) definition of NP, "the class of problems solvable in polynomial time by a nondeterministic Turing machine". Then HALT can't be in NP because otherwise runtime would be bounded by an exponential function. &lt;a class="footnote-backref" href="#fnref:1"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;</description>
                <pubDate>Wed, 16 Apr 2025 17:39:23 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/the-halting-problem-is-a-terrible-example-of-np/</guid>
            </item>
            <item>
                <title>Solving a "Layton Puzzle" with Prolog</title>
                <link>https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/</link>
                <description>&lt;p&gt;I have a lot in the works for the this month's &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; release. Among other things, I'm completely rewriting the chapter on Logic Programming Languages. &lt;/p&gt;
    &lt;p&gt;I originally showcased the paradigm with puzzle solvers, like &lt;a href="https://swish.swi-prolog.org/example/queens.pl" target="_blank"&gt;eight queens&lt;/a&gt; or &lt;a href="https://saksagan.ceng.metu.edu.tr/courses/ceng242/documents/prolog/jrfisher/2_1.html" target="_blank"&gt;four-coloring&lt;/a&gt;. Lots of other demos do this too! It takes creativity and insight for humans to solve them, so a program doing it feels magical. But I'm trying to write a book about practical techniques and I want everything I talk about to be &lt;em&gt;useful&lt;/em&gt;. So in v0.9 I'll be replacing these examples with a couple of new programs that might get people thinking that Prolog could help them in their day-to-day work.&lt;/p&gt;
    &lt;p&gt;On the other hand, for a newsletter, showcasing a puzzle solver is pretty cool. And recently I stumbled into &lt;a href="https://morepablo.com/2010/09/some-professor-layton-prolog.html" target="_blank"&gt;this post&lt;/a&gt; by my friend &lt;a href="https://morepablo.com/" target="_blank"&gt;Pablo Meier&lt;/a&gt;, where he solves a videogame puzzle with Prolog:&lt;sup id="fnref:path"&gt;&lt;a class="footnote-ref" href="#fn:path"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;&lt;img alt="See description below" class="newsletter-image" src="https://assets.buttondown.email/images/a4ee8689-bbce-4dc9-8175-a1de3bd8f2db.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Summary for the text-only readers: We have a test with 10 true/false questions (denoted &lt;code&gt;a/b&lt;/code&gt;) and four student attempts. Given the scores of the first three students, we have to figure out the fourth student's score.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;bbababbabb = 7
    baaababaaa = 5
    baaabbbaba = 3
    bbaaabbaaa = ???
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;You can see Pablo's solution &lt;a href="https://morepablo.com/2010/09/some-professor-layton-prolog.html" target="_blank"&gt;here&lt;/a&gt;, and try it in SWI-prolog &lt;a href="https://swish.swi-prolog.org/p/Some%20Professor%20Layton%20Prolog.pl" target="_blank"&gt;here&lt;/a&gt;. Pretty cool! But after way too long studying Prolog just to write this dang book chapter, I wanted to see if I could do it more elegantly than him. Code and puzzle spoilers to follow.&lt;/p&gt;
    &lt;p&gt;(Normally here's where I'd link to a gentler introduction I wrote but I think this is my first time writing about Prolog online? Uh here's a &lt;a href="https://www.hillelwayne.com/post/picat/" target="_blank"&gt;Picat intro&lt;/a&gt; instead)&lt;/p&gt;
    &lt;h3&gt;The Program&lt;/h3&gt;
    &lt;p&gt;You can try this all online at &lt;a href="https://swish.swi-prolog.org/p/" target="_blank"&gt;SWISH&lt;/a&gt; or just jump to my final version &lt;a href="https://swish.swi-prolog.org/p/layton_prolog_puzzle.pl" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="nf"&gt;use_module&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;    &lt;span class="c1"&gt;% Sound inequality&lt;/span&gt;
    &lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="nf"&gt;use_module&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;clpfd&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;  &lt;span class="c1"&gt;% Finite domain constraints&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;First some imports. &lt;code&gt;dif&lt;/code&gt; lets us write &lt;code&gt;dif(A, B)&lt;/code&gt;, which is true if &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;B&lt;/code&gt; are &lt;em&gt;not&lt;/em&gt; equal. &lt;code&gt;clpfd&lt;/code&gt; lets us write &lt;code&gt;A #= B + 1&lt;/code&gt; to say "A is 1 more than B".&lt;sup id="fnref:superior"&gt;&lt;a class="footnote-ref" href="#fn:superior"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;We'll say both the student submission and the key will be lists, where each value is &lt;code&gt;a&lt;/code&gt; or &lt;code&gt;b&lt;/code&gt;. In Prolog, lowercase identifiers are &lt;strong&gt;atoms&lt;/strong&gt; (like symbols in other languages) and identifiers that start with a capital are &lt;strong&gt;variables&lt;/strong&gt;. Prolog finds values for variables that match equations (&lt;strong&gt;unification&lt;/strong&gt;). The pattern matching is real real good.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;% ?- means query&lt;/span&gt;
    &lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nv"&gt;B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;B&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="s s-Atom"&gt;#=&lt;/span&gt; &lt;span class="mf"&gt;7.&lt;/span&gt;
    
    &lt;span class="nv"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nv"&gt;L&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nv"&gt;Y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Next, we define &lt;code&gt;score/3&lt;/code&gt;&lt;sup id="fnref:arity"&gt;&lt;a class="footnote-ref" href="#fn:arity"&gt;3&lt;/a&gt;&lt;/sup&gt; recursively. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;% The student's test score&lt;/span&gt;
    &lt;span class="c1"&gt;% score(student answers, answer key, score)&lt;/span&gt;
    &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([],&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
       &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="s s-Atom"&gt;#=&lt;/span&gt; &lt;span class="nv"&gt;M&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;M&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;K&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt; 
        &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;K&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;First key is the student's answers, second is the answer key, third is the final score. The base case is the empty test, which has score 0. Otherwise, we take the head values of each list and compare them. If they're the same, we add one to the score, otherwise we keep the same score. &lt;/p&gt;
    &lt;p&gt;Notice we couldn't write &lt;code&gt;if x then y else z&lt;/code&gt;, we instead used pattern matching to effectively express &lt;code&gt;(x &amp;amp;&amp;amp; y) || (!x &amp;amp;&amp;amp; z)&lt;/code&gt;. Prolog does have a conditional operator, but it prevents backtracking so what's the point???&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;A quick break about bidirectionality&lt;/h3&gt;
    &lt;p&gt;One of the coolest things about Prolog: all purely logical predicates are bidirectional. We can use &lt;code&gt;score&lt;/code&gt; to check if our expected score is correct:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="s s-Atom"&gt;true&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But we can also give it answers and a key and ask it for the score:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;em&gt;Or&lt;/em&gt; we could give it a key and a score and ask "what test answers would have this score?"&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The different value is written &lt;code&gt;_A&lt;/code&gt; because we never told Prolog that the array can &lt;em&gt;only&lt;/em&gt; contain &lt;code&gt;a&lt;/code&gt; and &lt;code&gt;b&lt;/code&gt;. We'll fix this later.&lt;/p&gt;
    &lt;h3&gt;Okay back to the program&lt;/h3&gt;
    &lt;p&gt;Now that we have a way of computing scores, we want to find a possible answer key that matches all of our observations, ie gives everybody the correct scores.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
        &lt;span class="c1"&gt;% Figure it out&lt;/span&gt;
        &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So far we haven't explicitly said that the &lt;code&gt;Key&lt;/code&gt; length matches the student answer lengths. This is implicitly verified by &lt;code&gt;score&lt;/code&gt; (both lists need to be empty at the same time) but it's a good idea to explicitly add &lt;code&gt;length(Key, 10)&lt;/code&gt; as a clause of &lt;code&gt;key/1&lt;/code&gt;. We should also explicitly say that every element of &lt;code&gt;Key&lt;/code&gt; is either &lt;code&gt;a&lt;/code&gt; or &lt;code&gt;b&lt;/code&gt;.&lt;sup id="fnref:explicit"&gt;&lt;a class="footnote-ref" href="#fn:explicit"&gt;4&lt;/a&gt;&lt;/sup&gt; Now we &lt;em&gt;could&lt;/em&gt; write a second predicate saying &lt;code&gt;Key&lt;/code&gt; had the right 'type': &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;keytype([]).
    keytype([K|Ks]) :- member(K, [a, b]), keytype(Ks).
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But "generating lists that match a constraint" is a thing that comes up often enough that we don't want to write a separate predicate for each constraint! So after some digging, I found a more elegant solution: &lt;code&gt;maplist&lt;/code&gt;. Let &lt;code&gt;L=[l1, l2]&lt;/code&gt;. Then &lt;code&gt;maplist(p, L)&lt;/code&gt; is equivalent to the clause &lt;code&gt;p(l1), p(l2)&lt;/code&gt;. It also accepts partial predicates: &lt;code&gt;maplist(p(x), L)&lt;/code&gt; is equivalent to &lt;code&gt;p(x, l1), p(x, l2)&lt;/code&gt;. So we could write&lt;sup id="fnref:yall"&gt;&lt;a class="footnote-ref" href="#fn:yall"&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    
    &lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
        &lt;span class="nf"&gt;length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;maplist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="c1"&gt;% the score stuff&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now, let's query for the Key:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So there are actually four &lt;em&gt;different&lt;/em&gt; keys that all explain our data. Does this mean the puzzle is broken and has multiple different answers?&lt;/p&gt;
    &lt;h3&gt;Nope&lt;/h3&gt;
    &lt;p&gt;The puzzle wasn't to find out what the answer key was, the point was to find the fourth student's score. And if we query for it, we see all four solutions give him the same score:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Huh! I really like it when puzzles look like they're broken, but every "alternate" solution still gives the same puzzle answer.&lt;/p&gt;
    &lt;p&gt;Total program length: 15 lines of code, compared to the original's 80 lines. &lt;em&gt;Suck it, Pablo.&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;(Incidentally, you can get all of the answer at once by writing &lt;code&gt;findall(X, (key(Key), score($answer-array, Key, X)), L).&lt;/code&gt;) &lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;I still don't like puzzles for teaching&lt;/h3&gt;
    &lt;p&gt;The actual examples I'm using in &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;the book&lt;/a&gt; are "analyzing a version control commit graph" and "planning a sequence of infrastructure changes", which are somewhat more likely to occur at work than needing to solve a puzzle. You'll see them in the next release!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:path"&gt;
    &lt;p&gt;I found it because he wrote &lt;a href="https://morepablo.com/2025/04/gamer-games-for-lite-gamers.html" target="_blank"&gt;Gamer Games for Lite Gamers&lt;/a&gt; as a response to my &lt;a href="https://www.hillelwayne.com/post/vidja-games/" target="_blank"&gt;Gamer Games for Non-Gamers&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:path" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:superior"&gt;
    &lt;p&gt;These are better versions of the core Prolog expressions &lt;code&gt;\+ (A = B)&lt;/code&gt; and &lt;code&gt;A is B + 1&lt;/code&gt;, because they can &lt;a href="https://eu.swi-prolog.org/pldoc/man?predicate=dif/2" target="_blank"&gt;defer unification&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:superior" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:arity"&gt;
    &lt;p&gt;Prolog-descendants have a convention of writing the arity of the function after its name, so &lt;code&gt;score/3&lt;/code&gt; means "score has three parameters". I think they do this because you can overload predicates with multiple different arities. Also Joe Armstrong used Prolog for prototyping, so Erlang and Elixir follow the same convention. &lt;a class="footnote-backref" href="#fnref:arity" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:explicit"&gt;
    &lt;p&gt;It &lt;em&gt;still&lt;/em&gt; gets the right answers without this type restriction, but I had no idea it did until I checked for myself. Probably better not to rely on this! &lt;a class="footnote-backref" href="#fnref:explicit" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:yall"&gt;
    &lt;p&gt;We could make this even more compact by using a lambda function. First import module &lt;code&gt;yall&lt;/code&gt;, then write &lt;code&gt;maplist([X]&amp;gt;&amp;gt;member(X, [a,b]), Key)&lt;/code&gt;. But (1) it's not a shorter program because you replace the extra definition with an extra module import, and (2) &lt;code&gt;yall&lt;/code&gt; is SWI-Prolog specific and not an ISO-standard prolog module. Using &lt;code&gt;contains&lt;/code&gt; is more portable. &lt;a class="footnote-backref" href="#fnref:yall" title="Jump back to footnote 5 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 08 Apr 2025 18:34:50 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/</guid>
            </item>
            <item>
                <title>[April Cools] Gaming Games for Non-Gamers</title>
                <link>https://buttondown.com/hillelwayne/archive/april-cools-gaming-games-for-non-gamers/</link>
                <description>&lt;p&gt;My &lt;em&gt;April Cools&lt;/em&gt; is out! &lt;a href="https://www.hillelwayne.com/post/vidja-games/" target="_blank"&gt;Gaming Games for Non-Gamers&lt;/a&gt; is a 3,000 word essay on video games worth playing if you've never enjoyed a video game before. &lt;a href="https://www.patreon.com/posts/blog-notes-gamer-125654321?utm_medium=clipboard_copy&amp;amp;utm_source=copyLink&amp;amp;utm_campaign=postshare_creator&amp;amp;utm_content=join_link" target="_blank"&gt;Patreon notes here&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;(April Cools is a project where we write genuine content on non-normal topics. You can see all the other April Cools posted so far &lt;a href="https://www.aprilcools.club/" target="_blank"&gt;here&lt;/a&gt;. There's still time to submit your own!)&lt;/p&gt;
    &lt;a class="embedded-link" href="https://www.aprilcools.club/"&gt; &lt;div style="width: 100%; background: #fff; border: 1px #ced3d9 solid; border-radius: 5px; margin-top: 1em; overflow: auto; margin-bottom: 1em;"&gt; &lt;div style="float: left; border-bottom: 1px #ced3d9 solid;"&gt; &lt;img class="link-image" src="https://www.aprilcools.club/aprilcoolsclub.png"/&gt; &lt;/div&gt; &lt;div style="float: left; color: #393f48; padding-left: 1em; padding-right: 1em;"&gt; &lt;h4 class="link-title" style="margin-bottom: 0em; line-height: 1.25em; margin-top: 1em; font-size: 14px;"&gt;                April Cools' Club&lt;/h4&gt; &lt;/div&gt; &lt;/div&gt;&lt;/a&gt;</description>
                <pubDate>Tue, 01 Apr 2025 16:04:59 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/april-cools-gaming-games-for-non-gamers/</guid>
            </item>
            <item>
                <title>Betteridge's Law of Software Engineering Specialness</title>
                <link>https://buttondown.com/hillelwayne/archive/betteridges-law-of-software-engineering/</link>
                <description>&lt;h3&gt;Logic for Programmers v0.8 now out!&lt;/h3&gt;
    &lt;p&gt;The new release has minor changes: new formatting for notes and a better introduction to predicates. I would have rolled it all into v0.9 next month but I like the monthly cadence. &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Get it here!&lt;/a&gt;&lt;/p&gt;
    &lt;h1&gt;Betteridge's Law of Software Engineering Specialness&lt;/h1&gt;
    &lt;p&gt;In &lt;a href="https://agileotter.blogspot.com/2025/03/there-is-no-automatic-reset-in.html" target="_blank"&gt;There is No Automatic Reset in Engineering&lt;/a&gt;, Tim Ottinger asks:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Do the other people have to live with January 2013 for the rest of their lives? Or is it only engineering that has to deal with every dirty hack since the beginning of the organization?&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;&lt;strong&gt;Betteridge's Law of Headlines&lt;/strong&gt; says that if a journalism headline ends with a question mark, the answer is probably "no". I propose a similar law relating to software engineering specialness:&lt;sup id="fnref:ottinger"&gt;&lt;a class="footnote-ref" href="#fn:ottinger"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;If someone asks if some aspect of software development is truly unique to just software development, the answer is probably "no".&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Take the idea that "in software, hacks are forever." My favorite example of this comes from a different profession. The &lt;a href="https://en.wikipedia.org/wiki/Dewey_Decimal_Classification" target="_blank"&gt;Dewey Decimal System&lt;/a&gt; hierarchically categorizes books by discipline. For example, &lt;em&gt;&lt;a href="https://www.librarything.com/work/10143437/t/Covered-Bridges-of-Pennsylvania" target="_blank"&gt;Covered Bridges of Pennsylvania&lt;/a&gt;&lt;/em&gt; has Dewey number &lt;code&gt;624.37&lt;/code&gt;. &lt;code&gt;6--&lt;/code&gt; is the technology discipline, &lt;code&gt;62-&lt;/code&gt; is engineering, &lt;code&gt;624&lt;/code&gt; is civil engineering, and &lt;code&gt;624.3&lt;/code&gt; is "special types of bridges". I have no idea what the last &lt;code&gt;0.07&lt;/code&gt; means, but you get the picture.&lt;/p&gt;
    &lt;p&gt;Now if you look at the &lt;a href="https://www.librarything.com/mds/6" target="_blank"&gt;6-- "technology" breakdown&lt;/a&gt;, you'll see that there's no "software" subdiscipline. This is because when Dewey preallocated the whole technology block in 1876. New topics were instead to be added to the &lt;code&gt;00-&lt;/code&gt; "general-knowledge" catch-all. Eventually &lt;code&gt;005&lt;/code&gt; was assigned to "software development", meaning &lt;em&gt;The C Programming Language&lt;/em&gt; lives at &lt;code&gt;005.133&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;Incidentally, another late addition to the general knowledge block is &lt;code&gt;001.9&lt;/code&gt;: "controversial knowledge". &lt;/p&gt;
    &lt;p&gt;And that's why my hometown library shelved the C++ books right next to &lt;em&gt;The Mothman Prophecies&lt;/em&gt;.&lt;/p&gt;
    &lt;p&gt;How's &lt;em&gt;that&lt;/em&gt; for technical debt?&lt;/p&gt;
    &lt;p&gt;If anything, fixing hacks in software is significantly &lt;em&gt;easier&lt;/em&gt; than in other fields. This came up when I was &lt;a href="https://www.hillelwayne.com/post/we-are-not-special/" target="_blank"&gt;interviewing classic engineers&lt;/a&gt;. Kludges happened all the time, but "refactoring" them out is &lt;em&gt;expensive&lt;/em&gt;. Need to house a machine that's just two inches taller than the room? Guess what, you're cutting a hole in the ceiling.&lt;/p&gt;
    &lt;p&gt;(Even if we restrict the question to other departments in a &lt;em&gt;software company&lt;/em&gt;, we can find kludges that are horrible to undo. I once worked for a company which landed an early contract by adding a bespoke support agreement for that one customer. That plagued them for years afterward.)&lt;/p&gt;
    &lt;p&gt;That's not to say that there aren't things that are different about software vs other fields!&lt;sup id="fnref:example"&gt;&lt;a class="footnote-ref" href="#fn:example"&gt;2&lt;/a&gt;&lt;/sup&gt;  But I think that &lt;em&gt;most&lt;/em&gt; of the time, when we say "software development is the only profession that deals with XYZ", it's only because we're ignorant of how those other professions work.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Short newsletter because I'm way behind on writing my &lt;a href="https://www.aprilcools.club/" target="_blank"&gt;April Cools&lt;/a&gt;. If you're interested in April Cools, you should try it out! I make it &lt;em&gt;way&lt;/em&gt; harder on myself than it actually needs to be— everybody else who participates finds it pretty chill.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:ottinger"&gt;
    &lt;p&gt;Ottinger caveats it with "engineering, software or otherwise", so I think he knows that other branches of &lt;em&gt;engineering&lt;/em&gt;, at least, have kludges. &lt;a class="footnote-backref" href="#fnref:ottinger" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:example"&gt;
    &lt;p&gt;The "software is different" idea that I'm most sympathetic to is that in software, the tools we use and the products we create are made from the same material. That's unusual at least in classic engineering. Then again, plenty of machinists have made their own lathes and mills! &lt;a class="footnote-backref" href="#fnref:example" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 26 Mar 2025 18:48:39 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/betteridges-law-of-software-engineering/</guid>
            </item>
            <item>
                <title>Verification-First Development</title>
                <link>https://buttondown.com/hillelwayne/archive/verification-first-development/</link>
                <description>&lt;p&gt;A while back I argued on the Blue Site&lt;sup id="fnref:li"&gt;&lt;a class="footnote-ref" href="#fn:li"&gt;1&lt;/a&gt;&lt;/sup&gt; that "test-first development" (TFD) was different than "test-driven development" (TDD). The former is "write tests before you write code", the latter is a paradigm, culture, and collection of norms that's based on TFD. More broadly, TFD is a special case of &lt;strong&gt;Verification-First Development&lt;/strong&gt; and TDD is not.&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;VFD: before writing code, put in place some means of verifying that the code is correct, or at least have an idea of what you'll do.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;"Verifying" could mean writing tests, or figuring out how to encode invariants in types, or &lt;a href="https://blog.regehr.org/archives/1091" target="_blank"&gt;adding contracts&lt;/a&gt;, or &lt;a href="https://learntla.com/" target="_blank"&gt;making a formal model&lt;/a&gt;, or writing a separate script that checks the output of the program. Just have &lt;em&gt;something&lt;/em&gt; appropriate in place that you can run as you go building the code. Ideally, we'd have verification in place for every interesting property, but that's rarely possible in practice. &lt;/p&gt;
    &lt;p&gt;Oftentimes we can't make the verification until the code is partially complete. In that case it still helps to figure out the verification we'll write later. The point is to have a &lt;em&gt;plan&lt;/em&gt; and follow it promptly.&lt;/p&gt;
    &lt;p&gt;I'm using "code" as a standin for anything we programmers make, not just software programs. When using constraint solvers, I try to find representative problems I know the answers to. When writing formal specifications, I figure out the system's properties before the design that satisfies those properties. There's probably equivalents in security and other topics, too.&lt;/p&gt;
    &lt;h3&gt;The Benefits of VFD&lt;/h3&gt;
    &lt;ol&gt;
    &lt;li&gt;Doing verification before coding makes it less likely we'll skip verification entirely. It's the professional equivalent of "No TV until you do your homework."&lt;/li&gt;
    &lt;li&gt;It's easier to make sure a verifier works properly if we start by running it on code we know doesn't pass it. Bebugging working code takes more discipline.&lt;/li&gt;
    &lt;li&gt;We can run checks earlier in the development process. It's better to realize that our code is broken five minutes after we broke it rather than two hours after.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;That's it, those are the benefits of verification-first development. Those are also &lt;em&gt;big&lt;/em&gt; benefits for relatively little investment. Specializations of VFD like test-first development can have more benefits, but also more drawbacks.&lt;/p&gt;
    &lt;h3&gt;The drawbacks of VFD&lt;/h3&gt;
    &lt;ol&gt;
    &lt;li&gt;It slows us down. I know lots of people say that "no actually it makes you go faster in the long run," but that's the &lt;em&gt;long&lt;/em&gt; run. Sometimes we do marathons, sometimes we sprint.&lt;/li&gt;
    &lt;li&gt;Verification gets in the way of exploratory coding, where we don't know what exactly we want or how exactly to do something.&lt;/li&gt;
    &lt;li&gt;Any specific form of verification exerts a pressure on our code to make it easier to verify with that method. For example, if we're mostly verifying via type invariants, we need to figure out how to express those things in our language's type system, which may not be suited for the specific invariants we need.&lt;sup id="fnref:sphinx"&gt;&lt;a class="footnote-ref" href="#fn:sphinx"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ol&gt;
    &lt;h2&gt;Whether "pressure" is a real drawback is incredibly controversial&lt;/h2&gt;
    &lt;p&gt;If I had to summarize what makes "test-driven development" different from VFD:&lt;sup id="fnref:tdd"&gt;&lt;a class="footnote-ref" href="#fn:tdd"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;The form of verification should specifically be tests, and unit tests at that&lt;/li&gt;
    &lt;li&gt;Testing pressure is invariably good. "Making your code easier to unit test" is the same as "making your code better".&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;This is something all of the various "drivens"— TDD, Type Driven Development, Design by Contract— share in common, this idea that the purpose of the paradigm is to exert pressure. Lots of TDD experts claim that "having a good test suite" is only the secondary benefit of TDD and the real benefit is how it improves code quality.&lt;sup id="fnref:docs"&gt;&lt;a class="footnote-ref" href="#fn:docs"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;Whether they're right or not is not something I want to argue: I've seen these approaches all improve my code structure, but also sometimes worsen it. Regardless, I consider pressure a drawback to VFD in general, though, for a somewhat idiosyncratic reason. If it &lt;em&gt;weren't&lt;/em&gt; for pressure, VFD would be wholly independent of the code itself. It would &lt;em&gt;just&lt;/em&gt; be about verification, and our decisions would exclusively be about how we want to verify. But the design pressure means that our means of verification affects the system we're checking. What if these conflict in some way?&lt;/p&gt;
    &lt;h3&gt;VFD is a technique, not a paradigm&lt;/h3&gt;
    &lt;p&gt;One of the main differences between "techniques" and "paradigms" is that paradigms don't play well with each other. If you tried to do both "proper" Test-Driven Development and "proper" Cleanroom, your head would explode. Whereas VFD being a "technique" means it works well with other techniques and even with many full paradigms.&lt;/p&gt;
    &lt;p&gt;It also doesn't take a whole lot of practice to start using. It does take practice, both in thinking of verifications and in using the particular verification method involved, to &lt;em&gt;use well&lt;/em&gt;, but we can use it poorly and still benefit.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:li"&gt;
    &lt;p&gt;LinkedIn, what did you think I meant? &lt;a class="footnote-backref" href="#fnref:li" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:sphinx"&gt;
    &lt;p&gt;This bit me in the butt when making my own &lt;a href="https://www.sphinx-doc.org/en/master/" target="_blank"&gt;sphinx&lt;/a&gt; extensions. The official guides do things in a highly dynamic way that Mypy can't statically check. I had to do things in a completely different way. Ended up being better though! &lt;a class="footnote-backref" href="#fnref:sphinx" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:tdd"&gt;
    &lt;p&gt;Someone's going to yell at me that I completely missed the point of TDD, which is XYZ. Well guess what, someone else &lt;em&gt;already&lt;/em&gt; yelled at me that only dumb idiot babies think XYZ is important in TDD. Put in whatever you want for XYZ. &lt;a class="footnote-backref" href="#fnref:tdd" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:docs"&gt;
    &lt;p&gt;Another thing that weirdly all of the paradigms claim: that they lead to better documentation. I can see the argument, I just find it strange that &lt;em&gt;every single one&lt;/em&gt; makes this claim! &lt;a class="footnote-backref" href="#fnref:docs" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 18 Mar 2025 16:22:20 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/verification-first-development/</guid>
            </item>
            <item>
                <title>New Blog Post: "A Perplexing Javascript Parsing Puzzle"</title>
                <link>https://buttondown.com/hillelwayne/archive/new-blog-post-a-perplexing-javascript-parsing/</link>
                <description>&lt;p&gt;I know I said we'd be back to normal newsletters this week and in fact had 80% of one already written. &lt;/p&gt;
    &lt;p&gt;Then I unearthed something that was better left buried.&lt;/p&gt;
    &lt;p&gt;&lt;a href="http://www.hillelwayne.com/post/javascript-puzzle/" target="_blank"&gt;Blog post here&lt;/a&gt;, &lt;a href="https://www.patreon.com/posts/blog-notes-124153641" target="_blank"&gt;Patreon notes here&lt;/a&gt; (Mostly an explanation of how I found this horror in the first place). Next week I'll send what was supposed to be this week's piece.&lt;/p&gt;
    &lt;p&gt;(PS: &lt;a href="https://www.aprilcools.club/" target="_blank"&gt;April Cools&lt;/a&gt; in three weeks!)&lt;/p&gt;</description>
                <pubDate>Wed, 12 Mar 2025 14:49:52 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/new-blog-post-a-perplexing-javascript-parsing/</guid>
            </item>
            <item>
                <title>Five Kinds of Nondeterminism</title>
                <link>https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/</link>
                <description>&lt;p&gt;No newsletter next week, I'm teaching a TLA+ workshop.&lt;/p&gt;
    &lt;p&gt;Speaking of which: I spend a lot of time thinking about formal methods (and TLA+ specifically) because it's where the source of almost all my revenue. But I don't share most of the details because 90% of my readers don't use FM and never will. I think it's more interesting to talk about ideas &lt;em&gt;from&lt;/em&gt; FM that would be useful to people outside that field. For example, the idea of "property strength" translates to the &lt;a href="https://buttondown.com/hillelwayne/archive/some-tests-are-stronger-than-others/" target="_blank"&gt;idea that some tests are stronger than others&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;Another possible export is how FM approaches nondeterminism. A &lt;strong&gt;nondeterministic&lt;/strong&gt; algorithm is one that, from the same starting conditions, has multiple possible outputs. This is nondeterministic:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# Pseudocode
    
    def f() {
        return rand()+1;
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;When specifying systems, I may not &lt;em&gt;encounter&lt;/em&gt; nondeterminism more often than in real systems, but I am definitely more aware of its presence. Modeling nondeterminism is a core part of formal specification. I mentally categorize nondeterminism into five buckets. Caveat, this is specifically about nondeterminism from the perspective of &lt;em&gt;system modeling&lt;/em&gt;, not computer science as a whole. If I tried to include stuff on NFAs and amb operations this would be twice as long.&lt;sup id="fnref:nondeterminism"&gt;&lt;a class="footnote-ref" href="#fn:nondeterminism"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h2&gt;1. True Randomness&lt;/h2&gt;
    &lt;p&gt;Programs that literally make calls to a &lt;code&gt;random&lt;/code&gt; function and then use the results. This the simplest type of nondeterminism and one of the most ubiquitous. &lt;/p&gt;
    &lt;p&gt;Most of the time, &lt;code&gt;random&lt;/code&gt; isn't &lt;em&gt;truly&lt;/em&gt; nondeterministic. Most of the time computer randomness is actually &lt;strong&gt;pseudorandom&lt;/strong&gt;, meaning we seed a deterministic algorithm that behaves "randomly-enough" for some use. You could "lift" a nondeterministic random function into a deterministic one by adding a fixed seed to the starting state.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python&lt;/span&gt;
    
    &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seed&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    
    &lt;span class="o"&gt;&gt;&gt;&gt;&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="mf"&gt;0.23796462709189137&lt;/span&gt;
    &lt;span class="o"&gt;&gt;&gt;&gt;&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="mf"&gt;0.23796462709189137&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Often we don't do this because the &lt;em&gt;point&lt;/em&gt; of randomness is to provide nondeterminism! We deliberately &lt;em&gt;abstract out&lt;/em&gt; the starting state of the seed from our program, because it's easier to think about it as locally nondeterministic.&lt;/p&gt;
    &lt;p&gt;(There's also "true" randomness, like using &lt;a href="https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html#inpage-nav-3-2" target="_blank"&gt;thermal noise&lt;/a&gt; as an entropy source, which I think are mainly used for cryptography and seeding PRNGs.)&lt;/p&gt;
    &lt;p&gt;Most formal specification languages don't deal with randomness (though some deal with &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;probability more broadly&lt;/a&gt;). Instead, we treat it as a nondeterministic choice:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# software
    if rand &gt; 0.001 then return a else crash
    
    # specification
    either return a or crash
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is because we're looking at worst-case scenarios, so it doesn't matter if &lt;code&gt;crash&lt;/code&gt; happens 50% of the time or 0.0001% of the time, it's still possible.  &lt;/p&gt;
    &lt;h2&gt;2. Concurrency&lt;/h2&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# Pseudocode
    global x = 1, y = 0;
    
    def thread1() {
       x++;
       x++;
       x++;
    }
    
    def thread2() {
        y := x;
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;If &lt;code&gt;thread1()&lt;/code&gt; and &lt;code&gt;thread2()&lt;/code&gt; run sequentially, then (assuming the sequence is fixed) the final value of &lt;code&gt;y&lt;/code&gt; is deterministic. If the two functions are started and run simultaneously, then depending on when &lt;code&gt;thread2&lt;/code&gt; executes &lt;code&gt;y&lt;/code&gt; can be 1, 2, 3, &lt;em&gt;or&lt;/em&gt; 4. Both functions are locally sequential, but running them concurrently leads to global nondeterminism.&lt;/p&gt;
    &lt;p&gt;Concurrency is arguably the most &lt;em&gt;dramatic&lt;/em&gt; source of nondeterminism. &lt;a href="https://buttondown.com/hillelwayne/archive/what-makes-concurrency-so-hard/" target="_blank"&gt;Small amounts of concurrency lead to huge explosions in the state space&lt;/a&gt;. We have words for the specific kinds of nondeterminism caused by concurrency, like "race condition" and "dirty write". Often we think about it as a separate &lt;em&gt;topic&lt;/em&gt; from nondeterminism. To some extent it "overshadows" the other kinds: I have a much easier time teaching students about concurrency in models than nondeterminism in models.&lt;/p&gt;
    &lt;p&gt;Many formal specification languages have special syntax/machinery for the concurrent aspects of a system, and generic syntax for other kinds of nondeterminism. In P that's &lt;a href="https://p-org.github.io/P/manual/expressions/#choose" target="_blank"&gt;choose&lt;/a&gt;. Others don't special-case concurrency, instead representing as it as nondeterministic choices by a global coordinator. This more flexible but also more inconvenient, as you have to implement process-local sequencing code yourself. &lt;/p&gt;
    &lt;h2&gt;3. User Input&lt;/h2&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;One of the most famous and influential programming books is &lt;em&gt;The C Programming Language&lt;/em&gt; by Kernighan and Ritchie. The first example of a nondeterministic program appears on page 14:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Picture of the book page. Code reproduced below." class="newsletter-image" src="https://assets.buttondown.email/images/94e6ad15-8d09-48df-b885-191318bfd179.jpg?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;For the newsletter readers who get text only emails,&lt;sup id="fnref:text-only"&gt;&lt;a class="footnote-ref" href="#fn:text-only"&gt;2&lt;/a&gt;&lt;/sup&gt; here's the program:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="cp"&gt;#include&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cpf"&gt;&lt;stdio.h&gt;&lt;/span&gt;
    &lt;span class="cm"&gt;/* copy input to output; 1st version */&lt;/span&gt;
    &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;getchar&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;while&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;!=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;EOF&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;putchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;getchar&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Yup, that's nondeterministic. Because the user can enter any string, any call of &lt;code&gt;main()&lt;/code&gt; could have any output, meaning the number of possible outcomes is infinity.&lt;/p&gt;
    &lt;p&gt;Okay that seems a little cheap, and I think it's because we tend to think of determinism in terms of how the user &lt;em&gt;experiences&lt;/em&gt; the program. Yes, &lt;code&gt;main()&lt;/code&gt; has an infinite number of user inputs, but for each input the user will experience only one possible output. It starts to feel more nondeterministic when modeling a long-standing system that's &lt;em&gt;reacting&lt;/em&gt; to user input, for example a server that runs a script whenever the user uploads a file. This can be modeled with nondeterminism and concurrency: We have one execution that's the system, and one nondeterministic execution that represents the effects of our user.&lt;/p&gt;
    &lt;p&gt;(One intrusive thought I sometimes have: any "yes/no" dialogue actually has &lt;em&gt;three&lt;/em&gt; outcomes: yes, no, or the user getting up and walking away without picking a choice, permanently stalling the execution.)&lt;/p&gt;
    &lt;h2&gt;4. External forces&lt;/h2&gt;
    &lt;p&gt;The more general version of "user input": anything where either 1) some part of the execution outcome depends on retrieving external information, or 2) the external world can change some state outside of your system. I call the distinction between internal and external components of the system &lt;a href="https://www.hillelwayne.com/post/world-vs-machine/" target="_blank"&gt;the world and the machine&lt;/a&gt;. Simple examples: code that at some point reads an external temperature sensor. Unrelated code running on a system which quits programs if it gets too hot. API requests to a third party vendor. Code processing files but users can delete files before the script gets to them.&lt;/p&gt;
    &lt;p&gt;Like with PRNGs, some of these cases don't &lt;em&gt;have&lt;/em&gt; to be nondeterministic; we can argue that "the temperature" should be a virtual input into the function. Like with PRNGs, we treat it as nondeterministic because it's useful to think in that way. Also, what if the temperature changes between starting a function and reading it?&lt;/p&gt;
    &lt;p&gt;External forces are also a source of nondeterminism as &lt;em&gt;uncertainty&lt;/em&gt;. Measurements in the real world often comes with errors, so repeating a measurement twice can give two different answers. Sometimes operations fail for no discernable reason, or for a non-programmatic reason (like something physically blocks the sensor).&lt;/p&gt;
    &lt;p&gt;All of these situations can be modeled in the same way as user input: a concurrent execution making nondeterministic choices.&lt;/p&gt;
    &lt;h2&gt;5. Abstraction&lt;/h2&gt;
    &lt;p&gt;This is where nondeterminism in system models and in "real software" differ the most. I said earlier that pseudorandomness is &lt;em&gt;arguably&lt;/em&gt; deterministic, but we abstract it into nondeterminism. More generally, &lt;strong&gt;nondeterminism hides implementation details of deterministic processes&lt;/strong&gt;.&lt;/p&gt;
    &lt;p&gt;In one consulting project, we had a machine that received a message, parsed a lot of data from the message, went into a complicated workflow, and then entered one of three states. The final state was totally deterministic on the content of the message, but the actual process of determining that final state took tons and tons of code. None of that mattered at the scope we were modeling, so we abstracted it all away: "on receiving message, nondeterministically enter state A, B, or C."&lt;/p&gt;
    &lt;p&gt;Doing this makes the system easier to model. It also makes the model more sensitive to possible errors. What if the workflow is bugged and sends us to the wrong state? That's already covered by the nondeterministic choice! Nondeterministic abstraction gives us the potential to pick the worst-case scenario for our system, so we can prove it's robust even under those conditions.&lt;/p&gt;
    &lt;p&gt;I know I beat the "nondeterminism as abstraction" drum a whole lot but that's because it's the insight from formal methods I personally value the most, that nondeterminism is a powerful tool to &lt;em&gt;simplify reasoning about things&lt;/em&gt;. You can see the same approach in how I approach modeling users and external forces: complex realities black-boxed and simplified into nondeterministic forces on the system.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Anyway, I hope this collection of ideas I got from formal methods are useful to my broader readership. Lemme know if it somehow helps you out!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:nondeterminism"&gt;
    &lt;p&gt;I realized after writing this that I already talked wrote an essay about nondeterminism in formal specification &lt;a href="https://buttondown.com/hillelwayne/archive/nondeterminism-in-formal-specification/" target="_blank"&gt;just under a year ago&lt;/a&gt;. I hope this one covers enough new ground to be interesting! &lt;a class="footnote-backref" href="#fnref:nondeterminism" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:text-only"&gt;
    &lt;p&gt;There is a surprising number of you. &lt;a class="footnote-backref" href="#fnref:text-only" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 19 Feb 2025 19:37:57 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/</guid>
            </item>
            <item>
                <title>Are Efficiency and Horizontal Scalability at odds?</title>
                <link>https://buttondown.com/hillelwayne/archive/are-efficiency-and-horizontal-scalability-at-odds/</link>
                <description>&lt;p&gt;Sorry for missing the newsletter last week! I started writing on Monday as normal, and by Wednesday the piece (about the &lt;a href="https://en.wikipedia.org/wiki/Hierarchy_of_hazard_controls" target="_blank"&gt;hierarchy of controls&lt;/a&gt; ) was 2000 words and not &lt;em&gt;close&lt;/em&gt; to done. So now it'll be a blog post sometime later this month.&lt;/p&gt;
    &lt;p&gt;I also just released a new version of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt;! 0.7 adds a bunch of new content (type invariants, modeling access policies, rewrites of the first chapters) but more importantly has new fonts that are more legible than the old ones. &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Go check it out!&lt;/a&gt;&lt;/p&gt;
    &lt;p&gt;For this week's newsletter I want to brainstorm an idea I've been noodling over for a while. Say we have a computational task, like running a simulation or searching a very large graph, and it's taking too long to complete on a computer. There's generally three things that we can do to make it faster:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Buy a faster computer ("vertical scaling")&lt;/li&gt;
    &lt;li&gt;Modify the software to use the computer's resources better ("efficiency")&lt;/li&gt;
    &lt;li&gt;Modify the software to use multiple computers ("horizontal scaling")&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;(Splitting single-threaded software across multiple threads/processes is sort of a blend of (2) and (3).)&lt;/p&gt;
    &lt;p&gt;The big benefit of (1) is that we (usually) don't have to make any changes to the software to get a speedup. The downside is that for the past couple of decades computers haven't &lt;em&gt;gotten&lt;/em&gt; much faster, except in ways that require recoding (like GPUs and multicore). This means we rely on (2) and (3), and we can do both to a point. I've noticed, though, that horizontal scaling seems to conflict with efficiency. Software optimized to scale well tends to be worse or the &lt;code&gt;N=1&lt;/code&gt; case than software optimized to, um, be optimized. &lt;/p&gt;
    &lt;p&gt;Are there reasons to &lt;em&gt;expect&lt;/em&gt; this? It seems reasonable that design goals of software are generally in conflict, purely because exclusively optimizing for one property means making decisions that impede other properties. But is there something in the nature of "efficiency" and "horizontal scalability" that make them especially disjoint?&lt;/p&gt;
    &lt;p&gt;This isn't me trying to explain a fully coherent idea, more me trying to figure this all out to myself. Also I'm probably getting some hardware stuff wrong&lt;/p&gt;
    &lt;h3&gt;Amdahl's Law&lt;/h3&gt;
    &lt;p&gt;According to &lt;a href="https://en.wikipedia.org/wiki/Amdahl%27s_law" target="_blank"&gt;Amdahl's Law&lt;/a&gt;, the maximum speedup by parallelization is constrained by the proportion of the work that can be parallelized. If 80% of algorithm X is parallelizable, the maximum speedup from horizontal scaling is 5x. If algorithm Y is 25% parallelizable, the maximum speedup is only 1.3x. &lt;/p&gt;
    &lt;p&gt;If you need horizontal scalability, you want to use algorithm X, &lt;em&gt;even if Y is naturally 3x faster&lt;/em&gt;. But if Y was 4x faster, you'd prefer it to X. Maximal scalability means finding the optimal balance between baseline speed and parallelizability. Maximal efficiency means just optimizing baseline speed. &lt;/p&gt;
    &lt;h3&gt;Coordination Overhead&lt;/h3&gt;
    &lt;p&gt;Distributed algorithms require more coordination. To add a list of numbers in parallel via &lt;a href="https://en.wikipedia.org/wiki/Fork%E2%80%93join_model" target="_blank"&gt;fork-join&lt;/a&gt;, we'd do something like this:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Split the list into N sublists&lt;/li&gt;
    &lt;li&gt;Fork a new thread/process for sublist&lt;/li&gt;
    &lt;li&gt;Wait for each thread/process to finish&lt;/li&gt;
    &lt;li&gt;Add the sums together.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;(1), (2), and (3) all add overhead to the algorithm. At the very least, it's extra lines of code to execute, but it can also mean inter-process communication or network hops. Distribution also means you have fewer natural correctness guarantees, so you need more administrative overhead to avoid race conditions. &lt;/p&gt;
    &lt;p&gt;&lt;strong&gt;Real world example:&lt;/strong&gt; Historically CPython has a "global interpreter lock" (GIL). In multithreaded code, only one thread could execute Python code at a time (others could execute C code). The &lt;a href="https://docs.python.org/3/howto/free-threading-python.html#single-threaded-performance" target="_blank"&gt;newest version&lt;/a&gt; supports disabling the GIL, which comes at a 40% overhead for single-threaded programs. Supposedly the difference is because the &lt;a href="https://docs.python.org/3/whatsnew/3.11.html#whatsnew311-pep659" target="_blank"&gt;specializing adaptor&lt;/a&gt; optimization isn't thread-safe yet. The Python team is hoping on getting it down to "only" 10%. &lt;/p&gt;
    &lt;p style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;Scaling loses shared resources&lt;/h3&gt;
    &lt;p&gt;I'd say that intra-machine scaling (multiple threads/processes) feels qualitatively &lt;em&gt;different&lt;/em&gt; than inter-machine scaling. Part of that is that intra-machine scaling is "capped" while inter-machine is not. But there's also a difference in what assumptions you can make about shared resources. Starting from the baseline of single-threaded program:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Threads have a much harder time sharing CPU caches (you have to manually mess with affinities)&lt;/li&gt;
    &lt;li&gt;Processes have a much harder time sharing RAM (I think you have to use &lt;a href="https://en.wikipedia.org/wiki/Memory-mapped_file" target="_blank"&gt;mmap&lt;/a&gt;?)&lt;/li&gt;
    &lt;li&gt;Machines can't share cache, RAM, or disk, period.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;It's a lot easier to solve a problem when the whole thing fits in RAM. But if you split a 50 gb problem across three machines, it doesn't fit in ram by default, even if the machines have 64 gb each. Scaling also means that separate machines can't reuse resources like database connections.&lt;/p&gt;
    &lt;h3&gt;Efficiency comes from limits&lt;/h3&gt;
    &lt;p&gt;I think the two previous points tie together in the idea that maximal efficiency comes from being able to make assumptions about the system. If we know the &lt;em&gt;exact&lt;/em&gt; sequence of computations, we can aim to minimize cache misses. If we don't have to worry about thread-safety, &lt;a href="https://www.playingwithpointers.com/blog/refcounting-harder-than-it-sounds.html" target="_blank"&gt;tracking references is dramatically simpler&lt;/a&gt;. If we have all of the data in a single database, our query planner has more room to work with. At various tiers of scaling these assumptions are no longer guaranteed and we lose the corresponding optimizations.&lt;/p&gt;
    &lt;p&gt;Sometimes these assumptions are implicit and crop up in odd places. Like if you're working at a scale where you need multiple synced databases, you might want to use UUIDs instead of numbers for keys. But then you lose the assumption "recently inserted rows are close together in the index", which I've read &lt;a href="https://www.cybertec-postgresql.com/en/unexpected-downsides-of-uuid-keys-in-postgresql/" target="_blank"&gt;can lead to significant slowdowns&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;This suggests that if you can find a limit somewhere else, you can get both high horizontal scaling and high efficiency. &lt;del&gt;Supposedly the &lt;a href="https://tigerbeetle.com/" target="_blank"&gt;TigerBeetle database&lt;/a&gt; has both, but that could be because they limit all records to &lt;a href="https://docs.tigerbeetle.com/coding/" target="_blank"&gt;accounts and transfers&lt;/a&gt;. This means every record fits in &lt;a href="https://tigerbeetle.com/blog/2024-07-23-rediscovering-transaction-processing-from-history-and-first-principles/#transaction-processing-from-first-principles" target="_blank"&gt;exactly 128 bytes&lt;/a&gt;.&lt;/del&gt; [A TigerBeetle engineer reached out to tell me that they do &lt;em&gt;not&lt;/em&gt; horizontally scale compute, they distribute across multiple nodes for redundancy. &lt;a href="https://lobste.rs/s/5akiq3/are_efficiency_horizontal_scalability#c_ve8ud5" target="_blank"&gt;"You can't make it faster by adding more machines."&lt;/a&gt;]&lt;/p&gt;
    &lt;p&gt;Does this mean that "assumptions" could be both "assumptions about the computing environment" and "assumptions about the problem"? In the famous essay &lt;a href="http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html" target="_blank"&gt;Scalability! But at what COST&lt;/a&gt;, Frank McSherry shows that his single-threaded laptop could outperform 128-node "big data systems" on PageRank and graph connectivity (via label propagation). Afterwards, he discusses how a different algorithm solves graph connectivity even faster: &lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;[Union find] is more line of code than label propagation, but it is 10x faster and 100x less embarassing. … The union-find algorithm is fundamentally incompatible with the graph computation approaches Giraph, GraphLab, and GraphX put forward (the so-called “think like a vertex” model).&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The interesting thing to me is that his alternate makes more "assumptions" than what he's comparing to. He can "assume" a fixed goal and optimize the code for that goal. The "big data systems" are trying to be general purpose compute platforms and have to pick a model that supports the widest range of possible problems. &lt;/p&gt;
    &lt;p&gt;A few years back I wrote &lt;a href="https://www.hillelwayne.com/post/cleverness/" target="_blank"&gt;clever vs insightful code&lt;/a&gt;, I think what I'm trying to say here is that efficiency comes from having insight into your problem and environment.&lt;/p&gt;
    &lt;p&gt;(Last thought to shove in here: to exploit assumptions, you need &lt;em&gt;control&lt;/em&gt;. Carefully arranging your data to fit in L1 doesn't matter if your programming language doesn't let you control where things are stored!)&lt;/p&gt;
    &lt;h3&gt;Is there a cultural aspect?&lt;/h3&gt;
    &lt;p&gt;Maybe there's also a cultural element to this conflict. What if the engineers interested in "efficiency" are different from the engineers interested in "horizontal scaling"?&lt;/p&gt;
    &lt;p&gt;At my first job the data scientists set up a &lt;a href="https://en.wikipedia.org/wiki/Apache_Hadoop" target="_blank"&gt;Hadoop&lt;/a&gt; cluster for their relatively small dataset, only a few dozen gigabytes or so. One of the senior software engineers saw this and said "big data is stupid." To prove it, he took one of their example queries, wrote a script in Go to compute the same thing, and optimized it to run faster on his machine.&lt;/p&gt;
    &lt;p&gt;At the time I was like "yeah, you're right, big data IS stupid!" But I think now that we both missed something obvious: with the "scalable" solution, the data scientists &lt;em&gt;didn't&lt;/em&gt; have to write an optimized script for every single query. Optimizing code is hard, adding more machines is easy! &lt;/p&gt;
    &lt;p&gt;The highest-tier of horizontal scaling is usually something large businesses want, and large businesses like problems that can be solved purely with money. Maximizing efficiency requires a lot of knowledge-intensive human labour, so is less appealing as an investment. Then again, I've seen a lot of work on making the scalable systems more efficient, such as evenly balancing heterogeneous workloads. Maybe in the largest systems intra-machine efficiency is just too small-scale a problem. &lt;/p&gt;
    &lt;h3&gt;I'm not sure where this fits in but scaling a volume of tasks conflicts less than scaling individual tasks&lt;/h3&gt;
    &lt;p&gt;If you have 1,000 machines and need to crunch one big graph, you probably want the most scalable algorithm. If you instead have 50,000 small graphs, you probably want the most efficient algorithm, which you then run on all 1,000 machines. When we call a problem &lt;a href="https://en.wikipedia.org/wiki/Embarrassingly_parallel" target="_blank"&gt;embarrassingly parallel&lt;/a&gt;, we usually mean it's easy to horizontally scale. But it's also one that's easy to make more efficient, because local optimizations don't affect the scaling! &lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Okay that's enough brainstorming for one week.&lt;/p&gt;
    &lt;h3&gt;Blog Rec&lt;/h3&gt;
    &lt;p&gt;Whenever I think about optimization as a skill, the first article that comes to mind is &lt;a href="https://matklad.github.io/" target="_blank"&gt;Mat Klad's&lt;/a&gt; &lt;a href="https://matklad.github.io/2023/11/15/push-ifs-up-and-fors-down.html" target="_blank"&gt;Push Ifs Up And Fors Down&lt;/a&gt;. I'd never have considered on my own that inlining loops into functions could be such a huge performance win. The blog has a lot of other posts on the nuts-and-bolts of systems languages, optimization, and concurrency.&lt;/p&gt;</description>
                <pubDate>Wed, 12 Feb 2025 18:26:20 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/are-efficiency-and-horizontal-scalability-at-odds/</guid>
            </item>
            <item>
                <title>What hard thing does your tech make easy?</title>
                <link>https://buttondown.com/hillelwayne/archive/what-hard-thing-does-your-tech-make-easy/</link>
                <description>&lt;p&gt;I occasionally receive emails asking me to look at the writer's new language/library/tool. Sometimes it's in an area I know well, like formal methods. Other times, I'm a complete stranger to the field. Regardless, I'm generally happy to check it out.&lt;/p&gt;
    &lt;p&gt;When starting out, this is the biggest question I'm looking to answer:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;What does this technology make easy that's normally hard?&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;What justifies me learning and migrating to a &lt;em&gt;new&lt;/em&gt; thing as opposed to fighting through my problems with the tools I already know? The new thing has to have some sort of value proposition, which could be something like "better performance" or "more secure". The most universal value and the most direct to show is "takes less time and mental effort to do something". I can't accurately judge two benchmarks, but I can see two demos or code samples and compare which one feels easier to me.&lt;/p&gt;
    &lt;h2&gt;Examples&lt;/h2&gt;
    &lt;h3&gt;Functional programming&lt;/h3&gt;
    &lt;p&gt;What drew me originally to functional programming was higher order functions. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# Without HOFs
    
    out = []
    for x in input {
      if test(x) {
        out.append(x)
     }
    }
    
    # With HOFs
    
    filter(test, input)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;We can also compare the easiness of various tasks between examples within the same paradigm. If I know FP via Clojure, what could be appealing about Haskell or F#? For one, null safety is a lot easier when I've got option types.&lt;/p&gt;
    &lt;h3&gt;Array Programming&lt;/h3&gt;
    &lt;p&gt;Array programming languages like APL or J make certain classes of computation easier. For example, finding all of the indices where two arrays &lt;del&gt;differ&lt;/del&gt; match. Here it is in Python:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    
    &lt;span class="o"&gt;&gt;&gt;&gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;And here it is in J:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;y&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
    
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;I&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;y&lt;/span&gt;
    &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Not every tool is meant for every programmer, because you might not have any of the problems a tool makes easier. What comes up more often for you: filtering a list or finding all the indices where two lists differ? Statistically speaking, functional programming is more useful to you than array programming.&lt;/p&gt;
    &lt;p&gt;But &lt;em&gt;I&lt;/em&gt; have this problem enough to justify learning array programming.&lt;/p&gt;
    &lt;h3&gt;LLMs&lt;/h3&gt;
    &lt;p&gt;I think a lot of the appeal of LLMs is they make a lot of specialist tasks easy for nonspecialists. One thing I recently did was convert some rst &lt;a href="https://docutils.sourceforge.io/docs/ref/rst/directives.html#list-table" target="_blank"&gt;list tables&lt;/a&gt; to &lt;a href="https://docutils.sourceforge.io/docs/ref/rst/directives.html#csv-table-1" target="_blank"&gt;csv tables&lt;/a&gt;. Normally I'd have to do write some tricky parsing and serialization code to automatically convert between the two. With LLMs, it's just&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Convert the following rst list-table into a csv-table: [table]&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;"Easy" can trump "correct" as a value. The LLM might get some translations wrong, but it's so convenient I'd rather manually review all the translations for errors than write specialized script that is correct 100% of the time.&lt;/p&gt;
    &lt;h2&gt;Let's not take this too far&lt;/h2&gt;
    &lt;p&gt;A college friend once claimed that he cracked the secret of human behavior: humans do whatever makes them happiest. "What about the martyr who dies for their beliefs?" "Well, in their last second of life they get REALLY happy."&lt;/p&gt;
    &lt;p&gt;We can do the same here, fitting every value proposition into the frame of "easy". CUDA makes it easier to do matrix multiplication. Rust makes it easier to write low-level code without memory bugs. TLA+ makes it easier to find errors in your design. Monads make it easier to sequence computations in a lazy environment. Making everything about "easy" obscures other reason for adopting new things.&lt;/p&gt;
    &lt;h3&gt;That whole "simple vs easy" thing&lt;/h3&gt;
    &lt;p&gt;Sometimes people think that "simple" is better than "easy", because "simple" is objective and "easy" is subjective. This comes from the famous talk &lt;a href="https://www.infoq.com/presentations/Simple-Made-Easy/" target="_blank"&gt;Simple Made Easy&lt;/a&gt;. I'm not sure I agree that simple is better &lt;em&gt;or&lt;/em&gt; more objective: the speaker claims that polymorphism and typeclasses are "simpler" than conditionals, and I doubt everybody would agree with that.&lt;/p&gt;
    &lt;p&gt;The problem is that "simple" is used to mean both "not complicated" &lt;em&gt;and&lt;/em&gt; "not complex". And everybody agrees that "complicated" and "complex" are different, even if they can't agree &lt;em&gt;what&lt;/em&gt; the difference is. This idea should probably expanded be expanded into its own newsletter.&lt;/p&gt;
    &lt;p&gt;It's also a lot harder to pitch a technology on being "simpler". Simplicity by itself doesn't make a tool better equipped to solve problems. Simplicity can unlock other benefits, like compositionality or &lt;a href="https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/" target="_blank"&gt;tractability&lt;/a&gt;, that provide the actual value. And often that value is in the form of "makes some tasks easier". &lt;/p&gt;</description>
                <pubDate>Wed, 29 Jan 2025 18:09:47 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/what-hard-thing-does-your-tech-make-easy/</guid>
            </item>
            <item>
                <title>The Juggler's Curse</title>
                <link>https://buttondown.com/hillelwayne/archive/the-jugglers-curse/</link>
                <description>&lt;p&gt;I'm making a more focused effort to juggle this year. Mostly &lt;a href="https://youtu.be/PPhG_90VH5k?si=AxOO65PcX4ZwnxPQ&amp;t=49" target="_blank"&gt;boxes&lt;/a&gt;, but also classic balls too.&lt;sup id="fnref:boxes"&gt;&lt;a class="footnote-ref" href="#fn:boxes"&gt;1&lt;/a&gt;&lt;/sup&gt; I've gotten to the point where I can almost consistently do a five-ball cascade, which I &lt;em&gt;thought&lt;/em&gt; was the cutoff to being a "good juggler". "Thought" because I now know a "good juggler" is one who can do the five-ball cascade with &lt;em&gt;outside throws&lt;/em&gt;. &lt;/p&gt;
    &lt;p&gt;I know this because I can't do the outside five-ball cascade... yet. But it's something I can see myself eventually mastering, unlike the slightly more difficult trick of the five-ball mess, which is impossible for mere mortals like me. &lt;/p&gt;
    &lt;p&gt;&lt;em&gt;In theory&lt;/em&gt; there is a spectrum of trick difficulties and skill levels. I could place myself on the axis like this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="A crudely-drawn scale with 10 even ticks, I'm between 5 and 6" class="newsletter-image" src="https://assets.buttondown.email/images/8ee51aa1-5dd4-48b8-8110-2cdf9a273612.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;In practice, there are three tiers:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Toddlers&lt;/li&gt;
    &lt;li&gt;Good jugglers who practice hard&lt;/li&gt;
    &lt;li&gt;Genetic freaks and actual wizards&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;And the graph always, &lt;em&gt;always&lt;/em&gt; looks like this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The same graph, with the top compressed into "wizards" and bottom into "toddlers". I'm in toddlers." class="newsletter-image" src="https://assets.buttondown.email/images/04c76cec-671e-4560-b64e-498b7652359e.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;This is the jugglers curse, and it's a three-parter:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;The threshold between you and "good" is the next trick you cannot do.&lt;/li&gt;
    &lt;li&gt;Everything below that level is trivial. Once you've gotten a trick down, you can never go back to not knowing it, to appreciating how difficult it was to learn in the first place.&lt;sup id="fnref:expert-blindness"&gt;&lt;a class="footnote-ref" href="#fn:expert-blindness"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;li&gt;Everything above that level is just "impossible". You don't have the knowledge needed to recognize the different tiers.&lt;sup id="fnref:dk"&gt;&lt;a class="footnote-ref" href="#fn:dk"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;So as you get better, the stuff that was impossible becomes differentiable, and you can see that some of it &lt;em&gt;is&lt;/em&gt; possible. And everything you learned becomes trivial. So you're never a good juggler until you learn "just one more hard trick".&lt;/p&gt;
    &lt;p&gt;The more you know, the more you know you don't know and the less you know you know.&lt;/p&gt;
    &lt;h3&gt;This is supposed to be a software newsletter&lt;/h3&gt;
    &lt;blockquote&gt;
    &lt;p&gt;A monad is a monoid in the category of endofunctors, what's the problem? &lt;a href="https://james-iry.blogspot.com/2009/05/brief-incomplete-and-mostly-wrong.html" target="_blank"&gt;(src)&lt;/a&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;I think this applies to any difficult topic? Most fields don't have the same stark &lt;a href="https://en.wikipedia.org/wiki/Spectral_line" target="_blank"&gt;spectral lines&lt;/a&gt; as juggling, but there's still tiers of difficulty to techniques, which get compressed the further in either direction they are from your current level.&lt;/p&gt;
    &lt;p&gt;Like, I'm not good at formal methods. I've written two books on it but I've never mastered a dependently-typed language or a theorem prover. Those are equally hard. And I'm not good at modeling concurrent systems because I don't understand the formal definition of bisimulation and haven't implemented a Raft. Those are also equally hard, in fact exactly as hard as mastering a theorem prover.&lt;/p&gt;
    &lt;p&gt;At the same time, the skills I've already developed are easy: properly using refinement is &lt;em&gt;exactly as easy&lt;/em&gt; as writing &lt;a href="https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/" target="_blank"&gt;a wrapped counter&lt;/a&gt;. Then I get surprised when I try to explain strong fairness to someone and they just don't get how □◇(ENABLED〈A〉ᵥ) is &lt;em&gt;obviously&lt;/em&gt; different from ◇□(ENABLED 〈A〉ᵥ).&lt;/p&gt;
    &lt;p&gt;Juggler's curse!&lt;/p&gt;
    &lt;p&gt;Now I don't actually know if this is actually how everybody experiences expertise or if it's just my particular personality— I was a juggler long before I was a software developer. Then again, I'd argue that lots of people talk about one consequence of the juggler's curse: imposter syndrome. If you constantly think what you know is "trivial" and what you don't know is "impossible", then yeah, you'd start feeling like an imposter at work real quick.&lt;/p&gt;
    &lt;p&gt;I wonder if part of the cause is that a lot of skills you have to learn are invisible. One of my favorite blog posts ever is &lt;a href="https://www.benkuhn.net/blub/" target="_blank"&gt;In Defense of Blub Studies&lt;/a&gt;, which argues that software expertise comes through understanding "boring" topics like "what all of the error messages mean" and "how to use a debugger well".  Blub is a critical part of expertise and takes a lot of hard work to learn, but it &lt;em&gt;feels&lt;/em&gt; like trivia. So looking back on a skill I mastered, I might think it was "easy" because I'm not including all of the blub that I had to learn, too.&lt;/p&gt;
    &lt;p&gt;The takeaway, of course, is that the outside five-ball cascade &lt;em&gt;is&lt;/em&gt; objectively the cutoff between good jugglers and toddlers.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:boxes"&gt;
    &lt;p&gt;Rant time: I &lt;em&gt;love&lt;/em&gt; cigar box juggling. It's fun, it's creative, it's totally unlike any other kind of juggling. And it's so niche I straight up cannot find anybody in Chicago to practice with. I once went to a juggling convention and was the only person with a cigar box set there. &lt;a class="footnote-backref" href="#fnref:boxes" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:expert-blindness"&gt;
    &lt;p&gt;This particular part of the juggler's curse is also called &lt;a href="https://en.wikipedia.org/wiki/Curse_of_knowledge" target="_blank"&gt;the curse of knowledge&lt;/a&gt; or "expert blindness". &lt;a class="footnote-backref" href="#fnref:expert-blindness" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:dk"&gt;
    &lt;p&gt;This isn't Dunning-Kruger, because DK says that people think they are &lt;em&gt;better&lt;/em&gt; than they actually are, and also &lt;a href="https://www.mcgill.ca/oss/article/critical-thinking/dunning-kruger-effect-probably-not-real" target="_blank"&gt;may not actually be real&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:dk" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 22 Jan 2025 18:50:40 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/the-jugglers-curse/</guid>
            </item>
            <item>
                <title>What are the Rosettas of formal specification?</title>
                <link>https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/</link>
                <description>&lt;p&gt;First of all, I just released version 0.6 of &lt;em&gt;Logic for Programmers&lt;/em&gt;! You can get it &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;here&lt;/a&gt;. Release notes in the footnote.&lt;sup id="fnref:release-notes"&gt;&lt;a class="footnote-ref" href="#fn:release-notes"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;I've been thinking about my next project after the book's done. One idea is to do a survey of new formal specification languages. There's been a lot of new ones in the past few years (P, Quint, etc), plus some old ones I haven't critically examined (SPIN, mcrl2). I'm thinking of a brief overview of each, what's interesting about it, and some examples of the corresponding models.&lt;/p&gt;
    &lt;p&gt;For this I'd want a set of "Rosetta" examples. &lt;a href="https://rosettacode.org/wiki/Rosetta_Code" target="_blank"&gt;Rosetta Code&lt;/a&gt; is a collection of programming tasks done in different languages. For example, &lt;a href="https://rosettacode.org/wiki/99_bottles_of_beer" target="_blank"&gt;"99 bottles of beer on the wall"&lt;/a&gt; in over 300 languages. If I wanted to make a Rosetta Code for specifications of concurrent systems, what examples would I use? &lt;/p&gt;
    &lt;h3&gt;What makes a good Rosetta examples?&lt;/h3&gt;
    &lt;p&gt;A good Rosetta example would be simple enough to understand and implement but also showcase the differences between the languages. &lt;/p&gt;
    &lt;p&gt;A good example of a Rosetta example is &lt;a href="https://github.com/hwayne/lets-prove-leftpad" target="_blank"&gt;leftpad for code verification&lt;/a&gt;. Proving leftpad correct is short in whatever verification language you use. But the proofs themselves are different enough that you can compare what it's like to use code contracts vs with dependent types, etc. &lt;/p&gt;
    &lt;p&gt;A &lt;em&gt;bad&lt;/em&gt; Rosetta example is "hello world". While it's good for showing how to run a language, it doesn't clearly differentiate languages. Haskell's "hello world" is almost identical to BASIC's "hello world".&lt;/p&gt;
    &lt;p&gt;Rosetta examples don't have to be flashy, but I &lt;em&gt;want&lt;/em&gt; mine to be flashy. Formal specification is niche enough that regardless of my medium, most of my audience hasn't use it and may be skeptical. I always have to be selling. This biases me away from using things like dining philosophers or two-phase commit.&lt;/p&gt;
    &lt;p&gt;So with that in mind, three ideas:&lt;/p&gt;
    &lt;h3&gt;1. Wrapped Counter&lt;/h3&gt;
    &lt;p&gt;A counter that starts at 1 and counts to N, after which it wraps around to 1 again.&lt;/p&gt;
    &lt;h4&gt;Why it's good&lt;/h4&gt;
    &lt;p&gt;This is a good introductory formal specification: it's a minimal possible stateful system without concurrency or nondeterminism. You can use it to talk about the basic structure of a spec, how a verifier works, etc. It also a good way of introducing "boring" semantics, like conditionals and arithmetic, and checking if the language does anything unusual with them. Alloy, for example, defaults to 4-bit signed integers, so you run into problems if you set N too high.&lt;sup id="fnref:alloy"&gt;&lt;a class="footnote-ref" href="#fn:alloy"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;At the same time, wrapped counters are a common building block of complex systems. Lots of things can be represented this way: &lt;code&gt;N=1&lt;/code&gt; is a flag or blinker, &lt;code&gt;N=3&lt;/code&gt; is a traffic light, &lt;code&gt;N=24&lt;/code&gt; is a clock, etc.&lt;/p&gt;
    &lt;p&gt;The next example is better for showing basic &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;safety and liveness properties&lt;/a&gt;, but this will do in a pinch. &lt;/p&gt;
    &lt;h3&gt;2. Threads&lt;/h3&gt;
    &lt;p&gt;A counter starts at 0. N threads each, simultaneously try to update the counter. They do this nonatomically: first they read the value of the counter and store that in a thread-local &lt;code&gt;tmp&lt;/code&gt;, then they increment &lt;code&gt;tmp&lt;/code&gt;, then they set the counter to &lt;code&gt;tmp&lt;/code&gt;. The expected behavior is that the final value of the counter will be N.&lt;/p&gt;
    &lt;h4&gt;Why it's good&lt;/h4&gt;
    &lt;p&gt;The system as described is bugged. If two threads interleave the setlocal commands, one thread update can "clobber" the other and the counter can go backwards. To my surprise, most people &lt;em&gt;do not&lt;/em&gt; see this error. So it's a good showcase of how the language actually finds real bugs, and how it can verify fixes.&lt;/p&gt;
    &lt;p&gt;As to actual language topics: the spec covers concurrency and track process-local state. A good spec language should make it possible to adjust N without having to add any new variables. And it "naturally" introduces safety, liveness, and &lt;a href="https://www.hillelwayne.com/post/action-properties/" target="_blank"&gt;action&lt;/a&gt; properties.&lt;/p&gt;
    &lt;p&gt;Finally, the thread spec is endlessly adaptable. I've used variations of it to teach refinement, resource starvation, fairness, livelocks, and hyperproperties. Tweak it a bit and you get dining philosophers.&lt;/p&gt;
    &lt;h3&gt;3. Bounded buffer&lt;/h3&gt;
    &lt;p&gt;We have a bounded buffer with maximum length &lt;code&gt;X&lt;/code&gt;. We have &lt;code&gt;R&lt;/code&gt; reader and &lt;code&gt;W&lt;/code&gt; writer processes. Before writing, writers first check if the buffer is full. If full, the writer goes to sleep. Otherwise, the writer wakes up &lt;em&gt;a random&lt;/em&gt; sleeping process, then pushes an arbitrary value. Readers work the same way, except they pop from the buffer (and go to sleep if the buffer is empty).&lt;/p&gt;
    &lt;p&gt;The only way for a sleeping process to wake up is if another process successfully performs a read or write.&lt;/p&gt;
    &lt;h4&gt;Why it's good&lt;/h4&gt;
    &lt;p&gt;This shows process-local nondeterminism (in choosing which sleeping process to wake up), different behavior for different types of processes, and deadlocks: it's possible for every reader and writer to be asleep at the same time.&lt;/p&gt;
    &lt;p&gt;The beautiful thing about this example: the spec can only deadlock if &lt;code&gt;X &lt; 2*(R+W)&lt;/code&gt;. This is the kind of bug you'd struggle to debug in real code. An in fact, people did struggle: even when presented with a minimal code sample and told there was a bug, many &lt;a href="http://wiki.c2.com/?ExtremeProgrammingChallengeFourteen" target="_blank"&gt;testing experts couldn't find it&lt;/a&gt;. Whereas a formal model of the same code &lt;a href="https://www.hillelwayne.com/post/augmenting-agile/" target="_blank"&gt;finds the bug in seconds&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;If a spec language can model the bounded buffer, then it's good enough for production systems.&lt;/p&gt;
    &lt;p&gt;On top of that, the bug happens regardless of what writers actually put in the buffer, so you can abstract that all away. This example can demonstrate that you can leave implementation details out of a spec and still find critical errors.&lt;/p&gt;
    &lt;h2&gt;Caveat&lt;/h2&gt;
    &lt;p&gt;This is all with a &lt;em&gt;heavy&lt;/em&gt; TLA+ bias. I've modeled all of these systems in TLA+ and it works pretty well for them. That is to say, none of these do things TLA+ is &lt;em&gt;bad&lt;/em&gt; at: reachability, subtyping, transitive closures, unbound spaces, etc. I imagine that as I cover more specification languages I'll find new Rosettas.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:release-notes"&gt;
    &lt;ul&gt;
    &lt;li&gt;Exercises are more compact, answers now show name of exercise in title&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;"Conditionals" chapter has new section on nested conditionals&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;"Crash course" chapter significantly rewritten&lt;/li&gt;
    &lt;li&gt;Starting migrating to use consistently use &lt;code&gt;==&lt;/code&gt; for equality and &lt;code&gt;=&lt;/code&gt; for definition. Not everything is migrated yet&lt;/li&gt;
    &lt;li&gt;"Beyond Logic" appendix does a &lt;em&gt;slightly&lt;/em&gt; better job of covering HOL and constructive logic&lt;/li&gt;
    &lt;li&gt;Addressed various reader feedback&lt;/li&gt;
    &lt;li&gt;Two new exercises&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;&lt;a class="footnote-backref" href="#fnref:release-notes" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:alloy"&gt;
    &lt;p&gt;You can change the int size in a model run, so this is more "surprising footgun and inconvenience" than "fundamental limit of the specification language." Something still good to know! &lt;a class="footnote-backref" href="#fnref:alloy" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 15 Jan 2025 17:34:40 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/</guid>
            </item>
            <item>
                <title>"Logic for Programmers" Project Update</title>
                <link>https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/</link>
                <description>&lt;p&gt;Happy new year everyone!&lt;/p&gt;
    &lt;p&gt;I released the first &lt;em&gt;Logic for Programmers&lt;/em&gt; alpha six months ago. There's since been four new versions since then, with the November release putting us in beta. Between work and holidays I didn't make much progress in December, but there will be a 0.6 release in the next week or two.&lt;/p&gt;
    &lt;p&gt;People have asked me if the book will ever be available in print, and my answer to that is "when it's done". To keep "when it's done" from being "never", I'm committing myself to &lt;strong&gt;have the book finished by July.&lt;/strong&gt; That means roughly six more releases between now and the official First Edition. Then I will start looking for a way to get it printed.&lt;/p&gt;
    &lt;h3&gt;The Current State and What Needs to be Done&lt;/h3&gt;
    &lt;p&gt;Right now the book is 26,000 words. For the most part, the structure is set— I don't plan to reorganize the chapters much. But I still need to fix shortcomings identified by the reader feedback. In particular, a few topics need more on real world applications, and the Alloy chapter is pretty weak. There's also a bunch of notes and todos and "fix this"s I need to go over.&lt;/p&gt;
    &lt;p&gt;I also need to rewrite the introduction and predicate logic chapters. Those haven't changed much since 0.1 and I need to go over them &lt;em&gt;very carefully&lt;/em&gt;.&lt;/p&gt;
    &lt;p&gt;After that comes copyediting.&lt;/p&gt;
    &lt;h4&gt;Ugh, Copyediting&lt;/h4&gt;
    &lt;p&gt;Copyediting means going through the entire book to make word and sentence sentence level changes to the flow. An example would be changing&lt;/p&gt;
    &lt;table&gt;
    &lt;thead&gt;
    &lt;tr&gt;
    &lt;th&gt;From&lt;/th&gt;
    &lt;th&gt;To&lt;/th&gt;
    &lt;/tr&gt;
    &lt;/thead&gt;
    &lt;tbody&gt;
    &lt;tr&gt;
    &lt;td&gt;I said predicates are just “boolean functions”. That isn’t &lt;em&gt;quite&lt;/em&gt; true.&lt;/td&gt;
    &lt;td&gt;It's easy to think of predicates as just "boolean" functions, but there is a subtle and important difference.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;/tbody&gt;
    &lt;/table&gt;
    &lt;p&gt;It's a tiny difference but it reads slightly better to me and makes the book slghtly better. Now repeat that for all 3000-odd sentences in the book and I'm done with copyediting!&lt;/p&gt;
    &lt;p&gt;For the first pass, anyway. Copyediting is miserable. &lt;/p&gt;
    &lt;p&gt;Some of the changes I need to make come from reader feedback, but most will come from going through it line-by-line with a copyeditor. Someone's kindly offered to do some of this for free, but I want to find a professional too. If you know anybody, let me know.&lt;/p&gt;
    &lt;h4&gt;Formatting&lt;/h4&gt;
    &lt;p&gt;The book, if I'm being honest, looks ugly. I'm using the default sphinx/latex combination for layout and typesetting. My thinking is it's not worth making the book pretty until it's worth reading. But I also want the book, when it's eventually printed, to look &lt;em&gt;nice&lt;/em&gt;. At the very least it shouldn't have "self-published" vibes. &lt;/p&gt;
    &lt;p&gt;I've found someone who's been giving me excellent advice on layout and I'm slowly mastering the LaTeX formatting arcana. It's gonna take a few iterations to get things right.&lt;/p&gt;
    &lt;h4&gt;Front cover&lt;/h4&gt;
    &lt;p&gt;Currently the front cover is this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Front cover" class="newsletter-image" src="https://assets.buttondown.email/images/b42ee3de-9d8a-4729-809e-a8739741f0cf.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;It works but gives "programmer spent ten minutes in Inkscape" vibes. I have a vision in my head for what would be nicer. A few people have recommended using Fiverr. So far the results haven't been that good, &lt;/p&gt;
    &lt;h4&gt;Fixing Epub&lt;/h4&gt;
    &lt;p&gt;&lt;em&gt;Ugh&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;I thought making an epub version would be kinder for phone reading, but it's such a painful format to develop for. Did you know that epub backlinks work totally different on kindle vs other ereaders? Did you know the only way to test if you got em working right is to load them up in a virtual kindle? The feedback loops are miserable. So I've been treating epub as a second-class citizen for now and only fixing the &lt;em&gt;worst&lt;/em&gt; errors (like math not rendering properly), but that'll have to change as the book finalizes.&lt;/p&gt;
    &lt;h3&gt;What comes next?&lt;/h3&gt;
    &lt;p&gt;After 1.0, I get my book an ISBN and figure out how to make print copies. The margin on print is &lt;em&gt;way&lt;/em&gt; lower than ebooks, especially if it's on-demand: the net royalties for &lt;a href="https://kdp.amazon.com/en_US/help/topic/G201834330" target="_blank"&gt;Amazon direct publishing&lt;/a&gt; would be 7 dollars on a 20-dollar book (as opposed to Leanpub's 16 dollars). Would having a print version double the sales? I hope so! Either way, a lot of people have been asking about print version so I want to make that possible.&lt;/p&gt;
    &lt;p&gt;(I also want to figure out how to give people who already have the ebook a discount on print, but I don't know if that's feasible.)&lt;/p&gt;
    &lt;p&gt;Then, I dunno, maybe make a talk or a workshop I can pitch to conferences. Once I have that I think I can call &lt;em&gt;LfP&lt;/em&gt; complete... at least until the second edition.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Anyway none of that is actually technical so here's a quick fun thing. I spent a good chunk of my break reading the &lt;a href="https://www.mcrl2.org/web/index.html" target="_blank"&gt;mCRL2 book&lt;/a&gt;. mCRL2 defines an "algebra" for "communicating processes". As a very broad explanation, that's defining what it means to "add" and "multiply" two processes. What's interesting is that according to their definition, the algebra follows the distributive law, &lt;em&gt;but only if you multiply on the right&lt;/em&gt;. eg&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;// VALID
    (a+b)*c = a*c + b*c
    
    // INVALID
    a*(b+c) = a*b + a*c
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is the first time I've ever seen this in practice! Juries still out on the rest of the language.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Videos and Stuff&lt;/h3&gt;
    &lt;ul&gt;
    &lt;li&gt;My &lt;em&gt;DDD Europe&lt;/em&gt; talk is now out! &lt;a href="https://www.youtube.com/watch?v=uRmNSuYBUOU" target="_blank"&gt;What We Know We Don't Know&lt;/a&gt; is about empirical software engineering in general, and software engineering research on Domain Driven Design in particular.&lt;/li&gt;
    &lt;li&gt;I was interviewed in the last video on &lt;a href="https://www.youtube.com/watch?v=yXxmSI9SlwM" target="_blank"&gt;Craft vs Cruft&lt;/a&gt;'s "Year of Formal Methods". Check it out!&lt;/li&gt;
    &lt;/ul&gt;</description>
                <pubDate>Tue, 07 Jan 2025 18:49:40 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/</guid>
            </item>
        </channel>
    </rss>
    Raw text
    <?xml version="1.0" encoding="utf-8"?>
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Computer Things</title><link>https://buttondown.com/hillelwayne</link><description>Hi, I'm Hillel. This is the newsletter version of [my website](https://www.hillelwayne.com). I post all website updates here. I also post weekly content just for the newsletter, on topics like
    
    * Formal Methods
    
    * Software History and Culture
    
    * Fringetech and exotic tooling
    
    * The philosophy and theory of software engineering
    
    You can see the archive of all public essays [here](https://buttondown.email/hillelwayne/archive/).</description><atom:link href="https://buttondown.email/hillelwayne/rss" rel="self"/><language>en-us</language><lastBuildDate>Wed, 10 Sep 2025 13:00:00 +0000</lastBuildDate><item><title>Many Hard Leetcode Problems are Easy Constraint Problems</title><link>https://buttondown.com/hillelwayne/archive/many-hard-leetcode-problems-are-easy-constraint/</link><description>
    &lt;p&gt;In my first interview out of college I was asked the change counter problem:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given a set of coin denominations, find the minimum number of coins required to make change for a given number. IE for USA coinage and 37 cents, the minimum number is four (quarter, dime, 2 pennies).&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;I implemented the simple greedy algorithm and immediately fell into the trap of the question: the greedy algorithm only works for "well-behaved" denominations. If the coin values were &lt;code&gt;[10, 9, 1]&lt;/code&gt;, then making 37 cents would take 10 coins in the greedy algorithm but only 4 coins optimally (&lt;code&gt;10+9+9+9&lt;/code&gt;). The "smart" answer is to use a dynamic programming algorithm, which I didn't know how to do. So I failed the interview.&lt;/p&gt;
    &lt;p&gt;But you only need dynamic programming if you're writing your own algorithm. It's really easy if you throw it into a constraint solver like &lt;a href="https://www.minizinc.org/" target="_blank"&gt;MiniZinc&lt;/a&gt; and call it a day. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;int: total;
    array[int] of int: values = [10, 9, 1];
    array[index_set(values)] of var 0..: coins;
    
    constraint sum (c in index_set(coins)) (coins[c] * values[c]) == total;
    solve minimize sum(coins);
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;You can try this online &lt;a href="https://play.minizinc.dev/" target="_blank"&gt;here&lt;/a&gt;. It'll give you a prompt to put in &lt;code&gt;total&lt;/code&gt; and then give you successively-better solutions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;coins = [0, 0, 37];
    ----------
    coins = [0, 1, 28];
    ----------
    coins = [0, 2, 19];
    ----------
    coins = [0, 3, 10];
    ----------
    coins = [0, 4, 1];
    ----------
    coins = [1, 3, 0];
    ----------
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Lots of similar interview questions are this kind of mathematical optimization problem, where we have to find the maximum or minimum of a function corresponding to constraints. They're hard in programming languages because programming languages are too low-level. They are also exactly the problems that constraint solvers were designed to solve. Hard leetcode problems are easy constraint problems.&lt;sup id="fnref:leetcode"&gt;&lt;a class="footnote-ref" href="#fn:leetcode"&gt;1&lt;/a&gt;&lt;/sup&gt; Here I'm using MiniZinc, but you could just as easily use Z3 or OR-Tools or whatever your favorite generalized solver is.&lt;/p&gt;
    &lt;h3&gt;More examples&lt;/h3&gt;
    &lt;p&gt;This was a question in a different interview (which I thankfully passed):&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given a list of stock prices through the day, find maximum profit you can get by buying one stock and selling one stock later.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;It's easy to do in O(n^2) time, or if you are clever, you can do it in O(n). Or you could be not clever at all and just write it as a constraint problem:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;array[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];
    var int: buy;
    var int: sell;
    var int: profit = prices[sell] - prices[buy];
    
    constraint sell &amp;gt; buy;
    constraint profit &amp;gt; 0;
    solve maximize profit;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Reminder, link to trying it online &lt;a href="https://play.minizinc.dev/" target="_blank"&gt;here&lt;/a&gt;. While working at that job, one interview question we tested out was:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given a list, determine if three numbers in that list can be added or subtracted to give 0? &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;This is a satisfaction problem, not a constraint problem: we don't need the "best answer", any answer will do. We eventually decided against it for being too tricky for the engineers we were targeting. But it's not tricky in a solver; &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;include "globals.mzn";
    array[int] of int: numbers = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];
    array[index_set(numbers)] of var {0, -1, 1}: choices;
    
    constraint sum(n in index_set(numbers)) (numbers[n] * choices[n]) = 0;
    constraint count(choices, -1) + count(choices, 1) = 3;
    solve satisfy;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Okay, one last one, a problem I saw last year at &lt;a href="https://chicagopython.github.io/algosig/" target="_blank"&gt;Chipy AlgoSIG&lt;/a&gt;. Basically they pick some leetcode problems and we all do them. I failed to solve &lt;a href="https://leetcode.com/problems/largest-rectangle-in-histogram/description/" target="_blank"&gt;this one&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Given an array of integers heights representing the histogram's bar height where the width of each bar is 1, return the area of the largest rectangle in the histogram.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="example from leetcode link" class="newsletter-image" src="https://assets.buttondown.email/images/63337f78-7138-4b21-87a0-917c0c5b1706.jpg?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The "proper" solution is a tricky thing involving tracking lots of bookkeeping states, which you can completely bypass by expressing it as constraints:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;array[int] of int: numbers = [2,1,5,6,2,3];
    
    var 1..length(numbers): x; 
    var 1..length(numbers): dx;
    var 1..: y;
    
    constraint x + dx &amp;lt;= length(numbers);
    constraint forall (i in x..(x+dx)) (y &amp;lt;= numbers[i]);
    
    var int: area = (dx+1)*y;
    solve maximize area;
    
    output ["(\(x)-&amp;gt;\(x+dx))*\(y) = \(area)"]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There's even a way to &lt;a href="https://docs.minizinc.dev/en/2.9.3/visualisation.html" target="_blank"&gt;automatically visualize the solution&lt;/a&gt; (using &lt;code&gt;vis_geost_2d&lt;/code&gt;), but I didn't feel like figuring it out in time for the newsletter.&lt;/p&gt;
    &lt;h3&gt;Is this better?&lt;/h3&gt;
    &lt;p&gt;Now if I actually brought these questions to an interview the interviewee could ruin my day by asking "what's the runtime complexity?" Constraint solvers runtimes are unpredictable and almost always slower than an ideal bespoke algorithm because they are more expressive, in what I refer to as the &lt;a href="https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/" target="_blank"&gt;capability/tractability tradeoff&lt;/a&gt;. But even so, they'll do way better than a &lt;em&gt;bad&lt;/em&gt; bespoke algorithm, and I'm not experienced enough in handwriting algorithms to consistently beat a solver.&lt;/p&gt;
    &lt;p&gt;The real advantage of solvers, though, is how well they handle new constraints. Take the stock picking problem above. I can write an O(n²) algorithm in a few minutes and the O(n) algorithm if you give me some time to think. Now change the problem to&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Maximize the profit by buying and selling up to &lt;code&gt;max_sales&lt;/code&gt; stocks, but you can only buy or sell one stock at a given time and you can only hold up to &lt;code&gt;max_hold&lt;/code&gt; stocks at a time?&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;That's a way harder problem to write even an inefficient algorithm for! While the constraint problem is only a tiny bit more complicated:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;include "globals.mzn";
    int: max_sales = 3;
    int: max_hold = 2;
    array[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];
    array [1..max_sales] of var int: buy;
    array [1..max_sales] of var int: sell;
    array [index_set(prices)] of var 0..max_hold: stocks_held;
    var int: profit = sum(s in 1..max_sales) (prices[sell[s]] - prices[buy[s]]);
    
    constraint forall (s in 1..max_sales) (sell[s] &amp;gt; buy[s]);
    constraint profit &amp;gt; 0;
    
    constraint forall(i in index_set(prices)) (stocks_held[i] = (count(s in 1..max_sales) (buy[s] &amp;lt;= i) - count(s in 1..max_sales) (sell[s] &amp;lt;= i)));
    constraint alldifferent(buy ++ sell);
    solve maximize profit;
    
    output ["buy at \(buy)\n", "sell at \(sell)\n", "for \(profit)"];
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Most constraint solving examples online are puzzles, like &lt;a href="https://docs.minizinc.dev/en/stable/modelling2.html#ex-sudoku" target="_blank"&gt;Sudoku&lt;/a&gt; or "&lt;a href="https://docs.minizinc.dev/en/stable/modelling2.html#ex-smm" target="_blank"&gt;SEND + MORE = MONEY&lt;/a&gt;". Solving leetcode problems would be a more interesting demonstration. And you get more interesting opportunities to teach optimizations, like symmetry breaking.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Update for the Internet&lt;/h3&gt;
    &lt;p&gt;This was sent as a weekly newsletter, which is usually on topics like &lt;a href="https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code" target="_blank"&gt;software history&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/" target="_blank"&gt;formal methods&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;unusual technologies&lt;/a&gt;, and the &lt;a href="https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/" target="_blank"&gt;theory of software engineering&lt;/a&gt;. You can subscribe here: &lt;/p&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:leetcode"&gt;
    &lt;p&gt;Because my dad will email me if I don't explain this: "leetcode" is slang for "tricky algorithmic interview questions that have little-to-no relevance in the actual job you're interviewing for." It's from &lt;a href="https://leetcode.com/" target="_blank"&gt;leetcode.com&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:leetcode" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 10 Sep 2025 13:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/many-hard-leetcode-problems-are-easy-constraint/</guid></item><item><title>The Angels and Demons of Nondeterminism</title><link>https://buttondown.com/hillelwayne/archive/the-angels-and-demons-of-nondeterminism/</link><description>
    &lt;p&gt;Greetings everyone! You might have noticed that it's September and I don't have the next version of &lt;em&gt;Logic for Programmers&lt;/em&gt; ready. As penance, &lt;a href="https://leanpub.com/logic/c/september-2025-kuBCrhBnUzb7" target="_blank"&gt;here's ten free copies of the book&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;So a few months ago I wrote &lt;a href="https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/" target="_blank"&gt;a newsletter&lt;/a&gt; about how we use nondeterminism in formal methods.  The overarching idea:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Nondeterminism is when multiple paths are possible from a starting state.&lt;/li&gt;
    &lt;li&gt;A system preserves a property if it holds on &lt;em&gt;all&lt;/em&gt; possible paths. If even one path violates the property, then we have a bug.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;An intuitive model of this is that for this is that when faced with a nondeterministic choice, the system always makes the &lt;em&gt;worst possible choice&lt;/em&gt;. This is sometimes called &lt;strong&gt;demonic nondeterminism&lt;/strong&gt; and is favored in formal methods because we are paranoid to a fault.&lt;/p&gt;
    &lt;p&gt;The opposite would be &lt;strong&gt;angelic nondeterminism&lt;/strong&gt;, where the system always makes the &lt;em&gt;best possible choice&lt;/em&gt;. A property then holds if &lt;em&gt;any&lt;/em&gt; possible path satisfies that property.&lt;sup id="fnref:duals"&gt;&lt;a class="footnote-ref" href="#fn:duals"&gt;1&lt;/a&gt;&lt;/sup&gt; This is not as common in FM, but it still has its uses! "Players can access the secret level" or "&lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/#other-properties" target="_blank"&gt;We can always shut down the computer&lt;/a&gt;" are &lt;strong&gt;reachability&lt;/strong&gt; properties, that something is possible even if not actually done.&lt;/p&gt;
    &lt;p&gt;In broader computer science research, I'd say that angelic nondeterminism is more popular, due to its widespread use in complexity analysis and programming languages.&lt;/p&gt;
    &lt;h3&gt;Complexity Analysis&lt;/h3&gt;
    &lt;p&gt;P is the set of all "decision problems" (&lt;em&gt;basically&lt;/em&gt;, boolean functions) can be solved in polynomial time: there's an algorithm that's worst-case in &lt;code&gt;O(n)&lt;/code&gt;, &lt;code&gt;O(n²)&lt;/code&gt;, &lt;code&gt;O(n³)&lt;/code&gt;, etc.&lt;sup id="fnref:big-o"&gt;&lt;a class="footnote-ref" href="#fn:big-o"&gt;2&lt;/a&gt;&lt;/sup&gt;  NP is the set of all problems that can be solved in polynomial time by an algorithm with &lt;em&gt;angelic nondeterminism&lt;/em&gt;.&lt;sup id="fnref:TM"&gt;&lt;a class="footnote-ref" href="#fn:TM"&gt;3&lt;/a&gt;&lt;/sup&gt; For example, the question "does list &lt;code&gt;l&lt;/code&gt; contain &lt;code&gt;x&lt;/code&gt;" can be solved in O(1) time by a nondeterministic algorithm:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;fun is_member(l: List[T], x: T): bool {
      if l == [] {return false};
    
      guess i in 0..&amp;lt;(len(l)-1);
      return l[i] == x;
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Say call &lt;code&gt;is_member([a, b, c, d], c)&lt;/code&gt;. The best possible choice would be to guess &lt;code&gt;i = 2&lt;/code&gt;, which would correctly return true. Now call &lt;code&gt;is_member([a, b], d)&lt;/code&gt;. No matter what we guess, the algorithm correctly returns false. and just return false. Ergo, O(1). NP stands for "Nondeterministic Polynomial". &lt;/p&gt;
    &lt;p&gt;(And I just now realized something pretty cool: you can say that P is the set of all problems solvable in polynomial time under &lt;em&gt;demonic nondeterminism&lt;/em&gt;, which is a nice parallel between the two classes.)&lt;/p&gt;
    &lt;p&gt;Computer scientists have proven that angelic nondeterminism doesn't give us any more "power": there are no problems solvable with AN that aren't also solvable deterministically. The big question is whether AN is more &lt;em&gt;efficient&lt;/em&gt;: it is widely believed, but not &lt;em&gt;proven&lt;/em&gt;, that there are problems in NP but not in P. Most famously, "Is there any variable assignment that makes this boolean formula true?" A polynomial AN algorithm is again easy:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;fun SAT(f(x1, x2, …: bool): bool): bool {
       N = num_params(f)
       for i in 1..=num_params(f) {
         guess x_i in {true, false}
       }
    
       return f(x_1, x_2, …)
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The best deterministic algorithms we have to solve the same problem are worst-case exponential with the number of boolean parameters. This a real frustrating problem because real computers don't have angelic nondeterminism, so problems like SAT remain hard. We can solve most "well-behaved" instances of the problem &lt;a href="https://www.hillelwayne.com/post/np-hard/" target="_blank"&gt;in reasonable time&lt;/a&gt;, but the worst-case instances get intractable real fast.&lt;/p&gt;
    &lt;h3&gt;Means of Abstraction&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;We can directly turn an AN algorithm into a (possibly much slower) deterministic algorithm, such as by &lt;a href="https://en.wikipedia.org/wiki/Backtracking" target="_blank"&gt;backtracking&lt;/a&gt;. This makes AN a pretty good abstraction over what an algorithm is doing. Does the regex &lt;code&gt;(a+b)\1+&lt;/code&gt; match "abaabaabaab"? Yes, if the regex engine nondeterministically guesses that it needs to start at the third letter and make the group &lt;code&gt;aab&lt;/code&gt;. How does my PL's regex implementation find that match? I dunno, backtracking or &lt;a href="https://swtch.com/~rsc/regexp/regexp1.html" target="_blank"&gt;NFA construction&lt;/a&gt; or something, I don't need to know the deterministic specifics in order to use the nondeterministic abstraction.&lt;/p&gt;
    &lt;p&gt;Neel Krishnaswami has &lt;a href="https://semantic-domain.blogspot.com/2013/07/what-declarative-languages-are.html" target="_blank"&gt;a great definition of 'declarative language'&lt;/a&gt;: "any language with a semantics has some nontrivial existential quantifiers in it". I'm not sure if this is &lt;em&gt;identical&lt;/em&gt; to saying "a language with an angelic nondeterministic abstraction", but they must be pretty close, and all of his examples match:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;SQL's selects and joins&lt;/li&gt;
    &lt;li&gt;Parsing DSLs&lt;/li&gt;
    &lt;li&gt;Logic programming's unification&lt;/li&gt;
    &lt;li&gt;Constraint solving&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;On top of that I'd add CSS selectors and &lt;a href="https://www.hillelwayne.com/post/picat/" target="_blank"&gt;planner's actions&lt;/a&gt;; all nondeterministic abstractions over a deterministic implementation. He also says that the things programmers hate most in declarative languages are features that "that expose the operational model": constraint solver search strategies, Prolog cuts, regex backreferences, etc. Which again matches my experiences with angelic nondeterminism: I dread features that force me to understand the deterministic implementation. But they're necessary, since P probably != NP and so we need to worry about operational optimizations.&lt;/p&gt;
    &lt;h3&gt;Eldritch Nondeterminism&lt;/h3&gt;
    &lt;p&gt;If you need to know the &lt;a href="https://en.wikipedia.org/wiki/PP_(complexity)" target="_blank"&gt;ratio of good/bad paths&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/%E2%99%AFP" target="_blank"&gt;the number of good paths&lt;/a&gt;, or probability, or anything more than "there is a good path" or "there is a bad path", you are beyond the reach of heaven or hell.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:duals"&gt;
    &lt;p&gt;Angelic and demonic nondeterminism are &lt;a href="https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/" target="_blank"&gt;duals&lt;/a&gt;: angelic returns "yes" if &lt;code&gt;some choice: correct&lt;/code&gt; and demonic returns "no" if &lt;code&gt;!all choice: correct&lt;/code&gt;, which is the same as &lt;code&gt;some choice: !correct&lt;/code&gt;. &lt;a class="footnote-backref" href="#fnref:duals" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:big-o"&gt;
    &lt;p&gt;Pet peeve about Big-O notation: &lt;code&gt;O(n²)&lt;/code&gt; is the &lt;em&gt;set&lt;/em&gt; of all algorithms that, for sufficiently large problem sizes, grow no faster that quadratically. "Bubblesort has &lt;code&gt;O(n²)&lt;/code&gt; complexity" &lt;em&gt;should&lt;/em&gt; be written &lt;code&gt;Bubblesort in O(n²)&lt;/code&gt;, &lt;em&gt;not&lt;/em&gt; &lt;code&gt;Bubblesort = O(n²)&lt;/code&gt;. &lt;a class="footnote-backref" href="#fnref:big-o" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:TM"&gt;
    &lt;p&gt;To be precise, solvable in polynomial time by a &lt;em&gt;Nondeterministic Turing Machine&lt;/em&gt;, a very particular model of computation. We can broadly talk about P and NP without framing everything in terms of Turing machines, but some details of complexity classes (like the existence "weak NP-hardness") kinda need Turing machines to make sense. &lt;a class="footnote-backref" href="#fnref:TM" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 04 Sep 2025 14:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/the-angels-and-demons-of-nondeterminism/</guid></item><item><title>Logical Duals in Software Engineering</title><link>https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/</link><description>
    &lt;p&gt;(&lt;a href="https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/" target="_blank"&gt;Last week's newsletter&lt;/a&gt; took too long and I'm way behind on &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; revisions so short one this time.&lt;sup id="fnref:retread"&gt;&lt;a class="footnote-ref" href="#fn:retread"&gt;1&lt;/a&gt;&lt;/sup&gt;)&lt;/p&gt;
    &lt;p&gt;In classical logic, two operators &lt;code&gt;F/G&lt;/code&gt; are &lt;strong&gt;duals&lt;/strong&gt; if &lt;code&gt;F(x) = !G(!x)&lt;/code&gt;. Three examples:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;&lt;code&gt;x || y&lt;/code&gt; is the same as &lt;code&gt;!(!x &amp;amp;&amp;amp; !y)&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;&amp;lt;&amp;gt;P&lt;/code&gt; ("P is possibly true") is the same as &lt;code&gt;![]!P&lt;/code&gt; ("not P isn't definitely true").&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;some x in set: P(x)&lt;/code&gt; is the same as &lt;code&gt;!(all x in set: !P(x))&lt;/code&gt;.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;(1) is just a version of De Morgan's Law, which we regularly use to simplify boolean expressions. (2) is important in modal logic but has niche applications in software engineering, mostly in how it powers various formal methods.&lt;sup id="fnref:fm"&gt;&lt;a class="footnote-ref" href="#fn:fm"&gt;2&lt;/a&gt;&lt;/sup&gt; The real interesting one is (3), the "quantifier duals". We use lots of software tools to either &lt;em&gt;find&lt;/em&gt; a value satisfying &lt;code&gt;P&lt;/code&gt; or &lt;em&gt;check&lt;/em&gt; that all values satisfy &lt;code&gt;P&lt;/code&gt;. And by duality, any tool that does one can do the other, by seeing if it &lt;em&gt;fails&lt;/em&gt; to find/check &lt;code&gt;!P&lt;/code&gt;. Some examples in the wild:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Z3 is used to solve mathematical constraints, like "find x, where &lt;code&gt;f(x) &amp;gt;= 0&lt;/code&gt;. If I want to prove a property like "f is always positive", I ask z3 to solve "find x, where &lt;code&gt;!(f(x) &amp;gt;= 0)&lt;/code&gt;, and see if that is unsatisfiable. This use case powers a LOT of theorem provers and formal verification tooling.&lt;/li&gt;
    &lt;li&gt;Property testing checks that all inputs to a code block satisfy a property. I've used it to generate complex inputs with certain properties by checking that all inputs &lt;em&gt;don't&lt;/em&gt; satisfy the property and reading out the test failure.&lt;/li&gt;
    &lt;li&gt;Model checkers check that all behaviors of a specification satisfy a property, so we can find a behavior that reaches a goal state G by checking that all states are &lt;code&gt;!G&lt;/code&gt;. &lt;a href="https://github.com/tlaplus/Examples/blob/master/specifications/DieHard/DieHard.tla" target="_blank"&gt;Here's TLA+ solving a puzzle this way&lt;/a&gt;.&lt;sup id="fnref:antithesis"&gt;&lt;a class="footnote-ref" href="#fn:antithesis"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;li&gt;Planners find behaviors that reach a goal state, so we can check if all behaviors satisfy a property P by asking it to reach goal state &lt;code&gt;!P&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;The problem "find the shortest &lt;a href="https://en.wikipedia.org/wiki/Travelling_salesman_problem" target="_blank"&gt;traveling salesman route&lt;/a&gt;" can be broken into &lt;code&gt;some route: distance(route) = n&lt;/code&gt; and &lt;code&gt;all route: !(distance(route) &amp;lt; n)&lt;/code&gt;. Then a route finder can find the first, and then convert the second into a &lt;code&gt;some&lt;/code&gt; and &lt;em&gt;fail&lt;/em&gt; to find it, proving &lt;code&gt;n&lt;/code&gt; is optimal.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Even cooler to me is when a tool does &lt;em&gt;both&lt;/em&gt; finding and checking, but gives them different "meanings". In SQL, &lt;code&gt;some x: P(x)&lt;/code&gt; is true if we can &lt;em&gt;query&lt;/em&gt; for &lt;code&gt;P(x)&lt;/code&gt; and get a nonempty response, while &lt;code&gt;all x: P(x)&lt;/code&gt; is true if all records satisfy the &lt;code&gt;P(x)&lt;/code&gt; &lt;em&gt;constraint&lt;/em&gt;. Most SQL databases allow for complex queries but not complex constraints! You got &lt;code&gt;UNIQUE&lt;/code&gt;, &lt;code&gt;NOT NULL&lt;/code&gt;, &lt;code&gt;REFERENCES&lt;/code&gt;, which are fixed predicates, and &lt;code&gt;CHECK&lt;/code&gt;, which is one-record only.&lt;sup id="fnref:check"&gt;&lt;a class="footnote-ref" href="#fn:check"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;Oh, and you got database triggers, which can run arbitrary queries and throw exceptions. So if you really need to enforce a complex constraint &lt;code&gt;P(x, y, z)&lt;/code&gt;, you put in a database trigger that queries &lt;code&gt;some x, y, z: !P(x, y, z)&lt;/code&gt; and throws an exception if it finds any results. That all works because of quantifier duality! See &lt;a href="https://eddmann.com/posts/maintaining-invariant-constraints-in-postgresql-using-trigger-functions/" target="_blank"&gt;here&lt;/a&gt; for an example of this in practice.&lt;/p&gt;
    &lt;h3&gt;Duals more broadly&lt;/h3&gt;
    &lt;p&gt;"Dual" doesn't have a strict meaning in math, it's more of a vibe thing where all of the "duals" are kinda similar in meaning but don't strictly follow all of the same rules. &lt;em&gt;Usually&lt;/em&gt; things X and Y are duals if there is some transform &lt;code&gt;F&lt;/code&gt; where &lt;code&gt;X = F(Y)&lt;/code&gt; and &lt;code&gt;Y = F(X)&lt;/code&gt;, but not always. Maybe the category theorists have a formal definition that covers all of the different uses. Usually duals switch properties of things, too: an example showing &lt;code&gt;some x: P(x)&lt;/code&gt; becomes a &lt;em&gt;counterexample&lt;/em&gt; of &lt;code&gt;all x: !P(x)&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;Under this definition, I think the dual of a list &lt;code&gt;l&lt;/code&gt; could be &lt;code&gt;reverse(l)&lt;/code&gt;. The first element of &lt;code&gt;l&lt;/code&gt; becomes the last element of &lt;code&gt;reverse(l)&lt;/code&gt;, the last becomes the first, etc. A more interesting case is the dual of a &lt;code&gt;K -&amp;gt; set(V)&lt;/code&gt; map is the &lt;code&gt;V -&amp;gt; set(K)&lt;/code&gt; map. IE the dual of &lt;code&gt;lived_in_city = {alice: {paris}, bob: {detroit}, charlie: {detroit, paris}}&lt;/code&gt; is &lt;code&gt;city_lived_in_by = {paris: {alice, charlie}, detroit: {bob, charlie}}&lt;/code&gt;. This preserves the property that &lt;code&gt;x in map[y] &amp;lt;=&amp;gt; y in dual[x]&lt;/code&gt;.&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:retread"&gt;
    &lt;p&gt;And after writing this I just realized this is partial retread of a newsletter I wrote &lt;a href="https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/" target="_blank"&gt;a couple months ago&lt;/a&gt;. But only a &lt;em&gt;partial&lt;/em&gt; retread! &lt;a class="footnote-backref" href="#fnref:retread" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:fm"&gt;
    &lt;p&gt;Specifically "linear temporal logics" are modal logics, so "&lt;code&gt;eventually P&lt;/code&gt; ("P is true in at least one state of each behavior") is the same as saying &lt;code&gt;!always !P&lt;/code&gt; ("not P isn't true in all states of all behaviors"). This is the basis of &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;liveness checking&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:fm" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:antithesis"&gt;
    &lt;p&gt;I don't know for sure, but my best guess is that Antithesis does something similar &lt;a href="https://antithesis.com/blog/tag/games/" target="_blank"&gt;when their fuzzer beats videogames&lt;/a&gt;. They're doing fuzzing, not model checking, but they have the same purpose check that complex state spaces don't have bugs. Making the bug "we can't reach the end screen" can make a fuzzer output a complete end-to-end run of the game. Obvs a lot more complicated than that but that's the general idea at least. &lt;a class="footnote-backref" href="#fnref:antithesis" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:check"&gt;
    &lt;p&gt;For &lt;code&gt;CHECK&lt;/code&gt; to constraint multiple records you would need to use a subquery. Core SQL does not support subqueries in check. It is an optional database "feature outside of core SQL" (F671), which &lt;a href="https://www.postgresql.org/docs/current/unsupported-features-sql-standard.html" target="_blank"&gt;Postgres does not support&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:check" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 27 Aug 2025 19:25:32 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/</guid></item><item><title>Sapir-Whorf does not apply to Programming Languages</title><link>https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/</link><description>
    &lt;p&gt;&lt;em&gt;This one is a hot mess but it's too late in the week to start over. Oh well!&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;Someone recognized me at last week's &lt;a href="https://www.chipy.org/" target="_blank"&gt;Chipy&lt;/a&gt; and asked for my opinion on Sapir-Whorf hypothesis in programming languages. I thought this was interesting enough to make a newsletter. First what it is, then why it &lt;em&gt;looks&lt;/em&gt; like it applies, and then why it doesn't apply after all.&lt;/p&gt;
    &lt;h3&gt;The Sapir-Whorf Hypothesis&lt;/h3&gt;
    &lt;blockquote&gt;
    &lt;p&gt;We dissect nature along lines laid down by our native language. — &lt;a href="https://web.mit.edu/allanmc/www/whorf.scienceandlinguistics.pdf" target="_blank"&gt;Whorf&lt;/a&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;To quote from a &lt;a href="https://www.amazon.com/Linguistics-Complete-Introduction-Teach-Yourself/dp/1444180320" target="_blank"&gt;Linguistics book I've read&lt;/a&gt;, the hypothesis is that "an individual's fundamental perception of reality is moulded by the language they speak." As a massive oversimplification, if English did not have a word for "rebellion", we would not be able to conceive of rebellion. This view, now called &lt;a href="https://en.wikipedia.org/wiki/Linguistic_determinism" target="_blank"&gt;Linguistic Determinism&lt;/a&gt;, is mostly rejected by modern linguists.&lt;/p&gt;
    &lt;p&gt;The "weak" form of SWH is that the language we speak influences, but does not &lt;em&gt;decide&lt;/em&gt; our cognition. &lt;a href="https://langcog.stanford.edu/papers/winawer2007.pdf" target="_blank"&gt;For example&lt;/a&gt;, Russian has distinct words for "light blue" and "dark blue", so can discriminate between "light blue" and "dark blue" shades faster than they can discriminate two "light blue" shades. English does not have distinct words, so we discriminate those at the same speed. This &lt;strong&gt;linguistic relativism&lt;/strong&gt; seems to have lots of empirical support in studies, but mostly with "small indicators". I don't think there's anything that convincingly shows linguistic relativism having effects on a societal level.&lt;sup id="fnref:economic-behavior"&gt;&lt;a class="footnote-ref" href="#fn:economic-behavior"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;The weak form of SWH for software would then be the "the programming languages you know affects how you think about programs."&lt;/p&gt;
    &lt;h3&gt;SWH in software&lt;/h3&gt;
    &lt;p&gt;This seems like a natural fit, as different paradigms solve problems in different ways. Consider the &lt;a href="https://hadid.dev/posts/living-coding/" target="_blank"&gt;hardest interview question ever&lt;/a&gt;, "given a list of integers, sum the even numbers". Here it is in four paradigms:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Procedural: &lt;code&gt;total = 0; foreach x in list {if IsEven(x) total += x}&lt;/code&gt;. You iterate over data with an algorithm.&lt;/li&gt;
    &lt;li&gt;Functional: &lt;code&gt;reduce(+, filter(IsEven, list), 0)&lt;/code&gt;. You apply transformations to data to get a result.&lt;/li&gt;
    &lt;li&gt;Array: &lt;code&gt;+ fold L * iseven L&lt;/code&gt;.&lt;sup id="fnref:J"&gt;&lt;a class="footnote-ref" href="#fn:J"&gt;2&lt;/a&gt;&lt;/sup&gt; In English: replace every element in L with 0 if odd and 1 if even, multiple the new array elementwise against &lt;code&gt;L&lt;/code&gt;, and then sum the resulting array. It's like functional except everything is in terms of whole-array transformations.&lt;/li&gt;
    &lt;li&gt;Logical: Somethingish like &lt;code&gt;sumeven(0, []). sumeven(X, [Y|L]) :- iseven(Y) -&amp;gt; sumeven(Z, L), X is Y + Z ; sumeven(X, L)&lt;/code&gt;. You write a set of equations that express what it means for X to &lt;em&gt;be&lt;/em&gt; the sum of events of L.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;There's some similarities between how these paradigms approach the problem, but each is also unique, too. It's plausible that where a procedural programmer "sees" a for loop, a functional programmer "sees" a map and an array programmer "sees" a singular operator.&lt;/p&gt;
    &lt;p&gt;I also have a personal experience with how a language changed the way I think. I use &lt;a href="https://learntla.com/" target="_blank"&gt;TLA+&lt;/a&gt; to detect concurrency bugs in software designs. After doing this for several years, I've gotten much better at intuitively seeing race conditions in things even &lt;em&gt;without&lt;/em&gt; writing a TLA+ spec. It's even leaked out into my day-to-day life. I see concurrency bugs everywhere. Phone tag is a race condition.&lt;/p&gt;
    &lt;p&gt;But I still don't think SWH is the right mental model to use, for one big reason: language is &lt;em&gt;special&lt;/em&gt;. We think in language, we dream in language, there are huge parts of our brain dedicated to processing language. &lt;a href="https://web.eecs.umich.edu/~weimerw/p/weimer-icse2017-preprint.pdf" target="_blank"&gt;We don't use those parts of our brain to read code&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;SWH is so intriguing because it seems so unnatural, that the way we express thoughts changes the way we &lt;em&gt;think&lt;/em&gt; thoughts. That I would be a different person if I was bilingual in Spanish, not because the life experiences it would open up but because &lt;a href="https://en.wikipedia.org/wiki/Grammatical_gender" target="_blank"&gt;grammatical gender&lt;/a&gt; would change my brain.&lt;/p&gt;
    &lt;p&gt;Compared to that, the idea that programming languages affect our brain is more natural and has a simpler explanation:&lt;/p&gt;
    &lt;p&gt;It's the goddamned &lt;a href="https://en.wikipedia.org/wiki/Tetris_effect" target="_blank"&gt;Tetris Effect&lt;/a&gt;.&lt;/p&gt;
    &lt;h3&gt;The Goddamned Tetris Effect&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;blockquote&gt;
    &lt;p&gt;The Tetris effect occurs when someone dedicates vast amounts of time, effort and concentration on an activity which thereby alters their thoughts, dreams, and other experiences not directly linked to said activity. — Wikipedia&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Every skill does this. I'm a juggler, so every item I can see right now has a tiny metadata field of "how would this tumble if I threw it up". I teach professionally, so I'm always noticing good teaching examples everywhere. I spent years writing specs in TLA+ and watching the model checker throw concurrency errors in my face, so now race conditions have visceral presence. Every skill does this. &lt;/p&gt;
    &lt;p&gt;And to really develop a skill, you gotta practice. This is where I think programming paradigms do something especially interesting that make them feel more like Sapir-Whorfy than, like, juggling. Some languages mix lots of different paradigms, like Javascript or Rust. Others like Haskell really focus on &lt;em&gt;excluding&lt;/em&gt; paradigms. If something is easy for you in procedural and hard in FP, in JS you could just lean on the procedural bits. In Haskell, &lt;em&gt;too bad&lt;/em&gt;, you're learning how to do it the functional way.&lt;sup id="fnref:escape-hatch"&gt;&lt;a class="footnote-ref" href="#fn:escape-hatch"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;And that forces you to practice, which makes you see functional patterns everywhere. Tetris effect!&lt;/p&gt;
    &lt;p&gt;Anyway this may all seem like quibbling— why does it matter whether we call it "Tetris effect" or "Sapir-Whorf", if our brains is get rewired either way? For me, personally, it's because SWH sounds really special and &lt;em&gt;unique&lt;/em&gt;, while Tetris effect sounds mundane and commonplace. Which it &lt;em&gt;is&lt;/em&gt;. But also because TE suggests it's not just programming languages that affect how we think about software, it's &lt;em&gt;everything&lt;/em&gt;. Spending lots of time debugging, profiling, writing exploits, whatever will change what you notice, what you think a program "is". And that's a way useful idea that shouldn't be restricted to just PLs.&lt;/p&gt;
    &lt;p&gt;(Then again, the Tetris Effect might also be a bad analogy to what's going on here, because I think part of it is that it wears off after a while. Maybe it's just "building a mental model is good".)&lt;/p&gt;
    &lt;h3&gt;I just realized all of this might have missed the point&lt;/h3&gt;
    &lt;p&gt;Wait are people actually using SWH to mean the &lt;em&gt;weak form&lt;/em&gt; or the &lt;em&gt;strong&lt;/em&gt; form? Like that if a language doesn't make something possible, its users can't conceive of it being possible. I've been arguing against the weaker form in software but I think I've seen strong form often too. Dammit.&lt;/p&gt;
    &lt;p&gt;Well, it's already Thursday and far too late to rewrite the whole newsletter, so I'll just outline the problem with the strong form: we describe the capabilities of our programming languages &lt;em&gt;with human language&lt;/em&gt;. In college I wrote a lot of crappy physics lab C++ and one of my projects was filled with comments like "man I hate copying this triply-nested loop in 10 places with one-line changes, I wish I could put it in one function and just take the changing line as a parameter". Even if I hadn't &lt;em&gt;encountered&lt;/em&gt; higher-order functions, I was still perfectly capable of expressing the idea. So if the strong SWH isn't true for human language, it's not true for programming languages either.&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h1&gt;Systems Distributed talk now up!&lt;/h1&gt;
    &lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=d9cM8f_qSLQ" target="_blank"&gt;Link here&lt;/a&gt;! Original abstract:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Building correct distributed systems takes thinking outside the box, and the fastest way to do that is to think inside a different box. One different box is "formal methods", the discipline of mathematically verifying software and systems. Formal methods encourages unusual perspectives on systems, models that are also broadly useful to all software developers. In this talk we will learn two of the most important FM perspectives: the abstract specifications behind software systems, and the property they are and aren't supposed to have.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The talk ended up evolving away from that abstract but I like how it turned out!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:economic-behavior"&gt;
    &lt;p&gt;There is &lt;a href="https://www.anderson.ucla.edu/faculty/keith.chen/papers/LanguageWorkingPaper.pdf" target="_blank"&gt;one paper&lt;/a&gt; arguing that people who speak a language that doesn't have a "future tense" are more likely to save and eat healthy, but it is... &lt;a href="https://www.reddit.com/r/linguistics/comments/rcne7m/comment/hnz2705/" target="_blank"&gt;extremely questionable&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:economic-behavior" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:J"&gt;
    &lt;p&gt;The original J is &lt;code&gt;+/ (* (0 =  2&amp;amp;|))&lt;/code&gt;. Obligatory &lt;a href="https://www.jsoftware.com/papers/tot.htm" target="_blank"&gt;Notation as a Tool of Thought&lt;/a&gt; reference &lt;a class="footnote-backref" href="#fnref:J" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:escape-hatch"&gt;
    &lt;p&gt;Though if it's &lt;em&gt;too&lt;/em&gt; hard for you, that's why languages have &lt;a href="https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/" target="_blank"&gt;escape hatches&lt;/a&gt; &lt;a class="footnote-backref" href="#fnref:escape-hatch" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 21 Aug 2025 13:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/</guid></item><item><title>Software books I wish I could read</title><link>https://buttondown.com/hillelwayne/archive/software-books-i-wish-i-could-read/</link><description>
    &lt;h3&gt;New Logic for Programmers Release!&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.11 is now available&lt;/a&gt;! This is over 20%  longer than v0.10, with a new chapter on code proofs, three chapter overhauls, and more! &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;Full release notes here&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Cover of the boooooook" class="newsletter-image" src="https://assets.buttondown.email/images/92b4a35d-2bdd-416a-92c7-15ff42b49d8d.jpg?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h1&gt;Software books I wish I could read&lt;/h1&gt;
    &lt;p&gt;I'm writing &lt;em&gt;Logic for Programmers&lt;/em&gt; because it's a book I wanted to have ten years ago. I had to learn everything in it the hard way, which is why I'm ensuring that everybody else can learn it the easy way.&lt;/p&gt;
    &lt;p&gt;Books occupy a sort of weird niche in software. We're great at sharing information via blogs and git repos and entire websites. These have many benefits over books: they're free, they're easily accessible, they can be updated quickly, they can even be interactive. But no blog post has influenced me as profoundly as &lt;a href="https://buttondown.com/hillelwayne/archive/why-you-should-read-data-and-reality/" target="_blank"&gt;Data and Reality&lt;/a&gt; or &lt;a href="https://www.oreilly.com/library/view/making-software/9780596808310/" target="_blank"&gt;Making Software&lt;/a&gt;. There is no blog or talk about debugging as good as the 
    &lt;a href="https://debuggingrules.com/" target="_blank"&gt;Debugging&lt;/a&gt; book.&lt;/p&gt;
    &lt;p&gt;It might not be anything deeper than "people spend more time per word on writing books than blog posts". I dunno.&lt;/p&gt;
    &lt;p&gt;So here are some other books I wish I could read. I don't &lt;em&gt;think&lt;/em&gt; any of them exist yet but it's a big world out there. Also while they're probably best as books, a website or a series of blog posts would be ok too.&lt;/p&gt;
    &lt;h4&gt;Everything about Configurations&lt;/h4&gt;
    &lt;p&gt;The whole topic of how we configure software, whether by CLI flags, environmental vars, or JSON/YAML/XML/Dhall files. What causes the &lt;a href="https://mikehadlow.blogspot.com/2012/05/configuration-complexity-clock.html" target="_blank"&gt;configuration complexity clock&lt;/a&gt;? How do we distinguish between basic, advanced, and developer-only configuration options? When should we disallow configuration? How do we test all possible configurations for correctness? Why do so many widespread outages trace back to misconfiguration, and how do we prevent them? &lt;/p&gt;
    &lt;p&gt;I also want the same for plugin systems. Manifests, permissions, common APIs and architectures, etc. Configuration management is more universal, though, since everybody either uses software with configuration or has made software with configuration.&lt;/p&gt;
    &lt;h4&gt;The Big Book of Complicated Data Schemas&lt;/h4&gt;
    &lt;p&gt;I guess this would kind of be like &lt;a href="https://schema.org/docs/full.html" target="_blank"&gt;Schema.org&lt;/a&gt;, except with a lot more on the "why" and not the what. Why is important for the &lt;a href="https://schema.org/Volcano" target="_blank"&gt;Volcano model&lt;/a&gt; to have a "smokingAllowed" field?&lt;sup id="fnref:volcano"&gt;&lt;a class="footnote-ref" href="#fn:volcano"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;I'd see this less as "here's your guide to putting Volcanos in your database" and more "here's recurring motifs in modeling interesting domains", to help a person see sources of complexity in their &lt;em&gt;own&lt;/em&gt; domain. Does something crop up if the references can form a cycle? If a relationship needs to be strictly temporary, or a reference can change type? Bonus: path dependence in data models, where an additional requirement leads to a vastly different ideal data model that a company couldn't do because they made the old model.&lt;/p&gt;
    &lt;p&gt;(This has got to exist, right? Business modeling is a big enough domain that this must exist. Maybe &lt;a href="https://essenceofsoftware.com/" target="_blank"&gt;The Essence of Software&lt;/a&gt; touches on this? Man I feel bad I haven't read that yet.)&lt;/p&gt;
    &lt;h4&gt;Computer Science for Software Engineers&lt;/h4&gt;
    &lt;p&gt;Yes, I checked, this book does not exist (though maybe &lt;a href="https://www.amazon.com/A-Programmers-Guide-to-Computer-Science-2-book-series/dp/B08433QR53" target="_blank"&gt;this&lt;/a&gt; is the same thing). I don't have any formal software education; everything I know was either self-taught or learned on the job. But it's way easier to learn software engineering that way than computer science. And I bet there's a lot of other engineers in the same boat. &lt;/p&gt;
    &lt;p&gt;This book wouldn't have to be comprehensive or instructive: just enough about each topic to understand why it's an area of study and appreciate how research in it eventually finds its way into practice. &lt;/p&gt;
    &lt;h4&gt;MISU Patterns&lt;/h4&gt;
    &lt;p&gt;MISU, or "Make Illegal States Unrepresentable", is the idea of designing system invariants in the structure of your data. For example, if a &lt;code&gt;Contact&lt;/code&gt; needs at least one of &lt;code&gt;email&lt;/code&gt; or &lt;code&gt;phone&lt;/code&gt; to be non-null, make it a sum type over &lt;code&gt;EmailContact, PhoneContact, EmailPhoneContact&lt;/code&gt; (from &lt;a href="https://fsharpforfunandprofit.com/posts/designing-with-types-making-illegal-states-unrepresentable/" target="_blank"&gt;this post&lt;/a&gt;). MISU is great.&lt;/p&gt;
    &lt;p&gt;Most MISU in the wild look very different than that, though, because the concept of MISU is so broad there's lots of different ways to achieve it. And that means there are "patterns": smart constructors, product types, properly using sets, &lt;a href="https://lexi-lambda.github.io/blog/2020/11/01/names-are-not-type-safety/" target="_blank"&gt;newtypes to some degree&lt;/a&gt;, etc. Some of them are specific to typed FP, while others can be used in even untyped languages. Someone oughta make a pattern book.&lt;/p&gt;
    &lt;p&gt;My one request would be to not give them cutesy names. Do something like the &lt;a href="https://ia600301.us.archive.org/18/items/Thompson2016MotifIndex/Thompson_2016_Motif-Index.pdf" target="_blank"&gt;Aarne–Thompson–Uther Index&lt;/a&gt;, where items are given names like "Recognition by manner of throwing cakes of different weights into faces of old uncles". Names can come later.&lt;/p&gt;
    &lt;h4&gt;The Tools of '25&lt;/h4&gt;
    &lt;p&gt;Not something I'd read, but something to recommend to junior engineers. Starting out it's easy to think the only bit that matters is the language or framework and not realize the enormous amount of surrounding tooling you'll have to learn. This book would cover the basics of tools that &lt;em&gt;enough&lt;/em&gt; developers will probably use at some point: git, VSCode, &lt;em&gt;very&lt;/em&gt; basic Unix and bash, curl. Maybe the general concepts of tools that appear in every ecosystem, like package managers, build tools, task runners. That might be easier if we specialize this to one particular domain, like webdev or data science.&lt;/p&gt;
    &lt;p&gt;Ideally the book would only have to be updated every five years or so. No LLM stuff because I don't expect the tooling will be stable through 2026, to say nothing of 2030.&lt;/p&gt;
    &lt;h4&gt;A History of Obsolete Optimizations&lt;/h4&gt;
    &lt;p&gt;Probably better as a really long blog series. Each chapter would be broken up into two parts:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;A deep dive into a brilliant, elegant, insightful historical optimization designed to work within the constraints of that era's computing technology&lt;/li&gt;
    &lt;li&gt;What we started doing instead, once we had more compute/network/storage available.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;c.f. &lt;a href="https://prog21.dadgum.com/29.html" target="_blank"&gt;A Spellchecker Used to Be a Major Feat of Software Engineering&lt;/a&gt;. Bonus topics would be brilliance obsoleted by standardization (like what people did before git and json were universal), optimizations we do today that may not stand the test of time, and optimizations from the past that &lt;em&gt;did&lt;/em&gt;.&lt;/p&gt;
    &lt;h4&gt;Sphinx Internals&lt;/h4&gt;
    &lt;p&gt;&lt;em&gt;I need this&lt;/em&gt;. I've spent so much goddamn time digging around in Sphinx and docutils source code I'm gonna throw up.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Systems Distributed Talk Today!&lt;/h3&gt;
    &lt;p&gt;Online premier's at noon central / 5 PM UTC, &lt;a href="https://www.youtube.com/watch?v=d9cM8f_qSLQ" target="_blank"&gt;here&lt;/a&gt;! I'll be hanging out to answer questions and be awkward. You ever watch a recording of your own talk? It's real uncomfortable!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:volcano"&gt;
    &lt;p&gt;In &lt;em&gt;this&lt;/em&gt; case because it's a field on one of &lt;code&gt;Volcano&lt;/code&gt;'s supertypes. I guess schemas gotta follow LSP too &lt;a class="footnote-backref" href="#fnref:volcano" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 06 Aug 2025 13:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/software-books-i-wish-i-could-read/</guid></item><item><title>2000 words about arrays and tables</title><link>https://buttondown.com/hillelwayne/archive/2000-words-about-arrays-and-tables/</link><description>
    &lt;p&gt;I'm way too discombobulated from getting next month's release of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; ready, so I'm pulling a idea from the slush pile. Basically I wanted to come up with a mental model of arrays as a concept that explained APL-style multidimensional arrays and tables but also why there weren't multitables.&lt;/p&gt;
    &lt;p&gt;So, arrays. In all languages they are basically the same: they map a sequence of numbers (I'll use &lt;code&gt;1..N&lt;/code&gt;)&lt;sup id="fnref:1-indexing"&gt;&lt;a class="footnote-ref" href="#fn:1-indexing"&gt;1&lt;/a&gt;&lt;/sup&gt; to homogeneous values (values of a single type). This is in contrast to the other two foundational types, associative arrays (which map an arbitrary type to homogeneous values) and structs (which map a fixed set of keys to &lt;em&gt;heterogeneous&lt;/em&gt; values). Arrays appear in PLs earlier than the other two, possibly because they have the simplest implementation and the most obvious application to scientific computing. The OG FORTRAN had arrays. &lt;/p&gt;
    &lt;p&gt;I'm interested in two structural extensions to arrays. The first, found in languages like nushell and frameworks like Pandas, is the &lt;em&gt;table&lt;/em&gt;. Tables have string keys like a struct &lt;em&gt;and&lt;/em&gt; indexes like an array. Each row is a struct, so you can get "all values in this column" or "all values for this row". They're heavily used in databases and data science.&lt;/p&gt;
    &lt;p&gt;The other extension is the &lt;strong&gt;N-dimensional array&lt;/strong&gt;, mostly seen in APLs like Dyalog and J. Think of this like arrays-of-arrays(-of-arrays), except all arrays at the same depth have the same length. So &lt;code&gt;[[1,2,3],[4]]&lt;/code&gt; is not a 2D array, but &lt;code&gt;[[1,2,3],[4,5,6]]&lt;/code&gt; is. This means that N-arrays can be queried on any axis.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
    &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;NB. first row&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;{"&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;NB. first column&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So, I've had some ideas on a conceptual model of arrays that explains all of these variations and possibly predicts new variations. I wrote up my notes and did the bare minimum of editing and polishing. Somehow it ended up being 2000 words.&lt;/p&gt;
    &lt;h3&gt;1-dimensional arrays&lt;/h3&gt;
    &lt;p&gt;A one-dimensional array is a function over &lt;code&gt;1..N&lt;/code&gt; for some N. &lt;/p&gt;
    &lt;p&gt;To be clear this is &lt;em&gt;math&lt;/em&gt; functions, not programming functions. Programming functions take values of a type and perform computations on them. Math functions take values of a fixed set and return values of another set. So the array &lt;code&gt;[a, b, c, d]&lt;/code&gt; can be represented by the function &lt;code&gt;(1 -&amp;gt; a ++ 2 -&amp;gt; b ++ 3 -&amp;gt; c ++ 4 -&amp;gt; d)&lt;/code&gt;. Let's write the set of all four element character arrays as &lt;code&gt;1..4 -&amp;gt; char&lt;/code&gt;. &lt;code&gt;1..4&lt;/code&gt; is the function's &lt;strong&gt;domain&lt;/strong&gt;.&lt;/p&gt;
    &lt;p&gt;The set of all character arrays is the empty array + the functions with domain &lt;code&gt;1..1&lt;/code&gt; + the functions with domain &lt;code&gt;1..2&lt;/code&gt; + ... Let's call this set &lt;code&gt;Array[Char]&lt;/code&gt;. Our compilers can enforce that a type belongs to &lt;code&gt;Array[Char]&lt;/code&gt;, but some operations care about the more specific type, like matrix multiplication. This is either checked with the runtime type or, in exotic enough languages, with static dependent types.&lt;/p&gt;
    &lt;p&gt;(This is actually how TLA+ does things: the basic collection types are functions and sets, and a function with domain 1..N is a sequence.)&lt;/p&gt;
    &lt;h3&gt;2-dimensional arrays&lt;/h3&gt;
    &lt;p&gt;Now take the 3x4 matrix&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;
    &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;11&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There are two equally valid ways to represent the array function:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;A function that takes a row and a column and returns the value at that index, so it would look like &lt;code&gt;f(r: 1..3, c: 1..4) -&amp;gt; Int&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;A function that takes a row and returns that column as an array, aka another function: &lt;code&gt;f(r: 1..3) -&amp;gt; g(c: 1..4) -&amp;gt; Int&lt;/code&gt;.&lt;sup id="fnref:associative"&gt;&lt;a class="footnote-ref" href="#fn:associative"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Man, (2) looks a lot like &lt;a href="https://en.wikipedia.org/wiki/Currying" target="_blank"&gt;currying&lt;/a&gt;! In Haskell, functions can only have one parameter. If you write &lt;code&gt;(+) 6 10&lt;/code&gt;, &lt;code&gt;(+) 6&lt;/code&gt; first returns a &lt;em&gt;new&lt;/em&gt; function &lt;code&gt;f y = y + 6&lt;/code&gt;, and then applies &lt;code&gt;f 10&lt;/code&gt; to get 16. So &lt;code&gt;(+)&lt;/code&gt; has the type signature &lt;code&gt;Int -&amp;gt; Int -&amp;gt; Int&lt;/code&gt;: it's a function that takes an &lt;code&gt;Int&lt;/code&gt; and returns a function of type &lt;code&gt;Int -&amp;gt; Int&lt;/code&gt;.&lt;sup id="fnref:typeclass"&gt;&lt;a class="footnote-ref" href="#fn:typeclass"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;Similarly, our 2D array can be represented as an array function that returns array functions: it has type &lt;code&gt;1..3 -&amp;gt; 1..4 -&amp;gt; Int&lt;/code&gt;, meaning it takes a row index and returns &lt;code&gt;1..4 -&amp;gt; Int&lt;/code&gt;, aka a single array.&lt;/p&gt;
    &lt;p&gt;(This differs from conventional array-of-arrays because it forces all of the subarrays to have the same domain, aka the same length. If we wanted to permit ragged arrays, we would instead have the type &lt;code&gt;1..3 -&amp;gt; Array[Int]&lt;/code&gt;.)&lt;/p&gt;
    &lt;p&gt;Why is this useful? A couple of reasons. First of all, we can apply function transformations to arrays, like "&lt;a href="https://blog.zdsmith.com/series/combinatory-programming.html" target="_blank"&gt;combinators&lt;/a&gt;". For example, we can flip any function of type &lt;code&gt;a -&amp;gt; b -&amp;gt; c&lt;/code&gt; into a function of type &lt;code&gt;b -&amp;gt; a -&amp;gt; c&lt;/code&gt;. So given a function that takes rows and returns columns, we can produce one that takes columns and returns rows. That's just a matrix transposition! &lt;/p&gt;
    &lt;p&gt;Second, we can extend this to any number of dimensions: a three-dimensional array is one with type &lt;code&gt;1..M -&amp;gt; 1..N -&amp;gt; 1..O -&amp;gt; V&lt;/code&gt;. We can still use function transformations to rearrange the array along any ordering of axes.&lt;/p&gt;
    &lt;p&gt;Speaking of dimensions:&lt;/p&gt;
    &lt;h3&gt;What are dimensions, anyway&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;Okay, so now imagine we have a &lt;code&gt;Row&lt;/code&gt; × &lt;code&gt;Col&lt;/code&gt; grid of pixels, where each pixel is a struct of type &lt;code&gt;Pixel(R: int, G: int, B: int)&lt;/code&gt;. So the array is&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Row -&amp;gt; Col -&amp;gt; Pixel
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But we can also represent the &lt;em&gt;Pixel struct&lt;/em&gt; with a function: &lt;code&gt;Pixel(R: 0, G: 0, B: 255)&lt;/code&gt; is the function where &lt;code&gt;f(R) = 0&lt;/code&gt;, &lt;code&gt;f(G) = 0&lt;/code&gt;, &lt;code&gt;f(B) = 255&lt;/code&gt;, making it a function of type &lt;code&gt;{R, G, B} -&amp;gt; Int&lt;/code&gt;. So the array is actually the function&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Row -&amp;gt; Col -&amp;gt; {R, G, B} -&amp;gt; Int
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;And then we can rearrange the parameters of the function like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;{R, G, B} -&amp;gt; Row -&amp;gt; Col -&amp;gt; Int
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Even though the set &lt;code&gt;{R, G, B}&lt;/code&gt; is not of form 1..N, this clearly has a real meaning: &lt;code&gt;f[R]&lt;/code&gt; is the function mapping each coordinate to that coordinate's red value. What about &lt;code&gt;Row -&amp;gt; {R, G, B} -&amp;gt; Col -&amp;gt; Int&lt;/code&gt;?  That's for each row, the 3 × Col array mapping each color to that row's intensities.&lt;/p&gt;
    &lt;p&gt;Really &lt;em&gt;any finite set&lt;/em&gt; can be a "dimension". Recording the monitor over a span of time? &lt;code&gt;Frame -&amp;gt; Row -&amp;gt; Col -&amp;gt; Color -&amp;gt; Int&lt;/code&gt;. Recording a bunch of computers over some time? &lt;code&gt;Computer -&amp;gt; Frame -&amp;gt; Row …&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;This is pretty common in constraint satisfaction! Like if you're conference trying to assign talks to talk slots, your array might be type &lt;code&gt;(Day, Time, Room) -&amp;gt; Talk&lt;/code&gt;, where Day/Time/Room are enumerations.&lt;/p&gt;
    &lt;p&gt;An implementation constraint is that most programming languages &lt;em&gt;only&lt;/em&gt; allow integer indexes, so we have to replace Rooms and Colors with numerical enumerations over the set. As long as the set is finite, this is always possible, and for struct-functions, we can always choose the indexing on the lexicographic ordering of the keys. But we lose type safety.&lt;/p&gt;
    &lt;h3&gt;Why tables are different&lt;/h3&gt;
    &lt;p&gt;One more example: &lt;code&gt;Day -&amp;gt; Hour -&amp;gt; Airport(name: str, flights: int, revenue: USD)&lt;/code&gt;. Can we turn the struct into a dimension like before? &lt;/p&gt;
    &lt;p&gt;In this case, no. We were able to make &lt;code&gt;Color&lt;/code&gt; an axis because we could turn &lt;code&gt;Pixel&lt;/code&gt; into a &lt;code&gt;Color -&amp;gt; Int&lt;/code&gt; function, and we could only do that because all of the fields of the struct had the same type. This time, the fields are &lt;em&gt;different&lt;/em&gt; types. So we can't convert &lt;code&gt;{name, flights, revenue}&lt;/code&gt; into an axis. &lt;sup id="fnref:name-dimension"&gt;&lt;a class="footnote-ref" href="#fn:name-dimension"&gt;4&lt;/a&gt;&lt;/sup&gt; One thing we can do is convert it to three &lt;em&gt;separate&lt;/em&gt; functions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;airport: Day -&amp;gt; Hour -&amp;gt; Str
    flights: Day -&amp;gt; Hour -&amp;gt; Int
    revenue: Day -&amp;gt; Hour -&amp;gt; USD
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But we want to keep all of the data in one place. That's where &lt;strong&gt;tables&lt;/strong&gt; come in: an array-of-structs is isomorphic to a struct-of-arrays:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;AirportColumns(
        airport: Day -&amp;gt; Hour -&amp;gt; Str,
        flights: Day -&amp;gt; Hour -&amp;gt; Int,
        revenue: Day -&amp;gt; Hour -&amp;gt; USD,
    )
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The table is a sort of &lt;em&gt;both&lt;/em&gt; representations simultaneously. If this was a pandas dataframe, &lt;code&gt;df["airport"]&lt;/code&gt; would get the airport column, while &lt;code&gt;df.loc[day1]&lt;/code&gt; would get the first day's data. I don't think many table implementations support more than one axis dimension but there's no reason they &lt;em&gt;couldn't&lt;/em&gt;. &lt;/p&gt;
    &lt;p&gt;These are also possible transforms:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Hour -&amp;gt; NamesAreHard(
        airport: Day -&amp;gt; Str,
        flights: Day -&amp;gt; Int,
        revenue: Day -&amp;gt; USD,
    )
    
    Day -&amp;gt; Whatever(
        airport: Hour -&amp;gt; Str,
        flights: Hour -&amp;gt; Int,
        revenue: Hour -&amp;gt; USD,
    )
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In my mental model, the heterogeneous struct acts as a "block" in the array. We can't remove it, we can only push an index into the fields or pull a shared column out. But there's no way to convert a heterogeneous table into an array.&lt;/p&gt;
    &lt;h3&gt;Actually there is a terrible way&lt;/h3&gt;
    &lt;p&gt;Most languages have unions or &lt;del&gt;product&lt;/del&gt; sum types that let us say "this is a string OR integer". So we can make our airport data &lt;code&gt;Day -&amp;gt; Hour -&amp;gt; AirportKey -&amp;gt; Int | Str | USD&lt;/code&gt;. Heck, might as well just say it's &lt;code&gt;Day -&amp;gt; Hour -&amp;gt; AirportKey -&amp;gt; Any&lt;/code&gt;. But would anybody really be mad enough to use that in practice?&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://code.jsoftware.com/wiki/Vocabulary/lt" target="_blank"&gt;Oh wait J does exactly that&lt;/a&gt;. J has an opaque datatype called a "box". A "table" is a function &lt;code&gt;Dim1 -&amp;gt; Dim2 -&amp;gt; Box&lt;/code&gt;. You can see some examples of what that looks like &lt;a href="https://code.jsoftware.com/wiki/DB/Flwor" target="_blank"&gt;here&lt;/a&gt;&lt;/p&gt;
    &lt;h3&gt;Misc Thoughts and Questions&lt;/h3&gt;
    &lt;p&gt;The heterogeneity barrier seems like it explains why we don't see multiple axes of table columns, while we do see multiple axes of array dimensions. But is that actually why? Is there a system out there that &lt;em&gt;does&lt;/em&gt; have multiple columnar axes?&lt;/p&gt;
    &lt;p&gt;The array &lt;code&gt;x = [[a, b, a], [b, b, b]]&lt;/code&gt; has type &lt;code&gt;1..2 -&amp;gt; 1..3 -&amp;gt; {a, b}&lt;/code&gt;. Can we rearrange it to &lt;code&gt;1..2 -&amp;gt; {a, b} -&amp;gt; 1..3&lt;/code&gt;? No. But we &lt;em&gt;can&lt;/em&gt; rearrange it to &lt;code&gt;1..2 -&amp;gt; {a, b} -&amp;gt; PowerSet(1..3)&lt;/code&gt;, which maps rows and characters to columns &lt;em&gt;with&lt;/em&gt; that character. &lt;code&gt;[(a -&amp;gt; {1, 3} ++ b -&amp;gt; {2}), (a -&amp;gt; {} ++ b -&amp;gt; {1, 2, 3}]&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;We can also transform &lt;code&gt;Row -&amp;gt; PowerSet(Col)&lt;/code&gt; into &lt;code&gt;Row -&amp;gt; Col -&amp;gt; Bool&lt;/code&gt;, aka a boolean matrix. This makes sense to me as both forms are means of representing directed graphs.&lt;/p&gt;
    &lt;p&gt;Are other function combinators useful for thinking about arrays?&lt;/p&gt;
    &lt;p&gt;Does this model cover pivot tables? Can we extend it to relational data with multiple tables?&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Systems Distributed Talk (will be) Online&lt;/h3&gt;
    &lt;p&gt;The premier will be August 6 at 12 CST, &lt;a href="https://www.youtube.com/watch?v=d9cM8f_qSLQ" target="_blank"&gt;here&lt;/a&gt;! I'll be there to answer questions / mock my own performance / generally make a fool of myself.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:1-indexing"&gt;
    &lt;p&gt;&lt;a href="https://buttondown.com/hillelwayne/archive/why-do-arrays-start-at-0/" target="_blank"&gt;Sacrilege&lt;/a&gt;! But it turns out in this context, it's easier to use 1-indexing than 0-indexing. In the years since I wrote that article I've settled on "each indexing choice matches different kinds of mathematical work", so mathematicians and computer scientists are best served by being able to choose their index. But software engineers need consistency, and 0-indexing is overall a net better consistency pick. &lt;a class="footnote-backref" href="#fnref:1-indexing" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:associative"&gt;
    &lt;p&gt;This is &lt;em&gt;right-associative&lt;/em&gt;: &lt;code&gt;a -&amp;gt; b -&amp;gt; c&lt;/code&gt; means &lt;code&gt;a -&amp;gt; (b -&amp;gt; c)&lt;/code&gt;, not &lt;code&gt;(a -&amp;gt; b) -&amp;gt; c&lt;/code&gt;. &lt;code&gt;(1..3 -&amp;gt; 1..4) -&amp;gt; Int&lt;/code&gt; would be the associative array that maps length-3 arrays to integers. &lt;a class="footnote-backref" href="#fnref:associative" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:typeclass"&gt;
    &lt;p&gt;Technically it has type &lt;code&gt;Num a =&amp;gt; a -&amp;gt; a -&amp;gt; a&lt;/code&gt;, since &lt;code&gt;(+)&lt;/code&gt; works on floats too. &lt;a class="footnote-backref" href="#fnref:typeclass" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:name-dimension"&gt;
    &lt;p&gt;Notice that if each &lt;code&gt;Airport&lt;/code&gt; had a unique name, we &lt;em&gt;could&lt;/em&gt; pull it out into &lt;code&gt;AirportName -&amp;gt; Airport(flights, revenue)&lt;/code&gt;, but we still are stuck with two different values. &lt;a class="footnote-backref" href="#fnref:name-dimension" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 30 Jul 2025 13:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/2000-words-about-arrays-and-tables/</guid></item><item><title>Programming Language Escape Hatches</title><link>https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/</link><description>
    &lt;p&gt;The excellent-but-defunct blog &lt;a href="https://prog21.dadgum.com/38.html" target="_blank"&gt;Programming in the 21st Century&lt;/a&gt; defines "puzzle languages" as languages were part of the appeal is in figuring out how to express a program idiomatically, like a puzzle. As examples, he lists Haskell, Erlang, and J. All puzzle languages, the author says, have an "escape" out of the puzzle model that is pragmatic but stigmatized.&lt;/p&gt;
    &lt;p&gt;But many mainstream languages have escape hatches, too.&lt;/p&gt;
    &lt;p&gt;Languages have a lot of properties. One of these properties is the language's &lt;a href="https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/" target="_blank"&gt;capabilities&lt;/a&gt;, roughly the set of things you can do in the language. Capability is desirable but comes into conflicts with a lot of other desirable properties, like simplicity or efficiency. In particular, reducing the capability of a language means that all remaining programs share more in common, meaning there's more assumptions the compiler and programmer can make ("tractability"). Assumptions are generally used to reason about correctness, but can also be about things like optimization: J's assumption that everything is an array leads to &lt;a href="https://code.jsoftware.com/wiki/Vocabulary/SpecialCombinations" target="_blank"&gt;high-performance "special combinations"&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;Rust is the most famous example of &lt;strong&gt;mainstream&lt;/strong&gt; language that trades capability for tractability.&lt;sup id="fnref:gc"&gt;&lt;a class="footnote-ref" href="#fn:gc"&gt;1&lt;/a&gt;&lt;/sup&gt; Rust has a lot of rules designed to prevent common memory errors, like keeping a reference to deallocated memory or modifying memory while something else is reading it. As a consequence, there's a lot of things that cannot be done in (safe) Rust, like interface with an external C function (as it doesn't have these guarantees).&lt;/p&gt;
    &lt;p&gt;To do this, you need to use &lt;a href="https://doc.rust-lang.org/book/ch20-01-unsafe-rust.html" target="_blank"&gt;unsafe Rust&lt;/a&gt;, which lets you do additional things forbidden by safe Rust, such as deference a raw pointer. Everybody tells you not to use &lt;code&gt;unsafe&lt;/code&gt; unless you absolutely 100% know what you're doing, and possibly not even then.&lt;/p&gt;
    &lt;p&gt;Sounds like an escape hatch to me!&lt;/p&gt;
    &lt;p&gt;To extrapolate, an &lt;strong&gt;escape hatch&lt;/strong&gt; is a feature (either in the language itself or a particular implementation) that deliberately breaks core assumptions about the language in order to add capabilities. This explains both Rust and most of the so-called "puzzle languages": they need escape hatches because they have very strong conceptual models of the language which leads to lots of assumptions about programs. But plenty of "kitchen sink" mainstream languages have escape hatches, too:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Some compilers let C++ code embed &lt;a href="https://en.cppreference.com/w/cpp/language/asm.html" target="_blank"&gt;inline assembly&lt;/a&gt;.&lt;/li&gt;
    &lt;li&gt;Languages built on .NET or the JVM has some sort of interop with C# or Java, and many of those languages make assumptions about programs that C#/Java do not.&lt;/li&gt;
    &lt;li&gt;The SQL language has stored procedures as an escape hatch &lt;em&gt;and&lt;/em&gt; vendors create a second escape hatch of user-defined functions.&lt;/li&gt;
    &lt;li&gt;Ruby lets you bypass any form of encapsulation with &lt;a href="https://ruby-doc.org/3.4.1/Object.html#method-i-send" target="_blank"&gt;&lt;code&gt;send&lt;/code&gt;&lt;/a&gt;.&lt;/li&gt;
    &lt;li&gt;Frameworks have escape hatches, too! React has &lt;a href="https://react.dev/learn/escape-hatches" target="_blank"&gt;an entire page on them&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;(Does &lt;code&gt;eval&lt;/code&gt; in interpreted languages count as an escape hatch? It feels different, but it does add a lot of capability. Maybe they don't "break assumptions" in the same way?)&lt;/p&gt;
    &lt;h3&gt;The problem with escape hatches&lt;/h3&gt;
    &lt;p&gt;In all languages with escape hatches, the rule is "use this as carefully and sparingly as possible", to the point where a messy solution &lt;em&gt;without&lt;/em&gt; an escape hatch is preferable to a clean solution &lt;em&gt;with&lt;/em&gt; one. Breaking a core assumption is a big deal! If the language is operating as if its still true, it's going to do incorrect things. &lt;/p&gt;
    &lt;p&gt;I recently had this problem in a TLA+ contract. TLA+ is a language for modeling complicated systems, and assumes that the model is a self-contained universe. The client wanted to use the TLA+ to test a real system. The model checker should send commands to a test device and check the next states were the same. This is straightforward to set up with the &lt;a href="https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla" target="_blank"&gt;IOExec escape hatch&lt;/a&gt;.&lt;sup id="fnref:ioexec"&gt;&lt;a class="footnote-ref" href="#fn:ioexec"&gt;2&lt;/a&gt;&lt;/sup&gt; But the model checker assumed that state exploration was pure and it could skip around the state randomly, meaning it would do things like &lt;code&gt;set x = 10&lt;/code&gt;, then skip to &lt;code&gt;set x = 1&lt;/code&gt;, then skip back to &lt;code&gt;inc x; assert x == 11&lt;/code&gt;. Oops!&lt;/p&gt;
    &lt;p&gt;We eventually found workarounds but it took a lot of clever tricks to pull off. I'll probably write up the technique when I'm less busy with The Book.&lt;/p&gt;
    &lt;p&gt;The other problem with escape hatches is the rest of the language is designed around &lt;em&gt;not&lt;/em&gt; having said capabilities, meaning it can't support the feature as well as a language designed for them from the start. Even if your escape hatch code is clean, it might not cleanly &lt;em&gt;integrate&lt;/em&gt; with the rest of your code. This is why people &lt;a href="https://chadaustin.me/2024/10/intrusive-linked-list-in-rust/" target="_blank"&gt;complain about unsafe Rust&lt;/a&gt; so often.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:gc"&gt;
    &lt;p&gt;It should be noted though that &lt;em&gt;all&lt;/em&gt; languages with automatic memory management are trading capability for tractability, too. If you can't deference pointers, you can't deference &lt;em&gt;null&lt;/em&gt; pointers. &lt;a class="footnote-backref" href="#fnref:gc" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:ioexec"&gt;
    &lt;p&gt;From the Community Modules (which come default with the VSCode extension). &lt;a class="footnote-backref" href="#fnref:ioexec" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 24 Jul 2025 14:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/</guid></item><item><title>Maybe writing speed actually is a bottleneck for programming</title><link>https://buttondown.com/hillelwayne/archive/maybe-writing-speed-actually-is-a-bottleneck-for/</link><description>
    &lt;p&gt;I'm a big (neo)vim buff. My config is over 1500 lines and I regularly write new scripts. I recently ported my neovim config to a new laptop. Before then, I was using VSCode to write, and when I switched back I immediately saw a big gain in productivity.&lt;/p&gt;
    &lt;p&gt;People often pooh-pooh vim (and other assistive writing technologies) by saying that writing code isn't the bottleneck in software development. Reading, understanding, and thinking through code is!&lt;/p&gt;
    &lt;p&gt;Now I don't know how true this actually is in practice, because empirical studies of time spent coding are all over the place. Most of them, like &lt;a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/meyer-fse-2014.pdf" target="_blank"&gt;this study&lt;/a&gt;, track time spent in the editor but don't distinguish between time spent reading code and time spent writing code. The only one I found that separates them was &lt;a href="https://scispace.com/pdf/i-know-what-you-did-last-summer-an-investigation-of-how-3zxclzzocc.pdf" target="_blank"&gt;this study&lt;/a&gt;. It finds that developers spend only 5% of their time editing. It also finds they spend 14% of their time moving or resizing editor windows, so I don't know how clean their data is.&lt;/p&gt;
    &lt;p&gt;But I have a bigger problem with "writing is not the bottleneck": when I think of a bottleneck, I imagine that &lt;em&gt;no&lt;/em&gt; amount of improvement will lead to productivity gains. Like if a program is bottlenecked on the network, it isn't going to get noticeably faster with 100x more ram or compute. &lt;/p&gt;
    &lt;p&gt;But being able to type code 100x faster, even with without corresponding improvements to reading and imagining code, would be &lt;strong&gt;huge&lt;/strong&gt;. &lt;/p&gt;
    &lt;p&gt;We'll assume the average developer writes at 80 words per minute, at five characters a word, for 400 characters a minute.What could we do if we instead wrote at 8,000 words/40k characters a minute? &lt;/p&gt;
    &lt;h3&gt;Writing fast&lt;/h3&gt;
    &lt;h4&gt;Boilerplate is trivial&lt;/h4&gt;
    &lt;p&gt;Why do people like type inference? Because writing all of the types manually is annoying. Why don't people like boilerplate? Because it's annoying to write every damn time. Programmers like features that help them write less! That's not a problem if you can write all of the boilerplate in 0.1 seconds.&lt;/p&gt;
    &lt;p&gt;You still have the problem of &lt;em&gt;reading&lt;/em&gt; boilerplate heavy code, but you can use the remaining 0.9 seconds to churn out an extension that parses the file and presents the boilerplate in a more legible fashion. &lt;/p&gt;
    &lt;h4&gt;We can write more tooling&lt;/h4&gt;
    &lt;p&gt;This is something I've noticed with LLMs: when I can churn out crappy code as a free action, I use that to write lots of tools that assist me in writing &lt;em&gt;good&lt;/em&gt; code. Even if I'm bottlenecked on a large program, I can still quickly write a script that helps me with something. Most of these aren't things I would have written because they'd take too long to write! &lt;/p&gt;
    &lt;p&gt;Again, not the best comparison, because LLMs also shortcut learning the relevant APIs, so also optimize the "understanding code" part. Then again, if I could type real fast I could more quickly whip up experiments on new apis to learn them faster. &lt;/p&gt;
    &lt;h4&gt;We can do practices that slow us down in the short-term&lt;/h4&gt;
    &lt;p&gt;Something like test-driven development significantly slows down how fast you write production code, because you have to spend a lot more time writing test code. Pair programming trades speed of writing code for speed of understanding code. A two-order-of-magnitude writing speedup makes both of them effectively free. Or, if you're not an eXtreme Programming fan, you can more easily follow the &lt;a href="https://en.wikipedia.org/wiki/The_Power_of_10:_Rules_for_Developing_Safety-Critical_Code" target="_blank"&gt;The Power of Ten Rules&lt;/a&gt; and blanket your code with contracts and assertions.&lt;/p&gt;
    &lt;h4&gt;We could do more speculative editing&lt;/h4&gt;
    &lt;p&gt;This is probably the biggest difference in how we'd work if we could write 100x faster: it'd be much easier to try changes to the code to see if they're good ideas in the first place. &lt;/p&gt;
    &lt;p&gt;How often have I tried optimizing something, only to find out it didn't make a difference? How often have I done a refactoring only to end up with lower-quality code overall? Too often. Over time it makes me prefer to try things that I know will work, and only "speculatively edit" when I think it be a fast change. If I could code 100x faster it would absolutely lead to me trying more speculative edits.&lt;/p&gt;
    &lt;p&gt;This is especially big because I believe that lots of speculative edits are high-risk, high-reward: given 50 things we could do to the code, 49 won't make a difference and one will be a major improvement. If I only have time to try five things, I have a 10% chance of hitting the jackpot. If I can try 500 things I will get that reward every single time. &lt;/p&gt;
    &lt;h2&gt;Processes are built off constraints&lt;/h2&gt;
    &lt;p&gt;There are just a few ideas I came up with; there are probably others. Most of them, I suspect, will share the same property in common: they change &lt;em&gt;the process&lt;/em&gt; of writing code to leverage the speedup. I can totally believe that a large speedup would not remove a bottleneck in the processes we &lt;em&gt;currently&lt;/em&gt; use to write code. But that's because those processes are developed work within our existing constraints. Remove a constraint and new processes become possible.&lt;/p&gt;
    &lt;p&gt;The way I see it, if our current process produces 1 Utils of Software / day, a 100x writing speedup might lead to only 1.5 UoS/day. But there are other processes that produce only 0.5 UoS/d &lt;em&gt;because they are bottlenecked on writing speed&lt;/em&gt;. A 100x speedup would lead to 10 UoS/day.&lt;/p&gt;
    &lt;p&gt;The problem with all of this that 100x speedup isn't realistic, and it's not obvious whether a 2x improvement would lead to better processes. Then again, one of the first custom vim function scripts I wrote was an aid to writing unit tests in a particular codebase, and it lead to me writing a lot more tests. So maybe even a 2x speedup is going to be speed things up, too.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Patreon Stuff&lt;/h3&gt;
    &lt;p&gt;I wrote a couple of TLA+ specs to show how to model &lt;a href="https://en.wikipedia.org/wiki/Fork%E2%80%93join_model" target="_blank"&gt;fork-join&lt;/a&gt; algorithms. I'm planning on eventually writing them up for my blog/learntla but it'll be a while, so if you want to see them in the meantime I put them up on &lt;a href="https://www.patreon.com/posts/fork-join-in-tla-134209395?utm_medium=clipboard_copy&amp;amp;utm_source=copyLink&amp;amp;utm_campaign=postshare_creator&amp;amp;utm_content=join_link" target="_blank"&gt;Patreon&lt;/a&gt;.&lt;/p&gt;
    </description><pubDate>Thu, 17 Jul 2025 19:08:27 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/maybe-writing-speed-actually-is-a-bottleneck-for/</guid></item><item><title>Logic for Programmers Turns One</title><link>https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/</link><description>
    &lt;p&gt;I released &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; exactly one year ago today. It feels weird to celebrate the anniversary of something that isn't 1.0 yet, but software projects have a proud tradition of celebrating a dozen anniversaries before 1.0. I wanted to share about what's changed in the past year and the work for the next six+ months.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The book cover!" class="newsletter-image" src="https://assets.buttondown.email/images/70ac47c9-c49f-47c0-9a05-7a9e70551d03.jpg?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h3&gt;The Road to 0.1&lt;/h3&gt;
    &lt;p&gt;I had been noodling on the idea of a logic book since the pandemic. The first time I wrote about it on the newsletter was in &lt;a href="https://buttondown.com/hillelwayne/archive/predicate-logic-for-programmers/" target="_blank"&gt;2021&lt;/a&gt;! Then I said that it would be done by June and would be "under 50 pages". The idea was to cover logic as a "soft skill" that helped you think about things like requirements and stuff.&lt;/p&gt;
    &lt;p&gt;That version &lt;em&gt;sucked&lt;/em&gt;. If you want to see how much it sucked, I put it up on &lt;a href="https://www.patreon.com/posts/what-logic-for-133675688" target="_blank"&gt;Patreon&lt;/a&gt;. Then I slept on the next draft for three years. Then in 2024 a lot of business fell through and I had a lot of free time, so with the help of &lt;a href="https://saul.pw/" target="_blank"&gt;Saul Pwanson&lt;/a&gt; I rewrote the book. This time I emphasized breadth over depth, trying to cover a lot more techniques.  &lt;/p&gt;
    &lt;p&gt;I also decided to self-publish it instead of pitching it to a publisher. Not going the traditional route would mean I would be responsible for paying for editing, advertising, graphic design etc, but I hoped that would be compensated by &lt;em&gt;much&lt;/em&gt; higher royalties. It also meant I could release the book in early access and use early sales to fund further improvements. So I wrote up a draft in &lt;a href="https://www.sphinx-doc.org/en/master/" target="_blank"&gt;Sphinx&lt;/a&gt;, compiled it to LaTeX, and uploaded the PDF to &lt;a href="https://leanpub.com/" target="_blank"&gt;leanpub&lt;/a&gt;. That was in June 2024.&lt;/p&gt;
    &lt;p&gt;Since then I kept to a monthly cadence of updates, missing once in November (short-notice contract) and once last month (&lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;). The book's now on v0.10. What's changed?&lt;/p&gt;
    &lt;h3&gt;A LOT&lt;/h3&gt;
    &lt;p&gt;v0.1 was &lt;em&gt;very obviously&lt;/em&gt; an alpha, and I have made a lot of improvements since then. For one, the book no longer looks like a &lt;a href="https://www.sphinx-doc.org/_/downloads/en/master/pdf/#page=13" target="_blank"&gt;Sphinx manual&lt;/a&gt;. Compare!&lt;/p&gt;
    &lt;p&gt;&lt;img alt="0.1 on left, 0.10 on right. Way better!" class="newsletter-image" src="https://assets.buttondown.email/images/e4d880ad-80b8-4360-9cae-27c07598c740.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Also, the content is very, very different. v0.1 was 19,000 words, v.10 is 31,000.&lt;sup id="fnref:pagesize"&gt;&lt;a class="footnote-ref" href="#fn:pagesize"&gt;1&lt;/a&gt;&lt;/sup&gt; This comes from new chapters on TLA+, constraint/SMT solving, logic programming, and major expansions to the existing chapters. Originally, "Simplifying Conditionals" was 600 words. Six hundred words! It almost fit in two pages!&lt;/p&gt;
    &lt;p&gt;&lt;img alt="How short Simplifying Conditions USED to be" class="newsletter-image" src="https://assets.buttondown.email/images/31e731b7-3bdc-4ded-9b09-2a6261a323ec.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;The chapter is now 2600 words, now covering condition lifting, quantifier manipulation, helper predicates, and set optimizations. All the other chapters have either gotten similar facelifts or are scheduled to get facelifts.&lt;/p&gt;
    &lt;p&gt;The last big change is the addition of &lt;a href="https://github.com/logicforprogrammers/book-assets" target="_blank"&gt;book assets&lt;/a&gt;. Originally you had to manually copy over all of the code to try it out, which is a problem when there are samples in eight distinct languages! Now there are ready-to-go examples for each chapter, with instructions on how to set up each programming environment. This is also nice because it gives me breaks from writing to code instead.&lt;/p&gt;
    &lt;h3&gt;How did the book do?&lt;/h3&gt;
    &lt;p&gt;Leanpub's all-time visualizations are terrible, so I'll just give the summary: 1180 copies sold, $18,241 in royalties. That's a lot of money for something that isn't fully out yet! By comparison, &lt;em&gt;Practical TLA+&lt;/em&gt; has made me less than half of that, despite selling over 5x as many books. Self-publishing was the right choice!&lt;/p&gt;
    &lt;p&gt;In that time I've paid about $400 for the book cover (worth it) and maybe $800 in Leanpub's advertising service (probably not worth it). &lt;/p&gt;
    &lt;p&gt;Right now that doesn't come close to making back the time investment, but I think it can get there post-release. I believe there's a lot more potential customers via marketing. I think post-release 10k copies sold is within reach.&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;Where is the book going?&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;The main content work is rewrites: many of the chapters have not meaningfully changed since 1.0, so I am going through and rewriting them from scratch. So far four of the ten chapters have been rewritten. My (admittedly ambitious) goal is to rewrite three of them by the end of this month and another three by the end of next. I also want to do final passes on the rewritten chapters; as most of them have a few TODOs left lying around.&lt;/p&gt;
    &lt;p&gt;(Also somehow in starting this newsletter and publishing it I realized that one of the chapters might be better split into two chapters, so there could well-be a tenth technique in v0.11 or v0.12!)&lt;/p&gt;
    &lt;p&gt;After that, I will pass it to a copy editor while I work on improving the layout, making images, and indexing. I want to have something worthy of printing on a dead tree by 1.0. &lt;/p&gt;
    &lt;p&gt;In terms of timelines, I am &lt;strong&gt;very roughly&lt;/strong&gt; estimating something like this:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Summer: final big changes and rewrites&lt;/li&gt;
    &lt;li&gt;Early Autumn: graphic design and copy editing&lt;/li&gt;
    &lt;li&gt;Late Autumn: proofing, figuring out printing stuff&lt;/li&gt;
    &lt;li&gt;Winter: final ebook and initial print releases of 1.0.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;(If you know a service that helps get self-published books "past the finish line", I'd love to hear about it! Preferably something that works for a fee, not part of royalties.)&lt;/p&gt;
    &lt;p&gt;This timeline may be disrupted by official client work, like a new TLA+ contract or a conference invitation.&lt;/p&gt;
    &lt;p&gt;Needless to say, I am incredibly excited to complete this book and share the final version with you all. This is a book I wished for years ago, a book I wrote because nobody else would. It fills a critical gap in software educational material, and someday soon I'll be able to put a copy on my bookshelf. It's exhilarating and terrifying and above all, satisfying.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:pagesize"&gt;
    &lt;p&gt;It's also 150 pages vs 50 pages, but admittedly this is partially because I made the book smaller with a larger font. &lt;a class="footnote-backref" href="#fnref:pagesize" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 08 Jul 2025 18:18:52 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/</guid></item><item><title>Logical Quantifiers in Software</title><link>https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/</link><description>
    &lt;p&gt;I realize that for all I've talked about &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; in this newsletter, I never once explained basic logical quantifiers. They're both simple and incredibly useful, so let's do that this week! &lt;/p&gt;
    &lt;h3&gt;Sets and quantifiers&lt;/h3&gt;
    &lt;p&gt;A &lt;strong&gt;set&lt;/strong&gt; is a collection of unordered, unique elements. &lt;code&gt;{1, 2, 3, …}&lt;/code&gt; is a set, as are "every programming language", "every programming language's Wikipedia page", and "every function ever defined in any programming language's standard library". You can put whatever you want in a set, with some very specific limitations to avoid certain paradoxes.&lt;sup id="fnref:paradox"&gt;&lt;a class="footnote-ref" href="#fn:paradox"&gt;2&lt;/a&gt;&lt;/sup&gt; &lt;/p&gt;
    &lt;p&gt;Once we have a set, we can ask "is something true for all elements of the set" and "is something true for at least one element of the set?" IE, is it true that every programming language has a &lt;code&gt;set&lt;/code&gt; collection type in the core language? We would write it like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# all of them
    all l in ProgrammingLanguages: HasSetType(l)
    
    # at least one
    some l in ProgrammingLanguages: HasSetType(l)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is the notation I use in the book because it's easy to read, type, and search for. Mathematicians historically had a few different formats; the one I grew up with was &lt;code&gt;∀x ∈ set: P(x)&lt;/code&gt; to mean &lt;code&gt;all x in set&lt;/code&gt;, and &lt;code&gt;∃&lt;/code&gt; to mean &lt;code&gt;some&lt;/code&gt;. I use these when writing for just myself, but find them confusing to programmers when communicating.&lt;/p&gt;
    &lt;p&gt;"All" and "some" are respectively referred to as "universal" and "existential" quantifiers.&lt;/p&gt;
    &lt;h3&gt;Some cool properties&lt;/h3&gt;
    &lt;p&gt;We can simplify expressions with quantifiers, in the same way that we can simplify &lt;code&gt;!(x &amp;amp;&amp;amp; y)&lt;/code&gt; to &lt;code&gt;!x || !y&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;First of all, quantifiers are commutative with themselves. &lt;code&gt;some x: some y: P(x,y)&lt;/code&gt; is the same as &lt;code&gt;some y: some x: P(x, y)&lt;/code&gt;. For this reason we can write &lt;code&gt;some x, y: P(x,y)&lt;/code&gt; as shorthand. We can even do this when quantifying over different sets, writing &lt;code&gt;some x, x' in X, y in Y&lt;/code&gt; instead of &lt;code&gt;some x, x' in X: some y in Y&lt;/code&gt;. We can &lt;em&gt;not&lt;/em&gt; do this with "alternating quantifiers":&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;code&gt;all p in Person: some m in Person: Mother(m, p)&lt;/code&gt; says that every person has a mother.&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;some m in Person: all p in Person: Mother(m, p)&lt;/code&gt; says that someone is every person's mother.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Second, existentials distribute over &lt;code&gt;||&lt;/code&gt; while universals distribute over &lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt;. "There is some url which returns a 403 or 404" is the same as "there is some url which returns a 403 or some url that returns a 404", and "all PRs pass the linter and the test suites" is the same as "all PRs pass the linter and all PRs pass the test suites".&lt;/p&gt;
    &lt;p&gt;Finally, &lt;code&gt;some&lt;/code&gt; and &lt;code&gt;all&lt;/code&gt; are &lt;em&gt;duals&lt;/em&gt;: &lt;code&gt;some x: P(x) == !(all x: !P(x))&lt;/code&gt;, and vice-versa. Intuitively: if some file is malicious, it's not true that all files are benign.&lt;/p&gt;
    &lt;p&gt;All these rules together mean we can manipulate quantifiers &lt;em&gt;almost&lt;/em&gt; as easily as we can manipulate regular booleans, putting them in whatever form is easiest to use in programming. &lt;/p&gt;
    &lt;p&gt;Speaking of which, how &lt;em&gt;do&lt;/em&gt; we use this in in programming?&lt;/p&gt;
    &lt;h2&gt;How we use this in programming&lt;/h2&gt;
    &lt;p&gt;First of all, people clearly have a need for directly using quantifiers in code. If we have something of the form:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;for x in list:
        if P(x):
            return true
    return false
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;That's just &lt;code&gt;some x in list: P(x)&lt;/code&gt;. And this is a prevalent pattern, as you can see by using &lt;a href="https://github.com/search?q=%2Ffor+.*%3A%5Cn%5Cs*if+.*%3A%5Cn%5Cs*return+%28False%7CTrue%29%5Cn%5Cs*return+%28True%7CFalse%29%2F+language%3Apython+NOT+is%3Afork&amp;amp;type=code" target="_blank"&gt;GitHub code search&lt;/a&gt;. It finds over 500k examples of this pattern in Python alone! That can be simplified via using the language's built-in quantifiers: the Python would be &lt;code&gt;any(P(x) for x in list)&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;(Note this is not quantifying over sets but iterables. But the idea translates cleanly enough.)&lt;/p&gt;
    &lt;p&gt;More generally, quantifiers are a key way we express higher-level properties of software. What does it mean for a list to be sorted in ascending order? That &lt;code&gt;all i, j in 0..&amp;lt;len(l): if i &amp;lt; j then l[i] &amp;lt;= l[j]&lt;/code&gt;. When should a &lt;a href="https://qntm.org/ratchet" target="_blank"&gt;ratchet test fail&lt;/a&gt;? When &lt;code&gt;some f in functions - exceptions: Uses(f, bad_function)&lt;/code&gt;. Should the image classifier work upside down? &lt;code&gt;all i in images: classify(i) == classify(rotate(i, 180))&lt;/code&gt;. These are the properties we verify with tests and types and &lt;a href="https://www.hillelwayne.com/post/constructive/" target="_blank"&gt;MISU&lt;/a&gt; and whatnot;&lt;sup id="fnref:misu"&gt;&lt;a class="footnote-ref" href="#fn:misu"&gt;1&lt;/a&gt;&lt;/sup&gt; it helps to be able to make them explicit!&lt;/p&gt;
    &lt;p&gt;One cool use case that'll be in the book's next version: database invariants are universal statements over the set of all records, like &lt;code&gt;all a in accounts: a.balance &amp;gt; 0&lt;/code&gt;. That's enforceable with a &lt;a href="https://sqlite.org/lang_createtable.html#check_constraints" target="_blank"&gt;CHECK&lt;/a&gt; constraint. But what about something like &lt;code&gt;all i, i' in intervals: NoOverlap(i, i')&lt;/code&gt;? That isn't covered by CHECK, since it spans two rows.&lt;/p&gt;
    &lt;p&gt;Quantifier duality to the rescue! The invariant is equivalent to &lt;code&gt;!(some i, i' in intervals: Overlap(i, i'))&lt;/code&gt;, so is preserved if the &lt;em&gt;query&lt;/em&gt; &lt;code&gt;SELECT COUNT(*) FROM intervals CROSS JOIN intervals …&lt;/code&gt; returns 0 rows. This means we can test it via a &lt;a href="https://sqlite.org/lang_createtrigger.html" target="_blank"&gt;database trigger&lt;/a&gt;.&lt;sup id="fnref:efficiency"&gt;&lt;a class="footnote-ref" href="#fn:efficiency"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;There are a lot more use cases for quantifiers, but this is enough to introduce the ideas! Next week's the one year anniversary of the book entering early access, so I'll be writing a bit about that experience and how the book changed. It's &lt;em&gt;crazy&lt;/em&gt; how crude v0.1 was compared to the current version.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:misu"&gt;
    &lt;p&gt;MISU ("make illegal states unrepresentable") means using data representations that rule out invalid values. For example, if you have a &lt;code&gt;location -&amp;gt; Optional(item)&lt;/code&gt; lookup and want to make sure that each item is in exactly one location, consider instead changing the map to &lt;code&gt;item -&amp;gt; location&lt;/code&gt;. This is a means of &lt;em&gt;implementing&lt;/em&gt; the property &lt;code&gt;all i in item, l, l' in location: if ItemIn(i, l) &amp;amp;&amp;amp; l != l' then !ItemIn(i, l')&lt;/code&gt;. &lt;a class="footnote-backref" href="#fnref:misu" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:paradox"&gt;
    &lt;p&gt;Specifically, a set can't be an element of itself, which rules out constructing things like "the set of all sets" or "the set of sets that don't contain themselves". &lt;a class="footnote-backref" href="#fnref:paradox" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:efficiency"&gt;
    &lt;p&gt;Though note that when you're inserting or updating an interval, you already &lt;em&gt;have&lt;/em&gt; that row's fields in the trigger's &lt;code&gt;NEW&lt;/code&gt; keyword. So you can just query &lt;code&gt;!(some i in intervals: Overlap(new, i'))&lt;/code&gt;, which is more efficient. &lt;a class="footnote-backref" href="#fnref:efficiency" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 02 Jul 2025 19:44:22 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/</guid></item><item><title>You can cheat a test suite with a big enough polynomial</title><link>https://buttondown.com/hillelwayne/archive/you-can-cheat-a-test-suite-with-a-big-enough/</link><description>
    &lt;p&gt;Hi nerds, I'm back from &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;! I'd heartily recommend it, wildest conference I've been to in years. I have a lot of work to catch up on, so this will be a short newsletter.&lt;/p&gt;
    &lt;p&gt;In an earlier version of my talk, I had a gag about unit tests. First I showed the test &lt;code&gt;f([1,2,3]) == 3&lt;/code&gt;, then said that this was satisfied by &lt;code&gt;f(l) = 3&lt;/code&gt;, &lt;code&gt;f(l) = l[-1]&lt;/code&gt;, &lt;code&gt;f(l) = len(l)&lt;/code&gt;, &lt;code&gt;f(l) = (129*l[0]-34*l[1]-617)*l[2] - 443*l[0] + 1148*l[1] - 182&lt;/code&gt;. Then I progressively rule them out one by one with more unit tests, except the last polynomial which stubbornly passes every single test.&lt;/p&gt;
    &lt;p&gt;If you're given some function of &lt;code&gt;f(x: int, y: int, …): int&lt;/code&gt; and a set of unit tests asserting &lt;a href="https://buttondown.com/hillelwayne/archive/oracle-testing/" target="_blank"&gt;specific inputs give specific outputs&lt;/a&gt;, then you can find a polynomial that passes every single unit test.&lt;/p&gt;
    &lt;p&gt;To find the gag, and as &lt;a href="https://en.wikipedia.org/wiki/Satisfiability_modulo_theories" target="_blank"&gt;SMT&lt;/a&gt; practice, I wrote a Python program that finds a polynomial that passes a test suite meant for &lt;code&gt;max&lt;/code&gt;. It's hardcoded for three parameters and only finds 2nd-order polynomials but I think it could be generalized with enough effort.&lt;/p&gt;
    &lt;h2&gt;The code&lt;/h2&gt;
    &lt;p&gt;Full code &lt;a href="https://gist.github.com/hwayne/0ed045a35376c786171f9cf4b55c470f" target="_blank"&gt;here&lt;/a&gt;, breakdown below.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;z3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;  &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
    &lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Solver&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;Solver&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;a href="https://microsoft.github.io/z3guide/" target="_blank"&gt;Z3&lt;/a&gt; is just the particular SMT solver we use, as it has good language bindings and a lot of affordances.&lt;/p&gt;
    &lt;p&gt;As part of learning SMT I wanted to do this two ways. First by putting the polynomial "outside" of the SMT solver in a python function, second by doing it "natively" in Z3. I created two solvers so I could test both versions in one run. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;a0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Consts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'a0 a b c d e f'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Ints&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'x y z'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"a*x+b*y+c*z+d*x*y+e*x*z+f*y*z+a0"&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Both &lt;code&gt;Const('x', IntSort())&lt;/code&gt; and &lt;code&gt;Int('x')&lt;/code&gt; do the exact same thing, the latter being syntactic sugar for the former. I did not know this when I wrote the program. &lt;/p&gt;
    &lt;p&gt;To keep the two versions in sync I represented the equation as a string, which I later &lt;code&gt;eval&lt;/code&gt;. This is one of the rare cases where eval is a good idea, to help us experiment more quickly while learning. The polynomial is a "2nd-order polynomial", even though it doesn't have &lt;code&gt;x^2&lt;/code&gt; terms, as it has &lt;code&gt;xy&lt;/code&gt; and &lt;code&gt;xz&lt;/code&gt; terms.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;lambdamax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="n"&gt;z3max&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'z3max'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;  &lt;span class="n"&gt;IntSort&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ForAll&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;z3max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;lambdamax&lt;/code&gt; is pretty straightforward: create a lambda with three parameters and &lt;code&gt;eval&lt;/code&gt; the string. The string "&lt;code&gt;a*x&lt;/code&gt;" then becomes the python expression &lt;code&gt;a*x&lt;/code&gt;, &lt;code&gt;a&lt;/code&gt; is an SMT symbol, while the &lt;code&gt;x&lt;/code&gt; SMT symbol is shadowed by the lambda parameter. To reiterate, a terrible idea in practice, but a good way to learn faster.&lt;/p&gt;
    &lt;p&gt;&lt;code&gt;z3max&lt;/code&gt; function is a little more complex. &lt;code&gt;Function&lt;/code&gt; takes an identifier string and N "sorts" (roughly the same as programming types). The first &lt;code&gt;N-1&lt;/code&gt; sorts define the parameters of the function, while the last becomes the output. So here I assign the string identifier &lt;code&gt;"z3max"&lt;/code&gt; to be a function with signature &lt;code&gt;(int, int, int) -&amp;gt; int&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;I can load the function into the model by specifying constraints on what &lt;code&gt;z3max&lt;/code&gt; &lt;em&gt;could&lt;/em&gt; be. This could either be a strict input/output, as will be done later, or a &lt;code&gt;ForAll&lt;/code&gt; over all possible inputs. Here I just use that directly to say "for all inputs, the function should match this polynomial." But I could do more complicated constraints, like commutativity (&lt;code&gt;f(x, y) == f(y, x)&lt;/code&gt;) or monotonicity (&lt;code&gt;Implies(x &amp;lt; y, f(x) &amp;lt;= f(y))&lt;/code&gt;).&lt;/p&gt;
    &lt;p&gt;Note &lt;code&gt;ForAll&lt;/code&gt; takes a list of z3 symbols to quantify over. That's the only reason we need to define &lt;code&gt;x, y, z&lt;/code&gt; in the first place. The lambda version doesn't need them. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z3max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lambdamax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This sets up the joke: adding constraints to each solver that the polynomial it finds must, for a fixed list of triplets, return the max of each triplet.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z3max&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lambdamax&lt;/span&gt;&lt;span class="p"&gt;)]:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;sat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"max([&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;]) ="&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"max([x, y, z]) = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;x + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"+ &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;z +"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# linebreaks added for newsletter rendering&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;xy + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;xz + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;yz + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;a0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Output:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;max([1, 2, 3]) = 3
    # etc
    max([x, y, z]) = -133x + 130y + -10z + -2xy + 62xz + -46yz + 0
    
    max([1, 2, 3]) = 3
    # etc
    max([x, y, z]) = -17x + 16y + 0z + 0xy + 8xz + -6yz + 0
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I find that &lt;code&gt;z3max&lt;/code&gt; (top) consistently finds larger coefficients than &lt;code&gt;lambdamax&lt;/code&gt; does. I don't know why.&lt;/p&gt;
    &lt;h3&gt;Practical Applications&lt;/h3&gt;
    &lt;p&gt;&lt;strong&gt;Test-Driven Development&lt;/strong&gt; recommends a strict "red-green refactor" cycle. Write a new failing test, make the new test pass, then go back and refactor. Well, the easiest way to make the new test pass would be to paste in a new polynomial, so that's what you should be doing. You can even do this all automatically: have a script read the set of test cases, pass them to the solver, and write the new polynomial to your code file. All you need to do is write the tests!&lt;/p&gt;
    &lt;h3&gt;Pedagogical Notes&lt;/h3&gt;
    &lt;p&gt;Writing the script took me a couple of hours. I'm sure an LLM could have whipped it all up in five minutes but I really want to &lt;em&gt;learn&lt;/em&gt; SMT and &lt;a href="https://www.sciencedirect.com/science/article/pii/S0747563224002541" target="_blank"&gt;LLMs &lt;em&gt;may&lt;/em&gt; decrease learning retention&lt;/a&gt;.&lt;sup id="fnref:caveat"&gt;&lt;a class="footnote-ref" href="#fn:caveat"&gt;1&lt;/a&gt;&lt;/sup&gt; Z3 documentation is not... great for non-academics, though, and most other SMT solvers have even worse docs. One useful trick I use regularly is to use Github code search to find code using the same APIs and study how that works. Turns out reading API-heavy code is a lot easier than writing it!&lt;/p&gt;
    &lt;p&gt;Anyway, I'm very, very slowly feeling like I'm getting the basics on how to use SMT. I don't have any practical use cases yet, but I wanted to learn this skill for a while and glad I finally did.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:caveat"&gt;
    &lt;p&gt;Caveat I have not actually &lt;em&gt;read&lt;/em&gt; the study, for all I know it could have a sample size of three people, I'll get around to it eventually &lt;a class="footnote-backref" href="#fnref:caveat" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 24 Jun 2025 16:27:01 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/you-can-cheat-a-test-suite-with-a-big-enough/</guid></item><item><title>Solving LinkedIn Queens with SMT</title><link>https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/</link><description>
    &lt;h3&gt;No newsletter next week&lt;/h3&gt;
    &lt;p&gt;I’ll be speaking at &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;. My talk isn't close to done yet, which is why this newsletter is both late and short. &lt;/p&gt;
    &lt;h1&gt;Solving LinkedIn Queens in SMT&lt;/h1&gt;
    &lt;p&gt;The article &lt;a href="https://codingnest.com/modern-sat-solvers-fast-neat-underused-part-1-of-n/" target="_blank"&gt;Modern SAT solvers: fast, neat and underused&lt;/a&gt; claims that SAT solvers&lt;sup id="fnref:SAT"&gt;&lt;a class="footnote-ref" href="#fn:SAT"&gt;1&lt;/a&gt;&lt;/sup&gt; are "criminally underused by the industry". A while back on the newsletter I asked "why": how come they're so powerful and yet nobody uses them? Many experts responded saying the reason is that encoding SAT kinda sucked and they rather prefer using tools that compile to SAT. &lt;/p&gt;
    &lt;p&gt;I was reminded of this when I read &lt;a href="https://ryanberger.me/posts/queens/" target="_blank"&gt;Ryan Berger's post&lt;/a&gt; on solving “LinkedIn Queens” as a SAT problem. &lt;/p&gt;
    &lt;p&gt;A quick overview of Queens. You’re presented with an NxN grid divided into N regions, and have to place N queens so that there is exactly one queen in each row, column, and region. While queens can be on the same diagonal, they &lt;em&gt;cannot&lt;/em&gt; be adjacently diagonal.&lt;/p&gt;
    &lt;p&gt;(Important note: Linkedin “Queens” is a variation on the puzzle game &lt;a href="https://starbattle.puzzlebaron.com/" target="_blank"&gt;Star Battle&lt;/a&gt;, which is the same except the number of stars you place in each row/column/region varies per puzzle, and is usually two. This is also why 'queens' don’t capture like chess queens.)&lt;/p&gt;
    &lt;p&gt;&lt;img alt="An image of a solved queens board. Copied from https://ryanberger.me/posts/queens" class="newsletter-image" src="https://assets.buttondown.email/images/96f6f923-331f-424d-8641-fe6753e1c2ca.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Ryan solved this by writing Queens as a SAT problem, expressing properties like "there is exactly one queen in row 3" as a large number of boolean clauses. &lt;a href="https://ryanberger.me/posts/queens/" target="_blank"&gt;Go read his post, it's pretty cool&lt;/a&gt;. What leapt out to me was that he used &lt;a href="https://cvc5.github.io/" target="_blank"&gt;CVC5&lt;/a&gt;, an &lt;strong&gt;SMT&lt;/strong&gt; solver.&lt;sup id="fnref:SMT"&gt;&lt;a class="footnote-ref" href="#fn:SMT"&gt;2&lt;/a&gt;&lt;/sup&gt; SMT solvers are "higher-level" than SAT, capable of handling more data types than just boolean variables. It's a lot easier to solve the problem at the SMT level than at the SAT level. To show this, I whipped up a short demo of solving the same problem in &lt;a href="https://github.com/Z3Prover/z3/wiki" target="_blank"&gt;Z3&lt;/a&gt; (via the &lt;a href="https://pypi.org/project/z3-solver/" target="_blank"&gt;Python API&lt;/a&gt;).&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://gist.github.com/hwayne/c5de7bc52e733995311236666bedecd3" target="_blank"&gt;Full code here&lt;/a&gt;, which you can compare to Ryan's SAT solution &lt;a href="https://github.com/ryan-berger/queens/blob/master/main.py" target="_blank"&gt;here&lt;/a&gt;. I didn't do a whole lot of cleanup on it (again, time crunch!), but short explanation below.&lt;/p&gt;
    &lt;h3&gt;The code&lt;/h3&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;z3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
    &lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;itertools&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;combinations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;
    &lt;span class="n"&gt;solver&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Solver&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt; &lt;span class="c1"&gt;# N&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Initial setup and modules. &lt;code&gt;size&lt;/code&gt; is the number of rows/columns/regions in the board, which I'll call &lt;code&gt;N&lt;/code&gt; below.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# queens[n] = col of queen on row n&lt;/span&gt;
    &lt;span class="c1"&gt;# by construction, not on same row&lt;/span&gt;
    &lt;span class="n"&gt;queens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;IntVector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'q'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;SAT represents the queen positions via N² booleans: &lt;code&gt;q_00&lt;/code&gt; means that a Queen is on row 0 and column 0, &lt;code&gt;!q_05&lt;/code&gt; means a queen &lt;em&gt;isn't&lt;/em&gt; on row 0 col 5, etc. In SMT we can instead encode it as N integers: &lt;code&gt;q_0 = 5&lt;/code&gt; means that the queen on row 0 is positioned at column 5. This immediately enforces one class of constraints for us: we don't need any constraints saying "exactly one queen per row", because that's embedded in the definition of &lt;code&gt;queens&lt;/code&gt;!&lt;/p&gt;
    &lt;p&gt;(Incidentally, using 0-based indexing for the board was a mistake on my part, it makes correctly encoding the regions later really painful.)&lt;/p&gt;
    &lt;p&gt;To actually make the variables &lt;code&gt;[q_0, q_1, …]&lt;/code&gt;, we use the Z3 affordance &lt;code&gt;IntVector(str, n)&lt;/code&gt; for making &lt;code&gt;n&lt;/code&gt; variables at once.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;And&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="c1"&gt;# not on same column&lt;/span&gt;
    &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Distinct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;First we constrain all the integers to &lt;code&gt;[0, N)&lt;/code&gt;, then use the &lt;em&gt;incredibly&lt;/em&gt; handy &lt;code&gt;Distinct&lt;/code&gt; constraint to force all the integers to have different values. This guarantees at most one queen per column, which by the &lt;a href="https://en.wikipedia.org/wiki/Pigeonhole_principle" target="_blank"&gt;pigeonhole principle&lt;/a&gt; means there is exactly one queen per column.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# not diagonally adjacent&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;q1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;q2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;One of the rules is that queens can't be adjacent. We already know that they can't be horizontally or vertically adjacent via other constraints, which leaves the diagonals. We only need to add constraints that, for each queen, there is no queen in the lower-left or lower-right corner, aka &lt;code&gt;q_3 != q_2 ± 1&lt;/code&gt;. We don't need to check the top corners because if &lt;code&gt;q_1&lt;/code&gt; is in the upper-left corner of &lt;code&gt;q_2&lt;/code&gt;, then &lt;code&gt;q_2&lt;/code&gt; is in the lower-right corner of &lt;code&gt;q_1&lt;/code&gt;!&lt;/p&gt;
    &lt;p&gt;That covers everything except the "one queen per region" constraint. But the regions are the tricky part, which we should expect because we vary the difficulty of queens games by varying the regions.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;regions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;"purple"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
            &lt;span class="s2"&gt;"red"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),],&lt;/span&gt;
            &lt;span class="c1"&gt;# you get the picture&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
    
    &lt;span class="c1"&gt;# Some checking code left out, see below&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The region has to be manually coded in, which is a huge pain.&lt;/p&gt;
    &lt;p&gt;(In the link, some validation code follows. Since it breaks up explaining the model I put it in the next section.)&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Or&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally we have the region constraint. The easiest way I found to say "there is exactly one queen in each region" is to say "there is a queen in region 1 and a queen in region 2 and a queen in region 3" etc." Then to say "there is a queen in region &lt;code&gt;purple&lt;/code&gt;" I wrote "&lt;code&gt;q_0 = 0&lt;/code&gt; OR &lt;code&gt;q_0 = 1&lt;/code&gt; OR … OR &lt;code&gt;q_1 = 0&lt;/code&gt; etc." &lt;/p&gt;
    &lt;p&gt;Why iterate over every position in the region instead of doing something like &lt;code&gt;(0, q[0]) in r&lt;/code&gt;? I tried that but it's not an expression that Z3 supports.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;sat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally, we solve and print the positions. Running this gives me:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;q__0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Which is the correct solution to the queens puzzle. I didn't benchmark the solution times, but I imagine it's considerably slower than a raw SAT solver. &lt;a href="https://github.com/audemard/glucose" target="_blank"&gt;Glucose&lt;/a&gt; is really, really fast.&lt;/p&gt;
    &lt;p&gt;But even so, solving the problem with SMT was a lot &lt;em&gt;easier&lt;/em&gt; than solving it with SAT. That satisfies me as an explanation for why people prefer it to SAT.&lt;/p&gt;
    &lt;h3&gt;Sanity checks&lt;/h3&gt;
    &lt;p&gt;One bit I glossed over earlier was the sanity checking code. I &lt;em&gt;knew for sure&lt;/em&gt; that I was going to make a mistake encoding the &lt;code&gt;region&lt;/code&gt;, and the solver wasn't going to provide useful information abut what I did wrong.  In cases like these, I like adding small tests and checks to catch mistakes early, because the solver certainly isn't going to catch them!&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;all_squares&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;repeat&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;test_i_set_up_problem_right&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;all_squares&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_iterable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r2&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;combinations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The first check was a quick test that I didn't leave any squares out, or accidentally put the same square in both regions. Converting the values into sets makes both checks a lot easier. Honestly I don't know why I didn't just use sets from the start, sets are great.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;render_regions&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;colormap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"purple"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s2"&gt;"red"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"brown"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"white"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"green"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"yellow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"orange"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"blue"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"pink"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;board&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; 
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;all_squares&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;colormap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    
    &lt;span class="n"&gt;render_regions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The second check is something that prints out the regions. It produces something like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;111111111
    112333999
    122439999
    124437799
    124666779
    124467799
    122467899
    122555889
    112258899
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I can compare this to the picture of the board to make sure I got it right. I guess a more advanced solution would be to print emoji squares like 🟥 instead.&lt;/p&gt;
    &lt;p&gt;Neither check is quality code but it's throwaway and it gets the job done so eh.&lt;/p&gt;
    &lt;h3&gt;Update for the Internet&lt;/h3&gt;
    &lt;p&gt;This was sent as a weekly newsletter, which is usually on topics like &lt;a href="https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code" target="_blank"&gt;software history&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/" target="_blank"&gt;formal methods&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;unusual technologies&lt;/a&gt;, and the &lt;a href="https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/" target="_blank"&gt;theory of software engineering&lt;/a&gt;. You &lt;a href="https://buttondown.email/hillelwayne/" target="_blank"&gt;can subscribe here&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:SAT"&gt;
    &lt;p&gt;"Boolean &lt;strong&gt;SAT&lt;/strong&gt;isfiability Solver", aka a solver that can find assignments that make complex boolean expressions true. I write a bit more about them &lt;a href="https://www.hillelwayne.com/post/np-hard/" target="_blank"&gt;here&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:SAT" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:SMT"&gt;
    &lt;p&gt;"Satisfiability Modulo Theories" &lt;a class="footnote-backref" href="#fnref:SMT" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 12 Jun 2025 15:43:25 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/</guid></item><item><title>AI is a gamechanger for TLA+ users</title><link>https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/</link><description>
    &lt;h3&gt;New Logic for Programmers Release&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.10 is now available&lt;/a&gt;! This is a minor release, mostly focused on logic-based refactoring, with new material on set types and testing refactors are correct. See the full release notes at &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;the changelog page&lt;/a&gt;. Due to &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;conference pressure&lt;/a&gt; v0.11 will also likely be a minor release. &lt;/p&gt;
    &lt;p&gt;&lt;img alt="The book cover" class="newsletter-image" src="https://assets.buttondown.email/images/29d4ae9d-bcb9-4d8b-99d4-8a35c0990ad5.jpg?w=300&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h1&gt;AI is a gamechanger for TLA+ users&lt;/h1&gt;
    &lt;p&gt;&lt;a href="https://lamport.azurewebsites.net/tla/tla.html" target="_blank"&gt;TLA+&lt;/a&gt; is a specification language to model and debug distributed systems. While very powerful, it's also hard for programmers to learn, and there's always questions of connecting specifications with actual code. &lt;/p&gt;
    &lt;p&gt;That's why &lt;a href="https://zfhuang99.github.io/github%20copilot/formal%20verification/tla+/2025/05/24/ai-revolution-in-distributed-systems.html" target="_blank"&gt;The Coming AI Revolution in Distributed Systems&lt;/a&gt; caught my interest. In the post, Cheng Huang claims that Azure successfully used LLMs to examine an existing codebase, derive a TLA+ spec, and find a production bug in that spec. "After a decade of manually crafting TLA+ specifications", he wrote, "I must acknowledge that this AI-generated specification rivals human work".&lt;/p&gt;
    &lt;p&gt;This inspired me to experiment with LLMs in TLA+ myself. My goals are a little less ambitious than Cheng's: I wanted to see how LLMs could help junior specifiers write TLA+, rather than handling the entire spec automatically. Details on what did and didn't work below, but my takeaway is that &lt;strong&gt;LLMs are an immense specification force multiplier.&lt;/strong&gt;&lt;/p&gt;
    &lt;p&gt;All tests were done with a standard VSCode Copilot subscription, writing Claude 3.7 in Agent mode. Other LLMs or IDEs may be more or less effective, etc.&lt;/p&gt;
    &lt;h2&gt;Things Claude was good at&lt;/h2&gt;
    &lt;h3&gt;Fixing syntax errors&lt;/h3&gt;
    &lt;p&gt;TLA+ uses a very different syntax than mainstream programming languages, meaning beginners make a lot of mistakes where they do a "programming syntax" instead of TLA+ syntax:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;NotThree(x) = \* should be ==, not =
        x != 3 \* should be #, not !=
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The problem is that the TLA+ syntax checker, SANY, is 30 years old and doesn't provide good information. Here's what it says for that snippet:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Was expecting "==== or more Module body"
    Encountered "NotThree" at line 6, column 1
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;That only isolates one error and doesn't tell us what the problem is, only where it is. Experienced TLA+ users get "error eyes" and can quickly see what the problem is, but beginners really struggle with this.&lt;/p&gt;
    &lt;p&gt;The TLA+ foundation has made LLM integration a priority, so the VSCode extension &lt;a href="https://github.com/tlaplus/vscode-tlaplus/blob/master/src/main.ts#L174" target="_blank"&gt;naturally supports several agents actions&lt;/a&gt;. One of these is running SANY, meaning an agent can get an error, fix it, get another error, fix it, etc. Provided the above sample and asked to make it work, Claude successfully fixed both errors. It also fixed many errors in a larger spec, as well as figure out why PlusCal specs weren't compiling to TLA+.&lt;/p&gt;
    &lt;p&gt;This by itself is already enough to make LLMs a worthwhile tool, as it fixes one of the biggest barriers to entry.&lt;/p&gt;
    &lt;h3&gt;Understanding error traces&lt;/h3&gt;
    &lt;p&gt;When TLA+ finds a violated property, it outputs the sequence of steps that leads to the error. This starts in plaintext, and VSCode parses it into an interactive table:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="An example error trace" class="newsletter-image" src="https://assets.buttondown.email/images/f7f16d0e-c61f-4286-ae49-67e03f844126.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Learning to read these error traces is a skill in itself. You have to understand what's happening in each step and how it relates back to the actually broken property. It takes a long time for people to learn how to do this well.&lt;/p&gt;
    &lt;p&gt;Claude was successful here, too, accurately reading 20+ step error traces and giving a high-level explanation of what went wrong. It also could condense error traces: if ten steps of the error trace could be condensed into a one-sentence summary (which can happen if you're modeling a lot of process internals) Claude would do it.&lt;/p&gt;
    &lt;p&gt;I did have issues here with doing this in agent mode: while the extension does provide a "run model checker" command, the agent would regularly ignore this and prefer to run a terminal command instead. This would be fine except that the LLM consistently hallucinated invalid commands. I had to amend every prompt with "run the model checker via vscode, do not use a terminal command". You can skip this if you're willing to copy and paste the error trace into the prompt.&lt;/p&gt;
    &lt;p&gt;As with syntax checking, if this was the &lt;em&gt;only&lt;/em&gt; thing LLMs could effectively do, that would already be enough&lt;sup id="fnref:dayenu"&gt;&lt;a class="footnote-ref" href="#fn:dayenu"&gt;1&lt;/a&gt;&lt;/sup&gt; to earn a strong recommend. Even as a TLA+ expert I expect I'll be using this trick regularly. &lt;/p&gt;
    &lt;h3&gt;Boilerplate tasks&lt;/h3&gt;
    &lt;p&gt;TLA+ has a lot of boilerplate. One of the most notorious examples is &lt;code&gt;UNCHANGED&lt;/code&gt; rules. Specifications are extremely precise — so precise that you have to specify what variables &lt;em&gt;don't&lt;/em&gt; change in every step. This takes the form of an &lt;code&gt;UNCHANGED&lt;/code&gt; clause at the end of relevant actions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;RemoveObjectFromStore(srv, o, s) ==
      /\ o \in stored[s]
      /\ stored' = [stored EXCEPT ![s] = @ \ {o}]
      /\ UNCHANGED &amp;lt;&amp;lt;capacity, log, objectsize, pc&amp;gt;&amp;gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Writing this is really annoying. Updating these whenever you change an action, or add a new variable to the spec, is doubly so. Syntax checking and error analysis are important for beginners, but this is what I wanted for &lt;em&gt;myself&lt;/em&gt;. I took a spec and prompted Claude&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Add UNCHANGED &amp;lt;&lt;v1, etc="" v2,=""&gt;&amp;gt; for each variable not changed in an action.&lt;/v1,&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;And it worked! It successfully updated the &lt;code&gt;UNCHANGED&lt;/code&gt; in every action. &lt;/p&gt;
    &lt;p&gt;(Note, though, that it was a "well-behaved" spec in this regard: only one "action" happened at a time. In TLA+ you can have two actions happen simultaneously, that each update half of the variables, meaning neither of them should have an &lt;code&gt;UNCHANGED&lt;/code&gt; clause. I haven't tested how Claude handles that!)&lt;/p&gt;
    &lt;p&gt;That's the most obvious win, but Claude was good at handling other tedious work, too. Some examples include updating &lt;code&gt;vars&lt;/code&gt; (the conventional collection of all state variables), lifting a hard-coded value into a model parameter, and changing data formats. Most impressive to me, though, was rewriting a spec designed for one process to instead handle multiple processes. This means taking all of the process variables, which originally have types like &lt;code&gt;Int&lt;/code&gt;, converting them to types like &lt;code&gt;[Process -&amp;gt; Int]&lt;/code&gt;, and then updating the uses of all of those variables in the spec. It didn't account for race conditions in the new concurrent behavior, but it was an excellent scaffold to do more work.&lt;/p&gt;
    &lt;h3&gt;Writing properties from an informal description&lt;/h3&gt;
    &lt;p&gt;You have to be pretty precise with your intended property description but it handles converting that precise description into TLA+'s formalized syntax, which is something beginners often struggle with.&lt;/p&gt;
    &lt;h2&gt;Things it is less good at&lt;/h2&gt;
    &lt;h3&gt;Generating model config files&lt;/h3&gt;
    &lt;p&gt;To model check TLA+, you need both a specification (&lt;code&gt;.tla&lt;/code&gt;) and a model config file (&lt;code&gt;.cfg&lt;/code&gt;), which have separate syntaxes. Asking the agent to generate the second often lead to it using TLA+ syntax. It automatically fixed this after getting parsing errors, though. &lt;/p&gt;
    &lt;h3&gt;Fixing specs&lt;/h3&gt;
    &lt;p&gt;Whenever the ran model checking and discovered a bug, it would naturally propose a change to either the invalid property or the spec. Sometimes the changes were good, other times the changes were not physically realizable. For example, if it found that a bug was due to a race condition between processes, it would often suggest fixing it by saying race conditions were okay. I mean yes, if you say bugs are okay, then the spec finds that bugs are okay! Or it would alternatively suggest adding a constraint to the spec saying that race conditions don't happen. &lt;a href="https://www.hillelwayne.com/post/alloy-facts/" target="_blank"&gt;But that's a huge mistake in specification&lt;/a&gt;, because race conditions happen if we don't have coordination. We need to specify the &lt;em&gt;mechanism&lt;/em&gt; that is supposed to prevent them.&lt;/p&gt;
    &lt;h3&gt;Finding properties of the spec&lt;/h3&gt;
    &lt;p&gt;After seeing how capable it was at translating my properties to TLA+, I started prompting Claude to come up with properties on its own. Unfortunately, almost everything I got back was either trivial, uninteresting, or too coupled to implementation details. I haven't tested if it would work better to ask it for "properties that may be violated".&lt;/p&gt;
    &lt;h3&gt;Generating code from specs&lt;/h3&gt;
    &lt;p&gt;I have to be specific here: Claude &lt;em&gt;could&lt;/em&gt; sometimes convert Python into a passable spec, an vice versa. It &lt;em&gt;wasn't&lt;/em&gt; good at recognizing abstraction. For example, TLA+ specifications often represent sequential operations with a state variable, commonly called &lt;code&gt;pc&lt;/code&gt;. If modeling code that nonatomically retrieves a counter value and increments it, we'd have one action that requires &lt;code&gt;pc = "Get"&lt;/code&gt; and sets the new value to &lt;code&gt;"Inc"&lt;/code&gt;, then another that requires it be &lt;code&gt;"Inc"&lt;/code&gt; and sets it to &lt;code&gt;"Done"&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;I found that Claude would try to somehow convert &lt;code&gt;pc&lt;/code&gt; into part of the Python program's state, rather than recognize it as a TLA+ abstraction. On the other side, when converting python code to TLA+ it would often try to translate things like &lt;code&gt;sleep&lt;/code&gt; into some part of the spec, not recognizing that it is abstractable into a distinct action. I didn't test other possible misconceptions, like converting randomness to nondeterminism.&lt;/p&gt;
    &lt;p&gt;For the record, when converting TLA+ to Python Claude tended to make simulators of the spec, rather than possible production code implementing the spec. I really wasn't expecting otherwise though.&lt;/p&gt;
    &lt;h2&gt;Unexplored Applications&lt;/h2&gt;
    &lt;p&gt;Things I haven't explored thoroughly but could possibly be effective, based on what I know about TLA+ and AI:&lt;/p&gt;
    &lt;h3&gt;Writing Java Overrides&lt;/h3&gt;
    &lt;p&gt;Most TLA+ operators are resolved via TLA+ interpreters, but you can also implement them in "native" Java. This lets you escape the standard language semantics and add capabilities like &lt;a href="https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla" target="_blank"&gt;executing programs during model-checking&lt;/a&gt; or &lt;a href="https://github.com/tlaplus/tlaplus/blob/master/tlatools/org.lamport.tlatools/src/tla2sany/StandardModules/TLC.tla#L62" target="_blank"&gt;dynamically constrain the depth of the searched state space&lt;/a&gt;. There's a lot of cool things I think would be possible with overrides. The problem is there's only a handful of people in the world who know how to write them. But that handful have written quite a few overrides and I think there's enough there for Claude to work with. &lt;/p&gt;
    &lt;h3&gt;Writing specs, given a reference mechanism&lt;/h3&gt;
    &lt;p&gt;In all my experiments, the LLM only had my prompts and the occasional Python script as information. That makes me suspect that some of its problems with writing and fixing specs come down to not having a system model. Maybe it wouldn't suggest fixes like "these processes never race" if it had a design doc saying that the processes can't coordinate. &lt;/p&gt;
    &lt;p&gt;(Could a Sufficiently Powerful LLM derive some TLA+ specification from a design document?)&lt;/p&gt;
    &lt;h3&gt;Connecting specs and code&lt;/h3&gt;
    &lt;p&gt;This is the holy grail of TLA+: taking a codebase and showing it correctly implements a spec. Currently the best ways to do this are by either using TLA+ to generate a test suite, or by taking logged production traces and matching them to TLA+ behaviors. &lt;a href="https://www.mongodb.com/blog/post/engineering/conformance-checking-at-mongodb-testing-our-code-matches-our-tla-specs" target="_blank"&gt;This blog post discusses both&lt;/a&gt;. While I've seen a lot of academic research into these approaches there are no industry-ready tools. So if you want trace validation you have to do a lot of manual labour tailored to your specific product. &lt;/p&gt;
    &lt;p&gt;If LLMs could do some of this work for us then that'd really amplify the usefulness of TLA+ to many companies.&lt;/p&gt;
    &lt;h2&gt;Thoughts&lt;/h2&gt;
    &lt;p&gt;&lt;em&gt;Right now&lt;/em&gt;, agents seem good at the tedious and routine parts of TLA+ and worse at the strategic and abstraction parts. But, since the routine parts are often a huge barrier to beginners, this means that LLMs have the potential to make TLA+ far, far more accessible than it previously was.&lt;/p&gt;
    &lt;p&gt;I have mixed thoughts on this. As an &lt;em&gt;advocate&lt;/em&gt;, this is incredible. I want more people using formal specifications because I believe it leads to cheaper, safer, more reliable software. Anything that gets people comfortable with specs is great for our industry. As a &lt;em&gt;professional TLA+ consultant&lt;/em&gt;, I'm worried that this obsoletes me. Most of my income comes from training and coaching, which companies will have far less demand of now. Then again, maybe this an opportunity to pitch "agentic TLA+ training" to companies!&lt;/p&gt;
    &lt;p&gt;Anyway, if you're interested in TLA+, there has never been a better time to try it. I mean it, these tools handle so much of the hard part now. I've got a &lt;a href="https://learntla.com/" target="_blank"&gt;free book available online&lt;/a&gt;, as does &lt;a href="https://lamport.azurewebsites.net/tla/book.html" target="_blank"&gt;the inventor of TLA+&lt;/a&gt;. I like &lt;a href="https://elliotswart.github.io/pragmaticformalmodeling/" target="_blank"&gt;this guide too&lt;/a&gt;. Happy modeling!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:dayenu"&gt;
    &lt;p&gt;Dayenu. &lt;a class="footnote-backref" href="#fnref:dayenu" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 05 Jun 2025 14:59:11 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/</guid></item><item><title>What does "Undecidable" mean, anyway</title><link>https://buttondown.com/hillelwayne/archive/what-does-undecidable-mean-anyway/</link><description>
    &lt;h3&gt;Systems Distributed&lt;/h3&gt;
    &lt;p&gt;I'll be speaking at &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt; next month! The talk is brand new and will aim to showcase some of the formal methods mental models that would be useful in mainstream software development. It has added some extra stress on my schedule, though, so expect the next two monthly releases of &lt;em&gt;Logic for Programmers&lt;/em&gt; to be mostly minor changes.&lt;/p&gt;
    &lt;h2&gt;What does "Undecidable" mean, anyway&lt;/h2&gt;
    &lt;p&gt;Last week I read &lt;a href="https://liamoc.net/forest/loc-000S/index.xml" target="_blank"&gt;Against Curry-Howard Mysticism&lt;/a&gt;, which is a solid article I recommend reading. But this newsletter is actually about &lt;a href="https://lobste.rs/s/n0whur/against_curry_howard_mysticism#c_lbts57" target="_blank"&gt;one comment&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;I like to see posts like this because I often feel like I can’t tell the difference between BS and a point I’m missing. Can we get one for questions like “Isn’t XYZ (Undecidable|NP-Complete|PSPACE-Complete)?” &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;I've already written one of these for &lt;a href="https://www.hillelwayne.com/post/np-hard/" target="_blank"&gt;NP-complete&lt;/a&gt;, so let's do one for "undecidable". Step one is to pull a technical definition from the book &lt;a href="https://link.springer.com/book/10.1007/978-1-4612-1844-9" target="_blank"&gt;&lt;em&gt;Automata and Computability&lt;/em&gt;&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;A property P of strings is said to be decidable if ... there is a total Turing machine that accepts input strings that have property P and rejects those that do not. (pg 220)&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Step two is to translate the technical computer science definition into more conventional programmer terms. Warning, because this is a newsletter and not a blog post, I might be a little sloppy with terms.&lt;/p&gt;
    &lt;h3&gt;Machines and Decision Problems&lt;/h3&gt;
    &lt;p&gt;In automata theory, all inputs to a "program" are strings of characters, and all outputs are "true" or "false". A program "accepts" a string if it outputs "true", and "rejects" if it outputs "false". You can think of this as automata studying all pure functions of type &lt;code&gt;f :: string -&amp;gt; boolean&lt;/code&gt;. Problems solvable by finding such an &lt;code&gt;f&lt;/code&gt; are called "decision problems".&lt;/p&gt;
    &lt;p&gt;This covers more than you'd think, because we can bootstrap more powerful functions from these. First, as anyone who's programmed in bash knows, strings can represent any other data. Second, we can fake non-boolean outputs by instead checking if a certain computation gives a certain result. For example, I can reframe the function &lt;code&gt;add(x, y) = x + y&lt;/code&gt; as a decision problem like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;IS_SUM(str) {
        x, y, z = split(str, "#")
        return x + y == z
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Then because &lt;code&gt;IS_SUM("2#3#5")&lt;/code&gt; returns true, we know &lt;code&gt;2 + 3 == 5&lt;/code&gt;, while &lt;code&gt;IS_SUM("2#3#6")&lt;/code&gt; is false. Since we can bootstrap parameters out of strings, I'll just say it's &lt;code&gt;IS_SUM(x, y, z)&lt;/code&gt; going forward.&lt;/p&gt;
    &lt;p&gt;A big part of automata theory is studying different models of computation with different strengths. One of the weakest is called &lt;a href="https://en.wikipedia.org/wiki/Deterministic_finite_automaton" target="_blank"&gt;"DFA"&lt;/a&gt;. I won't go into any details about what DFA actually can do, but the important thing is that it &lt;em&gt;can't&lt;/em&gt; solve &lt;code&gt;IS_SUM&lt;/code&gt;. That is, if you give me a DFA that takes inputs of form &lt;code&gt;x#y#z&lt;/code&gt;, I can always find an input where the DFA returns true when &lt;code&gt;x + y != z&lt;/code&gt;, &lt;em&gt;or&lt;/em&gt; an input which returns false when &lt;code&gt;x + y == z&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;It's really important to keep this model of "solve" in mind: a program solves a problem if it correctly returns true on all true inputs and correctly returns false on all false inputs.&lt;/p&gt;
    &lt;h3&gt;(total) Turing Machines&lt;/h3&gt;
    &lt;p&gt;A Turing Machine (TM) is a particular type of computation model. It's important for two reasons: &lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;
    &lt;p&gt;By the &lt;a href="https://en.wikipedia.org/wiki/Church%E2%80%93Turing_thesis" target="_blank"&gt;Church-Turing thesis&lt;/a&gt;, a Turing Machine is the "upper bound" of how powerful (physically realizable) computational models can get. This means that if an actual real-world programming language can solve a particular decision problem, so can a TM. Conversely, if the TM &lt;em&gt;can't&lt;/em&gt; solve it, neither can the programming language.&lt;sup id="fnref:caveat"&gt;&lt;a class="footnote-ref" href="#fn:caveat"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li&gt;
    &lt;p&gt;It's possible to write a Turing machine that takes &lt;em&gt;a textual representation of another Turing machine&lt;/em&gt; as input, and then simulates that Turing machine as part of its computations. &lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Property (1) means that we can move between different computational models of equal strength, proving things about one to learn things about another. That's why I'm able to write &lt;code&gt;IS_SUM&lt;/code&gt; in a pseudocode instead of writing it in terms of the TM computational model (and why I was able to use &lt;code&gt;split&lt;/code&gt; for convenience). &lt;/p&gt;
    &lt;p&gt;Property (2) does several interesting things. First of all, it makes it possible to compose Turing machines. Here's how I can roughly ask if a given number is the sum of two primes, with "just" addition and boolean functions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;IS_SUM_TWO_PRIMES(z):
        x := 1
        y := 1
        loop {
            if x &amp;gt; z {return false}
            if IS_PRIME(x) {
                if IS_PRIME(y) {
                    if IS_SUM(x, y, z) {
                        return true;
                    }
                }
            }
            y := y + 1
            if y &amp;gt; x {
                x := x + 1
                y := 0
            }
        }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Notice that without the &lt;code&gt;if x &amp;gt; z {return false}&lt;/code&gt;, the program would loop forever on &lt;code&gt;z=2&lt;/code&gt;. A TM that always halts for all inputs is called &lt;strong&gt;total&lt;/strong&gt;.&lt;/p&gt;
    &lt;p&gt;Property (2) also makes "Turing machines" a possible input to functions, meaning that we can now make decision problems about the behavior of Turing machines. For example, "does the TM &lt;code&gt;M&lt;/code&gt; either accept or reject &lt;code&gt;x&lt;/code&gt; within ten steps?"&lt;sup id="fnref:backticks"&gt;&lt;a class="footnote-ref" href="#fn:backticks"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;IS_DONE_IN_TEN_STEPS(M, x) {
        for (i = 0; i &amp;lt; 10; i++) {
            `simulate M(x) for one step`
            if(`M accepted or rejected`) {
                return true
            }
        }
        return false
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;Decidability and Undecidability&lt;/h3&gt;
    &lt;p&gt;Now we have all of the pieces to understand our original definition:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;A property P of strings is said to be decidable if ... there is a total Turing machine that accepts input strings that have property P and rejects those that do not. (220)&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Let &lt;code&gt;IS_P&lt;/code&gt; be the decision problem "Does the input satisfy P"? Then &lt;code&gt;IS_P&lt;/code&gt; is decidable if it can be solved by a Turing machine, ie, I can provide some &lt;code&gt;IS_P(x)&lt;/code&gt; machine that &lt;em&gt;always&lt;/em&gt; accepts if &lt;code&gt;x&lt;/code&gt; has property P, and always rejects if &lt;code&gt;x&lt;/code&gt; doesn't have property P. If I can't do that, then &lt;code&gt;IS_P&lt;/code&gt; is undecidable. &lt;/p&gt;
    &lt;p&gt;&lt;code&gt;IS_SUM(x, y, z)&lt;/code&gt; and &lt;code&gt;IS_DONE_IN_TEN_STEPS(M, x)&lt;/code&gt; are decidable properties. Is &lt;code&gt;IS_SUM_TWO_PRIMES(z)&lt;/code&gt; decidable? Some analysis shows that our corresponding program will either find a solution, or have &lt;code&gt;x&amp;gt;z&lt;/code&gt; and return false. So yes, it is decidable.&lt;/p&gt;
    &lt;p&gt;Notice there's an asymmetry here. To prove some property is decidable, I need just to need to find &lt;em&gt;one&lt;/em&gt; program that correctly solves it. To prove some property is undecidable, I need to show that any possible program, no matter what it is, doesn't solve it.&lt;/p&gt;
    &lt;p&gt;So with that asymmetry in mind, do are there &lt;em&gt;any&lt;/em&gt; undecidable problems? Yes, quite a lot. Recall that Turing machines can accept encodings of other TMs as input, meaning we can write a TM that checks &lt;em&gt;properties of Turing machines&lt;/em&gt;. And, by &lt;a href="https://en.wikipedia.org/wiki/Rice%27s_theorem" target="_blank"&gt;Rice's Theorem&lt;/a&gt;, almost every nontrivial semantic&lt;sup id="fnref:nontrivial"&gt;&lt;a class="footnote-ref" href="#fn:nontrivial"&gt;3&lt;/a&gt;&lt;/sup&gt; property of Turing machines is undecidable. The conventional way to prove this is to first find a single undecidable property &lt;code&gt;H&lt;/code&gt;, and then use that to bootstrap undecidability of other properties.&lt;/p&gt;
    &lt;p&gt;The canonical and most famous example of an undecidable problem is the &lt;a href="https://en.wikipedia.org/wiki/Halting_problem" target="_blank"&gt;Halting problem&lt;/a&gt;: "does machine M halt on input i?" It's pretty easy to prove undecidable, and easy to use it to bootstrap other undecidability properties. But again, &lt;em&gt;any&lt;/em&gt; nontrivial property is undecidable. Checking a TM is total is undecidable. Checking a TM accepts &lt;em&gt;any&lt;/em&gt; inputs is undecidable. Checking a TM solves &lt;code&gt;IS_SUM&lt;/code&gt; is undecidable. Etc etc etc.&lt;/p&gt;
    &lt;h3&gt;What this doesn't mean in practice&lt;/h3&gt;
    &lt;p&gt;I often see the halting problem misconstrued as "it's impossible to tell if a program will halt before running it." &lt;strong&gt;This is wrong&lt;/strong&gt;. The halting problem says that we cannot create an algorithm that, when applied to an arbitrary program, tells us whether the program will halt or not. It is absolutely possible to tell if many programs will halt or not. It's possible to find entire subcategories of programs that are guaranteed to halt. It's possible to say "a program constructed following constraints XYZ is guaranteed to halt." &lt;/p&gt;
    &lt;p&gt;The actual consequence of undecidability is more subtle. If we want to know if a program has property P, undecidability tells us&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;We will have to spend time and mental effort to determine if it has P&lt;/li&gt;
    &lt;li&gt;We may not be successful.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;This is subtle because we're so used to living in a world where everything's undecidable that we don't really consider what the counterfactual would be like. In such a world there might be no need for Rust, because "does this C program guarantee memory-safety" is a decidable property. The entire field of formal verification could be unnecessary, as we could just check properties of arbitrary programs directly. We could automatically check if a change in a program preserves all existing behavior. Lots of famous math problems could be solved overnight. &lt;/p&gt;
    &lt;p&gt;(This to me is a strong "intuitive" argument for why the halting problem is undecidable: a halt detector can be trivially repurposed as a program optimizer / theorem-prover / bcrypt cracker / chess engine. It's &lt;em&gt;too powerful&lt;/em&gt;, so we should expect it to be impossible.)&lt;/p&gt;
    &lt;p&gt;But because we don't live in that world, all of those things are hard problems that take effort and ingenuity to solve, and even then we often fail.&lt;/p&gt;
    &lt;h3&gt;Update for the Internet&lt;/h3&gt;
    &lt;p&gt;This was sent as a weekly newsletter, which is usually on topics like &lt;a href="https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code" target="_blank"&gt;software history&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/" target="_blank"&gt;formal methods&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;unusual technologies&lt;/a&gt;, and the &lt;a href="https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/" target="_blank"&gt;theory of software engineering&lt;/a&gt;. You &lt;a href="https://buttondown.email/hillelwayne/" target="_blank"&gt;can subscribe here&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:caveat"&gt;
    &lt;p&gt;To be pendantic, a TM can't do things like "scrape a webpage" or "render a bitmap", but we're only talking about computational decision problems here. &lt;a class="footnote-backref" href="#fnref:caveat" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:backticks"&gt;
    &lt;p&gt;One notation I've adopted in &lt;em&gt;Logic for Programmers&lt;/em&gt; is marking abstract sections of pseudocode with backticks. It's really handy! &lt;a class="footnote-backref" href="#fnref:backticks" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:nontrivial"&gt;
    &lt;p&gt;Nontrivial meaning "at least one TM has this property and at least one TM doesn't have this property". Semantic meaning "related to whether the TM accepts, rejects, or runs forever on a class of inputs". &lt;code&gt;IS_DONE_IN_TEN_STEPS&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt; a semantic property, as it doesn't tell us anything about inputs that take longer than ten steps. &lt;a class="footnote-backref" href="#fnref:nontrivial" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 28 May 2025 19:34:02 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/what-does-undecidable-mean-anyway/</guid></item><item><title>Finding hard 24 puzzles with planner programming</title><link>https://buttondown.com/hillelwayne/archive/finding-hard-24-puzzles-with-planner-programming/</link><description>
    &lt;p&gt;&lt;strong&gt;Planner programming&lt;/strong&gt; is a programming technique where you solve problems by providing a goal and actions, and letting the planner find actions that reach the goal. In a previous edition of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt;, I demonstrated how this worked by solving the 
    &lt;a href="https://en.wikipedia.org/wiki/24_(puzzle)" target="_blank"&gt;24 puzzle&lt;/a&gt; with planning. For &lt;a href="https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/" target="_blank"&gt;reasons discussed here&lt;/a&gt; I replaced that example with something more practical (orchestrating deployments), but left the &lt;a href="https://github.com/logicforprogrammers/book-assets/tree/master/code/chapter-misc" target="_blank"&gt;code online&lt;/a&gt; for posterity.&lt;/p&gt;
    &lt;p&gt;Recently I saw a family member try and fail to vibe code a tool that would find all valid 24 puzzles, and realized I could adapt the puzzle solver to also be a puzzle generator. First I'll explain the puzzle rules, then the original solver, then the generator.&lt;sup id="fnref:complex"&gt;&lt;a class="footnote-ref" href="#fn:complex"&gt;1&lt;/a&gt;&lt;/sup&gt; For a much longer intro to planning, see &lt;a href="https://www.hillelwayne.com/post/picat/" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;h3&gt;The rules of 24&lt;/h3&gt;
    &lt;p&gt;You're given four numbers and have to find some elementary equation (&lt;code&gt;+-*/&lt;/code&gt;+groupings) that uses all four numbers and results in 24. Each number must be used exactly once, but do not need to be used in the starting puzzle order. Some examples:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;code&gt;[6, 6, 6, 6]&lt;/code&gt; -&amp;gt; &lt;code&gt;6+6+6+6=24&lt;/code&gt;&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;[1, 1, 6, 6]&lt;/code&gt; -&amp;gt; &lt;code&gt;(6+6)*(1+1)=24&lt;/code&gt;&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;[4, 4, 4, 5]&lt;/code&gt; -&amp;gt; &lt;code&gt;4*(5+4/4)=24&lt;/code&gt;&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Some setups are impossible, like &lt;code&gt;[1, 1, 1, 1]&lt;/code&gt;. Others are possible only with non-elementary operations, like &lt;code&gt;[1, 5, 5, 324]&lt;/code&gt; (which requires exponentiation).&lt;/p&gt;
    &lt;h2&gt;The solver&lt;/h2&gt;
    &lt;p&gt;We will use the &lt;a href="http://picat-lang.org/" target="_blank"&gt;Picat&lt;/a&gt;, the only language that I know has a built-in planner module. The current state of our plan with be represented by a single list with all of the numbers.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;import&lt;/span&gt; &lt;span class="s s-Atom"&gt;planner&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="s s-Atom"&gt;import&lt;/span&gt; &lt;span class="s s-Atom"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    
    &lt;span class="nf"&gt;action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Cost&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;?=&amp;gt;&lt;/span&gt;
      &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt; &lt;span class="s s-Atom"&gt;:=&lt;/span&gt; &lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;% , is `and`&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt; &lt;span class="s s-Atom"&gt;:=&lt;/span&gt; &lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
        &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt; &lt;span class="s s-Atom"&gt;++&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is our "action", and it works in three steps:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Nondeterministically pull two different values out of the input, deleting them&lt;/li&gt;
    &lt;li&gt;Nondeterministically pick one of the basic operations&lt;/li&gt;
    &lt;li&gt;The new state is the remaining elements, appended with that operation applied to our two picks.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Let's walk through this with &lt;code&gt;[1, 6, 1, 7]&lt;/code&gt;. There are four choices for &lt;code&gt;X&lt;/code&gt; and three four &lt;code&gt;Y&lt;/code&gt;. If the planner chooses &lt;code&gt;X=6&lt;/code&gt; and &lt;code&gt;Y=7&lt;/code&gt;, &lt;code&gt;A = $(6 + 7)&lt;/code&gt;. This is an uncomputed term in the same way lisps might use quotation. We can resolve the computation with &lt;code&gt;apply&lt;/code&gt;, as in the line &lt;code&gt;S1 = S0 ++ [apply(A)]&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;final&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=:=&lt;/span&gt; &lt;span class="mf"&gt;24.&lt;/span&gt; &lt;span class="c1"&gt;% handle floating point&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Our final goal is just a list where the only element is 24. This has to be a little floating point-sensitive to handle floating point divison, done by &lt;code&gt;=:=&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;main&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nv"&gt;Start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;best_plan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"%w %w%n"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;For &lt;code&gt;main,&lt;/code&gt; we just find the best plan with the maximum cost of &lt;code&gt;4&lt;/code&gt; and print it. When run from the command line, &lt;code&gt;picat&lt;/code&gt; automatically executes whatever is in &lt;code&gt;main&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$ picat 24.pi
    [1,5,5,6] [1 + 5,5 * 6,30 - 6]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I don't want to spoil any more 24 puzzles, so let's stop showing the plan:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;main =&amp;gt;
    &lt;span class="gd"&gt;- , printf("%w %w%n", Start, Plan)&lt;/span&gt;
    &lt;span class="gi"&gt;+ , printf("%w%n", Start)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;Generating puzzles&lt;/h3&gt;
    &lt;p&gt;Picat provides a &lt;code&gt;find_all(X, p(X))&lt;/code&gt; function, which ruturns all &lt;code&gt;X&lt;/code&gt; for which &lt;code&gt;p(X)&lt;/code&gt; is true. In theory, we could write &lt;code&gt;find_all(S, best_plan(S, 4, _)&lt;/code&gt;. In practice, there are an infinite number of valid puzzles, so we need to bound S somewhat. We also don't want to find any redundant puzzles, such as &lt;code&gt;[6, 6, 6, 4]&lt;/code&gt; and &lt;code&gt;[4, 6, 6, 6]&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;We can solve both issues by writing a helper &lt;code&gt;valid24(S)&lt;/code&gt;, which will check that &lt;code&gt;S&lt;/code&gt; a sorted list of integers within some bounds, like &lt;code&gt;1..8&lt;/code&gt;, and also has a valid solution.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;valid24&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nv"&gt;Start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;new_list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Start&lt;/span&gt; &lt;span class="s s-Atom"&gt;::&lt;/span&gt; &lt;span class="mf"&gt;1..8&lt;/span&gt; &lt;span class="c1"&gt;% every value in 1..8&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;increasing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;% sorted ascending&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;solve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;% turn into values&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;best_plan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This leans on Picat's constraint solving features to automatically find bounded sorted lists, which is why we need the &lt;code&gt;solve&lt;/code&gt; step.&lt;sup id="fnref:efficiency"&gt;&lt;a class="footnote-ref" href="#fn:efficiency"&gt;2&lt;/a&gt;&lt;/sup&gt; Now we can just loop through all of the values in &lt;code&gt;find_all&lt;/code&gt; to get all solutions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;main&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nf"&gt;foreach&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="s s-Atom"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;find_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="nf"&gt;valid24&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="nf"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"%w%n"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="s s-Atom"&gt;end&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$ picat 24.pi
    
    [1,1,1,8]
    [1,1,2,6]
    [1,1,2,7]
    [1,1,2,8]
    # etc
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;Finding hard puzzles&lt;/h3&gt;
    &lt;p&gt;Last Friday I realized I could do something more interesting with this. Once I have found a plan, I can apply further constraints to the plan, for example to find problems that can be solved with division:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;valid24(Start, Plan) =&amp;gt;
    &lt;span class="w"&gt; &lt;/span&gt; Start = new_list(4)
    &lt;span class="w"&gt; &lt;/span&gt; , Start :: 1..8
    &lt;span class="w"&gt; &lt;/span&gt; , increasing(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , solve(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , best_plan(Start, 4, Plan)
    &lt;span class="gi"&gt;+ , member($(_ / _), Plan)&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; .
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In playing with this, though, I noticed something weird: there are some solutions that appear if I sort &lt;em&gt;up&lt;/em&gt; but not &lt;em&gt;down&lt;/em&gt;. For example, &lt;code&gt;[3,3,4,5]&lt;/code&gt; appears in the solution set, but &lt;code&gt;[5, 4, 3, 3]&lt;/code&gt; doesn't appear if I replace &lt;code&gt;increasing&lt;/code&gt; with &lt;code&gt;decreasing&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;As far as I can tell, this is because Picat only finds one best plan, and &lt;code&gt;[5, 4, 3, 3]&lt;/code&gt; has &lt;em&gt;two&lt;/em&gt; solutions: &lt;code&gt;4*(5-3/3)&lt;/code&gt; and &lt;code&gt;3*(5+4)-3&lt;/code&gt;. &lt;code&gt;best_plan&lt;/code&gt; is a &lt;em&gt;deterministic&lt;/em&gt; operator, so Picat commits to the first best plan it finds. So if it finds &lt;code&gt;3*(5+4)-3&lt;/code&gt; first, it sees that the solution doesn't contain a division, throws &lt;code&gt;[5, 4, 3, 3]&lt;/code&gt; away as a candidate, and moves on to the next puzzle.&lt;/p&gt;
    &lt;p&gt;There's a couple ways we can fix this. We could replace &lt;code&gt;best_plan&lt;/code&gt; with &lt;code&gt;best_plan_nondet&lt;/code&gt;, which can backtrack to find new plans (at the cost of an enormous number of duplicates). Or we could modify our &lt;code&gt;final&lt;/code&gt; to only accept plans with a division: &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;% Hypothetical change
    final([N]) =&amp;gt;
    &lt;span class="gi"&gt;+ member($(_ / _), current_plan()),&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; N =:= 24.
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;My favorite "fix" is to ask another question entirely. While I was looking for puzzles that can be solved with division, what I actually want is puzzles that &lt;em&gt;must&lt;/em&gt; be solved with division. What if I rejected any puzzle that has a solution &lt;em&gt;without&lt;/em&gt; division?&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gi"&gt;+ plan_with_no_div(S, P) =&amp;gt; best_plan_nondet(S, 4, P), not member($(_ / _), P).&lt;/span&gt;
    
    valid24(Start, Plan) =&amp;gt;
    &lt;span class="w"&gt; &lt;/span&gt; Start = new_list(4)
    &lt;span class="w"&gt; &lt;/span&gt; , Start :: 1..8
    &lt;span class="w"&gt; &lt;/span&gt; , increasing(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , solve(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , best_plan(Start, 4, Plan)
    &lt;span class="gd"&gt;- , member($(_ / _), Plan)&lt;/span&gt;
    &lt;span class="gi"&gt;+ , not plan_with_no_div(Start, _)&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; .
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The new line's a bit tricky. &lt;code&gt;plan_with_div&lt;/code&gt; nondeterministically finds a plan, and then fails if the plan contains a division.&lt;sup id="fnref:not"&gt;&lt;a class="footnote-ref" href="#fn:not"&gt;3&lt;/a&gt;&lt;/sup&gt; Since I used &lt;code&gt;best_plan_nondet&lt;/code&gt;, it can backtrack from there and find a new plan. This means &lt;code&gt;plan_with_no_div&lt;/code&gt; only fails if not such plan exists. And in &lt;code&gt;valid24&lt;/code&gt;, we only succeed if &lt;code&gt;plan_with_no_div&lt;/code&gt; fails, guaranteeing that the only existing plans use division. Since this doesn't depend on the plan found via &lt;code&gt;best_plan&lt;/code&gt;, it doesn't matter how the values in &lt;code&gt;Start&lt;/code&gt; are arranged, this will not miss any valid puzzles.&lt;/p&gt;
    &lt;h4&gt;Aside for my &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;logic book readers&lt;/a&gt;&lt;/h4&gt;
    &lt;p&gt;The new clause is equivalent to &lt;code&gt;!(some p: Plan(p) &amp;amp;&amp;amp; !(div in p))&lt;/code&gt;. Applying the simplifications we learned:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;&lt;code&gt;!(some p: Plan(p) &amp;amp;&amp;amp; !(div in p))&lt;/code&gt; (init)&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;all p: !(plan(p) &amp;amp;&amp;amp; !(div in p))&lt;/code&gt; (all/some duality)&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;all p: !plan(p) || div in p)&lt;/code&gt; (De Morgan's law)&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;all p: plan(p) =&amp;gt; div in p&lt;/code&gt; (implication definition)&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Which more obviously means "if P is a valid plan, then it contains a division".&lt;/p&gt;
    &lt;h4&gt;Back to finding hard puzzles&lt;/h4&gt;
    &lt;p&gt;&lt;em&gt;Anyway&lt;/em&gt;, with &lt;code&gt;not plan_with_no_div&lt;/code&gt;, we are filtering puzzles on the set of possible solutions, not just specific solutions. And this gives me an idea: what if we find puzzles that have only one solution? &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gh"&gt;different_plan(S, P) =&amp;gt; best_plan_nondet(S, 4, P2), P2 != P.&lt;/span&gt;
    
    valid24(Start, Plan) =&amp;gt;
    &lt;span class="gi"&gt;+ , not different_plan(Start, Plan)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I tried this from &lt;code&gt;1..8&lt;/code&gt; and got:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;[1,2,7,7]
    [1,3,4,6]
    [1,6,6,8]
    [3,3,8,8]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;These happen to be some of the &lt;a href="https://www.4nums.com/game/difficulties/" target="_blank"&gt;hardest 24 puzzles known&lt;/a&gt;, though not all of them. Note this is assuming that &lt;code&gt;(X + Y)&lt;/code&gt; and &lt;code&gt;(Y + X)&lt;/code&gt; are &lt;em&gt;different&lt;/em&gt; solutions. If we say they're the same (by appending writing &lt;code&gt;A = $(X + Y), X &amp;lt;= Y&lt;/code&gt; in our action) then we got a lot more puzzles, many of which are considered "easy". Other "hard" things we can look for include plans that require fractions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;plan_with_no_fractions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt; 
      &lt;span class="nf"&gt;best_plan_nondet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;not&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="s s-Atom"&gt;=\=&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;
      &lt;span class="p"&gt;).&lt;/span&gt;
    
    &lt;span class="c1"&gt;% insert `not plan...` in valid24 as usual&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally, we could try seeing if a negative number is required:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;plan_with_no_negatives&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt; 
      &lt;span class="nf"&gt;best_plan_nondet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;not&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
      &lt;span class="p"&gt;).&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Interestingly this one returns no solutions, so you are never required to construct a negative number as part of a standard 24 puzzle.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:complex"&gt;
    &lt;p&gt;The code below is different than old book version, as it uses more fancy logic programming features that aren't good in learning material. &lt;a class="footnote-backref" href="#fnref:complex" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:efficiency"&gt;
    &lt;p&gt;&lt;code&gt;increasing&lt;/code&gt; is a constraint predicate. We could alternatively write &lt;code&gt;sorted&lt;/code&gt;, which is a Picat logical predicate and must be placed after &lt;code&gt;solve&lt;/code&gt;. There doesn't seem to be any efficiency gains either way. &lt;a class="footnote-backref" href="#fnref:efficiency" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:not"&gt;
    &lt;p&gt;I don't know what the standard is in Picat, but in Prolog, the convention is to use &lt;code&gt;\+&lt;/code&gt; instead of &lt;code&gt;not&lt;/code&gt;. They mean the same thing, so I'm using &lt;code&gt;not&lt;/code&gt; because it's clearer to non-LPers. &lt;a class="footnote-backref" href="#fnref:not" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 20 May 2025 18:21:01 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/finding-hard-24-puzzles-with-planner-programming/</guid></item><item><title>Modeling Awkward Social Situations with TLA+</title><link>https://buttondown.com/hillelwayne/archive/modeling-awkward-social-situations-with-tla/</link><description>
    &lt;p&gt;You're walking down the street and need to pass someone going the opposite way. You take a step left, but they're thinking the same thing and take a step to their &lt;em&gt;right&lt;/em&gt;, aka your left. You're still blocking each other. Then you take a step to the right, and they take a step to their left, and you're back to where you started. I've heard this called "walkwarding"&lt;/p&gt;
    &lt;p&gt;Let's model this in &lt;a href="https://lamport.azurewebsites.net/tla/tla.html" target="_blank"&gt;TLA+&lt;/a&gt;. TLA+ is a &lt;strong&gt;formal methods&lt;/strong&gt; tool for finding bugs in complex software designs, most often involving concurrency. Two people trying to get past each other just also happens to be a concurrent system. A gentler introduction to TLA+'s capabilities is &lt;a href="https://www.hillelwayne.com/post/modeling-deployments/" target="_blank"&gt;here&lt;/a&gt;, an in-depth guide teaching the language is &lt;a href="https://learntla.com/" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;h2&gt;The spec&lt;/h2&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;---- MODULE walkward ----
    EXTENDS Integers
    
    VARIABLES pos
    vars == &amp;lt;&amp;lt;pos&amp;gt;&amp;gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Double equals defines a new operator, single equals is an equality check. &lt;code&gt;&amp;lt;&amp;lt;pos&amp;gt;&amp;gt;&lt;/code&gt; is a sequence, aka array.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;you == "you"
    me == "me"
    People == {you, me}
    
    MaxPlace == 4
    
    left == 0
    right == 1
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I've gotten into the habit of assigning string "symbols" to operators so that the compiler complains if I misspelled something. &lt;code&gt;left&lt;/code&gt; and &lt;code&gt;right&lt;/code&gt; are numbers so we can shift position with &lt;code&gt;right - pos&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;direction == [you |-&amp;gt; 1, me |-&amp;gt; -1]
    goal == [you |-&amp;gt; MaxPlace, me |-&amp;gt; 1]
    
    Init ==
      \* left-right, forward-backward
      pos = [you |-&amp;gt; [lr |-&amp;gt; left, fb |-&amp;gt; 1], me |-&amp;gt; [lr |-&amp;gt; left, fb |-&amp;gt; MaxPlace]]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;direction&lt;/code&gt;, &lt;code&gt;goal&lt;/code&gt;, and &lt;code&gt;pos&lt;/code&gt; are "records", or hash tables with string keys. I can get my left-right position with &lt;code&gt;pos.me.lr&lt;/code&gt; or &lt;code&gt;pos["me"]["lr"]&lt;/code&gt; (or &lt;code&gt;pos[me].lr&lt;/code&gt;, as &lt;code&gt;me == "me"&lt;/code&gt;).&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Juke(person) ==
      pos' = [pos EXCEPT ![person].lr = right - @]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;TLA+ breaks the world into a sequence of steps. In each step, &lt;code&gt;pos&lt;/code&gt; is the value of &lt;code&gt;pos&lt;/code&gt; in the &lt;em&gt;current&lt;/em&gt; step and &lt;code&gt;pos'&lt;/code&gt; is the value in the &lt;em&gt;next&lt;/em&gt; step. The main outcome of this semantics is that we "assign" a new value to &lt;code&gt;pos&lt;/code&gt; by declaring &lt;code&gt;pos'&lt;/code&gt; equal to something. But the semantics also open up lots of cool tricks, like swapping two values with &lt;code&gt;x' = y /\ y' = x&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;TLA+ is a little weird about updating functions. To set &lt;code&gt;f[x] = 3&lt;/code&gt;, you gotta write &lt;code&gt;f' = [f EXCEPT ![x] = 3]&lt;/code&gt;. To make things a little easier, the rhs of a function update can contain &lt;code&gt;@&lt;/code&gt; for the old value. &lt;code&gt;![me].lr = right - @&lt;/code&gt; is the same as &lt;code&gt;right - pos[me].lr&lt;/code&gt;, so it swaps left and right.&lt;/p&gt;
    &lt;p&gt;("Juke" comes from &lt;a href="https://www.merriam-webster.com/dictionary/juke" target="_blank"&gt;here&lt;/a&gt;)&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Move(person) ==
      LET new_pos == [pos[person] EXCEPT !.fb = @ + direction[person]]
      IN
        /\ pos[person].fb # goal[person]
        /\ \A p \in People: pos[p] # new_pos
        /\ pos' = [pos EXCEPT ![person] = new_pos]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The &lt;code&gt;EXCEPT&lt;/code&gt; syntax can be used in regular definitions, too. This lets someone move one step in their goal direction &lt;em&gt;unless&lt;/em&gt; they are at the goal &lt;em&gt;or&lt;/em&gt; someone is already in that space. &lt;code&gt;/\&lt;/code&gt; means "and".&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Next ==
      \E p \in People:
        \/ Move(p)
        \/ Juke(p)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I really like how TLA+ represents concurrency: "In each step, there is a person who either moves or jukes." It can take a few uses to really wrap your head around but it can express extraordinarily complicated distributed systems.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Spec == Init /\ [][Next]_vars
    
    Liveness == &amp;lt;&amp;gt;(pos[me].fb = goal[me])
    ====
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;Spec&lt;/code&gt; is our specification: we start at &lt;code&gt;Init&lt;/code&gt; and take a &lt;code&gt;Next&lt;/code&gt; step every step.&lt;/p&gt;
    &lt;p&gt;Liveness is the generic term for "something good is guaranteed to happen", see &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;here&lt;/a&gt; for more.  &lt;code&gt;&amp;lt;&amp;gt;&lt;/code&gt; means "eventually", so &lt;code&gt;Liveness&lt;/code&gt; means "eventually my forward-backward position will be my goal". I could extend it to "both of us eventually reach out goal" but I think this is good enough for a demo.&lt;/p&gt;
    &lt;h3&gt;Checking the spec&lt;/h3&gt;
    &lt;p&gt;Four years ago, everybody in TLA+ used the &lt;a href="https://lamport.azurewebsites.net/tla/toolbox.html" target="_blank"&gt;toolbox&lt;/a&gt;. Now the community has collectively shifted over to using the &lt;a href="https://github.com/tlaplus/vscode-tlaplus/" target="_blank"&gt;VSCode extension&lt;/a&gt;.&lt;sup id="fnref:ltla"&gt;&lt;a class="footnote-ref" href="#fn:ltla"&gt;1&lt;/a&gt;&lt;/sup&gt; VSCode requires we write a configuration file, which I will call &lt;code&gt;walkward.cfg&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;SPECIFICATION Spec
    PROPERTY Liveness
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I then check the model with the VSCode command &lt;code&gt;TLA+: Check model with TLC&lt;/code&gt;. Unsurprisingly, it finds an error:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Screenshot 2025-05-12 153537.png" class="newsletter-image" src="https://assets.buttondown.email/images/af6f9e89-0bc6-4705-b293-4da5f5c16cfe.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;The reason it fails is "stuttering": I can get one step away from my goal and then just stop moving forever. We say the spec is &lt;a href="https://www.hillelwayne.com/post/fairness/" target="_blank"&gt;unfair&lt;/a&gt;: it does not guarantee that if progress is always possible, progress will be made. If I want the spec to always make progress, I have to make some of the steps &lt;strong&gt;weakly fair&lt;/strong&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gi"&gt;+ Fairness == WF_vars(Next)&lt;/span&gt;
    
    &lt;span class="gd"&gt;- Spec == Init /\ [][Next]_vars&lt;/span&gt;
    &lt;span class="gi"&gt;+ Spec == Init /\ [][Next]_vars /\ Fairness&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now the spec is weakly fair, so someone will always do &lt;em&gt;something&lt;/em&gt;. New error:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;\* First six steps cut
    7: &amp;lt;Move("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 4], me |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2]]
    8: &amp;lt;Juke("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 4], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 2]]
    9: &amp;lt;Juke("me")&amp;gt; (back to state 7)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In this failure, I've successfully gotten past you, and then spend the rest of my life endlessly juking back and forth. The &lt;code&gt;Next&lt;/code&gt; step keeps happening, so weak fairness is satisfied. What I actually want is for both my &lt;code&gt;Move&lt;/code&gt; and my &lt;code&gt;Juke&lt;/code&gt; to both be weakly fair independently of each other.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gd"&gt;- Fairness == WF_vars(Next)&lt;/span&gt;
    &lt;span class="gi"&gt;+ Fairness == WF_vars(Move(me)) /\ WF_vars(Juke(me))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;If my liveness property also specified that &lt;em&gt;you&lt;/em&gt; reached your goal, I could instead write &lt;code&gt;\A p \in People: WF_vars(Move(p)) etc&lt;/code&gt;. I could also swap the &lt;code&gt;\A&lt;/code&gt; with a &lt;code&gt;\E&lt;/code&gt; to mean at least one of us is guaranteed to have fair actions, but not necessarily both of us. &lt;/p&gt;
    &lt;p&gt;New error:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;3: &amp;lt;Move("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 3]]
    4: &amp;lt;Juke("you")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 3]]
    5: &amp;lt;Juke("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 3]]
    6: &amp;lt;Juke("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 3]]
    7: &amp;lt;Juke("you")&amp;gt; (back to state 3)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now we're getting somewhere! This is the original walkwarding situation we wanted to capture. We're in each others way, then you juke, but before either of us can move you juke, then we both juke back. We can repeat this forever, trapped in a social hell.&lt;/p&gt;
    &lt;p&gt;Wait, but doesn't &lt;code&gt;WF(Move(me))&lt;/code&gt; guarantee I will eventually move? Yes, but &lt;em&gt;only if a move is permanently available&lt;/em&gt;. In this case, it's not permanently available, because every couple of steps it's made temporarily unavailable.&lt;/p&gt;
    &lt;p&gt;How do I fix this? I can't add a rule saying that we only juke if we're blocked, because the whole point of walkwarding is that we're not coordinated. In the real world, walkwarding can go on for agonizing seconds. What I can do instead is say that Liveness holds &lt;em&gt;as long as &lt;code&gt;Move&lt;/code&gt; is strongly fair&lt;/em&gt;. Unlike weak fairness, &lt;a href="https://www.hillelwayne.com/post/fairness/#strong-fairness" target="_blank"&gt;strong fairness&lt;/a&gt; guarantees something happens if it keeps becoming possible, even with interruptions. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Liveness == 
    &lt;span class="gi"&gt;+  SF_vars(Move(me)) =&amp;gt; &lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt;   &amp;lt;&amp;gt;(pos[me].fb = goal[me])
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This makes the spec pass. Even if we weave back and forth for five minutes, as long as we eventually pass each other, I will reach my goal. Note we could also by making &lt;code&gt;Move&lt;/code&gt; in &lt;code&gt;Fairness&lt;/code&gt; strongly fair, which is preferable if we have a lot of different liveness properties to check.&lt;/p&gt;
    &lt;h3&gt;A small exercise for the reader&lt;/h3&gt;
    &lt;p&gt;There is a presumed invariant that is violated. Identify what it is, write it as a property in TLA+, and show the spec violates it. Then fix it.&lt;/p&gt;
    &lt;p&gt;Answer (in &lt;a href="https://rot13.com/" target="_blank"&gt;rot13&lt;/a&gt;): Gur vainevnag vf "ab gjb crbcyr ner va gur rknpg fnzr ybpngvba". &lt;code&gt;Zbir&lt;/code&gt; thnenagrrf guvf ohg &lt;code&gt;Whxr&lt;/code&gt; &lt;em&gt;qbrf abg&lt;/em&gt;.&lt;/p&gt;
    &lt;h3&gt;More TLA+ Exercises&lt;/h3&gt;
    &lt;p&gt;I've started work on &lt;a href="https://github.com/hwayne/tlaplus-exercises/" target="_blank"&gt;an exercises repo&lt;/a&gt;. There's only a handful of specific problems now but I'm planning on adding more over the summer.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:ltla"&gt;
    &lt;p&gt;&lt;a href="https://learntla.com/" target="_blank"&gt;learntla&lt;/a&gt; is still on the toolbox, but I'm hoping to get it all moved over this summer. &lt;a class="footnote-backref" href="#fnref:ltla" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 14 May 2025 16:02:21 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/modeling-awkward-social-situations-with-tla/</guid></item><item><title>Write the most clever code you possibly can</title><link>https://buttondown.com/hillelwayne/archive/write-the-most-clever-code-you-possibly-can/</link><description>
    &lt;p&gt;&lt;em&gt;I started writing this early last week but Real Life Stuff happened and now you're getting the first-draft late this week. Warning, unedited thoughts ahead!&lt;/em&gt;&lt;/p&gt;
    &lt;h2&gt;New Logic for Programmers release!&lt;/h2&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.9 is out&lt;/a&gt;! This is a big release, with a new cover design, several rewritten chapters, &lt;a href="https://github.com/logicforprogrammers/book-assets/tree/master/code" target="_blank"&gt;online code samples&lt;/a&gt; and much more. See the full release notes at the &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;changelog page&lt;/a&gt;, and &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;get the book here&lt;/a&gt;!&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The new cover! It's a lot nicer" class="newsletter-image" src="https://assets.buttondown.email/images/038a7092-5dc7-41a5-9a16-56bdef8b5d58.jpg?w=400&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h2&gt;Write the cleverest code you possibly can&lt;/h2&gt;
    &lt;p&gt;There are millions of articles online about how programmers should not write "clever" code, and instead write simple, maintainable code that everybody understands. Sometimes the example of "clever" code looks like this (&lt;a href="https://codegolf.stackexchange.com/questions/57617/is-this-number-a-prime/57682#57682" target="_blank"&gt;src&lt;/a&gt;):&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python&lt;/span&gt;
    
    &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"p*=n*n;n+=1;"&lt;/span&gt;&lt;span class="o"&gt;*~-&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is code-golfing, the sport of writing the most concise code possible. Obviously you shouldn't run this in production for the same reason you shouldn't eat dinner off a Rembrandt. &lt;/p&gt;
    &lt;p&gt;Other times the example looks like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;is_prime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;False&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is "clever" because it uses a single list comprehension, as opposed to a "simple" for loop. Yes, "list comprehensions are too clever" is something I've read in one of these articles. &lt;/p&gt;
    &lt;p&gt;I've also talked to people who think that datatypes besides lists and hashmaps are too clever to use, that most optimizations are too clever to bother with, and even that functions and classes are too clever and code should be a linear script.&lt;sup id="fnref:grad-students"&gt;&lt;a class="footnote-ref" href="#fn:grad-students"&gt;1&lt;/a&gt;&lt;/sup&gt;. Clever code is anything using features or domain concepts we don't understand. Something that seems unbearably clever to me might be utterly mundane for you, and vice versa. &lt;/p&gt;
    &lt;p&gt;How do we make something utterly mundane? By using it and working at the boundaries of our skills. Almost everything I'm "good at" comes from banging my head against it more than is healthy. That suggests a really good reason to write clever code: it's an excellent form of purposeful practice. Writing clever code forces us to code outside of our comfort zone, developing our skills as software engineers. &lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you [will get excellent debugging practice at exactly the right level required to push your skills as a software engineer] — Brian Kernighan, probably&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;There are other benefits, too, but first let's kill the elephant in the room:&lt;sup id="fnref:bajillion"&gt;&lt;a class="footnote-ref" href="#fn:bajillion"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;h3&gt;Don't &lt;em&gt;commit&lt;/em&gt; clever code&lt;/h3&gt;
    &lt;p&gt;I am proposing writing clever code as a means of practice. Being at work is a &lt;em&gt;job&lt;/em&gt; with coworkers who will not appreciate if your code is too clever. Similarly, don't use &lt;a href="https://mcfunley.com/choose-boring-technology" target="_blank"&gt;too many innovative technologies&lt;/a&gt;. Don't put anything in production you are &lt;em&gt;uncomfortable&lt;/em&gt; with.&lt;/p&gt;
    &lt;p&gt;We can still responsibly write clever code at work, though: &lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Solve a problem in both a simple and a clever way, and then only commit the simple way. This works well for small scale problems where trying the "clever way" only takes a few minutes.&lt;/li&gt;
    &lt;li&gt;Write our &lt;em&gt;personal&lt;/em&gt; tools cleverly. I'm a big believer of the idea that most programmers would benefit from writing more scripts and support code customized to their particular work environment. This is a great place to practice new techniques, languages, etc.&lt;/li&gt;
    &lt;li&gt;If clever code is absolutely the best way to solve a problem, then commit it with &lt;strong&gt;extensive documentation&lt;/strong&gt; explaining how it works and why it's preferable to simpler solutions. Bonus: this potentially helps the whole team upskill.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;h2&gt;Writing clever code...&lt;/h2&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;h3&gt;...teaches simple solutions&lt;/h3&gt;
    &lt;p&gt;Usually, code that's called too clever composes several powerful features together — the "not a single list comprehension or function" people are the exception. &lt;a href="https://www.joshwcomeau.com/career/clever-code-considered-harmful/" target="_blank"&gt;Josh Comeau's&lt;/a&gt; "don't write clever code" article gives this example of "too clever":&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;extractDataFromResponse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;Component&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;resultsEntries&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Component&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;assignIfValueTruthy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;o&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;o&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;o&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;resultsEntries&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;assignIfValueTruthy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;What makes this "clever"? I count eight language features composed together: &lt;code&gt;entries&lt;/code&gt;, argument unpacking, implicit objects, splats, ternaries, higher-order functions, and reductions. Would code that used only one or two of these features still be "clever"? I don't think so. These features exist for a reason, and oftentimes they make code simpler than not using them.&lt;/p&gt;
    &lt;p&gt;We can, of course, learn these features one at a time. Writing the clever version (but not &lt;em&gt;committing it&lt;/em&gt;) gives us practice with all eight at once and also with how they compose together. That knowledge comes in handy when we want to apply a single one of the ideas.&lt;/p&gt;
    &lt;p&gt;I've recently had to do a bit of pandas for a project. Whenever I have to do a new analysis, I try to write it as a single chain of transformations, and then as a more balanced set of updates.&lt;/p&gt;
    &lt;h3&gt;...helps us master concepts&lt;/h3&gt;
    &lt;p&gt;Even if the composite parts of a "clever" solution aren't by themselves useful, it still makes us better at the overall language, and that's inherently valuable. A few years ago I wrote &lt;a href="https://www.hillelwayne.com/post/python-abc/" target="_blank"&gt;Crimes with Python's Pattern Matching&lt;/a&gt;. It involves writing horrible code like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;abc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ABC&lt;/span&gt;
    
    &lt;span class="k"&gt;class&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;NotIterable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ABC&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    
        &lt;span class="nd"&gt;@classmethod&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;__subclasshook__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;C&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;hasattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;C&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"__iter__"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;NotIterable&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; is not iterable"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; is iterable"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"__main__"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This composes Python match statements, which are broadly useful, and abstract base classes, which are incredibly niche. But even if I never use ABCs in real production code, it helped me understand Python's match semantics and &lt;a href="https://docs.python.org/3/howto/mro.html#python-2-3-mro" target="_blank"&gt;Method Resolution Order&lt;/a&gt; better. &lt;/p&gt;
    &lt;h3&gt;...prepares us for necessity&lt;/h3&gt;
    &lt;p&gt;Sometimes the clever way is the &lt;em&gt;only&lt;/em&gt; way. Maybe we need something faster than the simplest solution. Maybe we are working with constrained tools or frameworks that demand cleverness. Peter Norvig argued that design patterns compensate for missing language features. I'd argue that cleverness is another means of compensating: if our tools don't have an easy way to do something, we need to find a clever way.&lt;/p&gt;
    &lt;p&gt;You see this a lot in formal methods like TLA+. Need to check a hyperproperty? &lt;a href="https://www.hillelwayne.com/post/graphing-tla/" target="_blank"&gt;Cast your state space to a directed graph&lt;/a&gt;. Need to compose ten specifications together? &lt;a href="https://www.hillelwayne.com/post/composing-tla/" target="_blank"&gt;Combine refinements with state machines&lt;/a&gt;. Most difficult problems have a "clever" solution. The real problem is that clever solutions have a skill floor. If normal use of the tool is at difficult 3 out of 10, then basic clever solutions are at 5 out of 10, and it's hard to jump those two steps in the moment you need the cleverness.&lt;/p&gt;
    &lt;p&gt;But if you've practiced with writing overly clever code, you're used to working at a 7 out of 10 level in short bursts, and then you can "drop down" to 5/10. I don't know if that makes too much sense, but I see it happen a lot in practice.&lt;/p&gt;
    &lt;h3&gt;...builds comradery&lt;/h3&gt;
    &lt;p&gt;On a few occasions, after getting a pull request merged, I pulled the reviewer over and said "check out this horrible way of doing the same thing". I find that as long as people know they're not going to be subjected to a clever solution in production, they enjoy seeing it!&lt;/p&gt;
    &lt;p&gt;&lt;em&gt;Next week's newsletter will probably also be late, after that we should be back to a regular schedule for the rest of the summer.&lt;/em&gt;&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:grad-students"&gt;
    &lt;p&gt;Mostly grad students outside of CS who have to write scripts to do research. And in more than one data scientist. I think it's correlated with using Jupyter. &lt;a class="footnote-backref" href="#fnref:grad-students" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:bajillion"&gt;
    &lt;p&gt;If I don't put this at the beginning, I'll get a bajillion responses like "your team will hate you" &lt;a class="footnote-backref" href="#fnref:bajillion" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 08 May 2025 15:04:42 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/write-the-most-clever-code-you-possibly-can/</guid></item><item><title>Requirements change until they don't</title><link>https://buttondown.com/hillelwayne/archive/requirements-change-until-they-dont/</link><description>
    &lt;p&gt;Recently I got a question on formal methods&lt;sup id="fnref:fs"&gt;&lt;a class="footnote-ref" href="#fn:fs"&gt;1&lt;/a&gt;&lt;/sup&gt;: how does it help to mathematically model systems when the system requirements are constantly changing? It doesn't make sense to spend a lot of time proving a design works, and then deliver the product and find out it's not at all what the client needs. As the saying goes, the hard part is "building the right thing", not "building the thing right".&lt;/p&gt;
    &lt;p&gt;One possible response: "why write tests"? You shouldn't write tests, &lt;em&gt;especially&lt;/em&gt; &lt;a href="https://en.wikipedia.org/wiki/Test-driven_development" target="_blank"&gt;lots of unit tests ahead of time&lt;/a&gt;, if you might just throw them all away when the requirements change.&lt;/p&gt;
    &lt;p&gt;This is a bad response because we all know the difference between writing tests and formal methods: testing is &lt;em&gt;easy&lt;/em&gt; and FM is &lt;em&gt;hard&lt;/em&gt;. Testing requires low cost for moderate correctness, FM requires high(ish) cost for high correctness. And when requirements are constantly changing, "high(ish) cost" isn't affordable and "high correctness" isn't worthwhile, because a kinda-okay solution that solves a customer's problem is infinitely better than a solid solution that doesn't.&lt;/p&gt;
    &lt;p&gt;But eventually you get something that solves the problem, and what then?&lt;/p&gt;
    &lt;p&gt;Most of us don't work for Google, we can't axe features and products &lt;a href="https://killedbygoogle.com/" target="_blank"&gt;on a whim&lt;/a&gt;. If the client is happy with your solution, you are expected to support it. It should work when your customers run into new edge cases, or migrate all their computers to the next OS version, or expand into a market with shoddy internet. It should work when 10x as many customers are using 10x as many features. It should work when &lt;a href="https://www.hillelwayne.com/post/feature-interaction/" target="_blank"&gt;you add new features that come into conflict&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;And just as importantly, &lt;em&gt;it should never stop solving their problem&lt;/em&gt;. Canonical example: your feature involves processing requested tasks synchronously. At scale, this doesn't work, so to improve latency you make it asynchronous. Now it's eventually consistent, but your customers were depending on it being always consistent. Now it no longer does what they need, and has stopped solving their problems.&lt;/p&gt;
    &lt;p&gt;Every successful requirement met spawns a new requirement: "keep this working". That requirement is permanent, or close enough to decide our long-term strategy. It takes active investment to keep a feature behaving the same as the world around it changes.&lt;/p&gt;
    &lt;p&gt;(Is this all a pretentious of way of saying "software maintenance is hard?" Maybe!)&lt;/p&gt;
    &lt;h3&gt;Phase changes&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;In physics there's a concept of a &lt;a href="https://en.wikipedia.org/wiki/Phase_transition" target="_blank"&gt;phase transition&lt;/a&gt;. To raise the temperature of a gram of liquid water by 1° C, you have to add 4.184 joules of energy.&lt;sup id="fnref:calorie"&gt;&lt;a class="footnote-ref" href="#fn:calorie"&gt;2&lt;/a&gt;&lt;/sup&gt; This continues until you raise it to 100°C, then it stops. After you've added two &lt;em&gt;thousand&lt;/em&gt; joules to that gram, it suddenly turns into steam. The energy of the system changes continuously but the form, or phase, changes discretely.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Phase_diagram_of_water_simplified.svg.png (from above link)" class="newsletter-image" src="https://assets.buttondown.email/images/31676a33-be6a-4c6d-a96f-425723dcb0d5.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Software isn't physics but the idea works as a metaphor. A certain architecture handles a certain level of load, and past that you need a new architecture. Or a bunch of similar features are independently hardcoded until the system becomes too messy to understand, you remodel the internals into something unified and extendable. etc etc etc. It's doesn't have to be totally discrete phase transition, but there's definitely a "before" and "after" in the system form. &lt;/p&gt;
    &lt;p&gt;Phase changes tend to lead to more intricacy/complexity in the system, meaning it's likely that a phase change will introduce new bugs into existing behaviors. Take the synchronous vs asynchronous case. A very simple toy model of synchronous updates would be &lt;code&gt;Set(key, val)&lt;/code&gt;, which updates &lt;code&gt;data[key]&lt;/code&gt; to &lt;code&gt;val&lt;/code&gt;.&lt;sup id="fnref:tla"&gt;&lt;a class="footnote-ref" href="#fn:tla"&gt;3&lt;/a&gt;&lt;/sup&gt; A model of asynchronous updates would be &lt;code&gt;AsyncSet(key, val, priority)&lt;/code&gt; adds a &lt;code&gt;(key, val, priority, server_time())&lt;/code&gt; tuple to a &lt;code&gt;tasks&lt;/code&gt; set, and then another process asynchronously pulls a tuple (ordered by highest priority, then earliest time) and calls &lt;code&gt;Set(key, val)&lt;/code&gt;. Here are some properties the client may need preserved as a requirement: &lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;If &lt;code&gt;AsyncSet(key, val, _, _)&lt;/code&gt; is called, then &lt;em&gt;eventually&lt;/em&gt; &lt;code&gt;db[key] = val&lt;/code&gt; (possibly violated if higher-priority tasks keep coming in)&lt;/li&gt;
    &lt;li&gt;If someone calls &lt;code&gt;AsyncSet(key1, val1, low)&lt;/code&gt; and then &lt;code&gt;AsyncSet(key2, val2, low)&lt;/code&gt;, they should see the first update and then the second (linearizability, possibly violated if the requests go to different servers with different clock times)&lt;/li&gt;
    &lt;li&gt;If someone calls &lt;code&gt;AsyncSet(key, val, _)&lt;/code&gt; and &lt;em&gt;immediately&lt;/em&gt; reads &lt;code&gt;db[key]&lt;/code&gt; they should get &lt;code&gt;val&lt;/code&gt; (obviously violated, though the client may accept a &lt;em&gt;slightly&lt;/em&gt; weaker property)&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;If the new system doesn't satisfy an existing customer requirement, it's prudent to fix the bug &lt;em&gt;before&lt;/em&gt; releasing the new system. The customer doesn't notice or care that your system underwent a phase change. They'll just see that one day your product solves their problems, and the next day it suddenly doesn't. &lt;/p&gt;
    &lt;p&gt;This is one of the most common applications of formal methods. Both of those systems, and every one of those properties, is formally specifiable in a specification language. We can then automatically check that the new system satisfies the existing properties, and from there do things like &lt;a href="https://arxiv.org/abs/2006.00915" target="_blank"&gt;automatically generate test suites&lt;/a&gt;. This does take a lot of work, so if your requirements are constantly changing, FM may not be worth the investment. But eventually requirements &lt;em&gt;stop&lt;/em&gt; changing, and then you're stuck with them forever. That's where models shine.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:fs"&gt;
    &lt;p&gt;As always, I'm using formal methods to mean the subdiscipline of formal specification of designs, leaving out the formal verification of code. Mostly because "formal specification" is really awkward to say. &lt;a class="footnote-backref" href="#fnref:fs" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:calorie"&gt;
    &lt;p&gt;Also called a "calorie". The US "dietary Calorie" is actually a kilocalorie. &lt;a class="footnote-backref" href="#fnref:calorie" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:tla"&gt;
    &lt;p&gt;This is all directly translatable to a TLA+ specification, I'm just describing it in English to avoid paying the syntax tax &lt;a class="footnote-backref" href="#fnref:tla" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 24 Apr 2025 11:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/requirements-change-until-they-dont/</guid></item><item><title>The Halting Problem is a terrible example of NP-Harder</title><link>https://buttondown.com/hillelwayne/archive/the-halting-problem-is-a-terrible-example-of-np/</link><description>
    &lt;p&gt;&lt;em&gt;Short one this time because I have a lot going on this week.&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;In computation complexity, &lt;strong&gt;NP&lt;/strong&gt; is the class of all decision problems (yes/no) where a potential proof (or "witness") for "yes" can be &lt;em&gt;verified&lt;/em&gt; in polynomial time. For example, "does this set of numbers have a subset that sums to zero" is in NP. If the answer is "yes", you can prove it by presenting a set of numbers. We would then verify the witness by 1) checking that all the numbers are present in the set (~linear time) and 2) adding up all the numbers (also linear).&lt;/p&gt;
    &lt;p&gt;&lt;strong&gt;NP-complete&lt;/strong&gt; is the class of "hardest possible" NP problems. Subset sum is NP-complete. &lt;strong&gt;NP-hard&lt;/strong&gt; is the set all problems &lt;em&gt;at least as hard&lt;/em&gt; as NP-complete. Notably, NP-hard is &lt;em&gt;not&lt;/em&gt; a subset of NP, as it contains problems that are &lt;em&gt;harder&lt;/em&gt; than NP-complete. A natural question to ask is "like what?" And the canonical example of "NP-harder" is the halting problem (HALT): does program P halt on input C? As the argument goes, it's undecidable, so obviously not in NP.&lt;/p&gt;
    &lt;p&gt;I think this is a bad example for two reasons:&lt;/p&gt;
    &lt;ol&gt;&lt;li&gt;&lt;p&gt;All NP requires is that witnesses for "yes" can be verified in polynomial time. It does not require anything for the "no" case! And even though HP is undecidable, there &lt;em&gt;is&lt;/em&gt; a decidable way to verify a "yes": let the witness be "it halts in N steps", then run the program for that many steps and see if it halted by then. To prove HALT is not in NP, you have to show that this verification process grows faster than polynomially. It does (as &lt;a href="https://en.wikipedia.org/wiki/Busy_beaver" rel="noopener noreferrer nofollow" target="_blank"&gt;busy beaver&lt;/a&gt; is uncomputable), but this all makes the example needlessly confusing.&lt;sup id="fnref:1"&gt;&lt;a class="footnote-ref" data-id="37347adc-dba6-4629-9d24-c6252292ac6b" data-reference-number="1" href="#fn:1"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;"What's bigger than a dog? THE MOON"&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;
    &lt;p&gt;Really (2) bothers me a lot more than (1) because it's just so inelegant. It suggests that NP-complete is the upper bound of "solvable" problems, and after that you're in full-on undecidability. I'd rather show intuitive problems that are harder than NP but not &lt;em&gt;that&lt;/em&gt; much harder.&lt;/p&gt;
    &lt;p&gt;But in looking for a "slightly harder" problem, I ran into an, ah, problem. It &lt;em&gt;seems&lt;/em&gt; like the next-hardest class would be &lt;a href="https://en.wikipedia.org/wiki/EXPTIME" rel="noopener noreferrer nofollow" target="_blank"&gt;EXPTIME&lt;/a&gt;, except we don't know &lt;em&gt;for sure&lt;/em&gt; that NP != EXPTIME. We know &lt;em&gt;for sure&lt;/em&gt; that NP != &lt;a href="https://en.wikipedia.org/wiki/NEXPTIME" rel="noopener noreferrer nofollow" target="_blank"&gt;NEXPTIME&lt;/a&gt;, but NEXPTIME doesn't have any intuitive, easily explainable problems. Most "definitely harder than NP" problems require a nontrivial background in theoretical computer science or mathematics to understand.&lt;/p&gt;
    &lt;p&gt;There is one problem, though, that I find easily explainable. Place a token at the bottom left corner of a grid that extends infinitely up and right, call that point (0, 0). You're given list of valid displacement moves for the token, like &lt;code&gt;(+1, +0)&lt;/code&gt;, &lt;code&gt;(-20, +13)&lt;/code&gt;, &lt;code&gt;(-5, -6)&lt;/code&gt;, etc, and a target point like &lt;code&gt;(700, 1)&lt;/code&gt;. You may make any sequence of moves in any order, as long as no move ever puts the token off the grid. Does any sequence of moves bring you to the target?&lt;/p&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;This is PSPACE-complete, I think, which still isn't proven to be harder than NP-complete (though it's widely believed). But what if you increase the number of dimensions of the grid? Past a certain number of dimensions the problem jumps to being EXPSPACE-complete, and then TOWER-complete (grows &lt;a href="https://en.wikipedia.org/wiki/Tetration" rel="noopener noreferrer nofollow" target="_blank"&gt;tetrationally&lt;/a&gt;), and then it keeps going. Some point might recognize this as looking a lot like the &lt;a href="https://en.wikipedia.org/wiki/Ackermann_function" rel="noopener noreferrer nofollow" target="_blank"&gt;Ackermann function&lt;/a&gt;, and in fact this problem is &lt;a href="https://arxiv.org/abs/2104.13866" rel="noopener noreferrer nofollow" target="_blank"&gt;ACKERMANN-complete on the number of available dimensions&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://www.quantamagazine.org/an-easy-sounding-problem-yields-numbers-too-big-for-our-universe-20231204/" rel="noopener noreferrer nofollow" target="_blank"&gt;A friend wrote a Quanta article about the whole mess&lt;/a&gt;, you should read it.&lt;/p&gt;
    &lt;p&gt;This problem is ludicrously bigger than NP ("Chicago" instead of "The Moon"), but at least it's clearly decidable, easily explainable, and definitely &lt;em&gt;not&lt;/em&gt; in NP.&lt;/p&gt;
    &lt;div class="footnote"&gt;&lt;hr/&gt;&lt;ol class="footnotes"&gt;&lt;li data-id="37347adc-dba6-4629-9d24-c6252292ac6b" id="fn:1"&gt;&lt;p&gt;It's less confusing if you're taught the alternate (and original!) definition of NP, "the class of problems solvable in polynomial time by a nondeterministic Turing machine". Then HALT can't be in NP because otherwise runtime would be bounded by an exponential function. &lt;a class="footnote-backref" href="#fnref:1"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;
    </description><pubDate>Wed, 16 Apr 2025 17:39:23 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/the-halting-problem-is-a-terrible-example-of-np/</guid></item><item><title>Solving a "Layton Puzzle" with Prolog</title><link>https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/</link><description>
    &lt;p&gt;I have a lot in the works for the this month's &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; release. Among other things, I'm completely rewriting the chapter on Logic Programming Languages. &lt;/p&gt;
    &lt;p&gt;I originally showcased the paradigm with puzzle solvers, like &lt;a href="https://swish.swi-prolog.org/example/queens.pl" target="_blank"&gt;eight queens&lt;/a&gt; or &lt;a href="https://saksagan.ceng.metu.edu.tr/courses/ceng242/documents/prolog/jrfisher/2_1.html" target="_blank"&gt;four-coloring&lt;/a&gt;. Lots of other demos do this too! It takes creativity and insight for humans to solve them, so a program doing it feels magical. But I'm trying to write a book about practical techniques and I want everything I talk about to be &lt;em&gt;useful&lt;/em&gt;. So in v0.9 I'll be replacing these examples with a couple of new programs that might get people thinking that Prolog could help them in their day-to-day work.&lt;/p&gt;
    &lt;p&gt;On the other hand, for a newsletter, showcasing a puzzle solver is pretty cool. And recently I stumbled into &lt;a href="https://morepablo.com/2010/09/some-professor-layton-prolog.html" target="_blank"&gt;this post&lt;/a&gt; by my friend &lt;a href="https://morepablo.com/" target="_blank"&gt;Pablo Meier&lt;/a&gt;, where he solves a videogame puzzle with Prolog:&lt;sup id="fnref:path"&gt;&lt;a class="footnote-ref" href="#fn:path"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;&lt;img alt="See description below" class="newsletter-image" src="https://assets.buttondown.email/images/a4ee8689-bbce-4dc9-8175-a1de3bd8f2db.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Summary for the text-only readers: We have a test with 10 true/false questions (denoted &lt;code&gt;a/b&lt;/code&gt;) and four student attempts. Given the scores of the first three students, we have to figure out the fourth student's score.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;bbababbabb = 7
    baaababaaa = 5
    baaabbbaba = 3
    bbaaabbaaa = ???
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;You can see Pablo's solution &lt;a href="https://morepablo.com/2010/09/some-professor-layton-prolog.html" target="_blank"&gt;here&lt;/a&gt;, and try it in SWI-prolog &lt;a href="https://swish.swi-prolog.org/p/Some%20Professor%20Layton%20Prolog.pl" target="_blank"&gt;here&lt;/a&gt;. Pretty cool! But after way too long studying Prolog just to write this dang book chapter, I wanted to see if I could do it more elegantly than him. Code and puzzle spoilers to follow.&lt;/p&gt;
    &lt;p&gt;(Normally here's where I'd link to a gentler introduction I wrote but I think this is my first time writing about Prolog online? Uh here's a &lt;a href="https://www.hillelwayne.com/post/picat/" target="_blank"&gt;Picat intro&lt;/a&gt; instead)&lt;/p&gt;
    &lt;h3&gt;The Program&lt;/h3&gt;
    &lt;p&gt;You can try this all online at &lt;a href="https://swish.swi-prolog.org/p/" target="_blank"&gt;SWISH&lt;/a&gt; or just jump to my final version &lt;a href="https://swish.swi-prolog.org/p/layton_prolog_puzzle.pl" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="nf"&gt;use_module&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;    &lt;span class="c1"&gt;% Sound inequality&lt;/span&gt;
    &lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="nf"&gt;use_module&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;clpfd&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;  &lt;span class="c1"&gt;% Finite domain constraints&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;First some imports. &lt;code&gt;dif&lt;/code&gt; lets us write &lt;code&gt;dif(A, B)&lt;/code&gt;, which is true if &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;B&lt;/code&gt; are &lt;em&gt;not&lt;/em&gt; equal. &lt;code&gt;clpfd&lt;/code&gt; lets us write &lt;code&gt;A #= B + 1&lt;/code&gt; to say "A is 1 more than B".&lt;sup id="fnref:superior"&gt;&lt;a class="footnote-ref" href="#fn:superior"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;We'll say both the student submission and the key will be lists, where each value is &lt;code&gt;a&lt;/code&gt; or &lt;code&gt;b&lt;/code&gt;. In Prolog, lowercase identifiers are &lt;strong&gt;atoms&lt;/strong&gt; (like symbols in other languages) and identifiers that start with a capital are &lt;strong&gt;variables&lt;/strong&gt;. Prolog finds values for variables that match equations (&lt;strong&gt;unification&lt;/strong&gt;). The pattern matching is real real good.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;% ?- means query&lt;/span&gt;
    &lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nv"&gt;B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;B&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="s s-Atom"&gt;#=&lt;/span&gt; &lt;span class="mf"&gt;7.&lt;/span&gt;
    
    &lt;span class="nv"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nv"&gt;L&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nv"&gt;Y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Next, we define &lt;code&gt;score/3&lt;/code&gt;&lt;sup id="fnref:arity"&gt;&lt;a class="footnote-ref" href="#fn:arity"&gt;3&lt;/a&gt;&lt;/sup&gt; recursively. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;% The student's test score&lt;/span&gt;
    &lt;span class="c1"&gt;% score(student answers, answer key, score)&lt;/span&gt;
    &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([],&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
       &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="s s-Atom"&gt;#=&lt;/span&gt; &lt;span class="nv"&gt;M&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;M&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;K&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt; 
        &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;K&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;First key is the student's answers, second is the answer key, third is the final score. The base case is the empty test, which has score 0. Otherwise, we take the head values of each list and compare them. If they're the same, we add one to the score, otherwise we keep the same score. &lt;/p&gt;
    &lt;p&gt;Notice we couldn't write &lt;code&gt;if x then y else z&lt;/code&gt;, we instead used pattern matching to effectively express &lt;code&gt;(x &amp;amp;&amp;amp; y) || (!x &amp;amp;&amp;amp; z)&lt;/code&gt;. Prolog does have a conditional operator, but it prevents backtracking so what's the point???&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;A quick break about bidirectionality&lt;/h3&gt;
    &lt;p&gt;One of the coolest things about Prolog: all purely logical predicates are bidirectional. We can use &lt;code&gt;score&lt;/code&gt; to check if our expected score is correct:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="s s-Atom"&gt;true&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But we can also give it answers and a key and ask it for the score:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;em&gt;Or&lt;/em&gt; we could give it a key and a score and ask "what test answers would have this score?"&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The different value is written &lt;code&gt;_A&lt;/code&gt; because we never told Prolog that the array can &lt;em&gt;only&lt;/em&gt; contain &lt;code&gt;a&lt;/code&gt; and &lt;code&gt;b&lt;/code&gt;. We'll fix this later.&lt;/p&gt;
    &lt;h3&gt;Okay back to the program&lt;/h3&gt;
    &lt;p&gt;Now that we have a way of computing scores, we want to find a possible answer key that matches all of our observations, ie gives everybody the correct scores.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
        &lt;span class="c1"&gt;% Figure it out&lt;/span&gt;
        &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So far we haven't explicitly said that the &lt;code&gt;Key&lt;/code&gt; length matches the student answer lengths. This is implicitly verified by &lt;code&gt;score&lt;/code&gt; (both lists need to be empty at the same time) but it's a good idea to explicitly add &lt;code&gt;length(Key, 10)&lt;/code&gt; as a clause of &lt;code&gt;key/1&lt;/code&gt;. We should also explicitly say that every element of &lt;code&gt;Key&lt;/code&gt; is either &lt;code&gt;a&lt;/code&gt; or &lt;code&gt;b&lt;/code&gt;.&lt;sup id="fnref:explicit"&gt;&lt;a class="footnote-ref" href="#fn:explicit"&gt;4&lt;/a&gt;&lt;/sup&gt; Now we &lt;em&gt;could&lt;/em&gt; write a second predicate saying &lt;code&gt;Key&lt;/code&gt; had the right 'type': &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;keytype([]).
    keytype([K|Ks]) :- member(K, [a, b]), keytype(Ks).
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But "generating lists that match a constraint" is a thing that comes up often enough that we don't want to write a separate predicate for each constraint! So after some digging, I found a more elegant solution: &lt;code&gt;maplist&lt;/code&gt;. Let &lt;code&gt;L=[l1, l2]&lt;/code&gt;. Then &lt;code&gt;maplist(p, L)&lt;/code&gt; is equivalent to the clause &lt;code&gt;p(l1), p(l2)&lt;/code&gt;. It also accepts partial predicates: &lt;code&gt;maplist(p(x), L)&lt;/code&gt; is equivalent to &lt;code&gt;p(x, l1), p(x, l2)&lt;/code&gt;. So we could write&lt;sup id="fnref:yall"&gt;&lt;a class="footnote-ref" href="#fn:yall"&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    
    &lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
        &lt;span class="nf"&gt;length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;maplist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="c1"&gt;% the score stuff&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now, let's query for the Key:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So there are actually four &lt;em&gt;different&lt;/em&gt; keys that all explain our data. Does this mean the puzzle is broken and has multiple different answers?&lt;/p&gt;
    &lt;h3&gt;Nope&lt;/h3&gt;
    &lt;p&gt;The puzzle wasn't to find out what the answer key was, the point was to find the fourth student's score. And if we query for it, we see all four solutions give him the same score:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Huh! I really like it when puzzles look like they're broken, but every "alternate" solution still gives the same puzzle answer.&lt;/p&gt;
    &lt;p&gt;Total program length: 15 lines of code, compared to the original's 80 lines. &lt;em&gt;Suck it, Pablo.&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;(Incidentally, you can get all of the answer at once by writing &lt;code&gt;findall(X, (key(Key), score($answer-array, Key, X)), L).&lt;/code&gt;) &lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;I still don't like puzzles for teaching&lt;/h3&gt;
    &lt;p&gt;The actual examples I'm using in &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;the book&lt;/a&gt; are "analyzing a version control commit graph" and "planning a sequence of infrastructure changes", which are somewhat more likely to occur at work than needing to solve a puzzle. You'll see them in the next release!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:path"&gt;
    &lt;p&gt;I found it because he wrote &lt;a href="https://morepablo.com/2025/04/gamer-games-for-lite-gamers.html" target="_blank"&gt;Gamer Games for Lite Gamers&lt;/a&gt; as a response to my &lt;a href="https://www.hillelwayne.com/post/vidja-games/" target="_blank"&gt;Gamer Games for Non-Gamers&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:path" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:superior"&gt;
    &lt;p&gt;These are better versions of the core Prolog expressions &lt;code&gt;\+ (A = B)&lt;/code&gt; and &lt;code&gt;A is B + 1&lt;/code&gt;, because they can &lt;a href="https://eu.swi-prolog.org/pldoc/man?predicate=dif/2" target="_blank"&gt;defer unification&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:superior" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:arity"&gt;
    &lt;p&gt;Prolog-descendants have a convention of writing the arity of the function after its name, so &lt;code&gt;score/3&lt;/code&gt; means "score has three parameters". I think they do this because you can overload predicates with multiple different arities. Also Joe Armstrong used Prolog for prototyping, so Erlang and Elixir follow the same convention. &lt;a class="footnote-backref" href="#fnref:arity" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:explicit"&gt;
    &lt;p&gt;It &lt;em&gt;still&lt;/em&gt; gets the right answers without this type restriction, but I had no idea it did until I checked for myself. Probably better not to rely on this! &lt;a class="footnote-backref" href="#fnref:explicit" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:yall"&gt;
    &lt;p&gt;We could make this even more compact by using a lambda function. First import module &lt;code&gt;yall&lt;/code&gt;, then write &lt;code&gt;maplist([X]&amp;gt;&amp;gt;member(X, [a,b]), Key)&lt;/code&gt;. But (1) it's not a shorter program because you replace the extra definition with an extra module import, and (2) &lt;code&gt;yall&lt;/code&gt; is SWI-Prolog specific and not an ISO-standard prolog module. Using &lt;code&gt;contains&lt;/code&gt; is more portable. &lt;a class="footnote-backref" href="#fnref:yall" title="Jump back to footnote 5 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 08 Apr 2025 18:34:50 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/</guid></item><item><title>[April Cools] Gaming Games for Non-Gamers</title><link>https://buttondown.com/hillelwayne/archive/april-cools-gaming-games-for-non-gamers/</link><description>
    &lt;p&gt;My &lt;em&gt;April Cools&lt;/em&gt; is out! &lt;a href="https://www.hillelwayne.com/post/vidja-games/" target="_blank"&gt;Gaming Games for Non-Gamers&lt;/a&gt; is a 3,000 word essay on video games worth playing if you've never enjoyed a video game before. &lt;a href="https://www.patreon.com/posts/blog-notes-gamer-125654321?utm_medium=clipboard_copy&amp;amp;utm_source=copyLink&amp;amp;utm_campaign=postshare_creator&amp;amp;utm_content=join_link" target="_blank"&gt;Patreon notes here&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;(April Cools is a project where we write genuine content on non-normal topics. You can see all the other April Cools posted so far &lt;a href="https://www.aprilcools.club/" target="_blank"&gt;here&lt;/a&gt;. There's still time to submit your own!)&lt;/p&gt;
    &lt;a class="embedded-link" href="https://www.aprilcools.club/"&gt; &lt;div style="width: 100%; background: #fff; border: 1px #ced3d9 solid; border-radius: 5px; margin-top: 1em; overflow: auto; margin-bottom: 1em;"&gt; &lt;div style="float: left; border-bottom: 1px #ced3d9 solid;"&gt; &lt;img class="link-image" src="https://www.aprilcools.club/aprilcoolsclub.png"/&gt; &lt;/div&gt; &lt;div style="float: left; color: #393f48; padding-left: 1em; padding-right: 1em;"&gt; &lt;h4 class="link-title" style="margin-bottom: 0em; line-height: 1.25em; margin-top: 1em; font-size: 14px;"&gt;                April Cools' Club&lt;/h4&gt; &lt;/div&gt; &lt;/div&gt;&lt;/a&gt;
    </description><pubDate>Tue, 01 Apr 2025 16:04:59 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/april-cools-gaming-games-for-non-gamers/</guid></item><item><title>Betteridge's Law of Software Engineering Specialness</title><link>https://buttondown.com/hillelwayne/archive/betteridges-law-of-software-engineering/</link><description>
    &lt;h3&gt;Logic for Programmers v0.8 now out!&lt;/h3&gt;
    &lt;p&gt;The new release has minor changes: new formatting for notes and a better introduction to predicates. I would have rolled it all into v0.9 next month but I like the monthly cadence. &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Get it here!&lt;/a&gt;&lt;/p&gt;
    &lt;h1&gt;Betteridge's Law of Software Engineering Specialness&lt;/h1&gt;
    &lt;p&gt;In &lt;a href="https://agileotter.blogspot.com/2025/03/there-is-no-automatic-reset-in.html" target="_blank"&gt;There is No Automatic Reset in Engineering&lt;/a&gt;, Tim Ottinger asks:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Do the other people have to live with January 2013 for the rest of their lives? Or is it only engineering that has to deal with every dirty hack since the beginning of the organization?&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;&lt;strong&gt;Betteridge's Law of Headlines&lt;/strong&gt; says that if a journalism headline ends with a question mark, the answer is probably "no". I propose a similar law relating to software engineering specialness:&lt;sup id="fnref:ottinger"&gt;&lt;a class="footnote-ref" href="#fn:ottinger"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;If someone asks if some aspect of software development is truly unique to just software development, the answer is probably "no".&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Take the idea that "in software, hacks are forever." My favorite example of this comes from a different profession. The &lt;a href="https://en.wikipedia.org/wiki/Dewey_Decimal_Classification" target="_blank"&gt;Dewey Decimal System&lt;/a&gt; hierarchically categorizes books by discipline. For example, &lt;em&gt;&lt;a href="https://www.librarything.com/work/10143437/t/Covered-Bridges-of-Pennsylvania" target="_blank"&gt;Covered Bridges of Pennsylvania&lt;/a&gt;&lt;/em&gt; has Dewey number &lt;code&gt;624.37&lt;/code&gt;. &lt;code&gt;6--&lt;/code&gt; is the technology discipline, &lt;code&gt;62-&lt;/code&gt; is engineering, &lt;code&gt;624&lt;/code&gt; is civil engineering, and &lt;code&gt;624.3&lt;/code&gt; is "special types of bridges". I have no idea what the last &lt;code&gt;0.07&lt;/code&gt; means, but you get the picture.&lt;/p&gt;
    &lt;p&gt;Now if you look at the &lt;a href="https://www.librarything.com/mds/6" target="_blank"&gt;6-- "technology" breakdown&lt;/a&gt;, you'll see that there's no "software" subdiscipline. This is because when Dewey preallocated the whole technology block in 1876. New topics were instead to be added to the &lt;code&gt;00-&lt;/code&gt; "general-knowledge" catch-all. Eventually &lt;code&gt;005&lt;/code&gt; was assigned to "software development", meaning &lt;em&gt;The C Programming Language&lt;/em&gt; lives at &lt;code&gt;005.133&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;Incidentally, another late addition to the general knowledge block is &lt;code&gt;001.9&lt;/code&gt;: "controversial knowledge". &lt;/p&gt;
    &lt;p&gt;And that's why my hometown library shelved the C++ books right next to &lt;em&gt;The Mothman Prophecies&lt;/em&gt;.&lt;/p&gt;
    &lt;p&gt;How's &lt;em&gt;that&lt;/em&gt; for technical debt?&lt;/p&gt;
    &lt;p&gt;If anything, fixing hacks in software is significantly &lt;em&gt;easier&lt;/em&gt; than in other fields. This came up when I was &lt;a href="https://www.hillelwayne.com/post/we-are-not-special/" target="_blank"&gt;interviewing classic engineers&lt;/a&gt;. Kludges happened all the time, but "refactoring" them out is &lt;em&gt;expensive&lt;/em&gt;. Need to house a machine that's just two inches taller than the room? Guess what, you're cutting a hole in the ceiling.&lt;/p&gt;
    &lt;p&gt;(Even if we restrict the question to other departments in a &lt;em&gt;software company&lt;/em&gt;, we can find kludges that are horrible to undo. I once worked for a company which landed an early contract by adding a bespoke support agreement for that one customer. That plagued them for years afterward.)&lt;/p&gt;
    &lt;p&gt;That's not to say that there aren't things that are different about software vs other fields!&lt;sup id="fnref:example"&gt;&lt;a class="footnote-ref" href="#fn:example"&gt;2&lt;/a&gt;&lt;/sup&gt;  But I think that &lt;em&gt;most&lt;/em&gt; of the time, when we say "software development is the only profession that deals with XYZ", it's only because we're ignorant of how those other professions work.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Short newsletter because I'm way behind on writing my &lt;a href="https://www.aprilcools.club/" target="_blank"&gt;April Cools&lt;/a&gt;. If you're interested in April Cools, you should try it out! I make it &lt;em&gt;way&lt;/em&gt; harder on myself than it actually needs to be— everybody else who participates finds it pretty chill.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:ottinger"&gt;
    &lt;p&gt;Ottinger caveats it with "engineering, software or otherwise", so I think he knows that other branches of &lt;em&gt;engineering&lt;/em&gt;, at least, have kludges. &lt;a class="footnote-backref" href="#fnref:ottinger" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:example"&gt;
    &lt;p&gt;The "software is different" idea that I'm most sympathetic to is that in software, the tools we use and the products we create are made from the same material. That's unusual at least in classic engineering. Then again, plenty of machinists have made their own lathes and mills! &lt;a class="footnote-backref" href="#fnref:example" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 26 Mar 2025 18:48:39 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/betteridges-law-of-software-engineering/</guid></item><item><title>Verification-First Development</title><link>https://buttondown.com/hillelwayne/archive/verification-first-development/</link><description>
    &lt;p&gt;A while back I argued on the Blue Site&lt;sup id="fnref:li"&gt;&lt;a class="footnote-ref" href="#fn:li"&gt;1&lt;/a&gt;&lt;/sup&gt; that "test-first development" (TFD) was different than "test-driven development" (TDD). The former is "write tests before you write code", the latter is a paradigm, culture, and collection of norms that's based on TFD. More broadly, TFD is a special case of &lt;strong&gt;Verification-First Development&lt;/strong&gt; and TDD is not.&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;VFD: before writing code, put in place some means of verifying that the code is correct, or at least have an idea of what you'll do.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;"Verifying" could mean writing tests, or figuring out how to encode invariants in types, or &lt;a href="https://blog.regehr.org/archives/1091" target="_blank"&gt;adding contracts&lt;/a&gt;, or &lt;a href="https://learntla.com/" target="_blank"&gt;making a formal model&lt;/a&gt;, or writing a separate script that checks the output of the program. Just have &lt;em&gt;something&lt;/em&gt; appropriate in place that you can run as you go building the code. Ideally, we'd have verification in place for every interesting property, but that's rarely possible in practice. &lt;/p&gt;
    &lt;p&gt;Oftentimes we can't make the verification until the code is partially complete. In that case it still helps to figure out the verification we'll write later. The point is to have a &lt;em&gt;plan&lt;/em&gt; and follow it promptly.&lt;/p&gt;
    &lt;p&gt;I'm using "code" as a standin for anything we programmers make, not just software programs. When using constraint solvers, I try to find representative problems I know the answers to. When writing formal specifications, I figure out the system's properties before the design that satisfies those properties. There's probably equivalents in security and other topics, too.&lt;/p&gt;
    &lt;h3&gt;The Benefits of VFD&lt;/h3&gt;
    &lt;ol&gt;
    &lt;li&gt;Doing verification before coding makes it less likely we'll skip verification entirely. It's the professional equivalent of "No TV until you do your homework."&lt;/li&gt;
    &lt;li&gt;It's easier to make sure a verifier works properly if we start by running it on code we know doesn't pass it. Bebugging working code takes more discipline.&lt;/li&gt;
    &lt;li&gt;We can run checks earlier in the development process. It's better to realize that our code is broken five minutes after we broke it rather than two hours after.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;That's it, those are the benefits of verification-first development. Those are also &lt;em&gt;big&lt;/em&gt; benefits for relatively little investment. Specializations of VFD like test-first development can have more benefits, but also more drawbacks.&lt;/p&gt;
    &lt;h3&gt;The drawbacks of VFD&lt;/h3&gt;
    &lt;ol&gt;
    &lt;li&gt;It slows us down. I know lots of people say that "no actually it makes you go faster in the long run," but that's the &lt;em&gt;long&lt;/em&gt; run. Sometimes we do marathons, sometimes we sprint.&lt;/li&gt;
    &lt;li&gt;Verification gets in the way of exploratory coding, where we don't know what exactly we want or how exactly to do something.&lt;/li&gt;
    &lt;li&gt;Any specific form of verification exerts a pressure on our code to make it easier to verify with that method. For example, if we're mostly verifying via type invariants, we need to figure out how to express those things in our language's type system, which may not be suited for the specific invariants we need.&lt;sup id="fnref:sphinx"&gt;&lt;a class="footnote-ref" href="#fn:sphinx"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ol&gt;
    &lt;h2&gt;Whether "pressure" is a real drawback is incredibly controversial&lt;/h2&gt;
    &lt;p&gt;If I had to summarize what makes "test-driven development" different from VFD:&lt;sup id="fnref:tdd"&gt;&lt;a class="footnote-ref" href="#fn:tdd"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;The form of verification should specifically be tests, and unit tests at that&lt;/li&gt;
    &lt;li&gt;Testing pressure is invariably good. "Making your code easier to unit test" is the same as "making your code better".&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;This is something all of the various "drivens"— TDD, Type Driven Development, Design by Contract— share in common, this idea that the purpose of the paradigm is to exert pressure. Lots of TDD experts claim that "having a good test suite" is only the secondary benefit of TDD and the real benefit is how it improves code quality.&lt;sup id="fnref:docs"&gt;&lt;a class="footnote-ref" href="#fn:docs"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;Whether they're right or not is not something I want to argue: I've seen these approaches all improve my code structure, but also sometimes worsen it. Regardless, I consider pressure a drawback to VFD in general, though, for a somewhat idiosyncratic reason. If it &lt;em&gt;weren't&lt;/em&gt; for pressure, VFD would be wholly independent of the code itself. It would &lt;em&gt;just&lt;/em&gt; be about verification, and our decisions would exclusively be about how we want to verify. But the design pressure means that our means of verification affects the system we're checking. What if these conflict in some way?&lt;/p&gt;
    &lt;h3&gt;VFD is a technique, not a paradigm&lt;/h3&gt;
    &lt;p&gt;One of the main differences between "techniques" and "paradigms" is that paradigms don't play well with each other. If you tried to do both "proper" Test-Driven Development and "proper" Cleanroom, your head would explode. Whereas VFD being a "technique" means it works well with other techniques and even with many full paradigms.&lt;/p&gt;
    &lt;p&gt;It also doesn't take a whole lot of practice to start using. It does take practice, both in thinking of verifications and in using the particular verification method involved, to &lt;em&gt;use well&lt;/em&gt;, but we can use it poorly and still benefit.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:li"&gt;
    &lt;p&gt;LinkedIn, what did you think I meant? &lt;a class="footnote-backref" href="#fnref:li" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:sphinx"&gt;
    &lt;p&gt;This bit me in the butt when making my own &lt;a href="https://www.sphinx-doc.org/en/master/" target="_blank"&gt;sphinx&lt;/a&gt; extensions. The official guides do things in a highly dynamic way that Mypy can't statically check. I had to do things in a completely different way. Ended up being better though! &lt;a class="footnote-backref" href="#fnref:sphinx" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:tdd"&gt;
    &lt;p&gt;Someone's going to yell at me that I completely missed the point of TDD, which is XYZ. Well guess what, someone else &lt;em&gt;already&lt;/em&gt; yelled at me that only dumb idiot babies think XYZ is important in TDD. Put in whatever you want for XYZ. &lt;a class="footnote-backref" href="#fnref:tdd" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:docs"&gt;
    &lt;p&gt;Another thing that weirdly all of the paradigms claim: that they lead to better documentation. I can see the argument, I just find it strange that &lt;em&gt;every single one&lt;/em&gt; makes this claim! &lt;a class="footnote-backref" href="#fnref:docs" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 18 Mar 2025 16:22:20 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/verification-first-development/</guid></item><item><title>New Blog Post: "A Perplexing Javascript Parsing Puzzle"</title><link>https://buttondown.com/hillelwayne/archive/new-blog-post-a-perplexing-javascript-parsing/</link><description>
    &lt;p&gt;I know I said we'd be back to normal newsletters this week and in fact had 80% of one already written. &lt;/p&gt;
    &lt;p&gt;Then I unearthed something that was better left buried.&lt;/p&gt;
    &lt;p&gt;&lt;a href="http://www.hillelwayne.com/post/javascript-puzzle/" target="_blank"&gt;Blog post here&lt;/a&gt;, &lt;a href="https://www.patreon.com/posts/blog-notes-124153641" target="_blank"&gt;Patreon notes here&lt;/a&gt; (Mostly an explanation of how I found this horror in the first place). Next week I'll send what was supposed to be this week's piece.&lt;/p&gt;
    &lt;p&gt;(PS: &lt;a href="https://www.aprilcools.club/" target="_blank"&gt;April Cools&lt;/a&gt; in three weeks!)&lt;/p&gt;
    </description><pubDate>Wed, 12 Mar 2025 14:49:52 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/new-blog-post-a-perplexing-javascript-parsing/</guid></item><item><title>Five Kinds of Nondeterminism</title><link>https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/</link><description>
    &lt;p&gt;No newsletter next week, I'm teaching a TLA+ workshop.&lt;/p&gt;
    &lt;p&gt;Speaking of which: I spend a lot of time thinking about formal methods (and TLA+ specifically) because it's where the source of almost all my revenue. But I don't share most of the details because 90% of my readers don't use FM and never will. I think it's more interesting to talk about ideas &lt;em&gt;from&lt;/em&gt; FM that would be useful to people outside that field. For example, the idea of "property strength" translates to the &lt;a href="https://buttondown.com/hillelwayne/archive/some-tests-are-stronger-than-others/" target="_blank"&gt;idea that some tests are stronger than others&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;Another possible export is how FM approaches nondeterminism. A &lt;strong&gt;nondeterministic&lt;/strong&gt; algorithm is one that, from the same starting conditions, has multiple possible outputs. This is nondeterministic:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# Pseudocode
    
    def f() {
        return rand()+1;
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;When specifying systems, I may not &lt;em&gt;encounter&lt;/em&gt; nondeterminism more often than in real systems, but I am definitely more aware of its presence. Modeling nondeterminism is a core part of formal specification. I mentally categorize nondeterminism into five buckets. Caveat, this is specifically about nondeterminism from the perspective of &lt;em&gt;system modeling&lt;/em&gt;, not computer science as a whole. If I tried to include stuff on NFAs and amb operations this would be twice as long.&lt;sup id="fnref:nondeterminism"&gt;&lt;a class="footnote-ref" href="#fn:nondeterminism"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h2&gt;1. True Randomness&lt;/h2&gt;
    &lt;p&gt;Programs that literally make calls to a &lt;code&gt;random&lt;/code&gt; function and then use the results. This the simplest type of nondeterminism and one of the most ubiquitous. &lt;/p&gt;
    &lt;p&gt;Most of the time, &lt;code&gt;random&lt;/code&gt; isn't &lt;em&gt;truly&lt;/em&gt; nondeterministic. Most of the time computer randomness is actually &lt;strong&gt;pseudorandom&lt;/strong&gt;, meaning we seed a deterministic algorithm that behaves "randomly-enough" for some use. You could "lift" a nondeterministic random function into a deterministic one by adding a fixed seed to the starting state.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python&lt;/span&gt;
    
    &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seed&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    
    &lt;span class="o"&gt;&gt;&gt;&gt;&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="mf"&gt;0.23796462709189137&lt;/span&gt;
    &lt;span class="o"&gt;&gt;&gt;&gt;&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="mf"&gt;0.23796462709189137&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Often we don't do this because the &lt;em&gt;point&lt;/em&gt; of randomness is to provide nondeterminism! We deliberately &lt;em&gt;abstract out&lt;/em&gt; the starting state of the seed from our program, because it's easier to think about it as locally nondeterministic.&lt;/p&gt;
    &lt;p&gt;(There's also "true" randomness, like using &lt;a href="https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html#inpage-nav-3-2" target="_blank"&gt;thermal noise&lt;/a&gt; as an entropy source, which I think are mainly used for cryptography and seeding PRNGs.)&lt;/p&gt;
    &lt;p&gt;Most formal specification languages don't deal with randomness (though some deal with &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;probability more broadly&lt;/a&gt;). Instead, we treat it as a nondeterministic choice:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# software
    if rand &gt; 0.001 then return a else crash
    
    # specification
    either return a or crash
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is because we're looking at worst-case scenarios, so it doesn't matter if &lt;code&gt;crash&lt;/code&gt; happens 50% of the time or 0.0001% of the time, it's still possible.  &lt;/p&gt;
    &lt;h2&gt;2. Concurrency&lt;/h2&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# Pseudocode
    global x = 1, y = 0;
    
    def thread1() {
       x++;
       x++;
       x++;
    }
    
    def thread2() {
        y := x;
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;If &lt;code&gt;thread1()&lt;/code&gt; and &lt;code&gt;thread2()&lt;/code&gt; run sequentially, then (assuming the sequence is fixed) the final value of &lt;code&gt;y&lt;/code&gt; is deterministic. If the two functions are started and run simultaneously, then depending on when &lt;code&gt;thread2&lt;/code&gt; executes &lt;code&gt;y&lt;/code&gt; can be 1, 2, 3, &lt;em&gt;or&lt;/em&gt; 4. Both functions are locally sequential, but running them concurrently leads to global nondeterminism.&lt;/p&gt;
    &lt;p&gt;Concurrency is arguably the most &lt;em&gt;dramatic&lt;/em&gt; source of nondeterminism. &lt;a href="https://buttondown.com/hillelwayne/archive/what-makes-concurrency-so-hard/" target="_blank"&gt;Small amounts of concurrency lead to huge explosions in the state space&lt;/a&gt;. We have words for the specific kinds of nondeterminism caused by concurrency, like "race condition" and "dirty write". Often we think about it as a separate &lt;em&gt;topic&lt;/em&gt; from nondeterminism. To some extent it "overshadows" the other kinds: I have a much easier time teaching students about concurrency in models than nondeterminism in models.&lt;/p&gt;
    &lt;p&gt;Many formal specification languages have special syntax/machinery for the concurrent aspects of a system, and generic syntax for other kinds of nondeterminism. In P that's &lt;a href="https://p-org.github.io/P/manual/expressions/#choose" target="_blank"&gt;choose&lt;/a&gt;. Others don't special-case concurrency, instead representing as it as nondeterministic choices by a global coordinator. This more flexible but also more inconvenient, as you have to implement process-local sequencing code yourself. &lt;/p&gt;
    &lt;h2&gt;3. User Input&lt;/h2&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;One of the most famous and influential programming books is &lt;em&gt;The C Programming Language&lt;/em&gt; by Kernighan and Ritchie. The first example of a nondeterministic program appears on page 14:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Picture of the book page. Code reproduced below." class="newsletter-image" src="https://assets.buttondown.email/images/94e6ad15-8d09-48df-b885-191318bfd179.jpg?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;For the newsletter readers who get text only emails,&lt;sup id="fnref:text-only"&gt;&lt;a class="footnote-ref" href="#fn:text-only"&gt;2&lt;/a&gt;&lt;/sup&gt; here's the program:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="cp"&gt;#include&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cpf"&gt;&lt;stdio.h&gt;&lt;/span&gt;
    &lt;span class="cm"&gt;/* copy input to output; 1st version */&lt;/span&gt;
    &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;getchar&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;while&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;!=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;EOF&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;putchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;getchar&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Yup, that's nondeterministic. Because the user can enter any string, any call of &lt;code&gt;main()&lt;/code&gt; could have any output, meaning the number of possible outcomes is infinity.&lt;/p&gt;
    &lt;p&gt;Okay that seems a little cheap, and I think it's because we tend to think of determinism in terms of how the user &lt;em&gt;experiences&lt;/em&gt; the program. Yes, &lt;code&gt;main()&lt;/code&gt; has an infinite number of user inputs, but for each input the user will experience only one possible output. It starts to feel more nondeterministic when modeling a long-standing system that's &lt;em&gt;reacting&lt;/em&gt; to user input, for example a server that runs a script whenever the user uploads a file. This can be modeled with nondeterminism and concurrency: We have one execution that's the system, and one nondeterministic execution that represents the effects of our user.&lt;/p&gt;
    &lt;p&gt;(One intrusive thought I sometimes have: any "yes/no" dialogue actually has &lt;em&gt;three&lt;/em&gt; outcomes: yes, no, or the user getting up and walking away without picking a choice, permanently stalling the execution.)&lt;/p&gt;
    &lt;h2&gt;4. External forces&lt;/h2&gt;
    &lt;p&gt;The more general version of "user input": anything where either 1) some part of the execution outcome depends on retrieving external information, or 2) the external world can change some state outside of your system. I call the distinction between internal and external components of the system &lt;a href="https://www.hillelwayne.com/post/world-vs-machine/" target="_blank"&gt;the world and the machine&lt;/a&gt;. Simple examples: code that at some point reads an external temperature sensor. Unrelated code running on a system which quits programs if it gets too hot. API requests to a third party vendor. Code processing files but users can delete files before the script gets to them.&lt;/p&gt;
    &lt;p&gt;Like with PRNGs, some of these cases don't &lt;em&gt;have&lt;/em&gt; to be nondeterministic; we can argue that "the temperature" should be a virtual input into the function. Like with PRNGs, we treat it as nondeterministic because it's useful to think in that way. Also, what if the temperature changes between starting a function and reading it?&lt;/p&gt;
    &lt;p&gt;External forces are also a source of nondeterminism as &lt;em&gt;uncertainty&lt;/em&gt;. Measurements in the real world often comes with errors, so repeating a measurement twice can give two different answers. Sometimes operations fail for no discernable reason, or for a non-programmatic reason (like something physically blocks the sensor).&lt;/p&gt;
    &lt;p&gt;All of these situations can be modeled in the same way as user input: a concurrent execution making nondeterministic choices.&lt;/p&gt;
    &lt;h2&gt;5. Abstraction&lt;/h2&gt;
    &lt;p&gt;This is where nondeterminism in system models and in "real software" differ the most. I said earlier that pseudorandomness is &lt;em&gt;arguably&lt;/em&gt; deterministic, but we abstract it into nondeterminism. More generally, &lt;strong&gt;nondeterminism hides implementation details of deterministic processes&lt;/strong&gt;.&lt;/p&gt;
    &lt;p&gt;In one consulting project, we had a machine that received a message, parsed a lot of data from the message, went into a complicated workflow, and then entered one of three states. The final state was totally deterministic on the content of the message, but the actual process of determining that final state took tons and tons of code. None of that mattered at the scope we were modeling, so we abstracted it all away: "on receiving message, nondeterministically enter state A, B, or C."&lt;/p&gt;
    &lt;p&gt;Doing this makes the system easier to model. It also makes the model more sensitive to possible errors. What if the workflow is bugged and sends us to the wrong state? That's already covered by the nondeterministic choice! Nondeterministic abstraction gives us the potential to pick the worst-case scenario for our system, so we can prove it's robust even under those conditions.&lt;/p&gt;
    &lt;p&gt;I know I beat the "nondeterminism as abstraction" drum a whole lot but that's because it's the insight from formal methods I personally value the most, that nondeterminism is a powerful tool to &lt;em&gt;simplify reasoning about things&lt;/em&gt;. You can see the same approach in how I approach modeling users and external forces: complex realities black-boxed and simplified into nondeterministic forces on the system.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Anyway, I hope this collection of ideas I got from formal methods are useful to my broader readership. Lemme know if it somehow helps you out!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:nondeterminism"&gt;
    &lt;p&gt;I realized after writing this that I already talked wrote an essay about nondeterminism in formal specification &lt;a href="https://buttondown.com/hillelwayne/archive/nondeterminism-in-formal-specification/" target="_blank"&gt;just under a year ago&lt;/a&gt;. I hope this one covers enough new ground to be interesting! &lt;a class="footnote-backref" href="#fnref:nondeterminism" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:text-only"&gt;
    &lt;p&gt;There is a surprising number of you. &lt;a class="footnote-backref" href="#fnref:text-only" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 19 Feb 2025 19:37:57 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/</guid></item><item><title>Are Efficiency and Horizontal Scalability at odds?</title><link>https://buttondown.com/hillelwayne/archive/are-efficiency-and-horizontal-scalability-at-odds/</link><description>
    &lt;p&gt;Sorry for missing the newsletter last week! I started writing on Monday as normal, and by Wednesday the piece (about the &lt;a href="https://en.wikipedia.org/wiki/Hierarchy_of_hazard_controls" target="_blank"&gt;hierarchy of controls&lt;/a&gt; ) was 2000 words and not &lt;em&gt;close&lt;/em&gt; to done. So now it'll be a blog post sometime later this month.&lt;/p&gt;
    &lt;p&gt;I also just released a new version of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt;! 0.7 adds a bunch of new content (type invariants, modeling access policies, rewrites of the first chapters) but more importantly has new fonts that are more legible than the old ones. &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Go check it out!&lt;/a&gt;&lt;/p&gt;
    &lt;p&gt;For this week's newsletter I want to brainstorm an idea I've been noodling over for a while. Say we have a computational task, like running a simulation or searching a very large graph, and it's taking too long to complete on a computer. There's generally three things that we can do to make it faster:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Buy a faster computer ("vertical scaling")&lt;/li&gt;
    &lt;li&gt;Modify the software to use the computer's resources better ("efficiency")&lt;/li&gt;
    &lt;li&gt;Modify the software to use multiple computers ("horizontal scaling")&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;(Splitting single-threaded software across multiple threads/processes is sort of a blend of (2) and (3).)&lt;/p&gt;
    &lt;p&gt;The big benefit of (1) is that we (usually) don't have to make any changes to the software to get a speedup. The downside is that for the past couple of decades computers haven't &lt;em&gt;gotten&lt;/em&gt; much faster, except in ways that require recoding (like GPUs and multicore). This means we rely on (2) and (3), and we can do both to a point. I've noticed, though, that horizontal scaling seems to conflict with efficiency. Software optimized to scale well tends to be worse or the &lt;code&gt;N=1&lt;/code&gt; case than software optimized to, um, be optimized. &lt;/p&gt;
    &lt;p&gt;Are there reasons to &lt;em&gt;expect&lt;/em&gt; this? It seems reasonable that design goals of software are generally in conflict, purely because exclusively optimizing for one property means making decisions that impede other properties. But is there something in the nature of "efficiency" and "horizontal scalability" that make them especially disjoint?&lt;/p&gt;
    &lt;p&gt;This isn't me trying to explain a fully coherent idea, more me trying to figure this all out to myself. Also I'm probably getting some hardware stuff wrong&lt;/p&gt;
    &lt;h3&gt;Amdahl's Law&lt;/h3&gt;
    &lt;p&gt;According to &lt;a href="https://en.wikipedia.org/wiki/Amdahl%27s_law" target="_blank"&gt;Amdahl's Law&lt;/a&gt;, the maximum speedup by parallelization is constrained by the proportion of the work that can be parallelized. If 80% of algorithm X is parallelizable, the maximum speedup from horizontal scaling is 5x. If algorithm Y is 25% parallelizable, the maximum speedup is only 1.3x. &lt;/p&gt;
    &lt;p&gt;If you need horizontal scalability, you want to use algorithm X, &lt;em&gt;even if Y is naturally 3x faster&lt;/em&gt;. But if Y was 4x faster, you'd prefer it to X. Maximal scalability means finding the optimal balance between baseline speed and parallelizability. Maximal efficiency means just optimizing baseline speed. &lt;/p&gt;
    &lt;h3&gt;Coordination Overhead&lt;/h3&gt;
    &lt;p&gt;Distributed algorithms require more coordination. To add a list of numbers in parallel via &lt;a href="https://en.wikipedia.org/wiki/Fork%E2%80%93join_model" target="_blank"&gt;fork-join&lt;/a&gt;, we'd do something like this:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Split the list into N sublists&lt;/li&gt;
    &lt;li&gt;Fork a new thread/process for sublist&lt;/li&gt;
    &lt;li&gt;Wait for each thread/process to finish&lt;/li&gt;
    &lt;li&gt;Add the sums together.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;(1), (2), and (3) all add overhead to the algorithm. At the very least, it's extra lines of code to execute, but it can also mean inter-process communication or network hops. Distribution also means you have fewer natural correctness guarantees, so you need more administrative overhead to avoid race conditions. &lt;/p&gt;
    &lt;p&gt;&lt;strong&gt;Real world example:&lt;/strong&gt; Historically CPython has a "global interpreter lock" (GIL). In multithreaded code, only one thread could execute Python code at a time (others could execute C code). The &lt;a href="https://docs.python.org/3/howto/free-threading-python.html#single-threaded-performance" target="_blank"&gt;newest version&lt;/a&gt; supports disabling the GIL, which comes at a 40% overhead for single-threaded programs. Supposedly the difference is because the &lt;a href="https://docs.python.org/3/whatsnew/3.11.html#whatsnew311-pep659" target="_blank"&gt;specializing adaptor&lt;/a&gt; optimization isn't thread-safe yet. The Python team is hoping on getting it down to "only" 10%. &lt;/p&gt;
    &lt;p style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;Scaling loses shared resources&lt;/h3&gt;
    &lt;p&gt;I'd say that intra-machine scaling (multiple threads/processes) feels qualitatively &lt;em&gt;different&lt;/em&gt; than inter-machine scaling. Part of that is that intra-machine scaling is "capped" while inter-machine is not. But there's also a difference in what assumptions you can make about shared resources. Starting from the baseline of single-threaded program:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Threads have a much harder time sharing CPU caches (you have to manually mess with affinities)&lt;/li&gt;
    &lt;li&gt;Processes have a much harder time sharing RAM (I think you have to use &lt;a href="https://en.wikipedia.org/wiki/Memory-mapped_file" target="_blank"&gt;mmap&lt;/a&gt;?)&lt;/li&gt;
    &lt;li&gt;Machines can't share cache, RAM, or disk, period.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;It's a lot easier to solve a problem when the whole thing fits in RAM. But if you split a 50 gb problem across three machines, it doesn't fit in ram by default, even if the machines have 64 gb each. Scaling also means that separate machines can't reuse resources like database connections.&lt;/p&gt;
    &lt;h3&gt;Efficiency comes from limits&lt;/h3&gt;
    &lt;p&gt;I think the two previous points tie together in the idea that maximal efficiency comes from being able to make assumptions about the system. If we know the &lt;em&gt;exact&lt;/em&gt; sequence of computations, we can aim to minimize cache misses. If we don't have to worry about thread-safety, &lt;a href="https://www.playingwithpointers.com/blog/refcounting-harder-than-it-sounds.html" target="_blank"&gt;tracking references is dramatically simpler&lt;/a&gt;. If we have all of the data in a single database, our query planner has more room to work with. At various tiers of scaling these assumptions are no longer guaranteed and we lose the corresponding optimizations.&lt;/p&gt;
    &lt;p&gt;Sometimes these assumptions are implicit and crop up in odd places. Like if you're working at a scale where you need multiple synced databases, you might want to use UUIDs instead of numbers for keys. But then you lose the assumption "recently inserted rows are close together in the index", which I've read &lt;a href="https://www.cybertec-postgresql.com/en/unexpected-downsides-of-uuid-keys-in-postgresql/" target="_blank"&gt;can lead to significant slowdowns&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;This suggests that if you can find a limit somewhere else, you can get both high horizontal scaling and high efficiency. &lt;del&gt;Supposedly the &lt;a href="https://tigerbeetle.com/" target="_blank"&gt;TigerBeetle database&lt;/a&gt; has both, but that could be because they limit all records to &lt;a href="https://docs.tigerbeetle.com/coding/" target="_blank"&gt;accounts and transfers&lt;/a&gt;. This means every record fits in &lt;a href="https://tigerbeetle.com/blog/2024-07-23-rediscovering-transaction-processing-from-history-and-first-principles/#transaction-processing-from-first-principles" target="_blank"&gt;exactly 128 bytes&lt;/a&gt;.&lt;/del&gt; [A TigerBeetle engineer reached out to tell me that they do &lt;em&gt;not&lt;/em&gt; horizontally scale compute, they distribute across multiple nodes for redundancy. &lt;a href="https://lobste.rs/s/5akiq3/are_efficiency_horizontal_scalability#c_ve8ud5" target="_blank"&gt;"You can't make it faster by adding more machines."&lt;/a&gt;]&lt;/p&gt;
    &lt;p&gt;Does this mean that "assumptions" could be both "assumptions about the computing environment" and "assumptions about the problem"? In the famous essay &lt;a href="http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html" target="_blank"&gt;Scalability! But at what COST&lt;/a&gt;, Frank McSherry shows that his single-threaded laptop could outperform 128-node "big data systems" on PageRank and graph connectivity (via label propagation). Afterwards, he discusses how a different algorithm solves graph connectivity even faster: &lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;[Union find] is more line of code than label propagation, but it is 10x faster and 100x less embarassing. … The union-find algorithm is fundamentally incompatible with the graph computation approaches Giraph, GraphLab, and GraphX put forward (the so-called “think like a vertex” model).&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The interesting thing to me is that his alternate makes more "assumptions" than what he's comparing to. He can "assume" a fixed goal and optimize the code for that goal. The "big data systems" are trying to be general purpose compute platforms and have to pick a model that supports the widest range of possible problems. &lt;/p&gt;
    &lt;p&gt;A few years back I wrote &lt;a href="https://www.hillelwayne.com/post/cleverness/" target="_blank"&gt;clever vs insightful code&lt;/a&gt;, I think what I'm trying to say here is that efficiency comes from having insight into your problem and environment.&lt;/p&gt;
    &lt;p&gt;(Last thought to shove in here: to exploit assumptions, you need &lt;em&gt;control&lt;/em&gt;. Carefully arranging your data to fit in L1 doesn't matter if your programming language doesn't let you control where things are stored!)&lt;/p&gt;
    &lt;h3&gt;Is there a cultural aspect?&lt;/h3&gt;
    &lt;p&gt;Maybe there's also a cultural element to this conflict. What if the engineers interested in "efficiency" are different from the engineers interested in "horizontal scaling"?&lt;/p&gt;
    &lt;p&gt;At my first job the data scientists set up a &lt;a href="https://en.wikipedia.org/wiki/Apache_Hadoop" target="_blank"&gt;Hadoop&lt;/a&gt; cluster for their relatively small dataset, only a few dozen gigabytes or so. One of the senior software engineers saw this and said "big data is stupid." To prove it, he took one of their example queries, wrote a script in Go to compute the same thing, and optimized it to run faster on his machine.&lt;/p&gt;
    &lt;p&gt;At the time I was like "yeah, you're right, big data IS stupid!" But I think now that we both missed something obvious: with the "scalable" solution, the data scientists &lt;em&gt;didn't&lt;/em&gt; have to write an optimized script for every single query. Optimizing code is hard, adding more machines is easy! &lt;/p&gt;
    &lt;p&gt;The highest-tier of horizontal scaling is usually something large businesses want, and large businesses like problems that can be solved purely with money. Maximizing efficiency requires a lot of knowledge-intensive human labour, so is less appealing as an investment. Then again, I've seen a lot of work on making the scalable systems more efficient, such as evenly balancing heterogeneous workloads. Maybe in the largest systems intra-machine efficiency is just too small-scale a problem. &lt;/p&gt;
    &lt;h3&gt;I'm not sure where this fits in but scaling a volume of tasks conflicts less than scaling individual tasks&lt;/h3&gt;
    &lt;p&gt;If you have 1,000 machines and need to crunch one big graph, you probably want the most scalable algorithm. If you instead have 50,000 small graphs, you probably want the most efficient algorithm, which you then run on all 1,000 machines. When we call a problem &lt;a href="https://en.wikipedia.org/wiki/Embarrassingly_parallel" target="_blank"&gt;embarrassingly parallel&lt;/a&gt;, we usually mean it's easy to horizontally scale. But it's also one that's easy to make more efficient, because local optimizations don't affect the scaling! &lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Okay that's enough brainstorming for one week.&lt;/p&gt;
    &lt;h3&gt;Blog Rec&lt;/h3&gt;
    &lt;p&gt;Whenever I think about optimization as a skill, the first article that comes to mind is &lt;a href="https://matklad.github.io/" target="_blank"&gt;Mat Klad's&lt;/a&gt; &lt;a href="https://matklad.github.io/2023/11/15/push-ifs-up-and-fors-down.html" target="_blank"&gt;Push Ifs Up And Fors Down&lt;/a&gt;. I'd never have considered on my own that inlining loops into functions could be such a huge performance win. The blog has a lot of other posts on the nuts-and-bolts of systems languages, optimization, and concurrency.&lt;/p&gt;
    </description><pubDate>Wed, 12 Feb 2025 18:26:20 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/are-efficiency-and-horizontal-scalability-at-odds/</guid></item><item><title>What hard thing does your tech make easy?</title><link>https://buttondown.com/hillelwayne/archive/what-hard-thing-does-your-tech-make-easy/</link><description>
    &lt;p&gt;I occasionally receive emails asking me to look at the writer's new language/library/tool. Sometimes it's in an area I know well, like formal methods. Other times, I'm a complete stranger to the field. Regardless, I'm generally happy to check it out.&lt;/p&gt;
    &lt;p&gt;When starting out, this is the biggest question I'm looking to answer:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;What does this technology make easy that's normally hard?&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;What justifies me learning and migrating to a &lt;em&gt;new&lt;/em&gt; thing as opposed to fighting through my problems with the tools I already know? The new thing has to have some sort of value proposition, which could be something like "better performance" or "more secure". The most universal value and the most direct to show is "takes less time and mental effort to do something". I can't accurately judge two benchmarks, but I can see two demos or code samples and compare which one feels easier to me.&lt;/p&gt;
    &lt;h2&gt;Examples&lt;/h2&gt;
    &lt;h3&gt;Functional programming&lt;/h3&gt;
    &lt;p&gt;What drew me originally to functional programming was higher order functions. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# Without HOFs
    
    out = []
    for x in input {
      if test(x) {
        out.append(x)
     }
    }
    
    # With HOFs
    
    filter(test, input)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;We can also compare the easiness of various tasks between examples within the same paradigm. If I know FP via Clojure, what could be appealing about Haskell or F#? For one, null safety is a lot easier when I've got option types.&lt;/p&gt;
    &lt;h3&gt;Array Programming&lt;/h3&gt;
    &lt;p&gt;Array programming languages like APL or J make certain classes of computation easier. For example, finding all of the indices where two arrays &lt;del&gt;differ&lt;/del&gt; match. Here it is in Python:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    
    &lt;span class="o"&gt;&gt;&gt;&gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;And here it is in J:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;y&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
    
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;I&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;y&lt;/span&gt;
    &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Not every tool is meant for every programmer, because you might not have any of the problems a tool makes easier. What comes up more often for you: filtering a list or finding all the indices where two lists differ? Statistically speaking, functional programming is more useful to you than array programming.&lt;/p&gt;
    &lt;p&gt;But &lt;em&gt;I&lt;/em&gt; have this problem enough to justify learning array programming.&lt;/p&gt;
    &lt;h3&gt;LLMs&lt;/h3&gt;
    &lt;p&gt;I think a lot of the appeal of LLMs is they make a lot of specialist tasks easy for nonspecialists. One thing I recently did was convert some rst &lt;a href="https://docutils.sourceforge.io/docs/ref/rst/directives.html#list-table" target="_blank"&gt;list tables&lt;/a&gt; to &lt;a href="https://docutils.sourceforge.io/docs/ref/rst/directives.html#csv-table-1" target="_blank"&gt;csv tables&lt;/a&gt;. Normally I'd have to do write some tricky parsing and serialization code to automatically convert between the two. With LLMs, it's just&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Convert the following rst list-table into a csv-table: [table]&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;"Easy" can trump "correct" as a value. The LLM might get some translations wrong, but it's so convenient I'd rather manually review all the translations for errors than write specialized script that is correct 100% of the time.&lt;/p&gt;
    &lt;h2&gt;Let's not take this too far&lt;/h2&gt;
    &lt;p&gt;A college friend once claimed that he cracked the secret of human behavior: humans do whatever makes them happiest. "What about the martyr who dies for their beliefs?" "Well, in their last second of life they get REALLY happy."&lt;/p&gt;
    &lt;p&gt;We can do the same here, fitting every value proposition into the frame of "easy". CUDA makes it easier to do matrix multiplication. Rust makes it easier to write low-level code without memory bugs. TLA+ makes it easier to find errors in your design. Monads make it easier to sequence computations in a lazy environment. Making everything about "easy" obscures other reason for adopting new things.&lt;/p&gt;
    &lt;h3&gt;That whole "simple vs easy" thing&lt;/h3&gt;
    &lt;p&gt;Sometimes people think that "simple" is better than "easy", because "simple" is objective and "easy" is subjective. This comes from the famous talk &lt;a href="https://www.infoq.com/presentations/Simple-Made-Easy/" target="_blank"&gt;Simple Made Easy&lt;/a&gt;. I'm not sure I agree that simple is better &lt;em&gt;or&lt;/em&gt; more objective: the speaker claims that polymorphism and typeclasses are "simpler" than conditionals, and I doubt everybody would agree with that.&lt;/p&gt;
    &lt;p&gt;The problem is that "simple" is used to mean both "not complicated" &lt;em&gt;and&lt;/em&gt; "not complex". And everybody agrees that "complicated" and "complex" are different, even if they can't agree &lt;em&gt;what&lt;/em&gt; the difference is. This idea should probably expanded be expanded into its own newsletter.&lt;/p&gt;
    &lt;p&gt;It's also a lot harder to pitch a technology on being "simpler". Simplicity by itself doesn't make a tool better equipped to solve problems. Simplicity can unlock other benefits, like compositionality or &lt;a href="https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/" target="_blank"&gt;tractability&lt;/a&gt;, that provide the actual value. And often that value is in the form of "makes some tasks easier". &lt;/p&gt;
    </description><pubDate>Wed, 29 Jan 2025 18:09:47 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/what-hard-thing-does-your-tech-make-easy/</guid></item><item><title>The Juggler's Curse</title><link>https://buttondown.com/hillelwayne/archive/the-jugglers-curse/</link><description>
    &lt;p&gt;I'm making a more focused effort to juggle this year. Mostly &lt;a href="https://youtu.be/PPhG_90VH5k?si=AxOO65PcX4ZwnxPQ&amp;t=49" target="_blank"&gt;boxes&lt;/a&gt;, but also classic balls too.&lt;sup id="fnref:boxes"&gt;&lt;a class="footnote-ref" href="#fn:boxes"&gt;1&lt;/a&gt;&lt;/sup&gt; I've gotten to the point where I can almost consistently do a five-ball cascade, which I &lt;em&gt;thought&lt;/em&gt; was the cutoff to being a "good juggler". "Thought" because I now know a "good juggler" is one who can do the five-ball cascade with &lt;em&gt;outside throws&lt;/em&gt;. &lt;/p&gt;
    &lt;p&gt;I know this because I can't do the outside five-ball cascade... yet. But it's something I can see myself eventually mastering, unlike the slightly more difficult trick of the five-ball mess, which is impossible for mere mortals like me. &lt;/p&gt;
    &lt;p&gt;&lt;em&gt;In theory&lt;/em&gt; there is a spectrum of trick difficulties and skill levels. I could place myself on the axis like this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="A crudely-drawn scale with 10 even ticks, I'm between 5 and 6" class="newsletter-image" src="https://assets.buttondown.email/images/8ee51aa1-5dd4-48b8-8110-2cdf9a273612.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;In practice, there are three tiers:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Toddlers&lt;/li&gt;
    &lt;li&gt;Good jugglers who practice hard&lt;/li&gt;
    &lt;li&gt;Genetic freaks and actual wizards&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;And the graph always, &lt;em&gt;always&lt;/em&gt; looks like this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The same graph, with the top compressed into "wizards" and bottom into "toddlers". I'm in toddlers." class="newsletter-image" src="https://assets.buttondown.email/images/04c76cec-671e-4560-b64e-498b7652359e.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;This is the jugglers curse, and it's a three-parter:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;The threshold between you and "good" is the next trick you cannot do.&lt;/li&gt;
    &lt;li&gt;Everything below that level is trivial. Once you've gotten a trick down, you can never go back to not knowing it, to appreciating how difficult it was to learn in the first place.&lt;sup id="fnref:expert-blindness"&gt;&lt;a class="footnote-ref" href="#fn:expert-blindness"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;li&gt;Everything above that level is just "impossible". You don't have the knowledge needed to recognize the different tiers.&lt;sup id="fnref:dk"&gt;&lt;a class="footnote-ref" href="#fn:dk"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;So as you get better, the stuff that was impossible becomes differentiable, and you can see that some of it &lt;em&gt;is&lt;/em&gt; possible. And everything you learned becomes trivial. So you're never a good juggler until you learn "just one more hard trick".&lt;/p&gt;
    &lt;p&gt;The more you know, the more you know you don't know and the less you know you know.&lt;/p&gt;
    &lt;h3&gt;This is supposed to be a software newsletter&lt;/h3&gt;
    &lt;blockquote&gt;
    &lt;p&gt;A monad is a monoid in the category of endofunctors, what's the problem? &lt;a href="https://james-iry.blogspot.com/2009/05/brief-incomplete-and-mostly-wrong.html" target="_blank"&gt;(src)&lt;/a&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;I think this applies to any difficult topic? Most fields don't have the same stark &lt;a href="https://en.wikipedia.org/wiki/Spectral_line" target="_blank"&gt;spectral lines&lt;/a&gt; as juggling, but there's still tiers of difficulty to techniques, which get compressed the further in either direction they are from your current level.&lt;/p&gt;
    &lt;p&gt;Like, I'm not good at formal methods. I've written two books on it but I've never mastered a dependently-typed language or a theorem prover. Those are equally hard. And I'm not good at modeling concurrent systems because I don't understand the formal definition of bisimulation and haven't implemented a Raft. Those are also equally hard, in fact exactly as hard as mastering a theorem prover.&lt;/p&gt;
    &lt;p&gt;At the same time, the skills I've already developed are easy: properly using refinement is &lt;em&gt;exactly as easy&lt;/em&gt; as writing &lt;a href="https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/" target="_blank"&gt;a wrapped counter&lt;/a&gt;. Then I get surprised when I try to explain strong fairness to someone and they just don't get how □◇(ENABLED〈A〉ᵥ) is &lt;em&gt;obviously&lt;/em&gt; different from ◇□(ENABLED 〈A〉ᵥ).&lt;/p&gt;
    &lt;p&gt;Juggler's curse!&lt;/p&gt;
    &lt;p&gt;Now I don't actually know if this is actually how everybody experiences expertise or if it's just my particular personality— I was a juggler long before I was a software developer. Then again, I'd argue that lots of people talk about one consequence of the juggler's curse: imposter syndrome. If you constantly think what you know is "trivial" and what you don't know is "impossible", then yeah, you'd start feeling like an imposter at work real quick.&lt;/p&gt;
    &lt;p&gt;I wonder if part of the cause is that a lot of skills you have to learn are invisible. One of my favorite blog posts ever is &lt;a href="https://www.benkuhn.net/blub/" target="_blank"&gt;In Defense of Blub Studies&lt;/a&gt;, which argues that software expertise comes through understanding "boring" topics like "what all of the error messages mean" and "how to use a debugger well".  Blub is a critical part of expertise and takes a lot of hard work to learn, but it &lt;em&gt;feels&lt;/em&gt; like trivia. So looking back on a skill I mastered, I might think it was "easy" because I'm not including all of the blub that I had to learn, too.&lt;/p&gt;
    &lt;p&gt;The takeaway, of course, is that the outside five-ball cascade &lt;em&gt;is&lt;/em&gt; objectively the cutoff between good jugglers and toddlers.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:boxes"&gt;
    &lt;p&gt;Rant time: I &lt;em&gt;love&lt;/em&gt; cigar box juggling. It's fun, it's creative, it's totally unlike any other kind of juggling. And it's so niche I straight up cannot find anybody in Chicago to practice with. I once went to a juggling convention and was the only person with a cigar box set there. &lt;a class="footnote-backref" href="#fnref:boxes" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:expert-blindness"&gt;
    &lt;p&gt;This particular part of the juggler's curse is also called &lt;a href="https://en.wikipedia.org/wiki/Curse_of_knowledge" target="_blank"&gt;the curse of knowledge&lt;/a&gt; or "expert blindness". &lt;a class="footnote-backref" href="#fnref:expert-blindness" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:dk"&gt;
    &lt;p&gt;This isn't Dunning-Kruger, because DK says that people think they are &lt;em&gt;better&lt;/em&gt; than they actually are, and also &lt;a href="https://www.mcgill.ca/oss/article/critical-thinking/dunning-kruger-effect-probably-not-real" target="_blank"&gt;may not actually be real&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:dk" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 22 Jan 2025 18:50:40 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/the-jugglers-curse/</guid></item><item><title>What are the Rosettas of formal specification?</title><link>https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/</link><description>
    &lt;p&gt;First of all, I just released version 0.6 of &lt;em&gt;Logic for Programmers&lt;/em&gt;! You can get it &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;here&lt;/a&gt;. Release notes in the footnote.&lt;sup id="fnref:release-notes"&gt;&lt;a class="footnote-ref" href="#fn:release-notes"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;I've been thinking about my next project after the book's done. One idea is to do a survey of new formal specification languages. There's been a lot of new ones in the past few years (P, Quint, etc), plus some old ones I haven't critically examined (SPIN, mcrl2). I'm thinking of a brief overview of each, what's interesting about it, and some examples of the corresponding models.&lt;/p&gt;
    &lt;p&gt;For this I'd want a set of "Rosetta" examples. &lt;a href="https://rosettacode.org/wiki/Rosetta_Code" target="_blank"&gt;Rosetta Code&lt;/a&gt; is a collection of programming tasks done in different languages. For example, &lt;a href="https://rosettacode.org/wiki/99_bottles_of_beer" target="_blank"&gt;"99 bottles of beer on the wall"&lt;/a&gt; in over 300 languages. If I wanted to make a Rosetta Code for specifications of concurrent systems, what examples would I use? &lt;/p&gt;
    &lt;h3&gt;What makes a good Rosetta examples?&lt;/h3&gt;
    &lt;p&gt;A good Rosetta example would be simple enough to understand and implement but also showcase the differences between the languages. &lt;/p&gt;
    &lt;p&gt;A good example of a Rosetta example is &lt;a href="https://github.com/hwayne/lets-prove-leftpad" target="_blank"&gt;leftpad for code verification&lt;/a&gt;. Proving leftpad correct is short in whatever verification language you use. But the proofs themselves are different enough that you can compare what it's like to use code contracts vs with dependent types, etc. &lt;/p&gt;
    &lt;p&gt;A &lt;em&gt;bad&lt;/em&gt; Rosetta example is "hello world". While it's good for showing how to run a language, it doesn't clearly differentiate languages. Haskell's "hello world" is almost identical to BASIC's "hello world".&lt;/p&gt;
    &lt;p&gt;Rosetta examples don't have to be flashy, but I &lt;em&gt;want&lt;/em&gt; mine to be flashy. Formal specification is niche enough that regardless of my medium, most of my audience hasn't use it and may be skeptical. I always have to be selling. This biases me away from using things like dining philosophers or two-phase commit.&lt;/p&gt;
    &lt;p&gt;So with that in mind, three ideas:&lt;/p&gt;
    &lt;h3&gt;1. Wrapped Counter&lt;/h3&gt;
    &lt;p&gt;A counter that starts at 1 and counts to N, after which it wraps around to 1 again.&lt;/p&gt;
    &lt;h4&gt;Why it's good&lt;/h4&gt;
    &lt;p&gt;This is a good introductory formal specification: it's a minimal possible stateful system without concurrency or nondeterminism. You can use it to talk about the basic structure of a spec, how a verifier works, etc. It also a good way of introducing "boring" semantics, like conditionals and arithmetic, and checking if the language does anything unusual with them. Alloy, for example, defaults to 4-bit signed integers, so you run into problems if you set N too high.&lt;sup id="fnref:alloy"&gt;&lt;a class="footnote-ref" href="#fn:alloy"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;At the same time, wrapped counters are a common building block of complex systems. Lots of things can be represented this way: &lt;code&gt;N=1&lt;/code&gt; is a flag or blinker, &lt;code&gt;N=3&lt;/code&gt; is a traffic light, &lt;code&gt;N=24&lt;/code&gt; is a clock, etc.&lt;/p&gt;
    &lt;p&gt;The next example is better for showing basic &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;safety and liveness properties&lt;/a&gt;, but this will do in a pinch. &lt;/p&gt;
    &lt;h3&gt;2. Threads&lt;/h3&gt;
    &lt;p&gt;A counter starts at 0. N threads each, simultaneously try to update the counter. They do this nonatomically: first they read the value of the counter and store that in a thread-local &lt;code&gt;tmp&lt;/code&gt;, then they increment &lt;code&gt;tmp&lt;/code&gt;, then they set the counter to &lt;code&gt;tmp&lt;/code&gt;. The expected behavior is that the final value of the counter will be N.&lt;/p&gt;
    &lt;h4&gt;Why it's good&lt;/h4&gt;
    &lt;p&gt;The system as described is bugged. If two threads interleave the setlocal commands, one thread update can "clobber" the other and the counter can go backwards. To my surprise, most people &lt;em&gt;do not&lt;/em&gt; see this error. So it's a good showcase of how the language actually finds real bugs, and how it can verify fixes.&lt;/p&gt;
    &lt;p&gt;As to actual language topics: the spec covers concurrency and track process-local state. A good spec language should make it possible to adjust N without having to add any new variables. And it "naturally" introduces safety, liveness, and &lt;a href="https://www.hillelwayne.com/post/action-properties/" target="_blank"&gt;action&lt;/a&gt; properties.&lt;/p&gt;
    &lt;p&gt;Finally, the thread spec is endlessly adaptable. I've used variations of it to teach refinement, resource starvation, fairness, livelocks, and hyperproperties. Tweak it a bit and you get dining philosophers.&lt;/p&gt;
    &lt;h3&gt;3. Bounded buffer&lt;/h3&gt;
    &lt;p&gt;We have a bounded buffer with maximum length &lt;code&gt;X&lt;/code&gt;. We have &lt;code&gt;R&lt;/code&gt; reader and &lt;code&gt;W&lt;/code&gt; writer processes. Before writing, writers first check if the buffer is full. If full, the writer goes to sleep. Otherwise, the writer wakes up &lt;em&gt;a random&lt;/em&gt; sleeping process, then pushes an arbitrary value. Readers work the same way, except they pop from the buffer (and go to sleep if the buffer is empty).&lt;/p&gt;
    &lt;p&gt;The only way for a sleeping process to wake up is if another process successfully performs a read or write.&lt;/p&gt;
    &lt;h4&gt;Why it's good&lt;/h4&gt;
    &lt;p&gt;This shows process-local nondeterminism (in choosing which sleeping process to wake up), different behavior for different types of processes, and deadlocks: it's possible for every reader and writer to be asleep at the same time.&lt;/p&gt;
    &lt;p&gt;The beautiful thing about this example: the spec can only deadlock if &lt;code&gt;X &lt; 2*(R+W)&lt;/code&gt;. This is the kind of bug you'd struggle to debug in real code. An in fact, people did struggle: even when presented with a minimal code sample and told there was a bug, many &lt;a href="http://wiki.c2.com/?ExtremeProgrammingChallengeFourteen" target="_blank"&gt;testing experts couldn't find it&lt;/a&gt;. Whereas a formal model of the same code &lt;a href="https://www.hillelwayne.com/post/augmenting-agile/" target="_blank"&gt;finds the bug in seconds&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;If a spec language can model the bounded buffer, then it's good enough for production systems.&lt;/p&gt;
    &lt;p&gt;On top of that, the bug happens regardless of what writers actually put in the buffer, so you can abstract that all away. This example can demonstrate that you can leave implementation details out of a spec and still find critical errors.&lt;/p&gt;
    &lt;h2&gt;Caveat&lt;/h2&gt;
    &lt;p&gt;This is all with a &lt;em&gt;heavy&lt;/em&gt; TLA+ bias. I've modeled all of these systems in TLA+ and it works pretty well for them. That is to say, none of these do things TLA+ is &lt;em&gt;bad&lt;/em&gt; at: reachability, subtyping, transitive closures, unbound spaces, etc. I imagine that as I cover more specification languages I'll find new Rosettas.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:release-notes"&gt;
    &lt;ul&gt;
    &lt;li&gt;Exercises are more compact, answers now show name of exercise in title&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;"Conditionals" chapter has new section on nested conditionals&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;"Crash course" chapter significantly rewritten&lt;/li&gt;
    &lt;li&gt;Starting migrating to use consistently use &lt;code&gt;==&lt;/code&gt; for equality and &lt;code&gt;=&lt;/code&gt; for definition. Not everything is migrated yet&lt;/li&gt;
    &lt;li&gt;"Beyond Logic" appendix does a &lt;em&gt;slightly&lt;/em&gt; better job of covering HOL and constructive logic&lt;/li&gt;
    &lt;li&gt;Addressed various reader feedback&lt;/li&gt;
    &lt;li&gt;Two new exercises&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;&lt;a class="footnote-backref" href="#fnref:release-notes" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:alloy"&gt;
    &lt;p&gt;You can change the int size in a model run, so this is more "surprising footgun and inconvenience" than "fundamental limit of the specification language." Something still good to know! &lt;a class="footnote-backref" href="#fnref:alloy" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 15 Jan 2025 17:34:40 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/</guid></item><item><title>"Logic for Programmers" Project Update</title><link>https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/</link><description>
    &lt;p&gt;Happy new year everyone!&lt;/p&gt;
    &lt;p&gt;I released the first &lt;em&gt;Logic for Programmers&lt;/em&gt; alpha six months ago. There's since been four new versions since then, with the November release putting us in beta. Between work and holidays I didn't make much progress in December, but there will be a 0.6 release in the next week or two.&lt;/p&gt;
    &lt;p&gt;People have asked me if the book will ever be available in print, and my answer to that is "when it's done". To keep "when it's done" from being "never", I'm committing myself to &lt;strong&gt;have the book finished by July.&lt;/strong&gt; That means roughly six more releases between now and the official First Edition. Then I will start looking for a way to get it printed.&lt;/p&gt;
    &lt;h3&gt;The Current State and What Needs to be Done&lt;/h3&gt;
    &lt;p&gt;Right now the book is 26,000 words. For the most part, the structure is set— I don't plan to reorganize the chapters much. But I still need to fix shortcomings identified by the reader feedback. In particular, a few topics need more on real world applications, and the Alloy chapter is pretty weak. There's also a bunch of notes and todos and "fix this"s I need to go over.&lt;/p&gt;
    &lt;p&gt;I also need to rewrite the introduction and predicate logic chapters. Those haven't changed much since 0.1 and I need to go over them &lt;em&gt;very carefully&lt;/em&gt;.&lt;/p&gt;
    &lt;p&gt;After that comes copyediting.&lt;/p&gt;
    &lt;h4&gt;Ugh, Copyediting&lt;/h4&gt;
    &lt;p&gt;Copyediting means going through the entire book to make word and sentence sentence level changes to the flow. An example would be changing&lt;/p&gt;
    &lt;table&gt;
    &lt;thead&gt;
    &lt;tr&gt;
    &lt;th&gt;From&lt;/th&gt;
    &lt;th&gt;To&lt;/th&gt;
    &lt;/tr&gt;
    &lt;/thead&gt;
    &lt;tbody&gt;
    &lt;tr&gt;
    &lt;td&gt;I said predicates are just “boolean functions”. That isn’t &lt;em&gt;quite&lt;/em&gt; true.&lt;/td&gt;
    &lt;td&gt;It's easy to think of predicates as just "boolean" functions, but there is a subtle and important difference.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;/tbody&gt;
    &lt;/table&gt;
    &lt;p&gt;It's a tiny difference but it reads slightly better to me and makes the book slghtly better. Now repeat that for all 3000-odd sentences in the book and I'm done with copyediting!&lt;/p&gt;
    &lt;p&gt;For the first pass, anyway. Copyediting is miserable. &lt;/p&gt;
    &lt;p&gt;Some of the changes I need to make come from reader feedback, but most will come from going through it line-by-line with a copyeditor. Someone's kindly offered to do some of this for free, but I want to find a professional too. If you know anybody, let me know.&lt;/p&gt;
    &lt;h4&gt;Formatting&lt;/h4&gt;
    &lt;p&gt;The book, if I'm being honest, looks ugly. I'm using the default sphinx/latex combination for layout and typesetting. My thinking is it's not worth making the book pretty until it's worth reading. But I also want the book, when it's eventually printed, to look &lt;em&gt;nice&lt;/em&gt;. At the very least it shouldn't have "self-published" vibes. &lt;/p&gt;
    &lt;p&gt;I've found someone who's been giving me excellent advice on layout and I'm slowly mastering the LaTeX formatting arcana. It's gonna take a few iterations to get things right.&lt;/p&gt;
    &lt;h4&gt;Front cover&lt;/h4&gt;
    &lt;p&gt;Currently the front cover is this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Front cover" class="newsletter-image" src="https://assets.buttondown.email/images/b42ee3de-9d8a-4729-809e-a8739741f0cf.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;It works but gives "programmer spent ten minutes in Inkscape" vibes. I have a vision in my head for what would be nicer. A few people have recommended using Fiverr. So far the results haven't been that good, &lt;/p&gt;
    &lt;h4&gt;Fixing Epub&lt;/h4&gt;
    &lt;p&gt;&lt;em&gt;Ugh&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;I thought making an epub version would be kinder for phone reading, but it's such a painful format to develop for. Did you know that epub backlinks work totally different on kindle vs other ereaders? Did you know the only way to test if you got em working right is to load them up in a virtual kindle? The feedback loops are miserable. So I've been treating epub as a second-class citizen for now and only fixing the &lt;em&gt;worst&lt;/em&gt; errors (like math not rendering properly), but that'll have to change as the book finalizes.&lt;/p&gt;
    &lt;h3&gt;What comes next?&lt;/h3&gt;
    &lt;p&gt;After 1.0, I get my book an ISBN and figure out how to make print copies. The margin on print is &lt;em&gt;way&lt;/em&gt; lower than ebooks, especially if it's on-demand: the net royalties for &lt;a href="https://kdp.amazon.com/en_US/help/topic/G201834330" target="_blank"&gt;Amazon direct publishing&lt;/a&gt; would be 7 dollars on a 20-dollar book (as opposed to Leanpub's 16 dollars). Would having a print version double the sales? I hope so! Either way, a lot of people have been asking about print version so I want to make that possible.&lt;/p&gt;
    &lt;p&gt;(I also want to figure out how to give people who already have the ebook a discount on print, but I don't know if that's feasible.)&lt;/p&gt;
    &lt;p&gt;Then, I dunno, maybe make a talk or a workshop I can pitch to conferences. Once I have that I think I can call &lt;em&gt;LfP&lt;/em&gt; complete... at least until the second edition.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Anyway none of that is actually technical so here's a quick fun thing. I spent a good chunk of my break reading the &lt;a href="https://www.mcrl2.org/web/index.html" target="_blank"&gt;mCRL2 book&lt;/a&gt;. mCRL2 defines an "algebra" for "communicating processes". As a very broad explanation, that's defining what it means to "add" and "multiply" two processes. What's interesting is that according to their definition, the algebra follows the distributive law, &lt;em&gt;but only if you multiply on the right&lt;/em&gt;. eg&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;// VALID
    (a+b)*c = a*c + b*c
    
    // INVALID
    a*(b+c) = a*b + a*c
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is the first time I've ever seen this in practice! Juries still out on the rest of the language.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Videos and Stuff&lt;/h3&gt;
    &lt;ul&gt;
    &lt;li&gt;My &lt;em&gt;DDD Europe&lt;/em&gt; talk is now out! &lt;a href="https://www.youtube.com/watch?v=uRmNSuYBUOU" target="_blank"&gt;What We Know We Don't Know&lt;/a&gt; is about empirical software engineering in general, and software engineering research on Domain Driven Design in particular.&lt;/li&gt;
    &lt;li&gt;I was interviewed in the last video on &lt;a href="https://www.youtube.com/watch?v=yXxmSI9SlwM" target="_blank"&gt;Craft vs Cruft&lt;/a&gt;'s "Year of Formal Methods". Check it out!&lt;/li&gt;
    &lt;/ul&gt;
    </description><pubDate>Tue, 07 Jan 2025 18:49:40 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/</guid></item></channel></rss>
    Raw headers
    {
      "cf-cache-status": "DYNAMIC",
      "cf-ray": "980be2ee902fb98a-ORD",
      "connection": "keep-alive",
      "content-security-policy-report-only": "default-src 'self'; script-src 'self' 'unsafe-inline' https://static.addtoany.com https://embed.bsky.app https://platform.twitter.com https://www.tiktok.com https://embedr.flickr.com https://scripts.simpleanalyticscdn.com https://cdn.usefathom.com https://plausible.io https://www.googletagmanager.com https://cloud.umami.is https://connect.facebook.net https://www.instagram.com https://sniperl.ink https://cdn.tailwindcss.com https://challenges.cloudflare.com; style-src 'self' 'unsafe-inline' https:; img-src 'self' data: https: http: blob:; media-src 'self' data: https: http: blob:; font-src 'self' data: https:; frame-src https: blob:; connect-src 'self' https:; manifest-src 'self'; object-src 'none'; base-uri 'self'; form-action 'self'; report-uri https://o97520.ingest.us.sentry.io/api/6063581/security/?sentry_key=98d0ca1c1c554806b630fa9caf185b1f; report-to csp-endpoint",
      "content-type": "application/rss+xml; charset=utf-8",
      "cross-origin-opener-policy": "same-origin",
      "date": "Wed, 17 Sep 2025 22:02:12 GMT",
      "last-modified": "Wed, 10 Sep 2025 13:00:00 GMT",
      "nel": "{\"report_to\":\"heroku-nel\",\"response_headers\":[\"Via\"],\"max_age\":3600,\"success_fraction\":0.01,\"failure_fraction\":0.1}, {\"report_to\":\"heroku-nel\",\"response_headers\":[\"Via\"],\"max_age\":3600,\"success_fraction\":0.01,\"failure_fraction\":0.1}",
      "referrer-policy": "strict-origin-when-cross-origin",
      "report-to": "{\"group\":\"heroku-nel\",\"endpoints\":[{\"url\":\"https://nel.heroku.com/reports?s=YRSECVgoG0ktMX7lhTfNO3OYHOMqemMnqwahcV9vtvw%3D\\u0026sid=929419e7-33ea-4e2f-85f0-7d8b7cd5cbd6\\u0026ts=1758146531\"}],\"max_age\":3600}, {\"group\":\"heroku-nel\",\"endpoints\":[{\"url\":\"https://nel.heroku.com/reports?s=uRgUM3M%2FJHix%2F8fkdJMfIStDYnIAwsLw94nCGKlevcg%3D\\u0026sid=e11707d5-02a7-43ef-b45e-2cf4d2036f7d\\u0026ts=1758146531\"}],\"max_age\":3600}, {\"group\": \"csp-endpoint\", \"max_age\": 86400, \"endpoints\": [{\"url\": \"https://o97520.ingest.us.sentry.io/api/6063581/security/?sentry_key=98d0ca1c1c554806b630fa9caf185b1f\"}], \"include_subdomains\": true}",
      "reporting-endpoints": "heroku-nel=\"https://nel.heroku.com/reports?s=YRSECVgoG0ktMX7lhTfNO3OYHOMqemMnqwahcV9vtvw%3D&sid=929419e7-33ea-4e2f-85f0-7d8b7cd5cbd6&ts=1758146531\", heroku-nel=\"https://nel.heroku.com/reports?s=uRgUM3M%2FJHix%2F8fkdJMfIStDYnIAwsLw94nCGKlevcg%3D&sid=e11707d5-02a7-43ef-b45e-2cf4d2036f7d&ts=1758146531\", csp-endpoint=\"https://o97520.ingest.us.sentry.io/api/6063581/security/?sentry_key=98d0ca1c1c554806b630fa9caf185b1f\"",
      "server": "cloudflare",
      "set-cookie": "initial_path=\"/hillelwayne/rss\"; expires=Fri, 17 Oct 2025 22:02:12 GMT; Max-Age=2592000; Path=/",
      "transfer-encoding": "chunked",
      "vary": "Cookie, Host, origin, Accept-Encoding",
      "via": "1.1 heroku-router, 2.0 heroku-router",
      "x-content-type-options": "nosniff",
      "x-frame-options": "DENY"
    }
    Parsed with @rowanmanning/feed-parser
    {
      "meta": {
        "type": "rss",
        "version": "2.0"
      },
      "language": "en-us",
      "title": "Computer Things",
      "description": "Hi, I'm Hillel. This is the newsletter version of [my website](https://www.hillelwayne.com). I post all website updates here. I also post weekly content just for the newsletter, on topics like\n\n* Formal Methods\n\n* Software History and Culture\n\n* Fringetech and exotic tooling\n\n* The philosophy and theory of software engineering\n\nYou can see the archive of all public essays [here](https://buttondown.email/hillelwayne/archive/).",
      "copyright": null,
      "url": "https://buttondown.com/hillelwayne",
      "self": "https://buttondown.email/hillelwayne/rss",
      "published": null,
      "updated": "2025-09-10T13:00:00.000Z",
      "generator": null,
      "image": null,
      "authors": [],
      "categories": [],
      "items": [
        {
          "id": "https://buttondown.com/hillelwayne/archive/many-hard-leetcode-problems-are-easy-constraint/",
          "title": "Many Hard Leetcode Problems are Easy Constraint Problems",
          "description": "<p>In my first interview out of college I was asked the change counter problem:</p>\n<blockquote>\n<p>Given a set of coin denominations, find the minimum number of coins required to make change for a given number. IE for USA coinage and 37 cents, the minimum number is four (quarter, dime, 2 pennies).</p>\n</blockquote>\n<p>I implemented the simple greedy algorithm and immediately fell into the trap of the question: the greedy algorithm only works for \"well-behaved\" denominations. If the coin values were <code>[10, 9, 1]</code>, then making 37 cents would take 10 coins in the greedy algorithm but only 4 coins optimally (<code>10+9+9+9</code>). The \"smart\" answer is to use a dynamic programming algorithm, which I didn't know how to do. So I failed the interview.</p>\n<p>But you only need dynamic programming if you're writing your own algorithm. It's really easy if you throw it into a constraint solver like <a href=\"https://www.minizinc.org/\" target=\"_blank\">MiniZinc</a> and call it a day. </p>\n<div class=\"codehilite\"><pre><span></span><code>int: total;\narray[int] of int: values = [10, 9, 1];\narray[index_set(values)] of var 0..: coins;\n\nconstraint sum (c in index_set(coins)) (coins[c] * values[c]) == total;\nsolve minimize sum(coins);\n</code></pre></div>\n<p>You can try this online <a href=\"https://play.minizinc.dev/\" target=\"_blank\">here</a>. It'll give you a prompt to put in <code>total</code> and then give you successively-better solutions:</p>\n<div class=\"codehilite\"><pre><span></span><code>coins = [0, 0, 37];\n----------\ncoins = [0, 1, 28];\n----------\ncoins = [0, 2, 19];\n----------\ncoins = [0, 3, 10];\n----------\ncoins = [0, 4, 1];\n----------\ncoins = [1, 3, 0];\n----------\n</code></pre></div>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<p>Lots of similar interview questions are this kind of mathematical optimization problem, where we have to find the maximum or minimum of a function corresponding to constraints. They're hard in programming languages because programming languages are too low-level. They are also exactly the problems that constraint solvers were designed to solve. Hard leetcode problems are easy constraint problems.<sup id=\"fnref:leetcode\"><a class=\"footnote-ref\" href=\"#fn:leetcode\">1</a></sup> Here I'm using MiniZinc, but you could just as easily use Z3 or OR-Tools or whatever your favorite generalized solver is.</p>\n<h3>More examples</h3>\n<p>This was a question in a different interview (which I thankfully passed):</p>\n<blockquote>\n<p>Given a list of stock prices through the day, find maximum profit you can get by buying one stock and selling one stock later.</p>\n</blockquote>\n<p>It's easy to do in O(n^2) time, or if you are clever, you can do it in O(n). Or you could be not clever at all and just write it as a constraint problem:</p>\n<div class=\"codehilite\"><pre><span></span><code>array[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];\nvar int: buy;\nvar int: sell;\nvar int: profit = prices[sell] - prices[buy];\n\nconstraint sell > buy;\nconstraint profit > 0;\nsolve maximize profit;\n</code></pre></div>\n<p>Reminder, link to trying it online <a href=\"https://play.minizinc.dev/\" target=\"_blank\">here</a>. While working at that job, one interview question we tested out was:</p>\n<blockquote>\n<p>Given a list, determine if three numbers in that list can be added or subtracted to give 0? </p>\n</blockquote>\n<p>This is a satisfaction problem, not a constraint problem: we don't need the \"best answer\", any answer will do. We eventually decided against it for being too tricky for the engineers we were targeting. But it's not tricky in a solver; </p>\n<div class=\"codehilite\"><pre><span></span><code>include \"globals.mzn\";\narray[int] of int: numbers = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];\narray[index_set(numbers)] of var {0, -1, 1}: choices;\n\nconstraint sum(n in index_set(numbers)) (numbers[n] * choices[n]) = 0;\nconstraint count(choices, -1) + count(choices, 1) = 3;\nsolve satisfy;\n</code></pre></div>\n<p>Okay, one last one, a problem I saw last year at <a href=\"https://chicagopython.github.io/algosig/\" target=\"_blank\">Chipy AlgoSIG</a>. Basically they pick some leetcode problems and we all do them. I failed to solve <a href=\"https://leetcode.com/problems/largest-rectangle-in-histogram/description/\" target=\"_blank\">this one</a>:</p>\n<blockquote>\n<p>Given an array of integers heights representing the histogram's bar height where the width of each bar is 1, return the area of the largest rectangle in the histogram.</p>\n<p><img alt=\"example from leetcode link\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/63337f78-7138-4b21-87a0-917c0c5b1706.jpg?w=960&fit=max\"/></p>\n</blockquote>\n<p>The \"proper\" solution is a tricky thing involving tracking lots of bookkeeping states, which you can completely bypass by expressing it as constraints:</p>\n<div class=\"codehilite\"><pre><span></span><code>array[int] of int: numbers = [2,1,5,6,2,3];\n\nvar 1..length(numbers): x; \nvar 1..length(numbers): dx;\nvar 1..: y;\n\nconstraint x + dx <= length(numbers);\nconstraint forall (i in x..(x+dx)) (y <= numbers[i]);\n\nvar int: area = (dx+1)*y;\nsolve maximize area;\n\noutput [\"(\\(x)->\\(x+dx))*\\(y) = \\(area)\"]\n</code></pre></div>\n<p>There's even a way to <a href=\"https://docs.minizinc.dev/en/2.9.3/visualisation.html\" target=\"_blank\">automatically visualize the solution</a> (using <code>vis_geost_2d</code>), but I didn't feel like figuring it out in time for the newsletter.</p>\n<h3>Is this better?</h3>\n<p>Now if I actually brought these questions to an interview the interviewee could ruin my day by asking \"what's the runtime complexity?\" Constraint solvers runtimes are unpredictable and almost always slower than an ideal bespoke algorithm because they are more expressive, in what I refer to as the <a href=\"https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/\" target=\"_blank\">capability/tractability tradeoff</a>. But even so, they'll do way better than a <em>bad</em> bespoke algorithm, and I'm not experienced enough in handwriting algorithms to consistently beat a solver.</p>\n<p>The real advantage of solvers, though, is how well they handle new constraints. Take the stock picking problem above. I can write an O(n²) algorithm in a few minutes and the O(n) algorithm if you give me some time to think. Now change the problem to</p>\n<blockquote>\n<p>Maximize the profit by buying and selling up to <code>max_sales</code> stocks, but you can only buy or sell one stock at a given time and you can only hold up to <code>max_hold</code> stocks at a time?</p>\n</blockquote>\n<p>That's a way harder problem to write even an inefficient algorithm for! While the constraint problem is only a tiny bit more complicated:</p>\n<div class=\"codehilite\"><pre><span></span><code>include \"globals.mzn\";\nint: max_sales = 3;\nint: max_hold = 2;\narray[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8];\narray [1..max_sales] of var int: buy;\narray [1..max_sales] of var int: sell;\narray [index_set(prices)] of var 0..max_hold: stocks_held;\nvar int: profit = sum(s in 1..max_sales) (prices[sell[s]] - prices[buy[s]]);\n\nconstraint forall (s in 1..max_sales) (sell[s] > buy[s]);\nconstraint profit > 0;\n\nconstraint forall(i in index_set(prices)) (stocks_held[i] = (count(s in 1..max_sales) (buy[s] <= i) - count(s in 1..max_sales) (sell[s] <= i)));\nconstraint alldifferent(buy ++ sell);\nsolve maximize profit;\n\noutput [\"buy at \\(buy)\\n\", \"sell at \\(sell)\\n\", \"for \\(profit)\"];\n</code></pre></div>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<p>Most constraint solving examples online are puzzles, like <a href=\"https://docs.minizinc.dev/en/stable/modelling2.html#ex-sudoku\" target=\"_blank\">Sudoku</a> or \"<a href=\"https://docs.minizinc.dev/en/stable/modelling2.html#ex-smm\" target=\"_blank\">SEND + MORE = MONEY</a>\". Solving leetcode problems would be a more interesting demonstration. And you get more interesting opportunities to teach optimizations, like symmetry breaking.</p>\n<hr/>\n<h3>Update for the Internet</h3>\n<p>This was sent as a weekly newsletter, which is usually on topics like <a href=\"https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code\" target=\"_blank\">software history</a>, <a href=\"https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/\" target=\"_blank\">formal methods</a>, <a href=\"https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/\" target=\"_blank\">unusual technologies</a>, and the <a href=\"https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/\" target=\"_blank\">theory of software engineering</a>. You can subscribe here: </p>\n<div class=\"subscribe-form\"></div>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:leetcode\">\n<p>Because my dad will email me if I don't explain this: \"leetcode\" is slang for \"tricky algorithmic interview questions that have little-to-no relevance in the actual job you're interviewing for.\" It's from <a href=\"https://leetcode.com/\" target=\"_blank\">leetcode.com</a>. <a class=\"footnote-backref\" href=\"#fnref:leetcode\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/many-hard-leetcode-problems-are-easy-constraint/",
          "published": "2025-09-10T13:00:00.000Z",
          "updated": "2025-09-10T13:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/the-angels-and-demons-of-nondeterminism/",
          "title": "The Angels and Demons of Nondeterminism",
          "description": "<p>Greetings everyone! You might have noticed that it's September and I don't have the next version of <em>Logic for Programmers</em> ready. As penance, <a href=\"https://leanpub.com/logic/c/september-2025-kuBCrhBnUzb7\" target=\"_blank\">here's ten free copies of the book</a>.</p>\n<p>So a few months ago I wrote <a href=\"https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/\" target=\"_blank\">a newsletter</a> about how we use nondeterminism in formal methods.  The overarching idea:</p>\n<ol>\n<li>Nondeterminism is when multiple paths are possible from a starting state.</li>\n<li>A system preserves a property if it holds on <em>all</em> possible paths. If even one path violates the property, then we have a bug.</li>\n</ol>\n<p>An intuitive model of this is that for this is that when faced with a nondeterministic choice, the system always makes the <em>worst possible choice</em>. This is sometimes called <strong>demonic nondeterminism</strong> and is favored in formal methods because we are paranoid to a fault.</p>\n<p>The opposite would be <strong>angelic nondeterminism</strong>, where the system always makes the <em>best possible choice</em>. A property then holds if <em>any</em> possible path satisfies that property.<sup id=\"fnref:duals\"><a class=\"footnote-ref\" href=\"#fn:duals\">1</a></sup> This is not as common in FM, but it still has its uses! \"Players can access the secret level\" or \"<a href=\"https://www.hillelwayne.com/post/safety-and-liveness/#other-properties\" target=\"_blank\">We can always shut down the computer</a>\" are <strong>reachability</strong> properties, that something is possible even if not actually done.</p>\n<p>In broader computer science research, I'd say that angelic nondeterminism is more popular, due to its widespread use in complexity analysis and programming languages.</p>\n<h3>Complexity Analysis</h3>\n<p>P is the set of all \"decision problems\" (<em>basically</em>, boolean functions) can be solved in polynomial time: there's an algorithm that's worst-case in <code>O(n)</code>, <code>O(n²)</code>, <code>O(n³)</code>, etc.<sup id=\"fnref:big-o\"><a class=\"footnote-ref\" href=\"#fn:big-o\">2</a></sup>  NP is the set of all problems that can be solved in polynomial time by an algorithm with <em>angelic nondeterminism</em>.<sup id=\"fnref:TM\"><a class=\"footnote-ref\" href=\"#fn:TM\">3</a></sup> For example, the question \"does list <code>l</code> contain <code>x</code>\" can be solved in O(1) time by a nondeterministic algorithm:</p>\n<div class=\"codehilite\"><pre><span></span><code>fun is_member(l: List[T], x: T): bool {\n  if l == [] {return false};\n\n  guess i in 0..<(len(l)-1);\n  return l[i] == x;\n}\n</code></pre></div>\n<p>Say call <code>is_member([a, b, c, d], c)</code>. The best possible choice would be to guess <code>i = 2</code>, which would correctly return true. Now call <code>is_member([a, b], d)</code>. No matter what we guess, the algorithm correctly returns false. and just return false. Ergo, O(1). NP stands for \"Nondeterministic Polynomial\". </p>\n<p>(And I just now realized something pretty cool: you can say that P is the set of all problems solvable in polynomial time under <em>demonic nondeterminism</em>, which is a nice parallel between the two classes.)</p>\n<p>Computer scientists have proven that angelic nondeterminism doesn't give us any more \"power\": there are no problems solvable with AN that aren't also solvable deterministically. The big question is whether AN is more <em>efficient</em>: it is widely believed, but not <em>proven</em>, that there are problems in NP but not in P. Most famously, \"Is there any variable assignment that makes this boolean formula true?\" A polynomial AN algorithm is again easy:</p>\n<div class=\"codehilite\"><pre><span></span><code>fun SAT(f(x1, x2, …: bool): bool): bool {\n   N = num_params(f)\n   for i in 1..=num_params(f) {\n     guess x_i in {true, false}\n   }\n\n   return f(x_1, x_2, …)\n}\n</code></pre></div>\n<p>The best deterministic algorithms we have to solve the same problem are worst-case exponential with the number of boolean parameters. This a real frustrating problem because real computers don't have angelic nondeterminism, so problems like SAT remain hard. We can solve most \"well-behaved\" instances of the problem <a href=\"https://www.hillelwayne.com/post/np-hard/\" target=\"_blank\">in reasonable time</a>, but the worst-case instances get intractable real fast.</p>\n<h3>Means of Abstraction</h3>\n<div class=\"subscribe-form\"></div>\n<p>We can directly turn an AN algorithm into a (possibly much slower) deterministic algorithm, such as by <a href=\"https://en.wikipedia.org/wiki/Backtracking\" target=\"_blank\">backtracking</a>. This makes AN a pretty good abstraction over what an algorithm is doing. Does the regex <code>(a+b)\\1+</code> match \"abaabaabaab\"? Yes, if the regex engine nondeterministically guesses that it needs to start at the third letter and make the group <code>aab</code>. How does my PL's regex implementation find that match? I dunno, backtracking or <a href=\"https://swtch.com/~rsc/regexp/regexp1.html\" target=\"_blank\">NFA construction</a> or something, I don't need to know the deterministic specifics in order to use the nondeterministic abstraction.</p>\n<p>Neel Krishnaswami has <a href=\"https://semantic-domain.blogspot.com/2013/07/what-declarative-languages-are.html\" target=\"_blank\">a great definition of 'declarative language'</a>: \"any language with a semantics has some nontrivial existential quantifiers in it\". I'm not sure if this is <em>identical</em> to saying \"a language with an angelic nondeterministic abstraction\", but they must be pretty close, and all of his examples match:</p>\n<ul>\n<li>SQL's selects and joins</li>\n<li>Parsing DSLs</li>\n<li>Logic programming's unification</li>\n<li>Constraint solving</li>\n</ul>\n<p>On top of that I'd add CSS selectors and <a href=\"https://www.hillelwayne.com/post/picat/\" target=\"_blank\">planner's actions</a>; all nondeterministic abstractions over a deterministic implementation. He also says that the things programmers hate most in declarative languages are features that \"that expose the operational model\": constraint solver search strategies, Prolog cuts, regex backreferences, etc. Which again matches my experiences with angelic nondeterminism: I dread features that force me to understand the deterministic implementation. But they're necessary, since P probably != NP and so we need to worry about operational optimizations.</p>\n<h3>Eldritch Nondeterminism</h3>\n<p>If you need to know the <a href=\"https://en.wikipedia.org/wiki/PP_(complexity)\" target=\"_blank\">ratio of good/bad paths</a>, <a href=\"https://en.wikipedia.org/wiki/%E2%99%AFP\" target=\"_blank\">the number of good paths</a>, or probability, or anything more than \"there is a good path\" or \"there is a bad path\", you are beyond the reach of heaven or hell.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:duals\">\n<p>Angelic and demonic nondeterminism are <a href=\"https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/\" target=\"_blank\">duals</a>: angelic returns \"yes\" if <code>some choice: correct</code> and demonic returns \"no\" if <code>!all choice: correct</code>, which is the same as <code>some choice: !correct</code>. <a class=\"footnote-backref\" href=\"#fnref:duals\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:big-o\">\n<p>Pet peeve about Big-O notation: <code>O(n²)</code> is the <em>set</em> of all algorithms that, for sufficiently large problem sizes, grow no faster that quadratically. \"Bubblesort has <code>O(n²)</code> complexity\" <em>should</em> be written <code>Bubblesort in O(n²)</code>, <em>not</em> <code>Bubblesort = O(n²)</code>. <a class=\"footnote-backref\" href=\"#fnref:big-o\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:TM\">\n<p>To be precise, solvable in polynomial time by a <em>Nondeterministic Turing Machine</em>, a very particular model of computation. We can broadly talk about P and NP without framing everything in terms of Turing machines, but some details of complexity classes (like the existence \"weak NP-hardness\") kinda need Turing machines to make sense. <a class=\"footnote-backref\" href=\"#fnref:TM\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/the-angels-and-demons-of-nondeterminism/",
          "published": "2025-09-04T14:00:00.000Z",
          "updated": "2025-09-04T14:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/",
          "title": "Logical Duals in Software Engineering",
          "description": "<p>(<a href=\"https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/\" target=\"_blank\">Last week's newsletter</a> took too long and I'm way behind on <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a> revisions so short one this time.<sup id=\"fnref:retread\"><a class=\"footnote-ref\" href=\"#fn:retread\">1</a></sup>)</p>\n<p>In classical logic, two operators <code>F/G</code> are <strong>duals</strong> if <code>F(x) = !G(!x)</code>. Three examples:</p>\n<ol>\n<li><code>x || y</code> is the same as <code>!(!x && !y)</code>.</li>\n<li><code><>P</code> (\"P is possibly true\") is the same as <code>![]!P</code> (\"not P isn't definitely true\").</li>\n<li><code>some x in set: P(x)</code> is the same as <code>!(all x in set: !P(x))</code>.</li>\n</ol>\n<p>(1) is just a version of De Morgan's Law, which we regularly use to simplify boolean expressions. (2) is important in modal logic but has niche applications in software engineering, mostly in how it powers various formal methods.<sup id=\"fnref:fm\"><a class=\"footnote-ref\" href=\"#fn:fm\">2</a></sup> The real interesting one is (3), the \"quantifier duals\". We use lots of software tools to either <em>find</em> a value satisfying <code>P</code> or <em>check</em> that all values satisfy <code>P</code>. And by duality, any tool that does one can do the other, by seeing if it <em>fails</em> to find/check <code>!P</code>. Some examples in the wild:</p>\n<ul>\n<li>Z3 is used to solve mathematical constraints, like \"find x, where <code>f(x) >= 0</code>. If I want to prove a property like \"f is always positive\", I ask z3 to solve \"find x, where <code>!(f(x) >= 0)</code>, and see if that is unsatisfiable. This use case powers a LOT of theorem provers and formal verification tooling.</li>\n<li>Property testing checks that all inputs to a code block satisfy a property. I've used it to generate complex inputs with certain properties by checking that all inputs <em>don't</em> satisfy the property and reading out the test failure.</li>\n<li>Model checkers check that all behaviors of a specification satisfy a property, so we can find a behavior that reaches a goal state G by checking that all states are <code>!G</code>. <a href=\"https://github.com/tlaplus/Examples/blob/master/specifications/DieHard/DieHard.tla\" target=\"_blank\">Here's TLA+ solving a puzzle this way</a>.<sup id=\"fnref:antithesis\"><a class=\"footnote-ref\" href=\"#fn:antithesis\">3</a></sup></li>\n<li>Planners find behaviors that reach a goal state, so we can check if all behaviors satisfy a property P by asking it to reach goal state <code>!P</code>.</li>\n<li>The problem \"find the shortest <a href=\"https://en.wikipedia.org/wiki/Travelling_salesman_problem\" target=\"_blank\">traveling salesman route</a>\" can be broken into <code>some route: distance(route) = n</code> and <code>all route: !(distance(route) < n)</code>. Then a route finder can find the first, and then convert the second into a <code>some</code> and <em>fail</em> to find it, proving <code>n</code> is optimal.</li>\n</ul>\n<p>Even cooler to me is when a tool does <em>both</em> finding and checking, but gives them different \"meanings\". In SQL, <code>some x: P(x)</code> is true if we can <em>query</em> for <code>P(x)</code> and get a nonempty response, while <code>all x: P(x)</code> is true if all records satisfy the <code>P(x)</code> <em>constraint</em>. Most SQL databases allow for complex queries but not complex constraints! You got <code>UNIQUE</code>, <code>NOT NULL</code>, <code>REFERENCES</code>, which are fixed predicates, and <code>CHECK</code>, which is one-record only.<sup id=\"fnref:check\"><a class=\"footnote-ref\" href=\"#fn:check\">4</a></sup></p>\n<p>Oh, and you got database triggers, which can run arbitrary queries and throw exceptions. So if you really need to enforce a complex constraint <code>P(x, y, z)</code>, you put in a database trigger that queries <code>some x, y, z: !P(x, y, z)</code> and throws an exception if it finds any results. That all works because of quantifier duality! See <a href=\"https://eddmann.com/posts/maintaining-invariant-constraints-in-postgresql-using-trigger-functions/\" target=\"_blank\">here</a> for an example of this in practice.</p>\n<h3>Duals more broadly</h3>\n<p>\"Dual\" doesn't have a strict meaning in math, it's more of a vibe thing where all of the \"duals\" are kinda similar in meaning but don't strictly follow all of the same rules. <em>Usually</em> things X and Y are duals if there is some transform <code>F</code> where <code>X = F(Y)</code> and <code>Y = F(X)</code>, but not always. Maybe the category theorists have a formal definition that covers all of the different uses. Usually duals switch properties of things, too: an example showing <code>some x: P(x)</code> becomes a <em>counterexample</em> of <code>all x: !P(x)</code>.</p>\n<p>Under this definition, I think the dual of a list <code>l</code> could be <code>reverse(l)</code>. The first element of <code>l</code> becomes the last element of <code>reverse(l)</code>, the last becomes the first, etc. A more interesting case is the dual of a <code>K -> set(V)</code> map is the <code>V -> set(K)</code> map. IE the dual of <code>lived_in_city = {alice: {paris}, bob: {detroit}, charlie: {detroit, paris}}</code> is <code>city_lived_in_by = {paris: {alice, charlie}, detroit: {bob, charlie}}</code>. This preserves the property that <code>x in map[y] <=> y in dual[x]</code>.</p>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:retread\">\n<p>And after writing this I just realized this is partial retread of a newsletter I wrote <a href=\"https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/\" target=\"_blank\">a couple months ago</a>. But only a <em>partial</em> retread! <a class=\"footnote-backref\" href=\"#fnref:retread\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:fm\">\n<p>Specifically \"linear temporal logics\" are modal logics, so \"<code>eventually P</code> (\"P is true in at least one state of each behavior\") is the same as saying <code>!always !P</code> (\"not P isn't true in all states of all behaviors\"). This is the basis of <a href=\"https://www.hillelwayne.com/post/safety-and-liveness/\" target=\"_blank\">liveness checking</a>. <a class=\"footnote-backref\" href=\"#fnref:fm\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:antithesis\">\n<p>I don't know for sure, but my best guess is that Antithesis does something similar <a href=\"https://antithesis.com/blog/tag/games/\" target=\"_blank\">when their fuzzer beats videogames</a>. They're doing fuzzing, not model checking, but they have the same purpose check that complex state spaces don't have bugs. Making the bug \"we can't reach the end screen\" can make a fuzzer output a complete end-to-end run of the game. Obvs a lot more complicated than that but that's the general idea at least. <a class=\"footnote-backref\" href=\"#fnref:antithesis\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n<li id=\"fn:check\">\n<p>For <code>CHECK</code> to constraint multiple records you would need to use a subquery. Core SQL does not support subqueries in check. It is an optional database \"feature outside of core SQL\" (F671), which <a href=\"https://www.postgresql.org/docs/current/unsupported-features-sql-standard.html\" target=\"_blank\">Postgres does not support</a>. <a class=\"footnote-backref\" href=\"#fnref:check\" title=\"Jump back to footnote 4 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/logical-duals-in-software-engineering/",
          "published": "2025-08-27T19:25:32.000Z",
          "updated": "2025-08-27T19:25:32.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/",
          "title": "Sapir-Whorf does not apply to Programming Languages",
          "description": "<p><em>This one is a hot mess but it's too late in the week to start over. Oh well!</em></p>\n<p>Someone recognized me at last week's <a href=\"https://www.chipy.org/\" target=\"_blank\">Chipy</a> and asked for my opinion on Sapir-Whorf hypothesis in programming languages. I thought this was interesting enough to make a newsletter. First what it is, then why it <em>looks</em> like it applies, and then why it doesn't apply after all.</p>\n<h3>The Sapir-Whorf Hypothesis</h3>\n<blockquote>\n<p>We dissect nature along lines laid down by our native language. — <a href=\"https://web.mit.edu/allanmc/www/whorf.scienceandlinguistics.pdf\" target=\"_blank\">Whorf</a></p>\n</blockquote>\n<p>To quote from a <a href=\"https://www.amazon.com/Linguistics-Complete-Introduction-Teach-Yourself/dp/1444180320\" target=\"_blank\">Linguistics book I've read</a>, the hypothesis is that \"an individual's fundamental perception of reality is moulded by the language they speak.\" As a massive oversimplification, if English did not have a word for \"rebellion\", we would not be able to conceive of rebellion. This view, now called <a href=\"https://en.wikipedia.org/wiki/Linguistic_determinism\" target=\"_blank\">Linguistic Determinism</a>, is mostly rejected by modern linguists.</p>\n<p>The \"weak\" form of SWH is that the language we speak influences, but does not <em>decide</em> our cognition. <a href=\"https://langcog.stanford.edu/papers/winawer2007.pdf\" target=\"_blank\">For example</a>, Russian has distinct words for \"light blue\" and \"dark blue\", so can discriminate between \"light blue\" and \"dark blue\" shades faster than they can discriminate two \"light blue\" shades. English does not have distinct words, so we discriminate those at the same speed. This <strong>linguistic relativism</strong> seems to have lots of empirical support in studies, but mostly with \"small indicators\". I don't think there's anything that convincingly shows linguistic relativism having effects on a societal level.<sup id=\"fnref:economic-behavior\"><a class=\"footnote-ref\" href=\"#fn:economic-behavior\">1</a></sup></p>\n<p>The weak form of SWH for software would then be the \"the programming languages you know affects how you think about programs.\"</p>\n<h3>SWH in software</h3>\n<p>This seems like a natural fit, as different paradigms solve problems in different ways. Consider the <a href=\"https://hadid.dev/posts/living-coding/\" target=\"_blank\">hardest interview question ever</a>, \"given a list of integers, sum the even numbers\". Here it is in four paradigms:</p>\n<ul>\n<li>Procedural: <code>total = 0; foreach x in list {if IsEven(x) total += x}</code>. You iterate over data with an algorithm.</li>\n<li>Functional: <code>reduce(+, filter(IsEven, list), 0)</code>. You apply transformations to data to get a result.</li>\n<li>Array: <code>+ fold L * iseven L</code>.<sup id=\"fnref:J\"><a class=\"footnote-ref\" href=\"#fn:J\">2</a></sup> In English: replace every element in L with 0 if odd and 1 if even, multiple the new array elementwise against <code>L</code>, and then sum the resulting array. It's like functional except everything is in terms of whole-array transformations.</li>\n<li>Logical: Somethingish like <code>sumeven(0, []). sumeven(X, [Y|L]) :- iseven(Y) -> sumeven(Z, L), X is Y + Z ; sumeven(X, L)</code>. You write a set of equations that express what it means for X to <em>be</em> the sum of events of L.</li>\n</ul>\n<p>There's some similarities between how these paradigms approach the problem, but each is also unique, too. It's plausible that where a procedural programmer \"sees\" a for loop, a functional programmer \"sees\" a map and an array programmer \"sees\" a singular operator.</p>\n<p>I also have a personal experience with how a language changed the way I think. I use <a href=\"https://learntla.com/\" target=\"_blank\">TLA+</a> to detect concurrency bugs in software designs. After doing this for several years, I've gotten much better at intuitively seeing race conditions in things even <em>without</em> writing a TLA+ spec. It's even leaked out into my day-to-day life. I see concurrency bugs everywhere. Phone tag is a race condition.</p>\n<p>But I still don't think SWH is the right mental model to use, for one big reason: language is <em>special</em>. We think in language, we dream in language, there are huge parts of our brain dedicated to processing language. <a href=\"https://web.eecs.umich.edu/~weimerw/p/weimer-icse2017-preprint.pdf\" target=\"_blank\">We don't use those parts of our brain to read code</a>. </p>\n<p>SWH is so intriguing because it seems so unnatural, that the way we express thoughts changes the way we <em>think</em> thoughts. That I would be a different person if I was bilingual in Spanish, not because the life experiences it would open up but because <a href=\"https://en.wikipedia.org/wiki/Grammatical_gender\" target=\"_blank\">grammatical gender</a> would change my brain.</p>\n<p>Compared to that, the idea that programming languages affect our brain is more natural and has a simpler explanation:</p>\n<p>It's the goddamned <a href=\"https://en.wikipedia.org/wiki/Tetris_effect\" target=\"_blank\">Tetris Effect</a>.</p>\n<h3>The Goddamned Tetris Effect</h3>\n<div class=\"subscribe-form\"></div>\n<blockquote>\n<p>The Tetris effect occurs when someone dedicates vast amounts of time, effort and concentration on an activity which thereby alters their thoughts, dreams, and other experiences not directly linked to said activity. — Wikipedia</p>\n</blockquote>\n<p>Every skill does this. I'm a juggler, so every item I can see right now has a tiny metadata field of \"how would this tumble if I threw it up\". I teach professionally, so I'm always noticing good teaching examples everywhere. I spent years writing specs in TLA+ and watching the model checker throw concurrency errors in my face, so now race conditions have visceral presence. Every skill does this. </p>\n<p>And to really develop a skill, you gotta practice. This is where I think programming paradigms do something especially interesting that make them feel more like Sapir-Whorfy than, like, juggling. Some languages mix lots of different paradigms, like Javascript or Rust. Others like Haskell really focus on <em>excluding</em> paradigms. If something is easy for you in procedural and hard in FP, in JS you could just lean on the procedural bits. In Haskell, <em>too bad</em>, you're learning how to do it the functional way.<sup id=\"fnref:escape-hatch\"><a class=\"footnote-ref\" href=\"#fn:escape-hatch\">3</a></sup></p>\n<p>And that forces you to practice, which makes you see functional patterns everywhere. Tetris effect!</p>\n<p>Anyway this may all seem like quibbling— why does it matter whether we call it \"Tetris effect\" or \"Sapir-Whorf\", if our brains is get rewired either way? For me, personally, it's because SWH sounds really special and <em>unique</em>, while Tetris effect sounds mundane and commonplace. Which it <em>is</em>. But also because TE suggests it's not just programming languages that affect how we think about software, it's <em>everything</em>. Spending lots of time debugging, profiling, writing exploits, whatever will change what you notice, what you think a program \"is\". And that's a way useful idea that shouldn't be restricted to just PLs.</p>\n<p>(Then again, the Tetris Effect might also be a bad analogy to what's going on here, because I think part of it is that it wears off after a while. Maybe it's just \"building a mental model is good\".)</p>\n<h3>I just realized all of this might have missed the point</h3>\n<p>Wait are people actually using SWH to mean the <em>weak form</em> or the <em>strong</em> form? Like that if a language doesn't make something possible, its users can't conceive of it being possible. I've been arguing against the weaker form in software but I think I've seen strong form often too. Dammit.</p>\n<p>Well, it's already Thursday and far too late to rewrite the whole newsletter, so I'll just outline the problem with the strong form: we describe the capabilities of our programming languages <em>with human language</em>. In college I wrote a lot of crappy physics lab C++ and one of my projects was filled with comments like \"man I hate copying this triply-nested loop in 10 places with one-line changes, I wish I could put it in one function and just take the changing line as a parameter\". Even if I hadn't <em>encountered</em> higher-order functions, I was still perfectly capable of expressing the idea. So if the strong SWH isn't true for human language, it's not true for programming languages either.</p>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<hr/>\n<h1>Systems Distributed talk now up!</h1>\n<p><a href=\"https://www.youtube.com/watch?v=d9cM8f_qSLQ\" target=\"_blank\">Link here</a>! Original abstract:</p>\n<blockquote>\n<p>Building correct distributed systems takes thinking outside the box, and the fastest way to do that is to think inside a different box. One different box is \"formal methods\", the discipline of mathematically verifying software and systems. Formal methods encourages unusual perspectives on systems, models that are also broadly useful to all software developers. In this talk we will learn two of the most important FM perspectives: the abstract specifications behind software systems, and the property they are and aren't supposed to have.</p>\n</blockquote>\n<p>The talk ended up evolving away from that abstract but I like how it turned out!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:economic-behavior\">\n<p>There is <a href=\"https://www.anderson.ucla.edu/faculty/keith.chen/papers/LanguageWorkingPaper.pdf\" target=\"_blank\">one paper</a> arguing that people who speak a language that doesn't have a \"future tense\" are more likely to save and eat healthy, but it is... <a href=\"https://www.reddit.com/r/linguistics/comments/rcne7m/comment/hnz2705/\" target=\"_blank\">extremely questionable</a>. <a class=\"footnote-backref\" href=\"#fnref:economic-behavior\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:J\">\n<p>The original J is <code>+/ (* (0 =  2&|))</code>. Obligatory <a href=\"https://www.jsoftware.com/papers/tot.htm\" target=\"_blank\">Notation as a Tool of Thought</a> reference <a class=\"footnote-backref\" href=\"#fnref:J\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:escape-hatch\">\n<p>Though if it's <em>too</em> hard for you, that's why languages have <a href=\"https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/\" target=\"_blank\">escape hatches</a> <a class=\"footnote-backref\" href=\"#fnref:escape-hatch\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/",
          "published": "2025-08-21T13:00:00.000Z",
          "updated": "2025-08-21T13:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/software-books-i-wish-i-could-read/",
          "title": "Software books I wish I could read",
          "description": "<h3>New Logic for Programmers Release!</h3>\n<p><a href=\"https://leanpub.com/logic/\" target=\"_blank\">v0.11 is now available</a>! This is over 20%  longer than v0.10, with a new chapter on code proofs, three chapter overhauls, and more! <a href=\"https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md\" target=\"_blank\">Full release notes here</a>.</p>\n<p><img alt=\"Cover of the boooooook\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/92b4a35d-2bdd-416a-92c7-15ff42b49d8d.jpg?w=960&fit=max\"/></p>\n<h1>Software books I wish I could read</h1>\n<p>I'm writing <em>Logic for Programmers</em> because it's a book I wanted to have ten years ago. I had to learn everything in it the hard way, which is why I'm ensuring that everybody else can learn it the easy way.</p>\n<p>Books occupy a sort of weird niche in software. We're great at sharing information via blogs and git repos and entire websites. These have many benefits over books: they're free, they're easily accessible, they can be updated quickly, they can even be interactive. But no blog post has influenced me as profoundly as <a href=\"https://buttondown.com/hillelwayne/archive/why-you-should-read-data-and-reality/\" target=\"_blank\">Data and Reality</a> or <a href=\"https://www.oreilly.com/library/view/making-software/9780596808310/\" target=\"_blank\">Making Software</a>. There is no blog or talk about debugging as good as the \n<a href=\"https://debuggingrules.com/\" target=\"_blank\">Debugging</a> book.</p>\n<p>It might not be anything deeper than \"people spend more time per word on writing books than blog posts\". I dunno.</p>\n<p>So here are some other books I wish I could read. I don't <em>think</em> any of them exist yet but it's a big world out there. Also while they're probably best as books, a website or a series of blog posts would be ok too.</p>\n<h4>Everything about Configurations</h4>\n<p>The whole topic of how we configure software, whether by CLI flags, environmental vars, or JSON/YAML/XML/Dhall files. What causes the <a href=\"https://mikehadlow.blogspot.com/2012/05/configuration-complexity-clock.html\" target=\"_blank\">configuration complexity clock</a>? How do we distinguish between basic, advanced, and developer-only configuration options? When should we disallow configuration? How do we test all possible configurations for correctness? Why do so many widespread outages trace back to misconfiguration, and how do we prevent them? </p>\n<p>I also want the same for plugin systems. Manifests, permissions, common APIs and architectures, etc. Configuration management is more universal, though, since everybody either uses software with configuration or has made software with configuration.</p>\n<h4>The Big Book of Complicated Data Schemas</h4>\n<p>I guess this would kind of be like <a href=\"https://schema.org/docs/full.html\" target=\"_blank\">Schema.org</a>, except with a lot more on the \"why\" and not the what. Why is important for the <a href=\"https://schema.org/Volcano\" target=\"_blank\">Volcano model</a> to have a \"smokingAllowed\" field?<sup id=\"fnref:volcano\"><a class=\"footnote-ref\" href=\"#fn:volcano\">1</a></sup></p>\n<p>I'd see this less as \"here's your guide to putting Volcanos in your database\" and more \"here's recurring motifs in modeling interesting domains\", to help a person see sources of complexity in their <em>own</em> domain. Does something crop up if the references can form a cycle? If a relationship needs to be strictly temporary, or a reference can change type? Bonus: path dependence in data models, where an additional requirement leads to a vastly different ideal data model that a company couldn't do because they made the old model.</p>\n<p>(This has got to exist, right? Business modeling is a big enough domain that this must exist. Maybe <a href=\"https://essenceofsoftware.com/\" target=\"_blank\">The Essence of Software</a> touches on this? Man I feel bad I haven't read that yet.)</p>\n<h4>Computer Science for Software Engineers</h4>\n<p>Yes, I checked, this book does not exist (though maybe <a href=\"https://www.amazon.com/A-Programmers-Guide-to-Computer-Science-2-book-series/dp/B08433QR53\" target=\"_blank\">this</a> is the same thing). I don't have any formal software education; everything I know was either self-taught or learned on the job. But it's way easier to learn software engineering that way than computer science. And I bet there's a lot of other engineers in the same boat. </p>\n<p>This book wouldn't have to be comprehensive or instructive: just enough about each topic to understand why it's an area of study and appreciate how research in it eventually finds its way into practice. </p>\n<h4>MISU Patterns</h4>\n<p>MISU, or \"Make Illegal States Unrepresentable\", is the idea of designing system invariants in the structure of your data. For example, if a <code>Contact</code> needs at least one of <code>email</code> or <code>phone</code> to be non-null, make it a sum type over <code>EmailContact, PhoneContact, EmailPhoneContact</code> (from <a href=\"https://fsharpforfunandprofit.com/posts/designing-with-types-making-illegal-states-unrepresentable/\" target=\"_blank\">this post</a>). MISU is great.</p>\n<p>Most MISU in the wild look very different than that, though, because the concept of MISU is so broad there's lots of different ways to achieve it. And that means there are \"patterns\": smart constructors, product types, properly using sets, <a href=\"https://lexi-lambda.github.io/blog/2020/11/01/names-are-not-type-safety/\" target=\"_blank\">newtypes to some degree</a>, etc. Some of them are specific to typed FP, while others can be used in even untyped languages. Someone oughta make a pattern book.</p>\n<p>My one request would be to not give them cutesy names. Do something like the <a href=\"https://ia600301.us.archive.org/18/items/Thompson2016MotifIndex/Thompson_2016_Motif-Index.pdf\" target=\"_blank\">Aarne–Thompson–Uther Index</a>, where items are given names like \"Recognition by manner of throwing cakes of different weights into faces of old uncles\". Names can come later.</p>\n<h4>The Tools of '25</h4>\n<p>Not something I'd read, but something to recommend to junior engineers. Starting out it's easy to think the only bit that matters is the language or framework and not realize the enormous amount of surrounding tooling you'll have to learn. This book would cover the basics of tools that <em>enough</em> developers will probably use at some point: git, VSCode, <em>very</em> basic Unix and bash, curl. Maybe the general concepts of tools that appear in every ecosystem, like package managers, build tools, task runners. That might be easier if we specialize this to one particular domain, like webdev or data science.</p>\n<p>Ideally the book would only have to be updated every five years or so. No LLM stuff because I don't expect the tooling will be stable through 2026, to say nothing of 2030.</p>\n<h4>A History of Obsolete Optimizations</h4>\n<p>Probably better as a really long blog series. Each chapter would be broken up into two parts:</p>\n<ol>\n<li>A deep dive into a brilliant, elegant, insightful historical optimization designed to work within the constraints of that era's computing technology</li>\n<li>What we started doing instead, once we had more compute/network/storage available.</li>\n</ol>\n<p>c.f. <a href=\"https://prog21.dadgum.com/29.html\" target=\"_blank\">A Spellchecker Used to Be a Major Feat of Software Engineering</a>. Bonus topics would be brilliance obsoleted by standardization (like what people did before git and json were universal), optimizations we do today that may not stand the test of time, and optimizations from the past that <em>did</em>.</p>\n<h4>Sphinx Internals</h4>\n<p><em>I need this</em>. I've spent so much goddamn time digging around in Sphinx and docutils source code I'm gonna throw up.</p>\n<hr/>\n<h3>Systems Distributed Talk Today!</h3>\n<p>Online premier's at noon central / 5 PM UTC, <a href=\"https://www.youtube.com/watch?v=d9cM8f_qSLQ\" target=\"_blank\">here</a>! I'll be hanging out to answer questions and be awkward. You ever watch a recording of your own talk? It's real uncomfortable!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:volcano\">\n<p>In <em>this</em> case because it's a field on one of <code>Volcano</code>'s supertypes. I guess schemas gotta follow LSP too <a class=\"footnote-backref\" href=\"#fnref:volcano\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/software-books-i-wish-i-could-read/",
          "published": "2025-08-06T13:00:00.000Z",
          "updated": "2025-08-06T13:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/2000-words-about-arrays-and-tables/",
          "title": "2000 words about arrays and tables",
          "description": "<p>I'm way too discombobulated from getting next month's release of <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a> ready, so I'm pulling a idea from the slush pile. Basically I wanted to come up with a mental model of arrays as a concept that explained APL-style multidimensional arrays and tables but also why there weren't multitables.</p>\n<p>So, arrays. In all languages they are basically the same: they map a sequence of numbers (I'll use <code>1..N</code>)<sup id=\"fnref:1-indexing\"><a class=\"footnote-ref\" href=\"#fn:1-indexing\">1</a></sup> to homogeneous values (values of a single type). This is in contrast to the other two foundational types, associative arrays (which map an arbitrary type to homogeneous values) and structs (which map a fixed set of keys to <em>heterogeneous</em> values). Arrays appear in PLs earlier than the other two, possibly because they have the simplest implementation and the most obvious application to scientific computing. The OG FORTRAN had arrays. </p>\n<p>I'm interested in two structural extensions to arrays. The first, found in languages like nushell and frameworks like Pandas, is the <em>table</em>. Tables have string keys like a struct <em>and</em> indexes like an array. Each row is a struct, so you can get \"all values in this column\" or \"all values for this row\". They're heavily used in databases and data science.</p>\n<p>The other extension is the <strong>N-dimensional array</strong>, mostly seen in APLs like Dyalog and J. Think of this like arrays-of-arrays(-of-arrays), except all arrays at the same depth have the same length. So <code>[[1,2,3],[4]]</code> is not a 2D array, but <code>[[1,2,3],[4,5,6]]</code> is. This means that N-arrays can be queried on any axis.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"w\"> </span><span class=\"o\">]</span><span class=\"nv\">x</span><span class=\"w\"> </span><span class=\"o\">=:</span><span class=\"w\"> </span><span class=\"nv\">i</span><span class=\"o\">.</span><span class=\"w\"> </span><span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">3</span>\n<span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"mi\">2</span>\n<span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">4</span><span class=\"w\"> </span><span class=\"mi\">5</span>\n<span class=\"mi\">6</span><span class=\"w\"> </span><span class=\"mi\">7</span><span class=\"w\"> </span><span class=\"mi\">8</span>\n<span class=\"w\">   </span><span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"o\">{</span><span class=\"w\"> </span><span class=\"nv\">x</span><span class=\"w\"> </span><span class=\"c1\">NB. first row</span>\n<span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"mi\">2</span>\n<span class=\"w\">   </span><span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"o\">{\"</span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"nv\">x</span><span class=\"w\"> </span><span class=\"c1\">NB. first column</span>\n<span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">6</span>\n</code></pre></div>\n<p>So, I've had some ideas on a conceptual model of arrays that explains all of these variations and possibly predicts new variations. I wrote up my notes and did the bare minimum of editing and polishing. Somehow it ended up being 2000 words.</p>\n<h3>1-dimensional arrays</h3>\n<p>A one-dimensional array is a function over <code>1..N</code> for some N. </p>\n<p>To be clear this is <em>math</em> functions, not programming functions. Programming functions take values of a type and perform computations on them. Math functions take values of a fixed set and return values of another set. So the array <code>[a, b, c, d]</code> can be represented by the function <code>(1 -> a ++ 2 -> b ++ 3 -> c ++ 4 -> d)</code>. Let's write the set of all four element character arrays as <code>1..4 -> char</code>. <code>1..4</code> is the function's <strong>domain</strong>.</p>\n<p>The set of all character arrays is the empty array + the functions with domain <code>1..1</code> + the functions with domain <code>1..2</code> + ... Let's call this set <code>Array[Char]</code>. Our compilers can enforce that a type belongs to <code>Array[Char]</code>, but some operations care about the more specific type, like matrix multiplication. This is either checked with the runtime type or, in exotic enough languages, with static dependent types.</p>\n<p>(This is actually how TLA+ does things: the basic collection types are functions and sets, and a function with domain 1..N is a sequence.)</p>\n<h3>2-dimensional arrays</h3>\n<p>Now take the 3x4 matrix</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"w\">   </span><span class=\"nv\">i</span><span class=\"o\">.</span><span class=\"w\"> </span><span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">4</span>\n<span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">1</span><span class=\"w\">  </span><span class=\"mi\">2</span><span class=\"w\">  </span><span class=\"mi\">3</span>\n<span class=\"mi\">4</span><span class=\"w\"> </span><span class=\"mi\">5</span><span class=\"w\">  </span><span class=\"mi\">6</span><span class=\"w\">  </span><span class=\"mi\">7</span>\n<span class=\"mi\">8</span><span class=\"w\"> </span><span class=\"mi\">9</span><span class=\"w\"> </span><span class=\"mi\">10</span><span class=\"w\"> </span><span class=\"mi\">11</span>\n</code></pre></div>\n<p>There are two equally valid ways to represent the array function:</p>\n<ol>\n<li>A function that takes a row and a column and returns the value at that index, so it would look like <code>f(r: 1..3, c: 1..4) -> Int</code>.</li>\n<li>A function that takes a row and returns that column as an array, aka another function: <code>f(r: 1..3) -> g(c: 1..4) -> Int</code>.<sup id=\"fnref:associative\"><a class=\"footnote-ref\" href=\"#fn:associative\">2</a></sup></li>\n</ol>\n<p>Man, (2) looks a lot like <a href=\"https://en.wikipedia.org/wiki/Currying\" target=\"_blank\">currying</a>! In Haskell, functions can only have one parameter. If you write <code>(+) 6 10</code>, <code>(+) 6</code> first returns a <em>new</em> function <code>f y = y + 6</code>, and then applies <code>f 10</code> to get 16. So <code>(+)</code> has the type signature <code>Int -> Int -> Int</code>: it's a function that takes an <code>Int</code> and returns a function of type <code>Int -> Int</code>.<sup id=\"fnref:typeclass\"><a class=\"footnote-ref\" href=\"#fn:typeclass\">3</a></sup></p>\n<p>Similarly, our 2D array can be represented as an array function that returns array functions: it has type <code>1..3 -> 1..4 -> Int</code>, meaning it takes a row index and returns <code>1..4 -> Int</code>, aka a single array.</p>\n<p>(This differs from conventional array-of-arrays because it forces all of the subarrays to have the same domain, aka the same length. If we wanted to permit ragged arrays, we would instead have the type <code>1..3 -> Array[Int]</code>.)</p>\n<p>Why is this useful? A couple of reasons. First of all, we can apply function transformations to arrays, like \"<a href=\"https://blog.zdsmith.com/series/combinatory-programming.html\" target=\"_blank\">combinators</a>\". For example, we can flip any function of type <code>a -> b -> c</code> into a function of type <code>b -> a -> c</code>. So given a function that takes rows and returns columns, we can produce one that takes columns and returns rows. That's just a matrix transposition! </p>\n<p>Second, we can extend this to any number of dimensions: a three-dimensional array is one with type <code>1..M -> 1..N -> 1..O -> V</code>. We can still use function transformations to rearrange the array along any ordering of axes.</p>\n<p>Speaking of dimensions:</p>\n<h3>What are dimensions, anyway</h3>\n<div class=\"subscribe-form\"></div>\n<p>Okay, so now imagine we have a <code>Row</code> × <code>Col</code> grid of pixels, where each pixel is a struct of type <code>Pixel(R: int, G: int, B: int)</code>. So the array is</p>\n<div class=\"codehilite\"><pre><span></span><code>Row -> Col -> Pixel\n</code></pre></div>\n<p>But we can also represent the <em>Pixel struct</em> with a function: <code>Pixel(R: 0, G: 0, B: 255)</code> is the function where <code>f(R) = 0</code>, <code>f(G) = 0</code>, <code>f(B) = 255</code>, making it a function of type <code>{R, G, B} -> Int</code>. So the array is actually the function</p>\n<div class=\"codehilite\"><pre><span></span><code>Row -> Col -> {R, G, B} -> Int\n</code></pre></div>\n<p>And then we can rearrange the parameters of the function like this:</p>\n<div class=\"codehilite\"><pre><span></span><code>{R, G, B} -> Row -> Col -> Int\n</code></pre></div>\n<p>Even though the set <code>{R, G, B}</code> is not of form 1..N, this clearly has a real meaning: <code>f[R]</code> is the function mapping each coordinate to that coordinate's red value. What about <code>Row -> {R, G, B} -> Col -> Int</code>?  That's for each row, the 3 × Col array mapping each color to that row's intensities.</p>\n<p>Really <em>any finite set</em> can be a \"dimension\". Recording the monitor over a span of time? <code>Frame -> Row -> Col -> Color -> Int</code>. Recording a bunch of computers over some time? <code>Computer -> Frame -> Row …</code>.</p>\n<p>This is pretty common in constraint satisfaction! Like if you're conference trying to assign talks to talk slots, your array might be type <code>(Day, Time, Room) -> Talk</code>, where Day/Time/Room are enumerations.</p>\n<p>An implementation constraint is that most programming languages <em>only</em> allow integer indexes, so we have to replace Rooms and Colors with numerical enumerations over the set. As long as the set is finite, this is always possible, and for struct-functions, we can always choose the indexing on the lexicographic ordering of the keys. But we lose type safety.</p>\n<h3>Why tables are different</h3>\n<p>One more example: <code>Day -> Hour -> Airport(name: str, flights: int, revenue: USD)</code>. Can we turn the struct into a dimension like before? </p>\n<p>In this case, no. We were able to make <code>Color</code> an axis because we could turn <code>Pixel</code> into a <code>Color -> Int</code> function, and we could only do that because all of the fields of the struct had the same type. This time, the fields are <em>different</em> types. So we can't convert <code>{name, flights, revenue}</code> into an axis. <sup id=\"fnref:name-dimension\"><a class=\"footnote-ref\" href=\"#fn:name-dimension\">4</a></sup> One thing we can do is convert it to three <em>separate</em> functions:</p>\n<div class=\"codehilite\"><pre><span></span><code>airport: Day -> Hour -> Str\nflights: Day -> Hour -> Int\nrevenue: Day -> Hour -> USD\n</code></pre></div>\n<p>But we want to keep all of the data in one place. That's where <strong>tables</strong> come in: an array-of-structs is isomorphic to a struct-of-arrays:</p>\n<div class=\"codehilite\"><pre><span></span><code>AirportColumns(\n    airport: Day -> Hour -> Str,\n    flights: Day -> Hour -> Int,\n    revenue: Day -> Hour -> USD,\n)\n</code></pre></div>\n<p>The table is a sort of <em>both</em> representations simultaneously. If this was a pandas dataframe, <code>df[\"airport\"]</code> would get the airport column, while <code>df.loc[day1]</code> would get the first day's data. I don't think many table implementations support more than one axis dimension but there's no reason they <em>couldn't</em>. </p>\n<p>These are also possible transforms:</p>\n<div class=\"codehilite\"><pre><span></span><code>Hour -> NamesAreHard(\n    airport: Day -> Str,\n    flights: Day -> Int,\n    revenue: Day -> USD,\n)\n\nDay -> Whatever(\n    airport: Hour -> Str,\n    flights: Hour -> Int,\n    revenue: Hour -> USD,\n)\n</code></pre></div>\n<p>In my mental model, the heterogeneous struct acts as a \"block\" in the array. We can't remove it, we can only push an index into the fields or pull a shared column out. But there's no way to convert a heterogeneous table into an array.</p>\n<h3>Actually there is a terrible way</h3>\n<p>Most languages have unions or <del>product</del> sum types that let us say \"this is a string OR integer\". So we can make our airport data <code>Day -> Hour -> AirportKey -> Int | Str | USD</code>. Heck, might as well just say it's <code>Day -> Hour -> AirportKey -> Any</code>. But would anybody really be mad enough to use that in practice?</p>\n<p><a href=\"https://code.jsoftware.com/wiki/Vocabulary/lt\" target=\"_blank\">Oh wait J does exactly that</a>. J has an opaque datatype called a \"box\". A \"table\" is a function <code>Dim1 -> Dim2 -> Box</code>. You can see some examples of what that looks like <a href=\"https://code.jsoftware.com/wiki/DB/Flwor\" target=\"_blank\">here</a></p>\n<h3>Misc Thoughts and Questions</h3>\n<p>The heterogeneity barrier seems like it explains why we don't see multiple axes of table columns, while we do see multiple axes of array dimensions. But is that actually why? Is there a system out there that <em>does</em> have multiple columnar axes?</p>\n<p>The array <code>x = [[a, b, a], [b, b, b]]</code> has type <code>1..2 -> 1..3 -> {a, b}</code>. Can we rearrange it to <code>1..2 -> {a, b} -> 1..3</code>? No. But we <em>can</em> rearrange it to <code>1..2 -> {a, b} -> PowerSet(1..3)</code>, which maps rows and characters to columns <em>with</em> that character. <code>[(a -> {1, 3} ++ b -> {2}), (a -> {} ++ b -> {1, 2, 3}]</code>. </p>\n<p>We can also transform <code>Row -> PowerSet(Col)</code> into <code>Row -> Col -> Bool</code>, aka a boolean matrix. This makes sense to me as both forms are means of representing directed graphs.</p>\n<p>Are other function combinators useful for thinking about arrays?</p>\n<p>Does this model cover pivot tables? Can we extend it to relational data with multiple tables?</p>\n<hr/>\n<h3>Systems Distributed Talk (will be) Online</h3>\n<p>The premier will be August 6 at 12 CST, <a href=\"https://www.youtube.com/watch?v=d9cM8f_qSLQ\" target=\"_blank\">here</a>! I'll be there to answer questions / mock my own performance / generally make a fool of myself.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:1-indexing\">\n<p><a href=\"https://buttondown.com/hillelwayne/archive/why-do-arrays-start-at-0/\" target=\"_blank\">Sacrilege</a>! But it turns out in this context, it's easier to use 1-indexing than 0-indexing. In the years since I wrote that article I've settled on \"each indexing choice matches different kinds of mathematical work\", so mathematicians and computer scientists are best served by being able to choose their index. But software engineers need consistency, and 0-indexing is overall a net better consistency pick. <a class=\"footnote-backref\" href=\"#fnref:1-indexing\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:associative\">\n<p>This is <em>right-associative</em>: <code>a -> b -> c</code> means <code>a -> (b -> c)</code>, not <code>(a -> b) -> c</code>. <code>(1..3 -> 1..4) -> Int</code> would be the associative array that maps length-3 arrays to integers. <a class=\"footnote-backref\" href=\"#fnref:associative\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:typeclass\">\n<p>Technically it has type <code>Num a => a -> a -> a</code>, since <code>(+)</code> works on floats too. <a class=\"footnote-backref\" href=\"#fnref:typeclass\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n<li id=\"fn:name-dimension\">\n<p>Notice that if each <code>Airport</code> had a unique name, we <em>could</em> pull it out into <code>AirportName -> Airport(flights, revenue)</code>, but we still are stuck with two different values. <a class=\"footnote-backref\" href=\"#fnref:name-dimension\" title=\"Jump back to footnote 4 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/2000-words-about-arrays-and-tables/",
          "published": "2025-07-30T13:00:00.000Z",
          "updated": "2025-07-30T13:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/",
          "title": "Programming Language Escape Hatches",
          "description": "<p>The excellent-but-defunct blog <a href=\"https://prog21.dadgum.com/38.html\" target=\"_blank\">Programming in the 21st Century</a> defines \"puzzle languages\" as languages were part of the appeal is in figuring out how to express a program idiomatically, like a puzzle. As examples, he lists Haskell, Erlang, and J. All puzzle languages, the author says, have an \"escape\" out of the puzzle model that is pragmatic but stigmatized.</p>\n<p>But many mainstream languages have escape hatches, too.</p>\n<p>Languages have a lot of properties. One of these properties is the language's <a href=\"https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/\" target=\"_blank\">capabilities</a>, roughly the set of things you can do in the language. Capability is desirable but comes into conflicts with a lot of other desirable properties, like simplicity or efficiency. In particular, reducing the capability of a language means that all remaining programs share more in common, meaning there's more assumptions the compiler and programmer can make (\"tractability\"). Assumptions are generally used to reason about correctness, but can also be about things like optimization: J's assumption that everything is an array leads to <a href=\"https://code.jsoftware.com/wiki/Vocabulary/SpecialCombinations\" target=\"_blank\">high-performance \"special combinations\"</a>. </p>\n<p>Rust is the most famous example of <strong>mainstream</strong> language that trades capability for tractability.<sup id=\"fnref:gc\"><a class=\"footnote-ref\" href=\"#fn:gc\">1</a></sup> Rust has a lot of rules designed to prevent common memory errors, like keeping a reference to deallocated memory or modifying memory while something else is reading it. As a consequence, there's a lot of things that cannot be done in (safe) Rust, like interface with an external C function (as it doesn't have these guarantees).</p>\n<p>To do this, you need to use <a href=\"https://doc.rust-lang.org/book/ch20-01-unsafe-rust.html\" target=\"_blank\">unsafe Rust</a>, which lets you do additional things forbidden by safe Rust, such as deference a raw pointer. Everybody tells you not to use <code>unsafe</code> unless you absolutely 100% know what you're doing, and possibly not even then.</p>\n<p>Sounds like an escape hatch to me!</p>\n<p>To extrapolate, an <strong>escape hatch</strong> is a feature (either in the language itself or a particular implementation) that deliberately breaks core assumptions about the language in order to add capabilities. This explains both Rust and most of the so-called \"puzzle languages\": they need escape hatches because they have very strong conceptual models of the language which leads to lots of assumptions about programs. But plenty of \"kitchen sink\" mainstream languages have escape hatches, too:</p>\n<ul>\n<li>Some compilers let C++ code embed <a href=\"https://en.cppreference.com/w/cpp/language/asm.html\" target=\"_blank\">inline assembly</a>.</li>\n<li>Languages built on .NET or the JVM has some sort of interop with C# or Java, and many of those languages make assumptions about programs that C#/Java do not.</li>\n<li>The SQL language has stored procedures as an escape hatch <em>and</em> vendors create a second escape hatch of user-defined functions.</li>\n<li>Ruby lets you bypass any form of encapsulation with <a href=\"https://ruby-doc.org/3.4.1/Object.html#method-i-send\" target=\"_blank\"><code>send</code></a>.</li>\n<li>Frameworks have escape hatches, too! React has <a href=\"https://react.dev/learn/escape-hatches\" target=\"_blank\">an entire page on them</a>.</li>\n</ul>\n<p>(Does <code>eval</code> in interpreted languages count as an escape hatch? It feels different, but it does add a lot of capability. Maybe they don't \"break assumptions\" in the same way?)</p>\n<h3>The problem with escape hatches</h3>\n<p>In all languages with escape hatches, the rule is \"use this as carefully and sparingly as possible\", to the point where a messy solution <em>without</em> an escape hatch is preferable to a clean solution <em>with</em> one. Breaking a core assumption is a big deal! If the language is operating as if its still true, it's going to do incorrect things. </p>\n<p>I recently had this problem in a TLA+ contract. TLA+ is a language for modeling complicated systems, and assumes that the model is a self-contained universe. The client wanted to use the TLA+ to test a real system. The model checker should send commands to a test device and check the next states were the same. This is straightforward to set up with the <a href=\"https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla\" target=\"_blank\">IOExec escape hatch</a>.<sup id=\"fnref:ioexec\"><a class=\"footnote-ref\" href=\"#fn:ioexec\">2</a></sup> But the model checker assumed that state exploration was pure and it could skip around the state randomly, meaning it would do things like <code>set x = 10</code>, then skip to <code>set x = 1</code>, then skip back to <code>inc x; assert x == 11</code>. Oops!</p>\n<p>We eventually found workarounds but it took a lot of clever tricks to pull off. I'll probably write up the technique when I'm less busy with The Book.</p>\n<p>The other problem with escape hatches is the rest of the language is designed around <em>not</em> having said capabilities, meaning it can't support the feature as well as a language designed for them from the start. Even if your escape hatch code is clean, it might not cleanly <em>integrate</em> with the rest of your code. This is why people <a href=\"https://chadaustin.me/2024/10/intrusive-linked-list-in-rust/\" target=\"_blank\">complain about unsafe Rust</a> so often.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:gc\">\n<p>It should be noted though that <em>all</em> languages with automatic memory management are trading capability for tractability, too. If you can't deference pointers, you can't deference <em>null</em> pointers. <a class=\"footnote-backref\" href=\"#fnref:gc\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:ioexec\">\n<p>From the Community Modules (which come default with the VSCode extension). <a class=\"footnote-backref\" href=\"#fnref:ioexec\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/programming-language-escape-hatches/",
          "published": "2025-07-24T14:00:00.000Z",
          "updated": "2025-07-24T14:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/maybe-writing-speed-actually-is-a-bottleneck-for/",
          "title": "Maybe writing speed actually is a bottleneck for programming",
          "description": "<p>I'm a big (neo)vim buff. My config is over 1500 lines and I regularly write new scripts. I recently ported my neovim config to a new laptop. Before then, I was using VSCode to write, and when I switched back I immediately saw a big gain in productivity.</p>\n<p>People often pooh-pooh vim (and other assistive writing technologies) by saying that writing code isn't the bottleneck in software development. Reading, understanding, and thinking through code is!</p>\n<p>Now I don't know how true this actually is in practice, because empirical studies of time spent coding are all over the place. Most of them, like <a href=\"https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/meyer-fse-2014.pdf\" target=\"_blank\">this study</a>, track time spent in the editor but don't distinguish between time spent reading code and time spent writing code. The only one I found that separates them was <a href=\"https://scispace.com/pdf/i-know-what-you-did-last-summer-an-investigation-of-how-3zxclzzocc.pdf\" target=\"_blank\">this study</a>. It finds that developers spend only 5% of their time editing. It also finds they spend 14% of their time moving or resizing editor windows, so I don't know how clean their data is.</p>\n<p>But I have a bigger problem with \"writing is not the bottleneck\": when I think of a bottleneck, I imagine that <em>no</em> amount of improvement will lead to productivity gains. Like if a program is bottlenecked on the network, it isn't going to get noticeably faster with 100x more ram or compute. </p>\n<p>But being able to type code 100x faster, even with without corresponding improvements to reading and imagining code, would be <strong>huge</strong>. </p>\n<p>We'll assume the average developer writes at 80 words per minute, at five characters a word, for 400 characters a minute.What could we do if we instead wrote at 8,000 words/40k characters a minute? </p>\n<h3>Writing fast</h3>\n<h4>Boilerplate is trivial</h4>\n<p>Why do people like type inference? Because writing all of the types manually is annoying. Why don't people like boilerplate? Because it's annoying to write every damn time. Programmers like features that help them write less! That's not a problem if you can write all of the boilerplate in 0.1 seconds.</p>\n<p>You still have the problem of <em>reading</em> boilerplate heavy code, but you can use the remaining 0.9 seconds to churn out an extension that parses the file and presents the boilerplate in a more legible fashion. </p>\n<h4>We can write more tooling</h4>\n<p>This is something I've noticed with LLMs: when I can churn out crappy code as a free action, I use that to write lots of tools that assist me in writing <em>good</em> code. Even if I'm bottlenecked on a large program, I can still quickly write a script that helps me with something. Most of these aren't things I would have written because they'd take too long to write! </p>\n<p>Again, not the best comparison, because LLMs also shortcut learning the relevant APIs, so also optimize the \"understanding code\" part. Then again, if I could type real fast I could more quickly whip up experiments on new apis to learn them faster. </p>\n<h4>We can do practices that slow us down in the short-term</h4>\n<p>Something like test-driven development significantly slows down how fast you write production code, because you have to spend a lot more time writing test code. Pair programming trades speed of writing code for speed of understanding code. A two-order-of-magnitude writing speedup makes both of them effectively free. Or, if you're not an eXtreme Programming fan, you can more easily follow the <a href=\"https://en.wikipedia.org/wiki/The_Power_of_10:_Rules_for_Developing_Safety-Critical_Code\" target=\"_blank\">The Power of Ten Rules</a> and blanket your code with contracts and assertions.</p>\n<h4>We could do more speculative editing</h4>\n<p>This is probably the biggest difference in how we'd work if we could write 100x faster: it'd be much easier to try changes to the code to see if they're good ideas in the first place. </p>\n<p>How often have I tried optimizing something, only to find out it didn't make a difference? How often have I done a refactoring only to end up with lower-quality code overall? Too often. Over time it makes me prefer to try things that I know will work, and only \"speculatively edit\" when I think it be a fast change. If I could code 100x faster it would absolutely lead to me trying more speculative edits.</p>\n<p>This is especially big because I believe that lots of speculative edits are high-risk, high-reward: given 50 things we could do to the code, 49 won't make a difference and one will be a major improvement. If I only have time to try five things, I have a 10% chance of hitting the jackpot. If I can try 500 things I will get that reward every single time. </p>\n<h2>Processes are built off constraints</h2>\n<p>There are just a few ideas I came up with; there are probably others. Most of them, I suspect, will share the same property in common: they change <em>the process</em> of writing code to leverage the speedup. I can totally believe that a large speedup would not remove a bottleneck in the processes we <em>currently</em> use to write code. But that's because those processes are developed work within our existing constraints. Remove a constraint and new processes become possible.</p>\n<p>The way I see it, if our current process produces 1 Utils of Software / day, a 100x writing speedup might lead to only 1.5 UoS/day. But there are other processes that produce only 0.5 UoS/d <em>because they are bottlenecked on writing speed</em>. A 100x speedup would lead to 10 UoS/day.</p>\n<p>The problem with all of this that 100x speedup isn't realistic, and it's not obvious whether a 2x improvement would lead to better processes. Then again, one of the first custom vim function scripts I wrote was an aid to writing unit tests in a particular codebase, and it lead to me writing a lot more tests. So maybe even a 2x speedup is going to be speed things up, too.</p>\n<hr/>\n<h3>Patreon Stuff</h3>\n<p>I wrote a couple of TLA+ specs to show how to model <a href=\"https://en.wikipedia.org/wiki/Fork%E2%80%93join_model\" target=\"_blank\">fork-join</a> algorithms. I'm planning on eventually writing them up for my blog/learntla but it'll be a while, so if you want to see them in the meantime I put them up on <a href=\"https://www.patreon.com/posts/fork-join-in-tla-134209395?utm_medium=clipboard_copy&utm_source=copyLink&utm_campaign=postshare_creator&utm_content=join_link\" target=\"_blank\">Patreon</a>.</p>",
          "url": "https://buttondown.com/hillelwayne/archive/maybe-writing-speed-actually-is-a-bottleneck-for/",
          "published": "2025-07-17T19:08:27.000Z",
          "updated": "2025-07-17T19:08:27.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/",
          "title": "Logic for Programmers Turns One",
          "description": "<p>I released <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a> exactly one year ago today. It feels weird to celebrate the anniversary of something that isn't 1.0 yet, but software projects have a proud tradition of celebrating a dozen anniversaries before 1.0. I wanted to share about what's changed in the past year and the work for the next six+ months.</p>\n<p><img alt=\"The book cover!\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/70ac47c9-c49f-47c0-9a05-7a9e70551d03.jpg?w=960&fit=max\"/></p>\n<h3>The Road to 0.1</h3>\n<p>I had been noodling on the idea of a logic book since the pandemic. The first time I wrote about it on the newsletter was in <a href=\"https://buttondown.com/hillelwayne/archive/predicate-logic-for-programmers/\" target=\"_blank\">2021</a>! Then I said that it would be done by June and would be \"under 50 pages\". The idea was to cover logic as a \"soft skill\" that helped you think about things like requirements and stuff.</p>\n<p>That version <em>sucked</em>. If you want to see how much it sucked, I put it up on <a href=\"https://www.patreon.com/posts/what-logic-for-133675688\" target=\"_blank\">Patreon</a>. Then I slept on the next draft for three years. Then in 2024 a lot of business fell through and I had a lot of free time, so with the help of <a href=\"https://saul.pw/\" target=\"_blank\">Saul Pwanson</a> I rewrote the book. This time I emphasized breadth over depth, trying to cover a lot more techniques.  </p>\n<p>I also decided to self-publish it instead of pitching it to a publisher. Not going the traditional route would mean I would be responsible for paying for editing, advertising, graphic design etc, but I hoped that would be compensated by <em>much</em> higher royalties. It also meant I could release the book in early access and use early sales to fund further improvements. So I wrote up a draft in <a href=\"https://www.sphinx-doc.org/en/master/\" target=\"_blank\">Sphinx</a>, compiled it to LaTeX, and uploaded the PDF to <a href=\"https://leanpub.com/\" target=\"_blank\">leanpub</a>. That was in June 2024.</p>\n<p>Since then I kept to a monthly cadence of updates, missing once in November (short-notice contract) and once last month (<a href=\"https://systemsdistributed.com/\" target=\"_blank\">Systems Distributed</a>). The book's now on v0.10. What's changed?</p>\n<h3>A LOT</h3>\n<p>v0.1 was <em>very obviously</em> an alpha, and I have made a lot of improvements since then. For one, the book no longer looks like a <a href=\"https://www.sphinx-doc.org/_/downloads/en/master/pdf/#page=13\" target=\"_blank\">Sphinx manual</a>. Compare!</p>\n<p><img alt=\"0.1 on left, 0.10 on right. Way better!\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/e4d880ad-80b8-4360-9cae-27c07598c740.png?w=960&fit=max\"/></p>\n<p>Also, the content is very, very different. v0.1 was 19,000 words, v.10 is 31,000.<sup id=\"fnref:pagesize\"><a class=\"footnote-ref\" href=\"#fn:pagesize\">1</a></sup> This comes from new chapters on TLA+, constraint/SMT solving, logic programming, and major expansions to the existing chapters. Originally, \"Simplifying Conditionals\" was 600 words. Six hundred words! It almost fit in two pages!</p>\n<p><img alt=\"How short Simplifying Conditions USED to be\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/31e731b7-3bdc-4ded-9b09-2a6261a323ec.png?w=960&fit=max\"/></p>\n<p>The chapter is now 2600 words, now covering condition lifting, quantifier manipulation, helper predicates, and set optimizations. All the other chapters have either gotten similar facelifts or are scheduled to get facelifts.</p>\n<p>The last big change is the addition of <a href=\"https://github.com/logicforprogrammers/book-assets\" target=\"_blank\">book assets</a>. Originally you had to manually copy over all of the code to try it out, which is a problem when there are samples in eight distinct languages! Now there are ready-to-go examples for each chapter, with instructions on how to set up each programming environment. This is also nice because it gives me breaks from writing to code instead.</p>\n<h3>How did the book do?</h3>\n<p>Leanpub's all-time visualizations are terrible, so I'll just give the summary: 1180 copies sold, $18,241 in royalties. That's a lot of money for something that isn't fully out yet! By comparison, <em>Practical TLA+</em> has made me less than half of that, despite selling over 5x as many books. Self-publishing was the right choice!</p>\n<p>In that time I've paid about $400 for the book cover (worth it) and maybe $800 in Leanpub's advertising service (probably not worth it). </p>\n<p>Right now that doesn't come close to making back the time investment, but I think it can get there post-release. I believe there's a lot more potential customers via marketing. I think post-release 10k copies sold is within reach.</p>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<h3>Where is the book going?</h3>\n<div class=\"subscribe-form\"></div>\n<p>The main content work is rewrites: many of the chapters have not meaningfully changed since 1.0, so I am going through and rewriting them from scratch. So far four of the ten chapters have been rewritten. My (admittedly ambitious) goal is to rewrite three of them by the end of this month and another three by the end of next. I also want to do final passes on the rewritten chapters; as most of them have a few TODOs left lying around.</p>\n<p>(Also somehow in starting this newsletter and publishing it I realized that one of the chapters might be better split into two chapters, so there could well-be a tenth technique in v0.11 or v0.12!)</p>\n<p>After that, I will pass it to a copy editor while I work on improving the layout, making images, and indexing. I want to have something worthy of printing on a dead tree by 1.0. </p>\n<p>In terms of timelines, I am <strong>very roughly</strong> estimating something like this:</p>\n<ul>\n<li>Summer: final big changes and rewrites</li>\n<li>Early Autumn: graphic design and copy editing</li>\n<li>Late Autumn: proofing, figuring out printing stuff</li>\n<li>Winter: final ebook and initial print releases of 1.0.</li>\n</ul>\n<p>(If you know a service that helps get self-published books \"past the finish line\", I'd love to hear about it! Preferably something that works for a fee, not part of royalties.)</p>\n<p>This timeline may be disrupted by official client work, like a new TLA+ contract or a conference invitation.</p>\n<p>Needless to say, I am incredibly excited to complete this book and share the final version with you all. This is a book I wished for years ago, a book I wrote because nobody else would. It fills a critical gap in software educational material, and someday soon I'll be able to put a copy on my bookshelf. It's exhilarating and terrifying and above all, satisfying.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:pagesize\">\n<p>It's also 150 pages vs 50 pages, but admittedly this is partially because I made the book smaller with a larger font. <a class=\"footnote-backref\" href=\"#fnref:pagesize\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/logic-for-programmers-turns-one/",
          "published": "2025-07-08T18:18:52.000Z",
          "updated": "2025-07-08T18:18:52.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/",
          "title": "Logical Quantifiers in Software",
          "description": "<p>I realize that for all I've talked about <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a> in this newsletter, I never once explained basic logical quantifiers. They're both simple and incredibly useful, so let's do that this week! </p>\n<h3>Sets and quantifiers</h3>\n<p>A <strong>set</strong> is a collection of unordered, unique elements. <code>{1, 2, 3, …}</code> is a set, as are \"every programming language\", \"every programming language's Wikipedia page\", and \"every function ever defined in any programming language's standard library\". You can put whatever you want in a set, with some very specific limitations to avoid certain paradoxes.<sup id=\"fnref:paradox\"><a class=\"footnote-ref\" href=\"#fn:paradox\">2</a></sup> </p>\n<p>Once we have a set, we can ask \"is something true for all elements of the set\" and \"is something true for at least one element of the set?\" IE, is it true that every programming language has a <code>set</code> collection type in the core language? We would write it like this:</p>\n<div class=\"codehilite\"><pre><span></span><code># all of them\nall l in ProgrammingLanguages: HasSetType(l)\n\n# at least one\nsome l in ProgrammingLanguages: HasSetType(l)\n</code></pre></div>\n<p>This is the notation I use in the book because it's easy to read, type, and search for. Mathematicians historically had a few different formats; the one I grew up with was <code>∀x ∈ set: P(x)</code> to mean <code>all x in set</code>, and <code>∃</code> to mean <code>some</code>. I use these when writing for just myself, but find them confusing to programmers when communicating.</p>\n<p>\"All\" and \"some\" are respectively referred to as \"universal\" and \"existential\" quantifiers.</p>\n<h3>Some cool properties</h3>\n<p>We can simplify expressions with quantifiers, in the same way that we can simplify <code>!(x && y)</code> to <code>!x || !y</code>.</p>\n<p>First of all, quantifiers are commutative with themselves. <code>some x: some y: P(x,y)</code> is the same as <code>some y: some x: P(x, y)</code>. For this reason we can write <code>some x, y: P(x,y)</code> as shorthand. We can even do this when quantifying over different sets, writing <code>some x, x' in X, y in Y</code> instead of <code>some x, x' in X: some y in Y</code>. We can <em>not</em> do this with \"alternating quantifiers\":</p>\n<ul>\n<li><code>all p in Person: some m in Person: Mother(m, p)</code> says that every person has a mother.</li>\n<li><code>some m in Person: all p in Person: Mother(m, p)</code> says that someone is every person's mother.</li>\n</ul>\n<p>Second, existentials distribute over <code>||</code> while universals distribute over <code>&&</code>. \"There is some url which returns a 403 or 404\" is the same as \"there is some url which returns a 403 or some url that returns a 404\", and \"all PRs pass the linter and the test suites\" is the same as \"all PRs pass the linter and all PRs pass the test suites\".</p>\n<p>Finally, <code>some</code> and <code>all</code> are <em>duals</em>: <code>some x: P(x) == !(all x: !P(x))</code>, and vice-versa. Intuitively: if some file is malicious, it's not true that all files are benign.</p>\n<p>All these rules together mean we can manipulate quantifiers <em>almost</em> as easily as we can manipulate regular booleans, putting them in whatever form is easiest to use in programming. </p>\n<p>Speaking of which, how <em>do</em> we use this in in programming?</p>\n<h2>How we use this in programming</h2>\n<p>First of all, people clearly have a need for directly using quantifiers in code. If we have something of the form:</p>\n<div class=\"codehilite\"><pre><span></span><code>for x in list:\n    if P(x):\n        return true\nreturn false\n</code></pre></div>\n<p>That's just <code>some x in list: P(x)</code>. And this is a prevalent pattern, as you can see by using <a href=\"https://github.com/search?q=%2Ffor+.*%3A%5Cn%5Cs*if+.*%3A%5Cn%5Cs*return+%28False%7CTrue%29%5Cn%5Cs*return+%28True%7CFalse%29%2F+language%3Apython+NOT+is%3Afork&type=code\" target=\"_blank\">GitHub code search</a>. It finds over 500k examples of this pattern in Python alone! That can be simplified via using the language's built-in quantifiers: the Python would be <code>any(P(x) for x in list)</code>.</p>\n<p>(Note this is not quantifying over sets but iterables. But the idea translates cleanly enough.)</p>\n<p>More generally, quantifiers are a key way we express higher-level properties of software. What does it mean for a list to be sorted in ascending order? That <code>all i, j in 0..<len(l): if i < j then l[i] <= l[j]</code>. When should a <a href=\"https://qntm.org/ratchet\" target=\"_blank\">ratchet test fail</a>? When <code>some f in functions - exceptions: Uses(f, bad_function)</code>. Should the image classifier work upside down? <code>all i in images: classify(i) == classify(rotate(i, 180))</code>. These are the properties we verify with tests and types and <a href=\"https://www.hillelwayne.com/post/constructive/\" target=\"_blank\">MISU</a> and whatnot;<sup id=\"fnref:misu\"><a class=\"footnote-ref\" href=\"#fn:misu\">1</a></sup> it helps to be able to make them explicit!</p>\n<p>One cool use case that'll be in the book's next version: database invariants are universal statements over the set of all records, like <code>all a in accounts: a.balance > 0</code>. That's enforceable with a <a href=\"https://sqlite.org/lang_createtable.html#check_constraints\" target=\"_blank\">CHECK</a> constraint. But what about something like <code>all i, i' in intervals: NoOverlap(i, i')</code>? That isn't covered by CHECK, since it spans two rows.</p>\n<p>Quantifier duality to the rescue! The invariant is equivalent to <code>!(some i, i' in intervals: Overlap(i, i'))</code>, so is preserved if the <em>query</em> <code>SELECT COUNT(*) FROM intervals CROSS JOIN intervals …</code> returns 0 rows. This means we can test it via a <a href=\"https://sqlite.org/lang_createtrigger.html\" target=\"_blank\">database trigger</a>.<sup id=\"fnref:efficiency\"><a class=\"footnote-ref\" href=\"#fn:efficiency\">3</a></sup></p>\n<hr/>\n<p>There are a lot more use cases for quantifiers, but this is enough to introduce the ideas! Next week's the one year anniversary of the book entering early access, so I'll be writing a bit about that experience and how the book changed. It's <em>crazy</em> how crude v0.1 was compared to the current version.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:misu\">\n<p>MISU (\"make illegal states unrepresentable\") means using data representations that rule out invalid values. For example, if you have a <code>location -> Optional(item)</code> lookup and want to make sure that each item is in exactly one location, consider instead changing the map to <code>item -> location</code>. This is a means of <em>implementing</em> the property <code>all i in item, l, l' in location: if ItemIn(i, l) && l != l' then !ItemIn(i, l')</code>. <a class=\"footnote-backref\" href=\"#fnref:misu\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:paradox\">\n<p>Specifically, a set can't be an element of itself, which rules out constructing things like \"the set of all sets\" or \"the set of sets that don't contain themselves\". <a class=\"footnote-backref\" href=\"#fnref:paradox\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:efficiency\">\n<p>Though note that when you're inserting or updating an interval, you already <em>have</em> that row's fields in the trigger's <code>NEW</code> keyword. So you can just query <code>!(some i in intervals: Overlap(new, i'))</code>, which is more efficient. <a class=\"footnote-backref\" href=\"#fnref:efficiency\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/logical-quantifiers-in-software/",
          "published": "2025-07-02T19:44:22.000Z",
          "updated": "2025-07-02T19:44:22.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/you-can-cheat-a-test-suite-with-a-big-enough/",
          "title": "You can cheat a test suite with a big enough polynomial",
          "description": "<p>Hi nerds, I'm back from <a href=\"https://systemsdistributed.com/\" target=\"_blank\">Systems Distributed</a>! I'd heartily recommend it, wildest conference I've been to in years. I have a lot of work to catch up on, so this will be a short newsletter.</p>\n<p>In an earlier version of my talk, I had a gag about unit tests. First I showed the test <code>f([1,2,3]) == 3</code>, then said that this was satisfied by <code>f(l) = 3</code>, <code>f(l) = l[-1]</code>, <code>f(l) = len(l)</code>, <code>f(l) = (129*l[0]-34*l[1]-617)*l[2] - 443*l[0] + 1148*l[1] - 182</code>. Then I progressively rule them out one by one with more unit tests, except the last polynomial which stubbornly passes every single test.</p>\n<p>If you're given some function of <code>f(x: int, y: int, …): int</code> and a set of unit tests asserting <a href=\"https://buttondown.com/hillelwayne/archive/oracle-testing/\" target=\"_blank\">specific inputs give specific outputs</a>, then you can find a polynomial that passes every single unit test.</p>\n<p>To find the gag, and as <a href=\"https://en.wikipedia.org/wiki/Satisfiability_modulo_theories\" target=\"_blank\">SMT</a> practice, I wrote a Python program that finds a polynomial that passes a test suite meant for <code>max</code>. It's hardcoded for three parameters and only finds 2nd-order polynomials but I think it could be generalized with enough effort.</p>\n<h2>The code</h2>\n<p>Full code <a href=\"https://gist.github.com/hwayne/0ed045a35376c786171f9cf4b55c470f\" target=\"_blank\">here</a>, breakdown below.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"kn\">from</span><span class=\"w\"> </span><span class=\"nn\">z3</span><span class=\"w\"> </span><span class=\"kn\">import</span> <span class=\"o\">*</span>  <span class=\"c1\"># type: ignore</span>\n<span class=\"n\">s1</span><span class=\"p\">,</span> <span class=\"n\">s2</span> <span class=\"o\">=</span> <span class=\"n\">Solver</span><span class=\"p\">(),</span> <span class=\"n\">Solver</span><span class=\"p\">()</span>\n</code></pre></div>\n<p><a href=\"https://microsoft.github.io/z3guide/\" target=\"_blank\">Z3</a> is just the particular SMT solver we use, as it has good language bindings and a lot of affordances.</p>\n<p>As part of learning SMT I wanted to do this two ways. First by putting the polynomial \"outside\" of the SMT solver in a python function, second by doing it \"natively\" in Z3. I created two solvers so I could test both versions in one run. </p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">a0</span><span class=\"p\">,</span> <span class=\"n\">a</span><span class=\"p\">,</span> <span class=\"n\">b</span><span class=\"p\">,</span> <span class=\"n\">c</span><span class=\"p\">,</span> <span class=\"n\">d</span><span class=\"p\">,</span> <span class=\"n\">e</span><span class=\"p\">,</span> <span class=\"n\">f</span> <span class=\"o\">=</span> <span class=\"n\">Consts</span><span class=\"p\">(</span><span class=\"s1\">'a0 a b c d e f'</span><span class=\"p\">,</span> <span class=\"n\">IntSort</span><span class=\"p\">())</span>\n<span class=\"n\">x</span><span class=\"p\">,</span> <span class=\"n\">y</span><span class=\"p\">,</span> <span class=\"n\">z</span> <span class=\"o\">=</span> <span class=\"n\">Ints</span><span class=\"p\">(</span><span class=\"s1\">'x y z'</span><span class=\"p\">)</span>\n<span class=\"n\">t</span> <span class=\"o\">=</span> <span class=\"s2\">\"a*x+b*y+c*z+d*x*y+e*x*z+f*y*z+a0\"</span>\n</code></pre></div>\n<p>Both <code>Const('x', IntSort())</code> and <code>Int('x')</code> do the exact same thing, the latter being syntactic sugar for the former. I did not know this when I wrote the program. </p>\n<p>To keep the two versions in sync I represented the equation as a string, which I later <code>eval</code>. This is one of the rare cases where eval is a good idea, to help us experiment more quickly while learning. The polynomial is a \"2nd-order polynomial\", even though it doesn't have <code>x^2</code> terms, as it has <code>xy</code> and <code>xz</code> terms.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">lambdamax</span> <span class=\"o\">=</span> <span class=\"k\">lambda</span> <span class=\"n\">x</span><span class=\"p\">,</span> <span class=\"n\">y</span><span class=\"p\">,</span> <span class=\"n\">z</span><span class=\"p\">:</span> <span class=\"nb\">eval</span><span class=\"p\">(</span><span class=\"n\">t</span><span class=\"p\">)</span>\n\n<span class=\"n\">z3max</span> <span class=\"o\">=</span> <span class=\"n\">Function</span><span class=\"p\">(</span><span class=\"s1\">'z3max'</span><span class=\"p\">,</span> <span class=\"n\">IntSort</span><span class=\"p\">(),</span> <span class=\"n\">IntSort</span><span class=\"p\">(),</span> <span class=\"n\">IntSort</span><span class=\"p\">(),</span>  <span class=\"n\">IntSort</span><span class=\"p\">())</span>\n<span class=\"n\">s1</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">ForAll</span><span class=\"p\">([</span><span class=\"n\">x</span><span class=\"p\">,</span> <span class=\"n\">y</span><span class=\"p\">,</span> <span class=\"n\">z</span><span class=\"p\">],</span> <span class=\"n\">z3max</span><span class=\"p\">(</span><span class=\"n\">x</span><span class=\"p\">,</span> <span class=\"n\">y</span><span class=\"p\">,</span> <span class=\"n\">z</span><span class=\"p\">)</span> <span class=\"o\">==</span> <span class=\"nb\">eval</span><span class=\"p\">(</span><span class=\"n\">t</span><span class=\"p\">)))</span>\n</code></pre></div>\n<p><code>lambdamax</code> is pretty straightforward: create a lambda with three parameters and <code>eval</code> the string. The string \"<code>a*x</code>\" then becomes the python expression <code>a*x</code>, <code>a</code> is an SMT symbol, while the <code>x</code> SMT symbol is shadowed by the lambda parameter. To reiterate, a terrible idea in practice, but a good way to learn faster.</p>\n<p><code>z3max</code> function is a little more complex. <code>Function</code> takes an identifier string and N \"sorts\" (roughly the same as programming types). The first <code>N-1</code> sorts define the parameters of the function, while the last becomes the output. So here I assign the string identifier <code>\"z3max\"</code> to be a function with signature <code>(int, int, int) -> int</code>.</p>\n<p>I can load the function into the model by specifying constraints on what <code>z3max</code> <em>could</em> be. This could either be a strict input/output, as will be done later, or a <code>ForAll</code> over all possible inputs. Here I just use that directly to say \"for all inputs, the function should match this polynomial.\" But I could do more complicated constraints, like commutativity (<code>f(x, y) == f(y, x)</code>) or monotonicity (<code>Implies(x < y, f(x) <= f(y))</code>).</p>\n<p>Note <code>ForAll</code> takes a list of z3 symbols to quantify over. That's the only reason we need to define <code>x, y, z</code> in the first place. The lambda version doesn't need them. </p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">inputs</span> <span class=\"o\">=</span> <span class=\"p\">[(</span><span class=\"mi\">1</span><span class=\"p\">,</span><span class=\"mi\">2</span><span class=\"p\">,</span><span class=\"mi\">3</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">)]</span>\n\n<span class=\"k\">for</span> <span class=\"n\">g</span> <span class=\"ow\">in</span> <span class=\"n\">inputs</span><span class=\"p\">:</span>\n    <span class=\"n\">s1</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">z3max</span><span class=\"p\">(</span><span class=\"o\">*</span><span class=\"n\">g</span><span class=\"p\">)</span> <span class=\"o\">==</span> <span class=\"nb\">max</span><span class=\"p\">(</span><span class=\"o\">*</span><span class=\"n\">g</span><span class=\"p\">))</span>\n    <span class=\"n\">s2</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">lambdamax</span><span class=\"p\">(</span><span class=\"o\">*</span><span class=\"n\">g</span><span class=\"p\">)</span> <span class=\"o\">==</span> <span class=\"nb\">max</span><span class=\"p\">(</span><span class=\"o\">*</span><span class=\"n\">g</span><span class=\"p\">))</span>\n</code></pre></div>\n<p>This sets up the joke: adding constraints to each solver that the polynomial it finds must, for a fixed list of triplets, return the max of each triplet.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">for</span> <span class=\"n\">s</span><span class=\"p\">,</span> <span class=\"n\">func</span> <span class=\"ow\">in</span> <span class=\"p\">[(</span><span class=\"n\">s1</span><span class=\"p\">,</span> <span class=\"n\">z3max</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">s2</span><span class=\"p\">,</span> <span class=\"n\">lambdamax</span><span class=\"p\">)]:</span>\n    <span class=\"k\">if</span> <span class=\"n\">s</span><span class=\"o\">.</span><span class=\"n\">check</span><span class=\"p\">()</span> <span class=\"o\">==</span> <span class=\"n\">sat</span><span class=\"p\">:</span>\n        <span class=\"n\">m</span> <span class=\"o\">=</span> <span class=\"n\">s</span><span class=\"o\">.</span><span class=\"n\">model</span><span class=\"p\">()</span>\n        <span class=\"k\">for</span> <span class=\"n\">x</span><span class=\"p\">,</span> <span class=\"n\">y</span><span class=\"p\">,</span> <span class=\"n\">z</span> <span class=\"ow\">in</span> <span class=\"n\">inputs</span><span class=\"p\">:</span>\n            <span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"sa\">f</span><span class=\"s2\">\"max([</span><span class=\"si\">{</span><span class=\"n\">x</span><span class=\"si\">}</span><span class=\"s2\">, </span><span class=\"si\">{</span><span class=\"n\">y</span><span class=\"si\">}</span><span class=\"s2\">, </span><span class=\"si\">{</span><span class=\"n\">z</span><span class=\"si\">}</span><span class=\"s2\">]) =\"</span><span class=\"p\">,</span> <span class=\"n\">m</span><span class=\"o\">.</span><span class=\"n\">evaluate</span><span class=\"p\">(</span><span class=\"n\">func</span><span class=\"p\">(</span><span class=\"n\">x</span><span class=\"p\">,</span> <span class=\"n\">y</span><span class=\"p\">,</span> <span class=\"n\">z</span><span class=\"p\">)))</span>\n        <span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"sa\">f</span><span class=\"s2\">\"max([x, y, z]) = </span><span class=\"si\">{</span><span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">a</span><span class=\"p\">]</span><span class=\"si\">}</span><span class=\"s2\">x + </span><span class=\"si\">{</span><span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">b</span><span class=\"p\">]</span><span class=\"si\">}</span><span class=\"s2\">y\"</span><span class=\"p\">,</span>\n            <span class=\"sa\">f</span><span class=\"s2\">\"+ </span><span class=\"si\">{</span><span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">c</span><span class=\"p\">]</span><span class=\"si\">}</span><span class=\"s2\">z +\"</span><span class=\"p\">,</span> <span class=\"c1\"># linebreaks added for newsletter rendering</span>\n            <span class=\"sa\">f</span><span class=\"s2\">\"</span><span class=\"si\">{</span><span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">d</span><span class=\"p\">]</span><span class=\"si\">}</span><span class=\"s2\">xy + </span><span class=\"si\">{</span><span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">e</span><span class=\"p\">]</span><span class=\"si\">}</span><span class=\"s2\">xz + </span><span class=\"si\">{</span><span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">f</span><span class=\"p\">]</span><span class=\"si\">}</span><span class=\"s2\">yz + </span><span class=\"si\">{</span><span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">a0</span><span class=\"p\">]</span><span class=\"si\">}</span><span class=\"se\">\\n</span><span class=\"s2\">\"</span><span class=\"p\">)</span>\n</code></pre></div>\n<p>Output:</p>\n<div class=\"codehilite\"><pre><span></span><code>max([1, 2, 3]) = 3\n# etc\nmax([x, y, z]) = -133x + 130y + -10z + -2xy + 62xz + -46yz + 0\n\nmax([1, 2, 3]) = 3\n# etc\nmax([x, y, z]) = -17x + 16y + 0z + 0xy + 8xz + -6yz + 0\n</code></pre></div>\n<p>I find that <code>z3max</code> (top) consistently finds larger coefficients than <code>lambdamax</code> does. I don't know why.</p>\n<h3>Practical Applications</h3>\n<p><strong>Test-Driven Development</strong> recommends a strict \"red-green refactor\" cycle. Write a new failing test, make the new test pass, then go back and refactor. Well, the easiest way to make the new test pass would be to paste in a new polynomial, so that's what you should be doing. You can even do this all automatically: have a script read the set of test cases, pass them to the solver, and write the new polynomial to your code file. All you need to do is write the tests!</p>\n<h3>Pedagogical Notes</h3>\n<p>Writing the script took me a couple of hours. I'm sure an LLM could have whipped it all up in five minutes but I really want to <em>learn</em> SMT and <a href=\"https://www.sciencedirect.com/science/article/pii/S0747563224002541\" target=\"_blank\">LLMs <em>may</em> decrease learning retention</a>.<sup id=\"fnref:caveat\"><a class=\"footnote-ref\" href=\"#fn:caveat\">1</a></sup> Z3 documentation is not... great for non-academics, though, and most other SMT solvers have even worse docs. One useful trick I use regularly is to use Github code search to find code using the same APIs and study how that works. Turns out reading API-heavy code is a lot easier than writing it!</p>\n<p>Anyway, I'm very, very slowly feeling like I'm getting the basics on how to use SMT. I don't have any practical use cases yet, but I wanted to learn this skill for a while and glad I finally did.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:caveat\">\n<p>Caveat I have not actually <em>read</em> the study, for all I know it could have a sample size of three people, I'll get around to it eventually <a class=\"footnote-backref\" href=\"#fnref:caveat\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/you-can-cheat-a-test-suite-with-a-big-enough/",
          "published": "2025-06-24T16:27:01.000Z",
          "updated": "2025-06-24T16:27:01.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/",
          "title": "Solving LinkedIn Queens with SMT",
          "description": "<h3>No newsletter next week</h3>\n<p>I’ll be speaking at <a href=\"https://systemsdistributed.com/\" target=\"_blank\">Systems Distributed</a>. My talk isn't close to done yet, which is why this newsletter is both late and short. </p>\n<h1>Solving LinkedIn Queens in SMT</h1>\n<p>The article <a href=\"https://codingnest.com/modern-sat-solvers-fast-neat-underused-part-1-of-n/\" target=\"_blank\">Modern SAT solvers: fast, neat and underused</a> claims that SAT solvers<sup id=\"fnref:SAT\"><a class=\"footnote-ref\" href=\"#fn:SAT\">1</a></sup> are \"criminally underused by the industry\". A while back on the newsletter I asked \"why\": how come they're so powerful and yet nobody uses them? Many experts responded saying the reason is that encoding SAT kinda sucked and they rather prefer using tools that compile to SAT. </p>\n<p>I was reminded of this when I read <a href=\"https://ryanberger.me/posts/queens/\" target=\"_blank\">Ryan Berger's post</a> on solving “LinkedIn Queens” as a SAT problem. </p>\n<p>A quick overview of Queens. You’re presented with an NxN grid divided into N regions, and have to place N queens so that there is exactly one queen in each row, column, and region. While queens can be on the same diagonal, they <em>cannot</em> be adjacently diagonal.</p>\n<p>(Important note: Linkedin “Queens” is a variation on the puzzle game <a href=\"https://starbattle.puzzlebaron.com/\" target=\"_blank\">Star Battle</a>, which is the same except the number of stars you place in each row/column/region varies per puzzle, and is usually two. This is also why 'queens' don’t capture like chess queens.)</p>\n<p><img alt=\"An image of a solved queens board. Copied from https://ryanberger.me/posts/queens\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/96f6f923-331f-424d-8641-fe6753e1c2ca.png?w=960&fit=max\"/></p>\n<p>Ryan solved this by writing Queens as a SAT problem, expressing properties like \"there is exactly one queen in row 3\" as a large number of boolean clauses. <a href=\"https://ryanberger.me/posts/queens/\" target=\"_blank\">Go read his post, it's pretty cool</a>. What leapt out to me was that he used <a href=\"https://cvc5.github.io/\" target=\"_blank\">CVC5</a>, an <strong>SMT</strong> solver.<sup id=\"fnref:SMT\"><a class=\"footnote-ref\" href=\"#fn:SMT\">2</a></sup> SMT solvers are \"higher-level\" than SAT, capable of handling more data types than just boolean variables. It's a lot easier to solve the problem at the SMT level than at the SAT level. To show this, I whipped up a short demo of solving the same problem in <a href=\"https://github.com/Z3Prover/z3/wiki\" target=\"_blank\">Z3</a> (via the <a href=\"https://pypi.org/project/z3-solver/\" target=\"_blank\">Python API</a>).</p>\n<p><a href=\"https://gist.github.com/hwayne/c5de7bc52e733995311236666bedecd3\" target=\"_blank\">Full code here</a>, which you can compare to Ryan's SAT solution <a href=\"https://github.com/ryan-berger/queens/blob/master/main.py\" target=\"_blank\">here</a>. I didn't do a whole lot of cleanup on it (again, time crunch!), but short explanation below.</p>\n<h3>The code</h3>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"kn\">from</span><span class=\"w\"> </span><span class=\"nn\">z3</span><span class=\"w\"> </span><span class=\"kn\">import</span> <span class=\"o\">*</span> <span class=\"c1\"># type: ignore</span>\n<span class=\"kn\">from</span><span class=\"w\"> </span><span class=\"nn\">itertools</span><span class=\"w\"> </span><span class=\"kn\">import</span> <span class=\"n\">combinations</span><span class=\"p\">,</span> <span class=\"n\">chain</span><span class=\"p\">,</span> <span class=\"n\">product</span>\n<span class=\"n\">solver</span> <span class=\"o\">=</span> <span class=\"n\">Solver</span><span class=\"p\">()</span>\n<span class=\"n\">size</span> <span class=\"o\">=</span> <span class=\"mi\">9</span> <span class=\"c1\"># N</span>\n</code></pre></div>\n<p>Initial setup and modules. <code>size</code> is the number of rows/columns/regions in the board, which I'll call <code>N</code> below.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\"># queens[n] = col of queen on row n</span>\n<span class=\"c1\"># by construction, not on same row</span>\n<span class=\"n\">queens</span> <span class=\"o\">=</span> <span class=\"n\">IntVector</span><span class=\"p\">(</span><span class=\"s1\">'q'</span><span class=\"p\">,</span> <span class=\"n\">size</span><span class=\"p\">)</span> \n</code></pre></div>\n<p>SAT represents the queen positions via N² booleans: <code>q_00</code> means that a Queen is on row 0 and column 0, <code>!q_05</code> means a queen <em>isn't</em> on row 0 col 5, etc. In SMT we can instead encode it as N integers: <code>q_0 = 5</code> means that the queen on row 0 is positioned at column 5. This immediately enforces one class of constraints for us: we don't need any constraints saying \"exactly one queen per row\", because that's embedded in the definition of <code>queens</code>!</p>\n<p>(Incidentally, using 0-based indexing for the board was a mistake on my part, it makes correctly encoding the regions later really painful.)</p>\n<p>To actually make the variables <code>[q_0, q_1, …]</code>, we use the Z3 affordance <code>IntVector(str, n)</code> for making <code>n</code> variables at once.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">([</span><span class=\"n\">And</span><span class=\"p\">(</span><span class=\"mi\">0</span> <span class=\"o\"><=</span> <span class=\"n\">i</span><span class=\"p\">,</span> <span class=\"n\">i</span> <span class=\"o\"><</span> <span class=\"n\">size</span><span class=\"p\">)</span> <span class=\"k\">for</span> <span class=\"n\">i</span> <span class=\"ow\">in</span> <span class=\"n\">queens</span><span class=\"p\">])</span>\n<span class=\"c1\"># not on same column</span>\n<span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">Distinct</span><span class=\"p\">(</span><span class=\"n\">queens</span><span class=\"p\">))</span>\n</code></pre></div>\n<p>First we constrain all the integers to <code>[0, N)</code>, then use the <em>incredibly</em> handy <code>Distinct</code> constraint to force all the integers to have different values. This guarantees at most one queen per column, which by the <a href=\"https://en.wikipedia.org/wiki/Pigeonhole_principle\" target=\"_blank\">pigeonhole principle</a> means there is exactly one queen per column.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\"># not diagonally adjacent</span>\n<span class=\"k\">for</span> <span class=\"n\">i</span> <span class=\"ow\">in</span> <span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"n\">size</span><span class=\"o\">-</span><span class=\"mi\">1</span><span class=\"p\">):</span>\n    <span class=\"n\">q1</span><span class=\"p\">,</span> <span class=\"n\">q2</span> <span class=\"o\">=</span> <span class=\"n\">queens</span><span class=\"p\">[</span><span class=\"n\">i</span><span class=\"p\">],</span> <span class=\"n\">queens</span><span class=\"p\">[</span><span class=\"n\">i</span><span class=\"o\">+</span><span class=\"mi\">1</span><span class=\"p\">]</span>\n    <span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">Abs</span><span class=\"p\">(</span><span class=\"n\">q1</span> <span class=\"o\">-</span> <span class=\"n\">q2</span><span class=\"p\">)</span> <span class=\"o\">!=</span> <span class=\"mi\">1</span><span class=\"p\">)</span>\n</code></pre></div>\n<p>One of the rules is that queens can't be adjacent. We already know that they can't be horizontally or vertically adjacent via other constraints, which leaves the diagonals. We only need to add constraints that, for each queen, there is no queen in the lower-left or lower-right corner, aka <code>q_3 != q_2 ± 1</code>. We don't need to check the top corners because if <code>q_1</code> is in the upper-left corner of <code>q_2</code>, then <code>q_2</code> is in the lower-right corner of <code>q_1</code>!</p>\n<p>That covers everything except the \"one queen per region\" constraint. But the regions are the tricky part, which we should expect because we vary the difficulty of queens games by varying the regions.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">regions</span> <span class=\"o\">=</span> <span class=\"p\">{</span>\n        <span class=\"s2\">\"purple\"</span><span class=\"p\">:</span> <span class=\"p\">[(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">6</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">7</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">8</span><span class=\"p\">),</span>\n                   <span class=\"p\">(</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">5</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">6</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">7</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">8</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span>\n                   <span class=\"p\">(</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">8</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">)],</span>\n        <span class=\"s2\">\"red\"</span><span class=\"p\">:</span> <span class=\"p\">[(</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">5</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">6</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">6</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">7</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">7</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">8</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">8</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">),],</span>\n        <span class=\"c1\"># you get the picture</span>\n        <span class=\"p\">}</span>\n\n<span class=\"c1\"># Some checking code left out, see below</span>\n</code></pre></div>\n<p>The region has to be manually coded in, which is a huge pain.</p>\n<p>(In the link, some validation code follows. Since it breaks up explaining the model I put it in the next section.)</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">for</span> <span class=\"n\">r</span> <span class=\"ow\">in</span> <span class=\"n\">regions</span><span class=\"o\">.</span><span class=\"n\">values</span><span class=\"p\">():</span>\n    <span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">Or</span><span class=\"p\">(</span>\n        <span class=\"o\">*</span><span class=\"p\">[</span><span class=\"n\">queens</span><span class=\"p\">[</span><span class=\"n\">row</span><span class=\"p\">]</span> <span class=\"o\">==</span> <span class=\"n\">col</span> <span class=\"k\">for</span> <span class=\"p\">(</span><span class=\"n\">row</span><span class=\"p\">,</span> <span class=\"n\">col</span><span class=\"p\">)</span> <span class=\"ow\">in</span> <span class=\"n\">r</span><span class=\"p\">]</span>\n        <span class=\"p\">))</span>\n</code></pre></div>\n<p>Finally we have the region constraint. The easiest way I found to say \"there is exactly one queen in each region\" is to say \"there is a queen in region 1 and a queen in region 2 and a queen in region 3\" etc.\" Then to say \"there is a queen in region <code>purple</code>\" I wrote \"<code>q_0 = 0</code> OR <code>q_0 = 1</code> OR … OR <code>q_1 = 0</code> etc.\" </p>\n<p>Why iterate over every position in the region instead of doing something like <code>(0, q[0]) in r</code>? I tried that but it's not an expression that Z3 supports.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">if</span> <span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">check</span><span class=\"p\">()</span> <span class=\"o\">==</span> <span class=\"n\">sat</span><span class=\"p\">:</span>\n    <span class=\"n\">m</span> <span class=\"o\">=</span> <span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">model</span><span class=\"p\">()</span>\n    <span class=\"nb\">print</span><span class=\"p\">([(</span><span class=\"n\">l</span><span class=\"p\">,</span> <span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">l</span><span class=\"p\">])</span> <span class=\"k\">for</span> <span class=\"n\">l</span> <span class=\"ow\">in</span> <span class=\"n\">queens</span><span class=\"p\">])</span>\n</code></pre></div>\n<p>Finally, we solve and print the positions. Running this gives me:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"p\">[(</span><span class=\"n\">q__0</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__1</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__2</span><span class=\"p\">,</span> <span class=\"mi\">8</span><span class=\"p\">),</span> \n <span class=\"p\">(</span><span class=\"n\">q__3</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__4</span><span class=\"p\">,</span> <span class=\"mi\">7</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__5</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">),</span> \n <span class=\"p\">(</span><span class=\"n\">q__6</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__7</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__8</span><span class=\"p\">,</span> <span class=\"mi\">6</span><span class=\"p\">)]</span>\n</code></pre></div>\n<p>Which is the correct solution to the queens puzzle. I didn't benchmark the solution times, but I imagine it's considerably slower than a raw SAT solver. <a href=\"https://github.com/audemard/glucose\" target=\"_blank\">Glucose</a> is really, really fast.</p>\n<p>But even so, solving the problem with SMT was a lot <em>easier</em> than solving it with SAT. That satisfies me as an explanation for why people prefer it to SAT.</p>\n<h3>Sanity checks</h3>\n<p>One bit I glossed over earlier was the sanity checking code. I <em>knew for sure</em> that I was going to make a mistake encoding the <code>region</code>, and the solver wasn't going to provide useful information abut what I did wrong.  In cases like these, I like adding small tests and checks to catch mistakes early, because the solver certainly isn't going to catch them!</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">all_squares</span> <span class=\"o\">=</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">product</span><span class=\"p\">(</span><span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"n\">size</span><span class=\"p\">),</span> <span class=\"n\">repeat</span><span class=\"o\">=</span><span class=\"mi\">2</span><span class=\"p\">))</span>\n<span class=\"k\">def</span><span class=\"w\"> </span><span class=\"nf\">test_i_set_up_problem_right</span><span class=\"p\">():</span>\n    <span class=\"k\">assert</span> <span class=\"n\">all_squares</span> <span class=\"o\">==</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">chain</span><span class=\"o\">.</span><span class=\"n\">from_iterable</span><span class=\"p\">(</span><span class=\"n\">regions</span><span class=\"o\">.</span><span class=\"n\">values</span><span class=\"p\">()))</span>\n\n    <span class=\"k\">for</span> <span class=\"n\">r1</span><span class=\"p\">,</span> <span class=\"n\">r2</span> <span class=\"ow\">in</span> <span class=\"n\">combinations</span><span class=\"p\">(</span><span class=\"n\">regions</span><span class=\"o\">.</span><span class=\"n\">values</span><span class=\"p\">(),</span> <span class=\"mi\">2</span><span class=\"p\">):</span>\n        <span class=\"k\">assert</span> <span class=\"ow\">not</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">r1</span><span class=\"p\">)</span> <span class=\"o\">&</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">r2</span><span class=\"p\">),</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">r1</span><span class=\"p\">)</span> <span class=\"o\">&</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">r2</span><span class=\"p\">)</span>\n</code></pre></div>\n<p>The first check was a quick test that I didn't leave any squares out, or accidentally put the same square in both regions. Converting the values into sets makes both checks a lot easier. Honestly I don't know why I didn't just use sets from the start, sets are great.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">def</span><span class=\"w\"> </span><span class=\"nf\">render_regions</span><span class=\"p\">():</span>\n    <span class=\"n\">colormap</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s2\">\"purple\"</span><span class=\"p\">,</span>  <span class=\"s2\">\"red\"</span><span class=\"p\">,</span> <span class=\"s2\">\"brown\"</span><span class=\"p\">,</span> <span class=\"s2\">\"white\"</span><span class=\"p\">,</span> <span class=\"s2\">\"green\"</span><span class=\"p\">,</span> <span class=\"s2\">\"yellow\"</span><span class=\"p\">,</span> <span class=\"s2\">\"orange\"</span><span class=\"p\">,</span> <span class=\"s2\">\"blue\"</span><span class=\"p\">,</span> <span class=\"s2\">\"pink\"</span><span class=\"p\">]</span>\n    <span class=\"n\">board</span> <span class=\"o\">=</span> <span class=\"p\">[[</span><span class=\"mi\">0</span> <span class=\"k\">for</span> <span class=\"n\">_</span> <span class=\"ow\">in</span> <span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"n\">size</span><span class=\"p\">)]</span> <span class=\"k\">for</span> <span class=\"n\">_</span> <span class=\"ow\">in</span> <span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"n\">size</span><span class=\"p\">)]</span> \n    <span class=\"k\">for</span> <span class=\"p\">(</span><span class=\"n\">row</span><span class=\"p\">,</span> <span class=\"n\">col</span><span class=\"p\">)</span> <span class=\"ow\">in</span> <span class=\"n\">all_squares</span><span class=\"p\">:</span>\n        <span class=\"k\">for</span> <span class=\"n\">color</span><span class=\"p\">,</span> <span class=\"n\">region</span> <span class=\"ow\">in</span> <span class=\"n\">regions</span><span class=\"o\">.</span><span class=\"n\">items</span><span class=\"p\">():</span>\n            <span class=\"k\">if</span> <span class=\"p\">(</span><span class=\"n\">row</span><span class=\"p\">,</span> <span class=\"n\">col</span><span class=\"p\">)</span> <span class=\"ow\">in</span> <span class=\"n\">region</span><span class=\"p\">:</span>\n                <span class=\"n\">board</span><span class=\"p\">[</span><span class=\"n\">row</span><span class=\"p\">][</span><span class=\"n\">col</span><span class=\"p\">]</span> <span class=\"o\">=</span> <span class=\"n\">colormap</span><span class=\"o\">.</span><span class=\"n\">index</span><span class=\"p\">(</span><span class=\"n\">color</span><span class=\"p\">)</span><span class=\"o\">+</span><span class=\"mi\">1</span>\n\n    <span class=\"k\">for</span> <span class=\"n\">row</span> <span class=\"ow\">in</span> <span class=\"n\">board</span><span class=\"p\">:</span>\n        <span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"s2\">\"\"</span><span class=\"o\">.</span><span class=\"n\">join</span><span class=\"p\">(</span><span class=\"nb\">map</span><span class=\"p\">(</span><span class=\"nb\">str</span><span class=\"p\">,</span> <span class=\"n\">row</span><span class=\"p\">)))</span>\n\n<span class=\"n\">render_regions</span><span class=\"p\">()</span>\n</code></pre></div>\n<p>The second check is something that prints out the regions. It produces something like this:</p>\n<div class=\"codehilite\"><pre><span></span><code>111111111\n112333999\n122439999\n124437799\n124666779\n124467799\n122467899\n122555889\n112258899\n</code></pre></div>\n<p>I can compare this to the picture of the board to make sure I got it right. I guess a more advanced solution would be to print emoji squares like 🟥 instead.</p>\n<p>Neither check is quality code but it's throwaway and it gets the job done so eh.</p>\n<h3>Update for the Internet</h3>\n<p>This was sent as a weekly newsletter, which is usually on topics like <a href=\"https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code\" target=\"_blank\">software history</a>, <a href=\"https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/\" target=\"_blank\">formal methods</a>, <a href=\"https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/\" target=\"_blank\">unusual technologies</a>, and the <a href=\"https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/\" target=\"_blank\">theory of software engineering</a>. You <a href=\"https://buttondown.email/hillelwayne/\" target=\"_blank\">can subscribe here</a>.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:SAT\">\n<p>\"Boolean <strong>SAT</strong>isfiability Solver\", aka a solver that can find assignments that make complex boolean expressions true. I write a bit more about them <a href=\"https://www.hillelwayne.com/post/np-hard/\" target=\"_blank\">here</a>. <a class=\"footnote-backref\" href=\"#fnref:SAT\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:SMT\">\n<p>\"Satisfiability Modulo Theories\" <a class=\"footnote-backref\" href=\"#fnref:SMT\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/",
          "published": "2025-06-12T15:43:25.000Z",
          "updated": "2025-06-12T15:43:25.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/",
          "title": "AI is a gamechanger for TLA+ users",
          "description": "<h3>New Logic for Programmers Release</h3>\n<p><a href=\"https://leanpub.com/logic/\" target=\"_blank\">v0.10 is now available</a>! This is a minor release, mostly focused on logic-based refactoring, with new material on set types and testing refactors are correct. See the full release notes at <a href=\"https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md\" target=\"_blank\">the changelog page</a>. Due to <a href=\"https://systemsdistributed.com/\" target=\"_blank\">conference pressure</a> v0.11 will also likely be a minor release. </p>\n<p><img alt=\"The book cover\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/29d4ae9d-bcb9-4d8b-99d4-8a35c0990ad5.jpg?w=300&fit=max\"/></p>\n<h1>AI is a gamechanger for TLA+ users</h1>\n<p><a href=\"https://lamport.azurewebsites.net/tla/tla.html\" target=\"_blank\">TLA+</a> is a specification language to model and debug distributed systems. While very powerful, it's also hard for programmers to learn, and there's always questions of connecting specifications with actual code. </p>\n<p>That's why <a href=\"https://zfhuang99.github.io/github%20copilot/formal%20verification/tla+/2025/05/24/ai-revolution-in-distributed-systems.html\" target=\"_blank\">The Coming AI Revolution in Distributed Systems</a> caught my interest. In the post, Cheng Huang claims that Azure successfully used LLMs to examine an existing codebase, derive a TLA+ spec, and find a production bug in that spec. \"After a decade of manually crafting TLA+ specifications\", he wrote, \"I must acknowledge that this AI-generated specification rivals human work\".</p>\n<p>This inspired me to experiment with LLMs in TLA+ myself. My goals are a little less ambitious than Cheng's: I wanted to see how LLMs could help junior specifiers write TLA+, rather than handling the entire spec automatically. Details on what did and didn't work below, but my takeaway is that <strong>LLMs are an immense specification force multiplier.</strong></p>\n<p>All tests were done with a standard VSCode Copilot subscription, writing Claude 3.7 in Agent mode. Other LLMs or IDEs may be more or less effective, etc.</p>\n<h2>Things Claude was good at</h2>\n<h3>Fixing syntax errors</h3>\n<p>TLA+ uses a very different syntax than mainstream programming languages, meaning beginners make a lot of mistakes where they do a \"programming syntax\" instead of TLA+ syntax:</p>\n<div class=\"codehilite\"><pre><span></span><code>NotThree(x) = \\* should be ==, not =\n    x != 3 \\* should be #, not !=\n</code></pre></div>\n<p>The problem is that the TLA+ syntax checker, SANY, is 30 years old and doesn't provide good information. Here's what it says for that snippet:</p>\n<div class=\"codehilite\"><pre><span></span><code>Was expecting \"==== or more Module body\"\nEncountered \"NotThree\" at line 6, column 1\n</code></pre></div>\n<p>That only isolates one error and doesn't tell us what the problem is, only where it is. Experienced TLA+ users get \"error eyes\" and can quickly see what the problem is, but beginners really struggle with this.</p>\n<p>The TLA+ foundation has made LLM integration a priority, so the VSCode extension <a href=\"https://github.com/tlaplus/vscode-tlaplus/blob/master/src/main.ts#L174\" target=\"_blank\">naturally supports several agents actions</a>. One of these is running SANY, meaning an agent can get an error, fix it, get another error, fix it, etc. Provided the above sample and asked to make it work, Claude successfully fixed both errors. It also fixed many errors in a larger spec, as well as figure out why PlusCal specs weren't compiling to TLA+.</p>\n<p>This by itself is already enough to make LLMs a worthwhile tool, as it fixes one of the biggest barriers to entry.</p>\n<h3>Understanding error traces</h3>\n<p>When TLA+ finds a violated property, it outputs the sequence of steps that leads to the error. This starts in plaintext, and VSCode parses it into an interactive table:</p>\n<p><img alt=\"An example error trace\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/f7f16d0e-c61f-4286-ae49-67e03f844126.png?w=960&fit=max\"/></p>\n<p>Learning to read these error traces is a skill in itself. You have to understand what's happening in each step and how it relates back to the actually broken property. It takes a long time for people to learn how to do this well.</p>\n<p>Claude was successful here, too, accurately reading 20+ step error traces and giving a high-level explanation of what went wrong. It also could condense error traces: if ten steps of the error trace could be condensed into a one-sentence summary (which can happen if you're modeling a lot of process internals) Claude would do it.</p>\n<p>I did have issues here with doing this in agent mode: while the extension does provide a \"run model checker\" command, the agent would regularly ignore this and prefer to run a terminal command instead. This would be fine except that the LLM consistently hallucinated invalid commands. I had to amend every prompt with \"run the model checker via vscode, do not use a terminal command\". You can skip this if you're willing to copy and paste the error trace into the prompt.</p>\n<p>As with syntax checking, if this was the <em>only</em> thing LLMs could effectively do, that would already be enough<sup id=\"fnref:dayenu\"><a class=\"footnote-ref\" href=\"#fn:dayenu\">1</a></sup> to earn a strong recommend. Even as a TLA+ expert I expect I'll be using this trick regularly. </p>\n<h3>Boilerplate tasks</h3>\n<p>TLA+ has a lot of boilerplate. One of the most notorious examples is <code>UNCHANGED</code> rules. Specifications are extremely precise — so precise that you have to specify what variables <em>don't</em> change in every step. This takes the form of an <code>UNCHANGED</code> clause at the end of relevant actions:</p>\n<div class=\"codehilite\"><pre><span></span><code>RemoveObjectFromStore(srv, o, s) ==\n  /\\ o \\in stored[s]\n  /\\ stored' = [stored EXCEPT ![s] = @ \\ {o}]\n  /\\ UNCHANGED <<capacity, log, objectsize, pc>>\n</code></pre></div>\n<p>Writing this is really annoying. Updating these whenever you change an action, or add a new variable to the spec, is doubly so. Syntax checking and error analysis are important for beginners, but this is what I wanted for <em>myself</em>. I took a spec and prompted Claude</p>\n<blockquote>\n<p>Add UNCHANGED <<v1, etc=\"\" v2,=\"\">> for each variable not changed in an action.</v1,></p>\n</blockquote>\n<p>And it worked! It successfully updated the <code>UNCHANGED</code> in every action. </p>\n<p>(Note, though, that it was a \"well-behaved\" spec in this regard: only one \"action\" happened at a time. In TLA+ you can have two actions happen simultaneously, that each update half of the variables, meaning neither of them should have an <code>UNCHANGED</code> clause. I haven't tested how Claude handles that!)</p>\n<p>That's the most obvious win, but Claude was good at handling other tedious work, too. Some examples include updating <code>vars</code> (the conventional collection of all state variables), lifting a hard-coded value into a model parameter, and changing data formats. Most impressive to me, though, was rewriting a spec designed for one process to instead handle multiple processes. This means taking all of the process variables, which originally have types like <code>Int</code>, converting them to types like <code>[Process -> Int]</code>, and then updating the uses of all of those variables in the spec. It didn't account for race conditions in the new concurrent behavior, but it was an excellent scaffold to do more work.</p>\n<h3>Writing properties from an informal description</h3>\n<p>You have to be pretty precise with your intended property description but it handles converting that precise description into TLA+'s formalized syntax, which is something beginners often struggle with.</p>\n<h2>Things it is less good at</h2>\n<h3>Generating model config files</h3>\n<p>To model check TLA+, you need both a specification (<code>.tla</code>) and a model config file (<code>.cfg</code>), which have separate syntaxes. Asking the agent to generate the second often lead to it using TLA+ syntax. It automatically fixed this after getting parsing errors, though. </p>\n<h3>Fixing specs</h3>\n<p>Whenever the ran model checking and discovered a bug, it would naturally propose a change to either the invalid property or the spec. Sometimes the changes were good, other times the changes were not physically realizable. For example, if it found that a bug was due to a race condition between processes, it would often suggest fixing it by saying race conditions were okay. I mean yes, if you say bugs are okay, then the spec finds that bugs are okay! Or it would alternatively suggest adding a constraint to the spec saying that race conditions don't happen. <a href=\"https://www.hillelwayne.com/post/alloy-facts/\" target=\"_blank\">But that's a huge mistake in specification</a>, because race conditions happen if we don't have coordination. We need to specify the <em>mechanism</em> that is supposed to prevent them.</p>\n<h3>Finding properties of the spec</h3>\n<p>After seeing how capable it was at translating my properties to TLA+, I started prompting Claude to come up with properties on its own. Unfortunately, almost everything I got back was either trivial, uninteresting, or too coupled to implementation details. I haven't tested if it would work better to ask it for \"properties that may be violated\".</p>\n<h3>Generating code from specs</h3>\n<p>I have to be specific here: Claude <em>could</em> sometimes convert Python into a passable spec, an vice versa. It <em>wasn't</em> good at recognizing abstraction. For example, TLA+ specifications often represent sequential operations with a state variable, commonly called <code>pc</code>. If modeling code that nonatomically retrieves a counter value and increments it, we'd have one action that requires <code>pc = \"Get\"</code> and sets the new value to <code>\"Inc\"</code>, then another that requires it be <code>\"Inc\"</code> and sets it to <code>\"Done\"</code>.</p>\n<p>I found that Claude would try to somehow convert <code>pc</code> into part of the Python program's state, rather than recognize it as a TLA+ abstraction. On the other side, when converting python code to TLA+ it would often try to translate things like <code>sleep</code> into some part of the spec, not recognizing that it is abstractable into a distinct action. I didn't test other possible misconceptions, like converting randomness to nondeterminism.</p>\n<p>For the record, when converting TLA+ to Python Claude tended to make simulators of the spec, rather than possible production code implementing the spec. I really wasn't expecting otherwise though.</p>\n<h2>Unexplored Applications</h2>\n<p>Things I haven't explored thoroughly but could possibly be effective, based on what I know about TLA+ and AI:</p>\n<h3>Writing Java Overrides</h3>\n<p>Most TLA+ operators are resolved via TLA+ interpreters, but you can also implement them in \"native\" Java. This lets you escape the standard language semantics and add capabilities like <a href=\"https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla\" target=\"_blank\">executing programs during model-checking</a> or <a href=\"https://github.com/tlaplus/tlaplus/blob/master/tlatools/org.lamport.tlatools/src/tla2sany/StandardModules/TLC.tla#L62\" target=\"_blank\">dynamically constrain the depth of the searched state space</a>. There's a lot of cool things I think would be possible with overrides. The problem is there's only a handful of people in the world who know how to write them. But that handful have written quite a few overrides and I think there's enough there for Claude to work with. </p>\n<h3>Writing specs, given a reference mechanism</h3>\n<p>In all my experiments, the LLM only had my prompts and the occasional Python script as information. That makes me suspect that some of its problems with writing and fixing specs come down to not having a system model. Maybe it wouldn't suggest fixes like \"these processes never race\" if it had a design doc saying that the processes can't coordinate. </p>\n<p>(Could a Sufficiently Powerful LLM derive some TLA+ specification from a design document?)</p>\n<h3>Connecting specs and code</h3>\n<p>This is the holy grail of TLA+: taking a codebase and showing it correctly implements a spec. Currently the best ways to do this are by either using TLA+ to generate a test suite, or by taking logged production traces and matching them to TLA+ behaviors. <a href=\"https://www.mongodb.com/blog/post/engineering/conformance-checking-at-mongodb-testing-our-code-matches-our-tla-specs\" target=\"_blank\">This blog post discusses both</a>. While I've seen a lot of academic research into these approaches there are no industry-ready tools. So if you want trace validation you have to do a lot of manual labour tailored to your specific product. </p>\n<p>If LLMs could do some of this work for us then that'd really amplify the usefulness of TLA+ to many companies.</p>\n<h2>Thoughts</h2>\n<p><em>Right now</em>, agents seem good at the tedious and routine parts of TLA+ and worse at the strategic and abstraction parts. But, since the routine parts are often a huge barrier to beginners, this means that LLMs have the potential to make TLA+ far, far more accessible than it previously was.</p>\n<p>I have mixed thoughts on this. As an <em>advocate</em>, this is incredible. I want more people using formal specifications because I believe it leads to cheaper, safer, more reliable software. Anything that gets people comfortable with specs is great for our industry. As a <em>professional TLA+ consultant</em>, I'm worried that this obsoletes me. Most of my income comes from training and coaching, which companies will have far less demand of now. Then again, maybe this an opportunity to pitch \"agentic TLA+ training\" to companies!</p>\n<p>Anyway, if you're interested in TLA+, there has never been a better time to try it. I mean it, these tools handle so much of the hard part now. I've got a <a href=\"https://learntla.com/\" target=\"_blank\">free book available online</a>, as does <a href=\"https://lamport.azurewebsites.net/tla/book.html\" target=\"_blank\">the inventor of TLA+</a>. I like <a href=\"https://elliotswart.github.io/pragmaticformalmodeling/\" target=\"_blank\">this guide too</a>. Happy modeling!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:dayenu\">\n<p>Dayenu. <a class=\"footnote-backref\" href=\"#fnref:dayenu\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/",
          "published": "2025-06-05T14:59:11.000Z",
          "updated": "2025-06-05T14:59:11.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/what-does-undecidable-mean-anyway/",
          "title": "What does \"Undecidable\" mean, anyway",
          "description": "<h3>Systems Distributed</h3>\n<p>I'll be speaking at <a href=\"https://systemsdistributed.com/\" target=\"_blank\">Systems Distributed</a> next month! The talk is brand new and will aim to showcase some of the formal methods mental models that would be useful in mainstream software development. It has added some extra stress on my schedule, though, so expect the next two monthly releases of <em>Logic for Programmers</em> to be mostly minor changes.</p>\n<h2>What does \"Undecidable\" mean, anyway</h2>\n<p>Last week I read <a href=\"https://liamoc.net/forest/loc-000S/index.xml\" target=\"_blank\">Against Curry-Howard Mysticism</a>, which is a solid article I recommend reading. But this newsletter is actually about <a href=\"https://lobste.rs/s/n0whur/against_curry_howard_mysticism#c_lbts57\" target=\"_blank\">one comment</a>:</p>\n<blockquote>\n<p>I like to see posts like this because I often feel like I can’t tell the difference between BS and a point I’m missing. Can we get one for questions like “Isn’t XYZ (Undecidable|NP-Complete|PSPACE-Complete)?” </p>\n</blockquote>\n<p>I've already written one of these for <a href=\"https://www.hillelwayne.com/post/np-hard/\" target=\"_blank\">NP-complete</a>, so let's do one for \"undecidable\". Step one is to pull a technical definition from the book <a href=\"https://link.springer.com/book/10.1007/978-1-4612-1844-9\" target=\"_blank\"><em>Automata and Computability</em></a>:</p>\n<blockquote>\n<p>A property P of strings is said to be decidable if ... there is a total Turing machine that accepts input strings that have property P and rejects those that do not. (pg 220)</p>\n</blockquote>\n<p>Step two is to translate the technical computer science definition into more conventional programmer terms. Warning, because this is a newsletter and not a blog post, I might be a little sloppy with terms.</p>\n<h3>Machines and Decision Problems</h3>\n<p>In automata theory, all inputs to a \"program\" are strings of characters, and all outputs are \"true\" or \"false\". A program \"accepts\" a string if it outputs \"true\", and \"rejects\" if it outputs \"false\". You can think of this as automata studying all pure functions of type <code>f :: string -> boolean</code>. Problems solvable by finding such an <code>f</code> are called \"decision problems\".</p>\n<p>This covers more than you'd think, because we can bootstrap more powerful functions from these. First, as anyone who's programmed in bash knows, strings can represent any other data. Second, we can fake non-boolean outputs by instead checking if a certain computation gives a certain result. For example, I can reframe the function <code>add(x, y) = x + y</code> as a decision problem like this:</p>\n<div class=\"codehilite\"><pre><span></span><code>IS_SUM(str) {\n    x, y, z = split(str, \"#\")\n    return x + y == z\n}\n</code></pre></div>\n<p>Then because <code>IS_SUM(\"2#3#5\")</code> returns true, we know <code>2 + 3 == 5</code>, while <code>IS_SUM(\"2#3#6\")</code> is false. Since we can bootstrap parameters out of strings, I'll just say it's <code>IS_SUM(x, y, z)</code> going forward.</p>\n<p>A big part of automata theory is studying different models of computation with different strengths. One of the weakest is called <a href=\"https://en.wikipedia.org/wiki/Deterministic_finite_automaton\" target=\"_blank\">\"DFA\"</a>. I won't go into any details about what DFA actually can do, but the important thing is that it <em>can't</em> solve <code>IS_SUM</code>. That is, if you give me a DFA that takes inputs of form <code>x#y#z</code>, I can always find an input where the DFA returns true when <code>x + y != z</code>, <em>or</em> an input which returns false when <code>x + y == z</code>.</p>\n<p>It's really important to keep this model of \"solve\" in mind: a program solves a problem if it correctly returns true on all true inputs and correctly returns false on all false inputs.</p>\n<h3>(total) Turing Machines</h3>\n<p>A Turing Machine (TM) is a particular type of computation model. It's important for two reasons: </p>\n<ol>\n<li>\n<p>By the <a href=\"https://en.wikipedia.org/wiki/Church%E2%80%93Turing_thesis\" target=\"_blank\">Church-Turing thesis</a>, a Turing Machine is the \"upper bound\" of how powerful (physically realizable) computational models can get. This means that if an actual real-world programming language can solve a particular decision problem, so can a TM. Conversely, if the TM <em>can't</em> solve it, neither can the programming language.<sup id=\"fnref:caveat\"><a class=\"footnote-ref\" href=\"#fn:caveat\">1</a></sup></p>\n</li>\n<li>\n<p>It's possible to write a Turing machine that takes <em>a textual representation of another Turing machine</em> as input, and then simulates that Turing machine as part of its computations. </p>\n</li>\n</ol>\n<p>Property (1) means that we can move between different computational models of equal strength, proving things about one to learn things about another. That's why I'm able to write <code>IS_SUM</code> in a pseudocode instead of writing it in terms of the TM computational model (and why I was able to use <code>split</code> for convenience). </p>\n<p>Property (2) does several interesting things. First of all, it makes it possible to compose Turing machines. Here's how I can roughly ask if a given number is the sum of two primes, with \"just\" addition and boolean functions:</p>\n<div class=\"codehilite\"><pre><span></span><code>IS_SUM_TWO_PRIMES(z):\n    x := 1\n    y := 1\n    loop {\n        if x > z {return false}\n        if IS_PRIME(x) {\n            if IS_PRIME(y) {\n                if IS_SUM(x, y, z) {\n                    return true;\n                }\n            }\n        }\n        y := y + 1\n        if y > x {\n            x := x + 1\n            y := 0\n        }\n    }\n</code></pre></div>\n<p>Notice that without the <code>if x > z {return false}</code>, the program would loop forever on <code>z=2</code>. A TM that always halts for all inputs is called <strong>total</strong>.</p>\n<p>Property (2) also makes \"Turing machines\" a possible input to functions, meaning that we can now make decision problems about the behavior of Turing machines. For example, \"does the TM <code>M</code> either accept or reject <code>x</code> within ten steps?\"<sup id=\"fnref:backticks\"><a class=\"footnote-ref\" href=\"#fn:backticks\">2</a></sup></p>\n<div class=\"codehilite\"><pre><span></span><code>IS_DONE_IN_TEN_STEPS(M, x) {\n    for (i = 0; i < 10; i++) {\n        `simulate M(x) for one step`\n        if(`M accepted or rejected`) {\n            return true\n        }\n    }\n    return false\n}\n</code></pre></div>\n<h3>Decidability and Undecidability</h3>\n<p>Now we have all of the pieces to understand our original definition:</p>\n<blockquote>\n<p>A property P of strings is said to be decidable if ... there is a total Turing machine that accepts input strings that have property P and rejects those that do not. (220)</p>\n</blockquote>\n<p>Let <code>IS_P</code> be the decision problem \"Does the input satisfy P\"? Then <code>IS_P</code> is decidable if it can be solved by a Turing machine, ie, I can provide some <code>IS_P(x)</code> machine that <em>always</em> accepts if <code>x</code> has property P, and always rejects if <code>x</code> doesn't have property P. If I can't do that, then <code>IS_P</code> is undecidable. </p>\n<p><code>IS_SUM(x, y, z)</code> and <code>IS_DONE_IN_TEN_STEPS(M, x)</code> are decidable properties. Is <code>IS_SUM_TWO_PRIMES(z)</code> decidable? Some analysis shows that our corresponding program will either find a solution, or have <code>x>z</code> and return false. So yes, it is decidable.</p>\n<p>Notice there's an asymmetry here. To prove some property is decidable, I need just to need to find <em>one</em> program that correctly solves it. To prove some property is undecidable, I need to show that any possible program, no matter what it is, doesn't solve it.</p>\n<p>So with that asymmetry in mind, do are there <em>any</em> undecidable problems? Yes, quite a lot. Recall that Turing machines can accept encodings of other TMs as input, meaning we can write a TM that checks <em>properties of Turing machines</em>. And, by <a href=\"https://en.wikipedia.org/wiki/Rice%27s_theorem\" target=\"_blank\">Rice's Theorem</a>, almost every nontrivial semantic<sup id=\"fnref:nontrivial\"><a class=\"footnote-ref\" href=\"#fn:nontrivial\">3</a></sup> property of Turing machines is undecidable. The conventional way to prove this is to first find a single undecidable property <code>H</code>, and then use that to bootstrap undecidability of other properties.</p>\n<p>The canonical and most famous example of an undecidable problem is the <a href=\"https://en.wikipedia.org/wiki/Halting_problem\" target=\"_blank\">Halting problem</a>: \"does machine M halt on input i?\" It's pretty easy to prove undecidable, and easy to use it to bootstrap other undecidability properties. But again, <em>any</em> nontrivial property is undecidable. Checking a TM is total is undecidable. Checking a TM accepts <em>any</em> inputs is undecidable. Checking a TM solves <code>IS_SUM</code> is undecidable. Etc etc etc.</p>\n<h3>What this doesn't mean in practice</h3>\n<p>I often see the halting problem misconstrued as \"it's impossible to tell if a program will halt before running it.\" <strong>This is wrong</strong>. The halting problem says that we cannot create an algorithm that, when applied to an arbitrary program, tells us whether the program will halt or not. It is absolutely possible to tell if many programs will halt or not. It's possible to find entire subcategories of programs that are guaranteed to halt. It's possible to say \"a program constructed following constraints XYZ is guaranteed to halt.\" </p>\n<p>The actual consequence of undecidability is more subtle. If we want to know if a program has property P, undecidability tells us</p>\n<ol>\n<li>We will have to spend time and mental effort to determine if it has P</li>\n<li>We may not be successful.</li>\n</ol>\n<p>This is subtle because we're so used to living in a world where everything's undecidable that we don't really consider what the counterfactual would be like. In such a world there might be no need for Rust, because \"does this C program guarantee memory-safety\" is a decidable property. The entire field of formal verification could be unnecessary, as we could just check properties of arbitrary programs directly. We could automatically check if a change in a program preserves all existing behavior. Lots of famous math problems could be solved overnight. </p>\n<p>(This to me is a strong \"intuitive\" argument for why the halting problem is undecidable: a halt detector can be trivially repurposed as a program optimizer / theorem-prover / bcrypt cracker / chess engine. It's <em>too powerful</em>, so we should expect it to be impossible.)</p>\n<p>But because we don't live in that world, all of those things are hard problems that take effort and ingenuity to solve, and even then we often fail.</p>\n<h3>Update for the Internet</h3>\n<p>This was sent as a weekly newsletter, which is usually on topics like <a href=\"https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code\" target=\"_blank\">software history</a>, <a href=\"https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/\" target=\"_blank\">formal methods</a>, <a href=\"https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/\" target=\"_blank\">unusual technologies</a>, and the <a href=\"https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/\" target=\"_blank\">theory of software engineering</a>. You <a href=\"https://buttondown.email/hillelwayne/\" target=\"_blank\">can subscribe here</a>.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:caveat\">\n<p>To be pendantic, a TM can't do things like \"scrape a webpage\" or \"render a bitmap\", but we're only talking about computational decision problems here. <a class=\"footnote-backref\" href=\"#fnref:caveat\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:backticks\">\n<p>One notation I've adopted in <em>Logic for Programmers</em> is marking abstract sections of pseudocode with backticks. It's really handy! <a class=\"footnote-backref\" href=\"#fnref:backticks\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:nontrivial\">\n<p>Nontrivial meaning \"at least one TM has this property and at least one TM doesn't have this property\". Semantic meaning \"related to whether the TM accepts, rejects, or runs forever on a class of inputs\". <code>IS_DONE_IN_TEN_STEPS</code> is <em>not</em> a semantic property, as it doesn't tell us anything about inputs that take longer than ten steps. <a class=\"footnote-backref\" href=\"#fnref:nontrivial\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/what-does-undecidable-mean-anyway/",
          "published": "2025-05-28T19:34:02.000Z",
          "updated": "2025-05-28T19:34:02.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/finding-hard-24-puzzles-with-planner-programming/",
          "title": "Finding hard 24 puzzles with planner programming",
          "description": "<p><strong>Planner programming</strong> is a programming technique where you solve problems by providing a goal and actions, and letting the planner find actions that reach the goal. In a previous edition of <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a>, I demonstrated how this worked by solving the \n<a href=\"https://en.wikipedia.org/wiki/24_(puzzle)\" target=\"_blank\">24 puzzle</a> with planning. For <a href=\"https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/\" target=\"_blank\">reasons discussed here</a> I replaced that example with something more practical (orchestrating deployments), but left the <a href=\"https://github.com/logicforprogrammers/book-assets/tree/master/code/chapter-misc\" target=\"_blank\">code online</a> for posterity.</p>\n<p>Recently I saw a family member try and fail to vibe code a tool that would find all valid 24 puzzles, and realized I could adapt the puzzle solver to also be a puzzle generator. First I'll explain the puzzle rules, then the original solver, then the generator.<sup id=\"fnref:complex\"><a class=\"footnote-ref\" href=\"#fn:complex\">1</a></sup> For a much longer intro to planning, see <a href=\"https://www.hillelwayne.com/post/picat/\" target=\"_blank\">here</a>.</p>\n<h3>The rules of 24</h3>\n<p>You're given four numbers and have to find some elementary equation (<code>+-*/</code>+groupings) that uses all four numbers and results in 24. Each number must be used exactly once, but do not need to be used in the starting puzzle order. Some examples:</p>\n<ul>\n<li><code>[6, 6, 6, 6]</code> -> <code>6+6+6+6=24</code></li>\n<li><code>[1, 1, 6, 6]</code> -> <code>(6+6)*(1+1)=24</code></li>\n<li><code>[4, 4, 4, 5]</code> -> <code>4*(5+4/4)=24</code></li>\n</ul>\n<p>Some setups are impossible, like <code>[1, 1, 1, 1]</code>. Others are possible only with non-elementary operations, like <code>[1, 5, 5, 324]</code> (which requires exponentiation).</p>\n<h2>The solver</h2>\n<p>We will use the <a href=\"http://picat-lang.org/\" target=\"_blank\">Picat</a>, the only language that I know has a built-in planner module. The current state of our plan with be represented by a single list with all of the numbers.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">import</span> <span class=\"s s-Atom\">planner</span><span class=\"p\">,</span> <span class=\"s s-Atom\">math</span><span class=\"p\">.</span>\n<span class=\"s s-Atom\">import</span> <span class=\"s s-Atom\">cp</span><span class=\"p\">.</span>\n\n<span class=\"nf\">action</span><span class=\"p\">(</span><span class=\"nv\">S0</span><span class=\"p\">,</span> <span class=\"nv\">S1</span><span class=\"p\">,</span> <span class=\"nv\">Action</span><span class=\"p\">,</span> <span class=\"nv\">Cost</span><span class=\"p\">)</span> <span class=\"s s-Atom\">?=></span>\n  <span class=\"nf\">member</span><span class=\"p\">(</span><span class=\"nv\">X</span><span class=\"p\">,</span> <span class=\"nv\">S0</span><span class=\"p\">)</span>\n  <span class=\"p\">,</span> <span class=\"nv\">S0</span> <span class=\"s s-Atom\">:=</span> <span class=\"nf\">delete</span><span class=\"p\">(</span><span class=\"nv\">S0</span><span class=\"p\">,</span> <span class=\"nv\">X</span><span class=\"p\">)</span> <span class=\"c1\">% , is `and`</span>\n  <span class=\"p\">,</span> <span class=\"nf\">member</span><span class=\"p\">(</span><span class=\"nv\">Y</span><span class=\"p\">,</span> <span class=\"nv\">S0</span><span class=\"p\">)</span>\n  <span class=\"p\">,</span> <span class=\"nv\">S0</span> <span class=\"s s-Atom\">:=</span> <span class=\"nf\">delete</span><span class=\"p\">(</span><span class=\"nv\">S0</span><span class=\"p\">,</span> <span class=\"nv\">Y</span><span class=\"p\">)</span>\n  <span class=\"p\">,</span> <span class=\"p\">(</span>\n      <span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"err\">$</span><span class=\"p\">(</span><span class=\"nv\">X</span> <span class=\"o\">+</span> <span class=\"nv\">Y</span><span class=\"p\">)</span> \n    <span class=\"p\">;</span> <span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"err\">$</span><span class=\"p\">(</span><span class=\"nv\">X</span> <span class=\"o\">-</span> <span class=\"nv\">Y</span><span class=\"p\">)</span>\n    <span class=\"p\">;</span> <span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"err\">$</span><span class=\"p\">(</span><span class=\"nv\">X</span> <span class=\"o\">*</span> <span class=\"nv\">Y</span><span class=\"p\">)</span>\n    <span class=\"p\">;</span> <span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"err\">$</span><span class=\"p\">(</span><span class=\"nv\">X</span> <span class=\"o\">/</span> <span class=\"nv\">Y</span><span class=\"p\">),</span> <span class=\"nv\">Y</span> <span class=\"o\">></span> <span class=\"mi\">0</span>\n    <span class=\"p\">)</span>\n    <span class=\"p\">,</span> <span class=\"nv\">S1</span> <span class=\"o\">=</span> <span class=\"nv\">S0</span> <span class=\"s s-Atom\">++</span> <span class=\"p\">[</span><span class=\"nf\">apply</span><span class=\"p\">(</span><span class=\"nv\">A</span><span class=\"p\">)]</span>\n  <span class=\"p\">,</span> <span class=\"nv\">Action</span> <span class=\"o\">=</span> <span class=\"nv\">A</span>\n  <span class=\"p\">,</span> <span class=\"nv\">Cost</span> <span class=\"o\">=</span> <span class=\"mi\">1</span>\n  <span class=\"p\">.</span>\n</code></pre></div>\n<p>This is our \"action\", and it works in three steps:</p>\n<ol>\n<li>Nondeterministically pull two different values out of the input, deleting them</li>\n<li>Nondeterministically pick one of the basic operations</li>\n<li>The new state is the remaining elements, appended with that operation applied to our two picks.</li>\n</ol>\n<p>Let's walk through this with <code>[1, 6, 1, 7]</code>. There are four choices for <code>X</code> and three four <code>Y</code>. If the planner chooses <code>X=6</code> and <code>Y=7</code>, <code>A = $(6 + 7)</code>. This is an uncomputed term in the same way lisps might use quotation. We can resolve the computation with <code>apply</code>, as in the line <code>S1 = S0 ++ [apply(A)]</code>.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">final</span><span class=\"p\">([</span><span class=\"nv\">N</span><span class=\"p\">])</span> <span class=\"s s-Atom\">=></span>\n  <span class=\"nv\">N</span> <span class=\"o\">=:=</span> <span class=\"mf\">24.</span> <span class=\"c1\">% handle floating point</span>\n</code></pre></div>\n<p>Our final goal is just a list where the only element is 24. This has to be a little floating point-sensitive to handle floating point divison, done by <code>=:=</code>.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">main</span> <span class=\"s s-Atom\">=></span>\n  <span class=\"nv\">Start</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">,</span> <span class=\"mi\">6</span><span class=\"p\">]</span>\n  <span class=\"p\">,</span> <span class=\"nf\">best_plan</span><span class=\"p\">(</span><span class=\"nv\">Start</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"nv\">Plan</span><span class=\"p\">)</span>\n  <span class=\"p\">,</span> <span class=\"nf\">printf</span><span class=\"p\">(</span><span class=\"s2\">\"%w %w%n\"</span><span class=\"p\">,</span> <span class=\"nv\">Start</span><span class=\"p\">,</span> <span class=\"nv\">Plan</span><span class=\"p\">)</span>\n  <span class=\"p\">.</span>\n</code></pre></div>\n<p>For <code>main,</code> we just find the best plan with the maximum cost of <code>4</code> and print it. When run from the command line, <code>picat</code> automatically executes whatever is in <code>main</code>.</p>\n<div class=\"codehilite\"><pre><span></span><code>$ picat 24.pi\n[1,5,5,6] [1 + 5,5 * 6,30 - 6]\n</code></pre></div>\n<p>I don't want to spoil any more 24 puzzles, so let's stop showing the plan:</p>\n<div class=\"codehilite\"><pre><span></span><code>main =>\n<span class=\"gd\">- , printf(\"%w %w%n\", Start, Plan)</span>\n<span class=\"gi\">+ , printf(\"%w%n\", Start)</span>\n</code></pre></div>\n<h3>Generating puzzles</h3>\n<p>Picat provides a <code>find_all(X, p(X))</code> function, which ruturns all <code>X</code> for which <code>p(X)</code> is true. In theory, we could write <code>find_all(S, best_plan(S, 4, _)</code>. In practice, there are an infinite number of valid puzzles, so we need to bound S somewhat. We also don't want to find any redundant puzzles, such as <code>[6, 6, 6, 4]</code> and <code>[4, 6, 6, 6]</code>. </p>\n<p>We can solve both issues by writing a helper <code>valid24(S)</code>, which will check that <code>S</code> a sorted list of integers within some bounds, like <code>1..8</code>, and also has a valid solution.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">valid24</span><span class=\"p\">(</span><span class=\"nv\">Start</span><span class=\"p\">)</span> <span class=\"s s-Atom\">=></span>\n  <span class=\"nv\">Start</span> <span class=\"o\">=</span> <span class=\"nf\">new_list</span><span class=\"p\">(</span><span class=\"mi\">4</span><span class=\"p\">)</span>\n  <span class=\"p\">,</span> <span class=\"nv\">Start</span> <span class=\"s s-Atom\">::</span> <span class=\"mf\">1..8</span> <span class=\"c1\">% every value in 1..8</span>\n  <span class=\"p\">,</span> <span class=\"nf\">increasing</span><span class=\"p\">(</span><span class=\"nv\">Start</span><span class=\"p\">)</span> <span class=\"c1\">% sorted ascending</span>\n  <span class=\"p\">,</span> <span class=\"nf\">solve</span><span class=\"p\">(</span><span class=\"nv\">Start</span><span class=\"p\">)</span> <span class=\"c1\">% turn into values</span>\n  <span class=\"p\">,</span> <span class=\"nf\">best_plan</span><span class=\"p\">(</span><span class=\"nv\">Start</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"nv\">Plan</span><span class=\"p\">)</span>\n  <span class=\"p\">.</span>\n</code></pre></div>\n<p>This leans on Picat's constraint solving features to automatically find bounded sorted lists, which is why we need the <code>solve</code> step.<sup id=\"fnref:efficiency\"><a class=\"footnote-ref\" href=\"#fn:efficiency\">2</a></sup> Now we can just loop through all of the values in <code>find_all</code> to get all solutions:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">main</span> <span class=\"s s-Atom\">=></span>\n  <span class=\"nf\">foreach</span><span class=\"p\">([</span><span class=\"nv\">S</span><span class=\"p\">]</span> <span class=\"s s-Atom\">in</span> <span class=\"nf\">find_all</span><span class=\"p\">(</span>\n    <span class=\"p\">[</span><span class=\"nv\">Start</span><span class=\"p\">],</span>\n    <span class=\"nf\">valid24</span><span class=\"p\">(</span><span class=\"nv\">Start</span><span class=\"p\">)))</span>\n    <span class=\"nf\">printf</span><span class=\"p\">(</span><span class=\"s2\">\"%w%n\"</span><span class=\"p\">,</span> <span class=\"nv\">S</span><span class=\"p\">)</span>\n  <span class=\"s s-Atom\">end</span><span class=\"p\">.</span>\n</code></pre></div>\n<div class=\"codehilite\"><pre><span></span><code>$ picat 24.pi\n\n[1,1,1,8]\n[1,1,2,6]\n[1,1,2,7]\n[1,1,2,8]\n# etc\n</code></pre></div>\n<h3>Finding hard puzzles</h3>\n<p>Last Friday I realized I could do something more interesting with this. Once I have found a plan, I can apply further constraints to the plan, for example to find problems that can be solved with division:</p>\n<div class=\"codehilite\"><pre><span></span><code>valid24(Start, Plan) =>\n<span class=\"w\"> </span> Start = new_list(4)\n<span class=\"w\"> </span> , Start :: 1..8\n<span class=\"w\"> </span> , increasing(Start)\n<span class=\"w\"> </span> , solve(Start)\n<span class=\"w\"> </span> , best_plan(Start, 4, Plan)\n<span class=\"gi\">+ , member($(_ / _), Plan)</span>\n<span class=\"w\"> </span> .\n</code></pre></div>\n<p>In playing with this, though, I noticed something weird: there are some solutions that appear if I sort <em>up</em> but not <em>down</em>. For example, <code>[3,3,4,5]</code> appears in the solution set, but <code>[5, 4, 3, 3]</code> doesn't appear if I replace <code>increasing</code> with <code>decreasing</code>.</p>\n<p>As far as I can tell, this is because Picat only finds one best plan, and <code>[5, 4, 3, 3]</code> has <em>two</em> solutions: <code>4*(5-3/3)</code> and <code>3*(5+4)-3</code>. <code>best_plan</code> is a <em>deterministic</em> operator, so Picat commits to the first best plan it finds. So if it finds <code>3*(5+4)-3</code> first, it sees that the solution doesn't contain a division, throws <code>[5, 4, 3, 3]</code> away as a candidate, and moves on to the next puzzle.</p>\n<p>There's a couple ways we can fix this. We could replace <code>best_plan</code> with <code>best_plan_nondet</code>, which can backtrack to find new plans (at the cost of an enormous number of duplicates). Or we could modify our <code>final</code> to only accept plans with a division: </p>\n<div class=\"codehilite\"><pre><span></span><code>% Hypothetical change\nfinal([N]) =>\n<span class=\"gi\">+ member($(_ / _), current_plan()),</span>\n<span class=\"w\"> </span> N =:= 24.\n</code></pre></div>\n<p>My favorite \"fix\" is to ask another question entirely. While I was looking for puzzles that can be solved with division, what I actually want is puzzles that <em>must</em> be solved with division. What if I rejected any puzzle that has a solution <em>without</em> division?</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"gi\">+ plan_with_no_div(S, P) => best_plan_nondet(S, 4, P), not member($(_ / _), P).</span>\n\nvalid24(Start, Plan) =>\n<span class=\"w\"> </span> Start = new_list(4)\n<span class=\"w\"> </span> , Start :: 1..8\n<span class=\"w\"> </span> , increasing(Start)\n<span class=\"w\"> </span> , solve(Start)\n<span class=\"w\"> </span> , best_plan(Start, 4, Plan)\n<span class=\"gd\">- , member($(_ / _), Plan)</span>\n<span class=\"gi\">+ , not plan_with_no_div(Start, _)</span>\n<span class=\"w\"> </span> .\n</code></pre></div>\n<p>The new line's a bit tricky. <code>plan_with_div</code> nondeterministically finds a plan, and then fails if the plan contains a division.<sup id=\"fnref:not\"><a class=\"footnote-ref\" href=\"#fn:not\">3</a></sup> Since I used <code>best_plan_nondet</code>, it can backtrack from there and find a new plan. This means <code>plan_with_no_div</code> only fails if not such plan exists. And in <code>valid24</code>, we only succeed if <code>plan_with_no_div</code> fails, guaranteeing that the only existing plans use division. Since this doesn't depend on the plan found via <code>best_plan</code>, it doesn't matter how the values in <code>Start</code> are arranged, this will not miss any valid puzzles.</p>\n<h4>Aside for my <a href=\"https://leanpub.com/logic/\" target=\"_blank\">logic book readers</a></h4>\n<p>The new clause is equivalent to <code>!(some p: Plan(p) && !(div in p))</code>. Applying the simplifications we learned:</p>\n<ol>\n<li><code>!(some p: Plan(p) && !(div in p))</code> (init)</li>\n<li><code>all p: !(plan(p) && !(div in p))</code> (all/some duality)</li>\n<li><code>all p: !plan(p) || div in p)</code> (De Morgan's law)</li>\n<li><code>all p: plan(p) => div in p</code> (implication definition)</li>\n</ol>\n<p>Which more obviously means \"if P is a valid plan, then it contains a division\".</p>\n<h4>Back to finding hard puzzles</h4>\n<p><em>Anyway</em>, with <code>not plan_with_no_div</code>, we are filtering puzzles on the set of possible solutions, not just specific solutions. And this gives me an idea: what if we find puzzles that have only one solution? </p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"gh\">different_plan(S, P) => best_plan_nondet(S, 4, P2), P2 != P.</span>\n\nvalid24(Start, Plan) =>\n<span class=\"gi\">+ , not different_plan(Start, Plan)</span>\n</code></pre></div>\n<p>I tried this from <code>1..8</code> and got:</p>\n<div class=\"codehilite\"><pre><span></span><code>[1,2,7,7]\n[1,3,4,6]\n[1,6,6,8]\n[3,3,8,8]\n</code></pre></div>\n<p>These happen to be some of the <a href=\"https://www.4nums.com/game/difficulties/\" target=\"_blank\">hardest 24 puzzles known</a>, though not all of them. Note this is assuming that <code>(X + Y)</code> and <code>(Y + X)</code> are <em>different</em> solutions. If we say they're the same (by appending writing <code>A = $(X + Y), X <= Y</code> in our action) then we got a lot more puzzles, many of which are considered \"easy\". Other \"hard\" things we can look for include plans that require fractions:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">plan_with_no_fractions</span><span class=\"p\">(</span><span class=\"nv\">S</span><span class=\"p\">,</span> <span class=\"nv\">P</span><span class=\"p\">)</span> <span class=\"s s-Atom\">=></span> \n  <span class=\"nf\">best_plan_nondet</span><span class=\"p\">(</span><span class=\"nv\">S</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"nv\">P</span><span class=\"p\">)</span>\n  <span class=\"p\">,</span> <span class=\"o\">not</span><span class=\"p\">(</span>\n    <span class=\"nf\">member</span><span class=\"p\">(</span><span class=\"nv\">X</span><span class=\"p\">,</span> <span class=\"nv\">P</span><span class=\"p\">),</span>\n    <span class=\"nf\">round</span><span class=\"p\">(</span><span class=\"nf\">apply</span><span class=\"p\">(</span><span class=\"nv\">X</span><span class=\"p\">))</span> <span class=\"s s-Atom\">=\\=</span> <span class=\"nv\">X</span>\n  <span class=\"p\">).</span>\n\n<span class=\"c1\">% insert `not plan...` in valid24 as usual</span>\n</code></pre></div>\n<p>Finally, we could try seeing if a negative number is required:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">plan_with_no_negatives</span><span class=\"p\">(</span><span class=\"nv\">S</span><span class=\"p\">,</span> <span class=\"nv\">P</span><span class=\"p\">)</span> <span class=\"s s-Atom\">=></span> \n  <span class=\"nf\">best_plan_nondet</span><span class=\"p\">(</span><span class=\"nv\">S</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"nv\">P</span><span class=\"p\">)</span>\n  <span class=\"p\">,</span> <span class=\"o\">not</span><span class=\"p\">(</span>\n    <span class=\"nf\">member</span><span class=\"p\">(</span><span class=\"nv\">X</span><span class=\"p\">,</span> <span class=\"nv\">P</span><span class=\"p\">),</span>\n    <span class=\"nf\">apply</span><span class=\"p\">(</span><span class=\"nv\">X</span><span class=\"p\">)</span> <span class=\"o\"><</span> <span class=\"mi\">0</span>\n  <span class=\"p\">).</span>\n</code></pre></div>\n<p>Interestingly this one returns no solutions, so you are never required to construct a negative number as part of a standard 24 puzzle.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:complex\">\n<p>The code below is different than old book version, as it uses more fancy logic programming features that aren't good in learning material. <a class=\"footnote-backref\" href=\"#fnref:complex\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:efficiency\">\n<p><code>increasing</code> is a constraint predicate. We could alternatively write <code>sorted</code>, which is a Picat logical predicate and must be placed after <code>solve</code>. There doesn't seem to be any efficiency gains either way. <a class=\"footnote-backref\" href=\"#fnref:efficiency\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:not\">\n<p>I don't know what the standard is in Picat, but in Prolog, the convention is to use <code>\\+</code> instead of <code>not</code>. They mean the same thing, so I'm using <code>not</code> because it's clearer to non-LPers. <a class=\"footnote-backref\" href=\"#fnref:not\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/finding-hard-24-puzzles-with-planner-programming/",
          "published": "2025-05-20T18:21:01.000Z",
          "updated": "2025-05-20T18:21:01.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/modeling-awkward-social-situations-with-tla/",
          "title": "Modeling Awkward Social Situations with TLA+",
          "description": "<p>You're walking down the street and need to pass someone going the opposite way. You take a step left, but they're thinking the same thing and take a step to their <em>right</em>, aka your left. You're still blocking each other. Then you take a step to the right, and they take a step to their left, and you're back to where you started. I've heard this called \"walkwarding\"</p>\n<p>Let's model this in <a href=\"https://lamport.azurewebsites.net/tla/tla.html\" target=\"_blank\">TLA+</a>. TLA+ is a <strong>formal methods</strong> tool for finding bugs in complex software designs, most often involving concurrency. Two people trying to get past each other just also happens to be a concurrent system. A gentler introduction to TLA+'s capabilities is <a href=\"https://www.hillelwayne.com/post/modeling-deployments/\" target=\"_blank\">here</a>, an in-depth guide teaching the language is <a href=\"https://learntla.com/\" target=\"_blank\">here</a>.</p>\n<h2>The spec</h2>\n<div class=\"codehilite\"><pre><span></span><code>---- MODULE walkward ----\nEXTENDS Integers\n\nVARIABLES pos\nvars == <<pos>>\n</code></pre></div>\n<p>Double equals defines a new operator, single equals is an equality check. <code><<pos>></code> is a sequence, aka array.</p>\n<div class=\"codehilite\"><pre><span></span><code>you == \"you\"\nme == \"me\"\nPeople == {you, me}\n\nMaxPlace == 4\n\nleft == 0\nright == 1\n</code></pre></div>\n<p>I've gotten into the habit of assigning string \"symbols\" to operators so that the compiler complains if I misspelled something. <code>left</code> and <code>right</code> are numbers so we can shift position with <code>right - pos</code>.</p>\n<div class=\"codehilite\"><pre><span></span><code>direction == [you |-> 1, me |-> -1]\ngoal == [you |-> MaxPlace, me |-> 1]\n\nInit ==\n  \\* left-right, forward-backward\n  pos = [you |-> [lr |-> left, fb |-> 1], me |-> [lr |-> left, fb |-> MaxPlace]]\n</code></pre></div>\n<p><code>direction</code>, <code>goal</code>, and <code>pos</code> are \"records\", or hash tables with string keys. I can get my left-right position with <code>pos.me.lr</code> or <code>pos[\"me\"][\"lr\"]</code> (or <code>pos[me].lr</code>, as <code>me == \"me\"</code>).</p>\n<div class=\"codehilite\"><pre><span></span><code>Juke(person) ==\n  pos' = [pos EXCEPT ![person].lr = right - @]\n</code></pre></div>\n<p>TLA+ breaks the world into a sequence of steps. In each step, <code>pos</code> is the value of <code>pos</code> in the <em>current</em> step and <code>pos'</code> is the value in the <em>next</em> step. The main outcome of this semantics is that we \"assign\" a new value to <code>pos</code> by declaring <code>pos'</code> equal to something. But the semantics also open up lots of cool tricks, like swapping two values with <code>x' = y /\\ y' = x</code>.</p>\n<p>TLA+ is a little weird about updating functions. To set <code>f[x] = 3</code>, you gotta write <code>f' = [f EXCEPT ![x] = 3]</code>. To make things a little easier, the rhs of a function update can contain <code>@</code> for the old value. <code>![me].lr = right - @</code> is the same as <code>right - pos[me].lr</code>, so it swaps left and right.</p>\n<p>(\"Juke\" comes from <a href=\"https://www.merriam-webster.com/dictionary/juke\" target=\"_blank\">here</a>)</p>\n<div class=\"codehilite\"><pre><span></span><code>Move(person) ==\n  LET new_pos == [pos[person] EXCEPT !.fb = @ + direction[person]]\n  IN\n    /\\ pos[person].fb # goal[person]\n    /\\ \\A p \\in People: pos[p] # new_pos\n    /\\ pos' = [pos EXCEPT ![person] = new_pos]\n</code></pre></div>\n<p>The <code>EXCEPT</code> syntax can be used in regular definitions, too. This lets someone move one step in their goal direction <em>unless</em> they are at the goal <em>or</em> someone is already in that space. <code>/\\</code> means \"and\".</p>\n<div class=\"codehilite\"><pre><span></span><code>Next ==\n  \\E p \\in People:\n    \\/ Move(p)\n    \\/ Juke(p)\n</code></pre></div>\n<p>I really like how TLA+ represents concurrency: \"In each step, there is a person who either moves or jukes.\" It can take a few uses to really wrap your head around but it can express extraordinarily complicated distributed systems.</p>\n<div class=\"codehilite\"><pre><span></span><code>Spec == Init /\\ [][Next]_vars\n\nLiveness == <>(pos[me].fb = goal[me])\n====\n</code></pre></div>\n<p><code>Spec</code> is our specification: we start at <code>Init</code> and take a <code>Next</code> step every step.</p>\n<p>Liveness is the generic term for \"something good is guaranteed to happen\", see <a href=\"https://www.hillelwayne.com/post/safety-and-liveness/\" target=\"_blank\">here</a> for more.  <code><></code> means \"eventually\", so <code>Liveness</code> means \"eventually my forward-backward position will be my goal\". I could extend it to \"both of us eventually reach out goal\" but I think this is good enough for a demo.</p>\n<h3>Checking the spec</h3>\n<p>Four years ago, everybody in TLA+ used the <a href=\"https://lamport.azurewebsites.net/tla/toolbox.html\" target=\"_blank\">toolbox</a>. Now the community has collectively shifted over to using the <a href=\"https://github.com/tlaplus/vscode-tlaplus/\" target=\"_blank\">VSCode extension</a>.<sup id=\"fnref:ltla\"><a class=\"footnote-ref\" href=\"#fn:ltla\">1</a></sup> VSCode requires we write a configuration file, which I will call <code>walkward.cfg</code>.</p>\n<div class=\"codehilite\"><pre><span></span><code>SPECIFICATION Spec\nPROPERTY Liveness\n</code></pre></div>\n<p>I then check the model with the VSCode command <code>TLA+: Check model with TLC</code>. Unsurprisingly, it finds an error:</p>\n<p><img alt=\"Screenshot 2025-05-12 153537.png\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/af6f9e89-0bc6-4705-b293-4da5f5c16cfe.png?w=960&fit=max\"/></p>\n<p>The reason it fails is \"stuttering\": I can get one step away from my goal and then just stop moving forever. We say the spec is <a href=\"https://www.hillelwayne.com/post/fairness/\" target=\"_blank\">unfair</a>: it does not guarantee that if progress is always possible, progress will be made. If I want the spec to always make progress, I have to make some of the steps <strong>weakly fair</strong>.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"gi\">+ Fairness == WF_vars(Next)</span>\n\n<span class=\"gd\">- Spec == Init /\\ [][Next]_vars</span>\n<span class=\"gi\">+ Spec == Init /\\ [][Next]_vars /\\ Fairness</span>\n</code></pre></div>\n<p>Now the spec is weakly fair, so someone will always do <em>something</em>. New error:</p>\n<div class=\"codehilite\"><pre><span></span><code>\\* First six steps cut\n7: <Move(\"me\")>\npos = [you |-> [lr |-> 0, fb |-> 4], me |-> [lr |-> 1, fb |-> 2]]\n8: <Juke(\"me\")>\npos = [you |-> [lr |-> 0, fb |-> 4], me |-> [lr |-> 0, fb |-> 2]]\n9: <Juke(\"me\")> (back to state 7)\n</code></pre></div>\n<p>In this failure, I've successfully gotten past you, and then spend the rest of my life endlessly juking back and forth. The <code>Next</code> step keeps happening, so weak fairness is satisfied. What I actually want is for both my <code>Move</code> and my <code>Juke</code> to both be weakly fair independently of each other.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"gd\">- Fairness == WF_vars(Next)</span>\n<span class=\"gi\">+ Fairness == WF_vars(Move(me)) /\\ WF_vars(Juke(me))</span>\n</code></pre></div>\n<p>If my liveness property also specified that <em>you</em> reached your goal, I could instead write <code>\\A p \\in People: WF_vars(Move(p)) etc</code>. I could also swap the <code>\\A</code> with a <code>\\E</code> to mean at least one of us is guaranteed to have fair actions, but not necessarily both of us. </p>\n<p>New error:</p>\n<div class=\"codehilite\"><pre><span></span><code>3: <Move(\"me\")>\npos = [you |-> [lr |-> 0, fb |-> 2], me |-> [lr |-> 0, fb |-> 3]]\n4: <Juke(\"you\")>\npos = [you |-> [lr |-> 1, fb |-> 2], me |-> [lr |-> 0, fb |-> 3]]\n5: <Juke(\"me\")>\npos = [you |-> [lr |-> 1, fb |-> 2], me |-> [lr |-> 1, fb |-> 3]]\n6: <Juke(\"me\")>\npos = [you |-> [lr |-> 1, fb |-> 2], me |-> [lr |-> 0, fb |-> 3]]\n7: <Juke(\"you\")> (back to state 3)\n</code></pre></div>\n<p>Now we're getting somewhere! This is the original walkwarding situation we wanted to capture. We're in each others way, then you juke, but before either of us can move you juke, then we both juke back. We can repeat this forever, trapped in a social hell.</p>\n<p>Wait, but doesn't <code>WF(Move(me))</code> guarantee I will eventually move? Yes, but <em>only if a move is permanently available</em>. In this case, it's not permanently available, because every couple of steps it's made temporarily unavailable.</p>\n<p>How do I fix this? I can't add a rule saying that we only juke if we're blocked, because the whole point of walkwarding is that we're not coordinated. In the real world, walkwarding can go on for agonizing seconds. What I can do instead is say that Liveness holds <em>as long as <code>Move</code> is strongly fair</em>. Unlike weak fairness, <a href=\"https://www.hillelwayne.com/post/fairness/#strong-fairness\" target=\"_blank\">strong fairness</a> guarantees something happens if it keeps becoming possible, even with interruptions. </p>\n<div class=\"codehilite\"><pre><span></span><code>Liveness == \n<span class=\"gi\">+  SF_vars(Move(me)) => </span>\n<span class=\"w\"> </span>   <>(pos[me].fb = goal[me])\n</code></pre></div>\n<p>This makes the spec pass. Even if we weave back and forth for five minutes, as long as we eventually pass each other, I will reach my goal. Note we could also by making <code>Move</code> in <code>Fairness</code> strongly fair, which is preferable if we have a lot of different liveness properties to check.</p>\n<h3>A small exercise for the reader</h3>\n<p>There is a presumed invariant that is violated. Identify what it is, write it as a property in TLA+, and show the spec violates it. Then fix it.</p>\n<p>Answer (in <a href=\"https://rot13.com/\" target=\"_blank\">rot13</a>): Gur vainevnag vf \"ab gjb crbcyr ner va gur rknpg fnzr ybpngvba\". <code>Zbir</code> thnenagrrf guvf ohg <code>Whxr</code> <em>qbrf abg</em>.</p>\n<h3>More TLA+ Exercises</h3>\n<p>I've started work on <a href=\"https://github.com/hwayne/tlaplus-exercises/\" target=\"_blank\">an exercises repo</a>. There's only a handful of specific problems now but I'm planning on adding more over the summer.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:ltla\">\n<p><a href=\"https://learntla.com/\" target=\"_blank\">learntla</a> is still on the toolbox, but I'm hoping to get it all moved over this summer. <a class=\"footnote-backref\" href=\"#fnref:ltla\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/modeling-awkward-social-situations-with-tla/",
          "published": "2025-05-14T16:02:21.000Z",
          "updated": "2025-05-14T16:02:21.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/write-the-most-clever-code-you-possibly-can/",
          "title": "Write the most clever code you possibly can",
          "description": "<p><em>I started writing this early last week but Real Life Stuff happened and now you're getting the first-draft late this week. Warning, unedited thoughts ahead!</em></p>\n<h2>New Logic for Programmers release!</h2>\n<p><a href=\"https://leanpub.com/logic/\" target=\"_blank\">v0.9 is out</a>! This is a big release, with a new cover design, several rewritten chapters, <a href=\"https://github.com/logicforprogrammers/book-assets/tree/master/code\" target=\"_blank\">online code samples</a> and much more. See the full release notes at the <a href=\"https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md\" target=\"_blank\">changelog page</a>, and <a href=\"https://leanpub.com/logic/\" target=\"_blank\">get the book here</a>!</p>\n<p><img alt=\"The new cover! It's a lot nicer\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/038a7092-5dc7-41a5-9a16-56bdef8b5d58.jpg?w=400&fit=max\"/></p>\n<h2>Write the cleverest code you possibly can</h2>\n<p>There are millions of articles online about how programmers should not write \"clever\" code, and instead write simple, maintainable code that everybody understands. Sometimes the example of \"clever\" code looks like this (<a href=\"https://codegolf.stackexchange.com/questions/57617/is-this-number-a-prime/57682#57682\" target=\"_blank\">src</a>):</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\"># Python</span>\n\n<span class=\"n\">p</span><span class=\"o\">=</span><span class=\"n\">n</span><span class=\"o\">=</span><span class=\"mi\">1</span>\n<span class=\"n\">exec</span><span class=\"p\">(</span><span class=\"s2\">\"p*=n*n;n+=1;\"</span><span class=\"o\">*~-</span><span class=\"nb\">int</span><span class=\"p\">(</span><span class=\"nb\">input</span><span class=\"p\">()))</span>\n<span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"n\">p</span><span class=\"o\">%</span><span class=\"n\">n</span><span class=\"p\">)</span>\n</code></pre></div>\n<p>This is code-golfing, the sport of writing the most concise code possible. Obviously you shouldn't run this in production for the same reason you shouldn't eat dinner off a Rembrandt. </p>\n<p>Other times the example looks like this:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">def</span><span class=\"w\"> </span><span class=\"nf\">is_prime</span><span class=\"p\">(</span><span class=\"n\">x</span><span class=\"p\">):</span>\n    <span class=\"k\">if</span> <span class=\"n\">x</span> <span class=\"o\">==</span> <span class=\"mi\">1</span><span class=\"p\">:</span>\n        <span class=\"k\">return</span> <span class=\"kc\">False</span>\n    <span class=\"k\">return</span> <span class=\"nb\">all</span><span class=\"p\">([</span><span class=\"n\">x</span><span class=\"o\">%</span><span class=\"n\">n</span> <span class=\"o\">!=</span> <span class=\"mi\">0</span> <span class=\"k\">for</span> <span class=\"n\">n</span> <span class=\"ow\">in</span> <span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"n\">x</span><span class=\"p\">)])</span>\n</code></pre></div>\n<p>This is \"clever\" because it uses a single list comprehension, as opposed to a \"simple\" for loop. Yes, \"list comprehensions are too clever\" is something I've read in one of these articles. </p>\n<p>I've also talked to people who think that datatypes besides lists and hashmaps are too clever to use, that most optimizations are too clever to bother with, and even that functions and classes are too clever and code should be a linear script.<sup id=\"fnref:grad-students\"><a class=\"footnote-ref\" href=\"#fn:grad-students\">1</a></sup>. Clever code is anything using features or domain concepts we don't understand. Something that seems unbearably clever to me might be utterly mundane for you, and vice versa. </p>\n<p>How do we make something utterly mundane? By using it and working at the boundaries of our skills. Almost everything I'm \"good at\" comes from banging my head against it more than is healthy. That suggests a really good reason to write clever code: it's an excellent form of purposeful practice. Writing clever code forces us to code outside of our comfort zone, developing our skills as software engineers. </p>\n<blockquote>\n<p>Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you [will get excellent debugging practice at exactly the right level required to push your skills as a software engineer] — Brian Kernighan, probably</p>\n</blockquote>\n<p>There are other benefits, too, but first let's kill the elephant in the room:<sup id=\"fnref:bajillion\"><a class=\"footnote-ref\" href=\"#fn:bajillion\">2</a></sup></p>\n<h3>Don't <em>commit</em> clever code</h3>\n<p>I am proposing writing clever code as a means of practice. Being at work is a <em>job</em> with coworkers who will not appreciate if your code is too clever. Similarly, don't use <a href=\"https://mcfunley.com/choose-boring-technology\" target=\"_blank\">too many innovative technologies</a>. Don't put anything in production you are <em>uncomfortable</em> with.</p>\n<p>We can still responsibly write clever code at work, though: </p>\n<ol>\n<li>Solve a problem in both a simple and a clever way, and then only commit the simple way. This works well for small scale problems where trying the \"clever way\" only takes a few minutes.</li>\n<li>Write our <em>personal</em> tools cleverly. I'm a big believer of the idea that most programmers would benefit from writing more scripts and support code customized to their particular work environment. This is a great place to practice new techniques, languages, etc.</li>\n<li>If clever code is absolutely the best way to solve a problem, then commit it with <strong>extensive documentation</strong> explaining how it works and why it's preferable to simpler solutions. Bonus: this potentially helps the whole team upskill.</li>\n</ol>\n<h2>Writing clever code...</h2>\n<div class=\"subscribe-form\"></div>\n<h3>...teaches simple solutions</h3>\n<p>Usually, code that's called too clever composes several powerful features together — the \"not a single list comprehension or function\" people are the exception. <a href=\"https://www.joshwcomeau.com/career/clever-code-considered-harmful/\" target=\"_blank\">Josh Comeau's</a> \"don't write clever code\" article gives this example of \"too clever\":</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"kd\">const</span><span class=\"w\"> </span><span class=\"nx\">extractDataFromResponse</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"p\">(</span><span class=\"nx\">response</span><span class=\"p\">)</span><span class=\"w\"> </span><span class=\"p\">=></span><span class=\"w\"> </span><span class=\"p\">{</span>\n<span class=\"w\">  </span><span class=\"kd\">const</span><span class=\"w\"> </span><span class=\"p\">[</span><span class=\"nx\">Component</span><span class=\"p\">,</span><span class=\"w\"> </span><span class=\"nx\">props</span><span class=\"p\">]</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"nx\">response</span><span class=\"p\">;</span>\n\n<span class=\"w\">  </span><span class=\"kd\">const</span><span class=\"w\"> </span><span class=\"nx\">resultsEntries</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"nb\">Object</span><span class=\"p\">.</span><span class=\"nx\">entries</span><span class=\"p\">({</span><span class=\"w\"> </span><span class=\"nx\">Component</span><span class=\"p\">,</span><span class=\"w\"> </span><span class=\"nx\">props</span><span class=\"w\"> </span><span class=\"p\">});</span>\n<span class=\"w\">  </span><span class=\"kd\">const</span><span class=\"w\"> </span><span class=\"nx\">assignIfValueTruthy</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"p\">(</span><span class=\"nx\">o</span><span class=\"p\">,</span><span class=\"w\"> </span><span class=\"p\">[</span><span class=\"nx\">k</span><span class=\"p\">,</span><span class=\"w\"> </span><span class=\"nx\">v</span><span class=\"p\">])</span><span class=\"w\"> </span><span class=\"p\">=></span><span class=\"w\"> </span><span class=\"p\">(</span><span class=\"nx\">v</span>\n<span class=\"w\">    </span><span class=\"o\">?</span><span class=\"w\"> </span><span class=\"p\">{</span><span class=\"w\"> </span><span class=\"p\">...</span><span class=\"nx\">o</span><span class=\"p\">,</span><span class=\"w\"> </span><span class=\"p\">[</span><span class=\"nx\">k</span><span class=\"p\">]</span><span class=\"o\">:</span><span class=\"w\"> </span><span class=\"nx\">v</span><span class=\"w\"> </span><span class=\"p\">}</span>\n<span class=\"w\">    </span><span class=\"o\">:</span><span class=\"w\"> </span><span class=\"nx\">o</span>\n<span class=\"w\">  </span><span class=\"p\">);</span>\n\n<span class=\"w\">  </span><span class=\"k\">return</span><span class=\"w\"> </span><span class=\"nx\">resultsEntries</span><span class=\"p\">.</span><span class=\"nx\">reduce</span><span class=\"p\">(</span><span class=\"nx\">assignIfValueTruthy</span><span class=\"p\">,</span><span class=\"w\"> </span><span class=\"p\">{});</span>\n<span class=\"p\">}</span>\n</code></pre></div>\n<p>What makes this \"clever\"? I count eight language features composed together: <code>entries</code>, argument unpacking, implicit objects, splats, ternaries, higher-order functions, and reductions. Would code that used only one or two of these features still be \"clever\"? I don't think so. These features exist for a reason, and oftentimes they make code simpler than not using them.</p>\n<p>We can, of course, learn these features one at a time. Writing the clever version (but not <em>committing it</em>) gives us practice with all eight at once and also with how they compose together. That knowledge comes in handy when we want to apply a single one of the ideas.</p>\n<p>I've recently had to do a bit of pandas for a project. Whenever I have to do a new analysis, I try to write it as a single chain of transformations, and then as a more balanced set of updates.</p>\n<h3>...helps us master concepts</h3>\n<p>Even if the composite parts of a \"clever\" solution aren't by themselves useful, it still makes us better at the overall language, and that's inherently valuable. A few years ago I wrote <a href=\"https://www.hillelwayne.com/post/python-abc/\" target=\"_blank\">Crimes with Python's Pattern Matching</a>. It involves writing horrible code like this:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"kn\">from</span><span class=\"w\"> </span><span class=\"nn\">abc</span><span class=\"w\"> </span><span class=\"kn\">import</span> <span class=\"n\">ABC</span>\n\n<span class=\"k\">class</span><span class=\"w\"> </span><span class=\"nc\">NotIterable</span><span class=\"p\">(</span><span class=\"n\">ABC</span><span class=\"p\">):</span>\n\n    <span class=\"nd\">@classmethod</span>\n    <span class=\"k\">def</span><span class=\"w\"> </span><span class=\"nf\">__subclasshook__</span><span class=\"p\">(</span><span class=\"bp\">cls</span><span class=\"p\">,</span> <span class=\"n\">C</span><span class=\"p\">):</span>\n        <span class=\"k\">return</span> <span class=\"ow\">not</span> <span class=\"nb\">hasattr</span><span class=\"p\">(</span><span class=\"n\">C</span><span class=\"p\">,</span> <span class=\"s2\">\"__iter__\"</span><span class=\"p\">)</span>\n\n<span class=\"k\">def</span><span class=\"w\"> </span><span class=\"nf\">f</span><span class=\"p\">(</span><span class=\"n\">x</span><span class=\"p\">):</span>\n    <span class=\"k\">match</span> <span class=\"n\">x</span><span class=\"p\">:</span>\n        <span class=\"k\">case</span> <span class=\"n\">NotIterable</span><span class=\"p\">():</span>\n            <span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"sa\">f</span><span class=\"s2\">\"</span><span class=\"si\">{</span><span class=\"n\">x</span><span class=\"si\">}</span><span class=\"s2\"> is not iterable\"</span><span class=\"p\">)</span>\n        <span class=\"k\">case</span><span class=\"w\"> </span><span class=\"k\">_</span><span class=\"p\">:</span>\n            <span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"sa\">f</span><span class=\"s2\">\"</span><span class=\"si\">{</span><span class=\"n\">x</span><span class=\"si\">}</span><span class=\"s2\"> is iterable\"</span><span class=\"p\">)</span>\n\n<span class=\"k\">if</span> <span class=\"vm\">__name__</span> <span class=\"o\">==</span> <span class=\"s2\">\"__main__\"</span><span class=\"p\">:</span>\n    <span class=\"n\">f</span><span class=\"p\">(</span><span class=\"mi\">10</span><span class=\"p\">)</span>\n    <span class=\"n\">f</span><span class=\"p\">(</span><span class=\"s2\">\"string\"</span><span class=\"p\">)</span>\n    <span class=\"n\">f</span><span class=\"p\">([</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">])</span>\n</code></pre></div>\n<p>This composes Python match statements, which are broadly useful, and abstract base classes, which are incredibly niche. But even if I never use ABCs in real production code, it helped me understand Python's match semantics and <a href=\"https://docs.python.org/3/howto/mro.html#python-2-3-mro\" target=\"_blank\">Method Resolution Order</a> better. </p>\n<h3>...prepares us for necessity</h3>\n<p>Sometimes the clever way is the <em>only</em> way. Maybe we need something faster than the simplest solution. Maybe we are working with constrained tools or frameworks that demand cleverness. Peter Norvig argued that design patterns compensate for missing language features. I'd argue that cleverness is another means of compensating: if our tools don't have an easy way to do something, we need to find a clever way.</p>\n<p>You see this a lot in formal methods like TLA+. Need to check a hyperproperty? <a href=\"https://www.hillelwayne.com/post/graphing-tla/\" target=\"_blank\">Cast your state space to a directed graph</a>. Need to compose ten specifications together? <a href=\"https://www.hillelwayne.com/post/composing-tla/\" target=\"_blank\">Combine refinements with state machines</a>. Most difficult problems have a \"clever\" solution. The real problem is that clever solutions have a skill floor. If normal use of the tool is at difficult 3 out of 10, then basic clever solutions are at 5 out of 10, and it's hard to jump those two steps in the moment you need the cleverness.</p>\n<p>But if you've practiced with writing overly clever code, you're used to working at a 7 out of 10 level in short bursts, and then you can \"drop down\" to 5/10. I don't know if that makes too much sense, but I see it happen a lot in practice.</p>\n<h3>...builds comradery</h3>\n<p>On a few occasions, after getting a pull request merged, I pulled the reviewer over and said \"check out this horrible way of doing the same thing\". I find that as long as people know they're not going to be subjected to a clever solution in production, they enjoy seeing it!</p>\n<p><em>Next week's newsletter will probably also be late, after that we should be back to a regular schedule for the rest of the summer.</em></p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:grad-students\">\n<p>Mostly grad students outside of CS who have to write scripts to do research. And in more than one data scientist. I think it's correlated with using Jupyter. <a class=\"footnote-backref\" href=\"#fnref:grad-students\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:bajillion\">\n<p>If I don't put this at the beginning, I'll get a bajillion responses like \"your team will hate you\" <a class=\"footnote-backref\" href=\"#fnref:bajillion\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/write-the-most-clever-code-you-possibly-can/",
          "published": "2025-05-08T15:04:42.000Z",
          "updated": "2025-05-08T15:04:42.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/requirements-change-until-they-dont/",
          "title": "Requirements change until they don't",
          "description": "<p>Recently I got a question on formal methods<sup id=\"fnref:fs\"><a class=\"footnote-ref\" href=\"#fn:fs\">1</a></sup>: how does it help to mathematically model systems when the system requirements are constantly changing? It doesn't make sense to spend a lot of time proving a design works, and then deliver the product and find out it's not at all what the client needs. As the saying goes, the hard part is \"building the right thing\", not \"building the thing right\".</p>\n<p>One possible response: \"why write tests\"? You shouldn't write tests, <em>especially</em> <a href=\"https://en.wikipedia.org/wiki/Test-driven_development\" target=\"_blank\">lots of unit tests ahead of time</a>, if you might just throw them all away when the requirements change.</p>\n<p>This is a bad response because we all know the difference between writing tests and formal methods: testing is <em>easy</em> and FM is <em>hard</em>. Testing requires low cost for moderate correctness, FM requires high(ish) cost for high correctness. And when requirements are constantly changing, \"high(ish) cost\" isn't affordable and \"high correctness\" isn't worthwhile, because a kinda-okay solution that solves a customer's problem is infinitely better than a solid solution that doesn't.</p>\n<p>But eventually you get something that solves the problem, and what then?</p>\n<p>Most of us don't work for Google, we can't axe features and products <a href=\"https://killedbygoogle.com/\" target=\"_blank\">on a whim</a>. If the client is happy with your solution, you are expected to support it. It should work when your customers run into new edge cases, or migrate all their computers to the next OS version, or expand into a market with shoddy internet. It should work when 10x as many customers are using 10x as many features. It should work when <a href=\"https://www.hillelwayne.com/post/feature-interaction/\" target=\"_blank\">you add new features that come into conflict</a>. </p>\n<p>And just as importantly, <em>it should never stop solving their problem</em>. Canonical example: your feature involves processing requested tasks synchronously. At scale, this doesn't work, so to improve latency you make it asynchronous. Now it's eventually consistent, but your customers were depending on it being always consistent. Now it no longer does what they need, and has stopped solving their problems.</p>\n<p>Every successful requirement met spawns a new requirement: \"keep this working\". That requirement is permanent, or close enough to decide our long-term strategy. It takes active investment to keep a feature behaving the same as the world around it changes.</p>\n<p>(Is this all a pretentious of way of saying \"software maintenance is hard?\" Maybe!)</p>\n<h3>Phase changes</h3>\n<div class=\"subscribe-form\"></div>\n<p>In physics there's a concept of a <a href=\"https://en.wikipedia.org/wiki/Phase_transition\" target=\"_blank\">phase transition</a>. To raise the temperature of a gram of liquid water by 1° C, you have to add 4.184 joules of energy.<sup id=\"fnref:calorie\"><a class=\"footnote-ref\" href=\"#fn:calorie\">2</a></sup> This continues until you raise it to 100°C, then it stops. After you've added two <em>thousand</em> joules to that gram, it suddenly turns into steam. The energy of the system changes continuously but the form, or phase, changes discretely.</p>\n<p><img alt=\"Phase_diagram_of_water_simplified.svg.png (from above link)\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/31676a33-be6a-4c6d-a96f-425723dcb0d5.png?w=960&fit=max\"/></p>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<p>Software isn't physics but the idea works as a metaphor. A certain architecture handles a certain level of load, and past that you need a new architecture. Or a bunch of similar features are independently hardcoded until the system becomes too messy to understand, you remodel the internals into something unified and extendable. etc etc etc. It's doesn't have to be totally discrete phase transition, but there's definitely a \"before\" and \"after\" in the system form. </p>\n<p>Phase changes tend to lead to more intricacy/complexity in the system, meaning it's likely that a phase change will introduce new bugs into existing behaviors. Take the synchronous vs asynchronous case. A very simple toy model of synchronous updates would be <code>Set(key, val)</code>, which updates <code>data[key]</code> to <code>val</code>.<sup id=\"fnref:tla\"><a class=\"footnote-ref\" href=\"#fn:tla\">3</a></sup> A model of asynchronous updates would be <code>AsyncSet(key, val, priority)</code> adds a <code>(key, val, priority, server_time())</code> tuple to a <code>tasks</code> set, and then another process asynchronously pulls a tuple (ordered by highest priority, then earliest time) and calls <code>Set(key, val)</code>. Here are some properties the client may need preserved as a requirement: </p>\n<ul>\n<li>If <code>AsyncSet(key, val, _, _)</code> is called, then <em>eventually</em> <code>db[key] = val</code> (possibly violated if higher-priority tasks keep coming in)</li>\n<li>If someone calls <code>AsyncSet(key1, val1, low)</code> and then <code>AsyncSet(key2, val2, low)</code>, they should see the first update and then the second (linearizability, possibly violated if the requests go to different servers with different clock times)</li>\n<li>If someone calls <code>AsyncSet(key, val, _)</code> and <em>immediately</em> reads <code>db[key]</code> they should get <code>val</code> (obviously violated, though the client may accept a <em>slightly</em> weaker property)</li>\n</ul>\n<p>If the new system doesn't satisfy an existing customer requirement, it's prudent to fix the bug <em>before</em> releasing the new system. The customer doesn't notice or care that your system underwent a phase change. They'll just see that one day your product solves their problems, and the next day it suddenly doesn't. </p>\n<p>This is one of the most common applications of formal methods. Both of those systems, and every one of those properties, is formally specifiable in a specification language. We can then automatically check that the new system satisfies the existing properties, and from there do things like <a href=\"https://arxiv.org/abs/2006.00915\" target=\"_blank\">automatically generate test suites</a>. This does take a lot of work, so if your requirements are constantly changing, FM may not be worth the investment. But eventually requirements <em>stop</em> changing, and then you're stuck with them forever. That's where models shine.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:fs\">\n<p>As always, I'm using formal methods to mean the subdiscipline of formal specification of designs, leaving out the formal verification of code. Mostly because \"formal specification\" is really awkward to say. <a class=\"footnote-backref\" href=\"#fnref:fs\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:calorie\">\n<p>Also called a \"calorie\". The US \"dietary Calorie\" is actually a kilocalorie. <a class=\"footnote-backref\" href=\"#fnref:calorie\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:tla\">\n<p>This is all directly translatable to a TLA+ specification, I'm just describing it in English to avoid paying the syntax tax <a class=\"footnote-backref\" href=\"#fnref:tla\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/requirements-change-until-they-dont/",
          "published": "2025-04-24T11:00:00.000Z",
          "updated": "2025-04-24T11:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/the-halting-problem-is-a-terrible-example-of-np/",
          "title": "The Halting Problem is a terrible example of NP-Harder",
          "description": "<p><em>Short one this time because I have a lot going on this week.</em></p>\n<p>In computation complexity, <strong>NP</strong> is the class of all decision problems (yes/no) where a potential proof (or \"witness\") for \"yes\" can be <em>verified</em> in polynomial time. For example, \"does this set of numbers have a subset that sums to zero\" is in NP. If the answer is \"yes\", you can prove it by presenting a set of numbers. We would then verify the witness by 1) checking that all the numbers are present in the set (~linear time) and 2) adding up all the numbers (also linear).</p>\n<p><strong>NP-complete</strong> is the class of \"hardest possible\" NP problems. Subset sum is NP-complete. <strong>NP-hard</strong> is the set all problems <em>at least as hard</em> as NP-complete. Notably, NP-hard is <em>not</em> a subset of NP, as it contains problems that are <em>harder</em> than NP-complete. A natural question to ask is \"like what?\" And the canonical example of \"NP-harder\" is the halting problem (HALT): does program P halt on input C? As the argument goes, it's undecidable, so obviously not in NP.</p>\n<p>I think this is a bad example for two reasons:</p>\n<ol><li><p>All NP requires is that witnesses for \"yes\" can be verified in polynomial time. It does not require anything for the \"no\" case! And even though HP is undecidable, there <em>is</em> a decidable way to verify a \"yes\": let the witness be \"it halts in N steps\", then run the program for that many steps and see if it halted by then. To prove HALT is not in NP, you have to show that this verification process grows faster than polynomially. It does (as <a href=\"https://en.wikipedia.org/wiki/Busy_beaver\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">busy beaver</a> is uncomputable), but this all makes the example needlessly confusing.<sup id=\"fnref:1\"><a class=\"footnote-ref\" data-id=\"37347adc-dba6-4629-9d24-c6252292ac6b\" data-reference-number=\"1\" href=\"#fn:1\">1</a></sup></p></li><li><p>\"What's bigger than a dog? THE MOON\"</p></li></ol>\n<p>Really (2) bothers me a lot more than (1) because it's just so inelegant. It suggests that NP-complete is the upper bound of \"solvable\" problems, and after that you're in full-on undecidability. I'd rather show intuitive problems that are harder than NP but not <em>that</em> much harder.</p>\n<p>But in looking for a \"slightly harder\" problem, I ran into an, ah, problem. It <em>seems</em> like the next-hardest class would be <a href=\"https://en.wikipedia.org/wiki/EXPTIME\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">EXPTIME</a>, except we don't know <em>for sure</em> that NP != EXPTIME. We know <em>for sure</em> that NP != <a href=\"https://en.wikipedia.org/wiki/NEXPTIME\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">NEXPTIME</a>, but NEXPTIME doesn't have any intuitive, easily explainable problems. Most \"definitely harder than NP\" problems require a nontrivial background in theoretical computer science or mathematics to understand.</p>\n<p>There is one problem, though, that I find easily explainable. Place a token at the bottom left corner of a grid that extends infinitely up and right, call that point (0, 0). You're given list of valid displacement moves for the token, like <code>(+1, +0)</code>, <code>(-20, +13)</code>, <code>(-5, -6)</code>, etc, and a target point like <code>(700, 1)</code>. You may make any sequence of moves in any order, as long as no move ever puts the token off the grid. Does any sequence of moves bring you to the target?</p>\n<div class=\"subscribe-form\"></div>\n<p>This is PSPACE-complete, I think, which still isn't proven to be harder than NP-complete (though it's widely believed). But what if you increase the number of dimensions of the grid? Past a certain number of dimensions the problem jumps to being EXPSPACE-complete, and then TOWER-complete (grows <a href=\"https://en.wikipedia.org/wiki/Tetration\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">tetrationally</a>), and then it keeps going. Some point might recognize this as looking a lot like the <a href=\"https://en.wikipedia.org/wiki/Ackermann_function\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Ackermann function</a>, and in fact this problem is <a href=\"https://arxiv.org/abs/2104.13866\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">ACKERMANN-complete on the number of available dimensions</a>.</p>\n<p><a href=\"https://www.quantamagazine.org/an-easy-sounding-problem-yields-numbers-too-big-for-our-universe-20231204/\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">A friend wrote a Quanta article about the whole mess</a>, you should read it.</p>\n<p>This problem is ludicrously bigger than NP (\"Chicago\" instead of \"The Moon\"), but at least it's clearly decidable, easily explainable, and definitely <em>not</em> in NP.</p>\n<div class=\"footnote\"><hr/><ol class=\"footnotes\"><li data-id=\"37347adc-dba6-4629-9d24-c6252292ac6b\" id=\"fn:1\"><p>It's less confusing if you're taught the alternate (and original!) definition of NP, \"the class of problems solvable in polynomial time by a nondeterministic Turing machine\". Then HALT can't be in NP because otherwise runtime would be bounded by an exponential function. <a class=\"footnote-backref\" href=\"#fnref:1\">↩</a></p></li></ol></div>",
          "url": "https://buttondown.com/hillelwayne/archive/the-halting-problem-is-a-terrible-example-of-np/",
          "published": "2025-04-16T17:39:23.000Z",
          "updated": "2025-04-16T17:39:23.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/",
          "title": "Solving a \"Layton Puzzle\" with Prolog",
          "description": "<p>I have a lot in the works for the this month's <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a> release. Among other things, I'm completely rewriting the chapter on Logic Programming Languages. </p>\n<p>I originally showcased the paradigm with puzzle solvers, like <a href=\"https://swish.swi-prolog.org/example/queens.pl\" target=\"_blank\">eight queens</a> or <a href=\"https://saksagan.ceng.metu.edu.tr/courses/ceng242/documents/prolog/jrfisher/2_1.html\" target=\"_blank\">four-coloring</a>. Lots of other demos do this too! It takes creativity and insight for humans to solve them, so a program doing it feels magical. But I'm trying to write a book about practical techniques and I want everything I talk about to be <em>useful</em>. So in v0.9 I'll be replacing these examples with a couple of new programs that might get people thinking that Prolog could help them in their day-to-day work.</p>\n<p>On the other hand, for a newsletter, showcasing a puzzle solver is pretty cool. And recently I stumbled into <a href=\"https://morepablo.com/2010/09/some-professor-layton-prolog.html\" target=\"_blank\">this post</a> by my friend <a href=\"https://morepablo.com/\" target=\"_blank\">Pablo Meier</a>, where he solves a videogame puzzle with Prolog:<sup id=\"fnref:path\"><a class=\"footnote-ref\" href=\"#fn:path\">1</a></sup></p>\n<p><img alt=\"See description below\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/a4ee8689-bbce-4dc9-8175-a1de3bd8f2db.png?w=960&fit=max\"/></p>\n<p>Summary for the text-only readers: We have a test with 10 true/false questions (denoted <code>a/b</code>) and four student attempts. Given the scores of the first three students, we have to figure out the fourth student's score.</p>\n<div class=\"codehilite\"><pre><span></span><code>bbababbabb = 7\nbaaababaaa = 5\nbaaabbbaba = 3\nbbaaabbaaa = ???\n</code></pre></div>\n<p>You can see Pablo's solution <a href=\"https://morepablo.com/2010/09/some-professor-layton-prolog.html\" target=\"_blank\">here</a>, and try it in SWI-prolog <a href=\"https://swish.swi-prolog.org/p/Some%20Professor%20Layton%20Prolog.pl\" target=\"_blank\">here</a>. Pretty cool! But after way too long studying Prolog just to write this dang book chapter, I wanted to see if I could do it more elegantly than him. Code and puzzle spoilers to follow.</p>\n<p>(Normally here's where I'd link to a gentler introduction I wrote but I think this is my first time writing about Prolog online? Uh here's a <a href=\"https://www.hillelwayne.com/post/picat/\" target=\"_blank\">Picat intro</a> instead)</p>\n<h3>The Program</h3>\n<p>You can try this all online at <a href=\"https://swish.swi-prolog.org/p/\" target=\"_blank\">SWISH</a> or just jump to my final version <a href=\"https://swish.swi-prolog.org/p/layton_prolog_puzzle.pl\" target=\"_blank\">here</a>.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"p\">:-</span> <span class=\"nf\">use_module</span><span class=\"p\">(</span><span class=\"nf\">library</span><span class=\"p\">(</span><span class=\"s s-Atom\">dif</span><span class=\"p\">)).</span>    <span class=\"c1\">% Sound inequality</span>\n<span class=\"p\">:-</span> <span class=\"nf\">use_module</span><span class=\"p\">(</span><span class=\"nf\">library</span><span class=\"p\">(</span><span class=\"s s-Atom\">clpfd</span><span class=\"p\">)).</span>  <span class=\"c1\">% Finite domain constraints</span>\n</code></pre></div>\n<p>First some imports. <code>dif</code> lets us write <code>dif(A, B)</code>, which is true if <code>A</code> and <code>B</code> are <em>not</em> equal. <code>clpfd</code> lets us write <code>A #= B + 1</code> to say \"A is 1 more than B\".<sup id=\"fnref:superior\"><a class=\"footnote-ref\" href=\"#fn:superior\">2</a></sup></p>\n<p>We'll say both the student submission and the key will be lists, where each value is <code>a</code> or <code>b</code>. In Prolog, lowercase identifiers are <strong>atoms</strong> (like symbols in other languages) and identifiers that start with a capital are <strong>variables</strong>. Prolog finds values for variables that match equations (<strong>unification</strong>). The pattern matching is real real good.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\">% ?- means query</span>\n<span class=\"s s-Atom\">?-</span> <span class=\"nv\">L</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span><span class=\"nv\">B</span><span class=\"p\">,</span><span class=\"s s-Atom\">c</span><span class=\"p\">],</span> <span class=\"p\">[</span><span class=\"nv\">Y</span><span class=\"p\">|</span><span class=\"nv\">X</span><span class=\"p\">]</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"mi\">1</span><span class=\"p\">,</span><span class=\"mi\">2</span><span class=\"p\">|</span><span class=\"nv\">L</span><span class=\"p\">],</span> <span class=\"nv\">B</span> <span class=\"o\">+</span> <span class=\"mi\">1</span> <span class=\"s s-Atom\">#=</span> <span class=\"mf\">7.</span>\n\n<span class=\"nv\">B</span> <span class=\"o\">=</span> <span class=\"mi\">6</span><span class=\"p\">,</span>\n<span class=\"nv\">L</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"mi\">6</span><span class=\"p\">,</span> <span class=\"s s-Atom\">c</span><span class=\"p\">],</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"mi\">6</span><span class=\"p\">,</span> <span class=\"s s-Atom\">c</span><span class=\"p\">],</span>\n<span class=\"nv\">Y</span> <span class=\"o\">=</span> <span class=\"mi\">1</span>\n</code></pre></div>\n<p>Next, we define <code>score/3</code><sup id=\"fnref:arity\"><a class=\"footnote-ref\" href=\"#fn:arity\">3</a></sup> recursively. </p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\">% The student's test score</span>\n<span class=\"c1\">% score(student answers, answer key, score)</span>\n<span class=\"nf\">score</span><span class=\"p\">([],</span> <span class=\"p\">[],</span> <span class=\"mi\">0</span><span class=\"p\">).</span>\n<span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"nv\">A</span><span class=\"p\">|</span><span class=\"nv\">As</span><span class=\"p\">],</span> <span class=\"p\">[</span><span class=\"nv\">A</span><span class=\"p\">|</span><span class=\"nv\">Ks</span><span class=\"p\">],</span> <span class=\"nv\">N</span><span class=\"p\">)</span> <span class=\"p\">:-</span>\n   <span class=\"nv\">N</span> <span class=\"s s-Atom\">#=</span> <span class=\"nv\">M</span> <span class=\"o\">+</span> <span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"nf\">score</span><span class=\"p\">(</span><span class=\"nv\">As</span><span class=\"p\">,</span> <span class=\"nv\">Ks</span><span class=\"p\">,</span> <span class=\"nv\">M</span><span class=\"p\">).</span>\n<span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"nv\">A</span><span class=\"p\">|</span><span class=\"nv\">As</span><span class=\"p\">],</span> <span class=\"p\">[</span><span class=\"nv\">K</span><span class=\"p\">|</span><span class=\"nv\">Ks</span><span class=\"p\">],</span> <span class=\"nv\">N</span><span class=\"p\">)</span> <span class=\"p\">:-</span> \n    <span class=\"nf\">dif</span><span class=\"p\">(</span><span class=\"nv\">A</span><span class=\"p\">,</span> <span class=\"nv\">K</span><span class=\"p\">),</span> <span class=\"nf\">score</span><span class=\"p\">(</span><span class=\"nv\">As</span><span class=\"p\">,</span> <span class=\"nv\">Ks</span><span class=\"p\">,</span> <span class=\"nv\">N</span><span class=\"p\">).</span>\n</code></pre></div>\n<p>First key is the student's answers, second is the answer key, third is the final score. The base case is the empty test, which has score 0. Otherwise, we take the head values of each list and compare them. If they're the same, we add one to the score, otherwise we keep the same score. </p>\n<p>Notice we couldn't write <code>if x then y else z</code>, we instead used pattern matching to effectively express <code>(x && y) || (!x && z)</code>. Prolog does have a conditional operator, but it prevents backtracking so what's the point???</p>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<h3>A quick break about bidirectionality</h3>\n<p>One of the coolest things about Prolog: all purely logical predicates are bidirectional. We can use <code>score</code> to check if our expected score is correct:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span> <span class=\"mi\">2</span><span class=\"p\">).</span>\n<span class=\"s s-Atom\">true</span>\n</code></pre></div>\n<p>But we can also give it answers and a key and ask it for the score:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span> <span class=\"nv\">X</span><span class=\"p\">).</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"mi\">2</span>\n</code></pre></div>\n<p><em>Or</em> we could give it a key and a score and ask \"what test answers would have this score?\"</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">score</span><span class=\"p\">(</span><span class=\"nv\">X</span><span class=\"p\">,</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span> <span class=\"mi\">2</span><span class=\"p\">).</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"k\">_</span><span class=\"nv\">A</span><span class=\"p\">],</span>\n<span class=\"nf\">dif</span><span class=\"p\">(</span><span class=\"k\">_</span><span class=\"nv\">A</span><span class=\"p\">,</span><span class=\"s s-Atom\">b</span><span class=\"p\">)</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"k\">_</span><span class=\"nv\">A</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span>\n<span class=\"nf\">dif</span><span class=\"p\">(</span><span class=\"k\">_</span><span class=\"nv\">A</span><span class=\"p\">,</span><span class=\"s s-Atom\">b</span><span class=\"p\">)</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"k\">_</span><span class=\"nv\">A</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span>\n<span class=\"nf\">dif</span><span class=\"p\">(</span><span class=\"k\">_</span><span class=\"nv\">A</span><span class=\"p\">,</span><span class=\"s s-Atom\">b</span><span class=\"p\">)</span>\n</code></pre></div>\n<p>The different value is written <code>_A</code> because we never told Prolog that the array can <em>only</em> contain <code>a</code> and <code>b</code>. We'll fix this later.</p>\n<h3>Okay back to the program</h3>\n<p>Now that we have a way of computing scores, we want to find a possible answer key that matches all of our observations, ie gives everybody the correct scores.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">key</span><span class=\"p\">(</span><span class=\"nv\">Key</span><span class=\"p\">)</span> <span class=\"p\">:-</span>\n    <span class=\"c1\">% Figure it out</span>\n    <span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span> <span class=\"nv\">Key</span><span class=\"p\">,</span> <span class=\"mi\">7</span><span class=\"p\">),</span>\n    <span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">],</span> <span class=\"nv\">Key</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">),</span>\n    <span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">],</span> <span class=\"nv\">Key</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">).</span>\n</code></pre></div>\n<p>So far we haven't explicitly said that the <code>Key</code> length matches the student answer lengths. This is implicitly verified by <code>score</code> (both lists need to be empty at the same time) but it's a good idea to explicitly add <code>length(Key, 10)</code> as a clause of <code>key/1</code>. We should also explicitly say that every element of <code>Key</code> is either <code>a</code> or <code>b</code>.<sup id=\"fnref:explicit\"><a class=\"footnote-ref\" href=\"#fn:explicit\">4</a></sup> Now we <em>could</em> write a second predicate saying <code>Key</code> had the right 'type': </p>\n<div class=\"codehilite\"><pre><span></span><code>keytype([]).\nkeytype([K|Ks]) :- member(K, [a, b]), keytype(Ks).\n</code></pre></div>\n<p>But \"generating lists that match a constraint\" is a thing that comes up often enough that we don't want to write a separate predicate for each constraint! So after some digging, I found a more elegant solution: <code>maplist</code>. Let <code>L=[l1, l2]</code>. Then <code>maplist(p, L)</code> is equivalent to the clause <code>p(l1), p(l2)</code>. It also accepts partial predicates: <code>maplist(p(x), L)</code> is equivalent to <code>p(x, l1), p(x, l2)</code>. So we could write<sup id=\"fnref:yall\"><a class=\"footnote-ref\" href=\"#fn:yall\">5</a></sup></p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">contains</span><span class=\"p\">(</span><span class=\"nv\">L</span><span class=\"p\">,</span> <span class=\"nv\">X</span><span class=\"p\">)</span> <span class=\"p\">:-</span> <span class=\"nf\">member</span><span class=\"p\">(</span><span class=\"nv\">X</span><span class=\"p\">,</span> <span class=\"nv\">L</span><span class=\"p\">).</span>\n\n<span class=\"nf\">key</span><span class=\"p\">(</span><span class=\"nv\">Key</span><span class=\"p\">)</span> <span class=\"p\">:-</span>\n    <span class=\"nf\">length</span><span class=\"p\">(</span><span class=\"nv\">Key</span><span class=\"p\">,</span> <span class=\"mi\">10</span><span class=\"p\">),</span>\n    <span class=\"nf\">maplist</span><span class=\"p\">(</span><span class=\"nf\">contains</span><span class=\"p\">([</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span><span class=\"s s-Atom\">b</span><span class=\"p\">]),</span> <span class=\"nv\">L</span><span class=\"p\">),</span>\n    <span class=\"c1\">% the score stuff</span>\n</code></pre></div>\n<p>Now, let's query for the Key:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">key</span><span class=\"p\">(</span><span class=\"nv\">Key</span><span class=\"p\">)</span>\n<span class=\"nv\">Key</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">]</span>\n<span class=\"nv\">Key</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">]</span>\n<span class=\"nv\">Key</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">]</span>\n<span class=\"nv\">Key</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">]</span>\n</code></pre></div>\n<p>So there are actually four <em>different</em> keys that all explain our data. Does this mean the puzzle is broken and has multiple different answers?</p>\n<h3>Nope</h3>\n<p>The puzzle wasn't to find out what the answer key was, the point was to find the fourth student's score. And if we query for it, we see all four solutions give him the same score:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">key</span><span class=\"p\">(</span><span class=\"nv\">Key</span><span class=\"p\">),</span> <span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">],</span> <span class=\"nv\">Key</span><span class=\"p\">,</span> <span class=\"nv\">X</span><span class=\"p\">).</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"mi\">6</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"mi\">6</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"mi\">6</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"mi\">6</span>\n</code></pre></div>\n<p>Huh! I really like it when puzzles look like they're broken, but every \"alternate\" solution still gives the same puzzle answer.</p>\n<p>Total program length: 15 lines of code, compared to the original's 80 lines. <em>Suck it, Pablo.</em></p>\n<p>(Incidentally, you can get all of the answer at once by writing <code>findall(X, (key(Key), score($answer-array, Key, X)), L).</code>) </p>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<h3>I still don't like puzzles for teaching</h3>\n<p>The actual examples I'm using in <a href=\"https://leanpub.com/logic/\" target=\"_blank\">the book</a> are \"analyzing a version control commit graph\" and \"planning a sequence of infrastructure changes\", which are somewhat more likely to occur at work than needing to solve a puzzle. You'll see them in the next release!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:path\">\n<p>I found it because he wrote <a href=\"https://morepablo.com/2025/04/gamer-games-for-lite-gamers.html\" target=\"_blank\">Gamer Games for Lite Gamers</a> as a response to my <a href=\"https://www.hillelwayne.com/post/vidja-games/\" target=\"_blank\">Gamer Games for Non-Gamers</a>. <a class=\"footnote-backref\" href=\"#fnref:path\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:superior\">\n<p>These are better versions of the core Prolog expressions <code>\\+ (A = B)</code> and <code>A is B + 1</code>, because they can <a href=\"https://eu.swi-prolog.org/pldoc/man?predicate=dif/2\" target=\"_blank\">defer unification</a>. <a class=\"footnote-backref\" href=\"#fnref:superior\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:arity\">\n<p>Prolog-descendants have a convention of writing the arity of the function after its name, so <code>score/3</code> means \"score has three parameters\". I think they do this because you can overload predicates with multiple different arities. Also Joe Armstrong used Prolog for prototyping, so Erlang and Elixir follow the same convention. <a class=\"footnote-backref\" href=\"#fnref:arity\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n<li id=\"fn:explicit\">\n<p>It <em>still</em> gets the right answers without this type restriction, but I had no idea it did until I checked for myself. Probably better not to rely on this! <a class=\"footnote-backref\" href=\"#fnref:explicit\" title=\"Jump back to footnote 4 in the text\">↩</a></p>\n</li>\n<li id=\"fn:yall\">\n<p>We could make this even more compact by using a lambda function. First import module <code>yall</code>, then write <code>maplist([X]>>member(X, [a,b]), Key)</code>. But (1) it's not a shorter program because you replace the extra definition with an extra module import, and (2) <code>yall</code> is SWI-Prolog specific and not an ISO-standard prolog module. Using <code>contains</code> is more portable. <a class=\"footnote-backref\" href=\"#fnref:yall\" title=\"Jump back to footnote 5 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/",
          "published": "2025-04-08T18:34:50.000Z",
          "updated": "2025-04-08T18:34:50.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/april-cools-gaming-games-for-non-gamers/",
          "title": "[April Cools] Gaming Games for Non-Gamers",
          "description": "<p>My <em>April Cools</em> is out! <a href=\"https://www.hillelwayne.com/post/vidja-games/\" target=\"_blank\">Gaming Games for Non-Gamers</a> is a 3,000 word essay on video games worth playing if you've never enjoyed a video game before. <a href=\"https://www.patreon.com/posts/blog-notes-gamer-125654321?utm_medium=clipboard_copy&utm_source=copyLink&utm_campaign=postshare_creator&utm_content=join_link\" target=\"_blank\">Patreon notes here</a>.</p>\n<p>(April Cools is a project where we write genuine content on non-normal topics. You can see all the other April Cools posted so far <a href=\"https://www.aprilcools.club/\" target=\"_blank\">here</a>. There's still time to submit your own!)</p>\n<a class=\"embedded-link\" href=\"https://www.aprilcools.club/\"> <div style=\"width: 100%; background: #fff; border: 1px #ced3d9 solid; border-radius: 5px; margin-top: 1em; overflow: auto; margin-bottom: 1em;\"> <div style=\"float: left; border-bottom: 1px #ced3d9 solid;\"> <img class=\"link-image\" src=\"https://www.aprilcools.club/aprilcoolsclub.png\"/> </div> <div style=\"float: left; color: #393f48; padding-left: 1em; padding-right: 1em;\"> <h4 class=\"link-title\" style=\"margin-bottom: 0em; line-height: 1.25em; margin-top: 1em; font-size: 14px;\">                April Cools' Club</h4> </div> </div></a>",
          "url": "https://buttondown.com/hillelwayne/archive/april-cools-gaming-games-for-non-gamers/",
          "published": "2025-04-01T16:04:59.000Z",
          "updated": "2025-04-01T16:04:59.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/betteridges-law-of-software-engineering/",
          "title": "Betteridge's Law of Software Engineering Specialness",
          "description": "<h3>Logic for Programmers v0.8 now out!</h3>\n<p>The new release has minor changes: new formatting for notes and a better introduction to predicates. I would have rolled it all into v0.9 next month but I like the monthly cadence. <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Get it here!</a></p>\n<h1>Betteridge's Law of Software Engineering Specialness</h1>\n<p>In <a href=\"https://agileotter.blogspot.com/2025/03/there-is-no-automatic-reset-in.html\" target=\"_blank\">There is No Automatic Reset in Engineering</a>, Tim Ottinger asks:</p>\n<blockquote>\n<p>Do the other people have to live with January 2013 for the rest of their lives? Or is it only engineering that has to deal with every dirty hack since the beginning of the organization?</p>\n</blockquote>\n<p><strong>Betteridge's Law of Headlines</strong> says that if a journalism headline ends with a question mark, the answer is probably \"no\". I propose a similar law relating to software engineering specialness:<sup id=\"fnref:ottinger\"><a class=\"footnote-ref\" href=\"#fn:ottinger\">1</a></sup></p>\n<blockquote>\n<p>If someone asks if some aspect of software development is truly unique to just software development, the answer is probably \"no\".</p>\n</blockquote>\n<p>Take the idea that \"in software, hacks are forever.\" My favorite example of this comes from a different profession. The <a href=\"https://en.wikipedia.org/wiki/Dewey_Decimal_Classification\" target=\"_blank\">Dewey Decimal System</a> hierarchically categorizes books by discipline. For example, <em><a href=\"https://www.librarything.com/work/10143437/t/Covered-Bridges-of-Pennsylvania\" target=\"_blank\">Covered Bridges of Pennsylvania</a></em> has Dewey number <code>624.37</code>. <code>6--</code> is the technology discipline, <code>62-</code> is engineering, <code>624</code> is civil engineering, and <code>624.3</code> is \"special types of bridges\". I have no idea what the last <code>0.07</code> means, but you get the picture.</p>\n<p>Now if you look at the <a href=\"https://www.librarything.com/mds/6\" target=\"_blank\">6-- \"technology\" breakdown</a>, you'll see that there's no \"software\" subdiscipline. This is because when Dewey preallocated the whole technology block in 1876. New topics were instead to be added to the <code>00-</code> \"general-knowledge\" catch-all. Eventually <code>005</code> was assigned to \"software development\", meaning <em>The C Programming Language</em> lives at <code>005.133</code>. </p>\n<p>Incidentally, another late addition to the general knowledge block is <code>001.9</code>: \"controversial knowledge\". </p>\n<p>And that's why my hometown library shelved the C++ books right next to <em>The Mothman Prophecies</em>.</p>\n<p>How's <em>that</em> for technical debt?</p>\n<p>If anything, fixing hacks in software is significantly <em>easier</em> than in other fields. This came up when I was <a href=\"https://www.hillelwayne.com/post/we-are-not-special/\" target=\"_blank\">interviewing classic engineers</a>. Kludges happened all the time, but \"refactoring\" them out is <em>expensive</em>. Need to house a machine that's just two inches taller than the room? Guess what, you're cutting a hole in the ceiling.</p>\n<p>(Even if we restrict the question to other departments in a <em>software company</em>, we can find kludges that are horrible to undo. I once worked for a company which landed an early contract by adding a bespoke support agreement for that one customer. That plagued them for years afterward.)</p>\n<p>That's not to say that there aren't things that are different about software vs other fields!<sup id=\"fnref:example\"><a class=\"footnote-ref\" href=\"#fn:example\">2</a></sup>  But I think that <em>most</em> of the time, when we say \"software development is the only profession that deals with XYZ\", it's only because we're ignorant of how those other professions work.</p>\n<hr/>\n<p>Short newsletter because I'm way behind on writing my <a href=\"https://www.aprilcools.club/\" target=\"_blank\">April Cools</a>. If you're interested in April Cools, you should try it out! I make it <em>way</em> harder on myself than it actually needs to be— everybody else who participates finds it pretty chill.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:ottinger\">\n<p>Ottinger caveats it with \"engineering, software or otherwise\", so I think he knows that other branches of <em>engineering</em>, at least, have kludges. <a class=\"footnote-backref\" href=\"#fnref:ottinger\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:example\">\n<p>The \"software is different\" idea that I'm most sympathetic to is that in software, the tools we use and the products we create are made from the same material. That's unusual at least in classic engineering. Then again, plenty of machinists have made their own lathes and mills! <a class=\"footnote-backref\" href=\"#fnref:example\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/betteridges-law-of-software-engineering/",
          "published": "2025-03-26T18:48:39.000Z",
          "updated": "2025-03-26T18:48:39.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/verification-first-development/",
          "title": "Verification-First Development",
          "description": "<p>A while back I argued on the Blue Site<sup id=\"fnref:li\"><a class=\"footnote-ref\" href=\"#fn:li\">1</a></sup> that \"test-first development\" (TFD) was different than \"test-driven development\" (TDD). The former is \"write tests before you write code\", the latter is a paradigm, culture, and collection of norms that's based on TFD. More broadly, TFD is a special case of <strong>Verification-First Development</strong> and TDD is not.</p>\n<blockquote>\n<p>VFD: before writing code, put in place some means of verifying that the code is correct, or at least have an idea of what you'll do.</p>\n</blockquote>\n<p>\"Verifying\" could mean writing tests, or figuring out how to encode invariants in types, or <a href=\"https://blog.regehr.org/archives/1091\" target=\"_blank\">adding contracts</a>, or <a href=\"https://learntla.com/\" target=\"_blank\">making a formal model</a>, or writing a separate script that checks the output of the program. Just have <em>something</em> appropriate in place that you can run as you go building the code. Ideally, we'd have verification in place for every interesting property, but that's rarely possible in practice. </p>\n<p>Oftentimes we can't make the verification until the code is partially complete. In that case it still helps to figure out the verification we'll write later. The point is to have a <em>plan</em> and follow it promptly.</p>\n<p>I'm using \"code\" as a standin for anything we programmers make, not just software programs. When using constraint solvers, I try to find representative problems I know the answers to. When writing formal specifications, I figure out the system's properties before the design that satisfies those properties. There's probably equivalents in security and other topics, too.</p>\n<h3>The Benefits of VFD</h3>\n<ol>\n<li>Doing verification before coding makes it less likely we'll skip verification entirely. It's the professional equivalent of \"No TV until you do your homework.\"</li>\n<li>It's easier to make sure a verifier works properly if we start by running it on code we know doesn't pass it. Bebugging working code takes more discipline.</li>\n<li>We can run checks earlier in the development process. It's better to realize that our code is broken five minutes after we broke it rather than two hours after.</li>\n</ol>\n<p>That's it, those are the benefits of verification-first development. Those are also <em>big</em> benefits for relatively little investment. Specializations of VFD like test-first development can have more benefits, but also more drawbacks.</p>\n<h3>The drawbacks of VFD</h3>\n<ol>\n<li>It slows us down. I know lots of people say that \"no actually it makes you go faster in the long run,\" but that's the <em>long</em> run. Sometimes we do marathons, sometimes we sprint.</li>\n<li>Verification gets in the way of exploratory coding, where we don't know what exactly we want or how exactly to do something.</li>\n<li>Any specific form of verification exerts a pressure on our code to make it easier to verify with that method. For example, if we're mostly verifying via type invariants, we need to figure out how to express those things in our language's type system, which may not be suited for the specific invariants we need.<sup id=\"fnref:sphinx\"><a class=\"footnote-ref\" href=\"#fn:sphinx\">2</a></sup></li>\n</ol>\n<h2>Whether \"pressure\" is a real drawback is incredibly controversial</h2>\n<p>If I had to summarize what makes \"test-driven development\" different from VFD:<sup id=\"fnref:tdd\"><a class=\"footnote-ref\" href=\"#fn:tdd\">3</a></sup></p>\n<ol>\n<li>The form of verification should specifically be tests, and unit tests at that</li>\n<li>Testing pressure is invariably good. \"Making your code easier to unit test\" is the same as \"making your code better\".</li>\n</ol>\n<p>This is something all of the various \"drivens\"— TDD, Type Driven Development, Design by Contract— share in common, this idea that the purpose of the paradigm is to exert pressure. Lots of TDD experts claim that \"having a good test suite\" is only the secondary benefit of TDD and the real benefit is how it improves code quality.<sup id=\"fnref:docs\"><a class=\"footnote-ref\" href=\"#fn:docs\">4</a></sup></p>\n<p>Whether they're right or not is not something I want to argue: I've seen these approaches all improve my code structure, but also sometimes worsen it. Regardless, I consider pressure a drawback to VFD in general, though, for a somewhat idiosyncratic reason. If it <em>weren't</em> for pressure, VFD would be wholly independent of the code itself. It would <em>just</em> be about verification, and our decisions would exclusively be about how we want to verify. But the design pressure means that our means of verification affects the system we're checking. What if these conflict in some way?</p>\n<h3>VFD is a technique, not a paradigm</h3>\n<p>One of the main differences between \"techniques\" and \"paradigms\" is that paradigms don't play well with each other. If you tried to do both \"proper\" Test-Driven Development and \"proper\" Cleanroom, your head would explode. Whereas VFD being a \"technique\" means it works well with other techniques and even with many full paradigms.</p>\n<p>It also doesn't take a whole lot of practice to start using. It does take practice, both in thinking of verifications and in using the particular verification method involved, to <em>use well</em>, but we can use it poorly and still benefit.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:li\">\n<p>LinkedIn, what did you think I meant? <a class=\"footnote-backref\" href=\"#fnref:li\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:sphinx\">\n<p>This bit me in the butt when making my own <a href=\"https://www.sphinx-doc.org/en/master/\" target=\"_blank\">sphinx</a> extensions. The official guides do things in a highly dynamic way that Mypy can't statically check. I had to do things in a completely different way. Ended up being better though! <a class=\"footnote-backref\" href=\"#fnref:sphinx\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:tdd\">\n<p>Someone's going to yell at me that I completely missed the point of TDD, which is XYZ. Well guess what, someone else <em>already</em> yelled at me that only dumb idiot babies think XYZ is important in TDD. Put in whatever you want for XYZ. <a class=\"footnote-backref\" href=\"#fnref:tdd\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n<li id=\"fn:docs\">\n<p>Another thing that weirdly all of the paradigms claim: that they lead to better documentation. I can see the argument, I just find it strange that <em>every single one</em> makes this claim! <a class=\"footnote-backref\" href=\"#fnref:docs\" title=\"Jump back to footnote 4 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/verification-first-development/",
          "published": "2025-03-18T16:22:20.000Z",
          "updated": "2025-03-18T16:22:20.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/new-blog-post-a-perplexing-javascript-parsing/",
          "title": "New Blog Post: \"A Perplexing Javascript Parsing Puzzle\"",
          "description": "<p>I know I said we'd be back to normal newsletters this week and in fact had 80% of one already written. </p>\n<p>Then I unearthed something that was better left buried.</p>\n<p><a href=\"http://www.hillelwayne.com/post/javascript-puzzle/\" target=\"_blank\">Blog post here</a>, <a href=\"https://www.patreon.com/posts/blog-notes-124153641\" target=\"_blank\">Patreon notes here</a> (Mostly an explanation of how I found this horror in the first place). Next week I'll send what was supposed to be this week's piece.</p>\n<p>(PS: <a href=\"https://www.aprilcools.club/\" target=\"_blank\">April Cools</a> in three weeks!)</p>",
          "url": "https://buttondown.com/hillelwayne/archive/new-blog-post-a-perplexing-javascript-parsing/",
          "published": "2025-03-12T14:49:52.000Z",
          "updated": "2025-03-12T14:49:52.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/",
          "title": "Five Kinds of Nondeterminism",
          "description": "<p>No newsletter next week, I'm teaching a TLA+ workshop.</p>\n<p>Speaking of which: I spend a lot of time thinking about formal methods (and TLA+ specifically) because it's where the source of almost all my revenue. But I don't share most of the details because 90% of my readers don't use FM and never will. I think it's more interesting to talk about ideas <em>from</em> FM that would be useful to people outside that field. For example, the idea of \"property strength\" translates to the <a href=\"https://buttondown.com/hillelwayne/archive/some-tests-are-stronger-than-others/\" target=\"_blank\">idea that some tests are stronger than others</a>. </p>\n<p>Another possible export is how FM approaches nondeterminism. A <strong>nondeterministic</strong> algorithm is one that, from the same starting conditions, has multiple possible outputs. This is nondeterministic:</p>\n<div class=\"codehilite\"><pre><span></span><code># Pseudocode\n\ndef f() {\n    return rand()+1;\n}\n</code></pre></div>\n<p>When specifying systems, I may not <em>encounter</em> nondeterminism more often than in real systems, but I am definitely more aware of its presence. Modeling nondeterminism is a core part of formal specification. I mentally categorize nondeterminism into five buckets. Caveat, this is specifically about nondeterminism from the perspective of <em>system modeling</em>, not computer science as a whole. If I tried to include stuff on NFAs and amb operations this would be twice as long.<sup id=\"fnref:nondeterminism\"><a class=\"footnote-ref\" href=\"#fn:nondeterminism\">1</a></sup></p>\n<p style=\"height:16px; margin:0px !important;\"></p>\n<h2>1. True Randomness</h2>\n<p>Programs that literally make calls to a <code>random</code> function and then use the results. This the simplest type of nondeterminism and one of the most ubiquitous. </p>\n<p>Most of the time, <code>random</code> isn't <em>truly</em> nondeterministic. Most of the time computer randomness is actually <strong>pseudorandom</strong>, meaning we seed a deterministic algorithm that behaves \"randomly-enough\" for some use. You could \"lift\" a nondeterministic random function into a deterministic one by adding a fixed seed to the starting state.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\"># Python</span>\n\n<span class=\"kn\">from</span> <span class=\"nn\">random</span> <span class=\"kn\">import</span> <span class=\"n\">random</span><span class=\"p\">,</span> <span class=\"n\">seed</span>\n<span class=\"k\">def</span> <span class=\"nf\">f</span><span class=\"p\">(</span><span class=\"n\">x</span><span class=\"p\">):</span>\n    <span class=\"n\">seed</span><span class=\"p\">(</span><span class=\"n\">x</span><span class=\"p\">)</span>\n    <span class=\"k\">return</span> <span class=\"n\">random</span><span class=\"p\">()</span>\n\n<span class=\"o\">>>></span> <span class=\"n\">f</span><span class=\"p\">(</span><span class=\"mi\">3</span><span class=\"p\">)</span>\n<span class=\"mf\">0.23796462709189137</span>\n<span class=\"o\">>>></span> <span class=\"n\">f</span><span class=\"p\">(</span><span class=\"mi\">3</span><span class=\"p\">)</span>\n<span class=\"mf\">0.23796462709189137</span>\n</code></pre></div>\n<p>Often we don't do this because the <em>point</em> of randomness is to provide nondeterminism! We deliberately <em>abstract out</em> the starting state of the seed from our program, because it's easier to think about it as locally nondeterministic.</p>\n<p>(There's also \"true\" randomness, like using <a href=\"https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html#inpage-nav-3-2\" target=\"_blank\">thermal noise</a> as an entropy source, which I think are mainly used for cryptography and seeding PRNGs.)</p>\n<p>Most formal specification languages don't deal with randomness (though some deal with <a href=\"https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/\" target=\"_blank\">probability more broadly</a>). Instead, we treat it as a nondeterministic choice:</p>\n<div class=\"codehilite\"><pre><span></span><code># software\nif rand > 0.001 then return a else crash\n\n# specification\neither return a or crash\n</code></pre></div>\n<p>This is because we're looking at worst-case scenarios, so it doesn't matter if <code>crash</code> happens 50% of the time or 0.0001% of the time, it's still possible.  </p>\n<h2>2. Concurrency</h2>\n<div class=\"codehilite\"><pre><span></span><code># Pseudocode\nglobal x = 1, y = 0;\n\ndef thread1() {\n   x++;\n   x++;\n   x++;\n}\n\ndef thread2() {\n    y := x;\n}\n</code></pre></div>\n<p>If <code>thread1()</code> and <code>thread2()</code> run sequentially, then (assuming the sequence is fixed) the final value of <code>y</code> is deterministic. If the two functions are started and run simultaneously, then depending on when <code>thread2</code> executes <code>y</code> can be 1, 2, 3, <em>or</em> 4. Both functions are locally sequential, but running them concurrently leads to global nondeterminism.</p>\n<p>Concurrency is arguably the most <em>dramatic</em> source of nondeterminism. <a href=\"https://buttondown.com/hillelwayne/archive/what-makes-concurrency-so-hard/\" target=\"_blank\">Small amounts of concurrency lead to huge explosions in the state space</a>. We have words for the specific kinds of nondeterminism caused by concurrency, like \"race condition\" and \"dirty write\". Often we think about it as a separate <em>topic</em> from nondeterminism. To some extent it \"overshadows\" the other kinds: I have a much easier time teaching students about concurrency in models than nondeterminism in models.</p>\n<p>Many formal specification languages have special syntax/machinery for the concurrent aspects of a system, and generic syntax for other kinds of nondeterminism. In P that's <a href=\"https://p-org.github.io/P/manual/expressions/#choose\" target=\"_blank\">choose</a>. Others don't special-case concurrency, instead representing as it as nondeterministic choices by a global coordinator. This more flexible but also more inconvenient, as you have to implement process-local sequencing code yourself. </p>\n<h2>3. User Input</h2>\n<div class=\"subscribe-form\"></div>\n<p>One of the most famous and influential programming books is <em>The C Programming Language</em> by Kernighan and Ritchie. The first example of a nondeterministic program appears on page 14:</p>\n<p><img alt=\"Picture of the book page. Code reproduced below.\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/94e6ad15-8d09-48df-b885-191318bfd179.jpg?w=960&fit=max\"/></p>\n<p>For the newsletter readers who get text only emails,<sup id=\"fnref:text-only\"><a class=\"footnote-ref\" href=\"#fn:text-only\">2</a></sup> here's the program:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"cp\">#include</span><span class=\"w\"> </span><span class=\"cpf\"><stdio.h></span>\n<span class=\"cm\">/* copy input to output; 1st version */</span>\n<span class=\"n\">main</span><span class=\"p\">()</span>\n<span class=\"p\">{</span>\n<span class=\"w\">    </span><span class=\"kt\">int</span><span class=\"w\"> </span><span class=\"n\">c</span><span class=\"p\">;</span>\n<span class=\"w\">    </span><span class=\"n\">c</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"n\">getchar</span><span class=\"p\">();</span>\n<span class=\"w\">    </span><span class=\"k\">while</span><span class=\"w\"> </span><span class=\"p\">(</span><span class=\"n\">c</span><span class=\"w\"> </span><span class=\"o\">!=</span><span class=\"w\"> </span><span class=\"n\">EOF</span><span class=\"p\">)</span><span class=\"w\"> </span><span class=\"p\">{</span>\n<span class=\"w\">        </span><span class=\"n\">putchar</span><span class=\"p\">(</span><span class=\"n\">c</span><span class=\"p\">);</span>\n<span class=\"w\">        </span><span class=\"n\">c</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"n\">getchar</span><span class=\"p\">();</span>\n<span class=\"w\">    </span><span class=\"p\">}</span>\n<span class=\"p\">}</span>\n</code></pre></div>\n<p>Yup, that's nondeterministic. Because the user can enter any string, any call of <code>main()</code> could have any output, meaning the number of possible outcomes is infinity.</p>\n<p>Okay that seems a little cheap, and I think it's because we tend to think of determinism in terms of how the user <em>experiences</em> the program. Yes, <code>main()</code> has an infinite number of user inputs, but for each input the user will experience only one possible output. It starts to feel more nondeterministic when modeling a long-standing system that's <em>reacting</em> to user input, for example a server that runs a script whenever the user uploads a file. This can be modeled with nondeterminism and concurrency: We have one execution that's the system, and one nondeterministic execution that represents the effects of our user.</p>\n<p>(One intrusive thought I sometimes have: any \"yes/no\" dialogue actually has <em>three</em> outcomes: yes, no, or the user getting up and walking away without picking a choice, permanently stalling the execution.)</p>\n<h2>4. External forces</h2>\n<p>The more general version of \"user input\": anything where either 1) some part of the execution outcome depends on retrieving external information, or 2) the external world can change some state outside of your system. I call the distinction between internal and external components of the system <a href=\"https://www.hillelwayne.com/post/world-vs-machine/\" target=\"_blank\">the world and the machine</a>. Simple examples: code that at some point reads an external temperature sensor. Unrelated code running on a system which quits programs if it gets too hot. API requests to a third party vendor. Code processing files but users can delete files before the script gets to them.</p>\n<p>Like with PRNGs, some of these cases don't <em>have</em> to be nondeterministic; we can argue that \"the temperature\" should be a virtual input into the function. Like with PRNGs, we treat it as nondeterministic because it's useful to think in that way. Also, what if the temperature changes between starting a function and reading it?</p>\n<p>External forces are also a source of nondeterminism as <em>uncertainty</em>. Measurements in the real world often comes with errors, so repeating a measurement twice can give two different answers. Sometimes operations fail for no discernable reason, or for a non-programmatic reason (like something physically blocks the sensor).</p>\n<p>All of these situations can be modeled in the same way as user input: a concurrent execution making nondeterministic choices.</p>\n<h2>5. Abstraction</h2>\n<p>This is where nondeterminism in system models and in \"real software\" differ the most. I said earlier that pseudorandomness is <em>arguably</em> deterministic, but we abstract it into nondeterminism. More generally, <strong>nondeterminism hides implementation details of deterministic processes</strong>.</p>\n<p>In one consulting project, we had a machine that received a message, parsed a lot of data from the message, went into a complicated workflow, and then entered one of three states. The final state was totally deterministic on the content of the message, but the actual process of determining that final state took tons and tons of code. None of that mattered at the scope we were modeling, so we abstracted it all away: \"on receiving message, nondeterministically enter state A, B, or C.\"</p>\n<p>Doing this makes the system easier to model. It also makes the model more sensitive to possible errors. What if the workflow is bugged and sends us to the wrong state? That's already covered by the nondeterministic choice! Nondeterministic abstraction gives us the potential to pick the worst-case scenario for our system, so we can prove it's robust even under those conditions.</p>\n<p>I know I beat the \"nondeterminism as abstraction\" drum a whole lot but that's because it's the insight from formal methods I personally value the most, that nondeterminism is a powerful tool to <em>simplify reasoning about things</em>. You can see the same approach in how I approach modeling users and external forces: complex realities black-boxed and simplified into nondeterministic forces on the system.</p>\n<hr/>\n<p>Anyway, I hope this collection of ideas I got from formal methods are useful to my broader readership. Lemme know if it somehow helps you out!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:nondeterminism\">\n<p>I realized after writing this that I already talked wrote an essay about nondeterminism in formal specification <a href=\"https://buttondown.com/hillelwayne/archive/nondeterminism-in-formal-specification/\" target=\"_blank\">just under a year ago</a>. I hope this one covers enough new ground to be interesting! <a class=\"footnote-backref\" href=\"#fnref:nondeterminism\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:text-only\">\n<p>There is a surprising number of you. <a class=\"footnote-backref\" href=\"#fnref:text-only\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/",
          "published": "2025-02-19T19:37:57.000Z",
          "updated": "2025-02-19T19:37:57.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/are-efficiency-and-horizontal-scalability-at-odds/",
          "title": "Are Efficiency and Horizontal Scalability at odds?",
          "description": "<p>Sorry for missing the newsletter last week! I started writing on Monday as normal, and by Wednesday the piece (about the <a href=\"https://en.wikipedia.org/wiki/Hierarchy_of_hazard_controls\" target=\"_blank\">hierarchy of controls</a> ) was 2000 words and not <em>close</em> to done. So now it'll be a blog post sometime later this month.</p>\n<p>I also just released a new version of <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a>! 0.7 adds a bunch of new content (type invariants, modeling access policies, rewrites of the first chapters) but more importantly has new fonts that are more legible than the old ones. <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Go check it out!</a></p>\n<p>For this week's newsletter I want to brainstorm an idea I've been noodling over for a while. Say we have a computational task, like running a simulation or searching a very large graph, and it's taking too long to complete on a computer. There's generally three things that we can do to make it faster:</p>\n<ol>\n<li>Buy a faster computer (\"vertical scaling\")</li>\n<li>Modify the software to use the computer's resources better (\"efficiency\")</li>\n<li>Modify the software to use multiple computers (\"horizontal scaling\")</li>\n</ol>\n<p>(Splitting single-threaded software across multiple threads/processes is sort of a blend of (2) and (3).)</p>\n<p>The big benefit of (1) is that we (usually) don't have to make any changes to the software to get a speedup. The downside is that for the past couple of decades computers haven't <em>gotten</em> much faster, except in ways that require recoding (like GPUs and multicore). This means we rely on (2) and (3), and we can do both to a point. I've noticed, though, that horizontal scaling seems to conflict with efficiency. Software optimized to scale well tends to be worse or the <code>N=1</code> case than software optimized to, um, be optimized. </p>\n<p>Are there reasons to <em>expect</em> this? It seems reasonable that design goals of software are generally in conflict, purely because exclusively optimizing for one property means making decisions that impede other properties. But is there something in the nature of \"efficiency\" and \"horizontal scalability\" that make them especially disjoint?</p>\n<p>This isn't me trying to explain a fully coherent idea, more me trying to figure this all out to myself. Also I'm probably getting some hardware stuff wrong</p>\n<h3>Amdahl's Law</h3>\n<p>According to <a href=\"https://en.wikipedia.org/wiki/Amdahl%27s_law\" target=\"_blank\">Amdahl's Law</a>, the maximum speedup by parallelization is constrained by the proportion of the work that can be parallelized. If 80% of algorithm X is parallelizable, the maximum speedup from horizontal scaling is 5x. If algorithm Y is 25% parallelizable, the maximum speedup is only 1.3x. </p>\n<p>If you need horizontal scalability, you want to use algorithm X, <em>even if Y is naturally 3x faster</em>. But if Y was 4x faster, you'd prefer it to X. Maximal scalability means finding the optimal balance between baseline speed and parallelizability. Maximal efficiency means just optimizing baseline speed. </p>\n<h3>Coordination Overhead</h3>\n<p>Distributed algorithms require more coordination. To add a list of numbers in parallel via <a href=\"https://en.wikipedia.org/wiki/Fork%E2%80%93join_model\" target=\"_blank\">fork-join</a>, we'd do something like this:</p>\n<ol>\n<li>Split the list into N sublists</li>\n<li>Fork a new thread/process for sublist</li>\n<li>Wait for each thread/process to finish</li>\n<li>Add the sums together.</li>\n</ol>\n<p>(1), (2), and (3) all add overhead to the algorithm. At the very least, it's extra lines of code to execute, but it can also mean inter-process communication or network hops. Distribution also means you have fewer natural correctness guarantees, so you need more administrative overhead to avoid race conditions. </p>\n<p><strong>Real world example:</strong> Historically CPython has a \"global interpreter lock\" (GIL). In multithreaded code, only one thread could execute Python code at a time (others could execute C code). The <a href=\"https://docs.python.org/3/howto/free-threading-python.html#single-threaded-performance\" target=\"_blank\">newest version</a> supports disabling the GIL, which comes at a 40% overhead for single-threaded programs. Supposedly the difference is because the <a href=\"https://docs.python.org/3/whatsnew/3.11.html#whatsnew311-pep659\" target=\"_blank\">specializing adaptor</a> optimization isn't thread-safe yet. The Python team is hoping on getting it down to \"only\" 10%. </p>\n<p style=\"height:16px; margin:0px !important;\"></p>\n<h3>Scaling loses shared resources</h3>\n<p>I'd say that intra-machine scaling (multiple threads/processes) feels qualitatively <em>different</em> than inter-machine scaling. Part of that is that intra-machine scaling is \"capped\" while inter-machine is not. But there's also a difference in what assumptions you can make about shared resources. Starting from the baseline of single-threaded program:</p>\n<ol>\n<li>Threads have a much harder time sharing CPU caches (you have to manually mess with affinities)</li>\n<li>Processes have a much harder time sharing RAM (I think you have to use <a href=\"https://en.wikipedia.org/wiki/Memory-mapped_file\" target=\"_blank\">mmap</a>?)</li>\n<li>Machines can't share cache, RAM, or disk, period.</li>\n</ol>\n<p>It's a lot easier to solve a problem when the whole thing fits in RAM. But if you split a 50 gb problem across three machines, it doesn't fit in ram by default, even if the machines have 64 gb each. Scaling also means that separate machines can't reuse resources like database connections.</p>\n<h3>Efficiency comes from limits</h3>\n<p>I think the two previous points tie together in the idea that maximal efficiency comes from being able to make assumptions about the system. If we know the <em>exact</em> sequence of computations, we can aim to minimize cache misses. If we don't have to worry about thread-safety, <a href=\"https://www.playingwithpointers.com/blog/refcounting-harder-than-it-sounds.html\" target=\"_blank\">tracking references is dramatically simpler</a>. If we have all of the data in a single database, our query planner has more room to work with. At various tiers of scaling these assumptions are no longer guaranteed and we lose the corresponding optimizations.</p>\n<p>Sometimes these assumptions are implicit and crop up in odd places. Like if you're working at a scale where you need multiple synced databases, you might want to use UUIDs instead of numbers for keys. But then you lose the assumption \"recently inserted rows are close together in the index\", which I've read <a href=\"https://www.cybertec-postgresql.com/en/unexpected-downsides-of-uuid-keys-in-postgresql/\" target=\"_blank\">can lead to significant slowdowns</a>. </p>\n<p>This suggests that if you can find a limit somewhere else, you can get both high horizontal scaling and high efficiency. <del>Supposedly the <a href=\"https://tigerbeetle.com/\" target=\"_blank\">TigerBeetle database</a> has both, but that could be because they limit all records to <a href=\"https://docs.tigerbeetle.com/coding/\" target=\"_blank\">accounts and transfers</a>. This means every record fits in <a href=\"https://tigerbeetle.com/blog/2024-07-23-rediscovering-transaction-processing-from-history-and-first-principles/#transaction-processing-from-first-principles\" target=\"_blank\">exactly 128 bytes</a>.</del> [A TigerBeetle engineer reached out to tell me that they do <em>not</em> horizontally scale compute, they distribute across multiple nodes for redundancy. <a href=\"https://lobste.rs/s/5akiq3/are_efficiency_horizontal_scalability#c_ve8ud5\" target=\"_blank\">\"You can't make it faster by adding more machines.\"</a>]</p>\n<p>Does this mean that \"assumptions\" could be both \"assumptions about the computing environment\" and \"assumptions about the problem\"? In the famous essay <a href=\"http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html\" target=\"_blank\">Scalability! But at what COST</a>, Frank McSherry shows that his single-threaded laptop could outperform 128-node \"big data systems\" on PageRank and graph connectivity (via label propagation). Afterwards, he discusses how a different algorithm solves graph connectivity even faster: </p>\n<blockquote>\n<p>[Union find] is more line of code than label propagation, but it is 10x faster and 100x less embarassing. … The union-find algorithm is fundamentally incompatible with the graph computation approaches Giraph, GraphLab, and GraphX put forward (the so-called “think like a vertex” model).</p>\n</blockquote>\n<p>The interesting thing to me is that his alternate makes more \"assumptions\" than what he's comparing to. He can \"assume\" a fixed goal and optimize the code for that goal. The \"big data systems\" are trying to be general purpose compute platforms and have to pick a model that supports the widest range of possible problems. </p>\n<p>A few years back I wrote <a href=\"https://www.hillelwayne.com/post/cleverness/\" target=\"_blank\">clever vs insightful code</a>, I think what I'm trying to say here is that efficiency comes from having insight into your problem and environment.</p>\n<p>(Last thought to shove in here: to exploit assumptions, you need <em>control</em>. Carefully arranging your data to fit in L1 doesn't matter if your programming language doesn't let you control where things are stored!)</p>\n<h3>Is there a cultural aspect?</h3>\n<p>Maybe there's also a cultural element to this conflict. What if the engineers interested in \"efficiency\" are different from the engineers interested in \"horizontal scaling\"?</p>\n<p>At my first job the data scientists set up a <a href=\"https://en.wikipedia.org/wiki/Apache_Hadoop\" target=\"_blank\">Hadoop</a> cluster for their relatively small dataset, only a few dozen gigabytes or so. One of the senior software engineers saw this and said \"big data is stupid.\" To prove it, he took one of their example queries, wrote a script in Go to compute the same thing, and optimized it to run faster on his machine.</p>\n<p>At the time I was like \"yeah, you're right, big data IS stupid!\" But I think now that we both missed something obvious: with the \"scalable\" solution, the data scientists <em>didn't</em> have to write an optimized script for every single query. Optimizing code is hard, adding more machines is easy! </p>\n<p>The highest-tier of horizontal scaling is usually something large businesses want, and large businesses like problems that can be solved purely with money. Maximizing efficiency requires a lot of knowledge-intensive human labour, so is less appealing as an investment. Then again, I've seen a lot of work on making the scalable systems more efficient, such as evenly balancing heterogeneous workloads. Maybe in the largest systems intra-machine efficiency is just too small-scale a problem. </p>\n<h3>I'm not sure where this fits in but scaling a volume of tasks conflicts less than scaling individual tasks</h3>\n<p>If you have 1,000 machines and need to crunch one big graph, you probably want the most scalable algorithm. If you instead have 50,000 small graphs, you probably want the most efficient algorithm, which you then run on all 1,000 machines. When we call a problem <a href=\"https://en.wikipedia.org/wiki/Embarrassingly_parallel\" target=\"_blank\">embarrassingly parallel</a>, we usually mean it's easy to horizontally scale. But it's also one that's easy to make more efficient, because local optimizations don't affect the scaling! </p>\n<hr/>\n<p>Okay that's enough brainstorming for one week.</p>\n<h3>Blog Rec</h3>\n<p>Whenever I think about optimization as a skill, the first article that comes to mind is <a href=\"https://matklad.github.io/\" target=\"_blank\">Mat Klad's</a> <a href=\"https://matklad.github.io/2023/11/15/push-ifs-up-and-fors-down.html\" target=\"_blank\">Push Ifs Up And Fors Down</a>. I'd never have considered on my own that inlining loops into functions could be such a huge performance win. The blog has a lot of other posts on the nuts-and-bolts of systems languages, optimization, and concurrency.</p>",
          "url": "https://buttondown.com/hillelwayne/archive/are-efficiency-and-horizontal-scalability-at-odds/",
          "published": "2025-02-12T18:26:20.000Z",
          "updated": "2025-02-12T18:26:20.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/what-hard-thing-does-your-tech-make-easy/",
          "title": "What hard thing does your tech make easy?",
          "description": "<p>I occasionally receive emails asking me to look at the writer's new language/library/tool. Sometimes it's in an area I know well, like formal methods. Other times, I'm a complete stranger to the field. Regardless, I'm generally happy to check it out.</p>\n<p>When starting out, this is the biggest question I'm looking to answer:</p>\n<blockquote>\n<p>What does this technology make easy that's normally hard?</p>\n</blockquote>\n<p>What justifies me learning and migrating to a <em>new</em> thing as opposed to fighting through my problems with the tools I already know? The new thing has to have some sort of value proposition, which could be something like \"better performance\" or \"more secure\". The most universal value and the most direct to show is \"takes less time and mental effort to do something\". I can't accurately judge two benchmarks, but I can see two demos or code samples and compare which one feels easier to me.</p>\n<h2>Examples</h2>\n<h3>Functional programming</h3>\n<p>What drew me originally to functional programming was higher order functions. </p>\n<div class=\"codehilite\"><pre><span></span><code># Without HOFs\n\nout = []\nfor x in input {\n  if test(x) {\n    out.append(x)\n }\n}\n\n# With HOFs\n\nfilter(test, input)\n</code></pre></div>\n<p style=\"height:16px; margin:0px !important;\"></p>\n<p>We can also compare the easiness of various tasks between examples within the same paradigm. If I know FP via Clojure, what could be appealing about Haskell or F#? For one, null safety is a lot easier when I've got option types.</p>\n<h3>Array Programming</h3>\n<p>Array programming languages like APL or J make certain classes of computation easier. For example, finding all of the indices where two arrays <del>differ</del> match. Here it is in Python:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">x</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">]</span>\n<span class=\"n\">y</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">]</span>\n\n<span class=\"o\">>>></span> <span class=\"p\">[</span><span class=\"n\">i</span> <span class=\"k\">for</span> <span class=\"n\">i</span><span class=\"p\">,</span> <span class=\"p\">(</span><span class=\"n\">a</span><span class=\"p\">,</span> <span class=\"n\">b</span><span class=\"p\">)</span> <span class=\"ow\">in</span> <span class=\"nb\">enumerate</span><span class=\"p\">(</span><span class=\"nb\">zip</span><span class=\"p\">(</span><span class=\"n\">x</span><span class=\"p\">,</span> <span class=\"n\">y</span><span class=\"p\">))</span> <span class=\"k\">if</span> <span class=\"n\">a</span> <span class=\"o\">==</span> <span class=\"n\">b</span><span class=\"p\">]</span>\n<span class=\"p\">[</span><span class=\"mi\">7</span><span class=\"p\">,</span> <span class=\"mi\">9</span><span class=\"p\">]</span>\n</code></pre></div>\n<p>And here it is in J:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"w\">  </span><span class=\"nv\">x</span><span class=\"w\"> </span><span class=\"o\">=:</span><span class=\"w\"> </span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"mi\">4</span><span class=\"w\"> </span><span class=\"mi\">2</span><span class=\"w\"> </span><span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">4</span><span class=\"w\"> </span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">4</span>\n<span class=\"w\">  </span><span class=\"nv\">y</span><span class=\"w\"> </span><span class=\"o\">=:</span><span class=\"w\"> </span><span class=\"mi\">2</span><span class=\"w\"> </span><span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"mi\">2</span><span class=\"w\"> </span><span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">2</span><span class=\"w\"> </span><span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">2</span><span class=\"w\"> </span><span class=\"mi\">4</span>\n\n<span class=\"w\">  </span><span class=\"nv\">I</span><span class=\"o\">.</span><span class=\"w\"> </span><span class=\"nv\">x</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"nv\">y</span>\n<span class=\"mi\">7</span><span class=\"w\"> </span><span class=\"mi\">9</span>\n</code></pre></div>\n<p>Not every tool is meant for every programmer, because you might not have any of the problems a tool makes easier. What comes up more often for you: filtering a list or finding all the indices where two lists differ? Statistically speaking, functional programming is more useful to you than array programming.</p>\n<p>But <em>I</em> have this problem enough to justify learning array programming.</p>\n<h3>LLMs</h3>\n<p>I think a lot of the appeal of LLMs is they make a lot of specialist tasks easy for nonspecialists. One thing I recently did was convert some rst <a href=\"https://docutils.sourceforge.io/docs/ref/rst/directives.html#list-table\" target=\"_blank\">list tables</a> to <a href=\"https://docutils.sourceforge.io/docs/ref/rst/directives.html#csv-table-1\" target=\"_blank\">csv tables</a>. Normally I'd have to do write some tricky parsing and serialization code to automatically convert between the two. With LLMs, it's just</p>\n<blockquote>\n<p>Convert the following rst list-table into a csv-table: [table]</p>\n</blockquote>\n<p>\"Easy\" can trump \"correct\" as a value. The LLM might get some translations wrong, but it's so convenient I'd rather manually review all the translations for errors than write specialized script that is correct 100% of the time.</p>\n<h2>Let's not take this too far</h2>\n<p>A college friend once claimed that he cracked the secret of human behavior: humans do whatever makes them happiest. \"What about the martyr who dies for their beliefs?\" \"Well, in their last second of life they get REALLY happy.\"</p>\n<p>We can do the same here, fitting every value proposition into the frame of \"easy\". CUDA makes it easier to do matrix multiplication. Rust makes it easier to write low-level code without memory bugs. TLA+ makes it easier to find errors in your design. Monads make it easier to sequence computations in a lazy environment. Making everything about \"easy\" obscures other reason for adopting new things.</p>\n<h3>That whole \"simple vs easy\" thing</h3>\n<p>Sometimes people think that \"simple\" is better than \"easy\", because \"simple\" is objective and \"easy\" is subjective. This comes from the famous talk <a href=\"https://www.infoq.com/presentations/Simple-Made-Easy/\" target=\"_blank\">Simple Made Easy</a>. I'm not sure I agree that simple is better <em>or</em> more objective: the speaker claims that polymorphism and typeclasses are \"simpler\" than conditionals, and I doubt everybody would agree with that.</p>\n<p>The problem is that \"simple\" is used to mean both \"not complicated\" <em>and</em> \"not complex\". And everybody agrees that \"complicated\" and \"complex\" are different, even if they can't agree <em>what</em> the difference is. This idea should probably expanded be expanded into its own newsletter.</p>\n<p>It's also a lot harder to pitch a technology on being \"simpler\". Simplicity by itself doesn't make a tool better equipped to solve problems. Simplicity can unlock other benefits, like compositionality or <a href=\"https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/\" target=\"_blank\">tractability</a>, that provide the actual value. And often that value is in the form of \"makes some tasks easier\". </p>",
          "url": "https://buttondown.com/hillelwayne/archive/what-hard-thing-does-your-tech-make-easy/",
          "published": "2025-01-29T18:09:47.000Z",
          "updated": "2025-01-29T18:09:47.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/the-jugglers-curse/",
          "title": "The Juggler's Curse",
          "description": "<p>I'm making a more focused effort to juggle this year. Mostly <a href=\"https://youtu.be/PPhG_90VH5k?si=AxOO65PcX4ZwnxPQ&t=49\" target=\"_blank\">boxes</a>, but also classic balls too.<sup id=\"fnref:boxes\"><a class=\"footnote-ref\" href=\"#fn:boxes\">1</a></sup> I've gotten to the point where I can almost consistently do a five-ball cascade, which I <em>thought</em> was the cutoff to being a \"good juggler\". \"Thought\" because I now know a \"good juggler\" is one who can do the five-ball cascade with <em>outside throws</em>. </p>\n<p>I know this because I can't do the outside five-ball cascade... yet. But it's something I can see myself eventually mastering, unlike the slightly more difficult trick of the five-ball mess, which is impossible for mere mortals like me. </p>\n<p><em>In theory</em> there is a spectrum of trick difficulties and skill levels. I could place myself on the axis like this:</p>\n<p><img alt=\"A crudely-drawn scale with 10 even ticks, I'm between 5 and 6\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/8ee51aa1-5dd4-48b8-8110-2cdf9a273612.png?w=960&fit=max\"/></p>\n<p>In practice, there are three tiers:</p>\n<ol>\n<li>Toddlers</li>\n<li>Good jugglers who practice hard</li>\n<li>Genetic freaks and actual wizards</li>\n</ol>\n<p>And the graph always, <em>always</em> looks like this:</p>\n<p><img alt=\"The same graph, with the top compressed into \"wizards\" and bottom into \"toddlers\". I'm in toddlers.\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/04c76cec-671e-4560-b64e-498b7652359e.png?w=960&fit=max\"/></p>\n<p>This is the jugglers curse, and it's a three-parter:</p>\n<ol>\n<li>The threshold between you and \"good\" is the next trick you cannot do.</li>\n<li>Everything below that level is trivial. Once you've gotten a trick down, you can never go back to not knowing it, to appreciating how difficult it was to learn in the first place.<sup id=\"fnref:expert-blindness\"><a class=\"footnote-ref\" href=\"#fn:expert-blindness\">2</a></sup></li>\n<li>Everything above that level is just \"impossible\". You don't have the knowledge needed to recognize the different tiers.<sup id=\"fnref:dk\"><a class=\"footnote-ref\" href=\"#fn:dk\">3</a></sup></li>\n</ol>\n<p>So as you get better, the stuff that was impossible becomes differentiable, and you can see that some of it <em>is</em> possible. And everything you learned becomes trivial. So you're never a good juggler until you learn \"just one more hard trick\".</p>\n<p>The more you know, the more you know you don't know and the less you know you know.</p>\n<h3>This is supposed to be a software newsletter</h3>\n<blockquote>\n<p>A monad is a monoid in the category of endofunctors, what's the problem? <a href=\"https://james-iry.blogspot.com/2009/05/brief-incomplete-and-mostly-wrong.html\" target=\"_blank\">(src)</a></p>\n</blockquote>\n<p>I think this applies to any difficult topic? Most fields don't have the same stark <a href=\"https://en.wikipedia.org/wiki/Spectral_line\" target=\"_blank\">spectral lines</a> as juggling, but there's still tiers of difficulty to techniques, which get compressed the further in either direction they are from your current level.</p>\n<p>Like, I'm not good at formal methods. I've written two books on it but I've never mastered a dependently-typed language or a theorem prover. Those are equally hard. And I'm not good at modeling concurrent systems because I don't understand the formal definition of bisimulation and haven't implemented a Raft. Those are also equally hard, in fact exactly as hard as mastering a theorem prover.</p>\n<p>At the same time, the skills I've already developed are easy: properly using refinement is <em>exactly as easy</em> as writing <a href=\"https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/\" target=\"_blank\">a wrapped counter</a>. Then I get surprised when I try to explain strong fairness to someone and they just don't get how □◇(ENABLED〈A〉ᵥ) is <em>obviously</em> different from ◇□(ENABLED 〈A〉ᵥ).</p>\n<p>Juggler's curse!</p>\n<p>Now I don't actually know if this is actually how everybody experiences expertise or if it's just my particular personality— I was a juggler long before I was a software developer. Then again, I'd argue that lots of people talk about one consequence of the juggler's curse: imposter syndrome. If you constantly think what you know is \"trivial\" and what you don't know is \"impossible\", then yeah, you'd start feeling like an imposter at work real quick.</p>\n<p>I wonder if part of the cause is that a lot of skills you have to learn are invisible. One of my favorite blog posts ever is <a href=\"https://www.benkuhn.net/blub/\" target=\"_blank\">In Defense of Blub Studies</a>, which argues that software expertise comes through understanding \"boring\" topics like \"what all of the error messages mean\" and \"how to use a debugger well\".  Blub is a critical part of expertise and takes a lot of hard work to learn, but it <em>feels</em> like trivia. So looking back on a skill I mastered, I might think it was \"easy\" because I'm not including all of the blub that I had to learn, too.</p>\n<p>The takeaway, of course, is that the outside five-ball cascade <em>is</em> objectively the cutoff between good jugglers and toddlers.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:boxes\">\n<p>Rant time: I <em>love</em> cigar box juggling. It's fun, it's creative, it's totally unlike any other kind of juggling. And it's so niche I straight up cannot find anybody in Chicago to practice with. I once went to a juggling convention and was the only person with a cigar box set there. <a class=\"footnote-backref\" href=\"#fnref:boxes\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:expert-blindness\">\n<p>This particular part of the juggler's curse is also called <a href=\"https://en.wikipedia.org/wiki/Curse_of_knowledge\" target=\"_blank\">the curse of knowledge</a> or \"expert blindness\". <a class=\"footnote-backref\" href=\"#fnref:expert-blindness\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:dk\">\n<p>This isn't Dunning-Kruger, because DK says that people think they are <em>better</em> than they actually are, and also <a href=\"https://www.mcgill.ca/oss/article/critical-thinking/dunning-kruger-effect-probably-not-real\" target=\"_blank\">may not actually be real</a>. <a class=\"footnote-backref\" href=\"#fnref:dk\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/the-jugglers-curse/",
          "published": "2025-01-22T18:50:40.000Z",
          "updated": "2025-01-22T18:50:40.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/",
          "title": "What are the Rosettas of formal specification?",
          "description": "<p>First of all, I just released version 0.6 of <em>Logic for Programmers</em>! You can get it <a href=\"https://leanpub.com/logic/\" target=\"_blank\">here</a>. Release notes in the footnote.<sup id=\"fnref:release-notes\"><a class=\"footnote-ref\" href=\"#fn:release-notes\">1</a></sup></p>\n<p>I've been thinking about my next project after the book's done. One idea is to do a survey of new formal specification languages. There's been a lot of new ones in the past few years (P, Quint, etc), plus some old ones I haven't critically examined (SPIN, mcrl2). I'm thinking of a brief overview of each, what's interesting about it, and some examples of the corresponding models.</p>\n<p>For this I'd want a set of \"Rosetta\" examples. <a href=\"https://rosettacode.org/wiki/Rosetta_Code\" target=\"_blank\">Rosetta Code</a> is a collection of programming tasks done in different languages. For example, <a href=\"https://rosettacode.org/wiki/99_bottles_of_beer\" target=\"_blank\">\"99 bottles of beer on the wall\"</a> in over 300 languages. If I wanted to make a Rosetta Code for specifications of concurrent systems, what examples would I use? </p>\n<h3>What makes a good Rosetta examples?</h3>\n<p>A good Rosetta example would be simple enough to understand and implement but also showcase the differences between the languages. </p>\n<p>A good example of a Rosetta example is <a href=\"https://github.com/hwayne/lets-prove-leftpad\" target=\"_blank\">leftpad for code verification</a>. Proving leftpad correct is short in whatever verification language you use. But the proofs themselves are different enough that you can compare what it's like to use code contracts vs with dependent types, etc. </p>\n<p>A <em>bad</em> Rosetta example is \"hello world\". While it's good for showing how to run a language, it doesn't clearly differentiate languages. Haskell's \"hello world\" is almost identical to BASIC's \"hello world\".</p>\n<p>Rosetta examples don't have to be flashy, but I <em>want</em> mine to be flashy. Formal specification is niche enough that regardless of my medium, most of my audience hasn't use it and may be skeptical. I always have to be selling. This biases me away from using things like dining philosophers or two-phase commit.</p>\n<p>So with that in mind, three ideas:</p>\n<h3>1. Wrapped Counter</h3>\n<p>A counter that starts at 1 and counts to N, after which it wraps around to 1 again.</p>\n<h4>Why it's good</h4>\n<p>This is a good introductory formal specification: it's a minimal possible stateful system without concurrency or nondeterminism. You can use it to talk about the basic structure of a spec, how a verifier works, etc. It also a good way of introducing \"boring\" semantics, like conditionals and arithmetic, and checking if the language does anything unusual with them. Alloy, for example, defaults to 4-bit signed integers, so you run into problems if you set N too high.<sup id=\"fnref:alloy\"><a class=\"footnote-ref\" href=\"#fn:alloy\">2</a></sup></p>\n<p>At the same time, wrapped counters are a common building block of complex systems. Lots of things can be represented this way: <code>N=1</code> is a flag or blinker, <code>N=3</code> is a traffic light, <code>N=24</code> is a clock, etc.</p>\n<p>The next example is better for showing basic <a href=\"https://www.hillelwayne.com/post/safety-and-liveness/\" target=\"_blank\">safety and liveness properties</a>, but this will do in a pinch. </p>\n<h3>2. Threads</h3>\n<p>A counter starts at 0. N threads each, simultaneously try to update the counter. They do this nonatomically: first they read the value of the counter and store that in a thread-local <code>tmp</code>, then they increment <code>tmp</code>, then they set the counter to <code>tmp</code>. The expected behavior is that the final value of the counter will be N.</p>\n<h4>Why it's good</h4>\n<p>The system as described is bugged. If two threads interleave the setlocal commands, one thread update can \"clobber\" the other and the counter can go backwards. To my surprise, most people <em>do not</em> see this error. So it's a good showcase of how the language actually finds real bugs, and how it can verify fixes.</p>\n<p>As to actual language topics: the spec covers concurrency and track process-local state. A good spec language should make it possible to adjust N without having to add any new variables. And it \"naturally\" introduces safety, liveness, and <a href=\"https://www.hillelwayne.com/post/action-properties/\" target=\"_blank\">action</a> properties.</p>\n<p>Finally, the thread spec is endlessly adaptable. I've used variations of it to teach refinement, resource starvation, fairness, livelocks, and hyperproperties. Tweak it a bit and you get dining philosophers.</p>\n<h3>3. Bounded buffer</h3>\n<p>We have a bounded buffer with maximum length <code>X</code>. We have <code>R</code> reader and <code>W</code> writer processes. Before writing, writers first check if the buffer is full. If full, the writer goes to sleep. Otherwise, the writer wakes up <em>a random</em> sleeping process, then pushes an arbitrary value. Readers work the same way, except they pop from the buffer (and go to sleep if the buffer is empty).</p>\n<p>The only way for a sleeping process to wake up is if another process successfully performs a read or write.</p>\n<h4>Why it's good</h4>\n<p>This shows process-local nondeterminism (in choosing which sleeping process to wake up), different behavior for different types of processes, and deadlocks: it's possible for every reader and writer to be asleep at the same time.</p>\n<p>The beautiful thing about this example: the spec can only deadlock if <code>X < 2*(R+W)</code>. This is the kind of bug you'd struggle to debug in real code. An in fact, people did struggle: even when presented with a minimal code sample and told there was a bug, many <a href=\"http://wiki.c2.com/?ExtremeProgrammingChallengeFourteen\" target=\"_blank\">testing experts couldn't find it</a>. Whereas a formal model of the same code <a href=\"https://www.hillelwayne.com/post/augmenting-agile/\" target=\"_blank\">finds the bug in seconds</a>. </p>\n<p>If a spec language can model the bounded buffer, then it's good enough for production systems.</p>\n<p>On top of that, the bug happens regardless of what writers actually put in the buffer, so you can abstract that all away. This example can demonstrate that you can leave implementation details out of a spec and still find critical errors.</p>\n<h2>Caveat</h2>\n<p>This is all with a <em>heavy</em> TLA+ bias. I've modeled all of these systems in TLA+ and it works pretty well for them. That is to say, none of these do things TLA+ is <em>bad</em> at: reachability, subtyping, transitive closures, unbound spaces, etc. I imagine that as I cover more specification languages I'll find new Rosettas.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:release-notes\">\n<ul>\n<li>Exercises are more compact, answers now show name of exercise in title</li>\n</ul>\n<ul>\n<li>\"Conditionals\" chapter has new section on nested conditionals</li>\n</ul>\n<ul>\n<li>\"Crash course\" chapter significantly rewritten</li>\n<li>Starting migrating to use consistently use <code>==</code> for equality and <code>=</code> for definition. Not everything is migrated yet</li>\n<li>\"Beyond Logic\" appendix does a <em>slightly</em> better job of covering HOL and constructive logic</li>\n<li>Addressed various reader feedback</li>\n<li>Two new exercises</li>\n</ul>\n<p><a class=\"footnote-backref\" href=\"#fnref:release-notes\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:alloy\">\n<p>You can change the int size in a model run, so this is more \"surprising footgun and inconvenience\" than \"fundamental limit of the specification language.\" Something still good to know! <a class=\"footnote-backref\" href=\"#fnref:alloy\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/",
          "published": "2025-01-15T17:34:40.000Z",
          "updated": "2025-01-15T17:34:40.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/",
          "title": "\"Logic for Programmers\" Project Update",
          "description": "<p>Happy new year everyone!</p>\n<p>I released the first <em>Logic for Programmers</em> alpha six months ago. There's since been four new versions since then, with the November release putting us in beta. Between work and holidays I didn't make much progress in December, but there will be a 0.6 release in the next week or two.</p>\n<p>People have asked me if the book will ever be available in print, and my answer to that is \"when it's done\". To keep \"when it's done\" from being \"never\", I'm committing myself to <strong>have the book finished by July.</strong> That means roughly six more releases between now and the official First Edition. Then I will start looking for a way to get it printed.</p>\n<h3>The Current State and What Needs to be Done</h3>\n<p>Right now the book is 26,000 words. For the most part, the structure is set— I don't plan to reorganize the chapters much. But I still need to fix shortcomings identified by the reader feedback. In particular, a few topics need more on real world applications, and the Alloy chapter is pretty weak. There's also a bunch of notes and todos and \"fix this\"s I need to go over.</p>\n<p>I also need to rewrite the introduction and predicate logic chapters. Those haven't changed much since 0.1 and I need to go over them <em>very carefully</em>.</p>\n<p>After that comes copyediting.</p>\n<h4>Ugh, Copyediting</h4>\n<p>Copyediting means going through the entire book to make word and sentence sentence level changes to the flow. An example would be changing</p>\n<table>\n<thead>\n<tr>\n<th>From</th>\n<th>To</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>I said predicates are just “boolean functions”. That isn’t <em>quite</em> true.</td>\n<td>It's easy to think of predicates as just \"boolean\" functions, but there is a subtle and important difference.</td>\n</tr>\n</tbody>\n</table>\n<p>It's a tiny difference but it reads slightly better to me and makes the book slghtly better. Now repeat that for all 3000-odd sentences in the book and I'm done with copyediting!</p>\n<p>For the first pass, anyway. Copyediting is miserable. </p>\n<p>Some of the changes I need to make come from reader feedback, but most will come from going through it line-by-line with a copyeditor. Someone's kindly offered to do some of this for free, but I want to find a professional too. If you know anybody, let me know.</p>\n<h4>Formatting</h4>\n<p>The book, if I'm being honest, looks ugly. I'm using the default sphinx/latex combination for layout and typesetting. My thinking is it's not worth making the book pretty until it's worth reading. But I also want the book, when it's eventually printed, to look <em>nice</em>. At the very least it shouldn't have \"self-published\" vibes. </p>\n<p>I've found someone who's been giving me excellent advice on layout and I'm slowly mastering the LaTeX formatting arcana. It's gonna take a few iterations to get things right.</p>\n<h4>Front cover</h4>\n<p>Currently the front cover is this:</p>\n<p><img alt=\"Front cover\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/b42ee3de-9d8a-4729-809e-a8739741f0cf.png?w=960&fit=max\"/></p>\n<p>It works but gives \"programmer spent ten minutes in Inkscape\" vibes. I have a vision in my head for what would be nicer. A few people have recommended using Fiverr. So far the results haven't been that good, </p>\n<h4>Fixing Epub</h4>\n<p><em>Ugh</em></p>\n<p>I thought making an epub version would be kinder for phone reading, but it's such a painful format to develop for. Did you know that epub backlinks work totally different on kindle vs other ereaders? Did you know the only way to test if you got em working right is to load them up in a virtual kindle? The feedback loops are miserable. So I've been treating epub as a second-class citizen for now and only fixing the <em>worst</em> errors (like math not rendering properly), but that'll have to change as the book finalizes.</p>\n<h3>What comes next?</h3>\n<p>After 1.0, I get my book an ISBN and figure out how to make print copies. The margin on print is <em>way</em> lower than ebooks, especially if it's on-demand: the net royalties for <a href=\"https://kdp.amazon.com/en_US/help/topic/G201834330\" target=\"_blank\">Amazon direct publishing</a> would be 7 dollars on a 20-dollar book (as opposed to Leanpub's 16 dollars). Would having a print version double the sales? I hope so! Either way, a lot of people have been asking about print version so I want to make that possible.</p>\n<p>(I also want to figure out how to give people who already have the ebook a discount on print, but I don't know if that's feasible.)</p>\n<p>Then, I dunno, maybe make a talk or a workshop I can pitch to conferences. Once I have that I think I can call <em>LfP</em> complete... at least until the second edition.</p>\n<hr/>\n<p>Anyway none of that is actually technical so here's a quick fun thing. I spent a good chunk of my break reading the <a href=\"https://www.mcrl2.org/web/index.html\" target=\"_blank\">mCRL2 book</a>. mCRL2 defines an \"algebra\" for \"communicating processes\". As a very broad explanation, that's defining what it means to \"add\" and \"multiply\" two processes. What's interesting is that according to their definition, the algebra follows the distributive law, <em>but only if you multiply on the right</em>. eg</p>\n<div class=\"codehilite\"><pre><span></span><code>// VALID\n(a+b)*c = a*c + b*c\n\n// INVALID\na*(b+c) = a*b + a*c\n</code></pre></div>\n<p>This is the first time I've ever seen this in practice! Juries still out on the rest of the language.</p>\n<hr/>\n<h3>Videos and Stuff</h3>\n<ul>\n<li>My <em>DDD Europe</em> talk is now out! <a href=\"https://www.youtube.com/watch?v=uRmNSuYBUOU\" target=\"_blank\">What We Know We Don't Know</a> is about empirical software engineering in general, and software engineering research on Domain Driven Design in particular.</li>\n<li>I was interviewed in the last video on <a href=\"https://www.youtube.com/watch?v=yXxmSI9SlwM\" target=\"_blank\">Craft vs Cruft</a>'s \"Year of Formal Methods\". Check it out!</li>\n</ul>",
          "url": "https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/",
          "published": "2025-01-07T18:49:40.000Z",
          "updated": "2025-01-07T18:49:40.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        }
      ]
    }
    Analyze Another View with RSS.Style