RSS.Style logo RSS/Atom Feed Analysis


Analysis of https://buttondown.email/hillelwayne/rss

Feed fetched in 1,076 ms.
Warning Content type is application/rss+xml; charset=utf-8, not text/xml.
Feed is 394,264 characters long.
Warning Feed is missing an ETag.
Feed has a last modified date of Thu, 12 Jun 2025 15:43:25 GMT.
Warning This feed does not have a stylesheet.
This appears to be an RSS feed.
Feed title: Computer Things
Feed self link matches feed URL.
Feed has 30 items.
First item published on 2025-06-12T15:43:25.000Z
Last item published on 2024-09-10T19:40:29.000Z
Home page URL: https://buttondown.com/hillelwayne
Error Home page does not have a matching feed discovery link in the <head>.

1 feed links in <head>
  • https://buttondown.com/hillelwayne/rss

  • Error Home page does not have a link to the feed in the <body>.

    Formatted XML
    <?xml version="1.0" encoding="utf-8"?>
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
        <channel>
            <title>Computer Things</title>
            <link>https://buttondown.com/hillelwayne</link>
            <description>Hi, I'm Hillel. This is the newsletter version of [my website](https://www.hillelwayne.com). I post all website updates here. I also post weekly content just for the newsletter, on topics like
    
    * Formal Methods
    
    * Software History and Culture
    
    * Fringetech and exotic tooling
    
    * The philosophy and theory of software engineering
    
    You can see the archive of all public essays [here](https://buttondown.email/hillelwayne/archive/).</description>
            <atom:link href="https://buttondown.email/hillelwayne/rss" rel="self"/>
            <language>en-us</language>
            <lastBuildDate>Thu, 12 Jun 2025 15:43:25 +0000</lastBuildDate>
            <item>
                <title>Solving LinkedIn Queens with SMT</title>
                <link>https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/</link>
                <description>&lt;h3&gt;No newsletter next week&lt;/h3&gt;
    &lt;p&gt;I’ll be speaking at &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;. My talk isn't close to done yet, which is why this newsletter is both late and short. &lt;/p&gt;
    &lt;h1&gt;Solving LinkedIn Queens in SMT&lt;/h1&gt;
    &lt;p&gt;The article &lt;a href="https://codingnest.com/modern-sat-solvers-fast-neat-underused-part-1-of-n/" target="_blank"&gt;Modern SAT solvers: fast, neat and underused&lt;/a&gt; claims that SAT solvers&lt;sup id="fnref:SAT"&gt;&lt;a class="footnote-ref" href="#fn:SAT"&gt;1&lt;/a&gt;&lt;/sup&gt; are "criminally underused by the industry". A while back on the newsletter I asked "why": how come they're so powerful and yet nobody uses them? Many experts responded saying the reason is that encoding SAT kinda sucked and they rather prefer using tools that compile to SAT. &lt;/p&gt;
    &lt;p&gt;I was reminded of this when I read &lt;a href="https://ryanberger.me/posts/queens/" target="_blank"&gt;Ryan Berger's post&lt;/a&gt; on solving “LinkedIn Queens” as a SAT problem. &lt;/p&gt;
    &lt;p&gt;A quick overview of Queens. You’re presented with an NxN grid divided into N regions, and have to place N queens so that there is exactly one queen in each row, column, and region. While queens can be on the same diagonal, they &lt;em&gt;cannot&lt;/em&gt; be adjacently diagonal.&lt;/p&gt;
    &lt;p&gt;(Important note: Linkedin “Queens” is a variation on the puzzle game &lt;a href="https://starbattle.puzzlebaron.com/" target="_blank"&gt;Star Battle&lt;/a&gt;, which is the same except the number of stars you place in each row/column/region varies per puzzle, and is usually two. This is also why 'queens' don’t capture like chess queens.)&lt;/p&gt;
    &lt;p&gt;&lt;img alt="An image of a solved queens board. Copied from https://ryanberger.me/posts/queens" class="newsletter-image" src="https://assets.buttondown.email/images/96f6f923-331f-424d-8641-fe6753e1c2ca.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Ryan solved this by writing Queens as a SAT problem, expressing properties like "there is exactly one queen in row 3" as a large number of boolean clauses. &lt;a href="https://ryanberger.me/posts/queens/" target="_blank"&gt;Go read his post, it's pretty cool&lt;/a&gt;. What leapt out to me was that he used &lt;a href="https://cvc5.github.io/" target="_blank"&gt;CVC5&lt;/a&gt;, an &lt;strong&gt;SMT&lt;/strong&gt; solver.&lt;sup id="fnref:SMT"&gt;&lt;a class="footnote-ref" href="#fn:SMT"&gt;2&lt;/a&gt;&lt;/sup&gt; SMT solvers are "higher-level" than SAT, capable of handling more data types than just boolean variables. It's a lot easier to solve the problem at the SMT level than at the SAT level. To show this, I whipped up a short demo of solving the same problem in &lt;a href="https://github.com/Z3Prover/z3/wiki" target="_blank"&gt;Z3&lt;/a&gt; (via the &lt;a href="https://pypi.org/project/z3-solver/" target="_blank"&gt;Python API&lt;/a&gt;).&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://gist.github.com/hwayne/c5de7bc52e733995311236666bedecd3" target="_blank"&gt;Full code here&lt;/a&gt;, which you can compare to Ryan's SAT solution &lt;a href="https://github.com/ryan-berger/queens/blob/master/main.py" target="_blank"&gt;here&lt;/a&gt;. I didn't do a whole lot of cleanup on it (again, time crunch!), but short explanation below.&lt;/p&gt;
    &lt;h3&gt;The code&lt;/h3&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;z3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
    &lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;itertools&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;combinations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;
    &lt;span class="n"&gt;solver&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Solver&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt; &lt;span class="c1"&gt;# N&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Initial setup and modules. &lt;code&gt;size&lt;/code&gt; is the number of rows/columns/regions in the board, which I'll call &lt;code&gt;N&lt;/code&gt; below.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# queens[n] = col of queen on row n&lt;/span&gt;
    &lt;span class="c1"&gt;# by construction, not on same row&lt;/span&gt;
    &lt;span class="n"&gt;queens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;IntVector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'q'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;SAT represents the queen positions via N² booleans: &lt;code&gt;q_00&lt;/code&gt; means that a Queen is on row 0 and column 0, &lt;code&gt;!q_05&lt;/code&gt; means a queen &lt;em&gt;isn't&lt;/em&gt; on row 0 col 5, etc. In SMT we can instead encode it as N integers: &lt;code&gt;q_0 = 5&lt;/code&gt; means that the queen on row 0 is positioned at column 5. This immediately enforces one class of constraints for us: we don't need any constraints saying "exactly one queen per row", because that's embedded in the definition of &lt;code&gt;queens&lt;/code&gt;!&lt;/p&gt;
    &lt;p&gt;(Incidentally, using 0-based indexing for the board was a mistake on my part, it makes correctly encoding the regions later really painful.)&lt;/p&gt;
    &lt;p&gt;To actually make the variables &lt;code&gt;[q_0, q_1, …]&lt;/code&gt;, we use the Z3 affordance &lt;code&gt;IntVector(str, n)&lt;/code&gt; for making &lt;code&gt;n&lt;/code&gt; variables at once.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;And&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="c1"&gt;# not on same column&lt;/span&gt;
    &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Distinct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;First we constrain all the integers to &lt;code&gt;[0, N)&lt;/code&gt;, then use the &lt;em&gt;incredibly&lt;/em&gt; handy &lt;code&gt;Distinct&lt;/code&gt; constraint to force all the integers to have different values. This guarantees at most one queen per column, which by the &lt;a href="https://en.wikipedia.org/wiki/Pigeonhole_principle" target="_blank"&gt;pigeonhole principle&lt;/a&gt; means there is exactly one queen per column.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# not diagonally adjacent&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;q1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;q2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;One of the rules is that queens can't be adjacent. We already know that they can't be horizontally or vertically adjacent via other constraints, which leaves the diagonals. We only need to add constraints that, for each queen, there is no queen in the lower-left or lower-right corner, aka &lt;code&gt;q_3 != q_2 ± 1&lt;/code&gt;. We don't need to check the top corners because if &lt;code&gt;q_1&lt;/code&gt; is in the upper-left corner of &lt;code&gt;q_2&lt;/code&gt;, then &lt;code&gt;q_2&lt;/code&gt; is in the lower-right corner of &lt;code&gt;q_1&lt;/code&gt;!&lt;/p&gt;
    &lt;p&gt;That covers everything except the "one queen per region" constraint. But the regions are the tricky part, which we should expect because we vary the difficulty of queens games by varying the regions.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;regions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;"purple"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
            &lt;span class="s2"&gt;"red"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),],&lt;/span&gt;
            &lt;span class="c1"&gt;# you get the picture&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
    
    &lt;span class="c1"&gt;# Some checking code left out, see below&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The region has to be manually coded in, which is a huge pain.&lt;/p&gt;
    &lt;p&gt;(In the link, some validation code follows. Since it breaks up explaining the model I put it in the next section.)&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Or&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally we have the region constraint. The easiest way I found to say "there is exactly one queen in each region" is to say "there is a queen in region 1 and a queen in region 2 and a queen in region 3" etc." Then to say "there is a queen in region &lt;code&gt;purple&lt;/code&gt;" I wrote "&lt;code&gt;q_0 = 0&lt;/code&gt; OR &lt;code&gt;q_0 = 1&lt;/code&gt; OR … OR &lt;code&gt;q_1 = 0&lt;/code&gt; etc." &lt;/p&gt;
    &lt;p&gt;Why iterate over every position in the region instead of doing something like &lt;code&gt;(0, q[0]) in r&lt;/code&gt;? I tried that but it's not an expression that Z3 supports.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;sat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally, we solve and print the positions. Running this gives me:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;q__0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Which is the correct solution to the queens puzzle. I didn't benchmark the solution times, but I imagine it's considerably slower than a raw SAT solver. &lt;a href="https://github.com/audemard/glucose" target="_blank"&gt;Glucose&lt;/a&gt; is really, really fast.&lt;/p&gt;
    &lt;p&gt;But even so, solving the problem with SMT was a lot &lt;em&gt;easier&lt;/em&gt; than solving it with SAT. That satisfies me as an explanation for why people prefer it to SAT.&lt;/p&gt;
    &lt;h3&gt;Sanity checks&lt;/h3&gt;
    &lt;p&gt;One bit I glossed over earlier was the sanity checking code. I &lt;em&gt;knew for sure&lt;/em&gt; that I was going to make a mistake encoding the &lt;code&gt;region&lt;/code&gt;, and the solver wasn't going to provide useful information abut what I did wrong.  In cases like these, I like adding small tests and checks to catch mistakes early, because the solver certainly isn't going to catch them!&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;all_squares&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;repeat&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;test_i_set_up_problem_right&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;all_squares&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_iterable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r2&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;combinations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The first check was a quick test that I didn't leave any squares out, or accidentally put the same square in both regions. Converting the values into sets makes both checks a lot easier. Honestly I don't know why I didn't just use sets from the start, sets are great.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;render_regions&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;colormap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"purple"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s2"&gt;"red"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"brown"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"white"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"green"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"yellow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"orange"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"blue"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"pink"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;board&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; 
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;all_squares&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;colormap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    
    &lt;span class="n"&gt;render_regions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The second check is something that prints out the regions. It produces something like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;111111111
    112333999
    122439999
    124437799
    124666779
    124467799
    122467899
    122555889
    112258899
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I can compare this to the picture of the board to make sure I got it right. I guess a more advanced solution would be to print emoji squares like 🟥 instead.&lt;/p&gt;
    &lt;p&gt;Neither check is quality code but it's throwaway and it gets the job done so eh.&lt;/p&gt;
    &lt;h3&gt;Update for the Internet&lt;/h3&gt;
    &lt;p&gt;This was sent as a weekly newsletter, which is usually on topics like &lt;a href="https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code" target="_blank"&gt;software history&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/" target="_blank"&gt;formal methods&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;unusual technologies&lt;/a&gt;, and the &lt;a href="https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/" target="_blank"&gt;theory of software engineering&lt;/a&gt;. You &lt;a href="https://buttondown.email/hillelwayne/" target="_blank"&gt;can subscribe here&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:SAT"&gt;
    &lt;p&gt;"Boolean &lt;strong&gt;SAT&lt;/strong&gt;isfiability Solver", aka a solver that can find assignments that make complex boolean expressions true. I write a bit more about them &lt;a href="https://www.hillelwayne.com/post/np-hard/" target="_blank"&gt;here&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:SAT" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:SMT"&gt;
    &lt;p&gt;"Satisfiability Modulo Theories" &lt;a class="footnote-backref" href="#fnref:SMT" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 12 Jun 2025 15:43:25 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/</guid>
            </item>
            <item>
                <title>AI is a gamechanger for TLA+ users</title>
                <link>https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/</link>
                <description>&lt;h3&gt;New Logic for Programmers Release&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.10 is now available&lt;/a&gt;! This is a minor release, mostly focused on logic-based refactoring, with new material on set types and testing refactors are correct. See the full release notes at &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;the changelog page&lt;/a&gt;. Due to &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;conference pressure&lt;/a&gt; v0.11 will also likely be a minor release. &lt;/p&gt;
    &lt;p&gt;&lt;img alt="The book cover" class="newsletter-image" src="https://assets.buttondown.email/images/29d4ae9d-bcb9-4d8b-99d4-8a35c0990ad5.jpg?w=300&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h1&gt;AI is a gamechanger for TLA+ users&lt;/h1&gt;
    &lt;p&gt;&lt;a href="https://lamport.azurewebsites.net/tla/tla.html" target="_blank"&gt;TLA+&lt;/a&gt; is a specification language to model and debug distributed systems. While very powerful, it's also hard for programmers to learn, and there's always questions of connecting specifications with actual code. &lt;/p&gt;
    &lt;p&gt;That's why &lt;a href="https://zfhuang99.github.io/github%20copilot/formal%20verification/tla+/2025/05/24/ai-revolution-in-distributed-systems.html" target="_blank"&gt;The Coming AI Revolution in Distributed Systems&lt;/a&gt; caught my interest. In the post, Cheng Huang claims that Azure successfully used LLMs to examine an existing codebase, derive a TLA+ spec, and find a production bug in that spec. "After a decade of manually crafting TLA+ specifications", he wrote, "I must acknowledge that this AI-generated specification rivals human work".&lt;/p&gt;
    &lt;p&gt;This inspired me to experiment with LLMs in TLA+ myself. My goals are a little less ambitious than Cheng's: I wanted to see how LLMs could help junior specifiers write TLA+, rather than handling the entire spec automatically. Details on what did and didn't work below, but my takeaway is that &lt;strong&gt;LLMs are an immense specification force multiplier.&lt;/strong&gt;&lt;/p&gt;
    &lt;p&gt;All tests were done with a standard VSCode Copilot subscription, writing Claude 3.7 in Agent mode. Other LLMs or IDEs may be more or less effective, etc.&lt;/p&gt;
    &lt;h2&gt;Things Claude was good at&lt;/h2&gt;
    &lt;h3&gt;Fixing syntax errors&lt;/h3&gt;
    &lt;p&gt;TLA+ uses a very different syntax than mainstream programming languages, meaning beginners make a lot of mistakes where they do a "programming syntax" instead of TLA+ syntax:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;NotThree(x) = \* should be ==, not =
        x != 3 \* should be #, not !=
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The problem is that the TLA+ syntax checker, SANY, is 30 years old and doesn't provide good information. Here's what it says for that snippet:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Was expecting "==== or more Module body"
    Encountered "NotThree" at line 6, column 1
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;That only isolates one error and doesn't tell us what the problem is, only where it is. Experienced TLA+ users get "error eyes" and can quickly see what the problem is, but beginners really struggle with this.&lt;/p&gt;
    &lt;p&gt;The TLA+ foundation has made LLM integration a priority, so the VSCode extension &lt;a href="https://github.com/tlaplus/vscode-tlaplus/blob/master/src/main.ts#L174" target="_blank"&gt;naturally supports several agents actions&lt;/a&gt;. One of these is running SANY, meaning an agent can get an error, fix it, get another error, fix it, etc. Provided the above sample and asked to make it work, Claude successfully fixed both errors. It also fixed many errors in a larger spec, as well as figure out why PlusCal specs weren't compiling to TLA+.&lt;/p&gt;
    &lt;p&gt;This by itself is already enough to make LLMs a worthwhile tool, as it fixes one of the biggest barriers to entry.&lt;/p&gt;
    &lt;h3&gt;Understanding error traces&lt;/h3&gt;
    &lt;p&gt;When TLA+ finds a violated property, it outputs the sequence of steps that leads to the error. This starts in plaintext, and VSCode parses it into an interactive table:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="An example error trace" class="newsletter-image" src="https://assets.buttondown.email/images/f7f16d0e-c61f-4286-ae49-67e03f844126.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Learning to read these error traces is a skill in itself. You have to understand what's happening in each step and how it relates back to the actually broken property. It takes a long time for people to learn how to do this well.&lt;/p&gt;
    &lt;p&gt;Claude was successful here, too, accurately reading 20+ step error traces and giving a high-level explanation of what went wrong. It also could condense error traces: if ten steps of the error trace could be condensed into a one-sentence summary (which can happen if you're modeling a lot of process internals) Claude would do it.&lt;/p&gt;
    &lt;p&gt;I did have issues here with doing this in agent mode: while the extension does provide a "run model checker" command, the agent would regularly ignore this and prefer to run a terminal command instead. This would be fine except that the LLM consistently hallucinated invalid commands. I had to amend every prompt with "run the model checker via vscode, do not use a terminal command". You can skip this if you're willing to copy and paste the error trace into the prompt.&lt;/p&gt;
    &lt;p&gt;As with syntax checking, if this was the &lt;em&gt;only&lt;/em&gt; thing LLMs could effectively do, that would already be enough&lt;sup id="fnref:dayenu"&gt;&lt;a class="footnote-ref" href="#fn:dayenu"&gt;1&lt;/a&gt;&lt;/sup&gt; to earn a strong recommend. Even as a TLA+ expert I expect I'll be using this trick regularly. &lt;/p&gt;
    &lt;h3&gt;Boilerplate tasks&lt;/h3&gt;
    &lt;p&gt;TLA+ has a lot of boilerplate. One of the most notorious examples is &lt;code&gt;UNCHANGED&lt;/code&gt; rules. Specifications are extremely precise — so precise that you have to specify what variables &lt;em&gt;don't&lt;/em&gt; change in every step. This takes the form of an &lt;code&gt;UNCHANGED&lt;/code&gt; clause at the end of relevant actions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;RemoveObjectFromStore(srv, o, s) ==
      /\ o \in stored[s]
      /\ stored' = [stored EXCEPT ![s] = @ \ {o}]
      /\ UNCHANGED &amp;lt;&amp;lt;capacity, log, objectsize, pc&amp;gt;&amp;gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Writing this is really annoying. Updating these whenever you change an action, or add a new variable to the spec, is doubly so. Syntax checking and error analysis are important for beginners, but this is what I wanted for &lt;em&gt;myself&lt;/em&gt;. I took a spec and prompted Claude&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Add UNCHANGED &amp;lt;&lt;v1, etc="" v2,=""&gt;&amp;gt; for each variable not changed in an action.&lt;/v1,&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;And it worked! It successfully updated the &lt;code&gt;UNCHANGED&lt;/code&gt; in every action. &lt;/p&gt;
    &lt;p&gt;(Note, though, that it was a "well-behaved" spec in this regard: only one "action" happened at a time. In TLA+ you can have two actions happen simultaneously, that each update half of the variables, meaning neither of them should have an &lt;code&gt;UNCHANGED&lt;/code&gt; clause. I haven't tested how Claude handles that!)&lt;/p&gt;
    &lt;p&gt;That's the most obvious win, but Claude was good at handling other tedious work, too. Some examples include updating &lt;code&gt;vars&lt;/code&gt; (the conventional collection of all state variables), lifting a hard-coded value into a model parameter, and changing data formats. Most impressive to me, though, was rewriting a spec designed for one process to instead handle multiple processes. This means taking all of the process variables, which originally have types like &lt;code&gt;Int&lt;/code&gt;, converting them to types like &lt;code&gt;[Process -&amp;gt; Int]&lt;/code&gt;, and then updating the uses of all of those variables in the spec. It didn't account for race conditions in the new concurrent behavior, but it was an excellent scaffold to do more work.&lt;/p&gt;
    &lt;h3&gt;Writing properties from an informal description&lt;/h3&gt;
    &lt;p&gt;You have to be pretty precise with your intended property description but it handles converting that precise description into TLA+'s formalized syntax, which is something beginners often struggle with.&lt;/p&gt;
    &lt;h2&gt;Things it is less good at&lt;/h2&gt;
    &lt;h3&gt;Generating model config files&lt;/h3&gt;
    &lt;p&gt;To model check TLA+, you need both a specification (&lt;code&gt;.tla&lt;/code&gt;) and a model config file (&lt;code&gt;.cfg&lt;/code&gt;), which have separate syntaxes. Asking the agent to generate the second often lead to it using TLA+ syntax. It automatically fixed this after getting parsing errors, though. &lt;/p&gt;
    &lt;h3&gt;Fixing specs&lt;/h3&gt;
    &lt;p&gt;Whenever the ran model checking and discovered a bug, it would naturally propose a change to either the invalid property or the spec. Sometimes the changes were good, other times the changes were not physically realizable. For example, if it found that a bug was due to a race condition between processes, it would often suggest fixing it by saying race conditions were okay. I mean yes, if you say bugs are okay, then the spec finds that bugs are okay! Or it would alternatively suggest adding a constraint to the spec saying that race conditions don't happen. &lt;a href="https://www.hillelwayne.com/post/alloy-facts/" target="_blank"&gt;But that's a huge mistake in specification&lt;/a&gt;, because race conditions happen if we don't have coordination. We need to specify the &lt;em&gt;mechanism&lt;/em&gt; that is supposed to prevent them.&lt;/p&gt;
    &lt;h3&gt;Finding properties of the spec&lt;/h3&gt;
    &lt;p&gt;After seeing how capable it was at translating my properties to TLA+, I started prompting Claude to come up with properties on its own. Unfortunately, almost everything I got back was either trivial, uninteresting, or too coupled to implementation details. I haven't tested if it would work better to ask it for "properties that may be violated".&lt;/p&gt;
    &lt;h3&gt;Generating code from specs&lt;/h3&gt;
    &lt;p&gt;I have to be specific here: Claude &lt;em&gt;could&lt;/em&gt; sometimes convert Python into a passable spec, an vice versa. It &lt;em&gt;wasn't&lt;/em&gt; good at recognizing abstraction. For example, TLA+ specifications often represent sequential operations with a state variable, commonly called &lt;code&gt;pc&lt;/code&gt;. If modeling code that nonatomically retrieves a counter value and increments it, we'd have one action that requires &lt;code&gt;pc = "Get"&lt;/code&gt; and sets the new value to &lt;code&gt;"Inc"&lt;/code&gt;, then another that requires it be &lt;code&gt;"Inc"&lt;/code&gt; and sets it to &lt;code&gt;"Done"&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;I found that Claude would try to somehow convert &lt;code&gt;pc&lt;/code&gt; into part of the Python program's state, rather than recognize it as a TLA+ abstraction. On the other side, when converting python code to TLA+ it would often try to translate things like &lt;code&gt;sleep&lt;/code&gt; into some part of the spec, not recognizing that it is abstractable into a distinct action. I didn't test other possible misconceptions, like converting randomness to nondeterminism.&lt;/p&gt;
    &lt;p&gt;For the record, when converting TLA+ to Python Claude tended to make simulators of the spec, rather than possible production code implementing the spec. I really wasn't expecting otherwise though.&lt;/p&gt;
    &lt;h2&gt;Unexplored Applications&lt;/h2&gt;
    &lt;p&gt;Things I haven't explored thoroughly but could possibly be effective, based on what I know about TLA+ and AI:&lt;/p&gt;
    &lt;h3&gt;Writing Java Overrides&lt;/h3&gt;
    &lt;p&gt;Most TLA+ operators are resolved via TLA+ interpreters, but you can also implement them in "native" Java. This lets you escape the standard language semantics and add capabilities like &lt;a href="https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla" target="_blank"&gt;executing programs during model-checking&lt;/a&gt; or &lt;a href="https://github.com/tlaplus/tlaplus/blob/master/tlatools/org.lamport.tlatools/src/tla2sany/StandardModules/TLC.tla#L62" target="_blank"&gt;dynamically constrain the depth of the searched state space&lt;/a&gt;. There's a lot of cool things I think would be possible with overrides. The problem is there's only a handful of people in the world who know how to write them. But that handful have written quite a few overrides and I think there's enough there for Claude to work with. &lt;/p&gt;
    &lt;h3&gt;Writing specs, given a reference mechanism&lt;/h3&gt;
    &lt;p&gt;In all my experiments, the LLM only had my prompts and the occasional Python script as information. That makes me suspect that some of its problems with writing and fixing specs come down to not having a system model. Maybe it wouldn't suggest fixes like "these processes never race" if it had a design doc saying that the processes can't coordinate. &lt;/p&gt;
    &lt;p&gt;(Could a Sufficiently Powerful LLM derive some TLA+ specification from a design document?)&lt;/p&gt;
    &lt;h3&gt;Connecting specs and code&lt;/h3&gt;
    &lt;p&gt;This is the holy grail of TLA+: taking a codebase and showing it correctly implements a spec. Currently the best ways to do this are by either using TLA+ to generate a test suite, or by taking logged production traces and matching them to TLA+ behaviors. &lt;a href="https://www.mongodb.com/blog/post/engineering/conformance-checking-at-mongodb-testing-our-code-matches-our-tla-specs" target="_blank"&gt;This blog post discusses both&lt;/a&gt;. While I've seen a lot of academic research into these approaches there are no industry-ready tools. So if you want trace validation you have to do a lot of manual labour tailored to your specific product. &lt;/p&gt;
    &lt;p&gt;If LLMs could do some of this work for us then that'd really amplify the usefulness of TLA+ to many companies.&lt;/p&gt;
    &lt;h2&gt;Thoughts&lt;/h2&gt;
    &lt;p&gt;&lt;em&gt;Right now&lt;/em&gt;, agents seem good at the tedious and routine parts of TLA+ and worse at the strategic and abstraction parts. But, since the routine parts are often a huge barrier to beginners, this means that LLMs have the potential to make TLA+ far, far more accessible than it previously was.&lt;/p&gt;
    &lt;p&gt;I have mixed thoughts on this. As an &lt;em&gt;advocate&lt;/em&gt;, this is incredible. I want more people using formal specifications because I believe it leads to cheaper, safer, more reliable software. Anything that gets people comfortable with specs is great for our industry. As a &lt;em&gt;professional TLA+ consultant&lt;/em&gt;, I'm worried that this obsoletes me. Most of my income comes from training and coaching, which companies will have far less demand of now. Then again, maybe this an opportunity to pitch "agentic TLA+ training" to companies!&lt;/p&gt;
    &lt;p&gt;Anyway, if you're interested in TLA+, there has never been a better time to try it. I mean it, these tools handle so much of the hard part now. I've got a &lt;a href="https://learntla.com/" target="_blank"&gt;free book available online&lt;/a&gt;, as does &lt;a href="https://lamport.azurewebsites.net/tla/book.html" target="_blank"&gt;the inventor of TLA+&lt;/a&gt;. I like &lt;a href="https://elliotswart.github.io/pragmaticformalmodeling/" target="_blank"&gt;this guide too&lt;/a&gt;. Happy modeling!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:dayenu"&gt;
    &lt;p&gt;Dayenu. &lt;a class="footnote-backref" href="#fnref:dayenu" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 05 Jun 2025 14:59:11 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/</guid>
            </item>
            <item>
                <title>What does "Undecidable" mean, anyway</title>
                <link>https://buttondown.com/hillelwayne/archive/what-does-undecidable-mean-anyway/</link>
                <description>&lt;h3&gt;Systems Distributed&lt;/h3&gt;
    &lt;p&gt;I'll be speaking at &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt; next month! The talk is brand new and will aim to showcase some of the formal methods mental models that would be useful in mainstream software development. It has added some extra stress on my schedule, though, so expect the next two monthly releases of &lt;em&gt;Logic for Programmers&lt;/em&gt; to be mostly minor changes.&lt;/p&gt;
    &lt;h2&gt;What does "Undecidable" mean, anyway&lt;/h2&gt;
    &lt;p&gt;Last week I read &lt;a href="https://liamoc.net/forest/loc-000S/index.xml" target="_blank"&gt;Against Curry-Howard Mysticism&lt;/a&gt;, which is a solid article I recommend reading. But this newsletter is actually about &lt;a href="https://lobste.rs/s/n0whur/against_curry_howard_mysticism#c_lbts57" target="_blank"&gt;one comment&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;I like to see posts like this because I often feel like I can’t tell the difference between BS and a point I’m missing. Can we get one for questions like “Isn’t XYZ (Undecidable|NP-Complete|PSPACE-Complete)?” &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;I've already written one of these for &lt;a href="https://www.hillelwayne.com/post/np-hard/" target="_blank"&gt;NP-complete&lt;/a&gt;, so let's do one for "undecidable". Step one is to pull a technical definition from the book &lt;a href="https://link.springer.com/book/10.1007/978-1-4612-1844-9" target="_blank"&gt;&lt;em&gt;Automata and Computability&lt;/em&gt;&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;A property P of strings is said to be decidable if ... there is a total Turing machine that accepts input strings that have property P and rejects those that do not. (pg 220)&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Step two is to translate the technical computer science definition into more conventional programmer terms. Warning, because this is a newsletter and not a blog post, I might be a little sloppy with terms.&lt;/p&gt;
    &lt;h3&gt;Machines and Decision Problems&lt;/h3&gt;
    &lt;p&gt;In automata theory, all inputs to a "program" are strings of characters, and all outputs are "true" or "false". A program "accepts" a string if it outputs "true", and "rejects" if it outputs "false". You can think of this as automata studying all pure functions of type &lt;code&gt;f :: string -&amp;gt; boolean&lt;/code&gt;. Problems solvable by finding such an &lt;code&gt;f&lt;/code&gt; are called "decision problems".&lt;/p&gt;
    &lt;p&gt;This covers more than you'd think, because we can bootstrap more powerful functions from these. First, as anyone who's programmed in bash knows, strings can represent any other data. Second, we can fake non-boolean outputs by instead checking if a certain computation gives a certain result. For example, I can reframe the function &lt;code&gt;add(x, y) = x + y&lt;/code&gt; as a decision problem like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;IS_SUM(str) {
        x, y, z = split(str, "#")
        return x + y == z
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Then because &lt;code&gt;IS_SUM("2#3#5")&lt;/code&gt; returns true, we know &lt;code&gt;2 + 3 == 5&lt;/code&gt;, while &lt;code&gt;IS_SUM("2#3#6")&lt;/code&gt; is false. Since we can bootstrap parameters out of strings, I'll just say it's &lt;code&gt;IS_SUM(x, y, z)&lt;/code&gt; going forward.&lt;/p&gt;
    &lt;p&gt;A big part of automata theory is studying different models of computation with different strengths. One of the weakest is called &lt;a href="https://en.wikipedia.org/wiki/Deterministic_finite_automaton" target="_blank"&gt;"DFA"&lt;/a&gt;. I won't go into any details about what DFA actually can do, but the important thing is that it &lt;em&gt;can't&lt;/em&gt; solve &lt;code&gt;IS_SUM&lt;/code&gt;. That is, if you give me a DFA that takes inputs of form &lt;code&gt;x#y#z&lt;/code&gt;, I can always find an input where the DFA returns true when &lt;code&gt;x + y != z&lt;/code&gt;, &lt;em&gt;or&lt;/em&gt; an input which returns false when &lt;code&gt;x + y == z&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;It's really important to keep this model of "solve" in mind: a program solves a problem if it correctly returns true on all true inputs and correctly returns false on all false inputs.&lt;/p&gt;
    &lt;h3&gt;(total) Turing Machines&lt;/h3&gt;
    &lt;p&gt;A Turing Machine (TM) is a particular type of computation model. It's important for two reasons: &lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;
    &lt;p&gt;By the &lt;a href="https://en.wikipedia.org/wiki/Church%E2%80%93Turing_thesis" target="_blank"&gt;Church-Turing thesis&lt;/a&gt;, a Turing Machine is the "upper bound" of how powerful (physically realizable) computational models can get. This means that if an actual real-world programming language can solve a particular decision problem, so can a TM. Conversely, if the TM &lt;em&gt;can't&lt;/em&gt; solve it, neither can the programming language.&lt;sup id="fnref:caveat"&gt;&lt;a class="footnote-ref" href="#fn:caveat"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li&gt;
    &lt;p&gt;It's possible to write a Turing machine that takes &lt;em&gt;a textual representation of another Turing machine&lt;/em&gt; as input, and then simulates that Turing machine as part of its computations. &lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Property (1) means that we can move between different computational models of equal strength, proving things about one to learn things about another. That's why I'm able to write &lt;code&gt;IS_SUM&lt;/code&gt; in a pseudocode instead of writing it in terms of the TM computational model (and why I was able to use &lt;code&gt;split&lt;/code&gt; for convenience). &lt;/p&gt;
    &lt;p&gt;Property (2) does several interesting things. First of all, it makes it possible to compose Turing machines. Here's how I can roughly ask if a given number is the sum of two primes, with "just" addition and boolean functions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;IS_SUM_TWO_PRIMES(z):
        x := 1
        y := 1
        loop {
            if x &amp;gt; z {return false}
            if IS_PRIME(x) {
                if IS_PRIME(y) {
                    if IS_SUM(x, y, z) {
                        return true;
                    }
                }
            }
            y := y + 1
            if y &amp;gt; x {
                x := x + 1
                y := 0
            }
        }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Notice that without the &lt;code&gt;if x &amp;gt; z {return false}&lt;/code&gt;, the program would loop forever on &lt;code&gt;z=2&lt;/code&gt;. A TM that always halts for all inputs is called &lt;strong&gt;total&lt;/strong&gt;.&lt;/p&gt;
    &lt;p&gt;Property (2) also makes "Turing machines" a possible input to functions, meaning that we can now make decision problems about the behavior of Turing machines. For example, "does the TM &lt;code&gt;M&lt;/code&gt; either accept or reject &lt;code&gt;x&lt;/code&gt; within ten steps?"&lt;sup id="fnref:backticks"&gt;&lt;a class="footnote-ref" href="#fn:backticks"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;IS_DONE_IN_TEN_STEPS(M, x) {
        for (i = 0; i &amp;lt; 10; i++) {
            `simulate M(x) for one step`
            if(`M accepted or rejected`) {
                return true
            }
        }
        return false
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;Decidability and Undecidability&lt;/h3&gt;
    &lt;p&gt;Now we have all of the pieces to understand our original definition:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;A property P of strings is said to be decidable if ... there is a total Turing machine that accepts input strings that have property P and rejects those that do not. (220)&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Let &lt;code&gt;IS_P&lt;/code&gt; be the decision problem "Does the input satisfy P"? Then &lt;code&gt;IS_P&lt;/code&gt; is decidable if it can be solved by a Turing machine, ie, I can provide some &lt;code&gt;IS_P(x)&lt;/code&gt; machine that &lt;em&gt;always&lt;/em&gt; accepts if &lt;code&gt;x&lt;/code&gt; has property P, and always rejects if &lt;code&gt;x&lt;/code&gt; doesn't have property P. If I can't do that, then &lt;code&gt;IS_P&lt;/code&gt; is undecidable. &lt;/p&gt;
    &lt;p&gt;&lt;code&gt;IS_SUM(x, y, z)&lt;/code&gt; and &lt;code&gt;IS_DONE_IN_TEN_STEPS(M, x)&lt;/code&gt; are decidable properties. Is &lt;code&gt;IS_SUM_TWO_PRIMES(z)&lt;/code&gt; decidable? Some analysis shows that our corresponding program will either find a solution, or have &lt;code&gt;x&amp;gt;z&lt;/code&gt; and return false. So yes, it is decidable.&lt;/p&gt;
    &lt;p&gt;Notice there's an asymmetry here. To prove some property is decidable, I need just to need to find &lt;em&gt;one&lt;/em&gt; program that correctly solves it. To prove some property is undecidable, I need to show that any possible program, no matter what it is, doesn't solve it.&lt;/p&gt;
    &lt;p&gt;So with that asymmetry in mind, do are there &lt;em&gt;any&lt;/em&gt; undecidable problems? Yes, quite a lot. Recall that Turing machines can accept encodings of other TMs as input, meaning we can write a TM that checks &lt;em&gt;properties of Turing machines&lt;/em&gt;. And, by &lt;a href="https://en.wikipedia.org/wiki/Rice%27s_theorem" target="_blank"&gt;Rice's Theorem&lt;/a&gt;, almost every nontrivial semantic&lt;sup id="fnref:nontrivial"&gt;&lt;a class="footnote-ref" href="#fn:nontrivial"&gt;3&lt;/a&gt;&lt;/sup&gt; property of Turing machines is undecidable. The conventional way to prove this is to first find a single undecidable property &lt;code&gt;H&lt;/code&gt;, and then use that to bootstrap undecidability of other properties.&lt;/p&gt;
    &lt;p&gt;The canonical and most famous example of an undecidable problem is the &lt;a href="https://en.wikipedia.org/wiki/Halting_problem" target="_blank"&gt;Halting problem&lt;/a&gt;: "does machine M halt on input i?" It's pretty easy to prove undecidable, and easy to use it to bootstrap other undecidability properties. But again, &lt;em&gt;any&lt;/em&gt; nontrivial property is undecidable. Checking a TM is total is undecidable. Checking a TM accepts &lt;em&gt;any&lt;/em&gt; inputs is undecidable. Checking a TM solves &lt;code&gt;IS_SUM&lt;/code&gt; is undecidable. Etc etc etc.&lt;/p&gt;
    &lt;h3&gt;What this doesn't mean in practice&lt;/h3&gt;
    &lt;p&gt;I often see the halting problem misconstrued as "it's impossible to tell if a program will halt before running it." &lt;strong&gt;This is wrong&lt;/strong&gt;. The halting problem says that we cannot create an algorithm that, when applied to an arbitrary program, tells us whether the program will halt or not. It is absolutely possible to tell if many programs will halt or not. It's possible to find entire subcategories of programs that are guaranteed to halt. It's possible to say "a program constructed following constraints XYZ is guaranteed to halt." &lt;/p&gt;
    &lt;p&gt;The actual consequence of undecidability is more subtle. If we want to know if a program has property P, undecidability tells us&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;We will have to spend time and mental effort to determine if it has P&lt;/li&gt;
    &lt;li&gt;We may not be successful.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;This is subtle because we're so used to living in a world where everything's undecidable that we don't really consider what the counterfactual would be like. In such a world there might be no need for Rust, because "does this C program guarantee memory-safety" is a decidable property. The entire field of formal verification could be unnecessary, as we could just check properties of arbitrary programs directly. We could automatically check if a change in a program preserves all existing behavior. Lots of famous math problems could be solved overnight. &lt;/p&gt;
    &lt;p&gt;(This to me is a strong "intuitive" argument for why the halting problem is undecidable: a halt detector can be trivially repurposed as a program optimizer / theorem-prover / bcrypt cracker / chess engine. It's &lt;em&gt;too powerful&lt;/em&gt;, so we should expect it to be impossible.)&lt;/p&gt;
    &lt;p&gt;But because we don't live in that world, all of those things are hard problems that take effort and ingenuity to solve, and even then we often fail.&lt;/p&gt;
    &lt;h3&gt;Update for the Internet&lt;/h3&gt;
    &lt;p&gt;This was sent as a weekly newsletter, which is usually on topics like &lt;a href="https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code" target="_blank"&gt;software history&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/" target="_blank"&gt;formal methods&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;unusual technologies&lt;/a&gt;, and the &lt;a href="https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/" target="_blank"&gt;theory of software engineering&lt;/a&gt;. You &lt;a href="https://buttondown.email/hillelwayne/" target="_blank"&gt;can subscribe here&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:caveat"&gt;
    &lt;p&gt;To be pendantic, a TM can't do things like "scrape a webpage" or "render a bitmap", but we're only talking about computational decision problems here. &lt;a class="footnote-backref" href="#fnref:caveat" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:backticks"&gt;
    &lt;p&gt;One notation I've adopted in &lt;em&gt;Logic for Programmers&lt;/em&gt; is marking abstract sections of pseudocode with backticks. It's really handy! &lt;a class="footnote-backref" href="#fnref:backticks" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:nontrivial"&gt;
    &lt;p&gt;Nontrivial meaning "at least one TM has this property and at least one TM doesn't have this property". Semantic meaning "related to whether the TM accepts, rejects, or runs forever on a class of inputs". &lt;code&gt;IS_DONE_IN_TEN_STEPS&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt; a semantic property, as it doesn't tell us anything about inputs that take longer than ten steps. &lt;a class="footnote-backref" href="#fnref:nontrivial" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 28 May 2025 19:34:02 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/what-does-undecidable-mean-anyway/</guid>
            </item>
            <item>
                <title>Finding hard 24 puzzles with planner programming</title>
                <link>https://buttondown.com/hillelwayne/archive/finding-hard-24-puzzles-with-planner-programming/</link>
                <description>&lt;p&gt;&lt;strong&gt;Planner programming&lt;/strong&gt; is a programming technique where you solve problems by providing a goal and actions, and letting the planner find actions that reach the goal. In a previous edition of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt;, I demonstrated how this worked by solving the 
    &lt;a href="https://en.wikipedia.org/wiki/24_(puzzle)" target="_blank"&gt;24 puzzle&lt;/a&gt; with planning. For &lt;a href="https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/" target="_blank"&gt;reasons discussed here&lt;/a&gt; I replaced that example with something more practical (orchestrating deployments), but left the &lt;a href="https://github.com/logicforprogrammers/book-assets/tree/master/code/chapter-misc" target="_blank"&gt;code online&lt;/a&gt; for posterity.&lt;/p&gt;
    &lt;p&gt;Recently I saw a family member try and fail to vibe code a tool that would find all valid 24 puzzles, and realized I could adapt the puzzle solver to also be a puzzle generator. First I'll explain the puzzle rules, then the original solver, then the generator.&lt;sup id="fnref:complex"&gt;&lt;a class="footnote-ref" href="#fn:complex"&gt;1&lt;/a&gt;&lt;/sup&gt; For a much longer intro to planning, see &lt;a href="https://www.hillelwayne.com/post/picat/" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;h3&gt;The rules of 24&lt;/h3&gt;
    &lt;p&gt;You're given four numbers and have to find some elementary equation (&lt;code&gt;+-*/&lt;/code&gt;+groupings) that uses all four numbers and results in 24. Each number must be used exactly once, but do not need to be used in the starting puzzle order. Some examples:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;code&gt;[6, 6, 6, 6]&lt;/code&gt; -&amp;gt; &lt;code&gt;6+6+6+6=24&lt;/code&gt;&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;[1, 1, 6, 6]&lt;/code&gt; -&amp;gt; &lt;code&gt;(6+6)*(1+1)=24&lt;/code&gt;&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;[4, 4, 4, 5]&lt;/code&gt; -&amp;gt; &lt;code&gt;4*(5+4/4)=24&lt;/code&gt;&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Some setups are impossible, like &lt;code&gt;[1, 1, 1, 1]&lt;/code&gt;. Others are possible only with non-elementary operations, like &lt;code&gt;[1, 5, 5, 324]&lt;/code&gt; (which requires exponentiation).&lt;/p&gt;
    &lt;h2&gt;The solver&lt;/h2&gt;
    &lt;p&gt;We will use the &lt;a href="http://picat-lang.org/" target="_blank"&gt;Picat&lt;/a&gt;, the only language that I know has a built-in planner module. The current state of our plan with be represented by a single list with all of the numbers.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;import&lt;/span&gt; &lt;span class="s s-Atom"&gt;planner&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="s s-Atom"&gt;import&lt;/span&gt; &lt;span class="s s-Atom"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    
    &lt;span class="nf"&gt;action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Cost&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;?=&amp;gt;&lt;/span&gt;
      &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt; &lt;span class="s s-Atom"&gt;:=&lt;/span&gt; &lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;% , is `and`&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt; &lt;span class="s s-Atom"&gt;:=&lt;/span&gt; &lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
        &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt; &lt;span class="s s-Atom"&gt;++&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is our "action", and it works in three steps:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Nondeterministically pull two different values out of the input, deleting them&lt;/li&gt;
    &lt;li&gt;Nondeterministically pick one of the basic operations&lt;/li&gt;
    &lt;li&gt;The new state is the remaining elements, appended with that operation applied to our two picks.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Let's walk through this with &lt;code&gt;[1, 6, 1, 7]&lt;/code&gt;. There are four choices for &lt;code&gt;X&lt;/code&gt; and three four &lt;code&gt;Y&lt;/code&gt;. If the planner chooses &lt;code&gt;X=6&lt;/code&gt; and &lt;code&gt;Y=7&lt;/code&gt;, &lt;code&gt;A = $(6 + 7)&lt;/code&gt;. This is an uncomputed term in the same way lisps might use quotation. We can resolve the computation with &lt;code&gt;apply&lt;/code&gt;, as in the line &lt;code&gt;S1 = S0 ++ [apply(A)]&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;final&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=:=&lt;/span&gt; &lt;span class="mf"&gt;24.&lt;/span&gt; &lt;span class="c1"&gt;% handle floating point&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Our final goal is just a list where the only element is 24. This has to be a little floating point-sensitive to handle floating point divison, done by &lt;code&gt;=:=&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;main&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nv"&gt;Start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;best_plan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"%w %w%n"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;For &lt;code&gt;main,&lt;/code&gt; we just find the best plan with the maximum cost of &lt;code&gt;4&lt;/code&gt; and print it. When run from the command line, &lt;code&gt;picat&lt;/code&gt; automatically executes whatever is in &lt;code&gt;main&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$ picat 24.pi
    [1,5,5,6] [1 + 5,5 * 6,30 - 6]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I don't want to spoil any more 24 puzzles, so let's stop showing the plan:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;main =&amp;gt;
    &lt;span class="gd"&gt;- , printf("%w %w%n", Start, Plan)&lt;/span&gt;
    &lt;span class="gi"&gt;+ , printf("%w%n", Start)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;Generating puzzles&lt;/h3&gt;
    &lt;p&gt;Picat provides a &lt;code&gt;find_all(X, p(X))&lt;/code&gt; function, which ruturns all &lt;code&gt;X&lt;/code&gt; for which &lt;code&gt;p(X)&lt;/code&gt; is true. In theory, we could write &lt;code&gt;find_all(S, best_plan(S, 4, _)&lt;/code&gt;. In practice, there are an infinite number of valid puzzles, so we need to bound S somewhat. We also don't want to find any redundant puzzles, such as &lt;code&gt;[6, 6, 6, 4]&lt;/code&gt; and &lt;code&gt;[4, 6, 6, 6]&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;We can solve both issues by writing a helper &lt;code&gt;valid24(S)&lt;/code&gt;, which will check that &lt;code&gt;S&lt;/code&gt; a sorted list of integers within some bounds, like &lt;code&gt;1..8&lt;/code&gt;, and also has a valid solution.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;valid24&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nv"&gt;Start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;new_list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Start&lt;/span&gt; &lt;span class="s s-Atom"&gt;::&lt;/span&gt; &lt;span class="mf"&gt;1..8&lt;/span&gt; &lt;span class="c1"&gt;% every value in 1..8&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;increasing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;% sorted ascending&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;solve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;% turn into values&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;best_plan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This leans on Picat's constraint solving features to automatically find bounded sorted lists, which is why we need the &lt;code&gt;solve&lt;/code&gt; step.&lt;sup id="fnref:efficiency"&gt;&lt;a class="footnote-ref" href="#fn:efficiency"&gt;2&lt;/a&gt;&lt;/sup&gt; Now we can just loop through all of the values in &lt;code&gt;find_all&lt;/code&gt; to get all solutions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;main&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nf"&gt;foreach&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="s s-Atom"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;find_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="nf"&gt;valid24&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="nf"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"%w%n"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="s s-Atom"&gt;end&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$ picat 24.pi
    
    [1,1,1,8]
    [1,1,2,6]
    [1,1,2,7]
    [1,1,2,8]
    # etc
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;Finding hard puzzles&lt;/h3&gt;
    &lt;p&gt;Last Friday I realized I could do something more interesting with this. Once I have found a plan, I can apply further constraints to the plan, for example to find problems that can be solved with division:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;valid24(Start, Plan) =&amp;gt;
    &lt;span class="w"&gt; &lt;/span&gt; Start = new_list(4)
    &lt;span class="w"&gt; &lt;/span&gt; , Start :: 1..8
    &lt;span class="w"&gt; &lt;/span&gt; , increasing(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , solve(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , best_plan(Start, 4, Plan)
    &lt;span class="gi"&gt;+ , member($(_ / _), Plan)&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; .
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In playing with this, though, I noticed something weird: there are some solutions that appear if I sort &lt;em&gt;up&lt;/em&gt; but not &lt;em&gt;down&lt;/em&gt;. For example, &lt;code&gt;[3,3,4,5]&lt;/code&gt; appears in the solution set, but &lt;code&gt;[5, 4, 3, 3]&lt;/code&gt; doesn't appear if I replace &lt;code&gt;increasing&lt;/code&gt; with &lt;code&gt;decreasing&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;As far as I can tell, this is because Picat only finds one best plan, and &lt;code&gt;[5, 4, 3, 3]&lt;/code&gt; has &lt;em&gt;two&lt;/em&gt; solutions: &lt;code&gt;4*(5-3/3)&lt;/code&gt; and &lt;code&gt;3*(5+4)-3&lt;/code&gt;. &lt;code&gt;best_plan&lt;/code&gt; is a &lt;em&gt;deterministic&lt;/em&gt; operator, so Picat commits to the first best plan it finds. So if it finds &lt;code&gt;3*(5+4)-3&lt;/code&gt; first, it sees that the solution doesn't contain a division, throws &lt;code&gt;[5, 4, 3, 3]&lt;/code&gt; away as a candidate, and moves on to the next puzzle.&lt;/p&gt;
    &lt;p&gt;There's a couple ways we can fix this. We could replace &lt;code&gt;best_plan&lt;/code&gt; with &lt;code&gt;best_plan_nondet&lt;/code&gt;, which can backtrack to find new plans (at the cost of an enormous number of duplicates). Or we could modify our &lt;code&gt;final&lt;/code&gt; to only accept plans with a division: &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;% Hypothetical change
    final([N]) =&amp;gt;
    &lt;span class="gi"&gt;+ member($(_ / _), current_plan()),&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; N =:= 24.
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;My favorite "fix" is to ask another question entirely. While I was looking for puzzles that can be solved with division, what I actually want is puzzles that &lt;em&gt;must&lt;/em&gt; be solved with division. What if I rejected any puzzle that has a solution &lt;em&gt;without&lt;/em&gt; division?&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gi"&gt;+ plan_with_no_div(S, P) =&amp;gt; best_plan_nondet(S, 4, P), not member($(_ / _), P).&lt;/span&gt;
    
    valid24(Start, Plan) =&amp;gt;
    &lt;span class="w"&gt; &lt;/span&gt; Start = new_list(4)
    &lt;span class="w"&gt; &lt;/span&gt; , Start :: 1..8
    &lt;span class="w"&gt; &lt;/span&gt; , increasing(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , solve(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , best_plan(Start, 4, Plan)
    &lt;span class="gd"&gt;- , member($(_ / _), Plan)&lt;/span&gt;
    &lt;span class="gi"&gt;+ , not plan_with_no_div(Start, _)&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; .
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The new line's a bit tricky. &lt;code&gt;plan_with_div&lt;/code&gt; nondeterministically finds a plan, and then fails if the plan contains a division.&lt;sup id="fnref:not"&gt;&lt;a class="footnote-ref" href="#fn:not"&gt;3&lt;/a&gt;&lt;/sup&gt; Since I used &lt;code&gt;best_plan_nondet&lt;/code&gt;, it can backtrack from there and find a new plan. This means &lt;code&gt;plan_with_no_div&lt;/code&gt; only fails if not such plan exists. And in &lt;code&gt;valid24&lt;/code&gt;, we only succeed if &lt;code&gt;plan_with_no_div&lt;/code&gt; fails, guaranteeing that the only existing plans use division. Since this doesn't depend on the plan found via &lt;code&gt;best_plan&lt;/code&gt;, it doesn't matter how the values in &lt;code&gt;Start&lt;/code&gt; are arranged, this will not miss any valid puzzles.&lt;/p&gt;
    &lt;h4&gt;Aside for my &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;logic book readers&lt;/a&gt;&lt;/h4&gt;
    &lt;p&gt;The new clause is equivalent to &lt;code&gt;!(some p: Plan(p) &amp;amp;&amp;amp; !(div in p))&lt;/code&gt;. Applying the simplifications we learned:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;&lt;code&gt;!(some p: Plan(p) &amp;amp;&amp;amp; !(div in p))&lt;/code&gt; (init)&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;all p: !(plan(p) &amp;amp;&amp;amp; !(div in p))&lt;/code&gt; (all/some duality)&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;all p: !plan(p) || div in p)&lt;/code&gt; (De Morgan's law)&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;all p: plan(p) =&amp;gt; div in p&lt;/code&gt; (implication definition)&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Which more obviously means "if P is a valid plan, then it contains a division".&lt;/p&gt;
    &lt;h4&gt;Back to finding hard puzzles&lt;/h4&gt;
    &lt;p&gt;&lt;em&gt;Anyway&lt;/em&gt;, with &lt;code&gt;not plan_with_no_div&lt;/code&gt;, we are filtering puzzles on the set of possible solutions, not just specific solutions. And this gives me an idea: what if we find puzzles that have only one solution? &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gh"&gt;different_plan(S, P) =&amp;gt; best_plan_nondet(S, 4, P2), P2 != P.&lt;/span&gt;
    
    valid24(Start, Plan) =&amp;gt;
    &lt;span class="gi"&gt;+ , not different_plan(Start, Plan)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I tried this from &lt;code&gt;1..8&lt;/code&gt; and got:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;[1,2,7,7]
    [1,3,4,6]
    [1,6,6,8]
    [3,3,8,8]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;These happen to be some of the &lt;a href="https://www.4nums.com/game/difficulties/" target="_blank"&gt;hardest 24 puzzles known&lt;/a&gt;, though not all of them. Note this is assuming that &lt;code&gt;(X + Y)&lt;/code&gt; and &lt;code&gt;(Y + X)&lt;/code&gt; are &lt;em&gt;different&lt;/em&gt; solutions. If we say they're the same (by appending writing &lt;code&gt;A = $(X + Y), X &amp;lt;= Y&lt;/code&gt; in our action) then we got a lot more puzzles, many of which are considered "easy". Other "hard" things we can look for include plans that require fractions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;plan_with_no_fractions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt; 
      &lt;span class="nf"&gt;best_plan_nondet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;not&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="s s-Atom"&gt;=\=&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;
      &lt;span class="p"&gt;).&lt;/span&gt;
    
    &lt;span class="c1"&gt;% insert `not plan...` in valid24 as usual&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally, we could try seeing if a negative number is required:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;plan_with_no_negatives&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt; 
      &lt;span class="nf"&gt;best_plan_nondet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;not&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
      &lt;span class="p"&gt;).&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Interestingly this one returns no solutions, so you are never required to construct a negative number as part of a standard 24 puzzle.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:complex"&gt;
    &lt;p&gt;The code below is different than old book version, as it uses more fancy logic programming features that aren't good in learning material. &lt;a class="footnote-backref" href="#fnref:complex" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:efficiency"&gt;
    &lt;p&gt;&lt;code&gt;increasing&lt;/code&gt; is a constraint predicate. We could alternatively write &lt;code&gt;sorted&lt;/code&gt;, which is a Picat logical predicate and must be placed after &lt;code&gt;solve&lt;/code&gt;. There doesn't seem to be any efficiency gains either way. &lt;a class="footnote-backref" href="#fnref:efficiency" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:not"&gt;
    &lt;p&gt;I don't know what the standard is in Picat, but in Prolog, the convention is to use &lt;code&gt;\+&lt;/code&gt; instead of &lt;code&gt;not&lt;/code&gt;. They mean the same thing, so I'm using &lt;code&gt;not&lt;/code&gt; because it's clearer to non-LPers. &lt;a class="footnote-backref" href="#fnref:not" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 20 May 2025 18:21:01 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/finding-hard-24-puzzles-with-planner-programming/</guid>
            </item>
            <item>
                <title>Modeling Awkward Social Situations with TLA+</title>
                <link>https://buttondown.com/hillelwayne/archive/modeling-awkward-social-situations-with-tla/</link>
                <description>&lt;p&gt;You're walking down the street and need to pass someone going the opposite way. You take a step left, but they're thinking the same thing and take a step to their &lt;em&gt;right&lt;/em&gt;, aka your left. You're still blocking each other. Then you take a step to the right, and they take a step to their left, and you're back to where you started. I've heard this called "walkwarding"&lt;/p&gt;
    &lt;p&gt;Let's model this in &lt;a href="https://lamport.azurewebsites.net/tla/tla.html" target="_blank"&gt;TLA+&lt;/a&gt;. TLA+ is a &lt;strong&gt;formal methods&lt;/strong&gt; tool for finding bugs in complex software designs, most often involving concurrency. Two people trying to get past each other just also happens to be a concurrent system. A gentler introduction to TLA+'s capabilities is &lt;a href="https://www.hillelwayne.com/post/modeling-deployments/" target="_blank"&gt;here&lt;/a&gt;, an in-depth guide teaching the language is &lt;a href="https://learntla.com/" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;h2&gt;The spec&lt;/h2&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;---- MODULE walkward ----
    EXTENDS Integers
    
    VARIABLES pos
    vars == &amp;lt;&amp;lt;pos&amp;gt;&amp;gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Double equals defines a new operator, single equals is an equality check. &lt;code&gt;&amp;lt;&amp;lt;pos&amp;gt;&amp;gt;&lt;/code&gt; is a sequence, aka array.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;you == "you"
    me == "me"
    People == {you, me}
    
    MaxPlace == 4
    
    left == 0
    right == 1
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I've gotten into the habit of assigning string "symbols" to operators so that the compiler complains if I misspelled something. &lt;code&gt;left&lt;/code&gt; and &lt;code&gt;right&lt;/code&gt; are numbers so we can shift position with &lt;code&gt;right - pos&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;direction == [you |-&amp;gt; 1, me |-&amp;gt; -1]
    goal == [you |-&amp;gt; MaxPlace, me |-&amp;gt; 1]
    
    Init ==
      \* left-right, forward-backward
      pos = [you |-&amp;gt; [lr |-&amp;gt; left, fb |-&amp;gt; 1], me |-&amp;gt; [lr |-&amp;gt; left, fb |-&amp;gt; MaxPlace]]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;direction&lt;/code&gt;, &lt;code&gt;goal&lt;/code&gt;, and &lt;code&gt;pos&lt;/code&gt; are "records", or hash tables with string keys. I can get my left-right position with &lt;code&gt;pos.me.lr&lt;/code&gt; or &lt;code&gt;pos["me"]["lr"]&lt;/code&gt; (or &lt;code&gt;pos[me].lr&lt;/code&gt;, as &lt;code&gt;me == "me"&lt;/code&gt;).&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Juke(person) ==
      pos' = [pos EXCEPT ![person].lr = right - @]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;TLA+ breaks the world into a sequence of steps. In each step, &lt;code&gt;pos&lt;/code&gt; is the value of &lt;code&gt;pos&lt;/code&gt; in the &lt;em&gt;current&lt;/em&gt; step and &lt;code&gt;pos'&lt;/code&gt; is the value in the &lt;em&gt;next&lt;/em&gt; step. The main outcome of this semantics is that we "assign" a new value to &lt;code&gt;pos&lt;/code&gt; by declaring &lt;code&gt;pos'&lt;/code&gt; equal to something. But the semantics also open up lots of cool tricks, like swapping two values with &lt;code&gt;x' = y /\ y' = x&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;TLA+ is a little weird about updating functions. To set &lt;code&gt;f[x] = 3&lt;/code&gt;, you gotta write &lt;code&gt;f' = [f EXCEPT ![x] = 3]&lt;/code&gt;. To make things a little easier, the rhs of a function update can contain &lt;code&gt;@&lt;/code&gt; for the old value. &lt;code&gt;![me].lr = right - @&lt;/code&gt; is the same as &lt;code&gt;right - pos[me].lr&lt;/code&gt;, so it swaps left and right.&lt;/p&gt;
    &lt;p&gt;("Juke" comes from &lt;a href="https://www.merriam-webster.com/dictionary/juke" target="_blank"&gt;here&lt;/a&gt;)&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Move(person) ==
      LET new_pos == [pos[person] EXCEPT !.fb = @ + direction[person]]
      IN
        /\ pos[person].fb # goal[person]
        /\ \A p \in People: pos[p] # new_pos
        /\ pos' = [pos EXCEPT ![person] = new_pos]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The &lt;code&gt;EXCEPT&lt;/code&gt; syntax can be used in regular definitions, too. This lets someone move one step in their goal direction &lt;em&gt;unless&lt;/em&gt; they are at the goal &lt;em&gt;or&lt;/em&gt; someone is already in that space. &lt;code&gt;/\&lt;/code&gt; means "and".&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Next ==
      \E p \in People:
        \/ Move(p)
        \/ Juke(p)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I really like how TLA+ represents concurrency: "In each step, there is a person who either moves or jukes." It can take a few uses to really wrap your head around but it can express extraordinarily complicated distributed systems.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Spec == Init /\ [][Next]_vars
    
    Liveness == &amp;lt;&amp;gt;(pos[me].fb = goal[me])
    ====
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;Spec&lt;/code&gt; is our specification: we start at &lt;code&gt;Init&lt;/code&gt; and take a &lt;code&gt;Next&lt;/code&gt; step every step.&lt;/p&gt;
    &lt;p&gt;Liveness is the generic term for "something good is guaranteed to happen", see &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;here&lt;/a&gt; for more.  &lt;code&gt;&amp;lt;&amp;gt;&lt;/code&gt; means "eventually", so &lt;code&gt;Liveness&lt;/code&gt; means "eventually my forward-backward position will be my goal". I could extend it to "both of us eventually reach out goal" but I think this is good enough for a demo.&lt;/p&gt;
    &lt;h3&gt;Checking the spec&lt;/h3&gt;
    &lt;p&gt;Four years ago, everybody in TLA+ used the &lt;a href="https://lamport.azurewebsites.net/tla/toolbox.html" target="_blank"&gt;toolbox&lt;/a&gt;. Now the community has collectively shifted over to using the &lt;a href="https://github.com/tlaplus/vscode-tlaplus/" target="_blank"&gt;VSCode extension&lt;/a&gt;.&lt;sup id="fnref:ltla"&gt;&lt;a class="footnote-ref" href="#fn:ltla"&gt;1&lt;/a&gt;&lt;/sup&gt; VSCode requires we write a configuration file, which I will call &lt;code&gt;walkward.cfg&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;SPECIFICATION Spec
    PROPERTY Liveness
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I then check the model with the VSCode command &lt;code&gt;TLA+: Check model with TLC&lt;/code&gt;. Unsurprisingly, it finds an error:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Screenshot 2025-05-12 153537.png" class="newsletter-image" src="https://assets.buttondown.email/images/af6f9e89-0bc6-4705-b293-4da5f5c16cfe.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;The reason it fails is "stuttering": I can get one step away from my goal and then just stop moving forever. We say the spec is &lt;a href="https://www.hillelwayne.com/post/fairness/" target="_blank"&gt;unfair&lt;/a&gt;: it does not guarantee that if progress is always possible, progress will be made. If I want the spec to always make progress, I have to make some of the steps &lt;strong&gt;weakly fair&lt;/strong&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gi"&gt;+ Fairness == WF_vars(Next)&lt;/span&gt;
    
    &lt;span class="gd"&gt;- Spec == Init /\ [][Next]_vars&lt;/span&gt;
    &lt;span class="gi"&gt;+ Spec == Init /\ [][Next]_vars /\ Fairness&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now the spec is weakly fair, so someone will always do &lt;em&gt;something&lt;/em&gt;. New error:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;\* First six steps cut
    7: &amp;lt;Move("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 4], me |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2]]
    8: &amp;lt;Juke("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 4], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 2]]
    9: &amp;lt;Juke("me")&amp;gt; (back to state 7)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In this failure, I've successfully gotten past you, and then spend the rest of my life endlessly juking back and forth. The &lt;code&gt;Next&lt;/code&gt; step keeps happening, so weak fairness is satisfied. What I actually want is for both my &lt;code&gt;Move&lt;/code&gt; and my &lt;code&gt;Juke&lt;/code&gt; to both be weakly fair independently of each other.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gd"&gt;- Fairness == WF_vars(Next)&lt;/span&gt;
    &lt;span class="gi"&gt;+ Fairness == WF_vars(Move(me)) /\ WF_vars(Juke(me))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;If my liveness property also specified that &lt;em&gt;you&lt;/em&gt; reached your goal, I could instead write &lt;code&gt;\A p \in People: WF_vars(Move(p)) etc&lt;/code&gt;. I could also swap the &lt;code&gt;\A&lt;/code&gt; with a &lt;code&gt;\E&lt;/code&gt; to mean at least one of us is guaranteed to have fair actions, but not necessarily both of us. &lt;/p&gt;
    &lt;p&gt;New error:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;3: &amp;lt;Move("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 3]]
    4: &amp;lt;Juke("you")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 3]]
    5: &amp;lt;Juke("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 3]]
    6: &amp;lt;Juke("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 3]]
    7: &amp;lt;Juke("you")&amp;gt; (back to state 3)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now we're getting somewhere! This is the original walkwarding situation we wanted to capture. We're in each others way, then you juke, but before either of us can move you juke, then we both juke back. We can repeat this forever, trapped in a social hell.&lt;/p&gt;
    &lt;p&gt;Wait, but doesn't &lt;code&gt;WF(Move(me))&lt;/code&gt; guarantee I will eventually move? Yes, but &lt;em&gt;only if a move is permanently available&lt;/em&gt;. In this case, it's not permanently available, because every couple of steps it's made temporarily unavailable.&lt;/p&gt;
    &lt;p&gt;How do I fix this? I can't add a rule saying that we only juke if we're blocked, because the whole point of walkwarding is that we're not coordinated. In the real world, walkwarding can go on for agonizing seconds. What I can do instead is say that Liveness holds &lt;em&gt;as long as &lt;code&gt;Move&lt;/code&gt; is strongly fair&lt;/em&gt;. Unlike weak fairness, &lt;a href="https://www.hillelwayne.com/post/fairness/#strong-fairness" target="_blank"&gt;strong fairness&lt;/a&gt; guarantees something happens if it keeps becoming possible, even with interruptions. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Liveness == 
    &lt;span class="gi"&gt;+  SF_vars(Move(me)) =&amp;gt; &lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt;   &amp;lt;&amp;gt;(pos[me].fb = goal[me])
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This makes the spec pass. Even if we weave back and forth for five minutes, as long as we eventually pass each other, I will reach my goal. Note we could also by making &lt;code&gt;Move&lt;/code&gt; in &lt;code&gt;Fairness&lt;/code&gt; strongly fair, which is preferable if we have a lot of different liveness properties to check.&lt;/p&gt;
    &lt;h3&gt;A small exercise for the reader&lt;/h3&gt;
    &lt;p&gt;There is a presumed invariant that is violated. Identify what it is, write it as a property in TLA+, and show the spec violates it. Then fix it.&lt;/p&gt;
    &lt;p&gt;Answer (in &lt;a href="https://rot13.com/" target="_blank"&gt;rot13&lt;/a&gt;): Gur vainevnag vf "ab gjb crbcyr ner va gur rknpg fnzr ybpngvba". &lt;code&gt;Zbir&lt;/code&gt; thnenagrrf guvf ohg &lt;code&gt;Whxr&lt;/code&gt; &lt;em&gt;qbrf abg&lt;/em&gt;.&lt;/p&gt;
    &lt;h3&gt;More TLA+ Exercises&lt;/h3&gt;
    &lt;p&gt;I've started work on &lt;a href="https://github.com/hwayne/tlaplus-exercises/" target="_blank"&gt;an exercises repo&lt;/a&gt;. There's only a handful of specific problems now but I'm planning on adding more over the summer.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:ltla"&gt;
    &lt;p&gt;&lt;a href="https://learntla.com/" target="_blank"&gt;learntla&lt;/a&gt; is still on the toolbox, but I'm hoping to get it all moved over this summer. &lt;a class="footnote-backref" href="#fnref:ltla" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 14 May 2025 16:02:21 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/modeling-awkward-social-situations-with-tla/</guid>
            </item>
            <item>
                <title>Write the most clever code you possibly can</title>
                <link>https://buttondown.com/hillelwayne/archive/write-the-most-clever-code-you-possibly-can/</link>
                <description>&lt;p&gt;&lt;em&gt;I started writing this early last week but Real Life Stuff happened and now you're getting the first-draft late this week. Warning, unedited thoughts ahead!&lt;/em&gt;&lt;/p&gt;
    &lt;h2&gt;New Logic for Programmers release!&lt;/h2&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.9 is out&lt;/a&gt;! This is a big release, with a new cover design, several rewritten chapters, &lt;a href="https://github.com/logicforprogrammers/book-assets/tree/master/code" target="_blank"&gt;online code samples&lt;/a&gt; and much more. See the full release notes at the &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;changelog page&lt;/a&gt;, and &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;get the book here&lt;/a&gt;!&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The new cover! It's a lot nicer" class="newsletter-image" src="https://assets.buttondown.email/images/038a7092-5dc7-41a5-9a16-56bdef8b5d58.jpg?w=400&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h2&gt;Write the cleverest code you possibly can&lt;/h2&gt;
    &lt;p&gt;There are millions of articles online about how programmers should not write "clever" code, and instead write simple, maintainable code that everybody understands. Sometimes the example of "clever" code looks like this (&lt;a href="https://codegolf.stackexchange.com/questions/57617/is-this-number-a-prime/57682#57682" target="_blank"&gt;src&lt;/a&gt;):&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python&lt;/span&gt;
    
    &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"p*=n*n;n+=1;"&lt;/span&gt;&lt;span class="o"&gt;*~-&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is code-golfing, the sport of writing the most concise code possible. Obviously you shouldn't run this in production for the same reason you shouldn't eat dinner off a Rembrandt. &lt;/p&gt;
    &lt;p&gt;Other times the example looks like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;is_prime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;False&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is "clever" because it uses a single list comprehension, as opposed to a "simple" for loop. Yes, "list comprehensions are too clever" is something I've read in one of these articles. &lt;/p&gt;
    &lt;p&gt;I've also talked to people who think that datatypes besides lists and hashmaps are too clever to use, that most optimizations are too clever to bother with, and even that functions and classes are too clever and code should be a linear script.&lt;sup id="fnref:grad-students"&gt;&lt;a class="footnote-ref" href="#fn:grad-students"&gt;1&lt;/a&gt;&lt;/sup&gt;. Clever code is anything using features or domain concepts we don't understand. Something that seems unbearably clever to me might be utterly mundane for you, and vice versa. &lt;/p&gt;
    &lt;p&gt;How do we make something utterly mundane? By using it and working at the boundaries of our skills. Almost everything I'm "good at" comes from banging my head against it more than is healthy. That suggests a really good reason to write clever code: it's an excellent form of purposeful practice. Writing clever code forces us to code outside of our comfort zone, developing our skills as software engineers. &lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you [will get excellent debugging practice at exactly the right level required to push your skills as a software engineer] — Brian Kernighan, probably&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;There are other benefits, too, but first let's kill the elephant in the room:&lt;sup id="fnref:bajillion"&gt;&lt;a class="footnote-ref" href="#fn:bajillion"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;h3&gt;Don't &lt;em&gt;commit&lt;/em&gt; clever code&lt;/h3&gt;
    &lt;p&gt;I am proposing writing clever code as a means of practice. Being at work is a &lt;em&gt;job&lt;/em&gt; with coworkers who will not appreciate if your code is too clever. Similarly, don't use &lt;a href="https://mcfunley.com/choose-boring-technology" target="_blank"&gt;too many innovative technologies&lt;/a&gt;. Don't put anything in production you are &lt;em&gt;uncomfortable&lt;/em&gt; with.&lt;/p&gt;
    &lt;p&gt;We can still responsibly write clever code at work, though: &lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Solve a problem in both a simple and a clever way, and then only commit the simple way. This works well for small scale problems where trying the "clever way" only takes a few minutes.&lt;/li&gt;
    &lt;li&gt;Write our &lt;em&gt;personal&lt;/em&gt; tools cleverly. I'm a big believer of the idea that most programmers would benefit from writing more scripts and support code customized to their particular work environment. This is a great place to practice new techniques, languages, etc.&lt;/li&gt;
    &lt;li&gt;If clever code is absolutely the best way to solve a problem, then commit it with &lt;strong&gt;extensive documentation&lt;/strong&gt; explaining how it works and why it's preferable to simpler solutions. Bonus: this potentially helps the whole team upskill.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;h2&gt;Writing clever code...&lt;/h2&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;h3&gt;...teaches simple solutions&lt;/h3&gt;
    &lt;p&gt;Usually, code that's called too clever composes several powerful features together — the "not a single list comprehension or function" people are the exception. &lt;a href="https://www.joshwcomeau.com/career/clever-code-considered-harmful/" target="_blank"&gt;Josh Comeau's&lt;/a&gt; "don't write clever code" article gives this example of "too clever":&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;extractDataFromResponse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;Component&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;resultsEntries&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Component&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;assignIfValueTruthy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;o&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;o&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;o&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;resultsEntries&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;assignIfValueTruthy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;What makes this "clever"? I count eight language features composed together: &lt;code&gt;entries&lt;/code&gt;, argument unpacking, implicit objects, splats, ternaries, higher-order functions, and reductions. Would code that used only one or two of these features still be "clever"? I don't think so. These features exist for a reason, and oftentimes they make code simpler than not using them.&lt;/p&gt;
    &lt;p&gt;We can, of course, learn these features one at a time. Writing the clever version (but not &lt;em&gt;committing it&lt;/em&gt;) gives us practice with all eight at once and also with how they compose together. That knowledge comes in handy when we want to apply a single one of the ideas.&lt;/p&gt;
    &lt;p&gt;I've recently had to do a bit of pandas for a project. Whenever I have to do a new analysis, I try to write it as a single chain of transformations, and then as a more balanced set of updates.&lt;/p&gt;
    &lt;h3&gt;...helps us master concepts&lt;/h3&gt;
    &lt;p&gt;Even if the composite parts of a "clever" solution aren't by themselves useful, it still makes us better at the overall language, and that's inherently valuable. A few years ago I wrote &lt;a href="https://www.hillelwayne.com/post/python-abc/" target="_blank"&gt;Crimes with Python's Pattern Matching&lt;/a&gt;. It involves writing horrible code like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;abc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ABC&lt;/span&gt;
    
    &lt;span class="k"&gt;class&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;NotIterable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ABC&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    
        &lt;span class="nd"&gt;@classmethod&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;__subclasshook__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;C&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;hasattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;C&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"__iter__"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;NotIterable&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; is not iterable"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; is iterable"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"__main__"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This composes Python match statements, which are broadly useful, and abstract base classes, which are incredibly niche. But even if I never use ABCs in real production code, it helped me understand Python's match semantics and &lt;a href="https://docs.python.org/3/howto/mro.html#python-2-3-mro" target="_blank"&gt;Method Resolution Order&lt;/a&gt; better. &lt;/p&gt;
    &lt;h3&gt;...prepares us for necessity&lt;/h3&gt;
    &lt;p&gt;Sometimes the clever way is the &lt;em&gt;only&lt;/em&gt; way. Maybe we need something faster than the simplest solution. Maybe we are working with constrained tools or frameworks that demand cleverness. Peter Norvig argued that design patterns compensate for missing language features. I'd argue that cleverness is another means of compensating: if our tools don't have an easy way to do something, we need to find a clever way.&lt;/p&gt;
    &lt;p&gt;You see this a lot in formal methods like TLA+. Need to check a hyperproperty? &lt;a href="https://www.hillelwayne.com/post/graphing-tla/" target="_blank"&gt;Cast your state space to a directed graph&lt;/a&gt;. Need to compose ten specifications together? &lt;a href="https://www.hillelwayne.com/post/composing-tla/" target="_blank"&gt;Combine refinements with state machines&lt;/a&gt;. Most difficult problems have a "clever" solution. The real problem is that clever solutions have a skill floor. If normal use of the tool is at difficult 3 out of 10, then basic clever solutions are at 5 out of 10, and it's hard to jump those two steps in the moment you need the cleverness.&lt;/p&gt;
    &lt;p&gt;But if you've practiced with writing overly clever code, you're used to working at a 7 out of 10 level in short bursts, and then you can "drop down" to 5/10. I don't know if that makes too much sense, but I see it happen a lot in practice.&lt;/p&gt;
    &lt;h3&gt;...builds comradery&lt;/h3&gt;
    &lt;p&gt;On a few occasions, after getting a pull request merged, I pulled the reviewer over and said "check out this horrible way of doing the same thing". I find that as long as people know they're not going to be subjected to a clever solution in production, they enjoy seeing it!&lt;/p&gt;
    &lt;p&gt;&lt;em&gt;Next week's newsletter will probably also be late, after that we should be back to a regular schedule for the rest of the summer.&lt;/em&gt;&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:grad-students"&gt;
    &lt;p&gt;Mostly grad students outside of CS who have to write scripts to do research. And in more than one data scientist. I think it's correlated with using Jupyter. &lt;a class="footnote-backref" href="#fnref:grad-students" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:bajillion"&gt;
    &lt;p&gt;If I don't put this at the beginning, I'll get a bajillion responses like "your team will hate you" &lt;a class="footnote-backref" href="#fnref:bajillion" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 08 May 2025 15:04:42 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/write-the-most-clever-code-you-possibly-can/</guid>
            </item>
            <item>
                <title>Requirements change until they don't</title>
                <link>https://buttondown.com/hillelwayne/archive/requirements-change-until-they-dont/</link>
                <description>&lt;p&gt;Recently I got a question on formal methods&lt;sup id="fnref:fs"&gt;&lt;a class="footnote-ref" href="#fn:fs"&gt;1&lt;/a&gt;&lt;/sup&gt;: how does it help to mathematically model systems when the system requirements are constantly changing? It doesn't make sense to spend a lot of time proving a design works, and then deliver the product and find out it's not at all what the client needs. As the saying goes, the hard part is "building the right thing", not "building the thing right".&lt;/p&gt;
    &lt;p&gt;One possible response: "why write tests"? You shouldn't write tests, &lt;em&gt;especially&lt;/em&gt; &lt;a href="https://en.wikipedia.org/wiki/Test-driven_development" target="_blank"&gt;lots of unit tests ahead of time&lt;/a&gt;, if you might just throw them all away when the requirements change.&lt;/p&gt;
    &lt;p&gt;This is a bad response because we all know the difference between writing tests and formal methods: testing is &lt;em&gt;easy&lt;/em&gt; and FM is &lt;em&gt;hard&lt;/em&gt;. Testing requires low cost for moderate correctness, FM requires high(ish) cost for high correctness. And when requirements are constantly changing, "high(ish) cost" isn't affordable and "high correctness" isn't worthwhile, because a kinda-okay solution that solves a customer's problem is infinitely better than a solid solution that doesn't.&lt;/p&gt;
    &lt;p&gt;But eventually you get something that solves the problem, and what then?&lt;/p&gt;
    &lt;p&gt;Most of us don't work for Google, we can't axe features and products &lt;a href="https://killedbygoogle.com/" target="_blank"&gt;on a whim&lt;/a&gt;. If the client is happy with your solution, you are expected to support it. It should work when your customers run into new edge cases, or migrate all their computers to the next OS version, or expand into a market with shoddy internet. It should work when 10x as many customers are using 10x as many features. It should work when &lt;a href="https://www.hillelwayne.com/post/feature-interaction/" target="_blank"&gt;you add new features that come into conflict&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;And just as importantly, &lt;em&gt;it should never stop solving their problem&lt;/em&gt;. Canonical example: your feature involves processing requested tasks synchronously. At scale, this doesn't work, so to improve latency you make it asynchronous. Now it's eventually consistent, but your customers were depending on it being always consistent. Now it no longer does what they need, and has stopped solving their problems.&lt;/p&gt;
    &lt;p&gt;Every successful requirement met spawns a new requirement: "keep this working". That requirement is permanent, or close enough to decide our long-term strategy. It takes active investment to keep a feature behaving the same as the world around it changes.&lt;/p&gt;
    &lt;p&gt;(Is this all a pretentious of way of saying "software maintenance is hard?" Maybe!)&lt;/p&gt;
    &lt;h3&gt;Phase changes&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;In physics there's a concept of a &lt;a href="https://en.wikipedia.org/wiki/Phase_transition" target="_blank"&gt;phase transition&lt;/a&gt;. To raise the temperature of a gram of liquid water by 1° C, you have to add 4.184 joules of energy.&lt;sup id="fnref:calorie"&gt;&lt;a class="footnote-ref" href="#fn:calorie"&gt;2&lt;/a&gt;&lt;/sup&gt; This continues until you raise it to 100°C, then it stops. After you've added two &lt;em&gt;thousand&lt;/em&gt; joules to that gram, it suddenly turns into steam. The energy of the system changes continuously but the form, or phase, changes discretely.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Phase_diagram_of_water_simplified.svg.png (from above link)" class="newsletter-image" src="https://assets.buttondown.email/images/31676a33-be6a-4c6d-a96f-425723dcb0d5.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Software isn't physics but the idea works as a metaphor. A certain architecture handles a certain level of load, and past that you need a new architecture. Or a bunch of similar features are independently hardcoded until the system becomes too messy to understand, you remodel the internals into something unified and extendable. etc etc etc. It's doesn't have to be totally discrete phase transition, but there's definitely a "before" and "after" in the system form. &lt;/p&gt;
    &lt;p&gt;Phase changes tend to lead to more intricacy/complexity in the system, meaning it's likely that a phase change will introduce new bugs into existing behaviors. Take the synchronous vs asynchronous case. A very simple toy model of synchronous updates would be &lt;code&gt;Set(key, val)&lt;/code&gt;, which updates &lt;code&gt;data[key]&lt;/code&gt; to &lt;code&gt;val&lt;/code&gt;.&lt;sup id="fnref:tla"&gt;&lt;a class="footnote-ref" href="#fn:tla"&gt;3&lt;/a&gt;&lt;/sup&gt; A model of asynchronous updates would be &lt;code&gt;AsyncSet(key, val, priority)&lt;/code&gt; adds a &lt;code&gt;(key, val, priority, server_time())&lt;/code&gt; tuple to a &lt;code&gt;tasks&lt;/code&gt; set, and then another process asynchronously pulls a tuple (ordered by highest priority, then earliest time) and calls &lt;code&gt;Set(key, val)&lt;/code&gt;. Here are some properties the client may need preserved as a requirement: &lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;If &lt;code&gt;AsyncSet(key, val, _, _)&lt;/code&gt; is called, then &lt;em&gt;eventually&lt;/em&gt; &lt;code&gt;db[key] = val&lt;/code&gt; (possibly violated if higher-priority tasks keep coming in)&lt;/li&gt;
    &lt;li&gt;If someone calls &lt;code&gt;AsyncSet(key1, val1, low)&lt;/code&gt; and then &lt;code&gt;AsyncSet(key2, val2, low)&lt;/code&gt;, they should see the first update and then the second (linearizability, possibly violated if the requests go to different servers with different clock times)&lt;/li&gt;
    &lt;li&gt;If someone calls &lt;code&gt;AsyncSet(key, val, _)&lt;/code&gt; and &lt;em&gt;immediately&lt;/em&gt; reads &lt;code&gt;db[key]&lt;/code&gt; they should get &lt;code&gt;val&lt;/code&gt; (obviously violated, though the client may accept a &lt;em&gt;slightly&lt;/em&gt; weaker property)&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;If the new system doesn't satisfy an existing customer requirement, it's prudent to fix the bug &lt;em&gt;before&lt;/em&gt; releasing the new system. The customer doesn't notice or care that your system underwent a phase change. They'll just see that one day your product solves their problems, and the next day it suddenly doesn't. &lt;/p&gt;
    &lt;p&gt;This is one of the most common applications of formal methods. Both of those systems, and every one of those properties, is formally specifiable in a specification language. We can then automatically check that the new system satisfies the existing properties, and from there do things like &lt;a href="https://arxiv.org/abs/2006.00915" target="_blank"&gt;automatically generate test suites&lt;/a&gt;. This does take a lot of work, so if your requirements are constantly changing, FM may not be worth the investment. But eventually requirements &lt;em&gt;stop&lt;/em&gt; changing, and then you're stuck with them forever. That's where models shine.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:fs"&gt;
    &lt;p&gt;As always, I'm using formal methods to mean the subdiscipline of formal specification of designs, leaving out the formal verification of code. Mostly because "formal specification" is really awkward to say. &lt;a class="footnote-backref" href="#fnref:fs" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:calorie"&gt;
    &lt;p&gt;Also called a "calorie". The US "dietary Calorie" is actually a kilocalorie. &lt;a class="footnote-backref" href="#fnref:calorie" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:tla"&gt;
    &lt;p&gt;This is all directly translatable to a TLA+ specification, I'm just describing it in English to avoid paying the syntax tax &lt;a class="footnote-backref" href="#fnref:tla" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Thu, 24 Apr 2025 11:00:00 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/requirements-change-until-they-dont/</guid>
            </item>
            <item>
                <title>The Halting Problem is a terrible example of NP-Harder</title>
                <link>https://buttondown.com/hillelwayne/archive/the-halting-problem-is-a-terrible-example-of-np/</link>
                <description>&lt;p&gt;&lt;em&gt;Short one this time because I have a lot going on this week.&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;In computation complexity, &lt;strong&gt;NP&lt;/strong&gt; is the class of all decision problems (yes/no) where a potential proof (or "witness") for "yes" can be &lt;em&gt;verified&lt;/em&gt; in polynomial time. For example, "does this set of numbers have a subset that sums to zero" is in NP. If the answer is "yes", you can prove it by presenting a set of numbers. We would then verify the witness by 1) checking that all the numbers are present in the set (~linear time) and 2) adding up all the numbers (also linear).&lt;/p&gt;
    &lt;p&gt;&lt;strong&gt;NP-complete&lt;/strong&gt; is the class of "hardest possible" NP problems. Subset sum is NP-complete. &lt;strong&gt;NP-hard&lt;/strong&gt; is the set all problems &lt;em&gt;at least as hard&lt;/em&gt; as NP-complete. Notably, NP-hard is &lt;em&gt;not&lt;/em&gt; a subset of NP, as it contains problems that are &lt;em&gt;harder&lt;/em&gt; than NP-complete. A natural question to ask is "like what?" And the canonical example of "NP-harder" is the halting problem (HALT): does program P halt on input C? As the argument goes, it's undecidable, so obviously not in NP.&lt;/p&gt;
    &lt;p&gt;I think this is a bad example for two reasons:&lt;/p&gt;
    &lt;ol&gt;&lt;li&gt;&lt;p&gt;All NP requires is that witnesses for "yes" can be verified in polynomial time. It does not require anything for the "no" case! And even though HP is undecidable, there &lt;em&gt;is&lt;/em&gt; a decidable way to verify a "yes": let the witness be "it halts in N steps", then run the program for that many steps and see if it halted by then. To prove HALT is not in NP, you have to show that this verification process grows faster than polynomially. It does (as &lt;a href="https://en.wikipedia.org/wiki/Busy_beaver" rel="noopener noreferrer nofollow" target="_blank"&gt;busy beaver&lt;/a&gt; is uncomputable), but this all makes the example needlessly confusing.&lt;sup id="fnref:1"&gt;&lt;a class="footnote-ref" data-id="37347adc-dba6-4629-9d24-c6252292ac6b" data-reference-number="1" href="#fn:1"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;"What's bigger than a dog? THE MOON"&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;
    &lt;p&gt;Really (2) bothers me a lot more than (1) because it's just so inelegant. It suggests that NP-complete is the upper bound of "solvable" problems, and after that you're in full-on undecidability. I'd rather show intuitive problems that are harder than NP but not &lt;em&gt;that&lt;/em&gt; much harder.&lt;/p&gt;
    &lt;p&gt;But in looking for a "slightly harder" problem, I ran into an, ah, problem. It &lt;em&gt;seems&lt;/em&gt; like the next-hardest class would be &lt;a href="https://en.wikipedia.org/wiki/EXPTIME" rel="noopener noreferrer nofollow" target="_blank"&gt;EXPTIME&lt;/a&gt;, except we don't know &lt;em&gt;for sure&lt;/em&gt; that NP != EXPTIME. We know &lt;em&gt;for sure&lt;/em&gt; that NP != &lt;a href="https://en.wikipedia.org/wiki/NEXPTIME" rel="noopener noreferrer nofollow" target="_blank"&gt;NEXPTIME&lt;/a&gt;, but NEXPTIME doesn't have any intuitive, easily explainable problems. Most "definitely harder than NP" problems require a nontrivial background in theoretical computer science or mathematics to understand.&lt;/p&gt;
    &lt;p&gt;There is one problem, though, that I find easily explainable. Place a token at the bottom left corner of a grid that extends infinitely up and right, call that point (0, 0). You're given list of valid displacement moves for the token, like &lt;code&gt;(+1, +0)&lt;/code&gt;, &lt;code&gt;(-20, +13)&lt;/code&gt;, &lt;code&gt;(-5, -6)&lt;/code&gt;, etc, and a target point like &lt;code&gt;(700, 1)&lt;/code&gt;. You may make any sequence of moves in any order, as long as no move ever puts the token off the grid. Does any sequence of moves bring you to the target?&lt;/p&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;This is PSPACE-complete, I think, which still isn't proven to be harder than NP-complete (though it's widely believed). But what if you increase the number of dimensions of the grid? Past a certain number of dimensions the problem jumps to being EXPSPACE-complete, and then TOWER-complete (grows &lt;a href="https://en.wikipedia.org/wiki/Tetration" rel="noopener noreferrer nofollow" target="_blank"&gt;tetrationally&lt;/a&gt;), and then it keeps going. Some point might recognize this as looking a lot like the &lt;a href="https://en.wikipedia.org/wiki/Ackermann_function" rel="noopener noreferrer nofollow" target="_blank"&gt;Ackermann function&lt;/a&gt;, and in fact this problem is &lt;a href="https://arxiv.org/abs/2104.13866" rel="noopener noreferrer nofollow" target="_blank"&gt;ACKERMANN-complete on the number of available dimensions&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://www.quantamagazine.org/an-easy-sounding-problem-yields-numbers-too-big-for-our-universe-20231204/" rel="noopener noreferrer nofollow" target="_blank"&gt;A friend wrote a Quanta article about the whole mess&lt;/a&gt;, you should read it.&lt;/p&gt;
    &lt;p&gt;This problem is ludicrously bigger than NP ("Chicago" instead of "The Moon"), but at least it's clearly decidable, easily explainable, and definitely &lt;em&gt;not&lt;/em&gt; in NP.&lt;/p&gt;
    &lt;div class="footnote"&gt;&lt;hr/&gt;&lt;ol class="footnotes"&gt;&lt;li data-id="37347adc-dba6-4629-9d24-c6252292ac6b" id="fn:1"&gt;&lt;p&gt;It's less confusing if you're taught the alternate (and original!) definition of NP, "the class of problems solvable in polynomial time by a nondeterministic Turing machine". Then HALT can't be in NP because otherwise runtime would be bounded by an exponential function. &lt;a class="footnote-backref" href="#fnref:1"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;</description>
                <pubDate>Wed, 16 Apr 2025 17:39:23 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/the-halting-problem-is-a-terrible-example-of-np/</guid>
            </item>
            <item>
                <title>Solving a "Layton Puzzle" with Prolog</title>
                <link>https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/</link>
                <description>&lt;p&gt;I have a lot in the works for the this month's &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; release. Among other things, I'm completely rewriting the chapter on Logic Programming Languages. &lt;/p&gt;
    &lt;p&gt;I originally showcased the paradigm with puzzle solvers, like &lt;a href="https://swish.swi-prolog.org/example/queens.pl" target="_blank"&gt;eight queens&lt;/a&gt; or &lt;a href="https://saksagan.ceng.metu.edu.tr/courses/ceng242/documents/prolog/jrfisher/2_1.html" target="_blank"&gt;four-coloring&lt;/a&gt;. Lots of other demos do this too! It takes creativity and insight for humans to solve them, so a program doing it feels magical. But I'm trying to write a book about practical techniques and I want everything I talk about to be &lt;em&gt;useful&lt;/em&gt;. So in v0.9 I'll be replacing these examples with a couple of new programs that might get people thinking that Prolog could help them in their day-to-day work.&lt;/p&gt;
    &lt;p&gt;On the other hand, for a newsletter, showcasing a puzzle solver is pretty cool. And recently I stumbled into &lt;a href="https://morepablo.com/2010/09/some-professor-layton-prolog.html" target="_blank"&gt;this post&lt;/a&gt; by my friend &lt;a href="https://morepablo.com/" target="_blank"&gt;Pablo Meier&lt;/a&gt;, where he solves a videogame puzzle with Prolog:&lt;sup id="fnref:path"&gt;&lt;a class="footnote-ref" href="#fn:path"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;&lt;img alt="See description below" class="newsletter-image" src="https://assets.buttondown.email/images/a4ee8689-bbce-4dc9-8175-a1de3bd8f2db.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Summary for the text-only readers: We have a test with 10 true/false questions (denoted &lt;code&gt;a/b&lt;/code&gt;) and four student attempts. Given the scores of the first three students, we have to figure out the fourth student's score.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;bbababbabb = 7
    baaababaaa = 5
    baaabbbaba = 3
    bbaaabbaaa = ???
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;You can see Pablo's solution &lt;a href="https://morepablo.com/2010/09/some-professor-layton-prolog.html" target="_blank"&gt;here&lt;/a&gt;, and try it in SWI-prolog &lt;a href="https://swish.swi-prolog.org/p/Some%20Professor%20Layton%20Prolog.pl" target="_blank"&gt;here&lt;/a&gt;. Pretty cool! But after way too long studying Prolog just to write this dang book chapter, I wanted to see if I could do it more elegantly than him. Code and puzzle spoilers to follow.&lt;/p&gt;
    &lt;p&gt;(Normally here's where I'd link to a gentler introduction I wrote but I think this is my first time writing about Prolog online? Uh here's a &lt;a href="https://www.hillelwayne.com/post/picat/" target="_blank"&gt;Picat intro&lt;/a&gt; instead)&lt;/p&gt;
    &lt;h3&gt;The Program&lt;/h3&gt;
    &lt;p&gt;You can try this all online at &lt;a href="https://swish.swi-prolog.org/p/" target="_blank"&gt;SWISH&lt;/a&gt; or just jump to my final version &lt;a href="https://swish.swi-prolog.org/p/layton_prolog_puzzle.pl" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="nf"&gt;use_module&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;    &lt;span class="c1"&gt;% Sound inequality&lt;/span&gt;
    &lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="nf"&gt;use_module&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;clpfd&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;  &lt;span class="c1"&gt;% Finite domain constraints&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;First some imports. &lt;code&gt;dif&lt;/code&gt; lets us write &lt;code&gt;dif(A, B)&lt;/code&gt;, which is true if &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;B&lt;/code&gt; are &lt;em&gt;not&lt;/em&gt; equal. &lt;code&gt;clpfd&lt;/code&gt; lets us write &lt;code&gt;A #= B + 1&lt;/code&gt; to say "A is 1 more than B".&lt;sup id="fnref:superior"&gt;&lt;a class="footnote-ref" href="#fn:superior"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;We'll say both the student submission and the key will be lists, where each value is &lt;code&gt;a&lt;/code&gt; or &lt;code&gt;b&lt;/code&gt;. In Prolog, lowercase identifiers are &lt;strong&gt;atoms&lt;/strong&gt; (like symbols in other languages) and identifiers that start with a capital are &lt;strong&gt;variables&lt;/strong&gt;. Prolog finds values for variables that match equations (&lt;strong&gt;unification&lt;/strong&gt;). The pattern matching is real real good.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;% ?- means query&lt;/span&gt;
    &lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nv"&gt;B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;B&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="s s-Atom"&gt;#=&lt;/span&gt; &lt;span class="mf"&gt;7.&lt;/span&gt;
    
    &lt;span class="nv"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nv"&gt;L&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nv"&gt;Y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Next, we define &lt;code&gt;score/3&lt;/code&gt;&lt;sup id="fnref:arity"&gt;&lt;a class="footnote-ref" href="#fn:arity"&gt;3&lt;/a&gt;&lt;/sup&gt; recursively. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;% The student's test score&lt;/span&gt;
    &lt;span class="c1"&gt;% score(student answers, answer key, score)&lt;/span&gt;
    &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([],&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
       &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="s s-Atom"&gt;#=&lt;/span&gt; &lt;span class="nv"&gt;M&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;M&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;K&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt; 
        &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;K&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;First key is the student's answers, second is the answer key, third is the final score. The base case is the empty test, which has score 0. Otherwise, we take the head values of each list and compare them. If they're the same, we add one to the score, otherwise we keep the same score. &lt;/p&gt;
    &lt;p&gt;Notice we couldn't write &lt;code&gt;if x then y else z&lt;/code&gt;, we instead used pattern matching to effectively express &lt;code&gt;(x &amp;amp;&amp;amp; y) || (!x &amp;amp;&amp;amp; z)&lt;/code&gt;. Prolog does have a conditional operator, but it prevents backtracking so what's the point???&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;A quick break about bidirectionality&lt;/h3&gt;
    &lt;p&gt;One of the coolest things about Prolog: all purely logical predicates are bidirectional. We can use &lt;code&gt;score&lt;/code&gt; to check if our expected score is correct:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="s s-Atom"&gt;true&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But we can also give it answers and a key and ask it for the score:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;em&gt;Or&lt;/em&gt; we could give it a key and a score and ask "what test answers would have this score?"&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The different value is written &lt;code&gt;_A&lt;/code&gt; because we never told Prolog that the array can &lt;em&gt;only&lt;/em&gt; contain &lt;code&gt;a&lt;/code&gt; and &lt;code&gt;b&lt;/code&gt;. We'll fix this later.&lt;/p&gt;
    &lt;h3&gt;Okay back to the program&lt;/h3&gt;
    &lt;p&gt;Now that we have a way of computing scores, we want to find a possible answer key that matches all of our observations, ie gives everybody the correct scores.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
        &lt;span class="c1"&gt;% Figure it out&lt;/span&gt;
        &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So far we haven't explicitly said that the &lt;code&gt;Key&lt;/code&gt; length matches the student answer lengths. This is implicitly verified by &lt;code&gt;score&lt;/code&gt; (both lists need to be empty at the same time) but it's a good idea to explicitly add &lt;code&gt;length(Key, 10)&lt;/code&gt; as a clause of &lt;code&gt;key/1&lt;/code&gt;. We should also explicitly say that every element of &lt;code&gt;Key&lt;/code&gt; is either &lt;code&gt;a&lt;/code&gt; or &lt;code&gt;b&lt;/code&gt;.&lt;sup id="fnref:explicit"&gt;&lt;a class="footnote-ref" href="#fn:explicit"&gt;4&lt;/a&gt;&lt;/sup&gt; Now we &lt;em&gt;could&lt;/em&gt; write a second predicate saying &lt;code&gt;Key&lt;/code&gt; had the right 'type': &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;keytype([]).
    keytype([K|Ks]) :- member(K, [a, b]), keytype(Ks).
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But "generating lists that match a constraint" is a thing that comes up often enough that we don't want to write a separate predicate for each constraint! So after some digging, I found a more elegant solution: &lt;code&gt;maplist&lt;/code&gt;. Let &lt;code&gt;L=[l1, l2]&lt;/code&gt;. Then &lt;code&gt;maplist(p, L)&lt;/code&gt; is equivalent to the clause &lt;code&gt;p(l1), p(l2)&lt;/code&gt;. It also accepts partial predicates: &lt;code&gt;maplist(p(x), L)&lt;/code&gt; is equivalent to &lt;code&gt;p(x, l1), p(x, l2)&lt;/code&gt;. So we could write&lt;sup id="fnref:yall"&gt;&lt;a class="footnote-ref" href="#fn:yall"&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    
    &lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
        &lt;span class="nf"&gt;length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;maplist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="c1"&gt;% the score stuff&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now, let's query for the Key:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So there are actually four &lt;em&gt;different&lt;/em&gt; keys that all explain our data. Does this mean the puzzle is broken and has multiple different answers?&lt;/p&gt;
    &lt;h3&gt;Nope&lt;/h3&gt;
    &lt;p&gt;The puzzle wasn't to find out what the answer key was, the point was to find the fourth student's score. And if we query for it, we see all four solutions give him the same score:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Huh! I really like it when puzzles look like they're broken, but every "alternate" solution still gives the same puzzle answer.&lt;/p&gt;
    &lt;p&gt;Total program length: 15 lines of code, compared to the original's 80 lines. &lt;em&gt;Suck it, Pablo.&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;(Incidentally, you can get all of the answer at once by writing &lt;code&gt;findall(X, (key(Key), score($answer-array, Key, X)), L).&lt;/code&gt;) &lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;I still don't like puzzles for teaching&lt;/h3&gt;
    &lt;p&gt;The actual examples I'm using in &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;the book&lt;/a&gt; are "analyzing a version control commit graph" and "planning a sequence of infrastructure changes", which are somewhat more likely to occur at work than needing to solve a puzzle. You'll see them in the next release!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:path"&gt;
    &lt;p&gt;I found it because he wrote &lt;a href="https://morepablo.com/2025/04/gamer-games-for-lite-gamers.html" target="_blank"&gt;Gamer Games for Lite Gamers&lt;/a&gt; as a response to my &lt;a href="https://www.hillelwayne.com/post/vidja-games/" target="_blank"&gt;Gamer Games for Non-Gamers&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:path" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:superior"&gt;
    &lt;p&gt;These are better versions of the core Prolog expressions &lt;code&gt;\+ (A = B)&lt;/code&gt; and &lt;code&gt;A is B + 1&lt;/code&gt;, because they can &lt;a href="https://eu.swi-prolog.org/pldoc/man?predicate=dif/2" target="_blank"&gt;defer unification&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:superior" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:arity"&gt;
    &lt;p&gt;Prolog-descendants have a convention of writing the arity of the function after its name, so &lt;code&gt;score/3&lt;/code&gt; means "score has three parameters". I think they do this because you can overload predicates with multiple different arities. Also Joe Armstrong used Prolog for prototyping, so Erlang and Elixir follow the same convention. &lt;a class="footnote-backref" href="#fnref:arity" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:explicit"&gt;
    &lt;p&gt;It &lt;em&gt;still&lt;/em&gt; gets the right answers without this type restriction, but I had no idea it did until I checked for myself. Probably better not to rely on this! &lt;a class="footnote-backref" href="#fnref:explicit" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:yall"&gt;
    &lt;p&gt;We could make this even more compact by using a lambda function. First import module &lt;code&gt;yall&lt;/code&gt;, then write &lt;code&gt;maplist([X]&amp;gt;&amp;gt;member(X, [a,b]), Key)&lt;/code&gt;. But (1) it's not a shorter program because you replace the extra definition with an extra module import, and (2) &lt;code&gt;yall&lt;/code&gt; is SWI-Prolog specific and not an ISO-standard prolog module. Using &lt;code&gt;contains&lt;/code&gt; is more portable. &lt;a class="footnote-backref" href="#fnref:yall" title="Jump back to footnote 5 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 08 Apr 2025 18:34:50 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/</guid>
            </item>
            <item>
                <title>[April Cools] Gaming Games for Non-Gamers</title>
                <link>https://buttondown.com/hillelwayne/archive/april-cools-gaming-games-for-non-gamers/</link>
                <description>&lt;p&gt;My &lt;em&gt;April Cools&lt;/em&gt; is out! &lt;a href="https://www.hillelwayne.com/post/vidja-games/" target="_blank"&gt;Gaming Games for Non-Gamers&lt;/a&gt; is a 3,000 word essay on video games worth playing if you've never enjoyed a video game before. &lt;a href="https://www.patreon.com/posts/blog-notes-gamer-125654321?utm_medium=clipboard_copy&amp;amp;utm_source=copyLink&amp;amp;utm_campaign=postshare_creator&amp;amp;utm_content=join_link" target="_blank"&gt;Patreon notes here&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;(April Cools is a project where we write genuine content on non-normal topics. You can see all the other April Cools posted so far &lt;a href="https://www.aprilcools.club/" target="_blank"&gt;here&lt;/a&gt;. There's still time to submit your own!)&lt;/p&gt;
    &lt;a class="embedded-link" href="https://www.aprilcools.club/"&gt; &lt;div style="width: 100%; background: #fff; border: 1px #ced3d9 solid; border-radius: 5px; margin-top: 1em; overflow: auto; margin-bottom: 1em;"&gt; &lt;div style="float: left; border-bottom: 1px #ced3d9 solid;"&gt; &lt;img class="link-image" src="https://www.aprilcools.club/aprilcoolsclub.png"/&gt; &lt;/div&gt; &lt;div style="float: left; color: #393f48; padding-left: 1em; padding-right: 1em;"&gt; &lt;h4 class="link-title" style="margin-bottom: 0em; line-height: 1.25em; margin-top: 1em; font-size: 14px;"&gt;                April Cools' Club&lt;/h4&gt; &lt;/div&gt; &lt;/div&gt;&lt;/a&gt;</description>
                <pubDate>Tue, 01 Apr 2025 16:04:59 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/april-cools-gaming-games-for-non-gamers/</guid>
            </item>
            <item>
                <title>Betteridge's Law of Software Engineering Specialness</title>
                <link>https://buttondown.com/hillelwayne/archive/betteridges-law-of-software-engineering/</link>
                <description>&lt;h3&gt;Logic for Programmers v0.8 now out!&lt;/h3&gt;
    &lt;p&gt;The new release has minor changes: new formatting for notes and a better introduction to predicates. I would have rolled it all into v0.9 next month but I like the monthly cadence. &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Get it here!&lt;/a&gt;&lt;/p&gt;
    &lt;h1&gt;Betteridge's Law of Software Engineering Specialness&lt;/h1&gt;
    &lt;p&gt;In &lt;a href="https://agileotter.blogspot.com/2025/03/there-is-no-automatic-reset-in.html" target="_blank"&gt;There is No Automatic Reset in Engineering&lt;/a&gt;, Tim Ottinger asks:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Do the other people have to live with January 2013 for the rest of their lives? Or is it only engineering that has to deal with every dirty hack since the beginning of the organization?&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;&lt;strong&gt;Betteridge's Law of Headlines&lt;/strong&gt; says that if a journalism headline ends with a question mark, the answer is probably "no". I propose a similar law relating to software engineering specialness:&lt;sup id="fnref:ottinger"&gt;&lt;a class="footnote-ref" href="#fn:ottinger"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;If someone asks if some aspect of software development is truly unique to just software development, the answer is probably "no".&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Take the idea that "in software, hacks are forever." My favorite example of this comes from a different profession. The &lt;a href="https://en.wikipedia.org/wiki/Dewey_Decimal_Classification" target="_blank"&gt;Dewey Decimal System&lt;/a&gt; hierarchically categorizes books by discipline. For example, &lt;em&gt;&lt;a href="https://www.librarything.com/work/10143437/t/Covered-Bridges-of-Pennsylvania" target="_blank"&gt;Covered Bridges of Pennsylvania&lt;/a&gt;&lt;/em&gt; has Dewey number &lt;code&gt;624.37&lt;/code&gt;. &lt;code&gt;6--&lt;/code&gt; is the technology discipline, &lt;code&gt;62-&lt;/code&gt; is engineering, &lt;code&gt;624&lt;/code&gt; is civil engineering, and &lt;code&gt;624.3&lt;/code&gt; is "special types of bridges". I have no idea what the last &lt;code&gt;0.07&lt;/code&gt; means, but you get the picture.&lt;/p&gt;
    &lt;p&gt;Now if you look at the &lt;a href="https://www.librarything.com/mds/6" target="_blank"&gt;6-- "technology" breakdown&lt;/a&gt;, you'll see that there's no "software" subdiscipline. This is because when Dewey preallocated the whole technology block in 1876. New topics were instead to be added to the &lt;code&gt;00-&lt;/code&gt; "general-knowledge" catch-all. Eventually &lt;code&gt;005&lt;/code&gt; was assigned to "software development", meaning &lt;em&gt;The C Programming Language&lt;/em&gt; lives at &lt;code&gt;005.133&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;Incidentally, another late addition to the general knowledge block is &lt;code&gt;001.9&lt;/code&gt;: "controversial knowledge". &lt;/p&gt;
    &lt;p&gt;And that's why my hometown library shelved the C++ books right next to &lt;em&gt;The Mothman Prophecies&lt;/em&gt;.&lt;/p&gt;
    &lt;p&gt;How's &lt;em&gt;that&lt;/em&gt; for technical debt?&lt;/p&gt;
    &lt;p&gt;If anything, fixing hacks in software is significantly &lt;em&gt;easier&lt;/em&gt; than in other fields. This came up when I was &lt;a href="https://www.hillelwayne.com/post/we-are-not-special/" target="_blank"&gt;interviewing classic engineers&lt;/a&gt;. Kludges happened all the time, but "refactoring" them out is &lt;em&gt;expensive&lt;/em&gt;. Need to house a machine that's just two inches taller than the room? Guess what, you're cutting a hole in the ceiling.&lt;/p&gt;
    &lt;p&gt;(Even if we restrict the question to other departments in a &lt;em&gt;software company&lt;/em&gt;, we can find kludges that are horrible to undo. I once worked for a company which landed an early contract by adding a bespoke support agreement for that one customer. That plagued them for years afterward.)&lt;/p&gt;
    &lt;p&gt;That's not to say that there aren't things that are different about software vs other fields!&lt;sup id="fnref:example"&gt;&lt;a class="footnote-ref" href="#fn:example"&gt;2&lt;/a&gt;&lt;/sup&gt;  But I think that &lt;em&gt;most&lt;/em&gt; of the time, when we say "software development is the only profession that deals with XYZ", it's only because we're ignorant of how those other professions work.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Short newsletter because I'm way behind on writing my &lt;a href="https://www.aprilcools.club/" target="_blank"&gt;April Cools&lt;/a&gt;. If you're interested in April Cools, you should try it out! I make it &lt;em&gt;way&lt;/em&gt; harder on myself than it actually needs to be— everybody else who participates finds it pretty chill.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:ottinger"&gt;
    &lt;p&gt;Ottinger caveats it with "engineering, software or otherwise", so I think he knows that other branches of &lt;em&gt;engineering&lt;/em&gt;, at least, have kludges. &lt;a class="footnote-backref" href="#fnref:ottinger" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:example"&gt;
    &lt;p&gt;The "software is different" idea that I'm most sympathetic to is that in software, the tools we use and the products we create are made from the same material. That's unusual at least in classic engineering. Then again, plenty of machinists have made their own lathes and mills! &lt;a class="footnote-backref" href="#fnref:example" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 26 Mar 2025 18:48:39 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/betteridges-law-of-software-engineering/</guid>
            </item>
            <item>
                <title>Verification-First Development</title>
                <link>https://buttondown.com/hillelwayne/archive/verification-first-development/</link>
                <description>&lt;p&gt;A while back I argued on the Blue Site&lt;sup id="fnref:li"&gt;&lt;a class="footnote-ref" href="#fn:li"&gt;1&lt;/a&gt;&lt;/sup&gt; that "test-first development" (TFD) was different than "test-driven development" (TDD). The former is "write tests before you write code", the latter is a paradigm, culture, and collection of norms that's based on TFD. More broadly, TFD is a special case of &lt;strong&gt;Verification-First Development&lt;/strong&gt; and TDD is not.&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;VFD: before writing code, put in place some means of verifying that the code is correct, or at least have an idea of what you'll do.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;"Verifying" could mean writing tests, or figuring out how to encode invariants in types, or &lt;a href="https://blog.regehr.org/archives/1091" target="_blank"&gt;adding contracts&lt;/a&gt;, or &lt;a href="https://learntla.com/" target="_blank"&gt;making a formal model&lt;/a&gt;, or writing a separate script that checks the output of the program. Just have &lt;em&gt;something&lt;/em&gt; appropriate in place that you can run as you go building the code. Ideally, we'd have verification in place for every interesting property, but that's rarely possible in practice. &lt;/p&gt;
    &lt;p&gt;Oftentimes we can't make the verification until the code is partially complete. In that case it still helps to figure out the verification we'll write later. The point is to have a &lt;em&gt;plan&lt;/em&gt; and follow it promptly.&lt;/p&gt;
    &lt;p&gt;I'm using "code" as a standin for anything we programmers make, not just software programs. When using constraint solvers, I try to find representative problems I know the answers to. When writing formal specifications, I figure out the system's properties before the design that satisfies those properties. There's probably equivalents in security and other topics, too.&lt;/p&gt;
    &lt;h3&gt;The Benefits of VFD&lt;/h3&gt;
    &lt;ol&gt;
    &lt;li&gt;Doing verification before coding makes it less likely we'll skip verification entirely. It's the professional equivalent of "No TV until you do your homework."&lt;/li&gt;
    &lt;li&gt;It's easier to make sure a verifier works properly if we start by running it on code we know doesn't pass it. Bebugging working code takes more discipline.&lt;/li&gt;
    &lt;li&gt;We can run checks earlier in the development process. It's better to realize that our code is broken five minutes after we broke it rather than two hours after.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;That's it, those are the benefits of verification-first development. Those are also &lt;em&gt;big&lt;/em&gt; benefits for relatively little investment. Specializations of VFD like test-first development can have more benefits, but also more drawbacks.&lt;/p&gt;
    &lt;h3&gt;The drawbacks of VFD&lt;/h3&gt;
    &lt;ol&gt;
    &lt;li&gt;It slows us down. I know lots of people say that "no actually it makes you go faster in the long run," but that's the &lt;em&gt;long&lt;/em&gt; run. Sometimes we do marathons, sometimes we sprint.&lt;/li&gt;
    &lt;li&gt;Verification gets in the way of exploratory coding, where we don't know what exactly we want or how exactly to do something.&lt;/li&gt;
    &lt;li&gt;Any specific form of verification exerts a pressure on our code to make it easier to verify with that method. For example, if we're mostly verifying via type invariants, we need to figure out how to express those things in our language's type system, which may not be suited for the specific invariants we need.&lt;sup id="fnref:sphinx"&gt;&lt;a class="footnote-ref" href="#fn:sphinx"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ol&gt;
    &lt;h2&gt;Whether "pressure" is a real drawback is incredibly controversial&lt;/h2&gt;
    &lt;p&gt;If I had to summarize what makes "test-driven development" different from VFD:&lt;sup id="fnref:tdd"&gt;&lt;a class="footnote-ref" href="#fn:tdd"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;The form of verification should specifically be tests, and unit tests at that&lt;/li&gt;
    &lt;li&gt;Testing pressure is invariably good. "Making your code easier to unit test" is the same as "making your code better".&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;This is something all of the various "drivens"— TDD, Type Driven Development, Design by Contract— share in common, this idea that the purpose of the paradigm is to exert pressure. Lots of TDD experts claim that "having a good test suite" is only the secondary benefit of TDD and the real benefit is how it improves code quality.&lt;sup id="fnref:docs"&gt;&lt;a class="footnote-ref" href="#fn:docs"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;Whether they're right or not is not something I want to argue: I've seen these approaches all improve my code structure, but also sometimes worsen it. Regardless, I consider pressure a drawback to VFD in general, though, for a somewhat idiosyncratic reason. If it &lt;em&gt;weren't&lt;/em&gt; for pressure, VFD would be wholly independent of the code itself. It would &lt;em&gt;just&lt;/em&gt; be about verification, and our decisions would exclusively be about how we want to verify. But the design pressure means that our means of verification affects the system we're checking. What if these conflict in some way?&lt;/p&gt;
    &lt;h3&gt;VFD is a technique, not a paradigm&lt;/h3&gt;
    &lt;p&gt;One of the main differences between "techniques" and "paradigms" is that paradigms don't play well with each other. If you tried to do both "proper" Test-Driven Development and "proper" Cleanroom, your head would explode. Whereas VFD being a "technique" means it works well with other techniques and even with many full paradigms.&lt;/p&gt;
    &lt;p&gt;It also doesn't take a whole lot of practice to start using. It does take practice, both in thinking of verifications and in using the particular verification method involved, to &lt;em&gt;use well&lt;/em&gt;, but we can use it poorly and still benefit.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:li"&gt;
    &lt;p&gt;LinkedIn, what did you think I meant? &lt;a class="footnote-backref" href="#fnref:li" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:sphinx"&gt;
    &lt;p&gt;This bit me in the butt when making my own &lt;a href="https://www.sphinx-doc.org/en/master/" target="_blank"&gt;sphinx&lt;/a&gt; extensions. The official guides do things in a highly dynamic way that Mypy can't statically check. I had to do things in a completely different way. Ended up being better though! &lt;a class="footnote-backref" href="#fnref:sphinx" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:tdd"&gt;
    &lt;p&gt;Someone's going to yell at me that I completely missed the point of TDD, which is XYZ. Well guess what, someone else &lt;em&gt;already&lt;/em&gt; yelled at me that only dumb idiot babies think XYZ is important in TDD. Put in whatever you want for XYZ. &lt;a class="footnote-backref" href="#fnref:tdd" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:docs"&gt;
    &lt;p&gt;Another thing that weirdly all of the paradigms claim: that they lead to better documentation. I can see the argument, I just find it strange that &lt;em&gt;every single one&lt;/em&gt; makes this claim! &lt;a class="footnote-backref" href="#fnref:docs" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 18 Mar 2025 16:22:20 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/verification-first-development/</guid>
            </item>
            <item>
                <title>New Blog Post: "A Perplexing Javascript Parsing Puzzle"</title>
                <link>https://buttondown.com/hillelwayne/archive/new-blog-post-a-perplexing-javascript-parsing/</link>
                <description>&lt;p&gt;I know I said we'd be back to normal newsletters this week and in fact had 80% of one already written. &lt;/p&gt;
    &lt;p&gt;Then I unearthed something that was better left buried.&lt;/p&gt;
    &lt;p&gt;&lt;a href="http://www.hillelwayne.com/post/javascript-puzzle/" target="_blank"&gt;Blog post here&lt;/a&gt;, &lt;a href="https://www.patreon.com/posts/blog-notes-124153641" target="_blank"&gt;Patreon notes here&lt;/a&gt; (Mostly an explanation of how I found this horror in the first place). Next week I'll send what was supposed to be this week's piece.&lt;/p&gt;
    &lt;p&gt;(PS: &lt;a href="https://www.aprilcools.club/" target="_blank"&gt;April Cools&lt;/a&gt; in three weeks!)&lt;/p&gt;</description>
                <pubDate>Wed, 12 Mar 2025 14:49:52 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/new-blog-post-a-perplexing-javascript-parsing/</guid>
            </item>
            <item>
                <title>Five Kinds of Nondeterminism</title>
                <link>https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/</link>
                <description>&lt;p&gt;No newsletter next week, I'm teaching a TLA+ workshop.&lt;/p&gt;
    &lt;p&gt;Speaking of which: I spend a lot of time thinking about formal methods (and TLA+ specifically) because it's where the source of almost all my revenue. But I don't share most of the details because 90% of my readers don't use FM and never will. I think it's more interesting to talk about ideas &lt;em&gt;from&lt;/em&gt; FM that would be useful to people outside that field. For example, the idea of "property strength" translates to the &lt;a href="https://buttondown.com/hillelwayne/archive/some-tests-are-stronger-than-others/" target="_blank"&gt;idea that some tests are stronger than others&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;Another possible export is how FM approaches nondeterminism. A &lt;strong&gt;nondeterministic&lt;/strong&gt; algorithm is one that, from the same starting conditions, has multiple possible outputs. This is nondeterministic:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# Pseudocode
    
    def f() {
        return rand()+1;
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;When specifying systems, I may not &lt;em&gt;encounter&lt;/em&gt; nondeterminism more often than in real systems, but I am definitely more aware of its presence. Modeling nondeterminism is a core part of formal specification. I mentally categorize nondeterminism into five buckets. Caveat, this is specifically about nondeterminism from the perspective of &lt;em&gt;system modeling&lt;/em&gt;, not computer science as a whole. If I tried to include stuff on NFAs and amb operations this would be twice as long.&lt;sup id="fnref:nondeterminism"&gt;&lt;a class="footnote-ref" href="#fn:nondeterminism"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h2&gt;1. True Randomness&lt;/h2&gt;
    &lt;p&gt;Programs that literally make calls to a &lt;code&gt;random&lt;/code&gt; function and then use the results. This the simplest type of nondeterminism and one of the most ubiquitous. &lt;/p&gt;
    &lt;p&gt;Most of the time, &lt;code&gt;random&lt;/code&gt; isn't &lt;em&gt;truly&lt;/em&gt; nondeterministic. Most of the time computer randomness is actually &lt;strong&gt;pseudorandom&lt;/strong&gt;, meaning we seed a deterministic algorithm that behaves "randomly-enough" for some use. You could "lift" a nondeterministic random function into a deterministic one by adding a fixed seed to the starting state.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python&lt;/span&gt;
    
    &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seed&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    
    &lt;span class="o"&gt;&gt;&gt;&gt;&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="mf"&gt;0.23796462709189137&lt;/span&gt;
    &lt;span class="o"&gt;&gt;&gt;&gt;&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="mf"&gt;0.23796462709189137&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Often we don't do this because the &lt;em&gt;point&lt;/em&gt; of randomness is to provide nondeterminism! We deliberately &lt;em&gt;abstract out&lt;/em&gt; the starting state of the seed from our program, because it's easier to think about it as locally nondeterministic.&lt;/p&gt;
    &lt;p&gt;(There's also "true" randomness, like using &lt;a href="https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html#inpage-nav-3-2" target="_blank"&gt;thermal noise&lt;/a&gt; as an entropy source, which I think are mainly used for cryptography and seeding PRNGs.)&lt;/p&gt;
    &lt;p&gt;Most formal specification languages don't deal with randomness (though some deal with &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;probability more broadly&lt;/a&gt;). Instead, we treat it as a nondeterministic choice:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# software
    if rand &gt; 0.001 then return a else crash
    
    # specification
    either return a or crash
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is because we're looking at worst-case scenarios, so it doesn't matter if &lt;code&gt;crash&lt;/code&gt; happens 50% of the time or 0.0001% of the time, it's still possible.  &lt;/p&gt;
    &lt;h2&gt;2. Concurrency&lt;/h2&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# Pseudocode
    global x = 1, y = 0;
    
    def thread1() {
       x++;
       x++;
       x++;
    }
    
    def thread2() {
        y := x;
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;If &lt;code&gt;thread1()&lt;/code&gt; and &lt;code&gt;thread2()&lt;/code&gt; run sequentially, then (assuming the sequence is fixed) the final value of &lt;code&gt;y&lt;/code&gt; is deterministic. If the two functions are started and run simultaneously, then depending on when &lt;code&gt;thread2&lt;/code&gt; executes &lt;code&gt;y&lt;/code&gt; can be 1, 2, 3, &lt;em&gt;or&lt;/em&gt; 4. Both functions are locally sequential, but running them concurrently leads to global nondeterminism.&lt;/p&gt;
    &lt;p&gt;Concurrency is arguably the most &lt;em&gt;dramatic&lt;/em&gt; source of nondeterminism. &lt;a href="https://buttondown.com/hillelwayne/archive/what-makes-concurrency-so-hard/" target="_blank"&gt;Small amounts of concurrency lead to huge explosions in the state space&lt;/a&gt;. We have words for the specific kinds of nondeterminism caused by concurrency, like "race condition" and "dirty write". Often we think about it as a separate &lt;em&gt;topic&lt;/em&gt; from nondeterminism. To some extent it "overshadows" the other kinds: I have a much easier time teaching students about concurrency in models than nondeterminism in models.&lt;/p&gt;
    &lt;p&gt;Many formal specification languages have special syntax/machinery for the concurrent aspects of a system, and generic syntax for other kinds of nondeterminism. In P that's &lt;a href="https://p-org.github.io/P/manual/expressions/#choose" target="_blank"&gt;choose&lt;/a&gt;. Others don't special-case concurrency, instead representing as it as nondeterministic choices by a global coordinator. This more flexible but also more inconvenient, as you have to implement process-local sequencing code yourself. &lt;/p&gt;
    &lt;h2&gt;3. User Input&lt;/h2&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;One of the most famous and influential programming books is &lt;em&gt;The C Programming Language&lt;/em&gt; by Kernighan and Ritchie. The first example of a nondeterministic program appears on page 14:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Picture of the book page. Code reproduced below." class="newsletter-image" src="https://assets.buttondown.email/images/94e6ad15-8d09-48df-b885-191318bfd179.jpg?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;For the newsletter readers who get text only emails,&lt;sup id="fnref:text-only"&gt;&lt;a class="footnote-ref" href="#fn:text-only"&gt;2&lt;/a&gt;&lt;/sup&gt; here's the program:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="cp"&gt;#include&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cpf"&gt;&lt;stdio.h&gt;&lt;/span&gt;
    &lt;span class="cm"&gt;/* copy input to output; 1st version */&lt;/span&gt;
    &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;getchar&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;while&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;!=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;EOF&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;putchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;getchar&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Yup, that's nondeterministic. Because the user can enter any string, any call of &lt;code&gt;main()&lt;/code&gt; could have any output, meaning the number of possible outcomes is infinity.&lt;/p&gt;
    &lt;p&gt;Okay that seems a little cheap, and I think it's because we tend to think of determinism in terms of how the user &lt;em&gt;experiences&lt;/em&gt; the program. Yes, &lt;code&gt;main()&lt;/code&gt; has an infinite number of user inputs, but for each input the user will experience only one possible output. It starts to feel more nondeterministic when modeling a long-standing system that's &lt;em&gt;reacting&lt;/em&gt; to user input, for example a server that runs a script whenever the user uploads a file. This can be modeled with nondeterminism and concurrency: We have one execution that's the system, and one nondeterministic execution that represents the effects of our user.&lt;/p&gt;
    &lt;p&gt;(One intrusive thought I sometimes have: any "yes/no" dialogue actually has &lt;em&gt;three&lt;/em&gt; outcomes: yes, no, or the user getting up and walking away without picking a choice, permanently stalling the execution.)&lt;/p&gt;
    &lt;h2&gt;4. External forces&lt;/h2&gt;
    &lt;p&gt;The more general version of "user input": anything where either 1) some part of the execution outcome depends on retrieving external information, or 2) the external world can change some state outside of your system. I call the distinction between internal and external components of the system &lt;a href="https://www.hillelwayne.com/post/world-vs-machine/" target="_blank"&gt;the world and the machine&lt;/a&gt;. Simple examples: code that at some point reads an external temperature sensor. Unrelated code running on a system which quits programs if it gets too hot. API requests to a third party vendor. Code processing files but users can delete files before the script gets to them.&lt;/p&gt;
    &lt;p&gt;Like with PRNGs, some of these cases don't &lt;em&gt;have&lt;/em&gt; to be nondeterministic; we can argue that "the temperature" should be a virtual input into the function. Like with PRNGs, we treat it as nondeterministic because it's useful to think in that way. Also, what if the temperature changes between starting a function and reading it?&lt;/p&gt;
    &lt;p&gt;External forces are also a source of nondeterminism as &lt;em&gt;uncertainty&lt;/em&gt;. Measurements in the real world often comes with errors, so repeating a measurement twice can give two different answers. Sometimes operations fail for no discernable reason, or for a non-programmatic reason (like something physically blocks the sensor).&lt;/p&gt;
    &lt;p&gt;All of these situations can be modeled in the same way as user input: a concurrent execution making nondeterministic choices.&lt;/p&gt;
    &lt;h2&gt;5. Abstraction&lt;/h2&gt;
    &lt;p&gt;This is where nondeterminism in system models and in "real software" differ the most. I said earlier that pseudorandomness is &lt;em&gt;arguably&lt;/em&gt; deterministic, but we abstract it into nondeterminism. More generally, &lt;strong&gt;nondeterminism hides implementation details of deterministic processes&lt;/strong&gt;.&lt;/p&gt;
    &lt;p&gt;In one consulting project, we had a machine that received a message, parsed a lot of data from the message, went into a complicated workflow, and then entered one of three states. The final state was totally deterministic on the content of the message, but the actual process of determining that final state took tons and tons of code. None of that mattered at the scope we were modeling, so we abstracted it all away: "on receiving message, nondeterministically enter state A, B, or C."&lt;/p&gt;
    &lt;p&gt;Doing this makes the system easier to model. It also makes the model more sensitive to possible errors. What if the workflow is bugged and sends us to the wrong state? That's already covered by the nondeterministic choice! Nondeterministic abstraction gives us the potential to pick the worst-case scenario for our system, so we can prove it's robust even under those conditions.&lt;/p&gt;
    &lt;p&gt;I know I beat the "nondeterminism as abstraction" drum a whole lot but that's because it's the insight from formal methods I personally value the most, that nondeterminism is a powerful tool to &lt;em&gt;simplify reasoning about things&lt;/em&gt;. You can see the same approach in how I approach modeling users and external forces: complex realities black-boxed and simplified into nondeterministic forces on the system.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Anyway, I hope this collection of ideas I got from formal methods are useful to my broader readership. Lemme know if it somehow helps you out!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:nondeterminism"&gt;
    &lt;p&gt;I realized after writing this that I already talked wrote an essay about nondeterminism in formal specification &lt;a href="https://buttondown.com/hillelwayne/archive/nondeterminism-in-formal-specification/" target="_blank"&gt;just under a year ago&lt;/a&gt;. I hope this one covers enough new ground to be interesting! &lt;a class="footnote-backref" href="#fnref:nondeterminism" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:text-only"&gt;
    &lt;p&gt;There is a surprising number of you. &lt;a class="footnote-backref" href="#fnref:text-only" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 19 Feb 2025 19:37:57 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/</guid>
            </item>
            <item>
                <title>Are Efficiency and Horizontal Scalability at odds?</title>
                <link>https://buttondown.com/hillelwayne/archive/are-efficiency-and-horizontal-scalability-at-odds/</link>
                <description>&lt;p&gt;Sorry for missing the newsletter last week! I started writing on Monday as normal, and by Wednesday the piece (about the &lt;a href="https://en.wikipedia.org/wiki/Hierarchy_of_hazard_controls" target="_blank"&gt;hierarchy of controls&lt;/a&gt; ) was 2000 words and not &lt;em&gt;close&lt;/em&gt; to done. So now it'll be a blog post sometime later this month.&lt;/p&gt;
    &lt;p&gt;I also just released a new version of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt;! 0.7 adds a bunch of new content (type invariants, modeling access policies, rewrites of the first chapters) but more importantly has new fonts that are more legible than the old ones. &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Go check it out!&lt;/a&gt;&lt;/p&gt;
    &lt;p&gt;For this week's newsletter I want to brainstorm an idea I've been noodling over for a while. Say we have a computational task, like running a simulation or searching a very large graph, and it's taking too long to complete on a computer. There's generally three things that we can do to make it faster:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Buy a faster computer ("vertical scaling")&lt;/li&gt;
    &lt;li&gt;Modify the software to use the computer's resources better ("efficiency")&lt;/li&gt;
    &lt;li&gt;Modify the software to use multiple computers ("horizontal scaling")&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;(Splitting single-threaded software across multiple threads/processes is sort of a blend of (2) and (3).)&lt;/p&gt;
    &lt;p&gt;The big benefit of (1) is that we (usually) don't have to make any changes to the software to get a speedup. The downside is that for the past couple of decades computers haven't &lt;em&gt;gotten&lt;/em&gt; much faster, except in ways that require recoding (like GPUs and multicore). This means we rely on (2) and (3), and we can do both to a point. I've noticed, though, that horizontal scaling seems to conflict with efficiency. Software optimized to scale well tends to be worse or the &lt;code&gt;N=1&lt;/code&gt; case than software optimized to, um, be optimized. &lt;/p&gt;
    &lt;p&gt;Are there reasons to &lt;em&gt;expect&lt;/em&gt; this? It seems reasonable that design goals of software are generally in conflict, purely because exclusively optimizing for one property means making decisions that impede other properties. But is there something in the nature of "efficiency" and "horizontal scalability" that make them especially disjoint?&lt;/p&gt;
    &lt;p&gt;This isn't me trying to explain a fully coherent idea, more me trying to figure this all out to myself. Also I'm probably getting some hardware stuff wrong&lt;/p&gt;
    &lt;h3&gt;Amdahl's Law&lt;/h3&gt;
    &lt;p&gt;According to &lt;a href="https://en.wikipedia.org/wiki/Amdahl%27s_law" target="_blank"&gt;Amdahl's Law&lt;/a&gt;, the maximum speedup by parallelization is constrained by the proportion of the work that can be parallelized. If 80% of algorithm X is parallelizable, the maximum speedup from horizontal scaling is 5x. If algorithm Y is 25% parallelizable, the maximum speedup is only 1.3x. &lt;/p&gt;
    &lt;p&gt;If you need horizontal scalability, you want to use algorithm X, &lt;em&gt;even if Y is naturally 3x faster&lt;/em&gt;. But if Y was 4x faster, you'd prefer it to X. Maximal scalability means finding the optimal balance between baseline speed and parallelizability. Maximal efficiency means just optimizing baseline speed. &lt;/p&gt;
    &lt;h3&gt;Coordination Overhead&lt;/h3&gt;
    &lt;p&gt;Distributed algorithms require more coordination. To add a list of numbers in parallel via &lt;a href="https://en.wikipedia.org/wiki/Fork%E2%80%93join_model" target="_blank"&gt;fork-join&lt;/a&gt;, we'd do something like this:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Split the list into N sublists&lt;/li&gt;
    &lt;li&gt;Fork a new thread/process for sublist&lt;/li&gt;
    &lt;li&gt;Wait for each thread/process to finish&lt;/li&gt;
    &lt;li&gt;Add the sums together.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;(1), (2), and (3) all add overhead to the algorithm. At the very least, it's extra lines of code to execute, but it can also mean inter-process communication or network hops. Distribution also means you have fewer natural correctness guarantees, so you need more administrative overhead to avoid race conditions. &lt;/p&gt;
    &lt;p&gt;&lt;strong&gt;Real world example:&lt;/strong&gt; Historically CPython has a "global interpreter lock" (GIL). In multithreaded code, only one thread could execute Python code at a time (others could execute C code). The &lt;a href="https://docs.python.org/3/howto/free-threading-python.html#single-threaded-performance" target="_blank"&gt;newest version&lt;/a&gt; supports disabling the GIL, which comes at a 40% overhead for single-threaded programs. Supposedly the difference is because the &lt;a href="https://docs.python.org/3/whatsnew/3.11.html#whatsnew311-pep659" target="_blank"&gt;specializing adaptor&lt;/a&gt; optimization isn't thread-safe yet. The Python team is hoping on getting it down to "only" 10%. &lt;/p&gt;
    &lt;p style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;Scaling loses shared resources&lt;/h3&gt;
    &lt;p&gt;I'd say that intra-machine scaling (multiple threads/processes) feels qualitatively &lt;em&gt;different&lt;/em&gt; than inter-machine scaling. Part of that is that intra-machine scaling is "capped" while inter-machine is not. But there's also a difference in what assumptions you can make about shared resources. Starting from the baseline of single-threaded program:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Threads have a much harder time sharing CPU caches (you have to manually mess with affinities)&lt;/li&gt;
    &lt;li&gt;Processes have a much harder time sharing RAM (I think you have to use &lt;a href="https://en.wikipedia.org/wiki/Memory-mapped_file" target="_blank"&gt;mmap&lt;/a&gt;?)&lt;/li&gt;
    &lt;li&gt;Machines can't share cache, RAM, or disk, period.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;It's a lot easier to solve a problem when the whole thing fits in RAM. But if you split a 50 gb problem across three machines, it doesn't fit in ram by default, even if the machines have 64 gb each. Scaling also means that separate machines can't reuse resources like database connections.&lt;/p&gt;
    &lt;h3&gt;Efficiency comes from limits&lt;/h3&gt;
    &lt;p&gt;I think the two previous points tie together in the idea that maximal efficiency comes from being able to make assumptions about the system. If we know the &lt;em&gt;exact&lt;/em&gt; sequence of computations, we can aim to minimize cache misses. If we don't have to worry about thread-safety, &lt;a href="https://www.playingwithpointers.com/blog/refcounting-harder-than-it-sounds.html" target="_blank"&gt;tracking references is dramatically simpler&lt;/a&gt;. If we have all of the data in a single database, our query planner has more room to work with. At various tiers of scaling these assumptions are no longer guaranteed and we lose the corresponding optimizations.&lt;/p&gt;
    &lt;p&gt;Sometimes these assumptions are implicit and crop up in odd places. Like if you're working at a scale where you need multiple synced databases, you might want to use UUIDs instead of numbers for keys. But then you lose the assumption "recently inserted rows are close together in the index", which I've read &lt;a href="https://www.cybertec-postgresql.com/en/unexpected-downsides-of-uuid-keys-in-postgresql/" target="_blank"&gt;can lead to significant slowdowns&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;This suggests that if you can find a limit somewhere else, you can get both high horizontal scaling and high efficiency. &lt;del&gt;Supposedly the &lt;a href="https://tigerbeetle.com/" target="_blank"&gt;TigerBeetle database&lt;/a&gt; has both, but that could be because they limit all records to &lt;a href="https://docs.tigerbeetle.com/coding/" target="_blank"&gt;accounts and transfers&lt;/a&gt;. This means every record fits in &lt;a href="https://tigerbeetle.com/blog/2024-07-23-rediscovering-transaction-processing-from-history-and-first-principles/#transaction-processing-from-first-principles" target="_blank"&gt;exactly 128 bytes&lt;/a&gt;.&lt;/del&gt; [A TigerBeetle engineer reached out to tell me that they do &lt;em&gt;not&lt;/em&gt; horizontally scale compute, they distribute across multiple nodes for redundancy. &lt;a href="https://lobste.rs/s/5akiq3/are_efficiency_horizontal_scalability#c_ve8ud5" target="_blank"&gt;"You can't make it faster by adding more machines."&lt;/a&gt;]&lt;/p&gt;
    &lt;p&gt;Does this mean that "assumptions" could be both "assumptions about the computing environment" and "assumptions about the problem"? In the famous essay &lt;a href="http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html" target="_blank"&gt;Scalability! But at what COST&lt;/a&gt;, Frank McSherry shows that his single-threaded laptop could outperform 128-node "big data systems" on PageRank and graph connectivity (via label propagation). Afterwards, he discusses how a different algorithm solves graph connectivity even faster: &lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;[Union find] is more line of code than label propagation, but it is 10x faster and 100x less embarassing. … The union-find algorithm is fundamentally incompatible with the graph computation approaches Giraph, GraphLab, and GraphX put forward (the so-called “think like a vertex” model).&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The interesting thing to me is that his alternate makes more "assumptions" than what he's comparing to. He can "assume" a fixed goal and optimize the code for that goal. The "big data systems" are trying to be general purpose compute platforms and have to pick a model that supports the widest range of possible problems. &lt;/p&gt;
    &lt;p&gt;A few years back I wrote &lt;a href="https://www.hillelwayne.com/post/cleverness/" target="_blank"&gt;clever vs insightful code&lt;/a&gt;, I think what I'm trying to say here is that efficiency comes from having insight into your problem and environment.&lt;/p&gt;
    &lt;p&gt;(Last thought to shove in here: to exploit assumptions, you need &lt;em&gt;control&lt;/em&gt;. Carefully arranging your data to fit in L1 doesn't matter if your programming language doesn't let you control where things are stored!)&lt;/p&gt;
    &lt;h3&gt;Is there a cultural aspect?&lt;/h3&gt;
    &lt;p&gt;Maybe there's also a cultural element to this conflict. What if the engineers interested in "efficiency" are different from the engineers interested in "horizontal scaling"?&lt;/p&gt;
    &lt;p&gt;At my first job the data scientists set up a &lt;a href="https://en.wikipedia.org/wiki/Apache_Hadoop" target="_blank"&gt;Hadoop&lt;/a&gt; cluster for their relatively small dataset, only a few dozen gigabytes or so. One of the senior software engineers saw this and said "big data is stupid." To prove it, he took one of their example queries, wrote a script in Go to compute the same thing, and optimized it to run faster on his machine.&lt;/p&gt;
    &lt;p&gt;At the time I was like "yeah, you're right, big data IS stupid!" But I think now that we both missed something obvious: with the "scalable" solution, the data scientists &lt;em&gt;didn't&lt;/em&gt; have to write an optimized script for every single query. Optimizing code is hard, adding more machines is easy! &lt;/p&gt;
    &lt;p&gt;The highest-tier of horizontal scaling is usually something large businesses want, and large businesses like problems that can be solved purely with money. Maximizing efficiency requires a lot of knowledge-intensive human labour, so is less appealing as an investment. Then again, I've seen a lot of work on making the scalable systems more efficient, such as evenly balancing heterogeneous workloads. Maybe in the largest systems intra-machine efficiency is just too small-scale a problem. &lt;/p&gt;
    &lt;h3&gt;I'm not sure where this fits in but scaling a volume of tasks conflicts less than scaling individual tasks&lt;/h3&gt;
    &lt;p&gt;If you have 1,000 machines and need to crunch one big graph, you probably want the most scalable algorithm. If you instead have 50,000 small graphs, you probably want the most efficient algorithm, which you then run on all 1,000 machines. When we call a problem &lt;a href="https://en.wikipedia.org/wiki/Embarrassingly_parallel" target="_blank"&gt;embarrassingly parallel&lt;/a&gt;, we usually mean it's easy to horizontally scale. But it's also one that's easy to make more efficient, because local optimizations don't affect the scaling! &lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Okay that's enough brainstorming for one week.&lt;/p&gt;
    &lt;h3&gt;Blog Rec&lt;/h3&gt;
    &lt;p&gt;Whenever I think about optimization as a skill, the first article that comes to mind is &lt;a href="https://matklad.github.io/" target="_blank"&gt;Mat Klad's&lt;/a&gt; &lt;a href="https://matklad.github.io/2023/11/15/push-ifs-up-and-fors-down.html" target="_blank"&gt;Push Ifs Up And Fors Down&lt;/a&gt;. I'd never have considered on my own that inlining loops into functions could be such a huge performance win. The blog has a lot of other posts on the nuts-and-bolts of systems languages, optimization, and concurrency.&lt;/p&gt;</description>
                <pubDate>Wed, 12 Feb 2025 18:26:20 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/are-efficiency-and-horizontal-scalability-at-odds/</guid>
            </item>
            <item>
                <title>What hard thing does your tech make easy?</title>
                <link>https://buttondown.com/hillelwayne/archive/what-hard-thing-does-your-tech-make-easy/</link>
                <description>&lt;p&gt;I occasionally receive emails asking me to look at the writer's new language/library/tool. Sometimes it's in an area I know well, like formal methods. Other times, I'm a complete stranger to the field. Regardless, I'm generally happy to check it out.&lt;/p&gt;
    &lt;p&gt;When starting out, this is the biggest question I'm looking to answer:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;What does this technology make easy that's normally hard?&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;What justifies me learning and migrating to a &lt;em&gt;new&lt;/em&gt; thing as opposed to fighting through my problems with the tools I already know? The new thing has to have some sort of value proposition, which could be something like "better performance" or "more secure". The most universal value and the most direct to show is "takes less time and mental effort to do something". I can't accurately judge two benchmarks, but I can see two demos or code samples and compare which one feels easier to me.&lt;/p&gt;
    &lt;h2&gt;Examples&lt;/h2&gt;
    &lt;h3&gt;Functional programming&lt;/h3&gt;
    &lt;p&gt;What drew me originally to functional programming was higher order functions. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# Without HOFs
    
    out = []
    for x in input {
      if test(x) {
        out.append(x)
     }
    }
    
    # With HOFs
    
    filter(test, input)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;We can also compare the easiness of various tasks between examples within the same paradigm. If I know FP via Clojure, what could be appealing about Haskell or F#? For one, null safety is a lot easier when I've got option types.&lt;/p&gt;
    &lt;h3&gt;Array Programming&lt;/h3&gt;
    &lt;p&gt;Array programming languages like APL or J make certain classes of computation easier. For example, finding all of the indices where two arrays &lt;del&gt;differ&lt;/del&gt; match. Here it is in Python:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    
    &lt;span class="o"&gt;&gt;&gt;&gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;And here it is in J:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;y&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
    
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;I&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;y&lt;/span&gt;
    &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Not every tool is meant for every programmer, because you might not have any of the problems a tool makes easier. What comes up more often for you: filtering a list or finding all the indices where two lists differ? Statistically speaking, functional programming is more useful to you than array programming.&lt;/p&gt;
    &lt;p&gt;But &lt;em&gt;I&lt;/em&gt; have this problem enough to justify learning array programming.&lt;/p&gt;
    &lt;h3&gt;LLMs&lt;/h3&gt;
    &lt;p&gt;I think a lot of the appeal of LLMs is they make a lot of specialist tasks easy for nonspecialists. One thing I recently did was convert some rst &lt;a href="https://docutils.sourceforge.io/docs/ref/rst/directives.html#list-table" target="_blank"&gt;list tables&lt;/a&gt; to &lt;a href="https://docutils.sourceforge.io/docs/ref/rst/directives.html#csv-table-1" target="_blank"&gt;csv tables&lt;/a&gt;. Normally I'd have to do write some tricky parsing and serialization code to automatically convert between the two. With LLMs, it's just&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Convert the following rst list-table into a csv-table: [table]&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;"Easy" can trump "correct" as a value. The LLM might get some translations wrong, but it's so convenient I'd rather manually review all the translations for errors than write specialized script that is correct 100% of the time.&lt;/p&gt;
    &lt;h2&gt;Let's not take this too far&lt;/h2&gt;
    &lt;p&gt;A college friend once claimed that he cracked the secret of human behavior: humans do whatever makes them happiest. "What about the martyr who dies for their beliefs?" "Well, in their last second of life they get REALLY happy."&lt;/p&gt;
    &lt;p&gt;We can do the same here, fitting every value proposition into the frame of "easy". CUDA makes it easier to do matrix multiplication. Rust makes it easier to write low-level code without memory bugs. TLA+ makes it easier to find errors in your design. Monads make it easier to sequence computations in a lazy environment. Making everything about "easy" obscures other reason for adopting new things.&lt;/p&gt;
    &lt;h3&gt;That whole "simple vs easy" thing&lt;/h3&gt;
    &lt;p&gt;Sometimes people think that "simple" is better than "easy", because "simple" is objective and "easy" is subjective. This comes from the famous talk &lt;a href="https://www.infoq.com/presentations/Simple-Made-Easy/" target="_blank"&gt;Simple Made Easy&lt;/a&gt;. I'm not sure I agree that simple is better &lt;em&gt;or&lt;/em&gt; more objective: the speaker claims that polymorphism and typeclasses are "simpler" than conditionals, and I doubt everybody would agree with that.&lt;/p&gt;
    &lt;p&gt;The problem is that "simple" is used to mean both "not complicated" &lt;em&gt;and&lt;/em&gt; "not complex". And everybody agrees that "complicated" and "complex" are different, even if they can't agree &lt;em&gt;what&lt;/em&gt; the difference is. This idea should probably expanded be expanded into its own newsletter.&lt;/p&gt;
    &lt;p&gt;It's also a lot harder to pitch a technology on being "simpler". Simplicity by itself doesn't make a tool better equipped to solve problems. Simplicity can unlock other benefits, like compositionality or &lt;a href="https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/" target="_blank"&gt;tractability&lt;/a&gt;, that provide the actual value. And often that value is in the form of "makes some tasks easier". &lt;/p&gt;</description>
                <pubDate>Wed, 29 Jan 2025 18:09:47 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/what-hard-thing-does-your-tech-make-easy/</guid>
            </item>
            <item>
                <title>The Juggler's Curse</title>
                <link>https://buttondown.com/hillelwayne/archive/the-jugglers-curse/</link>
                <description>&lt;p&gt;I'm making a more focused effort to juggle this year. Mostly &lt;a href="https://youtu.be/PPhG_90VH5k?si=AxOO65PcX4ZwnxPQ&amp;t=49" target="_blank"&gt;boxes&lt;/a&gt;, but also classic balls too.&lt;sup id="fnref:boxes"&gt;&lt;a class="footnote-ref" href="#fn:boxes"&gt;1&lt;/a&gt;&lt;/sup&gt; I've gotten to the point where I can almost consistently do a five-ball cascade, which I &lt;em&gt;thought&lt;/em&gt; was the cutoff to being a "good juggler". "Thought" because I now know a "good juggler" is one who can do the five-ball cascade with &lt;em&gt;outside throws&lt;/em&gt;. &lt;/p&gt;
    &lt;p&gt;I know this because I can't do the outside five-ball cascade... yet. But it's something I can see myself eventually mastering, unlike the slightly more difficult trick of the five-ball mess, which is impossible for mere mortals like me. &lt;/p&gt;
    &lt;p&gt;&lt;em&gt;In theory&lt;/em&gt; there is a spectrum of trick difficulties and skill levels. I could place myself on the axis like this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="A crudely-drawn scale with 10 even ticks, I'm between 5 and 6" class="newsletter-image" src="https://assets.buttondown.email/images/8ee51aa1-5dd4-48b8-8110-2cdf9a273612.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;In practice, there are three tiers:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Toddlers&lt;/li&gt;
    &lt;li&gt;Good jugglers who practice hard&lt;/li&gt;
    &lt;li&gt;Genetic freaks and actual wizards&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;And the graph always, &lt;em&gt;always&lt;/em&gt; looks like this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The same graph, with the top compressed into "wizards" and bottom into "toddlers". I'm in toddlers." class="newsletter-image" src="https://assets.buttondown.email/images/04c76cec-671e-4560-b64e-498b7652359e.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;This is the jugglers curse, and it's a three-parter:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;The threshold between you and "good" is the next trick you cannot do.&lt;/li&gt;
    &lt;li&gt;Everything below that level is trivial. Once you've gotten a trick down, you can never go back to not knowing it, to appreciating how difficult it was to learn in the first place.&lt;sup id="fnref:expert-blindness"&gt;&lt;a class="footnote-ref" href="#fn:expert-blindness"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;li&gt;Everything above that level is just "impossible". You don't have the knowledge needed to recognize the different tiers.&lt;sup id="fnref:dk"&gt;&lt;a class="footnote-ref" href="#fn:dk"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;So as you get better, the stuff that was impossible becomes differentiable, and you can see that some of it &lt;em&gt;is&lt;/em&gt; possible. And everything you learned becomes trivial. So you're never a good juggler until you learn "just one more hard trick".&lt;/p&gt;
    &lt;p&gt;The more you know, the more you know you don't know and the less you know you know.&lt;/p&gt;
    &lt;h3&gt;This is supposed to be a software newsletter&lt;/h3&gt;
    &lt;blockquote&gt;
    &lt;p&gt;A monad is a monoid in the category of endofunctors, what's the problem? &lt;a href="https://james-iry.blogspot.com/2009/05/brief-incomplete-and-mostly-wrong.html" target="_blank"&gt;(src)&lt;/a&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;I think this applies to any difficult topic? Most fields don't have the same stark &lt;a href="https://en.wikipedia.org/wiki/Spectral_line" target="_blank"&gt;spectral lines&lt;/a&gt; as juggling, but there's still tiers of difficulty to techniques, which get compressed the further in either direction they are from your current level.&lt;/p&gt;
    &lt;p&gt;Like, I'm not good at formal methods. I've written two books on it but I've never mastered a dependently-typed language or a theorem prover. Those are equally hard. And I'm not good at modeling concurrent systems because I don't understand the formal definition of bisimulation and haven't implemented a Raft. Those are also equally hard, in fact exactly as hard as mastering a theorem prover.&lt;/p&gt;
    &lt;p&gt;At the same time, the skills I've already developed are easy: properly using refinement is &lt;em&gt;exactly as easy&lt;/em&gt; as writing &lt;a href="https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/" target="_blank"&gt;a wrapped counter&lt;/a&gt;. Then I get surprised when I try to explain strong fairness to someone and they just don't get how □◇(ENABLED〈A〉ᵥ) is &lt;em&gt;obviously&lt;/em&gt; different from ◇□(ENABLED 〈A〉ᵥ).&lt;/p&gt;
    &lt;p&gt;Juggler's curse!&lt;/p&gt;
    &lt;p&gt;Now I don't actually know if this is actually how everybody experiences expertise or if it's just my particular personality— I was a juggler long before I was a software developer. Then again, I'd argue that lots of people talk about one consequence of the juggler's curse: imposter syndrome. If you constantly think what you know is "trivial" and what you don't know is "impossible", then yeah, you'd start feeling like an imposter at work real quick.&lt;/p&gt;
    &lt;p&gt;I wonder if part of the cause is that a lot of skills you have to learn are invisible. One of my favorite blog posts ever is &lt;a href="https://www.benkuhn.net/blub/" target="_blank"&gt;In Defense of Blub Studies&lt;/a&gt;, which argues that software expertise comes through understanding "boring" topics like "what all of the error messages mean" and "how to use a debugger well".  Blub is a critical part of expertise and takes a lot of hard work to learn, but it &lt;em&gt;feels&lt;/em&gt; like trivia. So looking back on a skill I mastered, I might think it was "easy" because I'm not including all of the blub that I had to learn, too.&lt;/p&gt;
    &lt;p&gt;The takeaway, of course, is that the outside five-ball cascade &lt;em&gt;is&lt;/em&gt; objectively the cutoff between good jugglers and toddlers.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:boxes"&gt;
    &lt;p&gt;Rant time: I &lt;em&gt;love&lt;/em&gt; cigar box juggling. It's fun, it's creative, it's totally unlike any other kind of juggling. And it's so niche I straight up cannot find anybody in Chicago to practice with. I once went to a juggling convention and was the only person with a cigar box set there. &lt;a class="footnote-backref" href="#fnref:boxes" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:expert-blindness"&gt;
    &lt;p&gt;This particular part of the juggler's curse is also called &lt;a href="https://en.wikipedia.org/wiki/Curse_of_knowledge" target="_blank"&gt;the curse of knowledge&lt;/a&gt; or "expert blindness". &lt;a class="footnote-backref" href="#fnref:expert-blindness" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:dk"&gt;
    &lt;p&gt;This isn't Dunning-Kruger, because DK says that people think they are &lt;em&gt;better&lt;/em&gt; than they actually are, and also &lt;a href="https://www.mcgill.ca/oss/article/critical-thinking/dunning-kruger-effect-probably-not-real" target="_blank"&gt;may not actually be real&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:dk" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 22 Jan 2025 18:50:40 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/the-jugglers-curse/</guid>
            </item>
            <item>
                <title>What are the Rosettas of formal specification?</title>
                <link>https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/</link>
                <description>&lt;p&gt;First of all, I just released version 0.6 of &lt;em&gt;Logic for Programmers&lt;/em&gt;! You can get it &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;here&lt;/a&gt;. Release notes in the footnote.&lt;sup id="fnref:release-notes"&gt;&lt;a class="footnote-ref" href="#fn:release-notes"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;I've been thinking about my next project after the book's done. One idea is to do a survey of new formal specification languages. There's been a lot of new ones in the past few years (P, Quint, etc), plus some old ones I haven't critically examined (SPIN, mcrl2). I'm thinking of a brief overview of each, what's interesting about it, and some examples of the corresponding models.&lt;/p&gt;
    &lt;p&gt;For this I'd want a set of "Rosetta" examples. &lt;a href="https://rosettacode.org/wiki/Rosetta_Code" target="_blank"&gt;Rosetta Code&lt;/a&gt; is a collection of programming tasks done in different languages. For example, &lt;a href="https://rosettacode.org/wiki/99_bottles_of_beer" target="_blank"&gt;"99 bottles of beer on the wall"&lt;/a&gt; in over 300 languages. If I wanted to make a Rosetta Code for specifications of concurrent systems, what examples would I use? &lt;/p&gt;
    &lt;h3&gt;What makes a good Rosetta examples?&lt;/h3&gt;
    &lt;p&gt;A good Rosetta example would be simple enough to understand and implement but also showcase the differences between the languages. &lt;/p&gt;
    &lt;p&gt;A good example of a Rosetta example is &lt;a href="https://github.com/hwayne/lets-prove-leftpad" target="_blank"&gt;leftpad for code verification&lt;/a&gt;. Proving leftpad correct is short in whatever verification language you use. But the proofs themselves are different enough that you can compare what it's like to use code contracts vs with dependent types, etc. &lt;/p&gt;
    &lt;p&gt;A &lt;em&gt;bad&lt;/em&gt; Rosetta example is "hello world". While it's good for showing how to run a language, it doesn't clearly differentiate languages. Haskell's "hello world" is almost identical to BASIC's "hello world".&lt;/p&gt;
    &lt;p&gt;Rosetta examples don't have to be flashy, but I &lt;em&gt;want&lt;/em&gt; mine to be flashy. Formal specification is niche enough that regardless of my medium, most of my audience hasn't use it and may be skeptical. I always have to be selling. This biases me away from using things like dining philosophers or two-phase commit.&lt;/p&gt;
    &lt;p&gt;So with that in mind, three ideas:&lt;/p&gt;
    &lt;h3&gt;1. Wrapped Counter&lt;/h3&gt;
    &lt;p&gt;A counter that starts at 1 and counts to N, after which it wraps around to 1 again.&lt;/p&gt;
    &lt;h4&gt;Why it's good&lt;/h4&gt;
    &lt;p&gt;This is a good introductory formal specification: it's a minimal possible stateful system without concurrency or nondeterminism. You can use it to talk about the basic structure of a spec, how a verifier works, etc. It also a good way of introducing "boring" semantics, like conditionals and arithmetic, and checking if the language does anything unusual with them. Alloy, for example, defaults to 4-bit signed integers, so you run into problems if you set N too high.&lt;sup id="fnref:alloy"&gt;&lt;a class="footnote-ref" href="#fn:alloy"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;At the same time, wrapped counters are a common building block of complex systems. Lots of things can be represented this way: &lt;code&gt;N=1&lt;/code&gt; is a flag or blinker, &lt;code&gt;N=3&lt;/code&gt; is a traffic light, &lt;code&gt;N=24&lt;/code&gt; is a clock, etc.&lt;/p&gt;
    &lt;p&gt;The next example is better for showing basic &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;safety and liveness properties&lt;/a&gt;, but this will do in a pinch. &lt;/p&gt;
    &lt;h3&gt;2. Threads&lt;/h3&gt;
    &lt;p&gt;A counter starts at 0. N threads each, simultaneously try to update the counter. They do this nonatomically: first they read the value of the counter and store that in a thread-local &lt;code&gt;tmp&lt;/code&gt;, then they increment &lt;code&gt;tmp&lt;/code&gt;, then they set the counter to &lt;code&gt;tmp&lt;/code&gt;. The expected behavior is that the final value of the counter will be N.&lt;/p&gt;
    &lt;h4&gt;Why it's good&lt;/h4&gt;
    &lt;p&gt;The system as described is bugged. If two threads interleave the setlocal commands, one thread update can "clobber" the other and the counter can go backwards. To my surprise, most people &lt;em&gt;do not&lt;/em&gt; see this error. So it's a good showcase of how the language actually finds real bugs, and how it can verify fixes.&lt;/p&gt;
    &lt;p&gt;As to actual language topics: the spec covers concurrency and track process-local state. A good spec language should make it possible to adjust N without having to add any new variables. And it "naturally" introduces safety, liveness, and &lt;a href="https://www.hillelwayne.com/post/action-properties/" target="_blank"&gt;action&lt;/a&gt; properties.&lt;/p&gt;
    &lt;p&gt;Finally, the thread spec is endlessly adaptable. I've used variations of it to teach refinement, resource starvation, fairness, livelocks, and hyperproperties. Tweak it a bit and you get dining philosophers.&lt;/p&gt;
    &lt;h3&gt;3. Bounded buffer&lt;/h3&gt;
    &lt;p&gt;We have a bounded buffer with maximum length &lt;code&gt;X&lt;/code&gt;. We have &lt;code&gt;R&lt;/code&gt; reader and &lt;code&gt;W&lt;/code&gt; writer processes. Before writing, writers first check if the buffer is full. If full, the writer goes to sleep. Otherwise, the writer wakes up &lt;em&gt;a random&lt;/em&gt; sleeping process, then pushes an arbitrary value. Readers work the same way, except they pop from the buffer (and go to sleep if the buffer is empty).&lt;/p&gt;
    &lt;p&gt;The only way for a sleeping process to wake up is if another process successfully performs a read or write.&lt;/p&gt;
    &lt;h4&gt;Why it's good&lt;/h4&gt;
    &lt;p&gt;This shows process-local nondeterminism (in choosing which sleeping process to wake up), different behavior for different types of processes, and deadlocks: it's possible for every reader and writer to be asleep at the same time.&lt;/p&gt;
    &lt;p&gt;The beautiful thing about this example: the spec can only deadlock if &lt;code&gt;X &lt; 2*(R+W)&lt;/code&gt;. This is the kind of bug you'd struggle to debug in real code. An in fact, people did struggle: even when presented with a minimal code sample and told there was a bug, many &lt;a href="http://wiki.c2.com/?ExtremeProgrammingChallengeFourteen" target="_blank"&gt;testing experts couldn't find it&lt;/a&gt;. Whereas a formal model of the same code &lt;a href="https://www.hillelwayne.com/post/augmenting-agile/" target="_blank"&gt;finds the bug in seconds&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;If a spec language can model the bounded buffer, then it's good enough for production systems.&lt;/p&gt;
    &lt;p&gt;On top of that, the bug happens regardless of what writers actually put in the buffer, so you can abstract that all away. This example can demonstrate that you can leave implementation details out of a spec and still find critical errors.&lt;/p&gt;
    &lt;h2&gt;Caveat&lt;/h2&gt;
    &lt;p&gt;This is all with a &lt;em&gt;heavy&lt;/em&gt; TLA+ bias. I've modeled all of these systems in TLA+ and it works pretty well for them. That is to say, none of these do things TLA+ is &lt;em&gt;bad&lt;/em&gt; at: reachability, subtyping, transitive closures, unbound spaces, etc. I imagine that as I cover more specification languages I'll find new Rosettas.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:release-notes"&gt;
    &lt;ul&gt;
    &lt;li&gt;Exercises are more compact, answers now show name of exercise in title&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;"Conditionals" chapter has new section on nested conditionals&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;"Crash course" chapter significantly rewritten&lt;/li&gt;
    &lt;li&gt;Starting migrating to use consistently use &lt;code&gt;==&lt;/code&gt; for equality and &lt;code&gt;=&lt;/code&gt; for definition. Not everything is migrated yet&lt;/li&gt;
    &lt;li&gt;"Beyond Logic" appendix does a &lt;em&gt;slightly&lt;/em&gt; better job of covering HOL and constructive logic&lt;/li&gt;
    &lt;li&gt;Addressed various reader feedback&lt;/li&gt;
    &lt;li&gt;Two new exercises&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;&lt;a class="footnote-backref" href="#fnref:release-notes" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:alloy"&gt;
    &lt;p&gt;You can change the int size in a model run, so this is more "surprising footgun and inconvenience" than "fundamental limit of the specification language." Something still good to know! &lt;a class="footnote-backref" href="#fnref:alloy" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 15 Jan 2025 17:34:40 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/</guid>
            </item>
            <item>
                <title>"Logic for Programmers" Project Update</title>
                <link>https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/</link>
                <description>&lt;p&gt;Happy new year everyone!&lt;/p&gt;
    &lt;p&gt;I released the first &lt;em&gt;Logic for Programmers&lt;/em&gt; alpha six months ago. There's since been four new versions since then, with the November release putting us in beta. Between work and holidays I didn't make much progress in December, but there will be a 0.6 release in the next week or two.&lt;/p&gt;
    &lt;p&gt;People have asked me if the book will ever be available in print, and my answer to that is "when it's done". To keep "when it's done" from being "never", I'm committing myself to &lt;strong&gt;have the book finished by July.&lt;/strong&gt; That means roughly six more releases between now and the official First Edition. Then I will start looking for a way to get it printed.&lt;/p&gt;
    &lt;h3&gt;The Current State and What Needs to be Done&lt;/h3&gt;
    &lt;p&gt;Right now the book is 26,000 words. For the most part, the structure is set— I don't plan to reorganize the chapters much. But I still need to fix shortcomings identified by the reader feedback. In particular, a few topics need more on real world applications, and the Alloy chapter is pretty weak. There's also a bunch of notes and todos and "fix this"s I need to go over.&lt;/p&gt;
    &lt;p&gt;I also need to rewrite the introduction and predicate logic chapters. Those haven't changed much since 0.1 and I need to go over them &lt;em&gt;very carefully&lt;/em&gt;.&lt;/p&gt;
    &lt;p&gt;After that comes copyediting.&lt;/p&gt;
    &lt;h4&gt;Ugh, Copyediting&lt;/h4&gt;
    &lt;p&gt;Copyediting means going through the entire book to make word and sentence sentence level changes to the flow. An example would be changing&lt;/p&gt;
    &lt;table&gt;
    &lt;thead&gt;
    &lt;tr&gt;
    &lt;th&gt;From&lt;/th&gt;
    &lt;th&gt;To&lt;/th&gt;
    &lt;/tr&gt;
    &lt;/thead&gt;
    &lt;tbody&gt;
    &lt;tr&gt;
    &lt;td&gt;I said predicates are just “boolean functions”. That isn’t &lt;em&gt;quite&lt;/em&gt; true.&lt;/td&gt;
    &lt;td&gt;It's easy to think of predicates as just "boolean" functions, but there is a subtle and important difference.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;/tbody&gt;
    &lt;/table&gt;
    &lt;p&gt;It's a tiny difference but it reads slightly better to me and makes the book slghtly better. Now repeat that for all 3000-odd sentences in the book and I'm done with copyediting!&lt;/p&gt;
    &lt;p&gt;For the first pass, anyway. Copyediting is miserable. &lt;/p&gt;
    &lt;p&gt;Some of the changes I need to make come from reader feedback, but most will come from going through it line-by-line with a copyeditor. Someone's kindly offered to do some of this for free, but I want to find a professional too. If you know anybody, let me know.&lt;/p&gt;
    &lt;h4&gt;Formatting&lt;/h4&gt;
    &lt;p&gt;The book, if I'm being honest, looks ugly. I'm using the default sphinx/latex combination for layout and typesetting. My thinking is it's not worth making the book pretty until it's worth reading. But I also want the book, when it's eventually printed, to look &lt;em&gt;nice&lt;/em&gt;. At the very least it shouldn't have "self-published" vibes. &lt;/p&gt;
    &lt;p&gt;I've found someone who's been giving me excellent advice on layout and I'm slowly mastering the LaTeX formatting arcana. It's gonna take a few iterations to get things right.&lt;/p&gt;
    &lt;h4&gt;Front cover&lt;/h4&gt;
    &lt;p&gt;Currently the front cover is this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Front cover" class="newsletter-image" src="https://assets.buttondown.email/images/b42ee3de-9d8a-4729-809e-a8739741f0cf.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;It works but gives "programmer spent ten minutes in Inkscape" vibes. I have a vision in my head for what would be nicer. A few people have recommended using Fiverr. So far the results haven't been that good, &lt;/p&gt;
    &lt;h4&gt;Fixing Epub&lt;/h4&gt;
    &lt;p&gt;&lt;em&gt;Ugh&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;I thought making an epub version would be kinder for phone reading, but it's such a painful format to develop for. Did you know that epub backlinks work totally different on kindle vs other ereaders? Did you know the only way to test if you got em working right is to load them up in a virtual kindle? The feedback loops are miserable. So I've been treating epub as a second-class citizen for now and only fixing the &lt;em&gt;worst&lt;/em&gt; errors (like math not rendering properly), but that'll have to change as the book finalizes.&lt;/p&gt;
    &lt;h3&gt;What comes next?&lt;/h3&gt;
    &lt;p&gt;After 1.0, I get my book an ISBN and figure out how to make print copies. The margin on print is &lt;em&gt;way&lt;/em&gt; lower than ebooks, especially if it's on-demand: the net royalties for &lt;a href="https://kdp.amazon.com/en_US/help/topic/G201834330" target="_blank"&gt;Amazon direct publishing&lt;/a&gt; would be 7 dollars on a 20-dollar book (as opposed to Leanpub's 16 dollars). Would having a print version double the sales? I hope so! Either way, a lot of people have been asking about print version so I want to make that possible.&lt;/p&gt;
    &lt;p&gt;(I also want to figure out how to give people who already have the ebook a discount on print, but I don't know if that's feasible.)&lt;/p&gt;
    &lt;p&gt;Then, I dunno, maybe make a talk or a workshop I can pitch to conferences. Once I have that I think I can call &lt;em&gt;LfP&lt;/em&gt; complete... at least until the second edition.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Anyway none of that is actually technical so here's a quick fun thing. I spent a good chunk of my break reading the &lt;a href="https://www.mcrl2.org/web/index.html" target="_blank"&gt;mCRL2 book&lt;/a&gt;. mCRL2 defines an "algebra" for "communicating processes". As a very broad explanation, that's defining what it means to "add" and "multiply" two processes. What's interesting is that according to their definition, the algebra follows the distributive law, &lt;em&gt;but only if you multiply on the right&lt;/em&gt;. eg&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;// VALID
    (a+b)*c = a*c + b*c
    
    // INVALID
    a*(b+c) = a*b + a*c
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is the first time I've ever seen this in practice! Juries still out on the rest of the language.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Videos and Stuff&lt;/h3&gt;
    &lt;ul&gt;
    &lt;li&gt;My &lt;em&gt;DDD Europe&lt;/em&gt; talk is now out! &lt;a href="https://www.youtube.com/watch?v=uRmNSuYBUOU" target="_blank"&gt;What We Know We Don't Know&lt;/a&gt; is about empirical software engineering in general, and software engineering research on Domain Driven Design in particular.&lt;/li&gt;
    &lt;li&gt;I was interviewed in the last video on &lt;a href="https://www.youtube.com/watch?v=yXxmSI9SlwM" target="_blank"&gt;Craft vs Cruft&lt;/a&gt;'s "Year of Formal Methods". Check it out!&lt;/li&gt;
    &lt;/ul&gt;</description>
                <pubDate>Tue, 07 Jan 2025 18:49:40 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/</guid>
            </item>
            <item>
                <title>Formally modeling dreidel, the sequel</title>
                <link>https://buttondown.com/hillelwayne/archive/formally-modeling-dreidel-the-sequel/</link>
                <description>&lt;p&gt;Channukah's next week and that means my favorite pastime, complaining about how &lt;a href="https://en.wikipedia.org/wiki/Dreidel#" target="_blank"&gt;Dreidel&lt;/a&gt; is a bad game. Last year I formally modeled it in &lt;a href="https://www.prismmodelchecker.org/" target="_blank"&gt;PRISM&lt;/a&gt; to prove the game's not fun. But because I limited the model to only a small case, I couldn't prove the game was &lt;em&gt;truly&lt;/em&gt; bad. &lt;/p&gt;
    &lt;p&gt;It's time to finish the job.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="A flaming dreidel, from https://pixelsmerch.com/featured/flaming-dreidel-ilan-rosen.html" class="newsletter-image" src="https://assets.buttondown.email/images/61233445-69a7-4fd4-a024-ee0dca0281c1.jpg?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h2&gt;The Story so far&lt;/h2&gt;
    &lt;p&gt;You can read the last year's newsletter &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;here&lt;/a&gt; but here are the high-level notes.&lt;/p&gt;
    &lt;h3&gt;The Game of Dreidel&lt;/h3&gt;
    &lt;ol&gt;
    &lt;li&gt;Every player starts with N pieces (usually chocolate coins). This is usually 10-15 pieces per player.&lt;/li&gt;
    &lt;li&gt;At the beginning of the game, and whenever the pot is empty, every play antes one coin into the pot.&lt;/li&gt;
    &lt;li&gt;
    &lt;p&gt;Turns consist of spinning the dreidel. Outcomes are:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;נ (Nun): nothing happens.&lt;/li&gt;
    &lt;li&gt;ה (He): player takes half the pot, rounded up.&lt;/li&gt;
    &lt;li&gt;ג (Gimmel): player takes the whole pot, everybody antes.&lt;/li&gt;
    &lt;li&gt;ש (Shin): player adds one of their coins to the pot.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;/li&gt;
    &lt;li&gt;
    &lt;p&gt;If a player ever has zero coins, they are eliminated. Play continues until only one player remains.&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;If you don't have a dreidel, you can instead use a four-sided die, but for the authentic experience you should wait eight seconds before looking at your roll.&lt;/p&gt;
    &lt;h3&gt;PRISM&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://www.prismmodelchecker.org/" target="_blank"&gt;PRISM&lt;/a&gt; is a probabilistic modeling language, meaning you can encode a system with random chances of doing things and it can answer questions like "on average, how many spins does it take before one player loses" (64, for 4 players/10 coins) and "what's the more likely to knock the first player out, shin or ante" (ante is 2.4x more likely).  You can see last year's model &lt;a href="https://gist.github.com/hwayne/f8724f0c83393c576b1e20ee4b76966d#file-01-dreidel-prism" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;The problem with PRISM is that it is absurdly inexpressive: it's a thin abstraction for writing giant &lt;a href="https://en.wikipedia.org/wiki/Stochastic_matrix" target="_blank"&gt;stochastic matrices&lt;/a&gt; and lacks basic affordances like lists or functions. I had to hardcode every possible roll for every player. This meant last year's model had two limits. First, it only handles four players, and I would have to write a new model for three or five players. Second, I made the game end as soon as one player &lt;em&gt;lost&lt;/em&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;formula done = (p1=0) | (p2=0) | (p3=0) | (p4=0);
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;To fix both of these things, I thought I'd have to treat PRISM as a compilation target, writing a program that took a player count and output the corresponding model. But then December got super busy and I ran out of time to write a program. Instead, I stuck with four hardcoded players and extended the old model to run until victory.&lt;/p&gt;
    &lt;h2&gt;The new model&lt;/h2&gt;
    &lt;p&gt;These are all changes to &lt;a href="https://gist.github.com/hwayne/f8724f0c83393c576b1e20ee4b76966d#file-01-dreidel-prism" target="_blank"&gt;last year's model&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;First, instead of running until one player is out of money, we run until three players are out of money.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gd"&gt;- formula done = (p1=0) | (p2=0) | (p3=0) | (p4=0);&lt;/span&gt;
    &lt;span class="gi"&gt;+ formula done = &lt;/span&gt;
    &lt;span class="gi"&gt;+  ((p1=0) &amp; (p2=0) &amp; (p3=0)) |&lt;/span&gt;
    &lt;span class="gi"&gt;+  ((p1=0) &amp; (p2=0) &amp; (p4=0)) |&lt;/span&gt;
    &lt;span class="gi"&gt;+  ((p1=0) &amp; (p3=0) &amp; (p4=0)) |&lt;/span&gt;
    &lt;span class="gi"&gt;+  ((p2=0) &amp; (p3=0) &amp; (p4=0));&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Next, we change the ante formula. Instead of adding four coins to the pot and subtracting a coin from each player, we add one coin for each player left. &lt;code&gt;min(p1, 1)&lt;/code&gt; is 1 if player 1 is still in the game, and 0 otherwise. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gi"&gt;+ formula ante_left = min(p1, 1) + min(p2, 1) + min(p3, 1) + min(p4, 1);&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;We also have to make sure anteing doesn't end a player with negative money. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gd"&gt;- [ante] (pot = 0) &amp; !done -&gt; (pot'=pot+4) &amp; (p1' = p1-1) &amp; (p2' = p2-1) &amp; (p3' = p3-1) &amp; (p4' = p4-1);&lt;/span&gt;
    &lt;span class="gi"&gt;+ [ante] (pot = 0) &amp; !done -&gt; (pot'=pot+ante_left) &amp; (p1' = max(p1-1, 0)) &amp; (p2' = max(p2-1, 0)) &amp; (p3' = max(p3-1, 0)) &amp; (p4' = max(p4-1, 0));&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally, we have to add logic for a player being "out". Instead of moving to the next player after each turn, we move to the next player still in the game. Also, if someone starts their turn without any coins (f.ex if they just anted their last coin), we just skip their turn. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gi"&gt;+ formula p1n = (p2 &gt; 0 ? 2 : p3 &gt; 0 ? 3 : 4);&lt;/span&gt;
    
    &lt;span class="gi"&gt;+ [lost] ((pot != 0) &amp; !done &amp; (turn = 1) &amp; (p1 = 0)) -&gt; (turn' = p1n);&lt;/span&gt;
    &lt;span class="gd"&gt;- [spin] ((pot != 0) &amp; !done &amp; (turn = 1)) -&gt;&lt;/span&gt;
    &lt;span class="gi"&gt;+ [spin] ((pot != 0) &amp; !done &amp; (turn = 1) &amp; (p1 != 0)) -&gt;&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt;   0.25: (p1' = p1-1) 
    &lt;span class="w"&gt; &lt;/span&gt;          &amp; (pot' = min(pot+1, maxval)) 
    &lt;span class="gd"&gt;-          &amp; (turn' = 2) //shin&lt;/span&gt;
    &lt;span class="gi"&gt;+          &amp; (turn' = p1n) //shin&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;We make similar changes for all of the other players. You can see the final model &lt;a href="https://gist.github.com/hwayne/f8724f0c83393c576b1e20ee4b76966d#file-02-dreidel-prism" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;h3&gt;Querying the model&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;So now we have a full game of Dreidel that runs until the player ends. And now, &lt;em&gt;finally&lt;/em&gt;, we can see the average number of spins a 4 player game will last.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;./prism&lt;span class="w"&gt; &lt;/span&gt;dreidel.prism&lt;span class="w"&gt; &lt;/span&gt;-const&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;M&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-pf&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;'R=? [F done]'&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In English: each player starts with ten coins. &lt;code&gt;R=?&lt;/code&gt; means "expected value of the 'reward'", where 'reward' in this case means number of spins. &lt;code&gt;[F done]&lt;/code&gt; weights the reward over all behaviors that reach ("&lt;strong&gt;F&lt;/strong&gt;inally") the &lt;code&gt;done&lt;/code&gt; state.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Result: 760.5607582661091
    Time for model checking: 384.17 seconds.
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So there's the number: 760 spins.&lt;sup id="fnref:ben"&gt;&lt;a class="footnote-ref" href="#fn:ben"&gt;1&lt;/a&gt;&lt;/sup&gt; At 8 seconds a spin, that's almost two hours for &lt;em&gt;one&lt;/em&gt; game.&lt;/p&gt;
    &lt;p&gt;…Jesus, look at that runtime. Six minutes to test one query.&lt;/p&gt;
    &lt;p&gt;PRISM has over a hundred settings that affect model checking, with descriptions like "Pareto curve threshold" and "Use Backwards Pseudo SOR". After looking through them all, I found this perfect combination of configurations that gets the runtime to a more manageable level: &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;./prism dreidel.prism 
    &lt;span class="w"&gt; &lt;/span&gt;   -const M=10 
    &lt;span class="w"&gt; &lt;/span&gt;   -pf 'R=? [F done]' 
    &lt;span class="gi"&gt;+   -heuristic speed&lt;/span&gt;
    
    Result: 760.816255997373
    Time for model checking: 13.44 seconds.
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Yes, that's a literal "make it faster" flag.&lt;/p&gt;
    &lt;p&gt;Anyway, that's only the "average" number of spins, weighted across all games. Dreidel has a very long tail. To find that out, we'll use a variation on our query:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;const C0; P=? [F &lt;=C0 done]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;P=?&lt;/code&gt; is the &lt;strong&gt;P&lt;/strong&gt;robability something happens. &lt;code&gt;F &lt;=C0 done&lt;/code&gt; means we &lt;strong&gt;F&lt;/strong&gt;inally reach state &lt;code&gt;done&lt;/code&gt; in at most &lt;code&gt;C0&lt;/code&gt; steps. By passing in different values of &lt;code&gt;C0&lt;/code&gt; we can get a sense of how long a game takes. Since "steps" includes passes and antes, this will overestimate the length of the game. But antes take time too and it should only "pass" on a player once per player, so this should still be a good metric for game length.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;./prism dreidel.prism 
        -const M=10 
        -const C0=1000:1000:5000
        -pf 'const C0; P=? [F &lt;=C0 done]' 
        -heuristic speed
    
    C0      Result
    1000    0.6259953274918795
    2000    0.9098575028069353
    3000    0.9783122218576754
    4000    0.994782069562932
    5000    0.9987446018004976
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;A full 10% of games don't finish in 2000 steps, and 2% pass the 3000 step barrier. At 8 seconds a roll/ante, 3000 steps is over &lt;strong&gt;six hours&lt;/strong&gt;.&lt;/p&gt;
    &lt;p&gt;Dreidel is a bad game.&lt;/p&gt;
    &lt;h3&gt;More fun properties&lt;/h3&gt;
    &lt;p&gt;As a sanity check, let's confirm last year's result, that it takes an average of 64ish spins before one player is out. In that model, we just needed to get the total reward. Now we instead want to get the reward until the first state where any of the players have zero coins. &lt;sup id="fnref:co-safe"&gt;&lt;a class="footnote-ref" href="#fn:co-safe"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;./prism dreidel.prism 
        -const M=10 
        -pf 'R=? [F (p1=0 | p2=0 | p3=0 | p4=0)]' 
        -heuristic speed
    
    Result: 63.71310116083396
    Time for model checking: 2.017 seconds.
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Yep, looks good. With our new model we can also get the average point where two players are out and two players are left. PRISM's lack of abstraction makes expressing the condition directly a little painful, but we can cheat and look for the first state where &lt;code&gt;ante_left &lt;= 2&lt;/code&gt;.&lt;sup id="fnref:ante_left"&gt;&lt;a class="footnote-ref" href="#fn:ante_left"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;./prism dreidel.prism 
        -const M=10 
        -pf 'R=? [F (ante_left &lt;= 2)]' 
        -heuristic speed
    
    Result: 181.92839196680023
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;It takes twice as long to eliminate the second player as it takes to eliminate the first, and the remaining two players have to go for another 600 spins.&lt;/p&gt;
    &lt;p&gt;Dreidel is a bad game.&lt;/p&gt;
    &lt;h2&gt;The future&lt;/h2&gt;
    &lt;p&gt;There's two things I want to do next with this model. The first is script up something that can generate the PRISM model for me, so I can easily adjust the number of players to 3 or 5. The second is that PRISM has a &lt;a href="https://www.prismmodelchecker.org/manual/PropertySpecification/Filters" target="_blank"&gt;filter-query&lt;/a&gt; feature I don't understand but I &lt;em&gt;think&lt;/em&gt; it could be used for things like "if a player gets 75% of the pot, what's the probability they lose anyway". Otherwise you have to write wonky queries like &lt;code&gt;(P =? [F p1 = 30 &amp; (F p1 = 0)]) / (P =? [F p1 = 0])&lt;/code&gt;.&lt;sup id="fnref:lose"&gt;&lt;a class="footnote-ref" href="#fn:lose"&gt;4&lt;/a&gt;&lt;/sup&gt; But I'm out of time again, so this saga will have to conclude next year.&lt;/p&gt;
    &lt;p&gt;I'm also faced with the terrible revelation that I might be the biggest non-academic user of PRISM.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h4&gt;&lt;em&gt;Logic for Programmers&lt;/em&gt; Khanukah Sale&lt;/h4&gt;
    &lt;p&gt;Still going on! You can get &lt;em&gt;LFP&lt;/em&gt; for &lt;a href="https://leanpub.com/logic/c/hannukah-presents" target="_blank"&gt;40% off here&lt;/a&gt; from now until the end of Xannukkah (Jan 2).&lt;sup id="fnref:joke"&gt;&lt;a class="footnote-ref" href="#fn:joke"&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;h4&gt;I'm in the Raku Advent Calendar!&lt;/h4&gt;
    &lt;p&gt;My piece is called &lt;a href="https://raku-advent.blog/2024/12/11/day-11-counting-up-concurrency/" target="_blank"&gt;counting up concurrencies&lt;/a&gt;. It's about using Raku to do some combinatorics! Read the rest of the blog too, it's great&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:ben"&gt;
    &lt;p&gt;This is different from the &lt;a href="https://www.slate.com/articles/life/holidays/2014/12/rules_of_dreidel_the_hannukah_game_is_way_too_slow_let_s_speed_it_up.html" target="_blank"&gt;original anti-Dreidel article&lt;/a&gt;: Ben got &lt;em&gt;860&lt;/em&gt; spins. That's the average spins if you round &lt;em&gt;down&lt;/em&gt; on He, not up. Rounding up on He leads to a shorter game because it means He can empty the pot, which means more antes, and antes are what knocks most players out. &lt;a class="footnote-backref" href="#fnref:ben" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:co-safe"&gt;
    &lt;p&gt;PRISM calls this &lt;a href="https://www.prismmodelchecker.org/manual/PropertySpecification/Reward-basedProperties" target="_blank"&gt;"co-safe LTL reward"&lt;/a&gt; and does &lt;em&gt;not&lt;/em&gt; explain what that means, nor do most of the papers I found referencing "co-safe LTL". &lt;a href="https://mengguo.github.io/personal_site/papers/pdf/guo2016task.pdf" target="_blank"&gt;Eventually&lt;/a&gt; I found one that defined it as "any property that only uses X, U, F". &lt;a class="footnote-backref" href="#fnref:co-safe" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:ante_left"&gt;
    &lt;p&gt;Here's the exact point where I realize I could have defined &lt;code&gt;done&lt;/code&gt; as &lt;code&gt;ante_left = 1&lt;/code&gt;. Also checking for &lt;code&gt;F (ante_left = 2)&lt;/code&gt; gives an expected number of spins as "infinity". I have no idea why. &lt;a class="footnote-backref" href="#fnref:ante_left" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:lose"&gt;
    &lt;p&gt;10% chances at 4 players / 10 coins. And it takes a minute even &lt;em&gt;with&lt;/em&gt; fast mode enabled. &lt;a class="footnote-backref" href="#fnref:lose" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:joke"&gt;
    &lt;p&gt;This joke was funnier before I made the whole newsletter about Chanukahh. &lt;a class="footnote-backref" href="#fnref:joke" title="Jump back to footnote 5 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 18 Dec 2024 16:58:59 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/formally-modeling-dreidel-the-sequel/</guid>
            </item>
            <item>
                <title>Stroustrup's Rule</title>
                <link>https://buttondown.com/hillelwayne/archive/stroustrups-rule/</link>
                <description>&lt;p&gt;Just finished two weeks of workshops and am &lt;em&gt;exhausted&lt;/em&gt;, so this one will be light. &lt;/p&gt;
    &lt;h3&gt;Hanuka Sale&lt;/h3&gt;
    &lt;p&gt;&lt;em&gt;Logic for Programmers&lt;/em&gt; is on sale until the end of Chanukah! That's Jan 2nd if you're not Jewish. &lt;a href="https://leanpub.com/logic/c/hannukah-presents" target="_blank"&gt;Get it for 40% off here&lt;/a&gt;.&lt;/p&gt;
    &lt;h1&gt;Stroustrup's Rule&lt;/h1&gt;
    &lt;p&gt;I first encountered &lt;strong&gt;Stroustrup's Rule&lt;/strong&gt; on this &lt;a href="https://web.archive.org/web/20240914141601/https:/www.thefeedbackloop.xyz/stroustrups-rule-and-layering-over-time/" target="_blank"&gt;defunct webpage&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;One of my favorite insights about syntax design appeared in a &lt;a href="https://learn.microsoft.com/en-us/shows/lang-next-2014/keynote" target="_blank"&gt;retrospective on C++&lt;/a&gt;&lt;sup id="fnref:timing"&gt;&lt;a class="footnote-ref" href="#fn:timing"&gt;1&lt;/a&gt;&lt;/sup&gt; by Bjarne Stroustrup:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;For new features, people insist on &lt;strong&gt;LOUD&lt;/strong&gt; explicit syntax. &lt;/li&gt;
    &lt;li&gt;For established features, people want terse notation.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The blogger gives the example of option types in Rust. Originally, the idea of using option types to store errors was new for programmers, so the syntax for passing an error was very explicit:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;match&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;File&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"file.txt"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nb"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nb"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Once people were more familiar with it, Rust added the &lt;code&gt;try!&lt;/code&gt; macro to reduce boilerplate, and finally the &lt;a href="https://github.com/rust-lang/rfcs/blob/master/text/0243-trait-based-exception-handling.md" target="_blank"&gt;&lt;code&gt;?&lt;/code&gt; operator&lt;/a&gt; to streamline error handling further.&lt;/p&gt;
    &lt;p&gt;I see this as a special case of &lt;a href="http://teachtogether.tech/en/index.html#s:models" target="_blank"&gt;mental model development&lt;/a&gt;: when a feature is new to you, you don't have an internal mental model so need all of the explicit information you can get. Once you're familiar with it, explicit syntax is visual clutter and hinders how quickly you can parse out information.&lt;/p&gt;
    &lt;p&gt;(One example I like: which is more explicit, &lt;code&gt;user_id&lt;/code&gt; or &lt;code&gt;user_identifier&lt;/code&gt;? Which do experienced programmers prefer?)&lt;/p&gt;
    &lt;p&gt;What's interesting is that it's often the &lt;em&gt;same people&lt;/em&gt; on both sides of the spectrum. Beginners need explicit syntax, and as they become experts, they prefer terse syntax. &lt;/p&gt;
    &lt;p&gt;The rule applies to the overall community, too. At the beginning of a language's life, everybody's a beginner. Over time the ratio of experts to beginners changes, and this leads to more focus on "expert-friendly" features, like terser syntax.&lt;/p&gt;
    &lt;p&gt;This can make it harder for beginners to learn the language. There was a lot of drama in Python over the &lt;a href="https://peps.python.org/pep-0572/" target="_blank"&gt;"walrus" assignment operator&lt;/a&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Without walrus&lt;/span&gt;
    &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# `None` if key absent&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    
    &lt;span class="c1"&gt;# With walrus&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Experts supported it because it made code more elegant, teachers and beginners opposed it because it made the language harder to learn. Explicit syntax vs terse notation.&lt;/p&gt;
    &lt;p&gt;Does this lead to languages bloating over time?&lt;/p&gt;
    &lt;h3&gt;In Teaching&lt;/h3&gt;
    &lt;p&gt;I find that when I teach language workshops I have to actively work against Stroustrup's Rule. The terse notation that easiest for &lt;em&gt;me&lt;/em&gt; to read is bad for beginners, who need the explicit syntax that I find grating.&lt;/p&gt;
    &lt;p&gt;One good example is type invariants in TLA+. Say you have a set of workers, and each worker has a counter. Here's two ways to say that every worker's counter is a non-negative integer:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;\* Bad
    \A w \in Workers: counter[w] &gt;= 0
    
    \* Good
    counter \in [Workers -&gt; Nat]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The first way literally tests that for every worker, &lt;code&gt;counter[w]&lt;/code&gt; is non-negative. The second way tests that the &lt;code&gt;counter&lt;/code&gt; mapping as a whole is an element of the appropriate "function set"— all functions between workers and natural numbers.&lt;/p&gt;
    &lt;p&gt;The function set approach is terser, more elegant, and preferred by TLA+ experts. But I teach the "bad" way because it makes more sense to beginners.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:timing"&gt;
    &lt;p&gt;Starts minute 23. &lt;a class="footnote-backref" href="#fnref:timing" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 11 Dec 2024 17:32:53 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/stroustrups-rule/</guid>
            </item>
            <item>
                <title>Hyperproperties</title>
                <link>https://buttondown.com/hillelwayne/archive/hyperproperties/</link>
                <description>&lt;p&gt;I wrote about &lt;a href="https://hillelwayne.com/post/hyperproperties/" target="_blank"&gt;hyperproperties on my blog&lt;/a&gt; four years ago, but now an intriguing client problem got me thinking about them again.&lt;sup id="fnref:client"&gt;&lt;a class="footnote-ref" href="#fn:client"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;We're using TLA+ to model a system that starts in state A, and under certain complicated conditions &lt;code&gt;P&lt;/code&gt;, transitions to state B. They also had a flag &lt;code&gt;f&lt;/code&gt; that, when set, used a different complicated condition &lt;code&gt;Q&lt;/code&gt; to check the transitions. As a quick &lt;a href="https://www.hillelwayne.com/post/decision-tables/" target="_blank"&gt;decision table&lt;/a&gt; (from state &lt;code&gt;A&lt;/code&gt;):&lt;/p&gt;
    &lt;table&gt;
    &lt;thead&gt;
    &lt;tr&gt;
    &lt;th&gt;f&lt;/th&gt;
    &lt;th&gt;P&lt;/th&gt;
    &lt;th&gt;Q&lt;/th&gt;
    &lt;th&gt;state'&lt;/th&gt;
    &lt;/tr&gt;
    &lt;/thead&gt;
    &lt;tbody&gt;
    &lt;tr&gt;
    &lt;td&gt;F&lt;/td&gt;
    &lt;td&gt;F&lt;/td&gt;
    &lt;td&gt;-&lt;/td&gt;
    &lt;td&gt;A&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
    &lt;td&gt;F&lt;/td&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;-&lt;/td&gt;
    &lt;td&gt;B&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;F&lt;/td&gt;
    &lt;td&gt;F&lt;/td&gt;
    &lt;td&gt;A&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;F&lt;/td&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;B&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;F&lt;/td&gt;
    &lt;td&gt;&lt;strong&gt;impossible&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;B&lt;/td&gt;
    &lt;/tr&gt;
    &lt;/tbody&gt;
    &lt;/table&gt;
    &lt;p&gt;The interesting bit is the second-to-last row: Q has to be &lt;em&gt;strictly&lt;/em&gt; more permissible than P. The client wanted to verify the property that "the system more aggressively transitions when &lt;code&gt;f&lt;/code&gt; is set", ie there is no case where the machine transitions &lt;em&gt;only if &lt;code&gt;f&lt;/code&gt; is false&lt;/em&gt;.&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;Regular system properties&lt;/a&gt; are specified over states in a single sequence of states (behaviors). &lt;strong&gt;Hyperproperties&lt;/strong&gt; can hold over &lt;em&gt;sets&lt;/em&gt; of sequences of states. Here the hyperproperties are:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;ol&gt;
    &lt;li&gt;For any two states X and Y in separate behaviors, if the only difference in variable-state between X and Y is that &lt;code&gt;X.f = TRUE&lt;/code&gt;, then whenever Y transitions to B, so does X.&lt;/li&gt;
    &lt;li&gt;There is at least one such case where X transitions and Y does not.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;That's pretty convoluted, which is par for the course with hyperproperties! It makes a little more sense if you have all of the domain knowledge and specifics. &lt;/p&gt;
    &lt;p&gt;The key thing is that makes this a hyperproperty is that you can't &lt;em&gt;just&lt;/em&gt; look at individual behaviors to verify it. Imagine if, when &lt;code&gt;f&lt;/code&gt; is true, we &lt;em&gt;never&lt;/em&gt; transition to state B. Is that a violation of (1)? Not if we never transition when &lt;code&gt;f&lt;/code&gt; is false either! To prove a violation, you need to find a behavior where &lt;code&gt;f&lt;/code&gt; is false &lt;em&gt;and&lt;/em&gt; the state is otherwise the same &lt;em&gt;and&lt;/em&gt; we transition to B anyway.&lt;/p&gt;
    &lt;h4&gt;Aside: states in states in states&lt;/h4&gt;
    &lt;p&gt;I dislike how "state" refers to three things:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;The high-level "transition state" of a state-machine&lt;/li&gt;
    &lt;li&gt;A single point in time of a system (the "state space")&lt;/li&gt;
    &lt;li&gt;The mutable data inside your system's &lt;a href="https://www.hillelwayne.com/post/world-vs-machine/" target="_blank"&gt;machine&lt;/a&gt;.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;These are all "close" to each other but &lt;em&gt;just&lt;/em&gt; different enough to make conversations confusing. Software is pretty bad about reusing colloquial words like this; don't even get me &lt;em&gt;started&lt;/em&gt; on the word "design".&lt;/p&gt;
    &lt;h3&gt;There's a reason we don't talk about hyperproperties&lt;/h3&gt;
    &lt;p&gt;Or three reasons. First of all, hyperproperties make up a &lt;em&gt;vanishingly small&lt;/em&gt; percentage of the stuff in a system we care about. We only got to "&lt;code&gt;f&lt;/code&gt; makes the system more aggressive" after checking at least a dozen other simpler and &lt;em&gt;more important&lt;/em&gt; not-hyper properties.&lt;/p&gt;
    &lt;p&gt;Second, &lt;em&gt;most&lt;/em&gt; formal specification languages can't express hyperproperties, and the ones that can are all academic research projects. Modeling systems is hard enough without a generalized behavior notation!&lt;/p&gt;
    &lt;p&gt;Third, hyperproperties are astoundingly expensively to check. As an informal estimation, for a state space of size &lt;code&gt;N&lt;/code&gt; regular properties are checked across &lt;code&gt;N&lt;/code&gt; individual states and 2-behavior hyperproperties (2-props) are checked across &lt;code&gt;N²&lt;/code&gt; pairs. So for a small state space of just a million states, the 2-prop needs to be checked across a &lt;em&gt;trillion&lt;/em&gt; pairs. &lt;/p&gt;
    &lt;p&gt;These problems don't apply to "hyperproperties" of functions, just systems. Functions have a lot of interesting hyperproperties, there's an easy way to represent them (call the function twice in a test), and quadratic scaling isn't so bad if you're only testing 100 inputs or so. That's why so-called &lt;a href="https://www.hillelwayne.com/post/metamorphic-testing/" target="_blank"&gt;metamorphic testing&lt;/a&gt; of functions can be useful.&lt;/p&gt;
    &lt;h3&gt;Checking Hyperproperties Anyway&lt;/h3&gt;
    &lt;p&gt;If we &lt;em&gt;do&lt;/em&gt; need to check a hyperproperty, there's a few ways we can approach it. &lt;/p&gt;
    &lt;p&gt;The easiest way is to cheat and find a regular prop that implies the hyperproperty. In client's case, we can abstract &lt;code&gt;P&lt;/code&gt; and &lt;code&gt;Q&lt;/code&gt; into pure functions and then test that there's no input where &lt;code&gt;P&lt;/code&gt; is true and &lt;code&gt;Q&lt;/code&gt; is false. In TLA+, this would look something like&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;\* TLA+
    QLooserThanP ==
      \A i1 \in InputSet1, i2 \in Set2: \* ...
        P(i1, i2, …) =&gt; Q(i1, i2, …)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Of course we can't always encapsulate this way, and this can't catch bugs like "we accidentally use &lt;code&gt;P&lt;/code&gt; even if &lt;code&gt;f&lt;/code&gt; is true". But it gets the job done.&lt;/p&gt;
    &lt;p&gt;Another way is something I talked about in the &lt;a href="https://hillelwayne.com/post/hyperproperties/" target="_blank"&gt;original hyperproperty post&lt;/a&gt;: lifting specs into hyperspecs. We create a new spec that initializes two copies of our main spec, runs them in parallel, and then compares their behaviors. See the post for an example. Writing a hyperspec keeps us entirely in TLA+ but takes a lot of work and is &lt;em&gt;very&lt;/em&gt; expensive to check. Depending on the property we want to check, we can sometimes find simple optimizations.&lt;/p&gt;
    &lt;p&gt;The last way is something &lt;a href="https://hillelwayne.com/post/graphing-tla/" target="_blank"&gt;I explored last year&lt;/a&gt;: dump the state graph to disk and treat the hyperproperty as a graph property. In this case, the graph property would be something like &lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Find all graph edges representing an A → B transition. Take all the source nodes of each where &lt;code&gt;f = false&lt;/code&gt;. For each such source node, find the corresponding node that's identical except for &lt;code&gt;f = true&lt;/code&gt;. That node should be the source of an A → B edge.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Upside is you don't have to make any changes to the original spec. Downside is you have to use another programming language for analysis. Also, &lt;a href="https://hillelwayne.com/post/graph-types/" target="_blank"&gt;analyzing graphs is terrible&lt;/a&gt;. But I think this overall the most robust approach to handling hyperproperties, to be used when "cheating" fails.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;What fascinates me most about this is the four-year gap between "I learned and wrote about hyperproperties" and "I have to deal with hyperproperties in my job." This is one reason learning for the sake of learning can have a lot of long-term benefits.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Blog Rec&lt;/h3&gt;
    &lt;p&gt;This week's rec is &lt;a href="https://robertheaton.com/" target="_blank"&gt;Robert Heaton&lt;/a&gt;. It's a "general interest" software engineering blog with a focus on math, algorithms, and security. Some of my favorites:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://robertheaton.com/preventing-impossible-game-levels-using-cryptography/" target="_blank"&gt;Preventing impossible game levels using cryptography&lt;/a&gt; and the whole "Steve Steveington" series&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://robertheaton.com/2019/06/24/i-was-7-words-away-from-being-spear-phished/" target="_blank"&gt;I was 7 words away from being spear-phished&lt;/a&gt; is a great deep dive into one targeted scam&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://robertheaton.com/2019/02/24/making-peace-with-simpsons-paradox/" target="_blank"&gt;Making peace with Simpson's Paradox&lt;/a&gt; is the best explanation of Simpson's Paradox I've ever read.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Other good ones are &lt;a href="https://robertheaton.com/pyskywifi/" target="_blank"&gt;PySkyWiFi: completely free, unbelievably stupid wi-fi on long-haul flights&lt;/a&gt; and &lt;a href="https://robertheaton.com/interview/" target="_blank"&gt;How to pass a coding interview with me&lt;/a&gt;. The guy's got &lt;em&gt;breadth&lt;/em&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:client"&gt;
    &lt;p&gt;I do formal methods consulting btw. &lt;a href="https://www.hillelwayne.com/consulting/" target="_blank"&gt;Hire me!&lt;/a&gt; &lt;a class="footnote-backref" href="#fnref:client" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 19 Nov 2024 19:34:54 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/hyperproperties/</guid>
            </item>
            <item>
                <title>Five Unusual Raku Features</title>
                <link>https://buttondown.com/hillelwayne/archive/five-unusual-raku-features/</link>
                <description>&lt;h3&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;&lt;em&gt;Logic for Programmers&lt;/em&gt;&lt;/a&gt; is now in Beta!&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.5 marks the official end of alpha&lt;/a&gt;! With the new version, all of the content I wanted to put in the book is now present, and all that's left is copyediting, proofreading, and formatting. Which will probably take as long as it took to actually write the book. You can see the release notes in the footnote.&lt;sup id="fnref:release-notes"&gt;&lt;a class="footnote-ref" href="#fn:release-notes"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;And I've got a snazzy new cover:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The logic for programmers cover, a 40x zoom of a bird feather" class="newsletter-image" src="https://assets.buttondown.email/images/26c75f1e-e60a-4328-96e5-9878d96d3e53.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;(I don't actually like the cover that much but it &lt;em&gt;looks&lt;/em&gt; official enough until I can pay an actual cover designer.)&lt;/p&gt;
    &lt;h1&gt;"Five" Unusual Raku Features&lt;/h1&gt;
    &lt;p&gt;Last year I started learning Raku, and the sheer bizarreness of the language left me describing it as &lt;a href="https://buttondown.com/hillelwayne/archive/raku-a-language-for-gremlins/" target="_blank"&gt;a language for gremlins&lt;/a&gt;. Now that I've used it in anger for over a year, I have a better way of describing it:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Raku is a laboratory for language features.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;This is why it has &lt;a href="https://docs.raku.org/language/concurrency" target="_blank"&gt;five different models of concurrency&lt;/a&gt; and eighteen ways of doing anything else, because the point is to &lt;em&gt;see&lt;/em&gt; what happens. It also explains why many of the features interact so strangely and why there's all that odd edge-case behavior. Getting 100 experiments polished and playing nicely with each other is much harder than running 100 experiments; we can sort out the polish &lt;em&gt;after&lt;/em&gt; we figure out which ideas are good ones.&lt;/p&gt;
    &lt;p&gt;So here are "five" Raku experiments you could imagine seeing in another programming language. If you squint.&lt;/p&gt;
    &lt;h3&gt;&lt;a href="https://docs.raku.org/type/Junction" target="_blank"&gt;Junctions&lt;/a&gt;&lt;/h3&gt;
    &lt;p&gt;Junctions are "superpositions of possible values". Applying an operation to a junction instead applies it to every value inside the junction.  &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;
    &lt;span class="nb"&gt;any&lt;/span&gt;(&lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;10&lt;/span&gt;)
    
    &gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="nv"&gt;&amp;10&lt;/span&gt; + &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="nb"&gt;all&lt;/span&gt;(&lt;span class="mi"&gt;5&lt;/span&gt;, &lt;span class="mi"&gt;13&lt;/span&gt;)
    
    &gt;(&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="nv"&gt;&amp;2&lt;/span&gt;) + (&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;)
    &lt;span class="nb"&gt;all&lt;/span&gt;(&lt;span class="nb"&gt;one&lt;/span&gt;(&lt;span class="mi"&gt;11&lt;/span&gt;, &lt;span class="mi"&gt;21&lt;/span&gt;), &lt;span class="nb"&gt;one&lt;/span&gt;(&lt;span class="mi"&gt;12&lt;/span&gt;, &lt;span class="mi"&gt;22&lt;/span&gt;))
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;As you can probably tell from the &lt;code&gt;all&lt;/code&gt;s and &lt;code&gt;any&lt;/code&gt;s, junctions are a feature meant for representing boolean formula. There's no way to destructure a junction, and the only way to use it is to collapse it to a boolean first.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; (&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="nv"&gt;&amp;2&lt;/span&gt;) + (&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;) &lt; &lt;span class="mi"&gt;15&lt;/span&gt;
    &lt;span class="nb"&gt;all&lt;/span&gt;(&lt;span class="nb"&gt;one&lt;/span&gt;(&lt;span class="nb"&gt;True&lt;/span&gt;, &lt;span class="nb"&gt;False&lt;/span&gt;), &lt;span class="nb"&gt;one&lt;/span&gt;(&lt;span class="nb"&gt;True&lt;/span&gt;, &lt;span class="nb"&gt;False&lt;/span&gt;))
    
    &lt;span class="c1"&gt;# so coerces junctions to booleans&lt;/span&gt;
    &gt; &lt;span class="nb"&gt;so&lt;/span&gt; (&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="nv"&gt;&amp;2&lt;/span&gt;) + (&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;) &lt; &lt;span class="mi"&gt;15&lt;/span&gt;
    &lt;span class="nb"&gt;True&lt;/span&gt;
    
    &gt; &lt;span class="nb"&gt;so&lt;/span&gt; (&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="nv"&gt;&amp;2&lt;/span&gt;) + (&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;) &gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="nb"&gt;False&lt;/span&gt;
    
    &gt; &lt;span class="mi"&gt;16&lt;/span&gt; %% (&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="nv"&gt;&amp;5&lt;/span&gt;) ?? &lt;span class="s"&gt;"fizzbuzz"&lt;/span&gt; !! *
    *
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The real interesting thing for me is how Raku elegantly uses junctions to represent quantifiers. In most languages, you either have the function &lt;code&gt;all(list[T], T -&gt; bool)&lt;/code&gt; or the method &lt;code&gt;[T].all(T -&gt; bool)&lt;/code&gt;, both of which apply the test to every element of the list. In Raku, though, &lt;code&gt;list.all&lt;/code&gt; doesn't take &lt;em&gt;anything&lt;/em&gt;, it's just a niladic method that turns the list into a junction. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="k"&gt;my&lt;/span&gt; &lt;span class="nv"&gt;$x&lt;/span&gt; = &lt;span class="s"&gt;&lt;1 2 3&gt;&lt;/span&gt;.&lt;span class="nb"&gt;all&lt;/span&gt;
    &lt;span class="nb"&gt;all&lt;/span&gt;(&lt;span class="mi"&gt;1&lt;/span&gt;, &lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;3&lt;/span&gt;)
    &gt; &lt;span class="nb"&gt;is-prime&lt;/span&gt;(&lt;span class="nv"&gt;$x&lt;/span&gt;)
    &lt;span class="nb"&gt;all&lt;/span&gt;(&lt;span class="nb"&gt;False&lt;/span&gt;, &lt;span class="nb"&gt;True&lt;/span&gt;, &lt;span class="nb"&gt;True&lt;/span&gt;)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This means we can combine junctions. If Raku didn't already have a &lt;code&gt;unique&lt;/code&gt; method, we could build it by saying "are all elements equal to exactly one element?"&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="nb"&gt;so&lt;/span&gt; {.&lt;span class="nb"&gt;all&lt;/span&gt; == .&lt;span class="nb"&gt;one&lt;/span&gt;}(&lt;span class="s"&gt;&lt;1 2 3 7&gt;&lt;/span&gt;)
    &lt;span class="nb"&gt;True&lt;/span&gt;
    
    &gt; &lt;span class="nb"&gt;so&lt;/span&gt; {.&lt;span class="nb"&gt;all&lt;/span&gt; == .&lt;span class="nb"&gt;one&lt;/span&gt;}(&lt;span class="s"&gt;&lt;1 2 3 7 2&gt;&lt;/span&gt;)
    &lt;span class="nb"&gt;False&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;&lt;a href="https://docs.raku.org/type/Whatever" target="_blank"&gt;Whatevers&lt;/a&gt;&lt;/h3&gt;
    &lt;p&gt;&lt;code&gt;*&lt;/code&gt; is the "whatever" symbol and has a lot of different roles in Raku.&lt;sup id="fnref:analogs"&gt;&lt;a class="footnote-ref" href="#fn:analogs"&gt;2&lt;/a&gt;&lt;/sup&gt; Some functions and operators have special behavior when passed a &lt;code&gt;*&lt;/code&gt;. In a range or sequence, &lt;code&gt;*&lt;/code&gt; means "unbound".&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="mi"&gt;1&lt;/span&gt;..*
    &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="n"&gt;Inf&lt;/span&gt;
    
    &gt; (&lt;span class="mi"&gt;2&lt;/span&gt;,&lt;span class="mi"&gt;4&lt;/span&gt;,&lt;span class="mi"&gt;8&lt;/span&gt;...*)[&lt;span class="mi"&gt;17&lt;/span&gt;]
    &lt;span class="mi"&gt;262144&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The main built-in use, though, is that expressions with &lt;code&gt;*&lt;/code&gt; are lifted into anonymous functions. This is called "whatever-priming" and produces a &lt;code&gt;WhateverCode&lt;/code&gt;, which is indistinguishable from other functions except for the type.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; {&lt;span class="nv"&gt;$_&lt;/span&gt; + &lt;span class="mi"&gt;10&lt;/span&gt;}(&lt;span class="mi"&gt;2&lt;/span&gt;)
    &lt;span class="mi"&gt;12&lt;/span&gt;
    
    &gt; (* + &lt;span class="mi"&gt;10&lt;/span&gt;)(&lt;span class="mi"&gt;2&lt;/span&gt;)
    &lt;span class="mi"&gt;12&lt;/span&gt;
    
    &gt; (^&lt;span class="mi"&gt;10&lt;/span&gt;).&lt;span class="n"&gt;map&lt;/span&gt;(* % &lt;span class="mi"&gt;2&lt;/span&gt;)
    (&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There's actually a bit of weird behavior here: if &lt;em&gt;two&lt;/em&gt; whatevers appear in the expression, they become separate positional variables. &lt;code&gt;(2, 30, 4, 50).map(* + *)&lt;/code&gt; returns &lt;code&gt;(32, 54)&lt;/code&gt;. This makes it easy to express &lt;a href="https://docs.raku.org/language/operators#infix_..." target="_blank"&gt;a tricky Fibonacci definition&lt;/a&gt; but otherwise I don't see how it's better than making each &lt;code&gt;*&lt;/code&gt; the same value.&lt;/p&gt;
    &lt;p&gt;Regardless, priming is useful because &lt;em&gt;so many&lt;/em&gt; Raku methods are overloaded to take functions. You get the last element of a list with &lt;code&gt;l[*-1]&lt;/code&gt;. This &lt;em&gt;looks&lt;/em&gt; like standard negative-index syntax, but what actually happens is that when &lt;code&gt;[]&lt;/code&gt; is passed a function, it passes in list length and looks up the result. So if the list has 10 elements, &lt;code&gt;l[*-1] = l[10-1] = l[9]&lt;/code&gt;, aka the last element. Similarly, &lt;code&gt;l.head(2)&lt;/code&gt; is the first two elements of a list, &lt;code&gt;l.head(*-2)&lt;/code&gt; is all-but-the-last-two.&lt;/p&gt;
    &lt;p&gt;We can pass other functions to &lt;code&gt;[]&lt;/code&gt;, which e.g. makes implementing ring buffers easy.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="k"&gt;my&lt;/span&gt; &lt;span class="nv"&gt;@x&lt;/span&gt; = ^&lt;span class="mi"&gt;10&lt;/span&gt;
    [&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;]
    
    &gt; &lt;span class="nv"&gt;@x&lt;/span&gt;[&lt;span class="mi"&gt;95&lt;/span&gt; % *]--; &lt;span class="nv"&gt;@x&lt;/span&gt;
    [&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;&lt;a href="https://docs.raku.org/language/regexes" target="_blank"&gt;Regular Expressions&lt;/a&gt;&lt;/h3&gt;
    &lt;p&gt;There are two basic standards for regexes: POSIX regexes and Perl-compatible regexes (PCRE). POSIX regexes are a terrible mess of backslashes and punctuation. PCRE is backwards compatible with POSIX and is a more terrible mess of backslashes and punctuation. Most languages follow the PCRE standard, but Perl 6 breaks backwards compatibility with an entirely new regex syntax. &lt;/p&gt;
    &lt;p&gt;The most obvious improvement: &lt;a href="https://docs.raku.org/language/regexes#Subrules" target="_blank"&gt;composability&lt;/a&gt;. In most languages  "combine" two regexes by concating their strings together, which is terrible for many, many reasons. Raku has the standard "embed another regex" syntax: &lt;code&gt;/&lt; foo &gt;+/&lt;/code&gt; matches one-or-more of the &lt;code&gt;foo&lt;/code&gt; regex without &lt;code&gt;foo&lt;/code&gt; "leaking" into the top regex. &lt;/p&gt;
    &lt;p&gt;This already does a lot to make regexes more tractable: you can break a complicated regular expression down into simpler and more legible parts. And in fact this is how Raku supports &lt;a href="https://docs.raku.org/language/grammars" target="_blank"&gt;parsing grammars&lt;/a&gt; as a builtin language feature. I've only used grammars once but it &lt;a href="https://www.hillelwayne.com/post/picat/" target="_blank"&gt;was quite helpful&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;Since we're breaking backwards compatibility anyway, we can now add lots of small QOLs. There's a &lt;a href="https://docs.raku.org/language/regexes#Modified_quantifier:_%,_%%" target="_blank"&gt;value separator&lt;/a&gt; modifier: &lt;code&gt;\d+ % ','&lt;/code&gt; matches &lt;code&gt;1&lt;/code&gt; / &lt;code&gt;1,2&lt;/code&gt; / &lt;code&gt;1,1,4&lt;/code&gt; but not &lt;code&gt;1,&lt;/code&gt; or &lt;code&gt;12&lt;/code&gt;. &lt;a href="https://docs.raku.org/language/regexes#Lookaround_assertions" target="_blank"&gt;Lookaheads&lt;/a&gt; and non-capturing groups aren't nonsense glyphs. &lt;code&gt;r1 &amp;&amp; r2&lt;/code&gt; only matches strings that match &lt;em&gt;both&lt;/em&gt; &lt;code&gt;r1&lt;/code&gt; and &lt;code&gt;r2&lt;/code&gt;. Backtracking can be stopped with &lt;a href="https://docs.raku.org/language/regexes#Preventing_backtracking:_:" target="_blank"&gt;:&lt;/a&gt;. Whitespace is ignored by default and has to be explicitly enabled in match patterns.&lt;/p&gt;
    &lt;p&gt;There's more stuff Raku does with actually &lt;em&gt;processing&lt;/em&gt; regular expressions, but the regex notation is something that might actually appear in another language someday. &lt;/p&gt;
    &lt;p style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;&lt;a href="https://docs.raku.org/language/operators#Hyper_operators" target="_blank"&gt;Hyperoperators&lt;/a&gt;&lt;/h3&gt;
    &lt;p&gt;This is a small one compared to the other features, but it's also the thing I miss most often in other languages. The most basic form &lt;code&gt;l&gt;&gt;.method&lt;/code&gt; is basically equivalent to &lt;code&gt;map&lt;/code&gt;, except it also recursively descends into sublists.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; [&lt;span class="mi"&gt;1&lt;/span&gt;, [&lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;3&lt;/span&gt;], &lt;span class="mi"&gt;4&lt;/span&gt;]&gt;&gt;.&lt;span class="nb"&gt;succ&lt;/span&gt;
    [&lt;span class="mi"&gt;2&lt;/span&gt; [&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;] &lt;span class="mi"&gt;5&lt;/span&gt;]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is more useful than it looks because any function call &lt;code&gt;f(list, *args)&lt;/code&gt; can be rewritten in "method form" &lt;code&gt;list.&amp;f(*args)&lt;/code&gt;, so &lt;code&gt;&gt;&gt;.&lt;/code&gt; becomes the generalized mapping operator. You can use it with whatevers, too.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; [&lt;span class="mi"&gt;1&lt;/span&gt;, [&lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;3&lt;/span&gt;], &lt;span class="mi"&gt;4&lt;/span&gt;]&gt;&gt;.&amp;(*+&lt;span class="mi"&gt;1&lt;/span&gt;)
    [&lt;span class="mi"&gt;2&lt;/span&gt; [&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;] &lt;span class="mi"&gt;5&lt;/span&gt;]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Anyway, the more generalized &lt;em&gt;binary&lt;/em&gt; hyperoperator &lt;code&gt;l1 &lt;&lt; op &gt;&gt; l2&lt;/code&gt;&lt;sup id="fnref:spaces"&gt;&lt;a class="footnote-ref" href="#fn:spaces"&gt;3&lt;/a&gt;&lt;/sup&gt; applies &lt;code&gt;op&lt;/code&gt; elementwise to the two lists, looping the shorter list until the longer list is exhausted. &lt;code&gt;&gt;&gt;op&gt;&gt;&lt;/code&gt; / &lt;code&gt;&lt;&lt; op&lt;&lt;&lt;/code&gt; are the same except they instead loop until the lhs/rhs list is exhausted. Whew!&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; [&lt;span class="mi"&gt;1&lt;/span&gt;, &lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;3&lt;/span&gt;, &lt;span class="mi"&gt;4&lt;/span&gt;, &lt;span class="mi"&gt;5&lt;/span&gt;] &lt;span class="s"&gt;&lt;&lt;+&gt;&lt;/span&gt;&gt; [&lt;span class="mi"&gt;10&lt;/span&gt;, &lt;span class="mi"&gt;20&lt;/span&gt;]
    [&lt;span class="mi"&gt;11&lt;/span&gt; &lt;span class="mi"&gt;22&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;]
    
    &gt; [&lt;span class="mi"&gt;1&lt;/span&gt;, &lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;3&lt;/span&gt;, &lt;span class="mi"&gt;4&lt;/span&gt;, &lt;span class="mi"&gt;5&lt;/span&gt;] &lt;span class="s"&gt;&lt;&lt;+&lt;&lt; [10, 20]&lt;/span&gt;
    &lt;span class="s"&gt;[11 22]&lt;/span&gt;
    
    &lt;span class="s"&gt;&gt; [1, 2, 3, 4, 5] &gt;&gt;&lt;/span&gt;+&gt;&gt; [&lt;span class="mi"&gt;10&lt;/span&gt;, &lt;span class="mi"&gt;20&lt;/span&gt;]
    [&lt;span class="mi"&gt;11&lt;/span&gt; &lt;span class="mi"&gt;22&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;]
    
    &lt;span class="c1"&gt;# Also works with single values&lt;/span&gt;
    &gt; [&lt;span class="mi"&gt;1&lt;/span&gt;, &lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;3&lt;/span&gt;, &lt;span class="mi"&gt;4&lt;/span&gt;, &lt;span class="mi"&gt;5&lt;/span&gt;] &lt;span class="s"&gt;&lt;&lt;+&gt;&lt;/span&gt;&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
    [&lt;span class="mi"&gt;11&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt; &lt;span class="mi"&gt;14&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;]
    
    &lt;span class="c1"&gt;# Does weird things with nested lists too&lt;/span&gt;
    &gt; [&lt;span class="mi"&gt;1&lt;/span&gt;, [&lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;3&lt;/span&gt;], &lt;span class="mi"&gt;4&lt;/span&gt;, &lt;span class="mi"&gt;5&lt;/span&gt;] &lt;span class="s"&gt;&lt;&lt;+&gt;&lt;/span&gt;&gt; [&lt;span class="mi"&gt;10&lt;/span&gt;, &lt;span class="mi"&gt;20&lt;/span&gt;]
    [&lt;span class="mi"&gt;11&lt;/span&gt; [&lt;span class="mi"&gt;22&lt;/span&gt; &lt;span class="mi"&gt;23&lt;/span&gt;] &lt;span class="mi"&gt;14&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Also for some reason the hyperoperators have separate behaviors on two hashes, either applying &lt;code&gt;op&lt;/code&gt; to the union/intersection/hash difference. &lt;/p&gt;
    &lt;p&gt;Anyway it's a super weird (meta)operator but it's also quite useful! It's the closest thing I've seen to &lt;a href="https://hillelwayne.com/post/j-notation/" target="_blank"&gt;J verbs&lt;/a&gt; outside an APL. I like using it to run the same formula on multiple possible inputs at once.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;(&lt;span class="mi"&gt;20&lt;/span&gt; * &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="s"&gt;&lt;&lt;-&gt;&lt;/span&gt;&gt; (&lt;span class="mi"&gt;21&lt;/span&gt;, &lt;span class="mi"&gt;24&lt;/span&gt;)) &lt;span class="s"&gt;&lt;&lt;*&gt;&lt;/span&gt;&gt; (&lt;span class="mi"&gt;10&lt;/span&gt;, &lt;span class="mi"&gt;100&lt;/span&gt;)
    (&lt;span class="mi"&gt;1790&lt;/span&gt; &lt;span class="mi"&gt;17600&lt;/span&gt;)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Incidentally, it's called the hyperoperator because it evaluates all of the operations in parallel. Explicit loops can be parallelized by prefixing them with &lt;a href="https://docs.raku.org/language/statement-prefixes#hyper,_race" target="_blank"&gt;&lt;code&gt;hyper&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
    &lt;h3&gt;&lt;a href="https://docs.raku.org/type/Pair" target="_blank"&gt;Pair Syntax&lt;/a&gt;&lt;/h3&gt;
    &lt;p&gt;I've talked about pairs a little in &lt;a href="https://buttondown.com/hillelwayne/archive/unusual-basis-types-in-programming-languages/" target="_blank"&gt;this newsletter&lt;/a&gt;, but the gist is that Raku hashes are composed of a set of pairs &lt;code&gt;key =&gt; value&lt;/code&gt;. The pair is the basis type, the hash is the collection of pairs. There's also a &lt;em&gt;ton&lt;/em&gt; of syntactic sugar for concisely specifying pairs via "colon syntax":&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="k"&gt;my&lt;/span&gt; &lt;span class="nv"&gt;$x&lt;/span&gt; = &lt;span class="mi"&gt;3&lt;/span&gt;; :&lt;span class="nv"&gt;$x&lt;/span&gt;
    &lt;span class="nb"&gt;x&lt;/span&gt; =&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    
    &gt; :&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="s"&gt;&lt;$x&gt;&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt; =&gt; &lt;span class="s"&gt;"$x"&lt;/span&gt;
    
    &gt; :&lt;span class="n"&gt;a&lt;/span&gt;(&lt;span class="nv"&gt;$x&lt;/span&gt;)
    &lt;span class="n"&gt;a&lt;/span&gt; =&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    
    &gt; :&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt; =&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The most important sugars are &lt;code&gt;:key&lt;/code&gt; and &lt;code&gt;:!key&lt;/code&gt;, which map to &lt;code&gt;key =&gt; True&lt;/code&gt; and &lt;code&gt;key =&gt; False&lt;/code&gt;. This is a really elegant way to add flags to a methods! Take the definition of &lt;a href="https://docs.raku.org/type/Str#method_match" target="_blank"&gt;match&lt;/a&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;method&lt;/span&gt; &lt;span class="nb"&gt;match&lt;/span&gt;(&lt;span class="nv"&gt;$pat&lt;/span&gt;, 
        :&lt;span class="n"&gt;continue&lt;/span&gt;(:&lt;span class="nv"&gt;$c&lt;/span&gt;), :&lt;span class="n"&gt;pos&lt;/span&gt;(:&lt;span class="nv"&gt;$p&lt;/span&gt;), :&lt;span class="n"&gt;global&lt;/span&gt;(:&lt;span class="nv"&gt;$g&lt;/span&gt;), 
        :&lt;span class="n"&gt;overlap&lt;/span&gt;(:&lt;span class="nv"&gt;$ov&lt;/span&gt;), :&lt;span class="n"&gt;exhaustive&lt;/span&gt;(:&lt;span class="nv"&gt;$ex&lt;/span&gt;), 
        :&lt;span class="n"&gt;st&lt;/span&gt;(:&lt;span class="nv"&gt;$nd&lt;/span&gt;), :&lt;span class="n"&gt;rd&lt;/span&gt;(:&lt;span class="nv"&gt;$th&lt;/span&gt;), :&lt;span class="nv"&gt;$nth&lt;/span&gt;, :&lt;span class="nv"&gt;$x&lt;/span&gt; --&gt; &lt;span class="nb"&gt;Match&lt;/span&gt;)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Probably should also mention that in a definition, &lt;code&gt;:f(:$foo)&lt;/code&gt; defines the parameter &lt;code&gt;$foo&lt;/code&gt; but &lt;a href="https://docs.raku.org/language/signatures#Argument_aliases" target="_blank"&gt;also aliases it&lt;/a&gt; to &lt;code&gt;:f&lt;/code&gt;, so you can set the flag with &lt;code&gt;:f&lt;/code&gt; or &lt;code&gt;:foo&lt;/code&gt;. Colon-pairs defined in the signature can be passed in anywhere, or even stuck together:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="s"&gt;"abab"&lt;/span&gt;.&lt;span class="nb"&gt;match&lt;/span&gt;(&lt;span class="sr"&gt;/../&lt;/span&gt;)
    「&lt;span class="n"&gt;ab&lt;/span&gt;」
    &gt; &lt;span class="s"&gt;"abab"&lt;/span&gt;.&lt;span class="nb"&gt;match&lt;/span&gt;(&lt;span class="sr"&gt;/../&lt;/span&gt;, :&lt;span class="n"&gt;g&lt;/span&gt;)
    (「&lt;span class="n"&gt;ab&lt;/span&gt;」 「&lt;span class="n"&gt;ab&lt;/span&gt;」)
    &gt; &lt;span class="s"&gt;"abab"&lt;/span&gt;.&lt;span class="nb"&gt;match&lt;/span&gt;(&lt;span class="sr"&gt;/../&lt;/span&gt;, :&lt;span class="n"&gt;g&lt;/span&gt;, :&lt;span class="n"&gt;ov&lt;/span&gt;)
    (「&lt;span class="n"&gt;ab&lt;/span&gt;」 「&lt;span class="n"&gt;ba&lt;/span&gt;」 「&lt;span class="n"&gt;ab&lt;/span&gt;」)
    
    &lt;span class="c1"&gt;# Out of order stuck together&lt;/span&gt;
    &gt; &lt;span class="s"&gt;"abab"&lt;/span&gt;.&lt;span class="nb"&gt;match&lt;/span&gt;(:&lt;span class="n"&gt;g:ov&lt;/span&gt;,&lt;span class="sr"&gt; /../&lt;/span&gt;)
    (「&lt;span class="n"&gt;ab&lt;/span&gt;」 「&lt;span class="n"&gt;ba&lt;/span&gt;」 「&lt;span class="n"&gt;ab&lt;/span&gt;」)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So that leads to extremely concise method configuration. Definitely beats &lt;code&gt;match(global=True, overlap=True)&lt;/code&gt;!&lt;/p&gt;
    &lt;p&gt;And for some reason you can place keyword arguments &lt;em&gt;after&lt;/em&gt; the function call:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="s"&gt;"abab"&lt;/span&gt;.&lt;span class="nb"&gt;match&lt;/span&gt;(:&lt;span class="n"&gt;g&lt;/span&gt;,&lt;span class="sr"&gt; /../&lt;/span&gt;):&lt;span class="n"&gt;ov:2nd&lt;/span&gt;
    「&lt;span class="n"&gt;ba&lt;/span&gt;」
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h2&gt;The next-gen lab: Slangs and RakuAST&lt;/h2&gt;
    &lt;p&gt;These are features I have no experience in and &lt;em&gt;certainly&lt;/em&gt; are not making their way into other languages, but they really expand the explorable space of new features. &lt;a href="https://raku.land/zef:lizmat/Slangify" target="_blank"&gt;Slangs&lt;/a&gt; are modifications to the Raku syntax. This can be used for things like &lt;a href="https://raku.land/zef:elcaro/Slang::Otherwise" target="_blank"&gt;modifying loop syntax&lt;/a&gt;, &lt;a href="https://raku.land/zef:raku-community-modules/Slang::Piersing" target="_blank"&gt;changing identifiers&lt;/a&gt;, or adding &lt;a href="https://raku.land/zef:raku-community-modules/OO::Actors" target="_blank"&gt;actors&lt;/a&gt; or &lt;a href="https://raku.land/github:MattOates/BioInfo" target="_blank"&gt;DNA sequences&lt;/a&gt; to the base language.&lt;/p&gt;
    &lt;p&gt;I &lt;em&gt;barely&lt;/em&gt; understand &lt;a href="https://dev.to/lizmat/rakuast-for-early-adopters-576n" target="_blank"&gt;RakuAST&lt;/a&gt;. I &lt;em&gt;think&lt;/em&gt; the idea is that all Raku expressions can be parsed as an AST from inside Raku itself.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="s"&gt;Q/my $x; $x++/&lt;/span&gt;.&lt;span class="nb"&gt;AST&lt;/span&gt;
    &lt;span class="n"&gt;RakuAST::StatementList&lt;/span&gt;.&lt;span class="nb"&gt;new&lt;/span&gt;(
      &lt;span class="n"&gt;RakuAST::Statement::Expression&lt;/span&gt;.&lt;span class="nb"&gt;new&lt;/span&gt;(
        &lt;span class="n"&gt;expression&lt;/span&gt; =&gt; &lt;span class="n"&gt;RakuAST::VarDeclaration::Simple&lt;/span&gt;.&lt;span class="nb"&gt;new&lt;/span&gt;(
          &lt;span class="nb"&gt;sigil&lt;/span&gt;       =&gt; &lt;span class="s"&gt;"\$"&lt;/span&gt;,
          &lt;span class="n"&gt;desigilname&lt;/span&gt; =&gt; &lt;span class="n"&gt;RakuAST::Name&lt;/span&gt;.&lt;span class="n"&gt;from-identifier&lt;/span&gt;(&lt;span class="s"&gt;"x"&lt;/span&gt;)
        )
      ),
      &lt;span class="n"&gt;RakuAST::Statement::Expression&lt;/span&gt;.&lt;span class="nb"&gt;new&lt;/span&gt;(
        &lt;span class="n"&gt;expression&lt;/span&gt; =&gt; &lt;span class="n"&gt;RakuAST::ApplyPostfix&lt;/span&gt;.&lt;span class="nb"&gt;new&lt;/span&gt;(
          &lt;span class="n"&gt;operand&lt;/span&gt; =&gt; &lt;span class="n"&gt;RakuAST::Var::Lexical&lt;/span&gt;.&lt;span class="nb"&gt;new&lt;/span&gt;(&lt;span class="s"&gt;"\$x"&lt;/span&gt;),
          &lt;span class="nb"&gt;postfix&lt;/span&gt; =&gt; &lt;span class="n"&gt;RakuAST::Postfix&lt;/span&gt;.&lt;span class="nb"&gt;new&lt;/span&gt;(&lt;span class="s"&gt;"++"&lt;/span&gt;)
        )
      )
    )
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This allows for things like writing Raku in different languages:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nb"&gt;say&lt;/span&gt; &lt;span class="s"&gt;Q/my $x; put $x/&lt;/span&gt;.&lt;span class="nb"&gt;AST&lt;/span&gt;.&lt;span class="n"&gt;DEPARSE&lt;/span&gt;(&lt;span class="s"&gt;"NL"&lt;/span&gt;)
    &lt;span class="n"&gt;mijn&lt;/span&gt; &lt;span class="nv"&gt;$x&lt;/span&gt;;
    &lt;span class="n"&gt;zeg-het&lt;/span&gt; &lt;span class="nv"&gt;$x&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;Bonus experiment&lt;/h3&gt;
    &lt;p&gt;Raku comes with a "&lt;a href="https://rakudo.org/star" target="_blank"&gt;Rakudo Star&lt;/a&gt;" installation, which comes with a set of &lt;a href="https://github.com/rakudo/star/blob/master/etc/modules.txt" target="_blank"&gt;blessed third party modules&lt;/a&gt; preinstalled. I love this! It's a great compromise between the maintainer burdens of a large standard library and the user burdens of making everybody find the right packages in the ecosystem.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h2&gt;Blog Rec&lt;/h2&gt;
    &lt;p&gt;Feel obligated to recommend some Raku blogs! Elizabeth Mattijsen posts &lt;a href="https://dev.to/lizmat" target="_blank"&gt;a ton of stuff&lt;/a&gt; to dev.to about Raku internals. &lt;a href="https://www.codesections.com/blog/" target="_blank"&gt;Codesections&lt;/a&gt; has a pretty good blog; he's the person who eventually got me to try out Raku. Finally, the &lt;a href="https://raku-advent.blog/" target="_blank"&gt;Raku Advent Calendar&lt;/a&gt; is a great dive into advanced Raku techniques. Bad news is it only updates once a year, good news is it's 25 updates that once a year.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:release-notes"&gt;
    &lt;ul&gt;
    &lt;li&gt;All techniques chapters now have a "Further Reading" section&lt;/li&gt;
    &lt;li&gt;"System modeling" chapter significantly rewritten&lt;/li&gt;
    &lt;li&gt;"Conditionals" chapter expanded, now a real chapter&lt;/li&gt;
    &lt;li&gt;"Logic Programming" chapter now covers datalog, deductive databases&lt;/li&gt;
    &lt;li&gt;"Solvers" chapter has diagram explaining problem&lt;/li&gt;
    &lt;li&gt;Eight new exercises&lt;/li&gt;
    &lt;li&gt;Tentative front cover (will probably change)&lt;/li&gt;
    &lt;li&gt;Fixed some epub issues with math rendering&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;&lt;a class="footnote-backref" href="#fnref:release-notes" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:analogs"&gt;
    &lt;p&gt;Analogues are &lt;a href="https://stackoverflow.com/questions/8000903/what-are-all-the-uses-of-an-underscore-in-scala/8001065#8001065" target="_blank"&gt;Scala's underscore&lt;/a&gt;, except unlike Scala it's a value and not syntax, and like Python's &lt;a href="https://docs.python.org/3/library/constants.html#Ellipsis" target="_blank"&gt;Ellipses&lt;/a&gt;, except it has additional semantics. &lt;a class="footnote-backref" href="#fnref:analogs" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:spaces"&gt;
    &lt;p&gt;Spaces added so buttondown doesn't think they're tags &lt;a class="footnote-backref" href="#fnref:spaces" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 12 Nov 2024 20:06:55 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/five-unusual-raku-features/</guid>
            </item>
            <item>
                <title>A list of ternary operators</title>
                <link>https://buttondown.com/hillelwayne/archive/a-list-of-ternary-operators/</link>
                <description>&lt;p&gt;Sup nerds, I'm back from SREcon! I had a blast, despite knowing nothing about site reliability engineering and being way over my head in half the talks. I'm trying to catch up on &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;The Book&lt;/a&gt; and contract work now so I'll do something silly here: ternary operators.&lt;/p&gt;
    &lt;p&gt;Almost all operations on values in programming languages fall into one of three buckets: &lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;&lt;strong&gt;Unary operators&lt;/strong&gt;, where the operator goes &lt;em&gt;before&lt;/em&gt; or &lt;em&gt;after&lt;/em&gt; exactly one argument. Examples are &lt;code&gt;x++&lt;/code&gt; and &lt;code&gt;-y&lt;/code&gt; and &lt;code&gt;!bool&lt;/code&gt;. Most languages have a few critical unary operators hardcoded into the grammar. They are almost always symbols, but sometimes are string-identifiers (&lt;code&gt;not&lt;/code&gt;).&lt;/li&gt;
    &lt;li&gt;&lt;strong&gt;Binary operators&lt;/strong&gt;, which are placed &lt;em&gt;between&lt;/em&gt; exactly two arguments. Things like &lt;code&gt;+&lt;/code&gt; or &lt;code&gt;&amp;&amp;&lt;/code&gt; or &lt;code&gt;&gt;=&lt;/code&gt;. Languages have a lot more of these than unary operators, because there's more fundamental things we want to do with two values than one value. These can be symbols or identifiers (&lt;code&gt;and&lt;/code&gt;).&lt;/li&gt;
    &lt;li&gt;Functions/methods that &lt;em&gt;prefix&lt;/em&gt; any number of arguments. &lt;code&gt;func(a, b, c)&lt;/code&gt;, &lt;code&gt;obj.method(a, b, c, d)&lt;/code&gt;, anything in a lisp. These are how we extend the language, and they almost-exclusively use identifiers and not symbols.&lt;sup id="fnref:lisp"&gt;&lt;a class="footnote-ref" href="#fn:lisp"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;There's one widespread exception to this categorization: the &lt;strong&gt;ternary operator&lt;/strong&gt; &lt;code&gt;bool ? x : y&lt;/code&gt;.&lt;sup id="fnref:ternary"&gt;&lt;a class="footnote-ref" href="#fn:ternary"&gt;2&lt;/a&gt;&lt;/sup&gt; It's an infix operator that takes exactly &lt;em&gt;three&lt;/em&gt; arguments and can't be decomposed into two sequential binary operators. &lt;code&gt;bool ? x&lt;/code&gt; makes no sense on its own, nor does &lt;code&gt;x : y&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;Other ternary operators are &lt;em&gt;extremely&lt;/em&gt; rare, which is why conditional expressions got to monopolize the name "ternary". But I like how exceptional they are and want to compile some of them. A long long time ago I asked &lt;a href="https://twitter.com/hillelogram/status/1378509881498603527" target="_blank"&gt;Twitter&lt;/a&gt; for other ternary operators; this is a compilation of some applicable responses plus my own research.&lt;/p&gt;
    &lt;p&gt;(Most of these are a &lt;em&gt;bit&lt;/em&gt; of a stretch.)&lt;/p&gt;
    &lt;h3&gt;Stepped Ranges&lt;/h3&gt;
    &lt;p&gt;Many languages have some kind of "stepped range" function:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python&lt;/span&gt;
    &lt;span class="o"&gt;&gt;&gt;&gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There's the "base case" of start and endpoints, and an optional step. Many languages have a binary infix op for the base case, but a few also have a ternary for the optional step:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# Frink
    &gt; map[{|a| a*2}, (1 to 100 step 15) ] 
    [2, 32, 62, 92, 122, 152, 182]
    
    # Elixir
    &gt; IO.puts Enum.join(1..10//2, " ")
    1 3 5 7 9
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This isn't decomposable into two binary ops because you can't assign the range to a value and then step the value later.&lt;/p&gt;
    &lt;h3&gt;Graph ops&lt;/h3&gt;
    &lt;p&gt;In &lt;a href="https://graphviz.org/" target="_blank"&gt;Graphviz&lt;/a&gt;, a basic edge between two nodes is either the binary &lt;code&gt;node1 -&gt; node2&lt;/code&gt; or the ternary &lt;code&gt;node1 -&gt; node2 [edge_props]&lt;/code&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;digraph&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;G&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;a1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;a2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="na"&gt;color&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"green"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;img alt="Output of the above graphviz" class="newsletter-image" src="https://assets.buttondown.email/images/d1a0f894-59d5-45d3-8702-967e94672371.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Graphs seem ternary-friendly because there are three elements involved with any graph connection: the two nodes and the connecting edge. So you also see ternaries in some graph database query languages, with separate places to specify each node and the edge.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# GSQL (https://docs.tigergraph.com/gsql-ref/4.1/tutorials/gsql-101/parameterized-gsql-query)
    SELECT tgt
        FROM start:s -(Friendship:e)- Person:tgt;
    
    # Cypher (https://neo4j.com/docs/cypher-manual/current/introduction/cypher-overview/)
    MATCH (actor:Actor)-[:ACTED_IN]-&gt;(movie:Movie {title: 'The Matrix'})
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Obligatory plug for my &lt;a href="https://www.hillelwayne.com/post/graph-types/" target="_blank"&gt;graph datatype essay&lt;/a&gt;.&lt;/p&gt;
    &lt;h3&gt;Metaoperators&lt;/h3&gt;
    &lt;p&gt;Both &lt;a href="https://raku.org/" target="_blank"&gt;Raku&lt;/a&gt; and &lt;a href="https://www.jsoftware.com/#/README" target="_blank"&gt;J&lt;/a&gt; have special higher-order functions that apply to binary infixes. Raku calls them &lt;em&gt;metaoperators&lt;/em&gt;, while J calls them &lt;em&gt;adverbs&lt;/em&gt; and &lt;em&gt;conjugations&lt;/em&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Raku&lt;/span&gt;
    
    &lt;span class="c1"&gt;# `a «op» b` is map, "cycling" shorter list&lt;/span&gt;
    &lt;span class="nb"&gt;say&lt;/span&gt; &lt;span class="s"&gt;&lt;10 20 30&gt;&lt;/span&gt; «+» &lt;span class="s"&gt;&lt;4 5&gt;&lt;/span&gt;
    (&lt;span class="mi"&gt;14&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt; &lt;span class="mi"&gt;34&lt;/span&gt;)
    
    &lt;span class="c1"&gt;# `a Rop b` is `b op a`&lt;/span&gt;
    &lt;span class="nb"&gt;say&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="n"&gt;R-&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J&lt;/span&gt;
    
    &lt;span class="c1"&gt;NB. x f/ y creates a "table" of x f y&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+/&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;
    &lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;21&lt;/span&gt;
    &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;22&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The Raku metaoperators are closer to what I'm looking for, since I don't think you can assign the "created operator" directly to a callable variable. J lets you, though!&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nv"&gt;h&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+/&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;h&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
    &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;That said, J has some "decomposable" ternaries that feel &lt;em&gt;spiritually&lt;/em&gt; like ternaries, like &lt;a href="https://code.jsoftware.com/wiki/Vocabulary/curlyrt#dyadic" target="_blank"&gt;amend&lt;/a&gt; and &lt;a href="https://code.jsoftware.com/wiki/Vocabulary/fcap" target="_blank"&gt;fold&lt;/a&gt;. It also has a special ternary-ish contruct called the "fork".&lt;sup id="fnref:ternaryish"&gt;&lt;a class="footnote-ref" href="#fn:ternaryish"&gt;3&lt;/a&gt;&lt;/sup&gt; &lt;code&gt;x (f g h) y&lt;/code&gt; is parsed as &lt;code&gt;(x f y) g (x h y)&lt;/code&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. Max - min&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&lt;.&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&lt;.&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
    &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So at the top level that's just a binary operator, but the binary op is constructed via a ternary op. That's pretty cool IMO.&lt;/p&gt;
    &lt;h3&gt;Assignment Ternaries&lt;/h3&gt;
    &lt;p&gt;Bob Nystrom points out that in many languages, &lt;code&gt;a[b] = c&lt;/code&gt; is a ternary operation: it is &lt;em&gt;not&lt;/em&gt; the same as &lt;code&gt;x = a[b]; x = c&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;A weirder case shows up in &lt;a href="https://github.com/betaveros/noulith/" target="_blank"&gt;Noulith&lt;/a&gt; and Raku (again): update operators. Most languages have the &lt;code&gt;+=&lt;/code&gt; &lt;em&gt;binary operator&lt;/em&gt;, these two have the &lt;code&gt;f=&lt;/code&gt; &lt;em&gt;ternary operator&lt;/em&gt;. &lt;code&gt;a f= b&lt;/code&gt; is the same as &lt;code&gt;a = f(a, b)&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Raku&lt;/span&gt;
    &gt; &lt;span class="k"&gt;my&lt;/span&gt; &lt;span class="nv"&gt;$x&lt;/span&gt; = &lt;span class="mi"&gt;2&lt;/span&gt;; &lt;span class="nv"&gt;$x&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;= &lt;span class="mi"&gt;3&lt;/span&gt;; &lt;span class="nb"&gt;say&lt;/span&gt; &lt;span class="nv"&gt;$x&lt;/span&gt;
    &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Arguably this is just syntactic sugar, but I don't think it's decomposable into binary operations.&lt;/p&gt;
    &lt;h3&gt;Custom user ternaries&lt;/h3&gt;
    &lt;p&gt;Tikhon Jelvis pointed out that &lt;a href="https://agda.readthedocs.io/en/v2.7.0.1/language/mixfix-operators.html" target="_blank"&gt;Agda&lt;/a&gt;  lets you define &lt;em&gt;custom&lt;/em&gt; mixfix operators, which can be ternary or even tetranary or pentanary. I later found out that &lt;a href="https://docs.racket-lang.org/mixfix/index.html" target="_blank"&gt;Racket&lt;/a&gt; has this, too. &lt;a href="https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/ProgrammingWithObjectiveC/Introduction/Introduction.html" target="_blank"&gt;Objective-C&lt;/a&gt; &lt;em&gt;looks&lt;/em&gt; like this, too, but feels different somehow. &lt;/p&gt;
    &lt;h3&gt;Near Misses&lt;/h3&gt;
    &lt;p&gt;All of these are arguable, I've just got to draw a line in the sand &lt;em&gt;somewhere&lt;/em&gt;.&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Regular expression substitutions: &lt;code&gt;s/from/to/flags&lt;/code&gt; seems like a ternary, but I'd argue it a datatype constructor, not an expression operator.&lt;/li&gt;
    &lt;li&gt;Comprehensions like &lt;code&gt;[x + 1 | x &lt;- list]&lt;/code&gt;: looks like the ternary &lt;code&gt;[expr1 | expr2 &lt;- expr3]&lt;/code&gt;, but &lt;code&gt;expr2&lt;/code&gt; is only binding a name. Arguably a ternary if you can map &lt;em&gt;and filter&lt;/em&gt; in the same expression a la Python or Haskell, but should that be considered sugar for&lt;/li&gt;
    &lt;li&gt;Python's operator chaining (&lt;code&gt;1 &lt; x &lt; 5&lt;/code&gt;): syntactic sugar for &lt;code&gt;1 &lt; x and x &lt; 5&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;Someone suggested &lt;a href="https://stackoverflow.com/questions/7251772/what-exactly-constitutes-swizzling-in-opengl-es-2-0-powervr-sgx-specifically" target="_blank"&gt;glsl swizzles&lt;/a&gt;, which are very cool but binary operators.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;h2&gt;Why are ternaries so rare?&lt;/h2&gt;
    &lt;p&gt;Ternaries are &lt;em&gt;somewhat&lt;/em&gt; more common in math and physics, f.ex in integrals and sums. That's because they were historically done on paper, where you have a 2D canvas, so you can do stuff like this easily:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;10
    Σ    n
    n=0
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;We express the ternary by putting arguments above and below the operator. All mainstream programming languages are linear, though, so any given symbol has only two sides. Plus functions are more regular and universal than infix operators so you might as well write &lt;code&gt;Sum(n=0, 10, n)&lt;/code&gt;. The conditional ternary slips through purely because it's just so darn useful. Though now I'm wondering where it comes from in the first place. Different newsletter, maybe.&lt;/p&gt;
    &lt;p&gt;But I still find ternary operators super interesting, please let me know if you know any I haven't covered!&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Blog Rec&lt;/h3&gt;
    &lt;p&gt;This week's blog rec is &lt;a href="https://lexi-lambda.github.io/" target="_blank"&gt;Alexis King&lt;/a&gt;! Generally, Alexis's work spans the theory, practice, and implementation of programming languages, aimed at a popular audience and not an academic one. If you know her for one thing, it's probably &lt;a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate" target="_blank"&gt;Parse, don't validate&lt;/a&gt;, which is now so mainstream most people haven't read the original post. Another good one is about &lt;a href="https://lexi-lambda.github.io/blog/2020/01/19/no-dynamic-type-systems-are-not-inherently-more-open/" target="_blank"&gt;modeling open-world systems with static types&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;Nowadays she is &lt;em&gt;far&lt;/em&gt; more active on &lt;a href="https://langdev.stackexchange.com/users/861/alexis-king" target="_blank"&gt;Programming Languages Stack Exchange&lt;/a&gt;, where she has blog-length answers on &lt;a href="https://langdev.stackexchange.com/questions/2692/how-should-i-read-type-system-notation/2693#2693" target="_blank"&gt;reading type notations&lt;/a&gt;, &lt;a href="https://langdev.stackexchange.com/questions/3942/what-are-the-ways-compilers-recognize-complex-patterns/3945#3945" target="_blank"&gt;compiler design&lt;/a&gt;, and &lt;a href="https://langdev.stackexchange.com/questions/2069/what-is-an-arrow-and-what-powers-would-it-give-as-a-first-class-concept-in-a-pro/2372#2372" target="_blank"&gt;why arrows&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:lisp"&gt;
    &lt;p&gt;Unless it's a lisp. &lt;a class="footnote-backref" href="#fnref:lisp" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:ternary"&gt;
    &lt;p&gt;Or &lt;code&gt;x if bool else y&lt;/code&gt;, same thing. &lt;a class="footnote-backref" href="#fnref:ternary" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:ternaryish"&gt;
    &lt;p&gt;I say "ish" because trains can be arbitrarily long: &lt;code&gt;x (f1 f2 f3 f4 f5) y&lt;/code&gt; is something I have &lt;em&gt;no idea&lt;/em&gt; &lt;a href="https://code.jsoftware.com/wiki/Vocabulary/fork" target="_blank"&gt;how to parse&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:ternaryish" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 05 Nov 2024 18:40:33 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/a-list-of-ternary-operators/</guid>
            </item>
            <item>
                <title>TLA from first principles</title>
                <link>https://buttondown.com/hillelwayne/archive/tla-from-first-principles/</link>
                <description>&lt;h3&gt;No Newsletter next week&lt;/h3&gt;
    &lt;p&gt;I'll be speaking at &lt;a href="https://www.usenix.org/conference/srecon24emea/presentation/wayne" target="_blank"&gt;USENIX SRECon&lt;/a&gt;!&lt;/p&gt;
    &lt;h2&gt;TLA from first principles&lt;/h2&gt;
    &lt;p&gt;I'm working on v0.5 of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt;. In the process of revising the "System Modeling" chapter, I stumbled on a great way to explain the &lt;strong&gt;T&lt;/strong&gt;emporal &lt;strong&gt;L&lt;/strong&gt;ogic of &lt;strong&gt;A&lt;/strong&gt;ctions that TLA+ is based on. I'm reproducing that bit here with some changes to fit the newsletter format.&lt;/p&gt;
    &lt;p&gt;Note that by this point the reader has already encountered property testing, formal verification, decision tables, and nontemporal specifications, and should already have a lot of practice expressing things as predicates. &lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;The intro&lt;/h3&gt;
    &lt;p&gt;We have some bank users, each with an account balance. Bank users can wire money
    to each other. We have overdraft protection, so wires cannot reduce an
    account value below zero. &lt;/p&gt;
    &lt;p&gt;For the purposes of introducing the ideas, we'll assume an extremely simple system: two hardcoded
    variables &lt;code&gt;alice&lt;/code&gt; and &lt;code&gt;bob&lt;/code&gt;, both start with 10 dollars, and transfers
    are only from Alice to Bob. Also, the transfer is totally atomic: we
    check for adequate funds, withdraw, and deposit all in a single moment
    of time. Later [in the chapter] we'll allow for multiple nonatomic transfers at the same time.&lt;/p&gt;
    &lt;p&gt;First, let's look at a valid &lt;strong&gt;behavior&lt;/strong&gt; of the system, or possible way it can evolve.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;alice   10 -&gt;  5 -&gt; 3  -&gt; 3  -&gt; ...
    bob     10 -&gt; 15 -&gt; 17 -&gt; 17 -&gt; ...
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In programming, we'd think of &lt;code&gt;alice&lt;/code&gt; and &lt;code&gt;bob&lt;/code&gt; as variables that change. How do we represent those variables &lt;em&gt;purely&lt;/em&gt; in terms of predicate logic? One way is to instead think of them as &lt;em&gt;arrays&lt;/em&gt; of values. &lt;code&gt;alice[0]&lt;/code&gt; is the initial state of &lt;code&gt;alice&lt;/code&gt;, &lt;code&gt;alice[1]&lt;/code&gt; is after the first time step, etc. Time, then, is "just" the set of natural numbers.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Time  = {0, 1, 2, 3, ...}
    alice = [10, 5, 3, 3, ...]
    bob   = [10, 15, 17, 17, ...]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In comparison to our valid behavior, here are some &lt;em&gt;invalid&lt;/em&gt; behaviors:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;alice = [10, 3,  ...]
    bob   = [10  15, ...]
    
    alice = [10, -1,  ...]
    bob   = [10  21,  ...]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The first is invalid because Bob received more money than Alice lost.
    The second is invalid because it violates our proposed invariant, that
    accounts cannot go negative. Can we write a predicate that is &lt;em&gt;true&lt;/em&gt; for
    valid transitions and &lt;em&gt;false&lt;/em&gt; for our two invalid behaviors?&lt;/p&gt;
    &lt;p&gt;Here's one way:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Time = Nat // {0, 1, 2, etc}
    
    Transfer(t: Time) =
      some value in 0..=alice[t]:
        1. alice[t+1] = alice[t] - value
        2. bob[t+1] = bob[t] + value
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Go through and check that this is true for every &lt;code&gt;t&lt;/code&gt; in the valid
    behavior and false for at least one &lt;code&gt;t&lt;/code&gt; in the invalid behavior. Note
    that the steps where Alice &lt;em&gt;doesn't&lt;/em&gt; send a transfer also pass
    &lt;code&gt;Transfer&lt;/code&gt;; we just pick &lt;code&gt;value = 0&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;I can now write a predicate that perfectly describes a valid behavior:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Spec = 
      1. alice[0] = 10
      2. bob[0]   = 10
      3. all t in Time:
        Transfer(t)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now allowing "nothing happens" as "Alice sends an empty transfer" is
    a little bit weird. In the real system, we probably don't want people
    to constantly be sending each other zero dollars:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Transfer(t: Time) =
    &lt;span class="gd"&gt;- some value in 0..=alice[t]:&lt;/span&gt;
    &lt;span class="gi"&gt;+ some value in 1..=alice[t]:&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt;   1. alice[t+1] = alice[t] - value
    &lt;span class="w"&gt; &lt;/span&gt;   2. bob[t+1] = bob[t] + value
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But now there can't be a timestep where nothing happens. And that means
    &lt;em&gt;no&lt;/em&gt; behavior is valid! At every step, Alice &lt;em&gt;must&lt;/em&gt; transfer at least one dollar to Bob.
    Eventually there is some &lt;code&gt;t&lt;/code&gt; where &lt;code&gt;alice[t] = 0 &amp;&amp; bob[t] = 20&lt;/code&gt;. Then
    Alice can't make a transfer, &lt;code&gt;Transfer(t)&lt;/code&gt; is false, and so &lt;code&gt;Spec&lt;/code&gt; is
    false.&lt;sup id="fnref:exercise"&gt;&lt;a class="footnote-ref" href="#fn:exercise"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;So typically when modeling we add a &lt;strong&gt;stutter step&lt;/strong&gt;, like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Spec =
      1. alice[0] = 10
      2. bob[0]   = 10
      3. all t in Time:
        || Transfer(t)
        || 1. alice[t+1] = alice[t]
           2. bob[t+1] = bob[t]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;(This is also why we can use infinite behaviors to model a finite algorithm. If the algorithm completes at &lt;code&gt;t=21&lt;/code&gt;, &lt;code&gt;t=22,23,24...&lt;/code&gt; are all stutter steps.)&lt;/p&gt;
    &lt;p&gt;There's enough moving parts here that I'd want to break it into
    subpredicates.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Init =
      1. alice[0] = 10
      2. bob[0]   = 10
    
    Stutter(t) =
      1. alice[t+1] = alice[t]
      2. bob[t+1] = bob[t]
    
    Next(t) = Transfer(t) // foreshadowing
    
    Spec =
      1. Init
      2. all t in Time:
        Next(t) || Stutter(t)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now finally, how do we represent the property &lt;code&gt;NoOverdrafts&lt;/code&gt;? It's an
    &lt;em&gt;invariant&lt;/em&gt; that has to be true at all times. So we do the same thing we
    did in &lt;code&gt;Spec&lt;/code&gt;, write a predicate over all times.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;property NoOverdrafts =
      all t in Time:
        alice[t] &gt;= 0 &amp;&amp; bob[t] &gt;= 0
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;We can even say that &lt;code&gt;Spec =&gt; NoOverdrafts&lt;/code&gt;, ie if a behavior is valid
    under &lt;code&gt;Spec&lt;/code&gt;, it satisfies &lt;code&gt;NoOverdrafts&lt;/code&gt;.&lt;/p&gt;
    &lt;h4&gt;One of the exercises&lt;/h4&gt;
    &lt;p&gt;Modify the &lt;code&gt;Next&lt;/code&gt; so that Bob can send Alice transfers, too. Don't try
    to be too clever, just do this in the most direct way possible.&lt;/p&gt;
    &lt;p&gt;Bonus: can Alice and Bob transfer to each other in the same step?&lt;/p&gt;
    &lt;p&gt;&lt;strong&gt;Solution&lt;/strong&gt; [in back of book]: We can rename &lt;code&gt;Transfer(t)&lt;/code&gt; to &lt;code&gt;TransferAliceToBob(t)&lt;/code&gt;, write the
    converse as a new predicate, and then add it to &lt;code&gt;next&lt;/code&gt;. Like this&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;TransferBobToAlice(t: Time) =
      some value in 1..=bob[t]:
        1. alice[t+1] = alice[t] - value
        2. bob[t+1] = bob[t] + value
    
    Next(t) =
      || TransferAliceToBob(t)
      || TransferBobToAlice(t)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now, can Alice and Bob transfer to each other in the same step? No.
    Let's say they both start with 10 dollars and each try to transfer five
    dollars to each other. By &lt;code&gt;TransferAliceToBob&lt;/code&gt; we have:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;1. alice[1] = alice[0] - 5 = 5
    2. bob[1] = bob[0] + 5 = 15
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;And by &lt;code&gt;TransferBobToAlice&lt;/code&gt;, we have:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;1. bob[1] = bob[0] - 5 = 5
    2. alice[1] = alice[0] + 5 = 15
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So now we have &lt;code&gt;alice[1] = 5 &amp;&amp; alice[1] = 15&lt;/code&gt;, which is always false.&lt;/p&gt;
    &lt;h3&gt;Temporal Logic&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;This is good and all, but in practice, there's two downsides to
    treating time as a set we can quantify over:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;It's cumbersome. We have to write &lt;code&gt;var[t]&lt;/code&gt; and &lt;code&gt;var[t+1]&lt;/code&gt; all over
        the place.&lt;/li&gt;
    &lt;li&gt;It's too powerful. We can write expressions like
        &lt;code&gt;alice[t^2-5] = alice[t] + t&lt;/code&gt;.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Problem (2) might seem like a good thing; isn't the whole &lt;em&gt;point&lt;/em&gt; of
    logic to be expressive? But we have a long-term goal in mind: getting a
    computer to check our formal specification. We need to limit the
    expressivity of our model so that we can make it checkable. &lt;/p&gt;
    &lt;p&gt;In practice, this will mean making time implicit to our model, instead of
    explicitly quantifying over it.&lt;/p&gt;
    &lt;p&gt;The first thing we need to do is limit how we can use time. At a
    given point in time, all we can look at is the &lt;em&gt;current&lt;/em&gt; value of a
    variable (&lt;code&gt;var[t]&lt;/code&gt;) and the &lt;em&gt;next&lt;/em&gt; value (&lt;code&gt;var[t+1]&lt;/code&gt;). No &lt;code&gt;var[t+16]&lt;/code&gt; or
    &lt;code&gt;var[t-1]&lt;/code&gt; or anything else complicated.&lt;/p&gt;
    &lt;p&gt;And it turns out we've already seen a mathematical convention for
    expressing this: &lt;strong&gt;priming&lt;/strong&gt;!&lt;sup id="fnref:priming"&gt;&lt;a class="footnote-ref" href="#fn:priming"&gt;2&lt;/a&gt;&lt;/sup&gt; For a
    given time &lt;code&gt;t&lt;/code&gt;, we can define &lt;code&gt;var&lt;/code&gt; to mean &lt;code&gt;var[t]&lt;/code&gt; and &lt;code&gt;var'&lt;/code&gt; to mean
    &lt;code&gt;var[t+1]&lt;/code&gt;. Then &lt;code&gt;Transfer(t)&lt;/code&gt; becomes&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Transfer =
      some value in 1..=alice:
        1. alice' = alice
        2. bob' = bob
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Next we have the construct &lt;code&gt;all t in Time: P(t)&lt;/code&gt; in both &lt;code&gt;Spec&lt;/code&gt; and
    &lt;code&gt;NoOverdrafts&lt;/code&gt;. In other words, "P is always true". So we can add
    &lt;code&gt;always&lt;/code&gt; as a new term. Logicians conventionally use □ or &lt;code&gt;[]&lt;/code&gt;
    to mean the same thing.&lt;sup id="fnref:beyond"&gt;&lt;a class="footnote-ref" href="#fn:beyond"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;property NoOverdrafts =
      always (alice &gt;= 0 &amp;&amp; bob[t] &gt;= 0)
      // or [](alice &gt;= 0 &amp;&amp; bob[t] &gt;= 0)
    
    Spec =
      Init &amp;&amp; always (Next || Stutter)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now time is &lt;em&gt;almost&lt;/em&gt; completely implicit in our spec, with just one
    exception: &lt;code&gt;Init&lt;/code&gt; has &lt;code&gt;alice[0]&lt;/code&gt; and &lt;code&gt;bob[0]&lt;/code&gt;. We just need one more
    convention: if a variable is referenced &lt;em&gt;outside&lt;/em&gt; of the scope of a
    temporal operator, it means &lt;code&gt;var[0]&lt;/code&gt;. Since &lt;code&gt;Init&lt;/code&gt; is outside of the &lt;code&gt;[]&lt;/code&gt;, it becomes&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Init =
      1. alice = 10
      2. bob = 10
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;And with that, we've removed &lt;code&gt;Time&lt;/code&gt; as an explicit value in our model.&lt;/p&gt;
    &lt;p&gt;The addition of primes and &lt;code&gt;always&lt;/code&gt; makes this a &lt;strong&gt;temporal logic&lt;/strong&gt;, one that can model how things change over time. And that makes it ideal for modeling software systems.&lt;/p&gt;
    &lt;h3&gt;Modeling with TLA+&lt;/h3&gt;
    &lt;p&gt;One of the most popular specification languages for modeling these kinds
    of concurrent systems is &lt;strong&gt;TLA+&lt;/strong&gt;. TLA+ was invented by the Turing award-winner Leslie Lamport, who also invented a wide variety of concurrency algorithms and LaTeX. Here's our current
    spec in TLA+:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;---- MODULE transfers ----
    EXTENDS Integers
    
    VARIABLES alice, bob
    vars == &lt;&lt;alice, bob&gt;&gt;
    
    Init ==
      alice = 10 
      /\ bob = 10
    
    AliceToBob ==
      \E amnt \in 1..alice:
        alice' = alice - amnt
        /\ bob' = bob + amnt
    
    BobToAlice ==
      \E amnt \in 1..bob:
        alice' = alice + amnt
        /\ bob' = bob - amnt
    
    Next ==
      AliceToBob
      \/ BobToAlice
    
    Spec == Init /\ [][Next]_vars
    
    NoOverdrafts ==
      [](alice &gt;= 0 /\ bob &gt;= 0)
    
    ====
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;TLA+ uses ASCII versions of mathematicians notation: &lt;code&gt;/\&lt;/code&gt;/&lt;code&gt;\/&lt;/code&gt; for
    &lt;code&gt;&amp;&amp;/||&lt;/code&gt;, &lt;code&gt;\A \E&lt;/code&gt; for &lt;code&gt;all/some&lt;/code&gt;, etc. The only thing that's "unusual"
    (besides &lt;code&gt;==&lt;/code&gt; for definition) is the &lt;code&gt;[][Next]_vars&lt;/code&gt; bit. That's TLA+
    notation for &lt;code&gt;[](Next || Stutter)&lt;/code&gt;: &lt;code&gt;Next&lt;/code&gt; or &lt;code&gt;Stutter&lt;/code&gt; always happens.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;The rest of the chapter goes on to explain model checking, PlusCal (for modeling nonatomic transactions without needing to explain the exotic TLA+ function syntax), and liveness properties. But this is the intuition behind the "temporal logic of actions": temporal operators are operations on the set of points of time, and we restrict what we can do with those operators to make reasoning about the specification feasible.&lt;/p&gt;
    &lt;p&gt;Honestly I like it enough that I'm thinking of redesigning my TLA+ workshop to start with this explanation. Then again, maybe it only seems good to me because I already know TLA+. Please let me know what you think about it!&lt;/p&gt;
    &lt;p&gt;Anyway, the new version of the chapter will be in v0.5, which should be out mid-November.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Blog Rec&lt;/h3&gt;
    &lt;p&gt;This one it's really dear to me: &lt;a href="https://muratbuffalo.blogspot.com/" target="_blank"&gt;Metadata&lt;/a&gt;, by Murat Demirbas. When I was first trying to learn TLA+ back in 2016, his post &lt;a href="https://muratbuffalo.blogspot.com/2015/01/my-experience-with-using-tla-in.html" target="_blank"&gt;on using TLA+ in a distributed systems class&lt;/a&gt; was one of, like... &lt;em&gt;three&lt;/em&gt; public posts on TLA+. I must have spent hours rereading that post and puzzling out this weird language I stumbled into. Later I emailed Murat with some questions and he was super nice in answering them. Don't think I would have ever grokked TLA+ without him.&lt;/p&gt;
    &lt;p&gt;In addition to TLA+ content, a lot of the blog is also breakdowns of papers he read— like &lt;a href="https://blog.acolyer.org/" target="_blank"&gt;the morning paper&lt;/a&gt;, except with a focus on distributed systems (and still active). If you're interested in learning more about the science of distributed systems, he has an excellent page on &lt;a href="https://muratbuffalo.blogspot.com/2021/02/foundational-distributed-systems-papers.html" target="_blank"&gt;foundational distributed systems papers&lt;/a&gt;. But definitely check out his &lt;a href="https://muratbuffalo.blogspot.com/2023/09/metastable-failures-in-wild.html" target="_blank"&gt;his deep readings&lt;/a&gt;, too!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:exercise"&gt;
    &lt;p&gt;In the book this is presented as an exercise (with the solution in back). The exercise also clarifies that since &lt;code&gt;Time = Nat&lt;/code&gt;, all behaviors have an &lt;em&gt;infinite&lt;/em&gt; number of steps. &lt;a class="footnote-backref" href="#fnref:exercise" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:priming"&gt;
    &lt;p&gt;Priming is introduced in the chapter on decision tables, and again in the chapter on database invariants. &lt;code&gt;x'&lt;/code&gt; is "the next value of &lt;code&gt;x&lt;/code&gt;", so you can use it to express database invariants like "jobs only move from &lt;code&gt;ready&lt;/code&gt; to &lt;code&gt;started&lt;/code&gt; or &lt;code&gt;aborted&lt;/code&gt;." &lt;a class="footnote-backref" href="#fnref:priming" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:beyond"&gt;
    &lt;p&gt;I'm still vacillating on whether I want a "beyond logic" appendix that covers higher order logic, constructive logic, and modal logic (which is what we're sneakily doing right now!)&lt;/p&gt;
    &lt;p&gt;While I'm here, this explanation of &lt;code&gt;always&lt;/code&gt; as &lt;code&gt;all t in Time&lt;/code&gt; isn't &lt;em&gt;100%&lt;/em&gt; accurate, since it doesn't explain why things like &lt;code&gt;[](P =&gt; []Q)&lt;/code&gt; or &lt;code&gt;&lt;&gt;[]P&lt;/code&gt; make sense. But it's accurate in most cases and is a great intuition pump. &lt;a class="footnote-backref" href="#fnref:beyond" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 22 Oct 2024 17:14:21 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/tla-from-first-principles/</guid>
            </item>
            <item>
                <title>Be Suspicious of Success</title>
                <link>https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/</link>
                <description>&lt;p&gt;From Leslie Lamport's &lt;em&gt;Specifying Systems&lt;/em&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;You should be suspicious if [the model checker] does not find a violation of a liveness property... you should also be suspicious if [it] finds no errors when checking safety properties. &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;This is specifically in the context of model-checking a formal specification, but it's a widely applicable software principle. It's not enough for a program to work, it has to work for the &lt;em&gt;right reasons&lt;/em&gt;. Code working for the wrong reasons is code that's going to break when you least expect it. And since "correct for right reasons" is a much narrower target than "correct for any possible reason", we can't assume our first success is actually our intended success.&lt;/p&gt;
    &lt;p&gt;Hence, BSOS: &lt;strong&gt;Be Suspicious of Success&lt;/strong&gt;.&lt;/p&gt;
    &lt;h3&gt;Some useful BSOS practices&lt;/h3&gt;
    &lt;p&gt;The standard way of dealing with BSOS is verification. Tests, static checks, model checking, etc. We get more confident in our code if our verifications succeed. But then we also have to be suspicious of &lt;em&gt;that&lt;/em&gt; success, too! How do I know whether my tests are passing because they're properly testing correct code or because they're failing to test incorrect code?&lt;/p&gt;
    &lt;p&gt;This is why test-driven development gurus tell people to write a failing test first. Then at least we know the tests are doing &lt;em&gt;something&lt;/em&gt; (even if they still might not be testing what they want).&lt;/p&gt;
    &lt;p&gt;The other limit of verification is that it can't tell us &lt;em&gt;why&lt;/em&gt; something succeeds. Mainstream verification methods are good at explaining why things &lt;em&gt;fail&lt;/em&gt;— expected vs actual test output, type mismatches, specification error traces. Success isn't as "information-rich" as failure. How do you distinguish a faithful implementation of &lt;a href="https://en.wikipedia.org/wiki/Collatz_conjecture" target="_blank"&gt;&lt;code&gt;is_collatz_counterexample&lt;/code&gt;&lt;/a&gt; from &lt;code&gt;return false&lt;/code&gt;?&lt;/p&gt;
    &lt;p&gt;A broader technique I follow is &lt;em&gt;make it work, make it break&lt;/em&gt;. If code is working for the right reasons, I should be able to predict how to break it. This can be either a change in the runtime (this will livelock if we 10x the number of connections), or a change to the code itself (commenting out &lt;em&gt;this&lt;/em&gt; line will cause property X to fail). &lt;sup id="fnref:superproperties"&gt;&lt;a class="footnote-ref" href="#fn:superproperties"&gt;1&lt;/a&gt;&lt;/sup&gt; If the code still works even after the change, my model of the code is wrong and it was succeeding for the wrong reasons.&lt;/p&gt;
    &lt;h3&gt;Happy and Sad Paths&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;A related topic (possibly subset?) is "happy and sad paths". The happy path of your code is the behavior when everything's going right: correct inputs, preconditions are satisfied, the data sources are present, etc. The sad path is all of the code that handles things going wrong. Retry mechanisms, insufficient user authority, database constraint violation, etc. In most software, the code supporting the sad paths dwarfs the code in the happy path.&lt;/p&gt;
    &lt;p&gt;BSOS says that I can't just show code works in the happy path, I also need to check it works in the sad path. &lt;/p&gt;
    &lt;p&gt;BSOS also says that I have to be suspicious when the sad path works properly, too. &lt;/p&gt;
    &lt;p&gt;Say I add a retry mechanism to my code to handle the failure mode of timeouts. I test the code and it works. Did the retry code actually &lt;em&gt;run&lt;/em&gt;? Did it run &lt;em&gt;regardless&lt;/em&gt; of the original response? Is it really doing exponential backoff? Will stop after the maximum retry limit? Is the sad path code &lt;em&gt;after&lt;/em&gt; the maximum retry limit working properly?&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf" target="_blank"&gt;One paper&lt;/a&gt; found that 35% of catastrophic distributed system failures were caused by "trivial mistakes in error handlers" (pg 9). These were in mature, battle-hardened programs. Be suspicious of success. Be more suspicious of sad path success.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h2&gt;Blog Rec&lt;/h2&gt;
    &lt;p&gt;This week's blog rec is &lt;a href="https://www.redblobgames.com/" target="_blank"&gt;Red Blob Games&lt;/a&gt;!&lt;sup id="fnref:blogs-vs-articles"&gt;&lt;a class="footnote-ref" href="#fn:blogs-vs-articles"&gt;2&lt;/a&gt;&lt;/sup&gt; While primarily about computer game programming, the meat of the content is beautiful, interactive guides to general CS algorithms. Some highlights:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://www.redblobgames.com/pathfinding/a-star/introduction.html" target="_blank"&gt;Introduction to the A* Algorithm&lt;/a&gt; was really illuminating when I was a baby programmer.&lt;/li&gt;
    &lt;li&gt;I'm sure this &lt;a href="https://www.redblobgames.com/articles/noise/introduction.html" target="_blank"&gt;overview of noise functions&lt;/a&gt; will be useful to me &lt;em&gt;someday&lt;/em&gt;. Maybe for test data generation?&lt;/li&gt;
    &lt;li&gt;If you're also an explainer type he has a lot of great stuff on &lt;a href="https://www.redblobgames.com/making-of/line-drawing/" target="_blank"&gt;his process&lt;/a&gt; and his &lt;a href="https://www.redblobgames.com/making-of/little-things/" target="_blank"&gt;little tricks&lt;/a&gt; to make things more understandable.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;(I don't think his &lt;a href="https://www.redblobgames.com/blog/posts.xml" target="_blank"&gt;rss feed&lt;/a&gt; covers new interactive articles, only the &lt;a href="https://www.redblobgames.com/blog/" target="_blank"&gt;blog&lt;/a&gt; specifically.)&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:superproperties"&gt;
    &lt;p&gt;&lt;a href="https://www.jameskoppel.com/" target="_blank"&gt;Jimmy Koppel&lt;/a&gt; once proposed that just as code has properties, code variations have &lt;a href="https://groups.csail.mit.edu/sdg/pubs/2020/demystifying_dependence_published.pdf" target="_blank"&gt;&lt;strong&gt;superproperties&lt;/strong&gt;&lt;/a&gt;. For example, "no modification to the codebase causes us to use a greater number of deprecated APIs." &lt;a class="footnote-backref" href="#fnref:superproperties" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:blogs-vs-articles"&gt;
    &lt;p&gt;Okay, it's more an &lt;em&gt;article&lt;/em&gt; site, because there's also a &lt;a href="https://www.redblobgames.com/blog/" target="_blank"&gt;Red Blob &lt;em&gt;blog&lt;/em&gt;&lt;/a&gt; (which covers a lot of neat stuff, too). Maybe I should just rename this section to "site rec". &lt;a class="footnote-backref" href="#fnref:blogs-vs-articles" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Wed, 16 Oct 2024 15:08:39 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/</guid>
            </item>
            <item>
                <title>How to convince engineers that formal methods is cool</title>
                <link>https://buttondown.com/hillelwayne/archive/how-to-convince-engineers-that-formal-methods-is/</link>
                <description>&lt;p&gt;Sorry there was no newsletter last week! I got COVID. Still got it, which is why this one's also short.&lt;/p&gt;
    &lt;h3&gt;Logic for Programmers v0.4&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Now available&lt;/a&gt;! This version adds a chapter on TLA+, significantly expands the constraint solver chapter, and adds a "planner programming" section to the Logic Programming chapter. You can see the full release notes on the &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;book page&lt;/a&gt;.&lt;/p&gt;
    &lt;h1&gt;How to convince engineers that formal methods is cool&lt;/h1&gt;
    &lt;p&gt;I have an open email for answering questions about formal methods,&lt;sup id="fnref:fs-fv"&gt;&lt;a class="footnote-ref" href="#fn:fs-fv"&gt;1&lt;/a&gt;&lt;/sup&gt; and one of the most common questions I get is "how do I convince my coworkers that this is worth doing?" usually the context is the reader is really into the idea of FM but their coworkers don't know it exists. The goal of the asker is to both introduce FM and persuade them that FM's useful. &lt;/p&gt;
    &lt;p&gt;In my experience as a consultant and advocate, I've found that there's only two consistently-effective ways to successfully pitch FM:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Use FM to find an &lt;em&gt;existing&lt;/em&gt; bug in a work system&lt;/li&gt;
    &lt;li&gt;Show how FM finds a historical bug that's already been fixed.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;h4&gt;Why this works&lt;/h4&gt;
    &lt;p&gt;There's two main objections to FM that we need to address. The first is that FM is too academic and doesn't provide a tangible, practical benefit. The second is that FM is too hard; only PhDs and rocket scientists can economically use it. (Showing use cases from AWS &lt;em&gt;et al&lt;/em&gt; aren't broadly persuasive because skeptics don't have any insight into how AWS functions.) Finding an existing bug hits both: it helped the team with a real problem, and it was done by a mere mortal. &lt;/p&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;Demonstrating FM on a historical bug isn't &lt;em&gt;as&lt;/em&gt; effective: it only shows that formal methods &lt;em&gt;could have&lt;/em&gt; helped, not that it actually does help. But people will usually remember the misery of debugging that problem. Bug war stories are popular for a reason!&lt;/p&gt;
    &lt;h3&gt;Making historical bugs persuasive&lt;/h3&gt;
    &lt;p&gt;So "live bug" is a stronger rec, but "historical bug" tends to be easier to show. This is because &lt;em&gt;you know what you're looking for&lt;/em&gt;. It's easier to write a high-level spec on a system you already know, and show it finds a bug you already know about.&lt;/p&gt;
    &lt;p&gt;The trick to make it look convincing is to make the spec and bug as "natural" as possible. You can't make it seem like FM only found the bug because you had foreknowledge of what it was— then the whole exercise is too contrived. People will already know you had foreknowledge, of course, and are factoring that into their observations. You want to make the case that the spec you're writing is clear and obvious enough that an "ignorant" person could have written it. That means nothing contrived or suspicious.&lt;/p&gt;
    &lt;p&gt;This is a bit of a fuzzy definition, more a vibe than anything. Ask yourself "does this spec look like something that was tailor-made around this bug, or does it find the bug as a byproduct of being a regular spec?"&lt;/p&gt;
    &lt;p&gt;A good example of a "natural" spec is &lt;a href="https://www.hillelwayne.com/post/augmenting-agile/" target="_blank"&gt;the bounded queue problem&lt;/a&gt;. It's a straight translation of some Java code with no properties besides deadlock checking. Usually you'll be at a higher level of abstraction, though.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Blog rec: &lt;a href="https://www.argmin.net/" target="_blank"&gt;arg min&lt;/a&gt;&lt;/h3&gt;
    &lt;p&gt;This is a new section I want to try for a bit: recommending tech(/-adjacent) blogs that I like. This first one is going to be a bit niche: &lt;a href="https://www.argmin.net/" target="_blank"&gt;arg min&lt;/a&gt; is writing up lecture notes on "convex optimization". It's a cool look into the theory behind constraint solving. I don't understand most of the math but the prose is pretty approachable. Couple of highlights:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://www.argmin.net/p/modeling-dystopia" target="_blank"&gt;Modeling Dystopia&lt;/a&gt; about why constraint solving isn't a mainstream technology.&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://www.argmin.net/p/convex-optimization-live-blog" target="_blank"&gt;Table of Contents&lt;/a&gt; to see all of the posts.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;The blogger also talks about some other topics but I haven't read those posts much.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:fs-fv"&gt;
    &lt;p&gt;As always, talking primarily about formal specification of systems (TLA+/Alloy/Spin), not formal verification of code (Dafny/SPARK/Agda). I talk about the differences a bit &lt;a href="https://www.hillelwayne.com/post/why-dont-people-use-formal-methods/" target="_blank"&gt;here&lt;/a&gt; (but I really need to write a more focused piece). &lt;a class="footnote-backref" href="#fnref:fs-fv" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 08 Oct 2024 16:18:55 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/how-to-convince-engineers-that-formal-methods-is/</guid>
            </item>
            <item>
                <title>Refactoring Invariants</title>
                <link>https://buttondown.com/hillelwayne/archive/refactoring-invariants/</link>
                <description>&lt;p&gt;(Feeling a little sick so this one will be short.)&lt;/p&gt;
    &lt;p&gt;I'm often asked by clients to review their (usually TLA+) formal specifications. These specs are generally slower and more convoluted than an expert would write. I want to fix them up without changing the overall behavior of the spec or introducing subtle bugs.&lt;/p&gt;
    &lt;p&gt;To do this, I use a rather lovely feature of TLA+. Say I see a 100-line &lt;code&gt;Foo&lt;/code&gt; action that I think I can refactor down to 20 lines. I'll first write a refactored version as a separate action &lt;code&gt;NewFoo&lt;/code&gt;, then I run the model checker with the property&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;RefactorProp ==
        [][Foo &lt;=&gt; NewFoo]_vars
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;That's an intimidating nest of symbols but all it's saying is that every &lt;code&gt;Foo&lt;/code&gt; step must also be a &lt;code&gt;NewFoo&lt;/code&gt; step. If the refactor ever does something different from the original action, the model-checker will report the exact behavior and transition it fails for. Conversely, if the model checker passes, I can safely assume they have identical behaviors.&lt;/p&gt;
    &lt;p&gt;This is a &lt;strong&gt;refactoring invariant&lt;/strong&gt;:&lt;sup id="fnref:invariant"&gt;&lt;a class="footnote-ref" href="#fn:invariant"&gt;1&lt;/a&gt;&lt;/sup&gt; the old and new versions of functions have identical behavior. Refactoring invariants are superbly useful in formal specification. Software devs spend enough time refactoring that they'd be useful for coding, too.&lt;/p&gt;
    &lt;p&gt;Alas, refactoring invariants are a little harder to express in code. In TLA+ we're working with bounded state spaces, so the model checker can check the invariant for every possible state. Even a simple program can have an unbounded state space via an infinite number of possible function inputs. &lt;/p&gt;
    &lt;p&gt;(Also formal specifications are "pure" simulations while programs have side effects.)&lt;/p&gt;
    &lt;p&gt;The "normal" way to verify a program refactoring is to start out with a huge suite of &lt;a href="https://buttondown.com/hillelwayne/archive/oracle-testing/" target="_blank"&gt;oracle tests&lt;/a&gt;. This &lt;em&gt;should&lt;/em&gt; catch a bad refactor via failing tests. The downside is that you might not have the test suite in the first place, or not one that covers your particular refactoring. Second, even if the test suite does, it only indirectly tests the invariant. It catches the refactoring error as a consequence of testing other stuff. What if we want to directly test the refactoring invariant?&lt;/p&gt;
    &lt;h3&gt;Two ways of doing this&lt;/h3&gt;
    &lt;p&gt;One: by pulling in formal methods. Ray Myers has a &lt;a href="https://www.youtube.com/watch?v=UdB3XBf219Y" target="_blank"&gt;neat video&lt;/a&gt; on formally proving a refactoring is correct. That one's in the niche language ACL2, but he's also got one on &lt;a href="https://www.youtube.com/watch?v=_7RXQE-pCMo" target="_blank"&gt;refactoring C&lt;/a&gt;. You might not even to prove the refactoring correct, you could probably get away with using an &lt;a href="https://github.com/pschanely/CrossHair" target="_blank"&gt;SMT solver&lt;/a&gt; to find counterexamples.&lt;/p&gt;
    &lt;p&gt;Two: by using property-based testing. Generate random inputs, pass them to both functions, and check that the outputs are identical. Using the python &lt;a href="https://hypothesis.readthedocs.io/en/latest/" target="_blank"&gt;Hypothesis&lt;/a&gt; library:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;hypothesis&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;given&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;hypothesis.strategies&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;st&lt;/span&gt;
    
    &lt;span class="c1"&gt;# from the `gilded rose kata`&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_quality&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
    
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_quality_new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
    
    &lt;span class="nd"&gt;@given&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;builds&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_refactoring&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;update_quality&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;update_quality_new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;One tricky bit is if the function is part of a long call chain &lt;code&gt;A -&gt; B -&gt; C&lt;/code&gt;, and you want to test that refactoring &lt;code&gt;C'&lt;/code&gt; doesn't change the behavior of &lt;code&gt;A&lt;/code&gt;. You'd have to add a &lt;code&gt;B'&lt;/code&gt; that uses &lt;code&gt;C'&lt;/code&gt; and then an &lt;code&gt;A'&lt;/code&gt; that uses &lt;code&gt;B'&lt;/code&gt;. Maybe you could instead create a branch, commit the change the &lt;code&gt;C'&lt;/code&gt; in that branch, and then run a &lt;a href="https://www.hillelwayne.com/post/cross-branch-testing/" target="_blank"&gt;cross-branch test&lt;/a&gt; against each branch's &lt;code&gt;A&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;Impure functions are harder. The test now makes some side effect twice, which could spuriously break the refactoring invariant. You could instead test the changes are the same, or try to get the functions to effect different entities and then compare the updates of each entity. There's no general solution here though, and there might be No Good Way for a particular effectful refactoring.&lt;/p&gt;
    &lt;h3&gt;Behavior-changing rewrites&lt;/h3&gt;
    &lt;p&gt;We can apply similar ideas for rewrites that change &lt;em&gt;behavior&lt;/em&gt;. Say we have an API, and v1 returns a list of user names while v2 returns a &lt;code&gt;{version, userids}&lt;/code&gt; dict. Then we can find some transformation of v2 into v1, and run the refactoring invariant on that:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;v2_to_v1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v2_resp&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;v2_resp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"userids"&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    
    &lt;span class="nd"&gt;@given&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;some_query_generator&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_refactoring&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;v2_to_v1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Fun fact: &lt;code&gt;v2_to_v1&lt;/code&gt; is a &lt;a href="https://buttondown.com/hillelwayne/archive/software-isomorphisms/" target="_blank"&gt;software homomorphism&lt;/a&gt;!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:invariant"&gt;
    &lt;p&gt;Well technically it's an &lt;em&gt;action property&lt;/em&gt; since it's on the transitions of states, not the states, but "refactor invariant" gets the idea across better. &lt;a class="footnote-backref" href="#fnref:invariant" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 24 Sep 2024 20:06:10 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/refactoring-invariants/</guid>
            </item>
            <item>
                <title>Goodhart's Law in Software Engineering</title>
                <link>https://buttondown.com/hillelwayne/archive/goodharts-law-in-software-engineering/</link>
                <description>&lt;h3&gt;Blog Hiatus&lt;/h3&gt;
    &lt;p&gt;You might have noticed I haven't been updating my website. I haven't even &lt;em&gt;looked&lt;/em&gt; at any of my drafts for the past three months. All that time is instead going into &lt;em&gt;Logic for Programmers&lt;/em&gt;. I'll get back to the site when that's done or in 2025, whichever comes first. Newsletter and &lt;a href="https://www.patreon.com/hillelwayne" target="_blank"&gt;Patreon&lt;/a&gt; will still get regular updates.&lt;/p&gt;
    &lt;p&gt;(As a comparison, the book is now 22k words. That's like 11 blog posts!)&lt;/p&gt;
    &lt;h2&gt;Goodhart's Law in Software Engineering&lt;/h2&gt;
    &lt;p&gt;I recently got into an argument with some people about whether small functions were &lt;em&gt;mostly&lt;/em&gt; a good idea or &lt;em&gt;always 100%&lt;/em&gt; a good idea, and it reminded me a lot about &lt;a href="https://en.wikipedia.org/wiki/Goodhart%27s_law" target="_blank"&gt;Goodhart's Law&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;When a measure becomes a target, it ceases to be a good measure.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The &lt;em&gt;weak&lt;/em&gt; version of this is that people have perverse incentives to game the metrics. If your metric is "number of bugs in the bug tracker", people will start spuriously closing bugs just to get the number down. &lt;/p&gt;
    &lt;p&gt;The &lt;em&gt;strong&lt;/em&gt; version of the law is that even 100% honest pursuit of a metric, taken far enough, is harmful to your goals, and this is an inescapable consequence of the difference between metrics and values. We have metrics in the first place because what we actually &lt;em&gt;care about&lt;/em&gt; is nonquantifiable. There's some &lt;em&gt;thing&lt;/em&gt; we want more of, but we have no way of directly measuring that thing. We &lt;em&gt;can&lt;/em&gt; measure something that looks like a rough approximation for our goal. But it's &lt;em&gt;not&lt;/em&gt; our goal, and if we replace the metric with the goal, we start taking actions that favor the metric over the goal.&lt;/p&gt;
    &lt;p&gt;Say we want more reliable software. How do you measure "reliability"? You can't. But you &lt;em&gt;can&lt;/em&gt; measure the number of bugs in the bug tracker, because fewer open bugs roughly means more reliability. &lt;strong&gt;This is not the same thing&lt;/strong&gt;. I've seen bugs fixed in ways that made the system &lt;em&gt;less&lt;/em&gt; reliable, but not in ways that translated into tracked bugs.&lt;/p&gt;
    &lt;p&gt;I am a firm believer in the strong version of Goodhart's law. Mostly because of this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="A peacock with its feathers out. The peacock is scremming" class="newsletter-image" src="https://assets.buttondown.email/images/2573503d-bc57-49ce-aa26-9d399d801118.jpg?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;What does a peahen look for in a mate? A male with maximum fitness. What's a metric that approximates fitness? How nice the plumage is, because nicer plumage = more calories energy to waste on plumage.&lt;sup id="fnref:peacock"&gt;&lt;a class="footnote-ref" href="#fn:peacock"&gt;1&lt;/a&gt;&lt;/sup&gt; But that only &lt;em&gt;approximates&lt;/em&gt; fitness, and over generations the plumage itself becomes the point at the cost of overall bird fitness. Sexual selection is Goodhart's law in action.&lt;/p&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;If the blind watchmaker can fall for Goodhart, people can too.&lt;/p&gt;
    &lt;h3&gt;Examples in Engineering&lt;/h3&gt;
    &lt;p&gt;Goodhart's law is a warning for pointy-haired bosses who up with terrible metrics: lines added, feature points done, etc. I'm more interested in how it affects the metrics we set for ourselves that our bosses might never know about.&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;"Test coverage" is a proxy for how thoroughly we've tested our software. It diverges when we need to test lots of properties of the same lines of code, or when our worst bugs are emergent at the integration level.&lt;/li&gt;
    &lt;li&gt;"Cyclomatic complexity" and "function size" are proxies for code legibility. They diverges when we think about global module legibility, not local function legibility. Then too many functions can obscure the code and data flow.&lt;/li&gt;
    &lt;li&gt;Benchmarks are proxies for performant programs, and diverge when improving benchmarks slows down unbenchmarked operations.&lt;/li&gt;
    &lt;li&gt;Amount of time spent pairing/code reviewing/debugging/whatever proxies "being productive".&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://dora.dev/" target="_blank"&gt;The DORA report&lt;/a&gt; is an interesting case, because it claims four metrics&lt;sup id="fnref:metrics"&gt;&lt;a class="footnote-ref" href="#fn:metrics"&gt;2&lt;/a&gt;&lt;/sup&gt; are proxies to ineffable goals like "elite performance" and &lt;em&gt;employee satisfaction&lt;/em&gt;. It also argues that you should minimize commit size to improve the DORA metrics. A proxy of a proxy of a goal!&lt;/li&gt;
    &lt;/ul&gt;
    &lt;h3&gt;What can we do about this?&lt;/h3&gt;
    &lt;p&gt;No, I do not know how to avoid a law that can hijack the process of evolution.&lt;/p&gt;
    &lt;p&gt;The 2023 DORA report suggests readers should avoid Goodhart's law and "assess a team's strength across a wide range of people, processes, and technical capabilities" (pg 10), which is kind of like saying the fix to production bugs is "don't write bugs". It's a guiding principle but not actionable advice that gets to that principle.&lt;/p&gt;
    &lt;p&gt;They also say "to use a combination of metrics to drive deeper understanding" (ibid), which makes more sense at first. If you have metrics X and Y to approximate goal G, then overoptimizing X &lt;em&gt;might&lt;/em&gt; hurt Y, indicating you're getting further from G. In practice I've seen it turn into "we can't improve X because it'll hurt Y and we can't improve Y because it'll hurt X." This &lt;em&gt;could&lt;/em&gt; mean we're at the best possible spot for G, but more often it means we're trapped very far from our goal. You could come up with a weighted combination of X and Y, like 0.7X + 0.3Y, but &lt;em&gt;that too&lt;/em&gt; is a metric subject to Goodhart. &lt;/p&gt;
    &lt;p&gt;I guess the best I can do is say "use your best engineering judgement"? Evolution is mindless, people aren't. Again, not an actionable or scalable bit of advice, but as I grow older I keep finding "use your best judgement" is all we can do. Knowledge work is ineffable and irreducible.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:peacock"&gt;
    &lt;p&gt;This sent me down a rabbit hole; turns out scientists are still debating what &lt;em&gt;exactly&lt;/em&gt; the peacock's tail is used for! Is it sexual selection? Adverse signalling? Something else??? &lt;a class="footnote-backref" href="#fnref:peacock" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:metrics"&gt;
    &lt;p&gt;How soon commits get to production, deployment frequency, percent of deployments that cause errors in production, and mean time to recovery. &lt;a class="footnote-backref" href="#fnref:metrics" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 17 Sep 2024 16:33:40 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/goodharts-law-in-software-engineering/</guid>
            </item>
            <item>
                <title>Why Not Comments</title>
                <link>https://buttondown.com/hillelwayne/archive/why-not-comments/</link>
                <description>&lt;h2&gt;Logic For Programmers v0.3&lt;/h2&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Now available&lt;/a&gt;! It's a light release as I learn more about formatting a nice-looking book. You can see some of the differences between v2 and v3 &lt;a href="https://bsky.app/profile/hillelwayne.com/post/3l3egdqnqj62o" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;h2&gt;Why Not Comments&lt;/h2&gt;
    &lt;p&gt;Code is written in a structured machine language, comments are written in an expressive human language. The "human language" bit makes comments more expressive and communicative than code. Code has a limited amount of something &lt;em&gt;like&lt;/em&gt; human language contained in identifiers. "Comment the why, not the what" means to push as much information as possible into identifiers. &lt;a href="https://buttondown.com/hillelwayne/archive/3866bd6e-22c3-4098-92ef-4d47ef287ed8" target="_blank"&gt;Not all "what" can be embedded like this&lt;/a&gt;, but a lot can.&lt;/p&gt;
    &lt;p&gt;In recent years I see more people arguing that &lt;em&gt;whys&lt;/em&gt; do not belong in comments either, that they can be embedded into &lt;code&gt;LongFunctionNames&lt;/code&gt; or the names of test cases. Virtually all "self-documenting" codebases add documentation through the addition of identifiers.&lt;sup id="fnref:exception"&gt;&lt;a class="footnote-ref" href="#fn:exception"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;So what's something in the range of human expression that &lt;em&gt;cannot&lt;/em&gt; be represented with more code?&lt;/p&gt;
    &lt;p&gt;Negative information, drawing attention to what's &lt;em&gt;not&lt;/em&gt; there. The "why nots" of the system.&lt;/p&gt;
    &lt;h3&gt;A Recent Example&lt;/h3&gt;
    &lt;p&gt;This one comes from &lt;em&gt;Logic for Programmers&lt;/em&gt;. For convoluted technical reasons the epub build wasn't translating math notation (&lt;code&gt;\forall&lt;/code&gt;) into symbols (&lt;code&gt;∀&lt;/code&gt;). I wrote a script to manually go through and replace tokens in math strings with unicode equivalents. The easiest way to do this is to call &lt;code&gt;string = string.replace(old, new)&lt;/code&gt; for each one of the 16 math symbols I need to replace (some math strings have multiple symbols).&lt;/p&gt;
    &lt;p&gt;This is incredibly inefficient and I could instead do all 16 replacements in a single pass. But that would be a more complicated solution. So I did the simple way with a comment:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Does 16 passes over each string
    BUT there are only 25 math strings in the book so far and most are &amp;lt;5 characters.
    So it's still fast enough.
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;You can think of this as a "why I'm using slow code", but you can also think of it as "why not fast code". It's calling attention to something that's &lt;em&gt;not there&lt;/em&gt;.&lt;/p&gt;
    &lt;h3&gt;Why the comment&lt;/h3&gt;
    &lt;p&gt;If the slow code isn't causing any problems, why have a comment at all?&lt;/p&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;Well first of all the code might be a problem later. If a future version of &lt;em&gt;LfP&lt;/em&gt; has hundreds of math strings instead of a couple dozen then this build step will bottleneck the whole build. Good to lay a signpost now so I know exactly what to fix later.&lt;/p&gt;
    &lt;p&gt;But even if the code is fine forever, the comment still does something important: it shows &lt;em&gt;I'm aware of the tradeoff&lt;/em&gt;. Say I come back to my project two years from now, open &lt;code&gt;epub_math_fixer.py&lt;/code&gt; and see my terrible slow code. I ask "why did I write something so terrible?" Was it inexperience, time crunch, or just a random mistake?&lt;/p&gt;
    &lt;p&gt;The negative comment tells me that I &lt;em&gt;knew&lt;/em&gt; this was slow code, looked into the alternatives, and decided against optimizing. I don't have to spend a bunch of time reinvestigating only to come to the same conclusion. &lt;/p&gt;
    &lt;h2&gt;Why this can't be self-documented&lt;/h2&gt;
    &lt;p&gt;When I was first playing with this idea, someone told me that my negative comment isn't necessary, just name the function &lt;code&gt;RunFewerTimesSlowerAndSimplerAlgorithmAfterConsideringTradeOffs&lt;/code&gt;. Aside from the issues of being long, not explaining the tradeoffs, and that I'd have to change it everywhere if I ever optimize the code... This would make the code &lt;em&gt;less&lt;/em&gt; self-documenting. It doesn't tell you what the function actually &lt;em&gt;does&lt;/em&gt;.&lt;/p&gt;
    &lt;p&gt;The core problem is that function and variable identifiers can only contain one clause of information. I can't store "what the function does" and "what tradeoffs it makes" in the same identifier. &lt;/p&gt;
    &lt;p&gt;What about replacing the comment with a test. I guess you could make a test that greps for math blocks in the book and fails if there's more than 80? But that's not testing &lt;code&gt;EpubMathFixer&lt;/code&gt; directly. There's nothing in the function itself you can hook into. &lt;/p&gt;
    &lt;p&gt;That's the fundamental problem with self-documenting negative information. "Self-documentation" rides along with written code, and so describes what the code is doing. Negative information is about what the code is &lt;em&gt;not&lt;/em&gt; doing. &lt;/p&gt;
    &lt;h3&gt;End of newsletter speculation&lt;/h3&gt;
    &lt;p&gt;I wonder if you can think of "why not" comments as a case of counterfactuals. If so, are "abstractions of human communication" impossible to self-document in general? Can you self-document an analogy? Uncertainty? An ethical claim?&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:exception"&gt;
    &lt;p&gt;One interesting exception someone told me: they make code "more self-documenting" by turning comments into &lt;em&gt;logging&lt;/em&gt;. I encouraged them to write it up as a blog post but so far they haven't. If they ever do I will link it here. &lt;a class="footnote-backref" href="#fnref:exception" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;</description>
                <pubDate>Tue, 10 Sep 2024 19:40:29 +0000</pubDate>
                <guid>https://buttondown.com/hillelwayne/archive/why-not-comments/</guid>
            </item>
        </channel>
    </rss>
    Raw text
    <?xml version="1.0" encoding="utf-8"?>
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Computer Things</title><link>https://buttondown.com/hillelwayne</link><description>Hi, I'm Hillel. This is the newsletter version of [my website](https://www.hillelwayne.com). I post all website updates here. I also post weekly content just for the newsletter, on topics like
    
    * Formal Methods
    
    * Software History and Culture
    
    * Fringetech and exotic tooling
    
    * The philosophy and theory of software engineering
    
    You can see the archive of all public essays [here](https://buttondown.email/hillelwayne/archive/).</description><atom:link href="https://buttondown.email/hillelwayne/rss" rel="self"/><language>en-us</language><lastBuildDate>Thu, 12 Jun 2025 15:43:25 +0000</lastBuildDate><item><title>Solving LinkedIn Queens with SMT</title><link>https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/</link><description>
    &lt;h3&gt;No newsletter next week&lt;/h3&gt;
    &lt;p&gt;I’ll be speaking at &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt;. My talk isn't close to done yet, which is why this newsletter is both late and short. &lt;/p&gt;
    &lt;h1&gt;Solving LinkedIn Queens in SMT&lt;/h1&gt;
    &lt;p&gt;The article &lt;a href="https://codingnest.com/modern-sat-solvers-fast-neat-underused-part-1-of-n/" target="_blank"&gt;Modern SAT solvers: fast, neat and underused&lt;/a&gt; claims that SAT solvers&lt;sup id="fnref:SAT"&gt;&lt;a class="footnote-ref" href="#fn:SAT"&gt;1&lt;/a&gt;&lt;/sup&gt; are "criminally underused by the industry". A while back on the newsletter I asked "why": how come they're so powerful and yet nobody uses them? Many experts responded saying the reason is that encoding SAT kinda sucked and they rather prefer using tools that compile to SAT. &lt;/p&gt;
    &lt;p&gt;I was reminded of this when I read &lt;a href="https://ryanberger.me/posts/queens/" target="_blank"&gt;Ryan Berger's post&lt;/a&gt; on solving “LinkedIn Queens” as a SAT problem. &lt;/p&gt;
    &lt;p&gt;A quick overview of Queens. You’re presented with an NxN grid divided into N regions, and have to place N queens so that there is exactly one queen in each row, column, and region. While queens can be on the same diagonal, they &lt;em&gt;cannot&lt;/em&gt; be adjacently diagonal.&lt;/p&gt;
    &lt;p&gt;(Important note: Linkedin “Queens” is a variation on the puzzle game &lt;a href="https://starbattle.puzzlebaron.com/" target="_blank"&gt;Star Battle&lt;/a&gt;, which is the same except the number of stars you place in each row/column/region varies per puzzle, and is usually two. This is also why 'queens' don’t capture like chess queens.)&lt;/p&gt;
    &lt;p&gt;&lt;img alt="An image of a solved queens board. Copied from https://ryanberger.me/posts/queens" class="newsletter-image" src="https://assets.buttondown.email/images/96f6f923-331f-424d-8641-fe6753e1c2ca.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Ryan solved this by writing Queens as a SAT problem, expressing properties like "there is exactly one queen in row 3" as a large number of boolean clauses. &lt;a href="https://ryanberger.me/posts/queens/" target="_blank"&gt;Go read his post, it's pretty cool&lt;/a&gt;. What leapt out to me was that he used &lt;a href="https://cvc5.github.io/" target="_blank"&gt;CVC5&lt;/a&gt;, an &lt;strong&gt;SMT&lt;/strong&gt; solver.&lt;sup id="fnref:SMT"&gt;&lt;a class="footnote-ref" href="#fn:SMT"&gt;2&lt;/a&gt;&lt;/sup&gt; SMT solvers are "higher-level" than SAT, capable of handling more data types than just boolean variables. It's a lot easier to solve the problem at the SMT level than at the SAT level. To show this, I whipped up a short demo of solving the same problem in &lt;a href="https://github.com/Z3Prover/z3/wiki" target="_blank"&gt;Z3&lt;/a&gt; (via the &lt;a href="https://pypi.org/project/z3-solver/" target="_blank"&gt;Python API&lt;/a&gt;).&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://gist.github.com/hwayne/c5de7bc52e733995311236666bedecd3" target="_blank"&gt;Full code here&lt;/a&gt;, which you can compare to Ryan's SAT solution &lt;a href="https://github.com/ryan-berger/queens/blob/master/main.py" target="_blank"&gt;here&lt;/a&gt;. I didn't do a whole lot of cleanup on it (again, time crunch!), but short explanation below.&lt;/p&gt;
    &lt;h3&gt;The code&lt;/h3&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;z3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
    &lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;itertools&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;combinations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;
    &lt;span class="n"&gt;solver&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Solver&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt; &lt;span class="c1"&gt;# N&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Initial setup and modules. &lt;code&gt;size&lt;/code&gt; is the number of rows/columns/regions in the board, which I'll call &lt;code&gt;N&lt;/code&gt; below.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# queens[n] = col of queen on row n&lt;/span&gt;
    &lt;span class="c1"&gt;# by construction, not on same row&lt;/span&gt;
    &lt;span class="n"&gt;queens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;IntVector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'q'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;SAT represents the queen positions via N² booleans: &lt;code&gt;q_00&lt;/code&gt; means that a Queen is on row 0 and column 0, &lt;code&gt;!q_05&lt;/code&gt; means a queen &lt;em&gt;isn't&lt;/em&gt; on row 0 col 5, etc. In SMT we can instead encode it as N integers: &lt;code&gt;q_0 = 5&lt;/code&gt; means that the queen on row 0 is positioned at column 5. This immediately enforces one class of constraints for us: we don't need any constraints saying "exactly one queen per row", because that's embedded in the definition of &lt;code&gt;queens&lt;/code&gt;!&lt;/p&gt;
    &lt;p&gt;(Incidentally, using 0-based indexing for the board was a mistake on my part, it makes correctly encoding the regions later really painful.)&lt;/p&gt;
    &lt;p&gt;To actually make the variables &lt;code&gt;[q_0, q_1, …]&lt;/code&gt;, we use the Z3 affordance &lt;code&gt;IntVector(str, n)&lt;/code&gt; for making &lt;code&gt;n&lt;/code&gt; variables at once.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;And&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="c1"&gt;# not on same column&lt;/span&gt;
    &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Distinct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;First we constrain all the integers to &lt;code&gt;[0, N)&lt;/code&gt;, then use the &lt;em&gt;incredibly&lt;/em&gt; handy &lt;code&gt;Distinct&lt;/code&gt; constraint to force all the integers to have different values. This guarantees at most one queen per column, which by the &lt;a href="https://en.wikipedia.org/wiki/Pigeonhole_principle" target="_blank"&gt;pigeonhole principle&lt;/a&gt; means there is exactly one queen per column.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# not diagonally adjacent&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;q1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;q2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;One of the rules is that queens can't be adjacent. We already know that they can't be horizontally or vertically adjacent via other constraints, which leaves the diagonals. We only need to add constraints that, for each queen, there is no queen in the lower-left or lower-right corner, aka &lt;code&gt;q_3 != q_2 ± 1&lt;/code&gt;. We don't need to check the top corners because if &lt;code&gt;q_1&lt;/code&gt; is in the upper-left corner of &lt;code&gt;q_2&lt;/code&gt;, then &lt;code&gt;q_2&lt;/code&gt; is in the lower-right corner of &lt;code&gt;q_1&lt;/code&gt;!&lt;/p&gt;
    &lt;p&gt;That covers everything except the "one queen per region" constraint. But the regions are the tricky part, which we should expect because we vary the difficulty of queens games by varying the regions.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;regions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;"purple"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
            &lt;span class="s2"&gt;"red"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),],&lt;/span&gt;
            &lt;span class="c1"&gt;# you get the picture&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
    
    &lt;span class="c1"&gt;# Some checking code left out, see below&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The region has to be manually coded in, which is a huge pain.&lt;/p&gt;
    &lt;p&gt;(In the link, some validation code follows. Since it breaks up explaining the model I put it in the next section.)&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Or&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally we have the region constraint. The easiest way I found to say "there is exactly one queen in each region" is to say "there is a queen in region 1 and a queen in region 2 and a queen in region 3" etc." Then to say "there is a queen in region &lt;code&gt;purple&lt;/code&gt;" I wrote "&lt;code&gt;q_0 = 0&lt;/code&gt; OR &lt;code&gt;q_0 = 1&lt;/code&gt; OR … OR &lt;code&gt;q_1 = 0&lt;/code&gt; etc." &lt;/p&gt;
    &lt;p&gt;Why iterate over every position in the region instead of doing something like &lt;code&gt;(0, q[0]) in r&lt;/code&gt;? I tried that but it's not an expression that Z3 supports.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;sat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queens&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally, we solve and print the positions. Running this gives me:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;q__0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q__8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Which is the correct solution to the queens puzzle. I didn't benchmark the solution times, but I imagine it's considerably slower than a raw SAT solver. &lt;a href="https://github.com/audemard/glucose" target="_blank"&gt;Glucose&lt;/a&gt; is really, really fast.&lt;/p&gt;
    &lt;p&gt;But even so, solving the problem with SMT was a lot &lt;em&gt;easier&lt;/em&gt; than solving it with SAT. That satisfies me as an explanation for why people prefer it to SAT.&lt;/p&gt;
    &lt;h3&gt;Sanity checks&lt;/h3&gt;
    &lt;p&gt;One bit I glossed over earlier was the sanity checking code. I &lt;em&gt;knew for sure&lt;/em&gt; that I was going to make a mistake encoding the &lt;code&gt;region&lt;/code&gt;, and the solver wasn't going to provide useful information abut what I did wrong.  In cases like these, I like adding small tests and checks to catch mistakes early, because the solver certainly isn't going to catch them!&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;all_squares&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;repeat&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;test_i_set_up_problem_right&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;all_squares&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_iterable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r2&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;combinations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The first check was a quick test that I didn't leave any squares out, or accidentally put the same square in both regions. Converting the values into sets makes both checks a lot easier. Honestly I don't know why I didn't just use sets from the start, sets are great.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;render_regions&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;colormap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"purple"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s2"&gt;"red"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"brown"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"white"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"green"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"yellow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"orange"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"blue"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"pink"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;board&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; 
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;all_squares&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;colormap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;board&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    
    &lt;span class="n"&gt;render_regions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The second check is something that prints out the regions. It produces something like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;111111111
    112333999
    122439999
    124437799
    124666779
    124467799
    122467899
    122555889
    112258899
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I can compare this to the picture of the board to make sure I got it right. I guess a more advanced solution would be to print emoji squares like 🟥 instead.&lt;/p&gt;
    &lt;p&gt;Neither check is quality code but it's throwaway and it gets the job done so eh.&lt;/p&gt;
    &lt;h3&gt;Update for the Internet&lt;/h3&gt;
    &lt;p&gt;This was sent as a weekly newsletter, which is usually on topics like &lt;a href="https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code" target="_blank"&gt;software history&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/" target="_blank"&gt;formal methods&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;unusual technologies&lt;/a&gt;, and the &lt;a href="https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/" target="_blank"&gt;theory of software engineering&lt;/a&gt;. You &lt;a href="https://buttondown.email/hillelwayne/" target="_blank"&gt;can subscribe here&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:SAT"&gt;
    &lt;p&gt;"Boolean &lt;strong&gt;SAT&lt;/strong&gt;isfiability Solver", aka a solver that can find assignments that make complex boolean expressions true. I write a bit more about them &lt;a href="https://www.hillelwayne.com/post/np-hard/" target="_blank"&gt;here&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:SAT" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:SMT"&gt;
    &lt;p&gt;"Satisfiability Modulo Theories" &lt;a class="footnote-backref" href="#fnref:SMT" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 12 Jun 2025 15:43:25 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/</guid></item><item><title>AI is a gamechanger for TLA+ users</title><link>https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/</link><description>
    &lt;h3&gt;New Logic for Programmers Release&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.10 is now available&lt;/a&gt;! This is a minor release, mostly focused on logic-based refactoring, with new material on set types and testing refactors are correct. See the full release notes at &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;the changelog page&lt;/a&gt;. Due to &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;conference pressure&lt;/a&gt; v0.11 will also likely be a minor release. &lt;/p&gt;
    &lt;p&gt;&lt;img alt="The book cover" class="newsletter-image" src="https://assets.buttondown.email/images/29d4ae9d-bcb9-4d8b-99d4-8a35c0990ad5.jpg?w=300&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h1&gt;AI is a gamechanger for TLA+ users&lt;/h1&gt;
    &lt;p&gt;&lt;a href="https://lamport.azurewebsites.net/tla/tla.html" target="_blank"&gt;TLA+&lt;/a&gt; is a specification language to model and debug distributed systems. While very powerful, it's also hard for programmers to learn, and there's always questions of connecting specifications with actual code. &lt;/p&gt;
    &lt;p&gt;That's why &lt;a href="https://zfhuang99.github.io/github%20copilot/formal%20verification/tla+/2025/05/24/ai-revolution-in-distributed-systems.html" target="_blank"&gt;The Coming AI Revolution in Distributed Systems&lt;/a&gt; caught my interest. In the post, Cheng Huang claims that Azure successfully used LLMs to examine an existing codebase, derive a TLA+ spec, and find a production bug in that spec. "After a decade of manually crafting TLA+ specifications", he wrote, "I must acknowledge that this AI-generated specification rivals human work".&lt;/p&gt;
    &lt;p&gt;This inspired me to experiment with LLMs in TLA+ myself. My goals are a little less ambitious than Cheng's: I wanted to see how LLMs could help junior specifiers write TLA+, rather than handling the entire spec automatically. Details on what did and didn't work below, but my takeaway is that &lt;strong&gt;LLMs are an immense specification force multiplier.&lt;/strong&gt;&lt;/p&gt;
    &lt;p&gt;All tests were done with a standard VSCode Copilot subscription, writing Claude 3.7 in Agent mode. Other LLMs or IDEs may be more or less effective, etc.&lt;/p&gt;
    &lt;h2&gt;Things Claude was good at&lt;/h2&gt;
    &lt;h3&gt;Fixing syntax errors&lt;/h3&gt;
    &lt;p&gt;TLA+ uses a very different syntax than mainstream programming languages, meaning beginners make a lot of mistakes where they do a "programming syntax" instead of TLA+ syntax:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;NotThree(x) = \* should be ==, not =
        x != 3 \* should be #, not !=
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The problem is that the TLA+ syntax checker, SANY, is 30 years old and doesn't provide good information. Here's what it says for that snippet:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Was expecting "==== or more Module body"
    Encountered "NotThree" at line 6, column 1
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;That only isolates one error and doesn't tell us what the problem is, only where it is. Experienced TLA+ users get "error eyes" and can quickly see what the problem is, but beginners really struggle with this.&lt;/p&gt;
    &lt;p&gt;The TLA+ foundation has made LLM integration a priority, so the VSCode extension &lt;a href="https://github.com/tlaplus/vscode-tlaplus/blob/master/src/main.ts#L174" target="_blank"&gt;naturally supports several agents actions&lt;/a&gt;. One of these is running SANY, meaning an agent can get an error, fix it, get another error, fix it, etc. Provided the above sample and asked to make it work, Claude successfully fixed both errors. It also fixed many errors in a larger spec, as well as figure out why PlusCal specs weren't compiling to TLA+.&lt;/p&gt;
    &lt;p&gt;This by itself is already enough to make LLMs a worthwhile tool, as it fixes one of the biggest barriers to entry.&lt;/p&gt;
    &lt;h3&gt;Understanding error traces&lt;/h3&gt;
    &lt;p&gt;When TLA+ finds a violated property, it outputs the sequence of steps that leads to the error. This starts in plaintext, and VSCode parses it into an interactive table:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="An example error trace" class="newsletter-image" src="https://assets.buttondown.email/images/f7f16d0e-c61f-4286-ae49-67e03f844126.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Learning to read these error traces is a skill in itself. You have to understand what's happening in each step and how it relates back to the actually broken property. It takes a long time for people to learn how to do this well.&lt;/p&gt;
    &lt;p&gt;Claude was successful here, too, accurately reading 20+ step error traces and giving a high-level explanation of what went wrong. It also could condense error traces: if ten steps of the error trace could be condensed into a one-sentence summary (which can happen if you're modeling a lot of process internals) Claude would do it.&lt;/p&gt;
    &lt;p&gt;I did have issues here with doing this in agent mode: while the extension does provide a "run model checker" command, the agent would regularly ignore this and prefer to run a terminal command instead. This would be fine except that the LLM consistently hallucinated invalid commands. I had to amend every prompt with "run the model checker via vscode, do not use a terminal command". You can skip this if you're willing to copy and paste the error trace into the prompt.&lt;/p&gt;
    &lt;p&gt;As with syntax checking, if this was the &lt;em&gt;only&lt;/em&gt; thing LLMs could effectively do, that would already be enough&lt;sup id="fnref:dayenu"&gt;&lt;a class="footnote-ref" href="#fn:dayenu"&gt;1&lt;/a&gt;&lt;/sup&gt; to earn a strong recommend. Even as a TLA+ expert I expect I'll be using this trick regularly. &lt;/p&gt;
    &lt;h3&gt;Boilerplate tasks&lt;/h3&gt;
    &lt;p&gt;TLA+ has a lot of boilerplate. One of the most notorious examples is &lt;code&gt;UNCHANGED&lt;/code&gt; rules. Specifications are extremely precise — so precise that you have to specify what variables &lt;em&gt;don't&lt;/em&gt; change in every step. This takes the form of an &lt;code&gt;UNCHANGED&lt;/code&gt; clause at the end of relevant actions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;RemoveObjectFromStore(srv, o, s) ==
      /\ o \in stored[s]
      /\ stored' = [stored EXCEPT ![s] = @ \ {o}]
      /\ UNCHANGED &amp;lt;&amp;lt;capacity, log, objectsize, pc&amp;gt;&amp;gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Writing this is really annoying. Updating these whenever you change an action, or add a new variable to the spec, is doubly so. Syntax checking and error analysis are important for beginners, but this is what I wanted for &lt;em&gt;myself&lt;/em&gt;. I took a spec and prompted Claude&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Add UNCHANGED &amp;lt;&lt;v1, etc="" v2,=""&gt;&amp;gt; for each variable not changed in an action.&lt;/v1,&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;And it worked! It successfully updated the &lt;code&gt;UNCHANGED&lt;/code&gt; in every action. &lt;/p&gt;
    &lt;p&gt;(Note, though, that it was a "well-behaved" spec in this regard: only one "action" happened at a time. In TLA+ you can have two actions happen simultaneously, that each update half of the variables, meaning neither of them should have an &lt;code&gt;UNCHANGED&lt;/code&gt; clause. I haven't tested how Claude handles that!)&lt;/p&gt;
    &lt;p&gt;That's the most obvious win, but Claude was good at handling other tedious work, too. Some examples include updating &lt;code&gt;vars&lt;/code&gt; (the conventional collection of all state variables), lifting a hard-coded value into a model parameter, and changing data formats. Most impressive to me, though, was rewriting a spec designed for one process to instead handle multiple processes. This means taking all of the process variables, which originally have types like &lt;code&gt;Int&lt;/code&gt;, converting them to types like &lt;code&gt;[Process -&amp;gt; Int]&lt;/code&gt;, and then updating the uses of all of those variables in the spec. It didn't account for race conditions in the new concurrent behavior, but it was an excellent scaffold to do more work.&lt;/p&gt;
    &lt;h3&gt;Writing properties from an informal description&lt;/h3&gt;
    &lt;p&gt;You have to be pretty precise with your intended property description but it handles converting that precise description into TLA+'s formalized syntax, which is something beginners often struggle with.&lt;/p&gt;
    &lt;h2&gt;Things it is less good at&lt;/h2&gt;
    &lt;h3&gt;Generating model config files&lt;/h3&gt;
    &lt;p&gt;To model check TLA+, you need both a specification (&lt;code&gt;.tla&lt;/code&gt;) and a model config file (&lt;code&gt;.cfg&lt;/code&gt;), which have separate syntaxes. Asking the agent to generate the second often lead to it using TLA+ syntax. It automatically fixed this after getting parsing errors, though. &lt;/p&gt;
    &lt;h3&gt;Fixing specs&lt;/h3&gt;
    &lt;p&gt;Whenever the ran model checking and discovered a bug, it would naturally propose a change to either the invalid property or the spec. Sometimes the changes were good, other times the changes were not physically realizable. For example, if it found that a bug was due to a race condition between processes, it would often suggest fixing it by saying race conditions were okay. I mean yes, if you say bugs are okay, then the spec finds that bugs are okay! Or it would alternatively suggest adding a constraint to the spec saying that race conditions don't happen. &lt;a href="https://www.hillelwayne.com/post/alloy-facts/" target="_blank"&gt;But that's a huge mistake in specification&lt;/a&gt;, because race conditions happen if we don't have coordination. We need to specify the &lt;em&gt;mechanism&lt;/em&gt; that is supposed to prevent them.&lt;/p&gt;
    &lt;h3&gt;Finding properties of the spec&lt;/h3&gt;
    &lt;p&gt;After seeing how capable it was at translating my properties to TLA+, I started prompting Claude to come up with properties on its own. Unfortunately, almost everything I got back was either trivial, uninteresting, or too coupled to implementation details. I haven't tested if it would work better to ask it for "properties that may be violated".&lt;/p&gt;
    &lt;h3&gt;Generating code from specs&lt;/h3&gt;
    &lt;p&gt;I have to be specific here: Claude &lt;em&gt;could&lt;/em&gt; sometimes convert Python into a passable spec, an vice versa. It &lt;em&gt;wasn't&lt;/em&gt; good at recognizing abstraction. For example, TLA+ specifications often represent sequential operations with a state variable, commonly called &lt;code&gt;pc&lt;/code&gt;. If modeling code that nonatomically retrieves a counter value and increments it, we'd have one action that requires &lt;code&gt;pc = "Get"&lt;/code&gt; and sets the new value to &lt;code&gt;"Inc"&lt;/code&gt;, then another that requires it be &lt;code&gt;"Inc"&lt;/code&gt; and sets it to &lt;code&gt;"Done"&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;I found that Claude would try to somehow convert &lt;code&gt;pc&lt;/code&gt; into part of the Python program's state, rather than recognize it as a TLA+ abstraction. On the other side, when converting python code to TLA+ it would often try to translate things like &lt;code&gt;sleep&lt;/code&gt; into some part of the spec, not recognizing that it is abstractable into a distinct action. I didn't test other possible misconceptions, like converting randomness to nondeterminism.&lt;/p&gt;
    &lt;p&gt;For the record, when converting TLA+ to Python Claude tended to make simulators of the spec, rather than possible production code implementing the spec. I really wasn't expecting otherwise though.&lt;/p&gt;
    &lt;h2&gt;Unexplored Applications&lt;/h2&gt;
    &lt;p&gt;Things I haven't explored thoroughly but could possibly be effective, based on what I know about TLA+ and AI:&lt;/p&gt;
    &lt;h3&gt;Writing Java Overrides&lt;/h3&gt;
    &lt;p&gt;Most TLA+ operators are resolved via TLA+ interpreters, but you can also implement them in "native" Java. This lets you escape the standard language semantics and add capabilities like &lt;a href="https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla" target="_blank"&gt;executing programs during model-checking&lt;/a&gt; or &lt;a href="https://github.com/tlaplus/tlaplus/blob/master/tlatools/org.lamport.tlatools/src/tla2sany/StandardModules/TLC.tla#L62" target="_blank"&gt;dynamically constrain the depth of the searched state space&lt;/a&gt;. There's a lot of cool things I think would be possible with overrides. The problem is there's only a handful of people in the world who know how to write them. But that handful have written quite a few overrides and I think there's enough there for Claude to work with. &lt;/p&gt;
    &lt;h3&gt;Writing specs, given a reference mechanism&lt;/h3&gt;
    &lt;p&gt;In all my experiments, the LLM only had my prompts and the occasional Python script as information. That makes me suspect that some of its problems with writing and fixing specs come down to not having a system model. Maybe it wouldn't suggest fixes like "these processes never race" if it had a design doc saying that the processes can't coordinate. &lt;/p&gt;
    &lt;p&gt;(Could a Sufficiently Powerful LLM derive some TLA+ specification from a design document?)&lt;/p&gt;
    &lt;h3&gt;Connecting specs and code&lt;/h3&gt;
    &lt;p&gt;This is the holy grail of TLA+: taking a codebase and showing it correctly implements a spec. Currently the best ways to do this are by either using TLA+ to generate a test suite, or by taking logged production traces and matching them to TLA+ behaviors. &lt;a href="https://www.mongodb.com/blog/post/engineering/conformance-checking-at-mongodb-testing-our-code-matches-our-tla-specs" target="_blank"&gt;This blog post discusses both&lt;/a&gt;. While I've seen a lot of academic research into these approaches there are no industry-ready tools. So if you want trace validation you have to do a lot of manual labour tailored to your specific product. &lt;/p&gt;
    &lt;p&gt;If LLMs could do some of this work for us then that'd really amplify the usefulness of TLA+ to many companies.&lt;/p&gt;
    &lt;h2&gt;Thoughts&lt;/h2&gt;
    &lt;p&gt;&lt;em&gt;Right now&lt;/em&gt;, agents seem good at the tedious and routine parts of TLA+ and worse at the strategic and abstraction parts. But, since the routine parts are often a huge barrier to beginners, this means that LLMs have the potential to make TLA+ far, far more accessible than it previously was.&lt;/p&gt;
    &lt;p&gt;I have mixed thoughts on this. As an &lt;em&gt;advocate&lt;/em&gt;, this is incredible. I want more people using formal specifications because I believe it leads to cheaper, safer, more reliable software. Anything that gets people comfortable with specs is great for our industry. As a &lt;em&gt;professional TLA+ consultant&lt;/em&gt;, I'm worried that this obsoletes me. Most of my income comes from training and coaching, which companies will have far less demand of now. Then again, maybe this an opportunity to pitch "agentic TLA+ training" to companies!&lt;/p&gt;
    &lt;p&gt;Anyway, if you're interested in TLA+, there has never been a better time to try it. I mean it, these tools handle so much of the hard part now. I've got a &lt;a href="https://learntla.com/" target="_blank"&gt;free book available online&lt;/a&gt;, as does &lt;a href="https://lamport.azurewebsites.net/tla/book.html" target="_blank"&gt;the inventor of TLA+&lt;/a&gt;. I like &lt;a href="https://elliotswart.github.io/pragmaticformalmodeling/" target="_blank"&gt;this guide too&lt;/a&gt;. Happy modeling!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:dayenu"&gt;
    &lt;p&gt;Dayenu. &lt;a class="footnote-backref" href="#fnref:dayenu" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 05 Jun 2025 14:59:11 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/</guid></item><item><title>What does "Undecidable" mean, anyway</title><link>https://buttondown.com/hillelwayne/archive/what-does-undecidable-mean-anyway/</link><description>
    &lt;h3&gt;Systems Distributed&lt;/h3&gt;
    &lt;p&gt;I'll be speaking at &lt;a href="https://systemsdistributed.com/" target="_blank"&gt;Systems Distributed&lt;/a&gt; next month! The talk is brand new and will aim to showcase some of the formal methods mental models that would be useful in mainstream software development. It has added some extra stress on my schedule, though, so expect the next two monthly releases of &lt;em&gt;Logic for Programmers&lt;/em&gt; to be mostly minor changes.&lt;/p&gt;
    &lt;h2&gt;What does "Undecidable" mean, anyway&lt;/h2&gt;
    &lt;p&gt;Last week I read &lt;a href="https://liamoc.net/forest/loc-000S/index.xml" target="_blank"&gt;Against Curry-Howard Mysticism&lt;/a&gt;, which is a solid article I recommend reading. But this newsletter is actually about &lt;a href="https://lobste.rs/s/n0whur/against_curry_howard_mysticism#c_lbts57" target="_blank"&gt;one comment&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;I like to see posts like this because I often feel like I can’t tell the difference between BS and a point I’m missing. Can we get one for questions like “Isn’t XYZ (Undecidable|NP-Complete|PSPACE-Complete)?” &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;I've already written one of these for &lt;a href="https://www.hillelwayne.com/post/np-hard/" target="_blank"&gt;NP-complete&lt;/a&gt;, so let's do one for "undecidable". Step one is to pull a technical definition from the book &lt;a href="https://link.springer.com/book/10.1007/978-1-4612-1844-9" target="_blank"&gt;&lt;em&gt;Automata and Computability&lt;/em&gt;&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;A property P of strings is said to be decidable if ... there is a total Turing machine that accepts input strings that have property P and rejects those that do not. (pg 220)&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Step two is to translate the technical computer science definition into more conventional programmer terms. Warning, because this is a newsletter and not a blog post, I might be a little sloppy with terms.&lt;/p&gt;
    &lt;h3&gt;Machines and Decision Problems&lt;/h3&gt;
    &lt;p&gt;In automata theory, all inputs to a "program" are strings of characters, and all outputs are "true" or "false". A program "accepts" a string if it outputs "true", and "rejects" if it outputs "false". You can think of this as automata studying all pure functions of type &lt;code&gt;f :: string -&amp;gt; boolean&lt;/code&gt;. Problems solvable by finding such an &lt;code&gt;f&lt;/code&gt; are called "decision problems".&lt;/p&gt;
    &lt;p&gt;This covers more than you'd think, because we can bootstrap more powerful functions from these. First, as anyone who's programmed in bash knows, strings can represent any other data. Second, we can fake non-boolean outputs by instead checking if a certain computation gives a certain result. For example, I can reframe the function &lt;code&gt;add(x, y) = x + y&lt;/code&gt; as a decision problem like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;IS_SUM(str) {
        x, y, z = split(str, "#")
        return x + y == z
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Then because &lt;code&gt;IS_SUM("2#3#5")&lt;/code&gt; returns true, we know &lt;code&gt;2 + 3 == 5&lt;/code&gt;, while &lt;code&gt;IS_SUM("2#3#6")&lt;/code&gt; is false. Since we can bootstrap parameters out of strings, I'll just say it's &lt;code&gt;IS_SUM(x, y, z)&lt;/code&gt; going forward.&lt;/p&gt;
    &lt;p&gt;A big part of automata theory is studying different models of computation with different strengths. One of the weakest is called &lt;a href="https://en.wikipedia.org/wiki/Deterministic_finite_automaton" target="_blank"&gt;"DFA"&lt;/a&gt;. I won't go into any details about what DFA actually can do, but the important thing is that it &lt;em&gt;can't&lt;/em&gt; solve &lt;code&gt;IS_SUM&lt;/code&gt;. That is, if you give me a DFA that takes inputs of form &lt;code&gt;x#y#z&lt;/code&gt;, I can always find an input where the DFA returns true when &lt;code&gt;x + y != z&lt;/code&gt;, &lt;em&gt;or&lt;/em&gt; an input which returns false when &lt;code&gt;x + y == z&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;It's really important to keep this model of "solve" in mind: a program solves a problem if it correctly returns true on all true inputs and correctly returns false on all false inputs.&lt;/p&gt;
    &lt;h3&gt;(total) Turing Machines&lt;/h3&gt;
    &lt;p&gt;A Turing Machine (TM) is a particular type of computation model. It's important for two reasons: &lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;
    &lt;p&gt;By the &lt;a href="https://en.wikipedia.org/wiki/Church%E2%80%93Turing_thesis" target="_blank"&gt;Church-Turing thesis&lt;/a&gt;, a Turing Machine is the "upper bound" of how powerful (physically realizable) computational models can get. This means that if an actual real-world programming language can solve a particular decision problem, so can a TM. Conversely, if the TM &lt;em&gt;can't&lt;/em&gt; solve it, neither can the programming language.&lt;sup id="fnref:caveat"&gt;&lt;a class="footnote-ref" href="#fn:caveat"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li&gt;
    &lt;p&gt;It's possible to write a Turing machine that takes &lt;em&gt;a textual representation of another Turing machine&lt;/em&gt; as input, and then simulates that Turing machine as part of its computations. &lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Property (1) means that we can move between different computational models of equal strength, proving things about one to learn things about another. That's why I'm able to write &lt;code&gt;IS_SUM&lt;/code&gt; in a pseudocode instead of writing it in terms of the TM computational model (and why I was able to use &lt;code&gt;split&lt;/code&gt; for convenience). &lt;/p&gt;
    &lt;p&gt;Property (2) does several interesting things. First of all, it makes it possible to compose Turing machines. Here's how I can roughly ask if a given number is the sum of two primes, with "just" addition and boolean functions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;IS_SUM_TWO_PRIMES(z):
        x := 1
        y := 1
        loop {
            if x &amp;gt; z {return false}
            if IS_PRIME(x) {
                if IS_PRIME(y) {
                    if IS_SUM(x, y, z) {
                        return true;
                    }
                }
            }
            y := y + 1
            if y &amp;gt; x {
                x := x + 1
                y := 0
            }
        }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Notice that without the &lt;code&gt;if x &amp;gt; z {return false}&lt;/code&gt;, the program would loop forever on &lt;code&gt;z=2&lt;/code&gt;. A TM that always halts for all inputs is called &lt;strong&gt;total&lt;/strong&gt;.&lt;/p&gt;
    &lt;p&gt;Property (2) also makes "Turing machines" a possible input to functions, meaning that we can now make decision problems about the behavior of Turing machines. For example, "does the TM &lt;code&gt;M&lt;/code&gt; either accept or reject &lt;code&gt;x&lt;/code&gt; within ten steps?"&lt;sup id="fnref:backticks"&gt;&lt;a class="footnote-ref" href="#fn:backticks"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;IS_DONE_IN_TEN_STEPS(M, x) {
        for (i = 0; i &amp;lt; 10; i++) {
            `simulate M(x) for one step`
            if(`M accepted or rejected`) {
                return true
            }
        }
        return false
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;Decidability and Undecidability&lt;/h3&gt;
    &lt;p&gt;Now we have all of the pieces to understand our original definition:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;A property P of strings is said to be decidable if ... there is a total Turing machine that accepts input strings that have property P and rejects those that do not. (220)&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Let &lt;code&gt;IS_P&lt;/code&gt; be the decision problem "Does the input satisfy P"? Then &lt;code&gt;IS_P&lt;/code&gt; is decidable if it can be solved by a Turing machine, ie, I can provide some &lt;code&gt;IS_P(x)&lt;/code&gt; machine that &lt;em&gt;always&lt;/em&gt; accepts if &lt;code&gt;x&lt;/code&gt; has property P, and always rejects if &lt;code&gt;x&lt;/code&gt; doesn't have property P. If I can't do that, then &lt;code&gt;IS_P&lt;/code&gt; is undecidable. &lt;/p&gt;
    &lt;p&gt;&lt;code&gt;IS_SUM(x, y, z)&lt;/code&gt; and &lt;code&gt;IS_DONE_IN_TEN_STEPS(M, x)&lt;/code&gt; are decidable properties. Is &lt;code&gt;IS_SUM_TWO_PRIMES(z)&lt;/code&gt; decidable? Some analysis shows that our corresponding program will either find a solution, or have &lt;code&gt;x&amp;gt;z&lt;/code&gt; and return false. So yes, it is decidable.&lt;/p&gt;
    &lt;p&gt;Notice there's an asymmetry here. To prove some property is decidable, I need just to need to find &lt;em&gt;one&lt;/em&gt; program that correctly solves it. To prove some property is undecidable, I need to show that any possible program, no matter what it is, doesn't solve it.&lt;/p&gt;
    &lt;p&gt;So with that asymmetry in mind, do are there &lt;em&gt;any&lt;/em&gt; undecidable problems? Yes, quite a lot. Recall that Turing machines can accept encodings of other TMs as input, meaning we can write a TM that checks &lt;em&gt;properties of Turing machines&lt;/em&gt;. And, by &lt;a href="https://en.wikipedia.org/wiki/Rice%27s_theorem" target="_blank"&gt;Rice's Theorem&lt;/a&gt;, almost every nontrivial semantic&lt;sup id="fnref:nontrivial"&gt;&lt;a class="footnote-ref" href="#fn:nontrivial"&gt;3&lt;/a&gt;&lt;/sup&gt; property of Turing machines is undecidable. The conventional way to prove this is to first find a single undecidable property &lt;code&gt;H&lt;/code&gt;, and then use that to bootstrap undecidability of other properties.&lt;/p&gt;
    &lt;p&gt;The canonical and most famous example of an undecidable problem is the &lt;a href="https://en.wikipedia.org/wiki/Halting_problem" target="_blank"&gt;Halting problem&lt;/a&gt;: "does machine M halt on input i?" It's pretty easy to prove undecidable, and easy to use it to bootstrap other undecidability properties. But again, &lt;em&gt;any&lt;/em&gt; nontrivial property is undecidable. Checking a TM is total is undecidable. Checking a TM accepts &lt;em&gt;any&lt;/em&gt; inputs is undecidable. Checking a TM solves &lt;code&gt;IS_SUM&lt;/code&gt; is undecidable. Etc etc etc.&lt;/p&gt;
    &lt;h3&gt;What this doesn't mean in practice&lt;/h3&gt;
    &lt;p&gt;I often see the halting problem misconstrued as "it's impossible to tell if a program will halt before running it." &lt;strong&gt;This is wrong&lt;/strong&gt;. The halting problem says that we cannot create an algorithm that, when applied to an arbitrary program, tells us whether the program will halt or not. It is absolutely possible to tell if many programs will halt or not. It's possible to find entire subcategories of programs that are guaranteed to halt. It's possible to say "a program constructed following constraints XYZ is guaranteed to halt." &lt;/p&gt;
    &lt;p&gt;The actual consequence of undecidability is more subtle. If we want to know if a program has property P, undecidability tells us&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;We will have to spend time and mental effort to determine if it has P&lt;/li&gt;
    &lt;li&gt;We may not be successful.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;This is subtle because we're so used to living in a world where everything's undecidable that we don't really consider what the counterfactual would be like. In such a world there might be no need for Rust, because "does this C program guarantee memory-safety" is a decidable property. The entire field of formal verification could be unnecessary, as we could just check properties of arbitrary programs directly. We could automatically check if a change in a program preserves all existing behavior. Lots of famous math problems could be solved overnight. &lt;/p&gt;
    &lt;p&gt;(This to me is a strong "intuitive" argument for why the halting problem is undecidable: a halt detector can be trivially repurposed as a program optimizer / theorem-prover / bcrypt cracker / chess engine. It's &lt;em&gt;too powerful&lt;/em&gt;, so we should expect it to be impossible.)&lt;/p&gt;
    &lt;p&gt;But because we don't live in that world, all of those things are hard problems that take effort and ingenuity to solve, and even then we often fail.&lt;/p&gt;
    &lt;h3&gt;Update for the Internet&lt;/h3&gt;
    &lt;p&gt;This was sent as a weekly newsletter, which is usually on topics like &lt;a href="https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code" target="_blank"&gt;software history&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/" target="_blank"&gt;formal methods&lt;/a&gt;, &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;unusual technologies&lt;/a&gt;, and the &lt;a href="https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/" target="_blank"&gt;theory of software engineering&lt;/a&gt;. You &lt;a href="https://buttondown.email/hillelwayne/" target="_blank"&gt;can subscribe here&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:caveat"&gt;
    &lt;p&gt;To be pendantic, a TM can't do things like "scrape a webpage" or "render a bitmap", but we're only talking about computational decision problems here. &lt;a class="footnote-backref" href="#fnref:caveat" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:backticks"&gt;
    &lt;p&gt;One notation I've adopted in &lt;em&gt;Logic for Programmers&lt;/em&gt; is marking abstract sections of pseudocode with backticks. It's really handy! &lt;a class="footnote-backref" href="#fnref:backticks" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:nontrivial"&gt;
    &lt;p&gt;Nontrivial meaning "at least one TM has this property and at least one TM doesn't have this property". Semantic meaning "related to whether the TM accepts, rejects, or runs forever on a class of inputs". &lt;code&gt;IS_DONE_IN_TEN_STEPS&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt; a semantic property, as it doesn't tell us anything about inputs that take longer than ten steps. &lt;a class="footnote-backref" href="#fnref:nontrivial" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 28 May 2025 19:34:02 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/what-does-undecidable-mean-anyway/</guid></item><item><title>Finding hard 24 puzzles with planner programming</title><link>https://buttondown.com/hillelwayne/archive/finding-hard-24-puzzles-with-planner-programming/</link><description>
    &lt;p&gt;&lt;strong&gt;Planner programming&lt;/strong&gt; is a programming technique where you solve problems by providing a goal and actions, and letting the planner find actions that reach the goal. In a previous edition of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt;, I demonstrated how this worked by solving the 
    &lt;a href="https://en.wikipedia.org/wiki/24_(puzzle)" target="_blank"&gt;24 puzzle&lt;/a&gt; with planning. For &lt;a href="https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/" target="_blank"&gt;reasons discussed here&lt;/a&gt; I replaced that example with something more practical (orchestrating deployments), but left the &lt;a href="https://github.com/logicforprogrammers/book-assets/tree/master/code/chapter-misc" target="_blank"&gt;code online&lt;/a&gt; for posterity.&lt;/p&gt;
    &lt;p&gt;Recently I saw a family member try and fail to vibe code a tool that would find all valid 24 puzzles, and realized I could adapt the puzzle solver to also be a puzzle generator. First I'll explain the puzzle rules, then the original solver, then the generator.&lt;sup id="fnref:complex"&gt;&lt;a class="footnote-ref" href="#fn:complex"&gt;1&lt;/a&gt;&lt;/sup&gt; For a much longer intro to planning, see &lt;a href="https://www.hillelwayne.com/post/picat/" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;h3&gt;The rules of 24&lt;/h3&gt;
    &lt;p&gt;You're given four numbers and have to find some elementary equation (&lt;code&gt;+-*/&lt;/code&gt;+groupings) that uses all four numbers and results in 24. Each number must be used exactly once, but do not need to be used in the starting puzzle order. Some examples:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;code&gt;[6, 6, 6, 6]&lt;/code&gt; -&amp;gt; &lt;code&gt;6+6+6+6=24&lt;/code&gt;&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;[1, 1, 6, 6]&lt;/code&gt; -&amp;gt; &lt;code&gt;(6+6)*(1+1)=24&lt;/code&gt;&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;[4, 4, 4, 5]&lt;/code&gt; -&amp;gt; &lt;code&gt;4*(5+4/4)=24&lt;/code&gt;&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Some setups are impossible, like &lt;code&gt;[1, 1, 1, 1]&lt;/code&gt;. Others are possible only with non-elementary operations, like &lt;code&gt;[1, 5, 5, 324]&lt;/code&gt; (which requires exponentiation).&lt;/p&gt;
    &lt;h2&gt;The solver&lt;/h2&gt;
    &lt;p&gt;We will use the &lt;a href="http://picat-lang.org/" target="_blank"&gt;Picat&lt;/a&gt;, the only language that I know has a built-in planner module. The current state of our plan with be represented by a single list with all of the numbers.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;import&lt;/span&gt; &lt;span class="s s-Atom"&gt;planner&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="s s-Atom"&gt;import&lt;/span&gt; &lt;span class="s s-Atom"&gt;cp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    
    &lt;span class="nf"&gt;action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Cost&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;?=&amp;gt;&lt;/span&gt;
      &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt; &lt;span class="s s-Atom"&gt;:=&lt;/span&gt; &lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;% , is `and`&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt; &lt;span class="s s-Atom"&gt;:=&lt;/span&gt; &lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
        &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;S0&lt;/span&gt; &lt;span class="s s-Atom"&gt;++&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;A&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is our "action", and it works in three steps:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Nondeterministically pull two different values out of the input, deleting them&lt;/li&gt;
    &lt;li&gt;Nondeterministically pick one of the basic operations&lt;/li&gt;
    &lt;li&gt;The new state is the remaining elements, appended with that operation applied to our two picks.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Let's walk through this with &lt;code&gt;[1, 6, 1, 7]&lt;/code&gt;. There are four choices for &lt;code&gt;X&lt;/code&gt; and three four &lt;code&gt;Y&lt;/code&gt;. If the planner chooses &lt;code&gt;X=6&lt;/code&gt; and &lt;code&gt;Y=7&lt;/code&gt;, &lt;code&gt;A = $(6 + 7)&lt;/code&gt;. This is an uncomputed term in the same way lisps might use quotation. We can resolve the computation with &lt;code&gt;apply&lt;/code&gt;, as in the line &lt;code&gt;S1 = S0 ++ [apply(A)]&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;final&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=:=&lt;/span&gt; &lt;span class="mf"&gt;24.&lt;/span&gt; &lt;span class="c1"&gt;% handle floating point&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Our final goal is just a list where the only element is 24. This has to be a little floating point-sensitive to handle floating point divison, done by &lt;code&gt;=:=&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;main&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nv"&gt;Start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;best_plan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"%w %w%n"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;For &lt;code&gt;main,&lt;/code&gt; we just find the best plan with the maximum cost of &lt;code&gt;4&lt;/code&gt; and print it. When run from the command line, &lt;code&gt;picat&lt;/code&gt; automatically executes whatever is in &lt;code&gt;main&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$ picat 24.pi
    [1,5,5,6] [1 + 5,5 * 6,30 - 6]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I don't want to spoil any more 24 puzzles, so let's stop showing the plan:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;main =&amp;gt;
    &lt;span class="gd"&gt;- , printf("%w %w%n", Start, Plan)&lt;/span&gt;
    &lt;span class="gi"&gt;+ , printf("%w%n", Start)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;Generating puzzles&lt;/h3&gt;
    &lt;p&gt;Picat provides a &lt;code&gt;find_all(X, p(X))&lt;/code&gt; function, which ruturns all &lt;code&gt;X&lt;/code&gt; for which &lt;code&gt;p(X)&lt;/code&gt; is true. In theory, we could write &lt;code&gt;find_all(S, best_plan(S, 4, _)&lt;/code&gt;. In practice, there are an infinite number of valid puzzles, so we need to bound S somewhat. We also don't want to find any redundant puzzles, such as &lt;code&gt;[6, 6, 6, 4]&lt;/code&gt; and &lt;code&gt;[4, 6, 6, 6]&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;We can solve both issues by writing a helper &lt;code&gt;valid24(S)&lt;/code&gt;, which will check that &lt;code&gt;S&lt;/code&gt; a sorted list of integers within some bounds, like &lt;code&gt;1..8&lt;/code&gt;, and also has a valid solution.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;valid24&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nv"&gt;Start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;new_list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Start&lt;/span&gt; &lt;span class="s s-Atom"&gt;::&lt;/span&gt; &lt;span class="mf"&gt;1..8&lt;/span&gt; &lt;span class="c1"&gt;% every value in 1..8&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;increasing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;% sorted ascending&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;solve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;% turn into values&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;best_plan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This leans on Picat's constraint solving features to automatically find bounded sorted lists, which is why we need the &lt;code&gt;solve&lt;/code&gt; step.&lt;sup id="fnref:efficiency"&gt;&lt;a class="footnote-ref" href="#fn:efficiency"&gt;2&lt;/a&gt;&lt;/sup&gt; Now we can just loop through all of the values in &lt;code&gt;find_all&lt;/code&gt; to get all solutions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;main&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nf"&gt;foreach&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="s s-Atom"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;find_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="nf"&gt;valid24&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="nf"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"%w%n"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="s s-Atom"&gt;end&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$ picat 24.pi
    
    [1,1,1,8]
    [1,1,2,6]
    [1,1,2,7]
    [1,1,2,8]
    # etc
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;Finding hard puzzles&lt;/h3&gt;
    &lt;p&gt;Last Friday I realized I could do something more interesting with this. Once I have found a plan, I can apply further constraints to the plan, for example to find problems that can be solved with division:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;valid24(Start, Plan) =&amp;gt;
    &lt;span class="w"&gt; &lt;/span&gt; Start = new_list(4)
    &lt;span class="w"&gt; &lt;/span&gt; , Start :: 1..8
    &lt;span class="w"&gt; &lt;/span&gt; , increasing(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , solve(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , best_plan(Start, 4, Plan)
    &lt;span class="gi"&gt;+ , member($(_ / _), Plan)&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; .
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In playing with this, though, I noticed something weird: there are some solutions that appear if I sort &lt;em&gt;up&lt;/em&gt; but not &lt;em&gt;down&lt;/em&gt;. For example, &lt;code&gt;[3,3,4,5]&lt;/code&gt; appears in the solution set, but &lt;code&gt;[5, 4, 3, 3]&lt;/code&gt; doesn't appear if I replace &lt;code&gt;increasing&lt;/code&gt; with &lt;code&gt;decreasing&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;As far as I can tell, this is because Picat only finds one best plan, and &lt;code&gt;[5, 4, 3, 3]&lt;/code&gt; has &lt;em&gt;two&lt;/em&gt; solutions: &lt;code&gt;4*(5-3/3)&lt;/code&gt; and &lt;code&gt;3*(5+4)-3&lt;/code&gt;. &lt;code&gt;best_plan&lt;/code&gt; is a &lt;em&gt;deterministic&lt;/em&gt; operator, so Picat commits to the first best plan it finds. So if it finds &lt;code&gt;3*(5+4)-3&lt;/code&gt; first, it sees that the solution doesn't contain a division, throws &lt;code&gt;[5, 4, 3, 3]&lt;/code&gt; away as a candidate, and moves on to the next puzzle.&lt;/p&gt;
    &lt;p&gt;There's a couple ways we can fix this. We could replace &lt;code&gt;best_plan&lt;/code&gt; with &lt;code&gt;best_plan_nondet&lt;/code&gt;, which can backtrack to find new plans (at the cost of an enormous number of duplicates). Or we could modify our &lt;code&gt;final&lt;/code&gt; to only accept plans with a division: &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;% Hypothetical change
    final([N]) =&amp;gt;
    &lt;span class="gi"&gt;+ member($(_ / _), current_plan()),&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; N =:= 24.
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;My favorite "fix" is to ask another question entirely. While I was looking for puzzles that can be solved with division, what I actually want is puzzles that &lt;em&gt;must&lt;/em&gt; be solved with division. What if I rejected any puzzle that has a solution &lt;em&gt;without&lt;/em&gt; division?&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gi"&gt;+ plan_with_no_div(S, P) =&amp;gt; best_plan_nondet(S, 4, P), not member($(_ / _), P).&lt;/span&gt;
    
    valid24(Start, Plan) =&amp;gt;
    &lt;span class="w"&gt; &lt;/span&gt; Start = new_list(4)
    &lt;span class="w"&gt; &lt;/span&gt; , Start :: 1..8
    &lt;span class="w"&gt; &lt;/span&gt; , increasing(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , solve(Start)
    &lt;span class="w"&gt; &lt;/span&gt; , best_plan(Start, 4, Plan)
    &lt;span class="gd"&gt;- , member($(_ / _), Plan)&lt;/span&gt;
    &lt;span class="gi"&gt;+ , not plan_with_no_div(Start, _)&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt; .
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The new line's a bit tricky. &lt;code&gt;plan_with_div&lt;/code&gt; nondeterministically finds a plan, and then fails if the plan contains a division.&lt;sup id="fnref:not"&gt;&lt;a class="footnote-ref" href="#fn:not"&gt;3&lt;/a&gt;&lt;/sup&gt; Since I used &lt;code&gt;best_plan_nondet&lt;/code&gt;, it can backtrack from there and find a new plan. This means &lt;code&gt;plan_with_no_div&lt;/code&gt; only fails if not such plan exists. And in &lt;code&gt;valid24&lt;/code&gt;, we only succeed if &lt;code&gt;plan_with_no_div&lt;/code&gt; fails, guaranteeing that the only existing plans use division. Since this doesn't depend on the plan found via &lt;code&gt;best_plan&lt;/code&gt;, it doesn't matter how the values in &lt;code&gt;Start&lt;/code&gt; are arranged, this will not miss any valid puzzles.&lt;/p&gt;
    &lt;h4&gt;Aside for my &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;logic book readers&lt;/a&gt;&lt;/h4&gt;
    &lt;p&gt;The new clause is equivalent to &lt;code&gt;!(some p: Plan(p) &amp;amp;&amp;amp; !(div in p))&lt;/code&gt;. Applying the simplifications we learned:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;&lt;code&gt;!(some p: Plan(p) &amp;amp;&amp;amp; !(div in p))&lt;/code&gt; (init)&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;all p: !(plan(p) &amp;amp;&amp;amp; !(div in p))&lt;/code&gt; (all/some duality)&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;all p: !plan(p) || div in p)&lt;/code&gt; (De Morgan's law)&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;all p: plan(p) =&amp;gt; div in p&lt;/code&gt; (implication definition)&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Which more obviously means "if P is a valid plan, then it contains a division".&lt;/p&gt;
    &lt;h4&gt;Back to finding hard puzzles&lt;/h4&gt;
    &lt;p&gt;&lt;em&gt;Anyway&lt;/em&gt;, with &lt;code&gt;not plan_with_no_div&lt;/code&gt;, we are filtering puzzles on the set of possible solutions, not just specific solutions. And this gives me an idea: what if we find puzzles that have only one solution? &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gh"&gt;different_plan(S, P) =&amp;gt; best_plan_nondet(S, 4, P2), P2 != P.&lt;/span&gt;
    
    valid24(Start, Plan) =&amp;gt;
    &lt;span class="gi"&gt;+ , not different_plan(Start, Plan)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I tried this from &lt;code&gt;1..8&lt;/code&gt; and got:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;[1,2,7,7]
    [1,3,4,6]
    [1,6,6,8]
    [3,3,8,8]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;These happen to be some of the &lt;a href="https://www.4nums.com/game/difficulties/" target="_blank"&gt;hardest 24 puzzles known&lt;/a&gt;, though not all of them. Note this is assuming that &lt;code&gt;(X + Y)&lt;/code&gt; and &lt;code&gt;(Y + X)&lt;/code&gt; are &lt;em&gt;different&lt;/em&gt; solutions. If we say they're the same (by appending writing &lt;code&gt;A = $(X + Y), X &amp;lt;= Y&lt;/code&gt; in our action) then we got a lot more puzzles, many of which are considered "easy". Other "hard" things we can look for include plans that require fractions:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;plan_with_no_fractions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt; 
      &lt;span class="nf"&gt;best_plan_nondet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;not&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="s s-Atom"&gt;=\=&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;
      &lt;span class="p"&gt;).&lt;/span&gt;
    
    &lt;span class="c1"&gt;% insert `not plan...` in valid24 as usual&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally, we could try seeing if a negative number is required:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;plan_with_no_negatives&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="s s-Atom"&gt;=&amp;gt;&lt;/span&gt; 
      &lt;span class="nf"&gt;best_plan_nondet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;not&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;P&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
      &lt;span class="p"&gt;).&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Interestingly this one returns no solutions, so you are never required to construct a negative number as part of a standard 24 puzzle.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:complex"&gt;
    &lt;p&gt;The code below is different than old book version, as it uses more fancy logic programming features that aren't good in learning material. &lt;a class="footnote-backref" href="#fnref:complex" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:efficiency"&gt;
    &lt;p&gt;&lt;code&gt;increasing&lt;/code&gt; is a constraint predicate. We could alternatively write &lt;code&gt;sorted&lt;/code&gt;, which is a Picat logical predicate and must be placed after &lt;code&gt;solve&lt;/code&gt;. There doesn't seem to be any efficiency gains either way. &lt;a class="footnote-backref" href="#fnref:efficiency" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:not"&gt;
    &lt;p&gt;I don't know what the standard is in Picat, but in Prolog, the convention is to use &lt;code&gt;\+&lt;/code&gt; instead of &lt;code&gt;not&lt;/code&gt;. They mean the same thing, so I'm using &lt;code&gt;not&lt;/code&gt; because it's clearer to non-LPers. &lt;a class="footnote-backref" href="#fnref:not" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 20 May 2025 18:21:01 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/finding-hard-24-puzzles-with-planner-programming/</guid></item><item><title>Modeling Awkward Social Situations with TLA+</title><link>https://buttondown.com/hillelwayne/archive/modeling-awkward-social-situations-with-tla/</link><description>
    &lt;p&gt;You're walking down the street and need to pass someone going the opposite way. You take a step left, but they're thinking the same thing and take a step to their &lt;em&gt;right&lt;/em&gt;, aka your left. You're still blocking each other. Then you take a step to the right, and they take a step to their left, and you're back to where you started. I've heard this called "walkwarding"&lt;/p&gt;
    &lt;p&gt;Let's model this in &lt;a href="https://lamport.azurewebsites.net/tla/tla.html" target="_blank"&gt;TLA+&lt;/a&gt;. TLA+ is a &lt;strong&gt;formal methods&lt;/strong&gt; tool for finding bugs in complex software designs, most often involving concurrency. Two people trying to get past each other just also happens to be a concurrent system. A gentler introduction to TLA+'s capabilities is &lt;a href="https://www.hillelwayne.com/post/modeling-deployments/" target="_blank"&gt;here&lt;/a&gt;, an in-depth guide teaching the language is &lt;a href="https://learntla.com/" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;h2&gt;The spec&lt;/h2&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;---- MODULE walkward ----
    EXTENDS Integers
    
    VARIABLES pos
    vars == &amp;lt;&amp;lt;pos&amp;gt;&amp;gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Double equals defines a new operator, single equals is an equality check. &lt;code&gt;&amp;lt;&amp;lt;pos&amp;gt;&amp;gt;&lt;/code&gt; is a sequence, aka array.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;you == "you"
    me == "me"
    People == {you, me}
    
    MaxPlace == 4
    
    left == 0
    right == 1
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I've gotten into the habit of assigning string "symbols" to operators so that the compiler complains if I misspelled something. &lt;code&gt;left&lt;/code&gt; and &lt;code&gt;right&lt;/code&gt; are numbers so we can shift position with &lt;code&gt;right - pos&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;direction == [you |-&amp;gt; 1, me |-&amp;gt; -1]
    goal == [you |-&amp;gt; MaxPlace, me |-&amp;gt; 1]
    
    Init ==
      \* left-right, forward-backward
      pos = [you |-&amp;gt; [lr |-&amp;gt; left, fb |-&amp;gt; 1], me |-&amp;gt; [lr |-&amp;gt; left, fb |-&amp;gt; MaxPlace]]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;direction&lt;/code&gt;, &lt;code&gt;goal&lt;/code&gt;, and &lt;code&gt;pos&lt;/code&gt; are "records", or hash tables with string keys. I can get my left-right position with &lt;code&gt;pos.me.lr&lt;/code&gt; or &lt;code&gt;pos["me"]["lr"]&lt;/code&gt; (or &lt;code&gt;pos[me].lr&lt;/code&gt;, as &lt;code&gt;me == "me"&lt;/code&gt;).&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Juke(person) ==
      pos' = [pos EXCEPT ![person].lr = right - @]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;TLA+ breaks the world into a sequence of steps. In each step, &lt;code&gt;pos&lt;/code&gt; is the value of &lt;code&gt;pos&lt;/code&gt; in the &lt;em&gt;current&lt;/em&gt; step and &lt;code&gt;pos'&lt;/code&gt; is the value in the &lt;em&gt;next&lt;/em&gt; step. The main outcome of this semantics is that we "assign" a new value to &lt;code&gt;pos&lt;/code&gt; by declaring &lt;code&gt;pos'&lt;/code&gt; equal to something. But the semantics also open up lots of cool tricks, like swapping two values with &lt;code&gt;x' = y /\ y' = x&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;TLA+ is a little weird about updating functions. To set &lt;code&gt;f[x] = 3&lt;/code&gt;, you gotta write &lt;code&gt;f' = [f EXCEPT ![x] = 3]&lt;/code&gt;. To make things a little easier, the rhs of a function update can contain &lt;code&gt;@&lt;/code&gt; for the old value. &lt;code&gt;![me].lr = right - @&lt;/code&gt; is the same as &lt;code&gt;right - pos[me].lr&lt;/code&gt;, so it swaps left and right.&lt;/p&gt;
    &lt;p&gt;("Juke" comes from &lt;a href="https://www.merriam-webster.com/dictionary/juke" target="_blank"&gt;here&lt;/a&gt;)&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Move(person) ==
      LET new_pos == [pos[person] EXCEPT !.fb = @ + direction[person]]
      IN
        /\ pos[person].fb # goal[person]
        /\ \A p \in People: pos[p] # new_pos
        /\ pos' = [pos EXCEPT ![person] = new_pos]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The &lt;code&gt;EXCEPT&lt;/code&gt; syntax can be used in regular definitions, too. This lets someone move one step in their goal direction &lt;em&gt;unless&lt;/em&gt; they are at the goal &lt;em&gt;or&lt;/em&gt; someone is already in that space. &lt;code&gt;/\&lt;/code&gt; means "and".&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Next ==
      \E p \in People:
        \/ Move(p)
        \/ Juke(p)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I really like how TLA+ represents concurrency: "In each step, there is a person who either moves or jukes." It can take a few uses to really wrap your head around but it can express extraordinarily complicated distributed systems.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Spec == Init /\ [][Next]_vars
    
    Liveness == &amp;lt;&amp;gt;(pos[me].fb = goal[me])
    ====
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;Spec&lt;/code&gt; is our specification: we start at &lt;code&gt;Init&lt;/code&gt; and take a &lt;code&gt;Next&lt;/code&gt; step every step.&lt;/p&gt;
    &lt;p&gt;Liveness is the generic term for "something good is guaranteed to happen", see &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;here&lt;/a&gt; for more.  &lt;code&gt;&amp;lt;&amp;gt;&lt;/code&gt; means "eventually", so &lt;code&gt;Liveness&lt;/code&gt; means "eventually my forward-backward position will be my goal". I could extend it to "both of us eventually reach out goal" but I think this is good enough for a demo.&lt;/p&gt;
    &lt;h3&gt;Checking the spec&lt;/h3&gt;
    &lt;p&gt;Four years ago, everybody in TLA+ used the &lt;a href="https://lamport.azurewebsites.net/tla/toolbox.html" target="_blank"&gt;toolbox&lt;/a&gt;. Now the community has collectively shifted over to using the &lt;a href="https://github.com/tlaplus/vscode-tlaplus/" target="_blank"&gt;VSCode extension&lt;/a&gt;.&lt;sup id="fnref:ltla"&gt;&lt;a class="footnote-ref" href="#fn:ltla"&gt;1&lt;/a&gt;&lt;/sup&gt; VSCode requires we write a configuration file, which I will call &lt;code&gt;walkward.cfg&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;SPECIFICATION Spec
    PROPERTY Liveness
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;I then check the model with the VSCode command &lt;code&gt;TLA+: Check model with TLC&lt;/code&gt;. Unsurprisingly, it finds an error:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Screenshot 2025-05-12 153537.png" class="newsletter-image" src="https://assets.buttondown.email/images/af6f9e89-0bc6-4705-b293-4da5f5c16cfe.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;The reason it fails is "stuttering": I can get one step away from my goal and then just stop moving forever. We say the spec is &lt;a href="https://www.hillelwayne.com/post/fairness/" target="_blank"&gt;unfair&lt;/a&gt;: it does not guarantee that if progress is always possible, progress will be made. If I want the spec to always make progress, I have to make some of the steps &lt;strong&gt;weakly fair&lt;/strong&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gi"&gt;+ Fairness == WF_vars(Next)&lt;/span&gt;
    
    &lt;span class="gd"&gt;- Spec == Init /\ [][Next]_vars&lt;/span&gt;
    &lt;span class="gi"&gt;+ Spec == Init /\ [][Next]_vars /\ Fairness&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now the spec is weakly fair, so someone will always do &lt;em&gt;something&lt;/em&gt;. New error:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;\* First six steps cut
    7: &amp;lt;Move("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 4], me |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2]]
    8: &amp;lt;Juke("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 4], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 2]]
    9: &amp;lt;Juke("me")&amp;gt; (back to state 7)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In this failure, I've successfully gotten past you, and then spend the rest of my life endlessly juking back and forth. The &lt;code&gt;Next&lt;/code&gt; step keeps happening, so weak fairness is satisfied. What I actually want is for both my &lt;code&gt;Move&lt;/code&gt; and my &lt;code&gt;Juke&lt;/code&gt; to both be weakly fair independently of each other.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gd"&gt;- Fairness == WF_vars(Next)&lt;/span&gt;
    &lt;span class="gi"&gt;+ Fairness == WF_vars(Move(me)) /\ WF_vars(Juke(me))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;If my liveness property also specified that &lt;em&gt;you&lt;/em&gt; reached your goal, I could instead write &lt;code&gt;\A p \in People: WF_vars(Move(p)) etc&lt;/code&gt;. I could also swap the &lt;code&gt;\A&lt;/code&gt; with a &lt;code&gt;\E&lt;/code&gt; to mean at least one of us is guaranteed to have fair actions, but not necessarily both of us. &lt;/p&gt;
    &lt;p&gt;New error:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;3: &amp;lt;Move("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 3]]
    4: &amp;lt;Juke("you")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 3]]
    5: &amp;lt;Juke("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 3]]
    6: &amp;lt;Juke("me")&amp;gt;
    pos = [you |-&amp;gt; [lr |-&amp;gt; 1, fb |-&amp;gt; 2], me |-&amp;gt; [lr |-&amp;gt; 0, fb |-&amp;gt; 3]]
    7: &amp;lt;Juke("you")&amp;gt; (back to state 3)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now we're getting somewhere! This is the original walkwarding situation we wanted to capture. We're in each others way, then you juke, but before either of us can move you juke, then we both juke back. We can repeat this forever, trapped in a social hell.&lt;/p&gt;
    &lt;p&gt;Wait, but doesn't &lt;code&gt;WF(Move(me))&lt;/code&gt; guarantee I will eventually move? Yes, but &lt;em&gt;only if a move is permanently available&lt;/em&gt;. In this case, it's not permanently available, because every couple of steps it's made temporarily unavailable.&lt;/p&gt;
    &lt;p&gt;How do I fix this? I can't add a rule saying that we only juke if we're blocked, because the whole point of walkwarding is that we're not coordinated. In the real world, walkwarding can go on for agonizing seconds. What I can do instead is say that Liveness holds &lt;em&gt;as long as &lt;code&gt;Move&lt;/code&gt; is strongly fair&lt;/em&gt;. Unlike weak fairness, &lt;a href="https://www.hillelwayne.com/post/fairness/#strong-fairness" target="_blank"&gt;strong fairness&lt;/a&gt; guarantees something happens if it keeps becoming possible, even with interruptions. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Liveness == 
    &lt;span class="gi"&gt;+  SF_vars(Move(me)) =&amp;gt; &lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt;   &amp;lt;&amp;gt;(pos[me].fb = goal[me])
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This makes the spec pass. Even if we weave back and forth for five minutes, as long as we eventually pass each other, I will reach my goal. Note we could also by making &lt;code&gt;Move&lt;/code&gt; in &lt;code&gt;Fairness&lt;/code&gt; strongly fair, which is preferable if we have a lot of different liveness properties to check.&lt;/p&gt;
    &lt;h3&gt;A small exercise for the reader&lt;/h3&gt;
    &lt;p&gt;There is a presumed invariant that is violated. Identify what it is, write it as a property in TLA+, and show the spec violates it. Then fix it.&lt;/p&gt;
    &lt;p&gt;Answer (in &lt;a href="https://rot13.com/" target="_blank"&gt;rot13&lt;/a&gt;): Gur vainevnag vf "ab gjb crbcyr ner va gur rknpg fnzr ybpngvba". &lt;code&gt;Zbir&lt;/code&gt; thnenagrrf guvf ohg &lt;code&gt;Whxr&lt;/code&gt; &lt;em&gt;qbrf abg&lt;/em&gt;.&lt;/p&gt;
    &lt;h3&gt;More TLA+ Exercises&lt;/h3&gt;
    &lt;p&gt;I've started work on &lt;a href="https://github.com/hwayne/tlaplus-exercises/" target="_blank"&gt;an exercises repo&lt;/a&gt;. There's only a handful of specific problems now but I'm planning on adding more over the summer.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:ltla"&gt;
    &lt;p&gt;&lt;a href="https://learntla.com/" target="_blank"&gt;learntla&lt;/a&gt; is still on the toolbox, but I'm hoping to get it all moved over this summer. &lt;a class="footnote-backref" href="#fnref:ltla" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 14 May 2025 16:02:21 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/modeling-awkward-social-situations-with-tla/</guid></item><item><title>Write the most clever code you possibly can</title><link>https://buttondown.com/hillelwayne/archive/write-the-most-clever-code-you-possibly-can/</link><description>
    &lt;p&gt;&lt;em&gt;I started writing this early last week but Real Life Stuff happened and now you're getting the first-draft late this week. Warning, unedited thoughts ahead!&lt;/em&gt;&lt;/p&gt;
    &lt;h2&gt;New Logic for Programmers release!&lt;/h2&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.9 is out&lt;/a&gt;! This is a big release, with a new cover design, several rewritten chapters, &lt;a href="https://github.com/logicforprogrammers/book-assets/tree/master/code" target="_blank"&gt;online code samples&lt;/a&gt; and much more. See the full release notes at the &lt;a href="https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md" target="_blank"&gt;changelog page&lt;/a&gt;, and &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;get the book here&lt;/a&gt;!&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The new cover! It's a lot nicer" class="newsletter-image" src="https://assets.buttondown.email/images/038a7092-5dc7-41a5-9a16-56bdef8b5d58.jpg?w=400&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h2&gt;Write the cleverest code you possibly can&lt;/h2&gt;
    &lt;p&gt;There are millions of articles online about how programmers should not write "clever" code, and instead write simple, maintainable code that everybody understands. Sometimes the example of "clever" code looks like this (&lt;a href="https://codegolf.stackexchange.com/questions/57617/is-this-number-a-prime/57682#57682" target="_blank"&gt;src&lt;/a&gt;):&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python&lt;/span&gt;
    
    &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"p*=n*n;n+=1;"&lt;/span&gt;&lt;span class="o"&gt;*~-&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is code-golfing, the sport of writing the most concise code possible. Obviously you shouldn't run this in production for the same reason you shouldn't eat dinner off a Rembrandt. &lt;/p&gt;
    &lt;p&gt;Other times the example looks like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;is_prime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;False&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is "clever" because it uses a single list comprehension, as opposed to a "simple" for loop. Yes, "list comprehensions are too clever" is something I've read in one of these articles. &lt;/p&gt;
    &lt;p&gt;I've also talked to people who think that datatypes besides lists and hashmaps are too clever to use, that most optimizations are too clever to bother with, and even that functions and classes are too clever and code should be a linear script.&lt;sup id="fnref:grad-students"&gt;&lt;a class="footnote-ref" href="#fn:grad-students"&gt;1&lt;/a&gt;&lt;/sup&gt;. Clever code is anything using features or domain concepts we don't understand. Something that seems unbearably clever to me might be utterly mundane for you, and vice versa. &lt;/p&gt;
    &lt;p&gt;How do we make something utterly mundane? By using it and working at the boundaries of our skills. Almost everything I'm "good at" comes from banging my head against it more than is healthy. That suggests a really good reason to write clever code: it's an excellent form of purposeful practice. Writing clever code forces us to code outside of our comfort zone, developing our skills as software engineers. &lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you [will get excellent debugging practice at exactly the right level required to push your skills as a software engineer] — Brian Kernighan, probably&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;There are other benefits, too, but first let's kill the elephant in the room:&lt;sup id="fnref:bajillion"&gt;&lt;a class="footnote-ref" href="#fn:bajillion"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;h3&gt;Don't &lt;em&gt;commit&lt;/em&gt; clever code&lt;/h3&gt;
    &lt;p&gt;I am proposing writing clever code as a means of practice. Being at work is a &lt;em&gt;job&lt;/em&gt; with coworkers who will not appreciate if your code is too clever. Similarly, don't use &lt;a href="https://mcfunley.com/choose-boring-technology" target="_blank"&gt;too many innovative technologies&lt;/a&gt;. Don't put anything in production you are &lt;em&gt;uncomfortable&lt;/em&gt; with.&lt;/p&gt;
    &lt;p&gt;We can still responsibly write clever code at work, though: &lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Solve a problem in both a simple and a clever way, and then only commit the simple way. This works well for small scale problems where trying the "clever way" only takes a few minutes.&lt;/li&gt;
    &lt;li&gt;Write our &lt;em&gt;personal&lt;/em&gt; tools cleverly. I'm a big believer of the idea that most programmers would benefit from writing more scripts and support code customized to their particular work environment. This is a great place to practice new techniques, languages, etc.&lt;/li&gt;
    &lt;li&gt;If clever code is absolutely the best way to solve a problem, then commit it with &lt;strong&gt;extensive documentation&lt;/strong&gt; explaining how it works and why it's preferable to simpler solutions. Bonus: this potentially helps the whole team upskill.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;h2&gt;Writing clever code...&lt;/h2&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;h3&gt;...teaches simple solutions&lt;/h3&gt;
    &lt;p&gt;Usually, code that's called too clever composes several powerful features together — the "not a single list comprehension or function" people are the exception. &lt;a href="https://www.joshwcomeau.com/career/clever-code-considered-harmful/" target="_blank"&gt;Josh Comeau's&lt;/a&gt; "don't write clever code" article gives this example of "too clever":&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;extractDataFromResponse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;Component&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;resultsEntries&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Component&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;assignIfValueTruthy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;o&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;o&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;o&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;resultsEntries&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;assignIfValueTruthy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;What makes this "clever"? I count eight language features composed together: &lt;code&gt;entries&lt;/code&gt;, argument unpacking, implicit objects, splats, ternaries, higher-order functions, and reductions. Would code that used only one or two of these features still be "clever"? I don't think so. These features exist for a reason, and oftentimes they make code simpler than not using them.&lt;/p&gt;
    &lt;p&gt;We can, of course, learn these features one at a time. Writing the clever version (but not &lt;em&gt;committing it&lt;/em&gt;) gives us practice with all eight at once and also with how they compose together. That knowledge comes in handy when we want to apply a single one of the ideas.&lt;/p&gt;
    &lt;p&gt;I've recently had to do a bit of pandas for a project. Whenever I have to do a new analysis, I try to write it as a single chain of transformations, and then as a more balanced set of updates.&lt;/p&gt;
    &lt;h3&gt;...helps us master concepts&lt;/h3&gt;
    &lt;p&gt;Even if the composite parts of a "clever" solution aren't by themselves useful, it still makes us better at the overall language, and that's inherently valuable. A few years ago I wrote &lt;a href="https://www.hillelwayne.com/post/python-abc/" target="_blank"&gt;Crimes with Python's Pattern Matching&lt;/a&gt;. It involves writing horrible code like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;abc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ABC&lt;/span&gt;
    
    &lt;span class="k"&gt;class&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;NotIterable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ABC&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    
        &lt;span class="nd"&gt;@classmethod&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;__subclasshook__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;C&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;hasattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;C&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"__iter__"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;NotIterable&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; is not iterable"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; is iterable"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"__main__"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This composes Python match statements, which are broadly useful, and abstract base classes, which are incredibly niche. But even if I never use ABCs in real production code, it helped me understand Python's match semantics and &lt;a href="https://docs.python.org/3/howto/mro.html#python-2-3-mro" target="_blank"&gt;Method Resolution Order&lt;/a&gt; better. &lt;/p&gt;
    &lt;h3&gt;...prepares us for necessity&lt;/h3&gt;
    &lt;p&gt;Sometimes the clever way is the &lt;em&gt;only&lt;/em&gt; way. Maybe we need something faster than the simplest solution. Maybe we are working with constrained tools or frameworks that demand cleverness. Peter Norvig argued that design patterns compensate for missing language features. I'd argue that cleverness is another means of compensating: if our tools don't have an easy way to do something, we need to find a clever way.&lt;/p&gt;
    &lt;p&gt;You see this a lot in formal methods like TLA+. Need to check a hyperproperty? &lt;a href="https://www.hillelwayne.com/post/graphing-tla/" target="_blank"&gt;Cast your state space to a directed graph&lt;/a&gt;. Need to compose ten specifications together? &lt;a href="https://www.hillelwayne.com/post/composing-tla/" target="_blank"&gt;Combine refinements with state machines&lt;/a&gt;. Most difficult problems have a "clever" solution. The real problem is that clever solutions have a skill floor. If normal use of the tool is at difficult 3 out of 10, then basic clever solutions are at 5 out of 10, and it's hard to jump those two steps in the moment you need the cleverness.&lt;/p&gt;
    &lt;p&gt;But if you've practiced with writing overly clever code, you're used to working at a 7 out of 10 level in short bursts, and then you can "drop down" to 5/10. I don't know if that makes too much sense, but I see it happen a lot in practice.&lt;/p&gt;
    &lt;h3&gt;...builds comradery&lt;/h3&gt;
    &lt;p&gt;On a few occasions, after getting a pull request merged, I pulled the reviewer over and said "check out this horrible way of doing the same thing". I find that as long as people know they're not going to be subjected to a clever solution in production, they enjoy seeing it!&lt;/p&gt;
    &lt;p&gt;&lt;em&gt;Next week's newsletter will probably also be late, after that we should be back to a regular schedule for the rest of the summer.&lt;/em&gt;&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:grad-students"&gt;
    &lt;p&gt;Mostly grad students outside of CS who have to write scripts to do research. And in more than one data scientist. I think it's correlated with using Jupyter. &lt;a class="footnote-backref" href="#fnref:grad-students" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:bajillion"&gt;
    &lt;p&gt;If I don't put this at the beginning, I'll get a bajillion responses like "your team will hate you" &lt;a class="footnote-backref" href="#fnref:bajillion" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 08 May 2025 15:04:42 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/write-the-most-clever-code-you-possibly-can/</guid></item><item><title>Requirements change until they don't</title><link>https://buttondown.com/hillelwayne/archive/requirements-change-until-they-dont/</link><description>
    &lt;p&gt;Recently I got a question on formal methods&lt;sup id="fnref:fs"&gt;&lt;a class="footnote-ref" href="#fn:fs"&gt;1&lt;/a&gt;&lt;/sup&gt;: how does it help to mathematically model systems when the system requirements are constantly changing? It doesn't make sense to spend a lot of time proving a design works, and then deliver the product and find out it's not at all what the client needs. As the saying goes, the hard part is "building the right thing", not "building the thing right".&lt;/p&gt;
    &lt;p&gt;One possible response: "why write tests"? You shouldn't write tests, &lt;em&gt;especially&lt;/em&gt; &lt;a href="https://en.wikipedia.org/wiki/Test-driven_development" target="_blank"&gt;lots of unit tests ahead of time&lt;/a&gt;, if you might just throw them all away when the requirements change.&lt;/p&gt;
    &lt;p&gt;This is a bad response because we all know the difference between writing tests and formal methods: testing is &lt;em&gt;easy&lt;/em&gt; and FM is &lt;em&gt;hard&lt;/em&gt;. Testing requires low cost for moderate correctness, FM requires high(ish) cost for high correctness. And when requirements are constantly changing, "high(ish) cost" isn't affordable and "high correctness" isn't worthwhile, because a kinda-okay solution that solves a customer's problem is infinitely better than a solid solution that doesn't.&lt;/p&gt;
    &lt;p&gt;But eventually you get something that solves the problem, and what then?&lt;/p&gt;
    &lt;p&gt;Most of us don't work for Google, we can't axe features and products &lt;a href="https://killedbygoogle.com/" target="_blank"&gt;on a whim&lt;/a&gt;. If the client is happy with your solution, you are expected to support it. It should work when your customers run into new edge cases, or migrate all their computers to the next OS version, or expand into a market with shoddy internet. It should work when 10x as many customers are using 10x as many features. It should work when &lt;a href="https://www.hillelwayne.com/post/feature-interaction/" target="_blank"&gt;you add new features that come into conflict&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;And just as importantly, &lt;em&gt;it should never stop solving their problem&lt;/em&gt;. Canonical example: your feature involves processing requested tasks synchronously. At scale, this doesn't work, so to improve latency you make it asynchronous. Now it's eventually consistent, but your customers were depending on it being always consistent. Now it no longer does what they need, and has stopped solving their problems.&lt;/p&gt;
    &lt;p&gt;Every successful requirement met spawns a new requirement: "keep this working". That requirement is permanent, or close enough to decide our long-term strategy. It takes active investment to keep a feature behaving the same as the world around it changes.&lt;/p&gt;
    &lt;p&gt;(Is this all a pretentious of way of saying "software maintenance is hard?" Maybe!)&lt;/p&gt;
    &lt;h3&gt;Phase changes&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;In physics there's a concept of a &lt;a href="https://en.wikipedia.org/wiki/Phase_transition" target="_blank"&gt;phase transition&lt;/a&gt;. To raise the temperature of a gram of liquid water by 1° C, you have to add 4.184 joules of energy.&lt;sup id="fnref:calorie"&gt;&lt;a class="footnote-ref" href="#fn:calorie"&gt;2&lt;/a&gt;&lt;/sup&gt; This continues until you raise it to 100°C, then it stops. After you've added two &lt;em&gt;thousand&lt;/em&gt; joules to that gram, it suddenly turns into steam. The energy of the system changes continuously but the form, or phase, changes discretely.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Phase_diagram_of_water_simplified.svg.png (from above link)" class="newsletter-image" src="https://assets.buttondown.email/images/31676a33-be6a-4c6d-a96f-425723dcb0d5.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;Software isn't physics but the idea works as a metaphor. A certain architecture handles a certain level of load, and past that you need a new architecture. Or a bunch of similar features are independently hardcoded until the system becomes too messy to understand, you remodel the internals into something unified and extendable. etc etc etc. It's doesn't have to be totally discrete phase transition, but there's definitely a "before" and "after" in the system form. &lt;/p&gt;
    &lt;p&gt;Phase changes tend to lead to more intricacy/complexity in the system, meaning it's likely that a phase change will introduce new bugs into existing behaviors. Take the synchronous vs asynchronous case. A very simple toy model of synchronous updates would be &lt;code&gt;Set(key, val)&lt;/code&gt;, which updates &lt;code&gt;data[key]&lt;/code&gt; to &lt;code&gt;val&lt;/code&gt;.&lt;sup id="fnref:tla"&gt;&lt;a class="footnote-ref" href="#fn:tla"&gt;3&lt;/a&gt;&lt;/sup&gt; A model of asynchronous updates would be &lt;code&gt;AsyncSet(key, val, priority)&lt;/code&gt; adds a &lt;code&gt;(key, val, priority, server_time())&lt;/code&gt; tuple to a &lt;code&gt;tasks&lt;/code&gt; set, and then another process asynchronously pulls a tuple (ordered by highest priority, then earliest time) and calls &lt;code&gt;Set(key, val)&lt;/code&gt;. Here are some properties the client may need preserved as a requirement: &lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;If &lt;code&gt;AsyncSet(key, val, _, _)&lt;/code&gt; is called, then &lt;em&gt;eventually&lt;/em&gt; &lt;code&gt;db[key] = val&lt;/code&gt; (possibly violated if higher-priority tasks keep coming in)&lt;/li&gt;
    &lt;li&gt;If someone calls &lt;code&gt;AsyncSet(key1, val1, low)&lt;/code&gt; and then &lt;code&gt;AsyncSet(key2, val2, low)&lt;/code&gt;, they should see the first update and then the second (linearizability, possibly violated if the requests go to different servers with different clock times)&lt;/li&gt;
    &lt;li&gt;If someone calls &lt;code&gt;AsyncSet(key, val, _)&lt;/code&gt; and &lt;em&gt;immediately&lt;/em&gt; reads &lt;code&gt;db[key]&lt;/code&gt; they should get &lt;code&gt;val&lt;/code&gt; (obviously violated, though the client may accept a &lt;em&gt;slightly&lt;/em&gt; weaker property)&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;If the new system doesn't satisfy an existing customer requirement, it's prudent to fix the bug &lt;em&gt;before&lt;/em&gt; releasing the new system. The customer doesn't notice or care that your system underwent a phase change. They'll just see that one day your product solves their problems, and the next day it suddenly doesn't. &lt;/p&gt;
    &lt;p&gt;This is one of the most common applications of formal methods. Both of those systems, and every one of those properties, is formally specifiable in a specification language. We can then automatically check that the new system satisfies the existing properties, and from there do things like &lt;a href="https://arxiv.org/abs/2006.00915" target="_blank"&gt;automatically generate test suites&lt;/a&gt;. This does take a lot of work, so if your requirements are constantly changing, FM may not be worth the investment. But eventually requirements &lt;em&gt;stop&lt;/em&gt; changing, and then you're stuck with them forever. That's where models shine.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:fs"&gt;
    &lt;p&gt;As always, I'm using formal methods to mean the subdiscipline of formal specification of designs, leaving out the formal verification of code. Mostly because "formal specification" is really awkward to say. &lt;a class="footnote-backref" href="#fnref:fs" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:calorie"&gt;
    &lt;p&gt;Also called a "calorie". The US "dietary Calorie" is actually a kilocalorie. &lt;a class="footnote-backref" href="#fnref:calorie" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:tla"&gt;
    &lt;p&gt;This is all directly translatable to a TLA+ specification, I'm just describing it in English to avoid paying the syntax tax &lt;a class="footnote-backref" href="#fnref:tla" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Thu, 24 Apr 2025 11:00:00 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/requirements-change-until-they-dont/</guid></item><item><title>The Halting Problem is a terrible example of NP-Harder</title><link>https://buttondown.com/hillelwayne/archive/the-halting-problem-is-a-terrible-example-of-np/</link><description>
    &lt;p&gt;&lt;em&gt;Short one this time because I have a lot going on this week.&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;In computation complexity, &lt;strong&gt;NP&lt;/strong&gt; is the class of all decision problems (yes/no) where a potential proof (or "witness") for "yes" can be &lt;em&gt;verified&lt;/em&gt; in polynomial time. For example, "does this set of numbers have a subset that sums to zero" is in NP. If the answer is "yes", you can prove it by presenting a set of numbers. We would then verify the witness by 1) checking that all the numbers are present in the set (~linear time) and 2) adding up all the numbers (also linear).&lt;/p&gt;
    &lt;p&gt;&lt;strong&gt;NP-complete&lt;/strong&gt; is the class of "hardest possible" NP problems. Subset sum is NP-complete. &lt;strong&gt;NP-hard&lt;/strong&gt; is the set all problems &lt;em&gt;at least as hard&lt;/em&gt; as NP-complete. Notably, NP-hard is &lt;em&gt;not&lt;/em&gt; a subset of NP, as it contains problems that are &lt;em&gt;harder&lt;/em&gt; than NP-complete. A natural question to ask is "like what?" And the canonical example of "NP-harder" is the halting problem (HALT): does program P halt on input C? As the argument goes, it's undecidable, so obviously not in NP.&lt;/p&gt;
    &lt;p&gt;I think this is a bad example for two reasons:&lt;/p&gt;
    &lt;ol&gt;&lt;li&gt;&lt;p&gt;All NP requires is that witnesses for "yes" can be verified in polynomial time. It does not require anything for the "no" case! And even though HP is undecidable, there &lt;em&gt;is&lt;/em&gt; a decidable way to verify a "yes": let the witness be "it halts in N steps", then run the program for that many steps and see if it halted by then. To prove HALT is not in NP, you have to show that this verification process grows faster than polynomially. It does (as &lt;a href="https://en.wikipedia.org/wiki/Busy_beaver" rel="noopener noreferrer nofollow" target="_blank"&gt;busy beaver&lt;/a&gt; is uncomputable), but this all makes the example needlessly confusing.&lt;sup id="fnref:1"&gt;&lt;a class="footnote-ref" data-id="37347adc-dba6-4629-9d24-c6252292ac6b" data-reference-number="1" href="#fn:1"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;"What's bigger than a dog? THE MOON"&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;
    &lt;p&gt;Really (2) bothers me a lot more than (1) because it's just so inelegant. It suggests that NP-complete is the upper bound of "solvable" problems, and after that you're in full-on undecidability. I'd rather show intuitive problems that are harder than NP but not &lt;em&gt;that&lt;/em&gt; much harder.&lt;/p&gt;
    &lt;p&gt;But in looking for a "slightly harder" problem, I ran into an, ah, problem. It &lt;em&gt;seems&lt;/em&gt; like the next-hardest class would be &lt;a href="https://en.wikipedia.org/wiki/EXPTIME" rel="noopener noreferrer nofollow" target="_blank"&gt;EXPTIME&lt;/a&gt;, except we don't know &lt;em&gt;for sure&lt;/em&gt; that NP != EXPTIME. We know &lt;em&gt;for sure&lt;/em&gt; that NP != &lt;a href="https://en.wikipedia.org/wiki/NEXPTIME" rel="noopener noreferrer nofollow" target="_blank"&gt;NEXPTIME&lt;/a&gt;, but NEXPTIME doesn't have any intuitive, easily explainable problems. Most "definitely harder than NP" problems require a nontrivial background in theoretical computer science or mathematics to understand.&lt;/p&gt;
    &lt;p&gt;There is one problem, though, that I find easily explainable. Place a token at the bottom left corner of a grid that extends infinitely up and right, call that point (0, 0). You're given list of valid displacement moves for the token, like &lt;code&gt;(+1, +0)&lt;/code&gt;, &lt;code&gt;(-20, +13)&lt;/code&gt;, &lt;code&gt;(-5, -6)&lt;/code&gt;, etc, and a target point like &lt;code&gt;(700, 1)&lt;/code&gt;. You may make any sequence of moves in any order, as long as no move ever puts the token off the grid. Does any sequence of moves bring you to the target?&lt;/p&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;This is PSPACE-complete, I think, which still isn't proven to be harder than NP-complete (though it's widely believed). But what if you increase the number of dimensions of the grid? Past a certain number of dimensions the problem jumps to being EXPSPACE-complete, and then TOWER-complete (grows &lt;a href="https://en.wikipedia.org/wiki/Tetration" rel="noopener noreferrer nofollow" target="_blank"&gt;tetrationally&lt;/a&gt;), and then it keeps going. Some point might recognize this as looking a lot like the &lt;a href="https://en.wikipedia.org/wiki/Ackermann_function" rel="noopener noreferrer nofollow" target="_blank"&gt;Ackermann function&lt;/a&gt;, and in fact this problem is &lt;a href="https://arxiv.org/abs/2104.13866" rel="noopener noreferrer nofollow" target="_blank"&gt;ACKERMANN-complete on the number of available dimensions&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://www.quantamagazine.org/an-easy-sounding-problem-yields-numbers-too-big-for-our-universe-20231204/" rel="noopener noreferrer nofollow" target="_blank"&gt;A friend wrote a Quanta article about the whole mess&lt;/a&gt;, you should read it.&lt;/p&gt;
    &lt;p&gt;This problem is ludicrously bigger than NP ("Chicago" instead of "The Moon"), but at least it's clearly decidable, easily explainable, and definitely &lt;em&gt;not&lt;/em&gt; in NP.&lt;/p&gt;
    &lt;div class="footnote"&gt;&lt;hr/&gt;&lt;ol class="footnotes"&gt;&lt;li data-id="37347adc-dba6-4629-9d24-c6252292ac6b" id="fn:1"&gt;&lt;p&gt;It's less confusing if you're taught the alternate (and original!) definition of NP, "the class of problems solvable in polynomial time by a nondeterministic Turing machine". Then HALT can't be in NP because otherwise runtime would be bounded by an exponential function. &lt;a class="footnote-backref" href="#fnref:1"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;
    </description><pubDate>Wed, 16 Apr 2025 17:39:23 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/the-halting-problem-is-a-terrible-example-of-np/</guid></item><item><title>Solving a "Layton Puzzle" with Prolog</title><link>https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/</link><description>
    &lt;p&gt;I have a lot in the works for the this month's &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt; release. Among other things, I'm completely rewriting the chapter on Logic Programming Languages. &lt;/p&gt;
    &lt;p&gt;I originally showcased the paradigm with puzzle solvers, like &lt;a href="https://swish.swi-prolog.org/example/queens.pl" target="_blank"&gt;eight queens&lt;/a&gt; or &lt;a href="https://saksagan.ceng.metu.edu.tr/courses/ceng242/documents/prolog/jrfisher/2_1.html" target="_blank"&gt;four-coloring&lt;/a&gt;. Lots of other demos do this too! It takes creativity and insight for humans to solve them, so a program doing it feels magical. But I'm trying to write a book about practical techniques and I want everything I talk about to be &lt;em&gt;useful&lt;/em&gt;. So in v0.9 I'll be replacing these examples with a couple of new programs that might get people thinking that Prolog could help them in their day-to-day work.&lt;/p&gt;
    &lt;p&gt;On the other hand, for a newsletter, showcasing a puzzle solver is pretty cool. And recently I stumbled into &lt;a href="https://morepablo.com/2010/09/some-professor-layton-prolog.html" target="_blank"&gt;this post&lt;/a&gt; by my friend &lt;a href="https://morepablo.com/" target="_blank"&gt;Pablo Meier&lt;/a&gt;, where he solves a videogame puzzle with Prolog:&lt;sup id="fnref:path"&gt;&lt;a class="footnote-ref" href="#fn:path"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;&lt;img alt="See description below" class="newsletter-image" src="https://assets.buttondown.email/images/a4ee8689-bbce-4dc9-8175-a1de3bd8f2db.png?w=960&amp;amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Summary for the text-only readers: We have a test with 10 true/false questions (denoted &lt;code&gt;a/b&lt;/code&gt;) and four student attempts. Given the scores of the first three students, we have to figure out the fourth student's score.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;bbababbabb = 7
    baaababaaa = 5
    baaabbbaba = 3
    bbaaabbaaa = ???
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;You can see Pablo's solution &lt;a href="https://morepablo.com/2010/09/some-professor-layton-prolog.html" target="_blank"&gt;here&lt;/a&gt;, and try it in SWI-prolog &lt;a href="https://swish.swi-prolog.org/p/Some%20Professor%20Layton%20Prolog.pl" target="_blank"&gt;here&lt;/a&gt;. Pretty cool! But after way too long studying Prolog just to write this dang book chapter, I wanted to see if I could do it more elegantly than him. Code and puzzle spoilers to follow.&lt;/p&gt;
    &lt;p&gt;(Normally here's where I'd link to a gentler introduction I wrote but I think this is my first time writing about Prolog online? Uh here's a &lt;a href="https://www.hillelwayne.com/post/picat/" target="_blank"&gt;Picat intro&lt;/a&gt; instead)&lt;/p&gt;
    &lt;h3&gt;The Program&lt;/h3&gt;
    &lt;p&gt;You can try this all online at &lt;a href="https://swish.swi-prolog.org/p/" target="_blank"&gt;SWISH&lt;/a&gt; or just jump to my final version &lt;a href="https://swish.swi-prolog.org/p/layton_prolog_puzzle.pl" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="nf"&gt;use_module&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;    &lt;span class="c1"&gt;% Sound inequality&lt;/span&gt;
    &lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="nf"&gt;use_module&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s s-Atom"&gt;clpfd&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;  &lt;span class="c1"&gt;% Finite domain constraints&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;First some imports. &lt;code&gt;dif&lt;/code&gt; lets us write &lt;code&gt;dif(A, B)&lt;/code&gt;, which is true if &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;B&lt;/code&gt; are &lt;em&gt;not&lt;/em&gt; equal. &lt;code&gt;clpfd&lt;/code&gt; lets us write &lt;code&gt;A #= B + 1&lt;/code&gt; to say "A is 1 more than B".&lt;sup id="fnref:superior"&gt;&lt;a class="footnote-ref" href="#fn:superior"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;We'll say both the student submission and the key will be lists, where each value is &lt;code&gt;a&lt;/code&gt; or &lt;code&gt;b&lt;/code&gt;. In Prolog, lowercase identifiers are &lt;strong&gt;atoms&lt;/strong&gt; (like symbols in other languages) and identifiers that start with a capital are &lt;strong&gt;variables&lt;/strong&gt;. Prolog finds values for variables that match equations (&lt;strong&gt;unification&lt;/strong&gt;). The pattern matching is real real good.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;% ?- means query&lt;/span&gt;
    &lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nv"&gt;B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;B&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="s s-Atom"&gt;#=&lt;/span&gt; &lt;span class="mf"&gt;7.&lt;/span&gt;
    
    &lt;span class="nv"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nv"&gt;L&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;c&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nv"&gt;Y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Next, we define &lt;code&gt;score/3&lt;/code&gt;&lt;sup id="fnref:arity"&gt;&lt;a class="footnote-ref" href="#fn:arity"&gt;3&lt;/a&gt;&lt;/sup&gt; recursively. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;% The student's test score&lt;/span&gt;
    &lt;span class="c1"&gt;% score(student answers, answer key, score)&lt;/span&gt;
    &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([],&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
       &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="s s-Atom"&gt;#=&lt;/span&gt; &lt;span class="nv"&gt;M&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;M&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;K&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt; 
        &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;K&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;As&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Ks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;First key is the student's answers, second is the answer key, third is the final score. The base case is the empty test, which has score 0. Otherwise, we take the head values of each list and compare them. If they're the same, we add one to the score, otherwise we keep the same score. &lt;/p&gt;
    &lt;p&gt;Notice we couldn't write &lt;code&gt;if x then y else z&lt;/code&gt;, we instead used pattern matching to effectively express &lt;code&gt;(x &amp;amp;&amp;amp; y) || (!x &amp;amp;&amp;amp; z)&lt;/code&gt;. Prolog does have a conditional operator, but it prevents backtracking so what's the point???&lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;A quick break about bidirectionality&lt;/h3&gt;
    &lt;p&gt;One of the coolest things about Prolog: all purely logical predicates are bidirectional. We can use &lt;code&gt;score&lt;/code&gt; to check if our expected score is correct:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="s s-Atom"&gt;true&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But we can also give it answers and a key and ask it for the score:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;em&gt;Or&lt;/em&gt; we could give it a key and a score and ask "what test answers would have this score?"&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="nf"&gt;dif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="nv"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The different value is written &lt;code&gt;_A&lt;/code&gt; because we never told Prolog that the array can &lt;em&gt;only&lt;/em&gt; contain &lt;code&gt;a&lt;/code&gt; and &lt;code&gt;b&lt;/code&gt;. We'll fix this later.&lt;/p&gt;
    &lt;h3&gt;Okay back to the program&lt;/h3&gt;
    &lt;p&gt;Now that we have a way of computing scores, we want to find a possible answer key that matches all of our observations, ie gives everybody the correct scores.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
        &lt;span class="c1"&gt;% Figure it out&lt;/span&gt;
        &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So far we haven't explicitly said that the &lt;code&gt;Key&lt;/code&gt; length matches the student answer lengths. This is implicitly verified by &lt;code&gt;score&lt;/code&gt; (both lists need to be empty at the same time) but it's a good idea to explicitly add &lt;code&gt;length(Key, 10)&lt;/code&gt; as a clause of &lt;code&gt;key/1&lt;/code&gt;. We should also explicitly say that every element of &lt;code&gt;Key&lt;/code&gt; is either &lt;code&gt;a&lt;/code&gt; or &lt;code&gt;b&lt;/code&gt;.&lt;sup id="fnref:explicit"&gt;&lt;a class="footnote-ref" href="#fn:explicit"&gt;4&lt;/a&gt;&lt;/sup&gt; Now we &lt;em&gt;could&lt;/em&gt; write a second predicate saying &lt;code&gt;Key&lt;/code&gt; had the right 'type': &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;keytype([]).
    keytype([K|Ks]) :- member(K, [a, b]), keytype(Ks).
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But "generating lists that match a constraint" is a thing that comes up often enough that we don't want to write a separate predicate for each constraint! So after some digging, I found a more elegant solution: &lt;code&gt;maplist&lt;/code&gt;. Let &lt;code&gt;L=[l1, l2]&lt;/code&gt;. Then &lt;code&gt;maplist(p, L)&lt;/code&gt; is equivalent to the clause &lt;code&gt;p(l1), p(l2)&lt;/code&gt;. It also accepts partial predicates: &lt;code&gt;maplist(p(x), L)&lt;/code&gt; is equivalent to &lt;code&gt;p(x, l1), p(x, l2)&lt;/code&gt;. So we could write&lt;sup id="fnref:yall"&gt;&lt;a class="footnote-ref" href="#fn:yall"&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="nf"&gt;member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    
    &lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt;
        &lt;span class="nf"&gt;length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;maplist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="nv"&gt;L&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="c1"&gt;% the score stuff&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now, let's query for the Key:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nv"&gt;Key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So there are actually four &lt;em&gt;different&lt;/em&gt; keys that all explain our data. Does this mean the puzzle is broken and has multiple different answers?&lt;/p&gt;
    &lt;h3&gt;Nope&lt;/h3&gt;
    &lt;p&gt;The puzzle wasn't to find out what the answer key was, the point was to find the fourth student's score. And if we query for it, we see all four solutions give him the same score:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="s s-Atom"&gt;?-&lt;/span&gt; &lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s s-Atom"&gt;a&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;span class="nv"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Huh! I really like it when puzzles look like they're broken, but every "alternate" solution still gives the same puzzle answer.&lt;/p&gt;
    &lt;p&gt;Total program length: 15 lines of code, compared to the original's 80 lines. &lt;em&gt;Suck it, Pablo.&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;(Incidentally, you can get all of the answer at once by writing &lt;code&gt;findall(X, (key(Key), score($answer-array, Key, X)), L).&lt;/code&gt;) &lt;/p&gt;
    &lt;p class="empty-line" style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;I still don't like puzzles for teaching&lt;/h3&gt;
    &lt;p&gt;The actual examples I'm using in &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;the book&lt;/a&gt; are "analyzing a version control commit graph" and "planning a sequence of infrastructure changes", which are somewhat more likely to occur at work than needing to solve a puzzle. You'll see them in the next release!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:path"&gt;
    &lt;p&gt;I found it because he wrote &lt;a href="https://morepablo.com/2025/04/gamer-games-for-lite-gamers.html" target="_blank"&gt;Gamer Games for Lite Gamers&lt;/a&gt; as a response to my &lt;a href="https://www.hillelwayne.com/post/vidja-games/" target="_blank"&gt;Gamer Games for Non-Gamers&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:path" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:superior"&gt;
    &lt;p&gt;These are better versions of the core Prolog expressions &lt;code&gt;\+ (A = B)&lt;/code&gt; and &lt;code&gt;A is B + 1&lt;/code&gt;, because they can &lt;a href="https://eu.swi-prolog.org/pldoc/man?predicate=dif/2" target="_blank"&gt;defer unification&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:superior" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:arity"&gt;
    &lt;p&gt;Prolog-descendants have a convention of writing the arity of the function after its name, so &lt;code&gt;score/3&lt;/code&gt; means "score has three parameters". I think they do this because you can overload predicates with multiple different arities. Also Joe Armstrong used Prolog for prototyping, so Erlang and Elixir follow the same convention. &lt;a class="footnote-backref" href="#fnref:arity" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:explicit"&gt;
    &lt;p&gt;It &lt;em&gt;still&lt;/em&gt; gets the right answers without this type restriction, but I had no idea it did until I checked for myself. Probably better not to rely on this! &lt;a class="footnote-backref" href="#fnref:explicit" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:yall"&gt;
    &lt;p&gt;We could make this even more compact by using a lambda function. First import module &lt;code&gt;yall&lt;/code&gt;, then write &lt;code&gt;maplist([X]&amp;gt;&amp;gt;member(X, [a,b]), Key)&lt;/code&gt;. But (1) it's not a shorter program because you replace the extra definition with an extra module import, and (2) &lt;code&gt;yall&lt;/code&gt; is SWI-Prolog specific and not an ISO-standard prolog module. Using &lt;code&gt;contains&lt;/code&gt; is more portable. &lt;a class="footnote-backref" href="#fnref:yall" title="Jump back to footnote 5 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 08 Apr 2025 18:34:50 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/</guid></item><item><title>[April Cools] Gaming Games for Non-Gamers</title><link>https://buttondown.com/hillelwayne/archive/april-cools-gaming-games-for-non-gamers/</link><description>
    &lt;p&gt;My &lt;em&gt;April Cools&lt;/em&gt; is out! &lt;a href="https://www.hillelwayne.com/post/vidja-games/" target="_blank"&gt;Gaming Games for Non-Gamers&lt;/a&gt; is a 3,000 word essay on video games worth playing if you've never enjoyed a video game before. &lt;a href="https://www.patreon.com/posts/blog-notes-gamer-125654321?utm_medium=clipboard_copy&amp;amp;utm_source=copyLink&amp;amp;utm_campaign=postshare_creator&amp;amp;utm_content=join_link" target="_blank"&gt;Patreon notes here&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;(April Cools is a project where we write genuine content on non-normal topics. You can see all the other April Cools posted so far &lt;a href="https://www.aprilcools.club/" target="_blank"&gt;here&lt;/a&gt;. There's still time to submit your own!)&lt;/p&gt;
    &lt;a class="embedded-link" href="https://www.aprilcools.club/"&gt; &lt;div style="width: 100%; background: #fff; border: 1px #ced3d9 solid; border-radius: 5px; margin-top: 1em; overflow: auto; margin-bottom: 1em;"&gt; &lt;div style="float: left; border-bottom: 1px #ced3d9 solid;"&gt; &lt;img class="link-image" src="https://www.aprilcools.club/aprilcoolsclub.png"/&gt; &lt;/div&gt; &lt;div style="float: left; color: #393f48; padding-left: 1em; padding-right: 1em;"&gt; &lt;h4 class="link-title" style="margin-bottom: 0em; line-height: 1.25em; margin-top: 1em; font-size: 14px;"&gt;                April Cools' Club&lt;/h4&gt; &lt;/div&gt; &lt;/div&gt;&lt;/a&gt;
    </description><pubDate>Tue, 01 Apr 2025 16:04:59 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/april-cools-gaming-games-for-non-gamers/</guid></item><item><title>Betteridge's Law of Software Engineering Specialness</title><link>https://buttondown.com/hillelwayne/archive/betteridges-law-of-software-engineering/</link><description>
    &lt;h3&gt;Logic for Programmers v0.8 now out!&lt;/h3&gt;
    &lt;p&gt;The new release has minor changes: new formatting for notes and a better introduction to predicates. I would have rolled it all into v0.9 next month but I like the monthly cadence. &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Get it here!&lt;/a&gt;&lt;/p&gt;
    &lt;h1&gt;Betteridge's Law of Software Engineering Specialness&lt;/h1&gt;
    &lt;p&gt;In &lt;a href="https://agileotter.blogspot.com/2025/03/there-is-no-automatic-reset-in.html" target="_blank"&gt;There is No Automatic Reset in Engineering&lt;/a&gt;, Tim Ottinger asks:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Do the other people have to live with January 2013 for the rest of their lives? Or is it only engineering that has to deal with every dirty hack since the beginning of the organization?&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;&lt;strong&gt;Betteridge's Law of Headlines&lt;/strong&gt; says that if a journalism headline ends with a question mark, the answer is probably "no". I propose a similar law relating to software engineering specialness:&lt;sup id="fnref:ottinger"&gt;&lt;a class="footnote-ref" href="#fn:ottinger"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;If someone asks if some aspect of software development is truly unique to just software development, the answer is probably "no".&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Take the idea that "in software, hacks are forever." My favorite example of this comes from a different profession. The &lt;a href="https://en.wikipedia.org/wiki/Dewey_Decimal_Classification" target="_blank"&gt;Dewey Decimal System&lt;/a&gt; hierarchically categorizes books by discipline. For example, &lt;em&gt;&lt;a href="https://www.librarything.com/work/10143437/t/Covered-Bridges-of-Pennsylvania" target="_blank"&gt;Covered Bridges of Pennsylvania&lt;/a&gt;&lt;/em&gt; has Dewey number &lt;code&gt;624.37&lt;/code&gt;. &lt;code&gt;6--&lt;/code&gt; is the technology discipline, &lt;code&gt;62-&lt;/code&gt; is engineering, &lt;code&gt;624&lt;/code&gt; is civil engineering, and &lt;code&gt;624.3&lt;/code&gt; is "special types of bridges". I have no idea what the last &lt;code&gt;0.07&lt;/code&gt; means, but you get the picture.&lt;/p&gt;
    &lt;p&gt;Now if you look at the &lt;a href="https://www.librarything.com/mds/6" target="_blank"&gt;6-- "technology" breakdown&lt;/a&gt;, you'll see that there's no "software" subdiscipline. This is because when Dewey preallocated the whole technology block in 1876. New topics were instead to be added to the &lt;code&gt;00-&lt;/code&gt; "general-knowledge" catch-all. Eventually &lt;code&gt;005&lt;/code&gt; was assigned to "software development", meaning &lt;em&gt;The C Programming Language&lt;/em&gt; lives at &lt;code&gt;005.133&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;Incidentally, another late addition to the general knowledge block is &lt;code&gt;001.9&lt;/code&gt;: "controversial knowledge". &lt;/p&gt;
    &lt;p&gt;And that's why my hometown library shelved the C++ books right next to &lt;em&gt;The Mothman Prophecies&lt;/em&gt;.&lt;/p&gt;
    &lt;p&gt;How's &lt;em&gt;that&lt;/em&gt; for technical debt?&lt;/p&gt;
    &lt;p&gt;If anything, fixing hacks in software is significantly &lt;em&gt;easier&lt;/em&gt; than in other fields. This came up when I was &lt;a href="https://www.hillelwayne.com/post/we-are-not-special/" target="_blank"&gt;interviewing classic engineers&lt;/a&gt;. Kludges happened all the time, but "refactoring" them out is &lt;em&gt;expensive&lt;/em&gt;. Need to house a machine that's just two inches taller than the room? Guess what, you're cutting a hole in the ceiling.&lt;/p&gt;
    &lt;p&gt;(Even if we restrict the question to other departments in a &lt;em&gt;software company&lt;/em&gt;, we can find kludges that are horrible to undo. I once worked for a company which landed an early contract by adding a bespoke support agreement for that one customer. That plagued them for years afterward.)&lt;/p&gt;
    &lt;p&gt;That's not to say that there aren't things that are different about software vs other fields!&lt;sup id="fnref:example"&gt;&lt;a class="footnote-ref" href="#fn:example"&gt;2&lt;/a&gt;&lt;/sup&gt;  But I think that &lt;em&gt;most&lt;/em&gt; of the time, when we say "software development is the only profession that deals with XYZ", it's only because we're ignorant of how those other professions work.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Short newsletter because I'm way behind on writing my &lt;a href="https://www.aprilcools.club/" target="_blank"&gt;April Cools&lt;/a&gt;. If you're interested in April Cools, you should try it out! I make it &lt;em&gt;way&lt;/em&gt; harder on myself than it actually needs to be— everybody else who participates finds it pretty chill.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:ottinger"&gt;
    &lt;p&gt;Ottinger caveats it with "engineering, software or otherwise", so I think he knows that other branches of &lt;em&gt;engineering&lt;/em&gt;, at least, have kludges. &lt;a class="footnote-backref" href="#fnref:ottinger" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:example"&gt;
    &lt;p&gt;The "software is different" idea that I'm most sympathetic to is that in software, the tools we use and the products we create are made from the same material. That's unusual at least in classic engineering. Then again, plenty of machinists have made their own lathes and mills! &lt;a class="footnote-backref" href="#fnref:example" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 26 Mar 2025 18:48:39 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/betteridges-law-of-software-engineering/</guid></item><item><title>Verification-First Development</title><link>https://buttondown.com/hillelwayne/archive/verification-first-development/</link><description>
    &lt;p&gt;A while back I argued on the Blue Site&lt;sup id="fnref:li"&gt;&lt;a class="footnote-ref" href="#fn:li"&gt;1&lt;/a&gt;&lt;/sup&gt; that "test-first development" (TFD) was different than "test-driven development" (TDD). The former is "write tests before you write code", the latter is a paradigm, culture, and collection of norms that's based on TFD. More broadly, TFD is a special case of &lt;strong&gt;Verification-First Development&lt;/strong&gt; and TDD is not.&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;VFD: before writing code, put in place some means of verifying that the code is correct, or at least have an idea of what you'll do.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;"Verifying" could mean writing tests, or figuring out how to encode invariants in types, or &lt;a href="https://blog.regehr.org/archives/1091" target="_blank"&gt;adding contracts&lt;/a&gt;, or &lt;a href="https://learntla.com/" target="_blank"&gt;making a formal model&lt;/a&gt;, or writing a separate script that checks the output of the program. Just have &lt;em&gt;something&lt;/em&gt; appropriate in place that you can run as you go building the code. Ideally, we'd have verification in place for every interesting property, but that's rarely possible in practice. &lt;/p&gt;
    &lt;p&gt;Oftentimes we can't make the verification until the code is partially complete. In that case it still helps to figure out the verification we'll write later. The point is to have a &lt;em&gt;plan&lt;/em&gt; and follow it promptly.&lt;/p&gt;
    &lt;p&gt;I'm using "code" as a standin for anything we programmers make, not just software programs. When using constraint solvers, I try to find representative problems I know the answers to. When writing formal specifications, I figure out the system's properties before the design that satisfies those properties. There's probably equivalents in security and other topics, too.&lt;/p&gt;
    &lt;h3&gt;The Benefits of VFD&lt;/h3&gt;
    &lt;ol&gt;
    &lt;li&gt;Doing verification before coding makes it less likely we'll skip verification entirely. It's the professional equivalent of "No TV until you do your homework."&lt;/li&gt;
    &lt;li&gt;It's easier to make sure a verifier works properly if we start by running it on code we know doesn't pass it. Bebugging working code takes more discipline.&lt;/li&gt;
    &lt;li&gt;We can run checks earlier in the development process. It's better to realize that our code is broken five minutes after we broke it rather than two hours after.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;That's it, those are the benefits of verification-first development. Those are also &lt;em&gt;big&lt;/em&gt; benefits for relatively little investment. Specializations of VFD like test-first development can have more benefits, but also more drawbacks.&lt;/p&gt;
    &lt;h3&gt;The drawbacks of VFD&lt;/h3&gt;
    &lt;ol&gt;
    &lt;li&gt;It slows us down. I know lots of people say that "no actually it makes you go faster in the long run," but that's the &lt;em&gt;long&lt;/em&gt; run. Sometimes we do marathons, sometimes we sprint.&lt;/li&gt;
    &lt;li&gt;Verification gets in the way of exploratory coding, where we don't know what exactly we want or how exactly to do something.&lt;/li&gt;
    &lt;li&gt;Any specific form of verification exerts a pressure on our code to make it easier to verify with that method. For example, if we're mostly verifying via type invariants, we need to figure out how to express those things in our language's type system, which may not be suited for the specific invariants we need.&lt;sup id="fnref:sphinx"&gt;&lt;a class="footnote-ref" href="#fn:sphinx"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ol&gt;
    &lt;h2&gt;Whether "pressure" is a real drawback is incredibly controversial&lt;/h2&gt;
    &lt;p&gt;If I had to summarize what makes "test-driven development" different from VFD:&lt;sup id="fnref:tdd"&gt;&lt;a class="footnote-ref" href="#fn:tdd"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;The form of verification should specifically be tests, and unit tests at that&lt;/li&gt;
    &lt;li&gt;Testing pressure is invariably good. "Making your code easier to unit test" is the same as "making your code better".&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;This is something all of the various "drivens"— TDD, Type Driven Development, Design by Contract— share in common, this idea that the purpose of the paradigm is to exert pressure. Lots of TDD experts claim that "having a good test suite" is only the secondary benefit of TDD and the real benefit is how it improves code quality.&lt;sup id="fnref:docs"&gt;&lt;a class="footnote-ref" href="#fn:docs"&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;Whether they're right or not is not something I want to argue: I've seen these approaches all improve my code structure, but also sometimes worsen it. Regardless, I consider pressure a drawback to VFD in general, though, for a somewhat idiosyncratic reason. If it &lt;em&gt;weren't&lt;/em&gt; for pressure, VFD would be wholly independent of the code itself. It would &lt;em&gt;just&lt;/em&gt; be about verification, and our decisions would exclusively be about how we want to verify. But the design pressure means that our means of verification affects the system we're checking. What if these conflict in some way?&lt;/p&gt;
    &lt;h3&gt;VFD is a technique, not a paradigm&lt;/h3&gt;
    &lt;p&gt;One of the main differences between "techniques" and "paradigms" is that paradigms don't play well with each other. If you tried to do both "proper" Test-Driven Development and "proper" Cleanroom, your head would explode. Whereas VFD being a "technique" means it works well with other techniques and even with many full paradigms.&lt;/p&gt;
    &lt;p&gt;It also doesn't take a whole lot of practice to start using. It does take practice, both in thinking of verifications and in using the particular verification method involved, to &lt;em&gt;use well&lt;/em&gt;, but we can use it poorly and still benefit.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:li"&gt;
    &lt;p&gt;LinkedIn, what did you think I meant? &lt;a class="footnote-backref" href="#fnref:li" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:sphinx"&gt;
    &lt;p&gt;This bit me in the butt when making my own &lt;a href="https://www.sphinx-doc.org/en/master/" target="_blank"&gt;sphinx&lt;/a&gt; extensions. The official guides do things in a highly dynamic way that Mypy can't statically check. I had to do things in a completely different way. Ended up being better though! &lt;a class="footnote-backref" href="#fnref:sphinx" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:tdd"&gt;
    &lt;p&gt;Someone's going to yell at me that I completely missed the point of TDD, which is XYZ. Well guess what, someone else &lt;em&gt;already&lt;/em&gt; yelled at me that only dumb idiot babies think XYZ is important in TDD. Put in whatever you want for XYZ. &lt;a class="footnote-backref" href="#fnref:tdd" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:docs"&gt;
    &lt;p&gt;Another thing that weirdly all of the paradigms claim: that they lead to better documentation. I can see the argument, I just find it strange that &lt;em&gt;every single one&lt;/em&gt; makes this claim! &lt;a class="footnote-backref" href="#fnref:docs" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 18 Mar 2025 16:22:20 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/verification-first-development/</guid></item><item><title>New Blog Post: "A Perplexing Javascript Parsing Puzzle"</title><link>https://buttondown.com/hillelwayne/archive/new-blog-post-a-perplexing-javascript-parsing/</link><description>
    &lt;p&gt;I know I said we'd be back to normal newsletters this week and in fact had 80% of one already written. &lt;/p&gt;
    &lt;p&gt;Then I unearthed something that was better left buried.&lt;/p&gt;
    &lt;p&gt;&lt;a href="http://www.hillelwayne.com/post/javascript-puzzle/" target="_blank"&gt;Blog post here&lt;/a&gt;, &lt;a href="https://www.patreon.com/posts/blog-notes-124153641" target="_blank"&gt;Patreon notes here&lt;/a&gt; (Mostly an explanation of how I found this horror in the first place). Next week I'll send what was supposed to be this week's piece.&lt;/p&gt;
    &lt;p&gt;(PS: &lt;a href="https://www.aprilcools.club/" target="_blank"&gt;April Cools&lt;/a&gt; in three weeks!)&lt;/p&gt;
    </description><pubDate>Wed, 12 Mar 2025 14:49:52 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/new-blog-post-a-perplexing-javascript-parsing/</guid></item><item><title>Five Kinds of Nondeterminism</title><link>https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/</link><description>
    &lt;p&gt;No newsletter next week, I'm teaching a TLA+ workshop.&lt;/p&gt;
    &lt;p&gt;Speaking of which: I spend a lot of time thinking about formal methods (and TLA+ specifically) because it's where the source of almost all my revenue. But I don't share most of the details because 90% of my readers don't use FM and never will. I think it's more interesting to talk about ideas &lt;em&gt;from&lt;/em&gt; FM that would be useful to people outside that field. For example, the idea of "property strength" translates to the &lt;a href="https://buttondown.com/hillelwayne/archive/some-tests-are-stronger-than-others/" target="_blank"&gt;idea that some tests are stronger than others&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;Another possible export is how FM approaches nondeterminism. A &lt;strong&gt;nondeterministic&lt;/strong&gt; algorithm is one that, from the same starting conditions, has multiple possible outputs. This is nondeterministic:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# Pseudocode
    
    def f() {
        return rand()+1;
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;When specifying systems, I may not &lt;em&gt;encounter&lt;/em&gt; nondeterminism more often than in real systems, but I am definitely more aware of its presence. Modeling nondeterminism is a core part of formal specification. I mentally categorize nondeterminism into five buckets. Caveat, this is specifically about nondeterminism from the perspective of &lt;em&gt;system modeling&lt;/em&gt;, not computer science as a whole. If I tried to include stuff on NFAs and amb operations this would be twice as long.&lt;sup id="fnref:nondeterminism"&gt;&lt;a class="footnote-ref" href="#fn:nondeterminism"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h2&gt;1. True Randomness&lt;/h2&gt;
    &lt;p&gt;Programs that literally make calls to a &lt;code&gt;random&lt;/code&gt; function and then use the results. This the simplest type of nondeterminism and one of the most ubiquitous. &lt;/p&gt;
    &lt;p&gt;Most of the time, &lt;code&gt;random&lt;/code&gt; isn't &lt;em&gt;truly&lt;/em&gt; nondeterministic. Most of the time computer randomness is actually &lt;strong&gt;pseudorandom&lt;/strong&gt;, meaning we seed a deterministic algorithm that behaves "randomly-enough" for some use. You could "lift" a nondeterministic random function into a deterministic one by adding a fixed seed to the starting state.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python&lt;/span&gt;
    
    &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seed&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    
    &lt;span class="o"&gt;&gt;&gt;&gt;&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="mf"&gt;0.23796462709189137&lt;/span&gt;
    &lt;span class="o"&gt;&gt;&gt;&gt;&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="mf"&gt;0.23796462709189137&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Often we don't do this because the &lt;em&gt;point&lt;/em&gt; of randomness is to provide nondeterminism! We deliberately &lt;em&gt;abstract out&lt;/em&gt; the starting state of the seed from our program, because it's easier to think about it as locally nondeterministic.&lt;/p&gt;
    &lt;p&gt;(There's also "true" randomness, like using &lt;a href="https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html#inpage-nav-3-2" target="_blank"&gt;thermal noise&lt;/a&gt; as an entropy source, which I think are mainly used for cryptography and seeding PRNGs.)&lt;/p&gt;
    &lt;p&gt;Most formal specification languages don't deal with randomness (though some deal with &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;probability more broadly&lt;/a&gt;). Instead, we treat it as a nondeterministic choice:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# software
    if rand &gt; 0.001 then return a else crash
    
    # specification
    either return a or crash
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is because we're looking at worst-case scenarios, so it doesn't matter if &lt;code&gt;crash&lt;/code&gt; happens 50% of the time or 0.0001% of the time, it's still possible.  &lt;/p&gt;
    &lt;h2&gt;2. Concurrency&lt;/h2&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# Pseudocode
    global x = 1, y = 0;
    
    def thread1() {
       x++;
       x++;
       x++;
    }
    
    def thread2() {
        y := x;
    }
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;If &lt;code&gt;thread1()&lt;/code&gt; and &lt;code&gt;thread2()&lt;/code&gt; run sequentially, then (assuming the sequence is fixed) the final value of &lt;code&gt;y&lt;/code&gt; is deterministic. If the two functions are started and run simultaneously, then depending on when &lt;code&gt;thread2&lt;/code&gt; executes &lt;code&gt;y&lt;/code&gt; can be 1, 2, 3, &lt;em&gt;or&lt;/em&gt; 4. Both functions are locally sequential, but running them concurrently leads to global nondeterminism.&lt;/p&gt;
    &lt;p&gt;Concurrency is arguably the most &lt;em&gt;dramatic&lt;/em&gt; source of nondeterminism. &lt;a href="https://buttondown.com/hillelwayne/archive/what-makes-concurrency-so-hard/" target="_blank"&gt;Small amounts of concurrency lead to huge explosions in the state space&lt;/a&gt;. We have words for the specific kinds of nondeterminism caused by concurrency, like "race condition" and "dirty write". Often we think about it as a separate &lt;em&gt;topic&lt;/em&gt; from nondeterminism. To some extent it "overshadows" the other kinds: I have a much easier time teaching students about concurrency in models than nondeterminism in models.&lt;/p&gt;
    &lt;p&gt;Many formal specification languages have special syntax/machinery for the concurrent aspects of a system, and generic syntax for other kinds of nondeterminism. In P that's &lt;a href="https://p-org.github.io/P/manual/expressions/#choose" target="_blank"&gt;choose&lt;/a&gt;. Others don't special-case concurrency, instead representing as it as nondeterministic choices by a global coordinator. This more flexible but also more inconvenient, as you have to implement process-local sequencing code yourself. &lt;/p&gt;
    &lt;h2&gt;3. User Input&lt;/h2&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;One of the most famous and influential programming books is &lt;em&gt;The C Programming Language&lt;/em&gt; by Kernighan and Ritchie. The first example of a nondeterministic program appears on page 14:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Picture of the book page. Code reproduced below." class="newsletter-image" src="https://assets.buttondown.email/images/94e6ad15-8d09-48df-b885-191318bfd179.jpg?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;For the newsletter readers who get text only emails,&lt;sup id="fnref:text-only"&gt;&lt;a class="footnote-ref" href="#fn:text-only"&gt;2&lt;/a&gt;&lt;/sup&gt; here's the program:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="cp"&gt;#include&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cpf"&gt;&lt;stdio.h&gt;&lt;/span&gt;
    &lt;span class="cm"&gt;/* copy input to output; 1st version */&lt;/span&gt;
    &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;getchar&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;while&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;!=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;EOF&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;putchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;getchar&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Yup, that's nondeterministic. Because the user can enter any string, any call of &lt;code&gt;main()&lt;/code&gt; could have any output, meaning the number of possible outcomes is infinity.&lt;/p&gt;
    &lt;p&gt;Okay that seems a little cheap, and I think it's because we tend to think of determinism in terms of how the user &lt;em&gt;experiences&lt;/em&gt; the program. Yes, &lt;code&gt;main()&lt;/code&gt; has an infinite number of user inputs, but for each input the user will experience only one possible output. It starts to feel more nondeterministic when modeling a long-standing system that's &lt;em&gt;reacting&lt;/em&gt; to user input, for example a server that runs a script whenever the user uploads a file. This can be modeled with nondeterminism and concurrency: We have one execution that's the system, and one nondeterministic execution that represents the effects of our user.&lt;/p&gt;
    &lt;p&gt;(One intrusive thought I sometimes have: any "yes/no" dialogue actually has &lt;em&gt;three&lt;/em&gt; outcomes: yes, no, or the user getting up and walking away without picking a choice, permanently stalling the execution.)&lt;/p&gt;
    &lt;h2&gt;4. External forces&lt;/h2&gt;
    &lt;p&gt;The more general version of "user input": anything where either 1) some part of the execution outcome depends on retrieving external information, or 2) the external world can change some state outside of your system. I call the distinction between internal and external components of the system &lt;a href="https://www.hillelwayne.com/post/world-vs-machine/" target="_blank"&gt;the world and the machine&lt;/a&gt;. Simple examples: code that at some point reads an external temperature sensor. Unrelated code running on a system which quits programs if it gets too hot. API requests to a third party vendor. Code processing files but users can delete files before the script gets to them.&lt;/p&gt;
    &lt;p&gt;Like with PRNGs, some of these cases don't &lt;em&gt;have&lt;/em&gt; to be nondeterministic; we can argue that "the temperature" should be a virtual input into the function. Like with PRNGs, we treat it as nondeterministic because it's useful to think in that way. Also, what if the temperature changes between starting a function and reading it?&lt;/p&gt;
    &lt;p&gt;External forces are also a source of nondeterminism as &lt;em&gt;uncertainty&lt;/em&gt;. Measurements in the real world often comes with errors, so repeating a measurement twice can give two different answers. Sometimes operations fail for no discernable reason, or for a non-programmatic reason (like something physically blocks the sensor).&lt;/p&gt;
    &lt;p&gt;All of these situations can be modeled in the same way as user input: a concurrent execution making nondeterministic choices.&lt;/p&gt;
    &lt;h2&gt;5. Abstraction&lt;/h2&gt;
    &lt;p&gt;This is where nondeterminism in system models and in "real software" differ the most. I said earlier that pseudorandomness is &lt;em&gt;arguably&lt;/em&gt; deterministic, but we abstract it into nondeterminism. More generally, &lt;strong&gt;nondeterminism hides implementation details of deterministic processes&lt;/strong&gt;.&lt;/p&gt;
    &lt;p&gt;In one consulting project, we had a machine that received a message, parsed a lot of data from the message, went into a complicated workflow, and then entered one of three states. The final state was totally deterministic on the content of the message, but the actual process of determining that final state took tons and tons of code. None of that mattered at the scope we were modeling, so we abstracted it all away: "on receiving message, nondeterministically enter state A, B, or C."&lt;/p&gt;
    &lt;p&gt;Doing this makes the system easier to model. It also makes the model more sensitive to possible errors. What if the workflow is bugged and sends us to the wrong state? That's already covered by the nondeterministic choice! Nondeterministic abstraction gives us the potential to pick the worst-case scenario for our system, so we can prove it's robust even under those conditions.&lt;/p&gt;
    &lt;p&gt;I know I beat the "nondeterminism as abstraction" drum a whole lot but that's because it's the insight from formal methods I personally value the most, that nondeterminism is a powerful tool to &lt;em&gt;simplify reasoning about things&lt;/em&gt;. You can see the same approach in how I approach modeling users and external forces: complex realities black-boxed and simplified into nondeterministic forces on the system.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Anyway, I hope this collection of ideas I got from formal methods are useful to my broader readership. Lemme know if it somehow helps you out!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:nondeterminism"&gt;
    &lt;p&gt;I realized after writing this that I already talked wrote an essay about nondeterminism in formal specification &lt;a href="https://buttondown.com/hillelwayne/archive/nondeterminism-in-formal-specification/" target="_blank"&gt;just under a year ago&lt;/a&gt;. I hope this one covers enough new ground to be interesting! &lt;a class="footnote-backref" href="#fnref:nondeterminism" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:text-only"&gt;
    &lt;p&gt;There is a surprising number of you. &lt;a class="footnote-backref" href="#fnref:text-only" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 19 Feb 2025 19:37:57 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/</guid></item><item><title>Are Efficiency and Horizontal Scalability at odds?</title><link>https://buttondown.com/hillelwayne/archive/are-efficiency-and-horizontal-scalability-at-odds/</link><description>
    &lt;p&gt;Sorry for missing the newsletter last week! I started writing on Monday as normal, and by Wednesday the piece (about the &lt;a href="https://en.wikipedia.org/wiki/Hierarchy_of_hazard_controls" target="_blank"&gt;hierarchy of controls&lt;/a&gt; ) was 2000 words and not &lt;em&gt;close&lt;/em&gt; to done. So now it'll be a blog post sometime later this month.&lt;/p&gt;
    &lt;p&gt;I also just released a new version of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt;! 0.7 adds a bunch of new content (type invariants, modeling access policies, rewrites of the first chapters) but more importantly has new fonts that are more legible than the old ones. &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Go check it out!&lt;/a&gt;&lt;/p&gt;
    &lt;p&gt;For this week's newsletter I want to brainstorm an idea I've been noodling over for a while. Say we have a computational task, like running a simulation or searching a very large graph, and it's taking too long to complete on a computer. There's generally three things that we can do to make it faster:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Buy a faster computer ("vertical scaling")&lt;/li&gt;
    &lt;li&gt;Modify the software to use the computer's resources better ("efficiency")&lt;/li&gt;
    &lt;li&gt;Modify the software to use multiple computers ("horizontal scaling")&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;(Splitting single-threaded software across multiple threads/processes is sort of a blend of (2) and (3).)&lt;/p&gt;
    &lt;p&gt;The big benefit of (1) is that we (usually) don't have to make any changes to the software to get a speedup. The downside is that for the past couple of decades computers haven't &lt;em&gt;gotten&lt;/em&gt; much faster, except in ways that require recoding (like GPUs and multicore). This means we rely on (2) and (3), and we can do both to a point. I've noticed, though, that horizontal scaling seems to conflict with efficiency. Software optimized to scale well tends to be worse or the &lt;code&gt;N=1&lt;/code&gt; case than software optimized to, um, be optimized. &lt;/p&gt;
    &lt;p&gt;Are there reasons to &lt;em&gt;expect&lt;/em&gt; this? It seems reasonable that design goals of software are generally in conflict, purely because exclusively optimizing for one property means making decisions that impede other properties. But is there something in the nature of "efficiency" and "horizontal scalability" that make them especially disjoint?&lt;/p&gt;
    &lt;p&gt;This isn't me trying to explain a fully coherent idea, more me trying to figure this all out to myself. Also I'm probably getting some hardware stuff wrong&lt;/p&gt;
    &lt;h3&gt;Amdahl's Law&lt;/h3&gt;
    &lt;p&gt;According to &lt;a href="https://en.wikipedia.org/wiki/Amdahl%27s_law" target="_blank"&gt;Amdahl's Law&lt;/a&gt;, the maximum speedup by parallelization is constrained by the proportion of the work that can be parallelized. If 80% of algorithm X is parallelizable, the maximum speedup from horizontal scaling is 5x. If algorithm Y is 25% parallelizable, the maximum speedup is only 1.3x. &lt;/p&gt;
    &lt;p&gt;If you need horizontal scalability, you want to use algorithm X, &lt;em&gt;even if Y is naturally 3x faster&lt;/em&gt;. But if Y was 4x faster, you'd prefer it to X. Maximal scalability means finding the optimal balance between baseline speed and parallelizability. Maximal efficiency means just optimizing baseline speed. &lt;/p&gt;
    &lt;h3&gt;Coordination Overhead&lt;/h3&gt;
    &lt;p&gt;Distributed algorithms require more coordination. To add a list of numbers in parallel via &lt;a href="https://en.wikipedia.org/wiki/Fork%E2%80%93join_model" target="_blank"&gt;fork-join&lt;/a&gt;, we'd do something like this:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Split the list into N sublists&lt;/li&gt;
    &lt;li&gt;Fork a new thread/process for sublist&lt;/li&gt;
    &lt;li&gt;Wait for each thread/process to finish&lt;/li&gt;
    &lt;li&gt;Add the sums together.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;(1), (2), and (3) all add overhead to the algorithm. At the very least, it's extra lines of code to execute, but it can also mean inter-process communication or network hops. Distribution also means you have fewer natural correctness guarantees, so you need more administrative overhead to avoid race conditions. &lt;/p&gt;
    &lt;p&gt;&lt;strong&gt;Real world example:&lt;/strong&gt; Historically CPython has a "global interpreter lock" (GIL). In multithreaded code, only one thread could execute Python code at a time (others could execute C code). The &lt;a href="https://docs.python.org/3/howto/free-threading-python.html#single-threaded-performance" target="_blank"&gt;newest version&lt;/a&gt; supports disabling the GIL, which comes at a 40% overhead for single-threaded programs. Supposedly the difference is because the &lt;a href="https://docs.python.org/3/whatsnew/3.11.html#whatsnew311-pep659" target="_blank"&gt;specializing adaptor&lt;/a&gt; optimization isn't thread-safe yet. The Python team is hoping on getting it down to "only" 10%. &lt;/p&gt;
    &lt;p style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;Scaling loses shared resources&lt;/h3&gt;
    &lt;p&gt;I'd say that intra-machine scaling (multiple threads/processes) feels qualitatively &lt;em&gt;different&lt;/em&gt; than inter-machine scaling. Part of that is that intra-machine scaling is "capped" while inter-machine is not. But there's also a difference in what assumptions you can make about shared resources. Starting from the baseline of single-threaded program:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Threads have a much harder time sharing CPU caches (you have to manually mess with affinities)&lt;/li&gt;
    &lt;li&gt;Processes have a much harder time sharing RAM (I think you have to use &lt;a href="https://en.wikipedia.org/wiki/Memory-mapped_file" target="_blank"&gt;mmap&lt;/a&gt;?)&lt;/li&gt;
    &lt;li&gt;Machines can't share cache, RAM, or disk, period.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;It's a lot easier to solve a problem when the whole thing fits in RAM. But if you split a 50 gb problem across three machines, it doesn't fit in ram by default, even if the machines have 64 gb each. Scaling also means that separate machines can't reuse resources like database connections.&lt;/p&gt;
    &lt;h3&gt;Efficiency comes from limits&lt;/h3&gt;
    &lt;p&gt;I think the two previous points tie together in the idea that maximal efficiency comes from being able to make assumptions about the system. If we know the &lt;em&gt;exact&lt;/em&gt; sequence of computations, we can aim to minimize cache misses. If we don't have to worry about thread-safety, &lt;a href="https://www.playingwithpointers.com/blog/refcounting-harder-than-it-sounds.html" target="_blank"&gt;tracking references is dramatically simpler&lt;/a&gt;. If we have all of the data in a single database, our query planner has more room to work with. At various tiers of scaling these assumptions are no longer guaranteed and we lose the corresponding optimizations.&lt;/p&gt;
    &lt;p&gt;Sometimes these assumptions are implicit and crop up in odd places. Like if you're working at a scale where you need multiple synced databases, you might want to use UUIDs instead of numbers for keys. But then you lose the assumption "recently inserted rows are close together in the index", which I've read &lt;a href="https://www.cybertec-postgresql.com/en/unexpected-downsides-of-uuid-keys-in-postgresql/" target="_blank"&gt;can lead to significant slowdowns&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;This suggests that if you can find a limit somewhere else, you can get both high horizontal scaling and high efficiency. &lt;del&gt;Supposedly the &lt;a href="https://tigerbeetle.com/" target="_blank"&gt;TigerBeetle database&lt;/a&gt; has both, but that could be because they limit all records to &lt;a href="https://docs.tigerbeetle.com/coding/" target="_blank"&gt;accounts and transfers&lt;/a&gt;. This means every record fits in &lt;a href="https://tigerbeetle.com/blog/2024-07-23-rediscovering-transaction-processing-from-history-and-first-principles/#transaction-processing-from-first-principles" target="_blank"&gt;exactly 128 bytes&lt;/a&gt;.&lt;/del&gt; [A TigerBeetle engineer reached out to tell me that they do &lt;em&gt;not&lt;/em&gt; horizontally scale compute, they distribute across multiple nodes for redundancy. &lt;a href="https://lobste.rs/s/5akiq3/are_efficiency_horizontal_scalability#c_ve8ud5" target="_blank"&gt;"You can't make it faster by adding more machines."&lt;/a&gt;]&lt;/p&gt;
    &lt;p&gt;Does this mean that "assumptions" could be both "assumptions about the computing environment" and "assumptions about the problem"? In the famous essay &lt;a href="http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html" target="_blank"&gt;Scalability! But at what COST&lt;/a&gt;, Frank McSherry shows that his single-threaded laptop could outperform 128-node "big data systems" on PageRank and graph connectivity (via label propagation). Afterwards, he discusses how a different algorithm solves graph connectivity even faster: &lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;[Union find] is more line of code than label propagation, but it is 10x faster and 100x less embarassing. … The union-find algorithm is fundamentally incompatible with the graph computation approaches Giraph, GraphLab, and GraphX put forward (the so-called “think like a vertex” model).&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The interesting thing to me is that his alternate makes more "assumptions" than what he's comparing to. He can "assume" a fixed goal and optimize the code for that goal. The "big data systems" are trying to be general purpose compute platforms and have to pick a model that supports the widest range of possible problems. &lt;/p&gt;
    &lt;p&gt;A few years back I wrote &lt;a href="https://www.hillelwayne.com/post/cleverness/" target="_blank"&gt;clever vs insightful code&lt;/a&gt;, I think what I'm trying to say here is that efficiency comes from having insight into your problem and environment.&lt;/p&gt;
    &lt;p&gt;(Last thought to shove in here: to exploit assumptions, you need &lt;em&gt;control&lt;/em&gt;. Carefully arranging your data to fit in L1 doesn't matter if your programming language doesn't let you control where things are stored!)&lt;/p&gt;
    &lt;h3&gt;Is there a cultural aspect?&lt;/h3&gt;
    &lt;p&gt;Maybe there's also a cultural element to this conflict. What if the engineers interested in "efficiency" are different from the engineers interested in "horizontal scaling"?&lt;/p&gt;
    &lt;p&gt;At my first job the data scientists set up a &lt;a href="https://en.wikipedia.org/wiki/Apache_Hadoop" target="_blank"&gt;Hadoop&lt;/a&gt; cluster for their relatively small dataset, only a few dozen gigabytes or so. One of the senior software engineers saw this and said "big data is stupid." To prove it, he took one of their example queries, wrote a script in Go to compute the same thing, and optimized it to run faster on his machine.&lt;/p&gt;
    &lt;p&gt;At the time I was like "yeah, you're right, big data IS stupid!" But I think now that we both missed something obvious: with the "scalable" solution, the data scientists &lt;em&gt;didn't&lt;/em&gt; have to write an optimized script for every single query. Optimizing code is hard, adding more machines is easy! &lt;/p&gt;
    &lt;p&gt;The highest-tier of horizontal scaling is usually something large businesses want, and large businesses like problems that can be solved purely with money. Maximizing efficiency requires a lot of knowledge-intensive human labour, so is less appealing as an investment. Then again, I've seen a lot of work on making the scalable systems more efficient, such as evenly balancing heterogeneous workloads. Maybe in the largest systems intra-machine efficiency is just too small-scale a problem. &lt;/p&gt;
    &lt;h3&gt;I'm not sure where this fits in but scaling a volume of tasks conflicts less than scaling individual tasks&lt;/h3&gt;
    &lt;p&gt;If you have 1,000 machines and need to crunch one big graph, you probably want the most scalable algorithm. If you instead have 50,000 small graphs, you probably want the most efficient algorithm, which you then run on all 1,000 machines. When we call a problem &lt;a href="https://en.wikipedia.org/wiki/Embarrassingly_parallel" target="_blank"&gt;embarrassingly parallel&lt;/a&gt;, we usually mean it's easy to horizontally scale. But it's also one that's easy to make more efficient, because local optimizations don't affect the scaling! &lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Okay that's enough brainstorming for one week.&lt;/p&gt;
    &lt;h3&gt;Blog Rec&lt;/h3&gt;
    &lt;p&gt;Whenever I think about optimization as a skill, the first article that comes to mind is &lt;a href="https://matklad.github.io/" target="_blank"&gt;Mat Klad's&lt;/a&gt; &lt;a href="https://matklad.github.io/2023/11/15/push-ifs-up-and-fors-down.html" target="_blank"&gt;Push Ifs Up And Fors Down&lt;/a&gt;. I'd never have considered on my own that inlining loops into functions could be such a huge performance win. The blog has a lot of other posts on the nuts-and-bolts of systems languages, optimization, and concurrency.&lt;/p&gt;
    </description><pubDate>Wed, 12 Feb 2025 18:26:20 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/are-efficiency-and-horizontal-scalability-at-odds/</guid></item><item><title>What hard thing does your tech make easy?</title><link>https://buttondown.com/hillelwayne/archive/what-hard-thing-does-your-tech-make-easy/</link><description>
    &lt;p&gt;I occasionally receive emails asking me to look at the writer's new language/library/tool. Sometimes it's in an area I know well, like formal methods. Other times, I'm a complete stranger to the field. Regardless, I'm generally happy to check it out.&lt;/p&gt;
    &lt;p&gt;When starting out, this is the biggest question I'm looking to answer:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;What does this technology make easy that's normally hard?&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;What justifies me learning and migrating to a &lt;em&gt;new&lt;/em&gt; thing as opposed to fighting through my problems with the tools I already know? The new thing has to have some sort of value proposition, which could be something like "better performance" or "more secure". The most universal value and the most direct to show is "takes less time and mental effort to do something". I can't accurately judge two benchmarks, but I can see two demos or code samples and compare which one feels easier to me.&lt;/p&gt;
    &lt;h2&gt;Examples&lt;/h2&gt;
    &lt;h3&gt;Functional programming&lt;/h3&gt;
    &lt;p&gt;What drew me originally to functional programming was higher order functions. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# Without HOFs
    
    out = []
    for x in input {
      if test(x) {
        out.append(x)
     }
    }
    
    # With HOFs
    
    filter(test, input)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;p&gt;We can also compare the easiness of various tasks between examples within the same paradigm. If I know FP via Clojure, what could be appealing about Haskell or F#? For one, null safety is a lot easier when I've got option types.&lt;/p&gt;
    &lt;h3&gt;Array Programming&lt;/h3&gt;
    &lt;p&gt;Array programming languages like APL or J make certain classes of computation easier. For example, finding all of the indices where two arrays &lt;del&gt;differ&lt;/del&gt; match. Here it is in Python:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    
    &lt;span class="o"&gt;&gt;&gt;&gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;And here it is in J:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;y&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
    
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nv"&gt;I&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;y&lt;/span&gt;
    &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Not every tool is meant for every programmer, because you might not have any of the problems a tool makes easier. What comes up more often for you: filtering a list or finding all the indices where two lists differ? Statistically speaking, functional programming is more useful to you than array programming.&lt;/p&gt;
    &lt;p&gt;But &lt;em&gt;I&lt;/em&gt; have this problem enough to justify learning array programming.&lt;/p&gt;
    &lt;h3&gt;LLMs&lt;/h3&gt;
    &lt;p&gt;I think a lot of the appeal of LLMs is they make a lot of specialist tasks easy for nonspecialists. One thing I recently did was convert some rst &lt;a href="https://docutils.sourceforge.io/docs/ref/rst/directives.html#list-table" target="_blank"&gt;list tables&lt;/a&gt; to &lt;a href="https://docutils.sourceforge.io/docs/ref/rst/directives.html#csv-table-1" target="_blank"&gt;csv tables&lt;/a&gt;. Normally I'd have to do write some tricky parsing and serialization code to automatically convert between the two. With LLMs, it's just&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Convert the following rst list-table into a csv-table: [table]&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;"Easy" can trump "correct" as a value. The LLM might get some translations wrong, but it's so convenient I'd rather manually review all the translations for errors than write specialized script that is correct 100% of the time.&lt;/p&gt;
    &lt;h2&gt;Let's not take this too far&lt;/h2&gt;
    &lt;p&gt;A college friend once claimed that he cracked the secret of human behavior: humans do whatever makes them happiest. "What about the martyr who dies for their beliefs?" "Well, in their last second of life they get REALLY happy."&lt;/p&gt;
    &lt;p&gt;We can do the same here, fitting every value proposition into the frame of "easy". CUDA makes it easier to do matrix multiplication. Rust makes it easier to write low-level code without memory bugs. TLA+ makes it easier to find errors in your design. Monads make it easier to sequence computations in a lazy environment. Making everything about "easy" obscures other reason for adopting new things.&lt;/p&gt;
    &lt;h3&gt;That whole "simple vs easy" thing&lt;/h3&gt;
    &lt;p&gt;Sometimes people think that "simple" is better than "easy", because "simple" is objective and "easy" is subjective. This comes from the famous talk &lt;a href="https://www.infoq.com/presentations/Simple-Made-Easy/" target="_blank"&gt;Simple Made Easy&lt;/a&gt;. I'm not sure I agree that simple is better &lt;em&gt;or&lt;/em&gt; more objective: the speaker claims that polymorphism and typeclasses are "simpler" than conditionals, and I doubt everybody would agree with that.&lt;/p&gt;
    &lt;p&gt;The problem is that "simple" is used to mean both "not complicated" &lt;em&gt;and&lt;/em&gt; "not complex". And everybody agrees that "complicated" and "complex" are different, even if they can't agree &lt;em&gt;what&lt;/em&gt; the difference is. This idea should probably expanded be expanded into its own newsletter.&lt;/p&gt;
    &lt;p&gt;It's also a lot harder to pitch a technology on being "simpler". Simplicity by itself doesn't make a tool better equipped to solve problems. Simplicity can unlock other benefits, like compositionality or &lt;a href="https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/" target="_blank"&gt;tractability&lt;/a&gt;, that provide the actual value. And often that value is in the form of "makes some tasks easier". &lt;/p&gt;
    </description><pubDate>Wed, 29 Jan 2025 18:09:47 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/what-hard-thing-does-your-tech-make-easy/</guid></item><item><title>The Juggler's Curse</title><link>https://buttondown.com/hillelwayne/archive/the-jugglers-curse/</link><description>
    &lt;p&gt;I'm making a more focused effort to juggle this year. Mostly &lt;a href="https://youtu.be/PPhG_90VH5k?si=AxOO65PcX4ZwnxPQ&amp;t=49" target="_blank"&gt;boxes&lt;/a&gt;, but also classic balls too.&lt;sup id="fnref:boxes"&gt;&lt;a class="footnote-ref" href="#fn:boxes"&gt;1&lt;/a&gt;&lt;/sup&gt; I've gotten to the point where I can almost consistently do a five-ball cascade, which I &lt;em&gt;thought&lt;/em&gt; was the cutoff to being a "good juggler". "Thought" because I now know a "good juggler" is one who can do the five-ball cascade with &lt;em&gt;outside throws&lt;/em&gt;. &lt;/p&gt;
    &lt;p&gt;I know this because I can't do the outside five-ball cascade... yet. But it's something I can see myself eventually mastering, unlike the slightly more difficult trick of the five-ball mess, which is impossible for mere mortals like me. &lt;/p&gt;
    &lt;p&gt;&lt;em&gt;In theory&lt;/em&gt; there is a spectrum of trick difficulties and skill levels. I could place myself on the axis like this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="A crudely-drawn scale with 10 even ticks, I'm between 5 and 6" class="newsletter-image" src="https://assets.buttondown.email/images/8ee51aa1-5dd4-48b8-8110-2cdf9a273612.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;In practice, there are three tiers:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Toddlers&lt;/li&gt;
    &lt;li&gt;Good jugglers who practice hard&lt;/li&gt;
    &lt;li&gt;Genetic freaks and actual wizards&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;And the graph always, &lt;em&gt;always&lt;/em&gt; looks like this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The same graph, with the top compressed into "wizards" and bottom into "toddlers". I'm in toddlers." class="newsletter-image" src="https://assets.buttondown.email/images/04c76cec-671e-4560-b64e-498b7652359e.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;This is the jugglers curse, and it's a three-parter:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;The threshold between you and "good" is the next trick you cannot do.&lt;/li&gt;
    &lt;li&gt;Everything below that level is trivial. Once you've gotten a trick down, you can never go back to not knowing it, to appreciating how difficult it was to learn in the first place.&lt;sup id="fnref:expert-blindness"&gt;&lt;a class="footnote-ref" href="#fn:expert-blindness"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;li&gt;Everything above that level is just "impossible". You don't have the knowledge needed to recognize the different tiers.&lt;sup id="fnref:dk"&gt;&lt;a class="footnote-ref" href="#fn:dk"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;So as you get better, the stuff that was impossible becomes differentiable, and you can see that some of it &lt;em&gt;is&lt;/em&gt; possible. And everything you learned becomes trivial. So you're never a good juggler until you learn "just one more hard trick".&lt;/p&gt;
    &lt;p&gt;The more you know, the more you know you don't know and the less you know you know.&lt;/p&gt;
    &lt;h3&gt;This is supposed to be a software newsletter&lt;/h3&gt;
    &lt;blockquote&gt;
    &lt;p&gt;A monad is a monoid in the category of endofunctors, what's the problem? &lt;a href="https://james-iry.blogspot.com/2009/05/brief-incomplete-and-mostly-wrong.html" target="_blank"&gt;(src)&lt;/a&gt;&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;I think this applies to any difficult topic? Most fields don't have the same stark &lt;a href="https://en.wikipedia.org/wiki/Spectral_line" target="_blank"&gt;spectral lines&lt;/a&gt; as juggling, but there's still tiers of difficulty to techniques, which get compressed the further in either direction they are from your current level.&lt;/p&gt;
    &lt;p&gt;Like, I'm not good at formal methods. I've written two books on it but I've never mastered a dependently-typed language or a theorem prover. Those are equally hard. And I'm not good at modeling concurrent systems because I don't understand the formal definition of bisimulation and haven't implemented a Raft. Those are also equally hard, in fact exactly as hard as mastering a theorem prover.&lt;/p&gt;
    &lt;p&gt;At the same time, the skills I've already developed are easy: properly using refinement is &lt;em&gt;exactly as easy&lt;/em&gt; as writing &lt;a href="https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/" target="_blank"&gt;a wrapped counter&lt;/a&gt;. Then I get surprised when I try to explain strong fairness to someone and they just don't get how □◇(ENABLED〈A〉ᵥ) is &lt;em&gt;obviously&lt;/em&gt; different from ◇□(ENABLED 〈A〉ᵥ).&lt;/p&gt;
    &lt;p&gt;Juggler's curse!&lt;/p&gt;
    &lt;p&gt;Now I don't actually know if this is actually how everybody experiences expertise or if it's just my particular personality— I was a juggler long before I was a software developer. Then again, I'd argue that lots of people talk about one consequence of the juggler's curse: imposter syndrome. If you constantly think what you know is "trivial" and what you don't know is "impossible", then yeah, you'd start feeling like an imposter at work real quick.&lt;/p&gt;
    &lt;p&gt;I wonder if part of the cause is that a lot of skills you have to learn are invisible. One of my favorite blog posts ever is &lt;a href="https://www.benkuhn.net/blub/" target="_blank"&gt;In Defense of Blub Studies&lt;/a&gt;, which argues that software expertise comes through understanding "boring" topics like "what all of the error messages mean" and "how to use a debugger well".  Blub is a critical part of expertise and takes a lot of hard work to learn, but it &lt;em&gt;feels&lt;/em&gt; like trivia. So looking back on a skill I mastered, I might think it was "easy" because I'm not including all of the blub that I had to learn, too.&lt;/p&gt;
    &lt;p&gt;The takeaway, of course, is that the outside five-ball cascade &lt;em&gt;is&lt;/em&gt; objectively the cutoff between good jugglers and toddlers.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:boxes"&gt;
    &lt;p&gt;Rant time: I &lt;em&gt;love&lt;/em&gt; cigar box juggling. It's fun, it's creative, it's totally unlike any other kind of juggling. And it's so niche I straight up cannot find anybody in Chicago to practice with. I once went to a juggling convention and was the only person with a cigar box set there. &lt;a class="footnote-backref" href="#fnref:boxes" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:expert-blindness"&gt;
    &lt;p&gt;This particular part of the juggler's curse is also called &lt;a href="https://en.wikipedia.org/wiki/Curse_of_knowledge" target="_blank"&gt;the curse of knowledge&lt;/a&gt; or "expert blindness". &lt;a class="footnote-backref" href="#fnref:expert-blindness" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:dk"&gt;
    &lt;p&gt;This isn't Dunning-Kruger, because DK says that people think they are &lt;em&gt;better&lt;/em&gt; than they actually are, and also &lt;a href="https://www.mcgill.ca/oss/article/critical-thinking/dunning-kruger-effect-probably-not-real" target="_blank"&gt;may not actually be real&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:dk" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 22 Jan 2025 18:50:40 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/the-jugglers-curse/</guid></item><item><title>What are the Rosettas of formal specification?</title><link>https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/</link><description>
    &lt;p&gt;First of all, I just released version 0.6 of &lt;em&gt;Logic for Programmers&lt;/em&gt;! You can get it &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;here&lt;/a&gt;. Release notes in the footnote.&lt;sup id="fnref:release-notes"&gt;&lt;a class="footnote-ref" href="#fn:release-notes"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;I've been thinking about my next project after the book's done. One idea is to do a survey of new formal specification languages. There's been a lot of new ones in the past few years (P, Quint, etc), plus some old ones I haven't critically examined (SPIN, mcrl2). I'm thinking of a brief overview of each, what's interesting about it, and some examples of the corresponding models.&lt;/p&gt;
    &lt;p&gt;For this I'd want a set of "Rosetta" examples. &lt;a href="https://rosettacode.org/wiki/Rosetta_Code" target="_blank"&gt;Rosetta Code&lt;/a&gt; is a collection of programming tasks done in different languages. For example, &lt;a href="https://rosettacode.org/wiki/99_bottles_of_beer" target="_blank"&gt;"99 bottles of beer on the wall"&lt;/a&gt; in over 300 languages. If I wanted to make a Rosetta Code for specifications of concurrent systems, what examples would I use? &lt;/p&gt;
    &lt;h3&gt;What makes a good Rosetta examples?&lt;/h3&gt;
    &lt;p&gt;A good Rosetta example would be simple enough to understand and implement but also showcase the differences between the languages. &lt;/p&gt;
    &lt;p&gt;A good example of a Rosetta example is &lt;a href="https://github.com/hwayne/lets-prove-leftpad" target="_blank"&gt;leftpad for code verification&lt;/a&gt;. Proving leftpad correct is short in whatever verification language you use. But the proofs themselves are different enough that you can compare what it's like to use code contracts vs with dependent types, etc. &lt;/p&gt;
    &lt;p&gt;A &lt;em&gt;bad&lt;/em&gt; Rosetta example is "hello world". While it's good for showing how to run a language, it doesn't clearly differentiate languages. Haskell's "hello world" is almost identical to BASIC's "hello world".&lt;/p&gt;
    &lt;p&gt;Rosetta examples don't have to be flashy, but I &lt;em&gt;want&lt;/em&gt; mine to be flashy. Formal specification is niche enough that regardless of my medium, most of my audience hasn't use it and may be skeptical. I always have to be selling. This biases me away from using things like dining philosophers or two-phase commit.&lt;/p&gt;
    &lt;p&gt;So with that in mind, three ideas:&lt;/p&gt;
    &lt;h3&gt;1. Wrapped Counter&lt;/h3&gt;
    &lt;p&gt;A counter that starts at 1 and counts to N, after which it wraps around to 1 again.&lt;/p&gt;
    &lt;h4&gt;Why it's good&lt;/h4&gt;
    &lt;p&gt;This is a good introductory formal specification: it's a minimal possible stateful system without concurrency or nondeterminism. You can use it to talk about the basic structure of a spec, how a verifier works, etc. It also a good way of introducing "boring" semantics, like conditionals and arithmetic, and checking if the language does anything unusual with them. Alloy, for example, defaults to 4-bit signed integers, so you run into problems if you set N too high.&lt;sup id="fnref:alloy"&gt;&lt;a class="footnote-ref" href="#fn:alloy"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;At the same time, wrapped counters are a common building block of complex systems. Lots of things can be represented this way: &lt;code&gt;N=1&lt;/code&gt; is a flag or blinker, &lt;code&gt;N=3&lt;/code&gt; is a traffic light, &lt;code&gt;N=24&lt;/code&gt; is a clock, etc.&lt;/p&gt;
    &lt;p&gt;The next example is better for showing basic &lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;safety and liveness properties&lt;/a&gt;, but this will do in a pinch. &lt;/p&gt;
    &lt;h3&gt;2. Threads&lt;/h3&gt;
    &lt;p&gt;A counter starts at 0. N threads each, simultaneously try to update the counter. They do this nonatomically: first they read the value of the counter and store that in a thread-local &lt;code&gt;tmp&lt;/code&gt;, then they increment &lt;code&gt;tmp&lt;/code&gt;, then they set the counter to &lt;code&gt;tmp&lt;/code&gt;. The expected behavior is that the final value of the counter will be N.&lt;/p&gt;
    &lt;h4&gt;Why it's good&lt;/h4&gt;
    &lt;p&gt;The system as described is bugged. If two threads interleave the setlocal commands, one thread update can "clobber" the other and the counter can go backwards. To my surprise, most people &lt;em&gt;do not&lt;/em&gt; see this error. So it's a good showcase of how the language actually finds real bugs, and how it can verify fixes.&lt;/p&gt;
    &lt;p&gt;As to actual language topics: the spec covers concurrency and track process-local state. A good spec language should make it possible to adjust N without having to add any new variables. And it "naturally" introduces safety, liveness, and &lt;a href="https://www.hillelwayne.com/post/action-properties/" target="_blank"&gt;action&lt;/a&gt; properties.&lt;/p&gt;
    &lt;p&gt;Finally, the thread spec is endlessly adaptable. I've used variations of it to teach refinement, resource starvation, fairness, livelocks, and hyperproperties. Tweak it a bit and you get dining philosophers.&lt;/p&gt;
    &lt;h3&gt;3. Bounded buffer&lt;/h3&gt;
    &lt;p&gt;We have a bounded buffer with maximum length &lt;code&gt;X&lt;/code&gt;. We have &lt;code&gt;R&lt;/code&gt; reader and &lt;code&gt;W&lt;/code&gt; writer processes. Before writing, writers first check if the buffer is full. If full, the writer goes to sleep. Otherwise, the writer wakes up &lt;em&gt;a random&lt;/em&gt; sleeping process, then pushes an arbitrary value. Readers work the same way, except they pop from the buffer (and go to sleep if the buffer is empty).&lt;/p&gt;
    &lt;p&gt;The only way for a sleeping process to wake up is if another process successfully performs a read or write.&lt;/p&gt;
    &lt;h4&gt;Why it's good&lt;/h4&gt;
    &lt;p&gt;This shows process-local nondeterminism (in choosing which sleeping process to wake up), different behavior for different types of processes, and deadlocks: it's possible for every reader and writer to be asleep at the same time.&lt;/p&gt;
    &lt;p&gt;The beautiful thing about this example: the spec can only deadlock if &lt;code&gt;X &lt; 2*(R+W)&lt;/code&gt;. This is the kind of bug you'd struggle to debug in real code. An in fact, people did struggle: even when presented with a minimal code sample and told there was a bug, many &lt;a href="http://wiki.c2.com/?ExtremeProgrammingChallengeFourteen" target="_blank"&gt;testing experts couldn't find it&lt;/a&gt;. Whereas a formal model of the same code &lt;a href="https://www.hillelwayne.com/post/augmenting-agile/" target="_blank"&gt;finds the bug in seconds&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;If a spec language can model the bounded buffer, then it's good enough for production systems.&lt;/p&gt;
    &lt;p&gt;On top of that, the bug happens regardless of what writers actually put in the buffer, so you can abstract that all away. This example can demonstrate that you can leave implementation details out of a spec and still find critical errors.&lt;/p&gt;
    &lt;h2&gt;Caveat&lt;/h2&gt;
    &lt;p&gt;This is all with a &lt;em&gt;heavy&lt;/em&gt; TLA+ bias. I've modeled all of these systems in TLA+ and it works pretty well for them. That is to say, none of these do things TLA+ is &lt;em&gt;bad&lt;/em&gt; at: reachability, subtyping, transitive closures, unbound spaces, etc. I imagine that as I cover more specification languages I'll find new Rosettas.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:release-notes"&gt;
    &lt;ul&gt;
    &lt;li&gt;Exercises are more compact, answers now show name of exercise in title&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;"Conditionals" chapter has new section on nested conditionals&lt;/li&gt;
    &lt;/ul&gt;
    &lt;ul&gt;
    &lt;li&gt;"Crash course" chapter significantly rewritten&lt;/li&gt;
    &lt;li&gt;Starting migrating to use consistently use &lt;code&gt;==&lt;/code&gt; for equality and &lt;code&gt;=&lt;/code&gt; for definition. Not everything is migrated yet&lt;/li&gt;
    &lt;li&gt;"Beyond Logic" appendix does a &lt;em&gt;slightly&lt;/em&gt; better job of covering HOL and constructive logic&lt;/li&gt;
    &lt;li&gt;Addressed various reader feedback&lt;/li&gt;
    &lt;li&gt;Two new exercises&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;&lt;a class="footnote-backref" href="#fnref:release-notes" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:alloy"&gt;
    &lt;p&gt;You can change the int size in a model run, so this is more "surprising footgun and inconvenience" than "fundamental limit of the specification language." Something still good to know! &lt;a class="footnote-backref" href="#fnref:alloy" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 15 Jan 2025 17:34:40 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/</guid></item><item><title>"Logic for Programmers" Project Update</title><link>https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/</link><description>
    &lt;p&gt;Happy new year everyone!&lt;/p&gt;
    &lt;p&gt;I released the first &lt;em&gt;Logic for Programmers&lt;/em&gt; alpha six months ago. There's since been four new versions since then, with the November release putting us in beta. Between work and holidays I didn't make much progress in December, but there will be a 0.6 release in the next week or two.&lt;/p&gt;
    &lt;p&gt;People have asked me if the book will ever be available in print, and my answer to that is "when it's done". To keep "when it's done" from being "never", I'm committing myself to &lt;strong&gt;have the book finished by July.&lt;/strong&gt; That means roughly six more releases between now and the official First Edition. Then I will start looking for a way to get it printed.&lt;/p&gt;
    &lt;h3&gt;The Current State and What Needs to be Done&lt;/h3&gt;
    &lt;p&gt;Right now the book is 26,000 words. For the most part, the structure is set— I don't plan to reorganize the chapters much. But I still need to fix shortcomings identified by the reader feedback. In particular, a few topics need more on real world applications, and the Alloy chapter is pretty weak. There's also a bunch of notes and todos and "fix this"s I need to go over.&lt;/p&gt;
    &lt;p&gt;I also need to rewrite the introduction and predicate logic chapters. Those haven't changed much since 0.1 and I need to go over them &lt;em&gt;very carefully&lt;/em&gt;.&lt;/p&gt;
    &lt;p&gt;After that comes copyediting.&lt;/p&gt;
    &lt;h4&gt;Ugh, Copyediting&lt;/h4&gt;
    &lt;p&gt;Copyediting means going through the entire book to make word and sentence sentence level changes to the flow. An example would be changing&lt;/p&gt;
    &lt;table&gt;
    &lt;thead&gt;
    &lt;tr&gt;
    &lt;th&gt;From&lt;/th&gt;
    &lt;th&gt;To&lt;/th&gt;
    &lt;/tr&gt;
    &lt;/thead&gt;
    &lt;tbody&gt;
    &lt;tr&gt;
    &lt;td&gt;I said predicates are just “boolean functions”. That isn’t &lt;em&gt;quite&lt;/em&gt; true.&lt;/td&gt;
    &lt;td&gt;It's easy to think of predicates as just "boolean" functions, but there is a subtle and important difference.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;/tbody&gt;
    &lt;/table&gt;
    &lt;p&gt;It's a tiny difference but it reads slightly better to me and makes the book slghtly better. Now repeat that for all 3000-odd sentences in the book and I'm done with copyediting!&lt;/p&gt;
    &lt;p&gt;For the first pass, anyway. Copyediting is miserable. &lt;/p&gt;
    &lt;p&gt;Some of the changes I need to make come from reader feedback, but most will come from going through it line-by-line with a copyeditor. Someone's kindly offered to do some of this for free, but I want to find a professional too. If you know anybody, let me know.&lt;/p&gt;
    &lt;h4&gt;Formatting&lt;/h4&gt;
    &lt;p&gt;The book, if I'm being honest, looks ugly. I'm using the default sphinx/latex combination for layout and typesetting. My thinking is it's not worth making the book pretty until it's worth reading. But I also want the book, when it's eventually printed, to look &lt;em&gt;nice&lt;/em&gt;. At the very least it shouldn't have "self-published" vibes. &lt;/p&gt;
    &lt;p&gt;I've found someone who's been giving me excellent advice on layout and I'm slowly mastering the LaTeX formatting arcana. It's gonna take a few iterations to get things right.&lt;/p&gt;
    &lt;h4&gt;Front cover&lt;/h4&gt;
    &lt;p&gt;Currently the front cover is this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="Front cover" class="newsletter-image" src="https://assets.buttondown.email/images/b42ee3de-9d8a-4729-809e-a8739741f0cf.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;It works but gives "programmer spent ten minutes in Inkscape" vibes. I have a vision in my head for what would be nicer. A few people have recommended using Fiverr. So far the results haven't been that good, &lt;/p&gt;
    &lt;h4&gt;Fixing Epub&lt;/h4&gt;
    &lt;p&gt;&lt;em&gt;Ugh&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;I thought making an epub version would be kinder for phone reading, but it's such a painful format to develop for. Did you know that epub backlinks work totally different on kindle vs other ereaders? Did you know the only way to test if you got em working right is to load them up in a virtual kindle? The feedback loops are miserable. So I've been treating epub as a second-class citizen for now and only fixing the &lt;em&gt;worst&lt;/em&gt; errors (like math not rendering properly), but that'll have to change as the book finalizes.&lt;/p&gt;
    &lt;h3&gt;What comes next?&lt;/h3&gt;
    &lt;p&gt;After 1.0, I get my book an ISBN and figure out how to make print copies. The margin on print is &lt;em&gt;way&lt;/em&gt; lower than ebooks, especially if it's on-demand: the net royalties for &lt;a href="https://kdp.amazon.com/en_US/help/topic/G201834330" target="_blank"&gt;Amazon direct publishing&lt;/a&gt; would be 7 dollars on a 20-dollar book (as opposed to Leanpub's 16 dollars). Would having a print version double the sales? I hope so! Either way, a lot of people have been asking about print version so I want to make that possible.&lt;/p&gt;
    &lt;p&gt;(I also want to figure out how to give people who already have the ebook a discount on print, but I don't know if that's feasible.)&lt;/p&gt;
    &lt;p&gt;Then, I dunno, maybe make a talk or a workshop I can pitch to conferences. Once I have that I think I can call &lt;em&gt;LfP&lt;/em&gt; complete... at least until the second edition.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;Anyway none of that is actually technical so here's a quick fun thing. I spent a good chunk of my break reading the &lt;a href="https://www.mcrl2.org/web/index.html" target="_blank"&gt;mCRL2 book&lt;/a&gt;. mCRL2 defines an "algebra" for "communicating processes". As a very broad explanation, that's defining what it means to "add" and "multiply" two processes. What's interesting is that according to their definition, the algebra follows the distributive law, &lt;em&gt;but only if you multiply on the right&lt;/em&gt;. eg&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;// VALID
    (a+b)*c = a*c + b*c
    
    // INVALID
    a*(b+c) = a*b + a*c
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is the first time I've ever seen this in practice! Juries still out on the rest of the language.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Videos and Stuff&lt;/h3&gt;
    &lt;ul&gt;
    &lt;li&gt;My &lt;em&gt;DDD Europe&lt;/em&gt; talk is now out! &lt;a href="https://www.youtube.com/watch?v=uRmNSuYBUOU" target="_blank"&gt;What We Know We Don't Know&lt;/a&gt; is about empirical software engineering in general, and software engineering research on Domain Driven Design in particular.&lt;/li&gt;
    &lt;li&gt;I was interviewed in the last video on &lt;a href="https://www.youtube.com/watch?v=yXxmSI9SlwM" target="_blank"&gt;Craft vs Cruft&lt;/a&gt;'s "Year of Formal Methods". Check it out!&lt;/li&gt;
    &lt;/ul&gt;
    </description><pubDate>Tue, 07 Jan 2025 18:49:40 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/</guid></item><item><title>Formally modeling dreidel, the sequel</title><link>https://buttondown.com/hillelwayne/archive/formally-modeling-dreidel-the-sequel/</link><description>
    &lt;p&gt;Channukah's next week and that means my favorite pastime, complaining about how &lt;a href="https://en.wikipedia.org/wiki/Dreidel#" target="_blank"&gt;Dreidel&lt;/a&gt; is a bad game. Last year I formally modeled it in &lt;a href="https://www.prismmodelchecker.org/" target="_blank"&gt;PRISM&lt;/a&gt; to prove the game's not fun. But because I limited the model to only a small case, I couldn't prove the game was &lt;em&gt;truly&lt;/em&gt; bad. &lt;/p&gt;
    &lt;p&gt;It's time to finish the job.&lt;/p&gt;
    &lt;p&gt;&lt;img alt="A flaming dreidel, from https://pixelsmerch.com/featured/flaming-dreidel-ilan-rosen.html" class="newsletter-image" src="https://assets.buttondown.email/images/61233445-69a7-4fd4-a024-ee0dca0281c1.jpg?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;h2&gt;The Story so far&lt;/h2&gt;
    &lt;p&gt;You can read the last year's newsletter &lt;a href="https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/" target="_blank"&gt;here&lt;/a&gt; but here are the high-level notes.&lt;/p&gt;
    &lt;h3&gt;The Game of Dreidel&lt;/h3&gt;
    &lt;ol&gt;
    &lt;li&gt;Every player starts with N pieces (usually chocolate coins). This is usually 10-15 pieces per player.&lt;/li&gt;
    &lt;li&gt;At the beginning of the game, and whenever the pot is empty, every play antes one coin into the pot.&lt;/li&gt;
    &lt;li&gt;
    &lt;p&gt;Turns consist of spinning the dreidel. Outcomes are:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;נ (Nun): nothing happens.&lt;/li&gt;
    &lt;li&gt;ה (He): player takes half the pot, rounded up.&lt;/li&gt;
    &lt;li&gt;ג (Gimmel): player takes the whole pot, everybody antes.&lt;/li&gt;
    &lt;li&gt;ש (Shin): player adds one of their coins to the pot.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;/li&gt;
    &lt;li&gt;
    &lt;p&gt;If a player ever has zero coins, they are eliminated. Play continues until only one player remains.&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;If you don't have a dreidel, you can instead use a four-sided die, but for the authentic experience you should wait eight seconds before looking at your roll.&lt;/p&gt;
    &lt;h3&gt;PRISM&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://www.prismmodelchecker.org/" target="_blank"&gt;PRISM&lt;/a&gt; is a probabilistic modeling language, meaning you can encode a system with random chances of doing things and it can answer questions like "on average, how many spins does it take before one player loses" (64, for 4 players/10 coins) and "what's the more likely to knock the first player out, shin or ante" (ante is 2.4x more likely).  You can see last year's model &lt;a href="https://gist.github.com/hwayne/f8724f0c83393c576b1e20ee4b76966d#file-01-dreidel-prism" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;The problem with PRISM is that it is absurdly inexpressive: it's a thin abstraction for writing giant &lt;a href="https://en.wikipedia.org/wiki/Stochastic_matrix" target="_blank"&gt;stochastic matrices&lt;/a&gt; and lacks basic affordances like lists or functions. I had to hardcode every possible roll for every player. This meant last year's model had two limits. First, it only handles four players, and I would have to write a new model for three or five players. Second, I made the game end as soon as one player &lt;em&gt;lost&lt;/em&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;formula done = (p1=0) | (p2=0) | (p3=0) | (p4=0);
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;To fix both of these things, I thought I'd have to treat PRISM as a compilation target, writing a program that took a player count and output the corresponding model. But then December got super busy and I ran out of time to write a program. Instead, I stuck with four hardcoded players and extended the old model to run until victory.&lt;/p&gt;
    &lt;h2&gt;The new model&lt;/h2&gt;
    &lt;p&gt;These are all changes to &lt;a href="https://gist.github.com/hwayne/f8724f0c83393c576b1e20ee4b76966d#file-01-dreidel-prism" target="_blank"&gt;last year's model&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;First, instead of running until one player is out of money, we run until three players are out of money.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gd"&gt;- formula done = (p1=0) | (p2=0) | (p3=0) | (p4=0);&lt;/span&gt;
    &lt;span class="gi"&gt;+ formula done = &lt;/span&gt;
    &lt;span class="gi"&gt;+  ((p1=0) &amp; (p2=0) &amp; (p3=0)) |&lt;/span&gt;
    &lt;span class="gi"&gt;+  ((p1=0) &amp; (p2=0) &amp; (p4=0)) |&lt;/span&gt;
    &lt;span class="gi"&gt;+  ((p1=0) &amp; (p3=0) &amp; (p4=0)) |&lt;/span&gt;
    &lt;span class="gi"&gt;+  ((p2=0) &amp; (p3=0) &amp; (p4=0));&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Next, we change the ante formula. Instead of adding four coins to the pot and subtracting a coin from each player, we add one coin for each player left. &lt;code&gt;min(p1, 1)&lt;/code&gt; is 1 if player 1 is still in the game, and 0 otherwise. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gi"&gt;+ formula ante_left = min(p1, 1) + min(p2, 1) + min(p3, 1) + min(p4, 1);&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;We also have to make sure anteing doesn't end a player with negative money. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gd"&gt;- [ante] (pot = 0) &amp; !done -&gt; (pot'=pot+4) &amp; (p1' = p1-1) &amp; (p2' = p2-1) &amp; (p3' = p3-1) &amp; (p4' = p4-1);&lt;/span&gt;
    &lt;span class="gi"&gt;+ [ante] (pot = 0) &amp; !done -&gt; (pot'=pot+ante_left) &amp; (p1' = max(p1-1, 0)) &amp; (p2' = max(p2-1, 0)) &amp; (p3' = max(p3-1, 0)) &amp; (p4' = max(p4-1, 0));&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Finally, we have to add logic for a player being "out". Instead of moving to the next player after each turn, we move to the next player still in the game. Also, if someone starts their turn without any coins (f.ex if they just anted their last coin), we just skip their turn. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="gi"&gt;+ formula p1n = (p2 &gt; 0 ? 2 : p3 &gt; 0 ? 3 : 4);&lt;/span&gt;
    
    &lt;span class="gi"&gt;+ [lost] ((pot != 0) &amp; !done &amp; (turn = 1) &amp; (p1 = 0)) -&gt; (turn' = p1n);&lt;/span&gt;
    &lt;span class="gd"&gt;- [spin] ((pot != 0) &amp; !done &amp; (turn = 1)) -&gt;&lt;/span&gt;
    &lt;span class="gi"&gt;+ [spin] ((pot != 0) &amp; !done &amp; (turn = 1) &amp; (p1 != 0)) -&gt;&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt;   0.25: (p1' = p1-1) 
    &lt;span class="w"&gt; &lt;/span&gt;          &amp; (pot' = min(pot+1, maxval)) 
    &lt;span class="gd"&gt;-          &amp; (turn' = 2) //shin&lt;/span&gt;
    &lt;span class="gi"&gt;+          &amp; (turn' = p1n) //shin&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;We make similar changes for all of the other players. You can see the final model &lt;a href="https://gist.github.com/hwayne/f8724f0c83393c576b1e20ee4b76966d#file-02-dreidel-prism" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;h3&gt;Querying the model&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;So now we have a full game of Dreidel that runs until the player ends. And now, &lt;em&gt;finally&lt;/em&gt;, we can see the average number of spins a 4 player game will last.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;./prism&lt;span class="w"&gt; &lt;/span&gt;dreidel.prism&lt;span class="w"&gt; &lt;/span&gt;-const&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;M&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-pf&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;'R=? [F done]'&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In English: each player starts with ten coins. &lt;code&gt;R=?&lt;/code&gt; means "expected value of the 'reward'", where 'reward' in this case means number of spins. &lt;code&gt;[F done]&lt;/code&gt; weights the reward over all behaviors that reach ("&lt;strong&gt;F&lt;/strong&gt;inally") the &lt;code&gt;done&lt;/code&gt; state.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Result: 760.5607582661091
    Time for model checking: 384.17 seconds.
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So there's the number: 760 spins.&lt;sup id="fnref:ben"&gt;&lt;a class="footnote-ref" href="#fn:ben"&gt;1&lt;/a&gt;&lt;/sup&gt; At 8 seconds a spin, that's almost two hours for &lt;em&gt;one&lt;/em&gt; game.&lt;/p&gt;
    &lt;p&gt;…Jesus, look at that runtime. Six minutes to test one query.&lt;/p&gt;
    &lt;p&gt;PRISM has over a hundred settings that affect model checking, with descriptions like "Pareto curve threshold" and "Use Backwards Pseudo SOR". After looking through them all, I found this perfect combination of configurations that gets the runtime to a more manageable level: &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;./prism dreidel.prism 
    &lt;span class="w"&gt; &lt;/span&gt;   -const M=10 
    &lt;span class="w"&gt; &lt;/span&gt;   -pf 'R=? [F done]' 
    &lt;span class="gi"&gt;+   -heuristic speed&lt;/span&gt;
    
    Result: 760.816255997373
    Time for model checking: 13.44 seconds.
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Yes, that's a literal "make it faster" flag.&lt;/p&gt;
    &lt;p&gt;Anyway, that's only the "average" number of spins, weighted across all games. Dreidel has a very long tail. To find that out, we'll use a variation on our query:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;const C0; P=? [F &lt;=C0 done]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;code&gt;P=?&lt;/code&gt; is the &lt;strong&gt;P&lt;/strong&gt;robability something happens. &lt;code&gt;F &lt;=C0 done&lt;/code&gt; means we &lt;strong&gt;F&lt;/strong&gt;inally reach state &lt;code&gt;done&lt;/code&gt; in at most &lt;code&gt;C0&lt;/code&gt; steps. By passing in different values of &lt;code&gt;C0&lt;/code&gt; we can get a sense of how long a game takes. Since "steps" includes passes and antes, this will overestimate the length of the game. But antes take time too and it should only "pass" on a player once per player, so this should still be a good metric for game length.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;./prism dreidel.prism 
        -const M=10 
        -const C0=1000:1000:5000
        -pf 'const C0; P=? [F &lt;=C0 done]' 
        -heuristic speed
    
    C0      Result
    1000    0.6259953274918795
    2000    0.9098575028069353
    3000    0.9783122218576754
    4000    0.994782069562932
    5000    0.9987446018004976
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;A full 10% of games don't finish in 2000 steps, and 2% pass the 3000 step barrier. At 8 seconds a roll/ante, 3000 steps is over &lt;strong&gt;six hours&lt;/strong&gt;.&lt;/p&gt;
    &lt;p&gt;Dreidel is a bad game.&lt;/p&gt;
    &lt;h3&gt;More fun properties&lt;/h3&gt;
    &lt;p&gt;As a sanity check, let's confirm last year's result, that it takes an average of 64ish spins before one player is out. In that model, we just needed to get the total reward. Now we instead want to get the reward until the first state where any of the players have zero coins. &lt;sup id="fnref:co-safe"&gt;&lt;a class="footnote-ref" href="#fn:co-safe"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;./prism dreidel.prism 
        -const M=10 
        -pf 'R=? [F (p1=0 | p2=0 | p3=0 | p4=0)]' 
        -heuristic speed
    
    Result: 63.71310116083396
    Time for model checking: 2.017 seconds.
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Yep, looks good. With our new model we can also get the average point where two players are out and two players are left. PRISM's lack of abstraction makes expressing the condition directly a little painful, but we can cheat and look for the first state where &lt;code&gt;ante_left &lt;= 2&lt;/code&gt;.&lt;sup id="fnref:ante_left"&gt;&lt;a class="footnote-ref" href="#fn:ante_left"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;./prism dreidel.prism 
        -const M=10 
        -pf 'R=? [F (ante_left &lt;= 2)]' 
        -heuristic speed
    
    Result: 181.92839196680023
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;It takes twice as long to eliminate the second player as it takes to eliminate the first, and the remaining two players have to go for another 600 spins.&lt;/p&gt;
    &lt;p&gt;Dreidel is a bad game.&lt;/p&gt;
    &lt;h2&gt;The future&lt;/h2&gt;
    &lt;p&gt;There's two things I want to do next with this model. The first is script up something that can generate the PRISM model for me, so I can easily adjust the number of players to 3 or 5. The second is that PRISM has a &lt;a href="https://www.prismmodelchecker.org/manual/PropertySpecification/Filters" target="_blank"&gt;filter-query&lt;/a&gt; feature I don't understand but I &lt;em&gt;think&lt;/em&gt; it could be used for things like "if a player gets 75% of the pot, what's the probability they lose anyway". Otherwise you have to write wonky queries like &lt;code&gt;(P =? [F p1 = 30 &amp; (F p1 = 0)]) / (P =? [F p1 = 0])&lt;/code&gt;.&lt;sup id="fnref:lose"&gt;&lt;a class="footnote-ref" href="#fn:lose"&gt;4&lt;/a&gt;&lt;/sup&gt; But I'm out of time again, so this saga will have to conclude next year.&lt;/p&gt;
    &lt;p&gt;I'm also faced with the terrible revelation that I might be the biggest non-academic user of PRISM.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h4&gt;&lt;em&gt;Logic for Programmers&lt;/em&gt; Khanukah Sale&lt;/h4&gt;
    &lt;p&gt;Still going on! You can get &lt;em&gt;LFP&lt;/em&gt; for &lt;a href="https://leanpub.com/logic/c/hannukah-presents" target="_blank"&gt;40% off here&lt;/a&gt; from now until the end of Xannukkah (Jan 2).&lt;sup id="fnref:joke"&gt;&lt;a class="footnote-ref" href="#fn:joke"&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;h4&gt;I'm in the Raku Advent Calendar!&lt;/h4&gt;
    &lt;p&gt;My piece is called &lt;a href="https://raku-advent.blog/2024/12/11/day-11-counting-up-concurrency/" target="_blank"&gt;counting up concurrencies&lt;/a&gt;. It's about using Raku to do some combinatorics! Read the rest of the blog too, it's great&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:ben"&gt;
    &lt;p&gt;This is different from the &lt;a href="https://www.slate.com/articles/life/holidays/2014/12/rules_of_dreidel_the_hannukah_game_is_way_too_slow_let_s_speed_it_up.html" target="_blank"&gt;original anti-Dreidel article&lt;/a&gt;: Ben got &lt;em&gt;860&lt;/em&gt; spins. That's the average spins if you round &lt;em&gt;down&lt;/em&gt; on He, not up. Rounding up on He leads to a shorter game because it means He can empty the pot, which means more antes, and antes are what knocks most players out. &lt;a class="footnote-backref" href="#fnref:ben" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:co-safe"&gt;
    &lt;p&gt;PRISM calls this &lt;a href="https://www.prismmodelchecker.org/manual/PropertySpecification/Reward-basedProperties" target="_blank"&gt;"co-safe LTL reward"&lt;/a&gt; and does &lt;em&gt;not&lt;/em&gt; explain what that means, nor do most of the papers I found referencing "co-safe LTL". &lt;a href="https://mengguo.github.io/personal_site/papers/pdf/guo2016task.pdf" target="_blank"&gt;Eventually&lt;/a&gt; I found one that defined it as "any property that only uses X, U, F". &lt;a class="footnote-backref" href="#fnref:co-safe" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:ante_left"&gt;
    &lt;p&gt;Here's the exact point where I realize I could have defined &lt;code&gt;done&lt;/code&gt; as &lt;code&gt;ante_left = 1&lt;/code&gt;. Also checking for &lt;code&gt;F (ante_left = 2)&lt;/code&gt; gives an expected number of spins as "infinity". I have no idea why. &lt;a class="footnote-backref" href="#fnref:ante_left" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:lose"&gt;
    &lt;p&gt;10% chances at 4 players / 10 coins. And it takes a minute even &lt;em&gt;with&lt;/em&gt; fast mode enabled. &lt;a class="footnote-backref" href="#fnref:lose" title="Jump back to footnote 4 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:joke"&gt;
    &lt;p&gt;This joke was funnier before I made the whole newsletter about Chanukahh. &lt;a class="footnote-backref" href="#fnref:joke" title="Jump back to footnote 5 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 18 Dec 2024 16:58:59 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/formally-modeling-dreidel-the-sequel/</guid></item><item><title>Stroustrup's Rule</title><link>https://buttondown.com/hillelwayne/archive/stroustrups-rule/</link><description>
    &lt;p&gt;Just finished two weeks of workshops and am &lt;em&gt;exhausted&lt;/em&gt;, so this one will be light. &lt;/p&gt;
    &lt;h3&gt;Hanuka Sale&lt;/h3&gt;
    &lt;p&gt;&lt;em&gt;Logic for Programmers&lt;/em&gt; is on sale until the end of Chanukah! That's Jan 2nd if you're not Jewish. &lt;a href="https://leanpub.com/logic/c/hannukah-presents" target="_blank"&gt;Get it for 40% off here&lt;/a&gt;.&lt;/p&gt;
    &lt;h1&gt;Stroustrup's Rule&lt;/h1&gt;
    &lt;p&gt;I first encountered &lt;strong&gt;Stroustrup's Rule&lt;/strong&gt; on this &lt;a href="https://web.archive.org/web/20240914141601/https:/www.thefeedbackloop.xyz/stroustrups-rule-and-layering-over-time/" target="_blank"&gt;defunct webpage&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;One of my favorite insights about syntax design appeared in a &lt;a href="https://learn.microsoft.com/en-us/shows/lang-next-2014/keynote" target="_blank"&gt;retrospective on C++&lt;/a&gt;&lt;sup id="fnref:timing"&gt;&lt;a class="footnote-ref" href="#fn:timing"&gt;1&lt;/a&gt;&lt;/sup&gt; by Bjarne Stroustrup:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;For new features, people insist on &lt;strong&gt;LOUD&lt;/strong&gt; explicit syntax. &lt;/li&gt;
    &lt;li&gt;For established features, people want terse notation.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The blogger gives the example of option types in Rust. Originally, the idea of using option types to store errors was new for programmers, so the syntax for passing an error was very explicit:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;match&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;File&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"file.txt"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nb"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nb"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Once people were more familiar with it, Rust added the &lt;code&gt;try!&lt;/code&gt; macro to reduce boilerplate, and finally the &lt;a href="https://github.com/rust-lang/rfcs/blob/master/text/0243-trait-based-exception-handling.md" target="_blank"&gt;&lt;code&gt;?&lt;/code&gt; operator&lt;/a&gt; to streamline error handling further.&lt;/p&gt;
    &lt;p&gt;I see this as a special case of &lt;a href="http://teachtogether.tech/en/index.html#s:models" target="_blank"&gt;mental model development&lt;/a&gt;: when a feature is new to you, you don't have an internal mental model so need all of the explicit information you can get. Once you're familiar with it, explicit syntax is visual clutter and hinders how quickly you can parse out information.&lt;/p&gt;
    &lt;p&gt;(One example I like: which is more explicit, &lt;code&gt;user_id&lt;/code&gt; or &lt;code&gt;user_identifier&lt;/code&gt;? Which do experienced programmers prefer?)&lt;/p&gt;
    &lt;p&gt;What's interesting is that it's often the &lt;em&gt;same people&lt;/em&gt; on both sides of the spectrum. Beginners need explicit syntax, and as they become experts, they prefer terse syntax. &lt;/p&gt;
    &lt;p&gt;The rule applies to the overall community, too. At the beginning of a language's life, everybody's a beginner. Over time the ratio of experts to beginners changes, and this leads to more focus on "expert-friendly" features, like terser syntax.&lt;/p&gt;
    &lt;p&gt;This can make it harder for beginners to learn the language. There was a lot of drama in Python over the &lt;a href="https://peps.python.org/pep-0572/" target="_blank"&gt;"walrus" assignment operator&lt;/a&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Without walrus&lt;/span&gt;
    &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# `None` if key absent&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    
    &lt;span class="c1"&gt;# With walrus&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Experts supported it because it made code more elegant, teachers and beginners opposed it because it made the language harder to learn. Explicit syntax vs terse notation.&lt;/p&gt;
    &lt;p&gt;Does this lead to languages bloating over time?&lt;/p&gt;
    &lt;h3&gt;In Teaching&lt;/h3&gt;
    &lt;p&gt;I find that when I teach language workshops I have to actively work against Stroustrup's Rule. The terse notation that easiest for &lt;em&gt;me&lt;/em&gt; to read is bad for beginners, who need the explicit syntax that I find grating.&lt;/p&gt;
    &lt;p&gt;One good example is type invariants in TLA+. Say you have a set of workers, and each worker has a counter. Here's two ways to say that every worker's counter is a non-negative integer:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;\* Bad
    \A w \in Workers: counter[w] &gt;= 0
    
    \* Good
    counter \in [Workers -&gt; Nat]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The first way literally tests that for every worker, &lt;code&gt;counter[w]&lt;/code&gt; is non-negative. The second way tests that the &lt;code&gt;counter&lt;/code&gt; mapping as a whole is an element of the appropriate "function set"— all functions between workers and natural numbers.&lt;/p&gt;
    &lt;p&gt;The function set approach is terser, more elegant, and preferred by TLA+ experts. But I teach the "bad" way because it makes more sense to beginners.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:timing"&gt;
    &lt;p&gt;Starts minute 23. &lt;a class="footnote-backref" href="#fnref:timing" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 11 Dec 2024 17:32:53 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/stroustrups-rule/</guid></item><item><title>Hyperproperties</title><link>https://buttondown.com/hillelwayne/archive/hyperproperties/</link><description>
    &lt;p&gt;I wrote about &lt;a href="https://hillelwayne.com/post/hyperproperties/" target="_blank"&gt;hyperproperties on my blog&lt;/a&gt; four years ago, but now an intriguing client problem got me thinking about them again.&lt;sup id="fnref:client"&gt;&lt;a class="footnote-ref" href="#fn:client"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;We're using TLA+ to model a system that starts in state A, and under certain complicated conditions &lt;code&gt;P&lt;/code&gt;, transitions to state B. They also had a flag &lt;code&gt;f&lt;/code&gt; that, when set, used a different complicated condition &lt;code&gt;Q&lt;/code&gt; to check the transitions. As a quick &lt;a href="https://www.hillelwayne.com/post/decision-tables/" target="_blank"&gt;decision table&lt;/a&gt; (from state &lt;code&gt;A&lt;/code&gt;):&lt;/p&gt;
    &lt;table&gt;
    &lt;thead&gt;
    &lt;tr&gt;
    &lt;th&gt;f&lt;/th&gt;
    &lt;th&gt;P&lt;/th&gt;
    &lt;th&gt;Q&lt;/th&gt;
    &lt;th&gt;state'&lt;/th&gt;
    &lt;/tr&gt;
    &lt;/thead&gt;
    &lt;tbody&gt;
    &lt;tr&gt;
    &lt;td&gt;F&lt;/td&gt;
    &lt;td&gt;F&lt;/td&gt;
    &lt;td&gt;-&lt;/td&gt;
    &lt;td&gt;A&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
    &lt;td&gt;F&lt;/td&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;-&lt;/td&gt;
    &lt;td&gt;B&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;F&lt;/td&gt;
    &lt;td&gt;F&lt;/td&gt;
    &lt;td&gt;A&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;F&lt;/td&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;B&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;F&lt;/td&gt;
    &lt;td&gt;&lt;strong&gt;impossible&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;T&lt;/td&gt;
    &lt;td&gt;B&lt;/td&gt;
    &lt;/tr&gt;
    &lt;/tbody&gt;
    &lt;/table&gt;
    &lt;p&gt;The interesting bit is the second-to-last row: Q has to be &lt;em&gt;strictly&lt;/em&gt; more permissible than P. The client wanted to verify the property that "the system more aggressively transitions when &lt;code&gt;f&lt;/code&gt; is set", ie there is no case where the machine transitions &lt;em&gt;only if &lt;code&gt;f&lt;/code&gt; is false&lt;/em&gt;.&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://www.hillelwayne.com/post/safety-and-liveness/" target="_blank"&gt;Regular system properties&lt;/a&gt; are specified over states in a single sequence of states (behaviors). &lt;strong&gt;Hyperproperties&lt;/strong&gt; can hold over &lt;em&gt;sets&lt;/em&gt; of sequences of states. Here the hyperproperties are:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;ol&gt;
    &lt;li&gt;For any two states X and Y in separate behaviors, if the only difference in variable-state between X and Y is that &lt;code&gt;X.f = TRUE&lt;/code&gt;, then whenever Y transitions to B, so does X.&lt;/li&gt;
    &lt;li&gt;There is at least one such case where X transitions and Y does not.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;That's pretty convoluted, which is par for the course with hyperproperties! It makes a little more sense if you have all of the domain knowledge and specifics. &lt;/p&gt;
    &lt;p&gt;The key thing is that makes this a hyperproperty is that you can't &lt;em&gt;just&lt;/em&gt; look at individual behaviors to verify it. Imagine if, when &lt;code&gt;f&lt;/code&gt; is true, we &lt;em&gt;never&lt;/em&gt; transition to state B. Is that a violation of (1)? Not if we never transition when &lt;code&gt;f&lt;/code&gt; is false either! To prove a violation, you need to find a behavior where &lt;code&gt;f&lt;/code&gt; is false &lt;em&gt;and&lt;/em&gt; the state is otherwise the same &lt;em&gt;and&lt;/em&gt; we transition to B anyway.&lt;/p&gt;
    &lt;h4&gt;Aside: states in states in states&lt;/h4&gt;
    &lt;p&gt;I dislike how "state" refers to three things:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;The high-level "transition state" of a state-machine&lt;/li&gt;
    &lt;li&gt;A single point in time of a system (the "state space")&lt;/li&gt;
    &lt;li&gt;The mutable data inside your system's &lt;a href="https://www.hillelwayne.com/post/world-vs-machine/" target="_blank"&gt;machine&lt;/a&gt;.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;These are all "close" to each other but &lt;em&gt;just&lt;/em&gt; different enough to make conversations confusing. Software is pretty bad about reusing colloquial words like this; don't even get me &lt;em&gt;started&lt;/em&gt; on the word "design".&lt;/p&gt;
    &lt;h3&gt;There's a reason we don't talk about hyperproperties&lt;/h3&gt;
    &lt;p&gt;Or three reasons. First of all, hyperproperties make up a &lt;em&gt;vanishingly small&lt;/em&gt; percentage of the stuff in a system we care about. We only got to "&lt;code&gt;f&lt;/code&gt; makes the system more aggressive" after checking at least a dozen other simpler and &lt;em&gt;more important&lt;/em&gt; not-hyper properties.&lt;/p&gt;
    &lt;p&gt;Second, &lt;em&gt;most&lt;/em&gt; formal specification languages can't express hyperproperties, and the ones that can are all academic research projects. Modeling systems is hard enough without a generalized behavior notation!&lt;/p&gt;
    &lt;p&gt;Third, hyperproperties are astoundingly expensively to check. As an informal estimation, for a state space of size &lt;code&gt;N&lt;/code&gt; regular properties are checked across &lt;code&gt;N&lt;/code&gt; individual states and 2-behavior hyperproperties (2-props) are checked across &lt;code&gt;N²&lt;/code&gt; pairs. So for a small state space of just a million states, the 2-prop needs to be checked across a &lt;em&gt;trillion&lt;/em&gt; pairs. &lt;/p&gt;
    &lt;p&gt;These problems don't apply to "hyperproperties" of functions, just systems. Functions have a lot of interesting hyperproperties, there's an easy way to represent them (call the function twice in a test), and quadratic scaling isn't so bad if you're only testing 100 inputs or so. That's why so-called &lt;a href="https://www.hillelwayne.com/post/metamorphic-testing/" target="_blank"&gt;metamorphic testing&lt;/a&gt; of functions can be useful.&lt;/p&gt;
    &lt;h3&gt;Checking Hyperproperties Anyway&lt;/h3&gt;
    &lt;p&gt;If we &lt;em&gt;do&lt;/em&gt; need to check a hyperproperty, there's a few ways we can approach it. &lt;/p&gt;
    &lt;p&gt;The easiest way is to cheat and find a regular prop that implies the hyperproperty. In client's case, we can abstract &lt;code&gt;P&lt;/code&gt; and &lt;code&gt;Q&lt;/code&gt; into pure functions and then test that there's no input where &lt;code&gt;P&lt;/code&gt; is true and &lt;code&gt;Q&lt;/code&gt; is false. In TLA+, this would look something like&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;\* TLA+
    QLooserThanP ==
      \A i1 \in InputSet1, i2 \in Set2: \* ...
        P(i1, i2, …) =&gt; Q(i1, i2, …)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Of course we can't always encapsulate this way, and this can't catch bugs like "we accidentally use &lt;code&gt;P&lt;/code&gt; even if &lt;code&gt;f&lt;/code&gt; is true". But it gets the job done.&lt;/p&gt;
    &lt;p&gt;Another way is something I talked about in the &lt;a href="https://hillelwayne.com/post/hyperproperties/" target="_blank"&gt;original hyperproperty post&lt;/a&gt;: lifting specs into hyperspecs. We create a new spec that initializes two copies of our main spec, runs them in parallel, and then compares their behaviors. See the post for an example. Writing a hyperspec keeps us entirely in TLA+ but takes a lot of work and is &lt;em&gt;very&lt;/em&gt; expensive to check. Depending on the property we want to check, we can sometimes find simple optimizations.&lt;/p&gt;
    &lt;p&gt;The last way is something &lt;a href="https://hillelwayne.com/post/graphing-tla/" target="_blank"&gt;I explored last year&lt;/a&gt;: dump the state graph to disk and treat the hyperproperty as a graph property. In this case, the graph property would be something like &lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Find all graph edges representing an A → B transition. Take all the source nodes of each where &lt;code&gt;f = false&lt;/code&gt;. For each such source node, find the corresponding node that's identical except for &lt;code&gt;f = true&lt;/code&gt;. That node should be the source of an A → B edge.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;Upside is you don't have to make any changes to the original spec. Downside is you have to use another programming language for analysis. Also, &lt;a href="https://hillelwayne.com/post/graph-types/" target="_blank"&gt;analyzing graphs is terrible&lt;/a&gt;. But I think this overall the most robust approach to handling hyperproperties, to be used when "cheating" fails.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;What fascinates me most about this is the four-year gap between "I learned and wrote about hyperproperties" and "I have to deal with hyperproperties in my job." This is one reason learning for the sake of learning can have a lot of long-term benefits.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Blog Rec&lt;/h3&gt;
    &lt;p&gt;This week's rec is &lt;a href="https://robertheaton.com/" target="_blank"&gt;Robert Heaton&lt;/a&gt;. It's a "general interest" software engineering blog with a focus on math, algorithms, and security. Some of my favorites:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://robertheaton.com/preventing-impossible-game-levels-using-cryptography/" target="_blank"&gt;Preventing impossible game levels using cryptography&lt;/a&gt; and the whole "Steve Steveington" series&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://robertheaton.com/2019/06/24/i-was-7-words-away-from-being-spear-phished/" target="_blank"&gt;I was 7 words away from being spear-phished&lt;/a&gt; is a great deep dive into one targeted scam&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://robertheaton.com/2019/02/24/making-peace-with-simpsons-paradox/" target="_blank"&gt;Making peace with Simpson's Paradox&lt;/a&gt; is the best explanation of Simpson's Paradox I've ever read.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;Other good ones are &lt;a href="https://robertheaton.com/pyskywifi/" target="_blank"&gt;PySkyWiFi: completely free, unbelievably stupid wi-fi on long-haul flights&lt;/a&gt; and &lt;a href="https://robertheaton.com/interview/" target="_blank"&gt;How to pass a coding interview with me&lt;/a&gt;. The guy's got &lt;em&gt;breadth&lt;/em&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:client"&gt;
    &lt;p&gt;I do formal methods consulting btw. &lt;a href="https://www.hillelwayne.com/consulting/" target="_blank"&gt;Hire me!&lt;/a&gt; &lt;a class="footnote-backref" href="#fnref:client" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 19 Nov 2024 19:34:54 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/hyperproperties/</guid></item><item><title>Five Unusual Raku Features</title><link>https://buttondown.com/hillelwayne/archive/five-unusual-raku-features/</link><description>
    &lt;h3&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;&lt;em&gt;Logic for Programmers&lt;/em&gt;&lt;/a&gt; is now in Beta!&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;v0.5 marks the official end of alpha&lt;/a&gt;! With the new version, all of the content I wanted to put in the book is now present, and all that's left is copyediting, proofreading, and formatting. Which will probably take as long as it took to actually write the book. You can see the release notes in the footnote.&lt;sup id="fnref:release-notes"&gt;&lt;a class="footnote-ref" href="#fn:release-notes"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;And I've got a snazzy new cover:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="The logic for programmers cover, a 40x zoom of a bird feather" class="newsletter-image" src="https://assets.buttondown.email/images/26c75f1e-e60a-4328-96e5-9878d96d3e53.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;(I don't actually like the cover that much but it &lt;em&gt;looks&lt;/em&gt; official enough until I can pay an actual cover designer.)&lt;/p&gt;
    &lt;h1&gt;"Five" Unusual Raku Features&lt;/h1&gt;
    &lt;p&gt;Last year I started learning Raku, and the sheer bizarreness of the language left me describing it as &lt;a href="https://buttondown.com/hillelwayne/archive/raku-a-language-for-gremlins/" target="_blank"&gt;a language for gremlins&lt;/a&gt;. Now that I've used it in anger for over a year, I have a better way of describing it:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;Raku is a laboratory for language features.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;This is why it has &lt;a href="https://docs.raku.org/language/concurrency" target="_blank"&gt;five different models of concurrency&lt;/a&gt; and eighteen ways of doing anything else, because the point is to &lt;em&gt;see&lt;/em&gt; what happens. It also explains why many of the features interact so strangely and why there's all that odd edge-case behavior. Getting 100 experiments polished and playing nicely with each other is much harder than running 100 experiments; we can sort out the polish &lt;em&gt;after&lt;/em&gt; we figure out which ideas are good ones.&lt;/p&gt;
    &lt;p&gt;So here are "five" Raku experiments you could imagine seeing in another programming language. If you squint.&lt;/p&gt;
    &lt;h3&gt;&lt;a href="https://docs.raku.org/type/Junction" target="_blank"&gt;Junctions&lt;/a&gt;&lt;/h3&gt;
    &lt;p&gt;Junctions are "superpositions of possible values". Applying an operation to a junction instead applies it to every value inside the junction.  &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;
    &lt;span class="nb"&gt;any&lt;/span&gt;(&lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;10&lt;/span&gt;)
    
    &gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="nv"&gt;&amp;10&lt;/span&gt; + &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="nb"&gt;all&lt;/span&gt;(&lt;span class="mi"&gt;5&lt;/span&gt;, &lt;span class="mi"&gt;13&lt;/span&gt;)
    
    &gt;(&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="nv"&gt;&amp;2&lt;/span&gt;) + (&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;)
    &lt;span class="nb"&gt;all&lt;/span&gt;(&lt;span class="nb"&gt;one&lt;/span&gt;(&lt;span class="mi"&gt;11&lt;/span&gt;, &lt;span class="mi"&gt;21&lt;/span&gt;), &lt;span class="nb"&gt;one&lt;/span&gt;(&lt;span class="mi"&gt;12&lt;/span&gt;, &lt;span class="mi"&gt;22&lt;/span&gt;))
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;As you can probably tell from the &lt;code&gt;all&lt;/code&gt;s and &lt;code&gt;any&lt;/code&gt;s, junctions are a feature meant for representing boolean formula. There's no way to destructure a junction, and the only way to use it is to collapse it to a boolean first.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; (&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="nv"&gt;&amp;2&lt;/span&gt;) + (&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;) &lt; &lt;span class="mi"&gt;15&lt;/span&gt;
    &lt;span class="nb"&gt;all&lt;/span&gt;(&lt;span class="nb"&gt;one&lt;/span&gt;(&lt;span class="nb"&gt;True&lt;/span&gt;, &lt;span class="nb"&gt;False&lt;/span&gt;), &lt;span class="nb"&gt;one&lt;/span&gt;(&lt;span class="nb"&gt;True&lt;/span&gt;, &lt;span class="nb"&gt;False&lt;/span&gt;))
    
    &lt;span class="c1"&gt;# so coerces junctions to booleans&lt;/span&gt;
    &gt; &lt;span class="nb"&gt;so&lt;/span&gt; (&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="nv"&gt;&amp;2&lt;/span&gt;) + (&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;) &lt; &lt;span class="mi"&gt;15&lt;/span&gt;
    &lt;span class="nb"&gt;True&lt;/span&gt;
    
    &gt; &lt;span class="nb"&gt;so&lt;/span&gt; (&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="nv"&gt;&amp;2&lt;/span&gt;) + (&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;) &gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="nb"&gt;False&lt;/span&gt;
    
    &gt; &lt;span class="mi"&gt;16&lt;/span&gt; %% (&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="nv"&gt;&amp;5&lt;/span&gt;) ?? &lt;span class="s"&gt;"fizzbuzz"&lt;/span&gt; !! *
    *
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The real interesting thing for me is how Raku elegantly uses junctions to represent quantifiers. In most languages, you either have the function &lt;code&gt;all(list[T], T -&gt; bool)&lt;/code&gt; or the method &lt;code&gt;[T].all(T -&gt; bool)&lt;/code&gt;, both of which apply the test to every element of the list. In Raku, though, &lt;code&gt;list.all&lt;/code&gt; doesn't take &lt;em&gt;anything&lt;/em&gt;, it's just a niladic method that turns the list into a junction. &lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="k"&gt;my&lt;/span&gt; &lt;span class="nv"&gt;$x&lt;/span&gt; = &lt;span class="s"&gt;&lt;1 2 3&gt;&lt;/span&gt;.&lt;span class="nb"&gt;all&lt;/span&gt;
    &lt;span class="nb"&gt;all&lt;/span&gt;(&lt;span class="mi"&gt;1&lt;/span&gt;, &lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;3&lt;/span&gt;)
    &gt; &lt;span class="nb"&gt;is-prime&lt;/span&gt;(&lt;span class="nv"&gt;$x&lt;/span&gt;)
    &lt;span class="nb"&gt;all&lt;/span&gt;(&lt;span class="nb"&gt;False&lt;/span&gt;, &lt;span class="nb"&gt;True&lt;/span&gt;, &lt;span class="nb"&gt;True&lt;/span&gt;)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This means we can combine junctions. If Raku didn't already have a &lt;code&gt;unique&lt;/code&gt; method, we could build it by saying "are all elements equal to exactly one element?"&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="nb"&gt;so&lt;/span&gt; {.&lt;span class="nb"&gt;all&lt;/span&gt; == .&lt;span class="nb"&gt;one&lt;/span&gt;}(&lt;span class="s"&gt;&lt;1 2 3 7&gt;&lt;/span&gt;)
    &lt;span class="nb"&gt;True&lt;/span&gt;
    
    &gt; &lt;span class="nb"&gt;so&lt;/span&gt; {.&lt;span class="nb"&gt;all&lt;/span&gt; == .&lt;span class="nb"&gt;one&lt;/span&gt;}(&lt;span class="s"&gt;&lt;1 2 3 7 2&gt;&lt;/span&gt;)
    &lt;span class="nb"&gt;False&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;&lt;a href="https://docs.raku.org/type/Whatever" target="_blank"&gt;Whatevers&lt;/a&gt;&lt;/h3&gt;
    &lt;p&gt;&lt;code&gt;*&lt;/code&gt; is the "whatever" symbol and has a lot of different roles in Raku.&lt;sup id="fnref:analogs"&gt;&lt;a class="footnote-ref" href="#fn:analogs"&gt;2&lt;/a&gt;&lt;/sup&gt; Some functions and operators have special behavior when passed a &lt;code&gt;*&lt;/code&gt;. In a range or sequence, &lt;code&gt;*&lt;/code&gt; means "unbound".&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="mi"&gt;1&lt;/span&gt;..*
    &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="n"&gt;Inf&lt;/span&gt;
    
    &gt; (&lt;span class="mi"&gt;2&lt;/span&gt;,&lt;span class="mi"&gt;4&lt;/span&gt;,&lt;span class="mi"&gt;8&lt;/span&gt;...*)[&lt;span class="mi"&gt;17&lt;/span&gt;]
    &lt;span class="mi"&gt;262144&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The main built-in use, though, is that expressions with &lt;code&gt;*&lt;/code&gt; are lifted into anonymous functions. This is called "whatever-priming" and produces a &lt;code&gt;WhateverCode&lt;/code&gt;, which is indistinguishable from other functions except for the type.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; {&lt;span class="nv"&gt;$_&lt;/span&gt; + &lt;span class="mi"&gt;10&lt;/span&gt;}(&lt;span class="mi"&gt;2&lt;/span&gt;)
    &lt;span class="mi"&gt;12&lt;/span&gt;
    
    &gt; (* + &lt;span class="mi"&gt;10&lt;/span&gt;)(&lt;span class="mi"&gt;2&lt;/span&gt;)
    &lt;span class="mi"&gt;12&lt;/span&gt;
    
    &gt; (^&lt;span class="mi"&gt;10&lt;/span&gt;).&lt;span class="n"&gt;map&lt;/span&gt;(* % &lt;span class="mi"&gt;2&lt;/span&gt;)
    (&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There's actually a bit of weird behavior here: if &lt;em&gt;two&lt;/em&gt; whatevers appear in the expression, they become separate positional variables. &lt;code&gt;(2, 30, 4, 50).map(* + *)&lt;/code&gt; returns &lt;code&gt;(32, 54)&lt;/code&gt;. This makes it easy to express &lt;a href="https://docs.raku.org/language/operators#infix_..." target="_blank"&gt;a tricky Fibonacci definition&lt;/a&gt; but otherwise I don't see how it's better than making each &lt;code&gt;*&lt;/code&gt; the same value.&lt;/p&gt;
    &lt;p&gt;Regardless, priming is useful because &lt;em&gt;so many&lt;/em&gt; Raku methods are overloaded to take functions. You get the last element of a list with &lt;code&gt;l[*-1]&lt;/code&gt;. This &lt;em&gt;looks&lt;/em&gt; like standard negative-index syntax, but what actually happens is that when &lt;code&gt;[]&lt;/code&gt; is passed a function, it passes in list length and looks up the result. So if the list has 10 elements, &lt;code&gt;l[*-1] = l[10-1] = l[9]&lt;/code&gt;, aka the last element. Similarly, &lt;code&gt;l.head(2)&lt;/code&gt; is the first two elements of a list, &lt;code&gt;l.head(*-2)&lt;/code&gt; is all-but-the-last-two.&lt;/p&gt;
    &lt;p&gt;We can pass other functions to &lt;code&gt;[]&lt;/code&gt;, which e.g. makes implementing ring buffers easy.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="k"&gt;my&lt;/span&gt; &lt;span class="nv"&gt;@x&lt;/span&gt; = ^&lt;span class="mi"&gt;10&lt;/span&gt;
    [&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;]
    
    &gt; &lt;span class="nv"&gt;@x&lt;/span&gt;[&lt;span class="mi"&gt;95&lt;/span&gt; % *]--; &lt;span class="nv"&gt;@x&lt;/span&gt;
    [&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;&lt;a href="https://docs.raku.org/language/regexes" target="_blank"&gt;Regular Expressions&lt;/a&gt;&lt;/h3&gt;
    &lt;p&gt;There are two basic standards for regexes: POSIX regexes and Perl-compatible regexes (PCRE). POSIX regexes are a terrible mess of backslashes and punctuation. PCRE is backwards compatible with POSIX and is a more terrible mess of backslashes and punctuation. Most languages follow the PCRE standard, but Perl 6 breaks backwards compatibility with an entirely new regex syntax. &lt;/p&gt;
    &lt;p&gt;The most obvious improvement: &lt;a href="https://docs.raku.org/language/regexes#Subrules" target="_blank"&gt;composability&lt;/a&gt;. In most languages  "combine" two regexes by concating their strings together, which is terrible for many, many reasons. Raku has the standard "embed another regex" syntax: &lt;code&gt;/&lt; foo &gt;+/&lt;/code&gt; matches one-or-more of the &lt;code&gt;foo&lt;/code&gt; regex without &lt;code&gt;foo&lt;/code&gt; "leaking" into the top regex. &lt;/p&gt;
    &lt;p&gt;This already does a lot to make regexes more tractable: you can break a complicated regular expression down into simpler and more legible parts. And in fact this is how Raku supports &lt;a href="https://docs.raku.org/language/grammars" target="_blank"&gt;parsing grammars&lt;/a&gt; as a builtin language feature. I've only used grammars once but it &lt;a href="https://www.hillelwayne.com/post/picat/" target="_blank"&gt;was quite helpful&lt;/a&gt;.&lt;/p&gt;
    &lt;p&gt;Since we're breaking backwards compatibility anyway, we can now add lots of small QOLs. There's a &lt;a href="https://docs.raku.org/language/regexes#Modified_quantifier:_%,_%%" target="_blank"&gt;value separator&lt;/a&gt; modifier: &lt;code&gt;\d+ % ','&lt;/code&gt; matches &lt;code&gt;1&lt;/code&gt; / &lt;code&gt;1,2&lt;/code&gt; / &lt;code&gt;1,1,4&lt;/code&gt; but not &lt;code&gt;1,&lt;/code&gt; or &lt;code&gt;12&lt;/code&gt;. &lt;a href="https://docs.raku.org/language/regexes#Lookaround_assertions" target="_blank"&gt;Lookaheads&lt;/a&gt; and non-capturing groups aren't nonsense glyphs. &lt;code&gt;r1 &amp;&amp; r2&lt;/code&gt; only matches strings that match &lt;em&gt;both&lt;/em&gt; &lt;code&gt;r1&lt;/code&gt; and &lt;code&gt;r2&lt;/code&gt;. Backtracking can be stopped with &lt;a href="https://docs.raku.org/language/regexes#Preventing_backtracking:_:" target="_blank"&gt;:&lt;/a&gt;. Whitespace is ignored by default and has to be explicitly enabled in match patterns.&lt;/p&gt;
    &lt;p&gt;There's more stuff Raku does with actually &lt;em&gt;processing&lt;/em&gt; regular expressions, but the regex notation is something that might actually appear in another language someday. &lt;/p&gt;
    &lt;p style="height:16px; margin:0px !important;"&gt;&lt;/p&gt;
    &lt;h3&gt;&lt;a href="https://docs.raku.org/language/operators#Hyper_operators" target="_blank"&gt;Hyperoperators&lt;/a&gt;&lt;/h3&gt;
    &lt;p&gt;This is a small one compared to the other features, but it's also the thing I miss most often in other languages. The most basic form &lt;code&gt;l&gt;&gt;.method&lt;/code&gt; is basically equivalent to &lt;code&gt;map&lt;/code&gt;, except it also recursively descends into sublists.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; [&lt;span class="mi"&gt;1&lt;/span&gt;, [&lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;3&lt;/span&gt;], &lt;span class="mi"&gt;4&lt;/span&gt;]&gt;&gt;.&lt;span class="nb"&gt;succ&lt;/span&gt;
    [&lt;span class="mi"&gt;2&lt;/span&gt; [&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;] &lt;span class="mi"&gt;5&lt;/span&gt;]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This is more useful than it looks because any function call &lt;code&gt;f(list, *args)&lt;/code&gt; can be rewritten in "method form" &lt;code&gt;list.&amp;f(*args)&lt;/code&gt;, so &lt;code&gt;&gt;&gt;.&lt;/code&gt; becomes the generalized mapping operator. You can use it with whatevers, too.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; [&lt;span class="mi"&gt;1&lt;/span&gt;, [&lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;3&lt;/span&gt;], &lt;span class="mi"&gt;4&lt;/span&gt;]&gt;&gt;.&amp;(*+&lt;span class="mi"&gt;1&lt;/span&gt;)
    [&lt;span class="mi"&gt;2&lt;/span&gt; [&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;] &lt;span class="mi"&gt;5&lt;/span&gt;]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Anyway, the more generalized &lt;em&gt;binary&lt;/em&gt; hyperoperator &lt;code&gt;l1 &lt;&lt; op &gt;&gt; l2&lt;/code&gt;&lt;sup id="fnref:spaces"&gt;&lt;a class="footnote-ref" href="#fn:spaces"&gt;3&lt;/a&gt;&lt;/sup&gt; applies &lt;code&gt;op&lt;/code&gt; elementwise to the two lists, looping the shorter list until the longer list is exhausted. &lt;code&gt;&gt;&gt;op&gt;&gt;&lt;/code&gt; / &lt;code&gt;&lt;&lt; op&lt;&lt;&lt;/code&gt; are the same except they instead loop until the lhs/rhs list is exhausted. Whew!&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; [&lt;span class="mi"&gt;1&lt;/span&gt;, &lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;3&lt;/span&gt;, &lt;span class="mi"&gt;4&lt;/span&gt;, &lt;span class="mi"&gt;5&lt;/span&gt;] &lt;span class="s"&gt;&lt;&lt;+&gt;&lt;/span&gt;&gt; [&lt;span class="mi"&gt;10&lt;/span&gt;, &lt;span class="mi"&gt;20&lt;/span&gt;]
    [&lt;span class="mi"&gt;11&lt;/span&gt; &lt;span class="mi"&gt;22&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;]
    
    &gt; [&lt;span class="mi"&gt;1&lt;/span&gt;, &lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;3&lt;/span&gt;, &lt;span class="mi"&gt;4&lt;/span&gt;, &lt;span class="mi"&gt;5&lt;/span&gt;] &lt;span class="s"&gt;&lt;&lt;+&lt;&lt; [10, 20]&lt;/span&gt;
    &lt;span class="s"&gt;[11 22]&lt;/span&gt;
    
    &lt;span class="s"&gt;&gt; [1, 2, 3, 4, 5] &gt;&gt;&lt;/span&gt;+&gt;&gt; [&lt;span class="mi"&gt;10&lt;/span&gt;, &lt;span class="mi"&gt;20&lt;/span&gt;]
    [&lt;span class="mi"&gt;11&lt;/span&gt; &lt;span class="mi"&gt;22&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;]
    
    &lt;span class="c1"&gt;# Also works with single values&lt;/span&gt;
    &gt; [&lt;span class="mi"&gt;1&lt;/span&gt;, &lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;3&lt;/span&gt;, &lt;span class="mi"&gt;4&lt;/span&gt;, &lt;span class="mi"&gt;5&lt;/span&gt;] &lt;span class="s"&gt;&lt;&lt;+&gt;&lt;/span&gt;&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
    [&lt;span class="mi"&gt;11&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt; &lt;span class="mi"&gt;14&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;]
    
    &lt;span class="c1"&gt;# Does weird things with nested lists too&lt;/span&gt;
    &gt; [&lt;span class="mi"&gt;1&lt;/span&gt;, [&lt;span class="mi"&gt;2&lt;/span&gt;, &lt;span class="mi"&gt;3&lt;/span&gt;], &lt;span class="mi"&gt;4&lt;/span&gt;, &lt;span class="mi"&gt;5&lt;/span&gt;] &lt;span class="s"&gt;&lt;&lt;+&gt;&lt;/span&gt;&gt; [&lt;span class="mi"&gt;10&lt;/span&gt;, &lt;span class="mi"&gt;20&lt;/span&gt;]
    [&lt;span class="mi"&gt;11&lt;/span&gt; [&lt;span class="mi"&gt;22&lt;/span&gt; &lt;span class="mi"&gt;23&lt;/span&gt;] &lt;span class="mi"&gt;14&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Also for some reason the hyperoperators have separate behaviors on two hashes, either applying &lt;code&gt;op&lt;/code&gt; to the union/intersection/hash difference. &lt;/p&gt;
    &lt;p&gt;Anyway it's a super weird (meta)operator but it's also quite useful! It's the closest thing I've seen to &lt;a href="https://hillelwayne.com/post/j-notation/" target="_blank"&gt;J verbs&lt;/a&gt; outside an APL. I like using it to run the same formula on multiple possible inputs at once.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;(&lt;span class="mi"&gt;20&lt;/span&gt; * &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="s"&gt;&lt;&lt;-&gt;&lt;/span&gt;&gt; (&lt;span class="mi"&gt;21&lt;/span&gt;, &lt;span class="mi"&gt;24&lt;/span&gt;)) &lt;span class="s"&gt;&lt;&lt;*&gt;&lt;/span&gt;&gt; (&lt;span class="mi"&gt;10&lt;/span&gt;, &lt;span class="mi"&gt;100&lt;/span&gt;)
    (&lt;span class="mi"&gt;1790&lt;/span&gt; &lt;span class="mi"&gt;17600&lt;/span&gt;)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Incidentally, it's called the hyperoperator because it evaluates all of the operations in parallel. Explicit loops can be parallelized by prefixing them with &lt;a href="https://docs.raku.org/language/statement-prefixes#hyper,_race" target="_blank"&gt;&lt;code&gt;hyper&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
    &lt;h3&gt;&lt;a href="https://docs.raku.org/type/Pair" target="_blank"&gt;Pair Syntax&lt;/a&gt;&lt;/h3&gt;
    &lt;p&gt;I've talked about pairs a little in &lt;a href="https://buttondown.com/hillelwayne/archive/unusual-basis-types-in-programming-languages/" target="_blank"&gt;this newsletter&lt;/a&gt;, but the gist is that Raku hashes are composed of a set of pairs &lt;code&gt;key =&gt; value&lt;/code&gt;. The pair is the basis type, the hash is the collection of pairs. There's also a &lt;em&gt;ton&lt;/em&gt; of syntactic sugar for concisely specifying pairs via "colon syntax":&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="k"&gt;my&lt;/span&gt; &lt;span class="nv"&gt;$x&lt;/span&gt; = &lt;span class="mi"&gt;3&lt;/span&gt;; :&lt;span class="nv"&gt;$x&lt;/span&gt;
    &lt;span class="nb"&gt;x&lt;/span&gt; =&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    
    &gt; :&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="s"&gt;&lt;$x&gt;&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt; =&gt; &lt;span class="s"&gt;"$x"&lt;/span&gt;
    
    &gt; :&lt;span class="n"&gt;a&lt;/span&gt;(&lt;span class="nv"&gt;$x&lt;/span&gt;)
    &lt;span class="n"&gt;a&lt;/span&gt; =&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    
    &gt; :&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt; =&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The most important sugars are &lt;code&gt;:key&lt;/code&gt; and &lt;code&gt;:!key&lt;/code&gt;, which map to &lt;code&gt;key =&gt; True&lt;/code&gt; and &lt;code&gt;key =&gt; False&lt;/code&gt;. This is a really elegant way to add flags to a methods! Take the definition of &lt;a href="https://docs.raku.org/type/Str#method_match" target="_blank"&gt;match&lt;/a&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;method&lt;/span&gt; &lt;span class="nb"&gt;match&lt;/span&gt;(&lt;span class="nv"&gt;$pat&lt;/span&gt;, 
        :&lt;span class="n"&gt;continue&lt;/span&gt;(:&lt;span class="nv"&gt;$c&lt;/span&gt;), :&lt;span class="n"&gt;pos&lt;/span&gt;(:&lt;span class="nv"&gt;$p&lt;/span&gt;), :&lt;span class="n"&gt;global&lt;/span&gt;(:&lt;span class="nv"&gt;$g&lt;/span&gt;), 
        :&lt;span class="n"&gt;overlap&lt;/span&gt;(:&lt;span class="nv"&gt;$ov&lt;/span&gt;), :&lt;span class="n"&gt;exhaustive&lt;/span&gt;(:&lt;span class="nv"&gt;$ex&lt;/span&gt;), 
        :&lt;span class="n"&gt;st&lt;/span&gt;(:&lt;span class="nv"&gt;$nd&lt;/span&gt;), :&lt;span class="n"&gt;rd&lt;/span&gt;(:&lt;span class="nv"&gt;$th&lt;/span&gt;), :&lt;span class="nv"&gt;$nth&lt;/span&gt;, :&lt;span class="nv"&gt;$x&lt;/span&gt; --&gt; &lt;span class="nb"&gt;Match&lt;/span&gt;)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Probably should also mention that in a definition, &lt;code&gt;:f(:$foo)&lt;/code&gt; defines the parameter &lt;code&gt;$foo&lt;/code&gt; but &lt;a href="https://docs.raku.org/language/signatures#Argument_aliases" target="_blank"&gt;also aliases it&lt;/a&gt; to &lt;code&gt;:f&lt;/code&gt;, so you can set the flag with &lt;code&gt;:f&lt;/code&gt; or &lt;code&gt;:foo&lt;/code&gt;. Colon-pairs defined in the signature can be passed in anywhere, or even stuck together:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="s"&gt;"abab"&lt;/span&gt;.&lt;span class="nb"&gt;match&lt;/span&gt;(&lt;span class="sr"&gt;/../&lt;/span&gt;)
    「&lt;span class="n"&gt;ab&lt;/span&gt;」
    &gt; &lt;span class="s"&gt;"abab"&lt;/span&gt;.&lt;span class="nb"&gt;match&lt;/span&gt;(&lt;span class="sr"&gt;/../&lt;/span&gt;, :&lt;span class="n"&gt;g&lt;/span&gt;)
    (「&lt;span class="n"&gt;ab&lt;/span&gt;」 「&lt;span class="n"&gt;ab&lt;/span&gt;」)
    &gt; &lt;span class="s"&gt;"abab"&lt;/span&gt;.&lt;span class="nb"&gt;match&lt;/span&gt;(&lt;span class="sr"&gt;/../&lt;/span&gt;, :&lt;span class="n"&gt;g&lt;/span&gt;, :&lt;span class="n"&gt;ov&lt;/span&gt;)
    (「&lt;span class="n"&gt;ab&lt;/span&gt;」 「&lt;span class="n"&gt;ba&lt;/span&gt;」 「&lt;span class="n"&gt;ab&lt;/span&gt;」)
    
    &lt;span class="c1"&gt;# Out of order stuck together&lt;/span&gt;
    &gt; &lt;span class="s"&gt;"abab"&lt;/span&gt;.&lt;span class="nb"&gt;match&lt;/span&gt;(:&lt;span class="n"&gt;g:ov&lt;/span&gt;,&lt;span class="sr"&gt; /../&lt;/span&gt;)
    (「&lt;span class="n"&gt;ab&lt;/span&gt;」 「&lt;span class="n"&gt;ba&lt;/span&gt;」 「&lt;span class="n"&gt;ab&lt;/span&gt;」)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So that leads to extremely concise method configuration. Definitely beats &lt;code&gt;match(global=True, overlap=True)&lt;/code&gt;!&lt;/p&gt;
    &lt;p&gt;And for some reason you can place keyword arguments &lt;em&gt;after&lt;/em&gt; the function call:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="s"&gt;"abab"&lt;/span&gt;.&lt;span class="nb"&gt;match&lt;/span&gt;(:&lt;span class="n"&gt;g&lt;/span&gt;,&lt;span class="sr"&gt; /../&lt;/span&gt;):&lt;span class="n"&gt;ov:2nd&lt;/span&gt;
    「&lt;span class="n"&gt;ba&lt;/span&gt;」
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h2&gt;The next-gen lab: Slangs and RakuAST&lt;/h2&gt;
    &lt;p&gt;These are features I have no experience in and &lt;em&gt;certainly&lt;/em&gt; are not making their way into other languages, but they really expand the explorable space of new features. &lt;a href="https://raku.land/zef:lizmat/Slangify" target="_blank"&gt;Slangs&lt;/a&gt; are modifications to the Raku syntax. This can be used for things like &lt;a href="https://raku.land/zef:elcaro/Slang::Otherwise" target="_blank"&gt;modifying loop syntax&lt;/a&gt;, &lt;a href="https://raku.land/zef:raku-community-modules/Slang::Piersing" target="_blank"&gt;changing identifiers&lt;/a&gt;, or adding &lt;a href="https://raku.land/zef:raku-community-modules/OO::Actors" target="_blank"&gt;actors&lt;/a&gt; or &lt;a href="https://raku.land/github:MattOates/BioInfo" target="_blank"&gt;DNA sequences&lt;/a&gt; to the base language.&lt;/p&gt;
    &lt;p&gt;I &lt;em&gt;barely&lt;/em&gt; understand &lt;a href="https://dev.to/lizmat/rakuast-for-early-adopters-576n" target="_blank"&gt;RakuAST&lt;/a&gt;. I &lt;em&gt;think&lt;/em&gt; the idea is that all Raku expressions can be parsed as an AST from inside Raku itself.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&gt; &lt;span class="s"&gt;Q/my $x; $x++/&lt;/span&gt;.&lt;span class="nb"&gt;AST&lt;/span&gt;
    &lt;span class="n"&gt;RakuAST::StatementList&lt;/span&gt;.&lt;span class="nb"&gt;new&lt;/span&gt;(
      &lt;span class="n"&gt;RakuAST::Statement::Expression&lt;/span&gt;.&lt;span class="nb"&gt;new&lt;/span&gt;(
        &lt;span class="n"&gt;expression&lt;/span&gt; =&gt; &lt;span class="n"&gt;RakuAST::VarDeclaration::Simple&lt;/span&gt;.&lt;span class="nb"&gt;new&lt;/span&gt;(
          &lt;span class="nb"&gt;sigil&lt;/span&gt;       =&gt; &lt;span class="s"&gt;"\$"&lt;/span&gt;,
          &lt;span class="n"&gt;desigilname&lt;/span&gt; =&gt; &lt;span class="n"&gt;RakuAST::Name&lt;/span&gt;.&lt;span class="n"&gt;from-identifier&lt;/span&gt;(&lt;span class="s"&gt;"x"&lt;/span&gt;)
        )
      ),
      &lt;span class="n"&gt;RakuAST::Statement::Expression&lt;/span&gt;.&lt;span class="nb"&gt;new&lt;/span&gt;(
        &lt;span class="n"&gt;expression&lt;/span&gt; =&gt; &lt;span class="n"&gt;RakuAST::ApplyPostfix&lt;/span&gt;.&lt;span class="nb"&gt;new&lt;/span&gt;(
          &lt;span class="n"&gt;operand&lt;/span&gt; =&gt; &lt;span class="n"&gt;RakuAST::Var::Lexical&lt;/span&gt;.&lt;span class="nb"&gt;new&lt;/span&gt;(&lt;span class="s"&gt;"\$x"&lt;/span&gt;),
          &lt;span class="nb"&gt;postfix&lt;/span&gt; =&gt; &lt;span class="n"&gt;RakuAST::Postfix&lt;/span&gt;.&lt;span class="nb"&gt;new&lt;/span&gt;(&lt;span class="s"&gt;"++"&lt;/span&gt;)
        )
      )
    )
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This allows for things like writing Raku in different languages:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nb"&gt;say&lt;/span&gt; &lt;span class="s"&gt;Q/my $x; put $x/&lt;/span&gt;.&lt;span class="nb"&gt;AST&lt;/span&gt;.&lt;span class="n"&gt;DEPARSE&lt;/span&gt;(&lt;span class="s"&gt;"NL"&lt;/span&gt;)
    &lt;span class="n"&gt;mijn&lt;/span&gt; &lt;span class="nv"&gt;$x&lt;/span&gt;;
    &lt;span class="n"&gt;zeg-het&lt;/span&gt; &lt;span class="nv"&gt;$x&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;h3&gt;Bonus experiment&lt;/h3&gt;
    &lt;p&gt;Raku comes with a "&lt;a href="https://rakudo.org/star" target="_blank"&gt;Rakudo Star&lt;/a&gt;" installation, which comes with a set of &lt;a href="https://github.com/rakudo/star/blob/master/etc/modules.txt" target="_blank"&gt;blessed third party modules&lt;/a&gt; preinstalled. I love this! It's a great compromise between the maintainer burdens of a large standard library and the user burdens of making everybody find the right packages in the ecosystem.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h2&gt;Blog Rec&lt;/h2&gt;
    &lt;p&gt;Feel obligated to recommend some Raku blogs! Elizabeth Mattijsen posts &lt;a href="https://dev.to/lizmat" target="_blank"&gt;a ton of stuff&lt;/a&gt; to dev.to about Raku internals. &lt;a href="https://www.codesections.com/blog/" target="_blank"&gt;Codesections&lt;/a&gt; has a pretty good blog; he's the person who eventually got me to try out Raku. Finally, the &lt;a href="https://raku-advent.blog/" target="_blank"&gt;Raku Advent Calendar&lt;/a&gt; is a great dive into advanced Raku techniques. Bad news is it only updates once a year, good news is it's 25 updates that once a year.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:release-notes"&gt;
    &lt;ul&gt;
    &lt;li&gt;All techniques chapters now have a "Further Reading" section&lt;/li&gt;
    &lt;li&gt;"System modeling" chapter significantly rewritten&lt;/li&gt;
    &lt;li&gt;"Conditionals" chapter expanded, now a real chapter&lt;/li&gt;
    &lt;li&gt;"Logic Programming" chapter now covers datalog, deductive databases&lt;/li&gt;
    &lt;li&gt;"Solvers" chapter has diagram explaining problem&lt;/li&gt;
    &lt;li&gt;Eight new exercises&lt;/li&gt;
    &lt;li&gt;Tentative front cover (will probably change)&lt;/li&gt;
    &lt;li&gt;Fixed some epub issues with math rendering&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;&lt;a class="footnote-backref" href="#fnref:release-notes" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:analogs"&gt;
    &lt;p&gt;Analogues are &lt;a href="https://stackoverflow.com/questions/8000903/what-are-all-the-uses-of-an-underscore-in-scala/8001065#8001065" target="_blank"&gt;Scala's underscore&lt;/a&gt;, except unlike Scala it's a value and not syntax, and like Python's &lt;a href="https://docs.python.org/3/library/constants.html#Ellipsis" target="_blank"&gt;Ellipses&lt;/a&gt;, except it has additional semantics. &lt;a class="footnote-backref" href="#fnref:analogs" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:spaces"&gt;
    &lt;p&gt;Spaces added so buttondown doesn't think they're tags &lt;a class="footnote-backref" href="#fnref:spaces" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 12 Nov 2024 20:06:55 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/five-unusual-raku-features/</guid></item><item><title>A list of ternary operators</title><link>https://buttondown.com/hillelwayne/archive/a-list-of-ternary-operators/</link><description>
    &lt;p&gt;Sup nerds, I'm back from SREcon! I had a blast, despite knowing nothing about site reliability engineering and being way over my head in half the talks. I'm trying to catch up on &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;The Book&lt;/a&gt; and contract work now so I'll do something silly here: ternary operators.&lt;/p&gt;
    &lt;p&gt;Almost all operations on values in programming languages fall into one of three buckets: &lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;&lt;strong&gt;Unary operators&lt;/strong&gt;, where the operator goes &lt;em&gt;before&lt;/em&gt; or &lt;em&gt;after&lt;/em&gt; exactly one argument. Examples are &lt;code&gt;x++&lt;/code&gt; and &lt;code&gt;-y&lt;/code&gt; and &lt;code&gt;!bool&lt;/code&gt;. Most languages have a few critical unary operators hardcoded into the grammar. They are almost always symbols, but sometimes are string-identifiers (&lt;code&gt;not&lt;/code&gt;).&lt;/li&gt;
    &lt;li&gt;&lt;strong&gt;Binary operators&lt;/strong&gt;, which are placed &lt;em&gt;between&lt;/em&gt; exactly two arguments. Things like &lt;code&gt;+&lt;/code&gt; or &lt;code&gt;&amp;&amp;&lt;/code&gt; or &lt;code&gt;&gt;=&lt;/code&gt;. Languages have a lot more of these than unary operators, because there's more fundamental things we want to do with two values than one value. These can be symbols or identifiers (&lt;code&gt;and&lt;/code&gt;).&lt;/li&gt;
    &lt;li&gt;Functions/methods that &lt;em&gt;prefix&lt;/em&gt; any number of arguments. &lt;code&gt;func(a, b, c)&lt;/code&gt;, &lt;code&gt;obj.method(a, b, c, d)&lt;/code&gt;, anything in a lisp. These are how we extend the language, and they almost-exclusively use identifiers and not symbols.&lt;sup id="fnref:lisp"&gt;&lt;a class="footnote-ref" href="#fn:lisp"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;There's one widespread exception to this categorization: the &lt;strong&gt;ternary operator&lt;/strong&gt; &lt;code&gt;bool ? x : y&lt;/code&gt;.&lt;sup id="fnref:ternary"&gt;&lt;a class="footnote-ref" href="#fn:ternary"&gt;2&lt;/a&gt;&lt;/sup&gt; It's an infix operator that takes exactly &lt;em&gt;three&lt;/em&gt; arguments and can't be decomposed into two sequential binary operators. &lt;code&gt;bool ? x&lt;/code&gt; makes no sense on its own, nor does &lt;code&gt;x : y&lt;/code&gt;. &lt;/p&gt;
    &lt;p&gt;Other ternary operators are &lt;em&gt;extremely&lt;/em&gt; rare, which is why conditional expressions got to monopolize the name "ternary". But I like how exceptional they are and want to compile some of them. A long long time ago I asked &lt;a href="https://twitter.com/hillelogram/status/1378509881498603527" target="_blank"&gt;Twitter&lt;/a&gt; for other ternary operators; this is a compilation of some applicable responses plus my own research.&lt;/p&gt;
    &lt;p&gt;(Most of these are a &lt;em&gt;bit&lt;/em&gt; of a stretch.)&lt;/p&gt;
    &lt;h3&gt;Stepped Ranges&lt;/h3&gt;
    &lt;p&gt;Many languages have some kind of "stepped range" function:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python&lt;/span&gt;
    &lt;span class="o"&gt;&gt;&gt;&gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;There's the "base case" of start and endpoints, and an optional step. Many languages have a binary infix op for the base case, but a few also have a ternary for the optional step:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# Frink
    &gt; map[{|a| a*2}, (1 to 100 step 15) ] 
    [2, 32, 62, 92, 122, 152, 182]
    
    # Elixir
    &gt; IO.puts Enum.join(1..10//2, " ")
    1 3 5 7 9
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;This isn't decomposable into two binary ops because you can't assign the range to a value and then step the value later.&lt;/p&gt;
    &lt;h3&gt;Graph ops&lt;/h3&gt;
    &lt;p&gt;In &lt;a href="https://graphviz.org/" target="_blank"&gt;Graphviz&lt;/a&gt;, a basic edge between two nodes is either the binary &lt;code&gt;node1 -&gt; node2&lt;/code&gt; or the ternary &lt;code&gt;node1 -&gt; node2 [edge_props]&lt;/code&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;digraph&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;G&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;a1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;a2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="na"&gt;color&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"green"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;&lt;img alt="Output of the above graphviz" class="newsletter-image" src="https://assets.buttondown.email/images/d1a0f894-59d5-45d3-8702-967e94672371.png?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;Graphs seem ternary-friendly because there are three elements involved with any graph connection: the two nodes and the connecting edge. So you also see ternaries in some graph database query languages, with separate places to specify each node and the edge.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;# GSQL (https://docs.tigergraph.com/gsql-ref/4.1/tutorials/gsql-101/parameterized-gsql-query)
    SELECT tgt
        FROM start:s -(Friendship:e)- Person:tgt;
    
    # Cypher (https://neo4j.com/docs/cypher-manual/current/introduction/cypher-overview/)
    MATCH (actor:Actor)-[:ACTED_IN]-&gt;(movie:Movie {title: 'The Matrix'})
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Obligatory plug for my &lt;a href="https://www.hillelwayne.com/post/graph-types/" target="_blank"&gt;graph datatype essay&lt;/a&gt;.&lt;/p&gt;
    &lt;h3&gt;Metaoperators&lt;/h3&gt;
    &lt;p&gt;Both &lt;a href="https://raku.org/" target="_blank"&gt;Raku&lt;/a&gt; and &lt;a href="https://www.jsoftware.com/#/README" target="_blank"&gt;J&lt;/a&gt; have special higher-order functions that apply to binary infixes. Raku calls them &lt;em&gt;metaoperators&lt;/em&gt;, while J calls them &lt;em&gt;adverbs&lt;/em&gt; and &lt;em&gt;conjugations&lt;/em&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Raku&lt;/span&gt;
    
    &lt;span class="c1"&gt;# `a «op» b` is map, "cycling" shorter list&lt;/span&gt;
    &lt;span class="nb"&gt;say&lt;/span&gt; &lt;span class="s"&gt;&lt;10 20 30&gt;&lt;/span&gt; «+» &lt;span class="s"&gt;&lt;4 5&gt;&lt;/span&gt;
    (&lt;span class="mi"&gt;14&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt; &lt;span class="mi"&gt;34&lt;/span&gt;)
    
    &lt;span class="c1"&gt;# `a Rop b` is `b op a`&lt;/span&gt;
    &lt;span class="nb"&gt;say&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="n"&gt;R-&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. J&lt;/span&gt;
    
    &lt;span class="c1"&gt;NB. x f/ y creates a "table" of x f y&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+/&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;
    &lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;21&lt;/span&gt;
    &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;22&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The Raku metaoperators are closer to what I'm looking for, since I don't think you can assign the "created operator" directly to a callable variable. J lets you, though!&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nv"&gt;h&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+/&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;h&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
    &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;That said, J has some "decomposable" ternaries that feel &lt;em&gt;spiritually&lt;/em&gt; like ternaries, like &lt;a href="https://code.jsoftware.com/wiki/Vocabulary/curlyrt#dyadic" target="_blank"&gt;amend&lt;/a&gt; and &lt;a href="https://code.jsoftware.com/wiki/Vocabulary/fcap" target="_blank"&gt;fold&lt;/a&gt;. It also has a special ternary-ish contruct called the "fork".&lt;sup id="fnref:ternaryish"&gt;&lt;a class="footnote-ref" href="#fn:ternaryish"&gt;3&lt;/a&gt;&lt;/sup&gt; &lt;code&gt;x (f g h) y&lt;/code&gt; is parsed as &lt;code&gt;(x f y) g (x h y)&lt;/code&gt;:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;NB. Max - min&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&lt;.&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&lt;.&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
    &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So at the top level that's just a binary operator, but the binary op is constructed via a ternary op. That's pretty cool IMO.&lt;/p&gt;
    &lt;h3&gt;Assignment Ternaries&lt;/h3&gt;
    &lt;p&gt;Bob Nystrom points out that in many languages, &lt;code&gt;a[b] = c&lt;/code&gt; is a ternary operation: it is &lt;em&gt;not&lt;/em&gt; the same as &lt;code&gt;x = a[b]; x = c&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;A weirder case shows up in &lt;a href="https://github.com/betaveros/noulith/" target="_blank"&gt;Noulith&lt;/a&gt; and Raku (again): update operators. Most languages have the &lt;code&gt;+=&lt;/code&gt; &lt;em&gt;binary operator&lt;/em&gt;, these two have the &lt;code&gt;f=&lt;/code&gt; &lt;em&gt;ternary operator&lt;/em&gt;. &lt;code&gt;a f= b&lt;/code&gt; is the same as &lt;code&gt;a = f(a, b)&lt;/code&gt;.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Raku&lt;/span&gt;
    &gt; &lt;span class="k"&gt;my&lt;/span&gt; &lt;span class="nv"&gt;$x&lt;/span&gt; = &lt;span class="mi"&gt;2&lt;/span&gt;; &lt;span class="nv"&gt;$x&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;= &lt;span class="mi"&gt;3&lt;/span&gt;; &lt;span class="nb"&gt;say&lt;/span&gt; &lt;span class="nv"&gt;$x&lt;/span&gt;
    &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Arguably this is just syntactic sugar, but I don't think it's decomposable into binary operations.&lt;/p&gt;
    &lt;h3&gt;Custom user ternaries&lt;/h3&gt;
    &lt;p&gt;Tikhon Jelvis pointed out that &lt;a href="https://agda.readthedocs.io/en/v2.7.0.1/language/mixfix-operators.html" target="_blank"&gt;Agda&lt;/a&gt;  lets you define &lt;em&gt;custom&lt;/em&gt; mixfix operators, which can be ternary or even tetranary or pentanary. I later found out that &lt;a href="https://docs.racket-lang.org/mixfix/index.html" target="_blank"&gt;Racket&lt;/a&gt; has this, too. &lt;a href="https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/ProgrammingWithObjectiveC/Introduction/Introduction.html" target="_blank"&gt;Objective-C&lt;/a&gt; &lt;em&gt;looks&lt;/em&gt; like this, too, but feels different somehow. &lt;/p&gt;
    &lt;h3&gt;Near Misses&lt;/h3&gt;
    &lt;p&gt;All of these are arguable, I've just got to draw a line in the sand &lt;em&gt;somewhere&lt;/em&gt;.&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;Regular expression substitutions: &lt;code&gt;s/from/to/flags&lt;/code&gt; seems like a ternary, but I'd argue it a datatype constructor, not an expression operator.&lt;/li&gt;
    &lt;li&gt;Comprehensions like &lt;code&gt;[x + 1 | x &lt;- list]&lt;/code&gt;: looks like the ternary &lt;code&gt;[expr1 | expr2 &lt;- expr3]&lt;/code&gt;, but &lt;code&gt;expr2&lt;/code&gt; is only binding a name. Arguably a ternary if you can map &lt;em&gt;and filter&lt;/em&gt; in the same expression a la Python or Haskell, but should that be considered sugar for&lt;/li&gt;
    &lt;li&gt;Python's operator chaining (&lt;code&gt;1 &lt; x &lt; 5&lt;/code&gt;): syntactic sugar for &lt;code&gt;1 &lt; x and x &lt; 5&lt;/code&gt;.&lt;/li&gt;
    &lt;li&gt;Someone suggested &lt;a href="https://stackoverflow.com/questions/7251772/what-exactly-constitutes-swizzling-in-opengl-es-2-0-powervr-sgx-specifically" target="_blank"&gt;glsl swizzles&lt;/a&gt;, which are very cool but binary operators.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;h2&gt;Why are ternaries so rare?&lt;/h2&gt;
    &lt;p&gt;Ternaries are &lt;em&gt;somewhat&lt;/em&gt; more common in math and physics, f.ex in integrals and sums. That's because they were historically done on paper, where you have a 2D canvas, so you can do stuff like this easily:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;10
    Σ    n
    n=0
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;We express the ternary by putting arguments above and below the operator. All mainstream programming languages are linear, though, so any given symbol has only two sides. Plus functions are more regular and universal than infix operators so you might as well write &lt;code&gt;Sum(n=0, 10, n)&lt;/code&gt;. The conditional ternary slips through purely because it's just so darn useful. Though now I'm wondering where it comes from in the first place. Different newsletter, maybe.&lt;/p&gt;
    &lt;p&gt;But I still find ternary operators super interesting, please let me know if you know any I haven't covered!&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Blog Rec&lt;/h3&gt;
    &lt;p&gt;This week's blog rec is &lt;a href="https://lexi-lambda.github.io/" target="_blank"&gt;Alexis King&lt;/a&gt;! Generally, Alexis's work spans the theory, practice, and implementation of programming languages, aimed at a popular audience and not an academic one. If you know her for one thing, it's probably &lt;a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate" target="_blank"&gt;Parse, don't validate&lt;/a&gt;, which is now so mainstream most people haven't read the original post. Another good one is about &lt;a href="https://lexi-lambda.github.io/blog/2020/01/19/no-dynamic-type-systems-are-not-inherently-more-open/" target="_blank"&gt;modeling open-world systems with static types&lt;/a&gt;. &lt;/p&gt;
    &lt;p&gt;Nowadays she is &lt;em&gt;far&lt;/em&gt; more active on &lt;a href="https://langdev.stackexchange.com/users/861/alexis-king" target="_blank"&gt;Programming Languages Stack Exchange&lt;/a&gt;, where she has blog-length answers on &lt;a href="https://langdev.stackexchange.com/questions/2692/how-should-i-read-type-system-notation/2693#2693" target="_blank"&gt;reading type notations&lt;/a&gt;, &lt;a href="https://langdev.stackexchange.com/questions/3942/what-are-the-ways-compilers-recognize-complex-patterns/3945#3945" target="_blank"&gt;compiler design&lt;/a&gt;, and &lt;a href="https://langdev.stackexchange.com/questions/2069/what-is-an-arrow-and-what-powers-would-it-give-as-a-first-class-concept-in-a-pro/2372#2372" target="_blank"&gt;why arrows&lt;/a&gt;.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:lisp"&gt;
    &lt;p&gt;Unless it's a lisp. &lt;a class="footnote-backref" href="#fnref:lisp" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:ternary"&gt;
    &lt;p&gt;Or &lt;code&gt;x if bool else y&lt;/code&gt;, same thing. &lt;a class="footnote-backref" href="#fnref:ternary" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:ternaryish"&gt;
    &lt;p&gt;I say "ish" because trains can be arbitrarily long: &lt;code&gt;x (f1 f2 f3 f4 f5) y&lt;/code&gt; is something I have &lt;em&gt;no idea&lt;/em&gt; &lt;a href="https://code.jsoftware.com/wiki/Vocabulary/fork" target="_blank"&gt;how to parse&lt;/a&gt;. &lt;a class="footnote-backref" href="#fnref:ternaryish" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 05 Nov 2024 18:40:33 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/a-list-of-ternary-operators/</guid></item><item><title>TLA from first principles</title><link>https://buttondown.com/hillelwayne/archive/tla-from-first-principles/</link><description>
    &lt;h3&gt;No Newsletter next week&lt;/h3&gt;
    &lt;p&gt;I'll be speaking at &lt;a href="https://www.usenix.org/conference/srecon24emea/presentation/wayne" target="_blank"&gt;USENIX SRECon&lt;/a&gt;!&lt;/p&gt;
    &lt;h2&gt;TLA from first principles&lt;/h2&gt;
    &lt;p&gt;I'm working on v0.5 of &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Logic for Programmers&lt;/a&gt;. In the process of revising the "System Modeling" chapter, I stumbled on a great way to explain the &lt;strong&gt;T&lt;/strong&gt;emporal &lt;strong&gt;L&lt;/strong&gt;ogic of &lt;strong&gt;A&lt;/strong&gt;ctions that TLA+ is based on. I'm reproducing that bit here with some changes to fit the newsletter format.&lt;/p&gt;
    &lt;p&gt;Note that by this point the reader has already encountered property testing, formal verification, decision tables, and nontemporal specifications, and should already have a lot of practice expressing things as predicates. &lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;The intro&lt;/h3&gt;
    &lt;p&gt;We have some bank users, each with an account balance. Bank users can wire money
    to each other. We have overdraft protection, so wires cannot reduce an
    account value below zero. &lt;/p&gt;
    &lt;p&gt;For the purposes of introducing the ideas, we'll assume an extremely simple system: two hardcoded
    variables &lt;code&gt;alice&lt;/code&gt; and &lt;code&gt;bob&lt;/code&gt;, both start with 10 dollars, and transfers
    are only from Alice to Bob. Also, the transfer is totally atomic: we
    check for adequate funds, withdraw, and deposit all in a single moment
    of time. Later [in the chapter] we'll allow for multiple nonatomic transfers at the same time.&lt;/p&gt;
    &lt;p&gt;First, let's look at a valid &lt;strong&gt;behavior&lt;/strong&gt; of the system, or possible way it can evolve.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;alice   10 -&gt;  5 -&gt; 3  -&gt; 3  -&gt; ...
    bob     10 -&gt; 15 -&gt; 17 -&gt; 17 -&gt; ...
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In programming, we'd think of &lt;code&gt;alice&lt;/code&gt; and &lt;code&gt;bob&lt;/code&gt; as variables that change. How do we represent those variables &lt;em&gt;purely&lt;/em&gt; in terms of predicate logic? One way is to instead think of them as &lt;em&gt;arrays&lt;/em&gt; of values. &lt;code&gt;alice[0]&lt;/code&gt; is the initial state of &lt;code&gt;alice&lt;/code&gt;, &lt;code&gt;alice[1]&lt;/code&gt; is after the first time step, etc. Time, then, is "just" the set of natural numbers.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Time  = {0, 1, 2, 3, ...}
    alice = [10, 5, 3, 3, ...]
    bob   = [10, 15, 17, 17, ...]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;In comparison to our valid behavior, here are some &lt;em&gt;invalid&lt;/em&gt; behaviors:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;alice = [10, 3,  ...]
    bob   = [10  15, ...]
    
    alice = [10, -1,  ...]
    bob   = [10  21,  ...]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;The first is invalid because Bob received more money than Alice lost.
    The second is invalid because it violates our proposed invariant, that
    accounts cannot go negative. Can we write a predicate that is &lt;em&gt;true&lt;/em&gt; for
    valid transitions and &lt;em&gt;false&lt;/em&gt; for our two invalid behaviors?&lt;/p&gt;
    &lt;p&gt;Here's one way:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Time = Nat // {0, 1, 2, etc}
    
    Transfer(t: Time) =
      some value in 0..=alice[t]:
        1. alice[t+1] = alice[t] - value
        2. bob[t+1] = bob[t] + value
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Go through and check that this is true for every &lt;code&gt;t&lt;/code&gt; in the valid
    behavior and false for at least one &lt;code&gt;t&lt;/code&gt; in the invalid behavior. Note
    that the steps where Alice &lt;em&gt;doesn't&lt;/em&gt; send a transfer also pass
    &lt;code&gt;Transfer&lt;/code&gt;; we just pick &lt;code&gt;value = 0&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;I can now write a predicate that perfectly describes a valid behavior:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Spec = 
      1. alice[0] = 10
      2. bob[0]   = 10
      3. all t in Time:
        Transfer(t)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now allowing "nothing happens" as "Alice sends an empty transfer" is
    a little bit weird. In the real system, we probably don't want people
    to constantly be sending each other zero dollars:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Transfer(t: Time) =
    &lt;span class="gd"&gt;- some value in 0..=alice[t]:&lt;/span&gt;
    &lt;span class="gi"&gt;+ some value in 1..=alice[t]:&lt;/span&gt;
    &lt;span class="w"&gt; &lt;/span&gt;   1. alice[t+1] = alice[t] - value
    &lt;span class="w"&gt; &lt;/span&gt;   2. bob[t+1] = bob[t] + value
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;But now there can't be a timestep where nothing happens. And that means
    &lt;em&gt;no&lt;/em&gt; behavior is valid! At every step, Alice &lt;em&gt;must&lt;/em&gt; transfer at least one dollar to Bob.
    Eventually there is some &lt;code&gt;t&lt;/code&gt; where &lt;code&gt;alice[t] = 0 &amp;&amp; bob[t] = 20&lt;/code&gt;. Then
    Alice can't make a transfer, &lt;code&gt;Transfer(t)&lt;/code&gt; is false, and so &lt;code&gt;Spec&lt;/code&gt; is
    false.&lt;sup id="fnref:exercise"&gt;&lt;a class="footnote-ref" href="#fn:exercise"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;So typically when modeling we add a &lt;strong&gt;stutter step&lt;/strong&gt;, like this:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Spec =
      1. alice[0] = 10
      2. bob[0]   = 10
      3. all t in Time:
        || Transfer(t)
        || 1. alice[t+1] = alice[t]
           2. bob[t+1] = bob[t]
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;(This is also why we can use infinite behaviors to model a finite algorithm. If the algorithm completes at &lt;code&gt;t=21&lt;/code&gt;, &lt;code&gt;t=22,23,24...&lt;/code&gt; are all stutter steps.)&lt;/p&gt;
    &lt;p&gt;There's enough moving parts here that I'd want to break it into
    subpredicates.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Init =
      1. alice[0] = 10
      2. bob[0]   = 10
    
    Stutter(t) =
      1. alice[t+1] = alice[t]
      2. bob[t+1] = bob[t]
    
    Next(t) = Transfer(t) // foreshadowing
    
    Spec =
      1. Init
      2. all t in Time:
        Next(t) || Stutter(t)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now finally, how do we represent the property &lt;code&gt;NoOverdrafts&lt;/code&gt;? It's an
    &lt;em&gt;invariant&lt;/em&gt; that has to be true at all times. So we do the same thing we
    did in &lt;code&gt;Spec&lt;/code&gt;, write a predicate over all times.&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;property NoOverdrafts =
      all t in Time:
        alice[t] &gt;= 0 &amp;&amp; bob[t] &gt;= 0
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;We can even say that &lt;code&gt;Spec =&gt; NoOverdrafts&lt;/code&gt;, ie if a behavior is valid
    under &lt;code&gt;Spec&lt;/code&gt;, it satisfies &lt;code&gt;NoOverdrafts&lt;/code&gt;.&lt;/p&gt;
    &lt;h4&gt;One of the exercises&lt;/h4&gt;
    &lt;p&gt;Modify the &lt;code&gt;Next&lt;/code&gt; so that Bob can send Alice transfers, too. Don't try
    to be too clever, just do this in the most direct way possible.&lt;/p&gt;
    &lt;p&gt;Bonus: can Alice and Bob transfer to each other in the same step?&lt;/p&gt;
    &lt;p&gt;&lt;strong&gt;Solution&lt;/strong&gt; [in back of book]: We can rename &lt;code&gt;Transfer(t)&lt;/code&gt; to &lt;code&gt;TransferAliceToBob(t)&lt;/code&gt;, write the
    converse as a new predicate, and then add it to &lt;code&gt;next&lt;/code&gt;. Like this&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;TransferBobToAlice(t: Time) =
      some value in 1..=bob[t]:
        1. alice[t+1] = alice[t] - value
        2. bob[t+1] = bob[t] + value
    
    Next(t) =
      || TransferAliceToBob(t)
      || TransferBobToAlice(t)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now, can Alice and Bob transfer to each other in the same step? No.
    Let's say they both start with 10 dollars and each try to transfer five
    dollars to each other. By &lt;code&gt;TransferAliceToBob&lt;/code&gt; we have:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;1. alice[1] = alice[0] - 5 = 5
    2. bob[1] = bob[0] + 5 = 15
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;And by &lt;code&gt;TransferBobToAlice&lt;/code&gt;, we have:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;1. bob[1] = bob[0] - 5 = 5
    2. alice[1] = alice[0] + 5 = 15
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;So now we have &lt;code&gt;alice[1] = 5 &amp;&amp; alice[1] = 15&lt;/code&gt;, which is always false.&lt;/p&gt;
    &lt;h3&gt;Temporal Logic&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;This is good and all, but in practice, there's two downsides to
    treating time as a set we can quantify over:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;It's cumbersome. We have to write &lt;code&gt;var[t]&lt;/code&gt; and &lt;code&gt;var[t+1]&lt;/code&gt; all over
        the place.&lt;/li&gt;
    &lt;li&gt;It's too powerful. We can write expressions like
        &lt;code&gt;alice[t^2-5] = alice[t] + t&lt;/code&gt;.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Problem (2) might seem like a good thing; isn't the whole &lt;em&gt;point&lt;/em&gt; of
    logic to be expressive? But we have a long-term goal in mind: getting a
    computer to check our formal specification. We need to limit the
    expressivity of our model so that we can make it checkable. &lt;/p&gt;
    &lt;p&gt;In practice, this will mean making time implicit to our model, instead of
    explicitly quantifying over it.&lt;/p&gt;
    &lt;p&gt;The first thing we need to do is limit how we can use time. At a
    given point in time, all we can look at is the &lt;em&gt;current&lt;/em&gt; value of a
    variable (&lt;code&gt;var[t]&lt;/code&gt;) and the &lt;em&gt;next&lt;/em&gt; value (&lt;code&gt;var[t+1]&lt;/code&gt;). No &lt;code&gt;var[t+16]&lt;/code&gt; or
    &lt;code&gt;var[t-1]&lt;/code&gt; or anything else complicated.&lt;/p&gt;
    &lt;p&gt;And it turns out we've already seen a mathematical convention for
    expressing this: &lt;strong&gt;priming&lt;/strong&gt;!&lt;sup id="fnref:priming"&gt;&lt;a class="footnote-ref" href="#fn:priming"&gt;2&lt;/a&gt;&lt;/sup&gt; For a
    given time &lt;code&gt;t&lt;/code&gt;, we can define &lt;code&gt;var&lt;/code&gt; to mean &lt;code&gt;var[t]&lt;/code&gt; and &lt;code&gt;var'&lt;/code&gt; to mean
    &lt;code&gt;var[t+1]&lt;/code&gt;. Then &lt;code&gt;Transfer(t)&lt;/code&gt; becomes&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Transfer =
      some value in 1..=alice:
        1. alice' = alice
        2. bob' = bob
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Next we have the construct &lt;code&gt;all t in Time: P(t)&lt;/code&gt; in both &lt;code&gt;Spec&lt;/code&gt; and
    &lt;code&gt;NoOverdrafts&lt;/code&gt;. In other words, "P is always true". So we can add
    &lt;code&gt;always&lt;/code&gt; as a new term. Logicians conventionally use □ or &lt;code&gt;[]&lt;/code&gt;
    to mean the same thing.&lt;sup id="fnref:beyond"&gt;&lt;a class="footnote-ref" href="#fn:beyond"&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;property NoOverdrafts =
      always (alice &gt;= 0 &amp;&amp; bob[t] &gt;= 0)
      // or [](alice &gt;= 0 &amp;&amp; bob[t] &gt;= 0)
    
    Spec =
      Init &amp;&amp; always (Next || Stutter)
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Now time is &lt;em&gt;almost&lt;/em&gt; completely implicit in our spec, with just one
    exception: &lt;code&gt;Init&lt;/code&gt; has &lt;code&gt;alice[0]&lt;/code&gt; and &lt;code&gt;bob[0]&lt;/code&gt;. We just need one more
    convention: if a variable is referenced &lt;em&gt;outside&lt;/em&gt; of the scope of a
    temporal operator, it means &lt;code&gt;var[0]&lt;/code&gt;. Since &lt;code&gt;Init&lt;/code&gt; is outside of the &lt;code&gt;[]&lt;/code&gt;, it becomes&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Init =
      1. alice = 10
      2. bob = 10
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;And with that, we've removed &lt;code&gt;Time&lt;/code&gt; as an explicit value in our model.&lt;/p&gt;
    &lt;p&gt;The addition of primes and &lt;code&gt;always&lt;/code&gt; makes this a &lt;strong&gt;temporal logic&lt;/strong&gt;, one that can model how things change over time. And that makes it ideal for modeling software systems.&lt;/p&gt;
    &lt;h3&gt;Modeling with TLA+&lt;/h3&gt;
    &lt;p&gt;One of the most popular specification languages for modeling these kinds
    of concurrent systems is &lt;strong&gt;TLA+&lt;/strong&gt;. TLA+ was invented by the Turing award-winner Leslie Lamport, who also invented a wide variety of concurrency algorithms and LaTeX. Here's our current
    spec in TLA+:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;---- MODULE transfers ----
    EXTENDS Integers
    
    VARIABLES alice, bob
    vars == &lt;&lt;alice, bob&gt;&gt;
    
    Init ==
      alice = 10 
      /\ bob = 10
    
    AliceToBob ==
      \E amnt \in 1..alice:
        alice' = alice - amnt
        /\ bob' = bob + amnt
    
    BobToAlice ==
      \E amnt \in 1..bob:
        alice' = alice + amnt
        /\ bob' = bob - amnt
    
    Next ==
      AliceToBob
      \/ BobToAlice
    
    Spec == Init /\ [][Next]_vars
    
    NoOverdrafts ==
      [](alice &gt;= 0 /\ bob &gt;= 0)
    
    ====
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;TLA+ uses ASCII versions of mathematicians notation: &lt;code&gt;/\&lt;/code&gt;/&lt;code&gt;\/&lt;/code&gt; for
    &lt;code&gt;&amp;&amp;/||&lt;/code&gt;, &lt;code&gt;\A \E&lt;/code&gt; for &lt;code&gt;all/some&lt;/code&gt;, etc. The only thing that's "unusual"
    (besides &lt;code&gt;==&lt;/code&gt; for definition) is the &lt;code&gt;[][Next]_vars&lt;/code&gt; bit. That's TLA+
    notation for &lt;code&gt;[](Next || Stutter)&lt;/code&gt;: &lt;code&gt;Next&lt;/code&gt; or &lt;code&gt;Stutter&lt;/code&gt; always happens.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;p&gt;The rest of the chapter goes on to explain model checking, PlusCal (for modeling nonatomic transactions without needing to explain the exotic TLA+ function syntax), and liveness properties. But this is the intuition behind the "temporal logic of actions": temporal operators are operations on the set of points of time, and we restrict what we can do with those operators to make reasoning about the specification feasible.&lt;/p&gt;
    &lt;p&gt;Honestly I like it enough that I'm thinking of redesigning my TLA+ workshop to start with this explanation. Then again, maybe it only seems good to me because I already know TLA+. Please let me know what you think about it!&lt;/p&gt;
    &lt;p&gt;Anyway, the new version of the chapter will be in v0.5, which should be out mid-November.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Blog Rec&lt;/h3&gt;
    &lt;p&gt;This one it's really dear to me: &lt;a href="https://muratbuffalo.blogspot.com/" target="_blank"&gt;Metadata&lt;/a&gt;, by Murat Demirbas. When I was first trying to learn TLA+ back in 2016, his post &lt;a href="https://muratbuffalo.blogspot.com/2015/01/my-experience-with-using-tla-in.html" target="_blank"&gt;on using TLA+ in a distributed systems class&lt;/a&gt; was one of, like... &lt;em&gt;three&lt;/em&gt; public posts on TLA+. I must have spent hours rereading that post and puzzling out this weird language I stumbled into. Later I emailed Murat with some questions and he was super nice in answering them. Don't think I would have ever grokked TLA+ without him.&lt;/p&gt;
    &lt;p&gt;In addition to TLA+ content, a lot of the blog is also breakdowns of papers he read— like &lt;a href="https://blog.acolyer.org/" target="_blank"&gt;the morning paper&lt;/a&gt;, except with a focus on distributed systems (and still active). If you're interested in learning more about the science of distributed systems, he has an excellent page on &lt;a href="https://muratbuffalo.blogspot.com/2021/02/foundational-distributed-systems-papers.html" target="_blank"&gt;foundational distributed systems papers&lt;/a&gt;. But definitely check out his &lt;a href="https://muratbuffalo.blogspot.com/2023/09/metastable-failures-in-wild.html" target="_blank"&gt;his deep readings&lt;/a&gt;, too!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:exercise"&gt;
    &lt;p&gt;In the book this is presented as an exercise (with the solution in back). The exercise also clarifies that since &lt;code&gt;Time = Nat&lt;/code&gt;, all behaviors have an &lt;em&gt;infinite&lt;/em&gt; number of steps. &lt;a class="footnote-backref" href="#fnref:exercise" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:priming"&gt;
    &lt;p&gt;Priming is introduced in the chapter on decision tables, and again in the chapter on database invariants. &lt;code&gt;x'&lt;/code&gt; is "the next value of &lt;code&gt;x&lt;/code&gt;", so you can use it to express database invariants like "jobs only move from &lt;code&gt;ready&lt;/code&gt; to &lt;code&gt;started&lt;/code&gt; or &lt;code&gt;aborted&lt;/code&gt;." &lt;a class="footnote-backref" href="#fnref:priming" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:beyond"&gt;
    &lt;p&gt;I'm still vacillating on whether I want a "beyond logic" appendix that covers higher order logic, constructive logic, and modal logic (which is what we're sneakily doing right now!)&lt;/p&gt;
    &lt;p&gt;While I'm here, this explanation of &lt;code&gt;always&lt;/code&gt; as &lt;code&gt;all t in Time&lt;/code&gt; isn't &lt;em&gt;100%&lt;/em&gt; accurate, since it doesn't explain why things like &lt;code&gt;[](P =&gt; []Q)&lt;/code&gt; or &lt;code&gt;&lt;&gt;[]P&lt;/code&gt; make sense. But it's accurate in most cases and is a great intuition pump. &lt;a class="footnote-backref" href="#fnref:beyond" title="Jump back to footnote 3 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 22 Oct 2024 17:14:21 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/tla-from-first-principles/</guid></item><item><title>Be Suspicious of Success</title><link>https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/</link><description>
    &lt;p&gt;From Leslie Lamport's &lt;em&gt;Specifying Systems&lt;/em&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;You should be suspicious if [the model checker] does not find a violation of a liveness property... you should also be suspicious if [it] finds no errors when checking safety properties. &lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;This is specifically in the context of model-checking a formal specification, but it's a widely applicable software principle. It's not enough for a program to work, it has to work for the &lt;em&gt;right reasons&lt;/em&gt;. Code working for the wrong reasons is code that's going to break when you least expect it. And since "correct for right reasons" is a much narrower target than "correct for any possible reason", we can't assume our first success is actually our intended success.&lt;/p&gt;
    &lt;p&gt;Hence, BSOS: &lt;strong&gt;Be Suspicious of Success&lt;/strong&gt;.&lt;/p&gt;
    &lt;h3&gt;Some useful BSOS practices&lt;/h3&gt;
    &lt;p&gt;The standard way of dealing with BSOS is verification. Tests, static checks, model checking, etc. We get more confident in our code if our verifications succeed. But then we also have to be suspicious of &lt;em&gt;that&lt;/em&gt; success, too! How do I know whether my tests are passing because they're properly testing correct code or because they're failing to test incorrect code?&lt;/p&gt;
    &lt;p&gt;This is why test-driven development gurus tell people to write a failing test first. Then at least we know the tests are doing &lt;em&gt;something&lt;/em&gt; (even if they still might not be testing what they want).&lt;/p&gt;
    &lt;p&gt;The other limit of verification is that it can't tell us &lt;em&gt;why&lt;/em&gt; something succeeds. Mainstream verification methods are good at explaining why things &lt;em&gt;fail&lt;/em&gt;— expected vs actual test output, type mismatches, specification error traces. Success isn't as "information-rich" as failure. How do you distinguish a faithful implementation of &lt;a href="https://en.wikipedia.org/wiki/Collatz_conjecture" target="_blank"&gt;&lt;code&gt;is_collatz_counterexample&lt;/code&gt;&lt;/a&gt; from &lt;code&gt;return false&lt;/code&gt;?&lt;/p&gt;
    &lt;p&gt;A broader technique I follow is &lt;em&gt;make it work, make it break&lt;/em&gt;. If code is working for the right reasons, I should be able to predict how to break it. This can be either a change in the runtime (this will livelock if we 10x the number of connections), or a change to the code itself (commenting out &lt;em&gt;this&lt;/em&gt; line will cause property X to fail). &lt;sup id="fnref:superproperties"&gt;&lt;a class="footnote-ref" href="#fn:superproperties"&gt;1&lt;/a&gt;&lt;/sup&gt; If the code still works even after the change, my model of the code is wrong and it was succeeding for the wrong reasons.&lt;/p&gt;
    &lt;h3&gt;Happy and Sad Paths&lt;/h3&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;A related topic (possibly subset?) is "happy and sad paths". The happy path of your code is the behavior when everything's going right: correct inputs, preconditions are satisfied, the data sources are present, etc. The sad path is all of the code that handles things going wrong. Retry mechanisms, insufficient user authority, database constraint violation, etc. In most software, the code supporting the sad paths dwarfs the code in the happy path.&lt;/p&gt;
    &lt;p&gt;BSOS says that I can't just show code works in the happy path, I also need to check it works in the sad path. &lt;/p&gt;
    &lt;p&gt;BSOS also says that I have to be suspicious when the sad path works properly, too. &lt;/p&gt;
    &lt;p&gt;Say I add a retry mechanism to my code to handle the failure mode of timeouts. I test the code and it works. Did the retry code actually &lt;em&gt;run&lt;/em&gt;? Did it run &lt;em&gt;regardless&lt;/em&gt; of the original response? Is it really doing exponential backoff? Will stop after the maximum retry limit? Is the sad path code &lt;em&gt;after&lt;/em&gt; the maximum retry limit working properly?&lt;/p&gt;
    &lt;p&gt;&lt;a href="https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf" target="_blank"&gt;One paper&lt;/a&gt; found that 35% of catastrophic distributed system failures were caused by "trivial mistakes in error handlers" (pg 9). These were in mature, battle-hardened programs. Be suspicious of success. Be more suspicious of sad path success.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h2&gt;Blog Rec&lt;/h2&gt;
    &lt;p&gt;This week's blog rec is &lt;a href="https://www.redblobgames.com/" target="_blank"&gt;Red Blob Games&lt;/a&gt;!&lt;sup id="fnref:blogs-vs-articles"&gt;&lt;a class="footnote-ref" href="#fn:blogs-vs-articles"&gt;2&lt;/a&gt;&lt;/sup&gt; While primarily about computer game programming, the meat of the content is beautiful, interactive guides to general CS algorithms. Some highlights:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://www.redblobgames.com/pathfinding/a-star/introduction.html" target="_blank"&gt;Introduction to the A* Algorithm&lt;/a&gt; was really illuminating when I was a baby programmer.&lt;/li&gt;
    &lt;li&gt;I'm sure this &lt;a href="https://www.redblobgames.com/articles/noise/introduction.html" target="_blank"&gt;overview of noise functions&lt;/a&gt; will be useful to me &lt;em&gt;someday&lt;/em&gt;. Maybe for test data generation?&lt;/li&gt;
    &lt;li&gt;If you're also an explainer type he has a lot of great stuff on &lt;a href="https://www.redblobgames.com/making-of/line-drawing/" target="_blank"&gt;his process&lt;/a&gt; and his &lt;a href="https://www.redblobgames.com/making-of/little-things/" target="_blank"&gt;little tricks&lt;/a&gt; to make things more understandable.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;(I don't think his &lt;a href="https://www.redblobgames.com/blog/posts.xml" target="_blank"&gt;rss feed&lt;/a&gt; covers new interactive articles, only the &lt;a href="https://www.redblobgames.com/blog/" target="_blank"&gt;blog&lt;/a&gt; specifically.)&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:superproperties"&gt;
    &lt;p&gt;&lt;a href="https://www.jameskoppel.com/" target="_blank"&gt;Jimmy Koppel&lt;/a&gt; once proposed that just as code has properties, code variations have &lt;a href="https://groups.csail.mit.edu/sdg/pubs/2020/demystifying_dependence_published.pdf" target="_blank"&gt;&lt;strong&gt;superproperties&lt;/strong&gt;&lt;/a&gt;. For example, "no modification to the codebase causes us to use a greater number of deprecated APIs." &lt;a class="footnote-backref" href="#fnref:superproperties" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:blogs-vs-articles"&gt;
    &lt;p&gt;Okay, it's more an &lt;em&gt;article&lt;/em&gt; site, because there's also a &lt;a href="https://www.redblobgames.com/blog/" target="_blank"&gt;Red Blob &lt;em&gt;blog&lt;/em&gt;&lt;/a&gt; (which covers a lot of neat stuff, too). Maybe I should just rename this section to "site rec". &lt;a class="footnote-backref" href="#fnref:blogs-vs-articles" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Wed, 16 Oct 2024 15:08:39 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/</guid></item><item><title>How to convince engineers that formal methods is cool</title><link>https://buttondown.com/hillelwayne/archive/how-to-convince-engineers-that-formal-methods-is/</link><description>
    &lt;p&gt;Sorry there was no newsletter last week! I got COVID. Still got it, which is why this one's also short.&lt;/p&gt;
    &lt;h3&gt;Logic for Programmers v0.4&lt;/h3&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Now available&lt;/a&gt;! This version adds a chapter on TLA+, significantly expands the constraint solver chapter, and adds a "planner programming" section to the Logic Programming chapter. You can see the full release notes on the &lt;a href="https://leanpub.com/logic/" target="_blank"&gt;book page&lt;/a&gt;.&lt;/p&gt;
    &lt;h1&gt;How to convince engineers that formal methods is cool&lt;/h1&gt;
    &lt;p&gt;I have an open email for answering questions about formal methods,&lt;sup id="fnref:fs-fv"&gt;&lt;a class="footnote-ref" href="#fn:fs-fv"&gt;1&lt;/a&gt;&lt;/sup&gt; and one of the most common questions I get is "how do I convince my coworkers that this is worth doing?" usually the context is the reader is really into the idea of FM but their coworkers don't know it exists. The goal of the asker is to both introduce FM and persuade them that FM's useful. &lt;/p&gt;
    &lt;p&gt;In my experience as a consultant and advocate, I've found that there's only two consistently-effective ways to successfully pitch FM:&lt;/p&gt;
    &lt;ol&gt;
    &lt;li&gt;Use FM to find an &lt;em&gt;existing&lt;/em&gt; bug in a work system&lt;/li&gt;
    &lt;li&gt;Show how FM finds a historical bug that's already been fixed.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;h4&gt;Why this works&lt;/h4&gt;
    &lt;p&gt;There's two main objections to FM that we need to address. The first is that FM is too academic and doesn't provide a tangible, practical benefit. The second is that FM is too hard; only PhDs and rocket scientists can economically use it. (Showing use cases from AWS &lt;em&gt;et al&lt;/em&gt; aren't broadly persuasive because skeptics don't have any insight into how AWS functions.) Finding an existing bug hits both: it helped the team with a real problem, and it was done by a mere mortal. &lt;/p&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;Demonstrating FM on a historical bug isn't &lt;em&gt;as&lt;/em&gt; effective: it only shows that formal methods &lt;em&gt;could have&lt;/em&gt; helped, not that it actually does help. But people will usually remember the misery of debugging that problem. Bug war stories are popular for a reason!&lt;/p&gt;
    &lt;h3&gt;Making historical bugs persuasive&lt;/h3&gt;
    &lt;p&gt;So "live bug" is a stronger rec, but "historical bug" tends to be easier to show. This is because &lt;em&gt;you know what you're looking for&lt;/em&gt;. It's easier to write a high-level spec on a system you already know, and show it finds a bug you already know about.&lt;/p&gt;
    &lt;p&gt;The trick to make it look convincing is to make the spec and bug as "natural" as possible. You can't make it seem like FM only found the bug because you had foreknowledge of what it was— then the whole exercise is too contrived. People will already know you had foreknowledge, of course, and are factoring that into their observations. You want to make the case that the spec you're writing is clear and obvious enough that an "ignorant" person could have written it. That means nothing contrived or suspicious.&lt;/p&gt;
    &lt;p&gt;This is a bit of a fuzzy definition, more a vibe than anything. Ask yourself "does this spec look like something that was tailor-made around this bug, or does it find the bug as a byproduct of being a regular spec?"&lt;/p&gt;
    &lt;p&gt;A good example of a "natural" spec is &lt;a href="https://www.hillelwayne.com/post/augmenting-agile/" target="_blank"&gt;the bounded queue problem&lt;/a&gt;. It's a straight translation of some Java code with no properties besides deadlock checking. Usually you'll be at a higher level of abstraction, though.&lt;/p&gt;
    &lt;hr/&gt;
    &lt;h3&gt;Blog rec: &lt;a href="https://www.argmin.net/" target="_blank"&gt;arg min&lt;/a&gt;&lt;/h3&gt;
    &lt;p&gt;This is a new section I want to try for a bit: recommending tech(/-adjacent) blogs that I like. This first one is going to be a bit niche: &lt;a href="https://www.argmin.net/" target="_blank"&gt;arg min&lt;/a&gt; is writing up lecture notes on "convex optimization". It's a cool look into the theory behind constraint solving. I don't understand most of the math but the prose is pretty approachable. Couple of highlights:&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;&lt;a href="https://www.argmin.net/p/modeling-dystopia" target="_blank"&gt;Modeling Dystopia&lt;/a&gt; about why constraint solving isn't a mainstream technology.&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://www.argmin.net/p/convex-optimization-live-blog" target="_blank"&gt;Table of Contents&lt;/a&gt; to see all of the posts.&lt;/li&gt;
    &lt;/ul&gt;
    &lt;p&gt;The blogger also talks about some other topics but I haven't read those posts much.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:fs-fv"&gt;
    &lt;p&gt;As always, talking primarily about formal specification of systems (TLA+/Alloy/Spin), not formal verification of code (Dafny/SPARK/Agda). I talk about the differences a bit &lt;a href="https://www.hillelwayne.com/post/why-dont-people-use-formal-methods/" target="_blank"&gt;here&lt;/a&gt; (but I really need to write a more focused piece). &lt;a class="footnote-backref" href="#fnref:fs-fv" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 08 Oct 2024 16:18:55 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/how-to-convince-engineers-that-formal-methods-is/</guid></item><item><title>Refactoring Invariants</title><link>https://buttondown.com/hillelwayne/archive/refactoring-invariants/</link><description>
    &lt;p&gt;(Feeling a little sick so this one will be short.)&lt;/p&gt;
    &lt;p&gt;I'm often asked by clients to review their (usually TLA+) formal specifications. These specs are generally slower and more convoluted than an expert would write. I want to fix them up without changing the overall behavior of the spec or introducing subtle bugs.&lt;/p&gt;
    &lt;p&gt;To do this, I use a rather lovely feature of TLA+. Say I see a 100-line &lt;code&gt;Foo&lt;/code&gt; action that I think I can refactor down to 20 lines. I'll first write a refactored version as a separate action &lt;code&gt;NewFoo&lt;/code&gt;, then I run the model checker with the property&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;RefactorProp ==
        [][Foo &lt;=&gt; NewFoo]_vars
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;That's an intimidating nest of symbols but all it's saying is that every &lt;code&gt;Foo&lt;/code&gt; step must also be a &lt;code&gt;NewFoo&lt;/code&gt; step. If the refactor ever does something different from the original action, the model-checker will report the exact behavior and transition it fails for. Conversely, if the model checker passes, I can safely assume they have identical behaviors.&lt;/p&gt;
    &lt;p&gt;This is a &lt;strong&gt;refactoring invariant&lt;/strong&gt;:&lt;sup id="fnref:invariant"&gt;&lt;a class="footnote-ref" href="#fn:invariant"&gt;1&lt;/a&gt;&lt;/sup&gt; the old and new versions of functions have identical behavior. Refactoring invariants are superbly useful in formal specification. Software devs spend enough time refactoring that they'd be useful for coding, too.&lt;/p&gt;
    &lt;p&gt;Alas, refactoring invariants are a little harder to express in code. In TLA+ we're working with bounded state spaces, so the model checker can check the invariant for every possible state. Even a simple program can have an unbounded state space via an infinite number of possible function inputs. &lt;/p&gt;
    &lt;p&gt;(Also formal specifications are "pure" simulations while programs have side effects.)&lt;/p&gt;
    &lt;p&gt;The "normal" way to verify a program refactoring is to start out with a huge suite of &lt;a href="https://buttondown.com/hillelwayne/archive/oracle-testing/" target="_blank"&gt;oracle tests&lt;/a&gt;. This &lt;em&gt;should&lt;/em&gt; catch a bad refactor via failing tests. The downside is that you might not have the test suite in the first place, or not one that covers your particular refactoring. Second, even if the test suite does, it only indirectly tests the invariant. It catches the refactoring error as a consequence of testing other stuff. What if we want to directly test the refactoring invariant?&lt;/p&gt;
    &lt;h3&gt;Two ways of doing this&lt;/h3&gt;
    &lt;p&gt;One: by pulling in formal methods. Ray Myers has a &lt;a href="https://www.youtube.com/watch?v=UdB3XBf219Y" target="_blank"&gt;neat video&lt;/a&gt; on formally proving a refactoring is correct. That one's in the niche language ACL2, but he's also got one on &lt;a href="https://www.youtube.com/watch?v=_7RXQE-pCMo" target="_blank"&gt;refactoring C&lt;/a&gt;. You might not even to prove the refactoring correct, you could probably get away with using an &lt;a href="https://github.com/pschanely/CrossHair" target="_blank"&gt;SMT solver&lt;/a&gt; to find counterexamples.&lt;/p&gt;
    &lt;p&gt;Two: by using property-based testing. Generate random inputs, pass them to both functions, and check that the outputs are identical. Using the python &lt;a href="https://hypothesis.readthedocs.io/en/latest/" target="_blank"&gt;Hypothesis&lt;/a&gt; library:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;hypothesis&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;given&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;hypothesis.strategies&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;st&lt;/span&gt;
    
    &lt;span class="c1"&gt;# from the `gilded rose kata`&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_quality&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
    
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_quality_new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
    
    &lt;span class="nd"&gt;@given&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;builds&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_refactoring&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;update_quality&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;update_quality_new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;One tricky bit is if the function is part of a long call chain &lt;code&gt;A -&gt; B -&gt; C&lt;/code&gt;, and you want to test that refactoring &lt;code&gt;C'&lt;/code&gt; doesn't change the behavior of &lt;code&gt;A&lt;/code&gt;. You'd have to add a &lt;code&gt;B'&lt;/code&gt; that uses &lt;code&gt;C'&lt;/code&gt; and then an &lt;code&gt;A'&lt;/code&gt; that uses &lt;code&gt;B'&lt;/code&gt;. Maybe you could instead create a branch, commit the change the &lt;code&gt;C'&lt;/code&gt; in that branch, and then run a &lt;a href="https://www.hillelwayne.com/post/cross-branch-testing/" target="_blank"&gt;cross-branch test&lt;/a&gt; against each branch's &lt;code&gt;A&lt;/code&gt;.&lt;/p&gt;
    &lt;p&gt;Impure functions are harder. The test now makes some side effect twice, which could spuriously break the refactoring invariant. You could instead test the changes are the same, or try to get the functions to effect different entities and then compare the updates of each entity. There's no general solution here though, and there might be No Good Way for a particular effectful refactoring.&lt;/p&gt;
    &lt;h3&gt;Behavior-changing rewrites&lt;/h3&gt;
    &lt;p&gt;We can apply similar ideas for rewrites that change &lt;em&gt;behavior&lt;/em&gt;. Say we have an API, and v1 returns a list of user names while v2 returns a &lt;code&gt;{version, userids}&lt;/code&gt; dict. Then we can find some transformation of v2 into v1, and run the refactoring invariant on that:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;v2_to_v1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v2_resp&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;v2_resp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"userids"&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    
    &lt;span class="nd"&gt;@given&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;some_query_generator&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_refactoring&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;v2_to_v1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;Fun fact: &lt;code&gt;v2_to_v1&lt;/code&gt; is a &lt;a href="https://buttondown.com/hillelwayne/archive/software-isomorphisms/" target="_blank"&gt;software homomorphism&lt;/a&gt;!&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:invariant"&gt;
    &lt;p&gt;Well technically it's an &lt;em&gt;action property&lt;/em&gt; since it's on the transitions of states, not the states, but "refactor invariant" gets the idea across better. &lt;a class="footnote-backref" href="#fnref:invariant" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 24 Sep 2024 20:06:10 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/refactoring-invariants/</guid></item><item><title>Goodhart's Law in Software Engineering</title><link>https://buttondown.com/hillelwayne/archive/goodharts-law-in-software-engineering/</link><description>
    &lt;h3&gt;Blog Hiatus&lt;/h3&gt;
    &lt;p&gt;You might have noticed I haven't been updating my website. I haven't even &lt;em&gt;looked&lt;/em&gt; at any of my drafts for the past three months. All that time is instead going into &lt;em&gt;Logic for Programmers&lt;/em&gt;. I'll get back to the site when that's done or in 2025, whichever comes first. Newsletter and &lt;a href="https://www.patreon.com/hillelwayne" target="_blank"&gt;Patreon&lt;/a&gt; will still get regular updates.&lt;/p&gt;
    &lt;p&gt;(As a comparison, the book is now 22k words. That's like 11 blog posts!)&lt;/p&gt;
    &lt;h2&gt;Goodhart's Law in Software Engineering&lt;/h2&gt;
    &lt;p&gt;I recently got into an argument with some people about whether small functions were &lt;em&gt;mostly&lt;/em&gt; a good idea or &lt;em&gt;always 100%&lt;/em&gt; a good idea, and it reminded me a lot about &lt;a href="https://en.wikipedia.org/wiki/Goodhart%27s_law" target="_blank"&gt;Goodhart's Law&lt;/a&gt;:&lt;/p&gt;
    &lt;blockquote&gt;
    &lt;p&gt;When a measure becomes a target, it ceases to be a good measure.&lt;/p&gt;
    &lt;/blockquote&gt;
    &lt;p&gt;The &lt;em&gt;weak&lt;/em&gt; version of this is that people have perverse incentives to game the metrics. If your metric is "number of bugs in the bug tracker", people will start spuriously closing bugs just to get the number down. &lt;/p&gt;
    &lt;p&gt;The &lt;em&gt;strong&lt;/em&gt; version of the law is that even 100% honest pursuit of a metric, taken far enough, is harmful to your goals, and this is an inescapable consequence of the difference between metrics and values. We have metrics in the first place because what we actually &lt;em&gt;care about&lt;/em&gt; is nonquantifiable. There's some &lt;em&gt;thing&lt;/em&gt; we want more of, but we have no way of directly measuring that thing. We &lt;em&gt;can&lt;/em&gt; measure something that looks like a rough approximation for our goal. But it's &lt;em&gt;not&lt;/em&gt; our goal, and if we replace the metric with the goal, we start taking actions that favor the metric over the goal.&lt;/p&gt;
    &lt;p&gt;Say we want more reliable software. How do you measure "reliability"? You can't. But you &lt;em&gt;can&lt;/em&gt; measure the number of bugs in the bug tracker, because fewer open bugs roughly means more reliability. &lt;strong&gt;This is not the same thing&lt;/strong&gt;. I've seen bugs fixed in ways that made the system &lt;em&gt;less&lt;/em&gt; reliable, but not in ways that translated into tracked bugs.&lt;/p&gt;
    &lt;p&gt;I am a firm believer in the strong version of Goodhart's law. Mostly because of this:&lt;/p&gt;
    &lt;p&gt;&lt;img alt="A peacock with its feathers out. The peacock is scremming" class="newsletter-image" src="https://assets.buttondown.email/images/2573503d-bc57-49ce-aa26-9d399d801118.jpg?w=960&amp;fit=max"/&gt;&lt;/p&gt;
    &lt;p&gt;What does a peahen look for in a mate? A male with maximum fitness. What's a metric that approximates fitness? How nice the plumage is, because nicer plumage = more calories energy to waste on plumage.&lt;sup id="fnref:peacock"&gt;&lt;a class="footnote-ref" href="#fn:peacock"&gt;1&lt;/a&gt;&lt;/sup&gt; But that only &lt;em&gt;approximates&lt;/em&gt; fitness, and over generations the plumage itself becomes the point at the cost of overall bird fitness. Sexual selection is Goodhart's law in action.&lt;/p&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;If the blind watchmaker can fall for Goodhart, people can too.&lt;/p&gt;
    &lt;h3&gt;Examples in Engineering&lt;/h3&gt;
    &lt;p&gt;Goodhart's law is a warning for pointy-haired bosses who up with terrible metrics: lines added, feature points done, etc. I'm more interested in how it affects the metrics we set for ourselves that our bosses might never know about.&lt;/p&gt;
    &lt;ul&gt;
    &lt;li&gt;"Test coverage" is a proxy for how thoroughly we've tested our software. It diverges when we need to test lots of properties of the same lines of code, or when our worst bugs are emergent at the integration level.&lt;/li&gt;
    &lt;li&gt;"Cyclomatic complexity" and "function size" are proxies for code legibility. They diverges when we think about global module legibility, not local function legibility. Then too many functions can obscure the code and data flow.&lt;/li&gt;
    &lt;li&gt;Benchmarks are proxies for performant programs, and diverge when improving benchmarks slows down unbenchmarked operations.&lt;/li&gt;
    &lt;li&gt;Amount of time spent pairing/code reviewing/debugging/whatever proxies "being productive".&lt;/li&gt;
    &lt;li&gt;&lt;a href="https://dora.dev/" target="_blank"&gt;The DORA report&lt;/a&gt; is an interesting case, because it claims four metrics&lt;sup id="fnref:metrics"&gt;&lt;a class="footnote-ref" href="#fn:metrics"&gt;2&lt;/a&gt;&lt;/sup&gt; are proxies to ineffable goals like "elite performance" and &lt;em&gt;employee satisfaction&lt;/em&gt;. It also argues that you should minimize commit size to improve the DORA metrics. A proxy of a proxy of a goal!&lt;/li&gt;
    &lt;/ul&gt;
    &lt;h3&gt;What can we do about this?&lt;/h3&gt;
    &lt;p&gt;No, I do not know how to avoid a law that can hijack the process of evolution.&lt;/p&gt;
    &lt;p&gt;The 2023 DORA report suggests readers should avoid Goodhart's law and "assess a team's strength across a wide range of people, processes, and technical capabilities" (pg 10), which is kind of like saying the fix to production bugs is "don't write bugs". It's a guiding principle but not actionable advice that gets to that principle.&lt;/p&gt;
    &lt;p&gt;They also say "to use a combination of metrics to drive deeper understanding" (ibid), which makes more sense at first. If you have metrics X and Y to approximate goal G, then overoptimizing X &lt;em&gt;might&lt;/em&gt; hurt Y, indicating you're getting further from G. In practice I've seen it turn into "we can't improve X because it'll hurt Y and we can't improve Y because it'll hurt X." This &lt;em&gt;could&lt;/em&gt; mean we're at the best possible spot for G, but more often it means we're trapped very far from our goal. You could come up with a weighted combination of X and Y, like 0.7X + 0.3Y, but &lt;em&gt;that too&lt;/em&gt; is a metric subject to Goodhart. &lt;/p&gt;
    &lt;p&gt;I guess the best I can do is say "use your best engineering judgement"? Evolution is mindless, people aren't. Again, not an actionable or scalable bit of advice, but as I grow older I keep finding "use your best judgement" is all we can do. Knowledge work is ineffable and irreducible.&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:peacock"&gt;
    &lt;p&gt;This sent me down a rabbit hole; turns out scientists are still debating what &lt;em&gt;exactly&lt;/em&gt; the peacock's tail is used for! Is it sexual selection? Adverse signalling? Something else??? &lt;a class="footnote-backref" href="#fnref:peacock" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id="fn:metrics"&gt;
    &lt;p&gt;How soon commits get to production, deployment frequency, percent of deployments that cause errors in production, and mean time to recovery. &lt;a class="footnote-backref" href="#fnref:metrics" title="Jump back to footnote 2 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 17 Sep 2024 16:33:40 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/goodharts-law-in-software-engineering/</guid></item><item><title>Why Not Comments</title><link>https://buttondown.com/hillelwayne/archive/why-not-comments/</link><description>
    &lt;h2&gt;Logic For Programmers v0.3&lt;/h2&gt;
    &lt;p&gt;&lt;a href="https://leanpub.com/logic/" target="_blank"&gt;Now available&lt;/a&gt;! It's a light release as I learn more about formatting a nice-looking book. You can see some of the differences between v2 and v3 &lt;a href="https://bsky.app/profile/hillelwayne.com/post/3l3egdqnqj62o" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;
    &lt;h2&gt;Why Not Comments&lt;/h2&gt;
    &lt;p&gt;Code is written in a structured machine language, comments are written in an expressive human language. The "human language" bit makes comments more expressive and communicative than code. Code has a limited amount of something &lt;em&gt;like&lt;/em&gt; human language contained in identifiers. "Comment the why, not the what" means to push as much information as possible into identifiers. &lt;a href="https://buttondown.com/hillelwayne/archive/3866bd6e-22c3-4098-92ef-4d47ef287ed8" target="_blank"&gt;Not all "what" can be embedded like this&lt;/a&gt;, but a lot can.&lt;/p&gt;
    &lt;p&gt;In recent years I see more people arguing that &lt;em&gt;whys&lt;/em&gt; do not belong in comments either, that they can be embedded into &lt;code&gt;LongFunctionNames&lt;/code&gt; or the names of test cases. Virtually all "self-documenting" codebases add documentation through the addition of identifiers.&lt;sup id="fnref:exception"&gt;&lt;a class="footnote-ref" href="#fn:exception"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
    &lt;p&gt;So what's something in the range of human expression that &lt;em&gt;cannot&lt;/em&gt; be represented with more code?&lt;/p&gt;
    &lt;p&gt;Negative information, drawing attention to what's &lt;em&gt;not&lt;/em&gt; there. The "why nots" of the system.&lt;/p&gt;
    &lt;h3&gt;A Recent Example&lt;/h3&gt;
    &lt;p&gt;This one comes from &lt;em&gt;Logic for Programmers&lt;/em&gt;. For convoluted technical reasons the epub build wasn't translating math notation (&lt;code&gt;\forall&lt;/code&gt;) into symbols (&lt;code&gt;∀&lt;/code&gt;). I wrote a script to manually go through and replace tokens in math strings with unicode equivalents. The easiest way to do this is to call &lt;code&gt;string = string.replace(old, new)&lt;/code&gt; for each one of the 16 math symbols I need to replace (some math strings have multiple symbols).&lt;/p&gt;
    &lt;p&gt;This is incredibly inefficient and I could instead do all 16 replacements in a single pass. But that would be a more complicated solution. So I did the simple way with a comment:&lt;/p&gt;
    &lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Does 16 passes over each string
    BUT there are only 25 math strings in the book so far and most are &amp;lt;5 characters.
    So it's still fast enough.
    &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
    &lt;p&gt;You can think of this as a "why I'm using slow code", but you can also think of it as "why not fast code". It's calling attention to something that's &lt;em&gt;not there&lt;/em&gt;.&lt;/p&gt;
    &lt;h3&gt;Why the comment&lt;/h3&gt;
    &lt;p&gt;If the slow code isn't causing any problems, why have a comment at all?&lt;/p&gt;
    &lt;div class="subscribe-form"&gt;&lt;/div&gt;
    &lt;p&gt;Well first of all the code might be a problem later. If a future version of &lt;em&gt;LfP&lt;/em&gt; has hundreds of math strings instead of a couple dozen then this build step will bottleneck the whole build. Good to lay a signpost now so I know exactly what to fix later.&lt;/p&gt;
    &lt;p&gt;But even if the code is fine forever, the comment still does something important: it shows &lt;em&gt;I'm aware of the tradeoff&lt;/em&gt;. Say I come back to my project two years from now, open &lt;code&gt;epub_math_fixer.py&lt;/code&gt; and see my terrible slow code. I ask "why did I write something so terrible?" Was it inexperience, time crunch, or just a random mistake?&lt;/p&gt;
    &lt;p&gt;The negative comment tells me that I &lt;em&gt;knew&lt;/em&gt; this was slow code, looked into the alternatives, and decided against optimizing. I don't have to spend a bunch of time reinvestigating only to come to the same conclusion. &lt;/p&gt;
    &lt;h2&gt;Why this can't be self-documented&lt;/h2&gt;
    &lt;p&gt;When I was first playing with this idea, someone told me that my negative comment isn't necessary, just name the function &lt;code&gt;RunFewerTimesSlowerAndSimplerAlgorithmAfterConsideringTradeOffs&lt;/code&gt;. Aside from the issues of being long, not explaining the tradeoffs, and that I'd have to change it everywhere if I ever optimize the code... This would make the code &lt;em&gt;less&lt;/em&gt; self-documenting. It doesn't tell you what the function actually &lt;em&gt;does&lt;/em&gt;.&lt;/p&gt;
    &lt;p&gt;The core problem is that function and variable identifiers can only contain one clause of information. I can't store "what the function does" and "what tradeoffs it makes" in the same identifier. &lt;/p&gt;
    &lt;p&gt;What about replacing the comment with a test. I guess you could make a test that greps for math blocks in the book and fails if there's more than 80? But that's not testing &lt;code&gt;EpubMathFixer&lt;/code&gt; directly. There's nothing in the function itself you can hook into. &lt;/p&gt;
    &lt;p&gt;That's the fundamental problem with self-documenting negative information. "Self-documentation" rides along with written code, and so describes what the code is doing. Negative information is about what the code is &lt;em&gt;not&lt;/em&gt; doing. &lt;/p&gt;
    &lt;h3&gt;End of newsletter speculation&lt;/h3&gt;
    &lt;p&gt;I wonder if you can think of "why not" comments as a case of counterfactuals. If so, are "abstractions of human communication" impossible to self-document in general? Can you self-document an analogy? Uncertainty? An ethical claim?&lt;/p&gt;
    &lt;div class="footnote"&gt;
    &lt;hr/&gt;
    &lt;ol&gt;
    &lt;li id="fn:exception"&gt;
    &lt;p&gt;One interesting exception someone told me: they make code "more self-documenting" by turning comments into &lt;em&gt;logging&lt;/em&gt;. I encouraged them to write it up as a blog post but so far they haven't. If they ever do I will link it here. &lt;a class="footnote-backref" href="#fnref:exception" title="Jump back to footnote 1 in the text"&gt;↩&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;/ol&gt;
    &lt;/div&gt;
    </description><pubDate>Tue, 10 Sep 2024 19:40:29 +0000</pubDate><guid>https://buttondown.com/hillelwayne/archive/why-not-comments/</guid></item></channel></rss>
    Raw headers
    {
      "cf-cache-status": "DYNAMIC",
      "cf-ray": "9509089616c6630c-ORD",
      "connection": "keep-alive",
      "content-security-policy-report-only": "default-src 'self'; script-src 'self' 'unsafe-inline' https://static.addtoany.com https://embed.bsky.app https://platform.twitter.com https://www.tiktok.com https://embedr.flickr.com https://scripts.simpleanalyticscdn.com https://cdn.usefathom.com https://plausible.io https://cloud.umami.is https://connect.facebook.net https://www.instagram.com https://sniperl.ink https://cdn.tailwindcss.com; style-src 'self' 'unsafe-inline' https:; img-src 'self' data: https: http: blob:; media-src 'self' data: https: http: blob:; font-src 'self' data: https:; frame-src https: blob:; connect-src 'self' https:; manifest-src 'self'; object-src 'none'; base-uri 'self'; form-action 'self'; report-uri https://o97520.ingest.us.sentry.io/api/6063581/security/?sentry_key=98d0ca1c1c554806b630fa9caf185b1f; report-to csp-endpoint",
      "content-type": "application/rss+xml; charset=utf-8",
      "cross-origin-opener-policy": "same-origin",
      "date": "Mon, 16 Jun 2025 08:45:53 GMT",
      "last-modified": "Thu, 12 Jun 2025 15:43:25 GMT",
      "nel": "{\"report_to\":\"heroku-nel\",\"response_headers\":[\"Via\"],\"max_age\":3600,\"success_fraction\":0.01,\"failure_fraction\":0.1}, {\"report_to\":\"heroku-nel\",\"response_headers\":[\"Via\"],\"max_age\":3600,\"success_fraction\":0.01,\"failure_fraction\":0.1}",
      "referrer-policy": "strict-origin-when-cross-origin",
      "report-to": "{\"group\":\"heroku-nel\",\"endpoints\":[{\"url\":\"https://nel.heroku.com/reports?s=z3lVuY7dL4lCr6%2BiDJg7u4X7Fj6TnhZL0MBkXat3KnI%3D\\u0026sid=929419e7-33ea-4e2f-85f0-7d8b7cd5cbd6\\u0026ts=1750063553\"}],\"max_age\":3600}, {\"group\":\"heroku-nel\",\"endpoints\":[{\"url\":\"https://nel.heroku.com/reports?s=%2BO7MVfNLMMRrQlv%2FCfZ%2FoJhk33aemrCbh%2BUH7pkgB68%3D\\u0026sid=e11707d5-02a7-43ef-b45e-2cf4d2036f7d\\u0026ts=1750063553\"}],\"max_age\":3600}, {\"group\": \"csp-endpoint\", \"max_age\": 86400, \"endpoints\": [{\"url\": \"https://o97520.ingest.us.sentry.io/api/6063581/security/?sentry_key=98d0ca1c1c554806b630fa9caf185b1f\"}], \"include_subdomains\": true}",
      "reporting-endpoints": "heroku-nel=\"https://nel.heroku.com/reports?s=z3lVuY7dL4lCr6%2BiDJg7u4X7Fj6TnhZL0MBkXat3KnI%3D&sid=929419e7-33ea-4e2f-85f0-7d8b7cd5cbd6&ts=1750063553\", heroku-nel=\"https://nel.heroku.com/reports?s=%2BO7MVfNLMMRrQlv%2FCfZ%2FoJhk33aemrCbh%2BUH7pkgB68%3D&sid=e11707d5-02a7-43ef-b45e-2cf4d2036f7d&ts=1750063553\", csp-endpoint=\"https://o97520.ingest.us.sentry.io/api/6063581/security/?sentry_key=98d0ca1c1c554806b630fa9caf185b1f\"",
      "server": "cloudflare",
      "set-cookie": "initial_path=\"/hillelwayne/rss\"; expires=Wed, 16 Jul 2025 08:45:53 GMT; Max-Age=2592000; Path=/",
      "transfer-encoding": "chunked",
      "vary": "Cookie, Host, origin, Accept-Encoding",
      "via": "1.1 heroku-router, 2.0 heroku-router",
      "x-content-type-options": "nosniff",
      "x-frame-options": "DENY"
    }
    Parsed with @rowanmanning/feed-parser
    {
      "meta": {
        "type": "rss",
        "version": "2.0"
      },
      "language": "en-us",
      "title": "Computer Things",
      "description": "Hi, I'm Hillel. This is the newsletter version of [my website](https://www.hillelwayne.com). I post all website updates here. I also post weekly content just for the newsletter, on topics like\n\n* Formal Methods\n\n* Software History and Culture\n\n* Fringetech and exotic tooling\n\n* The philosophy and theory of software engineering\n\nYou can see the archive of all public essays [here](https://buttondown.email/hillelwayne/archive/).",
      "copyright": null,
      "url": "https://buttondown.com/hillelwayne",
      "self": "https://buttondown.email/hillelwayne/rss",
      "published": null,
      "updated": "2025-06-12T15:43:25.000Z",
      "generator": null,
      "image": null,
      "authors": [],
      "categories": [],
      "items": [
        {
          "id": "https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/",
          "title": "Solving LinkedIn Queens with SMT",
          "description": "<h3>No newsletter next week</h3>\n<p>I’ll be speaking at <a href=\"https://systemsdistributed.com/\" target=\"_blank\">Systems Distributed</a>. My talk isn't close to done yet, which is why this newsletter is both late and short. </p>\n<h1>Solving LinkedIn Queens in SMT</h1>\n<p>The article <a href=\"https://codingnest.com/modern-sat-solvers-fast-neat-underused-part-1-of-n/\" target=\"_blank\">Modern SAT solvers: fast, neat and underused</a> claims that SAT solvers<sup id=\"fnref:SAT\"><a class=\"footnote-ref\" href=\"#fn:SAT\">1</a></sup> are \"criminally underused by the industry\". A while back on the newsletter I asked \"why\": how come they're so powerful and yet nobody uses them? Many experts responded saying the reason is that encoding SAT kinda sucked and they rather prefer using tools that compile to SAT. </p>\n<p>I was reminded of this when I read <a href=\"https://ryanberger.me/posts/queens/\" target=\"_blank\">Ryan Berger's post</a> on solving “LinkedIn Queens” as a SAT problem. </p>\n<p>A quick overview of Queens. You’re presented with an NxN grid divided into N regions, and have to place N queens so that there is exactly one queen in each row, column, and region. While queens can be on the same diagonal, they <em>cannot</em> be adjacently diagonal.</p>\n<p>(Important note: Linkedin “Queens” is a variation on the puzzle game <a href=\"https://starbattle.puzzlebaron.com/\" target=\"_blank\">Star Battle</a>, which is the same except the number of stars you place in each row/column/region varies per puzzle, and is usually two. This is also why 'queens' don’t capture like chess queens.)</p>\n<p><img alt=\"An image of a solved queens board. Copied from https://ryanberger.me/posts/queens\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/96f6f923-331f-424d-8641-fe6753e1c2ca.png?w=960&fit=max\"/></p>\n<p>Ryan solved this by writing Queens as a SAT problem, expressing properties like \"there is exactly one queen in row 3\" as a large number of boolean clauses. <a href=\"https://ryanberger.me/posts/queens/\" target=\"_blank\">Go read his post, it's pretty cool</a>. What leapt out to me was that he used <a href=\"https://cvc5.github.io/\" target=\"_blank\">CVC5</a>, an <strong>SMT</strong> solver.<sup id=\"fnref:SMT\"><a class=\"footnote-ref\" href=\"#fn:SMT\">2</a></sup> SMT solvers are \"higher-level\" than SAT, capable of handling more data types than just boolean variables. It's a lot easier to solve the problem at the SMT level than at the SAT level. To show this, I whipped up a short demo of solving the same problem in <a href=\"https://github.com/Z3Prover/z3/wiki\" target=\"_blank\">Z3</a> (via the <a href=\"https://pypi.org/project/z3-solver/\" target=\"_blank\">Python API</a>).</p>\n<p><a href=\"https://gist.github.com/hwayne/c5de7bc52e733995311236666bedecd3\" target=\"_blank\">Full code here</a>, which you can compare to Ryan's SAT solution <a href=\"https://github.com/ryan-berger/queens/blob/master/main.py\" target=\"_blank\">here</a>. I didn't do a whole lot of cleanup on it (again, time crunch!), but short explanation below.</p>\n<h3>The code</h3>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"kn\">from</span><span class=\"w\"> </span><span class=\"nn\">z3</span><span class=\"w\"> </span><span class=\"kn\">import</span> <span class=\"o\">*</span> <span class=\"c1\"># type: ignore</span>\n<span class=\"kn\">from</span><span class=\"w\"> </span><span class=\"nn\">itertools</span><span class=\"w\"> </span><span class=\"kn\">import</span> <span class=\"n\">combinations</span><span class=\"p\">,</span> <span class=\"n\">chain</span><span class=\"p\">,</span> <span class=\"n\">product</span>\n<span class=\"n\">solver</span> <span class=\"o\">=</span> <span class=\"n\">Solver</span><span class=\"p\">()</span>\n<span class=\"n\">size</span> <span class=\"o\">=</span> <span class=\"mi\">9</span> <span class=\"c1\"># N</span>\n</code></pre></div>\n<p>Initial setup and modules. <code>size</code> is the number of rows/columns/regions in the board, which I'll call <code>N</code> below.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\"># queens[n] = col of queen on row n</span>\n<span class=\"c1\"># by construction, not on same row</span>\n<span class=\"n\">queens</span> <span class=\"o\">=</span> <span class=\"n\">IntVector</span><span class=\"p\">(</span><span class=\"s1\">'q'</span><span class=\"p\">,</span> <span class=\"n\">size</span><span class=\"p\">)</span> \n</code></pre></div>\n<p>SAT represents the queen positions via N² booleans: <code>q_00</code> means that a Queen is on row 0 and column 0, <code>!q_05</code> means a queen <em>isn't</em> on row 0 col 5, etc. In SMT we can instead encode it as N integers: <code>q_0 = 5</code> means that the queen on row 0 is positioned at column 5. This immediately enforces one class of constraints for us: we don't need any constraints saying \"exactly one queen per row\", because that's embedded in the definition of <code>queens</code>!</p>\n<p>(Incidentally, using 0-based indexing for the board was a mistake on my part, it makes correctly encoding the regions later really painful.)</p>\n<p>To actually make the variables <code>[q_0, q_1, …]</code>, we use the Z3 affordance <code>IntVector(str, n)</code> for making <code>n</code> variables at once.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">([</span><span class=\"n\">And</span><span class=\"p\">(</span><span class=\"mi\">0</span> <span class=\"o\"><=</span> <span class=\"n\">i</span><span class=\"p\">,</span> <span class=\"n\">i</span> <span class=\"o\"><</span> <span class=\"n\">size</span><span class=\"p\">)</span> <span class=\"k\">for</span> <span class=\"n\">i</span> <span class=\"ow\">in</span> <span class=\"n\">queens</span><span class=\"p\">])</span>\n<span class=\"c1\"># not on same column</span>\n<span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">Distinct</span><span class=\"p\">(</span><span class=\"n\">queens</span><span class=\"p\">))</span>\n</code></pre></div>\n<p>First we constrain all the integers to <code>[0, N)</code>, then use the <em>incredibly</em> handy <code>Distinct</code> constraint to force all the integers to have different values. This guarantees at most one queen per column, which by the <a href=\"https://en.wikipedia.org/wiki/Pigeonhole_principle\" target=\"_blank\">pigeonhole principle</a> means there is exactly one queen per column.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\"># not diagonally adjacent</span>\n<span class=\"k\">for</span> <span class=\"n\">i</span> <span class=\"ow\">in</span> <span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"n\">size</span><span class=\"o\">-</span><span class=\"mi\">1</span><span class=\"p\">):</span>\n    <span class=\"n\">q1</span><span class=\"p\">,</span> <span class=\"n\">q2</span> <span class=\"o\">=</span> <span class=\"n\">queens</span><span class=\"p\">[</span><span class=\"n\">i</span><span class=\"p\">],</span> <span class=\"n\">queens</span><span class=\"p\">[</span><span class=\"n\">i</span><span class=\"o\">+</span><span class=\"mi\">1</span><span class=\"p\">]</span>\n    <span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">Abs</span><span class=\"p\">(</span><span class=\"n\">q1</span> <span class=\"o\">-</span> <span class=\"n\">q2</span><span class=\"p\">)</span> <span class=\"o\">!=</span> <span class=\"mi\">1</span><span class=\"p\">)</span>\n</code></pre></div>\n<p>One of the rules is that queens can't be adjacent. We already know that they can't be horizontally or vertically adjacent via other constraints, which leaves the diagonals. We only need to add constraints that, for each queen, there is no queen in the lower-left or lower-right corner, aka <code>q_3 != q_2 ± 1</code>. We don't need to check the top corners because if <code>q_1</code> is in the upper-left corner of <code>q_2</code>, then <code>q_2</code> is in the lower-right corner of <code>q_1</code>!</p>\n<p>That covers everything except the \"one queen per region\" constraint. But the regions are the tricky part, which we should expect because we vary the difficulty of queens games by varying the regions.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">regions</span> <span class=\"o\">=</span> <span class=\"p\">{</span>\n        <span class=\"s2\">\"purple\"</span><span class=\"p\">:</span> <span class=\"p\">[(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">6</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">7</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">8</span><span class=\"p\">),</span>\n                   <span class=\"p\">(</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">5</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">6</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">7</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">8</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span>\n                   <span class=\"p\">(</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">8</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">)],</span>\n        <span class=\"s2\">\"red\"</span><span class=\"p\">:</span> <span class=\"p\">[(</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">5</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">6</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">6</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">7</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">7</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">8</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"mi\">8</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">),],</span>\n        <span class=\"c1\"># you get the picture</span>\n        <span class=\"p\">}</span>\n\n<span class=\"c1\"># Some checking code left out, see below</span>\n</code></pre></div>\n<p>The region has to be manually coded in, which is a huge pain.</p>\n<p>(In the link, some validation code follows. Since it breaks up explaining the model I put it in the next section.)</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">for</span> <span class=\"n\">r</span> <span class=\"ow\">in</span> <span class=\"n\">regions</span><span class=\"o\">.</span><span class=\"n\">values</span><span class=\"p\">():</span>\n    <span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">add</span><span class=\"p\">(</span><span class=\"n\">Or</span><span class=\"p\">(</span>\n        <span class=\"o\">*</span><span class=\"p\">[</span><span class=\"n\">queens</span><span class=\"p\">[</span><span class=\"n\">row</span><span class=\"p\">]</span> <span class=\"o\">==</span> <span class=\"n\">col</span> <span class=\"k\">for</span> <span class=\"p\">(</span><span class=\"n\">row</span><span class=\"p\">,</span> <span class=\"n\">col</span><span class=\"p\">)</span> <span class=\"ow\">in</span> <span class=\"n\">r</span><span class=\"p\">]</span>\n        <span class=\"p\">))</span>\n</code></pre></div>\n<p>Finally we have the region constraint. The easiest way I found to say \"there is exactly one queen in each region\" is to say \"there is a queen in region 1 and a queen in region 2 and a queen in region 3\" etc.\" Then to say \"there is a queen in region <code>purple</code>\" I wrote \"<code>q_0 = 0</code> OR <code>q_0 = 1</code> OR … OR <code>q_1 = 0</code> etc.\" </p>\n<p>Why iterate over every position in the region instead of doing something like <code>(0, q[0]) in r</code>? I tried that but it's not an expression that Z3 supports.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">if</span> <span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">check</span><span class=\"p\">()</span> <span class=\"o\">==</span> <span class=\"n\">sat</span><span class=\"p\">:</span>\n    <span class=\"n\">m</span> <span class=\"o\">=</span> <span class=\"n\">solver</span><span class=\"o\">.</span><span class=\"n\">model</span><span class=\"p\">()</span>\n    <span class=\"nb\">print</span><span class=\"p\">([(</span><span class=\"n\">l</span><span class=\"p\">,</span> <span class=\"n\">m</span><span class=\"p\">[</span><span class=\"n\">l</span><span class=\"p\">])</span> <span class=\"k\">for</span> <span class=\"n\">l</span> <span class=\"ow\">in</span> <span class=\"n\">queens</span><span class=\"p\">])</span>\n</code></pre></div>\n<p>Finally, we solve and print the positions. Running this gives me:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"p\">[(</span><span class=\"n\">q__0</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__1</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__2</span><span class=\"p\">,</span> <span class=\"mi\">8</span><span class=\"p\">),</span> \n <span class=\"p\">(</span><span class=\"n\">q__3</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__4</span><span class=\"p\">,</span> <span class=\"mi\">7</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__5</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">),</span> \n <span class=\"p\">(</span><span class=\"n\">q__6</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__7</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">),</span> <span class=\"p\">(</span><span class=\"n\">q__8</span><span class=\"p\">,</span> <span class=\"mi\">6</span><span class=\"p\">)]</span>\n</code></pre></div>\n<p>Which is the correct solution to the queens puzzle. I didn't benchmark the solution times, but I imagine it's considerably slower than a raw SAT solver. <a href=\"https://github.com/audemard/glucose\" target=\"_blank\">Glucose</a> is really, really fast.</p>\n<p>But even so, solving the problem with SMT was a lot <em>easier</em> than solving it with SAT. That satisfies me as an explanation for why people prefer it to SAT.</p>\n<h3>Sanity checks</h3>\n<p>One bit I glossed over earlier was the sanity checking code. I <em>knew for sure</em> that I was going to make a mistake encoding the <code>region</code>, and the solver wasn't going to provide useful information abut what I did wrong.  In cases like these, I like adding small tests and checks to catch mistakes early, because the solver certainly isn't going to catch them!</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">all_squares</span> <span class=\"o\">=</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">product</span><span class=\"p\">(</span><span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"n\">size</span><span class=\"p\">),</span> <span class=\"n\">repeat</span><span class=\"o\">=</span><span class=\"mi\">2</span><span class=\"p\">))</span>\n<span class=\"k\">def</span><span class=\"w\"> </span><span class=\"nf\">test_i_set_up_problem_right</span><span class=\"p\">():</span>\n    <span class=\"k\">assert</span> <span class=\"n\">all_squares</span> <span class=\"o\">==</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">chain</span><span class=\"o\">.</span><span class=\"n\">from_iterable</span><span class=\"p\">(</span><span class=\"n\">regions</span><span class=\"o\">.</span><span class=\"n\">values</span><span class=\"p\">()))</span>\n\n    <span class=\"k\">for</span> <span class=\"n\">r1</span><span class=\"p\">,</span> <span class=\"n\">r2</span> <span class=\"ow\">in</span> <span class=\"n\">combinations</span><span class=\"p\">(</span><span class=\"n\">regions</span><span class=\"o\">.</span><span class=\"n\">values</span><span class=\"p\">(),</span> <span class=\"mi\">2</span><span class=\"p\">):</span>\n        <span class=\"k\">assert</span> <span class=\"ow\">not</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">r1</span><span class=\"p\">)</span> <span class=\"o\">&</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">r2</span><span class=\"p\">),</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">r1</span><span class=\"p\">)</span> <span class=\"o\">&</span> <span class=\"nb\">set</span><span class=\"p\">(</span><span class=\"n\">r2</span><span class=\"p\">)</span>\n</code></pre></div>\n<p>The first check was a quick test that I didn't leave any squares out, or accidentally put the same square in both regions. Converting the values into sets makes both checks a lot easier. Honestly I don't know why I didn't just use sets from the start, sets are great.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">def</span><span class=\"w\"> </span><span class=\"nf\">render_regions</span><span class=\"p\">():</span>\n    <span class=\"n\">colormap</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s2\">\"purple\"</span><span class=\"p\">,</span>  <span class=\"s2\">\"red\"</span><span class=\"p\">,</span> <span class=\"s2\">\"brown\"</span><span class=\"p\">,</span> <span class=\"s2\">\"white\"</span><span class=\"p\">,</span> <span class=\"s2\">\"green\"</span><span class=\"p\">,</span> <span class=\"s2\">\"yellow\"</span><span class=\"p\">,</span> <span class=\"s2\">\"orange\"</span><span class=\"p\">,</span> <span class=\"s2\">\"blue\"</span><span class=\"p\">,</span> <span class=\"s2\">\"pink\"</span><span class=\"p\">]</span>\n    <span class=\"n\">board</span> <span class=\"o\">=</span> <span class=\"p\">[[</span><span class=\"mi\">0</span> <span class=\"k\">for</span> <span class=\"n\">_</span> <span class=\"ow\">in</span> <span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"n\">size</span><span class=\"p\">)]</span> <span class=\"k\">for</span> <span class=\"n\">_</span> <span class=\"ow\">in</span> <span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"n\">size</span><span class=\"p\">)]</span> \n    <span class=\"k\">for</span> <span class=\"p\">(</span><span class=\"n\">row</span><span class=\"p\">,</span> <span class=\"n\">col</span><span class=\"p\">)</span> <span class=\"ow\">in</span> <span class=\"n\">all_squares</span><span class=\"p\">:</span>\n        <span class=\"k\">for</span> <span class=\"n\">color</span><span class=\"p\">,</span> <span class=\"n\">region</span> <span class=\"ow\">in</span> <span class=\"n\">regions</span><span class=\"o\">.</span><span class=\"n\">items</span><span class=\"p\">():</span>\n            <span class=\"k\">if</span> <span class=\"p\">(</span><span class=\"n\">row</span><span class=\"p\">,</span> <span class=\"n\">col</span><span class=\"p\">)</span> <span class=\"ow\">in</span> <span class=\"n\">region</span><span class=\"p\">:</span>\n                <span class=\"n\">board</span><span class=\"p\">[</span><span class=\"n\">row</span><span class=\"p\">][</span><span class=\"n\">col</span><span class=\"p\">]</span> <span class=\"o\">=</span> <span class=\"n\">colormap</span><span class=\"o\">.</span><span class=\"n\">index</span><span class=\"p\">(</span><span class=\"n\">color</span><span class=\"p\">)</span><span class=\"o\">+</span><span class=\"mi\">1</span>\n\n    <span class=\"k\">for</span> <span class=\"n\">row</span> <span class=\"ow\">in</span> <span class=\"n\">board</span><span class=\"p\">:</span>\n        <span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"s2\">\"\"</span><span class=\"o\">.</span><span class=\"n\">join</span><span class=\"p\">(</span><span class=\"nb\">map</span><span class=\"p\">(</span><span class=\"nb\">str</span><span class=\"p\">,</span> <span class=\"n\">row</span><span class=\"p\">)))</span>\n\n<span class=\"n\">render_regions</span><span class=\"p\">()</span>\n</code></pre></div>\n<p>The second check is something that prints out the regions. It produces something like this:</p>\n<div class=\"codehilite\"><pre><span></span><code>111111111\n112333999\n122439999\n124437799\n124666779\n124467799\n122467899\n122555889\n112258899\n</code></pre></div>\n<p>I can compare this to the picture of the board to make sure I got it right. I guess a more advanced solution would be to print emoji squares like 🟥 instead.</p>\n<p>Neither check is quality code but it's throwaway and it gets the job done so eh.</p>\n<h3>Update for the Internet</h3>\n<p>This was sent as a weekly newsletter, which is usually on topics like <a href=\"https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code\" target=\"_blank\">software history</a>, <a href=\"https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/\" target=\"_blank\">formal methods</a>, <a href=\"https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/\" target=\"_blank\">unusual technologies</a>, and the <a href=\"https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/\" target=\"_blank\">theory of software engineering</a>. You <a href=\"https://buttondown.email/hillelwayne/\" target=\"_blank\">can subscribe here</a>.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:SAT\">\n<p>\"Boolean <strong>SAT</strong>isfiability Solver\", aka a solver that can find assignments that make complex boolean expressions true. I write a bit more about them <a href=\"https://www.hillelwayne.com/post/np-hard/\" target=\"_blank\">here</a>. <a class=\"footnote-backref\" href=\"#fnref:SAT\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:SMT\">\n<p>\"Satisfiability Modulo Theories\" <a class=\"footnote-backref\" href=\"#fnref:SMT\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/solving-linkedin-queens-with-smt/",
          "published": "2025-06-12T15:43:25.000Z",
          "updated": "2025-06-12T15:43:25.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/",
          "title": "AI is a gamechanger for TLA+ users",
          "description": "<h3>New Logic for Programmers Release</h3>\n<p><a href=\"https://leanpub.com/logic/\" target=\"_blank\">v0.10 is now available</a>! This is a minor release, mostly focused on logic-based refactoring, with new material on set types and testing refactors are correct. See the full release notes at <a href=\"https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md\" target=\"_blank\">the changelog page</a>. Due to <a href=\"https://systemsdistributed.com/\" target=\"_blank\">conference pressure</a> v0.11 will also likely be a minor release. </p>\n<p><img alt=\"The book cover\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/29d4ae9d-bcb9-4d8b-99d4-8a35c0990ad5.jpg?w=300&fit=max\"/></p>\n<h1>AI is a gamechanger for TLA+ users</h1>\n<p><a href=\"https://lamport.azurewebsites.net/tla/tla.html\" target=\"_blank\">TLA+</a> is a specification language to model and debug distributed systems. While very powerful, it's also hard for programmers to learn, and there's always questions of connecting specifications with actual code. </p>\n<p>That's why <a href=\"https://zfhuang99.github.io/github%20copilot/formal%20verification/tla+/2025/05/24/ai-revolution-in-distributed-systems.html\" target=\"_blank\">The Coming AI Revolution in Distributed Systems</a> caught my interest. In the post, Cheng Huang claims that Azure successfully used LLMs to examine an existing codebase, derive a TLA+ spec, and find a production bug in that spec. \"After a decade of manually crafting TLA+ specifications\", he wrote, \"I must acknowledge that this AI-generated specification rivals human work\".</p>\n<p>This inspired me to experiment with LLMs in TLA+ myself. My goals are a little less ambitious than Cheng's: I wanted to see how LLMs could help junior specifiers write TLA+, rather than handling the entire spec automatically. Details on what did and didn't work below, but my takeaway is that <strong>LLMs are an immense specification force multiplier.</strong></p>\n<p>All tests were done with a standard VSCode Copilot subscription, writing Claude 3.7 in Agent mode. Other LLMs or IDEs may be more or less effective, etc.</p>\n<h2>Things Claude was good at</h2>\n<h3>Fixing syntax errors</h3>\n<p>TLA+ uses a very different syntax than mainstream programming languages, meaning beginners make a lot of mistakes where they do a \"programming syntax\" instead of TLA+ syntax:</p>\n<div class=\"codehilite\"><pre><span></span><code>NotThree(x) = \\* should be ==, not =\n    x != 3 \\* should be #, not !=\n</code></pre></div>\n<p>The problem is that the TLA+ syntax checker, SANY, is 30 years old and doesn't provide good information. Here's what it says for that snippet:</p>\n<div class=\"codehilite\"><pre><span></span><code>Was expecting \"==== or more Module body\"\nEncountered \"NotThree\" at line 6, column 1\n</code></pre></div>\n<p>That only isolates one error and doesn't tell us what the problem is, only where it is. Experienced TLA+ users get \"error eyes\" and can quickly see what the problem is, but beginners really struggle with this.</p>\n<p>The TLA+ foundation has made LLM integration a priority, so the VSCode extension <a href=\"https://github.com/tlaplus/vscode-tlaplus/blob/master/src/main.ts#L174\" target=\"_blank\">naturally supports several agents actions</a>. One of these is running SANY, meaning an agent can get an error, fix it, get another error, fix it, etc. Provided the above sample and asked to make it work, Claude successfully fixed both errors. It also fixed many errors in a larger spec, as well as figure out why PlusCal specs weren't compiling to TLA+.</p>\n<p>This by itself is already enough to make LLMs a worthwhile tool, as it fixes one of the biggest barriers to entry.</p>\n<h3>Understanding error traces</h3>\n<p>When TLA+ finds a violated property, it outputs the sequence of steps that leads to the error. This starts in plaintext, and VSCode parses it into an interactive table:</p>\n<p><img alt=\"An example error trace\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/f7f16d0e-c61f-4286-ae49-67e03f844126.png?w=960&fit=max\"/></p>\n<p>Learning to read these error traces is a skill in itself. You have to understand what's happening in each step and how it relates back to the actually broken property. It takes a long time for people to learn how to do this well.</p>\n<p>Claude was successful here, too, accurately reading 20+ step error traces and giving a high-level explanation of what went wrong. It also could condense error traces: if ten steps of the error trace could be condensed into a one-sentence summary (which can happen if you're modeling a lot of process internals) Claude would do it.</p>\n<p>I did have issues here with doing this in agent mode: while the extension does provide a \"run model checker\" command, the agent would regularly ignore this and prefer to run a terminal command instead. This would be fine except that the LLM consistently hallucinated invalid commands. I had to amend every prompt with \"run the model checker via vscode, do not use a terminal command\". You can skip this if you're willing to copy and paste the error trace into the prompt.</p>\n<p>As with syntax checking, if this was the <em>only</em> thing LLMs could effectively do, that would already be enough<sup id=\"fnref:dayenu\"><a class=\"footnote-ref\" href=\"#fn:dayenu\">1</a></sup> to earn a strong recommend. Even as a TLA+ expert I expect I'll be using this trick regularly. </p>\n<h3>Boilerplate tasks</h3>\n<p>TLA+ has a lot of boilerplate. One of the most notorious examples is <code>UNCHANGED</code> rules. Specifications are extremely precise — so precise that you have to specify what variables <em>don't</em> change in every step. This takes the form of an <code>UNCHANGED</code> clause at the end of relevant actions:</p>\n<div class=\"codehilite\"><pre><span></span><code>RemoveObjectFromStore(srv, o, s) ==\n  /\\ o \\in stored[s]\n  /\\ stored' = [stored EXCEPT ![s] = @ \\ {o}]\n  /\\ UNCHANGED <<capacity, log, objectsize, pc>>\n</code></pre></div>\n<p>Writing this is really annoying. Updating these whenever you change an action, or add a new variable to the spec, is doubly so. Syntax checking and error analysis are important for beginners, but this is what I wanted for <em>myself</em>. I took a spec and prompted Claude</p>\n<blockquote>\n<p>Add UNCHANGED <<v1, etc=\"\" v2,=\"\">> for each variable not changed in an action.</v1,></p>\n</blockquote>\n<p>And it worked! It successfully updated the <code>UNCHANGED</code> in every action. </p>\n<p>(Note, though, that it was a \"well-behaved\" spec in this regard: only one \"action\" happened at a time. In TLA+ you can have two actions happen simultaneously, that each update half of the variables, meaning neither of them should have an <code>UNCHANGED</code> clause. I haven't tested how Claude handles that!)</p>\n<p>That's the most obvious win, but Claude was good at handling other tedious work, too. Some examples include updating <code>vars</code> (the conventional collection of all state variables), lifting a hard-coded value into a model parameter, and changing data formats. Most impressive to me, though, was rewriting a spec designed for one process to instead handle multiple processes. This means taking all of the process variables, which originally have types like <code>Int</code>, converting them to types like <code>[Process -> Int]</code>, and then updating the uses of all of those variables in the spec. It didn't account for race conditions in the new concurrent behavior, but it was an excellent scaffold to do more work.</p>\n<h3>Writing properties from an informal description</h3>\n<p>You have to be pretty precise with your intended property description but it handles converting that precise description into TLA+'s formalized syntax, which is something beginners often struggle with.</p>\n<h2>Things it is less good at</h2>\n<h3>Generating model config files</h3>\n<p>To model check TLA+, you need both a specification (<code>.tla</code>) and a model config file (<code>.cfg</code>), which have separate syntaxes. Asking the agent to generate the second often lead to it using TLA+ syntax. It automatically fixed this after getting parsing errors, though. </p>\n<h3>Fixing specs</h3>\n<p>Whenever the ran model checking and discovered a bug, it would naturally propose a change to either the invalid property or the spec. Sometimes the changes were good, other times the changes were not physically realizable. For example, if it found that a bug was due to a race condition between processes, it would often suggest fixing it by saying race conditions were okay. I mean yes, if you say bugs are okay, then the spec finds that bugs are okay! Or it would alternatively suggest adding a constraint to the spec saying that race conditions don't happen. <a href=\"https://www.hillelwayne.com/post/alloy-facts/\" target=\"_blank\">But that's a huge mistake in specification</a>, because race conditions happen if we don't have coordination. We need to specify the <em>mechanism</em> that is supposed to prevent them.</p>\n<h3>Finding properties of the spec</h3>\n<p>After seeing how capable it was at translating my properties to TLA+, I started prompting Claude to come up with properties on its own. Unfortunately, almost everything I got back was either trivial, uninteresting, or too coupled to implementation details. I haven't tested if it would work better to ask it for \"properties that may be violated\".</p>\n<h3>Generating code from specs</h3>\n<p>I have to be specific here: Claude <em>could</em> sometimes convert Python into a passable spec, an vice versa. It <em>wasn't</em> good at recognizing abstraction. For example, TLA+ specifications often represent sequential operations with a state variable, commonly called <code>pc</code>. If modeling code that nonatomically retrieves a counter value and increments it, we'd have one action that requires <code>pc = \"Get\"</code> and sets the new value to <code>\"Inc\"</code>, then another that requires it be <code>\"Inc\"</code> and sets it to <code>\"Done\"</code>.</p>\n<p>I found that Claude would try to somehow convert <code>pc</code> into part of the Python program's state, rather than recognize it as a TLA+ abstraction. On the other side, when converting python code to TLA+ it would often try to translate things like <code>sleep</code> into some part of the spec, not recognizing that it is abstractable into a distinct action. I didn't test other possible misconceptions, like converting randomness to nondeterminism.</p>\n<p>For the record, when converting TLA+ to Python Claude tended to make simulators of the spec, rather than possible production code implementing the spec. I really wasn't expecting otherwise though.</p>\n<h2>Unexplored Applications</h2>\n<p>Things I haven't explored thoroughly but could possibly be effective, based on what I know about TLA+ and AI:</p>\n<h3>Writing Java Overrides</h3>\n<p>Most TLA+ operators are resolved via TLA+ interpreters, but you can also implement them in \"native\" Java. This lets you escape the standard language semantics and add capabilities like <a href=\"https://github.com/tlaplus/CommunityModules/blob/master/modules/IOUtils.tla\" target=\"_blank\">executing programs during model-checking</a> or <a href=\"https://github.com/tlaplus/tlaplus/blob/master/tlatools/org.lamport.tlatools/src/tla2sany/StandardModules/TLC.tla#L62\" target=\"_blank\">dynamically constrain the depth of the searched state space</a>. There's a lot of cool things I think would be possible with overrides. The problem is there's only a handful of people in the world who know how to write them. But that handful have written quite a few overrides and I think there's enough there for Claude to work with. </p>\n<h3>Writing specs, given a reference mechanism</h3>\n<p>In all my experiments, the LLM only had my prompts and the occasional Python script as information. That makes me suspect that some of its problems with writing and fixing specs come down to not having a system model. Maybe it wouldn't suggest fixes like \"these processes never race\" if it had a design doc saying that the processes can't coordinate. </p>\n<p>(Could a Sufficiently Powerful LLM derive some TLA+ specification from a design document?)</p>\n<h3>Connecting specs and code</h3>\n<p>This is the holy grail of TLA+: taking a codebase and showing it correctly implements a spec. Currently the best ways to do this are by either using TLA+ to generate a test suite, or by taking logged production traces and matching them to TLA+ behaviors. <a href=\"https://www.mongodb.com/blog/post/engineering/conformance-checking-at-mongodb-testing-our-code-matches-our-tla-specs\" target=\"_blank\">This blog post discusses both</a>. While I've seen a lot of academic research into these approaches there are no industry-ready tools. So if you want trace validation you have to do a lot of manual labour tailored to your specific product. </p>\n<p>If LLMs could do some of this work for us then that'd really amplify the usefulness of TLA+ to many companies.</p>\n<h2>Thoughts</h2>\n<p><em>Right now</em>, agents seem good at the tedious and routine parts of TLA+ and worse at the strategic and abstraction parts. But, since the routine parts are often a huge barrier to beginners, this means that LLMs have the potential to make TLA+ far, far more accessible than it previously was.</p>\n<p>I have mixed thoughts on this. As an <em>advocate</em>, this is incredible. I want more people using formal specifications because I believe it leads to cheaper, safer, more reliable software. Anything that gets people comfortable with specs is great for our industry. As a <em>professional TLA+ consultant</em>, I'm worried that this obsoletes me. Most of my income comes from training and coaching, which companies will have far less demand of now. Then again, maybe this an opportunity to pitch \"agentic TLA+ training\" to companies!</p>\n<p>Anyway, if you're interested in TLA+, there has never been a better time to try it. I mean it, these tools handle so much of the hard part now. I've got a <a href=\"https://learntla.com/\" target=\"_blank\">free book available online</a>, as does <a href=\"https://lamport.azurewebsites.net/tla/book.html\" target=\"_blank\">the inventor of TLA+</a>. I like <a href=\"https://elliotswart.github.io/pragmaticformalmodeling/\" target=\"_blank\">this guide too</a>. Happy modeling!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:dayenu\">\n<p>Dayenu. <a class=\"footnote-backref\" href=\"#fnref:dayenu\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/ai-is-a-gamechanger-for-tla-users/",
          "published": "2025-06-05T14:59:11.000Z",
          "updated": "2025-06-05T14:59:11.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/what-does-undecidable-mean-anyway/",
          "title": "What does \"Undecidable\" mean, anyway",
          "description": "<h3>Systems Distributed</h3>\n<p>I'll be speaking at <a href=\"https://systemsdistributed.com/\" target=\"_blank\">Systems Distributed</a> next month! The talk is brand new and will aim to showcase some of the formal methods mental models that would be useful in mainstream software development. It has added some extra stress on my schedule, though, so expect the next two monthly releases of <em>Logic for Programmers</em> to be mostly minor changes.</p>\n<h2>What does \"Undecidable\" mean, anyway</h2>\n<p>Last week I read <a href=\"https://liamoc.net/forest/loc-000S/index.xml\" target=\"_blank\">Against Curry-Howard Mysticism</a>, which is a solid article I recommend reading. But this newsletter is actually about <a href=\"https://lobste.rs/s/n0whur/against_curry_howard_mysticism#c_lbts57\" target=\"_blank\">one comment</a>:</p>\n<blockquote>\n<p>I like to see posts like this because I often feel like I can’t tell the difference between BS and a point I’m missing. Can we get one for questions like “Isn’t XYZ (Undecidable|NP-Complete|PSPACE-Complete)?” </p>\n</blockquote>\n<p>I've already written one of these for <a href=\"https://www.hillelwayne.com/post/np-hard/\" target=\"_blank\">NP-complete</a>, so let's do one for \"undecidable\". Step one is to pull a technical definition from the book <a href=\"https://link.springer.com/book/10.1007/978-1-4612-1844-9\" target=\"_blank\"><em>Automata and Computability</em></a>:</p>\n<blockquote>\n<p>A property P of strings is said to be decidable if ... there is a total Turing machine that accepts input strings that have property P and rejects those that do not. (pg 220)</p>\n</blockquote>\n<p>Step two is to translate the technical computer science definition into more conventional programmer terms. Warning, because this is a newsletter and not a blog post, I might be a little sloppy with terms.</p>\n<h3>Machines and Decision Problems</h3>\n<p>In automata theory, all inputs to a \"program\" are strings of characters, and all outputs are \"true\" or \"false\". A program \"accepts\" a string if it outputs \"true\", and \"rejects\" if it outputs \"false\". You can think of this as automata studying all pure functions of type <code>f :: string -> boolean</code>. Problems solvable by finding such an <code>f</code> are called \"decision problems\".</p>\n<p>This covers more than you'd think, because we can bootstrap more powerful functions from these. First, as anyone who's programmed in bash knows, strings can represent any other data. Second, we can fake non-boolean outputs by instead checking if a certain computation gives a certain result. For example, I can reframe the function <code>add(x, y) = x + y</code> as a decision problem like this:</p>\n<div class=\"codehilite\"><pre><span></span><code>IS_SUM(str) {\n    x, y, z = split(str, \"#\")\n    return x + y == z\n}\n</code></pre></div>\n<p>Then because <code>IS_SUM(\"2#3#5\")</code> returns true, we know <code>2 + 3 == 5</code>, while <code>IS_SUM(\"2#3#6\")</code> is false. Since we can bootstrap parameters out of strings, I'll just say it's <code>IS_SUM(x, y, z)</code> going forward.</p>\n<p>A big part of automata theory is studying different models of computation with different strengths. One of the weakest is called <a href=\"https://en.wikipedia.org/wiki/Deterministic_finite_automaton\" target=\"_blank\">\"DFA\"</a>. I won't go into any details about what DFA actually can do, but the important thing is that it <em>can't</em> solve <code>IS_SUM</code>. That is, if you give me a DFA that takes inputs of form <code>x#y#z</code>, I can always find an input where the DFA returns true when <code>x + y != z</code>, <em>or</em> an input which returns false when <code>x + y == z</code>.</p>\n<p>It's really important to keep this model of \"solve\" in mind: a program solves a problem if it correctly returns true on all true inputs and correctly returns false on all false inputs.</p>\n<h3>(total) Turing Machines</h3>\n<p>A Turing Machine (TM) is a particular type of computation model. It's important for two reasons: </p>\n<ol>\n<li>\n<p>By the <a href=\"https://en.wikipedia.org/wiki/Church%E2%80%93Turing_thesis\" target=\"_blank\">Church-Turing thesis</a>, a Turing Machine is the \"upper bound\" of how powerful (physically realizable) computational models can get. This means that if an actual real-world programming language can solve a particular decision problem, so can a TM. Conversely, if the TM <em>can't</em> solve it, neither can the programming language.<sup id=\"fnref:caveat\"><a class=\"footnote-ref\" href=\"#fn:caveat\">1</a></sup></p>\n</li>\n<li>\n<p>It's possible to write a Turing machine that takes <em>a textual representation of another Turing machine</em> as input, and then simulates that Turing machine as part of its computations. </p>\n</li>\n</ol>\n<p>Property (1) means that we can move between different computational models of equal strength, proving things about one to learn things about another. That's why I'm able to write <code>IS_SUM</code> in a pseudocode instead of writing it in terms of the TM computational model (and why I was able to use <code>split</code> for convenience). </p>\n<p>Property (2) does several interesting things. First of all, it makes it possible to compose Turing machines. Here's how I can roughly ask if a given number is the sum of two primes, with \"just\" addition and boolean functions:</p>\n<div class=\"codehilite\"><pre><span></span><code>IS_SUM_TWO_PRIMES(z):\n    x := 1\n    y := 1\n    loop {\n        if x > z {return false}\n        if IS_PRIME(x) {\n            if IS_PRIME(y) {\n                if IS_SUM(x, y, z) {\n                    return true;\n                }\n            }\n        }\n        y := y + 1\n        if y > x {\n            x := x + 1\n            y := 0\n        }\n    }\n</code></pre></div>\n<p>Notice that without the <code>if x > z {return false}</code>, the program would loop forever on <code>z=2</code>. A TM that always halts for all inputs is called <strong>total</strong>.</p>\n<p>Property (2) also makes \"Turing machines\" a possible input to functions, meaning that we can now make decision problems about the behavior of Turing machines. For example, \"does the TM <code>M</code> either accept or reject <code>x</code> within ten steps?\"<sup id=\"fnref:backticks\"><a class=\"footnote-ref\" href=\"#fn:backticks\">2</a></sup></p>\n<div class=\"codehilite\"><pre><span></span><code>IS_DONE_IN_TEN_STEPS(M, x) {\n    for (i = 0; i < 10; i++) {\n        `simulate M(x) for one step`\n        if(`M accepted or rejected`) {\n            return true\n        }\n    }\n    return false\n}\n</code></pre></div>\n<h3>Decidability and Undecidability</h3>\n<p>Now we have all of the pieces to understand our original definition:</p>\n<blockquote>\n<p>A property P of strings is said to be decidable if ... there is a total Turing machine that accepts input strings that have property P and rejects those that do not. (220)</p>\n</blockquote>\n<p>Let <code>IS_P</code> be the decision problem \"Does the input satisfy P\"? Then <code>IS_P</code> is decidable if it can be solved by a Turing machine, ie, I can provide some <code>IS_P(x)</code> machine that <em>always</em> accepts if <code>x</code> has property P, and always rejects if <code>x</code> doesn't have property P. If I can't do that, then <code>IS_P</code> is undecidable. </p>\n<p><code>IS_SUM(x, y, z)</code> and <code>IS_DONE_IN_TEN_STEPS(M, x)</code> are decidable properties. Is <code>IS_SUM_TWO_PRIMES(z)</code> decidable? Some analysis shows that our corresponding program will either find a solution, or have <code>x>z</code> and return false. So yes, it is decidable.</p>\n<p>Notice there's an asymmetry here. To prove some property is decidable, I need just to need to find <em>one</em> program that correctly solves it. To prove some property is undecidable, I need to show that any possible program, no matter what it is, doesn't solve it.</p>\n<p>So with that asymmetry in mind, do are there <em>any</em> undecidable problems? Yes, quite a lot. Recall that Turing machines can accept encodings of other TMs as input, meaning we can write a TM that checks <em>properties of Turing machines</em>. And, by <a href=\"https://en.wikipedia.org/wiki/Rice%27s_theorem\" target=\"_blank\">Rice's Theorem</a>, almost every nontrivial semantic<sup id=\"fnref:nontrivial\"><a class=\"footnote-ref\" href=\"#fn:nontrivial\">3</a></sup> property of Turing machines is undecidable. The conventional way to prove this is to first find a single undecidable property <code>H</code>, and then use that to bootstrap undecidability of other properties.</p>\n<p>The canonical and most famous example of an undecidable problem is the <a href=\"https://en.wikipedia.org/wiki/Halting_problem\" target=\"_blank\">Halting problem</a>: \"does machine M halt on input i?\" It's pretty easy to prove undecidable, and easy to use it to bootstrap other undecidability properties. But again, <em>any</em> nontrivial property is undecidable. Checking a TM is total is undecidable. Checking a TM accepts <em>any</em> inputs is undecidable. Checking a TM solves <code>IS_SUM</code> is undecidable. Etc etc etc.</p>\n<h3>What this doesn't mean in practice</h3>\n<p>I often see the halting problem misconstrued as \"it's impossible to tell if a program will halt before running it.\" <strong>This is wrong</strong>. The halting problem says that we cannot create an algorithm that, when applied to an arbitrary program, tells us whether the program will halt or not. It is absolutely possible to tell if many programs will halt or not. It's possible to find entire subcategories of programs that are guaranteed to halt. It's possible to say \"a program constructed following constraints XYZ is guaranteed to halt.\" </p>\n<p>The actual consequence of undecidability is more subtle. If we want to know if a program has property P, undecidability tells us</p>\n<ol>\n<li>We will have to spend time and mental effort to determine if it has P</li>\n<li>We may not be successful.</li>\n</ol>\n<p>This is subtle because we're so used to living in a world where everything's undecidable that we don't really consider what the counterfactual would be like. In such a world there might be no need for Rust, because \"does this C program guarantee memory-safety\" is a decidable property. The entire field of formal verification could be unnecessary, as we could just check properties of arbitrary programs directly. We could automatically check if a change in a program preserves all existing behavior. Lots of famous math problems could be solved overnight. </p>\n<p>(This to me is a strong \"intuitive\" argument for why the halting problem is undecidable: a halt detector can be trivially repurposed as a program optimizer / theorem-prover / bcrypt cracker / chess engine. It's <em>too powerful</em>, so we should expect it to be impossible.)</p>\n<p>But because we don't live in that world, all of those things are hard problems that take effort and ingenuity to solve, and even then we often fail.</p>\n<h3>Update for the Internet</h3>\n<p>This was sent as a weekly newsletter, which is usually on topics like <a href=\"https://buttondown.com/hillelwayne/archive/why-do-we-call-it-boilerplate-code\" target=\"_blank\">software history</a>, <a href=\"https://buttondown.com/hillelwayne/archive/the-seven-specification-ur-languages/\" target=\"_blank\">formal methods</a>, <a href=\"https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/\" target=\"_blank\">unusual technologies</a>, and the <a href=\"https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/\" target=\"_blank\">theory of software engineering</a>. You <a href=\"https://buttondown.email/hillelwayne/\" target=\"_blank\">can subscribe here</a>.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:caveat\">\n<p>To be pendantic, a TM can't do things like \"scrape a webpage\" or \"render a bitmap\", but we're only talking about computational decision problems here. <a class=\"footnote-backref\" href=\"#fnref:caveat\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:backticks\">\n<p>One notation I've adopted in <em>Logic for Programmers</em> is marking abstract sections of pseudocode with backticks. It's really handy! <a class=\"footnote-backref\" href=\"#fnref:backticks\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:nontrivial\">\n<p>Nontrivial meaning \"at least one TM has this property and at least one TM doesn't have this property\". Semantic meaning \"related to whether the TM accepts, rejects, or runs forever on a class of inputs\". <code>IS_DONE_IN_TEN_STEPS</code> is <em>not</em> a semantic property, as it doesn't tell us anything about inputs that take longer than ten steps. <a class=\"footnote-backref\" href=\"#fnref:nontrivial\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/what-does-undecidable-mean-anyway/",
          "published": "2025-05-28T19:34:02.000Z",
          "updated": "2025-05-28T19:34:02.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/finding-hard-24-puzzles-with-planner-programming/",
          "title": "Finding hard 24 puzzles with planner programming",
          "description": "<p><strong>Planner programming</strong> is a programming technique where you solve problems by providing a goal and actions, and letting the planner find actions that reach the goal. In a previous edition of <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a>, I demonstrated how this worked by solving the \n<a href=\"https://en.wikipedia.org/wiki/24_(puzzle)\" target=\"_blank\">24 puzzle</a> with planning. For <a href=\"https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/\" target=\"_blank\">reasons discussed here</a> I replaced that example with something more practical (orchestrating deployments), but left the <a href=\"https://github.com/logicforprogrammers/book-assets/tree/master/code/chapter-misc\" target=\"_blank\">code online</a> for posterity.</p>\n<p>Recently I saw a family member try and fail to vibe code a tool that would find all valid 24 puzzles, and realized I could adapt the puzzle solver to also be a puzzle generator. First I'll explain the puzzle rules, then the original solver, then the generator.<sup id=\"fnref:complex\"><a class=\"footnote-ref\" href=\"#fn:complex\">1</a></sup> For a much longer intro to planning, see <a href=\"https://www.hillelwayne.com/post/picat/\" target=\"_blank\">here</a>.</p>\n<h3>The rules of 24</h3>\n<p>You're given four numbers and have to find some elementary equation (<code>+-*/</code>+groupings) that uses all four numbers and results in 24. Each number must be used exactly once, but do not need to be used in the starting puzzle order. Some examples:</p>\n<ul>\n<li><code>[6, 6, 6, 6]</code> -> <code>6+6+6+6=24</code></li>\n<li><code>[1, 1, 6, 6]</code> -> <code>(6+6)*(1+1)=24</code></li>\n<li><code>[4, 4, 4, 5]</code> -> <code>4*(5+4/4)=24</code></li>\n</ul>\n<p>Some setups are impossible, like <code>[1, 1, 1, 1]</code>. Others are possible only with non-elementary operations, like <code>[1, 5, 5, 324]</code> (which requires exponentiation).</p>\n<h2>The solver</h2>\n<p>We will use the <a href=\"http://picat-lang.org/\" target=\"_blank\">Picat</a>, the only language that I know has a built-in planner module. The current state of our plan with be represented by a single list with all of the numbers.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">import</span> <span class=\"s s-Atom\">planner</span><span class=\"p\">,</span> <span class=\"s s-Atom\">math</span><span class=\"p\">.</span>\n<span class=\"s s-Atom\">import</span> <span class=\"s s-Atom\">cp</span><span class=\"p\">.</span>\n\n<span class=\"nf\">action</span><span class=\"p\">(</span><span class=\"nv\">S0</span><span class=\"p\">,</span> <span class=\"nv\">S1</span><span class=\"p\">,</span> <span class=\"nv\">Action</span><span class=\"p\">,</span> <span class=\"nv\">Cost</span><span class=\"p\">)</span> <span class=\"s s-Atom\">?=></span>\n  <span class=\"nf\">member</span><span class=\"p\">(</span><span class=\"nv\">X</span><span class=\"p\">,</span> <span class=\"nv\">S0</span><span class=\"p\">)</span>\n  <span class=\"p\">,</span> <span class=\"nv\">S0</span> <span class=\"s s-Atom\">:=</span> <span class=\"nf\">delete</span><span class=\"p\">(</span><span class=\"nv\">S0</span><span class=\"p\">,</span> <span class=\"nv\">X</span><span class=\"p\">)</span> <span class=\"c1\">% , is `and`</span>\n  <span class=\"p\">,</span> <span class=\"nf\">member</span><span class=\"p\">(</span><span class=\"nv\">Y</span><span class=\"p\">,</span> <span class=\"nv\">S0</span><span class=\"p\">)</span>\n  <span class=\"p\">,</span> <span class=\"nv\">S0</span> <span class=\"s s-Atom\">:=</span> <span class=\"nf\">delete</span><span class=\"p\">(</span><span class=\"nv\">S0</span><span class=\"p\">,</span> <span class=\"nv\">Y</span><span class=\"p\">)</span>\n  <span class=\"p\">,</span> <span class=\"p\">(</span>\n      <span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"err\">$</span><span class=\"p\">(</span><span class=\"nv\">X</span> <span class=\"o\">+</span> <span class=\"nv\">Y</span><span class=\"p\">)</span> \n    <span class=\"p\">;</span> <span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"err\">$</span><span class=\"p\">(</span><span class=\"nv\">X</span> <span class=\"o\">-</span> <span class=\"nv\">Y</span><span class=\"p\">)</span>\n    <span class=\"p\">;</span> <span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"err\">$</span><span class=\"p\">(</span><span class=\"nv\">X</span> <span class=\"o\">*</span> <span class=\"nv\">Y</span><span class=\"p\">)</span>\n    <span class=\"p\">;</span> <span class=\"nv\">A</span> <span class=\"o\">=</span> <span class=\"err\">$</span><span class=\"p\">(</span><span class=\"nv\">X</span> <span class=\"o\">/</span> <span class=\"nv\">Y</span><span class=\"p\">),</span> <span class=\"nv\">Y</span> <span class=\"o\">></span> <span class=\"mi\">0</span>\n    <span class=\"p\">)</span>\n    <span class=\"p\">,</span> <span class=\"nv\">S1</span> <span class=\"o\">=</span> <span class=\"nv\">S0</span> <span class=\"s s-Atom\">++</span> <span class=\"p\">[</span><span class=\"nf\">apply</span><span class=\"p\">(</span><span class=\"nv\">A</span><span class=\"p\">)]</span>\n  <span class=\"p\">,</span> <span class=\"nv\">Action</span> <span class=\"o\">=</span> <span class=\"nv\">A</span>\n  <span class=\"p\">,</span> <span class=\"nv\">Cost</span> <span class=\"o\">=</span> <span class=\"mi\">1</span>\n  <span class=\"p\">.</span>\n</code></pre></div>\n<p>This is our \"action\", and it works in three steps:</p>\n<ol>\n<li>Nondeterministically pull two different values out of the input, deleting them</li>\n<li>Nondeterministically pick one of the basic operations</li>\n<li>The new state is the remaining elements, appended with that operation applied to our two picks.</li>\n</ol>\n<p>Let's walk through this with <code>[1, 6, 1, 7]</code>. There are four choices for <code>X</code> and three four <code>Y</code>. If the planner chooses <code>X=6</code> and <code>Y=7</code>, <code>A = $(6 + 7)</code>. This is an uncomputed term in the same way lisps might use quotation. We can resolve the computation with <code>apply</code>, as in the line <code>S1 = S0 ++ [apply(A)]</code>.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">final</span><span class=\"p\">([</span><span class=\"nv\">N</span><span class=\"p\">])</span> <span class=\"s s-Atom\">=></span>\n  <span class=\"nv\">N</span> <span class=\"o\">=:=</span> <span class=\"mf\">24.</span> <span class=\"c1\">% handle floating point</span>\n</code></pre></div>\n<p>Our final goal is just a list where the only element is 24. This has to be a little floating point-sensitive to handle floating point divison, done by <code>=:=</code>.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">main</span> <span class=\"s s-Atom\">=></span>\n  <span class=\"nv\">Start</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">,</span> <span class=\"mi\">6</span><span class=\"p\">]</span>\n  <span class=\"p\">,</span> <span class=\"nf\">best_plan</span><span class=\"p\">(</span><span class=\"nv\">Start</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"nv\">Plan</span><span class=\"p\">)</span>\n  <span class=\"p\">,</span> <span class=\"nf\">printf</span><span class=\"p\">(</span><span class=\"s2\">\"%w %w%n\"</span><span class=\"p\">,</span> <span class=\"nv\">Start</span><span class=\"p\">,</span> <span class=\"nv\">Plan</span><span class=\"p\">)</span>\n  <span class=\"p\">.</span>\n</code></pre></div>\n<p>For <code>main,</code> we just find the best plan with the maximum cost of <code>4</code> and print it. When run from the command line, <code>picat</code> automatically executes whatever is in <code>main</code>.</p>\n<div class=\"codehilite\"><pre><span></span><code>$ picat 24.pi\n[1,5,5,6] [1 + 5,5 * 6,30 - 6]\n</code></pre></div>\n<p>I don't want to spoil any more 24 puzzles, so let's stop showing the plan:</p>\n<div class=\"codehilite\"><pre><span></span><code>main =>\n<span class=\"gd\">- , printf(\"%w %w%n\", Start, Plan)</span>\n<span class=\"gi\">+ , printf(\"%w%n\", Start)</span>\n</code></pre></div>\n<h3>Generating puzzles</h3>\n<p>Picat provides a <code>find_all(X, p(X))</code> function, which ruturns all <code>X</code> for which <code>p(X)</code> is true. In theory, we could write <code>find_all(S, best_plan(S, 4, _)</code>. In practice, there are an infinite number of valid puzzles, so we need to bound S somewhat. We also don't want to find any redundant puzzles, such as <code>[6, 6, 6, 4]</code> and <code>[4, 6, 6, 6]</code>. </p>\n<p>We can solve both issues by writing a helper <code>valid24(S)</code>, which will check that <code>S</code> a sorted list of integers within some bounds, like <code>1..8</code>, and also has a valid solution.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">valid24</span><span class=\"p\">(</span><span class=\"nv\">Start</span><span class=\"p\">)</span> <span class=\"s s-Atom\">=></span>\n  <span class=\"nv\">Start</span> <span class=\"o\">=</span> <span class=\"nf\">new_list</span><span class=\"p\">(</span><span class=\"mi\">4</span><span class=\"p\">)</span>\n  <span class=\"p\">,</span> <span class=\"nv\">Start</span> <span class=\"s s-Atom\">::</span> <span class=\"mf\">1..8</span> <span class=\"c1\">% every value in 1..8</span>\n  <span class=\"p\">,</span> <span class=\"nf\">increasing</span><span class=\"p\">(</span><span class=\"nv\">Start</span><span class=\"p\">)</span> <span class=\"c1\">% sorted ascending</span>\n  <span class=\"p\">,</span> <span class=\"nf\">solve</span><span class=\"p\">(</span><span class=\"nv\">Start</span><span class=\"p\">)</span> <span class=\"c1\">% turn into values</span>\n  <span class=\"p\">,</span> <span class=\"nf\">best_plan</span><span class=\"p\">(</span><span class=\"nv\">Start</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"nv\">Plan</span><span class=\"p\">)</span>\n  <span class=\"p\">.</span>\n</code></pre></div>\n<p>This leans on Picat's constraint solving features to automatically find bounded sorted lists, which is why we need the <code>solve</code> step.<sup id=\"fnref:efficiency\"><a class=\"footnote-ref\" href=\"#fn:efficiency\">2</a></sup> Now we can just loop through all of the values in <code>find_all</code> to get all solutions:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">main</span> <span class=\"s s-Atom\">=></span>\n  <span class=\"nf\">foreach</span><span class=\"p\">([</span><span class=\"nv\">S</span><span class=\"p\">]</span> <span class=\"s s-Atom\">in</span> <span class=\"nf\">find_all</span><span class=\"p\">(</span>\n    <span class=\"p\">[</span><span class=\"nv\">Start</span><span class=\"p\">],</span>\n    <span class=\"nf\">valid24</span><span class=\"p\">(</span><span class=\"nv\">Start</span><span class=\"p\">)))</span>\n    <span class=\"nf\">printf</span><span class=\"p\">(</span><span class=\"s2\">\"%w%n\"</span><span class=\"p\">,</span> <span class=\"nv\">S</span><span class=\"p\">)</span>\n  <span class=\"s s-Atom\">end</span><span class=\"p\">.</span>\n</code></pre></div>\n<div class=\"codehilite\"><pre><span></span><code>$ picat 24.pi\n\n[1,1,1,8]\n[1,1,2,6]\n[1,1,2,7]\n[1,1,2,8]\n# etc\n</code></pre></div>\n<h3>Finding hard puzzles</h3>\n<p>Last Friday I realized I could do something more interesting with this. Once I have found a plan, I can apply further constraints to the plan, for example to find problems that can be solved with division:</p>\n<div class=\"codehilite\"><pre><span></span><code>valid24(Start, Plan) =>\n<span class=\"w\"> </span> Start = new_list(4)\n<span class=\"w\"> </span> , Start :: 1..8\n<span class=\"w\"> </span> , increasing(Start)\n<span class=\"w\"> </span> , solve(Start)\n<span class=\"w\"> </span> , best_plan(Start, 4, Plan)\n<span class=\"gi\">+ , member($(_ / _), Plan)</span>\n<span class=\"w\"> </span> .\n</code></pre></div>\n<p>In playing with this, though, I noticed something weird: there are some solutions that appear if I sort <em>up</em> but not <em>down</em>. For example, <code>[3,3,4,5]</code> appears in the solution set, but <code>[5, 4, 3, 3]</code> doesn't appear if I replace <code>increasing</code> with <code>decreasing</code>.</p>\n<p>As far as I can tell, this is because Picat only finds one best plan, and <code>[5, 4, 3, 3]</code> has <em>two</em> solutions: <code>4*(5-3/3)</code> and <code>3*(5+4)-3</code>. <code>best_plan</code> is a <em>deterministic</em> operator, so Picat commits to the first best plan it finds. So if it finds <code>3*(5+4)-3</code> first, it sees that the solution doesn't contain a division, throws <code>[5, 4, 3, 3]</code> away as a candidate, and moves on to the next puzzle.</p>\n<p>There's a couple ways we can fix this. We could replace <code>best_plan</code> with <code>best_plan_nondet</code>, which can backtrack to find new plans (at the cost of an enormous number of duplicates). Or we could modify our <code>final</code> to only accept plans with a division: </p>\n<div class=\"codehilite\"><pre><span></span><code>% Hypothetical change\nfinal([N]) =>\n<span class=\"gi\">+ member($(_ / _), current_plan()),</span>\n<span class=\"w\"> </span> N =:= 24.\n</code></pre></div>\n<p>My favorite \"fix\" is to ask another question entirely. While I was looking for puzzles that can be solved with division, what I actually want is puzzles that <em>must</em> be solved with division. What if I rejected any puzzle that has a solution <em>without</em> division?</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"gi\">+ plan_with_no_div(S, P) => best_plan_nondet(S, 4, P), not member($(_ / _), P).</span>\n\nvalid24(Start, Plan) =>\n<span class=\"w\"> </span> Start = new_list(4)\n<span class=\"w\"> </span> , Start :: 1..8\n<span class=\"w\"> </span> , increasing(Start)\n<span class=\"w\"> </span> , solve(Start)\n<span class=\"w\"> </span> , best_plan(Start, 4, Plan)\n<span class=\"gd\">- , member($(_ / _), Plan)</span>\n<span class=\"gi\">+ , not plan_with_no_div(Start, _)</span>\n<span class=\"w\"> </span> .\n</code></pre></div>\n<p>The new line's a bit tricky. <code>plan_with_div</code> nondeterministically finds a plan, and then fails if the plan contains a division.<sup id=\"fnref:not\"><a class=\"footnote-ref\" href=\"#fn:not\">3</a></sup> Since I used <code>best_plan_nondet</code>, it can backtrack from there and find a new plan. This means <code>plan_with_no_div</code> only fails if not such plan exists. And in <code>valid24</code>, we only succeed if <code>plan_with_no_div</code> fails, guaranteeing that the only existing plans use division. Since this doesn't depend on the plan found via <code>best_plan</code>, it doesn't matter how the values in <code>Start</code> are arranged, this will not miss any valid puzzles.</p>\n<h4>Aside for my <a href=\"https://leanpub.com/logic/\" target=\"_blank\">logic book readers</a></h4>\n<p>The new clause is equivalent to <code>!(some p: Plan(p) && !(div in p))</code>. Applying the simplifications we learned:</p>\n<ol>\n<li><code>!(some p: Plan(p) && !(div in p))</code> (init)</li>\n<li><code>all p: !(plan(p) && !(div in p))</code> (all/some duality)</li>\n<li><code>all p: !plan(p) || div in p)</code> (De Morgan's law)</li>\n<li><code>all p: plan(p) => div in p</code> (implication definition)</li>\n</ol>\n<p>Which more obviously means \"if P is a valid plan, then it contains a division\".</p>\n<h4>Back to finding hard puzzles</h4>\n<p><em>Anyway</em>, with <code>not plan_with_no_div</code>, we are filtering puzzles on the set of possible solutions, not just specific solutions. And this gives me an idea: what if we find puzzles that have only one solution? </p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"gh\">different_plan(S, P) => best_plan_nondet(S, 4, P2), P2 != P.</span>\n\nvalid24(Start, Plan) =>\n<span class=\"gi\">+ , not different_plan(Start, Plan)</span>\n</code></pre></div>\n<p>I tried this from <code>1..8</code> and got:</p>\n<div class=\"codehilite\"><pre><span></span><code>[1,2,7,7]\n[1,3,4,6]\n[1,6,6,8]\n[3,3,8,8]\n</code></pre></div>\n<p>These happen to be some of the <a href=\"https://www.4nums.com/game/difficulties/\" target=\"_blank\">hardest 24 puzzles known</a>, though not all of them. Note this is assuming that <code>(X + Y)</code> and <code>(Y + X)</code> are <em>different</em> solutions. If we say they're the same (by appending writing <code>A = $(X + Y), X <= Y</code> in our action) then we got a lot more puzzles, many of which are considered \"easy\". Other \"hard\" things we can look for include plans that require fractions:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">plan_with_no_fractions</span><span class=\"p\">(</span><span class=\"nv\">S</span><span class=\"p\">,</span> <span class=\"nv\">P</span><span class=\"p\">)</span> <span class=\"s s-Atom\">=></span> \n  <span class=\"nf\">best_plan_nondet</span><span class=\"p\">(</span><span class=\"nv\">S</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"nv\">P</span><span class=\"p\">)</span>\n  <span class=\"p\">,</span> <span class=\"o\">not</span><span class=\"p\">(</span>\n    <span class=\"nf\">member</span><span class=\"p\">(</span><span class=\"nv\">X</span><span class=\"p\">,</span> <span class=\"nv\">P</span><span class=\"p\">),</span>\n    <span class=\"nf\">round</span><span class=\"p\">(</span><span class=\"nf\">apply</span><span class=\"p\">(</span><span class=\"nv\">X</span><span class=\"p\">))</span> <span class=\"s s-Atom\">=\\=</span> <span class=\"nv\">X</span>\n  <span class=\"p\">).</span>\n\n<span class=\"c1\">% insert `not plan...` in valid24 as usual</span>\n</code></pre></div>\n<p>Finally, we could try seeing if a negative number is required:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">plan_with_no_negatives</span><span class=\"p\">(</span><span class=\"nv\">S</span><span class=\"p\">,</span> <span class=\"nv\">P</span><span class=\"p\">)</span> <span class=\"s s-Atom\">=></span> \n  <span class=\"nf\">best_plan_nondet</span><span class=\"p\">(</span><span class=\"nv\">S</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"nv\">P</span><span class=\"p\">)</span>\n  <span class=\"p\">,</span> <span class=\"o\">not</span><span class=\"p\">(</span>\n    <span class=\"nf\">member</span><span class=\"p\">(</span><span class=\"nv\">X</span><span class=\"p\">,</span> <span class=\"nv\">P</span><span class=\"p\">),</span>\n    <span class=\"nf\">apply</span><span class=\"p\">(</span><span class=\"nv\">X</span><span class=\"p\">)</span> <span class=\"o\"><</span> <span class=\"mi\">0</span>\n  <span class=\"p\">).</span>\n</code></pre></div>\n<p>Interestingly this one returns no solutions, so you are never required to construct a negative number as part of a standard 24 puzzle.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:complex\">\n<p>The code below is different than old book version, as it uses more fancy logic programming features that aren't good in learning material. <a class=\"footnote-backref\" href=\"#fnref:complex\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:efficiency\">\n<p><code>increasing</code> is a constraint predicate. We could alternatively write <code>sorted</code>, which is a Picat logical predicate and must be placed after <code>solve</code>. There doesn't seem to be any efficiency gains either way. <a class=\"footnote-backref\" href=\"#fnref:efficiency\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:not\">\n<p>I don't know what the standard is in Picat, but in Prolog, the convention is to use <code>\\+</code> instead of <code>not</code>. They mean the same thing, so I'm using <code>not</code> because it's clearer to non-LPers. <a class=\"footnote-backref\" href=\"#fnref:not\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/finding-hard-24-puzzles-with-planner-programming/",
          "published": "2025-05-20T18:21:01.000Z",
          "updated": "2025-05-20T18:21:01.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/modeling-awkward-social-situations-with-tla/",
          "title": "Modeling Awkward Social Situations with TLA+",
          "description": "<p>You're walking down the street and need to pass someone going the opposite way. You take a step left, but they're thinking the same thing and take a step to their <em>right</em>, aka your left. You're still blocking each other. Then you take a step to the right, and they take a step to their left, and you're back to where you started. I've heard this called \"walkwarding\"</p>\n<p>Let's model this in <a href=\"https://lamport.azurewebsites.net/tla/tla.html\" target=\"_blank\">TLA+</a>. TLA+ is a <strong>formal methods</strong> tool for finding bugs in complex software designs, most often involving concurrency. Two people trying to get past each other just also happens to be a concurrent system. A gentler introduction to TLA+'s capabilities is <a href=\"https://www.hillelwayne.com/post/modeling-deployments/\" target=\"_blank\">here</a>, an in-depth guide teaching the language is <a href=\"https://learntla.com/\" target=\"_blank\">here</a>.</p>\n<h2>The spec</h2>\n<div class=\"codehilite\"><pre><span></span><code>---- MODULE walkward ----\nEXTENDS Integers\n\nVARIABLES pos\nvars == <<pos>>\n</code></pre></div>\n<p>Double equals defines a new operator, single equals is an equality check. <code><<pos>></code> is a sequence, aka array.</p>\n<div class=\"codehilite\"><pre><span></span><code>you == \"you\"\nme == \"me\"\nPeople == {you, me}\n\nMaxPlace == 4\n\nleft == 0\nright == 1\n</code></pre></div>\n<p>I've gotten into the habit of assigning string \"symbols\" to operators so that the compiler complains if I misspelled something. <code>left</code> and <code>right</code> are numbers so we can shift position with <code>right - pos</code>.</p>\n<div class=\"codehilite\"><pre><span></span><code>direction == [you |-> 1, me |-> -1]\ngoal == [you |-> MaxPlace, me |-> 1]\n\nInit ==\n  \\* left-right, forward-backward\n  pos = [you |-> [lr |-> left, fb |-> 1], me |-> [lr |-> left, fb |-> MaxPlace]]\n</code></pre></div>\n<p><code>direction</code>, <code>goal</code>, and <code>pos</code> are \"records\", or hash tables with string keys. I can get my left-right position with <code>pos.me.lr</code> or <code>pos[\"me\"][\"lr\"]</code> (or <code>pos[me].lr</code>, as <code>me == \"me\"</code>).</p>\n<div class=\"codehilite\"><pre><span></span><code>Juke(person) ==\n  pos' = [pos EXCEPT ![person].lr = right - @]\n</code></pre></div>\n<p>TLA+ breaks the world into a sequence of steps. In each step, <code>pos</code> is the value of <code>pos</code> in the <em>current</em> step and <code>pos'</code> is the value in the <em>next</em> step. The main outcome of this semantics is that we \"assign\" a new value to <code>pos</code> by declaring <code>pos'</code> equal to something. But the semantics also open up lots of cool tricks, like swapping two values with <code>x' = y /\\ y' = x</code>.</p>\n<p>TLA+ is a little weird about updating functions. To set <code>f[x] = 3</code>, you gotta write <code>f' = [f EXCEPT ![x] = 3]</code>. To make things a little easier, the rhs of a function update can contain <code>@</code> for the old value. <code>![me].lr = right - @</code> is the same as <code>right - pos[me].lr</code>, so it swaps left and right.</p>\n<p>(\"Juke\" comes from <a href=\"https://www.merriam-webster.com/dictionary/juke\" target=\"_blank\">here</a>)</p>\n<div class=\"codehilite\"><pre><span></span><code>Move(person) ==\n  LET new_pos == [pos[person] EXCEPT !.fb = @ + direction[person]]\n  IN\n    /\\ pos[person].fb # goal[person]\n    /\\ \\A p \\in People: pos[p] # new_pos\n    /\\ pos' = [pos EXCEPT ![person] = new_pos]\n</code></pre></div>\n<p>The <code>EXCEPT</code> syntax can be used in regular definitions, too. This lets someone move one step in their goal direction <em>unless</em> they are at the goal <em>or</em> someone is already in that space. <code>/\\</code> means \"and\".</p>\n<div class=\"codehilite\"><pre><span></span><code>Next ==\n  \\E p \\in People:\n    \\/ Move(p)\n    \\/ Juke(p)\n</code></pre></div>\n<p>I really like how TLA+ represents concurrency: \"In each step, there is a person who either moves or jukes.\" It can take a few uses to really wrap your head around but it can express extraordinarily complicated distributed systems.</p>\n<div class=\"codehilite\"><pre><span></span><code>Spec == Init /\\ [][Next]_vars\n\nLiveness == <>(pos[me].fb = goal[me])\n====\n</code></pre></div>\n<p><code>Spec</code> is our specification: we start at <code>Init</code> and take a <code>Next</code> step every step.</p>\n<p>Liveness is the generic term for \"something good is guaranteed to happen\", see <a href=\"https://www.hillelwayne.com/post/safety-and-liveness/\" target=\"_blank\">here</a> for more.  <code><></code> means \"eventually\", so <code>Liveness</code> means \"eventually my forward-backward position will be my goal\". I could extend it to \"both of us eventually reach out goal\" but I think this is good enough for a demo.</p>\n<h3>Checking the spec</h3>\n<p>Four years ago, everybody in TLA+ used the <a href=\"https://lamport.azurewebsites.net/tla/toolbox.html\" target=\"_blank\">toolbox</a>. Now the community has collectively shifted over to using the <a href=\"https://github.com/tlaplus/vscode-tlaplus/\" target=\"_blank\">VSCode extension</a>.<sup id=\"fnref:ltla\"><a class=\"footnote-ref\" href=\"#fn:ltla\">1</a></sup> VSCode requires we write a configuration file, which I will call <code>walkward.cfg</code>.</p>\n<div class=\"codehilite\"><pre><span></span><code>SPECIFICATION Spec\nPROPERTY Liveness\n</code></pre></div>\n<p>I then check the model with the VSCode command <code>TLA+: Check model with TLC</code>. Unsurprisingly, it finds an error:</p>\n<p><img alt=\"Screenshot 2025-05-12 153537.png\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/af6f9e89-0bc6-4705-b293-4da5f5c16cfe.png?w=960&fit=max\"/></p>\n<p>The reason it fails is \"stuttering\": I can get one step away from my goal and then just stop moving forever. We say the spec is <a href=\"https://www.hillelwayne.com/post/fairness/\" target=\"_blank\">unfair</a>: it does not guarantee that if progress is always possible, progress will be made. If I want the spec to always make progress, I have to make some of the steps <strong>weakly fair</strong>.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"gi\">+ Fairness == WF_vars(Next)</span>\n\n<span class=\"gd\">- Spec == Init /\\ [][Next]_vars</span>\n<span class=\"gi\">+ Spec == Init /\\ [][Next]_vars /\\ Fairness</span>\n</code></pre></div>\n<p>Now the spec is weakly fair, so someone will always do <em>something</em>. New error:</p>\n<div class=\"codehilite\"><pre><span></span><code>\\* First six steps cut\n7: <Move(\"me\")>\npos = [you |-> [lr |-> 0, fb |-> 4], me |-> [lr |-> 1, fb |-> 2]]\n8: <Juke(\"me\")>\npos = [you |-> [lr |-> 0, fb |-> 4], me |-> [lr |-> 0, fb |-> 2]]\n9: <Juke(\"me\")> (back to state 7)\n</code></pre></div>\n<p>In this failure, I've successfully gotten past you, and then spend the rest of my life endlessly juking back and forth. The <code>Next</code> step keeps happening, so weak fairness is satisfied. What I actually want is for both my <code>Move</code> and my <code>Juke</code> to both be weakly fair independently of each other.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"gd\">- Fairness == WF_vars(Next)</span>\n<span class=\"gi\">+ Fairness == WF_vars(Move(me)) /\\ WF_vars(Juke(me))</span>\n</code></pre></div>\n<p>If my liveness property also specified that <em>you</em> reached your goal, I could instead write <code>\\A p \\in People: WF_vars(Move(p)) etc</code>. I could also swap the <code>\\A</code> with a <code>\\E</code> to mean at least one of us is guaranteed to have fair actions, but not necessarily both of us. </p>\n<p>New error:</p>\n<div class=\"codehilite\"><pre><span></span><code>3: <Move(\"me\")>\npos = [you |-> [lr |-> 0, fb |-> 2], me |-> [lr |-> 0, fb |-> 3]]\n4: <Juke(\"you\")>\npos = [you |-> [lr |-> 1, fb |-> 2], me |-> [lr |-> 0, fb |-> 3]]\n5: <Juke(\"me\")>\npos = [you |-> [lr |-> 1, fb |-> 2], me |-> [lr |-> 1, fb |-> 3]]\n6: <Juke(\"me\")>\npos = [you |-> [lr |-> 1, fb |-> 2], me |-> [lr |-> 0, fb |-> 3]]\n7: <Juke(\"you\")> (back to state 3)\n</code></pre></div>\n<p>Now we're getting somewhere! This is the original walkwarding situation we wanted to capture. We're in each others way, then you juke, but before either of us can move you juke, then we both juke back. We can repeat this forever, trapped in a social hell.</p>\n<p>Wait, but doesn't <code>WF(Move(me))</code> guarantee I will eventually move? Yes, but <em>only if a move is permanently available</em>. In this case, it's not permanently available, because every couple of steps it's made temporarily unavailable.</p>\n<p>How do I fix this? I can't add a rule saying that we only juke if we're blocked, because the whole point of walkwarding is that we're not coordinated. In the real world, walkwarding can go on for agonizing seconds. What I can do instead is say that Liveness holds <em>as long as <code>Move</code> is strongly fair</em>. Unlike weak fairness, <a href=\"https://www.hillelwayne.com/post/fairness/#strong-fairness\" target=\"_blank\">strong fairness</a> guarantees something happens if it keeps becoming possible, even with interruptions. </p>\n<div class=\"codehilite\"><pre><span></span><code>Liveness == \n<span class=\"gi\">+  SF_vars(Move(me)) => </span>\n<span class=\"w\"> </span>   <>(pos[me].fb = goal[me])\n</code></pre></div>\n<p>This makes the spec pass. Even if we weave back and forth for five minutes, as long as we eventually pass each other, I will reach my goal. Note we could also by making <code>Move</code> in <code>Fairness</code> strongly fair, which is preferable if we have a lot of different liveness properties to check.</p>\n<h3>A small exercise for the reader</h3>\n<p>There is a presumed invariant that is violated. Identify what it is, write it as a property in TLA+, and show the spec violates it. Then fix it.</p>\n<p>Answer (in <a href=\"https://rot13.com/\" target=\"_blank\">rot13</a>): Gur vainevnag vf \"ab gjb crbcyr ner va gur rknpg fnzr ybpngvba\". <code>Zbir</code> thnenagrrf guvf ohg <code>Whxr</code> <em>qbrf abg</em>.</p>\n<h3>More TLA+ Exercises</h3>\n<p>I've started work on <a href=\"https://github.com/hwayne/tlaplus-exercises/\" target=\"_blank\">an exercises repo</a>. There's only a handful of specific problems now but I'm planning on adding more over the summer.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:ltla\">\n<p><a href=\"https://learntla.com/\" target=\"_blank\">learntla</a> is still on the toolbox, but I'm hoping to get it all moved over this summer. <a class=\"footnote-backref\" href=\"#fnref:ltla\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/modeling-awkward-social-situations-with-tla/",
          "published": "2025-05-14T16:02:21.000Z",
          "updated": "2025-05-14T16:02:21.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/write-the-most-clever-code-you-possibly-can/",
          "title": "Write the most clever code you possibly can",
          "description": "<p><em>I started writing this early last week but Real Life Stuff happened and now you're getting the first-draft late this week. Warning, unedited thoughts ahead!</em></p>\n<h2>New Logic for Programmers release!</h2>\n<p><a href=\"https://leanpub.com/logic/\" target=\"_blank\">v0.9 is out</a>! This is a big release, with a new cover design, several rewritten chapters, <a href=\"https://github.com/logicforprogrammers/book-assets/tree/master/code\" target=\"_blank\">online code samples</a> and much more. See the full release notes at the <a href=\"https://github.com/logicforprogrammers/book-assets/blob/master/CHANGELOG.md\" target=\"_blank\">changelog page</a>, and <a href=\"https://leanpub.com/logic/\" target=\"_blank\">get the book here</a>!</p>\n<p><img alt=\"The new cover! It's a lot nicer\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/038a7092-5dc7-41a5-9a16-56bdef8b5d58.jpg?w=400&fit=max\"/></p>\n<h2>Write the cleverest code you possibly can</h2>\n<p>There are millions of articles online about how programmers should not write \"clever\" code, and instead write simple, maintainable code that everybody understands. Sometimes the example of \"clever\" code looks like this (<a href=\"https://codegolf.stackexchange.com/questions/57617/is-this-number-a-prime/57682#57682\" target=\"_blank\">src</a>):</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\"># Python</span>\n\n<span class=\"n\">p</span><span class=\"o\">=</span><span class=\"n\">n</span><span class=\"o\">=</span><span class=\"mi\">1</span>\n<span class=\"n\">exec</span><span class=\"p\">(</span><span class=\"s2\">\"p*=n*n;n+=1;\"</span><span class=\"o\">*~-</span><span class=\"nb\">int</span><span class=\"p\">(</span><span class=\"nb\">input</span><span class=\"p\">()))</span>\n<span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"n\">p</span><span class=\"o\">%</span><span class=\"n\">n</span><span class=\"p\">)</span>\n</code></pre></div>\n<p>This is code-golfing, the sport of writing the most concise code possible. Obviously you shouldn't run this in production for the same reason you shouldn't eat dinner off a Rembrandt. </p>\n<p>Other times the example looks like this:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">def</span><span class=\"w\"> </span><span class=\"nf\">is_prime</span><span class=\"p\">(</span><span class=\"n\">x</span><span class=\"p\">):</span>\n    <span class=\"k\">if</span> <span class=\"n\">x</span> <span class=\"o\">==</span> <span class=\"mi\">1</span><span class=\"p\">:</span>\n        <span class=\"k\">return</span> <span class=\"kc\">False</span>\n    <span class=\"k\">return</span> <span class=\"nb\">all</span><span class=\"p\">([</span><span class=\"n\">x</span><span class=\"o\">%</span><span class=\"n\">n</span> <span class=\"o\">!=</span> <span class=\"mi\">0</span> <span class=\"k\">for</span> <span class=\"n\">n</span> <span class=\"ow\">in</span> <span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"n\">x</span><span class=\"p\">)])</span>\n</code></pre></div>\n<p>This is \"clever\" because it uses a single list comprehension, as opposed to a \"simple\" for loop. Yes, \"list comprehensions are too clever\" is something I've read in one of these articles. </p>\n<p>I've also talked to people who think that datatypes besides lists and hashmaps are too clever to use, that most optimizations are too clever to bother with, and even that functions and classes are too clever and code should be a linear script.<sup id=\"fnref:grad-students\"><a class=\"footnote-ref\" href=\"#fn:grad-students\">1</a></sup>. Clever code is anything using features or domain concepts we don't understand. Something that seems unbearably clever to me might be utterly mundane for you, and vice versa. </p>\n<p>How do we make something utterly mundane? By using it and working at the boundaries of our skills. Almost everything I'm \"good at\" comes from banging my head against it more than is healthy. That suggests a really good reason to write clever code: it's an excellent form of purposeful practice. Writing clever code forces us to code outside of our comfort zone, developing our skills as software engineers. </p>\n<blockquote>\n<p>Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you [will get excellent debugging practice at exactly the right level required to push your skills as a software engineer] — Brian Kernighan, probably</p>\n</blockquote>\n<p>There are other benefits, too, but first let's kill the elephant in the room:<sup id=\"fnref:bajillion\"><a class=\"footnote-ref\" href=\"#fn:bajillion\">2</a></sup></p>\n<h3>Don't <em>commit</em> clever code</h3>\n<p>I am proposing writing clever code as a means of practice. Being at work is a <em>job</em> with coworkers who will not appreciate if your code is too clever. Similarly, don't use <a href=\"https://mcfunley.com/choose-boring-technology\" target=\"_blank\">too many innovative technologies</a>. Don't put anything in production you are <em>uncomfortable</em> with.</p>\n<p>We can still responsibly write clever code at work, though: </p>\n<ol>\n<li>Solve a problem in both a simple and a clever way, and then only commit the simple way. This works well for small scale problems where trying the \"clever way\" only takes a few minutes.</li>\n<li>Write our <em>personal</em> tools cleverly. I'm a big believer of the idea that most programmers would benefit from writing more scripts and support code customized to their particular work environment. This is a great place to practice new techniques, languages, etc.</li>\n<li>If clever code is absolutely the best way to solve a problem, then commit it with <strong>extensive documentation</strong> explaining how it works and why it's preferable to simpler solutions. Bonus: this potentially helps the whole team upskill.</li>\n</ol>\n<h2>Writing clever code...</h2>\n<div class=\"subscribe-form\"></div>\n<h3>...teaches simple solutions</h3>\n<p>Usually, code that's called too clever composes several powerful features together — the \"not a single list comprehension or function\" people are the exception. <a href=\"https://www.joshwcomeau.com/career/clever-code-considered-harmful/\" target=\"_blank\">Josh Comeau's</a> \"don't write clever code\" article gives this example of \"too clever\":</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"kd\">const</span><span class=\"w\"> </span><span class=\"nx\">extractDataFromResponse</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"p\">(</span><span class=\"nx\">response</span><span class=\"p\">)</span><span class=\"w\"> </span><span class=\"p\">=></span><span class=\"w\"> </span><span class=\"p\">{</span>\n<span class=\"w\">  </span><span class=\"kd\">const</span><span class=\"w\"> </span><span class=\"p\">[</span><span class=\"nx\">Component</span><span class=\"p\">,</span><span class=\"w\"> </span><span class=\"nx\">props</span><span class=\"p\">]</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"nx\">response</span><span class=\"p\">;</span>\n\n<span class=\"w\">  </span><span class=\"kd\">const</span><span class=\"w\"> </span><span class=\"nx\">resultsEntries</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"nb\">Object</span><span class=\"p\">.</span><span class=\"nx\">entries</span><span class=\"p\">({</span><span class=\"w\"> </span><span class=\"nx\">Component</span><span class=\"p\">,</span><span class=\"w\"> </span><span class=\"nx\">props</span><span class=\"w\"> </span><span class=\"p\">});</span>\n<span class=\"w\">  </span><span class=\"kd\">const</span><span class=\"w\"> </span><span class=\"nx\">assignIfValueTruthy</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"p\">(</span><span class=\"nx\">o</span><span class=\"p\">,</span><span class=\"w\"> </span><span class=\"p\">[</span><span class=\"nx\">k</span><span class=\"p\">,</span><span class=\"w\"> </span><span class=\"nx\">v</span><span class=\"p\">])</span><span class=\"w\"> </span><span class=\"p\">=></span><span class=\"w\"> </span><span class=\"p\">(</span><span class=\"nx\">v</span>\n<span class=\"w\">    </span><span class=\"o\">?</span><span class=\"w\"> </span><span class=\"p\">{</span><span class=\"w\"> </span><span class=\"p\">...</span><span class=\"nx\">o</span><span class=\"p\">,</span><span class=\"w\"> </span><span class=\"p\">[</span><span class=\"nx\">k</span><span class=\"p\">]</span><span class=\"o\">:</span><span class=\"w\"> </span><span class=\"nx\">v</span><span class=\"w\"> </span><span class=\"p\">}</span>\n<span class=\"w\">    </span><span class=\"o\">:</span><span class=\"w\"> </span><span class=\"nx\">o</span>\n<span class=\"w\">  </span><span class=\"p\">);</span>\n\n<span class=\"w\">  </span><span class=\"k\">return</span><span class=\"w\"> </span><span class=\"nx\">resultsEntries</span><span class=\"p\">.</span><span class=\"nx\">reduce</span><span class=\"p\">(</span><span class=\"nx\">assignIfValueTruthy</span><span class=\"p\">,</span><span class=\"w\"> </span><span class=\"p\">{});</span>\n<span class=\"p\">}</span>\n</code></pre></div>\n<p>What makes this \"clever\"? I count eight language features composed together: <code>entries</code>, argument unpacking, implicit objects, splats, ternaries, higher-order functions, and reductions. Would code that used only one or two of these features still be \"clever\"? I don't think so. These features exist for a reason, and oftentimes they make code simpler than not using them.</p>\n<p>We can, of course, learn these features one at a time. Writing the clever version (but not <em>committing it</em>) gives us practice with all eight at once and also with how they compose together. That knowledge comes in handy when we want to apply a single one of the ideas.</p>\n<p>I've recently had to do a bit of pandas for a project. Whenever I have to do a new analysis, I try to write it as a single chain of transformations, and then as a more balanced set of updates.</p>\n<h3>...helps us master concepts</h3>\n<p>Even if the composite parts of a \"clever\" solution aren't by themselves useful, it still makes us better at the overall language, and that's inherently valuable. A few years ago I wrote <a href=\"https://www.hillelwayne.com/post/python-abc/\" target=\"_blank\">Crimes with Python's Pattern Matching</a>. It involves writing horrible code like this:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"kn\">from</span><span class=\"w\"> </span><span class=\"nn\">abc</span><span class=\"w\"> </span><span class=\"kn\">import</span> <span class=\"n\">ABC</span>\n\n<span class=\"k\">class</span><span class=\"w\"> </span><span class=\"nc\">NotIterable</span><span class=\"p\">(</span><span class=\"n\">ABC</span><span class=\"p\">):</span>\n\n    <span class=\"nd\">@classmethod</span>\n    <span class=\"k\">def</span><span class=\"w\"> </span><span class=\"nf\">__subclasshook__</span><span class=\"p\">(</span><span class=\"bp\">cls</span><span class=\"p\">,</span> <span class=\"n\">C</span><span class=\"p\">):</span>\n        <span class=\"k\">return</span> <span class=\"ow\">not</span> <span class=\"nb\">hasattr</span><span class=\"p\">(</span><span class=\"n\">C</span><span class=\"p\">,</span> <span class=\"s2\">\"__iter__\"</span><span class=\"p\">)</span>\n\n<span class=\"k\">def</span><span class=\"w\"> </span><span class=\"nf\">f</span><span class=\"p\">(</span><span class=\"n\">x</span><span class=\"p\">):</span>\n    <span class=\"k\">match</span> <span class=\"n\">x</span><span class=\"p\">:</span>\n        <span class=\"k\">case</span> <span class=\"n\">NotIterable</span><span class=\"p\">():</span>\n            <span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"sa\">f</span><span class=\"s2\">\"</span><span class=\"si\">{</span><span class=\"n\">x</span><span class=\"si\">}</span><span class=\"s2\"> is not iterable\"</span><span class=\"p\">)</span>\n        <span class=\"k\">case</span><span class=\"w\"> </span><span class=\"k\">_</span><span class=\"p\">:</span>\n            <span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"sa\">f</span><span class=\"s2\">\"</span><span class=\"si\">{</span><span class=\"n\">x</span><span class=\"si\">}</span><span class=\"s2\"> is iterable\"</span><span class=\"p\">)</span>\n\n<span class=\"k\">if</span> <span class=\"vm\">__name__</span> <span class=\"o\">==</span> <span class=\"s2\">\"__main__\"</span><span class=\"p\">:</span>\n    <span class=\"n\">f</span><span class=\"p\">(</span><span class=\"mi\">10</span><span class=\"p\">)</span>\n    <span class=\"n\">f</span><span class=\"p\">(</span><span class=\"s2\">\"string\"</span><span class=\"p\">)</span>\n    <span class=\"n\">f</span><span class=\"p\">([</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">])</span>\n</code></pre></div>\n<p>This composes Python match statements, which are broadly useful, and abstract base classes, which are incredibly niche. But even if I never use ABCs in real production code, it helped me understand Python's match semantics and <a href=\"https://docs.python.org/3/howto/mro.html#python-2-3-mro\" target=\"_blank\">Method Resolution Order</a> better. </p>\n<h3>...prepares us for necessity</h3>\n<p>Sometimes the clever way is the <em>only</em> way. Maybe we need something faster than the simplest solution. Maybe we are working with constrained tools or frameworks that demand cleverness. Peter Norvig argued that design patterns compensate for missing language features. I'd argue that cleverness is another means of compensating: if our tools don't have an easy way to do something, we need to find a clever way.</p>\n<p>You see this a lot in formal methods like TLA+. Need to check a hyperproperty? <a href=\"https://www.hillelwayne.com/post/graphing-tla/\" target=\"_blank\">Cast your state space to a directed graph</a>. Need to compose ten specifications together? <a href=\"https://www.hillelwayne.com/post/composing-tla/\" target=\"_blank\">Combine refinements with state machines</a>. Most difficult problems have a \"clever\" solution. The real problem is that clever solutions have a skill floor. If normal use of the tool is at difficult 3 out of 10, then basic clever solutions are at 5 out of 10, and it's hard to jump those two steps in the moment you need the cleverness.</p>\n<p>But if you've practiced with writing overly clever code, you're used to working at a 7 out of 10 level in short bursts, and then you can \"drop down\" to 5/10. I don't know if that makes too much sense, but I see it happen a lot in practice.</p>\n<h3>...builds comradery</h3>\n<p>On a few occasions, after getting a pull request merged, I pulled the reviewer over and said \"check out this horrible way of doing the same thing\". I find that as long as people know they're not going to be subjected to a clever solution in production, they enjoy seeing it!</p>\n<p><em>Next week's newsletter will probably also be late, after that we should be back to a regular schedule for the rest of the summer.</em></p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:grad-students\">\n<p>Mostly grad students outside of CS who have to write scripts to do research. And in more than one data scientist. I think it's correlated with using Jupyter. <a class=\"footnote-backref\" href=\"#fnref:grad-students\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:bajillion\">\n<p>If I don't put this at the beginning, I'll get a bajillion responses like \"your team will hate you\" <a class=\"footnote-backref\" href=\"#fnref:bajillion\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/write-the-most-clever-code-you-possibly-can/",
          "published": "2025-05-08T15:04:42.000Z",
          "updated": "2025-05-08T15:04:42.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/requirements-change-until-they-dont/",
          "title": "Requirements change until they don't",
          "description": "<p>Recently I got a question on formal methods<sup id=\"fnref:fs\"><a class=\"footnote-ref\" href=\"#fn:fs\">1</a></sup>: how does it help to mathematically model systems when the system requirements are constantly changing? It doesn't make sense to spend a lot of time proving a design works, and then deliver the product and find out it's not at all what the client needs. As the saying goes, the hard part is \"building the right thing\", not \"building the thing right\".</p>\n<p>One possible response: \"why write tests\"? You shouldn't write tests, <em>especially</em> <a href=\"https://en.wikipedia.org/wiki/Test-driven_development\" target=\"_blank\">lots of unit tests ahead of time</a>, if you might just throw them all away when the requirements change.</p>\n<p>This is a bad response because we all know the difference between writing tests and formal methods: testing is <em>easy</em> and FM is <em>hard</em>. Testing requires low cost for moderate correctness, FM requires high(ish) cost for high correctness. And when requirements are constantly changing, \"high(ish) cost\" isn't affordable and \"high correctness\" isn't worthwhile, because a kinda-okay solution that solves a customer's problem is infinitely better than a solid solution that doesn't.</p>\n<p>But eventually you get something that solves the problem, and what then?</p>\n<p>Most of us don't work for Google, we can't axe features and products <a href=\"https://killedbygoogle.com/\" target=\"_blank\">on a whim</a>. If the client is happy with your solution, you are expected to support it. It should work when your customers run into new edge cases, or migrate all their computers to the next OS version, or expand into a market with shoddy internet. It should work when 10x as many customers are using 10x as many features. It should work when <a href=\"https://www.hillelwayne.com/post/feature-interaction/\" target=\"_blank\">you add new features that come into conflict</a>. </p>\n<p>And just as importantly, <em>it should never stop solving their problem</em>. Canonical example: your feature involves processing requested tasks synchronously. At scale, this doesn't work, so to improve latency you make it asynchronous. Now it's eventually consistent, but your customers were depending on it being always consistent. Now it no longer does what they need, and has stopped solving their problems.</p>\n<p>Every successful requirement met spawns a new requirement: \"keep this working\". That requirement is permanent, or close enough to decide our long-term strategy. It takes active investment to keep a feature behaving the same as the world around it changes.</p>\n<p>(Is this all a pretentious of way of saying \"software maintenance is hard?\" Maybe!)</p>\n<h3>Phase changes</h3>\n<div class=\"subscribe-form\"></div>\n<p>In physics there's a concept of a <a href=\"https://en.wikipedia.org/wiki/Phase_transition\" target=\"_blank\">phase transition</a>. To raise the temperature of a gram of liquid water by 1° C, you have to add 4.184 joules of energy.<sup id=\"fnref:calorie\"><a class=\"footnote-ref\" href=\"#fn:calorie\">2</a></sup> This continues until you raise it to 100°C, then it stops. After you've added two <em>thousand</em> joules to that gram, it suddenly turns into steam. The energy of the system changes continuously but the form, or phase, changes discretely.</p>\n<p><img alt=\"Phase_diagram_of_water_simplified.svg.png (from above link)\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/31676a33-be6a-4c6d-a96f-425723dcb0d5.png?w=960&fit=max\"/></p>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<p>Software isn't physics but the idea works as a metaphor. A certain architecture handles a certain level of load, and past that you need a new architecture. Or a bunch of similar features are independently hardcoded until the system becomes too messy to understand, you remodel the internals into something unified and extendable. etc etc etc. It's doesn't have to be totally discrete phase transition, but there's definitely a \"before\" and \"after\" in the system form. </p>\n<p>Phase changes tend to lead to more intricacy/complexity in the system, meaning it's likely that a phase change will introduce new bugs into existing behaviors. Take the synchronous vs asynchronous case. A very simple toy model of synchronous updates would be <code>Set(key, val)</code>, which updates <code>data[key]</code> to <code>val</code>.<sup id=\"fnref:tla\"><a class=\"footnote-ref\" href=\"#fn:tla\">3</a></sup> A model of asynchronous updates would be <code>AsyncSet(key, val, priority)</code> adds a <code>(key, val, priority, server_time())</code> tuple to a <code>tasks</code> set, and then another process asynchronously pulls a tuple (ordered by highest priority, then earliest time) and calls <code>Set(key, val)</code>. Here are some properties the client may need preserved as a requirement: </p>\n<ul>\n<li>If <code>AsyncSet(key, val, _, _)</code> is called, then <em>eventually</em> <code>db[key] = val</code> (possibly violated if higher-priority tasks keep coming in)</li>\n<li>If someone calls <code>AsyncSet(key1, val1, low)</code> and then <code>AsyncSet(key2, val2, low)</code>, they should see the first update and then the second (linearizability, possibly violated if the requests go to different servers with different clock times)</li>\n<li>If someone calls <code>AsyncSet(key, val, _)</code> and <em>immediately</em> reads <code>db[key]</code> they should get <code>val</code> (obviously violated, though the client may accept a <em>slightly</em> weaker property)</li>\n</ul>\n<p>If the new system doesn't satisfy an existing customer requirement, it's prudent to fix the bug <em>before</em> releasing the new system. The customer doesn't notice or care that your system underwent a phase change. They'll just see that one day your product solves their problems, and the next day it suddenly doesn't. </p>\n<p>This is one of the most common applications of formal methods. Both of those systems, and every one of those properties, is formally specifiable in a specification language. We can then automatically check that the new system satisfies the existing properties, and from there do things like <a href=\"https://arxiv.org/abs/2006.00915\" target=\"_blank\">automatically generate test suites</a>. This does take a lot of work, so if your requirements are constantly changing, FM may not be worth the investment. But eventually requirements <em>stop</em> changing, and then you're stuck with them forever. That's where models shine.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:fs\">\n<p>As always, I'm using formal methods to mean the subdiscipline of formal specification of designs, leaving out the formal verification of code. Mostly because \"formal specification\" is really awkward to say. <a class=\"footnote-backref\" href=\"#fnref:fs\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:calorie\">\n<p>Also called a \"calorie\". The US \"dietary Calorie\" is actually a kilocalorie. <a class=\"footnote-backref\" href=\"#fnref:calorie\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:tla\">\n<p>This is all directly translatable to a TLA+ specification, I'm just describing it in English to avoid paying the syntax tax <a class=\"footnote-backref\" href=\"#fnref:tla\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/requirements-change-until-they-dont/",
          "published": "2025-04-24T11:00:00.000Z",
          "updated": "2025-04-24T11:00:00.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/the-halting-problem-is-a-terrible-example-of-np/",
          "title": "The Halting Problem is a terrible example of NP-Harder",
          "description": "<p><em>Short one this time because I have a lot going on this week.</em></p>\n<p>In computation complexity, <strong>NP</strong> is the class of all decision problems (yes/no) where a potential proof (or \"witness\") for \"yes\" can be <em>verified</em> in polynomial time. For example, \"does this set of numbers have a subset that sums to zero\" is in NP. If the answer is \"yes\", you can prove it by presenting a set of numbers. We would then verify the witness by 1) checking that all the numbers are present in the set (~linear time) and 2) adding up all the numbers (also linear).</p>\n<p><strong>NP-complete</strong> is the class of \"hardest possible\" NP problems. Subset sum is NP-complete. <strong>NP-hard</strong> is the set all problems <em>at least as hard</em> as NP-complete. Notably, NP-hard is <em>not</em> a subset of NP, as it contains problems that are <em>harder</em> than NP-complete. A natural question to ask is \"like what?\" And the canonical example of \"NP-harder\" is the halting problem (HALT): does program P halt on input C? As the argument goes, it's undecidable, so obviously not in NP.</p>\n<p>I think this is a bad example for two reasons:</p>\n<ol><li><p>All NP requires is that witnesses for \"yes\" can be verified in polynomial time. It does not require anything for the \"no\" case! And even though HP is undecidable, there <em>is</em> a decidable way to verify a \"yes\": let the witness be \"it halts in N steps\", then run the program for that many steps and see if it halted by then. To prove HALT is not in NP, you have to show that this verification process grows faster than polynomially. It does (as <a href=\"https://en.wikipedia.org/wiki/Busy_beaver\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">busy beaver</a> is uncomputable), but this all makes the example needlessly confusing.<sup id=\"fnref:1\"><a class=\"footnote-ref\" data-id=\"37347adc-dba6-4629-9d24-c6252292ac6b\" data-reference-number=\"1\" href=\"#fn:1\">1</a></sup></p></li><li><p>\"What's bigger than a dog? THE MOON\"</p></li></ol>\n<p>Really (2) bothers me a lot more than (1) because it's just so inelegant. It suggests that NP-complete is the upper bound of \"solvable\" problems, and after that you're in full-on undecidability. I'd rather show intuitive problems that are harder than NP but not <em>that</em> much harder.</p>\n<p>But in looking for a \"slightly harder\" problem, I ran into an, ah, problem. It <em>seems</em> like the next-hardest class would be <a href=\"https://en.wikipedia.org/wiki/EXPTIME\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">EXPTIME</a>, except we don't know <em>for sure</em> that NP != EXPTIME. We know <em>for sure</em> that NP != <a href=\"https://en.wikipedia.org/wiki/NEXPTIME\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">NEXPTIME</a>, but NEXPTIME doesn't have any intuitive, easily explainable problems. Most \"definitely harder than NP\" problems require a nontrivial background in theoretical computer science or mathematics to understand.</p>\n<p>There is one problem, though, that I find easily explainable. Place a token at the bottom left corner of a grid that extends infinitely up and right, call that point (0, 0). You're given list of valid displacement moves for the token, like <code>(+1, +0)</code>, <code>(-20, +13)</code>, <code>(-5, -6)</code>, etc, and a target point like <code>(700, 1)</code>. You may make any sequence of moves in any order, as long as no move ever puts the token off the grid. Does any sequence of moves bring you to the target?</p>\n<div class=\"subscribe-form\"></div>\n<p>This is PSPACE-complete, I think, which still isn't proven to be harder than NP-complete (though it's widely believed). But what if you increase the number of dimensions of the grid? Past a certain number of dimensions the problem jumps to being EXPSPACE-complete, and then TOWER-complete (grows <a href=\"https://en.wikipedia.org/wiki/Tetration\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">tetrationally</a>), and then it keeps going. Some point might recognize this as looking a lot like the <a href=\"https://en.wikipedia.org/wiki/Ackermann_function\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Ackermann function</a>, and in fact this problem is <a href=\"https://arxiv.org/abs/2104.13866\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">ACKERMANN-complete on the number of available dimensions</a>.</p>\n<p><a href=\"https://www.quantamagazine.org/an-easy-sounding-problem-yields-numbers-too-big-for-our-universe-20231204/\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">A friend wrote a Quanta article about the whole mess</a>, you should read it.</p>\n<p>This problem is ludicrously bigger than NP (\"Chicago\" instead of \"The Moon\"), but at least it's clearly decidable, easily explainable, and definitely <em>not</em> in NP.</p>\n<div class=\"footnote\"><hr/><ol class=\"footnotes\"><li data-id=\"37347adc-dba6-4629-9d24-c6252292ac6b\" id=\"fn:1\"><p>It's less confusing if you're taught the alternate (and original!) definition of NP, \"the class of problems solvable in polynomial time by a nondeterministic Turing machine\". Then HALT can't be in NP because otherwise runtime would be bounded by an exponential function. <a class=\"footnote-backref\" href=\"#fnref:1\">↩</a></p></li></ol></div>",
          "url": "https://buttondown.com/hillelwayne/archive/the-halting-problem-is-a-terrible-example-of-np/",
          "published": "2025-04-16T17:39:23.000Z",
          "updated": "2025-04-16T17:39:23.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/",
          "title": "Solving a \"Layton Puzzle\" with Prolog",
          "description": "<p>I have a lot in the works for the this month's <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a> release. Among other things, I'm completely rewriting the chapter on Logic Programming Languages. </p>\n<p>I originally showcased the paradigm with puzzle solvers, like <a href=\"https://swish.swi-prolog.org/example/queens.pl\" target=\"_blank\">eight queens</a> or <a href=\"https://saksagan.ceng.metu.edu.tr/courses/ceng242/documents/prolog/jrfisher/2_1.html\" target=\"_blank\">four-coloring</a>. Lots of other demos do this too! It takes creativity and insight for humans to solve them, so a program doing it feels magical. But I'm trying to write a book about practical techniques and I want everything I talk about to be <em>useful</em>. So in v0.9 I'll be replacing these examples with a couple of new programs that might get people thinking that Prolog could help them in their day-to-day work.</p>\n<p>On the other hand, for a newsletter, showcasing a puzzle solver is pretty cool. And recently I stumbled into <a href=\"https://morepablo.com/2010/09/some-professor-layton-prolog.html\" target=\"_blank\">this post</a> by my friend <a href=\"https://morepablo.com/\" target=\"_blank\">Pablo Meier</a>, where he solves a videogame puzzle with Prolog:<sup id=\"fnref:path\"><a class=\"footnote-ref\" href=\"#fn:path\">1</a></sup></p>\n<p><img alt=\"See description below\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/a4ee8689-bbce-4dc9-8175-a1de3bd8f2db.png?w=960&fit=max\"/></p>\n<p>Summary for the text-only readers: We have a test with 10 true/false questions (denoted <code>a/b</code>) and four student attempts. Given the scores of the first three students, we have to figure out the fourth student's score.</p>\n<div class=\"codehilite\"><pre><span></span><code>bbababbabb = 7\nbaaababaaa = 5\nbaaabbbaba = 3\nbbaaabbaaa = ???\n</code></pre></div>\n<p>You can see Pablo's solution <a href=\"https://morepablo.com/2010/09/some-professor-layton-prolog.html\" target=\"_blank\">here</a>, and try it in SWI-prolog <a href=\"https://swish.swi-prolog.org/p/Some%20Professor%20Layton%20Prolog.pl\" target=\"_blank\">here</a>. Pretty cool! But after way too long studying Prolog just to write this dang book chapter, I wanted to see if I could do it more elegantly than him. Code and puzzle spoilers to follow.</p>\n<p>(Normally here's where I'd link to a gentler introduction I wrote but I think this is my first time writing about Prolog online? Uh here's a <a href=\"https://www.hillelwayne.com/post/picat/\" target=\"_blank\">Picat intro</a> instead)</p>\n<h3>The Program</h3>\n<p>You can try this all online at <a href=\"https://swish.swi-prolog.org/p/\" target=\"_blank\">SWISH</a> or just jump to my final version <a href=\"https://swish.swi-prolog.org/p/layton_prolog_puzzle.pl\" target=\"_blank\">here</a>.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"p\">:-</span> <span class=\"nf\">use_module</span><span class=\"p\">(</span><span class=\"nf\">library</span><span class=\"p\">(</span><span class=\"s s-Atom\">dif</span><span class=\"p\">)).</span>    <span class=\"c1\">% Sound inequality</span>\n<span class=\"p\">:-</span> <span class=\"nf\">use_module</span><span class=\"p\">(</span><span class=\"nf\">library</span><span class=\"p\">(</span><span class=\"s s-Atom\">clpfd</span><span class=\"p\">)).</span>  <span class=\"c1\">% Finite domain constraints</span>\n</code></pre></div>\n<p>First some imports. <code>dif</code> lets us write <code>dif(A, B)</code>, which is true if <code>A</code> and <code>B</code> are <em>not</em> equal. <code>clpfd</code> lets us write <code>A #= B + 1</code> to say \"A is 1 more than B\".<sup id=\"fnref:superior\"><a class=\"footnote-ref\" href=\"#fn:superior\">2</a></sup></p>\n<p>We'll say both the student submission and the key will be lists, where each value is <code>a</code> or <code>b</code>. In Prolog, lowercase identifiers are <strong>atoms</strong> (like symbols in other languages) and identifiers that start with a capital are <strong>variables</strong>. Prolog finds values for variables that match equations (<strong>unification</strong>). The pattern matching is real real good.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\">% ?- means query</span>\n<span class=\"s s-Atom\">?-</span> <span class=\"nv\">L</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span><span class=\"nv\">B</span><span class=\"p\">,</span><span class=\"s s-Atom\">c</span><span class=\"p\">],</span> <span class=\"p\">[</span><span class=\"nv\">Y</span><span class=\"p\">|</span><span class=\"nv\">X</span><span class=\"p\">]</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"mi\">1</span><span class=\"p\">,</span><span class=\"mi\">2</span><span class=\"p\">|</span><span class=\"nv\">L</span><span class=\"p\">],</span> <span class=\"nv\">B</span> <span class=\"o\">+</span> <span class=\"mi\">1</span> <span class=\"s s-Atom\">#=</span> <span class=\"mf\">7.</span>\n\n<span class=\"nv\">B</span> <span class=\"o\">=</span> <span class=\"mi\">6</span><span class=\"p\">,</span>\n<span class=\"nv\">L</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"mi\">6</span><span class=\"p\">,</span> <span class=\"s s-Atom\">c</span><span class=\"p\">],</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"mi\">6</span><span class=\"p\">,</span> <span class=\"s s-Atom\">c</span><span class=\"p\">],</span>\n<span class=\"nv\">Y</span> <span class=\"o\">=</span> <span class=\"mi\">1</span>\n</code></pre></div>\n<p>Next, we define <code>score/3</code><sup id=\"fnref:arity\"><a class=\"footnote-ref\" href=\"#fn:arity\">3</a></sup> recursively. </p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\">% The student's test score</span>\n<span class=\"c1\">% score(student answers, answer key, score)</span>\n<span class=\"nf\">score</span><span class=\"p\">([],</span> <span class=\"p\">[],</span> <span class=\"mi\">0</span><span class=\"p\">).</span>\n<span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"nv\">A</span><span class=\"p\">|</span><span class=\"nv\">As</span><span class=\"p\">],</span> <span class=\"p\">[</span><span class=\"nv\">A</span><span class=\"p\">|</span><span class=\"nv\">Ks</span><span class=\"p\">],</span> <span class=\"nv\">N</span><span class=\"p\">)</span> <span class=\"p\">:-</span>\n   <span class=\"nv\">N</span> <span class=\"s s-Atom\">#=</span> <span class=\"nv\">M</span> <span class=\"o\">+</span> <span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"nf\">score</span><span class=\"p\">(</span><span class=\"nv\">As</span><span class=\"p\">,</span> <span class=\"nv\">Ks</span><span class=\"p\">,</span> <span class=\"nv\">M</span><span class=\"p\">).</span>\n<span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"nv\">A</span><span class=\"p\">|</span><span class=\"nv\">As</span><span class=\"p\">],</span> <span class=\"p\">[</span><span class=\"nv\">K</span><span class=\"p\">|</span><span class=\"nv\">Ks</span><span class=\"p\">],</span> <span class=\"nv\">N</span><span class=\"p\">)</span> <span class=\"p\">:-</span> \n    <span class=\"nf\">dif</span><span class=\"p\">(</span><span class=\"nv\">A</span><span class=\"p\">,</span> <span class=\"nv\">K</span><span class=\"p\">),</span> <span class=\"nf\">score</span><span class=\"p\">(</span><span class=\"nv\">As</span><span class=\"p\">,</span> <span class=\"nv\">Ks</span><span class=\"p\">,</span> <span class=\"nv\">N</span><span class=\"p\">).</span>\n</code></pre></div>\n<p>First key is the student's answers, second is the answer key, third is the final score. The base case is the empty test, which has score 0. Otherwise, we take the head values of each list and compare them. If they're the same, we add one to the score, otherwise we keep the same score. </p>\n<p>Notice we couldn't write <code>if x then y else z</code>, we instead used pattern matching to effectively express <code>(x && y) || (!x && z)</code>. Prolog does have a conditional operator, but it prevents backtracking so what's the point???</p>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<h3>A quick break about bidirectionality</h3>\n<p>One of the coolest things about Prolog: all purely logical predicates are bidirectional. We can use <code>score</code> to check if our expected score is correct:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span> <span class=\"mi\">2</span><span class=\"p\">).</span>\n<span class=\"s s-Atom\">true</span>\n</code></pre></div>\n<p>But we can also give it answers and a key and ask it for the score:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span> <span class=\"nv\">X</span><span class=\"p\">).</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"mi\">2</span>\n</code></pre></div>\n<p><em>Or</em> we could give it a key and a score and ask \"what test answers would have this score?\"</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">score</span><span class=\"p\">(</span><span class=\"nv\">X</span><span class=\"p\">,</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span> <span class=\"mi\">2</span><span class=\"p\">).</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"k\">_</span><span class=\"nv\">A</span><span class=\"p\">],</span>\n<span class=\"nf\">dif</span><span class=\"p\">(</span><span class=\"k\">_</span><span class=\"nv\">A</span><span class=\"p\">,</span><span class=\"s s-Atom\">b</span><span class=\"p\">)</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"k\">_</span><span class=\"nv\">A</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span>\n<span class=\"nf\">dif</span><span class=\"p\">(</span><span class=\"k\">_</span><span class=\"nv\">A</span><span class=\"p\">,</span><span class=\"s s-Atom\">b</span><span class=\"p\">)</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"k\">_</span><span class=\"nv\">A</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span>\n<span class=\"nf\">dif</span><span class=\"p\">(</span><span class=\"k\">_</span><span class=\"nv\">A</span><span class=\"p\">,</span><span class=\"s s-Atom\">b</span><span class=\"p\">)</span>\n</code></pre></div>\n<p>The different value is written <code>_A</code> because we never told Prolog that the array can <em>only</em> contain <code>a</code> and <code>b</code>. We'll fix this later.</p>\n<h3>Okay back to the program</h3>\n<p>Now that we have a way of computing scores, we want to find a possible answer key that matches all of our observations, ie gives everybody the correct scores.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">key</span><span class=\"p\">(</span><span class=\"nv\">Key</span><span class=\"p\">)</span> <span class=\"p\">:-</span>\n    <span class=\"c1\">% Figure it out</span>\n    <span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">],</span> <span class=\"nv\">Key</span><span class=\"p\">,</span> <span class=\"mi\">7</span><span class=\"p\">),</span>\n    <span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">],</span> <span class=\"nv\">Key</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">),</span>\n    <span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">],</span> <span class=\"nv\">Key</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">).</span>\n</code></pre></div>\n<p>So far we haven't explicitly said that the <code>Key</code> length matches the student answer lengths. This is implicitly verified by <code>score</code> (both lists need to be empty at the same time) but it's a good idea to explicitly add <code>length(Key, 10)</code> as a clause of <code>key/1</code>. We should also explicitly say that every element of <code>Key</code> is either <code>a</code> or <code>b</code>.<sup id=\"fnref:explicit\"><a class=\"footnote-ref\" href=\"#fn:explicit\">4</a></sup> Now we <em>could</em> write a second predicate saying <code>Key</code> had the right 'type': </p>\n<div class=\"codehilite\"><pre><span></span><code>keytype([]).\nkeytype([K|Ks]) :- member(K, [a, b]), keytype(Ks).\n</code></pre></div>\n<p>But \"generating lists that match a constraint\" is a thing that comes up often enough that we don't want to write a separate predicate for each constraint! So after some digging, I found a more elegant solution: <code>maplist</code>. Let <code>L=[l1, l2]</code>. Then <code>maplist(p, L)</code> is equivalent to the clause <code>p(l1), p(l2)</code>. It also accepts partial predicates: <code>maplist(p(x), L)</code> is equivalent to <code>p(x, l1), p(x, l2)</code>. So we could write<sup id=\"fnref:yall\"><a class=\"footnote-ref\" href=\"#fn:yall\">5</a></sup></p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nf\">contains</span><span class=\"p\">(</span><span class=\"nv\">L</span><span class=\"p\">,</span> <span class=\"nv\">X</span><span class=\"p\">)</span> <span class=\"p\">:-</span> <span class=\"nf\">member</span><span class=\"p\">(</span><span class=\"nv\">X</span><span class=\"p\">,</span> <span class=\"nv\">L</span><span class=\"p\">).</span>\n\n<span class=\"nf\">key</span><span class=\"p\">(</span><span class=\"nv\">Key</span><span class=\"p\">)</span> <span class=\"p\">:-</span>\n    <span class=\"nf\">length</span><span class=\"p\">(</span><span class=\"nv\">Key</span><span class=\"p\">,</span> <span class=\"mi\">10</span><span class=\"p\">),</span>\n    <span class=\"nf\">maplist</span><span class=\"p\">(</span><span class=\"nf\">contains</span><span class=\"p\">([</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span><span class=\"s s-Atom\">b</span><span class=\"p\">]),</span> <span class=\"nv\">L</span><span class=\"p\">),</span>\n    <span class=\"c1\">% the score stuff</span>\n</code></pre></div>\n<p>Now, let's query for the Key:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">key</span><span class=\"p\">(</span><span class=\"nv\">Key</span><span class=\"p\">)</span>\n<span class=\"nv\">Key</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">]</span>\n<span class=\"nv\">Key</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">]</span>\n<span class=\"nv\">Key</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">]</span>\n<span class=\"nv\">Key</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">]</span>\n</code></pre></div>\n<p>So there are actually four <em>different</em> keys that all explain our data. Does this mean the puzzle is broken and has multiple different answers?</p>\n<h3>Nope</h3>\n<p>The puzzle wasn't to find out what the answer key was, the point was to find the fourth student's score. And if we query for it, we see all four solutions give him the same score:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"s s-Atom\">?-</span> <span class=\"nf\">key</span><span class=\"p\">(</span><span class=\"nv\">Key</span><span class=\"p\">),</span> <span class=\"nf\">score</span><span class=\"p\">([</span><span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">b</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">,</span> <span class=\"s s-Atom\">a</span><span class=\"p\">],</span> <span class=\"nv\">Key</span><span class=\"p\">,</span> <span class=\"nv\">X</span><span class=\"p\">).</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"mi\">6</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"mi\">6</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"mi\">6</span>\n<span class=\"nv\">X</span> <span class=\"o\">=</span> <span class=\"mi\">6</span>\n</code></pre></div>\n<p>Huh! I really like it when puzzles look like they're broken, but every \"alternate\" solution still gives the same puzzle answer.</p>\n<p>Total program length: 15 lines of code, compared to the original's 80 lines. <em>Suck it, Pablo.</em></p>\n<p>(Incidentally, you can get all of the answer at once by writing <code>findall(X, (key(Key), score($answer-array, Key, X)), L).</code>) </p>\n<p class=\"empty-line\" style=\"height:16px; margin:0px !important;\"></p>\n<h3>I still don't like puzzles for teaching</h3>\n<p>The actual examples I'm using in <a href=\"https://leanpub.com/logic/\" target=\"_blank\">the book</a> are \"analyzing a version control commit graph\" and \"planning a sequence of infrastructure changes\", which are somewhat more likely to occur at work than needing to solve a puzzle. You'll see them in the next release!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:path\">\n<p>I found it because he wrote <a href=\"https://morepablo.com/2025/04/gamer-games-for-lite-gamers.html\" target=\"_blank\">Gamer Games for Lite Gamers</a> as a response to my <a href=\"https://www.hillelwayne.com/post/vidja-games/\" target=\"_blank\">Gamer Games for Non-Gamers</a>. <a class=\"footnote-backref\" href=\"#fnref:path\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:superior\">\n<p>These are better versions of the core Prolog expressions <code>\\+ (A = B)</code> and <code>A is B + 1</code>, because they can <a href=\"https://eu.swi-prolog.org/pldoc/man?predicate=dif/2\" target=\"_blank\">defer unification</a>. <a class=\"footnote-backref\" href=\"#fnref:superior\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:arity\">\n<p>Prolog-descendants have a convention of writing the arity of the function after its name, so <code>score/3</code> means \"score has three parameters\". I think they do this because you can overload predicates with multiple different arities. Also Joe Armstrong used Prolog for prototyping, so Erlang and Elixir follow the same convention. <a class=\"footnote-backref\" href=\"#fnref:arity\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n<li id=\"fn:explicit\">\n<p>It <em>still</em> gets the right answers without this type restriction, but I had no idea it did until I checked for myself. Probably better not to rely on this! <a class=\"footnote-backref\" href=\"#fnref:explicit\" title=\"Jump back to footnote 4 in the text\">↩</a></p>\n</li>\n<li id=\"fn:yall\">\n<p>We could make this even more compact by using a lambda function. First import module <code>yall</code>, then write <code>maplist([X]>>member(X, [a,b]), Key)</code>. But (1) it's not a shorter program because you replace the extra definition with an extra module import, and (2) <code>yall</code> is SWI-Prolog specific and not an ISO-standard prolog module. Using <code>contains</code> is more portable. <a class=\"footnote-backref\" href=\"#fnref:yall\" title=\"Jump back to footnote 5 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/a48fce5b-8a05-4302-b620-9b26f057f145/",
          "published": "2025-04-08T18:34:50.000Z",
          "updated": "2025-04-08T18:34:50.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/april-cools-gaming-games-for-non-gamers/",
          "title": "[April Cools] Gaming Games for Non-Gamers",
          "description": "<p>My <em>April Cools</em> is out! <a href=\"https://www.hillelwayne.com/post/vidja-games/\" target=\"_blank\">Gaming Games for Non-Gamers</a> is a 3,000 word essay on video games worth playing if you've never enjoyed a video game before. <a href=\"https://www.patreon.com/posts/blog-notes-gamer-125654321?utm_medium=clipboard_copy&utm_source=copyLink&utm_campaign=postshare_creator&utm_content=join_link\" target=\"_blank\">Patreon notes here</a>.</p>\n<p>(April Cools is a project where we write genuine content on non-normal topics. You can see all the other April Cools posted so far <a href=\"https://www.aprilcools.club/\" target=\"_blank\">here</a>. There's still time to submit your own!)</p>\n<a class=\"embedded-link\" href=\"https://www.aprilcools.club/\"> <div style=\"width: 100%; background: #fff; border: 1px #ced3d9 solid; border-radius: 5px; margin-top: 1em; overflow: auto; margin-bottom: 1em;\"> <div style=\"float: left; border-bottom: 1px #ced3d9 solid;\"> <img class=\"link-image\" src=\"https://www.aprilcools.club/aprilcoolsclub.png\"/> </div> <div style=\"float: left; color: #393f48; padding-left: 1em; padding-right: 1em;\"> <h4 class=\"link-title\" style=\"margin-bottom: 0em; line-height: 1.25em; margin-top: 1em; font-size: 14px;\">                April Cools' Club</h4> </div> </div></a>",
          "url": "https://buttondown.com/hillelwayne/archive/april-cools-gaming-games-for-non-gamers/",
          "published": "2025-04-01T16:04:59.000Z",
          "updated": "2025-04-01T16:04:59.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/betteridges-law-of-software-engineering/",
          "title": "Betteridge's Law of Software Engineering Specialness",
          "description": "<h3>Logic for Programmers v0.8 now out!</h3>\n<p>The new release has minor changes: new formatting for notes and a better introduction to predicates. I would have rolled it all into v0.9 next month but I like the monthly cadence. <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Get it here!</a></p>\n<h1>Betteridge's Law of Software Engineering Specialness</h1>\n<p>In <a href=\"https://agileotter.blogspot.com/2025/03/there-is-no-automatic-reset-in.html\" target=\"_blank\">There is No Automatic Reset in Engineering</a>, Tim Ottinger asks:</p>\n<blockquote>\n<p>Do the other people have to live with January 2013 for the rest of their lives? Or is it only engineering that has to deal with every dirty hack since the beginning of the organization?</p>\n</blockquote>\n<p><strong>Betteridge's Law of Headlines</strong> says that if a journalism headline ends with a question mark, the answer is probably \"no\". I propose a similar law relating to software engineering specialness:<sup id=\"fnref:ottinger\"><a class=\"footnote-ref\" href=\"#fn:ottinger\">1</a></sup></p>\n<blockquote>\n<p>If someone asks if some aspect of software development is truly unique to just software development, the answer is probably \"no\".</p>\n</blockquote>\n<p>Take the idea that \"in software, hacks are forever.\" My favorite example of this comes from a different profession. The <a href=\"https://en.wikipedia.org/wiki/Dewey_Decimal_Classification\" target=\"_blank\">Dewey Decimal System</a> hierarchically categorizes books by discipline. For example, <em><a href=\"https://www.librarything.com/work/10143437/t/Covered-Bridges-of-Pennsylvania\" target=\"_blank\">Covered Bridges of Pennsylvania</a></em> has Dewey number <code>624.37</code>. <code>6--</code> is the technology discipline, <code>62-</code> is engineering, <code>624</code> is civil engineering, and <code>624.3</code> is \"special types of bridges\". I have no idea what the last <code>0.07</code> means, but you get the picture.</p>\n<p>Now if you look at the <a href=\"https://www.librarything.com/mds/6\" target=\"_blank\">6-- \"technology\" breakdown</a>, you'll see that there's no \"software\" subdiscipline. This is because when Dewey preallocated the whole technology block in 1876. New topics were instead to be added to the <code>00-</code> \"general-knowledge\" catch-all. Eventually <code>005</code> was assigned to \"software development\", meaning <em>The C Programming Language</em> lives at <code>005.133</code>. </p>\n<p>Incidentally, another late addition to the general knowledge block is <code>001.9</code>: \"controversial knowledge\". </p>\n<p>And that's why my hometown library shelved the C++ books right next to <em>The Mothman Prophecies</em>.</p>\n<p>How's <em>that</em> for technical debt?</p>\n<p>If anything, fixing hacks in software is significantly <em>easier</em> than in other fields. This came up when I was <a href=\"https://www.hillelwayne.com/post/we-are-not-special/\" target=\"_blank\">interviewing classic engineers</a>. Kludges happened all the time, but \"refactoring\" them out is <em>expensive</em>. Need to house a machine that's just two inches taller than the room? Guess what, you're cutting a hole in the ceiling.</p>\n<p>(Even if we restrict the question to other departments in a <em>software company</em>, we can find kludges that are horrible to undo. I once worked for a company which landed an early contract by adding a bespoke support agreement for that one customer. That plagued them for years afterward.)</p>\n<p>That's not to say that there aren't things that are different about software vs other fields!<sup id=\"fnref:example\"><a class=\"footnote-ref\" href=\"#fn:example\">2</a></sup>  But I think that <em>most</em> of the time, when we say \"software development is the only profession that deals with XYZ\", it's only because we're ignorant of how those other professions work.</p>\n<hr/>\n<p>Short newsletter because I'm way behind on writing my <a href=\"https://www.aprilcools.club/\" target=\"_blank\">April Cools</a>. If you're interested in April Cools, you should try it out! I make it <em>way</em> harder on myself than it actually needs to be— everybody else who participates finds it pretty chill.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:ottinger\">\n<p>Ottinger caveats it with \"engineering, software or otherwise\", so I think he knows that other branches of <em>engineering</em>, at least, have kludges. <a class=\"footnote-backref\" href=\"#fnref:ottinger\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:example\">\n<p>The \"software is different\" idea that I'm most sympathetic to is that in software, the tools we use and the products we create are made from the same material. That's unusual at least in classic engineering. Then again, plenty of machinists have made their own lathes and mills! <a class=\"footnote-backref\" href=\"#fnref:example\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/betteridges-law-of-software-engineering/",
          "published": "2025-03-26T18:48:39.000Z",
          "updated": "2025-03-26T18:48:39.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/verification-first-development/",
          "title": "Verification-First Development",
          "description": "<p>A while back I argued on the Blue Site<sup id=\"fnref:li\"><a class=\"footnote-ref\" href=\"#fn:li\">1</a></sup> that \"test-first development\" (TFD) was different than \"test-driven development\" (TDD). The former is \"write tests before you write code\", the latter is a paradigm, culture, and collection of norms that's based on TFD. More broadly, TFD is a special case of <strong>Verification-First Development</strong> and TDD is not.</p>\n<blockquote>\n<p>VFD: before writing code, put in place some means of verifying that the code is correct, or at least have an idea of what you'll do.</p>\n</blockquote>\n<p>\"Verifying\" could mean writing tests, or figuring out how to encode invariants in types, or <a href=\"https://blog.regehr.org/archives/1091\" target=\"_blank\">adding contracts</a>, or <a href=\"https://learntla.com/\" target=\"_blank\">making a formal model</a>, or writing a separate script that checks the output of the program. Just have <em>something</em> appropriate in place that you can run as you go building the code. Ideally, we'd have verification in place for every interesting property, but that's rarely possible in practice. </p>\n<p>Oftentimes we can't make the verification until the code is partially complete. In that case it still helps to figure out the verification we'll write later. The point is to have a <em>plan</em> and follow it promptly.</p>\n<p>I'm using \"code\" as a standin for anything we programmers make, not just software programs. When using constraint solvers, I try to find representative problems I know the answers to. When writing formal specifications, I figure out the system's properties before the design that satisfies those properties. There's probably equivalents in security and other topics, too.</p>\n<h3>The Benefits of VFD</h3>\n<ol>\n<li>Doing verification before coding makes it less likely we'll skip verification entirely. It's the professional equivalent of \"No TV until you do your homework.\"</li>\n<li>It's easier to make sure a verifier works properly if we start by running it on code we know doesn't pass it. Bebugging working code takes more discipline.</li>\n<li>We can run checks earlier in the development process. It's better to realize that our code is broken five minutes after we broke it rather than two hours after.</li>\n</ol>\n<p>That's it, those are the benefits of verification-first development. Those are also <em>big</em> benefits for relatively little investment. Specializations of VFD like test-first development can have more benefits, but also more drawbacks.</p>\n<h3>The drawbacks of VFD</h3>\n<ol>\n<li>It slows us down. I know lots of people say that \"no actually it makes you go faster in the long run,\" but that's the <em>long</em> run. Sometimes we do marathons, sometimes we sprint.</li>\n<li>Verification gets in the way of exploratory coding, where we don't know what exactly we want or how exactly to do something.</li>\n<li>Any specific form of verification exerts a pressure on our code to make it easier to verify with that method. For example, if we're mostly verifying via type invariants, we need to figure out how to express those things in our language's type system, which may not be suited for the specific invariants we need.<sup id=\"fnref:sphinx\"><a class=\"footnote-ref\" href=\"#fn:sphinx\">2</a></sup></li>\n</ol>\n<h2>Whether \"pressure\" is a real drawback is incredibly controversial</h2>\n<p>If I had to summarize what makes \"test-driven development\" different from VFD:<sup id=\"fnref:tdd\"><a class=\"footnote-ref\" href=\"#fn:tdd\">3</a></sup></p>\n<ol>\n<li>The form of verification should specifically be tests, and unit tests at that</li>\n<li>Testing pressure is invariably good. \"Making your code easier to unit test\" is the same as \"making your code better\".</li>\n</ol>\n<p>This is something all of the various \"drivens\"— TDD, Type Driven Development, Design by Contract— share in common, this idea that the purpose of the paradigm is to exert pressure. Lots of TDD experts claim that \"having a good test suite\" is only the secondary benefit of TDD and the real benefit is how it improves code quality.<sup id=\"fnref:docs\"><a class=\"footnote-ref\" href=\"#fn:docs\">4</a></sup></p>\n<p>Whether they're right or not is not something I want to argue: I've seen these approaches all improve my code structure, but also sometimes worsen it. Regardless, I consider pressure a drawback to VFD in general, though, for a somewhat idiosyncratic reason. If it <em>weren't</em> for pressure, VFD would be wholly independent of the code itself. It would <em>just</em> be about verification, and our decisions would exclusively be about how we want to verify. But the design pressure means that our means of verification affects the system we're checking. What if these conflict in some way?</p>\n<h3>VFD is a technique, not a paradigm</h3>\n<p>One of the main differences between \"techniques\" and \"paradigms\" is that paradigms don't play well with each other. If you tried to do both \"proper\" Test-Driven Development and \"proper\" Cleanroom, your head would explode. Whereas VFD being a \"technique\" means it works well with other techniques and even with many full paradigms.</p>\n<p>It also doesn't take a whole lot of practice to start using. It does take practice, both in thinking of verifications and in using the particular verification method involved, to <em>use well</em>, but we can use it poorly and still benefit.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:li\">\n<p>LinkedIn, what did you think I meant? <a class=\"footnote-backref\" href=\"#fnref:li\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:sphinx\">\n<p>This bit me in the butt when making my own <a href=\"https://www.sphinx-doc.org/en/master/\" target=\"_blank\">sphinx</a> extensions. The official guides do things in a highly dynamic way that Mypy can't statically check. I had to do things in a completely different way. Ended up being better though! <a class=\"footnote-backref\" href=\"#fnref:sphinx\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:tdd\">\n<p>Someone's going to yell at me that I completely missed the point of TDD, which is XYZ. Well guess what, someone else <em>already</em> yelled at me that only dumb idiot babies think XYZ is important in TDD. Put in whatever you want for XYZ. <a class=\"footnote-backref\" href=\"#fnref:tdd\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n<li id=\"fn:docs\">\n<p>Another thing that weirdly all of the paradigms claim: that they lead to better documentation. I can see the argument, I just find it strange that <em>every single one</em> makes this claim! <a class=\"footnote-backref\" href=\"#fnref:docs\" title=\"Jump back to footnote 4 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/verification-first-development/",
          "published": "2025-03-18T16:22:20.000Z",
          "updated": "2025-03-18T16:22:20.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/new-blog-post-a-perplexing-javascript-parsing/",
          "title": "New Blog Post: \"A Perplexing Javascript Parsing Puzzle\"",
          "description": "<p>I know I said we'd be back to normal newsletters this week and in fact had 80% of one already written. </p>\n<p>Then I unearthed something that was better left buried.</p>\n<p><a href=\"http://www.hillelwayne.com/post/javascript-puzzle/\" target=\"_blank\">Blog post here</a>, <a href=\"https://www.patreon.com/posts/blog-notes-124153641\" target=\"_blank\">Patreon notes here</a> (Mostly an explanation of how I found this horror in the first place). Next week I'll send what was supposed to be this week's piece.</p>\n<p>(PS: <a href=\"https://www.aprilcools.club/\" target=\"_blank\">April Cools</a> in three weeks!)</p>",
          "url": "https://buttondown.com/hillelwayne/archive/new-blog-post-a-perplexing-javascript-parsing/",
          "published": "2025-03-12T14:49:52.000Z",
          "updated": "2025-03-12T14:49:52.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/",
          "title": "Five Kinds of Nondeterminism",
          "description": "<p>No newsletter next week, I'm teaching a TLA+ workshop.</p>\n<p>Speaking of which: I spend a lot of time thinking about formal methods (and TLA+ specifically) because it's where the source of almost all my revenue. But I don't share most of the details because 90% of my readers don't use FM and never will. I think it's more interesting to talk about ideas <em>from</em> FM that would be useful to people outside that field. For example, the idea of \"property strength\" translates to the <a href=\"https://buttondown.com/hillelwayne/archive/some-tests-are-stronger-than-others/\" target=\"_blank\">idea that some tests are stronger than others</a>. </p>\n<p>Another possible export is how FM approaches nondeterminism. A <strong>nondeterministic</strong> algorithm is one that, from the same starting conditions, has multiple possible outputs. This is nondeterministic:</p>\n<div class=\"codehilite\"><pre><span></span><code># Pseudocode\n\ndef f() {\n    return rand()+1;\n}\n</code></pre></div>\n<p>When specifying systems, I may not <em>encounter</em> nondeterminism more often than in real systems, but I am definitely more aware of its presence. Modeling nondeterminism is a core part of formal specification. I mentally categorize nondeterminism into five buckets. Caveat, this is specifically about nondeterminism from the perspective of <em>system modeling</em>, not computer science as a whole. If I tried to include stuff on NFAs and amb operations this would be twice as long.<sup id=\"fnref:nondeterminism\"><a class=\"footnote-ref\" href=\"#fn:nondeterminism\">1</a></sup></p>\n<p style=\"height:16px; margin:0px !important;\"></p>\n<h2>1. True Randomness</h2>\n<p>Programs that literally make calls to a <code>random</code> function and then use the results. This the simplest type of nondeterminism and one of the most ubiquitous. </p>\n<p>Most of the time, <code>random</code> isn't <em>truly</em> nondeterministic. Most of the time computer randomness is actually <strong>pseudorandom</strong>, meaning we seed a deterministic algorithm that behaves \"randomly-enough\" for some use. You could \"lift\" a nondeterministic random function into a deterministic one by adding a fixed seed to the starting state.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\"># Python</span>\n\n<span class=\"kn\">from</span> <span class=\"nn\">random</span> <span class=\"kn\">import</span> <span class=\"n\">random</span><span class=\"p\">,</span> <span class=\"n\">seed</span>\n<span class=\"k\">def</span> <span class=\"nf\">f</span><span class=\"p\">(</span><span class=\"n\">x</span><span class=\"p\">):</span>\n    <span class=\"n\">seed</span><span class=\"p\">(</span><span class=\"n\">x</span><span class=\"p\">)</span>\n    <span class=\"k\">return</span> <span class=\"n\">random</span><span class=\"p\">()</span>\n\n<span class=\"o\">>>></span> <span class=\"n\">f</span><span class=\"p\">(</span><span class=\"mi\">3</span><span class=\"p\">)</span>\n<span class=\"mf\">0.23796462709189137</span>\n<span class=\"o\">>>></span> <span class=\"n\">f</span><span class=\"p\">(</span><span class=\"mi\">3</span><span class=\"p\">)</span>\n<span class=\"mf\">0.23796462709189137</span>\n</code></pre></div>\n<p>Often we don't do this because the <em>point</em> of randomness is to provide nondeterminism! We deliberately <em>abstract out</em> the starting state of the seed from our program, because it's easier to think about it as locally nondeterministic.</p>\n<p>(There's also \"true\" randomness, like using <a href=\"https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html#inpage-nav-3-2\" target=\"_blank\">thermal noise</a> as an entropy source, which I think are mainly used for cryptography and seeding PRNGs.)</p>\n<p>Most formal specification languages don't deal with randomness (though some deal with <a href=\"https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/\" target=\"_blank\">probability more broadly</a>). Instead, we treat it as a nondeterministic choice:</p>\n<div class=\"codehilite\"><pre><span></span><code># software\nif rand > 0.001 then return a else crash\n\n# specification\neither return a or crash\n</code></pre></div>\n<p>This is because we're looking at worst-case scenarios, so it doesn't matter if <code>crash</code> happens 50% of the time or 0.0001% of the time, it's still possible.  </p>\n<h2>2. Concurrency</h2>\n<div class=\"codehilite\"><pre><span></span><code># Pseudocode\nglobal x = 1, y = 0;\n\ndef thread1() {\n   x++;\n   x++;\n   x++;\n}\n\ndef thread2() {\n    y := x;\n}\n</code></pre></div>\n<p>If <code>thread1()</code> and <code>thread2()</code> run sequentially, then (assuming the sequence is fixed) the final value of <code>y</code> is deterministic. If the two functions are started and run simultaneously, then depending on when <code>thread2</code> executes <code>y</code> can be 1, 2, 3, <em>or</em> 4. Both functions are locally sequential, but running them concurrently leads to global nondeterminism.</p>\n<p>Concurrency is arguably the most <em>dramatic</em> source of nondeterminism. <a href=\"https://buttondown.com/hillelwayne/archive/what-makes-concurrency-so-hard/\" target=\"_blank\">Small amounts of concurrency lead to huge explosions in the state space</a>. We have words for the specific kinds of nondeterminism caused by concurrency, like \"race condition\" and \"dirty write\". Often we think about it as a separate <em>topic</em> from nondeterminism. To some extent it \"overshadows\" the other kinds: I have a much easier time teaching students about concurrency in models than nondeterminism in models.</p>\n<p>Many formal specification languages have special syntax/machinery for the concurrent aspects of a system, and generic syntax for other kinds of nondeterminism. In P that's <a href=\"https://p-org.github.io/P/manual/expressions/#choose\" target=\"_blank\">choose</a>. Others don't special-case concurrency, instead representing as it as nondeterministic choices by a global coordinator. This more flexible but also more inconvenient, as you have to implement process-local sequencing code yourself. </p>\n<h2>3. User Input</h2>\n<div class=\"subscribe-form\"></div>\n<p>One of the most famous and influential programming books is <em>The C Programming Language</em> by Kernighan and Ritchie. The first example of a nondeterministic program appears on page 14:</p>\n<p><img alt=\"Picture of the book page. Code reproduced below.\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/94e6ad15-8d09-48df-b885-191318bfd179.jpg?w=960&fit=max\"/></p>\n<p>For the newsletter readers who get text only emails,<sup id=\"fnref:text-only\"><a class=\"footnote-ref\" href=\"#fn:text-only\">2</a></sup> here's the program:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"cp\">#include</span><span class=\"w\"> </span><span class=\"cpf\"><stdio.h></span>\n<span class=\"cm\">/* copy input to output; 1st version */</span>\n<span class=\"n\">main</span><span class=\"p\">()</span>\n<span class=\"p\">{</span>\n<span class=\"w\">    </span><span class=\"kt\">int</span><span class=\"w\"> </span><span class=\"n\">c</span><span class=\"p\">;</span>\n<span class=\"w\">    </span><span class=\"n\">c</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"n\">getchar</span><span class=\"p\">();</span>\n<span class=\"w\">    </span><span class=\"k\">while</span><span class=\"w\"> </span><span class=\"p\">(</span><span class=\"n\">c</span><span class=\"w\"> </span><span class=\"o\">!=</span><span class=\"w\"> </span><span class=\"n\">EOF</span><span class=\"p\">)</span><span class=\"w\"> </span><span class=\"p\">{</span>\n<span class=\"w\">        </span><span class=\"n\">putchar</span><span class=\"p\">(</span><span class=\"n\">c</span><span class=\"p\">);</span>\n<span class=\"w\">        </span><span class=\"n\">c</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"n\">getchar</span><span class=\"p\">();</span>\n<span class=\"w\">    </span><span class=\"p\">}</span>\n<span class=\"p\">}</span>\n</code></pre></div>\n<p>Yup, that's nondeterministic. Because the user can enter any string, any call of <code>main()</code> could have any output, meaning the number of possible outcomes is infinity.</p>\n<p>Okay that seems a little cheap, and I think it's because we tend to think of determinism in terms of how the user <em>experiences</em> the program. Yes, <code>main()</code> has an infinite number of user inputs, but for each input the user will experience only one possible output. It starts to feel more nondeterministic when modeling a long-standing system that's <em>reacting</em> to user input, for example a server that runs a script whenever the user uploads a file. This can be modeled with nondeterminism and concurrency: We have one execution that's the system, and one nondeterministic execution that represents the effects of our user.</p>\n<p>(One intrusive thought I sometimes have: any \"yes/no\" dialogue actually has <em>three</em> outcomes: yes, no, or the user getting up and walking away without picking a choice, permanently stalling the execution.)</p>\n<h2>4. External forces</h2>\n<p>The more general version of \"user input\": anything where either 1) some part of the execution outcome depends on retrieving external information, or 2) the external world can change some state outside of your system. I call the distinction between internal and external components of the system <a href=\"https://www.hillelwayne.com/post/world-vs-machine/\" target=\"_blank\">the world and the machine</a>. Simple examples: code that at some point reads an external temperature sensor. Unrelated code running on a system which quits programs if it gets too hot. API requests to a third party vendor. Code processing files but users can delete files before the script gets to them.</p>\n<p>Like with PRNGs, some of these cases don't <em>have</em> to be nondeterministic; we can argue that \"the temperature\" should be a virtual input into the function. Like with PRNGs, we treat it as nondeterministic because it's useful to think in that way. Also, what if the temperature changes between starting a function and reading it?</p>\n<p>External forces are also a source of nondeterminism as <em>uncertainty</em>. Measurements in the real world often comes with errors, so repeating a measurement twice can give two different answers. Sometimes operations fail for no discernable reason, or for a non-programmatic reason (like something physically blocks the sensor).</p>\n<p>All of these situations can be modeled in the same way as user input: a concurrent execution making nondeterministic choices.</p>\n<h2>5. Abstraction</h2>\n<p>This is where nondeterminism in system models and in \"real software\" differ the most. I said earlier that pseudorandomness is <em>arguably</em> deterministic, but we abstract it into nondeterminism. More generally, <strong>nondeterminism hides implementation details of deterministic processes</strong>.</p>\n<p>In one consulting project, we had a machine that received a message, parsed a lot of data from the message, went into a complicated workflow, and then entered one of three states. The final state was totally deterministic on the content of the message, but the actual process of determining that final state took tons and tons of code. None of that mattered at the scope we were modeling, so we abstracted it all away: \"on receiving message, nondeterministically enter state A, B, or C.\"</p>\n<p>Doing this makes the system easier to model. It also makes the model more sensitive to possible errors. What if the workflow is bugged and sends us to the wrong state? That's already covered by the nondeterministic choice! Nondeterministic abstraction gives us the potential to pick the worst-case scenario for our system, so we can prove it's robust even under those conditions.</p>\n<p>I know I beat the \"nondeterminism as abstraction\" drum a whole lot but that's because it's the insight from formal methods I personally value the most, that nondeterminism is a powerful tool to <em>simplify reasoning about things</em>. You can see the same approach in how I approach modeling users and external forces: complex realities black-boxed and simplified into nondeterministic forces on the system.</p>\n<hr/>\n<p>Anyway, I hope this collection of ideas I got from formal methods are useful to my broader readership. Lemme know if it somehow helps you out!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:nondeterminism\">\n<p>I realized after writing this that I already talked wrote an essay about nondeterminism in formal specification <a href=\"https://buttondown.com/hillelwayne/archive/nondeterminism-in-formal-specification/\" target=\"_blank\">just under a year ago</a>. I hope this one covers enough new ground to be interesting! <a class=\"footnote-backref\" href=\"#fnref:nondeterminism\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:text-only\">\n<p>There is a surprising number of you. <a class=\"footnote-backref\" href=\"#fnref:text-only\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/five-kinds-of-nondeterminism/",
          "published": "2025-02-19T19:37:57.000Z",
          "updated": "2025-02-19T19:37:57.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/are-efficiency-and-horizontal-scalability-at-odds/",
          "title": "Are Efficiency and Horizontal Scalability at odds?",
          "description": "<p>Sorry for missing the newsletter last week! I started writing on Monday as normal, and by Wednesday the piece (about the <a href=\"https://en.wikipedia.org/wiki/Hierarchy_of_hazard_controls\" target=\"_blank\">hierarchy of controls</a> ) was 2000 words and not <em>close</em> to done. So now it'll be a blog post sometime later this month.</p>\n<p>I also just released a new version of <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a>! 0.7 adds a bunch of new content (type invariants, modeling access policies, rewrites of the first chapters) but more importantly has new fonts that are more legible than the old ones. <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Go check it out!</a></p>\n<p>For this week's newsletter I want to brainstorm an idea I've been noodling over for a while. Say we have a computational task, like running a simulation or searching a very large graph, and it's taking too long to complete on a computer. There's generally three things that we can do to make it faster:</p>\n<ol>\n<li>Buy a faster computer (\"vertical scaling\")</li>\n<li>Modify the software to use the computer's resources better (\"efficiency\")</li>\n<li>Modify the software to use multiple computers (\"horizontal scaling\")</li>\n</ol>\n<p>(Splitting single-threaded software across multiple threads/processes is sort of a blend of (2) and (3).)</p>\n<p>The big benefit of (1) is that we (usually) don't have to make any changes to the software to get a speedup. The downside is that for the past couple of decades computers haven't <em>gotten</em> much faster, except in ways that require recoding (like GPUs and multicore). This means we rely on (2) and (3), and we can do both to a point. I've noticed, though, that horizontal scaling seems to conflict with efficiency. Software optimized to scale well tends to be worse or the <code>N=1</code> case than software optimized to, um, be optimized. </p>\n<p>Are there reasons to <em>expect</em> this? It seems reasonable that design goals of software are generally in conflict, purely because exclusively optimizing for one property means making decisions that impede other properties. But is there something in the nature of \"efficiency\" and \"horizontal scalability\" that make them especially disjoint?</p>\n<p>This isn't me trying to explain a fully coherent idea, more me trying to figure this all out to myself. Also I'm probably getting some hardware stuff wrong</p>\n<h3>Amdahl's Law</h3>\n<p>According to <a href=\"https://en.wikipedia.org/wiki/Amdahl%27s_law\" target=\"_blank\">Amdahl's Law</a>, the maximum speedup by parallelization is constrained by the proportion of the work that can be parallelized. If 80% of algorithm X is parallelizable, the maximum speedup from horizontal scaling is 5x. If algorithm Y is 25% parallelizable, the maximum speedup is only 1.3x. </p>\n<p>If you need horizontal scalability, you want to use algorithm X, <em>even if Y is naturally 3x faster</em>. But if Y was 4x faster, you'd prefer it to X. Maximal scalability means finding the optimal balance between baseline speed and parallelizability. Maximal efficiency means just optimizing baseline speed. </p>\n<h3>Coordination Overhead</h3>\n<p>Distributed algorithms require more coordination. To add a list of numbers in parallel via <a href=\"https://en.wikipedia.org/wiki/Fork%E2%80%93join_model\" target=\"_blank\">fork-join</a>, we'd do something like this:</p>\n<ol>\n<li>Split the list into N sublists</li>\n<li>Fork a new thread/process for sublist</li>\n<li>Wait for each thread/process to finish</li>\n<li>Add the sums together.</li>\n</ol>\n<p>(1), (2), and (3) all add overhead to the algorithm. At the very least, it's extra lines of code to execute, but it can also mean inter-process communication or network hops. Distribution also means you have fewer natural correctness guarantees, so you need more administrative overhead to avoid race conditions. </p>\n<p><strong>Real world example:</strong> Historically CPython has a \"global interpreter lock\" (GIL). In multithreaded code, only one thread could execute Python code at a time (others could execute C code). The <a href=\"https://docs.python.org/3/howto/free-threading-python.html#single-threaded-performance\" target=\"_blank\">newest version</a> supports disabling the GIL, which comes at a 40% overhead for single-threaded programs. Supposedly the difference is because the <a href=\"https://docs.python.org/3/whatsnew/3.11.html#whatsnew311-pep659\" target=\"_blank\">specializing adaptor</a> optimization isn't thread-safe yet. The Python team is hoping on getting it down to \"only\" 10%. </p>\n<p style=\"height:16px; margin:0px !important;\"></p>\n<h3>Scaling loses shared resources</h3>\n<p>I'd say that intra-machine scaling (multiple threads/processes) feels qualitatively <em>different</em> than inter-machine scaling. Part of that is that intra-machine scaling is \"capped\" while inter-machine is not. But there's also a difference in what assumptions you can make about shared resources. Starting from the baseline of single-threaded program:</p>\n<ol>\n<li>Threads have a much harder time sharing CPU caches (you have to manually mess with affinities)</li>\n<li>Processes have a much harder time sharing RAM (I think you have to use <a href=\"https://en.wikipedia.org/wiki/Memory-mapped_file\" target=\"_blank\">mmap</a>?)</li>\n<li>Machines can't share cache, RAM, or disk, period.</li>\n</ol>\n<p>It's a lot easier to solve a problem when the whole thing fits in RAM. But if you split a 50 gb problem across three machines, it doesn't fit in ram by default, even if the machines have 64 gb each. Scaling also means that separate machines can't reuse resources like database connections.</p>\n<h3>Efficiency comes from limits</h3>\n<p>I think the two previous points tie together in the idea that maximal efficiency comes from being able to make assumptions about the system. If we know the <em>exact</em> sequence of computations, we can aim to minimize cache misses. If we don't have to worry about thread-safety, <a href=\"https://www.playingwithpointers.com/blog/refcounting-harder-than-it-sounds.html\" target=\"_blank\">tracking references is dramatically simpler</a>. If we have all of the data in a single database, our query planner has more room to work with. At various tiers of scaling these assumptions are no longer guaranteed and we lose the corresponding optimizations.</p>\n<p>Sometimes these assumptions are implicit and crop up in odd places. Like if you're working at a scale where you need multiple synced databases, you might want to use UUIDs instead of numbers for keys. But then you lose the assumption \"recently inserted rows are close together in the index\", which I've read <a href=\"https://www.cybertec-postgresql.com/en/unexpected-downsides-of-uuid-keys-in-postgresql/\" target=\"_blank\">can lead to significant slowdowns</a>. </p>\n<p>This suggests that if you can find a limit somewhere else, you can get both high horizontal scaling and high efficiency. <del>Supposedly the <a href=\"https://tigerbeetle.com/\" target=\"_blank\">TigerBeetle database</a> has both, but that could be because they limit all records to <a href=\"https://docs.tigerbeetle.com/coding/\" target=\"_blank\">accounts and transfers</a>. This means every record fits in <a href=\"https://tigerbeetle.com/blog/2024-07-23-rediscovering-transaction-processing-from-history-and-first-principles/#transaction-processing-from-first-principles\" target=\"_blank\">exactly 128 bytes</a>.</del> [A TigerBeetle engineer reached out to tell me that they do <em>not</em> horizontally scale compute, they distribute across multiple nodes for redundancy. <a href=\"https://lobste.rs/s/5akiq3/are_efficiency_horizontal_scalability#c_ve8ud5\" target=\"_blank\">\"You can't make it faster by adding more machines.\"</a>]</p>\n<p>Does this mean that \"assumptions\" could be both \"assumptions about the computing environment\" and \"assumptions about the problem\"? In the famous essay <a href=\"http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html\" target=\"_blank\">Scalability! But at what COST</a>, Frank McSherry shows that his single-threaded laptop could outperform 128-node \"big data systems\" on PageRank and graph connectivity (via label propagation). Afterwards, he discusses how a different algorithm solves graph connectivity even faster: </p>\n<blockquote>\n<p>[Union find] is more line of code than label propagation, but it is 10x faster and 100x less embarassing. … The union-find algorithm is fundamentally incompatible with the graph computation approaches Giraph, GraphLab, and GraphX put forward (the so-called “think like a vertex” model).</p>\n</blockquote>\n<p>The interesting thing to me is that his alternate makes more \"assumptions\" than what he's comparing to. He can \"assume\" a fixed goal and optimize the code for that goal. The \"big data systems\" are trying to be general purpose compute platforms and have to pick a model that supports the widest range of possible problems. </p>\n<p>A few years back I wrote <a href=\"https://www.hillelwayne.com/post/cleverness/\" target=\"_blank\">clever vs insightful code</a>, I think what I'm trying to say here is that efficiency comes from having insight into your problem and environment.</p>\n<p>(Last thought to shove in here: to exploit assumptions, you need <em>control</em>. Carefully arranging your data to fit in L1 doesn't matter if your programming language doesn't let you control where things are stored!)</p>\n<h3>Is there a cultural aspect?</h3>\n<p>Maybe there's also a cultural element to this conflict. What if the engineers interested in \"efficiency\" are different from the engineers interested in \"horizontal scaling\"?</p>\n<p>At my first job the data scientists set up a <a href=\"https://en.wikipedia.org/wiki/Apache_Hadoop\" target=\"_blank\">Hadoop</a> cluster for their relatively small dataset, only a few dozen gigabytes or so. One of the senior software engineers saw this and said \"big data is stupid.\" To prove it, he took one of their example queries, wrote a script in Go to compute the same thing, and optimized it to run faster on his machine.</p>\n<p>At the time I was like \"yeah, you're right, big data IS stupid!\" But I think now that we both missed something obvious: with the \"scalable\" solution, the data scientists <em>didn't</em> have to write an optimized script for every single query. Optimizing code is hard, adding more machines is easy! </p>\n<p>The highest-tier of horizontal scaling is usually something large businesses want, and large businesses like problems that can be solved purely with money. Maximizing efficiency requires a lot of knowledge-intensive human labour, so is less appealing as an investment. Then again, I've seen a lot of work on making the scalable systems more efficient, such as evenly balancing heterogeneous workloads. Maybe in the largest systems intra-machine efficiency is just too small-scale a problem. </p>\n<h3>I'm not sure where this fits in but scaling a volume of tasks conflicts less than scaling individual tasks</h3>\n<p>If you have 1,000 machines and need to crunch one big graph, you probably want the most scalable algorithm. If you instead have 50,000 small graphs, you probably want the most efficient algorithm, which you then run on all 1,000 machines. When we call a problem <a href=\"https://en.wikipedia.org/wiki/Embarrassingly_parallel\" target=\"_blank\">embarrassingly parallel</a>, we usually mean it's easy to horizontally scale. But it's also one that's easy to make more efficient, because local optimizations don't affect the scaling! </p>\n<hr/>\n<p>Okay that's enough brainstorming for one week.</p>\n<h3>Blog Rec</h3>\n<p>Whenever I think about optimization as a skill, the first article that comes to mind is <a href=\"https://matklad.github.io/\" target=\"_blank\">Mat Klad's</a> <a href=\"https://matklad.github.io/2023/11/15/push-ifs-up-and-fors-down.html\" target=\"_blank\">Push Ifs Up And Fors Down</a>. I'd never have considered on my own that inlining loops into functions could be such a huge performance win. The blog has a lot of other posts on the nuts-and-bolts of systems languages, optimization, and concurrency.</p>",
          "url": "https://buttondown.com/hillelwayne/archive/are-efficiency-and-horizontal-scalability-at-odds/",
          "published": "2025-02-12T18:26:20.000Z",
          "updated": "2025-02-12T18:26:20.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/what-hard-thing-does-your-tech-make-easy/",
          "title": "What hard thing does your tech make easy?",
          "description": "<p>I occasionally receive emails asking me to look at the writer's new language/library/tool. Sometimes it's in an area I know well, like formal methods. Other times, I'm a complete stranger to the field. Regardless, I'm generally happy to check it out.</p>\n<p>When starting out, this is the biggest question I'm looking to answer:</p>\n<blockquote>\n<p>What does this technology make easy that's normally hard?</p>\n</blockquote>\n<p>What justifies me learning and migrating to a <em>new</em> thing as opposed to fighting through my problems with the tools I already know? The new thing has to have some sort of value proposition, which could be something like \"better performance\" or \"more secure\". The most universal value and the most direct to show is \"takes less time and mental effort to do something\". I can't accurately judge two benchmarks, but I can see two demos or code samples and compare which one feels easier to me.</p>\n<h2>Examples</h2>\n<h3>Functional programming</h3>\n<p>What drew me originally to functional programming was higher order functions. </p>\n<div class=\"codehilite\"><pre><span></span><code># Without HOFs\n\nout = []\nfor x in input {\n  if test(x) {\n    out.append(x)\n }\n}\n\n# With HOFs\n\nfilter(test, input)\n</code></pre></div>\n<p style=\"height:16px; margin:0px !important;\"></p>\n<p>We can also compare the easiness of various tasks between examples within the same paradigm. If I know FP via Clojure, what could be appealing about Haskell or F#? For one, null safety is a lot easier when I've got option types.</p>\n<h3>Array Programming</h3>\n<p>Array programming languages like APL or J make certain classes of computation easier. For example, finding all of the indices where two arrays <del>differ</del> match. Here it is in Python:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"n\">x</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">]</span>\n<span class=\"n\">y</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">0</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">,</span> <span class=\"mi\">4</span><span class=\"p\">]</span>\n\n<span class=\"o\">>>></span> <span class=\"p\">[</span><span class=\"n\">i</span> <span class=\"k\">for</span> <span class=\"n\">i</span><span class=\"p\">,</span> <span class=\"p\">(</span><span class=\"n\">a</span><span class=\"p\">,</span> <span class=\"n\">b</span><span class=\"p\">)</span> <span class=\"ow\">in</span> <span class=\"nb\">enumerate</span><span class=\"p\">(</span><span class=\"nb\">zip</span><span class=\"p\">(</span><span class=\"n\">x</span><span class=\"p\">,</span> <span class=\"n\">y</span><span class=\"p\">))</span> <span class=\"k\">if</span> <span class=\"n\">a</span> <span class=\"o\">==</span> <span class=\"n\">b</span><span class=\"p\">]</span>\n<span class=\"p\">[</span><span class=\"mi\">7</span><span class=\"p\">,</span> <span class=\"mi\">9</span><span class=\"p\">]</span>\n</code></pre></div>\n<p>And here it is in J:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"w\">  </span><span class=\"nv\">x</span><span class=\"w\"> </span><span class=\"o\">=:</span><span class=\"w\"> </span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"mi\">4</span><span class=\"w\"> </span><span class=\"mi\">2</span><span class=\"w\"> </span><span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">4</span><span class=\"w\"> </span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">4</span>\n<span class=\"w\">  </span><span class=\"nv\">y</span><span class=\"w\"> </span><span class=\"o\">=:</span><span class=\"w\"> </span><span class=\"mi\">2</span><span class=\"w\"> </span><span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"mi\">2</span><span class=\"w\"> </span><span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">2</span><span class=\"w\"> </span><span class=\"mi\">0</span><span class=\"w\"> </span><span class=\"mi\">2</span><span class=\"w\"> </span><span class=\"mi\">4</span>\n\n<span class=\"w\">  </span><span class=\"nv\">I</span><span class=\"o\">.</span><span class=\"w\"> </span><span class=\"nv\">x</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"nv\">y</span>\n<span class=\"mi\">7</span><span class=\"w\"> </span><span class=\"mi\">9</span>\n</code></pre></div>\n<p>Not every tool is meant for every programmer, because you might not have any of the problems a tool makes easier. What comes up more often for you: filtering a list or finding all the indices where two lists differ? Statistically speaking, functional programming is more useful to you than array programming.</p>\n<p>But <em>I</em> have this problem enough to justify learning array programming.</p>\n<h3>LLMs</h3>\n<p>I think a lot of the appeal of LLMs is they make a lot of specialist tasks easy for nonspecialists. One thing I recently did was convert some rst <a href=\"https://docutils.sourceforge.io/docs/ref/rst/directives.html#list-table\" target=\"_blank\">list tables</a> to <a href=\"https://docutils.sourceforge.io/docs/ref/rst/directives.html#csv-table-1\" target=\"_blank\">csv tables</a>. Normally I'd have to do write some tricky parsing and serialization code to automatically convert between the two. With LLMs, it's just</p>\n<blockquote>\n<p>Convert the following rst list-table into a csv-table: [table]</p>\n</blockquote>\n<p>\"Easy\" can trump \"correct\" as a value. The LLM might get some translations wrong, but it's so convenient I'd rather manually review all the translations for errors than write specialized script that is correct 100% of the time.</p>\n<h2>Let's not take this too far</h2>\n<p>A college friend once claimed that he cracked the secret of human behavior: humans do whatever makes them happiest. \"What about the martyr who dies for their beliefs?\" \"Well, in their last second of life they get REALLY happy.\"</p>\n<p>We can do the same here, fitting every value proposition into the frame of \"easy\". CUDA makes it easier to do matrix multiplication. Rust makes it easier to write low-level code without memory bugs. TLA+ makes it easier to find errors in your design. Monads make it easier to sequence computations in a lazy environment. Making everything about \"easy\" obscures other reason for adopting new things.</p>\n<h3>That whole \"simple vs easy\" thing</h3>\n<p>Sometimes people think that \"simple\" is better than \"easy\", because \"simple\" is objective and \"easy\" is subjective. This comes from the famous talk <a href=\"https://www.infoq.com/presentations/Simple-Made-Easy/\" target=\"_blank\">Simple Made Easy</a>. I'm not sure I agree that simple is better <em>or</em> more objective: the speaker claims that polymorphism and typeclasses are \"simpler\" than conditionals, and I doubt everybody would agree with that.</p>\n<p>The problem is that \"simple\" is used to mean both \"not complicated\" <em>and</em> \"not complex\". And everybody agrees that \"complicated\" and \"complex\" are different, even if they can't agree <em>what</em> the difference is. This idea should probably expanded be expanded into its own newsletter.</p>\n<p>It's also a lot harder to pitch a technology on being \"simpler\". Simplicity by itself doesn't make a tool better equipped to solve problems. Simplicity can unlock other benefits, like compositionality or <a href=\"https://buttondown.com/hillelwayne/archive/the-capability-tractability-tradeoff/\" target=\"_blank\">tractability</a>, that provide the actual value. And often that value is in the form of \"makes some tasks easier\". </p>",
          "url": "https://buttondown.com/hillelwayne/archive/what-hard-thing-does-your-tech-make-easy/",
          "published": "2025-01-29T18:09:47.000Z",
          "updated": "2025-01-29T18:09:47.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/the-jugglers-curse/",
          "title": "The Juggler's Curse",
          "description": "<p>I'm making a more focused effort to juggle this year. Mostly <a href=\"https://youtu.be/PPhG_90VH5k?si=AxOO65PcX4ZwnxPQ&t=49\" target=\"_blank\">boxes</a>, but also classic balls too.<sup id=\"fnref:boxes\"><a class=\"footnote-ref\" href=\"#fn:boxes\">1</a></sup> I've gotten to the point where I can almost consistently do a five-ball cascade, which I <em>thought</em> was the cutoff to being a \"good juggler\". \"Thought\" because I now know a \"good juggler\" is one who can do the five-ball cascade with <em>outside throws</em>. </p>\n<p>I know this because I can't do the outside five-ball cascade... yet. But it's something I can see myself eventually mastering, unlike the slightly more difficult trick of the five-ball mess, which is impossible for mere mortals like me. </p>\n<p><em>In theory</em> there is a spectrum of trick difficulties and skill levels. I could place myself on the axis like this:</p>\n<p><img alt=\"A crudely-drawn scale with 10 even ticks, I'm between 5 and 6\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/8ee51aa1-5dd4-48b8-8110-2cdf9a273612.png?w=960&fit=max\"/></p>\n<p>In practice, there are three tiers:</p>\n<ol>\n<li>Toddlers</li>\n<li>Good jugglers who practice hard</li>\n<li>Genetic freaks and actual wizards</li>\n</ol>\n<p>And the graph always, <em>always</em> looks like this:</p>\n<p><img alt=\"The same graph, with the top compressed into \"wizards\" and bottom into \"toddlers\". I'm in toddlers.\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/04c76cec-671e-4560-b64e-498b7652359e.png?w=960&fit=max\"/></p>\n<p>This is the jugglers curse, and it's a three-parter:</p>\n<ol>\n<li>The threshold between you and \"good\" is the next trick you cannot do.</li>\n<li>Everything below that level is trivial. Once you've gotten a trick down, you can never go back to not knowing it, to appreciating how difficult it was to learn in the first place.<sup id=\"fnref:expert-blindness\"><a class=\"footnote-ref\" href=\"#fn:expert-blindness\">2</a></sup></li>\n<li>Everything above that level is just \"impossible\". You don't have the knowledge needed to recognize the different tiers.<sup id=\"fnref:dk\"><a class=\"footnote-ref\" href=\"#fn:dk\">3</a></sup></li>\n</ol>\n<p>So as you get better, the stuff that was impossible becomes differentiable, and you can see that some of it <em>is</em> possible. And everything you learned becomes trivial. So you're never a good juggler until you learn \"just one more hard trick\".</p>\n<p>The more you know, the more you know you don't know and the less you know you know.</p>\n<h3>This is supposed to be a software newsletter</h3>\n<blockquote>\n<p>A monad is a monoid in the category of endofunctors, what's the problem? <a href=\"https://james-iry.blogspot.com/2009/05/brief-incomplete-and-mostly-wrong.html\" target=\"_blank\">(src)</a></p>\n</blockquote>\n<p>I think this applies to any difficult topic? Most fields don't have the same stark <a href=\"https://en.wikipedia.org/wiki/Spectral_line\" target=\"_blank\">spectral lines</a> as juggling, but there's still tiers of difficulty to techniques, which get compressed the further in either direction they are from your current level.</p>\n<p>Like, I'm not good at formal methods. I've written two books on it but I've never mastered a dependently-typed language or a theorem prover. Those are equally hard. And I'm not good at modeling concurrent systems because I don't understand the formal definition of bisimulation and haven't implemented a Raft. Those are also equally hard, in fact exactly as hard as mastering a theorem prover.</p>\n<p>At the same time, the skills I've already developed are easy: properly using refinement is <em>exactly as easy</em> as writing <a href=\"https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/\" target=\"_blank\">a wrapped counter</a>. Then I get surprised when I try to explain strong fairness to someone and they just don't get how □◇(ENABLED〈A〉ᵥ) is <em>obviously</em> different from ◇□(ENABLED 〈A〉ᵥ).</p>\n<p>Juggler's curse!</p>\n<p>Now I don't actually know if this is actually how everybody experiences expertise or if it's just my particular personality— I was a juggler long before I was a software developer. Then again, I'd argue that lots of people talk about one consequence of the juggler's curse: imposter syndrome. If you constantly think what you know is \"trivial\" and what you don't know is \"impossible\", then yeah, you'd start feeling like an imposter at work real quick.</p>\n<p>I wonder if part of the cause is that a lot of skills you have to learn are invisible. One of my favorite blog posts ever is <a href=\"https://www.benkuhn.net/blub/\" target=\"_blank\">In Defense of Blub Studies</a>, which argues that software expertise comes through understanding \"boring\" topics like \"what all of the error messages mean\" and \"how to use a debugger well\".  Blub is a critical part of expertise and takes a lot of hard work to learn, but it <em>feels</em> like trivia. So looking back on a skill I mastered, I might think it was \"easy\" because I'm not including all of the blub that I had to learn, too.</p>\n<p>The takeaway, of course, is that the outside five-ball cascade <em>is</em> objectively the cutoff between good jugglers and toddlers.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:boxes\">\n<p>Rant time: I <em>love</em> cigar box juggling. It's fun, it's creative, it's totally unlike any other kind of juggling. And it's so niche I straight up cannot find anybody in Chicago to practice with. I once went to a juggling convention and was the only person with a cigar box set there. <a class=\"footnote-backref\" href=\"#fnref:boxes\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:expert-blindness\">\n<p>This particular part of the juggler's curse is also called <a href=\"https://en.wikipedia.org/wiki/Curse_of_knowledge\" target=\"_blank\">the curse of knowledge</a> or \"expert blindness\". <a class=\"footnote-backref\" href=\"#fnref:expert-blindness\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:dk\">\n<p>This isn't Dunning-Kruger, because DK says that people think they are <em>better</em> than they actually are, and also <a href=\"https://www.mcgill.ca/oss/article/critical-thinking/dunning-kruger-effect-probably-not-real\" target=\"_blank\">may not actually be real</a>. <a class=\"footnote-backref\" href=\"#fnref:dk\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/the-jugglers-curse/",
          "published": "2025-01-22T18:50:40.000Z",
          "updated": "2025-01-22T18:50:40.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/",
          "title": "What are the Rosettas of formal specification?",
          "description": "<p>First of all, I just released version 0.6 of <em>Logic for Programmers</em>! You can get it <a href=\"https://leanpub.com/logic/\" target=\"_blank\">here</a>. Release notes in the footnote.<sup id=\"fnref:release-notes\"><a class=\"footnote-ref\" href=\"#fn:release-notes\">1</a></sup></p>\n<p>I've been thinking about my next project after the book's done. One idea is to do a survey of new formal specification languages. There's been a lot of new ones in the past few years (P, Quint, etc), plus some old ones I haven't critically examined (SPIN, mcrl2). I'm thinking of a brief overview of each, what's interesting about it, and some examples of the corresponding models.</p>\n<p>For this I'd want a set of \"Rosetta\" examples. <a href=\"https://rosettacode.org/wiki/Rosetta_Code\" target=\"_blank\">Rosetta Code</a> is a collection of programming tasks done in different languages. For example, <a href=\"https://rosettacode.org/wiki/99_bottles_of_beer\" target=\"_blank\">\"99 bottles of beer on the wall\"</a> in over 300 languages. If I wanted to make a Rosetta Code for specifications of concurrent systems, what examples would I use? </p>\n<h3>What makes a good Rosetta examples?</h3>\n<p>A good Rosetta example would be simple enough to understand and implement but also showcase the differences between the languages. </p>\n<p>A good example of a Rosetta example is <a href=\"https://github.com/hwayne/lets-prove-leftpad\" target=\"_blank\">leftpad for code verification</a>. Proving leftpad correct is short in whatever verification language you use. But the proofs themselves are different enough that you can compare what it's like to use code contracts vs with dependent types, etc. </p>\n<p>A <em>bad</em> Rosetta example is \"hello world\". While it's good for showing how to run a language, it doesn't clearly differentiate languages. Haskell's \"hello world\" is almost identical to BASIC's \"hello world\".</p>\n<p>Rosetta examples don't have to be flashy, but I <em>want</em> mine to be flashy. Formal specification is niche enough that regardless of my medium, most of my audience hasn't use it and may be skeptical. I always have to be selling. This biases me away from using things like dining philosophers or two-phase commit.</p>\n<p>So with that in mind, three ideas:</p>\n<h3>1. Wrapped Counter</h3>\n<p>A counter that starts at 1 and counts to N, after which it wraps around to 1 again.</p>\n<h4>Why it's good</h4>\n<p>This is a good introductory formal specification: it's a minimal possible stateful system without concurrency or nondeterminism. You can use it to talk about the basic structure of a spec, how a verifier works, etc. It also a good way of introducing \"boring\" semantics, like conditionals and arithmetic, and checking if the language does anything unusual with them. Alloy, for example, defaults to 4-bit signed integers, so you run into problems if you set N too high.<sup id=\"fnref:alloy\"><a class=\"footnote-ref\" href=\"#fn:alloy\">2</a></sup></p>\n<p>At the same time, wrapped counters are a common building block of complex systems. Lots of things can be represented this way: <code>N=1</code> is a flag or blinker, <code>N=3</code> is a traffic light, <code>N=24</code> is a clock, etc.</p>\n<p>The next example is better for showing basic <a href=\"https://www.hillelwayne.com/post/safety-and-liveness/\" target=\"_blank\">safety and liveness properties</a>, but this will do in a pinch. </p>\n<h3>2. Threads</h3>\n<p>A counter starts at 0. N threads each, simultaneously try to update the counter. They do this nonatomically: first they read the value of the counter and store that in a thread-local <code>tmp</code>, then they increment <code>tmp</code>, then they set the counter to <code>tmp</code>. The expected behavior is that the final value of the counter will be N.</p>\n<h4>Why it's good</h4>\n<p>The system as described is bugged. If two threads interleave the setlocal commands, one thread update can \"clobber\" the other and the counter can go backwards. To my surprise, most people <em>do not</em> see this error. So it's a good showcase of how the language actually finds real bugs, and how it can verify fixes.</p>\n<p>As to actual language topics: the spec covers concurrency and track process-local state. A good spec language should make it possible to adjust N without having to add any new variables. And it \"naturally\" introduces safety, liveness, and <a href=\"https://www.hillelwayne.com/post/action-properties/\" target=\"_blank\">action</a> properties.</p>\n<p>Finally, the thread spec is endlessly adaptable. I've used variations of it to teach refinement, resource starvation, fairness, livelocks, and hyperproperties. Tweak it a bit and you get dining philosophers.</p>\n<h3>3. Bounded buffer</h3>\n<p>We have a bounded buffer with maximum length <code>X</code>. We have <code>R</code> reader and <code>W</code> writer processes. Before writing, writers first check if the buffer is full. If full, the writer goes to sleep. Otherwise, the writer wakes up <em>a random</em> sleeping process, then pushes an arbitrary value. Readers work the same way, except they pop from the buffer (and go to sleep if the buffer is empty).</p>\n<p>The only way for a sleeping process to wake up is if another process successfully performs a read or write.</p>\n<h4>Why it's good</h4>\n<p>This shows process-local nondeterminism (in choosing which sleeping process to wake up), different behavior for different types of processes, and deadlocks: it's possible for every reader and writer to be asleep at the same time.</p>\n<p>The beautiful thing about this example: the spec can only deadlock if <code>X < 2*(R+W)</code>. This is the kind of bug you'd struggle to debug in real code. An in fact, people did struggle: even when presented with a minimal code sample and told there was a bug, many <a href=\"http://wiki.c2.com/?ExtremeProgrammingChallengeFourteen\" target=\"_blank\">testing experts couldn't find it</a>. Whereas a formal model of the same code <a href=\"https://www.hillelwayne.com/post/augmenting-agile/\" target=\"_blank\">finds the bug in seconds</a>. </p>\n<p>If a spec language can model the bounded buffer, then it's good enough for production systems.</p>\n<p>On top of that, the bug happens regardless of what writers actually put in the buffer, so you can abstract that all away. This example can demonstrate that you can leave implementation details out of a spec and still find critical errors.</p>\n<h2>Caveat</h2>\n<p>This is all with a <em>heavy</em> TLA+ bias. I've modeled all of these systems in TLA+ and it works pretty well for them. That is to say, none of these do things TLA+ is <em>bad</em> at: reachability, subtyping, transitive closures, unbound spaces, etc. I imagine that as I cover more specification languages I'll find new Rosettas.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:release-notes\">\n<ul>\n<li>Exercises are more compact, answers now show name of exercise in title</li>\n</ul>\n<ul>\n<li>\"Conditionals\" chapter has new section on nested conditionals</li>\n</ul>\n<ul>\n<li>\"Crash course\" chapter significantly rewritten</li>\n<li>Starting migrating to use consistently use <code>==</code> for equality and <code>=</code> for definition. Not everything is migrated yet</li>\n<li>\"Beyond Logic\" appendix does a <em>slightly</em> better job of covering HOL and constructive logic</li>\n<li>Addressed various reader feedback</li>\n<li>Two new exercises</li>\n</ul>\n<p><a class=\"footnote-backref\" href=\"#fnref:release-notes\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:alloy\">\n<p>You can change the int size in a model run, so this is more \"surprising footgun and inconvenience\" than \"fundamental limit of the specification language.\" Something still good to know! <a class=\"footnote-backref\" href=\"#fnref:alloy\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/what-are-the-rosettas-of-formal-specification/",
          "published": "2025-01-15T17:34:40.000Z",
          "updated": "2025-01-15T17:34:40.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/",
          "title": "\"Logic for Programmers\" Project Update",
          "description": "<p>Happy new year everyone!</p>\n<p>I released the first <em>Logic for Programmers</em> alpha six months ago. There's since been four new versions since then, with the November release putting us in beta. Between work and holidays I didn't make much progress in December, but there will be a 0.6 release in the next week or two.</p>\n<p>People have asked me if the book will ever be available in print, and my answer to that is \"when it's done\". To keep \"when it's done\" from being \"never\", I'm committing myself to <strong>have the book finished by July.</strong> That means roughly six more releases between now and the official First Edition. Then I will start looking for a way to get it printed.</p>\n<h3>The Current State and What Needs to be Done</h3>\n<p>Right now the book is 26,000 words. For the most part, the structure is set— I don't plan to reorganize the chapters much. But I still need to fix shortcomings identified by the reader feedback. In particular, a few topics need more on real world applications, and the Alloy chapter is pretty weak. There's also a bunch of notes and todos and \"fix this\"s I need to go over.</p>\n<p>I also need to rewrite the introduction and predicate logic chapters. Those haven't changed much since 0.1 and I need to go over them <em>very carefully</em>.</p>\n<p>After that comes copyediting.</p>\n<h4>Ugh, Copyediting</h4>\n<p>Copyediting means going through the entire book to make word and sentence sentence level changes to the flow. An example would be changing</p>\n<table>\n<thead>\n<tr>\n<th>From</th>\n<th>To</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>I said predicates are just “boolean functions”. That isn’t <em>quite</em> true.</td>\n<td>It's easy to think of predicates as just \"boolean\" functions, but there is a subtle and important difference.</td>\n</tr>\n</tbody>\n</table>\n<p>It's a tiny difference but it reads slightly better to me and makes the book slghtly better. Now repeat that for all 3000-odd sentences in the book and I'm done with copyediting!</p>\n<p>For the first pass, anyway. Copyediting is miserable. </p>\n<p>Some of the changes I need to make come from reader feedback, but most will come from going through it line-by-line with a copyeditor. Someone's kindly offered to do some of this for free, but I want to find a professional too. If you know anybody, let me know.</p>\n<h4>Formatting</h4>\n<p>The book, if I'm being honest, looks ugly. I'm using the default sphinx/latex combination for layout and typesetting. My thinking is it's not worth making the book pretty until it's worth reading. But I also want the book, when it's eventually printed, to look <em>nice</em>. At the very least it shouldn't have \"self-published\" vibes. </p>\n<p>I've found someone who's been giving me excellent advice on layout and I'm slowly mastering the LaTeX formatting arcana. It's gonna take a few iterations to get things right.</p>\n<h4>Front cover</h4>\n<p>Currently the front cover is this:</p>\n<p><img alt=\"Front cover\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/b42ee3de-9d8a-4729-809e-a8739741f0cf.png?w=960&fit=max\"/></p>\n<p>It works but gives \"programmer spent ten minutes in Inkscape\" vibes. I have a vision in my head for what would be nicer. A few people have recommended using Fiverr. So far the results haven't been that good, </p>\n<h4>Fixing Epub</h4>\n<p><em>Ugh</em></p>\n<p>I thought making an epub version would be kinder for phone reading, but it's such a painful format to develop for. Did you know that epub backlinks work totally different on kindle vs other ereaders? Did you know the only way to test if you got em working right is to load them up in a virtual kindle? The feedback loops are miserable. So I've been treating epub as a second-class citizen for now and only fixing the <em>worst</em> errors (like math not rendering properly), but that'll have to change as the book finalizes.</p>\n<h3>What comes next?</h3>\n<p>After 1.0, I get my book an ISBN and figure out how to make print copies. The margin on print is <em>way</em> lower than ebooks, especially if it's on-demand: the net royalties for <a href=\"https://kdp.amazon.com/en_US/help/topic/G201834330\" target=\"_blank\">Amazon direct publishing</a> would be 7 dollars on a 20-dollar book (as opposed to Leanpub's 16 dollars). Would having a print version double the sales? I hope so! Either way, a lot of people have been asking about print version so I want to make that possible.</p>\n<p>(I also want to figure out how to give people who already have the ebook a discount on print, but I don't know if that's feasible.)</p>\n<p>Then, I dunno, maybe make a talk or a workshop I can pitch to conferences. Once I have that I think I can call <em>LfP</em> complete... at least until the second edition.</p>\n<hr/>\n<p>Anyway none of that is actually technical so here's a quick fun thing. I spent a good chunk of my break reading the <a href=\"https://www.mcrl2.org/web/index.html\" target=\"_blank\">mCRL2 book</a>. mCRL2 defines an \"algebra\" for \"communicating processes\". As a very broad explanation, that's defining what it means to \"add\" and \"multiply\" two processes. What's interesting is that according to their definition, the algebra follows the distributive law, <em>but only if you multiply on the right</em>. eg</p>\n<div class=\"codehilite\"><pre><span></span><code>// VALID\n(a+b)*c = a*c + b*c\n\n// INVALID\na*(b+c) = a*b + a*c\n</code></pre></div>\n<p>This is the first time I've ever seen this in practice! Juries still out on the rest of the language.</p>\n<hr/>\n<h3>Videos and Stuff</h3>\n<ul>\n<li>My <em>DDD Europe</em> talk is now out! <a href=\"https://www.youtube.com/watch?v=uRmNSuYBUOU\" target=\"_blank\">What We Know We Don't Know</a> is about empirical software engineering in general, and software engineering research on Domain Driven Design in particular.</li>\n<li>I was interviewed in the last video on <a href=\"https://www.youtube.com/watch?v=yXxmSI9SlwM\" target=\"_blank\">Craft vs Cruft</a>'s \"Year of Formal Methods\". Check it out!</li>\n</ul>",
          "url": "https://buttondown.com/hillelwayne/archive/logic-for-programmers-project-update/",
          "published": "2025-01-07T18:49:40.000Z",
          "updated": "2025-01-07T18:49:40.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/formally-modeling-dreidel-the-sequel/",
          "title": "Formally modeling dreidel, the sequel",
          "description": "<p>Channukah's next week and that means my favorite pastime, complaining about how <a href=\"https://en.wikipedia.org/wiki/Dreidel#\" target=\"_blank\">Dreidel</a> is a bad game. Last year I formally modeled it in <a href=\"https://www.prismmodelchecker.org/\" target=\"_blank\">PRISM</a> to prove the game's not fun. But because I limited the model to only a small case, I couldn't prove the game was <em>truly</em> bad. </p>\n<p>It's time to finish the job.</p>\n<p><img alt=\"A flaming dreidel, from https://pixelsmerch.com/featured/flaming-dreidel-ilan-rosen.html\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/61233445-69a7-4fd4-a024-ee0dca0281c1.jpg?w=960&fit=max\"/></p>\n<h2>The Story so far</h2>\n<p>You can read the last year's newsletter <a href=\"https://buttondown.com/hillelwayne/archive/i-formally-modeled-dreidel-for-no-good-reason/\" target=\"_blank\">here</a> but here are the high-level notes.</p>\n<h3>The Game of Dreidel</h3>\n<ol>\n<li>Every player starts with N pieces (usually chocolate coins). This is usually 10-15 pieces per player.</li>\n<li>At the beginning of the game, and whenever the pot is empty, every play antes one coin into the pot.</li>\n<li>\n<p>Turns consist of spinning the dreidel. Outcomes are:</p>\n<ul>\n<li>נ (Nun): nothing happens.</li>\n<li>ה (He): player takes half the pot, rounded up.</li>\n<li>ג (Gimmel): player takes the whole pot, everybody antes.</li>\n<li>ש (Shin): player adds one of their coins to the pot.</li>\n</ul>\n</li>\n<li>\n<p>If a player ever has zero coins, they are eliminated. Play continues until only one player remains.</p>\n</li>\n</ol>\n<p>If you don't have a dreidel, you can instead use a four-sided die, but for the authentic experience you should wait eight seconds before looking at your roll.</p>\n<h3>PRISM</h3>\n<p><a href=\"https://www.prismmodelchecker.org/\" target=\"_blank\">PRISM</a> is a probabilistic modeling language, meaning you can encode a system with random chances of doing things and it can answer questions like \"on average, how many spins does it take before one player loses\" (64, for 4 players/10 coins) and \"what's the more likely to knock the first player out, shin or ante\" (ante is 2.4x more likely).  You can see last year's model <a href=\"https://gist.github.com/hwayne/f8724f0c83393c576b1e20ee4b76966d#file-01-dreidel-prism\" target=\"_blank\">here</a>.</p>\n<p>The problem with PRISM is that it is absurdly inexpressive: it's a thin abstraction for writing giant <a href=\"https://en.wikipedia.org/wiki/Stochastic_matrix\" target=\"_blank\">stochastic matrices</a> and lacks basic affordances like lists or functions. I had to hardcode every possible roll for every player. This meant last year's model had two limits. First, it only handles four players, and I would have to write a new model for three or five players. Second, I made the game end as soon as one player <em>lost</em>:</p>\n<div class=\"codehilite\"><pre><span></span><code>formula done = (p1=0) | (p2=0) | (p3=0) | (p4=0);\n</code></pre></div>\n<p>To fix both of these things, I thought I'd have to treat PRISM as a compilation target, writing a program that took a player count and output the corresponding model. But then December got super busy and I ran out of time to write a program. Instead, I stuck with four hardcoded players and extended the old model to run until victory.</p>\n<h2>The new model</h2>\n<p>These are all changes to <a href=\"https://gist.github.com/hwayne/f8724f0c83393c576b1e20ee4b76966d#file-01-dreidel-prism\" target=\"_blank\">last year's model</a>.</p>\n<p>First, instead of running until one player is out of money, we run until three players are out of money.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"gd\">- formula done = (p1=0) | (p2=0) | (p3=0) | (p4=0);</span>\n<span class=\"gi\">+ formula done = </span>\n<span class=\"gi\">+  ((p1=0) & (p2=0) & (p3=0)) |</span>\n<span class=\"gi\">+  ((p1=0) & (p2=0) & (p4=0)) |</span>\n<span class=\"gi\">+  ((p1=0) & (p3=0) & (p4=0)) |</span>\n<span class=\"gi\">+  ((p2=0) & (p3=0) & (p4=0));</span>\n</code></pre></div>\n<p>Next, we change the ante formula. Instead of adding four coins to the pot and subtracting a coin from each player, we add one coin for each player left. <code>min(p1, 1)</code> is 1 if player 1 is still in the game, and 0 otherwise. </p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"gi\">+ formula ante_left = min(p1, 1) + min(p2, 1) + min(p3, 1) + min(p4, 1);</span>\n</code></pre></div>\n<p>We also have to make sure anteing doesn't end a player with negative money. </p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"gd\">- [ante] (pot = 0) & !done -> (pot'=pot+4) & (p1' = p1-1) & (p2' = p2-1) & (p3' = p3-1) & (p4' = p4-1);</span>\n<span class=\"gi\">+ [ante] (pot = 0) & !done -> (pot'=pot+ante_left) & (p1' = max(p1-1, 0)) & (p2' = max(p2-1, 0)) & (p3' = max(p3-1, 0)) & (p4' = max(p4-1, 0));</span>\n</code></pre></div>\n<p>Finally, we have to add logic for a player being \"out\". Instead of moving to the next player after each turn, we move to the next player still in the game. Also, if someone starts their turn without any coins (f.ex if they just anted their last coin), we just skip their turn. </p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"gi\">+ formula p1n = (p2 > 0 ? 2 : p3 > 0 ? 3 : 4);</span>\n\n<span class=\"gi\">+ [lost] ((pot != 0) & !done & (turn = 1) & (p1 = 0)) -> (turn' = p1n);</span>\n<span class=\"gd\">- [spin] ((pot != 0) & !done & (turn = 1)) -></span>\n<span class=\"gi\">+ [spin] ((pot != 0) & !done & (turn = 1) & (p1 != 0)) -></span>\n<span class=\"w\"> </span>   0.25: (p1' = p1-1) \n<span class=\"w\"> </span>          & (pot' = min(pot+1, maxval)) \n<span class=\"gd\">-          & (turn' = 2) //shin</span>\n<span class=\"gi\">+          & (turn' = p1n) //shin</span>\n</code></pre></div>\n<p>We make similar changes for all of the other players. You can see the final model <a href=\"https://gist.github.com/hwayne/f8724f0c83393c576b1e20ee4b76966d#file-02-dreidel-prism\" target=\"_blank\">here</a>.</p>\n<h3>Querying the model</h3>\n<div class=\"subscribe-form\"></div>\n<p>So now we have a full game of Dreidel that runs until the player ends. And now, <em>finally</em>, we can see the average number of spins a 4 player game will last.</p>\n<div class=\"codehilite\"><pre><span></span><code>./prism<span class=\"w\"> </span>dreidel.prism<span class=\"w\"> </span>-const<span class=\"w\"> </span><span class=\"nv\">M</span><span class=\"o\">=</span><span class=\"m\">10</span><span class=\"w\"> </span>-pf<span class=\"w\"> </span><span class=\"s1\">'R=? [F done]'</span><span class=\"w\"> </span>\n</code></pre></div>\n<p>In English: each player starts with ten coins. <code>R=?</code> means \"expected value of the 'reward'\", where 'reward' in this case means number of spins. <code>[F done]</code> weights the reward over all behaviors that reach (\"<strong>F</strong>inally\") the <code>done</code> state.</p>\n<div class=\"codehilite\"><pre><span></span><code>Result: 760.5607582661091\nTime for model checking: 384.17 seconds.\n</code></pre></div>\n<p>So there's the number: 760 spins.<sup id=\"fnref:ben\"><a class=\"footnote-ref\" href=\"#fn:ben\">1</a></sup> At 8 seconds a spin, that's almost two hours for <em>one</em> game.</p>\n<p>…Jesus, look at that runtime. Six minutes to test one query.</p>\n<p>PRISM has over a hundred settings that affect model checking, with descriptions like \"Pareto curve threshold\" and \"Use Backwards Pseudo SOR\". After looking through them all, I found this perfect combination of configurations that gets the runtime to a more manageable level: </p>\n<div class=\"codehilite\"><pre><span></span><code>./prism dreidel.prism \n<span class=\"w\"> </span>   -const M=10 \n<span class=\"w\"> </span>   -pf 'R=? [F done]' \n<span class=\"gi\">+   -heuristic speed</span>\n\nResult: 760.816255997373\nTime for model checking: 13.44 seconds.\n</code></pre></div>\n<p>Yes, that's a literal \"make it faster\" flag.</p>\n<p>Anyway, that's only the \"average\" number of spins, weighted across all games. Dreidel has a very long tail. To find that out, we'll use a variation on our query:</p>\n<div class=\"codehilite\"><pre><span></span><code>const C0; P=? [F <=C0 done]\n</code></pre></div>\n<p><code>P=?</code> is the <strong>P</strong>robability something happens. <code>F <=C0 done</code> means we <strong>F</strong>inally reach state <code>done</code> in at most <code>C0</code> steps. By passing in different values of <code>C0</code> we can get a sense of how long a game takes. Since \"steps\" includes passes and antes, this will overestimate the length of the game. But antes take time too and it should only \"pass\" on a player once per player, so this should still be a good metric for game length.</p>\n<div class=\"codehilite\"><pre><span></span><code>./prism dreidel.prism \n    -const M=10 \n    -const C0=1000:1000:5000\n    -pf 'const C0; P=? [F <=C0 done]' \n    -heuristic speed\n\nC0      Result\n1000    0.6259953274918795\n2000    0.9098575028069353\n3000    0.9783122218576754\n4000    0.994782069562932\n5000    0.9987446018004976\n</code></pre></div>\n<p>A full 10% of games don't finish in 2000 steps, and 2% pass the 3000 step barrier. At 8 seconds a roll/ante, 3000 steps is over <strong>six hours</strong>.</p>\n<p>Dreidel is a bad game.</p>\n<h3>More fun properties</h3>\n<p>As a sanity check, let's confirm last year's result, that it takes an average of 64ish spins before one player is out. In that model, we just needed to get the total reward. Now we instead want to get the reward until the first state where any of the players have zero coins. <sup id=\"fnref:co-safe\"><a class=\"footnote-ref\" href=\"#fn:co-safe\">2</a></sup></p>\n<div class=\"codehilite\"><pre><span></span><code>./prism dreidel.prism \n    -const M=10 \n    -pf 'R=? [F (p1=0 | p2=0 | p3=0 | p4=0)]' \n    -heuristic speed\n\nResult: 63.71310116083396\nTime for model checking: 2.017 seconds.\n</code></pre></div>\n<p>Yep, looks good. With our new model we can also get the average point where two players are out and two players are left. PRISM's lack of abstraction makes expressing the condition directly a little painful, but we can cheat and look for the first state where <code>ante_left <= 2</code>.<sup id=\"fnref:ante_left\"><a class=\"footnote-ref\" href=\"#fn:ante_left\">3</a></sup></p>\n<div class=\"codehilite\"><pre><span></span><code>./prism dreidel.prism \n    -const M=10 \n    -pf 'R=? [F (ante_left <= 2)]' \n    -heuristic speed\n\nResult: 181.92839196680023\n</code></pre></div>\n<p>It takes twice as long to eliminate the second player as it takes to eliminate the first, and the remaining two players have to go for another 600 spins.</p>\n<p>Dreidel is a bad game.</p>\n<h2>The future</h2>\n<p>There's two things I want to do next with this model. The first is script up something that can generate the PRISM model for me, so I can easily adjust the number of players to 3 or 5. The second is that PRISM has a <a href=\"https://www.prismmodelchecker.org/manual/PropertySpecification/Filters\" target=\"_blank\">filter-query</a> feature I don't understand but I <em>think</em> it could be used for things like \"if a player gets 75% of the pot, what's the probability they lose anyway\". Otherwise you have to write wonky queries like <code>(P =? [F p1 = 30 & (F p1 = 0)]) / (P =? [F p1 = 0])</code>.<sup id=\"fnref:lose\"><a class=\"footnote-ref\" href=\"#fn:lose\">4</a></sup> But I'm out of time again, so this saga will have to conclude next year.</p>\n<p>I'm also faced with the terrible revelation that I might be the biggest non-academic user of PRISM.</p>\n<hr/>\n<h4><em>Logic for Programmers</em> Khanukah Sale</h4>\n<p>Still going on! You can get <em>LFP</em> for <a href=\"https://leanpub.com/logic/c/hannukah-presents\" target=\"_blank\">40% off here</a> from now until the end of Xannukkah (Jan 2).<sup id=\"fnref:joke\"><a class=\"footnote-ref\" href=\"#fn:joke\">5</a></sup></p>\n<h4>I'm in the Raku Advent Calendar!</h4>\n<p>My piece is called <a href=\"https://raku-advent.blog/2024/12/11/day-11-counting-up-concurrency/\" target=\"_blank\">counting up concurrencies</a>. It's about using Raku to do some combinatorics! Read the rest of the blog too, it's great</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:ben\">\n<p>This is different from the <a href=\"https://www.slate.com/articles/life/holidays/2014/12/rules_of_dreidel_the_hannukah_game_is_way_too_slow_let_s_speed_it_up.html\" target=\"_blank\">original anti-Dreidel article</a>: Ben got <em>860</em> spins. That's the average spins if you round <em>down</em> on He, not up. Rounding up on He leads to a shorter game because it means He can empty the pot, which means more antes, and antes are what knocks most players out. <a class=\"footnote-backref\" href=\"#fnref:ben\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:co-safe\">\n<p>PRISM calls this <a href=\"https://www.prismmodelchecker.org/manual/PropertySpecification/Reward-basedProperties\" target=\"_blank\">\"co-safe LTL reward\"</a> and does <em>not</em> explain what that means, nor do most of the papers I found referencing \"co-safe LTL\". <a href=\"https://mengguo.github.io/personal_site/papers/pdf/guo2016task.pdf\" target=\"_blank\">Eventually</a> I found one that defined it as \"any property that only uses X, U, F\". <a class=\"footnote-backref\" href=\"#fnref:co-safe\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:ante_left\">\n<p>Here's the exact point where I realize I could have defined <code>done</code> as <code>ante_left = 1</code>. Also checking for <code>F (ante_left = 2)</code> gives an expected number of spins as \"infinity\". I have no idea why. <a class=\"footnote-backref\" href=\"#fnref:ante_left\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n<li id=\"fn:lose\">\n<p>10% chances at 4 players / 10 coins. And it takes a minute even <em>with</em> fast mode enabled. <a class=\"footnote-backref\" href=\"#fnref:lose\" title=\"Jump back to footnote 4 in the text\">↩</a></p>\n</li>\n<li id=\"fn:joke\">\n<p>This joke was funnier before I made the whole newsletter about Chanukahh. <a class=\"footnote-backref\" href=\"#fnref:joke\" title=\"Jump back to footnote 5 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/formally-modeling-dreidel-the-sequel/",
          "published": "2024-12-18T16:58:59.000Z",
          "updated": "2024-12-18T16:58:59.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/stroustrups-rule/",
          "title": "Stroustrup's Rule",
          "description": "<p>Just finished two weeks of workshops and am <em>exhausted</em>, so this one will be light. </p>\n<h3>Hanuka Sale</h3>\n<p><em>Logic for Programmers</em> is on sale until the end of Chanukah! That's Jan 2nd if you're not Jewish. <a href=\"https://leanpub.com/logic/c/hannukah-presents\" target=\"_blank\">Get it for 40% off here</a>.</p>\n<h1>Stroustrup's Rule</h1>\n<p>I first encountered <strong>Stroustrup's Rule</strong> on this <a href=\"https://web.archive.org/web/20240914141601/https:/www.thefeedbackloop.xyz/stroustrups-rule-and-layering-over-time/\" target=\"_blank\">defunct webpage</a>:</p>\n<blockquote>\n<p>One of my favorite insights about syntax design appeared in a <a href=\"https://learn.microsoft.com/en-us/shows/lang-next-2014/keynote\" target=\"_blank\">retrospective on C++</a><sup id=\"fnref:timing\"><a class=\"footnote-ref\" href=\"#fn:timing\">1</a></sup> by Bjarne Stroustrup:</p>\n<ul>\n<li>For new features, people insist on <strong>LOUD</strong> explicit syntax. </li>\n<li>For established features, people want terse notation.</li>\n</ul>\n</blockquote>\n<p>The blogger gives the example of option types in Rust. Originally, the idea of using option types to store errors was new for programmers, so the syntax for passing an error was very explicit:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"kd\">let</span><span class=\"w\"> </span><span class=\"n\">file</span><span class=\"w\"> </span><span class=\"o\">=</span><span class=\"w\"> </span><span class=\"k\">match</span><span class=\"w\"> </span><span class=\"n\">File</span><span class=\"p\">::</span><span class=\"n\">open</span><span class=\"p\">(</span><span class=\"s\">\"file.txt\"</span><span class=\"p\">)</span><span class=\"w\"> </span><span class=\"p\">{</span>\n<span class=\"w\">    </span><span class=\"nb\">Ok</span><span class=\"p\">(</span><span class=\"n\">file</span><span class=\"p\">)</span><span class=\"w\"> </span><span class=\"o\">=></span><span class=\"w\"> </span><span class=\"n\">file</span><span class=\"p\">,</span>\n<span class=\"w\">    </span><span class=\"nb\">Err</span><span class=\"p\">(</span><span class=\"n\">err</span><span class=\"p\">)</span><span class=\"w\"> </span><span class=\"o\">=></span><span class=\"w\"> </span><span class=\"p\">{</span><span class=\"w\"> </span><span class=\"k\">return</span><span class=\"w\"> </span><span class=\"n\">err</span><span class=\"p\">;</span><span class=\"w\"> </span><span class=\"p\">}</span>\n<span class=\"p\">}</span>\n</code></pre></div>\n<p>Once people were more familiar with it, Rust added the <code>try!</code> macro to reduce boilerplate, and finally the <a href=\"https://github.com/rust-lang/rfcs/blob/master/text/0243-trait-based-exception-handling.md\" target=\"_blank\"><code>?</code> operator</a> to streamline error handling further.</p>\n<p>I see this as a special case of <a href=\"http://teachtogether.tech/en/index.html#s:models\" target=\"_blank\">mental model development</a>: when a feature is new to you, you don't have an internal mental model so need all of the explicit information you can get. Once you're familiar with it, explicit syntax is visual clutter and hinders how quickly you can parse out information.</p>\n<p>(One example I like: which is more explicit, <code>user_id</code> or <code>user_identifier</code>? Which do experienced programmers prefer?)</p>\n<p>What's interesting is that it's often the <em>same people</em> on both sides of the spectrum. Beginners need explicit syntax, and as they become experts, they prefer terse syntax. </p>\n<p>The rule applies to the overall community, too. At the beginning of a language's life, everybody's a beginner. Over time the ratio of experts to beginners changes, and this leads to more focus on \"expert-friendly\" features, like terser syntax.</p>\n<p>This can make it harder for beginners to learn the language. There was a lot of drama in Python over the <a href=\"https://peps.python.org/pep-0572/\" target=\"_blank\">\"walrus\" assignment operator</a>:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\"># Without walrus</span>\n<span class=\"n\">val</span> <span class=\"o\">=</span> <span class=\"nb\">dict</span><span class=\"o\">.</span><span class=\"n\">get</span><span class=\"p\">(</span><span class=\"n\">key</span><span class=\"p\">)</span> <span class=\"c1\"># `None` if key absent</span>\n<span class=\"k\">if</span> <span class=\"n\">val</span><span class=\"p\">:</span>\n    <span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"n\">val</span><span class=\"p\">)</span>\n\n\n<span class=\"c1\"># With walrus</span>\n<span class=\"k\">if</span> <span class=\"n\">val</span> <span class=\"o\">:=</span> <span class=\"nb\">dict</span><span class=\"o\">.</span><span class=\"n\">get</span><span class=\"p\">(</span><span class=\"n\">key</span><span class=\"p\">):</span>\n    <span class=\"nb\">print</span><span class=\"p\">(</span><span class=\"n\">val</span><span class=\"p\">)</span>\n</code></pre></div>\n<p>Experts supported it because it made code more elegant, teachers and beginners opposed it because it made the language harder to learn. Explicit syntax vs terse notation.</p>\n<p>Does this lead to languages bloating over time?</p>\n<h3>In Teaching</h3>\n<p>I find that when I teach language workshops I have to actively work against Stroustrup's Rule. The terse notation that easiest for <em>me</em> to read is bad for beginners, who need the explicit syntax that I find grating.</p>\n<p>One good example is type invariants in TLA+. Say you have a set of workers, and each worker has a counter. Here's two ways to say that every worker's counter is a non-negative integer:</p>\n<div class=\"codehilite\"><pre><span></span><code>\\* Bad\n\\A w \\in Workers: counter[w] >= 0\n\n\\* Good\ncounter \\in [Workers -> Nat]\n</code></pre></div>\n<p>The first way literally tests that for every worker, <code>counter[w]</code> is non-negative. The second way tests that the <code>counter</code> mapping as a whole is an element of the appropriate \"function set\"— all functions between workers and natural numbers.</p>\n<p>The function set approach is terser, more elegant, and preferred by TLA+ experts. But I teach the \"bad\" way because it makes more sense to beginners.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:timing\">\n<p>Starts minute 23. <a class=\"footnote-backref\" href=\"#fnref:timing\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/stroustrups-rule/",
          "published": "2024-12-11T17:32:53.000Z",
          "updated": "2024-12-11T17:32:53.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/hyperproperties/",
          "title": "Hyperproperties",
          "description": "<p>I wrote about <a href=\"https://hillelwayne.com/post/hyperproperties/\" target=\"_blank\">hyperproperties on my blog</a> four years ago, but now an intriguing client problem got me thinking about them again.<sup id=\"fnref:client\"><a class=\"footnote-ref\" href=\"#fn:client\">1</a></sup></p>\n<p>We're using TLA+ to model a system that starts in state A, and under certain complicated conditions <code>P</code>, transitions to state B. They also had a flag <code>f</code> that, when set, used a different complicated condition <code>Q</code> to check the transitions. As a quick <a href=\"https://www.hillelwayne.com/post/decision-tables/\" target=\"_blank\">decision table</a> (from state <code>A</code>):</p>\n<table>\n<thead>\n<tr>\n<th>f</th>\n<th>P</th>\n<th>Q</th>\n<th>state'</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>F</td>\n<td>F</td>\n<td>-</td>\n<td>A</td>\n</tr>\n<tr>\n<td>F</td>\n<td>T</td>\n<td>-</td>\n<td>B</td>\n</tr>\n<tr>\n<td>T</td>\n<td>F</td>\n<td>F</td>\n<td>A</td>\n</tr>\n<tr>\n<td>T</td>\n<td>F</td>\n<td>T</td>\n<td>B</td>\n</tr>\n<tr>\n<td>T</td>\n<td>T</td>\n<td>F</td>\n<td><strong>impossible</strong></td>\n</tr>\n<tr>\n<td>T</td>\n<td>T</td>\n<td>T</td>\n<td>B</td>\n</tr>\n</tbody>\n</table>\n<p>The interesting bit is the second-to-last row: Q has to be <em>strictly</em> more permissible than P. The client wanted to verify the property that \"the system more aggressively transitions when <code>f</code> is set\", ie there is no case where the machine transitions <em>only if <code>f</code> is false</em>.</p>\n<p><a href=\"https://www.hillelwayne.com/post/safety-and-liveness/\" target=\"_blank\">Regular system properties</a> are specified over states in a single sequence of states (behaviors). <strong>Hyperproperties</strong> can hold over <em>sets</em> of sequences of states. Here the hyperproperties are:</p>\n<blockquote>\n<ol>\n<li>For any two states X and Y in separate behaviors, if the only difference in variable-state between X and Y is that <code>X.f = TRUE</code>, then whenever Y transitions to B, so does X.</li>\n<li>There is at least one such case where X transitions and Y does not.</li>\n</ol>\n</blockquote>\n<p>That's pretty convoluted, which is par for the course with hyperproperties! It makes a little more sense if you have all of the domain knowledge and specifics. </p>\n<p>The key thing is that makes this a hyperproperty is that you can't <em>just</em> look at individual behaviors to verify it. Imagine if, when <code>f</code> is true, we <em>never</em> transition to state B. Is that a violation of (1)? Not if we never transition when <code>f</code> is false either! To prove a violation, you need to find a behavior where <code>f</code> is false <em>and</em> the state is otherwise the same <em>and</em> we transition to B anyway.</p>\n<h4>Aside: states in states in states</h4>\n<p>I dislike how \"state\" refers to three things:</p>\n<ol>\n<li>The high-level \"transition state\" of a state-machine</li>\n<li>A single point in time of a system (the \"state space\")</li>\n<li>The mutable data inside your system's <a href=\"https://www.hillelwayne.com/post/world-vs-machine/\" target=\"_blank\">machine</a>.</li>\n</ol>\n<p>These are all \"close\" to each other but <em>just</em> different enough to make conversations confusing. Software is pretty bad about reusing colloquial words like this; don't even get me <em>started</em> on the word \"design\".</p>\n<h3>There's a reason we don't talk about hyperproperties</h3>\n<p>Or three reasons. First of all, hyperproperties make up a <em>vanishingly small</em> percentage of the stuff in a system we care about. We only got to \"<code>f</code> makes the system more aggressive\" after checking at least a dozen other simpler and <em>more important</em> not-hyper properties.</p>\n<p>Second, <em>most</em> formal specification languages can't express hyperproperties, and the ones that can are all academic research projects. Modeling systems is hard enough without a generalized behavior notation!</p>\n<p>Third, hyperproperties are astoundingly expensively to check. As an informal estimation, for a state space of size <code>N</code> regular properties are checked across <code>N</code> individual states and 2-behavior hyperproperties (2-props) are checked across <code>N²</code> pairs. So for a small state space of just a million states, the 2-prop needs to be checked across a <em>trillion</em> pairs. </p>\n<p>These problems don't apply to \"hyperproperties\" of functions, just systems. Functions have a lot of interesting hyperproperties, there's an easy way to represent them (call the function twice in a test), and quadratic scaling isn't so bad if you're only testing 100 inputs or so. That's why so-called <a href=\"https://www.hillelwayne.com/post/metamorphic-testing/\" target=\"_blank\">metamorphic testing</a> of functions can be useful.</p>\n<h3>Checking Hyperproperties Anyway</h3>\n<p>If we <em>do</em> need to check a hyperproperty, there's a few ways we can approach it. </p>\n<p>The easiest way is to cheat and find a regular prop that implies the hyperproperty. In client's case, we can abstract <code>P</code> and <code>Q</code> into pure functions and then test that there's no input where <code>P</code> is true and <code>Q</code> is false. In TLA+, this would look something like</p>\n<div class=\"codehilite\"><pre><span></span><code>\\* TLA+\nQLooserThanP ==\n  \\A i1 \\in InputSet1, i2 \\in Set2: \\* ...\n    P(i1, i2, …) => Q(i1, i2, …)\n</code></pre></div>\n<p>Of course we can't always encapsulate this way, and this can't catch bugs like \"we accidentally use <code>P</code> even if <code>f</code> is true\". But it gets the job done.</p>\n<p>Another way is something I talked about in the <a href=\"https://hillelwayne.com/post/hyperproperties/\" target=\"_blank\">original hyperproperty post</a>: lifting specs into hyperspecs. We create a new spec that initializes two copies of our main spec, runs them in parallel, and then compares their behaviors. See the post for an example. Writing a hyperspec keeps us entirely in TLA+ but takes a lot of work and is <em>very</em> expensive to check. Depending on the property we want to check, we can sometimes find simple optimizations.</p>\n<p>The last way is something <a href=\"https://hillelwayne.com/post/graphing-tla/\" target=\"_blank\">I explored last year</a>: dump the state graph to disk and treat the hyperproperty as a graph property. In this case, the graph property would be something like </p>\n<blockquote>\n<p>Find all graph edges representing an A → B transition. Take all the source nodes of each where <code>f = false</code>. For each such source node, find the corresponding node that's identical except for <code>f = true</code>. That node should be the source of an A → B edge.</p>\n</blockquote>\n<p>Upside is you don't have to make any changes to the original spec. Downside is you have to use another programming language for analysis. Also, <a href=\"https://hillelwayne.com/post/graph-types/\" target=\"_blank\">analyzing graphs is terrible</a>. But I think this overall the most robust approach to handling hyperproperties, to be used when \"cheating\" fails.</p>\n<hr/>\n<p>What fascinates me most about this is the four-year gap between \"I learned and wrote about hyperproperties\" and \"I have to deal with hyperproperties in my job.\" This is one reason learning for the sake of learning can have a lot of long-term benefits.</p>\n<hr/>\n<h3>Blog Rec</h3>\n<p>This week's rec is <a href=\"https://robertheaton.com/\" target=\"_blank\">Robert Heaton</a>. It's a \"general interest\" software engineering blog with a focus on math, algorithms, and security. Some of my favorites:</p>\n<ul>\n<li><a href=\"https://robertheaton.com/preventing-impossible-game-levels-using-cryptography/\" target=\"_blank\">Preventing impossible game levels using cryptography</a> and the whole \"Steve Steveington\" series</li>\n<li><a href=\"https://robertheaton.com/2019/06/24/i-was-7-words-away-from-being-spear-phished/\" target=\"_blank\">I was 7 words away from being spear-phished</a> is a great deep dive into one targeted scam</li>\n<li><a href=\"https://robertheaton.com/2019/02/24/making-peace-with-simpsons-paradox/\" target=\"_blank\">Making peace with Simpson's Paradox</a> is the best explanation of Simpson's Paradox I've ever read.</li>\n</ul>\n<p>Other good ones are <a href=\"https://robertheaton.com/pyskywifi/\" target=\"_blank\">PySkyWiFi: completely free, unbelievably stupid wi-fi on long-haul flights</a> and <a href=\"https://robertheaton.com/interview/\" target=\"_blank\">How to pass a coding interview with me</a>. The guy's got <em>breadth</em>.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:client\">\n<p>I do formal methods consulting btw. <a href=\"https://www.hillelwayne.com/consulting/\" target=\"_blank\">Hire me!</a> <a class=\"footnote-backref\" href=\"#fnref:client\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/hyperproperties/",
          "published": "2024-11-19T19:34:54.000Z",
          "updated": "2024-11-19T19:34:54.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/five-unusual-raku-features/",
          "title": "Five Unusual Raku Features",
          "description": "<h3><a href=\"https://leanpub.com/logic/\" target=\"_blank\"><em>Logic for Programmers</em></a> is now in Beta!</h3>\n<p><a href=\"https://leanpub.com/logic/\" target=\"_blank\">v0.5 marks the official end of alpha</a>! With the new version, all of the content I wanted to put in the book is now present, and all that's left is copyediting, proofreading, and formatting. Which will probably take as long as it took to actually write the book. You can see the release notes in the footnote.<sup id=\"fnref:release-notes\"><a class=\"footnote-ref\" href=\"#fn:release-notes\">1</a></sup></p>\n<p>And I've got a snazzy new cover:</p>\n<p><img alt=\"The logic for programmers cover, a 40x zoom of a bird feather\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/26c75f1e-e60a-4328-96e5-9878d96d3e53.png?w=960&fit=max\"/></p>\n<p>(I don't actually like the cover that much but it <em>looks</em> official enough until I can pay an actual cover designer.)</p>\n<h1>\"Five\" Unusual Raku Features</h1>\n<p>Last year I started learning Raku, and the sheer bizarreness of the language left me describing it as <a href=\"https://buttondown.com/hillelwayne/archive/raku-a-language-for-gremlins/\" target=\"_blank\">a language for gremlins</a>. Now that I've used it in anger for over a year, I have a better way of describing it:</p>\n<blockquote>\n<p>Raku is a laboratory for language features.</p>\n</blockquote>\n<p>This is why it has <a href=\"https://docs.raku.org/language/concurrency\" target=\"_blank\">five different models of concurrency</a> and eighteen ways of doing anything else, because the point is to <em>see</em> what happens. It also explains why many of the features interact so strangely and why there's all that odd edge-case behavior. Getting 100 experiments polished and playing nicely with each other is much harder than running 100 experiments; we can sort out the polish <em>after</em> we figure out which ideas are good ones.</p>\n<p>So here are \"five\" Raku experiments you could imagine seeing in another programming language. If you squint.</p>\n<h3><a href=\"https://docs.raku.org/type/Junction\" target=\"_blank\">Junctions</a></h3>\n<p>Junctions are \"superpositions of possible values\". Applying an operation to a junction instead applies it to every value inside the junction.  </p>\n<div class=\"codehilite\"><pre><span></span><code>> <span class=\"mi\">2</span><span class=\"o\">|</span><span class=\"mi\">10</span>\n<span class=\"nb\">any</span>(<span class=\"mi\">2</span>, <span class=\"mi\">10</span>)\n\n> <span class=\"mi\">2</span><span class=\"nv\">&10</span> + <span class=\"mi\">3</span>\n<span class=\"nb\">all</span>(<span class=\"mi\">5</span>, <span class=\"mi\">13</span>)\n\n>(<span class=\"mi\">1</span><span class=\"nv\">&2</span>) + (<span class=\"mi\">10</span><span class=\"o\">^</span><span class=\"mi\">20</span>)\n<span class=\"nb\">all</span>(<span class=\"nb\">one</span>(<span class=\"mi\">11</span>, <span class=\"mi\">21</span>), <span class=\"nb\">one</span>(<span class=\"mi\">12</span>, <span class=\"mi\">22</span>))\n</code></pre></div>\n<p>As you can probably tell from the <code>all</code>s and <code>any</code>s, junctions are a feature meant for representing boolean formula. There's no way to destructure a junction, and the only way to use it is to collapse it to a boolean first.</p>\n<div class=\"codehilite\"><pre><span></span><code>> (<span class=\"mi\">1</span><span class=\"nv\">&2</span>) + (<span class=\"mi\">10</span><span class=\"o\">^</span><span class=\"mi\">20</span>) < <span class=\"mi\">15</span>\n<span class=\"nb\">all</span>(<span class=\"nb\">one</span>(<span class=\"nb\">True</span>, <span class=\"nb\">False</span>), <span class=\"nb\">one</span>(<span class=\"nb\">True</span>, <span class=\"nb\">False</span>))\n\n<span class=\"c1\"># so coerces junctions to booleans</span>\n> <span class=\"nb\">so</span> (<span class=\"mi\">1</span><span class=\"nv\">&2</span>) + (<span class=\"mi\">10</span><span class=\"o\">^</span><span class=\"mi\">20</span>) < <span class=\"mi\">15</span>\n<span class=\"nb\">True</span>\n\n> <span class=\"nb\">so</span> (<span class=\"mi\">1</span><span class=\"nv\">&2</span>) + (<span class=\"mi\">10</span><span class=\"o\">^</span><span class=\"mi\">20</span>) > <span class=\"mi\">0</span>\n<span class=\"nb\">False</span>\n\n> <span class=\"mi\">16</span> %% (<span class=\"mi\">3</span><span class=\"nv\">&5</span>) ?? <span class=\"s\">\"fizzbuzz\"</span> !! *\n*\n</code></pre></div>\n<p>The real interesting thing for me is how Raku elegantly uses junctions to represent quantifiers. In most languages, you either have the function <code>all(list[T], T -> bool)</code> or the method <code>[T].all(T -> bool)</code>, both of which apply the test to every element of the list. In Raku, though, <code>list.all</code> doesn't take <em>anything</em>, it's just a niladic method that turns the list into a junction. </p>\n<div class=\"codehilite\"><pre><span></span><code>> <span class=\"k\">my</span> <span class=\"nv\">$x</span> = <span class=\"s\"><1 2 3></span>.<span class=\"nb\">all</span>\n<span class=\"nb\">all</span>(<span class=\"mi\">1</span>, <span class=\"mi\">2</span>, <span class=\"mi\">3</span>)\n> <span class=\"nb\">is-prime</span>(<span class=\"nv\">$x</span>)\n<span class=\"nb\">all</span>(<span class=\"nb\">False</span>, <span class=\"nb\">True</span>, <span class=\"nb\">True</span>)\n</code></pre></div>\n<p>This means we can combine junctions. If Raku didn't already have a <code>unique</code> method, we could build it by saying \"are all elements equal to exactly one element?\"</p>\n<div class=\"codehilite\"><pre><span></span><code>> <span class=\"nb\">so</span> {.<span class=\"nb\">all</span> == .<span class=\"nb\">one</span>}(<span class=\"s\"><1 2 3 7></span>)\n<span class=\"nb\">True</span>\n\n> <span class=\"nb\">so</span> {.<span class=\"nb\">all</span> == .<span class=\"nb\">one</span>}(<span class=\"s\"><1 2 3 7 2></span>)\n<span class=\"nb\">False</span>\n</code></pre></div>\n<h3><a href=\"https://docs.raku.org/type/Whatever\" target=\"_blank\">Whatevers</a></h3>\n<p><code>*</code> is the \"whatever\" symbol and has a lot of different roles in Raku.<sup id=\"fnref:analogs\"><a class=\"footnote-ref\" href=\"#fn:analogs\">2</a></sup> Some functions and operators have special behavior when passed a <code>*</code>. In a range or sequence, <code>*</code> means \"unbound\".</p>\n<div class=\"codehilite\"><pre><span></span><code>> <span class=\"mi\">1</span>..*\n<span class=\"mi\">1</span><span class=\"o\">..</span><span class=\"n\">Inf</span>\n\n> (<span class=\"mi\">2</span>,<span class=\"mi\">4</span>,<span class=\"mi\">8</span>...*)[<span class=\"mi\">17</span>]\n<span class=\"mi\">262144</span>\n</code></pre></div>\n<p>The main built-in use, though, is that expressions with <code>*</code> are lifted into anonymous functions. This is called \"whatever-priming\" and produces a <code>WhateverCode</code>, which is indistinguishable from other functions except for the type.</p>\n<div class=\"codehilite\"><pre><span></span><code>> {<span class=\"nv\">$_</span> + <span class=\"mi\">10</span>}(<span class=\"mi\">2</span>)\n<span class=\"mi\">12</span>\n\n> (* + <span class=\"mi\">10</span>)(<span class=\"mi\">2</span>)\n<span class=\"mi\">12</span>\n\n> (^<span class=\"mi\">10</span>).<span class=\"n\">map</span>(* % <span class=\"mi\">2</span>)\n(<span class=\"mi\">0</span> <span class=\"mi\">1</span> <span class=\"mi\">0</span> <span class=\"mi\">1</span> <span class=\"mi\">0</span> <span class=\"mi\">1</span> <span class=\"mi\">0</span> <span class=\"mi\">1</span> <span class=\"mi\">0</span> <span class=\"mi\">1</span>)\n</code></pre></div>\n<p>There's actually a bit of weird behavior here: if <em>two</em> whatevers appear in the expression, they become separate positional variables. <code>(2, 30, 4, 50).map(* + *)</code> returns <code>(32, 54)</code>. This makes it easy to express <a href=\"https://docs.raku.org/language/operators#infix_...\" target=\"_blank\">a tricky Fibonacci definition</a> but otherwise I don't see how it's better than making each <code>*</code> the same value.</p>\n<p>Regardless, priming is useful because <em>so many</em> Raku methods are overloaded to take functions. You get the last element of a list with <code>l[*-1]</code>. This <em>looks</em> like standard negative-index syntax, but what actually happens is that when <code>[]</code> is passed a function, it passes in list length and looks up the result. So if the list has 10 elements, <code>l[*-1] = l[10-1] = l[9]</code>, aka the last element. Similarly, <code>l.head(2)</code> is the first two elements of a list, <code>l.head(*-2)</code> is all-but-the-last-two.</p>\n<p>We can pass other functions to <code>[]</code>, which e.g. makes implementing ring buffers easy.</p>\n<div class=\"codehilite\"><pre><span></span><code>> <span class=\"k\">my</span> <span class=\"nv\">@x</span> = ^<span class=\"mi\">10</span>\n[<span class=\"mi\">0</span> <span class=\"mi\">1</span> <span class=\"mi\">2</span> <span class=\"mi\">3</span> <span class=\"mi\">4</span> <span class=\"mi\">5</span> <span class=\"mi\">6</span> <span class=\"mi\">7</span> <span class=\"mi\">8</span> <span class=\"mi\">9</span>]\n\n> <span class=\"nv\">@x</span>[<span class=\"mi\">95</span> % *]--; <span class=\"nv\">@x</span>\n[<span class=\"mi\">0</span> <span class=\"mi\">1</span> <span class=\"mi\">2</span> <span class=\"mi\">3</span> <span class=\"mi\">4</span> <span class=\"mi\">4</span> <span class=\"mi\">6</span> <span class=\"mi\">7</span> <span class=\"mi\">8</span> <span class=\"mi\">9</span>]\n</code></pre></div>\n<h3><a href=\"https://docs.raku.org/language/regexes\" target=\"_blank\">Regular Expressions</a></h3>\n<p>There are two basic standards for regexes: POSIX regexes and Perl-compatible regexes (PCRE). POSIX regexes are a terrible mess of backslashes and punctuation. PCRE is backwards compatible with POSIX and is a more terrible mess of backslashes and punctuation. Most languages follow the PCRE standard, but Perl 6 breaks backwards compatibility with an entirely new regex syntax. </p>\n<p>The most obvious improvement: <a href=\"https://docs.raku.org/language/regexes#Subrules\" target=\"_blank\">composability</a>. In most languages  \"combine\" two regexes by concating their strings together, which is terrible for many, many reasons. Raku has the standard \"embed another regex\" syntax: <code>/< foo >+/</code> matches one-or-more of the <code>foo</code> regex without <code>foo</code> \"leaking\" into the top regex. </p>\n<p>This already does a lot to make regexes more tractable: you can break a complicated regular expression down into simpler and more legible parts. And in fact this is how Raku supports <a href=\"https://docs.raku.org/language/grammars\" target=\"_blank\">parsing grammars</a> as a builtin language feature. I've only used grammars once but it <a href=\"https://www.hillelwayne.com/post/picat/\" target=\"_blank\">was quite helpful</a>.</p>\n<p>Since we're breaking backwards compatibility anyway, we can now add lots of small QOLs. There's a <a href=\"https://docs.raku.org/language/regexes#Modified_quantifier:_%,_%%\" target=\"_blank\">value separator</a> modifier: <code>\\d+ % ','</code> matches <code>1</code> / <code>1,2</code> / <code>1,1,4</code> but not <code>1,</code> or <code>12</code>. <a href=\"https://docs.raku.org/language/regexes#Lookaround_assertions\" target=\"_blank\">Lookaheads</a> and non-capturing groups aren't nonsense glyphs. <code>r1 && r2</code> only matches strings that match <em>both</em> <code>r1</code> and <code>r2</code>. Backtracking can be stopped with <a href=\"https://docs.raku.org/language/regexes#Preventing_backtracking:_:\" target=\"_blank\">:</a>. Whitespace is ignored by default and has to be explicitly enabled in match patterns.</p>\n<p>There's more stuff Raku does with actually <em>processing</em> regular expressions, but the regex notation is something that might actually appear in another language someday. </p>\n<p style=\"height:16px; margin:0px !important;\"></p>\n<h3><a href=\"https://docs.raku.org/language/operators#Hyper_operators\" target=\"_blank\">Hyperoperators</a></h3>\n<p>This is a small one compared to the other features, but it's also the thing I miss most often in other languages. The most basic form <code>l>>.method</code> is basically equivalent to <code>map</code>, except it also recursively descends into sublists.</p>\n<div class=\"codehilite\"><pre><span></span><code>> [<span class=\"mi\">1</span>, [<span class=\"mi\">2</span>, <span class=\"mi\">3</span>], <span class=\"mi\">4</span>]>>.<span class=\"nb\">succ</span>\n[<span class=\"mi\">2</span> [<span class=\"mi\">3</span> <span class=\"mi\">4</span>] <span class=\"mi\">5</span>]\n</code></pre></div>\n<p>This is more useful than it looks because any function call <code>f(list, *args)</code> can be rewritten in \"method form\" <code>list.&f(*args)</code>, so <code>>>.</code> becomes the generalized mapping operator. You can use it with whatevers, too.</p>\n<div class=\"codehilite\"><pre><span></span><code>> [<span class=\"mi\">1</span>, [<span class=\"mi\">2</span>, <span class=\"mi\">3</span>], <span class=\"mi\">4</span>]>>.&(*+<span class=\"mi\">1</span>)\n[<span class=\"mi\">2</span> [<span class=\"mi\">3</span> <span class=\"mi\">4</span>] <span class=\"mi\">5</span>]\n</code></pre></div>\n<p>Anyway, the more generalized <em>binary</em> hyperoperator <code>l1 << op >> l2</code><sup id=\"fnref:spaces\"><a class=\"footnote-ref\" href=\"#fn:spaces\">3</a></sup> applies <code>op</code> elementwise to the two lists, looping the shorter list until the longer list is exhausted. <code>>>op>></code> / <code><< op<<</code> are the same except they instead loop until the lhs/rhs list is exhausted. Whew!</p>\n<div class=\"codehilite\"><pre><span></span><code>> [<span class=\"mi\">1</span>, <span class=\"mi\">2</span>, <span class=\"mi\">3</span>, <span class=\"mi\">4</span>, <span class=\"mi\">5</span>] <span class=\"s\"><<+></span>> [<span class=\"mi\">10</span>, <span class=\"mi\">20</span>]\n[<span class=\"mi\">11</span> <span class=\"mi\">22</span> <span class=\"mi\">13</span> <span class=\"mi\">24</span> <span class=\"mi\">15</span>]\n\n> [<span class=\"mi\">1</span>, <span class=\"mi\">2</span>, <span class=\"mi\">3</span>, <span class=\"mi\">4</span>, <span class=\"mi\">5</span>] <span class=\"s\"><<+<< [10, 20]</span>\n<span class=\"s\">[11 22]</span>\n\n<span class=\"s\">> [1, 2, 3, 4, 5] >></span>+>> [<span class=\"mi\">10</span>, <span class=\"mi\">20</span>]\n[<span class=\"mi\">11</span> <span class=\"mi\">22</span> <span class=\"mi\">13</span> <span class=\"mi\">24</span> <span class=\"mi\">15</span>]\n\n<span class=\"c1\"># Also works with single values</span>\n> [<span class=\"mi\">1</span>, <span class=\"mi\">2</span>, <span class=\"mi\">3</span>, <span class=\"mi\">4</span>, <span class=\"mi\">5</span>] <span class=\"s\"><<+></span>> <span class=\"mi\">10</span>\n[<span class=\"mi\">11</span> <span class=\"mi\">12</span> <span class=\"mi\">13</span> <span class=\"mi\">14</span> <span class=\"mi\">15</span>]\n\n<span class=\"c1\"># Does weird things with nested lists too</span>\n> [<span class=\"mi\">1</span>, [<span class=\"mi\">2</span>, <span class=\"mi\">3</span>], <span class=\"mi\">4</span>, <span class=\"mi\">5</span>] <span class=\"s\"><<+></span>> [<span class=\"mi\">10</span>, <span class=\"mi\">20</span>]\n[<span class=\"mi\">11</span> [<span class=\"mi\">22</span> <span class=\"mi\">23</span>] <span class=\"mi\">14</span> <span class=\"mi\">25</span>]\n</code></pre></div>\n<p>Also for some reason the hyperoperators have separate behaviors on two hashes, either applying <code>op</code> to the union/intersection/hash difference. </p>\n<p>Anyway it's a super weird (meta)operator but it's also quite useful! It's the closest thing I've seen to <a href=\"https://hillelwayne.com/post/j-notation/\" target=\"_blank\">J verbs</a> outside an APL. I like using it to run the same formula on multiple possible inputs at once.</p>\n<div class=\"codehilite\"><pre><span></span><code>(<span class=\"mi\">20</span> * <span class=\"mi\">10</span> <span class=\"s\"><<-></span>> (<span class=\"mi\">21</span>, <span class=\"mi\">24</span>)) <span class=\"s\"><<*></span>> (<span class=\"mi\">10</span>, <span class=\"mi\">100</span>)\n(<span class=\"mi\">1790</span> <span class=\"mi\">17600</span>)\n</code></pre></div>\n<p>Incidentally, it's called the hyperoperator because it evaluates all of the operations in parallel. Explicit loops can be parallelized by prefixing them with <a href=\"https://docs.raku.org/language/statement-prefixes#hyper,_race\" target=\"_blank\"><code>hyper</code></a>.</p>\n<h3><a href=\"https://docs.raku.org/type/Pair\" target=\"_blank\">Pair Syntax</a></h3>\n<p>I've talked about pairs a little in <a href=\"https://buttondown.com/hillelwayne/archive/unusual-basis-types-in-programming-languages/\" target=\"_blank\">this newsletter</a>, but the gist is that Raku hashes are composed of a set of pairs <code>key => value</code>. The pair is the basis type, the hash is the collection of pairs. There's also a <em>ton</em> of syntactic sugar for concisely specifying pairs via \"colon syntax\":</p>\n<div class=\"codehilite\"><pre><span></span><code>> <span class=\"k\">my</span> <span class=\"nv\">$x</span> = <span class=\"mi\">3</span>; :<span class=\"nv\">$x</span>\n<span class=\"nb\">x</span> => <span class=\"mi\">3</span>\n\n> :<span class=\"n\">a</span><span class=\"s\"><$x></span>\n<span class=\"n\">a</span> => <span class=\"s\">\"$x\"</span>\n\n> :<span class=\"n\">a</span>(<span class=\"nv\">$x</span>)\n<span class=\"n\">a</span> => <span class=\"mi\">3</span>\n\n> :<span class=\"mi\">3</span><span class=\"n\">a</span>\n<span class=\"n\">a</span> => <span class=\"mi\">3</span>\n</code></pre></div>\n<p>The most important sugars are <code>:key</code> and <code>:!key</code>, which map to <code>key => True</code> and <code>key => False</code>. This is a really elegant way to add flags to a methods! Take the definition of <a href=\"https://docs.raku.org/type/Str#method_match\" target=\"_blank\">match</a>:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">method</span> <span class=\"nb\">match</span>(<span class=\"nv\">$pat</span>, \n    :<span class=\"n\">continue</span>(:<span class=\"nv\">$c</span>), :<span class=\"n\">pos</span>(:<span class=\"nv\">$p</span>), :<span class=\"n\">global</span>(:<span class=\"nv\">$g</span>), \n    :<span class=\"n\">overlap</span>(:<span class=\"nv\">$ov</span>), :<span class=\"n\">exhaustive</span>(:<span class=\"nv\">$ex</span>), \n    :<span class=\"n\">st</span>(:<span class=\"nv\">$nd</span>), :<span class=\"n\">rd</span>(:<span class=\"nv\">$th</span>), :<span class=\"nv\">$nth</span>, :<span class=\"nv\">$x</span> --> <span class=\"nb\">Match</span>)\n</code></pre></div>\n<p>Probably should also mention that in a definition, <code>:f(:$foo)</code> defines the parameter <code>$foo</code> but <a href=\"https://docs.raku.org/language/signatures#Argument_aliases\" target=\"_blank\">also aliases it</a> to <code>:f</code>, so you can set the flag with <code>:f</code> or <code>:foo</code>. Colon-pairs defined in the signature can be passed in anywhere, or even stuck together:</p>\n<div class=\"codehilite\"><pre><span></span><code>> <span class=\"s\">\"abab\"</span>.<span class=\"nb\">match</span>(<span class=\"sr\">/../</span>)\n「<span class=\"n\">ab</span>」\n> <span class=\"s\">\"abab\"</span>.<span class=\"nb\">match</span>(<span class=\"sr\">/../</span>, :<span class=\"n\">g</span>)\n(「<span class=\"n\">ab</span>」 「<span class=\"n\">ab</span>」)\n> <span class=\"s\">\"abab\"</span>.<span class=\"nb\">match</span>(<span class=\"sr\">/../</span>, :<span class=\"n\">g</span>, :<span class=\"n\">ov</span>)\n(「<span class=\"n\">ab</span>」 「<span class=\"n\">ba</span>」 「<span class=\"n\">ab</span>」)\n\n<span class=\"c1\"># Out of order stuck together</span>\n> <span class=\"s\">\"abab\"</span>.<span class=\"nb\">match</span>(:<span class=\"n\">g:ov</span>,<span class=\"sr\"> /../</span>)\n(「<span class=\"n\">ab</span>」 「<span class=\"n\">ba</span>」 「<span class=\"n\">ab</span>」)\n</code></pre></div>\n<p>So that leads to extremely concise method configuration. Definitely beats <code>match(global=True, overlap=True)</code>!</p>\n<p>And for some reason you can place keyword arguments <em>after</em> the function call:</p>\n<div class=\"codehilite\"><pre><span></span><code>> <span class=\"s\">\"abab\"</span>.<span class=\"nb\">match</span>(:<span class=\"n\">g</span>,<span class=\"sr\"> /../</span>):<span class=\"n\">ov:2nd</span>\n「<span class=\"n\">ba</span>」\n</code></pre></div>\n<h2>The next-gen lab: Slangs and RakuAST</h2>\n<p>These are features I have no experience in and <em>certainly</em> are not making their way into other languages, but they really expand the explorable space of new features. <a href=\"https://raku.land/zef:lizmat/Slangify\" target=\"_blank\">Slangs</a> are modifications to the Raku syntax. This can be used for things like <a href=\"https://raku.land/zef:elcaro/Slang::Otherwise\" target=\"_blank\">modifying loop syntax</a>, <a href=\"https://raku.land/zef:raku-community-modules/Slang::Piersing\" target=\"_blank\">changing identifiers</a>, or adding <a href=\"https://raku.land/zef:raku-community-modules/OO::Actors\" target=\"_blank\">actors</a> or <a href=\"https://raku.land/github:MattOates/BioInfo\" target=\"_blank\">DNA sequences</a> to the base language.</p>\n<p>I <em>barely</em> understand <a href=\"https://dev.to/lizmat/rakuast-for-early-adopters-576n\" target=\"_blank\">RakuAST</a>. I <em>think</em> the idea is that all Raku expressions can be parsed as an AST from inside Raku itself.</p>\n<div class=\"codehilite\"><pre><span></span><code>> <span class=\"s\">Q/my $x; $x++/</span>.<span class=\"nb\">AST</span>\n<span class=\"n\">RakuAST::StatementList</span>.<span class=\"nb\">new</span>(\n  <span class=\"n\">RakuAST::Statement::Expression</span>.<span class=\"nb\">new</span>(\n    <span class=\"n\">expression</span> => <span class=\"n\">RakuAST::VarDeclaration::Simple</span>.<span class=\"nb\">new</span>(\n      <span class=\"nb\">sigil</span>       => <span class=\"s\">\"\\$\"</span>,\n      <span class=\"n\">desigilname</span> => <span class=\"n\">RakuAST::Name</span>.<span class=\"n\">from-identifier</span>(<span class=\"s\">\"x\"</span>)\n    )\n  ),\n  <span class=\"n\">RakuAST::Statement::Expression</span>.<span class=\"nb\">new</span>(\n    <span class=\"n\">expression</span> => <span class=\"n\">RakuAST::ApplyPostfix</span>.<span class=\"nb\">new</span>(\n      <span class=\"n\">operand</span> => <span class=\"n\">RakuAST::Var::Lexical</span>.<span class=\"nb\">new</span>(<span class=\"s\">\"\\$x\"</span>),\n      <span class=\"nb\">postfix</span> => <span class=\"n\">RakuAST::Postfix</span>.<span class=\"nb\">new</span>(<span class=\"s\">\"++\"</span>)\n    )\n  )\n)\n</code></pre></div>\n<p>This allows for things like writing Raku in different languages:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"nb\">say</span> <span class=\"s\">Q/my $x; put $x/</span>.<span class=\"nb\">AST</span>.<span class=\"n\">DEPARSE</span>(<span class=\"s\">\"NL\"</span>)\n<span class=\"n\">mijn</span> <span class=\"nv\">$x</span>;\n<span class=\"n\">zeg-het</span> <span class=\"nv\">$x</span>\n</code></pre></div>\n<h3>Bonus experiment</h3>\n<p>Raku comes with a \"<a href=\"https://rakudo.org/star\" target=\"_blank\">Rakudo Star</a>\" installation, which comes with a set of <a href=\"https://github.com/rakudo/star/blob/master/etc/modules.txt\" target=\"_blank\">blessed third party modules</a> preinstalled. I love this! It's a great compromise between the maintainer burdens of a large standard library and the user burdens of making everybody find the right packages in the ecosystem.</p>\n<hr/>\n<h2>Blog Rec</h2>\n<p>Feel obligated to recommend some Raku blogs! Elizabeth Mattijsen posts <a href=\"https://dev.to/lizmat\" target=\"_blank\">a ton of stuff</a> to dev.to about Raku internals. <a href=\"https://www.codesections.com/blog/\" target=\"_blank\">Codesections</a> has a pretty good blog; he's the person who eventually got me to try out Raku. Finally, the <a href=\"https://raku-advent.blog/\" target=\"_blank\">Raku Advent Calendar</a> is a great dive into advanced Raku techniques. Bad news is it only updates once a year, good news is it's 25 updates that once a year.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:release-notes\">\n<ul>\n<li>All techniques chapters now have a \"Further Reading\" section</li>\n<li>\"System modeling\" chapter significantly rewritten</li>\n<li>\"Conditionals\" chapter expanded, now a real chapter</li>\n<li>\"Logic Programming\" chapter now covers datalog, deductive databases</li>\n<li>\"Solvers\" chapter has diagram explaining problem</li>\n<li>Eight new exercises</li>\n<li>Tentative front cover (will probably change)</li>\n<li>Fixed some epub issues with math rendering</li>\n</ul>\n<p><a class=\"footnote-backref\" href=\"#fnref:release-notes\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:analogs\">\n<p>Analogues are <a href=\"https://stackoverflow.com/questions/8000903/what-are-all-the-uses-of-an-underscore-in-scala/8001065#8001065\" target=\"_blank\">Scala's underscore</a>, except unlike Scala it's a value and not syntax, and like Python's <a href=\"https://docs.python.org/3/library/constants.html#Ellipsis\" target=\"_blank\">Ellipses</a>, except it has additional semantics. <a class=\"footnote-backref\" href=\"#fnref:analogs\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:spaces\">\n<p>Spaces added so buttondown doesn't think they're tags <a class=\"footnote-backref\" href=\"#fnref:spaces\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/five-unusual-raku-features/",
          "published": "2024-11-12T20:06:55.000Z",
          "updated": "2024-11-12T20:06:55.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/a-list-of-ternary-operators/",
          "title": "A list of ternary operators",
          "description": "<p>Sup nerds, I'm back from SREcon! I had a blast, despite knowing nothing about site reliability engineering and being way over my head in half the talks. I'm trying to catch up on <a href=\"https://leanpub.com/logic/\" target=\"_blank\">The Book</a> and contract work now so I'll do something silly here: ternary operators.</p>\n<p>Almost all operations on values in programming languages fall into one of three buckets: </p>\n<ol>\n<li><strong>Unary operators</strong>, where the operator goes <em>before</em> or <em>after</em> exactly one argument. Examples are <code>x++</code> and <code>-y</code> and <code>!bool</code>. Most languages have a few critical unary operators hardcoded into the grammar. They are almost always symbols, but sometimes are string-identifiers (<code>not</code>).</li>\n<li><strong>Binary operators</strong>, which are placed <em>between</em> exactly two arguments. Things like <code>+</code> or <code>&&</code> or <code>>=</code>. Languages have a lot more of these than unary operators, because there's more fundamental things we want to do with two values than one value. These can be symbols or identifiers (<code>and</code>).</li>\n<li>Functions/methods that <em>prefix</em> any number of arguments. <code>func(a, b, c)</code>, <code>obj.method(a, b, c, d)</code>, anything in a lisp. These are how we extend the language, and they almost-exclusively use identifiers and not symbols.<sup id=\"fnref:lisp\"><a class=\"footnote-ref\" href=\"#fn:lisp\">1</a></sup></li>\n</ol>\n<p>There's one widespread exception to this categorization: the <strong>ternary operator</strong> <code>bool ? x : y</code>.<sup id=\"fnref:ternary\"><a class=\"footnote-ref\" href=\"#fn:ternary\">2</a></sup> It's an infix operator that takes exactly <em>three</em> arguments and can't be decomposed into two sequential binary operators. <code>bool ? x</code> makes no sense on its own, nor does <code>x : y</code>. </p>\n<p>Other ternary operators are <em>extremely</em> rare, which is why conditional expressions got to monopolize the name \"ternary\". But I like how exceptional they are and want to compile some of them. A long long time ago I asked <a href=\"https://twitter.com/hillelogram/status/1378509881498603527\" target=\"_blank\">Twitter</a> for other ternary operators; this is a compilation of some applicable responses plus my own research.</p>\n<p>(Most of these are a <em>bit</em> of a stretch.)</p>\n<h3>Stepped Ranges</h3>\n<p>Many languages have some kind of \"stepped range\" function:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\"># Python</span>\n<span class=\"o\">>>></span> <span class=\"nb\">list</span><span class=\"p\">(</span><span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">10</span><span class=\"p\">,</span> <span class=\"mi\">2</span><span class=\"p\">))</span>\n<span class=\"p\">[</span><span class=\"mi\">1</span><span class=\"p\">,</span> <span class=\"mi\">3</span><span class=\"p\">,</span> <span class=\"mi\">5</span><span class=\"p\">,</span> <span class=\"mi\">7</span><span class=\"p\">,</span> <span class=\"mi\">9</span><span class=\"p\">]</span>\n</code></pre></div>\n<p>There's the \"base case\" of start and endpoints, and an optional step. Many languages have a binary infix op for the base case, but a few also have a ternary for the optional step:</p>\n<div class=\"codehilite\"><pre><span></span><code># Frink\n> map[{|a| a*2}, (1 to 100 step 15) ] \n[2, 32, 62, 92, 122, 152, 182]\n\n# Elixir\n> IO.puts Enum.join(1..10//2, \" \")\n1 3 5 7 9\n</code></pre></div>\n<p>This isn't decomposable into two binary ops because you can't assign the range to a value and then step the value later.</p>\n<h3>Graph ops</h3>\n<p>In <a href=\"https://graphviz.org/\" target=\"_blank\">Graphviz</a>, a basic edge between two nodes is either the binary <code>node1 -> node2</code> or the ternary <code>node1 -> node2 [edge_props]</code>:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">digraph</span><span class=\"w\"> </span><span class=\"nt\">G</span><span class=\"w\"> </span><span class=\"p\">{</span>\n<span class=\"w\">  </span><span class=\"nt\">a1</span><span class=\"w\"> </span><span class=\"o\">-></span><span class=\"w\"> </span><span class=\"nt\">a2</span><span class=\"w\"> </span><span class=\"p\">[</span><span class=\"na\">color</span><span class=\"p\">=</span><span class=\"s2\">\"green\"</span><span class=\"p\">]</span>\n<span class=\"p\">}</span>\n</code></pre></div>\n<p><img alt=\"Output of the above graphviz\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/d1a0f894-59d5-45d3-8702-967e94672371.png?w=960&fit=max\"/></p>\n<p>Graphs seem ternary-friendly because there are three elements involved with any graph connection: the two nodes and the connecting edge. So you also see ternaries in some graph database query languages, with separate places to specify each node and the edge.</p>\n<div class=\"codehilite\"><pre><span></span><code># GSQL (https://docs.tigergraph.com/gsql-ref/4.1/tutorials/gsql-101/parameterized-gsql-query)\nSELECT tgt\n    FROM start:s -(Friendship:e)- Person:tgt;\n\n# Cypher (https://neo4j.com/docs/cypher-manual/current/introduction/cypher-overview/)\nMATCH (actor:Actor)-[:ACTED_IN]->(movie:Movie {title: 'The Matrix'})\n</code></pre></div>\n<p>Obligatory plug for my <a href=\"https://www.hillelwayne.com/post/graph-types/\" target=\"_blank\">graph datatype essay</a>.</p>\n<h3>Metaoperators</h3>\n<p>Both <a href=\"https://raku.org/\" target=\"_blank\">Raku</a> and <a href=\"https://www.jsoftware.com/#/README\" target=\"_blank\">J</a> have special higher-order functions that apply to binary infixes. Raku calls them <em>metaoperators</em>, while J calls them <em>adverbs</em> and <em>conjugations</em>.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\"># Raku</span>\n\n<span class=\"c1\"># `a «op» b` is map, \"cycling\" shorter list</span>\n<span class=\"nb\">say</span> <span class=\"s\"><10 20 30></span> «+» <span class=\"s\"><4 5></span>\n(<span class=\"mi\">14</span> <span class=\"mi\">25</span> <span class=\"mi\">34</span>)\n\n<span class=\"c1\"># `a Rop b` is `b op a`</span>\n<span class=\"nb\">say</span> <span class=\"mi\">2</span> <span class=\"n\">R-</span> <span class=\"mi\">3</span>\n<span class=\"mi\">1</span>\n</code></pre></div>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\">NB. J</span>\n\n<span class=\"c1\">NB. x f/ y creates a \"table\" of x f y</span>\n<span class=\"w\">   </span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"mi\">2</span><span class=\"w\"> </span><span class=\"o\">+/</span><span class=\"w\"> </span><span class=\"mi\">10</span><span class=\"w\"> </span><span class=\"mi\">20</span>\n<span class=\"mi\">11</span><span class=\"w\"> </span><span class=\"mi\">21</span>\n<span class=\"mi\">12</span><span class=\"w\"> </span><span class=\"mi\">22</span>\n</code></pre></div>\n<p>The Raku metaoperators are closer to what I'm looking for, since I don't think you can assign the \"created operator\" directly to a callable variable. J lets you, though!</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"w\">   </span><span class=\"nv\">h</span><span class=\"w\"> </span><span class=\"o\">=:</span><span class=\"w\"> </span><span class=\"o\">+/</span>\n<span class=\"w\">   </span><span class=\"mi\">1</span><span class=\"w\"> </span><span class=\"mi\">2</span><span class=\"w\"> </span><span class=\"nv\">h</span><span class=\"w\"> </span><span class=\"mi\">3</span><span class=\"w\"> </span><span class=\"mi\">4</span>\n<span class=\"mi\">4</span><span class=\"w\"> </span><span class=\"mi\">5</span>\n<span class=\"mi\">5</span><span class=\"w\"> </span><span class=\"mi\">6</span>\n</code></pre></div>\n<p>That said, J has some \"decomposable\" ternaries that feel <em>spiritually</em> like ternaries, like <a href=\"https://code.jsoftware.com/wiki/Vocabulary/curlyrt#dyadic\" target=\"_blank\">amend</a> and <a href=\"https://code.jsoftware.com/wiki/Vocabulary/fcap\" target=\"_blank\">fold</a>. It also has a special ternary-ish contruct called the \"fork\".<sup id=\"fnref:ternaryish\"><a class=\"footnote-ref\" href=\"#fn:ternaryish\">3</a></sup> <code>x (f g h) y</code> is parsed as <code>(x f y) g (x h y)</code>:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\">NB. Max - min</span>\n<span class=\"w\">   </span><span class=\"mi\">5</span><span class=\"w\"> </span><span class=\"p\">(</span><span class=\"o\">>.</span><span class=\"w\"> </span><span class=\"o\">-</span><span class=\"w\"> </span><span class=\"o\"><.</span><span class=\"p\">)</span><span class=\"w\"> </span><span class=\"mi\">2</span>\n<span class=\"mi\">3</span>\n<span class=\"w\">   </span><span class=\"mi\">2</span><span class=\"w\"> </span><span class=\"p\">(</span><span class=\"o\">>.</span><span class=\"w\"> </span><span class=\"o\">-</span><span class=\"w\"> </span><span class=\"o\"><.</span><span class=\"p\">)</span><span class=\"w\"> </span><span class=\"mi\">5</span>\n<span class=\"mi\">3</span>\n</code></pre></div>\n<p>So at the top level that's just a binary operator, but the binary op is constructed via a ternary op. That's pretty cool IMO.</p>\n<h3>Assignment Ternaries</h3>\n<p>Bob Nystrom points out that in many languages, <code>a[b] = c</code> is a ternary operation: it is <em>not</em> the same as <code>x = a[b]; x = c</code>.</p>\n<p>A weirder case shows up in <a href=\"https://github.com/betaveros/noulith/\" target=\"_blank\">Noulith</a> and Raku (again): update operators. Most languages have the <code>+=</code> <em>binary operator</em>, these two have the <code>f=</code> <em>ternary operator</em>. <code>a f= b</code> is the same as <code>a = f(a, b)</code>.</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"c1\"># Raku</span>\n> <span class=\"k\">my</span> <span class=\"nv\">$x</span> = <span class=\"mi\">2</span>; <span class=\"nv\">$x</span> <span class=\"nb\">max</span>= <span class=\"mi\">3</span>; <span class=\"nb\">say</span> <span class=\"nv\">$x</span>\n<span class=\"mi\">3</span>\n</code></pre></div>\n<p>Arguably this is just syntactic sugar, but I don't think it's decomposable into binary operations.</p>\n<h3>Custom user ternaries</h3>\n<p>Tikhon Jelvis pointed out that <a href=\"https://agda.readthedocs.io/en/v2.7.0.1/language/mixfix-operators.html\" target=\"_blank\">Agda</a>  lets you define <em>custom</em> mixfix operators, which can be ternary or even tetranary or pentanary. I later found out that <a href=\"https://docs.racket-lang.org/mixfix/index.html\" target=\"_blank\">Racket</a> has this, too. <a href=\"https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/ProgrammingWithObjectiveC/Introduction/Introduction.html\" target=\"_blank\">Objective-C</a> <em>looks</em> like this, too, but feels different somehow. </p>\n<h3>Near Misses</h3>\n<p>All of these are arguable, I've just got to draw a line in the sand <em>somewhere</em>.</p>\n<ul>\n<li>Regular expression substitutions: <code>s/from/to/flags</code> seems like a ternary, but I'd argue it a datatype constructor, not an expression operator.</li>\n<li>Comprehensions like <code>[x + 1 | x <- list]</code>: looks like the ternary <code>[expr1 | expr2 <- expr3]</code>, but <code>expr2</code> is only binding a name. Arguably a ternary if you can map <em>and filter</em> in the same expression a la Python or Haskell, but should that be considered sugar for</li>\n<li>Python's operator chaining (<code>1 < x < 5</code>): syntactic sugar for <code>1 < x and x < 5</code>.</li>\n<li>Someone suggested <a href=\"https://stackoverflow.com/questions/7251772/what-exactly-constitutes-swizzling-in-opengl-es-2-0-powervr-sgx-specifically\" target=\"_blank\">glsl swizzles</a>, which are very cool but binary operators.</li>\n</ul>\n<h2>Why are ternaries so rare?</h2>\n<p>Ternaries are <em>somewhat</em> more common in math and physics, f.ex in integrals and sums. That's because they were historically done on paper, where you have a 2D canvas, so you can do stuff like this easily:</p>\n<div class=\"codehilite\"><pre><span></span><code>10\nΣ    n\nn=0\n</code></pre></div>\n<p>We express the ternary by putting arguments above and below the operator. All mainstream programming languages are linear, though, so any given symbol has only two sides. Plus functions are more regular and universal than infix operators so you might as well write <code>Sum(n=0, 10, n)</code>. The conditional ternary slips through purely because it's just so darn useful. Though now I'm wondering where it comes from in the first place. Different newsletter, maybe.</p>\n<p>But I still find ternary operators super interesting, please let me know if you know any I haven't covered!</p>\n<hr/>\n<h3>Blog Rec</h3>\n<p>This week's blog rec is <a href=\"https://lexi-lambda.github.io/\" target=\"_blank\">Alexis King</a>! Generally, Alexis's work spans the theory, practice, and implementation of programming languages, aimed at a popular audience and not an academic one. If you know her for one thing, it's probably <a href=\"https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate\" target=\"_blank\">Parse, don't validate</a>, which is now so mainstream most people haven't read the original post. Another good one is about <a href=\"https://lexi-lambda.github.io/blog/2020/01/19/no-dynamic-type-systems-are-not-inherently-more-open/\" target=\"_blank\">modeling open-world systems with static types</a>. </p>\n<p>Nowadays she is <em>far</em> more active on <a href=\"https://langdev.stackexchange.com/users/861/alexis-king\" target=\"_blank\">Programming Languages Stack Exchange</a>, where she has blog-length answers on <a href=\"https://langdev.stackexchange.com/questions/2692/how-should-i-read-type-system-notation/2693#2693\" target=\"_blank\">reading type notations</a>, <a href=\"https://langdev.stackexchange.com/questions/3942/what-are-the-ways-compilers-recognize-complex-patterns/3945#3945\" target=\"_blank\">compiler design</a>, and <a href=\"https://langdev.stackexchange.com/questions/2069/what-is-an-arrow-and-what-powers-would-it-give-as-a-first-class-concept-in-a-pro/2372#2372\" target=\"_blank\">why arrows</a>.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:lisp\">\n<p>Unless it's a lisp. <a class=\"footnote-backref\" href=\"#fnref:lisp\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:ternary\">\n<p>Or <code>x if bool else y</code>, same thing. <a class=\"footnote-backref\" href=\"#fnref:ternary\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:ternaryish\">\n<p>I say \"ish\" because trains can be arbitrarily long: <code>x (f1 f2 f3 f4 f5) y</code> is something I have <em>no idea</em> <a href=\"https://code.jsoftware.com/wiki/Vocabulary/fork\" target=\"_blank\">how to parse</a>. <a class=\"footnote-backref\" href=\"#fnref:ternaryish\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/a-list-of-ternary-operators/",
          "published": "2024-11-05T18:40:33.000Z",
          "updated": "2024-11-05T18:40:33.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/tla-from-first-principles/",
          "title": "TLA from first principles",
          "description": "<h3>No Newsletter next week</h3>\n<p>I'll be speaking at <a href=\"https://www.usenix.org/conference/srecon24emea/presentation/wayne\" target=\"_blank\">USENIX SRECon</a>!</p>\n<h2>TLA from first principles</h2>\n<p>I'm working on v0.5 of <a href=\"https://leanpub.com/logic/\" target=\"_blank\">Logic for Programmers</a>. In the process of revising the \"System Modeling\" chapter, I stumbled on a great way to explain the <strong>T</strong>emporal <strong>L</strong>ogic of <strong>A</strong>ctions that TLA+ is based on. I'm reproducing that bit here with some changes to fit the newsletter format.</p>\n<p>Note that by this point the reader has already encountered property testing, formal verification, decision tables, and nontemporal specifications, and should already have a lot of practice expressing things as predicates. </p>\n<hr/>\n<h3>The intro</h3>\n<p>We have some bank users, each with an account balance. Bank users can wire money\nto each other. We have overdraft protection, so wires cannot reduce an\naccount value below zero. </p>\n<p>For the purposes of introducing the ideas, we'll assume an extremely simple system: two hardcoded\nvariables <code>alice</code> and <code>bob</code>, both start with 10 dollars, and transfers\nare only from Alice to Bob. Also, the transfer is totally atomic: we\ncheck for adequate funds, withdraw, and deposit all in a single moment\nof time. Later [in the chapter] we'll allow for multiple nonatomic transfers at the same time.</p>\n<p>First, let's look at a valid <strong>behavior</strong> of the system, or possible way it can evolve.</p>\n<div class=\"codehilite\"><pre><span></span><code>alice   10 ->  5 -> 3  -> 3  -> ...\nbob     10 -> 15 -> 17 -> 17 -> ...\n</code></pre></div>\n<p>In programming, we'd think of <code>alice</code> and <code>bob</code> as variables that change. How do we represent those variables <em>purely</em> in terms of predicate logic? One way is to instead think of them as <em>arrays</em> of values. <code>alice[0]</code> is the initial state of <code>alice</code>, <code>alice[1]</code> is after the first time step, etc. Time, then, is \"just\" the set of natural numbers.</p>\n<div class=\"codehilite\"><pre><span></span><code>Time  = {0, 1, 2, 3, ...}\nalice = [10, 5, 3, 3, ...]\nbob   = [10, 15, 17, 17, ...]\n</code></pre></div>\n<p>In comparison to our valid behavior, here are some <em>invalid</em> behaviors:</p>\n<div class=\"codehilite\"><pre><span></span><code>alice = [10, 3,  ...]\nbob   = [10  15, ...]\n\nalice = [10, -1,  ...]\nbob   = [10  21,  ...]\n</code></pre></div>\n<p>The first is invalid because Bob received more money than Alice lost.\nThe second is invalid because it violates our proposed invariant, that\naccounts cannot go negative. Can we write a predicate that is <em>true</em> for\nvalid transitions and <em>false</em> for our two invalid behaviors?</p>\n<p>Here's one way:</p>\n<div class=\"codehilite\"><pre><span></span><code>Time = Nat // {0, 1, 2, etc}\n\nTransfer(t: Time) =\n  some value in 0..=alice[t]:\n    1. alice[t+1] = alice[t] - value\n    2. bob[t+1] = bob[t] + value\n</code></pre></div>\n<p>Go through and check that this is true for every <code>t</code> in the valid\nbehavior and false for at least one <code>t</code> in the invalid behavior. Note\nthat the steps where Alice <em>doesn't</em> send a transfer also pass\n<code>Transfer</code>; we just pick <code>value = 0</code>.</p>\n<p>I can now write a predicate that perfectly describes a valid behavior:</p>\n<div class=\"codehilite\"><pre><span></span><code>Spec = \n  1. alice[0] = 10\n  2. bob[0]   = 10\n  3. all t in Time:\n    Transfer(t)\n</code></pre></div>\n<p>Now allowing \"nothing happens\" as \"Alice sends an empty transfer\" is\na little bit weird. In the real system, we probably don't want people\nto constantly be sending each other zero dollars:</p>\n<div class=\"codehilite\"><pre><span></span><code>Transfer(t: Time) =\n<span class=\"gd\">- some value in 0..=alice[t]:</span>\n<span class=\"gi\">+ some value in 1..=alice[t]:</span>\n<span class=\"w\"> </span>   1. alice[t+1] = alice[t] - value\n<span class=\"w\"> </span>   2. bob[t+1] = bob[t] + value\n</code></pre></div>\n<p>But now there can't be a timestep where nothing happens. And that means\n<em>no</em> behavior is valid! At every step, Alice <em>must</em> transfer at least one dollar to Bob.\nEventually there is some <code>t</code> where <code>alice[t] = 0 && bob[t] = 20</code>. Then\nAlice can't make a transfer, <code>Transfer(t)</code> is false, and so <code>Spec</code> is\nfalse.<sup id=\"fnref:exercise\"><a class=\"footnote-ref\" href=\"#fn:exercise\">1</a></sup></p>\n<p>So typically when modeling we add a <strong>stutter step</strong>, like this:</p>\n<div class=\"codehilite\"><pre><span></span><code>Spec =\n  1. alice[0] = 10\n  2. bob[0]   = 10\n  3. all t in Time:\n    || Transfer(t)\n    || 1. alice[t+1] = alice[t]\n       2. bob[t+1] = bob[t]\n</code></pre></div>\n<p>(This is also why we can use infinite behaviors to model a finite algorithm. If the algorithm completes at <code>t=21</code>, <code>t=22,23,24...</code> are all stutter steps.)</p>\n<p>There's enough moving parts here that I'd want to break it into\nsubpredicates.</p>\n<div class=\"codehilite\"><pre><span></span><code>Init =\n  1. alice[0] = 10\n  2. bob[0]   = 10\n\nStutter(t) =\n  1. alice[t+1] = alice[t]\n  2. bob[t+1] = bob[t]\n\nNext(t) = Transfer(t) // foreshadowing\n\nSpec =\n  1. Init\n  2. all t in Time:\n    Next(t) || Stutter(t)\n</code></pre></div>\n<p>Now finally, how do we represent the property <code>NoOverdrafts</code>? It's an\n<em>invariant</em> that has to be true at all times. So we do the same thing we\ndid in <code>Spec</code>, write a predicate over all times.</p>\n<div class=\"codehilite\"><pre><span></span><code>property NoOverdrafts =\n  all t in Time:\n    alice[t] >= 0 && bob[t] >= 0\n</code></pre></div>\n<p>We can even say that <code>Spec => NoOverdrafts</code>, ie if a behavior is valid\nunder <code>Spec</code>, it satisfies <code>NoOverdrafts</code>.</p>\n<h4>One of the exercises</h4>\n<p>Modify the <code>Next</code> so that Bob can send Alice transfers, too. Don't try\nto be too clever, just do this in the most direct way possible.</p>\n<p>Bonus: can Alice and Bob transfer to each other in the same step?</p>\n<p><strong>Solution</strong> [in back of book]: We can rename <code>Transfer(t)</code> to <code>TransferAliceToBob(t)</code>, write the\nconverse as a new predicate, and then add it to <code>next</code>. Like this</p>\n<div class=\"codehilite\"><pre><span></span><code>TransferBobToAlice(t: Time) =\n  some value in 1..=bob[t]:\n    1. alice[t+1] = alice[t] - value\n    2. bob[t+1] = bob[t] + value\n\nNext(t) =\n  || TransferAliceToBob(t)\n  || TransferBobToAlice(t)\n</code></pre></div>\n<p>Now, can Alice and Bob transfer to each other in the same step? No.\nLet's say they both start with 10 dollars and each try to transfer five\ndollars to each other. By <code>TransferAliceToBob</code> we have:</p>\n<div class=\"codehilite\"><pre><span></span><code>1. alice[1] = alice[0] - 5 = 5\n2. bob[1] = bob[0] + 5 = 15\n</code></pre></div>\n<p>And by <code>TransferBobToAlice</code>, we have:</p>\n<div class=\"codehilite\"><pre><span></span><code>1. bob[1] = bob[0] - 5 = 5\n2. alice[1] = alice[0] + 5 = 15\n</code></pre></div>\n<p>So now we have <code>alice[1] = 5 && alice[1] = 15</code>, which is always false.</p>\n<h3>Temporal Logic</h3>\n<div class=\"subscribe-form\"></div>\n<p>This is good and all, but in practice, there's two downsides to\ntreating time as a set we can quantify over:</p>\n<ol>\n<li>It's cumbersome. We have to write <code>var[t]</code> and <code>var[t+1]</code> all over\n    the place.</li>\n<li>It's too powerful. We can write expressions like\n    <code>alice[t^2-5] = alice[t] + t</code>.</li>\n</ol>\n<p>Problem (2) might seem like a good thing; isn't the whole <em>point</em> of\nlogic to be expressive? But we have a long-term goal in mind: getting a\ncomputer to check our formal specification. We need to limit the\nexpressivity of our model so that we can make it checkable. </p>\n<p>In practice, this will mean making time implicit to our model, instead of\nexplicitly quantifying over it.</p>\n<p>The first thing we need to do is limit how we can use time. At a\ngiven point in time, all we can look at is the <em>current</em> value of a\nvariable (<code>var[t]</code>) and the <em>next</em> value (<code>var[t+1]</code>). No <code>var[t+16]</code> or\n<code>var[t-1]</code> or anything else complicated.</p>\n<p>And it turns out we've already seen a mathematical convention for\nexpressing this: <strong>priming</strong>!<sup id=\"fnref:priming\"><a class=\"footnote-ref\" href=\"#fn:priming\">2</a></sup> For a\ngiven time <code>t</code>, we can define <code>var</code> to mean <code>var[t]</code> and <code>var'</code> to mean\n<code>var[t+1]</code>. Then <code>Transfer(t)</code> becomes</p>\n<div class=\"codehilite\"><pre><span></span><code>Transfer =\n  some value in 1..=alice:\n    1. alice' = alice\n    2. bob' = bob\n</code></pre></div>\n<p>Next we have the construct <code>all t in Time: P(t)</code> in both <code>Spec</code> and\n<code>NoOverdrafts</code>. In other words, \"P is always true\". So we can add\n<code>always</code> as a new term. Logicians conventionally use □ or <code>[]</code>\nto mean the same thing.<sup id=\"fnref:beyond\"><a class=\"footnote-ref\" href=\"#fn:beyond\">3</a></sup></p>\n<div class=\"codehilite\"><pre><span></span><code>property NoOverdrafts =\n  always (alice >= 0 && bob[t] >= 0)\n  // or [](alice >= 0 && bob[t] >= 0)\n\nSpec =\n  Init && always (Next || Stutter)\n</code></pre></div>\n<p>Now time is <em>almost</em> completely implicit in our spec, with just one\nexception: <code>Init</code> has <code>alice[0]</code> and <code>bob[0]</code>. We just need one more\nconvention: if a variable is referenced <em>outside</em> of the scope of a\ntemporal operator, it means <code>var[0]</code>. Since <code>Init</code> is outside of the <code>[]</code>, it becomes</p>\n<div class=\"codehilite\"><pre><span></span><code>Init =\n  1. alice = 10\n  2. bob = 10\n</code></pre></div>\n<p>And with that, we've removed <code>Time</code> as an explicit value in our model.</p>\n<p>The addition of primes and <code>always</code> makes this a <strong>temporal logic</strong>, one that can model how things change over time. And that makes it ideal for modeling software systems.</p>\n<h3>Modeling with TLA+</h3>\n<p>One of the most popular specification languages for modeling these kinds\nof concurrent systems is <strong>TLA+</strong>. TLA+ was invented by the Turing award-winner Leslie Lamport, who also invented a wide variety of concurrency algorithms and LaTeX. Here's our current\nspec in TLA+:</p>\n<div class=\"codehilite\"><pre><span></span><code>---- MODULE transfers ----\nEXTENDS Integers\n\nVARIABLES alice, bob\nvars == <<alice, bob>>\n\nInit ==\n  alice = 10 \n  /\\ bob = 10\n\nAliceToBob ==\n  \\E amnt \\in 1..alice:\n    alice' = alice - amnt\n    /\\ bob' = bob + amnt\n\nBobToAlice ==\n  \\E amnt \\in 1..bob:\n    alice' = alice + amnt\n    /\\ bob' = bob - amnt\n\nNext ==\n  AliceToBob\n  \\/ BobToAlice\n\nSpec == Init /\\ [][Next]_vars\n\nNoOverdrafts ==\n  [](alice >= 0 /\\ bob >= 0)\n\n====\n</code></pre></div>\n<p>TLA+ uses ASCII versions of mathematicians notation: <code>/\\</code>/<code>\\/</code> for\n<code>&&/||</code>, <code>\\A \\E</code> for <code>all/some</code>, etc. The only thing that's \"unusual\"\n(besides <code>==</code> for definition) is the <code>[][Next]_vars</code> bit. That's TLA+\nnotation for <code>[](Next || Stutter)</code>: <code>Next</code> or <code>Stutter</code> always happens.</p>\n<hr/>\n<p>The rest of the chapter goes on to explain model checking, PlusCal (for modeling nonatomic transactions without needing to explain the exotic TLA+ function syntax), and liveness properties. But this is the intuition behind the \"temporal logic of actions\": temporal operators are operations on the set of points of time, and we restrict what we can do with those operators to make reasoning about the specification feasible.</p>\n<p>Honestly I like it enough that I'm thinking of redesigning my TLA+ workshop to start with this explanation. Then again, maybe it only seems good to me because I already know TLA+. Please let me know what you think about it!</p>\n<p>Anyway, the new version of the chapter will be in v0.5, which should be out mid-November.</p>\n<hr/>\n<h3>Blog Rec</h3>\n<p>This one it's really dear to me: <a href=\"https://muratbuffalo.blogspot.com/\" target=\"_blank\">Metadata</a>, by Murat Demirbas. When I was first trying to learn TLA+ back in 2016, his post <a href=\"https://muratbuffalo.blogspot.com/2015/01/my-experience-with-using-tla-in.html\" target=\"_blank\">on using TLA+ in a distributed systems class</a> was one of, like... <em>three</em> public posts on TLA+. I must have spent hours rereading that post and puzzling out this weird language I stumbled into. Later I emailed Murat with some questions and he was super nice in answering them. Don't think I would have ever grokked TLA+ without him.</p>\n<p>In addition to TLA+ content, a lot of the blog is also breakdowns of papers he read— like <a href=\"https://blog.acolyer.org/\" target=\"_blank\">the morning paper</a>, except with a focus on distributed systems (and still active). If you're interested in learning more about the science of distributed systems, he has an excellent page on <a href=\"https://muratbuffalo.blogspot.com/2021/02/foundational-distributed-systems-papers.html\" target=\"_blank\">foundational distributed systems papers</a>. But definitely check out his <a href=\"https://muratbuffalo.blogspot.com/2023/09/metastable-failures-in-wild.html\" target=\"_blank\">his deep readings</a>, too!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:exercise\">\n<p>In the book this is presented as an exercise (with the solution in back). The exercise also clarifies that since <code>Time = Nat</code>, all behaviors have an <em>infinite</em> number of steps. <a class=\"footnote-backref\" href=\"#fnref:exercise\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:priming\">\n<p>Priming is introduced in the chapter on decision tables, and again in the chapter on database invariants. <code>x'</code> is \"the next value of <code>x</code>\", so you can use it to express database invariants like \"jobs only move from <code>ready</code> to <code>started</code> or <code>aborted</code>.\" <a class=\"footnote-backref\" href=\"#fnref:priming\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li id=\"fn:beyond\">\n<p>I'm still vacillating on whether I want a \"beyond logic\" appendix that covers higher order logic, constructive logic, and modal logic (which is what we're sneakily doing right now!)</p>\n<p>While I'm here, this explanation of <code>always</code> as <code>all t in Time</code> isn't <em>100%</em> accurate, since it doesn't explain why things like <code>[](P => []Q)</code> or <code><>[]P</code> make sense. But it's accurate in most cases and is a great intuition pump. <a class=\"footnote-backref\" href=\"#fnref:beyond\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/tla-from-first-principles/",
          "published": "2024-10-22T17:14:21.000Z",
          "updated": "2024-10-22T17:14:21.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/",
          "title": "Be Suspicious of Success",
          "description": "<p>From Leslie Lamport's <em>Specifying Systems</em>:</p>\n<blockquote>\n<p>You should be suspicious if [the model checker] does not find a violation of a liveness property... you should also be suspicious if [it] finds no errors when checking safety properties. </p>\n</blockquote>\n<p>This is specifically in the context of model-checking a formal specification, but it's a widely applicable software principle. It's not enough for a program to work, it has to work for the <em>right reasons</em>. Code working for the wrong reasons is code that's going to break when you least expect it. And since \"correct for right reasons\" is a much narrower target than \"correct for any possible reason\", we can't assume our first success is actually our intended success.</p>\n<p>Hence, BSOS: <strong>Be Suspicious of Success</strong>.</p>\n<h3>Some useful BSOS practices</h3>\n<p>The standard way of dealing with BSOS is verification. Tests, static checks, model checking, etc. We get more confident in our code if our verifications succeed. But then we also have to be suspicious of <em>that</em> success, too! How do I know whether my tests are passing because they're properly testing correct code or because they're failing to test incorrect code?</p>\n<p>This is why test-driven development gurus tell people to write a failing test first. Then at least we know the tests are doing <em>something</em> (even if they still might not be testing what they want).</p>\n<p>The other limit of verification is that it can't tell us <em>why</em> something succeeds. Mainstream verification methods are good at explaining why things <em>fail</em>— expected vs actual test output, type mismatches, specification error traces. Success isn't as \"information-rich\" as failure. How do you distinguish a faithful implementation of <a href=\"https://en.wikipedia.org/wiki/Collatz_conjecture\" target=\"_blank\"><code>is_collatz_counterexample</code></a> from <code>return false</code>?</p>\n<p>A broader technique I follow is <em>make it work, make it break</em>. If code is working for the right reasons, I should be able to predict how to break it. This can be either a change in the runtime (this will livelock if we 10x the number of connections), or a change to the code itself (commenting out <em>this</em> line will cause property X to fail). <sup id=\"fnref:superproperties\"><a class=\"footnote-ref\" href=\"#fn:superproperties\">1</a></sup> If the code still works even after the change, my model of the code is wrong and it was succeeding for the wrong reasons.</p>\n<h3>Happy and Sad Paths</h3>\n<div class=\"subscribe-form\"></div>\n<p>A related topic (possibly subset?) is \"happy and sad paths\". The happy path of your code is the behavior when everything's going right: correct inputs, preconditions are satisfied, the data sources are present, etc. The sad path is all of the code that handles things going wrong. Retry mechanisms, insufficient user authority, database constraint violation, etc. In most software, the code supporting the sad paths dwarfs the code in the happy path.</p>\n<p>BSOS says that I can't just show code works in the happy path, I also need to check it works in the sad path. </p>\n<p>BSOS also says that I have to be suspicious when the sad path works properly, too. </p>\n<p>Say I add a retry mechanism to my code to handle the failure mode of timeouts. I test the code and it works. Did the retry code actually <em>run</em>? Did it run <em>regardless</em> of the original response? Is it really doing exponential backoff? Will stop after the maximum retry limit? Is the sad path code <em>after</em> the maximum retry limit working properly?</p>\n<p><a href=\"https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf\" target=\"_blank\">One paper</a> found that 35% of catastrophic distributed system failures were caused by \"trivial mistakes in error handlers\" (pg 9). These were in mature, battle-hardened programs. Be suspicious of success. Be more suspicious of sad path success.</p>\n<hr/>\n<h2>Blog Rec</h2>\n<p>This week's blog rec is <a href=\"https://www.redblobgames.com/\" target=\"_blank\">Red Blob Games</a>!<sup id=\"fnref:blogs-vs-articles\"><a class=\"footnote-ref\" href=\"#fn:blogs-vs-articles\">2</a></sup> While primarily about computer game programming, the meat of the content is beautiful, interactive guides to general CS algorithms. Some highlights:</p>\n<ul>\n<li><a href=\"https://www.redblobgames.com/pathfinding/a-star/introduction.html\" target=\"_blank\">Introduction to the A* Algorithm</a> was really illuminating when I was a baby programmer.</li>\n<li>I'm sure this <a href=\"https://www.redblobgames.com/articles/noise/introduction.html\" target=\"_blank\">overview of noise functions</a> will be useful to me <em>someday</em>. Maybe for test data generation?</li>\n<li>If you're also an explainer type he has a lot of great stuff on <a href=\"https://www.redblobgames.com/making-of/line-drawing/\" target=\"_blank\">his process</a> and his <a href=\"https://www.redblobgames.com/making-of/little-things/\" target=\"_blank\">little tricks</a> to make things more understandable.</li>\n</ul>\n<p>(I don't think his <a href=\"https://www.redblobgames.com/blog/posts.xml\" target=\"_blank\">rss feed</a> covers new interactive articles, only the <a href=\"https://www.redblobgames.com/blog/\" target=\"_blank\">blog</a> specifically.)</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:superproperties\">\n<p><a href=\"https://www.jameskoppel.com/\" target=\"_blank\">Jimmy Koppel</a> once proposed that just as code has properties, code variations have <a href=\"https://groups.csail.mit.edu/sdg/pubs/2020/demystifying_dependence_published.pdf\" target=\"_blank\"><strong>superproperties</strong></a>. For example, \"no modification to the codebase causes us to use a greater number of deprecated APIs.\" <a class=\"footnote-backref\" href=\"#fnref:superproperties\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:blogs-vs-articles\">\n<p>Okay, it's more an <em>article</em> site, because there's also a <a href=\"https://www.redblobgames.com/blog/\" target=\"_blank\">Red Blob <em>blog</em></a> (which covers a lot of neat stuff, too). Maybe I should just rename this section to \"site rec\". <a class=\"footnote-backref\" href=\"#fnref:blogs-vs-articles\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/be-suspicious-of-success/",
          "published": "2024-10-16T15:08:39.000Z",
          "updated": "2024-10-16T15:08:39.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/how-to-convince-engineers-that-formal-methods-is/",
          "title": "How to convince engineers that formal methods is cool",
          "description": "<p>Sorry there was no newsletter last week! I got COVID. Still got it, which is why this one's also short.</p>\n<h3>Logic for Programmers v0.4</h3>\n<p><a href=\"https://leanpub.com/logic/\" target=\"_blank\">Now available</a>! This version adds a chapter on TLA+, significantly expands the constraint solver chapter, and adds a \"planner programming\" section to the Logic Programming chapter. You can see the full release notes on the <a href=\"https://leanpub.com/logic/\" target=\"_blank\">book page</a>.</p>\n<h1>How to convince engineers that formal methods is cool</h1>\n<p>I have an open email for answering questions about formal methods,<sup id=\"fnref:fs-fv\"><a class=\"footnote-ref\" href=\"#fn:fs-fv\">1</a></sup> and one of the most common questions I get is \"how do I convince my coworkers that this is worth doing?\" usually the context is the reader is really into the idea of FM but their coworkers don't know it exists. The goal of the asker is to both introduce FM and persuade them that FM's useful. </p>\n<p>In my experience as a consultant and advocate, I've found that there's only two consistently-effective ways to successfully pitch FM:</p>\n<ol>\n<li>Use FM to find an <em>existing</em> bug in a work system</li>\n<li>Show how FM finds a historical bug that's already been fixed.</li>\n</ol>\n<h4>Why this works</h4>\n<p>There's two main objections to FM that we need to address. The first is that FM is too academic and doesn't provide a tangible, practical benefit. The second is that FM is too hard; only PhDs and rocket scientists can economically use it. (Showing use cases from AWS <em>et al</em> aren't broadly persuasive because skeptics don't have any insight into how AWS functions.) Finding an existing bug hits both: it helped the team with a real problem, and it was done by a mere mortal. </p>\n<div class=\"subscribe-form\"></div>\n<p>Demonstrating FM on a historical bug isn't <em>as</em> effective: it only shows that formal methods <em>could have</em> helped, not that it actually does help. But people will usually remember the misery of debugging that problem. Bug war stories are popular for a reason!</p>\n<h3>Making historical bugs persuasive</h3>\n<p>So \"live bug\" is a stronger rec, but \"historical bug\" tends to be easier to show. This is because <em>you know what you're looking for</em>. It's easier to write a high-level spec on a system you already know, and show it finds a bug you already know about.</p>\n<p>The trick to make it look convincing is to make the spec and bug as \"natural\" as possible. You can't make it seem like FM only found the bug because you had foreknowledge of what it was— then the whole exercise is too contrived. People will already know you had foreknowledge, of course, and are factoring that into their observations. You want to make the case that the spec you're writing is clear and obvious enough that an \"ignorant\" person could have written it. That means nothing contrived or suspicious.</p>\n<p>This is a bit of a fuzzy definition, more a vibe than anything. Ask yourself \"does this spec look like something that was tailor-made around this bug, or does it find the bug as a byproduct of being a regular spec?\"</p>\n<p>A good example of a \"natural\" spec is <a href=\"https://www.hillelwayne.com/post/augmenting-agile/\" target=\"_blank\">the bounded queue problem</a>. It's a straight translation of some Java code with no properties besides deadlock checking. Usually you'll be at a higher level of abstraction, though.</p>\n<hr/>\n<h3>Blog rec: <a href=\"https://www.argmin.net/\" target=\"_blank\">arg min</a></h3>\n<p>This is a new section I want to try for a bit: recommending tech(/-adjacent) blogs that I like. This first one is going to be a bit niche: <a href=\"https://www.argmin.net/\" target=\"_blank\">arg min</a> is writing up lecture notes on \"convex optimization\". It's a cool look into the theory behind constraint solving. I don't understand most of the math but the prose is pretty approachable. Couple of highlights:</p>\n<ul>\n<li><a href=\"https://www.argmin.net/p/modeling-dystopia\" target=\"_blank\">Modeling Dystopia</a> about why constraint solving isn't a mainstream technology.</li>\n<li><a href=\"https://www.argmin.net/p/convex-optimization-live-blog\" target=\"_blank\">Table of Contents</a> to see all of the posts.</li>\n</ul>\n<p>The blogger also talks about some other topics but I haven't read those posts much.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:fs-fv\">\n<p>As always, talking primarily about formal specification of systems (TLA+/Alloy/Spin), not formal verification of code (Dafny/SPARK/Agda). I talk about the differences a bit <a href=\"https://www.hillelwayne.com/post/why-dont-people-use-formal-methods/\" target=\"_blank\">here</a> (but I really need to write a more focused piece). <a class=\"footnote-backref\" href=\"#fnref:fs-fv\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/how-to-convince-engineers-that-formal-methods-is/",
          "published": "2024-10-08T16:18:55.000Z",
          "updated": "2024-10-08T16:18:55.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/refactoring-invariants/",
          "title": "Refactoring Invariants",
          "description": "<p>(Feeling a little sick so this one will be short.)</p>\n<p>I'm often asked by clients to review their (usually TLA+) formal specifications. These specs are generally slower and more convoluted than an expert would write. I want to fix them up without changing the overall behavior of the spec or introducing subtle bugs.</p>\n<p>To do this, I use a rather lovely feature of TLA+. Say I see a 100-line <code>Foo</code> action that I think I can refactor down to 20 lines. I'll first write a refactored version as a separate action <code>NewFoo</code>, then I run the model checker with the property</p>\n<div class=\"codehilite\"><pre><span></span><code>RefactorProp ==\n    [][Foo <=> NewFoo]_vars\n</code></pre></div>\n<p>That's an intimidating nest of symbols but all it's saying is that every <code>Foo</code> step must also be a <code>NewFoo</code> step. If the refactor ever does something different from the original action, the model-checker will report the exact behavior and transition it fails for. Conversely, if the model checker passes, I can safely assume they have identical behaviors.</p>\n<p>This is a <strong>refactoring invariant</strong>:<sup id=\"fnref:invariant\"><a class=\"footnote-ref\" href=\"#fn:invariant\">1</a></sup> the old and new versions of functions have identical behavior. Refactoring invariants are superbly useful in formal specification. Software devs spend enough time refactoring that they'd be useful for coding, too.</p>\n<p>Alas, refactoring invariants are a little harder to express in code. In TLA+ we're working with bounded state spaces, so the model checker can check the invariant for every possible state. Even a simple program can have an unbounded state space via an infinite number of possible function inputs. </p>\n<p>(Also formal specifications are \"pure\" simulations while programs have side effects.)</p>\n<p>The \"normal\" way to verify a program refactoring is to start out with a huge suite of <a href=\"https://buttondown.com/hillelwayne/archive/oracle-testing/\" target=\"_blank\">oracle tests</a>. This <em>should</em> catch a bad refactor via failing tests. The downside is that you might not have the test suite in the first place, or not one that covers your particular refactoring. Second, even if the test suite does, it only indirectly tests the invariant. It catches the refactoring error as a consequence of testing other stuff. What if we want to directly test the refactoring invariant?</p>\n<h3>Two ways of doing this</h3>\n<p>One: by pulling in formal methods. Ray Myers has a <a href=\"https://www.youtube.com/watch?v=UdB3XBf219Y\" target=\"_blank\">neat video</a> on formally proving a refactoring is correct. That one's in the niche language ACL2, but he's also got one on <a href=\"https://www.youtube.com/watch?v=_7RXQE-pCMo\" target=\"_blank\">refactoring C</a>. You might not even to prove the refactoring correct, you could probably get away with using an <a href=\"https://github.com/pschanely/CrossHair\" target=\"_blank\">SMT solver</a> to find counterexamples.</p>\n<p>Two: by using property-based testing. Generate random inputs, pass them to both functions, and check that the outputs are identical. Using the python <a href=\"https://hypothesis.readthedocs.io/en/latest/\" target=\"_blank\">Hypothesis</a> library:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"kn\">from</span> <span class=\"nn\">hypothesis</span> <span class=\"kn\">import</span> <span class=\"n\">given</span>\n<span class=\"kn\">import</span> <span class=\"nn\">hypothesis.strategies</span> <span class=\"k\">as</span> <span class=\"nn\">st</span>\n\n<span class=\"c1\"># from the `gilded rose kata`</span>\n<span class=\"k\">def</span> <span class=\"nf\">update_quality</span><span class=\"p\">(</span><span class=\"nb\">list</span><span class=\"p\">[</span><span class=\"n\">Item</span><span class=\"p\">]):</span>\n    <span class=\"o\">...</span>\n\n<span class=\"k\">def</span> <span class=\"nf\">update_quality_new</span><span class=\"p\">(</span><span class=\"nb\">list</span><span class=\"p\">[</span><span class=\"n\">Item</span><span class=\"p\">]):</span>\n    <span class=\"o\">...</span>\n\n<span class=\"nd\">@given</span><span class=\"p\">(</span><span class=\"n\">st</span><span class=\"o\">.</span><span class=\"n\">lists</span><span class=\"p\">(</span><span class=\"n\">st</span><span class=\"o\">.</span><span class=\"n\">builds</span><span class=\"p\">(</span><span class=\"n\">Item</span><span class=\"p\">)))</span>\n<span class=\"k\">def</span> <span class=\"nf\">test_refactoring</span><span class=\"p\">(</span><span class=\"n\">l</span><span class=\"p\">):</span>\n    <span class=\"k\">assert</span> <span class=\"n\">update_quality</span><span class=\"p\">(</span><span class=\"n\">l</span><span class=\"p\">)</span> <span class=\"o\">==</span> <span class=\"n\">update_quality_new</span><span class=\"p\">(</span><span class=\"n\">l</span><span class=\"p\">)</span>\n</code></pre></div>\n<p>One tricky bit is if the function is part of a long call chain <code>A -> B -> C</code>, and you want to test that refactoring <code>C'</code> doesn't change the behavior of <code>A</code>. You'd have to add a <code>B'</code> that uses <code>C'</code> and then an <code>A'</code> that uses <code>B'</code>. Maybe you could instead create a branch, commit the change the <code>C'</code> in that branch, and then run a <a href=\"https://www.hillelwayne.com/post/cross-branch-testing/\" target=\"_blank\">cross-branch test</a> against each branch's <code>A</code>.</p>\n<p>Impure functions are harder. The test now makes some side effect twice, which could spuriously break the refactoring invariant. You could instead test the changes are the same, or try to get the functions to effect different entities and then compare the updates of each entity. There's no general solution here though, and there might be No Good Way for a particular effectful refactoring.</p>\n<h3>Behavior-changing rewrites</h3>\n<p>We can apply similar ideas for rewrites that change <em>behavior</em>. Say we have an API, and v1 returns a list of user names while v2 returns a <code>{version, userids}</code> dict. Then we can find some transformation of v2 into v1, and run the refactoring invariant on that:</p>\n<div class=\"codehilite\"><pre><span></span><code><span class=\"k\">def</span> <span class=\"nf\">v2_to_v1</span><span class=\"p\">(</span><span class=\"n\">v2_resp</span><span class=\"p\">):</span>\n    <span class=\"k\">return</span> <span class=\"p\">[</span><span class=\"n\">User</span><span class=\"p\">(</span><span class=\"nb\">id</span><span class=\"p\">)</span><span class=\"o\">.</span><span class=\"n\">name</span> <span class=\"k\">for</span> <span class=\"n\">user</span> <span class=\"ow\">in</span> <span class=\"n\">v2_resp</span><span class=\"p\">[</span><span class=\"s2\">\"userids\"</span><span class=\"p\">]]</span>\n\n<span class=\"nd\">@given</span><span class=\"p\">(</span><span class=\"n\">some_query_generator</span><span class=\"p\">)</span>\n<span class=\"k\">def</span> <span class=\"nf\">test_refactoring</span><span class=\"p\">(</span><span class=\"n\">q</span><span class=\"p\">):</span>\n    <span class=\"k\">assert</span> <span class=\"n\">v1</span><span class=\"p\">(</span><span class=\"n\">q</span><span class=\"p\">)</span> <span class=\"o\">==</span> <span class=\"n\">v2_to_v1</span><span class=\"p\">(</span><span class=\"n\">v2</span><span class=\"p\">(</span><span class=\"n\">q</span><span class=\"p\">))</span>\n</code></pre></div>\n<p>Fun fact: <code>v2_to_v1</code> is a <a href=\"https://buttondown.com/hillelwayne/archive/software-isomorphisms/\" target=\"_blank\">software homomorphism</a>!</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:invariant\">\n<p>Well technically it's an <em>action property</em> since it's on the transitions of states, not the states, but \"refactor invariant\" gets the idea across better. <a class=\"footnote-backref\" href=\"#fnref:invariant\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/refactoring-invariants/",
          "published": "2024-09-24T20:06:10.000Z",
          "updated": "2024-09-24T20:06:10.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/goodharts-law-in-software-engineering/",
          "title": "Goodhart's Law in Software Engineering",
          "description": "<h3>Blog Hiatus</h3>\n<p>You might have noticed I haven't been updating my website. I haven't even <em>looked</em> at any of my drafts for the past three months. All that time is instead going into <em>Logic for Programmers</em>. I'll get back to the site when that's done or in 2025, whichever comes first. Newsletter and <a href=\"https://www.patreon.com/hillelwayne\" target=\"_blank\">Patreon</a> will still get regular updates.</p>\n<p>(As a comparison, the book is now 22k words. That's like 11 blog posts!)</p>\n<h2>Goodhart's Law in Software Engineering</h2>\n<p>I recently got into an argument with some people about whether small functions were <em>mostly</em> a good idea or <em>always 100%</em> a good idea, and it reminded me a lot about <a href=\"https://en.wikipedia.org/wiki/Goodhart%27s_law\" target=\"_blank\">Goodhart's Law</a>:</p>\n<blockquote>\n<p>When a measure becomes a target, it ceases to be a good measure.</p>\n</blockquote>\n<p>The <em>weak</em> version of this is that people have perverse incentives to game the metrics. If your metric is \"number of bugs in the bug tracker\", people will start spuriously closing bugs just to get the number down. </p>\n<p>The <em>strong</em> version of the law is that even 100% honest pursuit of a metric, taken far enough, is harmful to your goals, and this is an inescapable consequence of the difference between metrics and values. We have metrics in the first place because what we actually <em>care about</em> is nonquantifiable. There's some <em>thing</em> we want more of, but we have no way of directly measuring that thing. We <em>can</em> measure something that looks like a rough approximation for our goal. But it's <em>not</em> our goal, and if we replace the metric with the goal, we start taking actions that favor the metric over the goal.</p>\n<p>Say we want more reliable software. How do you measure \"reliability\"? You can't. But you <em>can</em> measure the number of bugs in the bug tracker, because fewer open bugs roughly means more reliability. <strong>This is not the same thing</strong>. I've seen bugs fixed in ways that made the system <em>less</em> reliable, but not in ways that translated into tracked bugs.</p>\n<p>I am a firm believer in the strong version of Goodhart's law. Mostly because of this:</p>\n<p><img alt=\"A peacock with its feathers out. The peacock is scremming\" class=\"newsletter-image\" src=\"https://assets.buttondown.email/images/2573503d-bc57-49ce-aa26-9d399d801118.jpg?w=960&fit=max\"/></p>\n<p>What does a peahen look for in a mate? A male with maximum fitness. What's a metric that approximates fitness? How nice the plumage is, because nicer plumage = more calories energy to waste on plumage.<sup id=\"fnref:peacock\"><a class=\"footnote-ref\" href=\"#fn:peacock\">1</a></sup> But that only <em>approximates</em> fitness, and over generations the plumage itself becomes the point at the cost of overall bird fitness. Sexual selection is Goodhart's law in action.</p>\n<div class=\"subscribe-form\"></div>\n<p>If the blind watchmaker can fall for Goodhart, people can too.</p>\n<h3>Examples in Engineering</h3>\n<p>Goodhart's law is a warning for pointy-haired bosses who up with terrible metrics: lines added, feature points done, etc. I'm more interested in how it affects the metrics we set for ourselves that our bosses might never know about.</p>\n<ul>\n<li>\"Test coverage\" is a proxy for how thoroughly we've tested our software. It diverges when we need to test lots of properties of the same lines of code, or when our worst bugs are emergent at the integration level.</li>\n<li>\"Cyclomatic complexity\" and \"function size\" are proxies for code legibility. They diverges when we think about global module legibility, not local function legibility. Then too many functions can obscure the code and data flow.</li>\n<li>Benchmarks are proxies for performant programs, and diverge when improving benchmarks slows down unbenchmarked operations.</li>\n<li>Amount of time spent pairing/code reviewing/debugging/whatever proxies \"being productive\".</li>\n<li><a href=\"https://dora.dev/\" target=\"_blank\">The DORA report</a> is an interesting case, because it claims four metrics<sup id=\"fnref:metrics\"><a class=\"footnote-ref\" href=\"#fn:metrics\">2</a></sup> are proxies to ineffable goals like \"elite performance\" and <em>employee satisfaction</em>. It also argues that you should minimize commit size to improve the DORA metrics. A proxy of a proxy of a goal!</li>\n</ul>\n<h3>What can we do about this?</h3>\n<p>No, I do not know how to avoid a law that can hijack the process of evolution.</p>\n<p>The 2023 DORA report suggests readers should avoid Goodhart's law and \"assess a team's strength across a wide range of people, processes, and technical capabilities\" (pg 10), which is kind of like saying the fix to production bugs is \"don't write bugs\". It's a guiding principle but not actionable advice that gets to that principle.</p>\n<p>They also say \"to use a combination of metrics to drive deeper understanding\" (ibid), which makes more sense at first. If you have metrics X and Y to approximate goal G, then overoptimizing X <em>might</em> hurt Y, indicating you're getting further from G. In practice I've seen it turn into \"we can't improve X because it'll hurt Y and we can't improve Y because it'll hurt X.\" This <em>could</em> mean we're at the best possible spot for G, but more often it means we're trapped very far from our goal. You could come up with a weighted combination of X and Y, like 0.7X + 0.3Y, but <em>that too</em> is a metric subject to Goodhart. </p>\n<p>I guess the best I can do is say \"use your best engineering judgement\"? Evolution is mindless, people aren't. Again, not an actionable or scalable bit of advice, but as I grow older I keep finding \"use your best judgement\" is all we can do. Knowledge work is ineffable and irreducible.</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:peacock\">\n<p>This sent me down a rabbit hole; turns out scientists are still debating what <em>exactly</em> the peacock's tail is used for! Is it sexual selection? Adverse signalling? Something else??? <a class=\"footnote-backref\" href=\"#fnref:peacock\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li id=\"fn:metrics\">\n<p>How soon commits get to production, deployment frequency, percent of deployments that cause errors in production, and mean time to recovery. <a class=\"footnote-backref\" href=\"#fnref:metrics\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/goodharts-law-in-software-engineering/",
          "published": "2024-09-17T16:33:40.000Z",
          "updated": "2024-09-17T16:33:40.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        },
        {
          "id": "https://buttondown.com/hillelwayne/archive/why-not-comments/",
          "title": "Why Not Comments",
          "description": "<h2>Logic For Programmers v0.3</h2>\n<p><a href=\"https://leanpub.com/logic/\" target=\"_blank\">Now available</a>! It's a light release as I learn more about formatting a nice-looking book. You can see some of the differences between v2 and v3 <a href=\"https://bsky.app/profile/hillelwayne.com/post/3l3egdqnqj62o\" target=\"_blank\">here</a>.</p>\n<h2>Why Not Comments</h2>\n<p>Code is written in a structured machine language, comments are written in an expressive human language. The \"human language\" bit makes comments more expressive and communicative than code. Code has a limited amount of something <em>like</em> human language contained in identifiers. \"Comment the why, not the what\" means to push as much information as possible into identifiers. <a href=\"https://buttondown.com/hillelwayne/archive/3866bd6e-22c3-4098-92ef-4d47ef287ed8\" target=\"_blank\">Not all \"what\" can be embedded like this</a>, but a lot can.</p>\n<p>In recent years I see more people arguing that <em>whys</em> do not belong in comments either, that they can be embedded into <code>LongFunctionNames</code> or the names of test cases. Virtually all \"self-documenting\" codebases add documentation through the addition of identifiers.<sup id=\"fnref:exception\"><a class=\"footnote-ref\" href=\"#fn:exception\">1</a></sup></p>\n<p>So what's something in the range of human expression that <em>cannot</em> be represented with more code?</p>\n<p>Negative information, drawing attention to what's <em>not</em> there. The \"why nots\" of the system.</p>\n<h3>A Recent Example</h3>\n<p>This one comes from <em>Logic for Programmers</em>. For convoluted technical reasons the epub build wasn't translating math notation (<code>\\forall</code>) into symbols (<code>∀</code>). I wrote a script to manually go through and replace tokens in math strings with unicode equivalents. The easiest way to do this is to call <code>string = string.replace(old, new)</code> for each one of the 16 math symbols I need to replace (some math strings have multiple symbols).</p>\n<p>This is incredibly inefficient and I could instead do all 16 replacements in a single pass. But that would be a more complicated solution. So I did the simple way with a comment:</p>\n<div class=\"codehilite\"><pre><span></span><code>Does 16 passes over each string\nBUT there are only 25 math strings in the book so far and most are <5 characters.\nSo it's still fast enough.\n</code></pre></div>\n<p>You can think of this as a \"why I'm using slow code\", but you can also think of it as \"why not fast code\". It's calling attention to something that's <em>not there</em>.</p>\n<h3>Why the comment</h3>\n<p>If the slow code isn't causing any problems, why have a comment at all?</p>\n<div class=\"subscribe-form\"></div>\n<p>Well first of all the code might be a problem later. If a future version of <em>LfP</em> has hundreds of math strings instead of a couple dozen then this build step will bottleneck the whole build. Good to lay a signpost now so I know exactly what to fix later.</p>\n<p>But even if the code is fine forever, the comment still does something important: it shows <em>I'm aware of the tradeoff</em>. Say I come back to my project two years from now, open <code>epub_math_fixer.py</code> and see my terrible slow code. I ask \"why did I write something so terrible?\" Was it inexperience, time crunch, or just a random mistake?</p>\n<p>The negative comment tells me that I <em>knew</em> this was slow code, looked into the alternatives, and decided against optimizing. I don't have to spend a bunch of time reinvestigating only to come to the same conclusion. </p>\n<h2>Why this can't be self-documented</h2>\n<p>When I was first playing with this idea, someone told me that my negative comment isn't necessary, just name the function <code>RunFewerTimesSlowerAndSimplerAlgorithmAfterConsideringTradeOffs</code>. Aside from the issues of being long, not explaining the tradeoffs, and that I'd have to change it everywhere if I ever optimize the code... This would make the code <em>less</em> self-documenting. It doesn't tell you what the function actually <em>does</em>.</p>\n<p>The core problem is that function and variable identifiers can only contain one clause of information. I can't store \"what the function does\" and \"what tradeoffs it makes\" in the same identifier. </p>\n<p>What about replacing the comment with a test. I guess you could make a test that greps for math blocks in the book and fails if there's more than 80? But that's not testing <code>EpubMathFixer</code> directly. There's nothing in the function itself you can hook into. </p>\n<p>That's the fundamental problem with self-documenting negative information. \"Self-documentation\" rides along with written code, and so describes what the code is doing. Negative information is about what the code is <em>not</em> doing. </p>\n<h3>End of newsletter speculation</h3>\n<p>I wonder if you can think of \"why not\" comments as a case of counterfactuals. If so, are \"abstractions of human communication\" impossible to self-document in general? Can you self-document an analogy? Uncertainty? An ethical claim?</p>\n<div class=\"footnote\">\n<hr/>\n<ol>\n<li id=\"fn:exception\">\n<p>One interesting exception someone told me: they make code \"more self-documenting\" by turning comments into <em>logging</em>. I encouraged them to write it up as a blog post but so far they haven't. If they ever do I will link it here. <a class=\"footnote-backref\" href=\"#fnref:exception\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
          "url": "https://buttondown.com/hillelwayne/archive/why-not-comments/",
          "published": "2024-09-10T19:40:29.000Z",
          "updated": "2024-09-10T19:40:29.000Z",
          "content": null,
          "image": null,
          "media": [],
          "authors": [],
          "categories": []
        }
      ]
    }
    Analyze Another View with RSS.Style