RSS/Atom Feed Analyzer

Analysis of https://gwern.substack.com/feed

Feed fetched in 97 ms.
Content type is application/xml; charset=utf-8.
Feed is 140,961 characters long.
Feed has an ETag of W/"226a1-zxWjOCmRMWBJkH3/QyhK3exuzZ0".
Warning Feed is missing the Last-Modified HTTP header.
Feed is well-formed XML.
Warning Feed has no styling.
This is an RSS feed.
Feed title: Gwern.net Newsletter
Feed self link matches feed URL.
Feed has an image at https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png.
Feed has 14 items.
First item published on 2021-06-11T14:16:22.000Z
Last item published on 2020-05-01T00:00:00.000Z
All items have published dates.
Newest item was published on 2021-06-11T14:16:22.000Z.
Home page URL: https://gwern.substack.com/
Home page has feed discovery link in <head>.
Error Home page does not have a link to the feed in the <body>.

Formatted XML

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0">
    <channel>
        <title><![CDATA[Gwern.net Newsletter]]></title>
        <description><![CDATA[Latest gwern.net updates, interesting links, and reviews]]></description>
        <link>https://gwern.substack.com</link>
        <image>
            <url>https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png</url>
            <title>Gwern.net Newsletter</title>
            <link>https://gwern.substack.com</link>
        </image>
        <generator>Substack</generator>
        <lastBuildDate>Sat, 14 Mar 2026 18:37:46 GMT</lastBuildDate>
        <atom:link href="https://gwern.substack.com/feed" rel="self" type="application/rss+xml"/>
        <copyright><![CDATA[Gwern Branwen]]></copyright>
        <language><![CDATA[en]]></language>
        <webMaster><![CDATA[[email protected]]]></webMaster>
        <itunes:owner>
            <itunes:email><![CDATA[[email protected]]]></itunes:email>
            <itunes:name><![CDATA[gwern]]></itunes:name>
        </itunes:owner>
        <itunes:author><![CDATA[gwern]]></itunes:author>
        <googleplay:owner><![CDATA[[email protected]]]></googleplay:owner>
        <googleplay:email><![CDATA[[email protected]]]></googleplay:email>
        <googleplay:author><![CDATA[gwern]]></googleplay:author>
        <itunes:block><![CDATA[Yes]]></itunes:block>
        <item>
            <title><![CDATA[May 2021 Gwern.net Newsletter]]></title>
            <description><![CDATA[links on AI hardware, diffusion models, optogenetics, brain scanning.]]></description>
            <link>https://gwern.substack.com/p/may-2021-gwernnet-newsletter</link>
            <guid isPermaLink="false">https://gwern.substack.com/p/may-2021-gwernnet-newsletter</guid>
            <dc:creator><![CDATA[gwern]]></dc:creator>
            <pubDate>Fri, 11 Jun 2021 14:16:22 GMT</pubDate>
            <enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/>
            <content:encoded><![CDATA[<p>May 2021&#8217;s <a href="https://www.gwern.net/newsletter/2021/05">Gwern.net</a> <a href="https://gwern.substack.com">newsletter</a> is now out; previous, <a href="https://www.gwern.net/newsletter/2021/04">April 2021</a> (<a href="https://www.gwern.net/tags/newsletter">archives</a>). This is a collation of links and summary of major changes, overlapping with my <a href="https://www.gwern.net/Changelog">Changelog</a>; brought to you by my donors on <a href="https://www.patreon.com/gwern">Patreon</a>.</p><p>Note: I will be in Denver 12&#8211;13 June 2021 for a conference.</p><h1>1 Writings</h1><ul><li><p><strong>Proposal</strong>: <a href="https://www.gwern.net/CYOA">&#8220;Choose Your Own Adventure AI Dungeon&#8221;</a>; <a href="https://www.gwern.net/GPT-2-preference-learning#decision-transformers-preference-learning-as-simple-as-possible">&#8220;Decision Transformers: Preference Learning As Simple As Possible&#8221;</a></p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><p><a href="https://old.reddit.com/r/mlscaling/">Matters Of Scale</a>:</p><ul><li><p><strong>Hardware</strong>:</p><ul><li><p><a href="https://arxiv.org/abs/2104.06272#deepmind">&#8220;Podracer architectures for scalable Reinforcement Learning&#8221;</a>, Hessel et al 2021 (highly-efficient TPU pod use: eg solving Pong in &lt;1min at 43 million FPS on a TPUv3-2048); <a href="https://venturebeat.com/2021/05/18/google-details-new-ai-accelerator-chips/">&#8220;Google details new TPUv4 AI accelerator chips&#8221;</a> (2.7&#215; TPUv3 chips; up to TPUv4-4096 pods, yielding &gt;1 ExaFLOPS; public access later in 2021)x</p></li><li><p><a href="https://arxiv.org/abs/2104.07857#microsoft">&#8220;ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning&#8221;</a>, Rajbhandari et al 2021 (~1 trillion parameters per 16 GPUs/DGX-2-node, scaling to &gt;512 GPUs ~40% efficiency)</p></li><li><p><a href="https://arxiv.org/abs/2105.04663#google">&#8220;GSPMD: General and Scalable Parallelization for ML Computation Graphs&#8221;</a>, Xu et al 2021 (Google upgrade of <a href="https://arxiv.org/abs/1811.06965#google" title="'GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism', Huang et al 2018">GPipe</a>/<a href="https://arxiv.org/abs/2006.16668#google" title="'GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding', Lepikhin et al 2020">GShard</a> arch to match <a href="https://www.microsoft.com/en-us/research/blog/deepspeed-extreme-scale-model-training-for-everyone/" title="DeepSpeed: Extreme-scale model training for everyone">MS DeepSpeed</a>: &#8220;&#8230;50%&#8211;62% compute utilization on 128&#8211;2048 Cloud TPUv3 cores for models with up to one trillion parameters&#8221;)</p></li><li><p><a href="https://arxiv.org/abs/2104.05158#facebook">&#8220;DLRM: High-performance, Distributed Training of Large-scale Deep Learning Recommendation Models&#8221;</a>,  Mudigere et al 2021 (ZionEX software/hardware platform for training  extremely large embeddings&#8212;while embeddings aren&#8217;t &#8216;real&#8217; parameters  &amp; things like <a href="https://arxiv.org/abs/2004.08366#google" title="'DynamicEmbedding: Extending TensorFlow for Colossal-Scale Applications', Zeng et al 2020">DynamicEmbedding</a> will never learn tricks like GPT-3 no matter how big, they present similar challenges); <a href="https://arxiv.org/abs/2105.08820#facebook">&#8220;RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance&#8221;</a>, Gupta et al 2021</p></li></ul></li><li><p><a href="https://arxiv.org/abs/2105.12196#deepmind">&#8220;From Motor Control to Team Play in Simulated Humanoid Football&#8221;</a>,  Liu et al 2021 (curriculum training of a single NN from raw humanoid  control to coordinated team-wide soccer strategy; neat to compare with <a href="https://arxiv.org/abs/2009.01719#deepmind" title="Grounded Language Learning Fast and Slow">Hill et al 2020</a> in terms of agent abilities)</p></li><li><p><a href="https://arxiv.org/abs/2105.11084#facebook">&#8220;Wav2vec-U: Unsupervised Speech Recognition&#8221;</a>, Baevski et al 2021</p></li><li><p><a href="https://www.anthropic.com/news/announcement">&#8220;Anthropic&#8221; public-benefit-corp/startup launched</a> (founded by the Amodeis; $124M investment for scaling &#8220;reliable and steerable AI systems&#8221;); <a href="https://www.cooperativeai.com/foundation">&#8220;Cooperative AI Foundation&#8221; (CAIF)</a> launched</p></li><li><p><a href="https://arxiv.org/abs/2105.01601#google">&#8220;MLP-Mixer: An all-MLP Architecture for Vision&#8221;</a>, Tolstikhin et al 2021 (another <a href="https://www.gwern.net/notes/FC">FC paper</a> removing even more inductive biases&#8212;ponies are all you need: &#8220;Mixer <a href="http://www.incompleteideas.net/IncIdeas/BitterLesson.html">improves more rapidly with data</a>  than ResNets, or even ViT, and the gap between large scale Mixer and  ViT models shrinks until the performance is matched on the entire  dataset&#8230;&#8221; The Bitter Lesson truly is the single bitterest lesson in ML,  isn&#8217;t it? The more people tweet about how MLP-Mixer is overhyped because  is &#8722;X% worse than the ultra-hand-optimized baseline or requires Y&#215; more  FLOPS, the more they demonstrate <em>precisely why</em> this sort of  research is so important! And showing, incidentally, that Transformers  are still under-researched if such a fundamental fact could have been  missed for so long.)</p></li><li><p><a href="https://arxiv.org/abs/2104.08945#facebook">&#8220;Data-Efficient Language-Supervised Zero-Shot Learning with Self-Distillation&#8221;</a>, Cheng et al 2021 (<a href="https://openai.com/blog/clip/">CLIP</a>-like performance scaled down to <em>n</em> = 3m using <a href="https://arxiv.org/abs/1503.02531#google" title="'Distilling the knowledge in a neural network', Hinton et al 2015">soft labels</a> generated by a <a href="https://www.gwern.net/docs/ai/2018-sharma.pdf#google" title="Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning">Conceptual Captions</a>-pretrained model)</p></li><li><p><a href="https://arxiv.org/abs/2104.07636#google">&#8220;SR3: Image Super-Resolution via Iterative Refinement&#8221;</a>, Saharia et al 2021; <a href="https://arxiv.org/abs/2105.05233#openai">&#8220;Diffusion Models Beat GANs on Image Synthesis&#8221;</a>, Dhariwal &amp; Nichol 2021 (<a href="https://arxiv.org/abs/2006.11239" title="'Denoising Diffusion Probabilistic Models', Ho et al 2020">DDPM</a>^<a href="file:///tmp/burlbC6ws6.html#fn1">1</a>^ finally surpass <a href="https://arxiv.org/abs/1809.11096#deepmind" title="'BigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis', Brock et al 2018">BigGAN-deep</a> on ImageNet 512px images at similar compute-cost, as <a href="https://arxiv.org/abs/2102.09672" title="'Improved Denoising Diffusion Probabilistic Models', Nichol &amp; Dhariwal 2021">expected from their</a><a href="https://www.gwern.net/notes/Scaling">good scaling</a>); <a href="https://cascaded-diffusion.github.io/">&#8220;Cascaded Diffusion Models for High Fidelity Image Generation&#8221;</a>, Ho et al 2021</p></li><li><p><a href="https://arxiv.org/abs/2009.01325#openai">&#8220;Learning to summarize from human feedback&#8221;</a>, Stiennon et al 2020</p></li><li><p><a href="https://www.gwern.net/docs/ai/2021-power.pdf#openai">&#8220;Grokking: Generalization Beyond Overfitting On Small Algorithmic Data Sets&#8221;</a>, Power et al 2021 (<a href="https://old.reddit.com/r/mlscaling/comments/n78584/grokking_generalization_beyond_overfitting_on/">discussion</a>;  new scaling effect, &#8216;grokking&#8217;: sudden perfect generalization emerging  many epochs after training-set overfitting on algorithmic tasks when  training in <a href="https://www.gwern.net/docs/ai/2021-power-poster.png#openai">flat shallow loss landscapes</a>); <a href="https://arxiv.org/abs/2106.05237#google">&#8220;Knowledge distillation: A good teacher is patient and consistent&#8221;</a>, Beyer et al 2021 (training much smaller models merely requires hundreds of thousands or millions of epochs)</p></li><li><p><a href="https://arxiv.org/abs/2104.14830#google">&#8220;Scaling End-to-End Models for Large-Scale Multilingual ASR&#8221;</a>, Li et al 2021</p></li><li><p><a href="https://arxiv.org/abs/2103.10948">&#8220;The Shape of Learning Curves: a Review&#8221;</a>, Viering &amp; Loog 2021</p></li><li><p><a href="https://www.sciencedirect.com/science/article/pii/S0004370221000862#deepmind">&#8220;Reward is enough&#8221;</a>,  Silver et al 2021 (a DRL manifesto: reward losses enough at scale of  compute/parameters/tasks to induce all important capabilities like  memory/exploration/generalization/imitation/reasoning)</p></li><li><p><strong>Scaling Down</strong>: <a href="https://github.com/nshepperd/lazy"><code>lazy</code>: a tool for running processes in idle time</a> (how to train on a GPU without destroying your GUI&#8217;s usability! <code>lazy</code>  pauses runs briefly while you interact with your desktop, letting you  do months-long runs without going crazy or resorting to Colab etc. This  enables hobbyists to go after previously-infeasible model sizes);  EleutherAI releases <a href="https://arankomatsuzaki.wordpress.com/2021/06/04/gpt-j/">a 6b-parameter GPT-3 model, GPT-J</a> (are you still using GPT-2/GPT-Neo? upgrade!); <a href="https://arxiv.org/abs/2105.12723">&#8220;Aggregating Nested Transformers&#8221;</a>, Zhang et al 2021/<a href="https://arxiv.org/abs/2105.14217">&#8220;Less is More: Pay Less Attention in Vision Transformers&#8221;</a>, Pan et al 2021</p></li></ul><ul><li><p><a href="https://arxiv.org/abs/2105.13626#google">&#8220;ByT5: Towards a token-free future with pre-trained byte-to-byte models&#8221;</a>, Xue et al 2021 (character models&#8212;not just feasible but desirable; we&#8217;ll get our rhyming &amp; pun-making language models yet!)</p></li><li><p><a href="https://www.gwern.net/docs/ai/2008-golle.pdf">&#8220;Machine Learning Attacks Against the Asirra CAPTCHA&#8221;</a>,  Golle 2008 (a look back on a decade of CV progress: months of work for  80% cat vs dog with SVM ensembles in 2008; 5min in Fast.ai for 99%  accuracy in 2018; for even more perspective, <a href="https://www.gwern.net/docs/ai/2012-ciresan.pdf" title="Deep big multilayer perceptrons for digit recognition">Cire&#351;an 2012</a>)</p></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href="https://www.gwern.net/docs/genetics/heritable/2021-levey.pdf">&#8220;Bi-ancestral  depression GWAS in the Million Veteran Program and meta-analysis in  &gt;1.2 million individuals highlight new therapeutic directions&#8221;</a>, Levey et al 2021</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.05.26.445798v1">&#8220;The complete sequence of a human genome&#8221;</a>, Nurk et al 2021 (<a href="https://www.nature.com/articles/d41586-021-01506-w" title="A complete human genome sequence is close: how scientists filled in the gaps; researchers added 200 million DNA base pairs and 115 protein-coding genes &#8212; but they&#8217;ve yet to entirely sequence the Y chromosome">media</a>)</p></li><li><p><a href="https://www.gwern.net/docs/iq/2021-vonstumm.pdf">&#8220;Using DNA to predict intelligence&#8221;</a>, von Stumm &amp; Plomin 2021 (review)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/848366v2.full">&#8220;Long  read sequencing of 3,622 Icelanders provides insight into the role of  structural variants in human diseases and other traits&#8221;</a>, Beyter et al 2021</p></li><li><p><a href="https://www.gwern.net/docs/genetics/heritable/2021-owen.pdf">&#8220;Rapid Sequencing&#8211;Based Diagnosis of Thiamine Metabolism Dysfunction Syndrome&#8221;</a> (sequence everyone!)</p></li></ul><p>Engineering:</p><ul><li><p><a href="https://www.gwern.net/docs/genetics/editing/2021-robertson.pdf">&#8220;Sense codon reassignment enables viral resistance and encoded polymer synthesis&#8221;</a>,  Robertson et al 2021 (&#8220;ultra-safe cells&#8221;: synthesizing an entire E.  coli genome with swapped codons for complete viral immunity)</p></li><li><p><a href="https://www.gwern.net/docs/genetics/editing/2021-musunuru.pdf">&#8220;In vivo CRISPR base editing of </a><em><a href="https://www.gwern.net/docs/genetics/editing/2021-musunuru.pdf">PCSK9</a></em><a href="https://www.gwern.net/docs/genetics/editing/2021-musunuru.pdf"> durably lowers cholesterol in primates&#8221;</a>, Musunuru et al 2021</p></li><li><p><strong><a href="https://en.wikipedia.org/wiki/Optogenetics">Optogenetics</a></strong>: <a href="https://www.gwern.net/docs/genetics/editing/2021-sahel.pdf">&#8220;Partial recovery of visual function in a blind patient after optogenetic therapy&#8221;</a>, Sahel et al 2021 (<a href="https://www.statnews.com/2021/05/24/scientists-use-optogenetics-for-first-time-to-help-blind-patient-see/" title="With engineered proteins, scientists use optogenetics for the first time to help a blind patient see again">media</a>); <a href="https://www.gwern.net/docs/biology/2021-yang.pdf">&#8220;Wireless multilateral devices for optogenetic studies of individual and social behaviors&#8221;</a>, Yang et al 2021 (<a href="https://www.nytimes.com/2021/05/25/science/optogenetics-brain-social-behavior.html" title="Scientists Drove Mice to Bond by Zapping Their Brains With Light: The study, a tour de force in bioengineering, comes after 2 decades of research on brain-to-brain synchrony in people">media</a>)</p></li><li><p><a href="https://www.pnas.org/content/118/18/e2018181118">&#8220;Retron Library Recombineering (RLR): High-throughput functional variant screens via in vivo production of single-stranded DNA&#8221;</a>, Schubert et al 2021</p></li><li><p><a href="https://www.nature.com/articles/d41586-021-01186-6">&#8220;First genetically modified Oxitec mosquitoes released in the United States&#8221;</a></p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.05.28.446207v1">&#8220;Genomic characterization of world&#8217;s longest selection experiment in mouse reveals the complexity of polygenic traits&#8221;</a>, Palma-Vera et al 2021</p></li><li><p><a href="https://www.sciencedirect.com/science/article/pii/S0734975021000628">&#8220;Surrogate broodstock to enhance biotechnology research and applications in aquaculture&#8221;</a>, Jin et al 2021</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.11.05.370478v3">&#8220;Utility of polygenic embryo screening for disease depends on the selection strategy&#8221;</a>, Lencz et al 2021</p></li><li><p><a href="https://www.nature.com/articles/d41586-021-01423-y">&#8220;Limit  on lab-grown human embryos dropped by stem-cell body: The International  Society for Stem Cell Research relaxed the famous 14-day rule on  culturing human embryos in its latest research guidelines&#8221;</a></p></li><li><p><a href="https://www.nytimes.com/2007/08/28/science/28crop.html">&#8220;Useful Mutants, Bred With Radiation&#8221;</a> (on <a href="https://en.wikipedia.org/wiki/Atomic_gardening">atomic gardening</a>)</p></li></ul><h2>2.3 Statistics/Meta-Science</h2><ul><li><p><a href="https://blog.dshr.org/2021/03/correlated-failures.html">&#8220;Correlated Failures&#8221; in HDDs/SSDs</a></p></li><li><p><a href="https://www.gwern.net/docs/statistics/bias/1992-rogers.pdf">&#8220;How a Publicity Blitz Created The Myth of Subliminal Advertising&#8221;</a>, Rogers 1992 (the famous movie-theater/popcorn-sales experiment never happened)</p></li></ul><h2>2.4 Politics/Religion</h2><ul><li><p><a href="https://www.gwern.net/docs/sociology/2021-costello.pdf">&#8220;Clarifying the Structure and Nature of Left-Wing Authoritarianism (LWA)&#8221;</a>, Costello et al 2021</p></li><li><p><a href="https://fantasticanachronism.com/2021/04/28/book-review-the-decline-and-fall-of-the-roman-empire/">&#8220;Book Review: </a><em><a href="https://fantasticanachronism.com/2021/04/28/book-review-the-decline-and-fall-of-the-roman-empire/">The Decline and Fall of the Roman Empire</a></em><a href="https://fantasticanachronism.com/2021/04/28/book-review-the-decline-and-fall-of-the-roman-empire/">&#8221;</a> (<a href="https://fantasticanachronism.com/2021/05/03/highlights-from-the-decline-and-fall-of-the-roman-empire/">excerpts</a>)</p></li></ul><h2>2.5 Psychology/Biology</h2><ul><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.05.29.446289v1">&#8220;A connectomic study of a petascale fragment of human cerebral cortex&#8221;</a>,  Shapson-Coe et al 2021 (&#8220;&#8230;This &#8220;digital tissue&#8221; is a ~660,000&#215; scale up  of an earlier saturated reconstruction from a small region of mouse  cortex, published in 2015 (<a href="https://www.sciencedirect.com/science/article/pii/S0092867415008247" title="Saturated Reconstruction of a Volume of Neocortex">Kasthuri et al 2015</a>).  Although this scaleup was difficult, it was not hundreds of thousands  of times more difficult and took about the same amount of time as the  previous data set (~4 years)&#8230;The rapid improvements over the past few  years&#8230;argues that analyzing volumes that are even 3 orders of magnitude  larger, such as an exascale whole mouse brain connectome, will likely be  in reach within a decade." See also <a href="https://xcorr.net/2021/04/27/accelerating-progress-in-brain-recording-tech/">&#8220;Accelerating progress in brain recording tech&#8221;</a>.)</p></li><li><p><a href="https://www.nature.com/articles/s41467-021-22199-9">&#8220;Neuroimaging evidence for a network sampling theory of individual differences in human intelligence test performance&#8221;</a>, Soreq et al 2021; <a href="https://elifesciences.org/articles/64058">&#8220;The neural basis of intelligence in fine-grained cortical topographies&#8221;</a>, Feilong et al 2021; <a href="https://link.springer.com/article/10.1007/s00429-020-02113-7">&#8220;Predicting intelligence from brain gray matter volume&#8221;</a>, Hilger et al 2020 (towards the mechanistic reification of <em>g</em>: per <a href="https://www.gwern.net/docs/iq/2007-jung.pdf" title="'The Parieto-Frontal Integration Theory (P-FIT) of intelligence: Converging neuroimaging evidence', Jung &amp; Haier 2007">P-FIT</a>,  it is global efficiency/total cognitive resources which can be spent on  learning &amp; orchestrating specialized capabilities); if we consider  recent human brain imaging studies, cross-species comparisons, and deep  learning as converging, I would offer as a speculation the following:</p><p>The Master Synthesis: intelligence  is execution of small simplicity-weighted programs, best discovered by  search over smooth loss landscapes like that of <a href="https://www.gwern.net/notes/Sparsity">highly-overparameterized</a> differentiable networks containing lottery-ticket subnetworks which are ensembled/averaged over, <a href="https://www.gwern.net/Backstop#deep-bayes">approaching Bayes-optimal</a>  reasoning in the limit (as nearest-neighbors-like high dimensional  interpolation / memorization gives way to algorithmic generalization /  interpolation on a more abstract level); this can be implemented by  large numbers of similar neurons trained using any of the many  approximations to backprop; human intelligence&#8217;s <em>g</em> is real but  is the overall &#8216;pool&#8217; of neural resources which derives from overall  body integrity because the number of neurons, their density, their  myelination, resistance to damage and infection etc, is causally  downstream of all body and developmental systems, creating a huge  mutational target; the brain regions specialize and differentiate, and  their orchestration (or lack thereof) contributes to observed  performance on tasks tapping into multiple specialized regions; as tasks  rely on fewer regions or approach intrinsic ceiling, <em>g</em> ceases to be observable and task-specific influences matter most.</p></li><li><p><a href="https://www.nature.com/articles/s41591-021-01336-3">&#8220;MDMA-assisted therapy for severe PTSD: a randomized, double-blind, placebo-controlled phase 3 study&#8221;</a>, Mitchell et al 2021 (<em>d</em> = 0.9 over therapy); <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7643046/">&#8220;Effects of Psilocybin-Assisted Therapy on Major Depressive Disorder&#8221;</a>, Davis et al 2021</p></li><li><p><a href="https://www.newyorker.com/magazine/2021/04/05/why-animals-dont-get-lost">&#8220;Why  Animals Don&#8217;t Get Lost: Birds do it. Bees do it. Learning about the  astounding navigational feats of wild creatures can teach us a lot about  where we&#8217;re going&#8221;</a> (on spectacular but still mysterious feats of <a href="https://en.wikipedia.org/wiki/Animal_navigation">animal navigation</a>)</p></li><li><p><a href="https://defector.com/in-the-future-of-collecting-is-anyone-having-fun/">&#8220;In The Future Of Collecting, Is Anyone Having Fun?&#8221;</a> (on <a href="https://en.wikipedia.org/wiki/Bobblehead">Bobblehead</a> collectors)</p></li><li><p><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8114859/">&#8220;Linking Brain Biology to Intellectual Endowment: A Review on the Associations of Human Intelligence With Neuroimaging Data&#8221;</a>, Dizaji et al 2021</p></li><li><p><a href="https://www.gwern.net/docs/economics/2012-oboyle.pdf">&#8220;The Best And The Rest: Revisiting The Norm Of Normality Of Individual Performance&#8221;</a>, O&#8217;Boyle &amp; Aguinis 2012 (performance is <a href="https://www.gwern.net/notes/Pipeline">log-normal</a>)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.11.21.392720v1">&#8220;A conserved strategy for inducing appendage regeneration&#8221;</a>, Abrams et al 2021 (slight regrowth of damaged mouse limbs by drinking sugar+amino-acid-supplemented water)</p></li><li><p><a href="https://astralcodexten.substack.com/p/know-your-amphetamines">&#8220;Know Your Amphetamines&#8221;</a>, Scott Alexander</p></li><li><p><a href="https://www.nature.com/articles/srep02617">&#8220;Feeling Small: Exploring the Tactile Perception Limits [of Humans]&#8221;</a>, Skedung et al 2013</p></li><li><p><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">&#8220;The Board Game of the Alpha Nerds: Before </a><em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">Risk</a></em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">, before </a><em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">Dungeons &amp; Dragons</a></em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">, before </a><em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">Magic: The Gathering</a></em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">, there was </a><em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">Diplomacy</a></em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">&#8221;</a> (<a href="https://en.wikipedia.org/wiki/Diplomacy_(game)">WP</a>;  &#8220;I still don&#8217;t know whom I should have trusted, if anyone. All I know  is that I felt stupid, stressed out, humiliated, and sad.&#8221;)</p></li></ul><h2>2.6 Technology</h2><ul><li><p><a href="https://rootsofprogress.org/nuclear-physics">&#8220;I walk the (beta-stability) line: How counting neutrons explains nuclear waste&#8221;</a></p></li><li><p><a href="https://alexdanco.com/2020/10/08/making-is-show-business-now/">&#8220;Making is Show Business now&#8221;</a>, Alex Danco</p></li><li><p><a href="https://www.thenewatlantis.com/publications/shop-class-as-soulcraft">&#8220;Shop Class as Soulcraft: The case for the manual trades&#8221;</a>, Crawford 2006</p></li><li><p><a href="https://www.kickstarter.com/projects/upperstory/spintronics-build-mechanical-circuits">&#8220;Spintronics: Build mechanical circuits&#8221;</a>, Kickstarter (followup to <a href="https://en.wikipedia.org/wiki/Turing_Tumble">Turing Tumble</a>)</p></li></ul><h2>2.7 Economics</h2><ul><li><p><a href="https://www.gwern.net/docs/sociology/2020-dellavigna.pdf">&#8220;RCTs to Scale: Comprehensive Evidence from 2 Nudge Units&#8221;</a>, DellaVigna &amp; Linos 2020 (nudge effects overestimated by 6.2&#215; due to publication bias)</p></li><li><p><a href="https://academic.oup.com/ije/advance-article/doi/10.1093/ije/dyab099/6288123">&#8220;No  causal associations between childhood family income and subsequent  psychiatric disorders, substance misuse and violent crime arrests: a  nationwide Finnish study of &gt;650,000 individuals and their siblings&#8221;</a>, Sariaslan et al 2021; <a href="https://academic.oup.com/ije/advance-article/doi/10.1093/ije/dyab066/6274255">&#8220;Parental income and mental disorders in children and adolescents: prospective register-based study&#8221;</a>, Kinge et al 2021</p></li><li><p><a href="https://mattlakeman.org/2021/06/01/everything-you-might-want-to-know-about-whaling/">&#8220;Everything You Might Want to Know about Whaling&#8221;</a>, Matt Lakeman</p></li><li><p><a href="https://www.gwern.net/notes/Nash">Exploding Nash Equilibrium For Trustless Trade</a></p></li></ul><h2>2.8 Fiction</h2><ul><li><p><a href="https://www.lightspeedmagazine.com/fiction/love-is-the-plan-the-plan-is-death/">&#8220;Love Is the Plan the Plan Is Death&#8221;</a>, <a href="https://en.wikipedia.org/wiki/James_Tiptree_Jr.">James Tiptree, Jr.</a> (<a href="https://en.wikipedia.org/wiki/Love_Is_the_Plan_the_Plan_Is_Death">WP</a>)</p></li></ul><h2>2.9 Miscellaneous</h2><ul><li><p><a href="https://www.newyorker.com/news/dispatch/the-strange-story-of-dagobert-the-ducktales-bandit">&#8220;The Strange Story of Dagobert, the </a><em><a href="https://www.newyorker.com/news/dispatch/the-strange-story-of-dagobert-the-ducktales-bandit">Duck Tales</a></em><a href="https://www.newyorker.com/news/dispatch/the-strange-story-of-dagobert-the-ducktales-bandit">  Bandit: In the &#8217;90s, a frustrated artist in Berlin went on a crime  spree&#8212;building bombs, extorting high-end stores, and styling his persona  after Scrooge McDuck. He soon became a German folk hero.&#8221;</a> (<a href="https://en.wikipedia.org/wiki/Arno_Funke">WP</a>; another reminder for Americans&#8212;odd as it may seem, Donald Duck is <em>extremely</em> popular overseas; see also the unknown-in-the-USA character <a href="https://en.wikipedia.org/wiki/John_D._Rockerduck">John D. Rockerduck</a> or <a href="https://slate.com/culture/2009/12/sweden-s-bizarre-tradition-of-watching-donald-duck-kalle-anka-cartoons-on-christmas-eve.html">beloved Scandinavian tradition</a><em><a href="https://en.wikipedia.org/wiki/From_All_of_Us_to_All_of_You">From All of Us to All of You</a></em> who 2020 airing set an all-time record of &gt;4.5m viewers)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Atmospheric_optics#List">List of atmospheric optical phenomena</a> (How many would you recognize from a distance or plane? How many have you even heard of?)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Franz_Nopcsa_von_Fels%C5%91-Szilv%C3%A1s">Baron Franz Nopcsa von Fels&#337;-Szilv&#225;s</a> (noted geologist, paleontologist, anthropologist, homosexual, &amp; skyjacker)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Krishnacore">Krishnacore</a></p></li></ul><div><hr></div><ol><li><p>What is a diffusion model like DDPM? To try to explain it as simply as possible <a href="https://yang-song.github.io/blog/2021/score/" title="Generative Modeling by Estimating Gradients of the Data Distribution">without the math</a>:</p><p>DDPM is a neural net which is trained to fix noise in an image: it  takes a noisy image and &#8216;sharpens&#8217; it to produce a new image. You train  it by adding dirt to a normal image, and teaching it to turn the dirty  version into the original. As it gets better, it learns what the images  all tend to look like so it can &#8216;see through&#8217; ever more noise, to turn  smudged hints of the original image into its best guess. Once it&#8217;s done  training, what happens if you give it a completely dirty photo, which is  pure static noise? Well, it produces a slightly less dirty &#8216;photo&#8217;. And  if you do it again? it&#8217;s a little cleaner still. Now, what if you do  this many times? It has to get cleaner each time. The end result: the  static noise goes in, and a face pops out! The DDPM has hallucinated a  face out of the noise. One little blob of static here turned into a  nose, and another blob turned into an ear, and it went from there.</p></li></ol>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[April 2021 newsletter]]></title>
            <description><![CDATA[with links on AI scaling, particular new East Asian record-breaking work & deep reinforcement learning.]]></description>
            <link>https://gwern.substack.com/p/april-2021-newsletter</link>
            <guid isPermaLink="false">https://gwern.substack.com/p/april-2021-newsletter</guid>
            <dc:creator><![CDATA[gwern]]></dc:creator>
            <pubDate>Thu, 03 Jun 2021 15:45:24 GMT</pubDate>
            <enclosure url="https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/33773d07-4631-4a6b-91c2-44a2b1082385_1164x702.png" length="0" type="image/jpeg"/>
            <content:encoded><![CDATA[<p>April 2021&#8217;s <a href="https://www.gwern.net/newsletter/2021/04">Gwern.net</a> <a href="https://gwern.substack.com">newsletter</a> is now out; previous, <a href="https://www.gwern.net/newsletter/2021/03">March 2021</a> (<a href="https://www.gwern.net/tags/newsletter">archives</a>). This is a collation of links and summary of major changes, overlapping with my <a href="https://www.gwern.net/Changelog">Changelog</a>; brought to you by my donors on <a href="https://www.patreon.com/gwern">Patreon</a>.</p><h1>1 Writings</h1><ul><li><p><a href="https://www.gwern.net/Variables">Better Greek Variable Suggestions</a> (use &#1008;, &#962;, &#965;, &#982;, &#933;, &#926;, &#953;, &#1009;, &#977;, or &#928; instead)</p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><ul><li><p><a href="https://arxiv.org/abs/1810.00825">&#8220;Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks&#8221;</a>, Lee et al 2018; <a href="https://arxiv.org/abs/2103.03206#deepmind">&#8220;Perceiver: General Perception with Iterative Attention&#8221;</a>, Jaegle et al 2021 (skinny Transformers applied recurrently; given reinvention, one might ask &#8220;is <a href="https://arxiv.org/abs/1706.03762#google" title="'Attention Is All You Need', Vaswani et al 2017">attention</a>, getting too much attention?&#8221;, especially given how many Transformer tweaks <a href="https://arxiv.org/abs/2102.11972#google" title="'Do Transformer Modifications Transfer Across Implementations and Applications?', Narang et al 2021">don&#8217;t pan out</a>  or have antecedents, indicating a gold rush? Probably not: if the  marginal return on this research direction had fallen below that of  competitors, we would see those neglected directions invade Transformer  topics&#8212;while we continue to see the reverse, and many applications as  yet untouched by all the new approaches, suggesting that we <em>still</em> don&#8217;t pay enough attention)</p></li><li><p><a href="https://arxiv.org/abs/2103.04689">&#8220;Z-IL: Predictive Coding Can Do Exact Backpropagation on Any Neural Network&#8221;</a>, Salvatori et al 2021 (scaling local learning rules to ImageNet AlexNet/Resnet &amp; ALE DRL at similar compute cost)</p></li><li><p><a href="https://arxiv.org/abs/1708.07120">&#8220;Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates&#8221;</a>,  Smith &amp; Topin 2017 (the lingering mystery of super-convergence,  saving 50&#8211;90% compute with LRs as high as 20 (!): what is it, why does  it work only sometimes, is there any connection to <a href="https://www.gwern.net/docs/ai/2021-power.pdf#openai" title="'Grokking: Generalization Beyond Overfitting On Small Algorithmic Data Sets', Powers et al 2021">grokking</a> &amp; can it work for large models like GPT-3 given the <a href="https://old.reddit.com/r/MachineLearning/comments/ba1wg5/d_thoughts_about_superconvergence_and/">tunneling hypothesis</a>?)</p></li><li><p><a href="http://www.offconvex.org/2021/04/07/ripvanwinkle/">&#8220;Rip van Winkle&#8217;s Razor, a Simple New Estimate for Adaptive Data Analysis&#8221;</a>  (an unusual approach to estimating generalization&#8212;by quantifying the  information-theoretic simplicity of all the powerful DL research  discoveries since 2012, into ~1 kilobyte. And yet, <em>what</em> a kilobyte&#8230;)</p></li><li><p><a href="https://github.com/golanlevin/AmbigrammaticFigures">&#8220;Ambigrammatic Figures&#8221;</a>, Levin &amp; Huang 2020 (making horrifying StyleGAN faces that can be <a href="https://en.wikipedia.org/wiki/Ambigram">rotated 180&#176;</a> by projection &amp; then <a href="https://www.gwern.net/Faces#reversing-stylegan-to-control-modify-images">gradient-ascent</a> towards an upside-down face)</p></li></ul><p><a href="https://old.reddit.com/r/mlscaling/">Matters Of Scale</a>:</p><ul><li><p><strong><a href="https://lair.lighton.ai/akronomicon/" title="The Akronomicon: an Extreme-Scale Leaderboard">Large Models</a></strong>:</p><ul><li><p>Congratulations to OpenAI on 1 year of GPT-3 &amp; OA API. Has it really only been a year?&#8212;it has truly exceeded expectations.</p></li><li><p><a href="https://en.wikipedia.org/wiki/Naver">Naver</a> announces 204b-parameter Korean-language NN, <a href="http://m.koreaherald.com/view.php?ud=20210525000824">&#8220;HyperCLOVA&#8221;</a>  (KO; unknown arch although apparently dense, or training-compute or  benchmark/loss performance; 650b token training dataset. Who knew Naver  was even trying? &#8220;And we are here as on a darkling plain / Swept with  confused alarms of struggle and flight, / Where ignorant armies clash by  night.&#8221;)</p></li><li><p><a href="https://arxiv.org/abs/2104.12369#huawei">&#8220;PanGu-&#945;: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation&#8221;</a>,  Zeng et al 2021 (Zh; Huawei&#8217;s GPT-3-200b prototype, trained on  indigenous Chinese GPU+DL stack; a partial replication, due to  incomplete training on ~43b tokens; the <a href="https://git.openi.org.cn/PCL-Platform.Intelligence/PanGu-Alpha#user-content-%E6%A8%A1%E5%9E%8B%E4%B8%8B%E8%BD%BD">13b-parameter</a> model checkpoint has been released for download, and they are considering releasing the 200b-parameter model&#8230; <a href="https://chinai.substack.com/p/chinai-141-the-pangu-origin-story">Ding commentary</a>)</p></li><li><p>New &#119978;(100b)-parameter Transformer models announced at Google I/O &#8217;2021: <a href="https://blog.google/technology/ai/lamda/" title="LaMDA: our breakthrough conversation technology">LaMDA</a> (EN; chatbot), <a href="https://blog.google/products/search/introducing-mum/">MUM</a> (multimodal multilingual search/translation/Q&amp;A)</p></li><li><p><a href="https://www.infoq.cn/article/EFIHo75sQsVqLvFTruKE#alibaba">&#8220;PLUG&#8221;</a> (Zh): a 27b parameter BERT-like Chinese language model, targeting 200b next (AliBaba followup to <a href="https://arxiv.org/abs/1908.04577#alibaba" title="'StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding', Wang et al 2019">StructBERT</a>/<a href="https://arxiv.org/abs/2004.07159#alibaba" title="'PALM: Pre-training an Autoencoding&amp;Autoregressive Language Model for Context-conditioned Generation', Bi et al 2020">PALM</a>)</p></li><li><p><a href="https://arxiv.org/abs/2105.13290">&#8220;CogView: Mastering Text-to-Image Generation via Transformers&#8221;</a>, Ding et al 2021 (another Chinese <a href="https://openai.com/blog/dall-e/">DALL&#183;E</a> clone, post-<a href="https://arxiv.org/abs/2103.00823#alibaba" title="'M6: A Chinese Multimodal Pretrainer', Lin et al 2021">M6</a>: <em>n</em> = <a href="https://wudaoai.cn/data-detail/1" title="WuDaoCorpus: the largest Chinese corpus data set, with about 2TB of text and 725 billion Chinese characters">30m text-image pairs</a>, 4b-parameter GPT, models to be released)</p></li><li><p><a href="https://arxiv.org/abs/2104.10157">&#8220;VideoGPT: Video Generation using VQ-VAE and Transformers&#8221;</a>, Yan et al 2021; <a href="https://arxiv.org/abs/2104.14806#microsoft">&#8220;GODIVA: </a><em><a href="https://arxiv.org/abs/2104.14806#microsoft">G</a></em><a href="https://arxiv.org/abs/2104.14806#microsoft">enerating </a><em><a href="https://arxiv.org/abs/2104.14806#microsoft">O</a></em><a href="https://arxiv.org/abs/2104.14806#microsoft">pen-</a><em><a href="https://arxiv.org/abs/2104.14806#microsoft">D</a></em><a href="https://arxiv.org/abs/2104.14806#microsoft">oma</a><em><a href="https://arxiv.org/abs/2104.14806#microsoft">I</a></em><a href="https://arxiv.org/abs/2104.14806#microsoft">n </a><em><a href="https://arxiv.org/abs/2104.14806#microsoft">V</a></em><a href="https://arxiv.org/abs/2104.14806#microsoft">ideos from n</a><em><a href="https://arxiv.org/abs/2104.14806#microsoft">A</a></em><a href="https://arxiv.org/abs/2104.14806#microsoft">tural Descriptions&#8221;</a>, Wu et al 2021 (DALL&#183;E for video on Howto100M: <a href="https://arxiv.org/abs/1906.00446#deepmind" title="'Generating Diverse High-Fidelity Images with VQ-VAE-2', Razavi et al 2019">VQ-VAE</a> + sparse attention)</p></li><li><p><a href="https://arxiv.org/abs/2104.04473#nvidia">&#8220;Efficient Large-Scale Language Model Training on GPU Clusters&#8221;</a>, Narayanan et al 2021 (Nvidia <a href="https://github.com/nvidia/megatron-lm">&#8216;Megatron-LM&#8217; software</a> for scaling up to 3072 A100 GPUs; allows 1t-parameter models at 502 petaFLOP/s or 50% efficiency, cf TPU rival, <a href="https://arxiv.org/abs/2105.04663#google" title="'GSPMD: General and Scalable Parallelization for ML Computation Graphs', Xu et al 2021: '50% to 62% compute utilization on 128 to 2048 Cloud TPUv3 cores for models with up to one trillion parameters'">GSPMD</a>, and note <a href="file:///tmp/burlyHGiKo.html#patterson-et-al-2021">Patterson et al 2021</a> estimates GPT-3 at ~3.5m V100 GPU-hours, so OA got ~20% efficiency?); <a href="https://www.youtube.com/watch?v=eAn_oiZwUXA&amp;t=2998s" title="GTC 2021 Keynote with NVIDIA CEO Jensen Huang: NVIDIA CEO Jensen Huang delivers the #GTC21&#8203; keynote, where he introduced amazing breakthroughs in building virtual worlds with NVIDIA Omniverse; in advancing enterprise computing with new NVIDIA DGX systems and software; in turning the data center into the new unit of computing with the new NVIDIA Grace CPU, BlueField-3 DPU, and DOCA 1.0 SDK; in broadening the reach of AI to all companies and industries with NVIDIA EGX and Aerial 5G; and in transforming transportation with NVIDIA DRIVE Orin and Atlan.">&#8220;We expect to see multi-trillion-parameter models by next year, and 100 trillion+ parameter models by 2023&#8221;</a> &#8212;Nvidia CEO <a href="https://en.wikipedia.org/wiki/Jensen_Huang">Jensen Huang</a> (<a href="https://www.gwern.net/docs/ai/2021-04-12-jensenhuang-gtc2021keynote-eAn_oiZwUXA.en.vtt.txt">subtitles</a>)</p></li><li><p>Mixture-Of-Experts:</p><ul><li><p><a href="https://en.pingwest.com/a/8693">BAAI&#8217;s &#8220;Wudao Wensu&#8221;: 1.75-trillion parameters &amp; multimodal!</a> (<a href="https://syncedreview.com/2021/03/23/chinas-gpt-3-baai-introduces-superscale-intelligence-model-wu-dao-1-0/">prologue</a>)</p></li><li><p><a href="https://arxiv.org/abs/2105.15082#alibaba">&#8220;Exploring Sparse Expert Models and Beyond&#8221;</a>, Yang et al 2021 (1t-parameter hierarchical Switch Transformer trained on 480 V100 GPUs)</p></li></ul></li></ul></li><li><p><strong><a href="https://arxiv.org/abs/1911.08265#deepmind">MuZero</a></strong>:</p><ul><li><p><a href="https://arxiv.org/abs/2104.06294#deepmind">&#8220;MuZero Unplugged: Online and Offline Reinforcement Learning by Planning with a Learned Model&#8221;</a>, Schrittwieser et al 2021 (Reanalyze+MuZero; <a href="https://www.gwern.net/images/ai/2021-schrittwieser-figure1-mspacmanmuzerologrewardscaling.png" title="Figure 1: Final scores in Ms. Pac-Man for different Reanalyse fractions. By scaling the Reanalyse fraction, MuZero can be trained at any desired data budget. All other parameters are held constant. Note the logarithmic x-axis: Linear improvements in score require exponentially more data, matching scaling laws such as described by Kaplan et al 2020 for language models.">smooth log-scaling</a> of <em>Ms.&nbsp;Pacman</em> reward with sample size, 107&#8211;1010, showing that DRL for arcade games parallels board games)</p></li><li><p><a href="https://sites.google.com/berkeley.edu/decision-transformer">&#8220;Decision Transformer: Reinforcement Learning via Sequence Modeling&#8221;</a>, Chen et al 2021</p></li><li><p><a href="https://arxiv.org/abs/2104.06303#deepmind">&#8220;Sampled MuZero: Learning and Planning in Complex Action Spaces&#8221;</a>, Hubert et al 2021 (MuZero for continuous domains: DM Control Suite/Real-World RL Suite); <a href="https://arxiv.org/abs/2006.07430">&#8220;Continuous Control for Searching and Planning with a Learned Model&#8221;</a>, Yang et al 2020</p></li><li><p><a href="https://arxiv.org/abs/2104.06159">&#8220;Muesli: Combining Improvements in Policy Optimization&#8221;</a>, Hessel et al 2020 (catching up with original MuZero)</p></li><li><p><a href="https://arxiv.org/abs/2102.12924">&#8220;Visualizing MuZero Models&#8221;</a>, de Vries et al 2021 (reimplementing &amp; introspecting a MuZero)</p></li></ul></li><li><p><a href="https://arxiv.org/abs/2104.03113">&#8220;Scaling Scaling Laws with Board Games&#8221;</a>, <a href="https://andyljones.com/">Jones</a> 2021 (AlphaZero/<a href="https://en.wikipedia.org/wiki/Hex_(board_game)">Hex</a>: <a href="https://www.gwern.net/notes/Faster">highly-optimized</a> GPU implementation enables showing <a href="https://www.gwern.net/notes/Scaling">smooth scaling</a>  across 6 OOM of compute&#8212;2&#215; FLOPS = 66% victory; amortization of  training &#8594; runtime tree-search, where 10&#215; training = 15&#215; runtime)</p></li><li><p><a href="https://christina.kim/2021/04/11/scaling-laws-for-language-transfer-learning/#openai">&#8220;Scaling Laws for Language Transfer Learning&#8221;</a>, Christina Kim (<a href="https://arxiv.org/abs/2102.01293#openai" title="Scaling Laws for Transfer">Hernandez et al 2021</a> followup: smooth scaling for En &#8594; De/Es/Zh)</p></li><li><p><a href="https://arxiv.org/abs/2104.10350#google">&#8220;Carbon Emissions and Large Neural Network Training&#8221;</a>,  Patterson et al 2021 (&#8220;&#8230;choice of DNN/datacenter/processor can reduce  the carbon footprint up to ~100&#8211;1000&#215;. These large factors make  retroactive estimates difficult.&#8221;)</p></li><li><p><a href="https://arxiv.org/abs/2104.07705">&#8220;How to Train BERT with an Academic Budget&#8221;</a>, Izsak et al 2021 (<a href="https://arxiv.org/abs/1810.04805#google" title="'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding', Devlin et al 2018">BERT</a> in 8 GPU-days&#8212;R&amp;D iteration allows finding efficiency; there&#8217;s nothing so expensive as demanding research be cheap.^1^)</p></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6818669/">&#8220;Precision exercise medicine: understanding exercise response variability&#8221;</a>,  Ross et al 2019 (&#8220;large individual differences in CRF response (range:  &#8722;33% to +118%) have been observed across the 8 exercise training studies  independent of exercise duration&#8221;&#8212;nothing in psychology, or medicine,  makes sense except in the light of individual differences&#8230;)</p></li></ul><p>Recent Evolution:</p><ul><li><p><a href="https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msab147/6277411">&#8220;Analysis of genomic DNA from medieval plague victims suggests long-term effect of </a><em><a href="https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msab147/6277411">Yersinia pestis</a></em><a href="https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msab147/6277411"> on human immunity genes&#8221;</a>, Immel et al 2021</p></li></ul><p>Engineering:</p><ul><li><p><a href="https://biohackinfo.com/news-china-gene-editing-criminal-law-article-336-march-2021/">&#8220;China officially bans CRISPR babies, human clones and animal-human hybrids&#8221;</a>? (another blow to attempts to project fears &amp; fantasies onto China)</p></li></ul><h2>2.3 Politics/Religion</h2><ul><li><p><em><a href="https://www.nap.edu/catalog/25762/reflecting-sunlight-recommendations-for-solar-geoengineering-research-and-research-governance">Reflecting Sunlight: Recommendations for Solar Geoengineering Research and Research Governance</a></em>, National Academies 2021 (<a href="https://www.nytimes.com/2021/03/25/climate/geoengineering-sunlight.html">media</a>)</p></li><li><p><a href="https://www.gwern.net/docs/sociology/2020-muralidharan.pdf">&#8220;Improving Public Sector Management at Scale? Experimental Evidence on School Governance India&#8221;</a>, Muralidharan &amp; Singh 2020</p></li><li><p><a href="https://www.gwern.net/docs/fiction/2012-mason.pdf">&#8220;Jay-Z&#8217;s </a><em><a href="https://www.gwern.net/docs/fiction/2012-mason.pdf">99 Problems</a></em><a href="https://www.gwern.net/docs/fiction/2012-mason.pdf">, Verse 2: A Close Reading with 4th Amendment Guidance for Cops and Perps&#8221;</a>, Mason 2012</p></li></ul><h2>2.4 Psychology/Biology</h2><ul><li><p><a href="https://www.gwern.net/docs/longevity/2021-wiley.pdf">&#8220;Oxylipin biosynthesis reinforces cellular senescence and allows detection of senolysis&#8221;</a>, Wiley et al 2021</p></li><li><p><a href="https://www.nytimes.com/2019/02/26/magazine/psychics-skeptics-facebook.html" title="Are some celebrity mediums fooling their audience members by reading social media pages in advance? A group of online vigilantes is out to prove it">&#8220;Inside the Secret Sting Operations to Expose Celebrity Psychics&#8221;</a></p></li><li><p><a href="https://www.gwern.net/docs/catnip/2021-smith.pdf">&#8220;If I fits I sits: A citizen science investigation into illusory contour susceptibility in domestic cats (</a><em><a href="https://www.gwern.net/docs/catnip/2021-smith.pdf">Felis silvestris catus</a></em><a href="https://www.gwern.net/docs/catnip/2021-smith.pdf">)&#8221;</a>, Smith et al 2021</p></li><li><p><a href="https://www.gwern.net/docs/biology/2005-paxton.pdf">&#8220;Cetaceans,  sex and sea serpents: an analysis of the Egede accounts of a &#8216;most  dreadful monster&#8217; seen off the coast of Greenland in 1734&#8221;</a>, Paxton et al 2005 (is that a legendary cryptid in your pocket, or are you just happy to see me?)</p></li><li><p><a href="https://www.gwern.net/docs/psychology/writing/2020-reilly.pdf">&#8220;Building the perfect curse word: A psycholinguistic investigation of the form and meaning of taboo words&#8221;</a>, Reilly et al 2020</p></li><li><p><a href="https://en.wikipedia.org/wiki/Tarrare">Tarrare</a></p></li></ul><h2>2.5 Technology</h2><ul><li><p><a href="https://arxiv.org/abs/2103.07487">&#8220;How Developers Choose Names&#8221;</a>,  Feitelson et al 2021 (&#8220;Another example concerned the function  &#8216;arrangeFilesByName(files)&#8217;. When asked the return value&#8230;one suggested  the number of files reordered&#8221;)</p></li><li><p><a href="https://arxiv.org/abs/2004.02504">&#8220;Bringing GNU Emacs to Native Code&#8221;</a>,  Corallo et al 2020 (using libgccjit to make Emacs 2.3&#215; to 42&#215; faster;  gccemacs has been merged into Emacs HEAD &amp; will be available soon)</p></li><li><p><a href="https://phiresky.github.io/blog/2021/hosting-sqlite-databases-on-github-pages/">&#8220;Hosting SQLite databases on Github Pages (or any static file hoster)&#8221;</a> (a revolution in static website technology: eg running a query <a href="https://nitter.cc/simonw/status/1388933800445452290" title="Check out this demo: I run the SQL query &quot;select country_code, long_name from wdi_country order by rowid desc limit 100&quot; and it fetches just 54.2KB of new data (across 49 small HTTP requests) to return 100 results---from a statically hosted database file that's 668.8MB!">need download only 54kb of a 670MB database</a>; fulltext site search is just the beginning of the possibilities of this clever use of <a href="https://en.wikipedia.org/wiki/Byte_serving">range requests</a>)</p></li><li><p><a href="https://www.coderelay.io/fontemon.html">&#8220;</a><em><a href="https://www.coderelay.io/fontemon.html">Fontemon</a></em><a href="https://www.coderelay.io/fontemon.html">: World&#8217;s first video game in a font!&#8221;</a> (a <em>Pokemon</em>-like CYOA <a href="https://github.com/mmulet/code-relay/blob/main/markdown/HowIDidIt.md">implemented as an OpenType font file</a>; play in browser or text editor&#8212;still not quite <a href="https://www.gwern.net/Turing-complete">Turing-complete</a> but definitely the most impressive thing implemented in a font so far)</p><ul><li><p><em>Fontemon</em> is by far the highlight of <a href="http://sigbovik.org/2021/proceedings.pdf">SIGBOVIK 2021</a>; but also worth noting: <a href="http://sigbovik.org/2021/proceedings.pdf#page=8">&#8220;Back to Square One: Superhuman Performance in Chutes and Ladders Through Deep Neural Networks and Tree Search&#8221;</a> &#183; <a href="http://sigbovik.org/2021/proceedings.pdf#page=83">&#8220;Deep Deterministic Policy Gradient Boosted Decision Trees&#8221;</a> &#183; <a href="http://sigbovik.org/2021/proceedings.pdf#page=126">&#8220;Lowestcase and uppestcase letters: Advances in derp learning&#8221;</a> &#183; <a href="http://sigbovik.org/2021/proceedings.pdf#page=167">&#8220;openCHEAT: Computationally Helped Error bar Approximation Tool&#8212;Kick-starting Science 4.0&#8221;</a> &#183; <a href="http://sigbovik.org/2021/proceedings.pdf#page=216">&#8220;The Newcomb-Benford Law, Applied to Binary Data: An Empirical and Theoretic Analysis&#8221;</a> &#183; <a href="http://sigbovik.org/2021/proceedings.pdf#page=252">&#8220;Inverted Code Theory: Manipulating Program Entropy&#8221;</a> (<em><a href="https://en.wikipedia.org/wiki/Tenet_(film)">Tenet</a></em> fans only&#8212;possibly inferior to <a href="http://www.frc.ri.cmu.edu/~hpm/project.archive/general.articles/1991/TempComp.html" title="Time Travel and Computing">Moravec 1991</a>?) &#183; <a href="http://sigbovik.org/2021/proceedings.pdf#page=282">&#8220;Build your own 8-bit busy beaver on a breadboard!&#8221;</a></p></li></ul><p>Incidentally, it&#8217;s curious that while STEM fields have entire annual issues, journals, &amp; conferences devoted to satire (<a href="http://sigbovik.org/">SIGBOVIK</a>; Arxiv April Fools papers like <a href="https://arxiv.org/abs/1703.10987" title="On the Impossibility of Supersized Machines">Garfinkel et al 2017</a>; <a href="https://www108.lamp.le.ac.uk/ojs1/index.php/pst/issue/archive">Special Topics</a>; the <a href="https://www.bmj.com/about-bmj/resources-authors/article-types/christmas-issue">BMJ Christmas issue</a>; the <a href="https://en.wikipedia.org/wiki/Ig_Nobel_Prize">Ig Nobel Prizes</a> &amp; <a href="https://bahfest.com/">BAHFest</a>), after asking in several places, I have found no instances in the humanities. (I know of many entertaining <em>papers</em>, like <a href="https://www.gwern.net/docs/philo/2008-sinhababu.pdf" title="Possible Girls">Sinhababu 2008</a> on waifus, but no <em>regular organized</em> publication, with the possible exception of the annual <a href="https://en.wikipedia.org/wiki/Latke%E2%80%93Hamantash_Debate">&#8220;Latke-Hamantash Debate&#8221;</a>.)</p></li></ul><h2>2.6 Economics</h2><ul><li><p><a href="https://www.gwern.net/docs/statistics/decision/2006-thorp.pdf">&#8220;The Kelly Criterion in Blackjack Sports Betting, and the Stock Market&#8221;</a>, Thorp 2006</p></li><li><p><a href="https://marginalrevolution.com/marginalrevolution/2016/10/performance-pay-nobel.html">&#8220;The Performance Pay Nobel&#8221;</a> (CEO pay as <a href="https://www.gwern.net/Backstop">blackbox optimization problem</a>)</p></li><li><p><a href="https://www.gwern.net/docs/economics/2008-josephson.pdf">&#8220;The Ocean&#8217;s Hot Dog: The Development of the Fish Stick&#8221;</a>,  Kelly 2008 (out of nostalgia, I bought some fish sticks for the first  time in decades; better than I remembered, even if I had no <a href="https://en.wikipedia.org/wiki/Tartar_sauce">tartar</a> handy)</p></li></ul><h2>2.7 Philosophy</h2><ul><li><p><a href="https://www.gwern.net/docs/culture/2007-shiner.pdf">&#8220;The Aesthetics of Smelly Art&#8221;</a>, Shiner &amp; Kriskovets 2007; <a href="https://www.gwern.net/docs/culture/2019-kraft.pdf">&#8220;The Odor Value Concept in the Formal Analysis of Olfactory Art&#8221;</a>, Kraft 2019; <a href="https://qualiacomputing.com/2020/02/21/perfumery-as-an-art-form/" title="Hedonic Tone, memetics, scent, sex, spirituality">&#8220;Perfumery as an art form&#8221;</a>/<a href="https://qualiacomputing.com/2020/08/14/qualia-research-diary-scents/" title="Qualia Research Diary: Scents [consciousness research, Experiment, genetics, memetics, scent, valence]">notes</a>, Qualia Computing 2020 (more: manufacturing: <a href="https://www.newyorker.com/magazine/2005/03/14/scent-nile" title="Chandler Burr 2005">&#8220;The Scent of the Nile: Jean-Claude Ellena creates a new perfume&#8221;</a>; human smell is better than you think: <a href="https://www.gwern.net/docs/psychology/2006-porter.pdf">&#8220;Mechanisms of Scent-tracking in Humans&#8221;</a>, Porter et al 2006 (<a href="https://www.gwern.net/images/psychology/2006-porter-humanscenttracking-41593_2007_bfnn1819_moesm2_esm.mp4">video</a>; see also <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5512720/">&#8220;Poor Human Olfaction is a 19th Century Myth&#8221;</a>, McGann 2017); <a href="https://www.pnas.org/content/109/49/19959.full" title="'Perceptual convergence of multi-component mixtures in olfaction implies an olfactory white', Weiss et al 2012">olfactory white</a>; <em><a href="https://en.wikipedia.org/wiki/K%C5%8Dd%C5%8D">K&#333;d&#333;</a></em>, which unexpectedly appears in <a href="https://www.gwern.net/docs/cs/2005-knuth-taocp-v4-prefascicle4b.pdf#page=22" title="7.2.1.7: History of Combinatorial Generation: Set Partitions">Knuth</a>. <a href="https://threadreaderapp.com/thread/1357071738731814912.html" title="https://twitter.com/add_hawk/status/1357071738731814912">C. Thi Nguyen</a>&#8217;s description of the more bizarre &amp; avant-garde perfumes made me curious enough to nose around &amp; order 39 <a href="https://www.luckyscent.com/">LuckyScent</a> samplers.)</p></li></ul><h2>2.8 Miscellaneous</h2><ul><li><p><a href="https://en.wikipedia.org/wiki/Bog_butter">Bog butter</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Sarah_Bernhardt">Sarah Bernhardt</a> (Lions. Lots of lions.)</p></li></ul><div><hr></div><ol><li><p>Another thought, looking at <a href="https://bls.gov/news.release/ecec.nr0.htm">&#8216;Employer Costs for Employee Compensation&#8217;</a> (<a href="https://bls.gov/news.release/archives/ecec_031986.pdf">PDF</a>):</p><ol><li><p>&#8220;Moore&#8217;s Law&#8221;: the cost of a transistor halves every ~19 months;</p></li><li><p>&#8220;Anti-Moore&#8217;s Law&#8221;: the cost of a synapse doubles every ~119 years.</p></li></ol></li></ol>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[March 2021 Gwern.net Newsletter]]></title>
            <description><![CDATA[2 major new site features: 'popins' and recursive Wikipedia popups]]></description>
            <link>https://gwern.substack.com/p/march-2021-gwernnet-newsletter</link>
            <guid isPermaLink="false">https://gwern.substack.com/p/march-2021-gwernnet-newsletter</guid>
            <dc:creator><![CDATA[gwern]]></dc:creator>
            <pubDate>Tue, 06 Apr 2021 15:31:01 GMT</pubDate>
            <enclosure url="https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/f10eb6e5-7674-4465-b223-2f254bc50ddb_685x1368.png" length="0" type="image/jpeg"/>
            <content:encoded><![CDATA[<p><a href="https://www.gwern.net/newsletter/2021/03">March 2021&#8217;s Gwern.net</a> <a href="https://gwern.substack.com">newsletter</a> is now out; previous, <a href="https://www.gwern.net/newsletter/2021/02">February 2021</a> (<a href="https://www.gwern.net/tags/newsletter">archives</a>). This is a summary of the revision-history RSS feed, overlapping with my <a href="https://www.gwern.net/Changelog">Changelog</a> &amp; <a href="https://old.reddit.com/r/gwern/">/r/gwern</a>; brought to you by my donors on <a href="https://www.patreon.com/gwern">Patreon</a>.</p><h1>1 Writings</h1><ul><li><p><strong>Gwern.net</strong>: mobile &#8220;popins&#8221; are finally enabled! (<a href="https://www.gwern.net/images/design/2021-03-28-gwern.net-annotations-mobilepopins-darkmode.png">example</a>); new Wikipedia popups (this 7th implementation enables <em><a href="https://www.gwern.net/images/design/2021-04-01-gwern.net-annotations-popups-recursivewikipediapopups.png">recursive</a></em><a href="https://www.gwern.net/images/design/2021-04-01-gwern.net-annotations-popups-recursivewikipediapopups.png"> WP popups</a>)</p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><ul><li><p><a href="https://distill.pub/2021/multimodal-neurons/#openai">&#8220;Multimodal Neurons in Artificial Neural Networks&#8221;</a>, Goh et al 2021 (dissecting <a href="https://openai.com/blog/clip/" title="CLIP (Contrastive Language-Image Pre-training): Connecting Text and Images">CLIP</a> concepts, discovering typographical classification &#8216;attacks&#8217;^1^ and a <a href="https://en.wikipedia.org/wiki/Stroop_effect">Stroop effect</a>! Is there anything CLIP can&#8217;t do?)</p></li><li><p><a href="https://arxiv.org/abs/2101.03958#google">&#8220;Evolving Reinforcement Learning Algorithms&#8221;</a>, Co-Reyes et al 2021 (evolving eg <a href="https://en.wikipedia.org/wiki/Temporal_difference_learning">TD-learning</a>)</p></li><li><p><a href="https://www.gwern.net/docs/rl/2021-scanlon.pdf">&#8220;Waymo Simulated Driving Behavior in Reconstructed Fatal Crashes within an Autonomous Vehicle Operating Domain&#8221;</a>, Scanlon et al 2021 (<a href="https://blog.waymo.com/2021/03/replaying-real-life.html">blog</a>; hard negative mining&#8212;self-driving cars, being inhuman, can learn not just from their mistakes but humans&#8217; mistakes too)</p></li><li><p><a href="https://andyljones.com/posts/rl-debugging.html">&#8220;Debugging Reinforcement Learning Systems Without The Agonizing Pain&#8221;</a>, Andy L. Jones; <a href="https://clemenswinter.com/2021/03/24/my-reinforcement-learning-learnings/">&#8220;My Reinforcement Learning Learnings&#8221;</a>, Clemens Winter</p></li></ul><p><a href="https://old.reddit.com/r/mlscaling/">Matters Of Scale</a>:</p><ul><li><p><a href="https://arxiv.org/abs/2103.01988#facebook">&#8220;SEER: Self-supervised Pretraining of Visual Features in the Wild&#8221;</a>, Goyal&nbsp;et&nbsp;al&nbsp;2021 (<a href="https://ai.facebook.com/blog/self-supervised-learning-the-dark-matter-of-intelligence" title="Self-supervised learning: The dark matter of intelligence">blog</a>;  near-SOTA by training 1b-param CNN on 1b unfiltered unlabeled Internet  images&#8212;another reminder that unsupervised learning is really working!); <a href="https://ai.facebook.com/blog/learning-from-videos-to-understand-the-world">&#8220;&#8216;Learning From Videos&#8217; to understand the world&#8221;</a> (rapid FB expansion of self-supervised training to millions of photos/videos/hours-of-speech); <a href="https://arxiv.org/abs/2103.14005">&#8220;Contrasting Contrastive Self-Supervised Representation Learning Models&#8221;</a>,  Kotar et al 2021 (Supervised learning from ImageNet is now obsolete for  transfer learning, and ImageNet just a contaminated validation set)</p></li><li><p><a href="https://arxiv.org/abs/2103.14586#google">&#8220;Understanding Robustness of Transformers for Image Classification&#8221;</a>, Bhojanapalli et al 2021 (<a href="https://openreview.net/forum?id=YicbFdNTTy#google">Vision Transformers</a> gain robustness faster than CNNs as dataset size increases)</p></li><li><p><a href="https://aiindex.stanford.edu/wp-content/uploads/2021/03/2021-AI-Index-Report_Master.pdf#page=41">&#8220;Artificial Intelligence Index Report 2021&#8221;</a>: technical performance and cost (<a href="https://chinai.substack.com/p/chinai-137-year-3-of-chinai" title="ChinAI #137: Year 3 of ChinAI: Reflections on the newsworthiness of machine translation">Ding questions</a>  whether this shows China catching up on AI at all, as we are  incessantly told it is doing; one question to ask: ignoring  fast-following, what, out of the thousands upon thousands of  publications flooding out these days, are the last 3 <em>major novel</em>  AI breakthroughs coming out of all pure-Chinese labs combined which  could be plausibly equated in importance with, say, just OpenAI&#8217;s recent  output of <a href="https://arxiv.org/abs/2005.14165#openai">GPT-3</a>/<a href="https://openai.com/blog/dall-e/">DALL&#183;E</a>/CLIP?)</p></li><li><p><a href="https://openai.com/blog/gpt-3-apps/">OA GPT-3 API: &gt;300 apps, &gt;10k developers, &gt;4.5b words per day</a></p></li><li><p><a href="https://www.pnas.org/content/116/23/11537">&#8220;A mathematical theory of semantic development in deep neural networks&#8221;</a>, Saxe et al 2019 (are jumps in NN capabilities to be expected when scaling? see also <a href="https://arxiv.org/pdf/2103.10948.pdf#page=22" title="The Shape of Learning Curves: a Review: 6. Ill-behaved learning curves: 6.1. Phase transitions">Viering &amp; Loog 2021</a>&#8217;s discussion of phase transitions &amp; averaging of exponentials giving power-laws)</p></li><li><p><a href="https://www.cell.com/cell/fulltext/S0092-8674(21)00239-7">&#8220;An early cell shape transition drives evolutionary expansion of the human forebrain&#8221;</a>, Benito-Kwiecinski et al 2021 (<a href="https://www.theguardian.com/science/2021/mar/24/scientists-discover-why-the-human-brain-is-so-big" title="Scientists discover why the human brain is so big: Molecular switch makes human organ three times larger than great apes&#8217;, study finds">media</a>; a simple switch for the <a href="https://www.gwern.net/docs/psychology/2012-herculanohouzel.pdf" title="'The remarkable, yet not extraordinary, human brain as a scaled-up primate brain and its associated cost', Herculano-Houzel 2012">scaling up</a> of the primate brain)</p><ul><li><p><a href="https://www.statnews.com/2020/09/24/crows-possess-higher-intelligence-long-thought-primarily-human/">&#8220;Crows possess higher intelligence long thought primarily human&#8221;</a> (the remarkable, yet not extraordinary, crow/raven brain as scaled-up <a href="https://en.wikipedia.org/wiki/Bird_intelligence">bird brain</a>)</p></li></ul></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href="https://advances.sciencemag.org/content/7/11/eabd1239">&#8220;GWAS in almost 195,000 individuals identifies 50 previously unidentified genetic loci for eye color&#8221;</a>, Simcoe et al 2021</p></li><li><p><a href="https://www.gwern.net/docs/genetics/heritable/2021-fagereng.pdf">&#8220;Why Do Wealthy Parents Have Wealthy Children?&#8221;</a>,  Fagereng&nbsp;et&nbsp;al&nbsp;2021 (I&#8217;m always impressed just how difficult it is for  rich people to pass on wealth&#8212;&#8220;shirtsleeves to shirtsleeves in 3  generations&#8221; etc)</p></li></ul><p>Evolution:</p><ul><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.02.25.432891v1">&#8220;Nothing in evolution makes sense except in the light of parasites&#8221;</a>, Hickinbotham et al 2021</p></li></ul><p>Engineering:</p><ul><li><p><a href="https://www.sierraclub.org/sierra/2021-2-march-april/feature/demise-and-potential-revival-american-chestnut" title="Before a disastrous blight, the American chestnut was a keystone species in eastern forests. Could genetic engineering help bring it back?">&#8220;The Demise and Potential Revival of the American Chestnut&#8221;</a></p></li></ul><h2>2.3 Statistics/Meta-Science</h2><ul><li><p><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7831807/">&#8220;Broad cross-national public support for accelerated COVID-19 vaccine trial designs&#8221;</a>,  Broockman et al 2021 (&#8220;we can&#8217;t do challenge trials with volunteers in  February 2020 to save countless thousands of lives because ordinary  people might think it unethical&#8221;&#8212;have you tried <em>asking</em> them, or was that irrelevant because it was just another noble lie?)</p></li><li><p><a href="https://crystalprisonzone.blogspot.com/2021/01/i-tried-to-report-scientific-misconduct.html">&#8220;This is the story of how I found what I believe to be scientific misconduct and what happened when I reported it&#8221;</a>, Joe Hilgard</p></li><li><p><a href="https://www.newyorker.com/culture/cultural-comment/the-revolution-in-classic-tetris">&#8220;The Revolution in Classic Tetris: How a younger generation used the Internet to master the falling blocks&#8221;</a> (how achieving classic Tetris maximum-scores, first done in 2010, became routine thanks to YouTube &amp; <a href="https://www.gwern.net/Bakewell#external-links">online competition for excellence</a>)</p></li></ul><h2>2.4 Politics/Religion</h2><ul><li><p><a href="https://www.gwern.net/docs/sociology/2021-singh.pdf">&#8220;Magic, Explanations, and Evil: The Origins and Design of Witches and Sorcerers&#8221;</a>, Singh 2021 (doubtless even cavemen were all &#8220;Og: sus.&#8221;)</p></li><li><p><a href="https://elifesciences.org/articles/62878">&#8220;Self-blinding citizen science to explore psychedelic microdosing&#8221;</a>, Szigeti et al 2021 (related to <a href="https://www.nature.com/articles/s41598-021-81446-7" title="Positive expectations predict improved mental-health outcomes linked to psychedelic microdosing">Kaertner et al 2021</a>; a self-blinding study, similar to my old self-blinding protocols, confirms that microdosing is just placebo effect, as <a href="https://www.gwern.net/LSD-microdosing">I said in 2012</a>, and I&#8217;m reminded of DNB studies like <a href="https://www.gwern.net/docs/dnb/2016-foroughi.pdf" title="Placebo effects in cognitive training">Foroughi et al 2016</a>)</p></li><li><p><a href="https://en.wikipedia.org/wiki/2019%E2%80%932020_vaping_lung_illness_outbreak">The 2019&#8211;2020 vaping moral panic</a>  over adulterated black-market THC products (depressing to see how  irresponsibly reported &amp; alarmist this was, and how everyone  attempted to frame nicotine for it<a href="file:///tmp/burlVELRZZ.html#fn2">2</a>. Naturally, no one involved has apologized or admitted fault&#8212;after all, their <em><a href="https://en.wikipedia.org/wiki/Noble_lie">intentions</a></em><a href="https://en.wikipedia.org/wiki/Noble_lie"> were good</a>,  &#8220;won&#8217;t someone think of the children&#8221;&#8253; The incompetence and/or  dishonesty here emphasizes how 2020&#8211;2021 was business as usual, and the  only unusual part is that reality happened so fast we saw some of <a href="https://en.wikipedia.org/wiki/Parable_of_the_broken_window">the unseen</a>.)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Mark_Hofmann">Mark Hofmann</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Alexandra_David-N%C3%A9el">Alexandra David-N&#233;el</a> (one of <em>those</em> 1800&#8211;1900s biographies)</p></li><li><p><a href="https://en.wikipedia.org/wiki/John_Harvey_Kellogg">John Harvey Kellogg</a></p></li></ul><h2>2.5 Psychology/Biology</h2><ul><li><p><a href="https://www.gwern.net/docs/iq/2021-brown.pdf">&#8220;Can You Ever Be Too Smart for Your Own Good? Comparing Linear and Nonlinear Effects of Cognitive Ability on Life Outcomes&#8221;</a>, Brown et al 2021</p></li><li><p><a href="https://psyarxiv.com/g8f9s/">&#8220;The pandemic fallacy: Inaccuracy of social scientists&#8217; and lay judgments about COVID-19&#8217;s societal consequences in America&#8221;</a>, Hutcherson et al 2021 (highly-inaccurate even retrospectively, typically grossly overestimating)</p></li><li><p><a href="https://psyarxiv.com/hc8je/">&#8220;Training Working Memory for Two Years&#8212;No Evidence of Latent Transfer to Intelligence&#8221;</a>, Watrin et al 2021 (fade-out of expectancy/placebo effects)</p></li><li><p><a href="https://www.cell.com/current-biology/fulltext/S0960-9822(21)00059-2">&#8220;Real-time dialogue between experimenters and dreamers during REM sleep&#8221;</a>, Konkoly et al 2021</p></li><li><p><a href="https://www.sciencedirect.com/science/article/pii/S0149763421001068">&#8220;Leroy&#8217;s elusive little people: A systematic review on lilliputian hallucinations&#8221;</a>, Blom 2021 (<a href="https://en.wikipedia.org/wiki/Alice_in_Wonderland_syndrome">Alice in Wonderland syndrome</a>)</p></li><li><p><a href="https://www.theatlantic.com/science/archive/2021/01/orcas-killer-whale-resident-transient/617862/">&#8220;A  Group of Orca Outcasts Is Now Dominating an Entire Sea: &#8216;Transient&#8217;  killer whales that feast on seals and hunt in small packs are thriving  while their widely beloved &#8216;Resident&#8217; siblings are dying out&#8221;</a> (I wonder how the third <a href="https://en.wikipedia.org/wiki/Killer_whale">orca</a> type, <a href="https://en.wikipedia.org/wiki/Killer_whale#Types">&#8216;offshore&#8217;</a>, are doing?)</p></li><li><p><a href="https://www.gwern.net/docs/biology/1995-watanabe.pdf">&#8220;Estimation of the total saliva volume produced per day in 5-year-old children&#8221;</a>, Watanabe et al 1995</p></li></ul><h2>2.6 Technology</h2><ul><li><p><a href="https://www.nngroup.com/articles/aesthetic-usability-effect/">&#8220;The Aesthetic-Usability Effect&#8221;</a>, Moran 2017 (<a href="https://pointersgonewild.com/2019/11/02/they-might-never-tell-you-its-broken/">&#8220;They Might Never Tell You It&#8217;s Broken&#8221;</a> if it&#8217;s pretty enough; see also <a href="https://asktog.com/atc/the-third-user/" title="'The Third User, or, Exactly Why Apple Keeps Doing Foolish Things">&#8220;The Third User&#8221;</a>)</p></li><li><p><a href="https://ciechanow.ski/cameras-and-lenses/">&#8220;Cameras and Lenses&#8221;</a>, Bartosz Ciechanowski (explorable; followup to <a href="https://ciechanow.ski/lights-and-shadows/">&#8220;Lights and Shadows&#8221;</a>)</p></li><li><p><a href="https://arxiv.org/abs/2103.07013">&#8220;Large Batch Simulation for Deep Reinforcement Learning&#8221;</a>, Shacklett et al 2021 (your computer is faster than you think)</p></li><li><p><a href="https://obscuritory.com/essay/incredible-boxes-of-hock-wah-yeo/">&#8220;The incredible boxes of Hock Wah Yeo&#8221;</a> (unusual video game packaging design)</p></li><li><p><a href="https://www.gwern.net/docs/technology/2017-post.pdf">&#8220;Stone Walls That Stay Built: A master waller shares how to dry-lay stone walls that hold their ground for centuries&#8221;</a>, Post 2017</p></li><li><p><a href="https://en.wikipedia.org/wiki/Automated_storage_and_retrieval_system">Automated storage and retrieval system</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Visual_cryptography">Visual cryptography</a></p></li></ul><h2>2.7 Economics</h2><ul><li><p><a href="https://www.gwern.net/docs/economics/2021-meyer.pdf">&#8220;The Use and Misuse of Income Data and Extreme Poverty in the United States&#8221;</a>, Meyer et al 2021 (measurement error in non-registry surveys of population extremes&#8212;not quite <a href="https://www.gwern.net/GPT-3#lizardman-constant">&#8220;lizardman&#8221;</a> but similar problem)</p></li><li><p><a href="https://www.gwern.net/docs/economics/2006-mackenzie.pdf">&#8220;Is economics performative? Option theory and the construction of derivatives markets&#8221;</a>, Mackenzie 2006 (the mechanics of how the <a href="https://en.wikipedia.org/wiki/Black%E2%80%93Scholes_model">Black-Scholes model</a> changed markets: <a href="https://en.wikipedia.org/wiki/Fischer_Black">Black</a>  ran a service printing &#8220;paper&#8221; estimating optimal prices for all  options which traders could consult &amp; use with simple heuristics to  try to arbitrage the market)</p></li><li><p><a href="https://www.cabinetmagazine.org/issues/52/hodes.php">&#8220;Whitewood under Siege: On the front lines of the pallet wars&#8221;</a> (the competition between the two ecosystems of shipping <a href="https://en.wikipedia.org/wiki/Pallet">pallets</a>: &#8216;whitewood&#8217; &amp; &#8216;blue pallet&#8217;)</p></li><li><p><em><a href="https://en.wikipedia.org/wiki/Mautam">Mautam</a></em></p></li></ul><h2>2.8 Philosophy</h2><ul><li><p><a href="https://www.tandfonline.com/doi/full/10.1080/03949370.2021.1893826">&#8220;Coping with mortality: responses of monkeys and great apes to collapsed, inanimate and dead conspecifics&#8221;</a>, De Marco et al 2021</p></li><li><p><a href="https://en.wikipedia.org/wiki/Braitenberg_vehicle">Braitenberg vehicle</a></p></li></ul><h2>2.9 Fiction</h2><ul><li><p><a href="https://en.wikipedia.org/wiki/Reply_of_the_Zaporozhian_Cossacks">&#8220;Reply of the Zaporozhian Cossacks&#8221;</a></p></li></ul><h2>2.10 Miscellaneous</h2><ul><li><p>America&#8217;s top ace, <a href="https://en.wikipedia.org/wiki/Dick_Bong">Major Dick Bong</a></p></li></ul><h1>3 Film/TV</h1><p><strong>Live-action:</strong></p><ul><li><p><em><a href="https://en.wikipedia.org/wiki/North_by_Northwest">North by Northwest</a></em> (<a href="https://en.wikipedia.org/wiki/Alfred_Hitchcock">Hitchcock</a>  1959; for such a extremely respected movie, it felt oddly formless and  like it was bouncing through genres as more of a comedic B-movie romp  than a serious auteur&#8217;s effort&#8212;since James Bond started in 1953, with a  TV adaptation in 1954, NbN comes off as almost a satire. I mean, really,  monkeying around in Presidential noses!)</p></li></ul><div><hr></div><ol><li><p>While interesting, these are &#8216;attacks&#8217; only in the most generous interpretation possible (since it <a href="https://nitter.cc/NoaNabeshima/status/1368662246885265409" title="The new CLIP adversarial examples are partially from the use-mention distinction. CLIP was trained to predict which caption from a list matches an image. It makes sense that a picture of an apple with a large 'iPod' label would be captioned with 'iPod', not 'Granny Smith'! This can be somewhat fixed with a list of labels that are more explicit about this, at least for a small set of pictures I've tried. After some experimentation, I found this prompt that seems to work with CLIP ViT-B-32: ...">does know</a> <a href="https://www.youtube.com/watch?v=Rk3MBx20z24&amp;t=35s" title="'Apple or iPod? Easy Fix for Adversarial Textual Attacks on OpenAI's CLIP Model!', Yannic Kilcher">the difference</a>),  and the fact that CLIP can read text in images to note the semantic  similarity, is to considerable credit. As the CLIP authors <a href="https://www.gwern.net/images/ai/2021-radford-clip-figure4-promptengineering.png" title="Radford et al 2021 (CLIP): **Figure 4**. _Prompt engineering and ensembling improve zero-shot performance_. Compared to the baseline of using contextless class names, prompt engineering and ensembling boost zero-shot classification performance by almost 5 points on average across 36 datasets. This improvement is similar to the gain from using 4&#215; more compute with the baseline zero-shot method but is &#8220;free&#8221; when amortized over many predictions.">note</a>,  some queries benefit from ensembling, more context than a single word  class name such as prefixing &#8220;A photograph of a&#8221;, and class names can be  highly ambiguous: in ImageNet, the class name &#8220;crane&#8221; could refer to  the bird or construction equipment; and the Oxford-IIIT Pet dataset  labels one class &#8220;boxer&#8221;. (CLIP is still <a href="https://stanislavfort.github.io/2021/03/05/OpenAI_CLIP_stickers_and_adversarial_examples.html" title="Pixels still beat text: Attacking the OpenAI CLIP model with text patches and adversarial pixel perturbations">vulnerable to regular adversarial examples</a>, of course.)&#8617;</p></li><li><p>It <em>couldn&#8217;t&#8217;ve</em> been  nicotine because people had been vaping for a decade and a half without  widespread near-instantaneous lung-related fatalities! It <em>had</em>  to be a new adulterant, and as soon as the first few black-market THC  links surfaced, that meant the problem had to be THC-products-only  because how would the same adulterant simultaneously get into the  different supply chains? And yet, every article, health official, and  activist did their paternalist best to suggest otherwise to pin the  blame on regular vaping, no matter how many tests turned up clean, and  it was the nicotine vaping products which got summarily banned&#8230;. One  must assume many of those laws are still on the books, inasmuch as <a href="https://old.reddit.com/r/electronic_cigarette/comments/lkhewr/usa_vape_mail_ban_newssales_megathread/">the shipping bans keep expanding</a>.&#8617;</p></li></ol>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[February 2021 Gwern.net Newsletter]]></title>
            <description><![CDATA[links on AI scaling, semaglutide, and ethicist ethics]]></description>
            <link>https://gwern.substack.com/p/february-2021-gwernnet-newsletter</link>
            <guid isPermaLink="false">https://gwern.substack.com/p/february-2021-gwernnet-newsletter</guid>
            <dc:creator><![CDATA[gwern]]></dc:creator>
            <pubDate>Sat, 13 Mar 2021 15:18:44 GMT</pubDate>
            <enclosure url="https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ef890e58-1193-4984-a1a5-8aca6141b85d_1108x691.png" length="0" type="image/jpeg"/>
            <content:encoded><![CDATA[<p>February 2021&#8217;s <a href="https://www.gwern.net/newsletter/2021/02">Gwern.net</a> <a href="https://gwern.substack.com">newsletter</a> is now out; previous, <a href="https://www.gwern.net/newsletter/2021/01">January 2021</a> (<a href="https://www.gwern.net/tags/newsletter">archives</a>). This is a summary of the revision-history RSS feed, overlapping with my <a href="https://www.gwern.net/Changelog">Changelog</a> &amp; <a href="https://old.reddit.com/r/gwern/">/r/gwern</a>; brought to you by my donors on <a href="https://www.patreon.com/gwern">Patreon</a>.</p><h1>1 Writings</h1><ul><li><p><strong>Gwern.net</strong>: popups: can now be moved, stickied, and full-screened (another step towards our ambition of Windows-95-in-the-browser!)</p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><ul><li><p><a href="https://lilianweng.github.io/lil-log/2021/01/02/controllable-neural-text-generation.html">&#8220;Controllable Neural Text Generation&#8221;</a>, Lilian Weng; <a href="https://ruder.io/recent-advances-lm-fine-tuning/" title="This article provides an overview of recent methods to fine-tune large pre-trained language models">&#8220;Recent Advances in Language Model Fine-tuning&#8221;</a>, Sebastian Ruder (review)</p><ul><li><p><a href="https://arxiv.org/abs/2102.07350">&#8220;Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm&#8221;</a>,  Reynolds &amp; McDonell 2021 (original 10-shot Fr &#8594; En translation can  be beaten by the better 0-shot prompt: &#8220;French: XYZ / English:&#8230;&#8221;; this  is &#8220;true of most worst-performing prompts&#8230;&#8221;); <a href="https://arxiv.org/abs/2102.09690">&#8220;Calibrate Before Use: Improving Few-Shot Performance of Language Models&#8221;</a>, Zhao et al 2021 (huge boost from calibrating unstable prompts; both demonstrate, <a href="https://www.gwern.net/GPT-3#prompts-as-programming">as always</a>, that &#8220;sampling can prove the presence of knowledge but not the absence.&#8221;)</p></li></ul></li><li><p><a href="https://arxiv.org/abs/2102.07074">&#8220;TransGAN: Two Transformers Can Make One Strong GAN&#8221;</a>, Jiang et al 2021 (Transformer-only GAN: attention is all you need)</p></li><li><p><a href="https://arxiv.org/abs/2102.06203">&#8220;PACT: Proof Artifact Co-training for Theorem Proving with Language Models&#8221;</a>, Han et al 2021 (<a href="https://arxiv.org/abs/2009.03393#openai" title="'GPT-f: Generative Language Modeling for Automated Theorem Proving', Polu &amp; Sutskever 2020">GPT-f</a> for <a href="https://en.wikipedia.org/wiki/Lean_(proof_assistant)">Lean</a>)</p></li><li><p><a href="https://arxiv.org/abs/2010.10648#google">&#8220;Towards End-to-End In-Image Neural Machine Translation&#8221;</a>, Mansimov et al 2020 (sure why not)</p></li><li><p><strong>Brains</strong>:</p><ul><li><p><a href="https://www.quantamagazine.org/artificial-neural-nets-finally-yield-clues-to-how-brains-learn-20210218/" title="The learning algorithm that enables the runaway success of deep neural networks doesn&#8217;t work in biological brains, but researchers are finding alternatives that could">&#8220;Artificial Neural Nets Finally Yield Clues to How Brains Learn&#8221;</a>  (short overview of biologically-plausible backprop: feedback alignment,  target propagation, predictive coding, &amp; attentional feedback; also  of recent interest, <a href="https://arxiv.org/abs/2012.14905" title="'VS-ML: Meta Learning Backpropagation And Improving It', Kirsch &amp; Schmidhuber 2021">VS-ML</a>;  given their increasing success in training while respecting more  biological constraints, the increasing power of backprop-trained ANNs  and the neurological success of ANNs in predicting &amp; imitating brain  signals, it is increasingly clear that brains <em>really do</em> do backprop in some sense)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.02.22.432340v1">&#8220;NSD: A massive 7-tesla fMRI dataset to bridge cognitive and computational neuroscience&#8221;</a>,  Jean et al 2021 (&#8220;&#8230;The availability of NSD thus opens the door to using  brain activity to directly guide the optimization of deep neural  networks.&#8221;)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.02.02.429430v1">&#8220;Brain2Pix: Fully convolutional naturalistic video reconstruction from brain activity&#8221;</a>, Le et al 2021 (reconstructing <em><a href="https://www.biorxiv.org/content/10.1101/687681v1.full" title="'A large single-participant fMRI dataset for probing brain responses to naturalistic stimuli in space and time', Seeliger et al 2019">Dr.&nbsp;Who</a></em>)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.07.01.183384v1.full">&#8220;High-performance brain-to-text communication via imagined handwriting&#8221;</a>, Willett et al 2020</p></li><li><p><a href="https://www.gwern.net/docs/rl/2021-spape.pdf">&#8220;Brain-computer interface for generating personally attractive images&#8221;</a>, Spape et al 2021 (many ways to improve this&#8230;)</p></li></ul></li></ul><p><a href="https://old.reddit.com/r/mlscaling/">Matters Of Scale</a>:</p><ul><li><p><a href="https://arxiv.org/abs/2102.01293#openai">&#8220;Scaling Laws for Transfer&#8221;</a>,  Hernandez et al 2021 (&#8220;We find that pre-training effectively multiplies  the fine-tuning dataset size&#8221;; a shot across the bow of anyone floating  on a proprietary-dataset moat: large models can drop data requirements  by orders of magnitude overnight, even surpassing you)</p></li><li><p><a href="https://arxiv.org/abs/2102.05918#google">&#8220;ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision&#8221;</a>, Jia et al 2021 (see also <a href="https://arxiv.org/abs/2102.08981#google" title="'Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts', Changpinyo et al 2021">CC-12M</a>; <a href="https://openai.com/blog/clip/">CLIP</a>-like w/EfficientNet trained on 1.8 billion images on a TPUv3-1024&#8212;<a href="https://arxiv.org/abs/2102.00529#deepmind" title="'Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers', Hendricks et al 2021">DM</a> argues that fancier cross-modal Transformers are better, nevertheless, <a href="http://www.incompleteideas.net/IncIdeas/BitterLesson.html">&#8216;TPUs go brrr&#8217;</a>. Given DALL&#183;E, CLIP, ALIGN, <a href="https://arxiv.org/abs/2011.10650#openai" title="'VDVAE: Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images', Child 2020">VDVAE</a>, <a href="https://arxiv.org/abs/2102.09532" title="'Clockwork Variational Autoencoders', Saxena et al 2021">CW-VAE</a>, <a href="https://arxiv.org/abs/2102.12037" title="'AIPO: Image Completion via Inference in Deep Generative Models', Harvey et al 2021">AIPO</a>  et al, are GANs already dead, and just don&#8217;t realize it yet? Or at  least soon to be relegated to only DRL-like uses as a final finetuning  phase to sharpen up a self-supervised model?); <a href="https://arxiv.org/abs/2103.06561">&#8220;WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training&#8221;</a>, Huo et al 2021</p></li><li><p><a href="https://arxiv.org/abs/2102.12092#openai">&#8220;DALL&#183;E: Zero-Shot Text-to-Image Generation&#8221;</a>, Ramesh et al 2021 (<a href="https://openai.com/blog/dall-e/">original blog</a>); <a href="https://arxiv.org/abs/2103.00823#alibaba">&#8220;M6: A Chinese Multimodal Pretrainer&#8221;</a>,  Lin et al 2021 (Chinese DALL&#183;E: 1.9TB images/0.29TB text for  10b-parameter dense/100b-parameter MoE Transformer; shockingly fast Chinese replication of DALL&#183;E/CLIP)</p></li><li><p><a href="https://arxiv.org/abs/2102.06701#google">&#8220;Explaining Neural Scaling Laws&#8221;</a>, Bahri et al 2021/<a href="https://arxiv.org/abs/2102.04074#deepmind">&#8220;Learning Curve Theory&#8221;</a>, Hutter 2021 (<a href="https://www.lesswrong.com/posts/Yt5wAXMc7D2zLpQqx/an-140-theoretical-models-that-predict-scaling-laws#HIGHLIGHTS">Rohin Shah commentary</a>; more on the manifold hypothesis)</p></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href="https://www.nature.com/articles/s41467-021-21283-4">&#8220;Phenotypic covariance across the entire spectrum of relatedness for 86 billion pairs of individuals&#8221;</a>, Kemper et al 2021</p></li><li><p><a href="https://www.nature.com/articles/s41380-021-01027-y">&#8220;Genetic variation, brain, and intelligence differences&#8221;</a>, Deary et al 2021</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.02.10.430571v1">&#8220;Pathfinder: A gamified measure to integrate general cognitive ability into the biological, medical and behavioural sciences&#8221;</a>, Malanchini et al 2021 (not the focus, but the IQ PGS is a slight improvement over <a href="https://www.biorxiv.org/content/early/2018/09/17/418210" title="Genomic prediction of cognitive traits in childhood and adolescence">Allegrini et al 2018</a> due to less phenotype measurement error?)</p></li><li><p><a href="https://www.nature.com/articles/s41380-021-01026-z">&#8220;Polygenic  burden has broader impact on health, cognition, and socioeconomic  outcomes than most rare and high-risk copy number variants&#8221;</a>, Saarentaus et al 2021</p></li><li><p><a href="http://www.scielo.br/scielo.php?script=sci_arttext&amp;pid=S1516-44462021005006201" title="'Ditching candidate gene association studies: lessons from psychiatric genetics', Duarte et al 2021">On candidate-genes &amp; COMT</a></p></li></ul><p>Recent Evolution:</p><ul><li><p><a href="https://www.nytimes.com/2021/02/17/science/DNA-mammoth.html">&#8220;Million-Year-Old  DNA Rewrites the Mammoth Family Tree: Genomic data&#8212;the oldest ever  recovered from a fossil&#8212;reveals the origin and evolution of the  Columbian mammoth&#8221;</a></p></li><li><p><a href="https://www.pnas.org/content/118/6/e2016046118">&#8220;Kin selection explains the evolution of cooperation in the gut microbiota&#8221;</a>, Simonet &amp; McNally 2021</p></li></ul><p>Engineering:</p><ul><li><p><a href="https://www.nytimes.com/2021/02/18/science/black-footed-ferret-clone.html" title="&quot;Meet Elizabeth Ann, the First Cloned Black-Footed Ferret: Her birth represents the first cloning of an endangered species native to North America, and may bring needed genetic diversity to the species&quot;">First Black-Footed Ferret cloned</a></p></li></ul><h2>2.3 Statistics/Meta-Science</h2><ul><li><p><a href="https://www.lesswrong.com/posts/9YDk52NPrfq7nqLvd/lessons-from-the-book-of-my-life">&#8220;Lessons from Gerolamo Cardano&#8217;s </a><em><a href="https://www.lesswrong.com/posts/9YDk52NPrfq7nqLvd/lessons-from-the-book-of-my-life">The Book of My Life</a></em><a href="https://www.lesswrong.com/posts/9YDk52NPrfq7nqLvd/lessons-from-the-book-of-my-life">&#8221;</a> (progress studies; see also <a href="https://www.gwern.net/Newton">Newton&#8217;s anthropic argument</a>, <a href="https://www.gwern.net/Bakewell">Bakewell &amp; inventing progress</a>, <em><a href="https://www.gwern.net/Book-reviews#the-autobiography-of-benvenuto-cellini-cellini-1999">The Autobiography of Benvenuto Cellini</a></em>)</p></li><li><p><a href="https://www.wired.com/story/group-house-covid-risk-points/">&#8220;How Many Microcovids Would You Spend on a Burrito?&#8221;</a> (on the <a href="https://www.microcovid.org/">microCOVID Project Calculator</a>)</p></li><li><p><a href="https://www.gwern.net/docs/math/1968-hammersley.pdf">&#8220;On  the enfeeblement of mathematical skills by &#8216;Modern Mathematics&#8217; and by  similar soft intellectual trash in schools and universities&#8221;</a>, Hammersley 1968 (<a href="https://www.gwern.net/docs/math/1973-knuth.pdf" title="The Dangers of Computer--Science Theory">Knuth</a> highlights as also amusing: <a href="https://www.gwern.net/docs/math/1967-austin.pdf">&#8220;A Note on Piffles&#8221;</a>, Smith 1967; <a href="https://www.gwern.net/docs/math/1980-farlow.pdf">&#8220;A rebuke of A. B. Smith&#8217;s paper, &#8216;A Note on Piffles&#8217;&#8221;</a>, Farlow 1980)</p></li><li><p><a href="https://www.gwern.net/docs/statistics/bias/2011-tatum.pdf">&#8220;Artifact and Recording Concepts in EEG&#8221;</a>, Tatum et al 2011 (on the <a href="https://en.wikipedia.org/wiki/Electroencephalography">EEG</a> signals of <a href="https://en.wikipedia.org/wiki/Jell-O">Jell-O</a>, or, the importance of <a href="https://en.wikipedia.org/wiki/Scientific_control#Negative">negative controls</a>)</p></li></ul><h2>2.4 Politics/Religion</h2><ul><li><p><a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0032541">&#8220;The Logic of Fashion Cycles&#8221;</a>, Acerbi et al 2012; <a href="https://royalsocietypublishing.org/doi/10.1098/rsif.2018.0731">&#8220;Fashion and art cycles are driven by counter-dominance signals of elite competition: quantitative evidence from music styles&#8221;</a>, Klimek et al 2019; <a href="https://arxiv.org/abs/1410.8001">&#8220;The hipster effect: When anti-conformists all look the same&#8221;</a>, Touboul 2019; <a href="https://slatestarcodex.com/2014/04/22/right-is-the-new-left">&#8220;Right Is The New Left&#8221;</a>, Scott Alexander (see also <a href="https://www.gwern.net/docs/culture/2010-han.pdf" title="Signaling Status with Luxury Goods: The Role of Brand Prominence">Han et al 2010</a>, <a href="https://www.gwern.net/docs/sociology/1972-downs.pdf" title="Up and down with ecology---the 'issue-attention cycle'">Downs 1972</a>/<a href="https://www.gwern.net/docs/sociology/2015-gupta.pdf" title="On Anthony Downs's 'Up and Down with Ecology: The &quot;Issue-Attention&quot; Cycle'">Gupta &amp; Jenkins-Smith 2015</a>, <a href="https://www.nature.com/articles/s41467-019-09311-w" title="Accelerating dynamics of collective attention">Lorenz-Spreen et al 2019</a>/<a href="https://www.gwern.net/docs/culture/2019-candia.pdf" title="The universal decay of collective memory and attention">Candia et al 2019</a>, <a href="https://www.gwern.net/docs/sociology/1994-loury.pdf" title="Self-Censorship in Public Discourse: A Theory of 'Political Correctness' and Related Phenomena">Loury 1994</a>)</p></li><li><p><a href="https://aeon.co/essays/what-can-we-learn-from-the-lunar-pandemic-that-never-was">&#8220;What can we learn from the lunar pandemic that never was?&#8221;</a>  (NASA&#8217;s lunar quarantine was a sham intended to mollify the public as  they covered up repeated major failures &amp; lab leaks both before  &amp; after&#8212;had there been any dangerous lunar organisms, they would  have escaped easily)</p></li><li><p><a href="https://en.wikipedia.org/wiki/MrBeast">MrBeast</a> (the new aristocracy of <a href="https://meltingasphalt.com/social-status-down-the-rabbit-hole/">prestige</a>? Borrowed plumage, perhaps, but effective&#8230;)</p></li><li><p><a href="https://www.cell.com/current-biology/fulltext/S0960-9822(17)30949-1">&#8220;Russia&#8217;s new Lysenkoism&#8221;</a>, Kolchinsky et al 2017</p></li></ul><h2>2.5 Psychology/Biology</h2><ul><li><p><strong><a href="https://en.wikipedia.org/wiki/Semaglutide">Semaglutide</a></strong>: <a href="https://www.gwern.net/docs/longevity/2021-wilding.pdf">&#8220;Once-Weekly Semaglutide in Adults with Overweight or Obesity&#8221;</a>, Wilding et al 2021; <a href="https://www.gwern.net/docs/longevity/2021-wadden.pdf">&#8220;Effect  of Subcutaneous Semaglutide vs Placebo as an Adjunct to Intensive  Behavioral Therapy on Body Weight in Adults With Overweight or Obesity:  The STEP 3 Randomized Clinical Trial&#8221;</a>, Wadden et al 2021</p><p>A longer-acting version of the insulin/appetite peptide <a href="https://en.wikipedia.org/wiki/Liraglutide">liraglutide</a>, semaglutide greatly reduces weight, fat, blood sugar, cholesterol etc, with an <a href="https://link.springer.com/article/10.1007/s40262-018-0728-4" title="'Safety and pharmacokinetics of single and multiple ascending doses of the novel oral human GLP-1 analogue, oral semaglutide, in healthy subjects and subjects with type 2 diabetes', Granhall et al 2019">upcoming oral version</a>; background: <a href="https://www.gwern.net/docs/longevity/2020-kushner.pdf" title="Semaglutide 2.4 mg for the Treatment of Obesity: Key Elements of the STEP Trials 1 to 5">Kushner et al 2020</a>, <a href="https://www.gwern.net/docs/longevity/2019-aroda.pdf" title="Comparative efficacy, safety, and cardiovascular outcomes with once-weekly subcutaneous semaglutide in the treatment of type 2 diabetes: Insights from the SUSTAIN 1--7 trials">Aroda et al 2019</a>, <a href="https://www.gwern.net/docs/longevity/2019-nauck.pdf" title="Management Of Endocrine Disease: Are all GLP-1 agonists equal in the treatment of type 2 diabetes?">Nauck &amp; Meier 2019</a>, <a href="https://www.gwern.net/docs/longevity/2018-oneil.pdf" title="Efficacy and safety of semaglutide compared with liraglutide and placebo for weight loss in patients with obesity: a randomized, double-blind, placebo and active controlled, dose-ranging, phase 2 trial">O&#8217;Neil et al 2018</a>, <a href="https://www.gwern.net/docs/longevity/2017-blundell.pdf" title="Effects of once-weekly semaglutide on appetite, energy intake, control of eating, food preference and body weight in subjects with obesity">Blundell et al 2017</a>, <a href="https://www.gwern.net/docs/longevity/2016-nauck.pdf" title="A Phase 2, Randomized, Dose-Finding Study of the Novel Once-Weekly Human GLP-1 Analog, Semaglutide, Compared With Placebo and Open-Label Liraglutide in Patients With Type 2 Diabetes">Nauck et al 2016</a>, <a href="https://www.gwern.net/docs/longevity/2015-lau.pdf" title="Discovery of the Once-Weekly Glucagon-Like Peptide-1 (GLP-1) Analogue Semaglutide">Lau et al 2015</a>.</p></li><li><p><a href="https://www.gwern.net/docs/biology/2020-irving.pdf">&#8220;Lessons from the host defences of bats, a unique viral reservoir&#8221;</a>, Irving et al 2021 (<a href="https://en.wikipedia.org/wiki/Bat-borne_virus">bat-borne viruses</a>; previously, <a href="https://get21stnight.com/2020/03/30/why-do-we-keep-getting-diseases-from-bats/">Trevor Klee</a>)</p></li><li><p><a href="https://www.frontiersin.org/articles/10.3389/fcell.2021.628157/full">&#8220;Beneficial  &amp; Detrimental Effects of Reactive Oxygen Species on Lifespan: A  Comprehensive Review of Comparative &amp; Experimental Studies&#8221;</a>,  Shields et al 2021 (antioxidants still aren&#8217;t the fountain of youth, and  may be harmful; animal studies still frequently inconsistent)</p></li><li><p><a href="https://www.nature.com/articles/s41598-021-81446-7">&#8220;Positive expectations predict improved mental-health outcomes linked to psychedelic microdosing&#8221;</a>, Kaertner et al 2021 (placebo)</p></li><li><p><a href="https://www.gwern.net/docs/iq/2021-aggeborn.pdf">&#8220;The Effects of Fluoride in Drinking Water&#8221;</a>, Aggeborn &amp; &#214;hman 2021</p></li><li><p><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1978350/">&#8220;Sleep  &amp; Sex: What Can Go Wrong? A Review of the Literature on Sleep  Related Disorders and Abnormal Sexual Behaviors &amp; Experiences&#8221;</a>, Schenck et al 2007</p></li></ul><h2>2.6 Technology</h2><ul><li><p><a href="https://www.xprize.org/prizes/elonmusk">New X-Prize: $100m in prizes for Carbon Removal</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Gauge_block">Wringing gauge blocks</a>  (&#8220;With their precisely-flat metal faces, gauge blocks can be stuck  together non-magnetically via a process calling &#8216;wringing&#8217;, requiring  substantial effort to separate. Scientists are still uncertain exactly  how wringing works.&#8221;)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Armoured_train">Armored train</a></p></li></ul><h2>2.7 Economics</h2><ul><li><p><a href="https://ourworldindata.org/cheap-renewables-growth">&#8220;Why did renewables become so cheap so fast? And what can we do to use this global opportunity for green growth?&#8221;</a>, Max Roser (specifically, why such an extreme <a href="https://en.wikipedia.org/wiki/Experience_curve_effects">experience curve</a>?)</p></li><li><p><a href="https://www.gwern.net/docs/iq/2012-grinblatt.pdf">&#8220;IQ, trading behavior, and performance&#8221;</a>, Grinblatt et al 2012; <a href="https://www.gwern.net/docs/economics/2020-barth.pdf">&#8220;Genetic Endowments and Wealth Inequality&#8221;</a>,  Barth et al 2020 (why, despite notorious setbacks, did Isaac Newton  &amp; LTCM&#8217;s founders die wealthy? Why, in general, are more intelligent  people so much better investors? &#8216;The indifference of the indicator&#8217;:  it&#8217;s not one thing, it&#8217;s everything&#8212;more intelligent people have lower  discount rates, save more for longer &amp; are less risk-averse, more  accurately predict future growth or inflation, are more likely to  participate in +EV opportunities like the stock market, to use low-fee  rather than high-fee (and thus, underperforming) mutual funds, succumb  less to biases like herding as they trade better &amp; at better times,  trade less, and harvest losses more efficiently when trading poorly.)</p></li></ul><h2>2.8 Philosophy</h2><ul><li><p>Are <strong>ethics experts more ethical</strong>? <a href="https://www.gwern.net/docs/philo/2016-schwitzgebel.pdf">&#8220;The Behavior of Ethicists&#8221;</a>, Schwitzgebel &amp; Rust 2016 (most recently: <a href="https://www.gwern.net/docs/philo/2019-schonegger.pdf">&#8220;The moral behavior of ethics professors: A replication-extension in German-speaking countries&#8221;</a>,  Sch&#246;negger et al 2019; given moral licensing &amp; activism, perhaps we  should be surprised we don&#8217;t hear about more ethicists doing things  like posting enemy lists or trying to dox reviewers. &#8220;Woe to you  Pharisees!&#8221;)</p></li><li><p><a href="https://psyarxiv.com/quwgr">&#8220;Meta-analysis on belief in free will manipulations&#8221;</a>, Genschow et al 2021 (another noble lie turns out to be ignoble)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Cooperative_principle">Gricean maxims of communication</a></p></li></ul><h2>2.9 Fiction</h2><ul><li><p><em><a href="https://en.wikipedia.org/wiki/Bunnies_%26_Burrows">Bunnies &amp; Burrows</a></em></p></li></ul><h2>2.10 Miscellaneous</h2><ul><li><p><a href="https://www.gwern.net/docs/history/1995-pop.pdf">&#8220;Caesar Lives&#8221;</a>, <a href="https://en.wikipedia.org/wiki/Iggy_Pop">Iggy Pop</a> 1995 (on <a href="https://en.wikipedia.org/wiki/The_History_of_the_Decline_and_Fall_of_the_Roman_Empire">Gibbon</a>)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Grayanotoxin#Mad_honey_intoxication">Mad honey</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Imperial_Court_System">Imperial Court System</a></p></li></ul>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Jan 2021 Gwern.net Newsletter]]></title>
            <description><![CDATA[January 2021 gwern.net newsletter with links on AI scaling up and down.]]></description>
            <link>https://gwern.substack.com/p/jan-2021-gwernnet-newsletter</link>
            <guid isPermaLink="false">https://gwern.substack.com/p/jan-2021-gwernnet-newsletter</guid>
            <dc:creator><![CDATA[gwern]]></dc:creator>
            <pubDate>Thu, 04 Feb 2021 20:23:01 GMT</pubDate>
            <enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/>
            <content:encoded><![CDATA[<p>January 2021&#8217;s <a href="https://www.gwern.net/newsletter/2021/01">Gwern.net</a> <a href="https://gwern.substack.com">newsletter</a> is now out; previous, <a href="https://www.gwern.net/newsletter/2020/12">December 2020</a> (<a href="https://www.gwern.net/tags/newsletter">archives</a>). This is a summary of the revision-history RSS feed, overlapping with my <a href="https://www.gwern.net/Changelog">Changelog</a> &amp; /r/gwern; brought to you by my donors on <a href="https://www.patreon.com/gwern">Patreon</a>.</p><h1>1 Writings</h1><ul><li><p><a href="https://www.gwern.net/Danbooru2020" title="Danbooru2020 is a large-scale anime image database with 4.2m+ images annotated with 130m+ tags; it can be useful for machine learning purposes such as image recognition and generation.">&#8220;Danbooru2020: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset&#8221;</a></p></li><li><p><a href="https://thisanimedoesnotexist.ai/">This Anime Does Not Exist.ai (TADNE)</a> (<a href="https://www.gwern.net/Faces#extended-stylegan2-danbooru2019-aydao">discussion</a>)</p></li><li><p><strong>Gwern.net</strong>: +return-to-top floating button; <em>popups</em>:  can now be disabled (use the &#8216;gear&#8217; icon); final reimplementation  (dynamic JS now; memoizing the recursive inlining, however clever &amp;  elegant, turns out to have painful edge-cases &amp; still not be  efficient enough&#8212;web browsers <em>really</em> don&#8217;t like loading hundreds of kilobytes of extra HTML)</p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><p><a href="https://old.reddit.com/r/mlscaling/">Matters Of Scale</a>:</p><ul><li><p><strong>Scaling up</strong>:</p><ul><li><p><a href="https://openai.com/blog/dall-e/">&#8220;DALL&#183;E: Creating Images from Text&#8221;</a>, OpenAI (GPT-3-12.5b generating 1280 tokens &#8594; <a href="https://arxiv.org/abs/1906.00446#deepmind" title="'Generating Diverse High-Fidelity Images with VQ-VAE-2', Razavi et al 2019">VQ-VAE</a> pixels; generates illustration &amp; photos); <a href="https://openai.com/blog/clip/">&#8220;CLIP (Contrastive Language-Image Pre-training): Connecting Text and Images&#8221;</a>, OpenAI (<a href="https://cdn.openai.com/papers/Learning_Transferable_Visual_Models_From_Natural_Language_Supervision.pdf" title="Learning Transferable Visual Models From Natural Language Supervision">Radford et al 2021</a>: zero-shot image understanding via text description&#8212;useful for much more than just ranking DALL&#183;E samples by quality)</p><p>Further <a href="https://www.gwern.net/newsletter/2020/05#blessings-of-scale">blessings of scale</a>: simple <a href="https://arxiv.org/abs/2010.05113" title="'Contrastive Representation Learning: A Framework and Review', Le-Khac et al 2020">contrastive</a> training on <em>n</em>  = 400m leads to remarkable generalization &amp; combinatorial  flexibility of image generation by DALL&#183;E, and CLIP learns to reach  image classification SOTA by zero-shot on many datasets, with more  human-like errors &amp; less degradation out of samples than rivals,  while costing the same to train. OpenAI released their smallest CLIP  model (the &#8220;<a href="https://openreview.net/forum?id=YicbFdNTTy#google" title="Vision Transformer (ViT): An Image is Worth 16&#215;16 Words: Transformers for Image Recognition at Scale">ViT</a>-B/32&#8221;-equivalent)  and people are discovering it seems able to do just about anything  without any further training&#8212;the paper notes that it does everything  from &#8220;fine-grained object classification, geo-localization, action  recognition in videos, and OCR&#8221;, but there&#8217;s so much more, and you can  use it to generate image captions/descriptions, classify your anime  images, pull a specific target image description by gradient ascent or  out of another neural network such as an ImageNet <a href="https://arxiv.org/abs/1809.11096#deepmind" title="'BigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis', Brock et al 2018">BigGAN</a>  or TADNE StyleGAN2-ext (or, why not, synthesize images images embodying  abstract concepts like emoji or words like &#8220;nightmare fuel&#8221; or  &#8220;confusion&#8221;!), search your image datasets by embedding, find mislabeled  images (eg by <a href="https://twitter.com/quasimondo/status/1351191660059832320">using &#8220;upside down&#8221; as the prompt</a>)&#8230;  One wonders, like GPT-3, how much better the largest CLIP  (&#8220;L/14-336px&#8221;) is and how many ways of using it (or DALL&#183;E) remain to be  found? And why prediction losses work so well in one place, but then  contrastive elsewhere?</p><p>For perspective: there are newly-minted PhDs going on the job market who got excited about deep learning because of these new <a href="https://arxiv.org/abs/1512.03385" title="'Deep Residual Learning for Image Recognition', He et al 2015">&#8220;resnet&#8221;</a> things; undergrads who applied to grad school because <a href="https://arxiv.org/abs/1810.04805#google" title="'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding', Devlin et al 2018">BERT</a>  et al were blowing open NLP &amp; extending neural supremacy to natural  language would not yet have passed quals; and it has been only 1  academic semester since <a href="https://arxiv.org/abs/2005.14165#openai" title="'GPT-3: Language Models are Few-Shot Learners', Brown et al 2020">GPT-3</a> was announced. Or to put it quantitatively, for just sequence modeling: it has been 8,478 days since <a href="https://www.gwern.net/docs/ai/1997-hochreiter.pdf" title="'Long Short-Term Memory', Hochreiter &amp; Schmidhuber 1997">LSTM</a> RNNs were published; 3,045 days since <a href="https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf" title="'ImageNet Classification with Deep Convolutional Neural Networks', Krizhevsky et al 2012">AlexNet&#8217;s</a> ImageNet scores were released; 1,880 days since residual networks were published in a paper; 1,330 days since <a href="https://arxiv.org/abs/1706.03762#google" title="Vaswani et al 2017">&#8220;Attention Is All You Need&#8221;</a> hit Arxiv; 844 days since BERT&#8217;s paper was published; 718 days since <a href="https://openai.com/blog/better-language-models/" title="'Better Language Models and Their Implications', OpenAI 2019">GPT-2</a> was announced; 353 days since <a href="https://arxiv.org/abs/2002.05709#google" title="'A Simple Framework for Contrastive Learning of Visual Representations', Chen et al 2020">SimCLR</a>, and 249 days since GPT-3 was; and 27 days since CLIP/DALL&#183;E.^1^ <a href="https://jetpress.org/volume1/moravec.htm" title="'When will computer hardware match the human brain?', Moravec 1998">Spring is coming.</a> (Some still insist we need not worry about &#8220;overpopulation on Mars&#8221; for &gt;18,264 more days&#8230;)</p></li><li><p><a href="https://arxiv.org/abs/2003.10580#google">&#8220;Meta Pseudo Labels&#8221;</a>, Pham et al 2020 (90% on ImageNet by pretraining a meta-learning teacher using JFT-300M on a TPUv3-2048)</p></li><li><p><a href="https://arxiv.org/abs/2101.03961#google">&#8220;Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity&#8221;</a>, Fedus et al 2021 (1.57t-parameter <a href="https://arxiv.org/abs/2006.16668#google" title="'GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding', Lepikhin et al 2020">GShard</a> followup; the mixture-of-experts approach, while scaling stably, starts showing its limits)</p></li></ul></li><li><p><strong>Scaling down</strong>:</p><ul><li><p><a href="https://arxiv.org/abs/2012.12877#facebook">&#8220;DeiT: Training data-efficient image transformers &amp; distillation through attention&#8221;</a>, Touvron et al 2020 (scaling Transformer classifiers down to ImageNet+1-GPU); <a href="https://arxiv.org/abs/2101.11605#google">&#8220;BoTNet: Bottleneck Transformers for Visual Recognition&#8221;</a>, Srinivas et al 2021/<a href="https://arxiv.org/abs/2101.11986">&#8220;Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet&#8221;</a>, Yuan et al 2021 (hybrids); <a href="https://arxiv.org/abs/2009.04433">&#8220;not-so-BigGAN: Generating High-Fidelity Images on Small Compute with Wavelet-based Super-Resolution&#8221;</a>, Han et al 2020/<a href="https://compvis.github.io/taming-transformers/">&#8220;VQGAN: Taming Transformers for High-Resolution Image Synthesis&#8221;</a>, Esser et al 2020 (training &gt;1024px Transformer GANs on just 2 GPUs)</p><p>Transformer supremacy in image-related tasks continues, and GANs  are becoming increasingly hybridized. Do pure-GANs have a future, now  that VAEs and autoregressive models are making such inroads into both  the highest-quality &amp; lowest-compute sample generation? To take the  GAN/DRL analogy seriously, perhaps they were they ultimately a dead end,  akin to trying to learn everything from rewards, and an adversarial GAN  loss ought to be only <a href="https://www.gwern.net/images/ai/2019-lecun-isscctalk-cake.png">the cherry on the cake</a> of a large unsupervised/semi-supervised generative model.</p></li><li><p><a href="https://arxiv.org/abs/2101.06840#microsoft">&#8220;ZeRO-Offload: Democratizing Billion-Scale Model Training&#8221;</a>, Ren et al 2021 (partial CPU training for 13b-parameter models on 1 V100 GPU, scaling to 128 GPUs)</p></li><li><p><a href="https://arxiv.org/abs/2101.00190">&#8220;Prefix-Tuning: Optimizing Continuous Prompts for Generation&#8221;</a>, Li &amp; Liang 2021 (could the <a href="https://arxiv.org/abs/2009.07118" title="'It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners', Schick &amp; Sch&#252;tze et al 2020">PET</a>  &amp; CLIP trick of averaging multiple embeddings to yield much better  performance be reused for GPT-3 prompts to greatly improve prompting?  The fact that the prefix-tuning, by directly optimizing the prompt  embeddings, yields better performance than even single optimized text  prompts, suggests so. The user could provide 3 or 4 similar prompts, and  synthesize them into a single super-prompt to better program GPT-3&#8230;)</p></li><li><p><a href="https://greydanus.github.io/2020/12/01/scaling-down/">&#8220;Scaling down Deep Learning&#8221;</a>,  Greydanus 2020 (cute: parametric simplified-MNIST for rapid iteration  on tiny NNs: experiments in lottery-ticket &amp; meta-learning of  LRs/activations)</p></li><li><p><a href="https://cp4space.hatsya.com/2021/01/08/the-neural-network-of-the-stockfish-chess-engine/">&#8220;The neural network of the Stockfish chess engine&#8221;</a> (very lightweight NN designed for incremental recomputation over changing board states)</p></li></ul></li><li><p><a href="https://arxiv.org/abs/2101.01169">&#8220;Transformers in Vision: A Survey&#8221;</a>, Khan et al 2021</p></li><li><p><a href="https://openai.com/blog/organizational-update/">OpenAI departures</a>:  Dario Amodei, Sam McCandlish, Tom Brown, Tom Henighan, Chris Olah, Jack  Clark, Ben Mann, Paul Christiano et al leave&#8212;most for an unspecified  new entity (<a href="https://steveblank.com/2009/12/21/the-elves-leave-middle-earth-%E2%80%93-soda%E2%80%99s-are-no-longer-free/">&#8220;the elves leave Middle Earth&#8221;</a>?)</p></li></ul><p>And the rest:</p><ul><li><p><a href="https://www.lesswrong.com/posts/pTYDdcag9pTzFQ7vw/2020-ai-alignment-literature-review-and-charity-comparison">&#8220;2020 AI Alignment Literature Review and Charity Comparison&#8221;</a>, Larks</p></li><li><p><a href="https://arxiv.org/abs/2009.01719#deepmind">&#8220;Grounded Language Learning Fast and Slow&#8221;</a>, Hill et al 2020</p></li><li><p><a href="https://arxiv.org/abs/2006.03654#microsoft">&#8220;DeBERTa: Decoding-enhanced BERT with Disentangled Attention&#8221;</a>, He et al 2020 (<a href="https://arxiv.org/abs/1905.00537" title="'SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems', Wang et al 2019">SuperGLUE</a> falls)</p></li><li><p><a href="https://arxiv.org/abs/2012.13349#deepmind">&#8220;Solving Mixed Integer Programs Using Neural Networks&#8221;</a>, Nair et al 2020</p></li><li><p><a href="https://arxiv.org/abs/2012.14271">&#8220;Towards Fully Automated Manga Translation&#8221;</a>, Hinami et al 2020</p></li><li><p><a href="https://arxiv.org/abs/2101.08001#baidu">&#8220;UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers&#8221;</a>, Hu et al 2021</p></li><li><p><a href="https://arxiv.org/abs/2012.07975#bair">&#8220;FERM: A Framework for Efficient Robotic Manipulation&#8221;</a>, Zhan et al 2021 (contrastive semi-supervised learning + data augmentation for sample-efficiency)</p></li><li><p><a href="https://arxiv.org/abs/2101.04702#google">&#8220;XMC-GAN: Cross-Modal Contrastive Learning for Text-to-Image Generation&#8221;</a>, Zhang et al 2021</p></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href="https://www.nature.com/articles/s41539-020-00079-z">&#8220;Nurture might be nature: cautionary tales and proposed solutions&#8221;</a>, Hart et al 2021</p></li><li><p><a href="https://www.sciencedirect.com/science/article/pii/S1755296620300624">&#8220;A genetic perspective on the association between exercise and mental health in the era of genome-wide association studies&#8221;</a>, de Geus 2020; <a href="https://www.gwern.net/docs/genetics/correlation/2020-schnurr.pdf">&#8220;Evidence for shared genetics between physical activity, sedentary behaviour and adiposity-related traits&#8221;</a>, Schnurr et al 2020</p></li><li><p><a href="https://www.medrxiv.org/content/10.1101/2020.12.11.20245035v1">&#8220;Antidepressant Response in Major Depressive Disorder: A Genome-wide Association Study&#8221;</a>, Pain et al 2020</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.04.03.024554v3">&#8220;Genome wide analysis of gene dosage in 24,092 individuals shows that 10,000 genes modulate cognitive ability&#8221;</a>, Huguet et al 2020 (yep, still polygenic)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.04.20.051631v2">&#8220;GWAS of three molecular traits highlights core genes and pathways alongside a highly polygenic background&#8221;</a>, Sinnott-Armstrong et al 2021</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.01.08.425895v1">&#8220;Genome-scale sequencing and analysis of human, wolf and bison DNA from 25,000 year-old sediment&#8221;</a>, Gelabert et al 2021 (incredible this is possible)</p></li><li><p><a href="https://www.medrxiv.org/content/10.1101/2021.01.25.21249961v1">&#8220;Disentangling  sex differences in the shared genetic architecture of PTSD, traumatic  experiences, and social support with body size and composition&#8221;</a>, Carvalho et al 2021 (<a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6684375/" title="'Distinguishing genetic correlation from causation across 52 diseases and complex traits', O'Connor &amp; Price 2018">LCV</a>)</p></li></ul><p>Recent Evolution:</p><ul><li><p><a href="https://www.gwern.net/docs/genetics/selection/2021-pereira.pdf">&#8220;African genetic diversity and adaptation inform a precision medicine agenda&#8221;</a>, Pereira et al 2021; <a href="https://www.nature.com/articles/s41576-020-00305-9">&#8220;The influence of evolutionary history on human health and disease&#8221;</a>, Benton et al 2021; <a href="https://www.biorxiv.org/content/10.1101/2021.01.26.428314v1">&#8220;Local adaptation and archaic introgression shape global diversity at human structural variant loci&#8221;</a>, Yan et al 2021</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.07.19.211078v2">&#8220;Genome scans of dog behavior implicate a gene network underlying psychopathology in mammals, including humans&#8221;</a>, Zapata et al 2021</p></li><li><p><a href="https://ideas.repec.org/p/uea/ueaeco/2021-02.html">&#8220;Natural Selection in Contemporary Humans is Linked to Income and Substitution Effects&#8221;</a>, Hugh-Jones &amp; Abdellaoui 2021</p></li><li><p><a href="https://elifesciences.org/articles/61644">&#8220;The diversity and function of sourdough starter microbiomes&#8221;</a>, Landis et al 2021 (crowdsourced sourdough show little trace of geographic origins?)</p></li></ul><p>Engineering:</p><ul><li><p><a href="https://www.gwern.net/docs/genetics/editing/2021-koblan.pdf">&#8220;In vivo base editing rescues Hutchinson-Gilford progeria syndrome in mice&#8221;</a>, Koblan et al 2021</p></li><li><p><a href="https://arxiv.org/abs/2101.05870">&#8220;From Genotype to Phenotype: polygenic prediction of complex human traits&#8221;</a>, Raben et al 2021</p></li></ul><h2>2.3 Statistics/Meta-Science/Math</h2><ul><li><p><a href="https://arxiv.org/abs/2101.07884">&#8220;The Quantum Field Theory on Which the Everyday World Supervenes&#8221;</a>,  Carroll 2021 (&#8220;&#8230;we have reason to be confident that the laws of physics  underlying the phenomena of everyday life are completely known&#8221; because  all unknown particles/fields are constrained to being extremely  rare/weak, eg by <a href="https://www.gwern.net/docs/science/2009-adelberger.pdf" title="Torsion balance experiments: A low--energy frontier of particle physics">Adelberger et al 2009</a>)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.12.10.419424v1">&#8220;How accurate are citations of frequently cited papers in biomedical literature?&#8221;</a>, Pavlovic et al 2020 (includes original author&#8217;s evaluation of whether a citation of their work is correct)</p></li><li><p><a href="https://arxiv.org/abs/1605.08448">&#8220;Energy-Efficient Algorithms&#8221;</a>, Demaine et al 2016 (<a href="https://en.wikipedia.org/wiki/Reversible_computing">reversible computing</a> asymptotics: constant-factor <a href="https://en.wikipedia.org/wiki/Stack_(abstract_data_type)">stacks</a>/<a href="https://en.wikipedia.org/wiki/Dynamic_array">arrays</a>, &#119978;(log <em>n</em>) time/energy <a href="https://en.wikipedia.org/wiki/AVL_tree">AVL trees</a>, &#119978;(<em>n</em>) space <a href="https://en.wikipedia.org/wiki/Comparison_sort">sorts</a>, &amp; various &#119978;(Vertex+Edge) time/space/energy <a href="https://en.wikipedia.org/wiki/Graph_traversal">graph searches</a>)</p></li><li><p><a href="https://www.gwern.net/docs/statistics/decision/2006-smith.pdf">&#8220;The Optimizer&#8217;s Curse: Skepticism and Postdecision Surprise in Decision Analysis&#8221;</a>,  Smith &amp; Winkler 2006 (regression to the mean is everywhere; another  example of why Bayes &amp; decision theory are two great flavors that  go great together)</p></li></ul><h2>2.4 Politics/Religion</h2><ul><li><p><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3650704">&#8220;The Mechanisms of Cult Production: An Overview&#8221;</a>, Xavier Marquez 2020 (see previously his <a href="https://www.gwern.net/newsletter/2019/02#abandoned-footnotes">blog roundup</a>)</p></li><li><p><a href="https://www.gwern.net/docs/sociology/1999-dawson.pdf">&#8220;When Prophecy Fails and Faith Persists: A Theoretical Overview&#8221;</a>, Dawson 1999</p></li><li><p><a href="https://www.overcomingbias.com/2020/11/why-we-fight-over-fiction.html">&#8220;Why We Fight Over Fiction&#8221;</a>, Robin Hanson</p></li><li><p><a href="https://en.wikipedia.org/wiki/All-Woman_Supreme_Court">The All-Woman Supreme Court</a></p></li></ul><h2>2.5 Psychology/Biology</h2><ul><li><p><a href="https://astralcodexten.substack.com/p/still-alive">&#8220;Still Alive&#8221;</a>,  Scott Alexander (announcement of SSC return as Substack newsletter  &#8216;Astral Codex Ten&#8217; &amp; launching a low-cost psychiatry clinic &#8216;Lorien  Psychiatry&#8217;)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.09.08.287276v1">&#8220;The Temporal Dynamics of Opportunity Costs: A Normative Account of Cognitive Fatigue and Boredom&#8221;</a>, Agrawal et al 2020</p></li><li><p><a href="https://onlinelibrary.wiley.com/doi/full/10.1002/hbm.25109">&#8220;A unified framework for association and prediction from vertex-wise grey-matter structure&#8221;</a>, Couvy-Duchesne et al 2020 (more <a href="https://www.gwern.net/Questions#variance-components">morphometricity</a>)</p></li><li><p><strong>Common phenomena</strong>: <a href="https://www.gwern.net/docs/psychology/2018-fassnidge.pdf">&#8220;Sounds from seeing silent motion: Who hears them, and what looks loudest?&#8221;</a>, Fassnidge &amp; Freeman 2018 (on &#8216;visual ear&#8217;; previously: <a href="https://www.sciencedirect.com/science/article/pii/S0960982208007343" title="The sound of change: visually-induced auditory synaesthesia">Saenz &amp; Koch 2008</a>, <a href="https://www.gwern.net/docs/psychology/2017-fassnidge.pdf" title="A deafening flash! Visual interference of auditory signal detection">Fassnidge et al 2017</a>)</p></li><li><p><a href="https://online.ucpress.edu/collabra/article/7/1/18731/115925/Predicting-Mental-Health-From-Followed-Accounts-on">&#8220;Predicting Mental Health From Followed Accounts on Twitter&#8221;</a>, Costelli et al 2021 (<a href="https://en.wikipedia.org/wiki/Preregistration_(science)#Registered_reports">Registered Report</a>: who you choose to follow says a lot about you&#8212;<a href="https://www.gwern.net/Everything">everything is correlated</a>)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.01.08.425841v1">&#8220;No evidence for general intelligence in a fish&#8221;</a>, Aellen et al 2021</p></li><li><p><a href="https://en.wikipedia.org/wiki/Delirium_tremens">Delirium tremens</a></p></li><li><p><a href="https://www.gwern.net/docs/biology/2021-asnicar.pdf">&#8220;Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals&#8221;</a>, Asnicar et al 2021</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.01.18.426733v1">&#8220;Universal DNA methylation age across mammalian tissues&#8221;</a>, Lu et al 2021; <a href="https://onlinelibrary.wiley.com/doi/full/10.1111/acel.13296">&#8220;Whole-body senescent cell clearance alleviates age-related brain inflammation and cognitive impairment in mice&#8221;</a>, Ogrodnik et al 2021</p></li><li><p><a href="https://arxiv.org/abs/2101.12037">&#8220;BENDR: using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data&#8221;</a>, Kostas et al 2021 (towards brain imitation learning)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Parker%E2%80%93Hulme_murder_case">Parker-Hulme murder case</a>; <a href="https://en.wikipedia.org/wiki/Slender_Man_stabbing">The Slender Man stabbing</a> (<a href="https://en.wikipedia.org/wiki/Paracosm">paracosms?</a>)</p></li><li><p><strong>Correction</strong>: <a href="https://news.ycombinator.com/item?id=25426329">Programming competition skills do not inversely correlate with job performance</a> after all</p></li></ul><h2>2.6 Technology</h2><ul><li><p><a href="https://en.wikipedia.org/wiki/Natural_nuclear_fission_reactor">Natural nuclear fission reactors (Oklo)</a></p></li><li><p><a href="https://www.gwern.net/docs/history/2007-keeley.pdf">&#8220;Baffles and Bastions: The Universal Features of Fortifications&#8221;</a>, Keeley et al 2007</p></li><li><p><a href="https://en.wikipedia.org/wiki/Corrupted_Blood_incident">The Corrupted Blood incident</a></p></li><li><p><em><a href="https://www.gwern.net/docs/design/2020-jeremytankard-footnote-36-redisturbed.pdf">Footnote</a></em><a href="https://www.gwern.net/docs/design/2020-jeremytankard-footnote-36-redisturbed.pdf"> 36: &#8220;Redisturbed&#8221;</a>: a <em>unicase</em> font experiment</p></li></ul><h2>2.7 Economics</h2><ul><li><p><a href="https://www.nytimes.com/2021/01/18/climate/carbon-removal-technology.html">&#8220;Businesses Aim to Pull Greenhouse Gases From the Air. It&#8217;s a Gamble&#8221;</a></p></li><li><p><a href="https://freakonomics.com/podcast/advertising-part-1/">"Does Advertising</a> <a href="https://freakonomics.com/podcast/advertising-part-2/">Actually Work?"</a>  (what could be more obvious than &#8220;advertising works&#8221;, and trivial to  confirm with correlational data? Yet, the tedious saying &#8220;correlation &#8800;  causation&#8221; stubbornly insists on being true); <a href="https://www.gwern.net/docs/traffic/2020-aral.pdf">&#8220;Digital Paywall Design: Implications for Content Demand and Subscriptions&#8221;</a>, Aral &amp; Dhillon 2020 (NYT nag-paywall caused &#8722;9.9% reading; in line with <a href="https://www.gwern.net/Ads">all the other results</a>)</p></li><li><p><a href="https://www.gwern.net/docs/economics/2010-schuh.pdf">&#8220;Who Gains and Who Loses from Credit Card Payments? Theory and Calibrations&#8221;</a>, Schuh et al 2010 (a compelling case for getting a rewards credit card if you&#8217;re a <a href="https://en.wikipedia.org/wiki/Debit_card">debit card</a> user&#8212;why subsidize them so much?)</p></li><li><p><a href="https://www.gwern.net/docs/economics/2019-quinn.pdf">&#8220;Squeezing the bears: cornering risk and limits on arbitrage during the &#8216;British bicycle mania&#8217;, 1896&#8211;1898&#8221;</a>, Quinn 2019</p></li></ul><h2>2.8 Fiction</h2><ul><li><p><a href="https://www.tabletmag.com/sections/arts-letters/articles/on-venus-have-we-got-a-rabbi" title="A long-lost space age satire about what it means to be a Jew from one of science fiction&#8217;s greatest humorists">&#8220;On Venus, Have We Got a Rabbi!&#8221;</a>, <a href="https://en.wikipedia.org/wiki/William_Tenn">William Tenn</a> 2016</p></li><li><p><a href="https://www.gwern.net/docs/history/2013-dubin-fabliauxtranslations-stmartinsfourwishes.pdf">&#8220;St Martin&#8217;s Four Wishes&#8221;</a>, Anonymous <a href="https://en.wikipedia.org/wiki/Fabliau">medieval poet</a> (trans. Dubin 2013)</p></li></ul><h2>2.9 Miscellaneous</h2><ul><li><p>The <a href="https://en.wikipedia.org/wiki/Anglo-Japanese_style">Anglo-Japanese style</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Stalag_Luft_III">Stalag Luft III</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Graham_Island_(Mediterranean_Sea)">Ferdinandea</a></p></li></ul><div><hr></div><ol><li><p>But it&#8217;ll still be too many days &#8217;till we say we&#8217;re sorry.</p></li></ol>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[December newsletter]]></title>
            <description><![CDATA[December 2020 gwern.net newsletter with links on AI and technology; major new site feature: fully-generalized recursive popups.]]></description>
            <link>https://gwern.substack.com/p/december-newsletter</link>
            <guid isPermaLink="false">https://gwern.substack.com/p/december-newsletter</guid>
            <dc:creator><![CDATA[gwern]]></dc:creator>
            <pubDate>Sun, 10 Jan 2021 17:31:06 GMT</pubDate>
            <enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/>
            <content:encoded><![CDATA[<p>Please see the canonical version of the December 2020 newsletter on <a href="https://www.gwern.net/newsletter/2020/12">Gwern.net</a>.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[November newsletter]]></title>
            <description><![CDATA[November 2020 gwern.net newsletter with links on DL and genomics scaling, dark mode rewrite, 1 essay, and 1 opera review ('The Ring' cycle).]]></description>
            <link>https://gwern.substack.com/p/november-newsletter</link>
            <guid isPermaLink="false">https://gwern.substack.com/p/november-newsletter</guid>
            <dc:creator><![CDATA[gwern]]></dc:creator>
            <pubDate>Fri, 04 Dec 2020 00:40:13 GMT</pubDate>
            <enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/>
            <content:encoded><![CDATA[<p><strong>Please see the <a href="https://www.gwern.net/newsletter/2020/11">canonical November 2020 gwern.net</a></strong><a href="https://www.gwern.net/newsletter/2020/11"> newsletter link.</a></p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[October 2020 news]]></title>
            <description><![CDATA[October 2020 gwern.net newsletter with links on AI scaling, Euclid; further site reorganization & improvement.]]></description>
            <link>https://gwern.substack.com/p/october-2020-news</link>
            <guid isPermaLink="false">https://gwern.substack.com/p/october-2020-news</guid>
            <dc:creator><![CDATA[gwern]]></dc:creator>
            <pubDate>Sun, 01 Nov 2020 21:42:39 GMT</pubDate>
            <enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/>
            <content:encoded><![CDATA[<p>Please see the <a href="https://www.gwern.net/newsletter/2020/10">canonical web October 2020</a> edition of <a href="https://gwern.substack.com">the <code>gwern.net</code> newsletter</a>.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[September 2020 News]]></title>
            <description><![CDATA[September 2020 gwern.net newsletter with links on DRL and AI scaling, psychiatric disorders; no reviews.]]></description>
            <link>https://gwern.substack.com/p/september-2020-news</link>
            <guid isPermaLink="false">https://gwern.substack.com/p/september-2020-news</guid>
            <dc:creator><![CDATA[gwern]]></dc:creator>
            <pubDate>Mon, 26 Oct 2020 13:40:32 GMT</pubDate>
            <enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/>
            <content:encoded><![CDATA[<p>Please see the&nbsp;<a href="https://www.gwern.net/newsletter/2020/09">canonical web September 2020</a>&nbsp;edition of&nbsp;<a href="https://gwern.substack.com">the&nbsp;<code>gwern.net</code>&nbsp;newsletter</a>.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[August 2020 gwern.net newsletter]]></title>
            <description><![CDATA[with an essay on sidenotes; links on human competence, efficient-computing/hardware-overhangs; no reviews.]]></description>
            <link>https://gwern.substack.com/p/august-2020-gwernnet-newsletter</link>
            <guid isPermaLink="false">https://gwern.substack.com/p/august-2020-gwernnet-newsletter</guid>
            <dc:creator><![CDATA[gwern]]></dc:creator>
            <pubDate>Tue, 01 Sep 2020 23:18:28 GMT</pubDate>
            <enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/>
            <content:encoded><![CDATA[<p>Please see the <a href="https://www.gwern.net/newsletter/2020/08">canonical on-site August 2020</a> edition of <a href="https://gwern.substack.com">the <code>gwern.net</code> newsletter</a>.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[July 2020 gwern.net newsletter]]></title>
            <description><![CDATA[Links on the Uighurs, authoritarianism, negative emissions, AI overhang; 1 movie & 2 anime reviews]]></description>
            <link>https://gwern.substack.com/p/july-2020-gwernnet-newsletter</link>
            <guid isPermaLink="false">https://gwern.substack.com/p/july-2020-gwernnet-newsletter</guid>
            <dc:creator><![CDATA[gwern]]></dc:creator>
            <pubDate>Thu, 20 Aug 2020 20:09:50 GMT</pubDate>
            <enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/>
            <content:encoded><![CDATA[<p>Please see the <a href="https://www.gwern.net/newsletter/2020/07">on-gwern.net canonical July 2020</a> edition of <a href="https://gwern.substack.com">the <code>gwern.net</code> newsletter</a>.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[June gwern.net newsletter]]></title>
            <description><![CDATA[June 2020 gwern.net newsletter with 3 new pages/essays, and links on CRISPR, population screening, AI scaling, politics, and technological unemployment.]]></description>
            <link>https://gwern.substack.com/p/june-gwernnet-newsletter</link>
            <guid isPermaLink="false">https://gwern.substack.com/p/june-gwernnet-newsletter</guid>
            <dc:creator><![CDATA[gwern]]></dc:creator>
            <pubDate>Thu, 02 Jul 2020 14:34:53 GMT</pubDate>
            <enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/>
            <content:encoded><![CDATA[<p>See the canonical <a href="https://www.gwern.net/newsletter/2020/06">on-gwern.net June 2020</a> edition of <a href="https://gwern.substack.com">the <code>gwern.net</code> newsletter</a>.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[May Gwern.net Newsletter]]></title>
            <description><![CDATA[Link compilation newsletter with anime GAN updates, links on AI scaling, discussion of GPT-3, and 1 book review.]]></description>
            <link>https://gwern.substack.com/p/may-gwernnet-newsletter</link>
            <guid isPermaLink="false">https://gwern.substack.com/p/may-gwernnet-newsletter</guid>
            <dc:creator><![CDATA[gwern]]></dc:creator>
            <pubDate>Sat, 06 Jun 2020 18:44:15 GMT</pubDate>
            <enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/>
            <content:encoded><![CDATA[<p>Due to extensive editing &amp; expansion of the GPT-3 discussion, please see the canonical newsletter version at <a href="https://www.gwern.net/newsletter/2020/05">https://www.gwern.net/newsletter/2020/05</a></p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[April 2020 gwern.net newsletter]]></title>
            <description><![CDATA[This is the April 2020 edition of the gwern.net newsletter; previous, March 2020 (archives).]]></description>
            <link>https://gwern.substack.com/p/april-2020-gwern-net-newsletter</link>
            <guid isPermaLink="false">https://gwern.substack.com/p/april-2020-gwern-net-newsletter</guid>
            <pubDate>Fri, 01 May 2020 00:00:00 GMT</pubDate>
            <enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/>
            <content:encoded><![CDATA[<p>This is the <a href="https://www.gwern.net/newsletter/2020/04">April 2020</a> edition of <a href="https://tinyletter.com/gwern">the <code>gwern.net</code> newsletter</a>; previous, <a href="https://www.gwern.net/newsletter/2020/03">March 2020</a> (<a href="https://www.gwern.net/tags/newsletter">archives</a>). Please see the canonical gwern.net version.</p>]]></content:encoded>
        </item>
    </channel>
</rss>

Raw text

<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Gwern.net Newsletter]]></title><description><![CDATA[Latest gwern.net updates, interesting links, and reviews]]></description><link>https://gwern.substack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png</url><title>Gwern.net Newsletter</title><link>https://gwern.substack.com</link></image><generator>Substack</generator><lastBuildDate>Sat, 14 Mar 2026 18:37:46 GMT</lastBuildDate><atom:link href="https://gwern.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Gwern Branwen]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[[email protected]]]></webMaster><itunes:owner><itunes:email><![CDATA[[email protected]]]></itunes:email><itunes:name><![CDATA[gwern]]></itunes:name></itunes:owner><itunes:author><![CDATA[gwern]]></itunes:author><googleplay:owner><![CDATA[[email protected]]]></googleplay:owner><googleplay:email><![CDATA[[email protected]]]></googleplay:email><googleplay:author><![CDATA[gwern]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[May 2021 Gwern.net Newsletter]]></title><description><![CDATA[links on AI hardware, diffusion models, optogenetics, brain scanning.]]></description><link>https://gwern.substack.com/p/may-2021-gwernnet-newsletter</link><guid isPermaLink="false">https://gwern.substack.com/p/may-2021-gwernnet-newsletter</guid><dc:creator><![CDATA[gwern]]></dc:creator><pubDate>Fri, 11 Jun 2021 14:16:22 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>May 2021&#8217;s <a href="https://www.gwern.net/newsletter/2021/05">Gwern.net</a> <a href="https://gwern.substack.com">newsletter</a> is now out; previous, <a href="https://www.gwern.net/newsletter/2021/04">April 2021</a> (<a href="https://www.gwern.net/tags/newsletter">archives</a>). This is a collation of links and summary of major changes, overlapping with my <a href="https://www.gwern.net/Changelog">Changelog</a>; brought to you by my donors on <a href="https://www.patreon.com/gwern">Patreon</a>.</p><p>Note: I will be in Denver 12&#8211;13 June 2021 for a conference.</p><h1>1 Writings</h1><ul><li><p><strong>Proposal</strong>: <a href="https://www.gwern.net/CYOA">&#8220;Choose Your Own Adventure AI Dungeon&#8221;</a>; <a href="https://www.gwern.net/GPT-2-preference-learning#decision-transformers-preference-learning-as-simple-as-possible">&#8220;Decision Transformers: Preference Learning As Simple As Possible&#8221;</a></p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><p><a href="https://old.reddit.com/r/mlscaling/">Matters Of Scale</a>:</p><ul><li><p><strong>Hardware</strong>:</p><ul><li><p><a href="https://arxiv.org/abs/2104.06272#deepmind">&#8220;Podracer architectures for scalable Reinforcement Learning&#8221;</a>, Hessel et al 2021 (highly-efficient TPU pod use: eg solving Pong in &lt;1min at 43 million FPS on a TPUv3-2048); <a href="https://venturebeat.com/2021/05/18/google-details-new-ai-accelerator-chips/">&#8220;Google details new TPUv4 AI accelerator chips&#8221;</a> (2.7&#215; TPUv3 chips; up to TPUv4-4096 pods, yielding &gt;1 ExaFLOPS; public access later in 2021)x</p></li><li><p><a href="https://arxiv.org/abs/2104.07857#microsoft">&#8220;ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning&#8221;</a>, Rajbhandari et al 2021 (~1 trillion parameters per 16 GPUs/DGX-2-node, scaling to &gt;512 GPUs ~40% efficiency)</p></li><li><p><a href="https://arxiv.org/abs/2105.04663#google">&#8220;GSPMD: General and Scalable Parallelization for ML Computation Graphs&#8221;</a>, Xu et al 2021 (Google upgrade of <a href="https://arxiv.org/abs/1811.06965#google" title="'GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism', Huang et al 2018">GPipe</a>/<a href="https://arxiv.org/abs/2006.16668#google" title="'GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding', Lepikhin et al 2020">GShard</a> arch to match <a href="https://www.microsoft.com/en-us/research/blog/deepspeed-extreme-scale-model-training-for-everyone/" title="DeepSpeed: Extreme-scale model training for everyone">MS DeepSpeed</a>: &#8220;&#8230;50%&#8211;62% compute utilization on 128&#8211;2048 Cloud TPUv3 cores for models with up to one trillion parameters&#8221;)</p></li><li><p><a href="https://arxiv.org/abs/2104.05158#facebook">&#8220;DLRM: High-performance, Distributed Training of Large-scale Deep Learning Recommendation Models&#8221;</a>,  Mudigere et al 2021 (ZionEX software/hardware platform for training  extremely large embeddings&#8212;while embeddings aren&#8217;t &#8216;real&#8217; parameters  &amp; things like <a href="https://arxiv.org/abs/2004.08366#google" title="'DynamicEmbedding: Extending TensorFlow for Colossal-Scale Applications', Zeng et al 2020">DynamicEmbedding</a> will never learn tricks like GPT-3 no matter how big, they present similar challenges); <a href="https://arxiv.org/abs/2105.08820#facebook">&#8220;RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance&#8221;</a>, Gupta et al 2021</p></li></ul></li><li><p><a href="https://arxiv.org/abs/2105.12196#deepmind">&#8220;From Motor Control to Team Play in Simulated Humanoid Football&#8221;</a>,  Liu et al 2021 (curriculum training of a single NN from raw humanoid  control to coordinated team-wide soccer strategy; neat to compare with <a href="https://arxiv.org/abs/2009.01719#deepmind" title="Grounded Language Learning Fast and Slow">Hill et al 2020</a> in terms of agent abilities)</p></li><li><p><a href="https://arxiv.org/abs/2105.11084#facebook">&#8220;Wav2vec-U: Unsupervised Speech Recognition&#8221;</a>, Baevski et al 2021</p></li><li><p><a href="https://www.anthropic.com/news/announcement">&#8220;Anthropic&#8221; public-benefit-corp/startup launched</a> (founded by the Amodeis; $124M investment for scaling &#8220;reliable and steerable AI systems&#8221;); <a href="https://www.cooperativeai.com/foundation">&#8220;Cooperative AI Foundation&#8221; (CAIF)</a> launched</p></li><li><p><a href="https://arxiv.org/abs/2105.01601#google">&#8220;MLP-Mixer: An all-MLP Architecture for Vision&#8221;</a>, Tolstikhin et al 2021 (another <a href="https://www.gwern.net/notes/FC">FC paper</a> removing even more inductive biases&#8212;ponies are all you need: &#8220;Mixer <a href="http://www.incompleteideas.net/IncIdeas/BitterLesson.html">improves more rapidly with data</a>  than ResNets, or even ViT, and the gap between large scale Mixer and  ViT models shrinks until the performance is matched on the entire  dataset&#8230;&#8221; The Bitter Lesson truly is the single bitterest lesson in ML,  isn&#8217;t it? The more people tweet about how MLP-Mixer is overhyped because  is &#8722;X% worse than the ultra-hand-optimized baseline or requires Y&#215; more  FLOPS, the more they demonstrate <em>precisely why</em> this sort of  research is so important! And showing, incidentally, that Transformers  are still under-researched if such a fundamental fact could have been  missed for so long.)</p></li><li><p><a href="https://arxiv.org/abs/2104.08945#facebook">&#8220;Data-Efficient Language-Supervised Zero-Shot Learning with Self-Distillation&#8221;</a>, Cheng et al 2021 (<a href="https://openai.com/blog/clip/">CLIP</a>-like performance scaled down to <em>n</em> = 3m using <a href="https://arxiv.org/abs/1503.02531#google" title="'Distilling the knowledge in a neural network', Hinton et al 2015">soft labels</a> generated by a <a href="https://www.gwern.net/docs/ai/2018-sharma.pdf#google" title="Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning">Conceptual Captions</a>-pretrained model)</p></li><li><p><a href="https://arxiv.org/abs/2104.07636#google">&#8220;SR3: Image Super-Resolution via Iterative Refinement&#8221;</a>, Saharia et al 2021; <a href="https://arxiv.org/abs/2105.05233#openai">&#8220;Diffusion Models Beat GANs on Image Synthesis&#8221;</a>, Dhariwal &amp; Nichol 2021 (<a href="https://arxiv.org/abs/2006.11239" title="'Denoising Diffusion Probabilistic Models', Ho et al 2020">DDPM</a>^<a href="file:///tmp/burlbC6ws6.html#fn1">1</a>^ finally surpass <a href="https://arxiv.org/abs/1809.11096#deepmind" title="'BigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis', Brock et al 2018">BigGAN-deep</a> on ImageNet 512px images at similar compute-cost, as <a href="https://arxiv.org/abs/2102.09672" title="'Improved Denoising Diffusion Probabilistic Models', Nichol &amp; Dhariwal 2021">expected from their</a><a href="https://www.gwern.net/notes/Scaling">good scaling</a>); <a href="https://cascaded-diffusion.github.io/">&#8220;Cascaded Diffusion Models for High Fidelity Image Generation&#8221;</a>, Ho et al 2021</p></li><li><p><a href="https://arxiv.org/abs/2009.01325#openai">&#8220;Learning to summarize from human feedback&#8221;</a>, Stiennon et al 2020</p></li><li><p><a href="https://www.gwern.net/docs/ai/2021-power.pdf#openai">&#8220;Grokking: Generalization Beyond Overfitting On Small Algorithmic Data Sets&#8221;</a>, Power et al 2021 (<a href="https://old.reddit.com/r/mlscaling/comments/n78584/grokking_generalization_beyond_overfitting_on/">discussion</a>;  new scaling effect, &#8216;grokking&#8217;: sudden perfect generalization emerging  many epochs after training-set overfitting on algorithmic tasks when  training in <a href="https://www.gwern.net/docs/ai/2021-power-poster.png#openai">flat shallow loss landscapes</a>); <a href="https://arxiv.org/abs/2106.05237#google">&#8220;Knowledge distillation: A good teacher is patient and consistent&#8221;</a>, Beyer et al 2021 (training much smaller models merely requires hundreds of thousands or millions of epochs)</p></li><li><p><a href="https://arxiv.org/abs/2104.14830#google">&#8220;Scaling End-to-End Models for Large-Scale Multilingual ASR&#8221;</a>, Li et al 2021</p></li><li><p><a href="https://arxiv.org/abs/2103.10948">&#8220;The Shape of Learning Curves: a Review&#8221;</a>, Viering &amp; Loog 2021</p></li><li><p><a href="https://www.sciencedirect.com/science/article/pii/S0004370221000862#deepmind">&#8220;Reward is enough&#8221;</a>,  Silver et al 2021 (a DRL manifesto: reward losses enough at scale of  compute/parameters/tasks to induce all important capabilities like  memory/exploration/generalization/imitation/reasoning)</p></li><li><p><strong>Scaling Down</strong>: <a href="https://github.com/nshepperd/lazy"><code>lazy</code>: a tool for running processes in idle time</a> (how to train on a GPU without destroying your GUI&#8217;s usability! <code>lazy</code>  pauses runs briefly while you interact with your desktop, letting you  do months-long runs without going crazy or resorting to Colab etc. This  enables hobbyists to go after previously-infeasible model sizes);  EleutherAI releases <a href="https://arankomatsuzaki.wordpress.com/2021/06/04/gpt-j/">a 6b-parameter GPT-3 model, GPT-J</a> (are you still using GPT-2/GPT-Neo? upgrade!); <a href="https://arxiv.org/abs/2105.12723">&#8220;Aggregating Nested Transformers&#8221;</a>, Zhang et al 2021/<a href="https://arxiv.org/abs/2105.14217">&#8220;Less is More: Pay Less Attention in Vision Transformers&#8221;</a>, Pan et al 2021</p></li></ul><ul><li><p><a href="https://arxiv.org/abs/2105.13626#google">&#8220;ByT5: Towards a token-free future with pre-trained byte-to-byte models&#8221;</a>, Xue et al 2021 (character models&#8212;not just feasible but desirable; we&#8217;ll get our rhyming &amp; pun-making language models yet!)</p></li><li><p><a href="https://www.gwern.net/docs/ai/2008-golle.pdf">&#8220;Machine Learning Attacks Against the Asirra CAPTCHA&#8221;</a>,  Golle 2008 (a look back on a decade of CV progress: months of work for  80% cat vs dog with SVM ensembles in 2008; 5min in Fast.ai for 99%  accuracy in 2018; for even more perspective, <a href="https://www.gwern.net/docs/ai/2012-ciresan.pdf" title="Deep big multilayer perceptrons for digit recognition">Cire&#351;an 2012</a>)</p></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href="https://www.gwern.net/docs/genetics/heritable/2021-levey.pdf">&#8220;Bi-ancestral  depression GWAS in the Million Veteran Program and meta-analysis in  &gt;1.2 million individuals highlight new therapeutic directions&#8221;</a>, Levey et al 2021</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.05.26.445798v1">&#8220;The complete sequence of a human genome&#8221;</a>, Nurk et al 2021 (<a href="https://www.nature.com/articles/d41586-021-01506-w" title="A complete human genome sequence is close: how scientists filled in the gaps; researchers added 200 million DNA base pairs and 115 protein-coding genes &#8212; but they&#8217;ve yet to entirely sequence the Y chromosome">media</a>)</p></li><li><p><a href="https://www.gwern.net/docs/iq/2021-vonstumm.pdf">&#8220;Using DNA to predict intelligence&#8221;</a>, von Stumm &amp; Plomin 2021 (review)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/848366v2.full">&#8220;Long  read sequencing of 3,622 Icelanders provides insight into the role of  structural variants in human diseases and other traits&#8221;</a>, Beyter et al 2021</p></li><li><p><a href="https://www.gwern.net/docs/genetics/heritable/2021-owen.pdf">&#8220;Rapid Sequencing&#8211;Based Diagnosis of Thiamine Metabolism Dysfunction Syndrome&#8221;</a> (sequence everyone!)</p></li></ul><p>Engineering:</p><ul><li><p><a href="https://www.gwern.net/docs/genetics/editing/2021-robertson.pdf">&#8220;Sense codon reassignment enables viral resistance and encoded polymer synthesis&#8221;</a>,  Robertson et al 2021 (&#8220;ultra-safe cells&#8221;: synthesizing an entire E.  coli genome with swapped codons for complete viral immunity)</p></li><li><p><a href="https://www.gwern.net/docs/genetics/editing/2021-musunuru.pdf">&#8220;In vivo CRISPR base editing of </a><em><a href="https://www.gwern.net/docs/genetics/editing/2021-musunuru.pdf">PCSK9</a></em><a href="https://www.gwern.net/docs/genetics/editing/2021-musunuru.pdf"> durably lowers cholesterol in primates&#8221;</a>, Musunuru et al 2021</p></li><li><p><strong><a href="https://en.wikipedia.org/wiki/Optogenetics">Optogenetics</a></strong>: <a href="https://www.gwern.net/docs/genetics/editing/2021-sahel.pdf">&#8220;Partial recovery of visual function in a blind patient after optogenetic therapy&#8221;</a>, Sahel et al 2021 (<a href="https://www.statnews.com/2021/05/24/scientists-use-optogenetics-for-first-time-to-help-blind-patient-see/" title="With engineered proteins, scientists use optogenetics for the first time to help a blind patient see again">media</a>); <a href="https://www.gwern.net/docs/biology/2021-yang.pdf">&#8220;Wireless multilateral devices for optogenetic studies of individual and social behaviors&#8221;</a>, Yang et al 2021 (<a href="https://www.nytimes.com/2021/05/25/science/optogenetics-brain-social-behavior.html" title="Scientists Drove Mice to Bond by Zapping Their Brains With Light: The study, a tour de force in bioengineering, comes after 2 decades of research on brain-to-brain synchrony in people">media</a>)</p></li><li><p><a href="https://www.pnas.org/content/118/18/e2018181118">&#8220;Retron Library Recombineering (RLR): High-throughput functional variant screens via in vivo production of single-stranded DNA&#8221;</a>, Schubert et al 2021</p></li><li><p><a href="https://www.nature.com/articles/d41586-021-01186-6">&#8220;First genetically modified Oxitec mosquitoes released in the United States&#8221;</a></p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.05.28.446207v1">&#8220;Genomic characterization of world&#8217;s longest selection experiment in mouse reveals the complexity of polygenic traits&#8221;</a>, Palma-Vera et al 2021</p></li><li><p><a href="https://www.sciencedirect.com/science/article/pii/S0734975021000628">&#8220;Surrogate broodstock to enhance biotechnology research and applications in aquaculture&#8221;</a>, Jin et al 2021</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.11.05.370478v3">&#8220;Utility of polygenic embryo screening for disease depends on the selection strategy&#8221;</a>, Lencz et al 2021</p></li><li><p><a href="https://www.nature.com/articles/d41586-021-01423-y">&#8220;Limit  on lab-grown human embryos dropped by stem-cell body: The International  Society for Stem Cell Research relaxed the famous 14-day rule on  culturing human embryos in its latest research guidelines&#8221;</a></p></li><li><p><a href="https://www.nytimes.com/2007/08/28/science/28crop.html">&#8220;Useful Mutants, Bred With Radiation&#8221;</a> (on <a href="https://en.wikipedia.org/wiki/Atomic_gardening">atomic gardening</a>)</p></li></ul><h2>2.3 Statistics/Meta-Science</h2><ul><li><p><a href="https://blog.dshr.org/2021/03/correlated-failures.html">&#8220;Correlated Failures&#8221; in HDDs/SSDs</a></p></li><li><p><a href="https://www.gwern.net/docs/statistics/bias/1992-rogers.pdf">&#8220;How a Publicity Blitz Created The Myth of Subliminal Advertising&#8221;</a>, Rogers 1992 (the famous movie-theater/popcorn-sales experiment never happened)</p></li></ul><h2>2.4 Politics/Religion</h2><ul><li><p><a href="https://www.gwern.net/docs/sociology/2021-costello.pdf">&#8220;Clarifying the Structure and Nature of Left-Wing Authoritarianism (LWA)&#8221;</a>, Costello et al 2021</p></li><li><p><a href="https://fantasticanachronism.com/2021/04/28/book-review-the-decline-and-fall-of-the-roman-empire/">&#8220;Book Review: </a><em><a href="https://fantasticanachronism.com/2021/04/28/book-review-the-decline-and-fall-of-the-roman-empire/">The Decline and Fall of the Roman Empire</a></em><a href="https://fantasticanachronism.com/2021/04/28/book-review-the-decline-and-fall-of-the-roman-empire/">&#8221;</a> (<a href="https://fantasticanachronism.com/2021/05/03/highlights-from-the-decline-and-fall-of-the-roman-empire/">excerpts</a>)</p></li></ul><h2>2.5 Psychology/Biology</h2><ul><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.05.29.446289v1">&#8220;A connectomic study of a petascale fragment of human cerebral cortex&#8221;</a>,  Shapson-Coe et al 2021 (&#8220;&#8230;This &#8220;digital tissue&#8221; is a ~660,000&#215; scale up  of an earlier saturated reconstruction from a small region of mouse  cortex, published in 2015 (<a href="https://www.sciencedirect.com/science/article/pii/S0092867415008247" title="Saturated Reconstruction of a Volume of Neocortex">Kasthuri et al 2015</a>).  Although this scaleup was difficult, it was not hundreds of thousands  of times more difficult and took about the same amount of time as the  previous data set (~4 years)&#8230;The rapid improvements over the past few  years&#8230;argues that analyzing volumes that are even 3 orders of magnitude  larger, such as an exascale whole mouse brain connectome, will likely be  in reach within a decade." See also <a href="https://xcorr.net/2021/04/27/accelerating-progress-in-brain-recording-tech/">&#8220;Accelerating progress in brain recording tech&#8221;</a>.)</p></li><li><p><a href="https://www.nature.com/articles/s41467-021-22199-9">&#8220;Neuroimaging evidence for a network sampling theory of individual differences in human intelligence test performance&#8221;</a>, Soreq et al 2021; <a href="https://elifesciences.org/articles/64058">&#8220;The neural basis of intelligence in fine-grained cortical topographies&#8221;</a>, Feilong et al 2021; <a href="https://link.springer.com/article/10.1007/s00429-020-02113-7">&#8220;Predicting intelligence from brain gray matter volume&#8221;</a>, Hilger et al 2020 (towards the mechanistic reification of <em>g</em>: per <a href="https://www.gwern.net/docs/iq/2007-jung.pdf" title="'The Parieto-Frontal Integration Theory (P-FIT) of intelligence: Converging neuroimaging evidence', Jung &amp; Haier 2007">P-FIT</a>,  it is global efficiency/total cognitive resources which can be spent on  learning &amp; orchestrating specialized capabilities); if we consider  recent human brain imaging studies, cross-species comparisons, and deep  learning as converging, I would offer as a speculation the following:</p><p>The Master Synthesis: intelligence  is execution of small simplicity-weighted programs, best discovered by  search over smooth loss landscapes like that of <a href="https://www.gwern.net/notes/Sparsity">highly-overparameterized</a> differentiable networks containing lottery-ticket subnetworks which are ensembled/averaged over, <a href="https://www.gwern.net/Backstop#deep-bayes">approaching Bayes-optimal</a>  reasoning in the limit (as nearest-neighbors-like high dimensional  interpolation / memorization gives way to algorithmic generalization /  interpolation on a more abstract level); this can be implemented by  large numbers of similar neurons trained using any of the many  approximations to backprop; human intelligence&#8217;s <em>g</em> is real but  is the overall &#8216;pool&#8217; of neural resources which derives from overall  body integrity because the number of neurons, their density, their  myelination, resistance to damage and infection etc, is causally  downstream of all body and developmental systems, creating a huge  mutational target; the brain regions specialize and differentiate, and  their orchestration (or lack thereof) contributes to observed  performance on tasks tapping into multiple specialized regions; as tasks  rely on fewer regions or approach intrinsic ceiling, <em>g</em> ceases to be observable and task-specific influences matter most.</p></li><li><p><a href="https://www.nature.com/articles/s41591-021-01336-3">&#8220;MDMA-assisted therapy for severe PTSD: a randomized, double-blind, placebo-controlled phase 3 study&#8221;</a>, Mitchell et al 2021 (<em>d</em> = 0.9 over therapy); <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7643046/">&#8220;Effects of Psilocybin-Assisted Therapy on Major Depressive Disorder&#8221;</a>, Davis et al 2021</p></li><li><p><a href="https://www.newyorker.com/magazine/2021/04/05/why-animals-dont-get-lost">&#8220;Why  Animals Don&#8217;t Get Lost: Birds do it. Bees do it. Learning about the  astounding navigational feats of wild creatures can teach us a lot about  where we&#8217;re going&#8221;</a> (on spectacular but still mysterious feats of <a href="https://en.wikipedia.org/wiki/Animal_navigation">animal navigation</a>)</p></li><li><p><a href="https://defector.com/in-the-future-of-collecting-is-anyone-having-fun/">&#8220;In The Future Of Collecting, Is Anyone Having Fun?&#8221;</a> (on <a href="https://en.wikipedia.org/wiki/Bobblehead">Bobblehead</a> collectors)</p></li><li><p><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8114859/">&#8220;Linking Brain Biology to Intellectual Endowment: A Review on the Associations of Human Intelligence With Neuroimaging Data&#8221;</a>, Dizaji et al 2021</p></li><li><p><a href="https://www.gwern.net/docs/economics/2012-oboyle.pdf">&#8220;The Best And The Rest: Revisiting The Norm Of Normality Of Individual Performance&#8221;</a>, O&#8217;Boyle &amp; Aguinis 2012 (performance is <a href="https://www.gwern.net/notes/Pipeline">log-normal</a>)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.11.21.392720v1">&#8220;A conserved strategy for inducing appendage regeneration&#8221;</a>, Abrams et al 2021 (slight regrowth of damaged mouse limbs by drinking sugar+amino-acid-supplemented water)</p></li><li><p><a href="https://astralcodexten.substack.com/p/know-your-amphetamines">&#8220;Know Your Amphetamines&#8221;</a>, Scott Alexander</p></li><li><p><a href="https://www.nature.com/articles/srep02617">&#8220;Feeling Small: Exploring the Tactile Perception Limits [of Humans]&#8221;</a>, Skedung et al 2013</p></li><li><p><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">&#8220;The Board Game of the Alpha Nerds: Before </a><em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">Risk</a></em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">, before </a><em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">Dungeons &amp; Dragons</a></em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">, before </a><em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">Magic: The Gathering</a></em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">, there was </a><em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">Diplomacy</a></em><a href="http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/" title="One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)">&#8221;</a> (<a href="https://en.wikipedia.org/wiki/Diplomacy_(game)">WP</a>;  &#8220;I still don&#8217;t know whom I should have trusted, if anyone. All I know  is that I felt stupid, stressed out, humiliated, and sad.&#8221;)</p></li></ul><h2>2.6 Technology</h2><ul><li><p><a href="https://rootsofprogress.org/nuclear-physics">&#8220;I walk the (beta-stability) line: How counting neutrons explains nuclear waste&#8221;</a></p></li><li><p><a href="https://alexdanco.com/2020/10/08/making-is-show-business-now/">&#8220;Making is Show Business now&#8221;</a>, Alex Danco</p></li><li><p><a href="https://www.thenewatlantis.com/publications/shop-class-as-soulcraft">&#8220;Shop Class as Soulcraft: The case for the manual trades&#8221;</a>, Crawford 2006</p></li><li><p><a href="https://www.kickstarter.com/projects/upperstory/spintronics-build-mechanical-circuits">&#8220;Spintronics: Build mechanical circuits&#8221;</a>, Kickstarter (followup to <a href="https://en.wikipedia.org/wiki/Turing_Tumble">Turing Tumble</a>)</p></li></ul><h2>2.7 Economics</h2><ul><li><p><a href="https://www.gwern.net/docs/sociology/2020-dellavigna.pdf">&#8220;RCTs to Scale: Comprehensive Evidence from 2 Nudge Units&#8221;</a>, DellaVigna &amp; Linos 2020 (nudge effects overestimated by 6.2&#215; due to publication bias)</p></li><li><p><a href="https://academic.oup.com/ije/advance-article/doi/10.1093/ije/dyab099/6288123">&#8220;No  causal associations between childhood family income and subsequent  psychiatric disorders, substance misuse and violent crime arrests: a  nationwide Finnish study of &gt;650,000 individuals and their siblings&#8221;</a>, Sariaslan et al 2021; <a href="https://academic.oup.com/ije/advance-article/doi/10.1093/ije/dyab066/6274255">&#8220;Parental income and mental disorders in children and adolescents: prospective register-based study&#8221;</a>, Kinge et al 2021</p></li><li><p><a href="https://mattlakeman.org/2021/06/01/everything-you-might-want-to-know-about-whaling/">&#8220;Everything You Might Want to Know about Whaling&#8221;</a>, Matt Lakeman</p></li><li><p><a href="https://www.gwern.net/notes/Nash">Exploding Nash Equilibrium For Trustless Trade</a></p></li></ul><h2>2.8 Fiction</h2><ul><li><p><a href="https://www.lightspeedmagazine.com/fiction/love-is-the-plan-the-plan-is-death/">&#8220;Love Is the Plan the Plan Is Death&#8221;</a>, <a href="https://en.wikipedia.org/wiki/James_Tiptree_Jr.">James Tiptree, Jr.</a> (<a href="https://en.wikipedia.org/wiki/Love_Is_the_Plan_the_Plan_Is_Death">WP</a>)</p></li></ul><h2>2.9 Miscellaneous</h2><ul><li><p><a href="https://www.newyorker.com/news/dispatch/the-strange-story-of-dagobert-the-ducktales-bandit">&#8220;The Strange Story of Dagobert, the </a><em><a href="https://www.newyorker.com/news/dispatch/the-strange-story-of-dagobert-the-ducktales-bandit">Duck Tales</a></em><a href="https://www.newyorker.com/news/dispatch/the-strange-story-of-dagobert-the-ducktales-bandit">  Bandit: In the &#8217;90s, a frustrated artist in Berlin went on a crime  spree&#8212;building bombs, extorting high-end stores, and styling his persona  after Scrooge McDuck. He soon became a German folk hero.&#8221;</a> (<a href="https://en.wikipedia.org/wiki/Arno_Funke">WP</a>; another reminder for Americans&#8212;odd as it may seem, Donald Duck is <em>extremely</em> popular overseas; see also the unknown-in-the-USA character <a href="https://en.wikipedia.org/wiki/John_D._Rockerduck">John D. Rockerduck</a> or <a href="https://slate.com/culture/2009/12/sweden-s-bizarre-tradition-of-watching-donald-duck-kalle-anka-cartoons-on-christmas-eve.html">beloved Scandinavian tradition</a><em><a href="https://en.wikipedia.org/wiki/From_All_of_Us_to_All_of_You">From All of Us to All of You</a></em> who 2020 airing set an all-time record of &gt;4.5m viewers)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Atmospheric_optics#List">List of atmospheric optical phenomena</a> (How many would you recognize from a distance or plane? How many have you even heard of?)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Franz_Nopcsa_von_Fels%C5%91-Szilv%C3%A1s">Baron Franz Nopcsa von Fels&#337;-Szilv&#225;s</a> (noted geologist, paleontologist, anthropologist, homosexual, &amp; skyjacker)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Krishnacore">Krishnacore</a></p></li></ul><div><hr></div><ol><li><p>What is a diffusion model like DDPM? To try to explain it as simply as possible <a href="https://yang-song.github.io/blog/2021/score/" title="Generative Modeling by Estimating Gradients of the Data Distribution">without the math</a>:</p><p>DDPM is a neural net which is trained to fix noise in an image: it  takes a noisy image and &#8216;sharpens&#8217; it to produce a new image. You train  it by adding dirt to a normal image, and teaching it to turn the dirty  version into the original. As it gets better, it learns what the images  all tend to look like so it can &#8216;see through&#8217; ever more noise, to turn  smudged hints of the original image into its best guess. Once it&#8217;s done  training, what happens if you give it a completely dirty photo, which is  pure static noise? Well, it produces a slightly less dirty &#8216;photo&#8217;. And  if you do it again? it&#8217;s a little cleaner still. Now, what if you do  this many times? It has to get cleaner each time. The end result: the  static noise goes in, and a face pops out! The DDPM has hallucinated a  face out of the noise. One little blob of static here turned into a  nose, and another blob turned into an ear, and it went from there.</p></li></ol>]]></content:encoded></item><item><title><![CDATA[April 2021 newsletter]]></title><description><![CDATA[with links on AI scaling, particular new East Asian record-breaking work & deep reinforcement learning.]]></description><link>https://gwern.substack.com/p/april-2021-newsletter</link><guid isPermaLink="false">https://gwern.substack.com/p/april-2021-newsletter</guid><dc:creator><![CDATA[gwern]]></dc:creator><pubDate>Thu, 03 Jun 2021 15:45:24 GMT</pubDate><enclosure url="https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/33773d07-4631-4a6b-91c2-44a2b1082385_1164x702.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>April 2021&#8217;s <a href="https://www.gwern.net/newsletter/2021/04">Gwern.net</a> <a href="https://gwern.substack.com">newsletter</a> is now out; previous, <a href="https://www.gwern.net/newsletter/2021/03">March 2021</a> (<a href="https://www.gwern.net/tags/newsletter">archives</a>). This is a collation of links and summary of major changes, overlapping with my <a href="https://www.gwern.net/Changelog">Changelog</a>; brought to you by my donors on <a href="https://www.patreon.com/gwern">Patreon</a>.</p><h1>1 Writings</h1><ul><li><p><a href="https://www.gwern.net/Variables">Better Greek Variable Suggestions</a> (use &#1008;, &#962;, &#965;, &#982;, &#933;, &#926;, &#953;, &#1009;, &#977;, or &#928; instead)</p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><ul><li><p><a href="https://arxiv.org/abs/1810.00825">&#8220;Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks&#8221;</a>, Lee et al 2018; <a href="https://arxiv.org/abs/2103.03206#deepmind">&#8220;Perceiver: General Perception with Iterative Attention&#8221;</a>, Jaegle et al 2021 (skinny Transformers applied recurrently; given reinvention, one might ask &#8220;is <a href="https://arxiv.org/abs/1706.03762#google" title="'Attention Is All You Need', Vaswani et al 2017">attention</a>, getting too much attention?&#8221;, especially given how many Transformer tweaks <a href="https://arxiv.org/abs/2102.11972#google" title="'Do Transformer Modifications Transfer Across Implementations and Applications?', Narang et al 2021">don&#8217;t pan out</a>  or have antecedents, indicating a gold rush? Probably not: if the  marginal return on this research direction had fallen below that of  competitors, we would see those neglected directions invade Transformer  topics&#8212;while we continue to see the reverse, and many applications as  yet untouched by all the new approaches, suggesting that we <em>still</em> don&#8217;t pay enough attention)</p></li><li><p><a href="https://arxiv.org/abs/2103.04689">&#8220;Z-IL: Predictive Coding Can Do Exact Backpropagation on Any Neural Network&#8221;</a>, Salvatori et al 2021 (scaling local learning rules to ImageNet AlexNet/Resnet &amp; ALE DRL at similar compute cost)</p></li><li><p><a href="https://arxiv.org/abs/1708.07120">&#8220;Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates&#8221;</a>,  Smith &amp; Topin 2017 (the lingering mystery of super-convergence,  saving 50&#8211;90% compute with LRs as high as 20 (!): what is it, why does  it work only sometimes, is there any connection to <a href="https://www.gwern.net/docs/ai/2021-power.pdf#openai" title="'Grokking: Generalization Beyond Overfitting On Small Algorithmic Data Sets', Powers et al 2021">grokking</a> &amp; can it work for large models like GPT-3 given the <a href="https://old.reddit.com/r/MachineLearning/comments/ba1wg5/d_thoughts_about_superconvergence_and/">tunneling hypothesis</a>?)</p></li><li><p><a href="http://www.offconvex.org/2021/04/07/ripvanwinkle/">&#8220;Rip van Winkle&#8217;s Razor, a Simple New Estimate for Adaptive Data Analysis&#8221;</a>  (an unusual approach to estimating generalization&#8212;by quantifying the  information-theoretic simplicity of all the powerful DL research  discoveries since 2012, into ~1 kilobyte. And yet, <em>what</em> a kilobyte&#8230;)</p></li><li><p><a href="https://github.com/golanlevin/AmbigrammaticFigures">&#8220;Ambigrammatic Figures&#8221;</a>, Levin &amp; Huang 2020 (making horrifying StyleGAN faces that can be <a href="https://en.wikipedia.org/wiki/Ambigram">rotated 180&#176;</a> by projection &amp; then <a href="https://www.gwern.net/Faces#reversing-stylegan-to-control-modify-images">gradient-ascent</a> towards an upside-down face)</p></li></ul><p><a href="https://old.reddit.com/r/mlscaling/">Matters Of Scale</a>:</p><ul><li><p><strong><a href="https://lair.lighton.ai/akronomicon/" title="The Akronomicon: an Extreme-Scale Leaderboard">Large Models</a></strong>:</p><ul><li><p>Congratulations to OpenAI on 1 year of GPT-3 &amp; OA API. Has it really only been a year?&#8212;it has truly exceeded expectations.</p></li><li><p><a href="https://en.wikipedia.org/wiki/Naver">Naver</a> announces 204b-parameter Korean-language NN, <a href="http://m.koreaherald.com/view.php?ud=20210525000824">&#8220;HyperCLOVA&#8221;</a>  (KO; unknown arch although apparently dense, or training-compute or  benchmark/loss performance; 650b token training dataset. Who knew Naver  was even trying? &#8220;And we are here as on a darkling plain / Swept with  confused alarms of struggle and flight, / Where ignorant armies clash by  night.&#8221;)</p></li><li><p><a href="https://arxiv.org/abs/2104.12369#huawei">&#8220;PanGu-&#945;: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation&#8221;</a>,  Zeng et al 2021 (Zh; Huawei&#8217;s GPT-3-200b prototype, trained on  indigenous Chinese GPU+DL stack; a partial replication, due to  incomplete training on ~43b tokens; the <a href="https://git.openi.org.cn/PCL-Platform.Intelligence/PanGu-Alpha#user-content-%E6%A8%A1%E5%9E%8B%E4%B8%8B%E8%BD%BD">13b-parameter</a> model checkpoint has been released for download, and they are considering releasing the 200b-parameter model&#8230; <a href="https://chinai.substack.com/p/chinai-141-the-pangu-origin-story">Ding commentary</a>)</p></li><li><p>New &#119978;(100b)-parameter Transformer models announced at Google I/O &#8217;2021: <a href="https://blog.google/technology/ai/lamda/" title="LaMDA: our breakthrough conversation technology">LaMDA</a> (EN; chatbot), <a href="https://blog.google/products/search/introducing-mum/">MUM</a> (multimodal multilingual search/translation/Q&amp;A)</p></li><li><p><a href="https://www.infoq.cn/article/EFIHo75sQsVqLvFTruKE#alibaba">&#8220;PLUG&#8221;</a> (Zh): a 27b parameter BERT-like Chinese language model, targeting 200b next (AliBaba followup to <a href="https://arxiv.org/abs/1908.04577#alibaba" title="'StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding', Wang et al 2019">StructBERT</a>/<a href="https://arxiv.org/abs/2004.07159#alibaba" title="'PALM: Pre-training an Autoencoding&amp;Autoregressive Language Model for Context-conditioned Generation', Bi et al 2020">PALM</a>)</p></li><li><p><a href="https://arxiv.org/abs/2105.13290">&#8220;CogView: Mastering Text-to-Image Generation via Transformers&#8221;</a>, Ding et al 2021 (another Chinese <a href="https://openai.com/blog/dall-e/">DALL&#183;E</a> clone, post-<a href="https://arxiv.org/abs/2103.00823#alibaba" title="'M6: A Chinese Multimodal Pretrainer', Lin et al 2021">M6</a>: <em>n</em> = <a href="https://wudaoai.cn/data-detail/1" title="WuDaoCorpus: the largest Chinese corpus data set, with about 2TB of text and 725 billion Chinese characters">30m text-image pairs</a>, 4b-parameter GPT, models to be released)</p></li><li><p><a href="https://arxiv.org/abs/2104.10157">&#8220;VideoGPT: Video Generation using VQ-VAE and Transformers&#8221;</a>, Yan et al 2021; <a href="https://arxiv.org/abs/2104.14806#microsoft">&#8220;GODIVA: </a><em><a href="https://arxiv.org/abs/2104.14806#microsoft">G</a></em><a href="https://arxiv.org/abs/2104.14806#microsoft">enerating </a><em><a href="https://arxiv.org/abs/2104.14806#microsoft">O</a></em><a href="https://arxiv.org/abs/2104.14806#microsoft">pen-</a><em><a href="https://arxiv.org/abs/2104.14806#microsoft">D</a></em><a href="https://arxiv.org/abs/2104.14806#microsoft">oma</a><em><a href="https://arxiv.org/abs/2104.14806#microsoft">I</a></em><a href="https://arxiv.org/abs/2104.14806#microsoft">n </a><em><a href="https://arxiv.org/abs/2104.14806#microsoft">V</a></em><a href="https://arxiv.org/abs/2104.14806#microsoft">ideos from n</a><em><a href="https://arxiv.org/abs/2104.14806#microsoft">A</a></em><a href="https://arxiv.org/abs/2104.14806#microsoft">tural Descriptions&#8221;</a>, Wu et al 2021 (DALL&#183;E for video on Howto100M: <a href="https://arxiv.org/abs/1906.00446#deepmind" title="'Generating Diverse High-Fidelity Images with VQ-VAE-2', Razavi et al 2019">VQ-VAE</a> + sparse attention)</p></li><li><p><a href="https://arxiv.org/abs/2104.04473#nvidia">&#8220;Efficient Large-Scale Language Model Training on GPU Clusters&#8221;</a>, Narayanan et al 2021 (Nvidia <a href="https://github.com/nvidia/megatron-lm">&#8216;Megatron-LM&#8217; software</a> for scaling up to 3072 A100 GPUs; allows 1t-parameter models at 502 petaFLOP/s or 50% efficiency, cf TPU rival, <a href="https://arxiv.org/abs/2105.04663#google" title="'GSPMD: General and Scalable Parallelization for ML Computation Graphs', Xu et al 2021: '50% to 62% compute utilization on 128 to 2048 Cloud TPUv3 cores for models with up to one trillion parameters'">GSPMD</a>, and note <a href="file:///tmp/burlyHGiKo.html#patterson-et-al-2021">Patterson et al 2021</a> estimates GPT-3 at ~3.5m V100 GPU-hours, so OA got ~20% efficiency?); <a href="https://www.youtube.com/watch?v=eAn_oiZwUXA&amp;t=2998s" title="GTC 2021 Keynote with NVIDIA CEO Jensen Huang: NVIDIA CEO Jensen Huang delivers the #GTC21&#8203; keynote, where he introduced amazing breakthroughs in building virtual worlds with NVIDIA Omniverse; in advancing enterprise computing with new NVIDIA DGX systems and software; in turning the data center into the new unit of computing with the new NVIDIA Grace CPU, BlueField-3 DPU, and DOCA 1.0 SDK; in broadening the reach of AI to all companies and industries with NVIDIA EGX and Aerial 5G; and in transforming transportation with NVIDIA DRIVE Orin and Atlan.">&#8220;We expect to see multi-trillion-parameter models by next year, and 100 trillion+ parameter models by 2023&#8221;</a> &#8212;Nvidia CEO <a href="https://en.wikipedia.org/wiki/Jensen_Huang">Jensen Huang</a> (<a href="https://www.gwern.net/docs/ai/2021-04-12-jensenhuang-gtc2021keynote-eAn_oiZwUXA.en.vtt.txt">subtitles</a>)</p></li><li><p>Mixture-Of-Experts:</p><ul><li><p><a href="https://en.pingwest.com/a/8693">BAAI&#8217;s &#8220;Wudao Wensu&#8221;: 1.75-trillion parameters &amp; multimodal!</a> (<a href="https://syncedreview.com/2021/03/23/chinas-gpt-3-baai-introduces-superscale-intelligence-model-wu-dao-1-0/">prologue</a>)</p></li><li><p><a href="https://arxiv.org/abs/2105.15082#alibaba">&#8220;Exploring Sparse Expert Models and Beyond&#8221;</a>, Yang et al 2021 (1t-parameter hierarchical Switch Transformer trained on 480 V100 GPUs)</p></li></ul></li></ul></li><li><p><strong><a href="https://arxiv.org/abs/1911.08265#deepmind">MuZero</a></strong>:</p><ul><li><p><a href="https://arxiv.org/abs/2104.06294#deepmind">&#8220;MuZero Unplugged: Online and Offline Reinforcement Learning by Planning with a Learned Model&#8221;</a>, Schrittwieser et al 2021 (Reanalyze+MuZero; <a href="https://www.gwern.net/images/ai/2021-schrittwieser-figure1-mspacmanmuzerologrewardscaling.png" title="Figure 1: Final scores in Ms. Pac-Man for different Reanalyse fractions. By scaling the Reanalyse fraction, MuZero can be trained at any desired data budget. All other parameters are held constant. Note the logarithmic x-axis: Linear improvements in score require exponentially more data, matching scaling laws such as described by Kaplan et al 2020 for language models.">smooth log-scaling</a> of <em>Ms.&nbsp;Pacman</em> reward with sample size, 107&#8211;1010, showing that DRL for arcade games parallels board games)</p></li><li><p><a href="https://sites.google.com/berkeley.edu/decision-transformer">&#8220;Decision Transformer: Reinforcement Learning via Sequence Modeling&#8221;</a>, Chen et al 2021</p></li><li><p><a href="https://arxiv.org/abs/2104.06303#deepmind">&#8220;Sampled MuZero: Learning and Planning in Complex Action Spaces&#8221;</a>, Hubert et al 2021 (MuZero for continuous domains: DM Control Suite/Real-World RL Suite); <a href="https://arxiv.org/abs/2006.07430">&#8220;Continuous Control for Searching and Planning with a Learned Model&#8221;</a>, Yang et al 2020</p></li><li><p><a href="https://arxiv.org/abs/2104.06159">&#8220;Muesli: Combining Improvements in Policy Optimization&#8221;</a>, Hessel et al 2020 (catching up with original MuZero)</p></li><li><p><a href="https://arxiv.org/abs/2102.12924">&#8220;Visualizing MuZero Models&#8221;</a>, de Vries et al 2021 (reimplementing &amp; introspecting a MuZero)</p></li></ul></li><li><p><a href="https://arxiv.org/abs/2104.03113">&#8220;Scaling Scaling Laws with Board Games&#8221;</a>, <a href="https://andyljones.com/">Jones</a> 2021 (AlphaZero/<a href="https://en.wikipedia.org/wiki/Hex_(board_game)">Hex</a>: <a href="https://www.gwern.net/notes/Faster">highly-optimized</a> GPU implementation enables showing <a href="https://www.gwern.net/notes/Scaling">smooth scaling</a>  across 6 OOM of compute&#8212;2&#215; FLOPS = 66% victory; amortization of  training &#8594; runtime tree-search, where 10&#215; training = 15&#215; runtime)</p></li><li><p><a href="https://christina.kim/2021/04/11/scaling-laws-for-language-transfer-learning/#openai">&#8220;Scaling Laws for Language Transfer Learning&#8221;</a>, Christina Kim (<a href="https://arxiv.org/abs/2102.01293#openai" title="Scaling Laws for Transfer">Hernandez et al 2021</a> followup: smooth scaling for En &#8594; De/Es/Zh)</p></li><li><p><a href="https://arxiv.org/abs/2104.10350#google">&#8220;Carbon Emissions and Large Neural Network Training&#8221;</a>,  Patterson et al 2021 (&#8220;&#8230;choice of DNN/datacenter/processor can reduce  the carbon footprint up to ~100&#8211;1000&#215;. These large factors make  retroactive estimates difficult.&#8221;)</p></li><li><p><a href="https://arxiv.org/abs/2104.07705">&#8220;How to Train BERT with an Academic Budget&#8221;</a>, Izsak et al 2021 (<a href="https://arxiv.org/abs/1810.04805#google" title="'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding', Devlin et al 2018">BERT</a> in 8 GPU-days&#8212;R&amp;D iteration allows finding efficiency; there&#8217;s nothing so expensive as demanding research be cheap.^1^)</p></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6818669/">&#8220;Precision exercise medicine: understanding exercise response variability&#8221;</a>,  Ross et al 2019 (&#8220;large individual differences in CRF response (range:  &#8722;33% to +118%) have been observed across the 8 exercise training studies  independent of exercise duration&#8221;&#8212;nothing in psychology, or medicine,  makes sense except in the light of individual differences&#8230;)</p></li></ul><p>Recent Evolution:</p><ul><li><p><a href="https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msab147/6277411">&#8220;Analysis of genomic DNA from medieval plague victims suggests long-term effect of </a><em><a href="https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msab147/6277411">Yersinia pestis</a></em><a href="https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msab147/6277411"> on human immunity genes&#8221;</a>, Immel et al 2021</p></li></ul><p>Engineering:</p><ul><li><p><a href="https://biohackinfo.com/news-china-gene-editing-criminal-law-article-336-march-2021/">&#8220;China officially bans CRISPR babies, human clones and animal-human hybrids&#8221;</a>? (another blow to attempts to project fears &amp; fantasies onto China)</p></li></ul><h2>2.3 Politics/Religion</h2><ul><li><p><em><a href="https://www.nap.edu/catalog/25762/reflecting-sunlight-recommendations-for-solar-geoengineering-research-and-research-governance">Reflecting Sunlight: Recommendations for Solar Geoengineering Research and Research Governance</a></em>, National Academies 2021 (<a href="https://www.nytimes.com/2021/03/25/climate/geoengineering-sunlight.html">media</a>)</p></li><li><p><a href="https://www.gwern.net/docs/sociology/2020-muralidharan.pdf">&#8220;Improving Public Sector Management at Scale? Experimental Evidence on School Governance India&#8221;</a>, Muralidharan &amp; Singh 2020</p></li><li><p><a href="https://www.gwern.net/docs/fiction/2012-mason.pdf">&#8220;Jay-Z&#8217;s </a><em><a href="https://www.gwern.net/docs/fiction/2012-mason.pdf">99 Problems</a></em><a href="https://www.gwern.net/docs/fiction/2012-mason.pdf">, Verse 2: A Close Reading with 4th Amendment Guidance for Cops and Perps&#8221;</a>, Mason 2012</p></li></ul><h2>2.4 Psychology/Biology</h2><ul><li><p><a href="https://www.gwern.net/docs/longevity/2021-wiley.pdf">&#8220;Oxylipin biosynthesis reinforces cellular senescence and allows detection of senolysis&#8221;</a>, Wiley et al 2021</p></li><li><p><a href="https://www.nytimes.com/2019/02/26/magazine/psychics-skeptics-facebook.html" title="Are some celebrity mediums fooling their audience members by reading social media pages in advance? A group of online vigilantes is out to prove it">&#8220;Inside the Secret Sting Operations to Expose Celebrity Psychics&#8221;</a></p></li><li><p><a href="https://www.gwern.net/docs/catnip/2021-smith.pdf">&#8220;If I fits I sits: A citizen science investigation into illusory contour susceptibility in domestic cats (</a><em><a href="https://www.gwern.net/docs/catnip/2021-smith.pdf">Felis silvestris catus</a></em><a href="https://www.gwern.net/docs/catnip/2021-smith.pdf">)&#8221;</a>, Smith et al 2021</p></li><li><p><a href="https://www.gwern.net/docs/biology/2005-paxton.pdf">&#8220;Cetaceans,  sex and sea serpents: an analysis of the Egede accounts of a &#8216;most  dreadful monster&#8217; seen off the coast of Greenland in 1734&#8221;</a>, Paxton et al 2005 (is that a legendary cryptid in your pocket, or are you just happy to see me?)</p></li><li><p><a href="https://www.gwern.net/docs/psychology/writing/2020-reilly.pdf">&#8220;Building the perfect curse word: A psycholinguistic investigation of the form and meaning of taboo words&#8221;</a>, Reilly et al 2020</p></li><li><p><a href="https://en.wikipedia.org/wiki/Tarrare">Tarrare</a></p></li></ul><h2>2.5 Technology</h2><ul><li><p><a href="https://arxiv.org/abs/2103.07487">&#8220;How Developers Choose Names&#8221;</a>,  Feitelson et al 2021 (&#8220;Another example concerned the function  &#8216;arrangeFilesByName(files)&#8217;. When asked the return value&#8230;one suggested  the number of files reordered&#8221;)</p></li><li><p><a href="https://arxiv.org/abs/2004.02504">&#8220;Bringing GNU Emacs to Native Code&#8221;</a>,  Corallo et al 2020 (using libgccjit to make Emacs 2.3&#215; to 42&#215; faster;  gccemacs has been merged into Emacs HEAD &amp; will be available soon)</p></li><li><p><a href="https://phiresky.github.io/blog/2021/hosting-sqlite-databases-on-github-pages/">&#8220;Hosting SQLite databases on Github Pages (or any static file hoster)&#8221;</a> (a revolution in static website technology: eg running a query <a href="https://nitter.cc/simonw/status/1388933800445452290" title="Check out this demo: I run the SQL query &quot;select country_code, long_name from wdi_country order by rowid desc limit 100&quot; and it fetches just 54.2KB of new data (across 49 small HTTP requests) to return 100 results---from a statically hosted database file that's 668.8MB!">need download only 54kb of a 670MB database</a>; fulltext site search is just the beginning of the possibilities of this clever use of <a href="https://en.wikipedia.org/wiki/Byte_serving">range requests</a>)</p></li><li><p><a href="https://www.coderelay.io/fontemon.html">&#8220;</a><em><a href="https://www.coderelay.io/fontemon.html">Fontemon</a></em><a href="https://www.coderelay.io/fontemon.html">: World&#8217;s first video game in a font!&#8221;</a> (a <em>Pokemon</em>-like CYOA <a href="https://github.com/mmulet/code-relay/blob/main/markdown/HowIDidIt.md">implemented as an OpenType font file</a>; play in browser or text editor&#8212;still not quite <a href="https://www.gwern.net/Turing-complete">Turing-complete</a> but definitely the most impressive thing implemented in a font so far)</p><ul><li><p><em>Fontemon</em> is by far the highlight of <a href="http://sigbovik.org/2021/proceedings.pdf">SIGBOVIK 2021</a>; but also worth noting: <a href="http://sigbovik.org/2021/proceedings.pdf#page=8">&#8220;Back to Square One: Superhuman Performance in Chutes and Ladders Through Deep Neural Networks and Tree Search&#8221;</a> &#183; <a href="http://sigbovik.org/2021/proceedings.pdf#page=83">&#8220;Deep Deterministic Policy Gradient Boosted Decision Trees&#8221;</a> &#183; <a href="http://sigbovik.org/2021/proceedings.pdf#page=126">&#8220;Lowestcase and uppestcase letters: Advances in derp learning&#8221;</a> &#183; <a href="http://sigbovik.org/2021/proceedings.pdf#page=167">&#8220;openCHEAT: Computationally Helped Error bar Approximation Tool&#8212;Kick-starting Science 4.0&#8221;</a> &#183; <a href="http://sigbovik.org/2021/proceedings.pdf#page=216">&#8220;The Newcomb-Benford Law, Applied to Binary Data: An Empirical and Theoretic Analysis&#8221;</a> &#183; <a href="http://sigbovik.org/2021/proceedings.pdf#page=252">&#8220;Inverted Code Theory: Manipulating Program Entropy&#8221;</a> (<em><a href="https://en.wikipedia.org/wiki/Tenet_(film)">Tenet</a></em> fans only&#8212;possibly inferior to <a href="http://www.frc.ri.cmu.edu/~hpm/project.archive/general.articles/1991/TempComp.html" title="Time Travel and Computing">Moravec 1991</a>?) &#183; <a href="http://sigbovik.org/2021/proceedings.pdf#page=282">&#8220;Build your own 8-bit busy beaver on a breadboard!&#8221;</a></p></li></ul><p>Incidentally, it&#8217;s curious that while STEM fields have entire annual issues, journals, &amp; conferences devoted to satire (<a href="http://sigbovik.org/">SIGBOVIK</a>; Arxiv April Fools papers like <a href="https://arxiv.org/abs/1703.10987" title="On the Impossibility of Supersized Machines">Garfinkel et al 2017</a>; <a href="https://www108.lamp.le.ac.uk/ojs1/index.php/pst/issue/archive">Special Topics</a>; the <a href="https://www.bmj.com/about-bmj/resources-authors/article-types/christmas-issue">BMJ Christmas issue</a>; the <a href="https://en.wikipedia.org/wiki/Ig_Nobel_Prize">Ig Nobel Prizes</a> &amp; <a href="https://bahfest.com/">BAHFest</a>), after asking in several places, I have found no instances in the humanities. (I know of many entertaining <em>papers</em>, like <a href="https://www.gwern.net/docs/philo/2008-sinhababu.pdf" title="Possible Girls">Sinhababu 2008</a> on waifus, but no <em>regular organized</em> publication, with the possible exception of the annual <a href="https://en.wikipedia.org/wiki/Latke%E2%80%93Hamantash_Debate">&#8220;Latke-Hamantash Debate&#8221;</a>.)</p></li></ul><h2>2.6 Economics</h2><ul><li><p><a href="https://www.gwern.net/docs/statistics/decision/2006-thorp.pdf">&#8220;The Kelly Criterion in Blackjack Sports Betting, and the Stock Market&#8221;</a>, Thorp 2006</p></li><li><p><a href="https://marginalrevolution.com/marginalrevolution/2016/10/performance-pay-nobel.html">&#8220;The Performance Pay Nobel&#8221;</a> (CEO pay as <a href="https://www.gwern.net/Backstop">blackbox optimization problem</a>)</p></li><li><p><a href="https://www.gwern.net/docs/economics/2008-josephson.pdf">&#8220;The Ocean&#8217;s Hot Dog: The Development of the Fish Stick&#8221;</a>,  Kelly 2008 (out of nostalgia, I bought some fish sticks for the first  time in decades; better than I remembered, even if I had no <a href="https://en.wikipedia.org/wiki/Tartar_sauce">tartar</a> handy)</p></li></ul><h2>2.7 Philosophy</h2><ul><li><p><a href="https://www.gwern.net/docs/culture/2007-shiner.pdf">&#8220;The Aesthetics of Smelly Art&#8221;</a>, Shiner &amp; Kriskovets 2007; <a href="https://www.gwern.net/docs/culture/2019-kraft.pdf">&#8220;The Odor Value Concept in the Formal Analysis of Olfactory Art&#8221;</a>, Kraft 2019; <a href="https://qualiacomputing.com/2020/02/21/perfumery-as-an-art-form/" title="Hedonic Tone, memetics, scent, sex, spirituality">&#8220;Perfumery as an art form&#8221;</a>/<a href="https://qualiacomputing.com/2020/08/14/qualia-research-diary-scents/" title="Qualia Research Diary: Scents [consciousness research, Experiment, genetics, memetics, scent, valence]">notes</a>, Qualia Computing 2020 (more: manufacturing: <a href="https://www.newyorker.com/magazine/2005/03/14/scent-nile" title="Chandler Burr 2005">&#8220;The Scent of the Nile: Jean-Claude Ellena creates a new perfume&#8221;</a>; human smell is better than you think: <a href="https://www.gwern.net/docs/psychology/2006-porter.pdf">&#8220;Mechanisms of Scent-tracking in Humans&#8221;</a>, Porter et al 2006 (<a href="https://www.gwern.net/images/psychology/2006-porter-humanscenttracking-41593_2007_bfnn1819_moesm2_esm.mp4">video</a>; see also <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5512720/">&#8220;Poor Human Olfaction is a 19th Century Myth&#8221;</a>, McGann 2017); <a href="https://www.pnas.org/content/109/49/19959.full" title="'Perceptual convergence of multi-component mixtures in olfaction implies an olfactory white', Weiss et al 2012">olfactory white</a>; <em><a href="https://en.wikipedia.org/wiki/K%C5%8Dd%C5%8D">K&#333;d&#333;</a></em>, which unexpectedly appears in <a href="https://www.gwern.net/docs/cs/2005-knuth-taocp-v4-prefascicle4b.pdf#page=22" title="7.2.1.7: History of Combinatorial Generation: Set Partitions">Knuth</a>. <a href="https://threadreaderapp.com/thread/1357071738731814912.html" title="https://twitter.com/add_hawk/status/1357071738731814912">C. Thi Nguyen</a>&#8217;s description of the more bizarre &amp; avant-garde perfumes made me curious enough to nose around &amp; order 39 <a href="https://www.luckyscent.com/">LuckyScent</a> samplers.)</p></li></ul><h2>2.8 Miscellaneous</h2><ul><li><p><a href="https://en.wikipedia.org/wiki/Bog_butter">Bog butter</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Sarah_Bernhardt">Sarah Bernhardt</a> (Lions. Lots of lions.)</p></li></ul><div><hr></div><ol><li><p>Another thought, looking at <a href="https://bls.gov/news.release/ecec.nr0.htm">&#8216;Employer Costs for Employee Compensation&#8217;</a> (<a href="https://bls.gov/news.release/archives/ecec_031986.pdf">PDF</a>):</p><ol><li><p>&#8220;Moore&#8217;s Law&#8221;: the cost of a transistor halves every ~19 months;</p></li><li><p>&#8220;Anti-Moore&#8217;s Law&#8221;: the cost of a synapse doubles every ~119 years.</p></li></ol></li></ol>]]></content:encoded></item><item><title><![CDATA[March 2021 Gwern.net Newsletter]]></title><description><![CDATA[2 major new site features: 'popins' and recursive Wikipedia popups]]></description><link>https://gwern.substack.com/p/march-2021-gwernnet-newsletter</link><guid isPermaLink="false">https://gwern.substack.com/p/march-2021-gwernnet-newsletter</guid><dc:creator><![CDATA[gwern]]></dc:creator><pubDate>Tue, 06 Apr 2021 15:31:01 GMT</pubDate><enclosure url="https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/f10eb6e5-7674-4465-b223-2f254bc50ddb_685x1368.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><a href="https://www.gwern.net/newsletter/2021/03">March 2021&#8217;s Gwern.net</a> <a href="https://gwern.substack.com">newsletter</a> is now out; previous, <a href="https://www.gwern.net/newsletter/2021/02">February 2021</a> (<a href="https://www.gwern.net/tags/newsletter">archives</a>). This is a summary of the revision-history RSS feed, overlapping with my <a href="https://www.gwern.net/Changelog">Changelog</a> &amp; <a href="https://old.reddit.com/r/gwern/">/r/gwern</a>; brought to you by my donors on <a href="https://www.patreon.com/gwern">Patreon</a>.</p><h1>1 Writings</h1><ul><li><p><strong>Gwern.net</strong>: mobile &#8220;popins&#8221; are finally enabled! (<a href="https://www.gwern.net/images/design/2021-03-28-gwern.net-annotations-mobilepopins-darkmode.png">example</a>); new Wikipedia popups (this 7th implementation enables <em><a href="https://www.gwern.net/images/design/2021-04-01-gwern.net-annotations-popups-recursivewikipediapopups.png">recursive</a></em><a href="https://www.gwern.net/images/design/2021-04-01-gwern.net-annotations-popups-recursivewikipediapopups.png"> WP popups</a>)</p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><ul><li><p><a href="https://distill.pub/2021/multimodal-neurons/#openai">&#8220;Multimodal Neurons in Artificial Neural Networks&#8221;</a>, Goh et al 2021 (dissecting <a href="https://openai.com/blog/clip/" title="CLIP (Contrastive Language-Image Pre-training): Connecting Text and Images">CLIP</a> concepts, discovering typographical classification &#8216;attacks&#8217;^1^ and a <a href="https://en.wikipedia.org/wiki/Stroop_effect">Stroop effect</a>! Is there anything CLIP can&#8217;t do?)</p></li><li><p><a href="https://arxiv.org/abs/2101.03958#google">&#8220;Evolving Reinforcement Learning Algorithms&#8221;</a>, Co-Reyes et al 2021 (evolving eg <a href="https://en.wikipedia.org/wiki/Temporal_difference_learning">TD-learning</a>)</p></li><li><p><a href="https://www.gwern.net/docs/rl/2021-scanlon.pdf">&#8220;Waymo Simulated Driving Behavior in Reconstructed Fatal Crashes within an Autonomous Vehicle Operating Domain&#8221;</a>, Scanlon et al 2021 (<a href="https://blog.waymo.com/2021/03/replaying-real-life.html">blog</a>; hard negative mining&#8212;self-driving cars, being inhuman, can learn not just from their mistakes but humans&#8217; mistakes too)</p></li><li><p><a href="https://andyljones.com/posts/rl-debugging.html">&#8220;Debugging Reinforcement Learning Systems Without The Agonizing Pain&#8221;</a>, Andy L. Jones; <a href="https://clemenswinter.com/2021/03/24/my-reinforcement-learning-learnings/">&#8220;My Reinforcement Learning Learnings&#8221;</a>, Clemens Winter</p></li></ul><p><a href="https://old.reddit.com/r/mlscaling/">Matters Of Scale</a>:</p><ul><li><p><a href="https://arxiv.org/abs/2103.01988#facebook">&#8220;SEER: Self-supervised Pretraining of Visual Features in the Wild&#8221;</a>, Goyal&nbsp;et&nbsp;al&nbsp;2021 (<a href="https://ai.facebook.com/blog/self-supervised-learning-the-dark-matter-of-intelligence" title="Self-supervised learning: The dark matter of intelligence">blog</a>;  near-SOTA by training 1b-param CNN on 1b unfiltered unlabeled Internet  images&#8212;another reminder that unsupervised learning is really working!); <a href="https://ai.facebook.com/blog/learning-from-videos-to-understand-the-world">&#8220;&#8216;Learning From Videos&#8217; to understand the world&#8221;</a> (rapid FB expansion of self-supervised training to millions of photos/videos/hours-of-speech); <a href="https://arxiv.org/abs/2103.14005">&#8220;Contrasting Contrastive Self-Supervised Representation Learning Models&#8221;</a>,  Kotar et al 2021 (Supervised learning from ImageNet is now obsolete for  transfer learning, and ImageNet just a contaminated validation set)</p></li><li><p><a href="https://arxiv.org/abs/2103.14586#google">&#8220;Understanding Robustness of Transformers for Image Classification&#8221;</a>, Bhojanapalli et al 2021 (<a href="https://openreview.net/forum?id=YicbFdNTTy#google">Vision Transformers</a> gain robustness faster than CNNs as dataset size increases)</p></li><li><p><a href="https://aiindex.stanford.edu/wp-content/uploads/2021/03/2021-AI-Index-Report_Master.pdf#page=41">&#8220;Artificial Intelligence Index Report 2021&#8221;</a>: technical performance and cost (<a href="https://chinai.substack.com/p/chinai-137-year-3-of-chinai" title="ChinAI #137: Year 3 of ChinAI: Reflections on the newsworthiness of machine translation">Ding questions</a>  whether this shows China catching up on AI at all, as we are  incessantly told it is doing; one question to ask: ignoring  fast-following, what, out of the thousands upon thousands of  publications flooding out these days, are the last 3 <em>major novel</em>  AI breakthroughs coming out of all pure-Chinese labs combined which  could be plausibly equated in importance with, say, just OpenAI&#8217;s recent  output of <a href="https://arxiv.org/abs/2005.14165#openai">GPT-3</a>/<a href="https://openai.com/blog/dall-e/">DALL&#183;E</a>/CLIP?)</p></li><li><p><a href="https://openai.com/blog/gpt-3-apps/">OA GPT-3 API: &gt;300 apps, &gt;10k developers, &gt;4.5b words per day</a></p></li><li><p><a href="https://www.pnas.org/content/116/23/11537">&#8220;A mathematical theory of semantic development in deep neural networks&#8221;</a>, Saxe et al 2019 (are jumps in NN capabilities to be expected when scaling? see also <a href="https://arxiv.org/pdf/2103.10948.pdf#page=22" title="The Shape of Learning Curves: a Review: 6. Ill-behaved learning curves: 6.1. Phase transitions">Viering &amp; Loog 2021</a>&#8217;s discussion of phase transitions &amp; averaging of exponentials giving power-laws)</p></li><li><p><a href="https://www.cell.com/cell/fulltext/S0092-8674(21)00239-7">&#8220;An early cell shape transition drives evolutionary expansion of the human forebrain&#8221;</a>, Benito-Kwiecinski et al 2021 (<a href="https://www.theguardian.com/science/2021/mar/24/scientists-discover-why-the-human-brain-is-so-big" title="Scientists discover why the human brain is so big: Molecular switch makes human organ three times larger than great apes&#8217;, study finds">media</a>; a simple switch for the <a href="https://www.gwern.net/docs/psychology/2012-herculanohouzel.pdf" title="'The remarkable, yet not extraordinary, human brain as a scaled-up primate brain and its associated cost', Herculano-Houzel 2012">scaling up</a> of the primate brain)</p><ul><li><p><a href="https://www.statnews.com/2020/09/24/crows-possess-higher-intelligence-long-thought-primarily-human/">&#8220;Crows possess higher intelligence long thought primarily human&#8221;</a> (the remarkable, yet not extraordinary, crow/raven brain as scaled-up <a href="https://en.wikipedia.org/wiki/Bird_intelligence">bird brain</a>)</p></li></ul></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href="https://advances.sciencemag.org/content/7/11/eabd1239">&#8220;GWAS in almost 195,000 individuals identifies 50 previously unidentified genetic loci for eye color&#8221;</a>, Simcoe et al 2021</p></li><li><p><a href="https://www.gwern.net/docs/genetics/heritable/2021-fagereng.pdf">&#8220;Why Do Wealthy Parents Have Wealthy Children?&#8221;</a>,  Fagereng&nbsp;et&nbsp;al&nbsp;2021 (I&#8217;m always impressed just how difficult it is for  rich people to pass on wealth&#8212;&#8220;shirtsleeves to shirtsleeves in 3  generations&#8221; etc)</p></li></ul><p>Evolution:</p><ul><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.02.25.432891v1">&#8220;Nothing in evolution makes sense except in the light of parasites&#8221;</a>, Hickinbotham et al 2021</p></li></ul><p>Engineering:</p><ul><li><p><a href="https://www.sierraclub.org/sierra/2021-2-march-april/feature/demise-and-potential-revival-american-chestnut" title="Before a disastrous blight, the American chestnut was a keystone species in eastern forests. Could genetic engineering help bring it back?">&#8220;The Demise and Potential Revival of the American Chestnut&#8221;</a></p></li></ul><h2>2.3 Statistics/Meta-Science</h2><ul><li><p><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7831807/">&#8220;Broad cross-national public support for accelerated COVID-19 vaccine trial designs&#8221;</a>,  Broockman et al 2021 (&#8220;we can&#8217;t do challenge trials with volunteers in  February 2020 to save countless thousands of lives because ordinary  people might think it unethical&#8221;&#8212;have you tried <em>asking</em> them, or was that irrelevant because it was just another noble lie?)</p></li><li><p><a href="https://crystalprisonzone.blogspot.com/2021/01/i-tried-to-report-scientific-misconduct.html">&#8220;This is the story of how I found what I believe to be scientific misconduct and what happened when I reported it&#8221;</a>, Joe Hilgard</p></li><li><p><a href="https://www.newyorker.com/culture/cultural-comment/the-revolution-in-classic-tetris">&#8220;The Revolution in Classic Tetris: How a younger generation used the Internet to master the falling blocks&#8221;</a> (how achieving classic Tetris maximum-scores, first done in 2010, became routine thanks to YouTube &amp; <a href="https://www.gwern.net/Bakewell#external-links">online competition for excellence</a>)</p></li></ul><h2>2.4 Politics/Religion</h2><ul><li><p><a href="https://www.gwern.net/docs/sociology/2021-singh.pdf">&#8220;Magic, Explanations, and Evil: The Origins and Design of Witches and Sorcerers&#8221;</a>, Singh 2021 (doubtless even cavemen were all &#8220;Og: sus.&#8221;)</p></li><li><p><a href="https://elifesciences.org/articles/62878">&#8220;Self-blinding citizen science to explore psychedelic microdosing&#8221;</a>, Szigeti et al 2021 (related to <a href="https://www.nature.com/articles/s41598-021-81446-7" title="Positive expectations predict improved mental-health outcomes linked to psychedelic microdosing">Kaertner et al 2021</a>; a self-blinding study, similar to my old self-blinding protocols, confirms that microdosing is just placebo effect, as <a href="https://www.gwern.net/LSD-microdosing">I said in 2012</a>, and I&#8217;m reminded of DNB studies like <a href="https://www.gwern.net/docs/dnb/2016-foroughi.pdf" title="Placebo effects in cognitive training">Foroughi et al 2016</a>)</p></li><li><p><a href="https://en.wikipedia.org/wiki/2019%E2%80%932020_vaping_lung_illness_outbreak">The 2019&#8211;2020 vaping moral panic</a>  over adulterated black-market THC products (depressing to see how  irresponsibly reported &amp; alarmist this was, and how everyone  attempted to frame nicotine for it<a href="file:///tmp/burlVELRZZ.html#fn2">2</a>. Naturally, no one involved has apologized or admitted fault&#8212;after all, their <em><a href="https://en.wikipedia.org/wiki/Noble_lie">intentions</a></em><a href="https://en.wikipedia.org/wiki/Noble_lie"> were good</a>,  &#8220;won&#8217;t someone think of the children&#8221;&#8253; The incompetence and/or  dishonesty here emphasizes how 2020&#8211;2021 was business as usual, and the  only unusual part is that reality happened so fast we saw some of <a href="https://en.wikipedia.org/wiki/Parable_of_the_broken_window">the unseen</a>.)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Mark_Hofmann">Mark Hofmann</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Alexandra_David-N%C3%A9el">Alexandra David-N&#233;el</a> (one of <em>those</em> 1800&#8211;1900s biographies)</p></li><li><p><a href="https://en.wikipedia.org/wiki/John_Harvey_Kellogg">John Harvey Kellogg</a></p></li></ul><h2>2.5 Psychology/Biology</h2><ul><li><p><a href="https://www.gwern.net/docs/iq/2021-brown.pdf">&#8220;Can You Ever Be Too Smart for Your Own Good? Comparing Linear and Nonlinear Effects of Cognitive Ability on Life Outcomes&#8221;</a>, Brown et al 2021</p></li><li><p><a href="https://psyarxiv.com/g8f9s/">&#8220;The pandemic fallacy: Inaccuracy of social scientists&#8217; and lay judgments about COVID-19&#8217;s societal consequences in America&#8221;</a>, Hutcherson et al 2021 (highly-inaccurate even retrospectively, typically grossly overestimating)</p></li><li><p><a href="https://psyarxiv.com/hc8je/">&#8220;Training Working Memory for Two Years&#8212;No Evidence of Latent Transfer to Intelligence&#8221;</a>, Watrin et al 2021 (fade-out of expectancy/placebo effects)</p></li><li><p><a href="https://www.cell.com/current-biology/fulltext/S0960-9822(21)00059-2">&#8220;Real-time dialogue between experimenters and dreamers during REM sleep&#8221;</a>, Konkoly et al 2021</p></li><li><p><a href="https://www.sciencedirect.com/science/article/pii/S0149763421001068">&#8220;Leroy&#8217;s elusive little people: A systematic review on lilliputian hallucinations&#8221;</a>, Blom 2021 (<a href="https://en.wikipedia.org/wiki/Alice_in_Wonderland_syndrome">Alice in Wonderland syndrome</a>)</p></li><li><p><a href="https://www.theatlantic.com/science/archive/2021/01/orcas-killer-whale-resident-transient/617862/">&#8220;A  Group of Orca Outcasts Is Now Dominating an Entire Sea: &#8216;Transient&#8217;  killer whales that feast on seals and hunt in small packs are thriving  while their widely beloved &#8216;Resident&#8217; siblings are dying out&#8221;</a> (I wonder how the third <a href="https://en.wikipedia.org/wiki/Killer_whale">orca</a> type, <a href="https://en.wikipedia.org/wiki/Killer_whale#Types">&#8216;offshore&#8217;</a>, are doing?)</p></li><li><p><a href="https://www.gwern.net/docs/biology/1995-watanabe.pdf">&#8220;Estimation of the total saliva volume produced per day in 5-year-old children&#8221;</a>, Watanabe et al 1995</p></li></ul><h2>2.6 Technology</h2><ul><li><p><a href="https://www.nngroup.com/articles/aesthetic-usability-effect/">&#8220;The Aesthetic-Usability Effect&#8221;</a>, Moran 2017 (<a href="https://pointersgonewild.com/2019/11/02/they-might-never-tell-you-its-broken/">&#8220;They Might Never Tell You It&#8217;s Broken&#8221;</a> if it&#8217;s pretty enough; see also <a href="https://asktog.com/atc/the-third-user/" title="'The Third User, or, Exactly Why Apple Keeps Doing Foolish Things">&#8220;The Third User&#8221;</a>)</p></li><li><p><a href="https://ciechanow.ski/cameras-and-lenses/">&#8220;Cameras and Lenses&#8221;</a>, Bartosz Ciechanowski (explorable; followup to <a href="https://ciechanow.ski/lights-and-shadows/">&#8220;Lights and Shadows&#8221;</a>)</p></li><li><p><a href="https://arxiv.org/abs/2103.07013">&#8220;Large Batch Simulation for Deep Reinforcement Learning&#8221;</a>, Shacklett et al 2021 (your computer is faster than you think)</p></li><li><p><a href="https://obscuritory.com/essay/incredible-boxes-of-hock-wah-yeo/">&#8220;The incredible boxes of Hock Wah Yeo&#8221;</a> (unusual video game packaging design)</p></li><li><p><a href="https://www.gwern.net/docs/technology/2017-post.pdf">&#8220;Stone Walls That Stay Built: A master waller shares how to dry-lay stone walls that hold their ground for centuries&#8221;</a>, Post 2017</p></li><li><p><a href="https://en.wikipedia.org/wiki/Automated_storage_and_retrieval_system">Automated storage and retrieval system</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Visual_cryptography">Visual cryptography</a></p></li></ul><h2>2.7 Economics</h2><ul><li><p><a href="https://www.gwern.net/docs/economics/2021-meyer.pdf">&#8220;The Use and Misuse of Income Data and Extreme Poverty in the United States&#8221;</a>, Meyer et al 2021 (measurement error in non-registry surveys of population extremes&#8212;not quite <a href="https://www.gwern.net/GPT-3#lizardman-constant">&#8220;lizardman&#8221;</a> but similar problem)</p></li><li><p><a href="https://www.gwern.net/docs/economics/2006-mackenzie.pdf">&#8220;Is economics performative? Option theory and the construction of derivatives markets&#8221;</a>, Mackenzie 2006 (the mechanics of how the <a href="https://en.wikipedia.org/wiki/Black%E2%80%93Scholes_model">Black-Scholes model</a> changed markets: <a href="https://en.wikipedia.org/wiki/Fischer_Black">Black</a>  ran a service printing &#8220;paper&#8221; estimating optimal prices for all  options which traders could consult &amp; use with simple heuristics to  try to arbitrage the market)</p></li><li><p><a href="https://www.cabinetmagazine.org/issues/52/hodes.php">&#8220;Whitewood under Siege: On the front lines of the pallet wars&#8221;</a> (the competition between the two ecosystems of shipping <a href="https://en.wikipedia.org/wiki/Pallet">pallets</a>: &#8216;whitewood&#8217; &amp; &#8216;blue pallet&#8217;)</p></li><li><p><em><a href="https://en.wikipedia.org/wiki/Mautam">Mautam</a></em></p></li></ul><h2>2.8 Philosophy</h2><ul><li><p><a href="https://www.tandfonline.com/doi/full/10.1080/03949370.2021.1893826">&#8220;Coping with mortality: responses of monkeys and great apes to collapsed, inanimate and dead conspecifics&#8221;</a>, De Marco et al 2021</p></li><li><p><a href="https://en.wikipedia.org/wiki/Braitenberg_vehicle">Braitenberg vehicle</a></p></li></ul><h2>2.9 Fiction</h2><ul><li><p><a href="https://en.wikipedia.org/wiki/Reply_of_the_Zaporozhian_Cossacks">&#8220;Reply of the Zaporozhian Cossacks&#8221;</a></p></li></ul><h2>2.10 Miscellaneous</h2><ul><li><p>America&#8217;s top ace, <a href="https://en.wikipedia.org/wiki/Dick_Bong">Major Dick Bong</a></p></li></ul><h1>3 Film/TV</h1><p><strong>Live-action:</strong></p><ul><li><p><em><a href="https://en.wikipedia.org/wiki/North_by_Northwest">North by Northwest</a></em> (<a href="https://en.wikipedia.org/wiki/Alfred_Hitchcock">Hitchcock</a>  1959; for such a extremely respected movie, it felt oddly formless and  like it was bouncing through genres as more of a comedic B-movie romp  than a serious auteur&#8217;s effort&#8212;since James Bond started in 1953, with a  TV adaptation in 1954, NbN comes off as almost a satire. I mean, really,  monkeying around in Presidential noses!)</p></li></ul><div><hr></div><ol><li><p>While interesting, these are &#8216;attacks&#8217; only in the most generous interpretation possible (since it <a href="https://nitter.cc/NoaNabeshima/status/1368662246885265409" title="The new CLIP adversarial examples are partially from the use-mention distinction. CLIP was trained to predict which caption from a list matches an image. It makes sense that a picture of an apple with a large 'iPod' label would be captioned with 'iPod', not 'Granny Smith'! This can be somewhat fixed with a list of labels that are more explicit about this, at least for a small set of pictures I've tried. After some experimentation, I found this prompt that seems to work with CLIP ViT-B-32: ...">does know</a> <a href="https://www.youtube.com/watch?v=Rk3MBx20z24&amp;t=35s" title="'Apple or iPod? Easy Fix for Adversarial Textual Attacks on OpenAI's CLIP Model!', Yannic Kilcher">the difference</a>),  and the fact that CLIP can read text in images to note the semantic  similarity, is to considerable credit. As the CLIP authors <a href="https://www.gwern.net/images/ai/2021-radford-clip-figure4-promptengineering.png" title="Radford et al 2021 (CLIP): **Figure 4**. _Prompt engineering and ensembling improve zero-shot performance_. Compared to the baseline of using contextless class names, prompt engineering and ensembling boost zero-shot classification performance by almost 5 points on average across 36 datasets. This improvement is similar to the gain from using 4&#215; more compute with the baseline zero-shot method but is &#8220;free&#8221; when amortized over many predictions.">note</a>,  some queries benefit from ensembling, more context than a single word  class name such as prefixing &#8220;A photograph of a&#8221;, and class names can be  highly ambiguous: in ImageNet, the class name &#8220;crane&#8221; could refer to  the bird or construction equipment; and the Oxford-IIIT Pet dataset  labels one class &#8220;boxer&#8221;. (CLIP is still <a href="https://stanislavfort.github.io/2021/03/05/OpenAI_CLIP_stickers_and_adversarial_examples.html" title="Pixels still beat text: Attacking the OpenAI CLIP model with text patches and adversarial pixel perturbations">vulnerable to regular adversarial examples</a>, of course.)&#8617;</p></li><li><p>It <em>couldn&#8217;t&#8217;ve</em> been  nicotine because people had been vaping for a decade and a half without  widespread near-instantaneous lung-related fatalities! It <em>had</em>  to be a new adulterant, and as soon as the first few black-market THC  links surfaced, that meant the problem had to be THC-products-only  because how would the same adulterant simultaneously get into the  different supply chains? And yet, every article, health official, and  activist did their paternalist best to suggest otherwise to pin the  blame on regular vaping, no matter how many tests turned up clean, and  it was the nicotine vaping products which got summarily banned&#8230;. One  must assume many of those laws are still on the books, inasmuch as <a href="https://old.reddit.com/r/electronic_cigarette/comments/lkhewr/usa_vape_mail_ban_newssales_megathread/">the shipping bans keep expanding</a>.&#8617;</p></li></ol>]]></content:encoded></item><item><title><![CDATA[February 2021 Gwern.net Newsletter]]></title><description><![CDATA[links on AI scaling, semaglutide, and ethicist ethics]]></description><link>https://gwern.substack.com/p/february-2021-gwernnet-newsletter</link><guid isPermaLink="false">https://gwern.substack.com/p/february-2021-gwernnet-newsletter</guid><dc:creator><![CDATA[gwern]]></dc:creator><pubDate>Sat, 13 Mar 2021 15:18:44 GMT</pubDate><enclosure url="https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ef890e58-1193-4984-a1a5-8aca6141b85d_1108x691.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>February 2021&#8217;s <a href="https://www.gwern.net/newsletter/2021/02">Gwern.net</a> <a href="https://gwern.substack.com">newsletter</a> is now out; previous, <a href="https://www.gwern.net/newsletter/2021/01">January 2021</a> (<a href="https://www.gwern.net/tags/newsletter">archives</a>). This is a summary of the revision-history RSS feed, overlapping with my <a href="https://www.gwern.net/Changelog">Changelog</a> &amp; <a href="https://old.reddit.com/r/gwern/">/r/gwern</a>; brought to you by my donors on <a href="https://www.patreon.com/gwern">Patreon</a>.</p><h1>1 Writings</h1><ul><li><p><strong>Gwern.net</strong>: popups: can now be moved, stickied, and full-screened (another step towards our ambition of Windows-95-in-the-browser!)</p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><ul><li><p><a href="https://lilianweng.github.io/lil-log/2021/01/02/controllable-neural-text-generation.html">&#8220;Controllable Neural Text Generation&#8221;</a>, Lilian Weng; <a href="https://ruder.io/recent-advances-lm-fine-tuning/" title="This article provides an overview of recent methods to fine-tune large pre-trained language models">&#8220;Recent Advances in Language Model Fine-tuning&#8221;</a>, Sebastian Ruder (review)</p><ul><li><p><a href="https://arxiv.org/abs/2102.07350">&#8220;Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm&#8221;</a>,  Reynolds &amp; McDonell 2021 (original 10-shot Fr &#8594; En translation can  be beaten by the better 0-shot prompt: &#8220;French: XYZ / English:&#8230;&#8221;; this  is &#8220;true of most worst-performing prompts&#8230;&#8221;); <a href="https://arxiv.org/abs/2102.09690">&#8220;Calibrate Before Use: Improving Few-Shot Performance of Language Models&#8221;</a>, Zhao et al 2021 (huge boost from calibrating unstable prompts; both demonstrate, <a href="https://www.gwern.net/GPT-3#prompts-as-programming">as always</a>, that &#8220;sampling can prove the presence of knowledge but not the absence.&#8221;)</p></li></ul></li><li><p><a href="https://arxiv.org/abs/2102.07074">&#8220;TransGAN: Two Transformers Can Make One Strong GAN&#8221;</a>, Jiang et al 2021 (Transformer-only GAN: attention is all you need)</p></li><li><p><a href="https://arxiv.org/abs/2102.06203">&#8220;PACT: Proof Artifact Co-training for Theorem Proving with Language Models&#8221;</a>, Han et al 2021 (<a href="https://arxiv.org/abs/2009.03393#openai" title="'GPT-f: Generative Language Modeling for Automated Theorem Proving', Polu &amp; Sutskever 2020">GPT-f</a> for <a href="https://en.wikipedia.org/wiki/Lean_(proof_assistant)">Lean</a>)</p></li><li><p><a href="https://arxiv.org/abs/2010.10648#google">&#8220;Towards End-to-End In-Image Neural Machine Translation&#8221;</a>, Mansimov et al 2020 (sure why not)</p></li><li><p><strong>Brains</strong>:</p><ul><li><p><a href="https://www.quantamagazine.org/artificial-neural-nets-finally-yield-clues-to-how-brains-learn-20210218/" title="The learning algorithm that enables the runaway success of deep neural networks doesn&#8217;t work in biological brains, but researchers are finding alternatives that could">&#8220;Artificial Neural Nets Finally Yield Clues to How Brains Learn&#8221;</a>  (short overview of biologically-plausible backprop: feedback alignment,  target propagation, predictive coding, &amp; attentional feedback; also  of recent interest, <a href="https://arxiv.org/abs/2012.14905" title="'VS-ML: Meta Learning Backpropagation And Improving It', Kirsch &amp; Schmidhuber 2021">VS-ML</a>;  given their increasing success in training while respecting more  biological constraints, the increasing power of backprop-trained ANNs  and the neurological success of ANNs in predicting &amp; imitating brain  signals, it is increasingly clear that brains <em>really do</em> do backprop in some sense)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.02.22.432340v1">&#8220;NSD: A massive 7-tesla fMRI dataset to bridge cognitive and computational neuroscience&#8221;</a>,  Jean et al 2021 (&#8220;&#8230;The availability of NSD thus opens the door to using  brain activity to directly guide the optimization of deep neural  networks.&#8221;)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.02.02.429430v1">&#8220;Brain2Pix: Fully convolutional naturalistic video reconstruction from brain activity&#8221;</a>, Le et al 2021 (reconstructing <em><a href="https://www.biorxiv.org/content/10.1101/687681v1.full" title="'A large single-participant fMRI dataset for probing brain responses to naturalistic stimuli in space and time', Seeliger et al 2019">Dr.&nbsp;Who</a></em>)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.07.01.183384v1.full">&#8220;High-performance brain-to-text communication via imagined handwriting&#8221;</a>, Willett et al 2020</p></li><li><p><a href="https://www.gwern.net/docs/rl/2021-spape.pdf">&#8220;Brain-computer interface for generating personally attractive images&#8221;</a>, Spape et al 2021 (many ways to improve this&#8230;)</p></li></ul></li></ul><p><a href="https://old.reddit.com/r/mlscaling/">Matters Of Scale</a>:</p><ul><li><p><a href="https://arxiv.org/abs/2102.01293#openai">&#8220;Scaling Laws for Transfer&#8221;</a>,  Hernandez et al 2021 (&#8220;We find that pre-training effectively multiplies  the fine-tuning dataset size&#8221;; a shot across the bow of anyone floating  on a proprietary-dataset moat: large models can drop data requirements  by orders of magnitude overnight, even surpassing you)</p></li><li><p><a href="https://arxiv.org/abs/2102.05918#google">&#8220;ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision&#8221;</a>, Jia et al 2021 (see also <a href="https://arxiv.org/abs/2102.08981#google" title="'Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts', Changpinyo et al 2021">CC-12M</a>; <a href="https://openai.com/blog/clip/">CLIP</a>-like w/EfficientNet trained on 1.8 billion images on a TPUv3-1024&#8212;<a href="https://arxiv.org/abs/2102.00529#deepmind" title="'Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers', Hendricks et al 2021">DM</a> argues that fancier cross-modal Transformers are better, nevertheless, <a href="http://www.incompleteideas.net/IncIdeas/BitterLesson.html">&#8216;TPUs go brrr&#8217;</a>. Given DALL&#183;E, CLIP, ALIGN, <a href="https://arxiv.org/abs/2011.10650#openai" title="'VDVAE: Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images', Child 2020">VDVAE</a>, <a href="https://arxiv.org/abs/2102.09532" title="'Clockwork Variational Autoencoders', Saxena et al 2021">CW-VAE</a>, <a href="https://arxiv.org/abs/2102.12037" title="'AIPO: Image Completion via Inference in Deep Generative Models', Harvey et al 2021">AIPO</a>  et al, are GANs already dead, and just don&#8217;t realize it yet? Or at  least soon to be relegated to only DRL-like uses as a final finetuning  phase to sharpen up a self-supervised model?); <a href="https://arxiv.org/abs/2103.06561">&#8220;WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training&#8221;</a>, Huo et al 2021</p></li><li><p><a href="https://arxiv.org/abs/2102.12092#openai">&#8220;DALL&#183;E: Zero-Shot Text-to-Image Generation&#8221;</a>, Ramesh et al 2021 (<a href="https://openai.com/blog/dall-e/">original blog</a>); <a href="https://arxiv.org/abs/2103.00823#alibaba">&#8220;M6: A Chinese Multimodal Pretrainer&#8221;</a>,  Lin et al 2021 (Chinese DALL&#183;E: 1.9TB images/0.29TB text for  10b-parameter dense/100b-parameter MoE Transformer; shockingly fast Chinese replication of DALL&#183;E/CLIP)</p></li><li><p><a href="https://arxiv.org/abs/2102.06701#google">&#8220;Explaining Neural Scaling Laws&#8221;</a>, Bahri et al 2021/<a href="https://arxiv.org/abs/2102.04074#deepmind">&#8220;Learning Curve Theory&#8221;</a>, Hutter 2021 (<a href="https://www.lesswrong.com/posts/Yt5wAXMc7D2zLpQqx/an-140-theoretical-models-that-predict-scaling-laws#HIGHLIGHTS">Rohin Shah commentary</a>; more on the manifold hypothesis)</p></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href="https://www.nature.com/articles/s41467-021-21283-4">&#8220;Phenotypic covariance across the entire spectrum of relatedness for 86 billion pairs of individuals&#8221;</a>, Kemper et al 2021</p></li><li><p><a href="https://www.nature.com/articles/s41380-021-01027-y">&#8220;Genetic variation, brain, and intelligence differences&#8221;</a>, Deary et al 2021</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.02.10.430571v1">&#8220;Pathfinder: A gamified measure to integrate general cognitive ability into the biological, medical and behavioural sciences&#8221;</a>, Malanchini et al 2021 (not the focus, but the IQ PGS is a slight improvement over <a href="https://www.biorxiv.org/content/early/2018/09/17/418210" title="Genomic prediction of cognitive traits in childhood and adolescence">Allegrini et al 2018</a> due to less phenotype measurement error?)</p></li><li><p><a href="https://www.nature.com/articles/s41380-021-01026-z">&#8220;Polygenic  burden has broader impact on health, cognition, and socioeconomic  outcomes than most rare and high-risk copy number variants&#8221;</a>, Saarentaus et al 2021</p></li><li><p><a href="http://www.scielo.br/scielo.php?script=sci_arttext&amp;pid=S1516-44462021005006201" title="'Ditching candidate gene association studies: lessons from psychiatric genetics', Duarte et al 2021">On candidate-genes &amp; COMT</a></p></li></ul><p>Recent Evolution:</p><ul><li><p><a href="https://www.nytimes.com/2021/02/17/science/DNA-mammoth.html">&#8220;Million-Year-Old  DNA Rewrites the Mammoth Family Tree: Genomic data&#8212;the oldest ever  recovered from a fossil&#8212;reveals the origin and evolution of the  Columbian mammoth&#8221;</a></p></li><li><p><a href="https://www.pnas.org/content/118/6/e2016046118">&#8220;Kin selection explains the evolution of cooperation in the gut microbiota&#8221;</a>, Simonet &amp; McNally 2021</p></li></ul><p>Engineering:</p><ul><li><p><a href="https://www.nytimes.com/2021/02/18/science/black-footed-ferret-clone.html" title="&quot;Meet Elizabeth Ann, the First Cloned Black-Footed Ferret: Her birth represents the first cloning of an endangered species native to North America, and may bring needed genetic diversity to the species&quot;">First Black-Footed Ferret cloned</a></p></li></ul><h2>2.3 Statistics/Meta-Science</h2><ul><li><p><a href="https://www.lesswrong.com/posts/9YDk52NPrfq7nqLvd/lessons-from-the-book-of-my-life">&#8220;Lessons from Gerolamo Cardano&#8217;s </a><em><a href="https://www.lesswrong.com/posts/9YDk52NPrfq7nqLvd/lessons-from-the-book-of-my-life">The Book of My Life</a></em><a href="https://www.lesswrong.com/posts/9YDk52NPrfq7nqLvd/lessons-from-the-book-of-my-life">&#8221;</a> (progress studies; see also <a href="https://www.gwern.net/Newton">Newton&#8217;s anthropic argument</a>, <a href="https://www.gwern.net/Bakewell">Bakewell &amp; inventing progress</a>, <em><a href="https://www.gwern.net/Book-reviews#the-autobiography-of-benvenuto-cellini-cellini-1999">The Autobiography of Benvenuto Cellini</a></em>)</p></li><li><p><a href="https://www.wired.com/story/group-house-covid-risk-points/">&#8220;How Many Microcovids Would You Spend on a Burrito?&#8221;</a> (on the <a href="https://www.microcovid.org/">microCOVID Project Calculator</a>)</p></li><li><p><a href="https://www.gwern.net/docs/math/1968-hammersley.pdf">&#8220;On  the enfeeblement of mathematical skills by &#8216;Modern Mathematics&#8217; and by  similar soft intellectual trash in schools and universities&#8221;</a>, Hammersley 1968 (<a href="https://www.gwern.net/docs/math/1973-knuth.pdf" title="The Dangers of Computer--Science Theory">Knuth</a> highlights as also amusing: <a href="https://www.gwern.net/docs/math/1967-austin.pdf">&#8220;A Note on Piffles&#8221;</a>, Smith 1967; <a href="https://www.gwern.net/docs/math/1980-farlow.pdf">&#8220;A rebuke of A. B. Smith&#8217;s paper, &#8216;A Note on Piffles&#8217;&#8221;</a>, Farlow 1980)</p></li><li><p><a href="https://www.gwern.net/docs/statistics/bias/2011-tatum.pdf">&#8220;Artifact and Recording Concepts in EEG&#8221;</a>, Tatum et al 2011 (on the <a href="https://en.wikipedia.org/wiki/Electroencephalography">EEG</a> signals of <a href="https://en.wikipedia.org/wiki/Jell-O">Jell-O</a>, or, the importance of <a href="https://en.wikipedia.org/wiki/Scientific_control#Negative">negative controls</a>)</p></li></ul><h2>2.4 Politics/Religion</h2><ul><li><p><a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0032541">&#8220;The Logic of Fashion Cycles&#8221;</a>, Acerbi et al 2012; <a href="https://royalsocietypublishing.org/doi/10.1098/rsif.2018.0731">&#8220;Fashion and art cycles are driven by counter-dominance signals of elite competition: quantitative evidence from music styles&#8221;</a>, Klimek et al 2019; <a href="https://arxiv.org/abs/1410.8001">&#8220;The hipster effect: When anti-conformists all look the same&#8221;</a>, Touboul 2019; <a href="https://slatestarcodex.com/2014/04/22/right-is-the-new-left">&#8220;Right Is The New Left&#8221;</a>, Scott Alexander (see also <a href="https://www.gwern.net/docs/culture/2010-han.pdf" title="Signaling Status with Luxury Goods: The Role of Brand Prominence">Han et al 2010</a>, <a href="https://www.gwern.net/docs/sociology/1972-downs.pdf" title="Up and down with ecology---the 'issue-attention cycle'">Downs 1972</a>/<a href="https://www.gwern.net/docs/sociology/2015-gupta.pdf" title="On Anthony Downs's 'Up and Down with Ecology: The &quot;Issue-Attention&quot; Cycle'">Gupta &amp; Jenkins-Smith 2015</a>, <a href="https://www.nature.com/articles/s41467-019-09311-w" title="Accelerating dynamics of collective attention">Lorenz-Spreen et al 2019</a>/<a href="https://www.gwern.net/docs/culture/2019-candia.pdf" title="The universal decay of collective memory and attention">Candia et al 2019</a>, <a href="https://www.gwern.net/docs/sociology/1994-loury.pdf" title="Self-Censorship in Public Discourse: A Theory of 'Political Correctness' and Related Phenomena">Loury 1994</a>)</p></li><li><p><a href="https://aeon.co/essays/what-can-we-learn-from-the-lunar-pandemic-that-never-was">&#8220;What can we learn from the lunar pandemic that never was?&#8221;</a>  (NASA&#8217;s lunar quarantine was a sham intended to mollify the public as  they covered up repeated major failures &amp; lab leaks both before  &amp; after&#8212;had there been any dangerous lunar organisms, they would  have escaped easily)</p></li><li><p><a href="https://en.wikipedia.org/wiki/MrBeast">MrBeast</a> (the new aristocracy of <a href="https://meltingasphalt.com/social-status-down-the-rabbit-hole/">prestige</a>? Borrowed plumage, perhaps, but effective&#8230;)</p></li><li><p><a href="https://www.cell.com/current-biology/fulltext/S0960-9822(17)30949-1">&#8220;Russia&#8217;s new Lysenkoism&#8221;</a>, Kolchinsky et al 2017</p></li></ul><h2>2.5 Psychology/Biology</h2><ul><li><p><strong><a href="https://en.wikipedia.org/wiki/Semaglutide">Semaglutide</a></strong>: <a href="https://www.gwern.net/docs/longevity/2021-wilding.pdf">&#8220;Once-Weekly Semaglutide in Adults with Overweight or Obesity&#8221;</a>, Wilding et al 2021; <a href="https://www.gwern.net/docs/longevity/2021-wadden.pdf">&#8220;Effect  of Subcutaneous Semaglutide vs Placebo as an Adjunct to Intensive  Behavioral Therapy on Body Weight in Adults With Overweight or Obesity:  The STEP 3 Randomized Clinical Trial&#8221;</a>, Wadden et al 2021</p><p>A longer-acting version of the insulin/appetite peptide <a href="https://en.wikipedia.org/wiki/Liraglutide">liraglutide</a>, semaglutide greatly reduces weight, fat, blood sugar, cholesterol etc, with an <a href="https://link.springer.com/article/10.1007/s40262-018-0728-4" title="'Safety and pharmacokinetics of single and multiple ascending doses of the novel oral human GLP-1 analogue, oral semaglutide, in healthy subjects and subjects with type 2 diabetes', Granhall et al 2019">upcoming oral version</a>; background: <a href="https://www.gwern.net/docs/longevity/2020-kushner.pdf" title="Semaglutide 2.4 mg for the Treatment of Obesity: Key Elements of the STEP Trials 1 to 5">Kushner et al 2020</a>, <a href="https://www.gwern.net/docs/longevity/2019-aroda.pdf" title="Comparative efficacy, safety, and cardiovascular outcomes with once-weekly subcutaneous semaglutide in the treatment of type 2 diabetes: Insights from the SUSTAIN 1--7 trials">Aroda et al 2019</a>, <a href="https://www.gwern.net/docs/longevity/2019-nauck.pdf" title="Management Of Endocrine Disease: Are all GLP-1 agonists equal in the treatment of type 2 diabetes?">Nauck &amp; Meier 2019</a>, <a href="https://www.gwern.net/docs/longevity/2018-oneil.pdf" title="Efficacy and safety of semaglutide compared with liraglutide and placebo for weight loss in patients with obesity: a randomized, double-blind, placebo and active controlled, dose-ranging, phase 2 trial">O&#8217;Neil et al 2018</a>, <a href="https://www.gwern.net/docs/longevity/2017-blundell.pdf" title="Effects of once-weekly semaglutide on appetite, energy intake, control of eating, food preference and body weight in subjects with obesity">Blundell et al 2017</a>, <a href="https://www.gwern.net/docs/longevity/2016-nauck.pdf" title="A Phase 2, Randomized, Dose-Finding Study of the Novel Once-Weekly Human GLP-1 Analog, Semaglutide, Compared With Placebo and Open-Label Liraglutide in Patients With Type 2 Diabetes">Nauck et al 2016</a>, <a href="https://www.gwern.net/docs/longevity/2015-lau.pdf" title="Discovery of the Once-Weekly Glucagon-Like Peptide-1 (GLP-1) Analogue Semaglutide">Lau et al 2015</a>.</p></li><li><p><a href="https://www.gwern.net/docs/biology/2020-irving.pdf">&#8220;Lessons from the host defences of bats, a unique viral reservoir&#8221;</a>, Irving et al 2021 (<a href="https://en.wikipedia.org/wiki/Bat-borne_virus">bat-borne viruses</a>; previously, <a href="https://get21stnight.com/2020/03/30/why-do-we-keep-getting-diseases-from-bats/">Trevor Klee</a>)</p></li><li><p><a href="https://www.frontiersin.org/articles/10.3389/fcell.2021.628157/full">&#8220;Beneficial  &amp; Detrimental Effects of Reactive Oxygen Species on Lifespan: A  Comprehensive Review of Comparative &amp; Experimental Studies&#8221;</a>,  Shields et al 2021 (antioxidants still aren&#8217;t the fountain of youth, and  may be harmful; animal studies still frequently inconsistent)</p></li><li><p><a href="https://www.nature.com/articles/s41598-021-81446-7">&#8220;Positive expectations predict improved mental-health outcomes linked to psychedelic microdosing&#8221;</a>, Kaertner et al 2021 (placebo)</p></li><li><p><a href="https://www.gwern.net/docs/iq/2021-aggeborn.pdf">&#8220;The Effects of Fluoride in Drinking Water&#8221;</a>, Aggeborn &amp; &#214;hman 2021</p></li><li><p><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1978350/">&#8220;Sleep  &amp; Sex: What Can Go Wrong? A Review of the Literature on Sleep  Related Disorders and Abnormal Sexual Behaviors &amp; Experiences&#8221;</a>, Schenck et al 2007</p></li></ul><h2>2.6 Technology</h2><ul><li><p><a href="https://www.xprize.org/prizes/elonmusk">New X-Prize: $100m in prizes for Carbon Removal</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Gauge_block">Wringing gauge blocks</a>  (&#8220;With their precisely-flat metal faces, gauge blocks can be stuck  together non-magnetically via a process calling &#8216;wringing&#8217;, requiring  substantial effort to separate. Scientists are still uncertain exactly  how wringing works.&#8221;)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Armoured_train">Armored train</a></p></li></ul><h2>2.7 Economics</h2><ul><li><p><a href="https://ourworldindata.org/cheap-renewables-growth">&#8220;Why did renewables become so cheap so fast? And what can we do to use this global opportunity for green growth?&#8221;</a>, Max Roser (specifically, why such an extreme <a href="https://en.wikipedia.org/wiki/Experience_curve_effects">experience curve</a>?)</p></li><li><p><a href="https://www.gwern.net/docs/iq/2012-grinblatt.pdf">&#8220;IQ, trading behavior, and performance&#8221;</a>, Grinblatt et al 2012; <a href="https://www.gwern.net/docs/economics/2020-barth.pdf">&#8220;Genetic Endowments and Wealth Inequality&#8221;</a>,  Barth et al 2020 (why, despite notorious setbacks, did Isaac Newton  &amp; LTCM&#8217;s founders die wealthy? Why, in general, are more intelligent  people so much better investors? &#8216;The indifference of the indicator&#8217;:  it&#8217;s not one thing, it&#8217;s everything&#8212;more intelligent people have lower  discount rates, save more for longer &amp; are less risk-averse, more  accurately predict future growth or inflation, are more likely to  participate in +EV opportunities like the stock market, to use low-fee  rather than high-fee (and thus, underperforming) mutual funds, succumb  less to biases like herding as they trade better &amp; at better times,  trade less, and harvest losses more efficiently when trading poorly.)</p></li></ul><h2>2.8 Philosophy</h2><ul><li><p>Are <strong>ethics experts more ethical</strong>? <a href="https://www.gwern.net/docs/philo/2016-schwitzgebel.pdf">&#8220;The Behavior of Ethicists&#8221;</a>, Schwitzgebel &amp; Rust 2016 (most recently: <a href="https://www.gwern.net/docs/philo/2019-schonegger.pdf">&#8220;The moral behavior of ethics professors: A replication-extension in German-speaking countries&#8221;</a>,  Sch&#246;negger et al 2019; given moral licensing &amp; activism, perhaps we  should be surprised we don&#8217;t hear about more ethicists doing things  like posting enemy lists or trying to dox reviewers. &#8220;Woe to you  Pharisees!&#8221;)</p></li><li><p><a href="https://psyarxiv.com/quwgr">&#8220;Meta-analysis on belief in free will manipulations&#8221;</a>, Genschow et al 2021 (another noble lie turns out to be ignoble)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Cooperative_principle">Gricean maxims of communication</a></p></li></ul><h2>2.9 Fiction</h2><ul><li><p><em><a href="https://en.wikipedia.org/wiki/Bunnies_%26_Burrows">Bunnies &amp; Burrows</a></em></p></li></ul><h2>2.10 Miscellaneous</h2><ul><li><p><a href="https://www.gwern.net/docs/history/1995-pop.pdf">&#8220;Caesar Lives&#8221;</a>, <a href="https://en.wikipedia.org/wiki/Iggy_Pop">Iggy Pop</a> 1995 (on <a href="https://en.wikipedia.org/wiki/The_History_of_the_Decline_and_Fall_of_the_Roman_Empire">Gibbon</a>)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Grayanotoxin#Mad_honey_intoxication">Mad honey</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Imperial_Court_System">Imperial Court System</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[Jan 2021 Gwern.net Newsletter]]></title><description><![CDATA[January 2021 gwern.net newsletter with links on AI scaling up and down.]]></description><link>https://gwern.substack.com/p/jan-2021-gwernnet-newsletter</link><guid isPermaLink="false">https://gwern.substack.com/p/jan-2021-gwernnet-newsletter</guid><dc:creator><![CDATA[gwern]]></dc:creator><pubDate>Thu, 04 Feb 2021 20:23:01 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>January 2021&#8217;s <a href="https://www.gwern.net/newsletter/2021/01">Gwern.net</a> <a href="https://gwern.substack.com">newsletter</a> is now out; previous, <a href="https://www.gwern.net/newsletter/2020/12">December 2020</a> (<a href="https://www.gwern.net/tags/newsletter">archives</a>). This is a summary of the revision-history RSS feed, overlapping with my <a href="https://www.gwern.net/Changelog">Changelog</a> &amp; /r/gwern; brought to you by my donors on <a href="https://www.patreon.com/gwern">Patreon</a>.</p><h1>1 Writings</h1><ul><li><p><a href="https://www.gwern.net/Danbooru2020" title="Danbooru2020 is a large-scale anime image database with 4.2m+ images annotated with 130m+ tags; it can be useful for machine learning purposes such as image recognition and generation.">&#8220;Danbooru2020: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset&#8221;</a></p></li><li><p><a href="https://thisanimedoesnotexist.ai/">This Anime Does Not Exist.ai (TADNE)</a> (<a href="https://www.gwern.net/Faces#extended-stylegan2-danbooru2019-aydao">discussion</a>)</p></li><li><p><strong>Gwern.net</strong>: +return-to-top floating button; <em>popups</em>:  can now be disabled (use the &#8216;gear&#8217; icon); final reimplementation  (dynamic JS now; memoizing the recursive inlining, however clever &amp;  elegant, turns out to have painful edge-cases &amp; still not be  efficient enough&#8212;web browsers <em>really</em> don&#8217;t like loading hundreds of kilobytes of extra HTML)</p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><p><a href="https://old.reddit.com/r/mlscaling/">Matters Of Scale</a>:</p><ul><li><p><strong>Scaling up</strong>:</p><ul><li><p><a href="https://openai.com/blog/dall-e/">&#8220;DALL&#183;E: Creating Images from Text&#8221;</a>, OpenAI (GPT-3-12.5b generating 1280 tokens &#8594; <a href="https://arxiv.org/abs/1906.00446#deepmind" title="'Generating Diverse High-Fidelity Images with VQ-VAE-2', Razavi et al 2019">VQ-VAE</a> pixels; generates illustration &amp; photos); <a href="https://openai.com/blog/clip/">&#8220;CLIP (Contrastive Language-Image Pre-training): Connecting Text and Images&#8221;</a>, OpenAI (<a href="https://cdn.openai.com/papers/Learning_Transferable_Visual_Models_From_Natural_Language_Supervision.pdf" title="Learning Transferable Visual Models From Natural Language Supervision">Radford et al 2021</a>: zero-shot image understanding via text description&#8212;useful for much more than just ranking DALL&#183;E samples by quality)</p><p>Further <a href="https://www.gwern.net/newsletter/2020/05#blessings-of-scale">blessings of scale</a>: simple <a href="https://arxiv.org/abs/2010.05113" title="'Contrastive Representation Learning: A Framework and Review', Le-Khac et al 2020">contrastive</a> training on <em>n</em>  = 400m leads to remarkable generalization &amp; combinatorial  flexibility of image generation by DALL&#183;E, and CLIP learns to reach  image classification SOTA by zero-shot on many datasets, with more  human-like errors &amp; less degradation out of samples than rivals,  while costing the same to train. OpenAI released their smallest CLIP  model (the &#8220;<a href="https://openreview.net/forum?id=YicbFdNTTy#google" title="Vision Transformer (ViT): An Image is Worth 16&#215;16 Words: Transformers for Image Recognition at Scale">ViT</a>-B/32&#8221;-equivalent)  and people are discovering it seems able to do just about anything  without any further training&#8212;the paper notes that it does everything  from &#8220;fine-grained object classification, geo-localization, action  recognition in videos, and OCR&#8221;, but there&#8217;s so much more, and you can  use it to generate image captions/descriptions, classify your anime  images, pull a specific target image description by gradient ascent or  out of another neural network such as an ImageNet <a href="https://arxiv.org/abs/1809.11096#deepmind" title="'BigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis', Brock et al 2018">BigGAN</a>  or TADNE StyleGAN2-ext (or, why not, synthesize images images embodying  abstract concepts like emoji or words like &#8220;nightmare fuel&#8221; or  &#8220;confusion&#8221;!), search your image datasets by embedding, find mislabeled  images (eg by <a href="https://twitter.com/quasimondo/status/1351191660059832320">using &#8220;upside down&#8221; as the prompt</a>)&#8230;  One wonders, like GPT-3, how much better the largest CLIP  (&#8220;L/14-336px&#8221;) is and how many ways of using it (or DALL&#183;E) remain to be  found? And why prediction losses work so well in one place, but then  contrastive elsewhere?</p><p>For perspective: there are newly-minted PhDs going on the job market who got excited about deep learning because of these new <a href="https://arxiv.org/abs/1512.03385" title="'Deep Residual Learning for Image Recognition', He et al 2015">&#8220;resnet&#8221;</a> things; undergrads who applied to grad school because <a href="https://arxiv.org/abs/1810.04805#google" title="'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding', Devlin et al 2018">BERT</a>  et al were blowing open NLP &amp; extending neural supremacy to natural  language would not yet have passed quals; and it has been only 1  academic semester since <a href="https://arxiv.org/abs/2005.14165#openai" title="'GPT-3: Language Models are Few-Shot Learners', Brown et al 2020">GPT-3</a> was announced. Or to put it quantitatively, for just sequence modeling: it has been 8,478 days since <a href="https://www.gwern.net/docs/ai/1997-hochreiter.pdf" title="'Long Short-Term Memory', Hochreiter &amp; Schmidhuber 1997">LSTM</a> RNNs were published; 3,045 days since <a href="https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf" title="'ImageNet Classification with Deep Convolutional Neural Networks', Krizhevsky et al 2012">AlexNet&#8217;s</a> ImageNet scores were released; 1,880 days since residual networks were published in a paper; 1,330 days since <a href="https://arxiv.org/abs/1706.03762#google" title="Vaswani et al 2017">&#8220;Attention Is All You Need&#8221;</a> hit Arxiv; 844 days since BERT&#8217;s paper was published; 718 days since <a href="https://openai.com/blog/better-language-models/" title="'Better Language Models and Their Implications', OpenAI 2019">GPT-2</a> was announced; 353 days since <a href="https://arxiv.org/abs/2002.05709#google" title="'A Simple Framework for Contrastive Learning of Visual Representations', Chen et al 2020">SimCLR</a>, and 249 days since GPT-3 was; and 27 days since CLIP/DALL&#183;E.^1^ <a href="https://jetpress.org/volume1/moravec.htm" title="'When will computer hardware match the human brain?', Moravec 1998">Spring is coming.</a> (Some still insist we need not worry about &#8220;overpopulation on Mars&#8221; for &gt;18,264 more days&#8230;)</p></li><li><p><a href="https://arxiv.org/abs/2003.10580#google">&#8220;Meta Pseudo Labels&#8221;</a>, Pham et al 2020 (90% on ImageNet by pretraining a meta-learning teacher using JFT-300M on a TPUv3-2048)</p></li><li><p><a href="https://arxiv.org/abs/2101.03961#google">&#8220;Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity&#8221;</a>, Fedus et al 2021 (1.57t-parameter <a href="https://arxiv.org/abs/2006.16668#google" title="'GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding', Lepikhin et al 2020">GShard</a> followup; the mixture-of-experts approach, while scaling stably, starts showing its limits)</p></li></ul></li><li><p><strong>Scaling down</strong>:</p><ul><li><p><a href="https://arxiv.org/abs/2012.12877#facebook">&#8220;DeiT: Training data-efficient image transformers &amp; distillation through attention&#8221;</a>, Touvron et al 2020 (scaling Transformer classifiers down to ImageNet+1-GPU); <a href="https://arxiv.org/abs/2101.11605#google">&#8220;BoTNet: Bottleneck Transformers for Visual Recognition&#8221;</a>, Srinivas et al 2021/<a href="https://arxiv.org/abs/2101.11986">&#8220;Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet&#8221;</a>, Yuan et al 2021 (hybrids); <a href="https://arxiv.org/abs/2009.04433">&#8220;not-so-BigGAN: Generating High-Fidelity Images on Small Compute with Wavelet-based Super-Resolution&#8221;</a>, Han et al 2020/<a href="https://compvis.github.io/taming-transformers/">&#8220;VQGAN: Taming Transformers for High-Resolution Image Synthesis&#8221;</a>, Esser et al 2020 (training &gt;1024px Transformer GANs on just 2 GPUs)</p><p>Transformer supremacy in image-related tasks continues, and GANs  are becoming increasingly hybridized. Do pure-GANs have a future, now  that VAEs and autoregressive models are making such inroads into both  the highest-quality &amp; lowest-compute sample generation? To take the  GAN/DRL analogy seriously, perhaps they were they ultimately a dead end,  akin to trying to learn everything from rewards, and an adversarial GAN  loss ought to be only <a href="https://www.gwern.net/images/ai/2019-lecun-isscctalk-cake.png">the cherry on the cake</a> of a large unsupervised/semi-supervised generative model.</p></li><li><p><a href="https://arxiv.org/abs/2101.06840#microsoft">&#8220;ZeRO-Offload: Democratizing Billion-Scale Model Training&#8221;</a>, Ren et al 2021 (partial CPU training for 13b-parameter models on 1 V100 GPU, scaling to 128 GPUs)</p></li><li><p><a href="https://arxiv.org/abs/2101.00190">&#8220;Prefix-Tuning: Optimizing Continuous Prompts for Generation&#8221;</a>, Li &amp; Liang 2021 (could the <a href="https://arxiv.org/abs/2009.07118" title="'It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners', Schick &amp; Sch&#252;tze et al 2020">PET</a>  &amp; CLIP trick of averaging multiple embeddings to yield much better  performance be reused for GPT-3 prompts to greatly improve prompting?  The fact that the prefix-tuning, by directly optimizing the prompt  embeddings, yields better performance than even single optimized text  prompts, suggests so. The user could provide 3 or 4 similar prompts, and  synthesize them into a single super-prompt to better program GPT-3&#8230;)</p></li><li><p><a href="https://greydanus.github.io/2020/12/01/scaling-down/">&#8220;Scaling down Deep Learning&#8221;</a>,  Greydanus 2020 (cute: parametric simplified-MNIST for rapid iteration  on tiny NNs: experiments in lottery-ticket &amp; meta-learning of  LRs/activations)</p></li><li><p><a href="https://cp4space.hatsya.com/2021/01/08/the-neural-network-of-the-stockfish-chess-engine/">&#8220;The neural network of the Stockfish chess engine&#8221;</a> (very lightweight NN designed for incremental recomputation over changing board states)</p></li></ul></li><li><p><a href="https://arxiv.org/abs/2101.01169">&#8220;Transformers in Vision: A Survey&#8221;</a>, Khan et al 2021</p></li><li><p><a href="https://openai.com/blog/organizational-update/">OpenAI departures</a>:  Dario Amodei, Sam McCandlish, Tom Brown, Tom Henighan, Chris Olah, Jack  Clark, Ben Mann, Paul Christiano et al leave&#8212;most for an unspecified  new entity (<a href="https://steveblank.com/2009/12/21/the-elves-leave-middle-earth-%E2%80%93-soda%E2%80%99s-are-no-longer-free/">&#8220;the elves leave Middle Earth&#8221;</a>?)</p></li></ul><p>And the rest:</p><ul><li><p><a href="https://www.lesswrong.com/posts/pTYDdcag9pTzFQ7vw/2020-ai-alignment-literature-review-and-charity-comparison">&#8220;2020 AI Alignment Literature Review and Charity Comparison&#8221;</a>, Larks</p></li><li><p><a href="https://arxiv.org/abs/2009.01719#deepmind">&#8220;Grounded Language Learning Fast and Slow&#8221;</a>, Hill et al 2020</p></li><li><p><a href="https://arxiv.org/abs/2006.03654#microsoft">&#8220;DeBERTa: Decoding-enhanced BERT with Disentangled Attention&#8221;</a>, He et al 2020 (<a href="https://arxiv.org/abs/1905.00537" title="'SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems', Wang et al 2019">SuperGLUE</a> falls)</p></li><li><p><a href="https://arxiv.org/abs/2012.13349#deepmind">&#8220;Solving Mixed Integer Programs Using Neural Networks&#8221;</a>, Nair et al 2020</p></li><li><p><a href="https://arxiv.org/abs/2012.14271">&#8220;Towards Fully Automated Manga Translation&#8221;</a>, Hinami et al 2020</p></li><li><p><a href="https://arxiv.org/abs/2101.08001#baidu">&#8220;UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers&#8221;</a>, Hu et al 2021</p></li><li><p><a href="https://arxiv.org/abs/2012.07975#bair">&#8220;FERM: A Framework for Efficient Robotic Manipulation&#8221;</a>, Zhan et al 2021 (contrastive semi-supervised learning + data augmentation for sample-efficiency)</p></li><li><p><a href="https://arxiv.org/abs/2101.04702#google">&#8220;XMC-GAN: Cross-Modal Contrastive Learning for Text-to-Image Generation&#8221;</a>, Zhang et al 2021</p></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href="https://www.nature.com/articles/s41539-020-00079-z">&#8220;Nurture might be nature: cautionary tales and proposed solutions&#8221;</a>, Hart et al 2021</p></li><li><p><a href="https://www.sciencedirect.com/science/article/pii/S1755296620300624">&#8220;A genetic perspective on the association between exercise and mental health in the era of genome-wide association studies&#8221;</a>, de Geus 2020; <a href="https://www.gwern.net/docs/genetics/correlation/2020-schnurr.pdf">&#8220;Evidence for shared genetics between physical activity, sedentary behaviour and adiposity-related traits&#8221;</a>, Schnurr et al 2020</p></li><li><p><a href="https://www.medrxiv.org/content/10.1101/2020.12.11.20245035v1">&#8220;Antidepressant Response in Major Depressive Disorder: A Genome-wide Association Study&#8221;</a>, Pain et al 2020</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.04.03.024554v3">&#8220;Genome wide analysis of gene dosage in 24,092 individuals shows that 10,000 genes modulate cognitive ability&#8221;</a>, Huguet et al 2020 (yep, still polygenic)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.04.20.051631v2">&#8220;GWAS of three molecular traits highlights core genes and pathways alongside a highly polygenic background&#8221;</a>, Sinnott-Armstrong et al 2021</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.01.08.425895v1">&#8220;Genome-scale sequencing and analysis of human, wolf and bison DNA from 25,000 year-old sediment&#8221;</a>, Gelabert et al 2021 (incredible this is possible)</p></li><li><p><a href="https://www.medrxiv.org/content/10.1101/2021.01.25.21249961v1">&#8220;Disentangling  sex differences in the shared genetic architecture of PTSD, traumatic  experiences, and social support with body size and composition&#8221;</a>, Carvalho et al 2021 (<a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6684375/" title="'Distinguishing genetic correlation from causation across 52 diseases and complex traits', O'Connor &amp; Price 2018">LCV</a>)</p></li></ul><p>Recent Evolution:</p><ul><li><p><a href="https://www.gwern.net/docs/genetics/selection/2021-pereira.pdf">&#8220;African genetic diversity and adaptation inform a precision medicine agenda&#8221;</a>, Pereira et al 2021; <a href="https://www.nature.com/articles/s41576-020-00305-9">&#8220;The influence of evolutionary history on human health and disease&#8221;</a>, Benton et al 2021; <a href="https://www.biorxiv.org/content/10.1101/2021.01.26.428314v1">&#8220;Local adaptation and archaic introgression shape global diversity at human structural variant loci&#8221;</a>, Yan et al 2021</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.07.19.211078v2">&#8220;Genome scans of dog behavior implicate a gene network underlying psychopathology in mammals, including humans&#8221;</a>, Zapata et al 2021</p></li><li><p><a href="https://ideas.repec.org/p/uea/ueaeco/2021-02.html">&#8220;Natural Selection in Contemporary Humans is Linked to Income and Substitution Effects&#8221;</a>, Hugh-Jones &amp; Abdellaoui 2021</p></li><li><p><a href="https://elifesciences.org/articles/61644">&#8220;The diversity and function of sourdough starter microbiomes&#8221;</a>, Landis et al 2021 (crowdsourced sourdough show little trace of geographic origins?)</p></li></ul><p>Engineering:</p><ul><li><p><a href="https://www.gwern.net/docs/genetics/editing/2021-koblan.pdf">&#8220;In vivo base editing rescues Hutchinson-Gilford progeria syndrome in mice&#8221;</a>, Koblan et al 2021</p></li><li><p><a href="https://arxiv.org/abs/2101.05870">&#8220;From Genotype to Phenotype: polygenic prediction of complex human traits&#8221;</a>, Raben et al 2021</p></li></ul><h2>2.3 Statistics/Meta-Science/Math</h2><ul><li><p><a href="https://arxiv.org/abs/2101.07884">&#8220;The Quantum Field Theory on Which the Everyday World Supervenes&#8221;</a>,  Carroll 2021 (&#8220;&#8230;we have reason to be confident that the laws of physics  underlying the phenomena of everyday life are completely known&#8221; because  all unknown particles/fields are constrained to being extremely  rare/weak, eg by <a href="https://www.gwern.net/docs/science/2009-adelberger.pdf" title="Torsion balance experiments: A low--energy frontier of particle physics">Adelberger et al 2009</a>)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.12.10.419424v1">&#8220;How accurate are citations of frequently cited papers in biomedical literature?&#8221;</a>, Pavlovic et al 2020 (includes original author&#8217;s evaluation of whether a citation of their work is correct)</p></li><li><p><a href="https://arxiv.org/abs/1605.08448">&#8220;Energy-Efficient Algorithms&#8221;</a>, Demaine et al 2016 (<a href="https://en.wikipedia.org/wiki/Reversible_computing">reversible computing</a> asymptotics: constant-factor <a href="https://en.wikipedia.org/wiki/Stack_(abstract_data_type)">stacks</a>/<a href="https://en.wikipedia.org/wiki/Dynamic_array">arrays</a>, &#119978;(log <em>n</em>) time/energy <a href="https://en.wikipedia.org/wiki/AVL_tree">AVL trees</a>, &#119978;(<em>n</em>) space <a href="https://en.wikipedia.org/wiki/Comparison_sort">sorts</a>, &amp; various &#119978;(Vertex+Edge) time/space/energy <a href="https://en.wikipedia.org/wiki/Graph_traversal">graph searches</a>)</p></li><li><p><a href="https://www.gwern.net/docs/statistics/decision/2006-smith.pdf">&#8220;The Optimizer&#8217;s Curse: Skepticism and Postdecision Surprise in Decision Analysis&#8221;</a>,  Smith &amp; Winkler 2006 (regression to the mean is everywhere; another  example of why Bayes &amp; decision theory are two great flavors that  go great together)</p></li></ul><h2>2.4 Politics/Religion</h2><ul><li><p><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3650704">&#8220;The Mechanisms of Cult Production: An Overview&#8221;</a>, Xavier Marquez 2020 (see previously his <a href="https://www.gwern.net/newsletter/2019/02#abandoned-footnotes">blog roundup</a>)</p></li><li><p><a href="https://www.gwern.net/docs/sociology/1999-dawson.pdf">&#8220;When Prophecy Fails and Faith Persists: A Theoretical Overview&#8221;</a>, Dawson 1999</p></li><li><p><a href="https://www.overcomingbias.com/2020/11/why-we-fight-over-fiction.html">&#8220;Why We Fight Over Fiction&#8221;</a>, Robin Hanson</p></li><li><p><a href="https://en.wikipedia.org/wiki/All-Woman_Supreme_Court">The All-Woman Supreme Court</a></p></li></ul><h2>2.5 Psychology/Biology</h2><ul><li><p><a href="https://astralcodexten.substack.com/p/still-alive">&#8220;Still Alive&#8221;</a>,  Scott Alexander (announcement of SSC return as Substack newsletter  &#8216;Astral Codex Ten&#8217; &amp; launching a low-cost psychiatry clinic &#8216;Lorien  Psychiatry&#8217;)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2020.09.08.287276v1">&#8220;The Temporal Dynamics of Opportunity Costs: A Normative Account of Cognitive Fatigue and Boredom&#8221;</a>, Agrawal et al 2020</p></li><li><p><a href="https://onlinelibrary.wiley.com/doi/full/10.1002/hbm.25109">&#8220;A unified framework for association and prediction from vertex-wise grey-matter structure&#8221;</a>, Couvy-Duchesne et al 2020 (more <a href="https://www.gwern.net/Questions#variance-components">morphometricity</a>)</p></li><li><p><strong>Common phenomena</strong>: <a href="https://www.gwern.net/docs/psychology/2018-fassnidge.pdf">&#8220;Sounds from seeing silent motion: Who hears them, and what looks loudest?&#8221;</a>, Fassnidge &amp; Freeman 2018 (on &#8216;visual ear&#8217;; previously: <a href="https://www.sciencedirect.com/science/article/pii/S0960982208007343" title="The sound of change: visually-induced auditory synaesthesia">Saenz &amp; Koch 2008</a>, <a href="https://www.gwern.net/docs/psychology/2017-fassnidge.pdf" title="A deafening flash! Visual interference of auditory signal detection">Fassnidge et al 2017</a>)</p></li><li><p><a href="https://online.ucpress.edu/collabra/article/7/1/18731/115925/Predicting-Mental-Health-From-Followed-Accounts-on">&#8220;Predicting Mental Health From Followed Accounts on Twitter&#8221;</a>, Costelli et al 2021 (<a href="https://en.wikipedia.org/wiki/Preregistration_(science)#Registered_reports">Registered Report</a>: who you choose to follow says a lot about you&#8212;<a href="https://www.gwern.net/Everything">everything is correlated</a>)</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.01.08.425841v1">&#8220;No evidence for general intelligence in a fish&#8221;</a>, Aellen et al 2021</p></li><li><p><a href="https://en.wikipedia.org/wiki/Delirium_tremens">Delirium tremens</a></p></li><li><p><a href="https://www.gwern.net/docs/biology/2021-asnicar.pdf">&#8220;Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals&#8221;</a>, Asnicar et al 2021</p></li><li><p><a href="https://www.biorxiv.org/content/10.1101/2021.01.18.426733v1">&#8220;Universal DNA methylation age across mammalian tissues&#8221;</a>, Lu et al 2021; <a href="https://onlinelibrary.wiley.com/doi/full/10.1111/acel.13296">&#8220;Whole-body senescent cell clearance alleviates age-related brain inflammation and cognitive impairment in mice&#8221;</a>, Ogrodnik et al 2021</p></li><li><p><a href="https://arxiv.org/abs/2101.12037">&#8220;BENDR: using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data&#8221;</a>, Kostas et al 2021 (towards brain imitation learning)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Parker%E2%80%93Hulme_murder_case">Parker-Hulme murder case</a>; <a href="https://en.wikipedia.org/wiki/Slender_Man_stabbing">The Slender Man stabbing</a> (<a href="https://en.wikipedia.org/wiki/Paracosm">paracosms?</a>)</p></li><li><p><strong>Correction</strong>: <a href="https://news.ycombinator.com/item?id=25426329">Programming competition skills do not inversely correlate with job performance</a> after all</p></li></ul><h2>2.6 Technology</h2><ul><li><p><a href="https://en.wikipedia.org/wiki/Natural_nuclear_fission_reactor">Natural nuclear fission reactors (Oklo)</a></p></li><li><p><a href="https://www.gwern.net/docs/history/2007-keeley.pdf">&#8220;Baffles and Bastions: The Universal Features of Fortifications&#8221;</a>, Keeley et al 2007</p></li><li><p><a href="https://en.wikipedia.org/wiki/Corrupted_Blood_incident">The Corrupted Blood incident</a></p></li><li><p><em><a href="https://www.gwern.net/docs/design/2020-jeremytankard-footnote-36-redisturbed.pdf">Footnote</a></em><a href="https://www.gwern.net/docs/design/2020-jeremytankard-footnote-36-redisturbed.pdf"> 36: &#8220;Redisturbed&#8221;</a>: a <em>unicase</em> font experiment</p></li></ul><h2>2.7 Economics</h2><ul><li><p><a href="https://www.nytimes.com/2021/01/18/climate/carbon-removal-technology.html">&#8220;Businesses Aim to Pull Greenhouse Gases From the Air. It&#8217;s a Gamble&#8221;</a></p></li><li><p><a href="https://freakonomics.com/podcast/advertising-part-1/">"Does Advertising</a> <a href="https://freakonomics.com/podcast/advertising-part-2/">Actually Work?"</a>  (what could be more obvious than &#8220;advertising works&#8221;, and trivial to  confirm with correlational data? Yet, the tedious saying &#8220;correlation &#8800;  causation&#8221; stubbornly insists on being true); <a href="https://www.gwern.net/docs/traffic/2020-aral.pdf">&#8220;Digital Paywall Design: Implications for Content Demand and Subscriptions&#8221;</a>, Aral &amp; Dhillon 2020 (NYT nag-paywall caused &#8722;9.9% reading; in line with <a href="https://www.gwern.net/Ads">all the other results</a>)</p></li><li><p><a href="https://www.gwern.net/docs/economics/2010-schuh.pdf">&#8220;Who Gains and Who Loses from Credit Card Payments? Theory and Calibrations&#8221;</a>, Schuh et al 2010 (a compelling case for getting a rewards credit card if you&#8217;re a <a href="https://en.wikipedia.org/wiki/Debit_card">debit card</a> user&#8212;why subsidize them so much?)</p></li><li><p><a href="https://www.gwern.net/docs/economics/2019-quinn.pdf">&#8220;Squeezing the bears: cornering risk and limits on arbitrage during the &#8216;British bicycle mania&#8217;, 1896&#8211;1898&#8221;</a>, Quinn 2019</p></li></ul><h2>2.8 Fiction</h2><ul><li><p><a href="https://www.tabletmag.com/sections/arts-letters/articles/on-venus-have-we-got-a-rabbi" title="A long-lost space age satire about what it means to be a Jew from one of science fiction&#8217;s greatest humorists">&#8220;On Venus, Have We Got a Rabbi!&#8221;</a>, <a href="https://en.wikipedia.org/wiki/William_Tenn">William Tenn</a> 2016</p></li><li><p><a href="https://www.gwern.net/docs/history/2013-dubin-fabliauxtranslations-stmartinsfourwishes.pdf">&#8220;St Martin&#8217;s Four Wishes&#8221;</a>, Anonymous <a href="https://en.wikipedia.org/wiki/Fabliau">medieval poet</a> (trans. Dubin 2013)</p></li></ul><h2>2.9 Miscellaneous</h2><ul><li><p>The <a href="https://en.wikipedia.org/wiki/Anglo-Japanese_style">Anglo-Japanese style</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Stalag_Luft_III">Stalag Luft III</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Graham_Island_(Mediterranean_Sea)">Ferdinandea</a></p></li></ul><div><hr></div><ol><li><p>But it&#8217;ll still be too many days &#8217;till we say we&#8217;re sorry.</p></li></ol>]]></content:encoded></item><item><title><![CDATA[December newsletter]]></title><description><![CDATA[December 2020 gwern.net newsletter with links on AI and technology; major new site feature: fully-generalized recursive popups.]]></description><link>https://gwern.substack.com/p/december-newsletter</link><guid isPermaLink="false">https://gwern.substack.com/p/december-newsletter</guid><dc:creator><![CDATA[gwern]]></dc:creator><pubDate>Sun, 10 Jan 2021 17:31:06 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Please see the canonical version of the December 2020 newsletter on <a href="https://www.gwern.net/newsletter/2020/12">Gwern.net</a>.</p>]]></content:encoded></item><item><title><![CDATA[November newsletter]]></title><description><![CDATA[November 2020 gwern.net newsletter with links on DL and genomics scaling, dark mode rewrite, 1 essay, and 1 opera review ('The Ring' cycle).]]></description><link>https://gwern.substack.com/p/november-newsletter</link><guid isPermaLink="false">https://gwern.substack.com/p/november-newsletter</guid><dc:creator><![CDATA[gwern]]></dc:creator><pubDate>Fri, 04 Dec 2020 00:40:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Please see the <a href="https://www.gwern.net/newsletter/2020/11">canonical November 2020 gwern.net</a></strong><a href="https://www.gwern.net/newsletter/2020/11"> newsletter link.</a></p>]]></content:encoded></item><item><title><![CDATA[October 2020 news]]></title><description><![CDATA[October 2020 gwern.net newsletter with links on AI scaling, Euclid; further site reorganization & improvement.]]></description><link>https://gwern.substack.com/p/october-2020-news</link><guid isPermaLink="false">https://gwern.substack.com/p/october-2020-news</guid><dc:creator><![CDATA[gwern]]></dc:creator><pubDate>Sun, 01 Nov 2020 21:42:39 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Please see the <a href="https://www.gwern.net/newsletter/2020/10">canonical web October 2020</a> edition of <a href="https://gwern.substack.com">the <code>gwern.net</code> newsletter</a>.</p>]]></content:encoded></item><item><title><![CDATA[September 2020 News]]></title><description><![CDATA[September 2020 gwern.net newsletter with links on DRL and AI scaling, psychiatric disorders; no reviews.]]></description><link>https://gwern.substack.com/p/september-2020-news</link><guid isPermaLink="false">https://gwern.substack.com/p/september-2020-news</guid><dc:creator><![CDATA[gwern]]></dc:creator><pubDate>Mon, 26 Oct 2020 13:40:32 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Please see the&nbsp;<a href="https://www.gwern.net/newsletter/2020/09">canonical web September 2020</a>&nbsp;edition of&nbsp;<a href="https://gwern.substack.com">the&nbsp;<code>gwern.net</code>&nbsp;newsletter</a>.</p>]]></content:encoded></item><item><title><![CDATA[August 2020 gwern.net newsletter]]></title><description><![CDATA[with an essay on sidenotes; links on human competence, efficient-computing/hardware-overhangs; no reviews.]]></description><link>https://gwern.substack.com/p/august-2020-gwernnet-newsletter</link><guid isPermaLink="false">https://gwern.substack.com/p/august-2020-gwernnet-newsletter</guid><dc:creator><![CDATA[gwern]]></dc:creator><pubDate>Tue, 01 Sep 2020 23:18:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Please see the <a href="https://www.gwern.net/newsletter/2020/08">canonical on-site August 2020</a> edition of <a href="https://gwern.substack.com">the <code>gwern.net</code> newsletter</a>.</p>]]></content:encoded></item><item><title><![CDATA[July 2020 gwern.net newsletter]]></title><description><![CDATA[Links on the Uighurs, authoritarianism, negative emissions, AI overhang; 1 movie & 2 anime reviews]]></description><link>https://gwern.substack.com/p/july-2020-gwernnet-newsletter</link><guid isPermaLink="false">https://gwern.substack.com/p/july-2020-gwernnet-newsletter</guid><dc:creator><![CDATA[gwern]]></dc:creator><pubDate>Thu, 20 Aug 2020 20:09:50 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Please see the <a href="https://www.gwern.net/newsletter/2020/07">on-gwern.net canonical July 2020</a> edition of <a href="https://gwern.substack.com">the <code>gwern.net</code> newsletter</a>.</p>]]></content:encoded></item><item><title><![CDATA[June gwern.net newsletter]]></title><description><![CDATA[June 2020 gwern.net newsletter with 3 new pages/essays, and links on CRISPR, population screening, AI scaling, politics, and technological unemployment.]]></description><link>https://gwern.substack.com/p/june-gwernnet-newsletter</link><guid isPermaLink="false">https://gwern.substack.com/p/june-gwernnet-newsletter</guid><dc:creator><![CDATA[gwern]]></dc:creator><pubDate>Thu, 02 Jul 2020 14:34:53 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>See the canonical <a href="https://www.gwern.net/newsletter/2020/06">on-gwern.net June 2020</a> edition of <a href="https://gwern.substack.com">the <code>gwern.net</code> newsletter</a>.</p>]]></content:encoded></item><item><title><![CDATA[May Gwern.net Newsletter]]></title><description><![CDATA[Link compilation newsletter with anime GAN updates, links on AI scaling, discussion of GPT-3, and 1 book review.]]></description><link>https://gwern.substack.com/p/may-gwernnet-newsletter</link><guid isPermaLink="false">https://gwern.substack.com/p/may-gwernnet-newsletter</guid><dc:creator><![CDATA[gwern]]></dc:creator><pubDate>Sat, 06 Jun 2020 18:44:15 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Due to extensive editing &amp; expansion of the GPT-3 discussion, please see the canonical newsletter version at <a href="https://www.gwern.net/newsletter/2020/05">https://www.gwern.net/newsletter/2020/05</a></p>]]></content:encoded></item><item><title><![CDATA[April 2020 gwern.net newsletter]]></title><description><![CDATA[This is the April 2020 edition of the gwern.net newsletter; previous, March 2020 (archives).]]></description><link>https://gwern.substack.com/p/april-2020-gwern-net-newsletter</link><guid isPermaLink="false">https://gwern.substack.com/p/april-2020-gwern-net-newsletter</guid><pubDate>Fri, 01 May 2020 00:00:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This is the <a href="https://www.gwern.net/newsletter/2020/04">April 2020</a> edition of <a href="https://tinyletter.com/gwern">the <code>gwern.net</code> newsletter</a>; previous, <a href="https://www.gwern.net/newsletter/2020/03">March 2020</a> (<a href="https://www.gwern.net/tags/newsletter">archives</a>). Please see the canonical gwern.net version.</p>]]></content:encoded></item></channel></rss>

Raw headers

{
  "age": "719",
  "cache-control": "no-cache",
  "cf-cache-status": "HIT",
  "cf-ray": "9dc5c48c25a95751-CMH",
  "connection": "keep-alive",
  "content-type": "application/xml; charset=utf-8",
  "date": "Sat, 14 Mar 2026 19:43:31 GMT",
  "etag": "W/\"226a1-zxWjOCmRMWBJkH3/QyhK3exuzZ0\"",
  "server": "cloudflare",
  "set-cookie": "__cf_bm=AlvNDAWNQ7OjfX9IhgK110JbCrFF5mQLfn26qM_iV1I-1773517411-1.0.1.1-eXDnoIsUHkZy9R_Pb2sBgodmdTg.NQpnvFzp60DTju5VfcM77VPu4jgyGUctuE5crSr0mTr6041hx9BYKgvFsaxZ846D4qtyJEjRrowd97A; path=/; expires=Sat, 14-Mar-26 20:13:31 GMT; domain=.substack.com; HttpOnly; Secure; SameSite=None",
  "strict-transport-security": "max-age=31536000; includeSubDomains; preload",
  "transfer-encoding": "chunked",
  "vary": "Accept-Encoding",
  "x-cluster": "substack",
  "x-deploy": "538433cd52",
  "x-powered-by": "Express",
  "x-served-by": "Substack",
  "x-service": "web",
  "x-sub": "gwern"
}

Parsed with @rowanmanning/feed-parser

{
  "meta": {
    "type": "rss",
    "version": "2.0"
  },
  "language": "en",
  "title": "Gwern.net Newsletter",
  "description": "Latest gwern.net updates, interesting links, and reviews",
  "copyright": "Gwern Branwen",
  "url": "https://gwern.substack.com",
  "self": "https://gwern.substack.com/feed",
  "published": null,
  "updated": "2026-03-14T18:37:46.000Z",
  "generator": {
    "label": "Substack",
    "version": null,
    "url": null
  },
  "image": {
    "title": "Gwern.net Newsletter",
    "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png"
  },
  "authors": [
    {
      "name": "gwern",
      "email": null,
      "url": null
    },
    {
      "name": "gwern",
      "email": null,
      "url": null
    }
  ],
  "categories": [],
  "items": [
    {
      "id": "https://gwern.substack.com/p/may-2021-gwernnet-newsletter",
      "title": "May 2021 Gwern.net Newsletter",
      "description": "links on AI hardware, diffusion models, optogenetics, brain scanning.",
      "url": "https://gwern.substack.com/p/may-2021-gwernnet-newsletter",
      "published": "2021-06-11T14:16:22.000Z",
      "updated": "2021-06-11T14:16:22.000Z",
      "content": "<p>May 2021’s <a href=\"https://www.gwern.net/newsletter/2021/05\">Gwern.net</a> <a href=\"https://gwern.substack.com\">newsletter</a> is now out; previous, <a href=\"https://www.gwern.net/newsletter/2021/04\">April 2021</a> (<a href=\"https://www.gwern.net/tags/newsletter\">archives</a>). This is a collation of links and summary of major changes, overlapping with my <a href=\"https://www.gwern.net/Changelog\">Changelog</a>; brought to you by my donors on <a href=\"https://www.patreon.com/gwern\">Patreon</a>.</p><p>Note: I will be in Denver 12–13 June 2021 for a conference.</p><h1>1 Writings</h1><ul><li><p><strong>Proposal</strong>: <a href=\"https://www.gwern.net/CYOA\">“Choose Your Own Adventure AI Dungeon”</a>; <a href=\"https://www.gwern.net/GPT-2-preference-learning#decision-transformers-preference-learning-as-simple-as-possible\">“Decision Transformers: Preference Learning As Simple As Possible”</a></p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><p><a href=\"https://old.reddit.com/r/mlscaling/\">Matters Of Scale</a>:</p><ul><li><p><strong>Hardware</strong>:</p><ul><li><p><a href=\"https://arxiv.org/abs/2104.06272#deepmind\">“Podracer architectures for scalable Reinforcement Learning”</a>, Hessel et al 2021 (highly-efficient TPU pod use: eg solving Pong in <1min at 43 million FPS on a TPUv3-2048); <a href=\"https://venturebeat.com/2021/05/18/google-details-new-ai-accelerator-chips/\">“Google details new TPUv4 AI accelerator chips”</a> (2.7× TPUv3 chips; up to TPUv4-4096 pods, yielding >1 ExaFLOPS; public access later in 2021)x</p></li><li><p><a href=\"https://arxiv.org/abs/2104.07857#microsoft\">“ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning”</a>, Rajbhandari et al 2021 (~1 trillion parameters per 16 GPUs/DGX-2-node, scaling to >512 GPUs ~40% efficiency)</p></li><li><p><a href=\"https://arxiv.org/abs/2105.04663#google\">“GSPMD: General and Scalable Parallelization for ML Computation Graphs”</a>, Xu et al 2021 (Google upgrade of <a href=\"https://arxiv.org/abs/1811.06965#google\" title=\"'GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism', Huang et al 2018\">GPipe</a>/<a href=\"https://arxiv.org/abs/2006.16668#google\" title=\"'GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding', Lepikhin et al 2020\">GShard</a> arch to match <a href=\"https://www.microsoft.com/en-us/research/blog/deepspeed-extreme-scale-model-training-for-everyone/\" title=\"DeepSpeed: Extreme-scale model training for everyone\">MS DeepSpeed</a>: “…50%–62% compute utilization on 128–2048 Cloud TPUv3 cores for models with up to one trillion parameters”)</p></li><li><p><a href=\"https://arxiv.org/abs/2104.05158#facebook\">“DLRM: High-performance, Distributed Training of Large-scale Deep Learning Recommendation Models”</a>,  Mudigere et al 2021 (ZionEX software/hardware platform for training  extremely large embeddings—while embeddings aren’t ‘real’ parameters  & things like <a href=\"https://arxiv.org/abs/2004.08366#google\" title=\"'DynamicEmbedding: Extending TensorFlow for Colossal-Scale Applications', Zeng et al 2020\">DynamicEmbedding</a> will never learn tricks like GPT-3 no matter how big, they present similar challenges); <a href=\"https://arxiv.org/abs/2105.08820#facebook\">“RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance”</a>, Gupta et al 2021</p></li></ul></li><li><p><a href=\"https://arxiv.org/abs/2105.12196#deepmind\">“From Motor Control to Team Play in Simulated Humanoid Football”</a>,  Liu et al 2021 (curriculum training of a single NN from raw humanoid  control to coordinated team-wide soccer strategy; neat to compare with <a href=\"https://arxiv.org/abs/2009.01719#deepmind\" title=\"Grounded Language Learning Fast and Slow\">Hill et al 2020</a> in terms of agent abilities)</p></li><li><p><a href=\"https://arxiv.org/abs/2105.11084#facebook\">“Wav2vec-U: Unsupervised Speech Recognition”</a>, Baevski et al 2021</p></li><li><p><a href=\"https://www.anthropic.com/news/announcement\">“Anthropic” public-benefit-corp/startup launched</a> (founded by the Amodeis; $124M investment for scaling “reliable and steerable AI systems”); <a href=\"https://www.cooperativeai.com/foundation\">“Cooperative AI Foundation” (CAIF)</a> launched</p></li><li><p><a href=\"https://arxiv.org/abs/2105.01601#google\">“MLP-Mixer: An all-MLP Architecture for Vision”</a>, Tolstikhin et al 2021 (another <a href=\"https://www.gwern.net/notes/FC\">FC paper</a> removing even more inductive biases—ponies are all you need: “Mixer <a href=\"http://www.incompleteideas.net/IncIdeas/BitterLesson.html\">improves more rapidly with data</a>  than ResNets, or even ViT, and the gap between large scale Mixer and  ViT models shrinks until the performance is matched on the entire  dataset…” The Bitter Lesson truly is the single bitterest lesson in ML,  isn’t it? The more people tweet about how MLP-Mixer is overhyped because  is −X% worse than the ultra-hand-optimized baseline or requires Y× more  FLOPS, the more they demonstrate <em>precisely why</em> this sort of  research is so important! And showing, incidentally, that Transformers  are still under-researched if such a fundamental fact could have been  missed for so long.)</p></li><li><p><a href=\"https://arxiv.org/abs/2104.08945#facebook\">“Data-Efficient Language-Supervised Zero-Shot Learning with Self-Distillation”</a>, Cheng et al 2021 (<a href=\"https://openai.com/blog/clip/\">CLIP</a>-like performance scaled down to <em>n</em> = 3m using <a href=\"https://arxiv.org/abs/1503.02531#google\" title=\"'Distilling the knowledge in a neural network', Hinton et al 2015\">soft labels</a> generated by a <a href=\"https://www.gwern.net/docs/ai/2018-sharma.pdf#google\" title=\"Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning\">Conceptual Captions</a>-pretrained model)</p></li><li><p><a href=\"https://arxiv.org/abs/2104.07636#google\">“SR3: Image Super-Resolution via Iterative Refinement”</a>, Saharia et al 2021; <a href=\"https://arxiv.org/abs/2105.05233#openai\">“Diffusion Models Beat GANs on Image Synthesis”</a>, Dhariwal & Nichol 2021 (<a href=\"https://arxiv.org/abs/2006.11239\" title=\"'Denoising Diffusion Probabilistic Models', Ho et al 2020\">DDPM</a>^<a href=\"file:///tmp/burlbC6ws6.html#fn1\">1</a>^ finally surpass <a href=\"https://arxiv.org/abs/1809.11096#deepmind\" title=\"'BigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis', Brock et al 2018\">BigGAN-deep</a> on ImageNet 512px images at similar compute-cost, as <a href=\"https://arxiv.org/abs/2102.09672\" title=\"'Improved Denoising Diffusion Probabilistic Models', Nichol & Dhariwal 2021\">expected from their</a><a href=\"https://www.gwern.net/notes/Scaling\">good scaling</a>); <a href=\"https://cascaded-diffusion.github.io/\">“Cascaded Diffusion Models for High Fidelity Image Generation”</a>, Ho et al 2021</p></li><li><p><a href=\"https://arxiv.org/abs/2009.01325#openai\">“Learning to summarize from human feedback”</a>, Stiennon et al 2020</p></li><li><p><a href=\"https://www.gwern.net/docs/ai/2021-power.pdf#openai\">“Grokking: Generalization Beyond Overfitting On Small Algorithmic Data Sets”</a>, Power et al 2021 (<a href=\"https://old.reddit.com/r/mlscaling/comments/n78584/grokking_generalization_beyond_overfitting_on/\">discussion</a>;  new scaling effect, ‘grokking’: sudden perfect generalization emerging  many epochs after training-set overfitting on algorithmic tasks when  training in <a href=\"https://www.gwern.net/docs/ai/2021-power-poster.png#openai\">flat shallow loss landscapes</a>); <a href=\"https://arxiv.org/abs/2106.05237#google\">“Knowledge distillation: A good teacher is patient and consistent”</a>, Beyer et al 2021 (training much smaller models merely requires hundreds of thousands or millions of epochs)</p></li><li><p><a href=\"https://arxiv.org/abs/2104.14830#google\">“Scaling End-to-End Models for Large-Scale Multilingual ASR”</a>, Li et al 2021</p></li><li><p><a href=\"https://arxiv.org/abs/2103.10948\">“The Shape of Learning Curves: a Review”</a>, Viering & Loog 2021</p></li><li><p><a href=\"https://www.sciencedirect.com/science/article/pii/S0004370221000862#deepmind\">“Reward is enough”</a>,  Silver et al 2021 (a DRL manifesto: reward losses enough at scale of  compute/parameters/tasks to induce all important capabilities like  memory/exploration/generalization/imitation/reasoning)</p></li><li><p><strong>Scaling Down</strong>: <a href=\"https://github.com/nshepperd/lazy\"><code>lazy</code>: a tool for running processes in idle time</a> (how to train on a GPU without destroying your GUI’s usability! <code>lazy</code>  pauses runs briefly while you interact with your desktop, letting you  do months-long runs without going crazy or resorting to Colab etc. This  enables hobbyists to go after previously-infeasible model sizes);  EleutherAI releases <a href=\"https://arankomatsuzaki.wordpress.com/2021/06/04/gpt-j/\">a 6b-parameter GPT-3 model, GPT-J</a> (are you still using GPT-2/GPT-Neo? upgrade!); <a href=\"https://arxiv.org/abs/2105.12723\">“Aggregating Nested Transformers”</a>, Zhang et al 2021/<a href=\"https://arxiv.org/abs/2105.14217\">“Less is More: Pay Less Attention in Vision Transformers”</a>, Pan et al 2021</p></li></ul><ul><li><p><a href=\"https://arxiv.org/abs/2105.13626#google\">“ByT5: Towards a token-free future with pre-trained byte-to-byte models”</a>, Xue et al 2021 (character models—not just feasible but desirable; we’ll get our rhyming & pun-making language models yet!)</p></li><li><p><a href=\"https://www.gwern.net/docs/ai/2008-golle.pdf\">“Machine Learning Attacks Against the Asirra CAPTCHA”</a>,  Golle 2008 (a look back on a decade of CV progress: months of work for  80% cat vs dog with SVM ensembles in 2008; 5min in Fast.ai for 99%  accuracy in 2018; for even more perspective, <a href=\"https://www.gwern.net/docs/ai/2012-ciresan.pdf\" title=\"Deep big multilayer perceptrons for digit recognition\">Cireşan 2012</a>)</p></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href=\"https://www.gwern.net/docs/genetics/heritable/2021-levey.pdf\">“Bi-ancestral  depression GWAS in the Million Veteran Program and meta-analysis in  >1.2 million individuals highlight new therapeutic directions”</a>, Levey et al 2021</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2021.05.26.445798v1\">“The complete sequence of a human genome”</a>, Nurk et al 2021 (<a href=\"https://www.nature.com/articles/d41586-021-01506-w\" title=\"A complete human genome sequence is close: how scientists filled in the gaps; researchers added 200 million DNA base pairs and 115 protein-coding genes — but they’ve yet to entirely sequence the Y chromosome\">media</a>)</p></li><li><p><a href=\"https://www.gwern.net/docs/iq/2021-vonstumm.pdf\">“Using DNA to predict intelligence”</a>, von Stumm & Plomin 2021 (review)</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/848366v2.full\">“Long  read sequencing of 3,622 Icelanders provides insight into the role of  structural variants in human diseases and other traits”</a>, Beyter et al 2021</p></li><li><p><a href=\"https://www.gwern.net/docs/genetics/heritable/2021-owen.pdf\">“Rapid Sequencing–Based Diagnosis of Thiamine Metabolism Dysfunction Syndrome”</a> (sequence everyone!)</p></li></ul><p>Engineering:</p><ul><li><p><a href=\"https://www.gwern.net/docs/genetics/editing/2021-robertson.pdf\">“Sense codon reassignment enables viral resistance and encoded polymer synthesis”</a>,  Robertson et al 2021 (“ultra-safe cells”: synthesizing an entire E.  coli genome with swapped codons for complete viral immunity)</p></li><li><p><a href=\"https://www.gwern.net/docs/genetics/editing/2021-musunuru.pdf\">“In vivo CRISPR base editing of </a><em><a href=\"https://www.gwern.net/docs/genetics/editing/2021-musunuru.pdf\">PCSK9</a></em><a href=\"https://www.gwern.net/docs/genetics/editing/2021-musunuru.pdf\"> durably lowers cholesterol in primates”</a>, Musunuru et al 2021</p></li><li><p><strong><a href=\"https://en.wikipedia.org/wiki/Optogenetics\">Optogenetics</a></strong>: <a href=\"https://www.gwern.net/docs/genetics/editing/2021-sahel.pdf\">“Partial recovery of visual function in a blind patient after optogenetic therapy”</a>, Sahel et al 2021 (<a href=\"https://www.statnews.com/2021/05/24/scientists-use-optogenetics-for-first-time-to-help-blind-patient-see/\" title=\"With engineered proteins, scientists use optogenetics for the first time to help a blind patient see again\">media</a>); <a href=\"https://www.gwern.net/docs/biology/2021-yang.pdf\">“Wireless multilateral devices for optogenetic studies of individual and social behaviors”</a>, Yang et al 2021 (<a href=\"https://www.nytimes.com/2021/05/25/science/optogenetics-brain-social-behavior.html\" title=\"Scientists Drove Mice to Bond by Zapping Their Brains With Light: The study, a tour de force in bioengineering, comes after 2 decades of research on brain-to-brain synchrony in people\">media</a>)</p></li><li><p><a href=\"https://www.pnas.org/content/118/18/e2018181118\">“Retron Library Recombineering (RLR): High-throughput functional variant screens via in vivo production of single-stranded DNA”</a>, Schubert et al 2021</p></li><li><p><a href=\"https://www.nature.com/articles/d41586-021-01186-6\">“First genetically modified Oxitec mosquitoes released in the United States”</a></p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2021.05.28.446207v1\">“Genomic characterization of world’s longest selection experiment in mouse reveals the complexity of polygenic traits”</a>, Palma-Vera et al 2021</p></li><li><p><a href=\"https://www.sciencedirect.com/science/article/pii/S0734975021000628\">“Surrogate broodstock to enhance biotechnology research and applications in aquaculture”</a>, Jin et al 2021</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2020.11.05.370478v3\">“Utility of polygenic embryo screening for disease depends on the selection strategy”</a>, Lencz et al 2021</p></li><li><p><a href=\"https://www.nature.com/articles/d41586-021-01423-y\">“Limit  on lab-grown human embryos dropped by stem-cell body: The International  Society for Stem Cell Research relaxed the famous 14-day rule on  culturing human embryos in its latest research guidelines”</a></p></li><li><p><a href=\"https://www.nytimes.com/2007/08/28/science/28crop.html\">“Useful Mutants, Bred With Radiation”</a> (on <a href=\"https://en.wikipedia.org/wiki/Atomic_gardening\">atomic gardening</a>)</p></li></ul><h2>2.3 Statistics/Meta-Science</h2><ul><li><p><a href=\"https://blog.dshr.org/2021/03/correlated-failures.html\">“Correlated Failures” in HDDs/SSDs</a></p></li><li><p><a href=\"https://www.gwern.net/docs/statistics/bias/1992-rogers.pdf\">“How a Publicity Blitz Created The Myth of Subliminal Advertising”</a>, Rogers 1992 (the famous movie-theater/popcorn-sales experiment never happened)</p></li></ul><h2>2.4 Politics/Religion</h2><ul><li><p><a href=\"https://www.gwern.net/docs/sociology/2021-costello.pdf\">“Clarifying the Structure and Nature of Left-Wing Authoritarianism (LWA)”</a>, Costello et al 2021</p></li><li><p><a href=\"https://fantasticanachronism.com/2021/04/28/book-review-the-decline-and-fall-of-the-roman-empire/\">“Book Review: </a><em><a href=\"https://fantasticanachronism.com/2021/04/28/book-review-the-decline-and-fall-of-the-roman-empire/\">The Decline and Fall of the Roman Empire</a></em><a href=\"https://fantasticanachronism.com/2021/04/28/book-review-the-decline-and-fall-of-the-roman-empire/\">”</a> (<a href=\"https://fantasticanachronism.com/2021/05/03/highlights-from-the-decline-and-fall-of-the-roman-empire/\">excerpts</a>)</p></li></ul><h2>2.5 Psychology/Biology</h2><ul><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2021.05.29.446289v1\">“A connectomic study of a petascale fragment of human cerebral cortex”</a>,  Shapson-Coe et al 2021 (“…This “digital tissue” is a ~660,000× scale up  of an earlier saturated reconstruction from a small region of mouse  cortex, published in 2015 (<a href=\"https://www.sciencedirect.com/science/article/pii/S0092867415008247\" title=\"Saturated Reconstruction of a Volume of Neocortex\">Kasthuri et al 2015</a>).  Although this scaleup was difficult, it was not hundreds of thousands  of times more difficult and took about the same amount of time as the  previous data set (~4 years)…The rapid improvements over the past few  years…argues that analyzing volumes that are even 3 orders of magnitude  larger, such as an exascale whole mouse brain connectome, will likely be  in reach within a decade.\" See also <a href=\"https://xcorr.net/2021/04/27/accelerating-progress-in-brain-recording-tech/\">“Accelerating progress in brain recording tech”</a>.)</p></li><li><p><a href=\"https://www.nature.com/articles/s41467-021-22199-9\">“Neuroimaging evidence for a network sampling theory of individual differences in human intelligence test performance”</a>, Soreq et al 2021; <a href=\"https://elifesciences.org/articles/64058\">“The neural basis of intelligence in fine-grained cortical topographies”</a>, Feilong et al 2021; <a href=\"https://link.springer.com/article/10.1007/s00429-020-02113-7\">“Predicting intelligence from brain gray matter volume”</a>, Hilger et al 2020 (towards the mechanistic reification of <em>g</em>: per <a href=\"https://www.gwern.net/docs/iq/2007-jung.pdf\" title=\"'The Parieto-Frontal Integration Theory (P-FIT) of intelligence: Converging neuroimaging evidence', Jung & Haier 2007\">P-FIT</a>,  it is global efficiency/total cognitive resources which can be spent on  learning & orchestrating specialized capabilities); if we consider  recent human brain imaging studies, cross-species comparisons, and deep  learning as converging, I would offer as a speculation the following:</p><p>The Master Synthesis: intelligence  is execution of small simplicity-weighted programs, best discovered by  search over smooth loss landscapes like that of <a href=\"https://www.gwern.net/notes/Sparsity\">highly-overparameterized</a> differentiable networks containing lottery-ticket subnetworks which are ensembled/averaged over, <a href=\"https://www.gwern.net/Backstop#deep-bayes\">approaching Bayes-optimal</a>  reasoning in the limit (as nearest-neighbors-like high dimensional  interpolation / memorization gives way to algorithmic generalization /  interpolation on a more abstract level); this can be implemented by  large numbers of similar neurons trained using any of the many  approximations to backprop; human intelligence’s <em>g</em> is real but  is the overall ‘pool’ of neural resources which derives from overall  body integrity because the number of neurons, their density, their  myelination, resistance to damage and infection etc, is causally  downstream of all body and developmental systems, creating a huge  mutational target; the brain regions specialize and differentiate, and  their orchestration (or lack thereof) contributes to observed  performance on tasks tapping into multiple specialized regions; as tasks  rely on fewer regions or approach intrinsic ceiling, <em>g</em> ceases to be observable and task-specific influences matter most.</p></li><li><p><a href=\"https://www.nature.com/articles/s41591-021-01336-3\">“MDMA-assisted therapy for severe PTSD: a randomized, double-blind, placebo-controlled phase 3 study”</a>, Mitchell et al 2021 (<em>d</em> = 0.9 over therapy); <a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7643046/\">“Effects of Psilocybin-Assisted Therapy on Major Depressive Disorder”</a>, Davis et al 2021</p></li><li><p><a href=\"https://www.newyorker.com/magazine/2021/04/05/why-animals-dont-get-lost\">“Why  Animals Don’t Get Lost: Birds do it. Bees do it. Learning about the  astounding navigational feats of wild creatures can teach us a lot about  where we’re going”</a> (on spectacular but still mysterious feats of <a href=\"https://en.wikipedia.org/wiki/Animal_navigation\">animal navigation</a>)</p></li><li><p><a href=\"https://defector.com/in-the-future-of-collecting-is-anyone-having-fun/\">“In The Future Of Collecting, Is Anyone Having Fun?”</a> (on <a href=\"https://en.wikipedia.org/wiki/Bobblehead\">Bobblehead</a> collectors)</p></li><li><p><a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8114859/\">“Linking Brain Biology to Intellectual Endowment: A Review on the Associations of Human Intelligence With Neuroimaging Data”</a>, Dizaji et al 2021</p></li><li><p><a href=\"https://www.gwern.net/docs/economics/2012-oboyle.pdf\">“The Best And The Rest: Revisiting The Norm Of Normality Of Individual Performance”</a>, O’Boyle & Aguinis 2012 (performance is <a href=\"https://www.gwern.net/notes/Pipeline\">log-normal</a>)</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2020.11.21.392720v1\">“A conserved strategy for inducing appendage regeneration”</a>, Abrams et al 2021 (slight regrowth of damaged mouse limbs by drinking sugar+amino-acid-supplemented water)</p></li><li><p><a href=\"https://astralcodexten.substack.com/p/know-your-amphetamines\">“Know Your Amphetamines”</a>, Scott Alexander</p></li><li><p><a href=\"https://www.nature.com/articles/srep02617\">“Feeling Small: Exploring the Tactile Perception Limits [of Humans]”</a>, Skedung et al 2013</p></li><li><p><a href=\"http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/\" title=\"One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)\">“The Board Game of the Alpha Nerds: Before </a><em><a href=\"http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/\" title=\"One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)\">Risk</a></em><a href=\"http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/\" title=\"One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)\">, before </a><em><a href=\"http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/\" title=\"One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)\">Dungeons & Dragons</a></em><a href=\"http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/\" title=\"One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)\">, before </a><em><a href=\"http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/\" title=\"One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)\">Magic: The Gathering</a></em><a href=\"http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/\" title=\"One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)\">, there was </a><em><a href=\"http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/\" title=\"One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)\">Diplomacy</a></em><a href=\"http://grantland.com/features/diplomacy-the-board-game-of-the-alpha-nerds/\" title=\"One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)\">”</a> (<a href=\"https://en.wikipedia.org/wiki/Diplomacy_(game)\">WP</a>;  “I still don’t know whom I should have trusted, if anyone. All I know  is that I felt stupid, stressed out, humiliated, and sad.”)</p></li></ul><h2>2.6 Technology</h2><ul><li><p><a href=\"https://rootsofprogress.org/nuclear-physics\">“I walk the (beta-stability) line: How counting neutrons explains nuclear waste”</a></p></li><li><p><a href=\"https://alexdanco.com/2020/10/08/making-is-show-business-now/\">“Making is Show Business now”</a>, Alex Danco</p></li><li><p><a href=\"https://www.thenewatlantis.com/publications/shop-class-as-soulcraft\">“Shop Class as Soulcraft: The case for the manual trades”</a>, Crawford 2006</p></li><li><p><a href=\"https://www.kickstarter.com/projects/upperstory/spintronics-build-mechanical-circuits\">“Spintronics: Build mechanical circuits”</a>, Kickstarter (followup to <a href=\"https://en.wikipedia.org/wiki/Turing_Tumble\">Turing Tumble</a>)</p></li></ul><h2>2.7 Economics</h2><ul><li><p><a href=\"https://www.gwern.net/docs/sociology/2020-dellavigna.pdf\">“RCTs to Scale: Comprehensive Evidence from 2 Nudge Units”</a>, DellaVigna & Linos 2020 (nudge effects overestimated by 6.2× due to publication bias)</p></li><li><p><a href=\"https://academic.oup.com/ije/advance-article/doi/10.1093/ije/dyab099/6288123\">“No  causal associations between childhood family income and subsequent  psychiatric disorders, substance misuse and violent crime arrests: a  nationwide Finnish study of >650,000 individuals and their siblings”</a>, Sariaslan et al 2021; <a href=\"https://academic.oup.com/ije/advance-article/doi/10.1093/ije/dyab066/6274255\">“Parental income and mental disorders in children and adolescents: prospective register-based study”</a>, Kinge et al 2021</p></li><li><p><a href=\"https://mattlakeman.org/2021/06/01/everything-you-might-want-to-know-about-whaling/\">“Everything You Might Want to Know about Whaling”</a>, Matt Lakeman</p></li><li><p><a href=\"https://www.gwern.net/notes/Nash\">Exploding Nash Equilibrium For Trustless Trade</a></p></li></ul><h2>2.8 Fiction</h2><ul><li><p><a href=\"https://www.lightspeedmagazine.com/fiction/love-is-the-plan-the-plan-is-death/\">“Love Is the Plan the Plan Is Death”</a>, <a href=\"https://en.wikipedia.org/wiki/James_Tiptree_Jr.\">James Tiptree, Jr.</a> (<a href=\"https://en.wikipedia.org/wiki/Love_Is_the_Plan_the_Plan_Is_Death\">WP</a>)</p></li></ul><h2>2.9 Miscellaneous</h2><ul><li><p><a href=\"https://www.newyorker.com/news/dispatch/the-strange-story-of-dagobert-the-ducktales-bandit\">“The Strange Story of Dagobert, the </a><em><a href=\"https://www.newyorker.com/news/dispatch/the-strange-story-of-dagobert-the-ducktales-bandit\">Duck Tales</a></em><a href=\"https://www.newyorker.com/news/dispatch/the-strange-story-of-dagobert-the-ducktales-bandit\">  Bandit: In the ’90s, a frustrated artist in Berlin went on a crime  spree—building bombs, extorting high-end stores, and styling his persona  after Scrooge McDuck. He soon became a German folk hero.”</a> (<a href=\"https://en.wikipedia.org/wiki/Arno_Funke\">WP</a>; another reminder for Americans—odd as it may seem, Donald Duck is <em>extremely</em> popular overseas; see also the unknown-in-the-USA character <a href=\"https://en.wikipedia.org/wiki/John_D._Rockerduck\">John D. Rockerduck</a> or <a href=\"https://slate.com/culture/2009/12/sweden-s-bizarre-tradition-of-watching-donald-duck-kalle-anka-cartoons-on-christmas-eve.html\">beloved Scandinavian tradition</a><em><a href=\"https://en.wikipedia.org/wiki/From_All_of_Us_to_All_of_You\">From All of Us to All of You</a></em> who 2020 airing set an all-time record of >4.5m viewers)</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Atmospheric_optics#List\">List of atmospheric optical phenomena</a> (How many would you recognize from a distance or plane? How many have you even heard of?)</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Franz_Nopcsa_von_Fels%C5%91-Szilv%C3%A1s\">Baron Franz Nopcsa von Felső-Szilvás</a> (noted geologist, paleontologist, anthropologist, homosexual, & skyjacker)</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Krishnacore\">Krishnacore</a></p></li></ul><div><hr></div><ol><li><p>What is a diffusion model like DDPM? To try to explain it as simply as possible <a href=\"https://yang-song.github.io/blog/2021/score/\" title=\"Generative Modeling by Estimating Gradients of the Data Distribution\">without the math</a>:</p><p>DDPM is a neural net which is trained to fix noise in an image: it  takes a noisy image and ‘sharpens’ it to produce a new image. You train  it by adding dirt to a normal image, and teaching it to turn the dirty  version into the original. As it gets better, it learns what the images  all tend to look like so it can ‘see through’ ever more noise, to turn  smudged hints of the original image into its best guess. Once it’s done  training, what happens if you give it a completely dirty photo, which is  pure static noise? Well, it produces a slightly less dirty ‘photo’. And  if you do it again? it’s a little cleaner still. Now, what if you do  this many times? It has to get cleaner each time. The end result: the  static noise goes in, and a face pops out! The DDPM has hallucinated a  face out of the noise. One little blob of static here turned into a  nose, and another blob turned into an ear, and it went from there.</p></li></ol>",
      "image": {
        "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
        "title": null
      },
      "media": [
        {
          "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "image": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "title": null,
          "length": 0,
          "type": "image",
          "mimeType": "image/jpeg"
        }
      ],
      "authors": [
        {
          "name": "gwern",
          "email": null,
          "url": null
        }
      ],
      "categories": []
    },
    {
      "id": "https://gwern.substack.com/p/april-2021-newsletter",
      "title": "April 2021 newsletter",
      "description": "with links on AI scaling, particular new East Asian record-breaking work & deep reinforcement learning.",
      "url": "https://gwern.substack.com/p/april-2021-newsletter",
      "published": "2021-06-03T15:45:24.000Z",
      "updated": "2021-06-03T15:45:24.000Z",
      "content": "<p>April 2021’s <a href=\"https://www.gwern.net/newsletter/2021/04\">Gwern.net</a> <a href=\"https://gwern.substack.com\">newsletter</a> is now out; previous, <a href=\"https://www.gwern.net/newsletter/2021/03\">March 2021</a> (<a href=\"https://www.gwern.net/tags/newsletter\">archives</a>). This is a collation of links and summary of major changes, overlapping with my <a href=\"https://www.gwern.net/Changelog\">Changelog</a>; brought to you by my donors on <a href=\"https://www.patreon.com/gwern\">Patreon</a>.</p><h1>1 Writings</h1><ul><li><p><a href=\"https://www.gwern.net/Variables\">Better Greek Variable Suggestions</a> (use ϰ, ς, υ, ϖ, Υ, Ξ, ι, ϱ, ϑ, or Π instead)</p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><ul><li><p><a href=\"https://arxiv.org/abs/1810.00825\">“Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks”</a>, Lee et al 2018; <a href=\"https://arxiv.org/abs/2103.03206#deepmind\">“Perceiver: General Perception with Iterative Attention”</a>, Jaegle et al 2021 (skinny Transformers applied recurrently; given reinvention, one might ask “is <a href=\"https://arxiv.org/abs/1706.03762#google\" title=\"'Attention Is All You Need', Vaswani et al 2017\">attention</a>, getting too much attention?”, especially given how many Transformer tweaks <a href=\"https://arxiv.org/abs/2102.11972#google\" title=\"'Do Transformer Modifications Transfer Across Implementations and Applications?', Narang et al 2021\">don’t pan out</a>  or have antecedents, indicating a gold rush? Probably not: if the  marginal return on this research direction had fallen below that of  competitors, we would see those neglected directions invade Transformer  topics—while we continue to see the reverse, and many applications as  yet untouched by all the new approaches, suggesting that we <em>still</em> don’t pay enough attention)</p></li><li><p><a href=\"https://arxiv.org/abs/2103.04689\">“Z-IL: Predictive Coding Can Do Exact Backpropagation on Any Neural Network”</a>, Salvatori et al 2021 (scaling local learning rules to ImageNet AlexNet/Resnet & ALE DRL at similar compute cost)</p></li><li><p><a href=\"https://arxiv.org/abs/1708.07120\">“Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates”</a>,  Smith & Topin 2017 (the lingering mystery of super-convergence,  saving 50–90% compute with LRs as high as 20 (!): what is it, why does  it work only sometimes, is there any connection to <a href=\"https://www.gwern.net/docs/ai/2021-power.pdf#openai\" title=\"'Grokking: Generalization Beyond Overfitting On Small Algorithmic Data Sets', Powers et al 2021\">grokking</a> & can it work for large models like GPT-3 given the <a href=\"https://old.reddit.com/r/MachineLearning/comments/ba1wg5/d_thoughts_about_superconvergence_and/\">tunneling hypothesis</a>?)</p></li><li><p><a href=\"http://www.offconvex.org/2021/04/07/ripvanwinkle/\">“Rip van Winkle’s Razor, a Simple New Estimate for Adaptive Data Analysis”</a>  (an unusual approach to estimating generalization—by quantifying the  information-theoretic simplicity of all the powerful DL research  discoveries since 2012, into ~1 kilobyte. And yet, <em>what</em> a kilobyte…)</p></li><li><p><a href=\"https://github.com/golanlevin/AmbigrammaticFigures\">“Ambigrammatic Figures”</a>, Levin & Huang 2020 (making horrifying StyleGAN faces that can be <a href=\"https://en.wikipedia.org/wiki/Ambigram\">rotated 180°</a> by projection & then <a href=\"https://www.gwern.net/Faces#reversing-stylegan-to-control-modify-images\">gradient-ascent</a> towards an upside-down face)</p></li></ul><p><a href=\"https://old.reddit.com/r/mlscaling/\">Matters Of Scale</a>:</p><ul><li><p><strong><a href=\"https://lair.lighton.ai/akronomicon/\" title=\"The Akronomicon: an Extreme-Scale Leaderboard\">Large Models</a></strong>:</p><ul><li><p>Congratulations to OpenAI on 1 year of GPT-3 & OA API. Has it really only been a year?—it has truly exceeded expectations.</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Naver\">Naver</a> announces 204b-parameter Korean-language NN, <a href=\"http://m.koreaherald.com/view.php?ud=20210525000824\">“HyperCLOVA”</a>  (KO; unknown arch although apparently dense, or training-compute or  benchmark/loss performance; 650b token training dataset. Who knew Naver  was even trying? “And we are here as on a darkling plain / Swept with  confused alarms of struggle and flight, / Where ignorant armies clash by  night.”)</p></li><li><p><a href=\"https://arxiv.org/abs/2104.12369#huawei\">“PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation”</a>,  Zeng et al 2021 (Zh; Huawei’s GPT-3-200b prototype, trained on  indigenous Chinese GPU+DL stack; a partial replication, due to  incomplete training on ~43b tokens; the <a href=\"https://git.openi.org.cn/PCL-Platform.Intelligence/PanGu-Alpha#user-content-%E6%A8%A1%E5%9E%8B%E4%B8%8B%E8%BD%BD\">13b-parameter</a> model checkpoint has been released for download, and they are considering releasing the 200b-parameter model… <a href=\"https://chinai.substack.com/p/chinai-141-the-pangu-origin-story\">Ding commentary</a>)</p></li><li><p>New 𝒪(100b)-parameter Transformer models announced at Google I/O ’2021: <a href=\"https://blog.google/technology/ai/lamda/\" title=\"LaMDA: our breakthrough conversation technology\">LaMDA</a> (EN; chatbot), <a href=\"https://blog.google/products/search/introducing-mum/\">MUM</a> (multimodal multilingual search/translation/Q&A)</p></li><li><p><a href=\"https://www.infoq.cn/article/EFIHo75sQsVqLvFTruKE#alibaba\">“PLUG”</a> (Zh): a 27b parameter BERT-like Chinese language model, targeting 200b next (AliBaba followup to <a href=\"https://arxiv.org/abs/1908.04577#alibaba\" title=\"'StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding', Wang et al 2019\">StructBERT</a>/<a href=\"https://arxiv.org/abs/2004.07159#alibaba\" title=\"'PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation', Bi et al 2020\">PALM</a>)</p></li><li><p><a href=\"https://arxiv.org/abs/2105.13290\">“CogView: Mastering Text-to-Image Generation via Transformers”</a>, Ding et al 2021 (another Chinese <a href=\"https://openai.com/blog/dall-e/\">DALL·E</a> clone, post-<a href=\"https://arxiv.org/abs/2103.00823#alibaba\" title=\"'M6: A Chinese Multimodal Pretrainer', Lin et al 2021\">M6</a>: <em>n</em> = <a href=\"https://wudaoai.cn/data-detail/1\" title=\"WuDaoCorpus: the largest Chinese corpus data set, with about 2TB of text and 725 billion Chinese characters\">30m text-image pairs</a>, 4b-parameter GPT, models to be released)</p></li><li><p><a href=\"https://arxiv.org/abs/2104.10157\">“VideoGPT: Video Generation using VQ-VAE and Transformers”</a>, Yan et al 2021; <a href=\"https://arxiv.org/abs/2104.14806#microsoft\">“GODIVA: </a><em><a href=\"https://arxiv.org/abs/2104.14806#microsoft\">G</a></em><a href=\"https://arxiv.org/abs/2104.14806#microsoft\">enerating </a><em><a href=\"https://arxiv.org/abs/2104.14806#microsoft\">O</a></em><a href=\"https://arxiv.org/abs/2104.14806#microsoft\">pen-</a><em><a href=\"https://arxiv.org/abs/2104.14806#microsoft\">D</a></em><a href=\"https://arxiv.org/abs/2104.14806#microsoft\">oma</a><em><a href=\"https://arxiv.org/abs/2104.14806#microsoft\">I</a></em><a href=\"https://arxiv.org/abs/2104.14806#microsoft\">n </a><em><a href=\"https://arxiv.org/abs/2104.14806#microsoft\">V</a></em><a href=\"https://arxiv.org/abs/2104.14806#microsoft\">ideos from n</a><em><a href=\"https://arxiv.org/abs/2104.14806#microsoft\">A</a></em><a href=\"https://arxiv.org/abs/2104.14806#microsoft\">tural Descriptions”</a>, Wu et al 2021 (DALL·E for video on Howto100M: <a href=\"https://arxiv.org/abs/1906.00446#deepmind\" title=\"'Generating Diverse High-Fidelity Images with VQ-VAE-2', Razavi et al 2019\">VQ-VAE</a> + sparse attention)</p></li><li><p><a href=\"https://arxiv.org/abs/2104.04473#nvidia\">“Efficient Large-Scale Language Model Training on GPU Clusters”</a>, Narayanan et al 2021 (Nvidia <a href=\"https://github.com/nvidia/megatron-lm\">‘Megatron-LM’ software</a> for scaling up to 3072 A100 GPUs; allows 1t-parameter models at 502 petaFLOP/s or 50% efficiency, cf TPU rival, <a href=\"https://arxiv.org/abs/2105.04663#google\" title=\"'GSPMD: General and Scalable Parallelization for ML Computation Graphs', Xu et al 2021: '50% to 62% compute utilization on 128 to 2048 Cloud TPUv3 cores for models with up to one trillion parameters'\">GSPMD</a>, and note <a href=\"file:///tmp/burlyHGiKo.html#patterson-et-al-2021\">Patterson et al 2021</a> estimates GPT-3 at ~3.5m V100 GPU-hours, so OA got ~20% efficiency?); <a href=\"https://www.youtube.com/watch?v=eAn_oiZwUXA&t=2998s\" title=\"GTC 2021 Keynote with NVIDIA CEO Jensen Huang: NVIDIA CEO Jensen Huang delivers the #GTC21 keynote, where he introduced amazing breakthroughs in building virtual worlds with NVIDIA Omniverse; in advancing enterprise computing with new NVIDIA DGX systems and software; in turning the data center into the new unit of computing with the new NVIDIA Grace CPU, BlueField-3 DPU, and DOCA 1.0 SDK; in broadening the reach of AI to all companies and industries with NVIDIA EGX and Aerial 5G; and in transforming transportation with NVIDIA DRIVE Orin and Atlan.\">“We expect to see multi-trillion-parameter models by next year, and 100 trillion+ parameter models by 2023”</a> —Nvidia CEO <a href=\"https://en.wikipedia.org/wiki/Jensen_Huang\">Jensen Huang</a> (<a href=\"https://www.gwern.net/docs/ai/2021-04-12-jensenhuang-gtc2021keynote-eAn_oiZwUXA.en.vtt.txt\">subtitles</a>)</p></li><li><p>Mixture-Of-Experts:</p><ul><li><p><a href=\"https://en.pingwest.com/a/8693\">BAAI’s “Wudao Wensu”: 1.75-trillion parameters & multimodal!</a> (<a href=\"https://syncedreview.com/2021/03/23/chinas-gpt-3-baai-introduces-superscale-intelligence-model-wu-dao-1-0/\">prologue</a>)</p></li><li><p><a href=\"https://arxiv.org/abs/2105.15082#alibaba\">“Exploring Sparse Expert Models and Beyond”</a>, Yang et al 2021 (1t-parameter hierarchical Switch Transformer trained on 480 V100 GPUs)</p></li></ul></li></ul></li><li><p><strong><a href=\"https://arxiv.org/abs/1911.08265#deepmind\">MuZero</a></strong>:</p><ul><li><p><a href=\"https://arxiv.org/abs/2104.06294#deepmind\">“MuZero Unplugged: Online and Offline Reinforcement Learning by Planning with a Learned Model”</a>, Schrittwieser et al 2021 (Reanalyze+MuZero; <a href=\"https://www.gwern.net/images/ai/2021-schrittwieser-figure1-mspacmanmuzerologrewardscaling.png\" title=\"Figure 1: Final scores in Ms. Pac-Man for different Reanalyse fractions. By scaling the Reanalyse fraction, MuZero can be trained at any desired data budget. All other parameters are held constant. Note the logarithmic x-axis: Linear improvements in score require exponentially more data, matching scaling laws such as described by Kaplan et al 2020 for language models.\">smooth log-scaling</a> of <em>Ms. Pacman</em> reward with sample size, 107–1010, showing that DRL for arcade games parallels board games)</p></li><li><p><a href=\"https://sites.google.com/berkeley.edu/decision-transformer\">“Decision Transformer: Reinforcement Learning via Sequence Modeling”</a>, Chen et al 2021</p></li><li><p><a href=\"https://arxiv.org/abs/2104.06303#deepmind\">“Sampled MuZero: Learning and Planning in Complex Action Spaces”</a>, Hubert et al 2021 (MuZero for continuous domains: DM Control Suite/Real-World RL Suite); <a href=\"https://arxiv.org/abs/2006.07430\">“Continuous Control for Searching and Planning with a Learned Model”</a>, Yang et al 2020</p></li><li><p><a href=\"https://arxiv.org/abs/2104.06159\">“Muesli: Combining Improvements in Policy Optimization”</a>, Hessel et al 2020 (catching up with original MuZero)</p></li><li><p><a href=\"https://arxiv.org/abs/2102.12924\">“Visualizing MuZero Models”</a>, de Vries et al 2021 (reimplementing & introspecting a MuZero)</p></li></ul></li><li><p><a href=\"https://arxiv.org/abs/2104.03113\">“Scaling Scaling Laws with Board Games”</a>, <a href=\"https://andyljones.com/\">Jones</a> 2021 (AlphaZero/<a href=\"https://en.wikipedia.org/wiki/Hex_(board_game)\">Hex</a>: <a href=\"https://www.gwern.net/notes/Faster\">highly-optimized</a> GPU implementation enables showing <a href=\"https://www.gwern.net/notes/Scaling\">smooth scaling</a>  across 6 OOM of compute—2× FLOPS = 66% victory; amortization of  training → runtime tree-search, where 10× training = 15× runtime)</p></li><li><p><a href=\"https://christina.kim/2021/04/11/scaling-laws-for-language-transfer-learning/#openai\">“Scaling Laws for Language Transfer Learning”</a>, Christina Kim (<a href=\"https://arxiv.org/abs/2102.01293#openai\" title=\"Scaling Laws for Transfer\">Hernandez et al 2021</a> followup: smooth scaling for En → De/Es/Zh)</p></li><li><p><a href=\"https://arxiv.org/abs/2104.10350#google\">“Carbon Emissions and Large Neural Network Training”</a>,  Patterson et al 2021 (“…choice of DNN/datacenter/processor can reduce  the carbon footprint up to ~100–1000×. These large factors make  retroactive estimates difficult.”)</p></li><li><p><a href=\"https://arxiv.org/abs/2104.07705\">“How to Train BERT with an Academic Budget”</a>, Izsak et al 2021 (<a href=\"https://arxiv.org/abs/1810.04805#google\" title=\"'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding', Devlin et al 2018\">BERT</a> in 8 GPU-days—R&D iteration allows finding efficiency; there’s nothing so expensive as demanding research be cheap.^1^)</p></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6818669/\">“Precision exercise medicine: understanding exercise response variability”</a>,  Ross et al 2019 (“large individual differences in CRF response (range:  −33% to +118%) have been observed across the 8 exercise training studies  independent of exercise duration”—nothing in psychology, or medicine,  makes sense except in the light of individual differences…)</p></li></ul><p>Recent Evolution:</p><ul><li><p><a href=\"https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msab147/6277411\">“Analysis of genomic DNA from medieval plague victims suggests long-term effect of </a><em><a href=\"https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msab147/6277411\">Yersinia pestis</a></em><a href=\"https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msab147/6277411\"> on human immunity genes”</a>, Immel et al 2021</p></li></ul><p>Engineering:</p><ul><li><p><a href=\"https://biohackinfo.com/news-china-gene-editing-criminal-law-article-336-march-2021/\">“China officially bans CRISPR babies, human clones and animal-human hybrids”</a>? (another blow to attempts to project fears & fantasies onto China)</p></li></ul><h2>2.3 Politics/Religion</h2><ul><li><p><em><a href=\"https://www.nap.edu/catalog/25762/reflecting-sunlight-recommendations-for-solar-geoengineering-research-and-research-governance\">Reflecting Sunlight: Recommendations for Solar Geoengineering Research and Research Governance</a></em>, National Academies 2021 (<a href=\"https://www.nytimes.com/2021/03/25/climate/geoengineering-sunlight.html\">media</a>)</p></li><li><p><a href=\"https://www.gwern.net/docs/sociology/2020-muralidharan.pdf\">“Improving Public Sector Management at Scale? Experimental Evidence on School Governance India”</a>, Muralidharan & Singh 2020</p></li><li><p><a href=\"https://www.gwern.net/docs/fiction/2012-mason.pdf\">“Jay-Z’s </a><em><a href=\"https://www.gwern.net/docs/fiction/2012-mason.pdf\">99 Problems</a></em><a href=\"https://www.gwern.net/docs/fiction/2012-mason.pdf\">, Verse 2: A Close Reading with 4th Amendment Guidance for Cops and Perps”</a>, Mason 2012</p></li></ul><h2>2.4 Psychology/Biology</h2><ul><li><p><a href=\"https://www.gwern.net/docs/longevity/2021-wiley.pdf\">“Oxylipin biosynthesis reinforces cellular senescence and allows detection of senolysis”</a>, Wiley et al 2021</p></li><li><p><a href=\"https://www.nytimes.com/2019/02/26/magazine/psychics-skeptics-facebook.html\" title=\"Are some celebrity mediums fooling their audience members by reading social media pages in advance? A group of online vigilantes is out to prove it\">“Inside the Secret Sting Operations to Expose Celebrity Psychics”</a></p></li><li><p><a href=\"https://www.gwern.net/docs/catnip/2021-smith.pdf\">“If I fits I sits: A citizen science investigation into illusory contour susceptibility in domestic cats (</a><em><a href=\"https://www.gwern.net/docs/catnip/2021-smith.pdf\">Felis silvestris catus</a></em><a href=\"https://www.gwern.net/docs/catnip/2021-smith.pdf\">)”</a>, Smith et al 2021</p></li><li><p><a href=\"https://www.gwern.net/docs/biology/2005-paxton.pdf\">“Cetaceans,  sex and sea serpents: an analysis of the Egede accounts of a ‘most  dreadful monster’ seen off the coast of Greenland in 1734”</a>, Paxton et al 2005 (is that a legendary cryptid in your pocket, or are you just happy to see me?)</p></li><li><p><a href=\"https://www.gwern.net/docs/psychology/writing/2020-reilly.pdf\">“Building the perfect curse word: A psycholinguistic investigation of the form and meaning of taboo words”</a>, Reilly et al 2020</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Tarrare\">Tarrare</a></p></li></ul><h2>2.5 Technology</h2><ul><li><p><a href=\"https://arxiv.org/abs/2103.07487\">“How Developers Choose Names”</a>,  Feitelson et al 2021 (“Another example concerned the function  ‘arrangeFilesByName(files)’. When asked the return value…one suggested  the number of files reordered”)</p></li><li><p><a href=\"https://arxiv.org/abs/2004.02504\">“Bringing GNU Emacs to Native Code”</a>,  Corallo et al 2020 (using libgccjit to make Emacs 2.3× to 42× faster;  gccemacs has been merged into Emacs HEAD & will be available soon)</p></li><li><p><a href=\"https://phiresky.github.io/blog/2021/hosting-sqlite-databases-on-github-pages/\">“Hosting SQLite databases on Github Pages (or any static file hoster)”</a> (a revolution in static website technology: eg running a query <a href=\"https://nitter.cc/simonw/status/1388933800445452290\" title=\"Check out this demo: I run the SQL query \"select country_code, long_name from wdi_country order by rowid desc limit 100\" and it fetches just 54.2KB of new data (across 49 small HTTP requests) to return 100 results---from a statically hosted database file that's 668.8MB!\">need download only 54kb of a 670MB database</a>; fulltext site search is just the beginning of the possibilities of this clever use of <a href=\"https://en.wikipedia.org/wiki/Byte_serving\">range requests</a>)</p></li><li><p><a href=\"https://www.coderelay.io/fontemon.html\">“</a><em><a href=\"https://www.coderelay.io/fontemon.html\">Fontemon</a></em><a href=\"https://www.coderelay.io/fontemon.html\">: World’s first video game in a font!”</a> (a <em>Pokemon</em>-like CYOA <a href=\"https://github.com/mmulet/code-relay/blob/main/markdown/HowIDidIt.md\">implemented as an OpenType font file</a>; play in browser or text editor—still not quite <a href=\"https://www.gwern.net/Turing-complete\">Turing-complete</a> but definitely the most impressive thing implemented in a font so far)</p><ul><li><p><em>Fontemon</em> is by far the highlight of <a href=\"http://sigbovik.org/2021/proceedings.pdf\">SIGBOVIK 2021</a>; but also worth noting: <a href=\"http://sigbovik.org/2021/proceedings.pdf#page=8\">“Back to Square One: Superhuman Performance in Chutes and Ladders Through Deep Neural Networks and Tree Search”</a> · <a href=\"http://sigbovik.org/2021/proceedings.pdf#page=83\">“Deep Deterministic Policy Gradient Boosted Decision Trees”</a> · <a href=\"http://sigbovik.org/2021/proceedings.pdf#page=126\">“Lowestcase and uppestcase letters: Advances in derp learning”</a> · <a href=\"http://sigbovik.org/2021/proceedings.pdf#page=167\">“openCHEAT: Computationally Helped Error bar Approximation Tool—Kick-starting Science 4.0”</a> · <a href=\"http://sigbovik.org/2021/proceedings.pdf#page=216\">“The Newcomb-Benford Law, Applied to Binary Data: An Empirical and Theoretic Analysis”</a> · <a href=\"http://sigbovik.org/2021/proceedings.pdf#page=252\">“Inverted Code Theory: Manipulating Program Entropy”</a> (<em><a href=\"https://en.wikipedia.org/wiki/Tenet_(film)\">Tenet</a></em> fans only—possibly inferior to <a href=\"http://www.frc.ri.cmu.edu/~hpm/project.archive/general.articles/1991/TempComp.html\" title=\"Time Travel and Computing\">Moravec 1991</a>?) · <a href=\"http://sigbovik.org/2021/proceedings.pdf#page=282\">“Build your own 8-bit busy beaver on a breadboard!”</a></p></li></ul><p>Incidentally, it’s curious that while STEM fields have entire annual issues, journals, & conferences devoted to satire (<a href=\"http://sigbovik.org/\">SIGBOVIK</a>; Arxiv April Fools papers like <a href=\"https://arxiv.org/abs/1703.10987\" title=\"On the Impossibility of Supersized Machines\">Garfinkel et al 2017</a>; <a href=\"https://www108.lamp.le.ac.uk/ojs1/index.php/pst/issue/archive\">Special Topics</a>; the <a href=\"https://www.bmj.com/about-bmj/resources-authors/article-types/christmas-issue\">BMJ Christmas issue</a>; the <a href=\"https://en.wikipedia.org/wiki/Ig_Nobel_Prize\">Ig Nobel Prizes</a> & <a href=\"https://bahfest.com/\">BAHFest</a>), after asking in several places, I have found no instances in the humanities. (I know of many entertaining <em>papers</em>, like <a href=\"https://www.gwern.net/docs/philo/2008-sinhababu.pdf\" title=\"Possible Girls\">Sinhababu 2008</a> on waifus, but no <em>regular organized</em> publication, with the possible exception of the annual <a href=\"https://en.wikipedia.org/wiki/Latke%E2%80%93Hamantash_Debate\">“Latke-Hamantash Debate”</a>.)</p></li></ul><h2>2.6 Economics</h2><ul><li><p><a href=\"https://www.gwern.net/docs/statistics/decision/2006-thorp.pdf\">“The Kelly Criterion in Blackjack Sports Betting, and the Stock Market”</a>, Thorp 2006</p></li><li><p><a href=\"https://marginalrevolution.com/marginalrevolution/2016/10/performance-pay-nobel.html\">“The Performance Pay Nobel”</a> (CEO pay as <a href=\"https://www.gwern.net/Backstop\">blackbox optimization problem</a>)</p></li><li><p><a href=\"https://www.gwern.net/docs/economics/2008-josephson.pdf\">“The Ocean’s Hot Dog: The Development of the Fish Stick”</a>,  Kelly 2008 (out of nostalgia, I bought some fish sticks for the first  time in decades; better than I remembered, even if I had no <a href=\"https://en.wikipedia.org/wiki/Tartar_sauce\">tartar</a> handy)</p></li></ul><h2>2.7 Philosophy</h2><ul><li><p><a href=\"https://www.gwern.net/docs/culture/2007-shiner.pdf\">“The Aesthetics of Smelly Art”</a>, Shiner & Kriskovets 2007; <a href=\"https://www.gwern.net/docs/culture/2019-kraft.pdf\">“The Odor Value Concept in the Formal Analysis of Olfactory Art”</a>, Kraft 2019; <a href=\"https://qualiacomputing.com/2020/02/21/perfumery-as-an-art-form/\" title=\"Hedonic Tone, memetics, scent, sex, spirituality\">“Perfumery as an art form”</a>/<a href=\"https://qualiacomputing.com/2020/08/14/qualia-research-diary-scents/\" title=\"Qualia Research Diary: Scents [consciousness research, Experiment, genetics, memetics, scent, valence]\">notes</a>, Qualia Computing 2020 (more: manufacturing: <a href=\"https://www.newyorker.com/magazine/2005/03/14/scent-nile\" title=\"Chandler Burr 2005\">“The Scent of the Nile: Jean-Claude Ellena creates a new perfume”</a>; human smell is better than you think: <a href=\"https://www.gwern.net/docs/psychology/2006-porter.pdf\">“Mechanisms of Scent-tracking in Humans”</a>, Porter et al 2006 (<a href=\"https://www.gwern.net/images/psychology/2006-porter-humanscenttracking-41593_2007_bfnn1819_moesm2_esm.mp4\">video</a>; see also <a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5512720/\">“Poor Human Olfaction is a 19th Century Myth”</a>, McGann 2017); <a href=\"https://www.pnas.org/content/109/49/19959.full\" title=\"'Perceptual convergence of multi-component mixtures in olfaction implies an olfactory white', Weiss et al 2012\">olfactory white</a>; <em><a href=\"https://en.wikipedia.org/wiki/K%C5%8Dd%C5%8D\">Kōdō</a></em>, which unexpectedly appears in <a href=\"https://www.gwern.net/docs/cs/2005-knuth-taocp-v4-prefascicle4b.pdf#page=22\" title=\"7.2.1.7: History of Combinatorial Generation: Set Partitions\">Knuth</a>. <a href=\"https://threadreaderapp.com/thread/1357071738731814912.html\" title=\"https://twitter.com/add_hawk/status/1357071738731814912\">C. Thi Nguyen</a>’s description of the more bizarre & avant-garde perfumes made me curious enough to nose around & order 39 <a href=\"https://www.luckyscent.com/\">LuckyScent</a> samplers.)</p></li></ul><h2>2.8 Miscellaneous</h2><ul><li><p><a href=\"https://en.wikipedia.org/wiki/Bog_butter\">Bog butter</a></p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Sarah_Bernhardt\">Sarah Bernhardt</a> (Lions. Lots of lions.)</p></li></ul><div><hr></div><ol><li><p>Another thought, looking at <a href=\"https://bls.gov/news.release/ecec.nr0.htm\">‘Employer Costs for Employee Compensation’</a> (<a href=\"https://bls.gov/news.release/archives/ecec_031986.pdf\">PDF</a>):</p><ol><li><p>“Moore’s Law”: the cost of a transistor halves every ~19 months;</p></li><li><p>“Anti-Moore’s Law”: the cost of a synapse doubles every ~119 years.</p></li></ol></li></ol>",
      "image": {
        "url": "https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/33773d07-4631-4a6b-91c2-44a2b1082385_1164x702.png",
        "title": null
      },
      "media": [
        {
          "url": "https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/33773d07-4631-4a6b-91c2-44a2b1082385_1164x702.png",
          "image": "https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/33773d07-4631-4a6b-91c2-44a2b1082385_1164x702.png",
          "title": null,
          "length": 0,
          "type": "image",
          "mimeType": "image/jpeg"
        }
      ],
      "authors": [
        {
          "name": "gwern",
          "email": null,
          "url": null
        }
      ],
      "categories": []
    },
    {
      "id": "https://gwern.substack.com/p/march-2021-gwernnet-newsletter",
      "title": "March 2021 Gwern.net Newsletter",
      "description": "2 major new site features: 'popins' and recursive Wikipedia popups",
      "url": "https://gwern.substack.com/p/march-2021-gwernnet-newsletter",
      "published": "2021-04-06T15:31:01.000Z",
      "updated": "2021-04-06T15:31:01.000Z",
      "content": "<p><a href=\"https://www.gwern.net/newsletter/2021/03\">March 2021’s Gwern.net</a> <a href=\"https://gwern.substack.com\">newsletter</a> is now out; previous, <a href=\"https://www.gwern.net/newsletter/2021/02\">February 2021</a> (<a href=\"https://www.gwern.net/tags/newsletter\">archives</a>). This is a summary of the revision-history RSS feed, overlapping with my <a href=\"https://www.gwern.net/Changelog\">Changelog</a> & <a href=\"https://old.reddit.com/r/gwern/\">/r/gwern</a>; brought to you by my donors on <a href=\"https://www.patreon.com/gwern\">Patreon</a>.</p><h1>1 Writings</h1><ul><li><p><strong>Gwern.net</strong>: mobile “popins” are finally enabled! (<a href=\"https://www.gwern.net/images/design/2021-03-28-gwern.net-annotations-mobilepopins-darkmode.png\">example</a>); new Wikipedia popups (this 7th implementation enables <em><a href=\"https://www.gwern.net/images/design/2021-04-01-gwern.net-annotations-popups-recursivewikipediapopups.png\">recursive</a></em><a href=\"https://www.gwern.net/images/design/2021-04-01-gwern.net-annotations-popups-recursivewikipediapopups.png\"> WP popups</a>)</p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><ul><li><p><a href=\"https://distill.pub/2021/multimodal-neurons/#openai\">“Multimodal Neurons in Artificial Neural Networks”</a>, Goh et al 2021 (dissecting <a href=\"https://openai.com/blog/clip/\" title=\"CLIP (Contrastive Language-Image Pre-training): Connecting Text and Images\">CLIP</a> concepts, discovering typographical classification ‘attacks’^1^ and a <a href=\"https://en.wikipedia.org/wiki/Stroop_effect\">Stroop effect</a>! Is there anything CLIP can’t do?)</p></li><li><p><a href=\"https://arxiv.org/abs/2101.03958#google\">“Evolving Reinforcement Learning Algorithms”</a>, Co-Reyes et al 2021 (evolving eg <a href=\"https://en.wikipedia.org/wiki/Temporal_difference_learning\">TD-learning</a>)</p></li><li><p><a href=\"https://www.gwern.net/docs/rl/2021-scanlon.pdf\">“Waymo Simulated Driving Behavior in Reconstructed Fatal Crashes within an Autonomous Vehicle Operating Domain”</a>, Scanlon et al 2021 (<a href=\"https://blog.waymo.com/2021/03/replaying-real-life.html\">blog</a>; hard negative mining—self-driving cars, being inhuman, can learn not just from their mistakes but humans’ mistakes too)</p></li><li><p><a href=\"https://andyljones.com/posts/rl-debugging.html\">“Debugging Reinforcement Learning Systems Without The Agonizing Pain”</a>, Andy L. Jones; <a href=\"https://clemenswinter.com/2021/03/24/my-reinforcement-learning-learnings/\">“My Reinforcement Learning Learnings”</a>, Clemens Winter</p></li></ul><p><a href=\"https://old.reddit.com/r/mlscaling/\">Matters Of Scale</a>:</p><ul><li><p><a href=\"https://arxiv.org/abs/2103.01988#facebook\">“SEER: Self-supervised Pretraining of Visual Features in the Wild”</a>, Goyal et al 2021 (<a href=\"https://ai.facebook.com/blog/self-supervised-learning-the-dark-matter-of-intelligence\" title=\"Self-supervised learning: The dark matter of intelligence\">blog</a>;  near-SOTA by training 1b-param CNN on 1b unfiltered unlabeled Internet  images—another reminder that unsupervised learning is really working!); <a href=\"https://ai.facebook.com/blog/learning-from-videos-to-understand-the-world\">“‘Learning From Videos’ to understand the world”</a> (rapid FB expansion of self-supervised training to millions of photos/videos/hours-of-speech); <a href=\"https://arxiv.org/abs/2103.14005\">“Contrasting Contrastive Self-Supervised Representation Learning Models”</a>,  Kotar et al 2021 (Supervised learning from ImageNet is now obsolete for  transfer learning, and ImageNet just a contaminated validation set)</p></li><li><p><a href=\"https://arxiv.org/abs/2103.14586#google\">“Understanding Robustness of Transformers for Image Classification”</a>, Bhojanapalli et al 2021 (<a href=\"https://openreview.net/forum?id=YicbFdNTTy#google\">Vision Transformers</a> gain robustness faster than CNNs as dataset size increases)</p></li><li><p><a href=\"https://aiindex.stanford.edu/wp-content/uploads/2021/03/2021-AI-Index-Report_Master.pdf#page=41\">“Artificial Intelligence Index Report 2021”</a>: technical performance and cost (<a href=\"https://chinai.substack.com/p/chinai-137-year-3-of-chinai\" title=\"ChinAI #137: Year 3 of ChinAI: Reflections on the newsworthiness of machine translation\">Ding questions</a>  whether this shows China catching up on AI at all, as we are  incessantly told it is doing; one question to ask: ignoring  fast-following, what, out of the thousands upon thousands of  publications flooding out these days, are the last 3 <em>major novel</em>  AI breakthroughs coming out of all pure-Chinese labs combined which  could be plausibly equated in importance with, say, just OpenAI’s recent  output of <a href=\"https://arxiv.org/abs/2005.14165#openai\">GPT-3</a>/<a href=\"https://openai.com/blog/dall-e/\">DALL·E</a>/CLIP?)</p></li><li><p><a href=\"https://openai.com/blog/gpt-3-apps/\">OA GPT-3 API: >300 apps, >10k developers, >4.5b words per day</a></p></li><li><p><a href=\"https://www.pnas.org/content/116/23/11537\">“A mathematical theory of semantic development in deep neural networks”</a>, Saxe et al 2019 (are jumps in NN capabilities to be expected when scaling? see also <a href=\"https://arxiv.org/pdf/2103.10948.pdf#page=22\" title=\"The Shape of Learning Curves: a Review: 6. Ill-behaved learning curves: 6.1. Phase transitions\">Viering & Loog 2021</a>’s discussion of phase transitions & averaging of exponentials giving power-laws)</p></li><li><p><a href=\"https://www.cell.com/cell/fulltext/S0092-8674(21)00239-7\">“An early cell shape transition drives evolutionary expansion of the human forebrain”</a>, Benito-Kwiecinski et al 2021 (<a href=\"https://www.theguardian.com/science/2021/mar/24/scientists-discover-why-the-human-brain-is-so-big\" title=\"Scientists discover why the human brain is so big: Molecular switch makes human organ three times larger than great apes’, study finds\">media</a>; a simple switch for the <a href=\"https://www.gwern.net/docs/psychology/2012-herculanohouzel.pdf\" title=\"'The remarkable, yet not extraordinary, human brain as a scaled-up primate brain and its associated cost', Herculano-Houzel 2012\">scaling up</a> of the primate brain)</p><ul><li><p><a href=\"https://www.statnews.com/2020/09/24/crows-possess-higher-intelligence-long-thought-primarily-human/\">“Crows possess higher intelligence long thought primarily human”</a> (the remarkable, yet not extraordinary, crow/raven brain as scaled-up <a href=\"https://en.wikipedia.org/wiki/Bird_intelligence\">bird brain</a>)</p></li></ul></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href=\"https://advances.sciencemag.org/content/7/11/eabd1239\">“GWAS in almost 195,000 individuals identifies 50 previously unidentified genetic loci for eye color”</a>, Simcoe et al 2021</p></li><li><p><a href=\"https://www.gwern.net/docs/genetics/heritable/2021-fagereng.pdf\">“Why Do Wealthy Parents Have Wealthy Children?”</a>,  Fagereng et al 2021 (I’m always impressed just how difficult it is for  rich people to pass on wealth—“shirtsleeves to shirtsleeves in 3  generations” etc)</p></li></ul><p>Evolution:</p><ul><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2021.02.25.432891v1\">“Nothing in evolution makes sense except in the light of parasites”</a>, Hickinbotham et al 2021</p></li></ul><p>Engineering:</p><ul><li><p><a href=\"https://www.sierraclub.org/sierra/2021-2-march-april/feature/demise-and-potential-revival-american-chestnut\" title=\"Before a disastrous blight, the American chestnut was a keystone species in eastern forests. Could genetic engineering help bring it back?\">“The Demise and Potential Revival of the American Chestnut”</a></p></li></ul><h2>2.3 Statistics/Meta-Science</h2><ul><li><p><a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7831807/\">“Broad cross-national public support for accelerated COVID-19 vaccine trial designs”</a>,  Broockman et al 2021 (“we can’t do challenge trials with volunteers in  February 2020 to save countless thousands of lives because ordinary  people might think it unethical”—have you tried <em>asking</em> them, or was that irrelevant because it was just another noble lie?)</p></li><li><p><a href=\"https://crystalprisonzone.blogspot.com/2021/01/i-tried-to-report-scientific-misconduct.html\">“This is the story of how I found what I believe to be scientific misconduct and what happened when I reported it”</a>, Joe Hilgard</p></li><li><p><a href=\"https://www.newyorker.com/culture/cultural-comment/the-revolution-in-classic-tetris\">“The Revolution in Classic Tetris: How a younger generation used the Internet to master the falling blocks”</a> (how achieving classic Tetris maximum-scores, first done in 2010, became routine thanks to YouTube & <a href=\"https://www.gwern.net/Bakewell#external-links\">online competition for excellence</a>)</p></li></ul><h2>2.4 Politics/Religion</h2><ul><li><p><a href=\"https://www.gwern.net/docs/sociology/2021-singh.pdf\">“Magic, Explanations, and Evil: The Origins and Design of Witches and Sorcerers”</a>, Singh 2021 (doubtless even cavemen were all “Og: sus.”)</p></li><li><p><a href=\"https://elifesciences.org/articles/62878\">“Self-blinding citizen science to explore psychedelic microdosing”</a>, Szigeti et al 2021 (related to <a href=\"https://www.nature.com/articles/s41598-021-81446-7\" title=\"Positive expectations predict improved mental-health outcomes linked to psychedelic microdosing\">Kaertner et al 2021</a>; a self-blinding study, similar to my old self-blinding protocols, confirms that microdosing is just placebo effect, as <a href=\"https://www.gwern.net/LSD-microdosing\">I said in 2012</a>, and I’m reminded of DNB studies like <a href=\"https://www.gwern.net/docs/dnb/2016-foroughi.pdf\" title=\"Placebo effects in cognitive training\">Foroughi et al 2016</a>)</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/2019%E2%80%932020_vaping_lung_illness_outbreak\">The 2019–2020 vaping moral panic</a>  over adulterated black-market THC products (depressing to see how  irresponsibly reported & alarmist this was, and how everyone  attempted to frame nicotine for it<a href=\"file:///tmp/burlVELRZZ.html#fn2\">2</a>. Naturally, no one involved has apologized or admitted fault—after all, their <em><a href=\"https://en.wikipedia.org/wiki/Noble_lie\">intentions</a></em><a href=\"https://en.wikipedia.org/wiki/Noble_lie\"> were good</a>,  “won’t someone think of the children”‽ The incompetence and/or  dishonesty here emphasizes how 2020–2021 was business as usual, and the  only unusual part is that reality happened so fast we saw some of <a href=\"https://en.wikipedia.org/wiki/Parable_of_the_broken_window\">the unseen</a>.)</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Mark_Hofmann\">Mark Hofmann</a></p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Alexandra_David-N%C3%A9el\">Alexandra David-Néel</a> (one of <em>those</em> 1800–1900s biographies)</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/John_Harvey_Kellogg\">John Harvey Kellogg</a></p></li></ul><h2>2.5 Psychology/Biology</h2><ul><li><p><a href=\"https://www.gwern.net/docs/iq/2021-brown.pdf\">“Can You Ever Be Too Smart for Your Own Good? Comparing Linear and Nonlinear Effects of Cognitive Ability on Life Outcomes”</a>, Brown et al 2021</p></li><li><p><a href=\"https://psyarxiv.com/g8f9s/\">“The pandemic fallacy: Inaccuracy of social scientists’ and lay judgments about COVID-19’s societal consequences in America”</a>, Hutcherson et al 2021 (highly-inaccurate even retrospectively, typically grossly overestimating)</p></li><li><p><a href=\"https://psyarxiv.com/hc8je/\">“Training Working Memory for Two Years—No Evidence of Latent Transfer to Intelligence”</a>, Watrin et al 2021 (fade-out of expectancy/placebo effects)</p></li><li><p><a href=\"https://www.cell.com/current-biology/fulltext/S0960-9822(21)00059-2\">“Real-time dialogue between experimenters and dreamers during REM sleep”</a>, Konkoly et al 2021</p></li><li><p><a href=\"https://www.sciencedirect.com/science/article/pii/S0149763421001068\">“Leroy’s elusive little people: A systematic review on lilliputian hallucinations”</a>, Blom 2021 (<a href=\"https://en.wikipedia.org/wiki/Alice_in_Wonderland_syndrome\">Alice in Wonderland syndrome</a>)</p></li><li><p><a href=\"https://www.theatlantic.com/science/archive/2021/01/orcas-killer-whale-resident-transient/617862/\">“A  Group of Orca Outcasts Is Now Dominating an Entire Sea: ‘Transient’  killer whales that feast on seals and hunt in small packs are thriving  while their widely beloved ‘Resident’ siblings are dying out”</a> (I wonder how the third <a href=\"https://en.wikipedia.org/wiki/Killer_whale\">orca</a> type, <a href=\"https://en.wikipedia.org/wiki/Killer_whale#Types\">‘offshore’</a>, are doing?)</p></li><li><p><a href=\"https://www.gwern.net/docs/biology/1995-watanabe.pdf\">“Estimation of the total saliva volume produced per day in 5-year-old children”</a>, Watanabe et al 1995</p></li></ul><h2>2.6 Technology</h2><ul><li><p><a href=\"https://www.nngroup.com/articles/aesthetic-usability-effect/\">“The Aesthetic-Usability Effect”</a>, Moran 2017 (<a href=\"https://pointersgonewild.com/2019/11/02/they-might-never-tell-you-its-broken/\">“They Might Never Tell You It’s Broken”</a> if it’s pretty enough; see also <a href=\"https://asktog.com/atc/the-third-user/\" title=\"'The Third User, or, Exactly Why Apple Keeps Doing Foolish Things\">“The Third User”</a>)</p></li><li><p><a href=\"https://ciechanow.ski/cameras-and-lenses/\">“Cameras and Lenses”</a>, Bartosz Ciechanowski (explorable; followup to <a href=\"https://ciechanow.ski/lights-and-shadows/\">“Lights and Shadows”</a>)</p></li><li><p><a href=\"https://arxiv.org/abs/2103.07013\">“Large Batch Simulation for Deep Reinforcement Learning”</a>, Shacklett et al 2021 (your computer is faster than you think)</p></li><li><p><a href=\"https://obscuritory.com/essay/incredible-boxes-of-hock-wah-yeo/\">“The incredible boxes of Hock Wah Yeo”</a> (unusual video game packaging design)</p></li><li><p><a href=\"https://www.gwern.net/docs/technology/2017-post.pdf\">“Stone Walls That Stay Built: A master waller shares how to dry-lay stone walls that hold their ground for centuries”</a>, Post 2017</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Automated_storage_and_retrieval_system\">Automated storage and retrieval system</a></p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Visual_cryptography\">Visual cryptography</a></p></li></ul><h2>2.7 Economics</h2><ul><li><p><a href=\"https://www.gwern.net/docs/economics/2021-meyer.pdf\">“The Use and Misuse of Income Data and Extreme Poverty in the United States”</a>, Meyer et al 2021 (measurement error in non-registry surveys of population extremes—not quite <a href=\"https://www.gwern.net/GPT-3#lizardman-constant\">“lizardman”</a> but similar problem)</p></li><li><p><a href=\"https://www.gwern.net/docs/economics/2006-mackenzie.pdf\">“Is economics performative? Option theory and the construction of derivatives markets”</a>, Mackenzie 2006 (the mechanics of how the <a href=\"https://en.wikipedia.org/wiki/Black%E2%80%93Scholes_model\">Black-Scholes model</a> changed markets: <a href=\"https://en.wikipedia.org/wiki/Fischer_Black\">Black</a>  ran a service printing “paper” estimating optimal prices for all  options which traders could consult & use with simple heuristics to  try to arbitrage the market)</p></li><li><p><a href=\"https://www.cabinetmagazine.org/issues/52/hodes.php\">“Whitewood under Siege: On the front lines of the pallet wars”</a> (the competition between the two ecosystems of shipping <a href=\"https://en.wikipedia.org/wiki/Pallet\">pallets</a>: ‘whitewood’ & ‘blue pallet’)</p></li><li><p><em><a href=\"https://en.wikipedia.org/wiki/Mautam\">Mautam</a></em></p></li></ul><h2>2.8 Philosophy</h2><ul><li><p><a href=\"https://www.tandfonline.com/doi/full/10.1080/03949370.2021.1893826\">“Coping with mortality: responses of monkeys and great apes to collapsed, inanimate and dead conspecifics”</a>, De Marco et al 2021</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Braitenberg_vehicle\">Braitenberg vehicle</a></p></li></ul><h2>2.9 Fiction</h2><ul><li><p><a href=\"https://en.wikipedia.org/wiki/Reply_of_the_Zaporozhian_Cossacks\">“Reply of the Zaporozhian Cossacks”</a></p></li></ul><h2>2.10 Miscellaneous</h2><ul><li><p>America’s top ace, <a href=\"https://en.wikipedia.org/wiki/Dick_Bong\">Major Dick Bong</a></p></li></ul><h1>3 Film/TV</h1><p><strong>Live-action:</strong></p><ul><li><p><em><a href=\"https://en.wikipedia.org/wiki/North_by_Northwest\">North by Northwest</a></em> (<a href=\"https://en.wikipedia.org/wiki/Alfred_Hitchcock\">Hitchcock</a>  1959; for such a extremely respected movie, it felt oddly formless and  like it was bouncing through genres as more of a comedic B-movie romp  than a serious auteur’s effort—since James Bond started in 1953, with a  TV adaptation in 1954, NbN comes off as almost a satire. I mean, really,  monkeying around in Presidential noses!)</p></li></ul><div><hr></div><ol><li><p>While interesting, these are ‘attacks’ only in the most generous interpretation possible (since it <a href=\"https://nitter.cc/NoaNabeshima/status/1368662246885265409\" title=\"The new CLIP adversarial examples are partially from the use-mention distinction. CLIP was trained to predict which caption from a list matches an image. It makes sense that a picture of an apple with a large 'iPod' label would be captioned with 'iPod', not 'Granny Smith'! This can be somewhat fixed with a list of labels that are more explicit about this, at least for a small set of pictures I've tried. After some experimentation, I found this prompt that seems to work with CLIP ViT-B-32: ...\">does know</a> <a href=\"https://www.youtube.com/watch?v=Rk3MBx20z24&t=35s\" title=\"'Apple or iPod? Easy Fix for Adversarial Textual Attacks on OpenAI's CLIP Model!', Yannic Kilcher\">the difference</a>),  and the fact that CLIP can read text in images to note the semantic  similarity, is to considerable credit. As the CLIP authors <a href=\"https://www.gwern.net/images/ai/2021-radford-clip-figure4-promptengineering.png\" title=\"Radford et al 2021 (CLIP): **Figure 4**. _Prompt engineering and ensembling improve zero-shot performance_. Compared to the baseline of using contextless class names, prompt engineering and ensembling boost zero-shot classification performance by almost 5 points on average across 36 datasets. This improvement is similar to the gain from using 4× more compute with the baseline zero-shot method but is “free” when amortized over many predictions.\">note</a>,  some queries benefit from ensembling, more context than a single word  class name such as prefixing “A photograph of a”, and class names can be  highly ambiguous: in ImageNet, the class name “crane” could refer to  the bird or construction equipment; and the Oxford-IIIT Pet dataset  labels one class “boxer”. (CLIP is still <a href=\"https://stanislavfort.github.io/2021/03/05/OpenAI_CLIP_stickers_and_adversarial_examples.html\" title=\"Pixels still beat text: Attacking the OpenAI CLIP model with text patches and adversarial pixel perturbations\">vulnerable to regular adversarial examples</a>, of course.)↩</p></li><li><p>It <em>couldn’t’ve</em> been  nicotine because people had been vaping for a decade and a half without  widespread near-instantaneous lung-related fatalities! It <em>had</em>  to be a new adulterant, and as soon as the first few black-market THC  links surfaced, that meant the problem had to be THC-products-only  because how would the same adulterant simultaneously get into the  different supply chains? And yet, every article, health official, and  activist did their paternalist best to suggest otherwise to pin the  blame on regular vaping, no matter how many tests turned up clean, and  it was the nicotine vaping products which got summarily banned…. One  must assume many of those laws are still on the books, inasmuch as <a href=\"https://old.reddit.com/r/electronic_cigarette/comments/lkhewr/usa_vape_mail_ban_newssales_megathread/\">the shipping bans keep expanding</a>.↩</p></li></ol>",
      "image": {
        "url": "https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/f10eb6e5-7674-4465-b223-2f254bc50ddb_685x1368.png",
        "title": null
      },
      "media": [
        {
          "url": "https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/f10eb6e5-7674-4465-b223-2f254bc50ddb_685x1368.png",
          "image": "https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/f10eb6e5-7674-4465-b223-2f254bc50ddb_685x1368.png",
          "title": null,
          "length": 0,
          "type": "image",
          "mimeType": "image/jpeg"
        }
      ],
      "authors": [
        {
          "name": "gwern",
          "email": null,
          "url": null
        }
      ],
      "categories": []
    },
    {
      "id": "https://gwern.substack.com/p/february-2021-gwernnet-newsletter",
      "title": "February 2021 Gwern.net Newsletter",
      "description": "links on AI scaling, semaglutide, and ethicist ethics",
      "url": "https://gwern.substack.com/p/february-2021-gwernnet-newsletter",
      "published": "2021-03-13T15:18:44.000Z",
      "updated": "2021-03-13T15:18:44.000Z",
      "content": "<p>February 2021’s <a href=\"https://www.gwern.net/newsletter/2021/02\">Gwern.net</a> <a href=\"https://gwern.substack.com\">newsletter</a> is now out; previous, <a href=\"https://www.gwern.net/newsletter/2021/01\">January 2021</a> (<a href=\"https://www.gwern.net/tags/newsletter\">archives</a>). This is a summary of the revision-history RSS feed, overlapping with my <a href=\"https://www.gwern.net/Changelog\">Changelog</a> & <a href=\"https://old.reddit.com/r/gwern/\">/r/gwern</a>; brought to you by my donors on <a href=\"https://www.patreon.com/gwern\">Patreon</a>.</p><h1>1 Writings</h1><ul><li><p><strong>Gwern.net</strong>: popups: can now be moved, stickied, and full-screened (another step towards our ambition of Windows-95-in-the-browser!)</p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><ul><li><p><a href=\"https://lilianweng.github.io/lil-log/2021/01/02/controllable-neural-text-generation.html\">“Controllable Neural Text Generation”</a>, Lilian Weng; <a href=\"https://ruder.io/recent-advances-lm-fine-tuning/\" title=\"This article provides an overview of recent methods to fine-tune large pre-trained language models\">“Recent Advances in Language Model Fine-tuning”</a>, Sebastian Ruder (review)</p><ul><li><p><a href=\"https://arxiv.org/abs/2102.07350\">“Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm”</a>,  Reynolds & McDonell 2021 (original 10-shot Fr → En translation can  be beaten by the better 0-shot prompt: “French: XYZ / English:…”; this  is “true of most worst-performing prompts…”); <a href=\"https://arxiv.org/abs/2102.09690\">“Calibrate Before Use: Improving Few-Shot Performance of Language Models”</a>, Zhao et al 2021 (huge boost from calibrating unstable prompts; both demonstrate, <a href=\"https://www.gwern.net/GPT-3#prompts-as-programming\">as always</a>, that “sampling can prove the presence of knowledge but not the absence.”)</p></li></ul></li><li><p><a href=\"https://arxiv.org/abs/2102.07074\">“TransGAN: Two Transformers Can Make One Strong GAN”</a>, Jiang et al 2021 (Transformer-only GAN: attention is all you need)</p></li><li><p><a href=\"https://arxiv.org/abs/2102.06203\">“PACT: Proof Artifact Co-training for Theorem Proving with Language Models”</a>, Han et al 2021 (<a href=\"https://arxiv.org/abs/2009.03393#openai\" title=\"'GPT-f: Generative Language Modeling for Automated Theorem Proving', Polu & Sutskever 2020\">GPT-f</a> for <a href=\"https://en.wikipedia.org/wiki/Lean_(proof_assistant)\">Lean</a>)</p></li><li><p><a href=\"https://arxiv.org/abs/2010.10648#google\">“Towards End-to-End In-Image Neural Machine Translation”</a>, Mansimov et al 2020 (sure why not)</p></li><li><p><strong>Brains</strong>:</p><ul><li><p><a href=\"https://www.quantamagazine.org/artificial-neural-nets-finally-yield-clues-to-how-brains-learn-20210218/\" title=\"The learning algorithm that enables the runaway success of deep neural networks doesn’t work in biological brains, but researchers are finding alternatives that could\">“Artificial Neural Nets Finally Yield Clues to How Brains Learn”</a>  (short overview of biologically-plausible backprop: feedback alignment,  target propagation, predictive coding, & attentional feedback; also  of recent interest, <a href=\"https://arxiv.org/abs/2012.14905\" title=\"'VS-ML: Meta Learning Backpropagation And Improving It', Kirsch & Schmidhuber 2021\">VS-ML</a>;  given their increasing success in training while respecting more  biological constraints, the increasing power of backprop-trained ANNs  and the neurological success of ANNs in predicting & imitating brain  signals, it is increasingly clear that brains <em>really do</em> do backprop in some sense)</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2021.02.22.432340v1\">“NSD: A massive 7-tesla fMRI dataset to bridge cognitive and computational neuroscience”</a>,  Jean et al 2021 (“…The availability of NSD thus opens the door to using  brain activity to directly guide the optimization of deep neural  networks.”)</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2021.02.02.429430v1\">“Brain2Pix: Fully convolutional naturalistic video reconstruction from brain activity”</a>, Le et al 2021 (reconstructing <em><a href=\"https://www.biorxiv.org/content/10.1101/687681v1.full\" title=\"'A large single-participant fMRI dataset for probing brain responses to naturalistic stimuli in space and time', Seeliger et al 2019\">Dr. Who</a></em>)</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2020.07.01.183384v1.full\">“High-performance brain-to-text communication via imagined handwriting”</a>, Willett et al 2020</p></li><li><p><a href=\"https://www.gwern.net/docs/rl/2021-spape.pdf\">“Brain-computer interface for generating personally attractive images”</a>, Spape et al 2021 (many ways to improve this…)</p></li></ul></li></ul><p><a href=\"https://old.reddit.com/r/mlscaling/\">Matters Of Scale</a>:</p><ul><li><p><a href=\"https://arxiv.org/abs/2102.01293#openai\">“Scaling Laws for Transfer”</a>,  Hernandez et al 2021 (“We find that pre-training effectively multiplies  the fine-tuning dataset size”; a shot across the bow of anyone floating  on a proprietary-dataset moat: large models can drop data requirements  by orders of magnitude overnight, even surpassing you)</p></li><li><p><a href=\"https://arxiv.org/abs/2102.05918#google\">“ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision”</a>, Jia et al 2021 (see also <a href=\"https://arxiv.org/abs/2102.08981#google\" title=\"'Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts', Changpinyo et al 2021\">CC-12M</a>; <a href=\"https://openai.com/blog/clip/\">CLIP</a>-like w/EfficientNet trained on 1.8 billion images on a TPUv3-1024—<a href=\"https://arxiv.org/abs/2102.00529#deepmind\" title=\"'Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers', Hendricks et al 2021\">DM</a> argues that fancier cross-modal Transformers are better, nevertheless, <a href=\"http://www.incompleteideas.net/IncIdeas/BitterLesson.html\">‘TPUs go brrr’</a>. Given DALL·E, CLIP, ALIGN, <a href=\"https://arxiv.org/abs/2011.10650#openai\" title=\"'VDVAE: Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images', Child 2020\">VDVAE</a>, <a href=\"https://arxiv.org/abs/2102.09532\" title=\"'Clockwork Variational Autoencoders', Saxena et al 2021\">CW-VAE</a>, <a href=\"https://arxiv.org/abs/2102.12037\" title=\"'AIPO: Image Completion via Inference in Deep Generative Models', Harvey et al 2021\">AIPO</a>  et al, are GANs already dead, and just don’t realize it yet? Or at  least soon to be relegated to only DRL-like uses as a final finetuning  phase to sharpen up a self-supervised model?); <a href=\"https://arxiv.org/abs/2103.06561\">“WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training”</a>, Huo et al 2021</p></li><li><p><a href=\"https://arxiv.org/abs/2102.12092#openai\">“DALL·E: Zero-Shot Text-to-Image Generation”</a>, Ramesh et al 2021 (<a href=\"https://openai.com/blog/dall-e/\">original blog</a>); <a href=\"https://arxiv.org/abs/2103.00823#alibaba\">“M6: A Chinese Multimodal Pretrainer”</a>,  Lin et al 2021 (Chinese DALL·E: 1.9TB images/0.29TB text for  10b-parameter dense/100b-parameter MoE Transformer; shockingly fast Chinese replication of DALL·E/CLIP)</p></li><li><p><a href=\"https://arxiv.org/abs/2102.06701#google\">“Explaining Neural Scaling Laws”</a>, Bahri et al 2021/<a href=\"https://arxiv.org/abs/2102.04074#deepmind\">“Learning Curve Theory”</a>, Hutter 2021 (<a href=\"https://www.lesswrong.com/posts/Yt5wAXMc7D2zLpQqx/an-140-theoretical-models-that-predict-scaling-laws#HIGHLIGHTS\">Rohin Shah commentary</a>; more on the manifold hypothesis)</p></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href=\"https://www.nature.com/articles/s41467-021-21283-4\">“Phenotypic covariance across the entire spectrum of relatedness for 86 billion pairs of individuals”</a>, Kemper et al 2021</p></li><li><p><a href=\"https://www.nature.com/articles/s41380-021-01027-y\">“Genetic variation, brain, and intelligence differences”</a>, Deary et al 2021</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2021.02.10.430571v1\">“Pathfinder: A gamified measure to integrate general cognitive ability into the biological, medical and behavioural sciences”</a>, Malanchini et al 2021 (not the focus, but the IQ PGS is a slight improvement over <a href=\"https://www.biorxiv.org/content/early/2018/09/17/418210\" title=\"Genomic prediction of cognitive traits in childhood and adolescence\">Allegrini et al 2018</a> due to less phenotype measurement error?)</p></li><li><p><a href=\"https://www.nature.com/articles/s41380-021-01026-z\">“Polygenic  burden has broader impact on health, cognition, and socioeconomic  outcomes than most rare and high-risk copy number variants”</a>, Saarentaus et al 2021</p></li><li><p><a href=\"http://www.scielo.br/scielo.php?script=sci_arttext&pid=S1516-44462021005006201\" title=\"'Ditching candidate gene association studies: lessons from psychiatric genetics', Duarte et al 2021\">On candidate-genes & COMT</a></p></li></ul><p>Recent Evolution:</p><ul><li><p><a href=\"https://www.nytimes.com/2021/02/17/science/DNA-mammoth.html\">“Million-Year-Old  DNA Rewrites the Mammoth Family Tree: Genomic data—the oldest ever  recovered from a fossil—reveals the origin and evolution of the  Columbian mammoth”</a></p></li><li><p><a href=\"https://www.pnas.org/content/118/6/e2016046118\">“Kin selection explains the evolution of cooperation in the gut microbiota”</a>, Simonet & McNally 2021</p></li></ul><p>Engineering:</p><ul><li><p><a href=\"https://www.nytimes.com/2021/02/18/science/black-footed-ferret-clone.html\" title=\"\"Meet Elizabeth Ann, the First Cloned Black-Footed Ferret: Her birth represents the first cloning of an endangered species native to North America, and may bring needed genetic diversity to the species\"\">First Black-Footed Ferret cloned</a></p></li></ul><h2>2.3 Statistics/Meta-Science</h2><ul><li><p><a href=\"https://www.lesswrong.com/posts/9YDk52NPrfq7nqLvd/lessons-from-the-book-of-my-life\">“Lessons from Gerolamo Cardano’s </a><em><a href=\"https://www.lesswrong.com/posts/9YDk52NPrfq7nqLvd/lessons-from-the-book-of-my-life\">The Book of My Life</a></em><a href=\"https://www.lesswrong.com/posts/9YDk52NPrfq7nqLvd/lessons-from-the-book-of-my-life\">”</a> (progress studies; see also <a href=\"https://www.gwern.net/Newton\">Newton’s anthropic argument</a>, <a href=\"https://www.gwern.net/Bakewell\">Bakewell & inventing progress</a>, <em><a href=\"https://www.gwern.net/Book-reviews#the-autobiography-of-benvenuto-cellini-cellini-1999\">The Autobiography of Benvenuto Cellini</a></em>)</p></li><li><p><a href=\"https://www.wired.com/story/group-house-covid-risk-points/\">“How Many Microcovids Would You Spend on a Burrito?”</a> (on the <a href=\"https://www.microcovid.org/\">microCOVID Project Calculator</a>)</p></li><li><p><a href=\"https://www.gwern.net/docs/math/1968-hammersley.pdf\">“On  the enfeeblement of mathematical skills by ‘Modern Mathematics’ and by  similar soft intellectual trash in schools and universities”</a>, Hammersley 1968 (<a href=\"https://www.gwern.net/docs/math/1973-knuth.pdf\" title=\"The Dangers of Computer--Science Theory\">Knuth</a> highlights as also amusing: <a href=\"https://www.gwern.net/docs/math/1967-austin.pdf\">“A Note on Piffles”</a>, Smith 1967; <a href=\"https://www.gwern.net/docs/math/1980-farlow.pdf\">“A rebuke of A. B. Smith’s paper, ‘A Note on Piffles’”</a>, Farlow 1980)</p></li><li><p><a href=\"https://www.gwern.net/docs/statistics/bias/2011-tatum.pdf\">“Artifact and Recording Concepts in EEG”</a>, Tatum et al 2011 (on the <a href=\"https://en.wikipedia.org/wiki/Electroencephalography\">EEG</a> signals of <a href=\"https://en.wikipedia.org/wiki/Jell-O\">Jell-O</a>, or, the importance of <a href=\"https://en.wikipedia.org/wiki/Scientific_control#Negative\">negative controls</a>)</p></li></ul><h2>2.4 Politics/Religion</h2><ul><li><p><a href=\"https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0032541\">“The Logic of Fashion Cycles”</a>, Acerbi et al 2012; <a href=\"https://royalsocietypublishing.org/doi/10.1098/rsif.2018.0731\">“Fashion and art cycles are driven by counter-dominance signals of elite competition: quantitative evidence from music styles”</a>, Klimek et al 2019; <a href=\"https://arxiv.org/abs/1410.8001\">“The hipster effect: When anti-conformists all look the same”</a>, Touboul 2019; <a href=\"https://slatestarcodex.com/2014/04/22/right-is-the-new-left\">“Right Is The New Left”</a>, Scott Alexander (see also <a href=\"https://www.gwern.net/docs/culture/2010-han.pdf\" title=\"Signaling Status with Luxury Goods: The Role of Brand Prominence\">Han et al 2010</a>, <a href=\"https://www.gwern.net/docs/sociology/1972-downs.pdf\" title=\"Up and down with ecology---the 'issue-attention cycle'\">Downs 1972</a>/<a href=\"https://www.gwern.net/docs/sociology/2015-gupta.pdf\" title=\"On Anthony Downs's 'Up and Down with Ecology: The \"Issue-Attention\" Cycle'\">Gupta & Jenkins-Smith 2015</a>, <a href=\"https://www.nature.com/articles/s41467-019-09311-w\" title=\"Accelerating dynamics of collective attention\">Lorenz-Spreen et al 2019</a>/<a href=\"https://www.gwern.net/docs/culture/2019-candia.pdf\" title=\"The universal decay of collective memory and attention\">Candia et al 2019</a>, <a href=\"https://www.gwern.net/docs/sociology/1994-loury.pdf\" title=\"Self-Censorship in Public Discourse: A Theory of 'Political Correctness' and Related Phenomena\">Loury 1994</a>)</p></li><li><p><a href=\"https://aeon.co/essays/what-can-we-learn-from-the-lunar-pandemic-that-never-was\">“What can we learn from the lunar pandemic that never was?”</a>  (NASA’s lunar quarantine was a sham intended to mollify the public as  they covered up repeated major failures & lab leaks both before  & after—had there been any dangerous lunar organisms, they would  have escaped easily)</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/MrBeast\">MrBeast</a> (the new aristocracy of <a href=\"https://meltingasphalt.com/social-status-down-the-rabbit-hole/\">prestige</a>? Borrowed plumage, perhaps, but effective…)</p></li><li><p><a href=\"https://www.cell.com/current-biology/fulltext/S0960-9822(17)30949-1\">“Russia’s new Lysenkoism”</a>, Kolchinsky et al 2017</p></li></ul><h2>2.5 Psychology/Biology</h2><ul><li><p><strong><a href=\"https://en.wikipedia.org/wiki/Semaglutide\">Semaglutide</a></strong>: <a href=\"https://www.gwern.net/docs/longevity/2021-wilding.pdf\">“Once-Weekly Semaglutide in Adults with Overweight or Obesity”</a>, Wilding et al 2021; <a href=\"https://www.gwern.net/docs/longevity/2021-wadden.pdf\">“Effect  of Subcutaneous Semaglutide vs Placebo as an Adjunct to Intensive  Behavioral Therapy on Body Weight in Adults With Overweight or Obesity:  The STEP 3 Randomized Clinical Trial”</a>, Wadden et al 2021</p><p>A longer-acting version of the insulin/appetite peptide <a href=\"https://en.wikipedia.org/wiki/Liraglutide\">liraglutide</a>, semaglutide greatly reduces weight, fat, blood sugar, cholesterol etc, with an <a href=\"https://link.springer.com/article/10.1007/s40262-018-0728-4\" title=\"'Safety and pharmacokinetics of single and multiple ascending doses of the novel oral human GLP-1 analogue, oral semaglutide, in healthy subjects and subjects with type 2 diabetes', Granhall et al 2019\">upcoming oral version</a>; background: <a href=\"https://www.gwern.net/docs/longevity/2020-kushner.pdf\" title=\"Semaglutide 2.4 mg for the Treatment of Obesity: Key Elements of the STEP Trials 1 to 5\">Kushner et al 2020</a>, <a href=\"https://www.gwern.net/docs/longevity/2019-aroda.pdf\" title=\"Comparative efficacy, safety, and cardiovascular outcomes with once-weekly subcutaneous semaglutide in the treatment of type 2 diabetes: Insights from the SUSTAIN 1--7 trials\">Aroda et al 2019</a>, <a href=\"https://www.gwern.net/docs/longevity/2019-nauck.pdf\" title=\"Management Of Endocrine Disease: Are all GLP-1 agonists equal in the treatment of type 2 diabetes?\">Nauck & Meier 2019</a>, <a href=\"https://www.gwern.net/docs/longevity/2018-oneil.pdf\" title=\"Efficacy and safety of semaglutide compared with liraglutide and placebo for weight loss in patients with obesity: a randomized, double-blind, placebo and active controlled, dose-ranging, phase 2 trial\">O’Neil et al 2018</a>, <a href=\"https://www.gwern.net/docs/longevity/2017-blundell.pdf\" title=\"Effects of once-weekly semaglutide on appetite, energy intake, control of eating, food preference and body weight in subjects with obesity\">Blundell et al 2017</a>, <a href=\"https://www.gwern.net/docs/longevity/2016-nauck.pdf\" title=\"A Phase 2, Randomized, Dose-Finding Study of the Novel Once-Weekly Human GLP-1 Analog, Semaglutide, Compared With Placebo and Open-Label Liraglutide in Patients With Type 2 Diabetes\">Nauck et al 2016</a>, <a href=\"https://www.gwern.net/docs/longevity/2015-lau.pdf\" title=\"Discovery of the Once-Weekly Glucagon-Like Peptide-1 (GLP-1) Analogue Semaglutide\">Lau et al 2015</a>.</p></li><li><p><a href=\"https://www.gwern.net/docs/biology/2020-irving.pdf\">“Lessons from the host defences of bats, a unique viral reservoir”</a>, Irving et al 2021 (<a href=\"https://en.wikipedia.org/wiki/Bat-borne_virus\">bat-borne viruses</a>; previously, <a href=\"https://get21stnight.com/2020/03/30/why-do-we-keep-getting-diseases-from-bats/\">Trevor Klee</a>)</p></li><li><p><a href=\"https://www.frontiersin.org/articles/10.3389/fcell.2021.628157/full\">“Beneficial  & Detrimental Effects of Reactive Oxygen Species on Lifespan: A  Comprehensive Review of Comparative & Experimental Studies”</a>,  Shields et al 2021 (antioxidants still aren’t the fountain of youth, and  may be harmful; animal studies still frequently inconsistent)</p></li><li><p><a href=\"https://www.nature.com/articles/s41598-021-81446-7\">“Positive expectations predict improved mental-health outcomes linked to psychedelic microdosing”</a>, Kaertner et al 2021 (placebo)</p></li><li><p><a href=\"https://www.gwern.net/docs/iq/2021-aggeborn.pdf\">“The Effects of Fluoride in Drinking Water”</a>, Aggeborn & Öhman 2021</p></li><li><p><a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1978350/\">“Sleep  & Sex: What Can Go Wrong? A Review of the Literature on Sleep  Related Disorders and Abnormal Sexual Behaviors & Experiences”</a>, Schenck et al 2007</p></li></ul><h2>2.6 Technology</h2><ul><li><p><a href=\"https://www.xprize.org/prizes/elonmusk\">New X-Prize: $100m in prizes for Carbon Removal</a></p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Gauge_block\">Wringing gauge blocks</a>  (“With their precisely-flat metal faces, gauge blocks can be stuck  together non-magnetically via a process calling ‘wringing’, requiring  substantial effort to separate. Scientists are still uncertain exactly  how wringing works.”)</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Armoured_train\">Armored train</a></p></li></ul><h2>2.7 Economics</h2><ul><li><p><a href=\"https://ourworldindata.org/cheap-renewables-growth\">“Why did renewables become so cheap so fast? And what can we do to use this global opportunity for green growth?”</a>, Max Roser (specifically, why such an extreme <a href=\"https://en.wikipedia.org/wiki/Experience_curve_effects\">experience curve</a>?)</p></li><li><p><a href=\"https://www.gwern.net/docs/iq/2012-grinblatt.pdf\">“IQ, trading behavior, and performance”</a>, Grinblatt et al 2012; <a href=\"https://www.gwern.net/docs/economics/2020-barth.pdf\">“Genetic Endowments and Wealth Inequality”</a>,  Barth et al 2020 (why, despite notorious setbacks, did Isaac Newton  & LTCM’s founders die wealthy? Why, in general, are more intelligent  people so much better investors? ‘The indifference of the indicator’:  it’s not one thing, it’s everything—more intelligent people have lower  discount rates, save more for longer & are less risk-averse, more  accurately predict future growth or inflation, are more likely to  participate in +EV opportunities like the stock market, to use low-fee  rather than high-fee (and thus, underperforming) mutual funds, succumb  less to biases like herding as they trade better & at better times,  trade less, and harvest losses more efficiently when trading poorly.)</p></li></ul><h2>2.8 Philosophy</h2><ul><li><p>Are <strong>ethics experts more ethical</strong>? <a href=\"https://www.gwern.net/docs/philo/2016-schwitzgebel.pdf\">“The Behavior of Ethicists”</a>, Schwitzgebel & Rust 2016 (most recently: <a href=\"https://www.gwern.net/docs/philo/2019-schonegger.pdf\">“The moral behavior of ethics professors: A replication-extension in German-speaking countries”</a>,  Schönegger et al 2019; given moral licensing & activism, perhaps we  should be surprised we don’t hear about more ethicists doing things  like posting enemy lists or trying to dox reviewers. “Woe to you  Pharisees!”)</p></li><li><p><a href=\"https://psyarxiv.com/quwgr\">“Meta-analysis on belief in free will manipulations”</a>, Genschow et al 2021 (another noble lie turns out to be ignoble)</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Cooperative_principle\">Gricean maxims of communication</a></p></li></ul><h2>2.9 Fiction</h2><ul><li><p><em><a href=\"https://en.wikipedia.org/wiki/Bunnies_%26_Burrows\">Bunnies & Burrows</a></em></p></li></ul><h2>2.10 Miscellaneous</h2><ul><li><p><a href=\"https://www.gwern.net/docs/history/1995-pop.pdf\">“Caesar Lives”</a>, <a href=\"https://en.wikipedia.org/wiki/Iggy_Pop\">Iggy Pop</a> 1995 (on <a href=\"https://en.wikipedia.org/wiki/The_History_of_the_Decline_and_Fall_of_the_Roman_Empire\">Gibbon</a>)</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Grayanotoxin#Mad_honey_intoxication\">Mad honey</a></p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Imperial_Court_System\">Imperial Court System</a></p></li></ul>",
      "image": {
        "url": "https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ef890e58-1193-4984-a1a5-8aca6141b85d_1108x691.png",
        "title": null
      },
      "media": [
        {
          "url": "https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ef890e58-1193-4984-a1a5-8aca6141b85d_1108x691.png",
          "image": "https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ef890e58-1193-4984-a1a5-8aca6141b85d_1108x691.png",
          "title": null,
          "length": 0,
          "type": "image",
          "mimeType": "image/jpeg"
        }
      ],
      "authors": [
        {
          "name": "gwern",
          "email": null,
          "url": null
        }
      ],
      "categories": []
    },
    {
      "id": "https://gwern.substack.com/p/jan-2021-gwernnet-newsletter",
      "title": "Jan 2021 Gwern.net Newsletter",
      "description": "January 2021 gwern.net newsletter with links on AI scaling up and down.",
      "url": "https://gwern.substack.com/p/jan-2021-gwernnet-newsletter",
      "published": "2021-02-04T20:23:01.000Z",
      "updated": "2021-02-04T20:23:01.000Z",
      "content": "<p>January 2021’s <a href=\"https://www.gwern.net/newsletter/2021/01\">Gwern.net</a> <a href=\"https://gwern.substack.com\">newsletter</a> is now out; previous, <a href=\"https://www.gwern.net/newsletter/2020/12\">December 2020</a> (<a href=\"https://www.gwern.net/tags/newsletter\">archives</a>). This is a summary of the revision-history RSS feed, overlapping with my <a href=\"https://www.gwern.net/Changelog\">Changelog</a> & /r/gwern; brought to you by my donors on <a href=\"https://www.patreon.com/gwern\">Patreon</a>.</p><h1>1 Writings</h1><ul><li><p><a href=\"https://www.gwern.net/Danbooru2020\" title=\"Danbooru2020 is a large-scale anime image database with 4.2m+ images annotated with 130m+ tags; it can be useful for machine learning purposes such as image recognition and generation.\">“Danbooru2020: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset”</a></p></li><li><p><a href=\"https://thisanimedoesnotexist.ai/\">This Anime Does Not Exist.ai (TADNE)</a> (<a href=\"https://www.gwern.net/Faces#extended-stylegan2-danbooru2019-aydao\">discussion</a>)</p></li><li><p><strong>Gwern.net</strong>: +return-to-top floating button; <em>popups</em>:  can now be disabled (use the ‘gear’ icon); final reimplementation  (dynamic JS now; memoizing the recursive inlining, however clever &  elegant, turns out to have painful edge-cases & still not be  efficient enough—web browsers <em>really</em> don’t like loading hundreds of kilobytes of extra HTML)</p></li></ul><h1>2 Links</h1><h2>2.1 AI</h2><p><a href=\"https://old.reddit.com/r/mlscaling/\">Matters Of Scale</a>:</p><ul><li><p><strong>Scaling up</strong>:</p><ul><li><p><a href=\"https://openai.com/blog/dall-e/\">“DALL·E: Creating Images from Text”</a>, OpenAI (GPT-3-12.5b generating 1280 tokens → <a href=\"https://arxiv.org/abs/1906.00446#deepmind\" title=\"'Generating Diverse High-Fidelity Images with VQ-VAE-2', Razavi et al 2019\">VQ-VAE</a> pixels; generates illustration & photos); <a href=\"https://openai.com/blog/clip/\">“CLIP (Contrastive Language-Image Pre-training): Connecting Text and Images”</a>, OpenAI (<a href=\"https://cdn.openai.com/papers/Learning_Transferable_Visual_Models_From_Natural_Language_Supervision.pdf\" title=\"Learning Transferable Visual Models From Natural Language Supervision\">Radford et al 2021</a>: zero-shot image understanding via text description—useful for much more than just ranking DALL·E samples by quality)</p><p>Further <a href=\"https://www.gwern.net/newsletter/2020/05#blessings-of-scale\">blessings of scale</a>: simple <a href=\"https://arxiv.org/abs/2010.05113\" title=\"'Contrastive Representation Learning: A Framework and Review', Le-Khac et al 2020\">contrastive</a> training on <em>n</em>  = 400m leads to remarkable generalization & combinatorial  flexibility of image generation by DALL·E, and CLIP learns to reach  image classification SOTA by zero-shot on many datasets, with more  human-like errors & less degradation out of samples than rivals,  while costing the same to train. OpenAI released their smallest CLIP  model (the “<a href=\"https://openreview.net/forum?id=YicbFdNTTy#google\" title=\"Vision Transformer (ViT): An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale\">ViT</a>-B/32”-equivalent)  and people are discovering it seems able to do just about anything  without any further training—the paper notes that it does everything  from “fine-grained object classification, geo-localization, action  recognition in videos, and OCR”, but there’s so much more, and you can  use it to generate image captions/descriptions, classify your anime  images, pull a specific target image description by gradient ascent or  out of another neural network such as an ImageNet <a href=\"https://arxiv.org/abs/1809.11096#deepmind\" title=\"'BigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis', Brock et al 2018\">BigGAN</a>  or TADNE StyleGAN2-ext (or, why not, synthesize images images embodying  abstract concepts like emoji or words like “nightmare fuel” or  “confusion”!), search your image datasets by embedding, find mislabeled  images (eg by <a href=\"https://twitter.com/quasimondo/status/1351191660059832320\">using “upside down” as the prompt</a>)…  One wonders, like GPT-3, how much better the largest CLIP  (“L/14-336px”) is and how many ways of using it (or DALL·E) remain to be  found? And why prediction losses work so well in one place, but then  contrastive elsewhere?</p><p>For perspective: there are newly-minted PhDs going on the job market who got excited about deep learning because of these new <a href=\"https://arxiv.org/abs/1512.03385\" title=\"'Deep Residual Learning for Image Recognition', He et al 2015\">“resnet”</a> things; undergrads who applied to grad school because <a href=\"https://arxiv.org/abs/1810.04805#google\" title=\"'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding', Devlin et al 2018\">BERT</a>  et al were blowing open NLP & extending neural supremacy to natural  language would not yet have passed quals; and it has been only 1  academic semester since <a href=\"https://arxiv.org/abs/2005.14165#openai\" title=\"'GPT-3: Language Models are Few-Shot Learners', Brown et al 2020\">GPT-3</a> was announced. Or to put it quantitatively, for just sequence modeling: it has been 8,478 days since <a href=\"https://www.gwern.net/docs/ai/1997-hochreiter.pdf\" title=\"'Long Short-Term Memory', Hochreiter & Schmidhuber 1997\">LSTM</a> RNNs were published; 3,045 days since <a href=\"https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf\" title=\"'ImageNet Classification with Deep Convolutional Neural Networks', Krizhevsky et al 2012\">AlexNet’s</a> ImageNet scores were released; 1,880 days since residual networks were published in a paper; 1,330 days since <a href=\"https://arxiv.org/abs/1706.03762#google\" title=\"Vaswani et al 2017\">“Attention Is All You Need”</a> hit Arxiv; 844 days since BERT’s paper was published; 718 days since <a href=\"https://openai.com/blog/better-language-models/\" title=\"'Better Language Models and Their Implications', OpenAI 2019\">GPT-2</a> was announced; 353 days since <a href=\"https://arxiv.org/abs/2002.05709#google\" title=\"'A Simple Framework for Contrastive Learning of Visual Representations', Chen et al 2020\">SimCLR</a>, and 249 days since GPT-3 was; and 27 days since CLIP/DALL·E.^1^ <a href=\"https://jetpress.org/volume1/moravec.htm\" title=\"'When will computer hardware match the human brain?', Moravec 1998\">Spring is coming.</a> (Some still insist we need not worry about “overpopulation on Mars” for >18,264 more days…)</p></li><li><p><a href=\"https://arxiv.org/abs/2003.10580#google\">“Meta Pseudo Labels”</a>, Pham et al 2020 (90% on ImageNet by pretraining a meta-learning teacher using JFT-300M on a TPUv3-2048)</p></li><li><p><a href=\"https://arxiv.org/abs/2101.03961#google\">“Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity”</a>, Fedus et al 2021 (1.57t-parameter <a href=\"https://arxiv.org/abs/2006.16668#google\" title=\"'GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding', Lepikhin et al 2020\">GShard</a> followup; the mixture-of-experts approach, while scaling stably, starts showing its limits)</p></li></ul></li><li><p><strong>Scaling down</strong>:</p><ul><li><p><a href=\"https://arxiv.org/abs/2012.12877#facebook\">“DeiT: Training data-efficient image transformers & distillation through attention”</a>, Touvron et al 2020 (scaling Transformer classifiers down to ImageNet+1-GPU); <a href=\"https://arxiv.org/abs/2101.11605#google\">“BoTNet: Bottleneck Transformers for Visual Recognition”</a>, Srinivas et al 2021/<a href=\"https://arxiv.org/abs/2101.11986\">“Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet”</a>, Yuan et al 2021 (hybrids); <a href=\"https://arxiv.org/abs/2009.04433\">“not-so-BigGAN: Generating High-Fidelity Images on Small Compute with Wavelet-based Super-Resolution”</a>, Han et al 2020/<a href=\"https://compvis.github.io/taming-transformers/\">“VQGAN: Taming Transformers for High-Resolution Image Synthesis”</a>, Esser et al 2020 (training >1024px Transformer GANs on just 2 GPUs)</p><p>Transformer supremacy in image-related tasks continues, and GANs  are becoming increasingly hybridized. Do pure-GANs have a future, now  that VAEs and autoregressive models are making such inroads into both  the highest-quality & lowest-compute sample generation? To take the  GAN/DRL analogy seriously, perhaps they were they ultimately a dead end,  akin to trying to learn everything from rewards, and an adversarial GAN  loss ought to be only <a href=\"https://www.gwern.net/images/ai/2019-lecun-isscctalk-cake.png\">the cherry on the cake</a> of a large unsupervised/semi-supervised generative model.</p></li><li><p><a href=\"https://arxiv.org/abs/2101.06840#microsoft\">“ZeRO-Offload: Democratizing Billion-Scale Model Training”</a>, Ren et al 2021 (partial CPU training for 13b-parameter models on 1 V100 GPU, scaling to 128 GPUs)</p></li><li><p><a href=\"https://arxiv.org/abs/2101.00190\">“Prefix-Tuning: Optimizing Continuous Prompts for Generation”</a>, Li & Liang 2021 (could the <a href=\"https://arxiv.org/abs/2009.07118\" title=\"'It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners', Schick & Schütze et al 2020\">PET</a>  & CLIP trick of averaging multiple embeddings to yield much better  performance be reused for GPT-3 prompts to greatly improve prompting?  The fact that the prefix-tuning, by directly optimizing the prompt  embeddings, yields better performance than even single optimized text  prompts, suggests so. The user could provide 3 or 4 similar prompts, and  synthesize them into a single super-prompt to better program GPT-3…)</p></li><li><p><a href=\"https://greydanus.github.io/2020/12/01/scaling-down/\">“Scaling down Deep Learning”</a>,  Greydanus 2020 (cute: parametric simplified-MNIST for rapid iteration  on tiny NNs: experiments in lottery-ticket & meta-learning of  LRs/activations)</p></li><li><p><a href=\"https://cp4space.hatsya.com/2021/01/08/the-neural-network-of-the-stockfish-chess-engine/\">“The neural network of the Stockfish chess engine”</a> (very lightweight NN designed for incremental recomputation over changing board states)</p></li></ul></li><li><p><a href=\"https://arxiv.org/abs/2101.01169\">“Transformers in Vision: A Survey”</a>, Khan et al 2021</p></li><li><p><a href=\"https://openai.com/blog/organizational-update/\">OpenAI departures</a>:  Dario Amodei, Sam McCandlish, Tom Brown, Tom Henighan, Chris Olah, Jack  Clark, Ben Mann, Paul Christiano et al leave—most for an unspecified  new entity (<a href=\"https://steveblank.com/2009/12/21/the-elves-leave-middle-earth-%E2%80%93-soda%E2%80%99s-are-no-longer-free/\">“the elves leave Middle Earth”</a>?)</p></li></ul><p>And the rest:</p><ul><li><p><a href=\"https://www.lesswrong.com/posts/pTYDdcag9pTzFQ7vw/2020-ai-alignment-literature-review-and-charity-comparison\">“2020 AI Alignment Literature Review and Charity Comparison”</a>, Larks</p></li><li><p><a href=\"https://arxiv.org/abs/2009.01719#deepmind\">“Grounded Language Learning Fast and Slow”</a>, Hill et al 2020</p></li><li><p><a href=\"https://arxiv.org/abs/2006.03654#microsoft\">“DeBERTa: Decoding-enhanced BERT with Disentangled Attention”</a>, He et al 2020 (<a href=\"https://arxiv.org/abs/1905.00537\" title=\"'SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems', Wang et al 2019\">SuperGLUE</a> falls)</p></li><li><p><a href=\"https://arxiv.org/abs/2012.13349#deepmind\">“Solving Mixed Integer Programs Using Neural Networks”</a>, Nair et al 2020</p></li><li><p><a href=\"https://arxiv.org/abs/2012.14271\">“Towards Fully Automated Manga Translation”</a>, Hinami et al 2020</p></li><li><p><a href=\"https://arxiv.org/abs/2101.08001#baidu\">“UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers”</a>, Hu et al 2021</p></li><li><p><a href=\"https://arxiv.org/abs/2012.07975#bair\">“FERM: A Framework for Efficient Robotic Manipulation”</a>, Zhan et al 2021 (contrastive semi-supervised learning + data augmentation for sample-efficiency)</p></li><li><p><a href=\"https://arxiv.org/abs/2101.04702#google\">“XMC-GAN: Cross-Modal Contrastive Learning for Text-to-Image Generation”</a>, Zhang et al 2021</p></li></ul><h2>2.2 Genetics</h2><p>Everything Is Heritable:</p><ul><li><p><a href=\"https://www.nature.com/articles/s41539-020-00079-z\">“Nurture might be nature: cautionary tales and proposed solutions”</a>, Hart et al 2021</p></li><li><p><a href=\"https://www.sciencedirect.com/science/article/pii/S1755296620300624\">“A genetic perspective on the association between exercise and mental health in the era of genome-wide association studies”</a>, de Geus 2020; <a href=\"https://www.gwern.net/docs/genetics/correlation/2020-schnurr.pdf\">“Evidence for shared genetics between physical activity, sedentary behaviour and adiposity-related traits”</a>, Schnurr et al 2020</p></li><li><p><a href=\"https://www.medrxiv.org/content/10.1101/2020.12.11.20245035v1\">“Antidepressant Response in Major Depressive Disorder: A Genome-wide Association Study”</a>, Pain et al 2020</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2020.04.03.024554v3\">“Genome wide analysis of gene dosage in 24,092 individuals shows that 10,000 genes modulate cognitive ability”</a>, Huguet et al 2020 (yep, still polygenic)</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2020.04.20.051631v2\">“GWAS of three molecular traits highlights core genes and pathways alongside a highly polygenic background”</a>, Sinnott-Armstrong et al 2021</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2021.01.08.425895v1\">“Genome-scale sequencing and analysis of human, wolf and bison DNA from 25,000 year-old sediment”</a>, Gelabert et al 2021 (incredible this is possible)</p></li><li><p><a href=\"https://www.medrxiv.org/content/10.1101/2021.01.25.21249961v1\">“Disentangling  sex differences in the shared genetic architecture of PTSD, traumatic  experiences, and social support with body size and composition”</a>, Carvalho et al 2021 (<a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6684375/\" title=\"'Distinguishing genetic correlation from causation across 52 diseases and complex traits', O'Connor & Price 2018\">LCV</a>)</p></li></ul><p>Recent Evolution:</p><ul><li><p><a href=\"https://www.gwern.net/docs/genetics/selection/2021-pereira.pdf\">“African genetic diversity and adaptation inform a precision medicine agenda”</a>, Pereira et al 2021; <a href=\"https://www.nature.com/articles/s41576-020-00305-9\">“The influence of evolutionary history on human health and disease”</a>, Benton et al 2021; <a href=\"https://www.biorxiv.org/content/10.1101/2021.01.26.428314v1\">“Local adaptation and archaic introgression shape global diversity at human structural variant loci”</a>, Yan et al 2021</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2020.07.19.211078v2\">“Genome scans of dog behavior implicate a gene network underlying psychopathology in mammals, including humans”</a>, Zapata et al 2021</p></li><li><p><a href=\"https://ideas.repec.org/p/uea/ueaeco/2021-02.html\">“Natural Selection in Contemporary Humans is Linked to Income and Substitution Effects”</a>, Hugh-Jones & Abdellaoui 2021</p></li><li><p><a href=\"https://elifesciences.org/articles/61644\">“The diversity and function of sourdough starter microbiomes”</a>, Landis et al 2021 (crowdsourced sourdough show little trace of geographic origins?)</p></li></ul><p>Engineering:</p><ul><li><p><a href=\"https://www.gwern.net/docs/genetics/editing/2021-koblan.pdf\">“In vivo base editing rescues Hutchinson-Gilford progeria syndrome in mice”</a>, Koblan et al 2021</p></li><li><p><a href=\"https://arxiv.org/abs/2101.05870\">“From Genotype to Phenotype: polygenic prediction of complex human traits”</a>, Raben et al 2021</p></li></ul><h2>2.3 Statistics/Meta-Science/Math</h2><ul><li><p><a href=\"https://arxiv.org/abs/2101.07884\">“The Quantum Field Theory on Which the Everyday World Supervenes”</a>,  Carroll 2021 (“…we have reason to be confident that the laws of physics  underlying the phenomena of everyday life are completely known” because  all unknown particles/fields are constrained to being extremely  rare/weak, eg by <a href=\"https://www.gwern.net/docs/science/2009-adelberger.pdf\" title=\"Torsion balance experiments: A low--energy frontier of particle physics\">Adelberger et al 2009</a>)</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2020.12.10.419424v1\">“How accurate are citations of frequently cited papers in biomedical literature?”</a>, Pavlovic et al 2020 (includes original author’s evaluation of whether a citation of their work is correct)</p></li><li><p><a href=\"https://arxiv.org/abs/1605.08448\">“Energy-Efficient Algorithms”</a>, Demaine et al 2016 (<a href=\"https://en.wikipedia.org/wiki/Reversible_computing\">reversible computing</a> asymptotics: constant-factor <a href=\"https://en.wikipedia.org/wiki/Stack_(abstract_data_type)\">stacks</a>/<a href=\"https://en.wikipedia.org/wiki/Dynamic_array\">arrays</a>, 𝒪(log <em>n</em>) time/energy <a href=\"https://en.wikipedia.org/wiki/AVL_tree\">AVL trees</a>, 𝒪(<em>n</em>) space <a href=\"https://en.wikipedia.org/wiki/Comparison_sort\">sorts</a>, & various 𝒪(Vertex+Edge) time/space/energy <a href=\"https://en.wikipedia.org/wiki/Graph_traversal\">graph searches</a>)</p></li><li><p><a href=\"https://www.gwern.net/docs/statistics/decision/2006-smith.pdf\">“The Optimizer’s Curse: Skepticism and Postdecision Surprise in Decision Analysis”</a>,  Smith & Winkler 2006 (regression to the mean is everywhere; another  example of why Bayes & decision theory are two great flavors that  go great together)</p></li></ul><h2>2.4 Politics/Religion</h2><ul><li><p><a href=\"https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3650704\">“The Mechanisms of Cult Production: An Overview”</a>, Xavier Marquez 2020 (see previously his <a href=\"https://www.gwern.net/newsletter/2019/02#abandoned-footnotes\">blog roundup</a>)</p></li><li><p><a href=\"https://www.gwern.net/docs/sociology/1999-dawson.pdf\">“When Prophecy Fails and Faith Persists: A Theoretical Overview”</a>, Dawson 1999</p></li><li><p><a href=\"https://www.overcomingbias.com/2020/11/why-we-fight-over-fiction.html\">“Why We Fight Over Fiction”</a>, Robin Hanson</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/All-Woman_Supreme_Court\">The All-Woman Supreme Court</a></p></li></ul><h2>2.5 Psychology/Biology</h2><ul><li><p><a href=\"https://astralcodexten.substack.com/p/still-alive\">“Still Alive”</a>,  Scott Alexander (announcement of SSC return as Substack newsletter  ‘Astral Codex Ten’ & launching a low-cost psychiatry clinic ‘Lorien  Psychiatry’)</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2020.09.08.287276v1\">“The Temporal Dynamics of Opportunity Costs: A Normative Account of Cognitive Fatigue and Boredom”</a>, Agrawal et al 2020</p></li><li><p><a href=\"https://onlinelibrary.wiley.com/doi/full/10.1002/hbm.25109\">“A unified framework for association and prediction from vertex-wise grey-matter structure”</a>, Couvy-Duchesne et al 2020 (more <a href=\"https://www.gwern.net/Questions#variance-components\">morphometricity</a>)</p></li><li><p><strong>Common phenomena</strong>: <a href=\"https://www.gwern.net/docs/psychology/2018-fassnidge.pdf\">“Sounds from seeing silent motion: Who hears them, and what looks loudest?”</a>, Fassnidge & Freeman 2018 (on ‘visual ear’; previously: <a href=\"https://www.sciencedirect.com/science/article/pii/S0960982208007343\" title=\"The sound of change: visually-induced auditory synaesthesia\">Saenz & Koch 2008</a>, <a href=\"https://www.gwern.net/docs/psychology/2017-fassnidge.pdf\" title=\"A deafening flash! Visual interference of auditory signal detection\">Fassnidge et al 2017</a>)</p></li><li><p><a href=\"https://online.ucpress.edu/collabra/article/7/1/18731/115925/Predicting-Mental-Health-From-Followed-Accounts-on\">“Predicting Mental Health From Followed Accounts on Twitter”</a>, Costelli et al 2021 (<a href=\"https://en.wikipedia.org/wiki/Preregistration_(science)#Registered_reports\">Registered Report</a>: who you choose to follow says a lot about you—<a href=\"https://www.gwern.net/Everything\">everything is correlated</a>)</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2021.01.08.425841v1\">“No evidence for general intelligence in a fish”</a>, Aellen et al 2021</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Delirium_tremens\">Delirium tremens</a></p></li><li><p><a href=\"https://www.gwern.net/docs/biology/2021-asnicar.pdf\">“Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals”</a>, Asnicar et al 2021</p></li><li><p><a href=\"https://www.biorxiv.org/content/10.1101/2021.01.18.426733v1\">“Universal DNA methylation age across mammalian tissues”</a>, Lu et al 2021; <a href=\"https://onlinelibrary.wiley.com/doi/full/10.1111/acel.13296\">“Whole-body senescent cell clearance alleviates age-related brain inflammation and cognitive impairment in mice”</a>, Ogrodnik et al 2021</p></li><li><p><a href=\"https://arxiv.org/abs/2101.12037\">“BENDR: using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data”</a>, Kostas et al 2021 (towards brain imitation learning)</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Parker%E2%80%93Hulme_murder_case\">Parker-Hulme murder case</a>; <a href=\"https://en.wikipedia.org/wiki/Slender_Man_stabbing\">The Slender Man stabbing</a> (<a href=\"https://en.wikipedia.org/wiki/Paracosm\">paracosms?</a>)</p></li><li><p><strong>Correction</strong>: <a href=\"https://news.ycombinator.com/item?id=25426329\">Programming competition skills do not inversely correlate with job performance</a> after all</p></li></ul><h2>2.6 Technology</h2><ul><li><p><a href=\"https://en.wikipedia.org/wiki/Natural_nuclear_fission_reactor\">Natural nuclear fission reactors (Oklo)</a></p></li><li><p><a href=\"https://www.gwern.net/docs/history/2007-keeley.pdf\">“Baffles and Bastions: The Universal Features of Fortifications”</a>, Keeley et al 2007</p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Corrupted_Blood_incident\">The Corrupted Blood incident</a></p></li><li><p><em><a href=\"https://www.gwern.net/docs/design/2020-jeremytankard-footnote-36-redisturbed.pdf\">Footnote</a></em><a href=\"https://www.gwern.net/docs/design/2020-jeremytankard-footnote-36-redisturbed.pdf\"> 36: “Redisturbed”</a>: a <em>unicase</em> font experiment</p></li></ul><h2>2.7 Economics</h2><ul><li><p><a href=\"https://www.nytimes.com/2021/01/18/climate/carbon-removal-technology.html\">“Businesses Aim to Pull Greenhouse Gases From the Air. It’s a Gamble”</a></p></li><li><p><a href=\"https://freakonomics.com/podcast/advertising-part-1/\">\"Does Advertising</a> <a href=\"https://freakonomics.com/podcast/advertising-part-2/\">Actually Work?\"</a>  (what could be more obvious than “advertising works”, and trivial to  confirm with correlational data? Yet, the tedious saying “correlation ≠  causation” stubbornly insists on being true); <a href=\"https://www.gwern.net/docs/traffic/2020-aral.pdf\">“Digital Paywall Design: Implications for Content Demand and Subscriptions”</a>, Aral & Dhillon 2020 (NYT nag-paywall caused −9.9% reading; in line with <a href=\"https://www.gwern.net/Ads\">all the other results</a>)</p></li><li><p><a href=\"https://www.gwern.net/docs/economics/2010-schuh.pdf\">“Who Gains and Who Loses from Credit Card Payments? Theory and Calibrations”</a>, Schuh et al 2010 (a compelling case for getting a rewards credit card if you’re a <a href=\"https://en.wikipedia.org/wiki/Debit_card\">debit card</a> user—why subsidize them so much?)</p></li><li><p><a href=\"https://www.gwern.net/docs/economics/2019-quinn.pdf\">“Squeezing the bears: cornering risk and limits on arbitrage during the ‘British bicycle mania’, 1896–1898”</a>, Quinn 2019</p></li></ul><h2>2.8 Fiction</h2><ul><li><p><a href=\"https://www.tabletmag.com/sections/arts-letters/articles/on-venus-have-we-got-a-rabbi\" title=\"A long-lost space age satire about what it means to be a Jew from one of science fiction’s greatest humorists\">“On Venus, Have We Got a Rabbi!”</a>, <a href=\"https://en.wikipedia.org/wiki/William_Tenn\">William Tenn</a> 2016</p></li><li><p><a href=\"https://www.gwern.net/docs/history/2013-dubin-fabliauxtranslations-stmartinsfourwishes.pdf\">“St Martin’s Four Wishes”</a>, Anonymous <a href=\"https://en.wikipedia.org/wiki/Fabliau\">medieval poet</a> (trans. Dubin 2013)</p></li></ul><h2>2.9 Miscellaneous</h2><ul><li><p>The <a href=\"https://en.wikipedia.org/wiki/Anglo-Japanese_style\">Anglo-Japanese style</a></p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Stalag_Luft_III\">Stalag Luft III</a></p></li><li><p><a href=\"https://en.wikipedia.org/wiki/Graham_Island_(Mediterranean_Sea)\">Ferdinandea</a></p></li></ul><div><hr></div><ol><li><p>But it’ll still be too many days ’till we say we’re sorry.</p></li></ol>",
      "image": {
        "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
        "title": null
      },
      "media": [
        {
          "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "image": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "title": null,
          "length": 0,
          "type": "image",
          "mimeType": "image/jpeg"
        }
      ],
      "authors": [
        {
          "name": "gwern",
          "email": null,
          "url": null
        }
      ],
      "categories": []
    },
    {
      "id": "https://gwern.substack.com/p/december-newsletter",
      "title": "December newsletter",
      "description": "December 2020 gwern.net newsletter with links on AI and technology; major new site feature: fully-generalized recursive popups.",
      "url": "https://gwern.substack.com/p/december-newsletter",
      "published": "2021-01-10T17:31:06.000Z",
      "updated": "2021-01-10T17:31:06.000Z",
      "content": "<p>Please see the canonical version of the December 2020 newsletter on <a href=\"https://www.gwern.net/newsletter/2020/12\">Gwern.net</a>.</p>",
      "image": {
        "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
        "title": null
      },
      "media": [
        {
          "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "image": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "title": null,
          "length": 0,
          "type": "image",
          "mimeType": "image/jpeg"
        }
      ],
      "authors": [
        {
          "name": "gwern",
          "email": null,
          "url": null
        }
      ],
      "categories": []
    },
    {
      "id": "https://gwern.substack.com/p/november-newsletter",
      "title": "November newsletter",
      "description": "November 2020 gwern.net newsletter with links on DL and genomics scaling, dark mode rewrite, 1 essay, and 1 opera review ('The Ring' cycle).",
      "url": "https://gwern.substack.com/p/november-newsletter",
      "published": "2020-12-04T00:40:13.000Z",
      "updated": "2020-12-04T00:40:13.000Z",
      "content": "<p><strong>Please see the <a href=\"https://www.gwern.net/newsletter/2020/11\">canonical November 2020 gwern.net</a></strong><a href=\"https://www.gwern.net/newsletter/2020/11\"> newsletter link.</a></p>",
      "image": {
        "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
        "title": null
      },
      "media": [
        {
          "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "image": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "title": null,
          "length": 0,
          "type": "image",
          "mimeType": "image/jpeg"
        }
      ],
      "authors": [
        {
          "name": "gwern",
          "email": null,
          "url": null
        }
      ],
      "categories": []
    },
    {
      "id": "https://gwern.substack.com/p/october-2020-news",
      "title": "October 2020 news",
      "description": "October 2020 gwern.net newsletter with links on AI scaling, Euclid; further site reorganization & improvement.",
      "url": "https://gwern.substack.com/p/october-2020-news",
      "published": "2020-11-01T21:42:39.000Z",
      "updated": "2020-11-01T21:42:39.000Z",
      "content": "<p>Please see the <a href=\"https://www.gwern.net/newsletter/2020/10\">canonical web October 2020</a> edition of <a href=\"https://gwern.substack.com\">the <code>gwern.net</code> newsletter</a>.</p>",
      "image": {
        "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
        "title": null
      },
      "media": [
        {
          "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "image": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "title": null,
          "length": 0,
          "type": "image",
          "mimeType": "image/jpeg"
        }
      ],
      "authors": [
        {
          "name": "gwern",
          "email": null,
          "url": null
        }
      ],
      "categories": []
    },
    {
      "id": "https://gwern.substack.com/p/september-2020-news",
      "title": "September 2020 News",
      "description": "September 2020 gwern.net newsletter with links on DRL and AI scaling, psychiatric disorders; no reviews.",
      "url": "https://gwern.substack.com/p/september-2020-news",
      "published": "2020-10-26T13:40:32.000Z",
      "updated": "2020-10-26T13:40:32.000Z",
      "content": "<p>Please see the <a href=\"https://www.gwern.net/newsletter/2020/09\">canonical web September 2020</a> edition of <a href=\"https://gwern.substack.com\">the <code>gwern.net</code> newsletter</a>.</p>",
      "image": {
        "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
        "title": null
      },
      "media": [
        {
          "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "image": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "title": null,
          "length": 0,
          "type": "image",
          "mimeType": "image/jpeg"
        }
      ],
      "authors": [
        {
          "name": "gwern",
          "email": null,
          "url": null
        }
      ],
      "categories": []
    },
    {
      "id": "https://gwern.substack.com/p/august-2020-gwernnet-newsletter",
      "title": "August 2020 gwern.net newsletter",
      "description": "with an essay on sidenotes; links on human competence, efficient-computing/hardware-overhangs; no reviews.",
      "url": "https://gwern.substack.com/p/august-2020-gwernnet-newsletter",
      "published": "2020-09-01T23:18:28.000Z",
      "updated": "2020-09-01T23:18:28.000Z",
      "content": "<p>Please see the <a href=\"https://www.gwern.net/newsletter/2020/08\">canonical on-site August 2020</a> edition of <a href=\"https://gwern.substack.com\">the <code>gwern.net</code> newsletter</a>.</p>",
      "image": {
        "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
        "title": null
      },
      "media": [
        {
          "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "image": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "title": null,
          "length": 0,
          "type": "image",
          "mimeType": "image/jpeg"
        }
      ],
      "authors": [
        {
          "name": "gwern",
          "email": null,
          "url": null
        }
      ],
      "categories": []
    },
    {
      "id": "https://gwern.substack.com/p/july-2020-gwernnet-newsletter",
      "title": "July 2020 gwern.net newsletter",
      "description": "Links on the Uighurs, authoritarianism, negative emissions, AI overhang; 1 movie & 2 anime reviews",
      "url": "https://gwern.substack.com/p/july-2020-gwernnet-newsletter",
      "published": "2020-08-20T20:09:50.000Z",
      "updated": "2020-08-20T20:09:50.000Z",
      "content": "<p>Please see the <a href=\"https://www.gwern.net/newsletter/2020/07\">on-gwern.net canonical July 2020</a> edition of <a href=\"https://gwern.substack.com\">the <code>gwern.net</code> newsletter</a>.</p>",
      "image": {
        "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
        "title": null
      },
      "media": [
        {
          "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "image": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "title": null,
          "length": 0,
          "type": "image",
          "mimeType": "image/jpeg"
        }
      ],
      "authors": [
        {
          "name": "gwern",
          "email": null,
          "url": null
        }
      ],
      "categories": []
    },
    {
      "id": "https://gwern.substack.com/p/june-gwernnet-newsletter",
      "title": "June gwern.net newsletter",
      "description": "June 2020 gwern.net newsletter with 3 new pages/essays, and links on CRISPR, population screening, AI scaling, politics, and technological unemployment.",
      "url": "https://gwern.substack.com/p/june-gwernnet-newsletter",
      "published": "2020-07-02T14:34:53.000Z",
      "updated": "2020-07-02T14:34:53.000Z",
      "content": "<p>See the canonical <a href=\"https://www.gwern.net/newsletter/2020/06\">on-gwern.net June 2020</a> edition of <a href=\"https://gwern.substack.com\">the <code>gwern.net</code> newsletter</a>.</p>",
      "image": {
        "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
        "title": null
      },
      "media": [
        {
          "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "image": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "title": null,
          "length": 0,
          "type": "image",
          "mimeType": "image/jpeg"
        }
      ],
      "authors": [
        {
          "name": "gwern",
          "email": null,
          "url": null
        }
      ],
      "categories": []
    },
    {
      "id": "https://gwern.substack.com/p/may-gwernnet-newsletter",
      "title": "May Gwern.net Newsletter",
      "description": "Link compilation newsletter with anime GAN updates, links on AI scaling, discussion of GPT-3, and 1 book review.",
      "url": "https://gwern.substack.com/p/may-gwernnet-newsletter",
      "published": "2020-06-06T18:44:15.000Z",
      "updated": "2020-06-06T18:44:15.000Z",
      "content": "<p>Due to extensive editing & expansion of the GPT-3 discussion, please see the canonical newsletter version at <a href=\"https://www.gwern.net/newsletter/2020/05\">https://www.gwern.net/newsletter/2020/05</a></p>",
      "image": {
        "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
        "title": null
      },
      "media": [
        {
          "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "image": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "title": null,
          "length": 0,
          "type": "image",
          "mimeType": "image/jpeg"
        }
      ],
      "authors": [
        {
          "name": "gwern",
          "email": null,
          "url": null
        }
      ],
      "categories": []
    },
    {
      "id": "https://gwern.substack.com/p/april-2020-gwern-net-newsletter",
      "title": "April 2020 gwern.net newsletter",
      "description": "This is the April 2020 edition of the gwern.net newsletter; previous, March 2020 (archives).",
      "url": "https://gwern.substack.com/p/april-2020-gwern-net-newsletter",
      "published": "2020-05-01T00:00:00.000Z",
      "updated": "2020-05-01T00:00:00.000Z",
      "content": "<p>This is the <a href=\"https://www.gwern.net/newsletter/2020/04\">April 2020</a> edition of <a href=\"https://tinyletter.com/gwern\">the <code>gwern.net</code> newsletter</a>; previous, <a href=\"https://www.gwern.net/newsletter/2020/03\">March 2020</a> (<a href=\"https://www.gwern.net/tags/newsletter\">archives</a>). Please see the canonical gwern.net version.</p>",
      "image": {
        "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
        "title": null
      },
      "media": [
        {
          "url": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "image": "https://substackcdn.com/image/fetch/$s_!rpC9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5860611c-2de0-45a7-99a0-dc1b248b0199_1280x1280.png",
          "title": null,
          "length": 0,
          "type": "image",
          "mimeType": "image/jpeg"
        }
      ],
      "authors": [
        {
          "name": "gwern",
          "email": null,
          "url": null
        },
        {
          "name": "gwern",
          "email": null,
          "url": null
        }
      ],
      "categories": []
    }
  ]
}

Analyze Another View with RSS.Style