Skip to main content

Design Of This Website

Meta page de­scrib­ing Gwern.net, the self-documenting web­site’s im­ple­men­ta­tion and ex­per­i­ments for bet­ter ‘se­man­tic zoom’ of hy­per­text; tech­ni­cal de­ci­sions using Mark­down and sta­tic host­ing.

Gwern.net is im­ple­mented as a sta­tic web­site com­piled via Hakyll from Pan­doc Mark­down and hosted on a ded­i­cated server (due to ex­pen­sive cloud band­width).

It stands out from your stan­dard Mark­down sta­tic web­site by aim­ing at good ty­pog­ra­phy, fast per­for­mance, and ad­vanced (ie. 1980s-era) hy­per­text brows­ing fea­tures (at the cost of great im­ple­men­ta­tion com­plex­ity); the ⁠4 de­sign prin­ci­ples are: aesthetically-pleasing min­i­mal­ism, ac­ces­si­bil­ity/progressive-enhancement, speed, and a ‘se­man­tic zoom’ ap­proach to hy­per­text use.

Un­usual fea­tures in­clude the mono­chrome es­thet­ics, side­notes in­stead of foot­notes on wide win­dows, ef­fi­cient drop­caps, small­caps, col­lapsi­ble sec­tions, au­to­matic inflation-adjusted cur­rency, Wikipedia-style link icons & in­foboxes, cus­tom syn­tax high­light­ing, ex­ten­sive local archives to fight linkrot, and an ecosys­tem of “popup”/“popover” an­no­ta­tions & pre­views of links for fric­tion­less brows­ing—the net ef­fect of hi­er­ar­chi­cal struc­tures with col­laps­ing and in­stant popup ac­cess to ex­cerpts en­ables iceberg-like pages where most in­for­ma­tion is hid­den but the reader can eas­ily drill down as deep as they wish. (For a demo of all fea­tures & stress-test page, see Lorem Ipsum; for de­tailed guide­lines, the Man­ual of Style.)

Also dis­cussed are the many failed ex­per­i­ments/changes made along the way.

Screenshot of Gwern.net demonstrating recursive popup functionality, allowing arbitrarily deep hypertext exploration of references and links.

What does it take to present, for the long-term, com­plex, highly-referenced, link-intensive, long-form text on­line as ef­fec­tively as pos­si­ble, while con­serv­ing the reader’s time & at­ten­tion?

Benefit

Peo­ple who are re­ally se­ri­ous about soft­ware should make their own hard­ware.

Alan Kay, “Cre­ative Think” 1982

Screenshot of SpongeBob SquarePants, ‘Procrastination’ episode (season 2, episode 37): SpongeBob has spent hours writing a fancy calligraphic dropcap of the word ‘the’, and failed to write the rest of his 800-word homework essay on ‘what not to do at a stoplight’, illustrating the dangers for a writer of yak-shaving & typographic design.

Time well-spent.

The ⁠sor­row of web de­sign & ty­pog­ra­phy is that it all can mat­ter just a lit­tle how you present your pages. A page can be ter­ri­bly de­signed and ren­der as type­writer text in 80-column ASCII mono­space, and read­ers will still read it, even if they com­plain about it. And the most tastefully-designed page, with true small­caps and cor­rect use of em-dashes vs en-dashes vs hy­phens vs mi­nuses and all, which loads in a frac­tion of a sec­ond and is SEO op­ti­mized, is of lit­tle avail if the page has noth­ing worth read­ing; no amount of ty­pog­ra­phy can res­cue a page of dreck. Per­haps 1% of read­ers could even name any of these de­tails, much less rec­og­nize them. If we added up all the small touches, they surely make a dif­fer­ence to the reader’s hap­pi­ness, but it would have to be a small one—say, 5%.⁠⁠1⁠ It’s hardly worth it for writ­ing just a few things.

But the joy of web de­sign & ty­pog­ra­phy is that just its pre­sen­ta­tion can mat­ter a lit­tle to all your pages. Writ­ing is hard work, and any new piece of writ­ing will gen­er­ally add to the pile of ex­ist­ing ones, rather than mul­ti­ply­ing it all; it’s an enor­mous amount of work to go through all one’s ex­ist­ing writ­ings and im­prove them some­how, so it usu­ally doesn’t hap­pen. De­sign im­prove­ments, on the other hand, ben­e­fit one’s en­tire web­site & all fu­ture read­ers, and so at a cer­tain scale, can be quite use­ful. I feel I’ve reached the point where it’s worth sweat­ing the small stuff, ty­po­graph­i­cally.

Principles

There are 4 de­sign prin­ci­ples:

  1. Aesthetically-pleasing Min­i­mal­ism

    The de­sign es­thetic is min­i­mal­ist, with a dash of Art Nou­veau. I be­lieve that min­i­mal­ism helps one focus on the con­tent. Any­thing be­sides the con­tent is dis­trac­tion and not de­sign. ‘At­ten­tion!’, as Ikkyu would say.⁠⁠2⁠

    The palette is de­lib­er­ately kept to grayscale as an ex­per­i­ment in con­sis­tency and whether this con­straint per­mits a read­able aesthetically-pleasing web­site. Clas­sic ty­po­graph­i­cal tools, like drop­caps and small caps are used for them­ing or em­pha­sis.⁠⁠3⁠

    This does not mean lack­ing fea­tures; many ‘min­i­mal­ist’ de­signs proud of their sim­plic­ity are merely simple-minded.⁠⁠4⁠

  2. Ac­ces­si­bil­ity & Pro­gres­sive En­hance­ment

    Se­man­tic markup is used where Mark­down per­mits. JavaScript is not re­quired for the core read­ing ex­pe­ri­ence, only for (mostly) op­tional fea­tures: pop­ups & tran­sclu­sions, table-sorting, side­notes, and so on. Pages can even be read with­out much prob­lem in a smart­phone or a text browser like elinks.

  3. Speed & Ef­fi­ciency

    On an increasingly-bloated In­ter­net, a web­site which is any­where re­motely as fast as it could be is a breath of fresh air. Read­ers de­serve bet­ter. Gwern.net uses many tricks to offer nice fea­tures like side­notes or LaTeX math at min­i­mal cost.

  4. ⁠Se­man­tic Zoom

    How should we present texts on­line? A web page, un­like many medi­ums such as print mag­a­zines, lets us pro­vide an un­lim­ited amount of text. We need not limit our­selves to overly con­cise con­struc­tions, which coun­te­nance con­tem­pla­tion but not con­vic­tion.

    The prob­lem then be­comes tam­ing com­plex­ity and length, lest we hang our­selves with our own rope. Some read­ers want to read every last word about a par­tic­u­lar topic, while most read­ers want the sum­mary or are skim­ming through on their way to some­thing else. A tree struc­ture is help­ful in or­ga­niz­ing the con­cepts, but doesn’t solve the pre­sen­ta­tion prob­lem: a book or ar­ti­cle may be hi­er­ar­chi­cally or­ga­nized, but it still must present every last leaf node at 100% size. Tricks like foot­notes or ap­pen­dices only go so far—hav­ing thou­sands of end­notes or 20 ap­pen­dices to tame the size of the ‘main text’ is un­sat­is­fac­tory as while any spe­cific reader is un­likely to want to read any spe­cific ap­pen­dix, they will cer­tainly want to read an ap­pen­dix & pos­si­bly many. The clas­sic hy­per­text par­a­digm of sim­ply hav­ing a rat’s-nest of links to hun­dreds of tiny pages to avoid any page being too big also breaks down, be­cause how gran­u­lar does one want to go? Should every sec­tion be a sep­a­rate page? Every para­graph? (Any­one who at­tempted to read a GNU Info man­ual knows how te­dious it can be.⁠⁠5⁠) What about every ref­er­ence in the bib­li­og­ra­phy, should there be 100 dif­fer­ent pages for 100 dif­fer­ent ref­er­ences?

    A web page, how­ever, can be dy­namic. The so­lu­tion to the length prob­lem is to pro­gres­sively ex­pose more be­yond the de­fault as the reader re­quests it, and make re­quest­ing as easy as pos­si­ble. For lack of a well-known term (Nel­son’s “Stretch­Text” never caught on) and by anal­ogy to code fold­ing in struc­tural ed­i­tors/out­lin­ers, I call this se­man­tic zoom: the hi­er­ar­chy is made vis­i­ble & mal­leable to allow read­ing at mul­ti­ple lev­els of the struc­ture.

    A Gwern.net page can be read at mul­ti­ple struc­tural lev­els, high to low: title, meta­data block, ab­stracts, sec­tion head­ers, mar­gin notes, em­pha­sized key­words in list items, foot­notes/side­notes, col­lapsed sec­tions or para­graphs, in­ter­nal cross-referencing links to other sec­tions (such as ap­pen­dices) which popup for im­me­di­ate read­ing, and full­text links or in­ter­nal links to other pages (also pop­ping up).

    So the reader can read (in in­creas­ing depth) the title/meta­data, or the page ab­stract, or skim the head­ers/Table of Con­tents, then skim mar­gin notes+item sum­maries, then read the body text, then click to un­col­lapse re­gions to read in-depth sec­tions too, and then if they still want more, they can mouse over ref­er­ences to pull up the ab­stracts or ex­cerpts, and then they can go even deeper by click­ing the full­text link to read the orig­i­nal. Thus, a page may look short, and the reader can un­der­stand & nav­i­gate it eas­ily, but like an ice­berg, those read­ers who want to know more about any spe­cific point will find more under the sur­face.

Mis­cel­la­neous prin­ci­ples:

  • vi­sual dif­fer­ences should be se­man­tic dif­fer­ences

  • UI el­e­ments that can react should change on hover

  • all UI el­e­ments should have tooltips/sum­maries; in­ter­ac­tive links should be ei­ther un­der­lined or small­caps

  • hi­er­ar­chies & pro­gres­sions should come in cy­cles of 3 (eg. bold > small­caps > ital­ics)

  • all num­bers should be 0, 1, or ∞

  • func­tion > form

  • more > less

  • self-contained > frag­mented

  • con­ven­tion (lin­ters/check­ers) > con­straints

  • hy­per­text is a great idea, we should try that!

  • local > re­mote—every link dies some­day

    • archives are ex­pen­sive short-term but cheap long-term

  • reader > au­thor

  • give the reader agency

  • speed is the 2nd-most im­por­tant fea­ture after cor­rect­ness

  • ⁠al­ways bet on text

  • you must earn your or­na­ments

    • if you go over­board on min­i­mal­ism, you may barely be mediocre

  • “users won’t tell you when it’s bro­ken”

  • UI con­sis­tency is un­der­rated

  • when in doubt, copy Wikipedia

  • be as mag­i­cal as Wikipedia was

  • if you find your­self ⁠doing some­thing 3 times, fix it.

  • web­site con­tent: good, FLOSS, un­re­stricted topic—choose two

Features

56. Soft­ware is under a con­stant ten­sion. Being sym­bolic it is ar­bi­trar­ily per­fectible; but also it is ar­bi­trar­ily change­able.

Alan Perlis

No­table fea­tures (com­pared to stan­dard Mark­down sta­tic sites):

  • link popup an­no­ta­tions (⁠all types demo; ⁠‘popover’ on small screens or mo­bile):

    An­no­ta­tions can be au­to­mat­i­cally ex­tracted from sources (eg. Arxiv/BioRxiv/MedRxiv/Cross­ref), or writ­ten by hand (for­mat­ting is kept con­sis­tent by an ex­ten­sive se­ries of rewrite rules & checks, in­clud­ing ⁠ma­chine learn­ing to break up mono­lithic ab­stracts for read­abil­ity); pop­ups can be re­cur­sive, and can be ma­nip­u­lated in many ways—moved, fullscreened, ‘stick­ied’ (an­chored in place), etc. ⁠Wikipedia pages are specially-supported, en­abling them to be re­cur­sively nav­i­gated as well. Local Gwern.net pages & whitelisted do­mains can be popped up and viewed in full; PDFs can be read in­side a PDF viewer; and sup­ported source code for­mats can pop up syntax-highlighted ver­sions (⁠eg. LinkMetadata.hs).

  • client-side tran­sclu­sion

    Tran­sclu­sion sup­ports within-page or cross-page, ar­bi­trary IDs or ranges in pages, links, an­no­ta­tions, etc. Tran­sclu­sions are lazy by de­fault, but can be made strict; this en­ables ex­tremely large index pages, like the tags.

  • code folding-style col­lapse/dis­clo­sure sup­port (both in­line & block)

    These are used heav­ily with lazy tran­sclu­sions, as they let one cre­ate arbitrarily-large ‘vir­tual’ pages which are dis­played on de­mand purely by writ­ing or­di­nary hy­per­linked text.

  • au­to­matic local archive/mir­rors of most links to elim­i­nate linkrot from the start while pro­vid­ing a bet­ter read­ing ex­pe­ri­ence

  • side­notes using both mar­gins, fall­back to float­ing foot­notes

  • true bidi­rec­tional ⁠back­links, which can pop up the con­text

    • also sup­ported at the sec­tion level, so one can eas­ily see dis­cus­sions else­where of a spe­cific part of a page, rather than the page as a whole

  • reader mode (al­ter­na­tive view re­mov­ing most UX like hy­per­links, ⁠eg; tog­gle: )

  • source code syn­tax high­light­ing

  • JavaScript-free LaTeX math ren­der­ing (⁠ex­am­ples; but where pos­si­ble, it is ⁠com­piled to na­tive HTML+CSS+Uni­code in­stead like “√4” or “1⁄2”, as that is more ef­fi­cient & natural-looking)

  • ⁠dark mode (with a theme switcher and AI clas­si­fi­ca­tion of whether to in­vert im­ages using ⁠In­ver­tOrNot.com)

  • click-to-zoom im­ages & slideshows; width-full ta­bles/im­ages

  • sortable ta­bles; ta­bles of var­i­ous sizes

  • au­to­mat­i­cally ⁠inflation-adjust dol­lar amounts, exchange-rate Bit­coin amounts (eg. ‘$1 in 1950 is $11$11950 today.’)

  • link icons for clas­si­fy­ing links by file­type/do­main/topic/au­thor (⁠ex­am­ples)

  • “ad­mo­ni­tions” in­foboxes (Wikipedia-like by way of Markdeep)

  • light­weight drop­caps

    With sup­port for AI-generated sets, like the “drop­cats” used on cat-themed pages

  • multi-column lists

  • in­ter­wiki link syn­tax

  • au­to­matic link-ification of key­words (⁠LinkAuto.hs)

  • com­pact ci­ta­tion type­set­ting (using sub­scripts)

  • print sup­port

  • epigraphs

  • in­ter­view for­mat­ting

  • blogroll im­ple­mented as “Site/quote/link of the day” in page foot­ers

  • demo-mode: track the use-count of site fea­tures, in order to dis­able or change them after n uses.

    This al­lows for ob­tru­sive newbie-friendly fea­tures or ap­pear­ances, which au­to­mat­i­cally sim­plify for reg­u­lar read­ers. It is loosely in­spired by clas­sic elec­tron­ics “demo mode” set­tings, which loop through all the fea­tures of a de­vice. It uses Lo­cal­Stor­age to avoid any server in­te­gra­tion.

    We use it pri­mar­ily to high­light the ex­is­tence of the theme switcher tool­bar (), so read­ers know how to en­able the other fea­tures like dark-mode or reader-mode: the an­i­ma­tion would be highly dis­tract­ing if we ran it on every page load, but if we don’t how do read­ers dis­cover it? (We’ve found out that a lot of read­ers do not no­tice it on their own, prob­a­bly due to gen­eral web-clutter-blindness.) Our so­lu­tion: we sim­ply use demo-mode to dis­able it after a few times.

    We also use it to slim down the UI. For ex­am­ple, the dis­clo­sure/col­lapse re­gions are un­usual, so we write out a de­scrip­tion ex­plic­itly for new read­ers; but they are so widely used that leav­ing the de­scrip­tion in place is a lot of clut­ter for read­ers who have learned them.

  • 404 page: a 404 error page uses the error, and sitemap + Lev­en­shtein dis­tance, to ⁠try to guess the in­tended URL, as well as point­ing to the main index and pro­vid­ing site-search short­cuts.

    (It also in­cludes a cu­rated se­lec­tion of epigraphs & il­lus­tra­tions for the reader de­pressed by linkrot.)

Much of Gwern.net de­sign and JavaScript/CSS was de­vel­oped by ⁠Said Achmiz, 2017–202?. Some in­spi­ra­tion has come from ⁠Tufte CSS & Matthew But­t­er­ick’s Prac­ti­cal Ty­pog­ra­phy.

Tags

Gwern.net im­ple­ments a sim­ple hi­er­ar­chi­cal/DAG tag sys­tem for all links, mod­eled on Wikipedia’s cat­e­gories (see ⁠Tags.hs).

It is de­signed to be browsed via pop-ups, and in­te­grate nat­u­rally with the filesys­tem.

These hi­er­ar­chi­cal tags cor­re­spond to the filesys­tem hi­er­ar­chy: a URL can be ‘tagged’ with a string foo, in which case it is as­signed to the /doc/foo/ di­rec­tory.⁠⁠13⁠ If the tag string has a forward-slash in it, then it refers to a nested tag, like foo/bar/doc/foo/bar/. Thus, a file added to a Gwern.net di­rec­tory like /doc/foo/bar/2023-smith.pdf is in­ferred to au­to­mat­i­cally be tagged foo/bar. (Be­cause it is hi­er­ar­chi­cal, it can­not be tagged both foo and foo/bar; that is in­ter­preted as sim­ply foo/bar.) As it would be a bad idea to copy/sym­link files around, a given URL (such as a file) can be tagged ar­bi­trar­ily many times. This is tracked in the same meta­data data­base as the an­no­ta­tions, and can be edited like any other part of the an­no­ta­tion.

It is im­ple­mented as a ⁠stand­alone batch process, which reads a list of di­rec­to­ries, queries the an­no­ta­tion data­base for an­no­ta­tions which match each di­rec­tory’s im­plied tag, and writes out a Mark­down index.md file with a list of tran­sclu­sions, which is then com­piled nor­mally.

Properties

Tags are first-class cit­i­zens, in that they are pages/es­says of their own, and can be tagged like any other URL:

  1. Tags as pages:

    Tags can have in­tro­duc­tions/dis­cus­sions of the topic, for cases where the mean­ing of the tag might not be ob­vi­ous (eg. “inner-monologue” or “dark knowl­edge”).

    These in­tro­duc­tions are processed like an essay, and in­deed, may just be tran­scluded from a reg­u­lar page (eg. the boogers tag or ef­fi­cient Trans­former at­ten­tion).

  2. Tagged tags:

    Tags can them­selves be ‘tagged’, and ap­pear under that tag (and vice-versa); these tags are not re­cur­sive, how­ever, and make no at­tempt to avoid cy­cles. They are meant more in the spirit of a ‘see also’ cross-reference.

  3. Tag­ging any URL: as tags are on URLs/an­no­ta­tions, they treat dif­fer­ent URLs+an­chor-fragments as dif­fer­ent.

    Thus, you can cre­ate an­no­ta­tions for mul­ti­ple an­chors on a page or mul­ti­ple pages in­side a PDF, and an­no­tate & tag them sep­a­rately. See the ⁠back­links dis­cus­sion of how use­ful this hack can be. (Be­cause an­chors do not need to exist in­side a URL—the browser will sim­ply load the URL nor­mally if the an­chor can’t be found—you can even treat an­chors as a way to ‘tag’ URLs with ar­bi­trary meta­data while re­quir­ing no data­base or other soft­ware sup­port⁠⁠14⁠.)

Use & Appearance

The pri­mary way to browse tags is via pop­ups on an­no­tated URLs:

Example of a research paper with 3 tags opened, which are themselves tagged, with ToCs for fast popups of specific tag entries.

Ex­am­ple of a re­search paper with 3 tags opened, which are them­selves tagged, with ToCs for fast pop­ups of spe­cific tag en­tries.

The tag pop­ups give an overview of the tag as a whole: how many tagged items there are of what type, how it’s tagged & ac­cess to its broader par­ent tag, the raw tag name, image thumb­nails (ex­tracted from the most re­cent an­no­ta­tion with an image), and a com­pact table-of-contents which will pop up those an­no­ta­tions. Stan­dard fea­tures like link-bibliographies are sup­ported, and it’s all im­ple­mented by popup and/or tran­sclu­sion.⁠⁠15⁠ Like the back­links, there is lit­tle dif­fer­ence be­tween tags and every­thing else—it all Just Works™ from the reader’s per­spec­tive.

For more in-depth read­ing, the tags are avail­able as stand­alone HTML pages. The idea for these pages is that one might be search­ing for a key ref­er­ence, or try­ing to catch up on the lat­est re­search.

So, these are or­ga­nized as the pref­ace/in­tro­duc­tion (if any), then the parental & chil­dren & cross-referenced tags (with ar­rows to in­di­cate which), then the an­no­ta­tions in re­verse chrono­log­i­cal order (re­quir­ing a date & title), then the Wikipedia links (which come last, inas­much as they do not have well-defined ‘dates’); then, a ‘mis­cel­la­neous’ sec­tion lists URLs which have at least 1 tag but oth­er­wise lack key meta­data like a title, au­thor, or date; fi­nally, the link-bibliography, which is the con­cate­na­tion of all of the an­no­tated en­tries’ in­di­vid­ual link-bibliographies. These items all make heavy use of the lazi­ness of tran­scludes & col­lapses to ren­der ac­cept­ably—they are so dense with hy­per­links that a fully-transcluded page would bring a web browser to its knees (likely one rea­son that web­sites like Wikipedia do not even try to pro­vide sim­i­lar in­ter­faces).

Tags are a key way of or­ga­niz­ing large num­bers of an­no­ta­tions. In some cases, they re­place sec­tions of pages or en­tire pages, where there would oth­er­wise be a hand-maintained bib­li­og­ra­phy. For ex­am­ple, I try to track uses of the ⁠DNM Archive & ⁠Dan­booru20xx datasets to help es­tab­lish their value & archive uses of them; I used to hand-link each reverse-citation, while hav­ing to also tag/an­no­tate them man­u­ally. But with tags+tran­sclu­sions, I can sim­ply set up a tag solely for URLs in­volv­ing uses of the dataset (darknet-market/dnm-archive & ai/anime/danbooru), and tran­sclude the tag into a sec­tion. Now each URL will ap­pear au­to­mat­i­cally when I tag it, with no fur­ther ef­fort.

Features

Con­ve­nience fea­tures:

  1. Gen­er­ated tags: There are two spe­cial tags which are ‘gen­er­ated’ (more so):

    • newest, which lists the most re­cently added an­no­ta­tions (as a sort of live equiv­a­lent of the monthly newslet­ter & let me eas­ily proof­read recently-written an­no­ta­tions)

    • and the root tag-directory it­self, doc, which lists all tags by path & human-readable short-name (to show the reader the full breadth of tags avail­able).

  2. Short ↔︎ Long tag-names:

    For brevity, the Gwern.net tag tax­on­omy does not at­tempt to be a per­fect cat­e­gor­i­cal pyra­mid. It sup­ports ‘short’ or ‘pet names’, which are short human-readable ver­sions of otherwise-opaque long tag names. (For ex­am­ple, genetics/heritable/correlation/mendelian-randomization → “Mendelian Ran­dom­iza­tion”.)

    In the other di­rec­tion, it at­tempts to in­tel­li­gently guess what any short tag might refer to: if I at­tempt on the CLI to run the com­mand to up­load a new doc­u­ment like ⁠upload 1981-knuth.pdf latex, the tag code will guess that design/typography/tex is meant, and up­load to that tag-directory in­stead.

  3. In­ferred or au­to­matic tags:

    • To boot­strap the tag tax­on­omy, I de­fined rules that any URL linked by a page would get a spe­cific tag; for ex­am­ple, the DNB FAQ would im­pose the dual-n-back tag. This proved to be too free with the tags, and has been re­moved.

    • A locally-hosted file typ­i­cally has a tag en­coded into its path, as dis­cussed be­fore. (This ex­cludes the special-case of the local mir­rors in /doc/www/, and a few mir­rors or projects.)

    • Do­main matches can trig­ger a tag, in cases where ei­ther a do­main is a tag of its own (eg. The Pub­lic Do­main Re­view has its own tag at history/public-domain-review so it is con­ve­nient to auto-tag any URL match­ing publicdomainreview.org), or where the web­site is single-topic (any link to ⁠Eva­Mon­key.com will be an anime/eva tag).

  4. CLI tools: ⁠changeTag.hs and upload allow edit­ing & cre­at­ing an­no­ta­tions in large batches, and ⁠annotation-dump.hs en­ables search/brows­ing:

    I use changeTag.hs (short­cut: gwt) as a kind of book­mark­ing tool to ‘tag’ any URLs I come across. (Tab-completion is eas­ily pro­vided by list­ing all the di­rec­tory names & turn­ing them into tags.) For ex­am­ple, an in­ter­est­ing Arxiv link will get a quick gwt https://arxiv.org/abs/2106.11297 attention/compression t5; this will cre­ate the an­no­ta­tion for it, pulling all its meta­data from Arxiv, run­ning all the for­mat­ting passes like para­graphiz­ing, gen­er­ates an em­bed­ding for it which will be in­cluded in all fu­ture similar-links rec­om­men­da­tions, adding it to the local-archiving queue, and tag­ging it under ai/nn/transformer/attention/compression & ai/nn/transformer/t5. Beats doing that man­u­ally!

    Mean­while, annotation-dump.hs (short­cut: gwa) helps me make some ac­tual use of tag­ging to re­find things eg. gwa https://arxiv.org/abs/2106.11297 | fold --spaces --width=100:

    Ex­am­ple of query­ing an­no­ta­tions in Bash, show­ing syn­tax high­light­ing, short­cuts like a full Gwern.net URL to view the an­no­ta­tion, tags, which YAML file data­base it’s in, etc.

    These can be grepped, piped, edited in a text ed­i­tor, etc. This can be com­bined with gwt for bulk edits: grep for par­tic­u­lar key­words, fil­ter­ing out already-tagged an­no­ta­tions, pipe into less, re­view by hand, and copy the URLs of ones to tag/un-tag. Which can be fur­ther com­bined with link-extractor.hs to ex­tract links from given Mark­down pages, see if they are al­ready tagged with a tag, and present just the un­tagged ones for re­view.

    For ex­am­ple, when I wanted to pop­u­late my Frank Her­bert tag, I ex­tracted the links from my two Dune-related pages, grepped for any an­no­ta­tion men­tion­ing any of those links, fil­tered out any an­no­ta­tion which was al­ready tagged ‘Frank Her­bert’, and printed out just the URL of the re­main­der for re­view, and tagged many of them:

    TMP=$(mktemp /tmp/urls.txt.XXXX)
     
    cat ./dune-genetics.md ./dune.md | pandoc -f markdown -w markdown | \
        runghc -i/home/gwern/wiki/static/build/ ~/static/build/link-extractor.hs | \
        sort --unique | grep -E -v -e '^#' >> "$TMP"
    cat "$TMP"
     
    gwa | grep -F --color=always --file="$TMP" | \
        grep -F -v -e '"fiction/science-fiction/frank-herbert"' | \
        cut --delimiter=';' --fields='1-4' | less
     
    gwt 'frank-herbert' []

Future Tag Features

Fu­ture work: the Gwern.net tag sys­tem is in­com­plete due to poor tool­ing. Tags are an un­solved prob­lem, cur­rently solved by human brute force, but a far bet­ter tag­ging fu­ture is pos­si­ble with deep learn­ing.

When I look at the his­tory of tag­ging, folk­sonomies, and book­mark­ing ser­vices like Wikipedia, del.icio.us & Archive of Our Own (see ⁠“Fan is A Tool-Using An­i­mal”), what I see is a tool with too much fric­tion for in­di­vid­ual read­ers.

It is easy to set up a sim­ple tag sys­tem, and han­dle a few hun­dred or thou­sand links. (In­deed, any­one who starts keep­ing book­marks at all will quickly de­velop an ad hoc cat­e­gory or tag sys­tem.) It is not so easy to keep using it for years pro­duc­tively; all tag GUIs are clunky and re­quire many sec­onds to do any­thing, and their ‘au­toma­tion’ is min­i­mal.

Such sys­tems should get eas­ier & faster & smarter to use over time, but usu­ally get harder & slower & dumber. Like spaced rep­e­ti­tion or ⁠com­plex per­son­al­ized soft­ware, the av­er­age user tastes the ini­tial fruits of using tags, goes per­haps a bit over­board, be­gins to bog down under the bur­den of main­te­nance, doesn’t quite have time to split up tags or pop­u­late ob­scure tags, and the sys­tem be­gins to ca­reen out of con­trol with ‘mega’ tags con­tain­ing half the world while ob­scure tags have only 1 entry—as the tech­ni­cal debt es­ca­lates, the user gets ever less value out of it, and sim­ply look­ing at the tags be­comes more painful as one sees the un­done work pile up, like an email inbox.

One sees many blogs which use ‘tags’ but in a com­pletely point­less way: there is one tag which is on every other post and has thou­sands of en­tries, and then each entry will have 1 tag which is used pretty much only on that entry & never again. No one reads or uses those tags, in­clud­ing their au­thors. In this fail­ure mode, par­tic­u­larly ev­i­dent on Tum­blr & In­sta­gram, as Hil­lel Wayne notes, tags be­come ut­terly de­based ⁠“metacrap” by huge swathes of re­dun­dant tags being slapped onto every post—if you ever need to re­find a par­tic­u­lar post, it sure won’t be through the tags…⁠⁠16⁠ At the other ex­treme is Twit­ter: Twit­ter’s fa­mous ‘hash­tags’ used to be widely used, and were key or­ga­niz­ing tools… but some­where along the way, real Twit­ter users seemed to stop using them & they be­came spam. Is it any won­der that most users even­tu­ally ac­knowl­edge that it’s a waste of time un­less they begin a sec­ond ca­reer as ref­er­ence li­brar­i­ans, and give up, de­pend­ing on their search skills to re­find any­thing they need? (And to be fair, for many users, they prob­a­bly did not re­ally need tag­ging to begin with. It was an at­trac­tive nui­sance for them, an il­lu­sion of pro­duc­tiv­ity—like al­pha­bet­iz­ing one’s book­shelf.)

This is why sites that do make pro­duc­tive use of tags tend to be sites cater­ing to niches, with power users, and highly ac­tive cu­ra­tors (named things like ‘li­brar­i­ans’ or WikiG­nomes) of some sort which will clean up & en­force stan­dards. For ex­am­ple, Wikipedia ed­i­tors put an enor­mous amount of ef­fort into main­tain­ing an in­cred­i­bly elab­o­rate cat­e­gory sys­tem, with ex­ten­sive bot tool­ing; and Wikipedia ed­i­tors (if not reg­u­lar read­ers) ben­e­fit, be­cause they do use it ex­ten­sively for both con­tent edit­ing and or­ga­niz­ing the in­fi­nite amount of meta-editing con­tent (like cat­e­gories of ⁠over­loaded ⁠cat­e­gories). Archive of Our Own like­wise is renowned for its ex­ten­sive tag sys­tem, which is ‘wran­gled’ by rabid fans into a rea­son­ably consistently-applied folk­son­omy of char­ac­ters/fran­chises/top­ics, and those are used heav­ily by its read­ers to nav­i­gate the >10m fanfics (in lieu of any more leg­i­ble way to nav­i­gate the seas of fanfics—after all, the point of fan­fic is that any­one can do it, pos­ing cu­ra­tion prob­lems which don’t exist for the orig­i­nal works).

Or to put it an­other way, cur­rent tag sys­tems are not like a Ted Nel­son or Mi­nor­ity Re­port-esque ex­pe­ri­ence of wiz­ards weav­ing with the stuff of thought, ca­su­ally group­ing & or­ches­trat­ing flocks of items on the screen; but more like going to the DMV to fill out forms in trip­li­cate, or using tweez­ers to move a pile of sand grain by grain. Suc­cess­ful tag sys­tems are like the Pyra­mids of Egypt: mon­u­men­tal feats of labor by thou­sands of peo­ple la­bor­ing for years to push un­gainly build­ing blocks pre­cisely into place. Tags are pow­ered by herds of human brains, te­diously, one by one, learn­ing a tiny frag­ment of the total tag folk­son­omy, adding it via clunky soft­ware, drum­ming their fin­gers while their web browsers clunk through the back-and-forth, spend­ing hours refac­tor­ing a list of a thou­sand en­tries in 1 tag into 2 tags after star­ing at the list for a while try­ing to imag­ine what those 2 tags could be, and doing this all with no vis­i­ble re­ward. For a shared re­source like Wikipedia, this is worth­while; for your own per­sonal files… the re­turn on in­vest­ment is du­bi­ous. (Also, peo­ple are lazy & for­get­ful.)

But with mod­ern tools, par­tic­u­larly DL NLP tools like doc­u­ment em­bed­dings, the tag ex­pe­ri­ence could be so much bet­ter—even mag­i­cal. The 3 major pain points of a tag sys­tem are tag­ging new items, refac­tor­ing big tags into smaller tags (typ­i­cally a sin­gle tag into 2–4), pop­u­lat­ing a new/small tag with ex­ist­ing items, and read­ing/search­ing large tags. All of these can be made dras­ti­cally bet­ter. Im­ple­mented with care and an eye to per­for­mance, these 4 tech­niques would re­move most of the pain of a tag sys­tem; in­deed, cu­rat­ing tags might even be pleas­ant, in a popping-bubblewrap sort of way:

  1. to au­to­mat­i­cally tag a doc­u­ment with high ac­cu­racy is well within 2023 ca­pa­bil­i­ties.

    Many doc­u­ments come with a sum­mary or ab­stract which can be em­bed­ded. For those which don’t, 2023+ LLMs gen­er­ally have long enough con­text win­dows to embed en­tire doc­u­ments (at some ex­pense, and per­haps lower em­bed­ding qual­ity); they are also gen­er­ally ca­pa­ble of writ­ing ac­cu­rate sum­maries.

    For >90% of my an­no­ta­tions, the ap­pro­pri­ate tags would be ex­tremely ob­vi­ous to a clas­si­fier (eg. ran­dom forests) trained on my ex­ist­ing cor­pus of tags+em­bed­dings⁠⁠17⁠, and the tags could be au­to­mat­i­cally gen­er­ated.⁠⁠18⁠ In­deed, I view my tag­ging ef­forts as par­tially jus­ti­fied by train­ing a fu­ture clas­si­fier on it, and this is just the costly boot­strap phase.

    And as the tag col­lec­tion gets larger, the ac­cu­racy of tag­ging im­proves & the user will be asked to tag fewer items, re­ward­ing the user.

  2. Refac­tor­ing tags into sub-tags is harder, but prob­a­bly doable as an in­ter­ac­tive clus­ter­ing prob­lem.

    After a few hours re-tagging one’s doc­u­ments, one wants to yell at the com­puter: “look, it’s ob­vi­ous what this tag should be, just split them up the ob­vi­ous way and do what I mean!” It’s often so ob­vi­ous one can get halfway there with reg­exps… but not the other half.

    One can take the em­bed­dings of an overly-large tag, and run a clus­ter­ing al­go­rithm like k-means clus­ter­ing⁠19⁠ on it for var­i­ous k, like 2–10. Then one can present the dif­fer­ent clus­ter­ings as sets, with the most cen­tral items in each clus­ter as its pro­to­typ­i­cal ex­am­ples. It should be ob­vi­ous to an ex­pert user what the clus­ters are, and what the best k is: ‘these 3 clus­ters make sense, but then at 4 it breaks down and I can’t tell what #3 and #4 in that are sup­posed to be.’

    Given a clus­ter­ing, one can then put the pro­to­typ­i­cal ex­am­ples into a tool like GPT to ask it what the name of the tag for those ex­am­ples ought to be. (This is sim­i­lar to how the Ope­nAI ⁠Chat­GPT in­ter­face will au­to­mat­i­cally ‘title’ each Chat­GPT ses­sion to pro­vide mean­ing­ful sum­maries, with­out the user hav­ing to do so.⁠⁠20⁠ Could they? Of course. But that is work.) The user will ap­prove or pro­vide his own.

    With the clus­ter­ing cho­sen & la­bels, the tag can be au­to­mat­i­cally refac­tored into the new tags. The ef­fort to refac­tor a tag goes from ‘sev­eral hours of ex­tremely te­dious work ac­tively read­ing through thou­sands of items to try to infer some good tags and then apply it, one by one’ to ‘a minute of pleas­ant con­sid­er­a­tion of sev­eral op­tions pre­sented to one’.

    As the num­ber of tags in­creases, the num­ber of nec­es­sary refac­tor­ings will de­crease (power law, ap­par­ently, which makes sense given Zipf’s law) and the au­to­matic tag­ging of fu­ture items will im­prove (both be­cause the se­man­tics be­comes richer and be­cause if a tag-cluster could be found in an un­su­per­vised fash­ion by clus­ter­ing em­bed­dings, then it would be even eas­ier to pre­dict those tag-clusters given a la­beled dataset), again re­ward­ing the user and im­prov­ing in qual­ity over time.

    (Much less fre­quently, we will want to com­bine tags. But that’s triv­ial to au­to­mate.)

  3. Pop­u­lat­ing a rare tag:

    Some­times a user will cre­ate a use­ful tag, or a small clus­ter will pop out out of the clus­ter­ing be­cause it is so dis­tinct. If it’s a good tag, it may have many valid in­stances, but scat­tered across the whole dataset. In this case, the au­to­matic tag­ging of new items, or refac­tor­ing ex­ist­ing tags, will not help. You need to go back over ex­ist­ing items.

    In this case, cre­at­ing the rare tag would in­te­grate well with ac­tive learn­ing ap­proaches.

    The sim­plest ac­tive learn­ing ap­proach (un­cer­tainty sam­pling) would look some­thing like this: the user cre­ates a new tag, and adds a few ini­tial ex­am­ples. The tag clas­si­fier im­me­di­ately re­trains on this, and cre­ates ranked list of all un­tagged in­stances, sorted by its es­ti­mated prob­a­bil­ity. The user looks over the list, and tags a few on the first screen’s worth. They are tagged, the rest on that screen ig­nored hence­forth⁠⁠21⁠, and the clas­si­fier im­me­di­ately re­trains, and pro­duces an­other ranked list. A tag clas­si­fier like a ran­dom forests can train on quite large datasets in sec­onds, so this could pro­ceed in near-realtime, or in­deed, asyn­chro­nously, with the user just tap­ping ‘yes’/‘no’ on in­stances as they pop up on the screen while the clas­si­fier trains & re­clas­si­fies in a loop in the back­ground. Like the refac­tor­ing, this de­mands much less of the user than the tra­di­tional man­ual ap­proach of ‘work re­ally hard’. (Such semi-automated tag­ging ap­proaches are widely used in ML in­dus­try to cre­ate datasets like JFT-300M be­cause they make la­bel­ing vastly more ef­fi­cient, but has not been seen much in end-user soft­ware.)

    Within min­utes, the new tag would be fully pop­u­lated and look as if it had been there all along.

  4. Pre­sent­ing a tag’s con­tents in a log­i­cal order—sort­ing by se­man­tic sim­i­lar­ity, which looks like a “sort by magic”:

    But we can go fur­ther. Why is a mega-tag such a pain to read or search? Well, one prob­lem is that they tend to be a giant messy pile with no order aside from reverse-chronological. Reverse-chronological order is bad in many cases, even blogs (con­sider a multi-part se­ries where you can only reach them all by the tag, which of course shows you the se­ries in the worst pos­si­ble order!), and is used sim­ply be­cause… how else are you going to sort them? At least that shows you the newest ones, which is not al­ways a good order, but at least is an order. To go be­yond that, you’d need some sort of se­man­tic un­der­stand­ing, the sort of deeper un­der­stand­ing that a human would have (and of course, human brains, par­tic­u­larly the brain of your human user, are too ex­pen­sive to use to pro­vide some sen­si­ble order).

    For­tu­nately, we have those doc­u­ment em­bed­dings at hand. We could try clus­ter­ing with k-means & ti­tling with an LLM again, and dis­play each clus­ter one after an­other, treat­ing the clus­ters as ‘tem­po­rary’ or ‘pseudo’ tags.⁠⁠22⁠ (We can eas­ily name the anony­mous clus­ters ⁠with an LLM by feed­ing in the meta­data like ti­tles and ask­ing for a tag name. The user may re­ject it, but even a wrong tag name is ex­tremely help­ful for mak­ing it ob­vi­ous what the right one is, and breaks “the tyranny of the blank page” and de­ci­sion fa­tigue.) Clus­ters don’t re­spect our 2D read­ing order, but there are al­ter­nate ways of clus­ter­ing which are in­tended to project the high-dimensional em­bed­ding’s clus­ter­ing geom­e­try down to fewer di­men­sions, like 2D, for use in graphs, or even 1D—eg. t-SNE or UMAP.

    I don’t know if they work well in 1D, but if they work bet­ter at slightly higher di­men­sion­al­ity, then it can be eas­ily turned into a se­quence with min­i­mum total dis­tance as a trav­el­ing sales­man prob­lem. A sim­ple way to ‘sort’ which doesn’t re­quire heavy-weight ma­chin­ery is to ‘sort by se­man­tics’: how­ever, not by dis­tance from a spe­cific point, but greed­ily pair­wise. One se­lects an ar­bi­trary start­ing point (‘most re­cent item’ is a log­i­cal start­ing point for a tag), finds the ‘near­est’ point, adds it to the list, and then the near­est un­used point to that point, and so on re­cur­sively.

    I find with Gwern.net an­no­ta­tions, the greedy list sort­ing al­go­rithm works sur­pris­ingly well. It nat­u­rally pro­duces a fairly log­i­cal se­quence with oc­ca­sional ‘jumps’ as the la­tent clus­ter changes, in con­trast to the naive ‘sort by dis­tance’ which would tend to ‘ping-pong’ or ‘zig-zag’ back & forth across clus­ters based on slight dif­fer­ences in dis­tance.⁠⁠23⁠

    This would im­plic­itly ex­pose the un­der­ly­ing struc­ture by pre­serv­ing the local geom­e­try (even if the ‘global’ shape doesn’t make sense), and help a reader skim through, as they feel ‘hot’ and ‘cold’, and can focus on the re­gion of the tag which seems clos­est to what they want. (And if the tag re­ally needs to be chrono­log­i­cal, or the em­bed­ding lin­eariza­tion is bad, there can just be a set­ting to over­ride that.)

    This ap­proach would work for any­thing that can be use­fully em­bed­ded, and would prob­a­bly work even bet­ter for im­ages given how hard it is to or­ga­nize im­ages & how good image em­bed­dings like CLIP have be­come (eg. Con­cept or SOOT). This will wind up in­evitably pro­duc­ing some abrupt tran­si­tions be­tween clus­ters, but that tells you where the nat­ural cat­e­gories are, and you can eas­ily drag-and-drop the clus­ter of im­ages into fold­ers & redo the trick in­side each di­rec­tory. This would make it easy to drag-and-drop a set of dat­a­points, and se­lect them, and de­fine a new tag which ap­plies to them.

    And be­cause em­bed­dings are such widely-used tools, there are many tricks one can use. For ex­am­ple, the de­fault em­bed­ding might not put enough weight on what you want, and might wind up clus­ter­ing by some­thing like ‘av­er­age color’ or ‘real-world lo­ca­tion’. But em­bed­ders can be prompted to tar­get spe­cific use-cases, and if that is not pos­si­ble, you can ma­nip­u­late the em­bed­ding di­rectly based on the em­bed­ding of a spe­cific point such as a pro­to­typ­i­cal file (or a query/key­word prompt if text-only or using a cross-modal em­bed­ding like CLIP’s image+text): embed the new file or prompt, weighted mul­ti­ply all the oth­ers by it (or some­thing), then re-organize. (And you can of course fine­tune any em­bed­ding model with the user’s im­prove­ments, con­trastively: push fur­ther apart the points that the user in­di­cated were not as alike as the em­bed­ding in­di­cated, and vice-versa. This works most eas­ily with the model that gen­er­ated the em­bed­ding, but one can come up with tricks to fine­tune other mod­els as well on the user ac­tions.)

    Or you can ex­per­i­ment with “em­bed­ding arith­metic”: if the de­fault 2D lay­out is un­help­ful, be­cause the most vis­i­ble vari­a­tion is fo­cused on un­help­ful parts, one can ‘sub­tract’ em­bed­dings to change what shows up. And you can do this with any num­ber of em­bed­dings by doing arith­metic on them first. For ex­am­ple, you can ‘sub­tract’ a tag X from every dat­a­point to ig­nore their X-ness, by av­er­ag­ing every dat­a­point with tag X to get a “pro­to­typ­i­cal X”; the new em­bed­dings are now “what those dat­a­points mean be­sides the con­cept en­coded tag X”.⁠⁠24⁠ (If the right tag or dat­a­point doesn’t exist which em­pha­sises the right thing—just make one up!) By se­quen­tially sub­tract­ing, one can look through the dataset as a whole for ‘miss­ing’ tags; in­deed, if every tag is sub­tracted, the resid­ual clus­ters might still be sur­pris­ingly mean­ing­ful, be­cause they have struc­ture that no tag yet en­coded. One could also try adding in order to em­pha­size a spe­cific X.⁠⁠25⁠

Abandoned

Cartoon drawing of Captain Picard facepalming, expressing my frustration with web development, my website readers, and the world in general.

Meta page de­scrib­ing Gwern.net web­site de­sign ex­per­i­ments and post-mortem analy­ses.

Often the most in­ter­est­ing part of any de­sign are the parts that are in­vis­i­ble—what was tried but did not work. Some­times they were un­nec­es­sary, other times read­ers didn’t un­der­stand them be­cause it was too idio­syn­cratic, and some­times we just can’t have nice things.

Some post-mortems of things I tried on Gwern.net but aban­doned (in chrono­log­i­cal order).

Tools

Soft­ware tools & li­braries used in the site as a whole:

  • The source files are writ­ten in ⁠Pan­doc Mark­down (Pan­doc: John Mac­Far­lane et al; GPL) (source files: Gwern Bran­wen, CC-0). The Pan­doc Mark­down uses a num­ber of ex­ten­sions; pipe ta­bles are pre­ferred for any­thing but the sim­plest ta­bles; and I use ⁠se­man­tic line­feeds (also called ⁠“se­man­tic line breaks” or ⁠“ven­ti­lated prose”) for­mat­ting.

  • math is writ­ten in LaTeX which com­piles to MathML, ren­dered sta­t­i­cally by Math­Jax (Apache li­cense) into HTML/CSS/fonts; copy-paste of the orig­i­nal math ex­pres­sion is han­dled by a JavaScript copy-paste lis­tener

  • syn­tax high­light­ing: we orig­i­nally used Pan­doc’s builtin Kate-derived themes, but most clashed with the over­all ap­pear­ance; after look­ing through all the ex­ist­ing themes, we took in­spi­ra­tion from Pyg­ments’s ⁠algol_nu (BSD) based on the orig­i­nal ALGOL re­port, and type­set it in the IBM Plex Mono font⁠⁠26⁠

  • the site is com­piled with the Hakyllv4+ sta­tic site gen­er­a­tor, used to gen­er­ate Gwern.net, writ­ten in Haskell (Jasper Van der Jeugt et al; BSD); for the gory de­tails, see ⁠hakyll.hs which im­ple­ments the com­pi­la­tion, RSS feed gen­er­a­tion, & pars­ing of in­ter­wiki links as well. This just gen­er­ates the basic web­site; I do many ad­di­tional op­ti­miza­tions/tests be­fore & after up­load­ing, which is han­dled by ⁠sync.sh (Gwern Bran­wen, CC-0)

    My pre­ferred method of use is to browse & edit lo­cally using Emacs, and then dis­trib­ute using Hakyll. The sim­plest way to use Hakyll is that you cd into your repos­i­tory and runghc hakyll.hs build (with hakyll.hs hav­ing what­ever op­tions you like). Hakyll will build a sta­tic HTML/CSS hi­er­ar­chy in­side _site/; you can then do some­thing like firefox _static/index. (Be­cause HTML ex­ten­sions are not spec­i­fied in the in­ter­est of cool URIs, you can­not use the Hakyll watch web­server as of Jan­u­ary 201411ya.) Hakyll’s main ad­van­tage for me is rel­a­tively straight­for­ward in­te­gra­tion with the Pan­doc Mark­down li­braries; Hakyll is not that easy to use, and so I do not rec­om­mend use of Hakyll as a gen­eral sta­tic site gen­er­a­tor un­less one is al­ready adept with Haskell.

  • the CSS is bor­rowed from a mot­ley of sources and has been heav­ily mod­i­fied, but its ori­gin was the ⁠Hakyll home­page & Gitit; for specifics, see ⁠default.css

  • Mark­down syn­tax ex­ten­sions:

    • I im­ple­mented a Pan­doc Mark­down plu­gin for a cus­tom syn­tax for in­ter­wiki links in Gitit, and then ported it to Hakyll (de­fined in hakyll.hs); it al­lows link­ing to the Eng­lish Wikipedia (among oth­ers) with syn­tax like [malefits](!Wiktionary) or [antonym of 'benefits'](!Wiktionary "Malefits"). CC-0.

    • in­fla­tion ad­just­ment: Inflation.hs pro­vides a Pan­doc Mark­down plu­gin which al­lows au­to­matic in­fla­tion ad­just­ing of dol­lar amounts, pre­sent­ing the nom­i­nal amount & a cur­rent real amount, with a syn­tax like [$5]($1980).

    • Book af­fil­i­ate links are through an Ama­zon Af­fil­i­ates tag ap­pended in the hakyll.hs

    • image di­men­sions are looked up at com­pi­la­tion time & in­serted into <img> tags as browser hints

  • JavaScript:

    • the HTML ta­bles are sortable via ⁠ta­ble­sorter (Chris­t­ian Bach; MIT/GPL)

    • the MathML is ren­dered using Math­Jax

    • an­a­lyt­ics are han­dled by Google An­a­lyt­ics

    • A/B test­ing is done using AB­a­lyt­ics (Daniele Mazz­ini; MIT) which hooks into Google An­a­lyt­ics (see test­ing notes) for individual-level test­ing; when doing site-level long-term test­ing like in the ad­ver­tis­ing A/B tests, I sim­ply write the JavaScript man­u­ally.

    • Gen­er­al­ized tooltip pop­ups for load­ing in­tro­duc­tions/sum­maries/pre­views of all links when one mouses-over a link; reads an­no­ta­tions, which are man­u­ally writ­ten & au­to­mat­i­cally pop­u­lated from many sources (Wikipedia, Pubmed, BioRxiv, Arxiv, hand-written…), with spe­cial han­dling of YouTube videos (Said Achmiz, ⁠Shawn Presser; MIT).

      Note that ‘links’ here is in­ter­preted broadly: al­most every­thing can be ‘popped up’. This in­cludes links to sec­tions (or div IDs) on the cur­rent or other pages, PDFs (often page-linked using the ob­scure but handy #page=N fea­ture), source code files (which are syntax-highlighted by Pan­doc), locally-mirrored web pages, foot­notes/side­notes, any such links within the pop­ups them­selves re­cur­sively…

      • the float­ing foot­notes are han­dled by the gen­er­al­ized tooltip pop­ups (they were orig­i­nally im­ple­mented via ⁠footnotes.js); when the browser win­dow is wide enough, the float­ing foot­notes are in­stead re­placed with mar­ginal notes/side­notes⁠27⁠ using a cus­tom li­brary, ⁠sidenotes.js (Said Achmiz, MIT)

        Image of a webpage with 1-column layout but footnotes typeset in the left and right margins as ‘sidenotes’, near the text that they annotate.

        Demon­stra­tion of side­notes on Ra­di­ance.

    • image size: full-scale im­ages (fig­ures) can be clicked on to zoom into them with slideshow mode—use­ful for fig­ures or graphs which do not com­fort­ably fit into the nar­row body—using an­other cus­tom li­brary, ⁠image-focus.js (Said Achmiz; GPL)

  • error check­ing: prob­lems such as bro­ken links are checked in 3 phases:

Implementation Details

The Pro­gram­mers’ Credo: “We do these things not be­cause they are easy, but be­cause we thought they were going to be easy.”

Ma­ciej Cegłowski (2016-08-05)

There are some tricks & de­tails that web de­sign­ers might find in­ter­est­ing.

Ef­fi­ciency:

  • fonts:

    • Adobe Source Serif/Sans (orig­i­nally Gwern.net used Baskerville)

      Why use our own web­fonts in­stead of just using pre-existing web-safe/sys­tem fonts? One might ask if the font over­head (non-blocking down­load of ~0.5MB of fonts for the most com­plex pages like the GPT-3 fic­tion page) is worth it, com­pared to trust­ing in fonts that may be in­stalled al­ready and are ‘free’ network-wise. This is what our web­fonts buys us:

      • cor­rect­ness (con­sis­tent ren­der­ing):

        The fun­da­men­tal rea­son for not using sys­tem fonts is that there are not many of them, they vary across op­er­at­ing sys­tems & de­vices, usu­ally aren’t great (lack­ing al­ter­na­tives & fea­tures like small­caps, & often ba­sics like Uni­code), and can be buggy (eg. Apple ships a Gill Sans dig­i­ti­za­tion—not an ob­scure font!—which is >22 years old & has bro­ken kern­ing).

        I ini­tially used sys­tem “Baskerville”, but they looked bad on some screens (sim­i­lar issue to peo­ple im­i­tat­ing LaTeX by using Com­puter Mod­ern on screens) and the highly lim­ited se­lec­tion of sys­tem fonts didn’t give me many op­tions. The Google Fonts Baskerville was OK but lacked many fea­tures & was slower than host­ing my own web­font, so Said Achmiz con­vinced me to just switch to self-hosting the ⁠‘screen serif’ Source fam­ily, whose ap­pear­ance I liked, which could be sub­set down to only nec­es­sary char­ac­ters to be faster than Google Fonts & not a bot­tle­neck, and which wasn’t widely used then de­spite being FLOSS & high-quality & ac­tively main­tained (so helped my per­sonal brand­ing).

        We were then re­peat­edly forced to add more fonts to fix dis­play bugs: fonts could look quite dif­fer­ent on Linux & Mac, and the sys­tem “sans” for the Table of Con­tents looked bad on Win­dows. The more care­fully de­signed the ap­pear­ance, the more small dif­fer­ences in sizes or ap­pear­ance be­tween the ‘same’ font on dif­fer­ent plat­forms screwed things up. Link-icons, side­notes, emoji, return-arrows, mis­cel­la­neous Uni­code look­ing rather dif­fer­ent or break­ing—all of these have run into plat­form is­sues, and later fea­tures like ci­ta­tion sub­scripts or ⁠inflation-adjustments surely would if we couldn’t tune their CSS to a known font.

        (Should we let read­ers set their own fonts? Reader, be real. It is 2023, not 199332ya. No one today sets their own fonts or writes cus­tom CSS stylesheets—which would break lay­out & icons on many sites any­way—and they es­pe­cially do not on the mo­bile de­vices which are ~50% of my site traf­fic.)

      • small­caps: used ex­ten­sively in the site de­sign for vary­ing lev­els of em­pha­sis in be­tween bold & ital­ics, so high qual­ity small­caps is crit­i­cal; true small­caps is pro­vided by Source (ital­ics may yet be added at my re­quest), while un­avail­able in most fonts

      • con­sis­tent mono­chrome emoji via Noto Emoji (be­fore, emoji would be dif­fer­ent sizes in link-icons, some would be un­pre­dictably col­ored on some plat­forms and scream on the page)

      • IBM Plex Mono for source code: more dis­tinct confusable-characters in IBM Plex Mono com­pared to an or­di­nary mono­space sys­tem font (Plex Mono has an Open­Type fea­ture for slashed-zeros which can be en­abled in just source code), and looks good on Macs.⁠⁠28⁠

        • the Source fam­ily also pro­vides both tab­u­lar & pro­por­tional num­bers (also called “old-style”), which most fonts don’t, and which makes ta­bles vs text more read­able (pro­por­tional num­bers would break vi­sual align­ment in­side ta­bles anal­o­gous to pro­por­tional vs mono­space fonts for code, while tab­u­lar num­bers look large & ob­tru­sive in­side reg­u­lar text)

      • icons via ⁠Quivira font, and rare char­ac­ters like in­ter­robang, as­ter­ism, back­wards ques­tion mark, shield-with-cross (for link icon), BLACK SQUARE ■, and con­sis­tency in fall­back char­ac­ters for rare Uni­code points not in the sub­sets.

    • ef­fi­cient drop­caps fonts ⁠by sub­set­ting

  • image op­ti­miza­tion: PNGs are op­ti­mized by pngnq/advpng, JPEGs with mozjpeg, SVGs are mini­fied, PDFs are com­pressed with ocrmypdf’s JBIG2 sup­port. (GIFs are not used at all in favor of WebM/MP4 <video>s.)

  • JavaScript/CSS mini­fi­ca­tion: be­cause Cloud­flare does Brotli com­pres­sion, mini­fi­ca­tion of JavaScript/CSS has lit­tle ad­van­tage⁠⁠29⁠ and makes de­vel­op­ment harder, so no mini­fi­ca­tion is done; the font files don’t need any spe­cial com­pres­sion ei­ther be­yond the sub­set­ting.

  • Math­Jax: get­ting well-rendered math­e­mat­i­cal equa­tions re­quires Math­Jax or a sim­i­lar heavy­weight JavaScript li­brary; worse, even after dis­abling fea­tures, the load & ren­der time is ex­tremely high—a page like the em­bryo se­lec­tion page which is both large & has a lot of equa­tions can vis­i­bly take >5s (as a progress bar that help­fully pops up in­forms the reader).

    The so­lu­tion here is to ⁠pre­ren­der Math­Jax lo­cally after Hakyll com­pi­la­tion, using the local tool mathjax-node-page to load the final HTML files, parse the page to find all the math, com­pile the ex­pres­sions, de­fine the nec­es­sary CSS, and write the HTML back out. Pages still need to down­load the fonts but the over­all speed goes from >5s to <0.5s, and JavaScript is not nec­es­sary at all.

  • Au­to­matic Link-Ification Reg­exps: I wrote ⁠LinkAuto.hs, a Pan­doc li­brary for au­to­mat­i­cally turn­ing user-defined regexp-matching strings into links, to au­to­mat­i­cally turn all the sci­en­tific jar­gon into Wikipedia or paper links. (There are too many to an­no­tate by hand, es­pe­cially as new terms are added to the list or ab­stracts are gen­er­ated for pop­ups.)

    “Test all strings against a list of reg­exps and rewrite if they match” may sound sim­ple and easy, but the naive ap­proach is ex­po­nen­tial: n strings, r reg­exps tested on each, so 𝒪(nr) matches total. With >600 reg­exps ini­tially & mil­lions of words on Gwern.net… Reg­exp match­ing is fast, but it’s not that fast. Get­ting this into the range of ‘ac­cept­able’ (~3× slow­down) re­quired a few tricks.

    The major trick is that each doc­u­ment is con­verted to a sim­ple plain text for­mat, and the reg­exps are run against the en­tire doc­u­ment; in the av­er­age case (think of short pages or popup an­no­ta­tions), there will be zero matches, and the doc­u­ment can be skipped en­tirely. Only the match­ing reg­exps get used in the full-strength AST tra­ver­sal. While it is ex­pen­sive to check a reg­exp against an en­tire doc­u­ment, it is an order of mag­ni­tude or two less ex­pen­sive than check­ing that reg­exp against every string node in­side that doc­u­ment!

Cor­rect­ness:

  • Dark mode (): our dark mode is cus­tom, and tries to make dark mode a first-class cit­i­zen.

    1. Avoid­ing Flash­ing & Laggy Scrolling: it is im­ple­mented in the stan­dard best-practice way of cre­at­ing two color palettes (as­so­ci­at­ing a set of color vari­ables for every el­e­ment, for a light-mode and then automatically-generating dark mode col­ors by in­vert­ing & gamma-correcting), and using JavaScript to tog­gle the media-query to in­stantly en­able that color.

      This avoids the ‘flash of white’ on page loads which reg­u­lar JavaScript-based ap­proaches incur (be­cause the CSS media-queries can only im­ple­ment auto-dark-mode, and the dark mode wid­get re­quires JavaScript; how­ever, the JavaScript, when it de­cides to in­ject dark mode CSS into the page, is too late and that CSS will be ren­dered last after the reader has al­ready been ex­posed to the flash). The sep­a­rate color palette ap­proach also avoids the lag & jank of using in­vert CSS fil­ters (one would think that invert(100%) would be free from a per­for­mance stand­point, since what pixel ma­nip­u­la­tion could be sim­pler than negat­ing the color?—but it is not).

    2. Na­tive Dark Mode Color Scheme: we mod­ify the color scheme as nec­es­sary.

      Be­cause of the changes in con­trast, in­vert­ing the color scheme only mostly works. In par­tic­u­lar, in­line & code blocks tend to dis­ap­pear. To fix this, we allow a small de­vi­a­tion from pure-monochrome to add some blue, and the source code syn­tax high­light­ing is tweaked with a few blue/pur­ple/red col­ors for dark mode vis­i­bil­ity (since there’s not any log­i­cal dark-mode equiv­a­lent of the ALGOL syntax-highlighting style).

    3. In­verted Im­ages: color im­ages are de­sat­u­rated & grayscaled by de­fault to re­duce their bright­ness; grayscale/mono­chrome im­ages, are au­to­mat­i­cally in­verted by a machine-learning API, ⁠In­ver­tOrNot.com.

      This avoids the com­mon fail­ure mode where a blog uses a dark mode li­brary which im­ple­ments the class ap­proach cor­rectly… but then all of their im­ages still have blind­ing bright white back­grounds or over­all col­oration, de­feat­ing the point! How­ever, one also can­not just blindly in­vert im­ages be­cause many im­ages, pho­tographs of peo­ple es­pe­cially, are garbage as ‘photo-negatives’.

      De­fault Your De­vices To Dark Mode

      If you add a dark mode to your app or web­site, set your de­vices to dark mode on it—even if you don’t like dark mode or it’s in­ap­pro­pri­ate.

      You will have dark mode-only bugs, but your read­ers will never tell you about the bugs, par­tic­u­larly the odd one-off bugs. You will see your light-mode often enough due to logged-out de­vices or screen­shots or reg­u­lar de­vel­op­ment, so you need to force your­self to use dark mode.

    4. Three-Way Dark Mode Tog­gle: Many dark modes are im­ple­mented with a sim­ple bi­nary on/off logic stored in a cookie, ig­nor­ing browser/OS pref­er­ences, or sim­ply defin­ing ‘dark mode’ as the nega­tion of the cur­rent browser/OS pref­er­ence.

      This is in­cor­rect, and leads to odd sit­u­a­tions like a web­site en­abling dark mode dur­ing the day, and then light mode dur­ing the night! Using an auto/dark/light three-way tog­gle means that read­ers can force dark/light mode but also leave it on ‘auto’ to fol­low the browser/OS pref­er­ence over the course of the day.

      This re­quires a UI wid­get & it still in­curs some of the prob­lems of an ⁠auto-only dark mode, but over­all strikes the best bal­ance be­tween en­abling dark mode unasked, reader con­trol/con­fu­sion, and avoid­ing dark mode at the wrong time.

  • col­lapsi­ble sec­tions: man­ag­ing com­plex­ity of pages is a bal­anc­ing act. It is good to pro­vide all nec­es­sary code to re­pro­duce re­sults, but does the reader re­ally want to look at a big block of code? Some­times they al­ways would, some­times only a few read­ers in­ter­ested in the gory de­tails will want to read the code. Sim­i­larly, a sec­tion might go into de­tail on a tan­gen­tial topic or pro­vide ad­di­tional jus­ti­fi­ca­tion, which most read­ers don’t want to plow through to con­tinue with the main theme. Should the code or sec­tion be deleted? No. But rel­e­gat­ing it to an ap­pen­dix, or an­other page en­tirely is not sat­is­fac­tory ei­ther—for code blocks par­tic­u­larly, one loses the lit­er­ate pro­gram­ming as­pect if code blocks are being shuf­fled around out of order.

    A nice so­lu­tion is to sim­ply use a lit­tle JavaScript to im­ple­ment code fold­ing ap­proach where sec­tions or code blocks can be vi­su­ally shrunk or col­lapsed, and ex­panded on de­mand by a mouse click. Col­lapsed sec­tions are spec­i­fied by a HTML class (eg. <div class="collapse"></div>), and sum­maries of a col­lapsed sec­tion can be dis­played, de­fined by an­other class (<div class="abstract-collapse">). This al­lows code blocks to be col­lapse by de­fault where they are lengthy or dis­tract­ing, and for en­tire re­gions to be col­lapsed & sum­ma­rized, with­out re­sort­ing to many ap­pen­dices or forc­ing the reader to an en­tirely sep­a­rate page.

  • Side­notes: one might won­der why sidenotes.js is nec­es­sary when most side­note uses are like Tufte-CSS and use a sta­tic HTML/CSS ap­proach, which would avoid a JavaScript li­brary en­tirely and vis­i­bly re­paint­ing the page after load?

    The prob­lem is that Tufte-CSS-style side­notes do not re­flow and are solely on the right mar­gin (wast­ing the con­sid­er­able white­space on the left), and de­pend­ing on the im­ple­men­ta­tion, may over­lap, be pushed far down the page away from their, break when the browser win­dow is too nar­row or not work on smart­phones/tablets at all. (This is fix­able, Tufte-CSS’s main­tain­ers just haven’t.) The JavaScript li­brary is able to han­dle all these and can han­dle the most dif­fi­cult cases like ⁠my an­no­tated edi­tion of Ra­di­ance. (⁠Tufte-CSS-style epigraphs, how­ever, pose no such prob­lems and we take the same ap­proach of defin­ing an HTML class & styling with CSS.)

  • Link icons: icons are de­fined for all file­types used in Gwern.net and most commonly-linked web­sites such as Wikipedia, or Gwern.net (within-page sec­tion links get up/down-arrows to in­di­cate rel­a­tive po­si­tion, with ‘¶’ as a JavaScript-less fall­back; cross-page links get the logo icon).

    They are im­ple­mented in a scal­able compile-time ap­proach when the ⁠stan­dard ap­proach failed.

  • Redi­rects: sta­tic sites have trou­ble with redi­rects, as they are just sta­tic files. AWS 3S does not sup­port a .htaccess-like mech­a­nism for rewrit­ing URLs. To al­low­ing mov­ing pages & fix bro­ken links, I wrote ⁠Hakyll.Web.Redirect for gen­er­at­ing sim­ple HTML pages with redi­rect meta­data+JavaScript, which sim­ply redi­rect from URL 1 to URL 2. After mov­ing to Nginx host­ing, I con­verted all the redi­rects to reg­u­lar Nginx rewrite rules.

    In ad­di­tion to page re­names, I mon­i­tor 404 hits in Google An­a­lyt­ics to fix er­rors where pos­si­ble, and Nginx logs. There are an as­ton­ish­ing num­ber of ways to mis­spell Gwern.net URLs, it turns out, and I have de­fined >20k redi­rects so far (in ad­di­tion to generic reg­exp rewrites to fix pat­terns of er­rors).

Appendix

Returns To Design?

What is the ‘shape’ of re­turns on in­vest­ment in in­dus­trial de­sign, UI/UX, ty­pog­ra­phy etc? Is it a sig­moid with a golden mean of ef­fort vs re­turn… or a parabola with an un­happy val­ley of medi­oc­rity?

My ex­pe­ri­ence with Gwern.net de­sign im­prove­ments is that read­ers ap­pre­ci­ated changes mod­er­ately early on in mak­ing its con­tent more pleas­ant to read (if only by com­par­i­son to the rest of the In­ter­net!), but after a cer­tain point, it all ‘came to­gether’, in some sense, and read­ers started rav­ing over the de­sign and point­ing to Gwern.net’s de­sign rather than its con­tent. This is in­con­sis­tent with the de­fault, in­tu­itive model of ‘di­min­ish­ing re­turns’, where each suc­ces­sive de­sign tweak should be worth less than the pre­vi­ous one.

Is there a ‘per­fec­tion pre­mium’ (per­haps as a sig­nal of un­der­ly­ing un­ob­serv­able qual­ity, or per­haps reader in­ter­ac­tion is like an O-ring process)?

Im­pute—the process by which an im­pres­sion of a prod­uct, com­pany or per­son is formed by men­tally trans­fer­ring the char­ac­ter­is­tics of the com­mu­ni­cat­ing media…Peo­ple do judge a book by its cover…The gen­eral im­pres­sion of Apple Com­puter Inc. (our image) is the com­bined re­sult of every­thing the cus­tomer sees, hears or feels from Apple, not nec­es­sar­ily what Apple ac­tu­ally is! We may have the best prod­uct, the high­est qual­ity, the most use­ful soft­ware etc.; if we present them in a slip­shod man­ner, they will be per­ceived as slip­shod; if we present them in a cre­ative, pro­fes­sional man­ner, we will im­pute the de­sired qual­i­ties.

Mike Markkula, “The Apple Mar­ket­ing Phi­los­o­phy: Em­pa­thy Focus Im­pute” (1977-12)

Si paulum summo de­ces­sit, ver­git ad imum

Ho­race, Ars Po­et­ica

Par­tic­u­larly with ty­pog­ra­phy, there seems to be an in­fi­nite num­ber of finicky de­tails one could spend time on (much of which ap­pears to be for nov­elty’s sake, while vastly more im­por­tant things like ⁠ad­ver­tis­ing harms go ig­nored by so-called de­sign­ers). One’s ini­tial guess is that it’d be di­min­ish­ing re­turns like most things: it’d look some­thing like a log curve, where every ad­di­tional tweak costs more ef­fort as one ap­proaches the Pla­tonic ideal. A more so­phis­ti­cated guess would be that it’d look like a sig­moid: at first, some­thing is so awful that any fixes are ir­rel­e­vant to the reader be­cause that just means they suf­fer from a dif­fer­ent prob­lem (it doesn’t mat­ter much if a web­site doesn’t ren­der be­cause of a JavaScript bug if the text when it does ren­der is so light-shaded that one can’t read it); then each im­prove­ments makes a dif­fer­ence to some read­ers as it ap­proaches a re­spectable medi­oc­rity; and after that, it’s back to di­min­ish­ing re­turns.

My ex­pe­ri­ence with im­prov­ing the de­sign of Gwern.net & read­ing about de­sign has made me won­der if ei­ther of those is right. The shape may re­sem­ble more of a parabola: the sig­moid, at some point, spikes up and re­turns in­crease rather than di­min­ish?

I no­ticed that for the first half-decade or so, no one paid much at­ten­tion to the tweaks I made, as it was an or­di­nary Markdown-based sta­tic site. As I kept tin­ker­ing, a com­ment would be made once in a while. When Said Achmiz lent his tal­ents to adding fea­tures & en­hance­ments and ex­plor­ing novel tweaks, com­ments cropped up more fre­quently (con­sis­tent with the enor­mous in­crease in time spent on it); by 2019, the re­design had mostly sta­bi­lized and most of the sig­na­ture fea­tures & vi­sual de­sign had been im­ple­mented, and 2020 was more about bug fixes than adding piz­zazz. Under the in­tu­itive the­o­ries, the rate of com­ments would be about the same: while the bug fixes may in­volve huge ef­fort—the dark mode rewrite was a 3-month agony—the im­prove­ments are ever smaller—said rewrite had no reader-visible change other than re­mov­ing slow­ness. But while site traf­fic re­mained steady, 2020 at­tracted more com­pli­ments than ever!

Sim­i­larly, the LW team put an un­usual amount of ef­fort into de­sign­ing ⁠a 2018 essay com­pi­la­tion, mak­ing it styl­ish (even re­draw­ing all the im­ages to match the color themes), and they were sur­prised by un­usu­ally large the pre­orders were: not a few per­cent­age points, but many times. (There are many books on data vi­su­al­iza­tion, but I sus­pect Ed­ward Tufte’s books out­sell them, even the best, by sim­i­lar mag­ni­tudes.) And what should we make of ⁠Apple & de­sign, whose de­vices & soft­ware have glar­ing flaws and yet, by mak­ing more of an at­tempt, com­mand a pre­mium and are re­garded well by the pub­lic? Or Stripe?⁠⁠30⁠

If the sig­moid were right, just how much more ef­fort would be nec­es­sary to elicit such jumps? Or­ders of mag­ni­tude more? I & Said have in­vested ef­fort, cer­tainly, but there are count­less sites (even con­fin­ing the com­par­i­son to just per­sonal web­sites and ex­clud­ing sites with pro­fes­sional full-time de­vel­op­ers/de­sign­ers), whose cre­ators have surely in­vested more time; mil­lions of books are self-published every year; and Apple is cer­tainly not the only tech com­pany which tries to de­sign things well.

What might be going on is re­lated to the “aesthetic-usability ef­fect”: at a cer­tain level, the de­sign it­self be­comes no­tice­able to the reader for its es­thetic ef­fect and the es­thet­ics it­self be­comes a fea­ture adding to the ex­pe­ri­ence. That is, at the bot­tom of the sig­moid, on a web­site strewn with typos and bro­ken links and con­fus­ing col­ors, the reader thinks “this web­site sucks!”, while in the mid­dle, the reader ceases to think of the web­site at all and just gets on with using it, only oc­ca­sion­ally ir­ri­tated by de­sign flaws; fi­nally, at a cer­tain level, when all the flaws have been re­moved and the site it­self is gen­uinely uniron­i­cally beau­ti­ful, both the beauty & ab­sence of flaws them­selves be­come no­tice­able, and the reader thinks, “this web­site, it is—pretty awe­some!” The spike is where sud­denly the de­sign it­self is per­ceived as a dis­tinct thing, not merely how the thing hap­pens to be. De­sign­ers often as­pire to an end-state of sprez­zatura or the “crys­tal gob­let”, where they do their job so well the reader doesn’t re­al­ize there was a job to be done at all—but in this fallen world, where ex­cel­lence seems so rare, the bet­ter one does the job, the more the con­trast with all the botched jobs in­evitably draws at­ten­tion.

It is dif­fi­cult for even the reader least in­ter­ested in the topic to open a Tufte book, or walk into an Apple store, and not be struck by first im­pres­sions of el­e­gance and care­ful de­sign—which is not nec­es­sar­ily a good thing if that can­not be lived up to. (Any per­son struck by this must also re­al­ize that other peo­ple will be sim­i­larly im­pressed, using their own re­sponse as a proxy for the gen­eral re­ac­tion⁠⁠31⁠, and will take it as a model for as­pi­ra­tion; lik­ing Apple or Tufte sig­nals your good taste, and that makes them lux­ury prod­ucts as much as any­thing.)

The rea­son it makes an im­pres­sion might be that it serves as a costly sig­nal that if you care enough to vis­i­bly “get it right”, even where that re­quires un­rea­son­able ef­fort, then you prob­a­bly can be trusted to get it right on things where other peo­ple can’t eas­ily see that. Since it’s so hard to judge soft­ware qual­ity with­out ex­ten­sive use (and is bor­der­line im­pos­si­ble for things like se­cu­rity & pri­vacy), as op­posed to fur­ni­ture⁠⁠32⁠, peo­ple es­pe­cially rely on these sorts of heuris­tics.

This sug­gests a dan­ger­ous idea (dan­ger­ous be­cause a good ex­cuse for com­pla­cency & medi­oc­rity, es­pe­cially for those who do not man­age even medi­oc­rity but be­lieve oth­er­wise): if you are going to in­vest in de­sign, half-measures yield less than half-results. If the de­sign is ter­ri­ble, then one should con­tinue; but if the de­sign is al­ready rea­son­able, then in­stead of there being sub­stan­tial re­turns, the di­min­ish­ing re­turns have al­ready set in, and it may be a too-long slog from where you are to the point where peo­ple are im­pressed enough by the de­sign for the es­thetic ef­fect to kick in. Those mod­er­ate im­prove­ments may not be worth­while if one can only mod­estly im­prove on medi­oc­rity; and a sufficiently-flawed de­sign may not be able to reach the es­thetic level at all, re­quir­ing a rad­i­cal new de­sign.


  1.  

    Rut­ter ar­gues for this point in ⁠Web Ty­pog­ra­phy, which is con­sis­tent with my own A/B tests where even lousy changes are dif­fi­cult to dis­tin­guish from zero ef­fect de­spite large n, and with the gen­eral sham­bolic state of the In­ter­net (eg. as re­viewed in the 2019 Web Al­manac). If read­ers ⁠will not in­stall ad­block and load­ing times of mul­ti­ple sec­onds have rel­a­tively mod­est traf­fic re­duc­tions, things like align­ing columns prop­erly or using sec­tion signs or side­notes must have ef­fects on be­hav­ior so close to zero as to be un­ob­serv­able.

  2.  

    Para­phrased from Di­a­logues of the Zen Mas­ters as quoted in pg11 of the Ed­i­tor’s In­tro­duc­tion to Three Pil­lars of Zen:

    One day a man of the peo­ple said to Mas­ter Ikkyu: “Mas­ter, will you please write for me max­ims of the high­est wis­dom?” Ikkyu im­me­di­ately brushed out the word ‘At­ten­tion’. “Is that all? Will you not write some more?” Ikkyu then brushed out twice: ‘At­ten­tion. At­ten­tion.’ The man re­marked ir­ri­ta­bly that there wasn’t much depth or sub­tlety to that. Then Ikkyu wrote the same word 3 times run­ning: ‘At­ten­tion. At­ten­tion. At­ten­tion.’ Half-angered, the man de­manded: “What does ‘At­ten­tion’ mean any­way?” And Ikkyu an­swered gen­tly: “At­ten­tion means at­ten­tion.”

  3.  

    And also, ad­mit­tedly, for es­thetic value. One earns the right to add ‘ex­tra­ne­ous’ de­tails by first putting in the hard work of re­mov­ing the ac­tual ex­tra­ne­ous de­tails; only after the ground has been cleared—the ‘data-ink ratio’ max­i­mized, the ‘chartjunk’ re­moved—can one see what is ac­tu­ally beau­ti­ful to add.

  4.  

    Good de­sign may be “as lit­tle de­sign as pos­si­ble” which gets the job done, to para­phrase Di­eter Rams; the prob­lem comes when de­sign­ers focus on the first part, and for­get the sec­ond part. If a min­i­mal­ist de­sign can­not han­dle more con­tent than a few para­graphs of text & a generic ‘hero image’, then it has not solved the de­sign prob­lem, and is merely a sub-genre of il­lus­tra­tion. (Like pho­tographs of el­e­gant min­i­mal­ist Scan­di­na­vian or Japan­ese ar­chi­tec­ture which leave one won­der­ing whether any human could live in­side them, and how those build­ings would learn.) And if a min­i­mal­ist web­site can­not even present some text well, you can be sure they have not solved any of the hard prob­lems of web de­sign like link rot or cross-referencing!

  5.  

    The de­fault pre­sen­ta­tion of sep­a­rate pages means that an en­tire page may con­tain only a sin­gle para­graph or sen­tence. The HTML ver­sions of many tech­ni­cal man­u­als (typ­i­cally com­piled from LaTeX, Doc­Book, or GNU Info) are even worse, be­cause they fail to ⁠ex­ploit prefetch­ing & are slower than local doc­u­men­ta­tion, and take away all of the use­ful key­bind­ings which makes nav­i­gat­ing info man­u­als fast & con­ve­nient. Read­ing such doc­u­men­ta­tion in a web browser is Chi­nese water tor­ture. (That, decades later, the GNU project keeps gen­er­at­ing doc­u­men­ta­tion in that for­mat, rather than at least as large single-page man­u­als with hy­per­linked table-of-contents, is a good ex­am­ple of how bad they are at UI/UX de­sign.) And it’s not clear that it’s that much worse than the other ex­treme, the mono­lithic man page which in­cludes every de­tail under the sun and is im­pos­si­ble to nav­i­gate with­out one’s eyes glaz­ing over even using in­cre­men­tal search to nav­i­gate through dozens of ir­rel­e­vant hits—every sin­gle time!

  6.  

    Also known as “back­ward or re­verse ci­ta­tions”, “what links here”, & “back­links”.

  7.  

    This fixes the biggest prob­lem with the Me­di­aWiki wiki sys­tem’s ‘what links here’ im­ple­men­ta­tion of back­links—which is the sim­plis­tic way of im­ple­ment­ing it so has be­come the stan­dard wiki soft­ware ap­proach to dis­play­ing back­links.

    The WhatLinksHere page (⁠eg. En WP) will tell you that sev­eral hun­dred other Wikipedia ar­ti­cles link to your cur­rent Wikipedia ar­ti­cle, yes, but you have no idea what the con­text is (on ei­ther page!), and if it is an im­por­tant link or a minor link, or even where in the ar­ti­cle it might be—it might be hid­den under some un­pre­dictable dis­played text, and you have to search the Me­di­aWiki markup it­self just to find it!

    This is only par­tially fixed by tools like Lupin’s Tool which try to lo­cate the link by load­ing the other page, be­cause those are used by few ed­i­tors, and still re­quire ef­fort. Be­cause Me­di­aWiki ren­ders every­thing server-side, there is no rea­son it could not do some­thing sim­i­lar and dis­play con­tex­tu­al­iz­ing ex­cerpts next to each link. It just doesn’t. (It doesn’t need true bidi­rec­tional links—even a heuris­tic hack of as­sum­ing the first link in each ar­ti­cle is the ‘real’ link, and ig­nor­ing du­pli­cates, would be a major im­prove­ment.)

  8.  

    Roam ap­par­ently might do some­thing like our ‘in­lin­ing’, but I know too lit­tle about it to say. Mag­gie Ap­ple­ton mocks up such a ⁠“spec­u­la­tive in­ter­face”, but ap­pears to not know of any im­ple­men­ta­tions.

    A lim­ited ex­am­ple is ⁠Greater­Wrong, which does back­links on ⁠posts and ⁠in­di­vid­ual com­ments. How­ever, while back­links on in­di­vid­ual com­ments are rea­son­ably atomic, they do not show the call­ing con­text, and the pop­ups on the links only show the stan­dard whole-item view. (GW’s back­links were in­tro­duced in 2019 at the re­quest of Wei Dai, well be­fore Gwern.net’s back­links were in­tro­duced in 2021 to take ad­van­tage of the new tran­sclu­sion fea­ture, and they are mostly in­de­pen­dent in de­sign.)

  9.  

    The orig­i­nal rea­son I began au­to­mat­i­cally gen­er­at­ing IDs on all Gwern.net hy­per­links was minor: I wanted to use the within-page pop­ups (like the lit­tle up/down-arrows) to re­move re­dun­dant links & show more con­text.

    A re­search paper might be dis­cussed at length in one sec­tion, but then cited else­where; it would be bad to not hy­per­link it, so usu­ally, I would make a re­dun­dant hy­per­link. How­ever, if the first dis­cus­sion had a unique ID, then I could sim­ply link later ref­er­ences to the ID in­stead, and the reader could hover over it to pop up that dis­cus­sion, read it, and then click through. (So it would look like this in Mark­down: [Foo 2020](URL){#foo-2020} proved ABC, which is interesting because of DEF … [thousands of words & many sections later] … see also [Foo 2020](#foo-2020).)

    One could do this man­u­ally on a case-by-case basis, but there are so many links, and the ID can be in­ferred from the meta­data, so why not gen­er­ate them au­to­mat­i­cally, so one could al­ways be sure that #foo-2020 was valid?

    And once most links had IDs were unique within pages, that meant they could be unique across pages as well… So the pop­ups led to the bidi­rec­tional back­links.

  10.  

    Mul­ti­ple links to an­other URL is not un­usual on Gwern.net, par­tic­u­larly when mak­ing use of IDs for pre­cise links so one might eas­ily link not just /foo but /foo#bar, /foo#quux, and maybe even /foo#baz, why not?

    In fact, this can be a good way to han­dle com­plex an­no­ta­tions: you can break them up into mul­ti­ple an­no­ta­tions & link each ver­sion. For ex­am­ple, imag­ine a com­plex, in-depth ma­chine learn­ing paper like the Big­GAN paper, where the ab­stract is im­por­tant but omits some key parts on page 6 and also page 8 I want to high­light for other pur­poses.

    I could set­tle for not an­no­tat­ing them at all; or I could try to jam them all into just one an­no­ta­tion; or I could link to the exact pages in the paper PDF using the #page=n trick & set­tle for the PDF pop­ping up with no an­no­ta­tion pos­si­ble (this also works if you cre­ate ar­bi­trary IDs solely for the pur­pose of writ­ing mul­ti­ple dis­tinct an­no­ta­tions); or I could cre­ate an­no­ta­tions for the exact page links & sim­ply cross-reference them! The back­links en­able cross-referencing at a glance, and nav­i­gat­ing at a hover. And since this is all fully re­cur­sive, an­no­ta­tions are first-class cit­i­zens, the tar­gets can be ar­bi­trary IDs of ar­bi­trary URLs or <div>/<span>s, back­links & links in­ter­op­er­ate etc, it all Just Works™ seam­lessly on the part of both au­thor & reader.

    But a sys­tem which threw away the meta­data of an­chors & IDs would strug­gle to do any of this: the 1:1 links or the dis­tinct URL-anchors would col­lapse down to hopelessly-ambiguous many:many maps.

  11.  

    I use au­thor names for my IDs, be­cause that meta­data is usu­ally avail­able due to an­no­ta­tions and is eas­ily guessed & writ­ten. But other im­ple­men­ta­tions might pre­fer to in­stead gen­er­ate consistent-but-unique IDs by sim­ply strip­ping or es­cap­ing the URL in ques­tion (eg. into Base64 or URL-encoding), or by feed­ing it into a web-browser-supported hash func­tion like SHA-256 (trun­cated to 8 chars—there are not nearly enough URLs on any page to worry about col­li­sions).

  12.  

    Show­ing the con­text for the back­link re­quires down­load­ing ei­ther the an­no­ta­tion or page, to nar­row down to the ID’s con­text. Show­ing back­link con­text can use up a lot of space, and ren­der­ing all that HTML is ex­pen­sive, par­tic­u­larly for back­link sec­tions which have scores of back­links.

    So the col­laps­ing serves as lazy eval­u­a­tion, and avoids doing that un­less the reader re­quests it. (Since back­links are all known at compile-time, it would be pos­si­ble to pre­com­pute the con­text, but not too easy.)

  13.  

    This re­places an ear­lier ⁠Hakyll-based tag sys­tem. The Hakyll ap­proach was quite sim­ple and in­tended just for small blogs, and had no way to han­dle tag­ging local files, much less ar­bi­trary URLs. (The tag code was also black magic I couldn’t mod­ify.) Mean­while, the evolv­ing filesys­tem hi­er­ar­chy for my local files al­ready looked like a tag sys­tem, and the evo­lu­tion was easy.

  14.  

    I abuse an­chors in this way to track ‘af­fil­i­a­tion’ of URLs, both for eas­ier ref­er­ence/search and for set­ting link-icons. For ex­am­ple, I re­gard Deep­Mind au­thor­ship of a paper as being a help­ful thing to know, and so I ap­pend #deepmind to any DeepMind-related URL (eg. https://arxiv.org/abs/1704.03073#deepmind). I find it par­tic­u­larly help­ful for track­ing Chi­nese AI re­search, where they have a habit of qui­etly drop­ping re­veal­ing pa­pers on Arxiv with no PR or West­ern at­ten­tion.

    When I link that URL, the link will get the Deep­Mind logo as its link-icon, and it is eas­ier for me to search if I can re­mem­ber that it was DM-related. This will not break the link, be­cause the an­chor is client-side only (un­like if you wanted to abuse query-parameters this way—many servers would ig­nore a mal­formed url like foo?deepmind, but many oth­ers would throw an error); thus, I can copy-paste back and forth be­tween Gwern.net and Red­dit or Twit­ter, and the lat­ter will con­tinue to work nor­mally (they track the full URL but usu­ally drop the an­chor for pur­poses like search­ing). Be­cause it’s over­load­ing the an­chor, I can de­fine new af­fil­i­a­tions any time I please, and am up to 51 af­fil­i­a­tions as of 2023-04-19; and I could en­code any­thing else I might want to en­code by using a new con­ven­tion. I could en­code the date by writ­ing #2023-04-19, or the au­thor, #john-smith, or lit­tle notes like #todo. As long as they do not hap­pen to be a real an­chor, they will work. (This is why a num­ber of past web de­sign hacks like `#!’ URLs or “Text Frag­ments” (crude content-addressable URLs) have also ex­ploited an­chors, for their backwards-compatibility.)

    This hack does come with costs. First, it cre­ates spu­ri­ous an­chors which my linkcheck­ers will warn about but must be ig­nored as de­lib­er­ate er­rors. Sec­ond and more se­ri­ously, while it works fine on ex­ter­nal URLs, it be­gins to cause prob­lems with local URLs: con­sider a case like /doc/reinforcement-learning/model-free/2016-graves.pdf#deepmind—this URL is not it­self a prob­lem for an­no­ta­tions, which does every­thing by URL, but it is a prob­lem for any­thing at the file level, which sees only .../2016-graves.pdf. There is no #deepmind at the file level! This re­quires flaky hacks like look­ing up every an­no­ta­tion with the file as a pre­fix to see if there is an an­no­ta­tion with some an­chor. I in­tend to re­move this hack in favor of stor­ing af­fil­i­a­tions in­side the an­no­ta­tion meta­data; how­ever, I may keep it as a con­ve­nient way to input af­fil­i­a­tions.

  15.  

    The tag pop­ups used to sim­ply tran­sclude/load the tag page into the popup. This turned out to be un­pre­dictable & slow for the mega-tags like psychology which had hun­dreds to thou­sands of en­tries (and are in des­per­ate need of refac­tor­ing), which were also not tran­sclu­sions at the time, and so could take >10s to load.

  16.  

    Amus­ingly, while use­less for their os­ten­si­ble in­tended pur­pose, In­sta­gram tags were use­ful for early land­mark re­sults in deep learn­ing scal­ing (par­tic­u­larly Ma­ha­jan et al 2018), help­ing to es­tab­lish that neural nets could learn from bil­lions of im­ages when the ex­pert con­ven­tional wis­dom was that mil­lions was all that was use­ful & neural nets were ‘fun­da­men­tally un­able to scale’.

  17.  

    Em­bed­dings which are en­hanced by in­clud­ing their in­ter­nal links, back­links, manually-curated similar-links (the ‘see-also’ sec­tions of an­no­ta­tions), and their prior tags, it’s worth not­ing: all these should en­hance the tag­ga­bil­ity & clus­ter­abil­ity of the em­bed­dings.

  18.  

    You can use LLMs di­rectly for tag­ging, with tricks like fine­tun­ing or in­clud­ing a list of valid tags to choose from, but these will prob­a­bly be less ac­cu­rate than a clas­si­fier, will tend to be slower & less suit­able for ⁠real-time active-learning of tags, and em­bed­dings are reusable for other pur­poses.

  19.  

    An al­ter­na­tive would be k-medoids, which would con­struct clus­ters whose ‘cen­ter’ is a spe­cific dat­a­point (with k-means, there is not nec­es­sar­ily a dat­a­point at the cen­ter of a clus­ter), mak­ing in­ter­pretabil­ity eas­ier for the user and pos­si­bly cre­at­ing higher-quality clus­ters. We wouldn’t want to use DB­SCAN be­cause it would ig­nore many points as ‘out­liers’; this is rea­son­able with real-world data where dat­a­points may be un­pre­dictable or out­right garbage, but in tag­ging, we can as­sume that all data is valid, and so we want to keep ‘out­liers’ and con­sider as­sign­ing them their own tag—maybe they are sim­ply im­ma­ture.

  20.  

    For ex­am­ple, my last few ses­sions’ au­to­matic ti­tles/sum­maries: “In­su­lated con­tainer tem­per­a­ture”, “Dan­ger­ous meta­data dates”, “Haskell Re­cur­sive File List­ing”, “Plas­tic Pipes Re­duce Build-up”, “Tem­plate for un­sup­ported ci­ta­tions”, “Buffer rewrite con­di­tion mod­i­fi­ca­tion”, “Git pull with au­to­matic merge”, “Grep for non-printable char­ac­ters”, “Al­ter­na­tive to (keyboard-quit)”, “Re­verse Mark­down List Order”, “Goal Set­ting The­ory & Con­sci­en­tious­ness”, “Odin’s Fa­vorite Spice?”, “Cat loung­ing against chair”, ⁠“De­crypt Poem Mes­sage”, “La­dy­bug Fork­lift Cer­ti­fi­ca­tion”.

    Why not do this for every­thing, like file­names, and end the ‘blink­ing 12’ prob­lem of Untitled (89).doc? After all, the prob­lem with these is due less to the in­trin­sic com­plex­ity or dif­fi­culty—it takes all of a minute to read the man­ual or fig­ure it out or de­cide what to call a file—so much as the irk­some­ness of tak­ing a minute to do so every time one would have to–every power out­age, Day­light Sav­ings, VCR up­grade, ran­dom doc­u­ment—for what is ul­ti­mately a tiny ben­e­fit per-instance. (A real cler­i­cal ben­e­fit, which adds up in ag­gre­gate, but still, small in­di­vid­u­ally, and thus eas­ily out­weighed by the has­sle.) But a LLM can do it eas­ily and won’t com­plain.

  21.  

    One idea I have not seen much, but which would be use­ful as au­toma­tion is added, is a con­cept of ‘neg­a­tive tags’ or ‘anti-tags’: as­sert­ing that an item is def­i­nitely not a tag.

    Tags are typ­i­cally pre­sented as a two-valued bi­nary vari­able, but be­cause the de­fault for tag sys­tems is typ­i­cally to be un­tagged, and be­cause most tag sys­tems are in­com­plete, that means that er­rors are highly skewed to­wards er­rors of omis­sion rather than com­mis­sion. An item with tag x but not tag y, is al­most al­ways in­deed an in­stance of x; how­ever, it will often be y too. So the ab­sence of a tag is much less in­for­ma­tive than the pres­ence of a tag. But there is no way to dis­tin­guish be­tween “this item is not tagged y be­cause no one has got­ten around to it” and “be­cause some­one looked closely and it’s def­i­nitely not y”.

    In reg­u­lar tag use, this merely re­sults in some wasted ef­fort as users pe­ri­od­i­cally look at the item and double-check that it’s not y. With au­toma­tion, this can be a se­ri­ous ob­sta­cle to things like ac­tive learn­ing work­ing at all: if we have no way of mark­ing ‘it’s def­i­nitely not y’, then when we at­tempt to find in­stances of y which are not la­beled y, we will every time have to ig­nore the same false pos­i­tives. (And we also can’t train our clas­si­fier to ig­nore those false pos­i­tives, even though those would be the most valu­able to train on be­cause they are the ones which most fooled our clas­si­fier.)

  22.  

    This could, in fact, be how we do the clus­ter­ing refac­tor­ing to begin with: we sim­ply clus­ter by de­fault (ar­bi­trar­ily sort­ing across-clusters, and then semantic-sort within-cluster), and the user can hit a lit­tle but­ton to ‘can­on­ize’ a clus­ter as a new tag.

  23.  

    Imag­ine you are sort­ing a list of items A–Z ([ABCDE­FGHIJKMLNOPQRSTU­VWXYZ]), where you have pair­wise dis­tances like ‘A is closer to B than C or Z’. If you pick, say, ‘H’, and then sim­ply ‘sort by dis­tance’ to ‘H’ to form a 1D list, the re­sult would merely ‘ping-pong’ back and forth: [HIGJFKEM­DOCPB…]. (The more equidis­tant clus­ters there are, the worse this ping-pong ef­fect is.) This would be mean­ing­ful in that ‘P’ re­ally is slightly closer to ‘H’ than ‘B’, but the back-and-forth would look con­fus­ing to any reader, who wouldn’t see the un­der­ly­ing A–Z. How­ever, if you sorted pair­wise greed­ily, you would get a list like [HIJKMLN…YZ­ABCDEFG], and aside from the ‘jump’ at A/Z, this would be mean­ing­ful to the reader and much more pleas­ant to browse. It would also be eas­ier to cu­rate, as you can see the se­quence and also the ‘jump’, and, say, de­cide to edit A–G into its own tag, and then fur­ther re­fine what’s left into H–L & M–Z tags.

  24.  

    If you had a cor­pus of restau­rant re­views, you might want to clus­ter them by the type of food, but your de­fault em­bed­ding keeps group­ing them by ge­og­ra­phy when re­duced to a 2D GUI; no prob­lem, just sub­tract the tags for all of the cities like “New York City”, and then what’s left af­ter­wards will prob­a­bly be clump­ing by French vs Asian fu­sion vs Chi­nese etc and you can eas­ily lasso each clus­ter and name them and cre­ate new tags with hardly any toil.

  25.  

    This is ab­stract enough that I doubt it could be eas­ily ex­plained to a user, but I think it could be en­coded into the UI in a us­able way. Like one could present a 2D plot, where each dat­a­point and averaged-tag is pre­sented, and they can be clicked on to ‘strengthen’ or ‘weaken’ them (where each de­gree of strength cor­re­sponds to an em­bed­ding arith­metic op­er­a­tion, but weighted, like 10% per click). If you are in­ter­ested in the X-ness of all your dat­a­points, you sim­ply click on the X dot to ‘strengthen’ it a few times until the up­dated 2D plot makes sense.

  26.  

    An un­usual choice, as one does not as­so­ciate IBM with font de­sign ex­cel­lence, but nev­er­the­less, it was our choice after blind com­par­i­son of ~20 code fonts with vari­ant ze­roes (which we con­sider a re­quire­ment for code). An ap­peal­ing newer al­ter­na­tive is Jet­Brains Mono (which doesn’t work as well with Gwern.net’s style, but may suit other web­sites).

  27.  

    Side­notes have long been used as a ty­po­graphic so­lu­tion to densely-annotated texts such as the Geneva Bible (first 2 pages), but have not shown up much on­line yet.

    Screenshot of Google Books https://books.google.com/books?id=JmtXAAAAYAAJ&pg=PA900 , showing advanced typography in a single page which contains body text, footnotes, and (recursively) sidenotes to footnotes, of Pierre Bayle’s famous Enlightenment text, the ‘Historical and Critical Dictionary’ (pg900 of volume 4 of the 1737 English edition).

    Pierre Bayle’s His­tor­i­cal and Crit­i­cal Dic­tio­nary, demon­strat­ing re­cur­sive foot­notes/side­notes (1737288ya, vol­ume 4, pg901; source: Google Books)

    An early & in­spir­ing use of mar­gin/side notes.

  28.  

    IBM Plex Mono was cho­sen in part via using the Cod­ing­Font ‘tour­na­ment’; Adobe Source Code Pro also ranked high, and we used it ini­tially, but Plex Mono edged it out, with its use­ful al­ter­na­tives and a some­what bet­ter over­all ap­pear­ance. (Who knew IBM could com­mis­sion such a nice mono­space font?)

  29.  

    Or at least, so we think? Google Page­Speed keeps claim­ing that mini­fi­ca­tion would cut as much as half a sec­ond off total time.

  30.  

    Per­haps the re­turns to de­sign are also going up with time as In­ter­net de­sign­ers in­creas­ingly get all the rope they need to hang them­selves? What browser devs & Moore’s Law giveth, semi-malicious web de­sign­ers take away. Every year, the range of worst to best web­site gets broader, as ever new ways to de­grade the brows­ing ex­pe­ri­ence—not 1 but 100 track­ers! newslet­ter pop­ups! sup­port chat! Taboola chum­box! ‘browser no­ti­fi­ca­tions re­quested’! 50MB of hero im­ages! lay­out shifts right as you click on some­thing!—are in­vented. 80-column ASCII text files on BBSes offer lit­tle de­sign great­ness, but they are also hard to screw up. To make an out­stand­ingly bad web­site re­quires the lat­est CMSes, A/B test­ing in­fra­struc­ture to ⁠Schlitz your way to prof­itabil­ity, CDNs, ad net­work auc­tion­ing tech­nol­ogy, and high-paid web de­sign­ers using only Apple lap­tops. (A ⁠2021 satire; note that you need to dis­able ad­block.) Given the sub­tlety of this creep to­wards degra­da­tion & short-term prof­its and the rel­a­tively weak cor­re­la­tion with fit­ness/prof­itabil­ity, we can’t ex­pect any rapid evo­lu­tion to­wards bet­ter de­sign, un­for­tu­nately, but there is an op­por­tu­nity for those busi­nesses with taste.

  31.  

    Which might ac­count for why im­prove­ments in Gwern.net de­sign also seem to cor­re­late with more com­ments where the com­menter ap­pears in­fu­ri­ated by the de­sign—that’s cheat­ing!

  32.  

    One anec­dote Steve Jobs cited for his per­fec­tion­ism, even in things the user would os­ten­si­bly not see, is his fa­ther’s hob­by­ist car­pen­try, where he cared about mak­ing even the backs of fences & cab­i­nets look good; from Wal­ter Isaac­son’s2011 Steve Jobs:

    Jobs re­mem­bered being im­pressed by his fa­ther’s focus on crafts­man­ship. “I thought my dad’s sense of de­sign was pretty good”, he said, “be­cause he knew how to build any­thing. If we needed a cab­i­net, he would build it. When he built our fence, he gave me a ham­mer so I could work with him.”

    50 years later the fence still sur­rounds the back and side yards of the house in Moun­tain View. As Jobs showed it off to me, he ca­ressed the stock­ade pan­els and re­called a les­son that his fa­ther im­planted deeply in him. It was im­por­tant, his fa­ther said, to craft the backs of cab­i­nets and fences prop­erly, even though they were hid­den. “He loved doing things right. He even cared about the look of the parts you couldn’t see.”

    His fa­ther con­tin­ued to re­fur­bish and re­sell used cars, and he fes­tooned the garage with pic­tures of his fa­vorites. He would point out the de­tail­ing of the de­sign to his son: the lines, the vents, the chrome, the trim of the seats. After work each day, he would change into his dun­ga­rees and re­treat to the garage, often with Steve tag­ging along. “I fig­ured I could get him nailed down with a lit­tle me­chan­i­cal abil­ity, but he re­ally wasn’t in­ter­ested in get­ting his hands dirty”, Paul later re­called. “He never re­ally cared too much about me­chan­i­cal things. I wasn’t that into fix­ing cars”, Jobs ad­mit­ted. “But I was eager to hang out with my dad.”

    Cyn­i­cal com­men­ta­tors point out that many great pro­fes­sional fur­ni­ture mak­ers did not put much work into the back of cab­i­nets, as it was a waste; I would point out that since it is rel­a­tively easy to judge fur­ni­ture com­pared to soft­ware, their crit­i­cism ac­tu­ally re­veals why this at­ti­tude could be bril­liant mar­ket­ing for soft­ware.

    Whose soft­ware would you trust more (and pay a pre­mium for): the guy who clearly slacks off in the few places you can spot slack­ing, but swears he’s just ef­fi­ciently fo­cus­ing on the im­por­tant parts, “trust me!”—or the guy who is so neu­rot­i­cally per­fec­tion­ist that he costs his com­pany mil­lions of dol­lars at the last minute ‘fix­ing’ some minor ug­li­ness in­side the case where you might never have looked? (Or con­sider sig­na­ture Jobs moves like hav­ing ⁠“sig­na­tures” in­side the Mac­in­tosh case, where “no one” would see it; but of course, plenty of peo­ple would see it, as even Mac­in­toshes would be opened up rou­tinely, and those who saw it would tell oth­ers, and those would tell oth­ers, and the sig­na­tures be­came fa­mous enough that I am link­ing the story many decades later.)

Similar Links