ASCII Input Standard (v1.0)
todo
todo
ASCII Input Standard
- Version 1.0
- Fallback behaviors
- Notes for font makers
- Conventions for some obscure features
- Standard glyph numbering
Version 1.0
This is a standard for typing Sitelen Pona using ASCII. It's used in the following situations:
- Someone is composing a Sitelen Pona document with a preprocessor that converts ASCII to Sitelen Pona.
- A font is converting ASCII strings into Sitelen Pona glyphs using ligatures. This standard was primarily created by font makers for this purpose.
- A developer of a Input Method Editor (IME) for Sitelen Pona may opt to use this standard as a basis for their input method.
To use a font's default glyph for a word, type the word, such as sewi.
To type a font's alternate glyph for a word, append a number, such as sewi2. This numbering differs between fonts and technologies. For example, some fonts assign the "upside-down anpa" version of sewi to sewi1; others assign it to sewi2.
To type a specific glyph for a word, in a way that's consistent across fonts and technologies, append a number with a leading zero, such as sewi02. (The leading zero will still be present after 10, e.g. word09, word010, word011.) kulupu Linku will publish and maintain a database of standardized glyph numberings. The initial set is below, under Standard glyph numbering.
There are three ways of combining glyphs.
-To stack two glyphs on top of each other, use a hyphen. For example,kala-liliwill show a kala with a lili above it.+To nest one glyph inside another, use a plus. For example,kala+liliwill show a kala with a lili inside it.&To request a non-standard way of combining glyphs, use an ampersand. For example,kulupu&kilimight draw three kili in the arrangement of the kulupu glyph. Also use the ampersand to allow a font designer to pick for you if there is no special combination. For example, in one font,kala&lilimight stack; but in another, it might nest.
To write a cartouche, use the symbols [ and ]. For example, "jan Itan" might be written jan [ijo tan anpa nanpa].
To use nasin sitelen kalama inside cartouches, use . and :. For example, "jan Itan" might be written jan [ijo tan:].
To use tally marks below cartouches, use ,. For example, "jan Itan" might be written jan [ijo, tan,,,].
To write quotation marks, use te for opening quote, and to for closing quote.
To separate words, type a single space, e.g. kala ma. This space is collapsed to zero-width. For example, kala ma will show the "kala" glyph followed by the "ma" glyph, with no space in between.
To write a full-width space, type |. For example, to write two sentences on one line without punctuation, type something like soweli li pona | o pona tawa ona. To indent a sentence fragment, type something like:
soweli mi li suwi
| | li pona tawa mi
| | | | tawa jan ante
To write the word "pi" wrapping under other words, use ( and ). For example, "tomo pi mama mi li suwi" might be written tomo pi(mama mi) li suwi.
There are two ways of rotating the word "ni":
- Use leading-zero standard numbering, such as
ni02for "ni" pointing right. - Use
v,>,^, and<, such asni>for "ni" pointing right.
Fallback behaviors
- When a word isn't supported by a font or technology, the fallback is undefined and varies. For example, for an unsupported word "kokosila", a font might display a [?] symbol, a string like "koTok koTok s i laTok", or a cartouche like "[ko.ko.sina.la.]".
- If a numbered glyph variant or directional glyph variant isn't available, the font or technology picks a different variant to use.
- If long pi isn't available,
(and)are treated as invisible. - If the user-specified method for combining glyphs isn't available, then a font or technology can pick a different combo strategy, or not combine the characters at all.
Notes for font makers
If you're making a font that converts ASCII into Sitelen Pona using ligatures, you should make the standard space show a zero-width space. This collapses the space between two words, such as jan pona. Intentional spaces between words can be typed with |. This design choice ensures that sequences of spaces show up properly in web browsers, avoiding the space-collapsing design of HTML itself, and avoiding an Apple bug that prevents ligatures like p o n a space from being used.
If a tally mark is used outside a cartouche, the resulting behavior is undefined and varies. Since sentences like tenpo ni la, mi pilin pona are common, and would otherwise result in an unintended tally mark below la, we recommend hiding the tally mark outside of cartouches, and producing a zero-width or full-width space.
Supporting a space between pi and ( is optional. To guarantee compatibility, authors should write pi( without a space.
Conventions for some obscure features
These ASCII symbols aren't standardized by this document, and may be repurposed for something else in a future revision if a more important or popular feature needs them. But they're included here to help font makers to experiment:
To wrap a glyph backwards around other words, use { and }. For example, "mi tawa tomo mi" might be written mi tawa {tomo}mi. (Many font makers use { and } for other purposes, such as custom cartouches, and that isn't a violation of this standard.)
To type dakuten in a cartouche, use " and *. For example, "jan Itan" might be written jan [ijo" tan*].
To manually insert a combining cartouche extension (the middle of a cartouche), use =. To manually insert a combining long glyph extension (the middle of long pi), use _. Combining extensions are applied to the preceding glyph, so mi toki= pona shows "toki" with a line above and below it.
Standard glyph numbering
The following list is maintained by kulupu Linku, not the Association. It’s not part of this standard, and is only included for reference. Glyphs are included based on frequency of their inclusion in fonts, and ordered historically.
For every glyph included in "Toki Pona: The Language of Good" (2014), word01 produces the original version of that glyph. For example, sama01 produces an equals sign.
a01: pu aa03(not a typo): triple-stick aakesi01: 6 legged akesiakesi02: 4 legged akesiante01: pu anteante02: skew anteapeja01: kulupu apejaepiku01: upvote epikuepiku02: emitters epikuisipin01: lawa emitters isipinjami01: suwi jamijan01: pu janjan02: eyes jankala01: pu kalakala02: eyes kalakamalawala01: anarchist kamalawalakapesi01: loje jelo laso style kapesikiki01: explosion kikikiki02: triangle kikiko01: blobby koko02: flower kokokosila02(not a typo): toki kokosilakonwe01: glider konwekulijo01: lete kulijolanpan01: jo lanpanlanpan02: pana lanpanlape01: lying down lapelape02: u.u lapelinluwi01: emitters linluwilinluwi02: sitelen linluwilinluwi03: kulupu linluwilupa01: pu lupalupa02: sitelen sa lupamajuna01: turned sin majunamajuna02: lotus/book majunameli01: pu melimeli02: venus melimelome01: right-side-up melomemeso01: tu mesomije01: pu mijemije02: mars mijemijomi01: right-side-up mijomimisa01: legless earless misamisikeke01: capsule misikekemisikeke02: mortar and pestle misikekemoli01: pu molimoli02: x_x molimonsi01: pu monsimonsi02: tail monsimonsuta01: jagged line monsutamu01: pu mumu02: dialogue punctuationmute01: pu mutemute02orluka&luka&luka&luka: four hands mutenamako01: 4-line sin namakonamako02: pepper namakonena01: pu nenanena02: sitelen sa nenani01orniv: down nini02orni>: right nini03orni^: up nini04orni<: left nini05,niv>, orni>v: down right nini06,ni^v, orni>^: up right nini07,ni^<, orni<^: up left nini08,niv<, orni<v: down left nioko01: side okoolin01: pu olinolin02: overlapping olinolin03: emitters olinomekapo01: kala emitters omekapoowe01: lukin emitters okopake01: T pakepenpo01: toki pona with 8 emitters penpopika01: lightning pikapo01: numeral 4 popowe01: lon ala powepuwa01: cloud puwasan01: triple backwards wan sansewi01: pu sewisewi02: turned anpa sewisinpin01: pu sinpinsinpin02: face sinpinsitelen01: pu sitelensitelen02: line sitelensoko01: thin stem sokosoko02: thick stem sokosoko03: annulus sokosoto01: box sotosu01: oz susu02: unicorn susu03: suno susutopatikuna01: head sutopatikunataki01: yin yang takitaki02: magnet takiteje01: box tejetenpo01: pu tenpotenpo02: hourglass tenpotomo01: pu tomotomo02: overhang tomotonsi01: transgender symbol tonsitonsi02: nonbinary symbol tonsiunu01: kule mun unuusawi01: nasa emitters usawiuta01: pu utauta02: dotless utawa01: exclamation wile wawasoweli01: 4 legged wasoweliwekama01: weka kama wekamawile01: pu wilewile02: turned heart wilewuwojiti01: box wuwojitiyupekosi01: y yupekosi