Unicode versus Texas symbols

Unicode symbols and their Texas (ASCII) counterparts

The following unicode symbols can be used in Perl 6 without needing to load any additional modules. Please note that the Since column applies to the version of Perl the symbol was known to be available. It is only mentioned if it is different from 6.c.

Reference is made below to various properties of unicode codepoints. The definitive list can be found here: http://www.unicode.org/Public/UCD/latest/ucd/PropList.txt.

Alphabetic Characters

Any codepoint that has the Ll (Letter, lowercase), Lu (Letter, uppercase), Lt (Letter, titlecase), Lm (Letter, modifier), or the Lo (Letter, other) property can be used just like any other alphabetic character from the ASCII range.

Numeric characters

Any codepoint that has the Nd (Number, decimal digit) property, can be used as a digit in any number. For example:

my $var = 19# U+FF11 U+FF19 
say $var + 2;  # OUTPUT: «21␤» 

Numeric values

Any codepoint that has the No (Number, other) or Nl (Number, letter) property can be used standalone as a numeric value, such as ½ and ⅓. (These aren't decimal digit characters, so can't be combined.) For example:

my $var = ⅒ + 2 + Ⅻ; # here ⅒ is No and Rat and Ⅻ is Nl and Int 
say $var;            # OUTPUT: «14.1␤» 

Whitespace characters

Besides spaces and tabs you can use any other unicode whitespace character that has the Zs (Separator, space), Zl (Separator, line), or Zp (Separator, paragraph) property.

Other acceptable single codepoints

This list contains the single codepoints [and their Texas (ASCII) equivalents] that have a special meaning in Perl 6.

Symbol Codepoint Texas Since Remarks
« U+00AB << v6.c as part of «» or .« or regex left word boundary
» U+00BB >> v6.c as part of «» or .» or regex right word boundary
× U+00D7 * v6.c
÷ U+00F7 / v6.c
U+2264 <= v6.c
U+2265 >= v6.c
U+2260 != v6.c
U+2212 - v6.c
U+2218 o v6.c
U+2245 =~= v6.c
π U+03C0 pi v6.c 3.14159_26535_89793_238e0
τ U+03C4 tau v6.c 6.28318_53071_79586_476e0
𝑒 U+1D452 e v6.c 2.71828_18284_59045_235e0
U+221E Inf v6.c
U+2026 ... v6.c
U+2018 ' v6.c as part of ‘’ or ’‘
U+2019 ' v6.c as part of ‘’ or ‚’ or ’‘
U+201A ' v6.c as part of ‚‘ or ‚’
U+201C " v6.c as part of “” or ”“
U+201D " v6.c as part of “” or ”“ or ””
U+201E " v6.c as part of „“ or „”
U+FF62 Q// v6.c as part of 「」 (Note: Q// variant cannot be used bare in regexes)
U+FF63 Q// v6.c as part of 「」 (Note: Q// variant cannot be used bare in regexes)
U+207A + v6.c (must use explicit number) as part of exponentiation
U+207B - v6.c (must use explicit number) as part of exponentiation
¯ U+00AF - v6.c (must use explicit number) as part of exponentiation (macron is an alternative way of writing a minus)
U+2070 **0 v6.c can be combined with ⁰..⁹
¹ U+00B9 **1 v6.c can be combined with ⁰..⁹
² U+00B2 **2 v6.c can be combined with ⁰..⁹
³ U+00B3 **3 v6.c can be combined with ⁰..⁹
U+2074 **4 v6.c can be combined with ⁰..⁹
U+2075 **5 v6.c can be combined with ⁰..⁹
U+2076 **6 v6.c can be combined with ⁰..⁹
U+2077 **7 v6.c can be combined with ⁰..⁹
U+2078 **8 v6.c can be combined with ⁰..⁹
U+2079 **9 v6.c can be combined with ⁰..⁹
U+2205 set() v6.c (empty set)
U+2208 (elem) v6.c
U+2209 !(elem) v6.c
U+220B (cont) v6.c
U+220C !(cont) v6.c
U+2286 (<=) v6.c
U+2288 !(<=) v6.c
U+2282 (<) v6.c
U+2284 !(<) v6.c
U+2287 (>=) v6.c
U+2289 !(>=) v6.c
U+2283 (>) v6.c
U+2285 !(>) v6.c
U+227C (<+) v6.c
U+227D (>+) v6.c
U+222A (|) v6.c
U+2229 (&) v6.c
U+2216 (-) v6.c
U+2296 (^) v6.c
U+228D (.) v6.c
U+228E (+) v6.c

Multiple codepoints

This list contains multiple-codepoint operators that require special composition for their Texas (ASCII) equivalents. Note the codepoints are shown space-separated but should be entered as adjacent codepoints when used.

Symbol Codepoints Texas Since Remarks
»=» U+00BB = U+00BB >>[=]>> v6.c uses ASCII '='
«=» U+00AB = U+00BB <<[=]>> v6.c uses ASCII '='