The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.
How to enter various non-ASCII characters in editors

Note: you will need to enter the “TeX” input method to enter emacs key
combinations starting with “\”, see below.

 Unicode       ASCII                        key sequence
  char      fallback        Vim        Emacs       Compose sequence

    «           <<          ^k < <     C-x 8 <     Compose < <
    »           >>          ^k > >     C-x 8 >     Compose > >
    ¥           Y           ^k Y e     C-x 8 Y     Compose Y =

Set.pm operators (included for reference):
    ≠           !=          ^k ! =     \ne
    ∩           *           ^k ( U     \cap
    ∪           +           ^k ) U     \cup
    ∖           -                      \setminus
    ⊂           <           ^k ( C     \subset
    ⊃           >           ^k ) C     \supset
    ⊆           <=          ^k ( _     \subseteq
    ⊇           >=          ^k ) _     \supseteq
    ⊄      !( $a < $b )                \nsubset
    ⊅      !( $a > $b )                \nsupset
    ⊈     !( $a <= $b )                \nsubseteq
    ⊉     !( $a >= $b )                \nsupseteq
    ⊊           <                      \subsetneq
    ⊋           >                      \supsetneq
    ∋/∍   $a.includes($b)   ^k ) -     \ni
    ∈/∊   $b.includes($a)   ^k ( -     \in
    ∌    !$a.includes($b)              C-q 1236051
    ∉    !$b.includes($a)              \notin
    ø         set()                    C-x 8 / o

Other unicode characters used as operators in Perl 6 programs should
be added here (for review, not necessarily for inclusion in Perl 6),
so long as the /intent/ of the Unicode code point is to depict an
operator that has a well known function.  That is, this shouldn't be a
‘that glyph looks the most like this concept’ thing.  If it's not
their intended use, don't use them :).

So, these *might* be considered not too awful;

    ×           *           ^k * X     C-x 8 x
    ¬           !           ^k N O     C-x 8 ~ ~
    ∕           /           ^k / f     C-q 1236065
    ÷           /           ^k :-      C-x 8 / /
    ≡          =:=          ^k = 3     \equiv
    ≔           :=          ^v u 2254  \coloneq
  ⩴ or ≝       ::=                     C-q 1113524 / C-q 1236175
  ≈ or ≊        ~~          ^k ? 2     \approx / \approxeq
    …          ...          ^v u 2026  \ldots
    √          sqrt()       ^k R T     \surd
    ∧           &&          ^k A N     \wedge
    ∨           ||          ^k O R     \vee
    ∣           mod                    \mid
   ⌈$x⌉        ceil($x)     ^k </> 7   \lceil / \rceil
   ⌊$x⌋        floor($x)    ^k 7 </> 7 \lfloor / \rfloor


However I think it is a BAD idea that the following unicode characters
be abused; however ❝cute❞ it might be…

    ╳           *
    ∗           *
    ⊗           *
✱✲✳✴✵✶✷✸✹ etc   *
   ⟨ ⟩         < >
   ⟪ ⟫         « »        (er, I mean << >> ;-) )
    →           ->
    ‼          !(!($a))   (i.e., cast to boolean (better: ?$a))
   ⁇ ‼        ?? !!
    ∥         || or //
    ¹           **1       (as these glyphs are just "superscript 1" etc)
    ²           **2                    C-x ^ 2
    ³           **3                    C-x ^ 3
    ☂         eval{ }
    ☝         warn
    ⋯          ...
    ☣          ...
    ☠          !!!
    ☢          ???
   ✁          HERE-doc
   ✃
    ☮         exit(0);                 C-q 1110656


Vim
~~~
Vim users see ":help digraph-table" and ":digraphs".

The digraphs used in vim come from "Character Mnemonics & Character
Sets", RFC1345 (http://www.ietf.org/rfc/rfc1345.txt). After doing :set
digraph, the digraph ^k A B may also be entered as A <BS> B.
    
Emacs
~~~~~
Most of these characters can be entered using the TeX input method,
though, which you can select using:

  M-x set-input-method <enter> TeX <enter>

You can also bind a key, such as C-# (C-S-3 on US layout) to
ucs-insert, which allows you to enter a unicode character by its
hexadecimal codepoint;

  (global-set-key [?\C-#] 'ucs-insert)

The C-q numbers above there are emacs codepoints; emacs has an
internationalisation system similar to Unicode that predates Unicode.
You can see the emacs codepoint of a character by moving the cursor
over the character and typing: C-u C-x =.  M-x describe-char-after
gives more information, but is still missing the Unicode codepoint in
the information (emacs 21.4).

X11R6 / Xorg
~~~~~~~~~~~~
There are two ways for Unix/X11 users to enable the compose key:

Either see: 
http://www.gentoo.org/news/en/gwn/20041206-newsletter.xml#doc_chap8

Or run 'xev' to figure out the keycode of whatever key you want to use
as your compose key (bring the program into focus and type the key,
then wade through the garbage to find the keycode), then put:

    keycode 115 = Multi_key

(Substituting for 115 the keycode you found) Into your
$HOME/.xmodmaprc, and run:

    % xmodmap ~/.xmodmaprc

Note that the compose combinations here are an X11R6 standard, and do
not necessarily correspond to the compose combinations available when
you use your "compose" key.  (FIXME - perhaps someone can research and
find the history of them?  maybe they have roots in a standard
somewhere)

Of course, being able to raise an input event that corresponds to the
character you want does not guarantee that it will end up in your
editor intact :-).

Fonts
~~~~~
The `6x13' and `9x15' bitmap fonts supplied with XFree86 have some of
the best coverage of Unicode.  `10x20' also has most of the characters
in this document.