How to enter various non-ASCII characters in editors
Note: you will need to enter the “TeX” input method to enter emacs key
combinations starting with “\”, see below.
Unicode ASCII key sequence
char fallback Vim Emacs Compose sequence
« << ^k < < C-x 8 < Compose < <
» >> ^k > > C-x 8 > Compose > >
¥ Y ^k Y e C-x 8 Y Compose Y =
Set.pm operators (included for reference):
≠ != ^k ! = \ne
∩ * ^k ( U \cap
∪ + ^k ) U \cup
∖ - \setminus
⊂ < ^k ( C \subset
⊃ > ^k ) C \supset
⊆ <= ^k ( _ \subseteq
⊇ >= ^k ) _ \supseteq
⊄ !( $a < $b ) \nsubset
⊅ !( $a > $b ) \nsupset
⊈ !( $a <= $b ) \nsubseteq
⊉ !( $a >= $b ) \nsupseteq
⊊ < \subsetneq
⊋ > \supsetneq
∋/∍ $a.includes($b) ^k ) - \ni
∈/∊ $b.includes($a) ^k ( - \in
∌ !$a.includes($b) C-q 1236051
∉ !$b.includes($a) \notin
ø set() C-x 8 / o
Other unicode characters used as operators in Perl 6 programs should
be added here (for review, not necessarily for inclusion in Perl 6),
so long as the /intent/ of the Unicode code point is to depict an
operator that has a well known function. That is, this shouldn't be a
‘that glyph looks the most like this concept’ thing. If it's not
their intended use, don't use them :).
So, these *might* be considered not too awful;
× * ^k * X C-x 8 x
¬ ! ^k N O C-x 8 ~ ~
∕ / ^k / f C-q 1236065
÷ / ^k :- C-x 8 / /
≡ =:= ^k = 3 \equiv
≔ := ^v u 2254 \coloneq
⩴ or ≝ ::= C-q 1113524 / C-q 1236175
≈ or ≊ ~~ ^k ? 2 \approx / \approxeq
… ... ^v u 2026 \ldots
√ sqrt() ^k R T \surd
∧ && ^k A N \wedge
∨ || ^k O R \vee
∣ mod \mid
⌈$x⌉ ceil($x) ^k </> 7 \lceil / \rceil
⌊$x⌋ floor($x) ^k 7 </> 7 \lfloor / \rfloor
However I think it is a BAD idea that the following unicode characters
be abused; however ❝cute❞ it might be…
╳ *
∗ *
⊗ *
✱✲✳✴✵✶✷✸✹ etc *
⟨ ⟩ < >
⟪ ⟫ « » (er, I mean << >> ;-) )
→ ->
‼ !(!($a)) (i.e., cast to boolean (better: ?$a))
⁇ ‼ ?? !!
∥ || or //
¹ **1 (as these glyphs are just "superscript 1" etc)
² **2 C-x ^ 2
³ **3 C-x ^ 3
☂ eval{ }
☝ warn
⋯ ...
☣ ...
☠ !!!
☢ ???
✁ HERE-doc
✃
☮ exit(0); C-q 1110656
Vim
~~~
Vim users see ":help digraph-table" and ":digraphs".
The digraphs used in vim come from "Character Mnemonics & Character
Sets", RFC1345 (http://www.ietf.org/rfc/rfc1345.txt). After doing :set
digraph, the digraph ^k A B may also be entered as A <BS> B.
Emacs
~~~~~
Most of these characters can be entered using the TeX input method,
though, which you can select using:
M-x set-input-method <enter> TeX <enter>
You can also bind a key, such as C-# (C-S-3 on US layout) to
ucs-insert, which allows you to enter a unicode character by its
hexadecimal codepoint;
(global-set-key [?\C-#] 'ucs-insert)
The C-q numbers above there are emacs codepoints; emacs has an
internationalisation system similar to Unicode that predates Unicode.
You can see the emacs codepoint of a character by moving the cursor
over the character and typing: C-u C-x =. M-x describe-char-after
gives more information, but is still missing the Unicode codepoint in
the information (emacs 21.4).
X11R6 / Xorg
~~~~~~~~~~~~
There are two ways for Unix/X11 users to enable the compose key:
Either see:
http://www.gentoo.org/news/en/gwn/20041206-newsletter.xml#doc_chap8
Or run 'xev' to figure out the keycode of whatever key you want to use
as your compose key (bring the program into focus and type the key,
then wade through the garbage to find the keycode), then put:
keycode 115 = Multi_key
(Substituting for 115 the keycode you found) Into your
$HOME/.xmodmaprc, and run:
% xmodmap ~/.xmodmaprc
Note that the compose combinations here are an X11R6 standard, and do
not necessarily correspond to the compose combinations available when
you use your "compose" key. (FIXME - perhaps someone can research and
find the history of them? maybe they have roots in a standard
somewhere)
Of course, being able to raise an input event that corresponds to the
character you want does not guarantee that it will end up in your
editor intact :-).
Fonts
~~~~~
The `6x13' and `9x15' bitmap fonts supplied with XFree86 have some of
the best coverage of Unicode. `10x20' also has most of the characters
in this document.