Most Latin-based European languages were supported in LATEX by
introducing the T1
font encoding and by using the fontenc
and inputenc packages; these use only standard TEX means
to support any 8-bit input encoding and this one standard font
encoding. The restriction to a single font encoding guarantees that
multiple languages can happily coexist in one document (e.g.,
hyphenation will be correct for all languages).
Starting with the December 1998 Release, LATEX finally supports
Cyrillic languages. This support is based on the new standard
Cyrillic TEX font encodings--T2A
, T2B
, T2C
, and X2
. The
first three of these satisfy some basic requirements for
LATEX T*
encodings, and thus can be used in multi-lingual documents
with other languages based on standard font encodings.
The reason why we need four different Cyrillic font encodings is that these font encodings support all the Cyrillic languages that have been used during the twentieth century (see Section 4)! The number of Cyrillic glyphs is large, so they cannot be represented with 128 character slots; the other (lower) 128 slots are reserved for Latin letters and other invariant symbols that are needed for the encoding to be a conformant LATEX T encoding.
There are some glyphs in the T2*
encodings which do not yet have
associated characters in Unicode, the world-wide character
standard. Also, one more font encoding, T2D
, is planned for a
forthcoming release of LATEX. A lot of Cyrillic input encodings
are already supported (see Section 5), and additional
encodings could be added easily.