TeX is a typesetting system: it was especially designed to handle complex mathematics, as well as most ordinary text typesetting.
TeX is a batch language, like C or Pascal, and not an interactive "word processor": you compile a TeX input file into a corresponding device-independent (DVI) file (and then translate the DVI file to the commands for a particular output device). This approach has both considerable disadvantages and considerable advantages. For a complete description of the TeX language, see The TeXbook (see section B. References). Many other books on TeX, introductory and otherwise, are available.
tex
invocation
TeX (usually invoked as tex
) formats the given text and
commands, and outputs a corresponding device-independent representation
of the typeset document. This section merely describes the options
available in the Web2c implementation. For a complete description of
the TeX typesetting language, see The TeXbook
(see section B. References).
TeX, Metafont, and MetaPost process the command line (described here) and determine their memory dump (fmt) file in the same way (see section 3.5.2 Memory dumps). Synopses:
tex [option]... [texname[.tex]] [tex-commands] tex [option]... \first-line tex [option]... &fmt args
TeX searches the usual places for the main input file texname
(see section `Supported file formats' in Kpathsea), extending
texname with `.tex' if necessary. To see all the
relevant paths, set the environment variable KPATHSEA_DEBUG
to
`-1' before running the program.
After texname is read, TeX processes any remaining
tex-commands on the command line as regular TeX input. Also,
if the first non-option argument begins with a TeX escape character
(usually \
), TeX processes all non-option command-line
arguments as a line of regular TeX input.
If no arguments or options are specified, TeX prompts for an input file name with `**'.
TeX writes the main DVI output to the file `basetexname.dvi', where basetexname is the basename of texname, or `texput' if no input file was specified. A DVI file is a device-independent binary representation of your TeX document. The idea is that after running TeX, you translate the DVI file using a separate program to the commands for a particular output device, such as a PostScript printer (see section `Introduction' in Dvips) or an X Window System display (see xdvi(1)).
TeX also reads TFM files for any fonts you load in your document with
the \font
primitive. By default, it runs an external program
named `mktextfm' to create any nonexistent TFM files. You can
disable this at configure-time or runtime (see section `mktex configuration' in Kpathsea). This is enabled mostly for the
sake of the EC fonts, which can be generated at any size.
TeX can write output files, via the \openout
primitive; this
opens a security hole vulnerable to Trojan horse attack: an unwitting
user could run a TeX program that overwrites, say, `~/.rhosts'.
(MetaPost has a write
primitive with similar implications). To
alleviate this, there is a configuration variable openout_any
,
which selects one of three levels of security. When it is set to
`a' (for "any"), no restrictions are imposed. When it is set to
`r' (for "restricted"), filenames beginning with `.' are
disallowed (except `.tex' because LaTeX needs it). When it is set
to `p' (for "paranoid") additional restrictions are imposed: an
absolute filename must refer to a file in (a subdirectory) of
TEXMFOUTPUT
, and any attempt to go up a directory level is
forbidden (that is, paths may not contain a `..' component). The
paranoid setting is the default. (For backwards compatibility, `y'
and `1' are synonyms of `a', while `n' and `0' are
synonyms for `r'.)
In any case, all \openout
filenames are recorded in the log file,
except those opened on the first line of input, which is processed when
the log file has not yet been opened. (If you as a TeX administrator
wish to implement more stringent rules on \openout
, modifying the
function openoutnameok
in `web2c/lib/texmfmp.c' is intended
to suffice.)
The program accepts the following options, as well as the standard `-help' and `-version' (see section 3.2 Common options):
configure
during installation of Web2c.
INITEX
(see section 3.5.1 Initial and virgin), enable MLTeX
extensions such as \charsubdef
. Implicitly set if the program
name is mltex
. See section 4.5.1 MLTeX: Multi-lingual TeX.
\write
. (If you as a
TeX administrator wish to implement more stringent rules on what can
be executed, you will need to modify `tex.ch'.)
initex
invocation
initex
is the "initial" form of TeX, which does lengthy
initializations avoided by the "virgin" (vir
) form, so as to be
capable of dumping `.fmt' files (see section 3.5.2 Memory dumps). For a
detailed comparison of virgin and initial forms, see section 3.5.1 Initial and virgin.
For a list of options and other information, see section 4.1 tex
invocation.
Unlike Metafont and MetaPost, many format files are commonly used with TeX. The standard one implementing the features described in the TeXbook is `plain.fmt', also known as `tex.fmt' (again, see section 3.5.2 Memory dumps). It is created by default during installation, but you can also do so by hand if necessary (e.g., if an update to `plain.tex' is issued):
initex '\input plain \dump'
(The quotes prevent interpretation of the backslashes from the shell.) Then install the resulting `plain.fmt' in `$(fmtdir)' (`/usr/local/share/texmf/web2c' by default), and link `tex.fmt' to it.
The necessary invocation for generating a format file differs for each format, so instructions that come with the format should explain. The top-level `web2c' Makefile has targets for making most common formats: plain latex amstex texinfo eplain. See section 4.4 Formats, for more details on TeX formats.
virtex
invocation
virtex
is the "virgin" form of TeX, which avoids the lengthy
initializations done by the "initial" (ini
) form, and is thus what
is generally used for production work. For a detailed comparison of
virgin and initial forms, see section 3.5.1 Initial and virgin.
For a list of options and other information, see section 4.1 tex
invocation.
TeX formats are large collections of macros, possibly dumped
into a `.fmt' file (see section 3.5.2 Memory dumps) by initex
(see section 4.2 initex
invocation). A number of formats are in reasonably
widespread use, and the Web2c Makefile has targets to make the versions
current at the time of release. You can change which formats are
automatically built by setting the fmts
Make variable; by default,
only the `plain' and `latex' formats are made.
You can get the latest versions of most of these formats from the CTAN archives in subdirectories of `CTAN:/macros' (for CTAN info, see section `unixtex.ftp' in Kpathsea). The archive ftp://ftp.tug.org/tex/lib.tar.gz (also available from CTAN) contains most of these formats (although perhaps not the absolute latest version), among other things.
TeX supports most natural languages. See also section 4.7 TeX extensions.
Multi-lingual TeX (mltex
) is an extension of TeX originally
written by Michael Ferguson and now updated and maintained by Bernd
Raichle. It allows the use of non-existing glyphs in a font by
declaring glyph substitutions. These are restricted to substitutions of
an accented character glyph, which need not be defined in the current
font, by its appropriate \accent
construction using a base and
accent character glyph, which do have to exist in the current font.
This substitution is automatically done behind the scenes, if necessary,
and thus MLTeX additionally supports hyphenation of words containing
an accented character glyph for fonts missing this glyph (e.g., Computer
Modern). Standard TeX suppresses hyphenation in this case.
MLTeX works at `.fmt'-creation time: the basic idea is to
specify the `-mltex' option to TeX when you \dump
a
format. Then, when you subsequently invoke TeX and read that
.fmt
file, the MLTeX features described below will be enabled.
Generally, you use special macro files to create an MLTeX .fmt
file. See:
CTAN:/systems/generic/mltex ftp://ftp.univ-rennes1.fr/pub/GUTenberg/french/
The sections below describe the two new primitives that MLTeX defines. Aside from these, MLTeX is completely compatible with standard TeX.
\charsubdef
: Character substitutions
The most important primitive MLTeX adds is \charsubdef
, used
in a way reminiscent of \chardef
:
\charsubdef composite [=] accent base
Each of composite, accent, and base are font glyph numbers, expressed in the usual TeX syntax: `\e symbolically, '145 for octal, "65 for hex, 101 for decimal.
MLTeX's \charsubdef
declares how to construct an accented
character glyph (not necessarily existing in the current font) using two
character glyphs (that do exist). Thus it defines whether a character
glyph code, either typed as a single character or using the \char
primitive, will be mapped to a font glyph or to an \accent
glyph
construction.
For example, if you assume glyph code 138
(decimal) for an e-circumflex
and you are using the Computer Modern fonts, which have the circumflex
accent in position 18 and lowercase `e' in the usual ASCII position 101
decimal, you would use \charsubdef
as follows:
\charsubdef 138 = 18 101
For the plain TeX format to make use of this substitution, you have
to redefine the circumflex accent macro \^
in such a way that if
its argument is character `e' the expansion \char138
is used
instead of \accent18 e
. Similar \charsubdef
declaration
and macro redefinitions have to be done for all other accented
characters.
To disable a previous \charsubdef c
, redefine c
as a pair of zeros. For example:
\charsubdef '321 = 0 0 % disable N tilde
(Octal '321 is the ISO Latin-1 value for the Spanish N tilde.)
\charsubdef
commands should only be given once. Although in
principle you can use \charsubdef
at any time, the result is
unspecified. If \charsubdef
declarations are changed, usually
either incorrect character dimensions will be used or MLTeX will
output missing character warnings. (The substitution of a
\charsubdef
is used by TeX when appending the character node
to the current horizontal list, to compute the width of a horizontal box
when the box gets packed, and when building the \accent
construction at \shipout
-time. In summary, the substitution is
accessed often, so changing it is not desirable, nor generally useful.)
\tracingcharsubdef
: Substitution diagnostics
To help diagnose problems with `\charsubdef', MLTeX provides a
new primitive parameter, \tracingcharsubdef
. If positive, every
use of \charsubdef
will be reported. This can help track down
when a character is redefined.
In addition, if the TeX parameter \tracinglostchars
is 100 or
more, the character substitutions actually performed at
\shipout
-time will be recorded.
TCX (TeX character translation) files help TeX support direct input of 8-bit international characters if fonts containing those characters are being used. Specifically, they map an input (keyboard) character code to the internal TeX character code (a superset of ASCII).
Of the various proposals for handling more than one input encoding, TCX files were chosen because they follow Knuth's original ideas for the use of the `xhcr' and `xord' tables. He ventured that these would be changed in the WEB source in order to adjust the actual version to a given environment. It turned out, however, that recompiling the WEB sources is not as simple task as Knuth predicted; therefore, TCX files, providing the possibility of changing of the conversion tables on on-the-fly, has been implemented instead.
This approach limits the portability of TeX documents, as some implementations do not support it (or use a different method for input-internal reencoding). It may also be problematic to determine the encoding to use for a TeX document of unknown provenance; in the worst case, failure to do so correctly may result in subtle errors in the typeset output.
While TCX files can be used with any format, using them breaks the LaTeX `inputenc' package. This is why you should either use tcxfile or `inputenc' in LaTeX files, but never both.
This is entirely independent of the MLTeX extension (see section 4.5.1 MLTeX: Multi-lingual TeX):
whereas a TCX file defines how an input keyboard character is mapped to
TeX's internal code, MLTeX defines substitutions for a
non-existing character glyph in a font with a \accent
construction made out of two separate character glyphs. TCX files
involve no new primitives; it is not possible to specify
that an input (keyboard) character maps to more than one character.
WEB2C
path.
INITEX
ignores TCX files.
The Web2c distribution comes with at least two TCX files, `il1-t1.tcx' and `il2-t1.tcx'. These support ISO Latin 1 and ISO Latin 2, respectively, with Cork-encoded fonts (a.k.a. the T1 encoding). TCX files for Czech, Polish, and Slovak are also provided.
src [dest]
Finally, here's what happens: when TeX sees an input character with code src, it 1) changes src to dest; and 2) makes code the dest "printable", i.e., printed as-is in diagnostics and the log file instead of in `^^' notation.
By default, no characters are translated, and character codes between 32 and 126 inclusive (decimal) are printable. It is not possible to make these (or any) characters unprintable.
Specifying translations for the printable ASCII characters (codes
32--127) will yield unpredictable results. Additionally you shouldn't
make the following characters printable: ^^I
(TAB), ^^J
(line feed), ^^M
(carriage return), and ^^?
(delete),
since TeX uses them in various ways.
Thus, the idea is to specify the input (keyboard) character code for src, and the output (font) character code for dest.
Patgen creates hyphenation patterns from dictionary files for use with TeX. Synopsis:
patgen dictionary patterns output translate
Each argument is a filename. No path searching is done. The output is written to the file output.
In addition, Patgen prompts interactively for other values.
For more information, see Word hy-phen-a-tion by com-puter by Frank Liang (see section B. References), and also the `patgen.web' source file.
The only options are `-help' and `-version' (see section 3.2 Common options).
(Sorry, but I'm not going to write this unless someone actually uses this feature. Let me know.)
This functionality is available only if the `--enable-ipc' option
was specified to configure
during installation of Web2c
(see section 2. Installation).
If you define IPC_DEBUG
before compilation (e.g., with `make
XCFLAGS=-DIPC_DEBUG'), TeX will print messages to standard error
about its socket operations. This may be helpful if you are, well,
debugging.
The base TeX program has been extended in many ways. Here's a partial list. Please send information on extensions not listed here to the address in section `Reporting bugs' in Kpathsea.
Go to the first, previous, next, last section, table of contents.