Go to the first, previous, next, last section, table of contents.


3. Commonalities

Many aspects of the TeX system are the same among more than one program, so we describe all those pieces together, here.

3.1 Option conventions

To provide a clean and consistent behavior, we chose to have all these programs use the GNU function getopt_long_only to parse command lines.

As a result, you can:

By convention, non-option arguments, if specified, generally define the name of an input file, as documented for each program.

If a particular option with a value is given more than once, it is the last value that counts.

For example, the following command line specifies the options `foo', `bar', and `verbose'; gives the value `baz' to the `abc' option, and the value `xyz' to the `quux' option; and specifies the filename `-myfile-'.

-foo --bar -verb -abc=baz -quux karl --quux xyz -- -myfile-

3.2 Common options

All of these programs accept the standard GNU `--help' and `--version' options, and several programs accept `--verbose'. Rather than writing identical descriptions in every node, they are described here.

`--help'
Print a usage message listing basic usage and all available options to standard output, then exit successfully.
`--verbose'
Print progress reports to standard output.
`--version'
Print the version number to standard output, then exit successfully.

TeX, Metafont, and MetaPost have additional options in common:

`-kpathsea-debug=number'
Set path searching debugging flags according to the bits of number (see section `Debugging' in Kpathsea). You can also specify this in KPATHSEA_DEBUG environment variable (for all Web2c programs). (The command line value overrides.) The most useful value is `-1', to get all available output.
`-ini'
Enable the "initial" form of the program (see section 3.5.1 Initial and virgin). This is implicitly set if the program name is initex resp. inimf resp. inimpost.
`-interaction=string'
Set the interaction mode from the command line. The string must be one of `batchmode', `nonstopmode', `scrollmode', or `errorstopmode'.
`-fmt=dumpname'
`-base=dumpname'
`-mem=dumpname'
Use dumpname instead of the program name or a `%&' line to determine the name of the memory dump file read (`fmt' for TeX, `base' for Metafont, `mem' for MetaPost). See section 3.5.2 Memory dumps. Also sets the program name to dumpname if no `-progname' option was given. When creating a dump, this option will also set the name of the dump file.
`-progname=string'
Set program (and memory dump) name to string. This may affect the search paths and other values used (see section `Config files' in Kpathsea). Using this option is equivalent to making a link named string to the binary and then invoking the binary under that name. See section 3.5.2 Memory dumps.
`-translate-file=tcxfile'
Use tcxfile to define which characters are printable and translations between the internal and external character sets. Moreover, tcxfile can be explicitly declared in the first line of the main input file `%& -translate-file=tcxfile'. This is the recommended method for portability reasons. See section 4.5.2 TCX files: Character translations.
`-c-style-errors'
Change the way error messages are printed. The alternate style looks like error messages from C/C++ compilers and is easier to parse for some editors that drive TeX compilers.
`-oem'
This option is specific to win32. When used, TeX engines will use the OEM code page rather than the ANSI one to display their messages.

3.3 Path searching

All of the Web2c programs, including TeX, which do path searching use the Kpathsea routines to do so. The precise names of the environment and configuration file variables which get searched for particular file formatted are therefore documented in the Kpathsea manual (see section `Supported file formats' in Kpathsea). Reading `texmf.cnf' (see section `Config files' in Kpathsea), invoking mktex... scripts (see section `mktex scripts' in Kpathsea), and so on are all handled by Kpathsea.

The programs which read fonts make use of another Kpathsea feature: `texfonts.map', which allows arbitrary aliases for the actual names of font files; for example, `Times-Roman' for `ptmr8r.tfm'. The distributed (and installed by default) `texfonts.map' includes aliases for many widely available PostScript fonts by their PostScript names.

3.4 Output file location

All the programs generally follow the usual convention for output files. Namely, they are placed in the directory current when the program is run, regardless of any input file location; or, in a few cases, output is to standard output.

For example, if you run `tex /tmp/foo', for example, the output will be in `./foo.dvi' and `./foo.log', not `/tmp/foo.dvi' and `/tmp/foo.log'.

However, if the current directory is not writable, the main programs (TeX, Metafont, MetaPost, and BibTeX) make an exception: if the environment variable or config file value TEXMFOUTPUT is set (it is not by default), output files are written to the directory specified. This is useful when you are in some read-only distribution directory, perhaps on a CD-ROM, and want to TeX some documentation, for example.

3.5 Three programs: Metafont, MetaPost, and TeX

TeX, Metafont, and MetaPost have a number of features in common. Besides the ones here, the common command-line options are described in the previous section. The configuration file options that let you control some array sizes and other features are described in section 2.5 Runtime options.

3.5.1 Initial and virgin

The TeX, Metafont, and MetaPost programs each have two main variants, called initial and virgin. As of Web2c 7, one executable suffices for both variants.

The initial form is enabled if:

  1. the `-ini' option was specified; or
  2. the program name is `initex' resp. `inimf' resp. `inimpost'; or
  3. the first line of the main input file is `%&ini';

otherwise, the virgin form is used.

The virgin form is the one generally invoked for production use. The first thing it does is read a memory dump (see section 3.5.2.2 Determining the memory dump to use), and then proceeds on with the main job.

The initial form is generally used only to create memory dumps (see the next section). It starts up more slowly than the virgin form, because it must do lengthy initializations that are encapsulated in the memory dump file.

In the past, there was a third form, preloaded executables. This is no longer recommended or widely used; but see the section below if you're interested anyway. In this case, the memory dump file was read in to the virgin form, a core dump of the running executable was done, and the undump program run to create a new binary. Nowadays, reading memory dumps is fast enough that this is generally no longer worth the cost in disk space and unshared executables.

3.5.1.1 Preloaded executables

Specifying `--enable-auto-core' to configure tells TeX, Metafont, and MetaPost to suicide with a SIGQUIT on an input filename of `HackyInputFileNameForCoreDump.tex' (all three programs use the `.tex' suffix). This produces a memory dump of the running executable in a file `core'. (This is unrelated to the standard memory dump feature in these programs; see section 3.5.2 Memory dumps).

You don't actually need to do this to produce a core dump. Just typing your quit character (usually CTRL-\) when the program is waiting for input (at `**') will have the same result. But a few sites want to reliably generate a core dump without human intervention; that's what --enable-auto-core is for.

With the program undump, you can use `core' to reconstitute a preloaded executable, which does not need to read a `.fmt' file to get started. Although preloaded executables save startup time, they have a big disadvantage: neither the disk space to store them nor their code segments (at runtime) can be shared. Therefore, if both tex and latex are running, twice as much memory will be consumed, to the general detriment of performance.

The undump program is not part of the Web2c distribution, but you can get it from the CTAN archives as `CTAN:/support/undump', and it is included in several TeX distributions (see section `unixtex.ftp' in Kpathsea).

3.5.2 Memory dumps

In typical use, TeX, Metafont, and MetaPost require a large number of macros to be predefined; therefore, they support memory dump files, which can be read much more efficiently than ordinary source code.

3.5.2.1 Creating memory dumps

The programs all create memory dumps in slightly idiosyncratic (thought substantially similar) way, so we describe the details in separate sections (references below). The basic idea is to run the initial version of the program (see section 3.5.1 Initial and virgin), read the source file to define the macros, and then execute the \dump primitive.

Also, each program uses a different filename extension for its memory dumps, since although they are completely analogous they are not interchangeable (TeX cannot read a Metafont memory dump, for example).

Here is a list of filename extensions with references to examples of creating memory dumps:

TeX
(`.fmt') See section 4.2 initex invocation.
Metafont
(`.base') See section 5.2 inimf invocation.
MetaPost
(`.mem') See section 6.2 inimpost invocation.

When making memory dumps, the programs read environment variables and configuration files for path searching and other values as usual. If you are making a new installation and have environment variables pointing to an old one, for example, you will probably run into difficulties.

3.5.2.2 Determining the memory dump to use

The virgin form (see section 3.5.1 Initial and virgin) of each program always reads a memory dump before processing normal source input. All three programs determine the memory dump to use in the same way:

  1. If the first non-option command-line argument begins with `&', the program uses the remainder of that argument as the memory dump name. For example, running `tex \&super' reads `super.fmt'. (The backslash protects the `&' against interpretation by the shell.)
  2. If the `-fmt' resp. `-base' resp. `-mem' option is specified, its value is used.
  3. If the `-progname' option is specified, its value is used.
  4. If the first line of the main input file (which must be specified on the command line, not in response to `**') is %&dump, and dump is an existing memory dump of the appropriate type, dump is used. The first line of the main input file can also specify which character translation file is to be used: %&-translate-file=tcxfile (see section 4.5.2 TCX files: Character translations). These two roles can be combined: %&dump -translate-file=tcxfile. If this is done, the name of the dump must be given first.
  5. Otherwise, the program uses the program invocation name, most commonly `tex' resp. `mf' resp. `mpost'. For example, if `latex' is a link to `tex', and the user runs `latex foo', `latex.fmt' will be used.

3.5.2.3 Hardware and memory dumps

By default, memory dump files are generally sharable between architectures of different types; specifically, on machines of different endianness (see section `Byte order' in GNU C Library). (This is a feature of the Web2c implementation, and is not true of all TeX implementations.) If you specify `--disable-dump-share' to configure, however, memory dumps will be endian-dependent.

The reason to do this is speed. To achieve endian-independence, the reading of memory dumps on LittleEndian architectures, such as PC's and DEC architectures, is somewhat slowed (all the multibyte values have to be swapped). Usually, this is not noticeable, and the advantage of being able to share memory dumps across all platforms at a site far outweighs the speed loss. But if you're installing Web2c for use on LittleEndian machines only, perhaps on a PC being used only by you, you may wish to get maximum speed.

TeXnically, even without `--disable-dump-share', sharing of `.fmt' files cannot be guaranteed to work. Floating-point values are always written in native format, and hence will generally not be readable across platforms. Fortunately, TeX uses floating point only to represent glue ratios, and all common formats (plain, LaTeX, AMSTeX, ...) do not do any glue setting at `.fmt'-creation time. Metafont and MetaPost do not use floating point in any dumped value at all.

Incidentally, different memory dump files will never compare equal byte-for-byte, because the program always dumps the current date and time. So don't be alarmed by just a few bytes difference.

If you don't know what endianness your machine is, and you're curious, here is a little C program to tell you. (The configure script contains a similar program.) This is from the book C: A Reference Manual, by Samuel P. Harbison and Guy L. Steele Jr. (see section B. References).

main ()
{
  /* Are we little or big endian?  From Harbison&Steele.  */
  union
  {
    long l;
    char c[sizeof (long)];
  } u;
  u.l = 1;
  if (u.c[0] == 1)
    printf ("LittleEndian\n");
  else if (u.c[sizeof (long) - 1] == 1)
    printf ("BigEndian\n");
  else
    printf ("unknownEndian");

  exit (u.c[sizeof (long) - 1] == 1);
}

3.5.3 Editor invocation

TeX, Metafont, and MetaPost all (by default) stop and ask for user intervention at an error. If the user responds with e or E, the program invokes an editor.

Specifying `--with-editor=cmd' to configure sets the default editor command string to cmd. The environment variables/configuration values TEXEDIT, MFEDIT, and MPEDIT (respectively) override this. If `--with-editor' is not specified, the default is vi +%d %s.

In this string, `%d' is replaced by the line number of the error, and `%s' is replaced by the name of the current input file.

3.5.4 \input filenames

TeX, Metafont, and MetaPost source programs can all read other source files with the \input (TeX) and input (MF and MP) primitives:

\input name % in TeX

The file name can always be terminated with whitespace; for Metafont and MetaPost, the statement terminator `;' also works. (LaTeX and other macro packages provide other interfaces to \input that allow different notation; here we are concerned only with the primitive operation.) This means that \input filenames cannot directly contain whitespace, even though Unix has no trouble. Sorry.

On the other hand, various C library routines and Unix itself use the null byte (character code zero, ASCII NUL) to terminate strings. So filenames in Web2c cannot contain nulls, even though TeX itself does not treat NUL specially.

Furthermore, some older Unix variants do not allow eight-bit characters (codes 128--255) in filenames.

For maximal portability of your document across systems, use only the characters `a'--`z', `0'--`9', and `.', and restrict your filenames to at most eight characters (not including the extension), and at most a three-character extension. Do not use anything but simple filenames, since directory separators vary among systems; instead, add the necessary directories to the appropriate search path.

Finally, the present Web2c implementation does `~' and `$' expansion on name, unlike Knuth's original implementation and older versions of Web2c. Thus:

\input ~jsmith/$foo.bar

will dereference the environment variable or Kpathsea config file value `foo' and read that file extended with `.bar' in user `jsmith''s home directory. (You can also use braces, as in `${foo}bar' if you want to follow the variable name with a letter, numeral, or `_'.)

(So you could define an environment variable value including whitespace and get the program to read such a filename that way, if you need to.)

In all the common TeX formats (plain TeX, LaTeX, AMSTeX), the characters `~' and `~' have special category codes, so to actually use these in a document you have to change their catcodes or use \string. (The result is unportable anyway, see the suggestions above.) The place where they are most likely to be useful is when typing interactively.


Go to the first, previous, next, last section, table of contents.