This is README.TeX-to-C. Most of the information here was supplied by Tim Morgan (address below). Only a few lines have been changed to reflect the arrangement chosen for the Unix distribution tape. It is strongly suggested that all Unix sites try this out before falling back on the old Pascal compilation. TeX-to-C appears successfully to avoid the old necessity for machine dependent change files; the code generated is smaller, faster, and passes trip successfully. If you want a really HUGE TeX, you should study the listing in Computers and Typesetting Volume B, TeX the Program, and adjust |max_halfword| and all dependent values. This option is not open on Unix pc Pascal, owing to the 16-bit limitation in array sizes. (PAM, 1/19/88) Relevant new files are in ./tex82: Contains Makefile.TeX-to-C and the ctex.ch changefile for TeX 2.9 in addition to the regular Pascal based files. TeX-to-C uses the standard tex.web file Setup.TeX-to-C gives further instructions. The texedit.h and texplaces.h files in this directory set up various default search paths and strings. the textrip.h file contains a single definition to govern whether the special triptex form of initex is compiled There are two directories added to the ./tex82 path: ./tex82/textoc: Contains the conversion programs which translate ctex.p into several .h and .c files, which are stored into the ctex directory. ./tex82/ctex: Contains the extra stuff needed to build a running TeX. ``tex.h'' contains possibly-system-dependent definitions of datastructures and other types. It assumes that ``long'' is at least 32 bits and ``short'' is at least 16 bits. This is as specified by the ANSI Draft C standard, and most compilers (at least on larger machines) work this way. ``extra.c'' contains some possibly-system-dependent code as well, such as opening files, reading the input, getting things started, etc. To create TeX in C and compile it, you should just be able to type ``make'' in the current directory if you're running 4.2BSD or an equivalent. Here's an English explanation of what needs to be done, and what you have to do for System V or other non-BSD environments: ------------------------------------------ To compile TeX from scratch, check the contents of ./tex82/texplaces.h and texedit.h. Change these if necessary. Make sure ``tangle'' is in your search path (if it isn't you may have to compile it from the bootstrapping tangle.[*}.p file appropriate to your system). Make sure you are back in directory ``tex82'' and type ``make''. This should create files ctex.p and ctex.pool. ``ctex.p'' is basically Pascal, with some C-isms, and with some routines and type definitions left out (they're in tex.h and extra.c). When ctex.p has been created, make will move into the ``textoc'' directory, and run ``make'' there to create all the conversion programs. Then make will run ``convert'' (also in textoc). This will complete the conversion process, creating ten .c files and some .h files in the ``ctex'' directory. These files have run through lint, and proved to be practically lint-free. There are some things, like unused variables, which are in Knuth's code. There are also some ``unreached statements.'' Some of these come from conditional compilation, and some come from the addition of unnecessary ``return'' statements in functions which actually don't exit at the bottom of the routine. Don't worry about any of this. By default, the ``convert'' script runs a program which splits TeX into 10 different C modules. This is primarily to speed compilation since much of the code is common to initex and virtex. If you remove the line | (cd ../ctex; ../textoc/splitup) from the pipeline, then you'll get one monolithic module. Either way, there are embedded #ifdef's for INITEX, DEBUG, and STATS. While doing the conversion process, you have the option of making all the local variables of appropriate types ``register'' variables. This can speed up the execution by 10% on some machines/compilers. If you want to do this, remove the comment character in the convert script at the beginning of the ``regfix'' line. On SunOS 3.2, doing this will cause the compiler to produce incorrect code. on a MC68020 machine, but correct code on a MC68010 machine. ULTRIX accepts the code adjusted with ``regfix'' and produces executables that are about half the size of the Pascal executables, and faster as well. Fixwrites.c contains most of the tex.web dependencies, since it has to generate %s, %d, or whatever for printf strings depending on the type of the object to be written out. Since it doesn't maintain a symbol table, it is possible that a major change in TeX's variables could break it. Similarly, it knows that the Pascal READ statement is used only to read integers (other READ's are eliminated by the changefile). Its function should be incorporated into the program textoc, since it maintains a symbol table and therefore knows the types of the identifiers being written. ``make'' will then go to the ``ctex'' directory and ``make '' both initex and virtex. To create a triptex, type `` make triptex.'' Since this causes make to move into the ctex directory and ``make initex'', you will want to put initex somewhere else if you have already successfully compiled. After ``initex,'' has been successfuly made, it is renamed to ``triptex.'' Remove the .o files from this compilation. Triptex is a special version of initex which is set up for running the trip test. To run the test itself, follow the instructions which come with the trip.tex file. The generated C code has passed the trip test for TeX 2.[23579] on the following systems: Sequent Balance, Dynix 2.1 Sun3(MC.68020), SunOS 3.2 Sun2(MC.68010), SunOS 3.2 (works with regfix) Integrated Solutions, Unix 3.07, both cc and gcc compilers DEC VAX-11/750, 4.2BSD, 4.3BSD DEC VAX-8530, ULTRIX 2.0 (works with regfix) Pyramid 90x, OSx 4.0 If you're running on something else, you should make sure that the C code passes the trip test. If it doesn't, you can't call it TeX! Also, I'd like to know it if you port this code to another type of system, so please drop me a note if you do. Other compile-time options that you can set in tex.h include STAT and DEBUG. STAT adds TeX's statistics-gathering code, and DEBUG adds the special debugging code. You should have STAT turned on when doing the trip test (i.e., when TRIP is defined in tex.h). DEBUG is only useful when you're going to debug TeX itself. I also tried changing the type definition of halfword to unsigned long from unsigned short in ``ctex/tex.h''. The resulting program still passed the trip test (and lint) on SunOS 3.2. This means you can build a TeX with a mem array larger than 65535 words for those really big jobs. You can edit the manifest constants in ctex/texd.h, as long as you obey any restrictions on the range of values allowed in tex.web (e.g., the relationship between memmax and memtop). The values set by ctex.ch are those used at UCI. If your native integer size is two bytes, your C compiler may complain about constants larger in absolute value than 65535. They really should be written as nnnL. There may be other problems with such compilers, since I don't have one available to try. All printf's use ``%ld'' when printing integers to be safe. Finally, I have also tested ctex under the System V compiler available on the Sequent in the ``ATT Universe,'' and with the cc command available in SunOS 3.2 in /usr/5bin. These are the only System V compilers available to me. As far as I know, since the rest of the code passes lint, it should port very easily to any other type of Unix system. To compile it for System V, you should edit ``extra.c'' and change the #undef to a #define for the cpp symbol SYSV. I would like to know if System V users find other incompatibilities. The original version of textoc.{c,yacc,lex} was written by Tomas Rokicki, now at Stanford University. It was mostly finished, and he put it into the public domain. I took this code, made quite a few changes in it, and simultaneously created a TeX changefile compatible with it, borrowing from existing Unix TeX changefiles. I wrote all the auxiliary programs which are now used in converting TeX to C, and I made further changes in the textoc program to produce better (== faster and more lintable) C code. Tomas suggested the method of producing ``#ifdef INITEX'' instead of having two change files and producing two completely different .c files, and he also suggested automatically splitting up the file into several separately-compilable files. If you make any additions or improvements in this code, please send the changes to me so I can distribute them to others. Tim Morgan Department of Information and Computer Science University of California, Irvine 92717 morgan@ics.uci.edu, morgan@uci.bitnet (714) 856-7553