html2latex -- convert HTML markup to LaTeX markup
The original author, Nathan Torrington, wrote:
"The source is available
here",
but this is obviously no longer the case.
Instead, I (W.Hennings) put it here. This
zip-file includes an msdos executable.
There is another compiled version on CTAN which I also put
here (see description), but both result from the same source.
html2latex [opt ...] [file ...]
For each file argument, html2latex converts the text as HTML markup to LaTeX markup. If no files are specified, a usage message is given. Input will be taken from standard input for files named -. Output will to a similarly named file with a .tex extension (html2latex recognises .html extensions).
Options modify the action of html2latex. The options are:
An example of use is html2latex -n - < file.html | less This converts file.html to LaTeX and pages through the output. The sections (corresponding to heading tags in the HTML source) will be numbered.
Another example is html2latex -t 'Introduction to HTML' -a gnat -p -c html-intro This takes input from the file html-intro, writing to html-intro.tex, and adds a title page (with title Introduction to HTML and author gnat) and table of contents with page-breaks after both. The sections of the document are not numbered.
Current the only HTML tags supported are: TITLE, H1, H2, H3, H4, H5, H6, UL, OL, DL, DT, DD, LI, B, I, U, EM, STRONG, CODE, SAMP, KBD, VAR, DFN, CITE, LISTING. The only recognised SGML escapes are &, <, >. ADDRESS tags are handled badly.
The COMPACT attribute to a DL tag is not recognised. MENU and DIR styles are not handled well. TITLE text are ignored.
Currently PRE tags are not handled at all.
The entire file is read into memory. For long HTML documents on machines with little memory, this may cause problems.
Nathan Torkington adapted the HTML parser from NCSA's Xmosaic package (file://ncsa.uiuc.edu/Web/xmosaic) and wrote the conversion code. The HTML parser code is subject to the NCSA restrictions. The conversion code is subject to the VUW restrictions. Enquiries should be sent via e-mail to Nathan.Torkington@vuw.ac.nz.
This HTML page is part of the texcnv site.
Copyright © 1998, 1999, 2000 Wilfried Hennings
You may copy and redistribute it under the following conditions: