Vertical lists in TeX and how they could be improved with the horizontal ones

| No TrackBacks

My previous post described the elements of a horizontal list in TeX. This one will describe the elements which are broken into pages and some improvements which should be now more possible than in 1980s when TeX was implemented.

The Chapter 15 of The TeXbook by Donald E. Knuth explains the page breaking algorithms of TeX and how they may be used to produce beautiful books. The first paragraph (page 109) states that page breaking is much more difficult than line breaking, since ‘pages often have much less flexibility than lines do’. Unlike line breaking, which uses the total-fit algorithm enabling optimal breaks of whole paragraphs, for page breaking a first-fit algorithm is used, so only the current page is ‘seen’ by TeX to select appropriate breaks. As Knuth explains (page 110), this design difference is based on the unavailability of enough high-speed memory to store several pages. This was certainly true in 1980s, but now many complete books fit in the modern equivalents of high-speed memory of elder days.

Both vertical and horizontal lists contain boxes, glue, kerns and penalties. I’ve described them previously, there are no interesting differences here except for the direction of typesetting. Whatsits and marks were explained in that post, since they are passed from horizontal lists to vertical.

There are two types of material occurring only in one of these two modes – discretionary breaks are only in horizontal mode, in vertical output routines do special tricks instead; insertions are used to put some material in special places of pages (most commonly footnotes, floating tables and figures). Discretionary breaks in vertical lists would probably simplify some things requiring complicated output routines, for example typesetting indices with the entry text repeated on pages beginning with subentries (a solution using marks is explain in The TeXbook, pages 261–263).

The output routine is one of the new ideas in TeX. It allows nearly arbitrary modifications of the page produced from the vertical list, to a box which is shipped out to an output file. Output routines allow things like multicolumn typesetting, special headers and footers, footnotes and correctly floating figures.

An output routine is so useful in vertical mode, so would something similar in horizontal mode be useful? Lines are just boxes of certain width and shift (chosen by e.g. \parshape), with special glue on both sides (to allow e.g. ragged-right typesetting) and content determined by the total-fit (pdfTeX also adds margin kerning). It would be interesting with an arbitrary TeX token list producing such boxes. It would probably make things like line counting or repeated opening quote mark simpler. It would also determine how nice the line is and possibly change it according to the number of previous lines. Is there a nice TeX solution for typesetting the first line of a paragraph in small caps? According to a TUG interview with Werner Lemberg it is simple in Troff. The ‘line routine’ would make it simple in a TeX-like system.

The line routine would determine the badness of a line, clearly ragged-right text has different optimal breaks than justified one (compare the broken LaTeX ragged text commands with normal justified text; use the ragged2e package instead). In vertical mode the badness of a page break is determined before calling the output routine, but it may decide to change the break. Wouldn’t this be simpler with an output routine called for each feasible break to determine the badness of this break?

There are two possible solutions to the problems of the current page breaking algorithm. One is a total-fit page breaking which would also make a typesetting system simpler (the same total-fit algorithm could be used for both lines and pages). The other one is a better cooperation between line breaking and page breaking (proposed at least once for the NTS, the project which led to e-TeX). Maybe if badness was calculated for a chapter as a whole, things like adjusting \looseness by hand to prevent bad page breaks would be automated in a way not possible with TeX?

No TrackBacks

TrackBack URL: http://blog.mtjm.eu/cgi-bin/mt/mt-tb.cgi/56