Most non-fiction texts have numbered sections and many references to them, sometimes stating also which page is referred to. Using LaTeX numbering of pages, sections and all such references is completely automatic, making these numbers nearly always correct. However, the method by which it is implemented is not completely optimal.
It is used by writing the \label{id} command in
section to be referred, where id identifies
the section, preferably being easy to remember and not changed too
often. This makes it possible to use the
\ref{id}
and \pageref{id} commands which typeset
the number of the section or its page number. (References may
lead to page of any text or a number of equation, list item,
theorems, etc; I refer to all of them as ‘sections’ in
this post.)
During the first run of LaTeX the text is completely typeset,
using ‘??’ instead of numbers to be referred to. All
\labels write their identifiers, section and page
numbers to the .aux file. On the beginning of the
second run this file is read, then all references are used as they
were correct on the previous run, and new values are written to
the file.
Here an important feature of TeX is seen –
typesetting words, breaking paragraphs into lines, joining lines
into pages, and outputting pages are done asynchronously. To know
the page number of a given \label LaTeX uses the
primitive TeX command \write which evaluates
appropriate commands during page output. This makes it impossible
to change text depending on current page number, so the number
from previous run must be used instead.
The same method is used for other things, like tables of
contents, bibliographic references, indices and correct placement
of margin notes on two-sided documents by the
mparhack package.
However, using multiple passes for cross references has several
disadvantages. The most visible one is that the time needed to
make correct output is several times larger, although only one
output file is needed and error messages are useful for only one
pass (this can be improved by using \batchmode for
non-first runs of LaTeX and pdfTeX’s -draftmode
option for non-last runs). Despite this, it is clearly visible
that most of work during non-last passes is unnecessary
(especially when referring only to numbers of sections, not
pages).
Since the same files are modified in each pass, it is difficult
to optimally use make or another generic build system
with LaTeX. This leads to longer processing than necessary and
uncertainty of the document having outdated references.
It is even possible to make a document which has always incorrect references. This document shows this problem:
\documentclass{minimal}
\pagestyle{empty}
\pagenumbering{roman}
\setlength{\textwidth}{8pt}
\setlength{\textheight}{10pt}
\setlength{\parindent}{0pt}
\makeatletter
\@ifundefined{pdfpagewidth}{}{%
\pdfpagewidth=2in
\pdfpageheight=2in
}
\makeatother
\begin{document}
\setcounter{page}{9}
\pageref{x}\hspace{0pt}i\label{x}
\end{document}
(The part with \pdfpagewidth is to make it easier
to see both pages at once in a PDF viewer, I have described it in a previous
post.)
Since the second pass, this document will oscillate between having one or two pages. When the reference leads to page x, then the ‘i’ is on page ix, but with reference to page ix the ‘i’ is on page x. (Leslie Lamport states in LaTeX: A Document Preparation System that using Roman numerals may lead to this problem, I did not know any specific example of such document before writing the above one.)
So despite being very useful, automatic cross references in LaTeX have some disadvantages. Usually a good enough solution is to run LaTeX on a document some times longer than possible necessary, and change text to avoid having infinite loops in this process. Could it be improved? I’ll write about some other ways to avoid these problems in a separate post.
