The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
%PDF-1.4 %ÐÔÅØ 3 0 obj << /Length 152 >> stream BT /F51 9.9626 Tf 91.925 759.927 Td [(W)80(elcome)-250(to)-250(pdfT)]TJ 67.818 -2.241 Td [(E)]TJ 4.842 2.241 Td [(X!)]TJ 138.924 -654.747 Td [(1)]TJ ET endstream endobj 2 0 obj << /Type /Page /Contents 3 0 R /Resources 1 0 R /MediaBox [0 0 595.276 841.89] /Parent 5 0 R >> endobj 1 0 obj << /Font << /F51 4 0 R >> /ProcSet [ /PDF /Text ] >> endobj 7 0 obj [333 408 500 500 833 778 333 333 333 500 564 250 333 250 278 500 500 500 500 500 500 500 500 500 500 278 278 564 564 564 444 921 722 667 667 722 611 556 722 722 333 389 722 611 889 722 722 556 722 667 556 611 722 722 944 722 722 611 333 278 333 469 500 333 444 500 444 500 444 333 500 500 278 278 500 278 778 500 500 500 500 333 389 278] endobj 8 0 obj << /Type /FontDescriptor /FontName /Times-Roman /Flags 34 /FontBBox [0 -216 1000 678] /Ascent 678 /CapHeight 651 /Descent -216 /ItalicAngle 0 /StemV 83 /XHeight 450 >> endobj 6 0 obj << /Type /Encoding /Differences [33/exclam 49/one 69/E 84/T 87/W/X 99/c/d/e/f 108/l/m 111/o/p 116/t]

>> endobj 4 0 obj << /Type /Font /Subtype /Type1 /BaseFont /Times-Roman /FontDescriptor 8 0 R /FirstChar 33 /LastChar 116 /Widths 7 0 R /Encoding 6 0 R >> endobj 5 0 obj << /Type /Pages /Count 1 /Kids [2 0 R] >> endobj 9 0 obj << /Type /Catalog /Pages 5 0 R >> endobj 10 0 obj << /Producer (pdfTeX-1.40.1) /Creator (TeX) /CreationDate (D:20070125120125+01’00’) /ModDate (D:20070125120125+01’00’) /Trapped /False /PTEX.Fullbanner (This is pdfTeX, Version 3.141592-1.40.1-2.2 (Web2C 7.5.6) kpathsea version 3.5.6) >> endobj xref 0 11 0000000000 65535 f 0000000335 00000 n 0000000224 00000 n 0000000015 00000 n 0000001058 00000 n 0000001210 00000 n 0000000939 00000 n 0000000403 00000 n 0000000756 00000 n 0000001267 00000 n 0000001316 00000 n trailer << /Size 11 /Root 9 0 R /Info 10 0 R /ID [<C0AA7E97DD300D37AF4A42A0553DC9E6> <C0AA7E97DD300D37AF4A42A0553DC9E6>] >> startxref 1570 %%EOF

The pdfTEX user manual

The pdfTEX user manual
´ Hàn Thê Thành Sebastian Rahtz Hans Hagen Hartmut Henkel Paweł Jackowski Martin Schröder January 25, 2007 Rev. 1.675

The title page of this manual represents the plain TEX coded text “Welcome to pdfTEX!”

\pdfoutput=1 \pdfcompresslevel=0 \font\tenrm=ptmr8r \tenrm Welcome to pdf\TeX! \bye

The pdfTEX user manual

Contents
1 2 3 4 5 6 7 8 Introduction . . . . . . . . . . . . . . . . . . . . . . . About PDF . . . . . . . . . . . . . . . . . . . . . . . . . Getting started . . . . . . . . . . . . . . . . . . . . . Macro packages supporting PDFTEX . . Setting up fonts . . . . . . . . . . . . . . . . . . . . Formal syntax specification . . . . . . . . . PDFTEX primitives . . . . . . . . . . . . . . . . . Graphics and color . . . . . . . . . . . . . . . . . 1 2 3 9 10 14 18 37 9 Character translation . . . . . . . . . . . . . . . 39 39 40 42 42 44 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . Examples of HZ and protruding . . . . . . . . . Additional PDF keys . . . . . . . . . . . . . . . . . . . Colophon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GNU Free Documentation License . . . . . . .

1 Introduction
The main purpose of the pdfTEX project is to create and maintain an extension of TEX that can produce pdf directly from TEX source files and improve/enhance the result of TEX typesetting with the help of pdf. When pdf output is not selected, pdfTEX produces normal dvi output, otherwise it generates pdf output that looks identical to the dvi output. An important aspect of this project is to investigate alternative justification algorithms (e. g. a font expansion algorithm akin to the hz micro--typography algorithm by Prof. Hermann Zapf), optionally making use of Multiple Master fonts. pdfTEX is based on the original TEX sources and Web2c, and has been successfully compiled on Unix, Win32 and MSDos systems. It is under active development, with new features trickling in. Great care is taken to keep new pdfTEX versions backward compatible with earlier ones. For some years there has been a ‘moderate’ successor to TEX available, called ε-TEX. Because mainstream A macro packages such as L TEX have started supporting this welcome extension, the ε-TEX functionality has also been integrated into the pdfTEX code. For a while (TEX Live 2004 and 2005) pdfTEX therefore came in two flavours: the ε-TEX enabled pdfeTEX engine and the standard one, pdfTEX. The ability to produce both pdf and dvi output made pdfeTEX the primary TEX engine in these distributions. Since pdfTEX version 1.40 now the ε-TEX extensions are part already of the pdfTEX engine, so there is no need anymore to ship pdfeTEX. The ε-TEX functionality of pdfTEX can be disabled if not required. Other extensions are MLTEX and encTEX; these are also included in the current pdfTEX code. ´ pdfTEX is maintained by Hàn Thê Thành, Martin Schröder, Hans Hagen, Taco Hoekwater, Hartmut Henkel, and others. The pdfTEX homepage is http://www.pdftex.org. Please send pdfTEX comments and bug reports to the mailing list pdftex@tug.org. We thank all readers who send us corrections and suggestions. We also wish to express the hope that pdfTEX will be of as much use to you as it is to us. Since pdfTEX is still being improved and extended, we suggest you to keep track of updates.

1.1

About this manual
This manual revision (1.675) tries to keep track with the recent pdfTEX development up to version 1.40.0. Main text updates were done regarding the new configuration scheme, font mapping, and new or updated primitives. The primary repository for the manual and its sources is at http://sarovar.org /projects/pdftex/. Copies in pdf format can also be found at the CTAN network in directory ctan: systems/pdftex. Thanks to Karl Berry for proof reading and submitting a long changes list. New errors might have slipped in afterwards by the editor. Please send questions or suggestions by email to pdftex@tug.org.

1.2

Legal Notice
´ Copyright C 1996--2007 Hàn Thê Thành. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version

1

The pdfTEX user manual

published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License”.

2 About PDF
The cover of this manual lists an almost minimal pdf file generated by pdfTEX, with the corresponding source file on the next page. Unless compression is enabled, such a pdf file is rather verbose and readable. The first line specifies the version used; currently pdfTEX produces level 1.4 output by default. pdf viewers are supposed to silently skip over all elements they cannot handle. A pdf file consists of objects. These objects can be recognized by their number and keywords:

9 0 obj << /Type /Catalog /Pages 5 0 R >> endobj
Here 9 0 obj ... endobj is the object capsule. The first number is the object number. The sequence 5 0 R is an object reference, a pointer to another object (no. 5). The second number (here a zero) is currently not used in pdfTEX; it is the version number of the object. It is for instance used by pdf editors, when they replace objects by new ones. When a viewer opens a pdf file, it goes right to the end of the file, looking for the keyword startxref. The number after startxref gives the absolute position (byte offset from the file start) of the so called ‘object cross-reference table’ that begins with the keyword xref. This table in turn tells the byte offsets of all objects that make up the pdf file, providing fast random access to the individual objects (here the xref table shows 11 objects, numbered from 0 to 10; the object no. 0 is always unused). The actual starting point of the file’s object structure is defined after the trailer: The /Root entry points to the /Catalog object (no. 9). In this object the viewer can find the pointer /Pages to the page list object (no. 5). In our example we have only one page. The trailer also holds an /Info entry, which points to an object (no. 10) with a bit more about the document. Just follow the thread:

/Root −→ object 9 −→ /Pages −→ object 5 −→ /Kids −→ object 2 −→ /Contents −→ object 3
As soon as we add annotations, a fancy word for hyperlinks and the like, some more entries will be present in the catalog. We invite users to take a look at the pdf code of this file to get an impression of that. The page content is a stream of drawing operations. Such a stream can be compressed, where the level of compression can be set with \pdfcompresslevel (compression is switched off for the title page). Let’s take a closer look at this stream in object 3. Often (but not in our example) there is a transformation matrix, six numbers followed by cm. As in PostScript, the operator comes after the operands. Between BT and ET comes the text. A font is selected by a Tf operator, which is given a font resource name /F.. and the font size. The actual text goes into () bracket pairs so that it creates a PostScript string. The numbers inbetween bracket pairs provide horizontal movements like spaces and fine glyph positioning (kerning). When one analyzes a file produced by a less sophisticated typesetting engine, whole sequences of words can be recognized. In pdf files generated by pdfTEX however, many words come out rather fragmented, mainly because a lot of kerning takes place; in our example the 80 moves the text (elcome) left towards the letter (W) by 80/1000 of the font size. pdf viewers in search mode simply ignore the kerning information in these text streams. When a document is searched, the search engine reconstructs the text from these (string) snippets. Every /Page object points also to a /Resources object (no. 1) that gives all ingredients needed to assemble the page. In our example only a /Font object (no. 4) is referenced, which in turn tells that the text is typeset in /Font /Times-Roman. The /Font object points also to a /Widths array (object no. 7) that tells for each character by how much the viewer must move forward horizontally after typesetting a glyph. More details about the font can be found in the /FontDescriptor object (no. 8); if a font file is embedded, this object points to the font program stream. But as the Times-Roman font used for our

2

The pdfTEX user manual

example is one of the 14 so--called standard fonts that should always be present in any pdf viewer and therefore need not be embedded in the pdf file, it is left out here for brevity. However, when we use for instance a Computer Modern Roman font, we have to make sure that this font is later available to the pdf viewer, and the best way to do this is to embed the font. It’s highly recommended nowadays to embed even the standard fonts, as modern viewers often don’t use the original 14 standard fonts anymore, but instead approximate them by instances of built--in Multiple Master fonts (e. g. the Adobe Reader 7 approximates the Times-Roman variants by the Minion font). So you never really know how it looks exactly at the viewer side unless you embed every font. In this simple file we don’t specify in what way the file should be opened, for instance full screen or clipped. A closer look at the page object no. 2 (/Type /Page) shows that a mediabox (/MediaBox) is part of the page description. A mediabox acts like the (high-resolution) bounding box in a PostScript file. pdfTEX users can add dictionary stuff to page objects by the \pdfpageattr primitive. Although in most cases macro packages will shield users from these internals, pdfTEX provides access to many of the entries described here, either automatically by translating the TEX data structures into pdf ones, or manually by pushing entries to the catalog, page, info or self created objects. One can for instance create an object by using \pdfobj after which \pdflastobj returns the number. So

\pdfobj{/Type /Catalog /Pages 5 0 R}
inserts an object into the pdf file, while \pdflastobj returns the number pdfTEX assigned to this object. Unless objects are referenced by others, they will just end up as isolated entities, not doing any real harm but bloating the pdf file. In general this rather direct way of pushing objects in the pdf files by primitives like \pdfobj is not very useful, and only makes sense when implementing, say, fill--in field support or annotation content reuse. We will come to that later. For those who want to learn more about the gory pdf details, the best bet is to read the PDF Reference. As of the time of writing you can download this book as a big pdf file from Adobe’s PDF Technology Center, http://www.adobe.com/devnet/pdf/pdf_reference.html — or get the heavy paper version. Those who, after this introduction, feel unsure how to proceed, are advised to read on but skip section 7. Before we come to that section, we will describe how to get started with pdfTEX.

3 Getting started
This section describes the steps needed to get pdfTEX running on a system where pdfTEX is not yet installed. Nowadays virtually all TEX distributions have pdfTEX as a component, such as TEX Live, teTEX, XEmTEX, MikTeX, proTEXt, and CMacTEX. The ready to run TEX Live distribution comes with pdfTEX versions for many Unix, Win32, and Mac OS X systems; more information can be found at http://www. tug.org/tex-live/. teTEX by Thomas Esser is a source distribution with an automated compilation process for Unix systems; see http://www.tug.org/teTeX/. For Win32 systems there are also three separate distributions that contain pdfTEX, all in ctan:systems/win32: XEmTEX by Fabrice Popineau, MikTeX by Christian Schenk, and proTEXt (based on MikTeX) by Thomas Feuerstack. So when you use any of these distributions, you don’t need to bother with the pdfTEX installation procedure in the next sections. If there is no precompiled pdfTEX binary for your system, or the version coming with a distribution is not the current one and you would like to try out a fresh pdfTEX immediately, you will need to build pdfTEX from sources; read on. You should already have a working TEX system, e. g. TEX Live or teTEX, into which the freshly compiled pdfTEX will be integrated. Note that the installation description in this manual is Web2c--specific.