LuaTeX: visualizing interword glue calculations

Introduction

In this example we will use the “post_linebreak_filter” to iterate over all the lines in a typeset paragraph and use the data provided by LuaTeX to insert pdf_literal nodes which draw a box to show the size of the glue between words. It is based on a nice piece of code on the LuaTeX wiki by (I believe) Patrick Gundlach.

Note: The code does not recurse into nested lists; so if you have further boxes within the lines of your paragraph you should note that this code is not designed to handle them. An exercise for the reader ;-).

If you extend the code to deal with nested boxes, other glue types, such as \baselineskip etc, and account for glyph widths, heights and depths, what this leads to is the ability to calculate precise (x,y) positions within many types of node lists.

Code and explanations

Here, we are placing a paragraph into a \vbox{There are many variations of ….} and hooking into the post_linebreak_filter to access the node list after TeX has broken the paragraph into lines. Each line in the paragraph is accessible from LuaTeX as an hlist which is traversed using the LuaTeX API call node.traverse_id(…). We then scan through the nodes contained in each hlist, looking for glue. As mentioned, the code does not recurse so we are just choosing the individual paragraph lines and not exploring anything below that level.

Each hlist contains its glue setting parameters which let us calculate by how much the individual glue items are stretched or shrunk within each line of the paragraph. From those values you can calculate the size of the set glue and draw an appropriately sized box with a pdf_literal, whose zero size does not affect the positioning of the text. The pdf_literal is inserted into the hlist node tree just before each glue node.

For further reading on TeX’s glue calculations see page 77 of The TeXBook.

\pdfoutput=1
\hoffset-1in
\voffset-1in
\nopagenumbers
\directlua{

function scanskips(hlist)

local size=0
% Some debug output
print(string.char(10), "glue_order = "..hlist.glue_order, "glue_set = "..hlist.glue_set, "glue_sign = "..hlist.glue_sign)

id = node.id("glue")

for n in node.traverse_id(id, hlist.head) do 
% Some more debug output
          print("subtype="..n.subtype, "width: ", n.spec.width,  "stretch: ", n.spec.stretch,  "shrink: ", 
                  n.spec.shrink, "stretch order: ", n.spec.stretch_order, 
                  "shrink order: ", n.spec.shrink_order, string.char(10)) 

          order=hlist.glue_order
          set=hlist.glue_set
          sign=hlist.glue_sign

          if sign ==2 then 
                size=n.spec.width - n.spec.shrink*set
          else
                 size=n.spec.width + n.spec.stretch*set
          end

          bp=(size/65536)*(72.0/72.27)

          local pdf = node.new("whatsit","pdf_literal")
          pdf.mode = 0
          pdf.data = "q 0 0 "..bp.." 12 re 0.3 w S Q"
          local prev = n.prev
          prev.next=pdf
          pdf.next = n

      end %for loop
end % function
}

\directlua{
          linelist=function(head)
                    for line in node.traverse_id('hlist',head) do
                              scanskips(line)
                    end
          return head
          end

          callback.register("post_linebreak_filter",linelist)
}
 
\setbox1000=\vbox{\hsize=50mm There are many variations of passages of Lorem Ipsum available, but the majority have suffered alteration in some form, by injected humour, or randomised words which don't look even slightly believable. If you are going to use a passage of Lorem Ipsum, you need to be sure there isn't anything embarrassing hidden in the middle of text. All the Lorem Ipsum generators on the Internet tend to repeat predefined chunks as necessary, making this the first true generator on the Internet. It uses a dictionary of over 200 Latin words, combined with a handful of model sentence structures, to generate Lorem Ipsum which looks reasonable. The generated Lorem Ipsum is therefore always free from repetition, injected humour, or non-characteristic words etc.}
\pdfpagewidth=\wd1000
\vsize=\ht1000
\advance\vsize by 65536 sp
\pdfpageheight=\vsize
\box1000
\bye

Resulting PDF

As usual, through the Google Docs viewer or download here.

Producing printers crop marks with MetaPost and LuaTeX nodes

Introduction

In this article I’ll show a technique for producing crop marks using the MetaPost library (MPlib) which is built into LuaTeX. There is a lot of ground to cover so I’ll try to focus on/summarise the most important/useful areas to prevent this article becoming far too long! It has already taken a few days to write it and prepare the graphics… I’m not going to attempt a tutorial on the MetaPost language because there are many excellent articles already written by people who are best qualified to produce that material. A great starting point is the TeX Users Group page on MetaPost.

SVG-enabled browser needed: This post uses inline SVG and SVG via in iframe, which may not work in all browsers.

Citing various sources: a few thanks are in order

Much of the material in this post is derived from existing work by members of the LuaTeX community, so I’d like to acknowledge those sources. Firstly, a huge thank you to Hans Hagen for creating the Lua code which converts the MPlib data to raw PDF code. It is a really excellent and hugely useful piece of code which is part of the ConTeXt distribution. In addition, I learnt a lot by reading the source code of luamplib by Hans Hagen, Taco Hoekwater, Elie Roux and Manuel Pégourié-Gonnard. I also discovered some code by Dohyun Kim which makes some very helpful additions to luamplib through which you can support the traditional btex … etex construct for including TeX within MetaPost code processed by MPlib (more of that in another post).

Pieces of the process

The key elements of the technique I’ll describe are:

  1. Embedding MetaPost code (to draw a crop mark) within your LuaTeX document.
  2. Converting the MetaPost output to PDF code (using the code from Hans Hagen’s ConTeXt).
  3. Generating pdf_literal nodes representing the crop marks (modification to luamplib).
  4. Placing the crop marks in the appropriate locations through an output routine.

What are crop marks?

Crop marks, also referred to as “printers marks”, “cut marks” or “trim marks”, are small graphics placed at the corners of a page to indicate the physical size of a final printed document pages. They are used during commercial printing activities, such as page imposition, colour separation, folding and trimming. The physical appearance of crop marks will vary depending on the application used to generate the pages but, of course, with LuaTeX and other TeX engines you are free to create your own designs. Advanced readers will be aware that for multi-colour separations (spot and 4 colour CMYK) the crop marks must appear on all plates but I’ll not cover that topic here. It would be fairly easy to do through injecting appropriate PDF code to set the colour space for the crop marks or using the Printer’s Mark Annotations feature of PDF.

The following graphic (produced with Inkscape) shows the general idea.

The following inline SVG graphic (produced directly by exporting from MPlib) shows the design of the crop mark we will be producing via MetaPost code.


The MetaPost code

I’ll readily admit that I’m an amateur when it comes to programming with MetaPost so due apologies to any experts reading the code :-). The idea here is that the MetaPost code is inline in the LuaTeX document and I’m using the process_input_buffer callback (see this post) to store the MetaPost code into a buffer which will processed via MPlib. The \startbuffer and \stopbuffer commands were described in an earlier article.

\def\startbuffer{\directlua{callback.register("process_input_buffer",dobuffer)}}
\def\stopbuffer{\directlua{callback.register("process_input_buffer",nil)}}
\startbuffer
beginfig(1);
numeric n,g,tl, alpha; 
pair dl,zc,zl,zr,zt,zb,db;
path c, t;
alpha:=0.75;
n:=2;
g:=72*(3/25.4);
tl:=(n+1)*g;
pickup pensquare scaled 0.5bp;
draw (0,g)--(0,tl);
draw (-g,0)--(-tl,0);
rd:=0.5*tl;
rc:=0.5*rd;
dl=(-(1+alpha)*rc,0);
db=(0,-(1+alpha)*rc);
zc=(-rd,rd);
zl=zc+dl;
zr=zc-dl;
zt=zc+db;
zb=zc-db;
t=zc+(0.9*rc,0)--zc-(0.9*rc,0);
c = fullcircle scaled 2rc shifted(-rd,rd);
pickup pensquare scaled 0.5bp;
draw zl--zr;
draw zt--zb;
fill c withcolor black;
pickup pensquare scaled 1bp;
draw t withcolor white;
draw t  rotatedabout(zc, 90) withcolor white;
endfig;
end;
\stopbuffer%

Very brief introduction to MPlib

As mentioned, MPlib is the library version on the MetaPost interpreter built into LuaTeX. Instead of using MetaPost as a standalone executable (e.g., mpost.exe) you access it through an API provided by the LuaTeX engine. The value of the integration of MetaPost with LuaTeX is well demonstrated by the truly stunning results achieved by Hans Hagen and the ConTeXt distribution. The ambitions of this article are rather more modest.

In outline the steps are as follows.

  1. You need to create a “finder” function which MPlib will call to locate any files it needs.
  2. Provide the finder function as one of the arguments to the API call mplib.new() which
    is responsible for creating an instance of the interpreter.
  3. Load the “format file” containing the macro package you want to use (e.g., plain.mp).
  4. Use the interpreter instance returned by mplib.new() to execure your code.
  5. Process the figure objects generated and returned by MPlib (if your code worked without error).

Here is some sample code for a finder function and creating an instance of the MetaPost interpreter.

-- finder function for MPlib
function finder(name, mode, ftype)
        local found 
                if mode=="w" then found = name else 
                        found = kpse.find_file(name,ftype) 
                end 
                if found then
                        print("MPlib finder: " .. name .. " -> " .. found) 
                end
        return found 
end 

-- create new interpreter instance
function newmp (memname)
        local preamble = "let dump = endinput ; input \%s ;"
        local mp = mplib.new {
            ini_version = true,
            find_file = finder,
        }
        mp:execute(string.format(preamble, memname))
        return mp
 end

MetaPost to PDF or SVG

One nice feature of MPlib is that it will automatically generate an SVG representation of the graphic produced from the MetaPost code, and that is how the inline SVG crop mark (above) was produced. The MPlib library will also PostScript code and an “object representation” of the graphic. The “object representation” can be parsed to convert the graphic into other data formats and this is how the PDF data is generated by the ConTeXt Lua code contained in luamplib. It runs over the collection of objects and converts them to the equivalent representation in PDF data/structures. I really admire that code and it’s great to have it available.

Re-using code and ideas in luamplib

Within the luamplib distribution is the core Lua code from ConTeXt which does the “heavy lifting” to convert MPlib data structures to PDF code. For the purposes of the work described in this article I re-used that Lua code and placed it into Lua file (mpnodes.lua) which can be downloaded here. To use it you’ll need to load it as a Lua module:

\directlua{require(“mpnodes.lua”)}

Using mpnodes.lua

The mpnodes.lua module contains functions which create a new instance of the MetaPost interpreter (via MPlib), execute MetaPost code and use the functions from ConTeXt to generate the PDF data. During the process of generating PDF data, the Lua code makes a number of calls to defined TeX macros (originally in luamplib.sty) which, in effect, pass the PDF data to the (Lua)TeX engine. The TeX macros of interest here are:

  • \def\mplibstarttoPDF#1#2#3#4{….}
  • \def\mplibtoPDF#1{…}
  • \def\mplibstoptoPDF{…}

As part of this implementation those macros were redefined to work with LuaTeX nodes. Examples will be provided later in the article.

Note: the mpnodes.lua module also contains other functions which I won’t describe: including work based on the code from Dohyun Kim which implements the btex … etex functionality of the standalone MetaPost interpreter. The MPlib version of MetaPost does not directly support the btex … etex construct; other methods have to be employed to include TeX code within MetaPost graphics processed by MPlib. Again, those methods are based on the pioneering work of the ConTeXt distribution. I’ll write about this in a future post.

Creating nodes to store crop marks

If you look at the figure above, which shows 4 crop marks on a page, you can see that you only need to create 1 crop mark graphic and then rotate it in increments of 90 degrees as you place it at the 4 corners. In the approach, described below, the PDF data generated from the MetaPost graphic is stored as LuaTeX pdf_literal nodes which are then drawn at each corner via the \output routine described below.

Enough description, here’s the code

Here’s the Plain-TeX-based code with inline comments.

\pdfoutput=1
\pdfcompresslevel=0
\hoffset-1in
\voffset-1in
\pdfpageheight=200mm
\pdfpagewidth=300mm
\vsize=100mm
\hsize=200mm
\topskip=0pt
\maxdepth=0pt

% This output routine centres \box255 horizontally and vertically
% on the PDF page and ships out 4 boxes (2000--2003) onto every page.
% These boxes contain the crop mark graphic, suitably rotated.
% The \output routine is described in the article text.

\output={\shipout\vbox to \pdfpageheight{%
\vfill%
\vbox to \vsize{\offinterlineskip%
\vfill%
\hbox to\pdfpagewidth{\hfill\hbox to \hsize{\copy2000\hfill\copy2001}\hfill}%
\hbox to \pdfpagewidth{\hfill\box255 \hfill}%
\hbox to \pdfpagewidth{\hfill\hbox to \hsize{\copy2003\hfill\copy2002}\hfill}%
\vfill}%
\vfill%
}%pdfpageheight
}%output

% Load the mpnodes.lua module

\directlua{require("mpnodes")}

% Here we redefine the macros in luamplib to 
% work with nodes. 


% \mplibstarttoPDF simply stores the coordinates of the bounding box of the MetaPost
% graphic in 4 TeX tokens, but they are not used in the code below.

\def\mplibstarttoPDF#1#2#3#4{%
\directlua{
		tex.toks[500]=#1
		tex.toks[501]=#2
		tex.toks[502]=#3
		tex.toks[503]=#4		 
%tex.print(tex.toks[500],tex.toks[501],tex.toks[502],tex.toks[503])
}}

% \mplibtoPDF stores each line of PDF data generated from Lua
% and builds it up into one long string. Note I am adding a carriage
% return to each line (string.char(10))

\def\mplibtoPDF#1{%
\directlua{
		buffy= buffy or ""
		buffy=buffy..string.char(10).."#1"
}}

% Here's the main work. Once the PDF data describing the crop mark has been collected 
% the final macro call made by the Lua code is to \mplibstoptoPDF. In this macro 
% we create 4 pdf_literal nodes which only differ by adding extra PDF data to rotate the  
% crop mark in increments of 90 degrees. The PDF data has been collected by \mplibtoPDF
% in a text string called "buffy".

\def\mplibstoptoPDF{%
\directlua{
	rad=math.rad
	sin=math.sin
	cos=math.cos

% Function to generate PDF transformation (rotation) matrices.
function rotate(angle)
	local d=string.format("q 0 0 m \%3.3f  \%3.3f  \%3.3f \%3.3f  0 0 cm ", 
	cos(rad(angle)), 
	sin(rad(angle)),
	-sin(rad(angle)), 
	cos(rad(angle)))
	return d
end

% The names of the pdf_literal nodes reflect their position on the page.
% Note: PDF rotations are counter-clockwise hence negative angles used
% in the code below.

pdftopleft = node.new("whatsit","pdf_literal")
pdftopleft.mode=0
pdftopleft.data=buffy

pdftopright = node.new("whatsit","pdf_literal")
pdftopright.mode=0

rot90=rotate(-90)
print(rot90)%..buffy.." Q")
pdftopright.data= rot90..buffy.." Q"

pdfbottomright = node.new("whatsit","pdf_literal") 
pdfbottomright.mode=0
rot180=rotate(180)
pdfbottomright.data=rot180..buffy.." Q"

pdfbottomleft = node.new("whatsit","pdf_literal") 
pdfbottomleft.mode=0
rot270=rotate(-270)
pdfbottomleft.data=rot270..buffy.." Q"

% We have our nodes now pack them into boxes for shipping
% out via the \output routine as \copy2000....\copy2003

tex.box[2000]= node.hpack(pdftopleft)
tex.box[2001]= node.hpack(pdftopright)
tex.box[2002]= node.hpack(pdfbottomright)
tex.box[2003]= node.hpack(pdfbottomleft)

}}

% Macro to create an instance of the MPlib interpreter and execute
% the MetaPost code collected by the \startbuffer and \stopbuffer macros.
% Lua code is stored in the mpnodes.lua module hence you prefix the functions
% with "mpnodes" to call them: mpnodes.function_name(...)

\def\runmpcode{%
\directlua{
%print(buffer)

%Error checking could be improved here...
mp=mpnodes.newmpx("plain")
res,err=mp:execute(buffer)
%print(res,err)
if res then 
	mpnodes.outputpdf(res) %core function to produce PDF data
		else
		print("No figures")
	end
}}

% Code to implement buffering the inline MetaPost code
\directlua{
function addline(line)
	line=line..string.char(10)
	%print("called with "..line)
	buffer = buffer..line
end}

% Code to implement buffering the inline MetaPost code
\directlua{
buffer=""
function dobuffer(line)
if string.match(line,"stopbuffer") then
	callback.register("process_input_buffer",nil)
	return ""
end
	addline(line)
%	print(line)
	return ""
end
}

% Code to implement buffering the inline MetaPost code through 
% process_input_buffer callback 
\def\startbuffer{\directlua{callback.register("process_input_buffer",dobuffer)}}
\def\stopbuffer{\directlua{callback.register("process_input_buffer",nil)}}
% Here's the inline MetaPost code
\startbuffer
beginfig(1);
numeric n,g,tl, alpha; 
pair dl,zc,zl,zr,zt,zb,db;
path c, t;
alpha:=0.75;
n:=2;
g:=72*(3/25.4);
tl:=(n+1)*g;
pickup pensquare scaled 0.5bp;
draw (0,g)--(0,tl);
draw (-g,0)--(-tl,0);
rd:=0.5*tl;
rc:=0.5*rd;
dl=(-(1+alpha)*rc,0);
db=(0,-(1+alpha)*rc);
zc=(-rd,rd);
zl=zc+dl;
zr=zc-dl;
zt=zc+db;
zb=zc-db;
t=zc+(0.9*rc,0)--zc-(0.9*rc,0);
c = fullcircle scaled 2rc shifted(-rd,rd);
pickup pensquare scaled 0.5bp;
draw zl--zr;
draw zt--zb;
fill c withcolor black;
pickup pensquare scaled 1bp;
draw t withcolor white;
draw t  rotatedabout(zc, 90) withcolor white;
endfig;
end;
\stopbuffer%
\runmpcode
\def\apar{Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the Renaissance. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet.."\par}%

\apar\apar\apar\apar\apar\apar\apar\apar\apar\apar\apar\apar
\bye

Sample PDF

Here’s an example PDF produced by the code above.

Have you spotted the flaw?

Readers who have a background in print production may have spotted a problem with the positioning of the crop marks and the text area defined by \box255: there is no white space between the text area and the crop marks, they are too close to the live text material. Can this be fixed? Yes, very easily. What you need to do is insert some vertical kerns to move the crop marks vertically, and offset the crop marks horizontally by adjusting the width of the horizontal boxes containing the crop marks. These actions will create offsets between the crops and the text area. Of course, these will have to be factored into any calculations for your page design but that should not be difficult. What this will mean is that the page size defined by the crop marks would then be:

  • width of page = \hsize + 2*(horizontal offset)
  • height of page = \vsize + 2*(vertical kern offset)

We’ll look at this below.

The output routine

A crucial part of the process is to assemble everything onto the final PDF page with the crop marks placed as required. In the following I’ll discuss a basic \output routine, that ignores many complications, such as inserts, but which could form the basis for your own experiments. It could, for example, be used as a starting point for with simpler documents such as business cards. I’ll also give an example of fixing the “crop marks offset” problem mentioned above.

So where do we start?

The \output routine is responsible for assembling the page components to achieve your desired page design and the way that the following output routine works is by wrapping \box255 in a series of \vboxes and \hboxes with flexible glues to centre \box255 on the page. The overall structure is an outer \vbox to the same size as the height of the PDF page followed by another \vbox to the value of \vsize. Very flexible (stretchy) vertical glue is used to fill the “space” above and below the inner \vbox, i.e., the space that needs to be filled due to the different heights of the two \vboxes (\pdfpageheight versus \vsize). This glue is responsible for vertical centring the inner \vbox.

Inside the second \vbox (the one to \vsize) we place a series of horizontal boxes (to a width of \pdfpagewidth) to contain the crop marks above and below \box255, plus the actual typeset content of \box255 itself. We use horizontal glue to centre everything… horizontally.

\output={\shipout\vbox to \pdfpageheight{%
\vfill%
\vbox to \vsize{\offinterlineskip%
\vfill%
\hbox to\pdfpagewidth{\hfill\hbox to \hsize{\copy2000\hfill\copy2001}\hfill}%
\hbox to \pdfpagewidth{\hfill\box255 \hfill}%
\hbox to \pdfpagewidth{\hfill\hbox to \hsize{\copy2003\hfill\copy2002}\hfill}%
\vfill}%
\vfill%
}%pdfpageheight
}%output

Placing the crop marks

Two of the three “\hboxes to \pdfpagewidth” contain yet another \hbox (of width \hsize) and the purpose of those is to place the crop marks above and below \box255.

\hbox to\pdfpagewidth{\hfill\hbox to \hsize{\copy2000\hfill\copy2001}\hfill}%
\hbox to \pdfpagewidth{\hfill\box255 \hfill}%
\hbox to \pdfpagewidth{\hfill\hbox to \hsize{\copy2003\hfill\copy2002}\hfill}%

Let’s take a look at the first one:

\hbox to\pdfpagewidth{\hfill\hbox to \hsize{\copy2000\hfill\copy2001}\hfill}%

The inner \hbox to \hsize{\copy2000\hfill\copy2001} contains an infinitely stretchable \hfill glue which streches to fill the \hbox. The key point is that the pdf_literal nodes from which the boxes 2000 and 2001 have zero width and so the \hfill glue pushes them to the far left and right of the containing \hbox.

If you look back at the Lua code which created the pdf_literal nodes:

pdftopleft = node.new(“whatsit”,”pdf_literal”)
pdftopleft.mode=0
pdftopleft.data=buffy

you’ll see that the “mode” is set to 0. This defines the origin for drawing them as the point on the page where they appear, which is just above and below \box255 thanks to the actions of the glue.

And finally, offsetting the crop marks

  • Vertically: one solution is simply to absorb some of the space which is occupied by the strechable glues, thus preventing the crop marks being pushed up against \box255.
  • Horizonally: adjust the size of the \hboxes containing the crop marks (make them wider).

Here’s one very quick example where we add 20pt offset vertically and increase the \hboxes containing the crop marks to 1.25 x \hsize. A proper solution would of course paramaterise everything.

\output={\shipout\vbox to \pdfpageheight{%
\vfill%
\vbox to \vsize{\offinterlineskip%
\vfill%
\hbox to\pdfpagewidth{\hfill\hbox to 1.25\hsize{\copy2000\hfill\copy2001}\hfill}%
\kern20pt%
\hbox to \pdfpagewidth{\hfill\box255 \hfill}%
\kern20pt%
\hbox to \pdfpagewidth{\hfill\hbox to 1.25\hsize{\copy2003\hfill\copy2002}\hfill}%
\vfill}%
\vfill%
}%pdfpageheight
}%output

And the resulting PDF:

Download mpnode.lua

You can download it here.

Creating PDF pattern fills with LuaTeX nodes

Introduction

This is a short example to introduce two very useful LuaTeX API functions which let you work with low-level PDF objects. Here we’ll use them to explore a PDF feature called pattern fills which (from the PDF specification) are a way to

“… apply “paint” that consists of a repeating graphical figure or a smoothly varying color gradient instead of a simple color. Such a repeating figure or smooth gradient is called a pattern. Patterns are quite general, and have many uses; for example, they can be used to create various graphical textures, such as weaves, brick walls, sunbursts, and similar geometrical and chromatic effects”

I’m not going to attempt any explanation of pattern fills because the PDF specification explains them, and their many options, in great detail. Hopefully, these small code examples will be helpful should you want to explore using them in your work with LuaTeX.

LuaTeX API

Here are the functions we’ll be using:

pdf.immediateobj(…)
Quoting from The LuaTeX Reference Manual: This function creates a pdf object and immediately writes it to the pdf file. It is modelled after pdfTEX’s \immediate\pdfobj primitives. All function variants return the object number of the newly generated object.

  • n = pdf.immediateobj( objtext)
  • n = pdf.immediateobj("file", filename)
  • n = pdf.immediateobj("stream", streamtext, attrtext)

pdf.pageresources =…
This lets you add named resources to the page /Resources dictionary so that they can be used within page content streams.

In outline…

There are two parts to the approach:

  1. Writing the appropriate pattern fill objects and data to the PDF file.
  2. Creating a pdf_literal node to use our newly defined pattern fills.

The code

Here’s the \directlua code with in-line comments.

\directlua{
% If you want to quickly view the LuaTeX PDF data in a text editor you should 
% set the compression level to 0. 
tex.pdfobjcompresslevel=0

% We'll use the object number o to store the object reference in the page /Resources dictionary.
o = pdf.immediateobj("[/Pattern /DeviceRGB] ")

% Here we use pdf.immediateobj(...) write the data which actually defines the pattern fill. 
% We'll use the object number n to store the object reference in the page /Resources dictionary.
 
n = pdf.immediateobj("stream", "1 J
.5 w
1 1 4 4 re 5 5 3 3 re f", "/Type /Pattern
/PatternType 1
/PaintType 2
/TilingType 1
/BBox [0 0 10 10]
/XStep 5
/YStep 5
/Resources << >>")

% Here we add the appropriate named resources (for the pattern  fill) to the page /Resources dictionary so
% that we can use the pattern fill in content streams within any page.

pdf.pageresources =  "/Pattern << /P1 "..n.." 0 R >> /ColorSpace << /Cs12 "..o.." 0 R >> "

% Here we create a pdf_literal node which simply draws a box (0 0 12 12  re)
% and fills it using our pattern. Note that if you omit the q ... Q construct to save and 
% restore then the pattern fill will affect other content on your page, with very strange
% results...

pdfdata = node.new("whatsit","pdf_literal")
pdfdata.mode=0
pdfdata.data=" q 0 0 12 12  re  /Cs12 cs 0.77 0.20 0.00 /P1 scn f Q"

% Package our pdf_literal into a box so that we can use it with TeX
% code such as \copy999

tex.box[999]= node.vpack(pdfdata)
}

Full example minus comments

\pdfoutput=1
\pdfcompresslevel=0
\hoffset-1in
\voffset-1in
\pdfpageheight=200mm
\pdfpagewidth=300mm
\vsize=100mm
\hsize=200mm

\directlua{
tex.pdfobjcompresslevel=0

o = pdf.immediateobj("[/Pattern /DeviceRGB] ")

n = pdf.immediateobj("stream", "1 J
.5 w
1 1 4 4 re 5 5 3 3 re f", "/Type /Pattern
/PatternType 1
/PaintType 2
/TilingType 1
/BBox [0 0 10 10]
/XStep 5
/YStep 5
/Resources << >>")

pdf.pageresources =  "/Pattern << /P1 "..n.." 0 R >> /ColorSpace << /Cs12 "..o.." 0 R >> "

pdfdata = node.new("whatsit","pdf_literal")
pdfdata.mode=0
pdfdata.data=" q 0 0 12 12  re  /Cs12 cs 0.77 0.20 0.00 /P1 scn f Q"

tex.box[999]= node.vpack(pdfdata)
}

Hip \copy999 Hip Hooray \copy999

\bye

Resulting PDF

You can download the PDF output from the above example. It displays OK with my version of Adobe Reader (8.2.1 for Windows) and Evince (2.28 for Windows).

Evince PDF viewer: a Windows productivity tool

Adobe’s PDF reader (Adobe Reader) is certainly a very nice tool for viewing PDFs but it has one annoying “feature” (certainly on Windows): it puts some form of “lock” on the PDF file you are viewing. If you have the file open in Adobe Reader then LuaTeX will see the file is non-writable and will output the PDF in another location using settings in texmf.cnf: TEXMFOUTPUT. This behaviour of Adobe Reader forces you to tediously close and re-open the PDF file each time you update the PDF from a fresh run of LuaTeX.

If, like me, you are doing a lot of “edit–run LuaTeX–view PDF” cycles then you may find that the free Evince PDF viewer can save you a lot of time, and tedium. Evince does not lock the PDF file it is displaying so that LuaTeX can overwrite it and Evince will automatically refresh the display with the new PDF.

For sure, a number of TeX/LaTeX editors have in-built PDF viewers but sometimes you may prefer, or need, to use a standalone PDF viewer; if so, Evince is a nice solution and the Windows version can be downloaded from http://live.gnome.org/Evince/Downloads.

Basic example of LuaTeX’s process_input_buffer callback

Introduction

As mentioned in a previous post, LuaTeX provides a facility called callbacks in which you write a Lua function (your callback) and register it with LuaTeX through the callback.register() API function.

callback.register() takes two parameters:

  • A predefined callback name: as defined by LuaTeX. This defines the action or purpose of your of function and is the “hook” into LuaTeX telling it where/why/when your function should be called.
  • Your actual callback function: either a named or anonymous Lua function.

In this post I’ll give an example/framework for one simple way to use the process_input_buffer callback. This callback allows you to hook into LuaTeX’s input buffer before LuaTeX actually starts looking at it. With this callback you can, for example, completely re-process the input and return the processed results back to LuaTeX, or filter out entire sections of the input–perhaps to store it elsewhere for later use or processing.

Code example and explanation

We will register a callback function which strips out the input sandwiched between two macros called \startbuffer{..} and \stopbuffer. Again, this is a simple example and there are undoubtedly many much more sophisticated ways to do this.

The basic idea here is that the macro

\def\startbuffer#1{%
\directlua{
     dosomething(#1)
     callback.register("process_input_buffer",dobuffer)}
}

switches on intercepting the input, and a dummy macro (\stopbuffer{}) is “intercepted” to switch off processing the input.

There’s a very important point here in that after we make a call to \startbuffer{…} all our input bypasses LuaTeX processing from that point on and is passed to the function we have registered: “dobuffer”. To detect the end of the buffer, and switch off the callback, dobuffer must scan the input for the string “stopbuffer” and “unregister” the callback so that LuaTeX reverts back to its standard processing of the input. For sure, this approach could fail if the string stopbuffer occurs elsewhere within the input stream. I did say it was a simple example ;-).

In the example you can see a string of characters ” ^ & # _ $$ % \” whose \catcode settings could, without their \catcode being reset temporarily, cause problems if input and processed directly by LuaTeX.

\startbuffer{figure1}
Let’s have some characters with various catcodes:
^
&
#
_
$$
%
\
\stopbuffer

However, in the example these characters are being intercepted long before LuaTeX starts to process them and so the idea of catcodes, at this point, has no meaning. Of course, they can still trip you up later in your processing.

Explanation

After the call to \startbuffer{…}, every line in the input is passed to the function dobuffer() which in our example does very little except print it out to the screen. Because the dobuffer() function returns the empty string “” the entire input between \startbuffer and \stopbuffer is removed from LuaTeX’s input: it never reaches the typesetting engine. You could use this as a simple way to implement long comment sections in your document.

You could implement the dobuffer function to perform all sorts of tasks, such as save a copy of the filtered input and store it into, say, a table perhaps indexed by the value passed into \startbuffer{…..}. The options and possibilities are endless!

\pdfoutput=1
\hoffset-1in
\voffset-1in
\pdfpagewidth=100mm
\pdfpageheight=100mm

\directlua{

function addline(line)
     print("I was called with "..line)
end}

\directlua{

function dobuffer(line)
   if string.match(line,"stopbuffer") then
	callback.register("process_input_buffer",nil)
	return ""
   end
	addline(line)
	return "" % this removes line from LuaTeX's input buffer
end
}

\directlua{
	function dosomething(str)
	% do something useful with the parameter to \startbuffer{.....}
	end
}

\def\startbuffer#1{\directlua{dosomething(#1);callback.register("process_input_buffer",dobuffer)}}
\def\stopbuffer{}

\startbuffer{figure1}
Let's have some characters with various catcodes:
^
&
#
_ 
$$ 
% 
\
\stopbuffer

Hello

\bye

Wow, it works! (or, nodes and output routines)

Introduction

As I explore more of LuaTeX I’m constantly amazed by its versatility. Over the last few days I have been reading about \output routines which are, deservedly, deemed some of the more complex parts of TeX. Caveat to readers: I do not profess to understand very much about \output routines and am not going to cover the genuinely complicated areas such as inserts (footnotes, graphics etc), main vertical lists, page breaking, recent contributions, etc. The following is based on typesetting pure text with no bells or whistles such as headers, footers, page numbers etc.

My objective was to experiment with Plain TeX to understand more about the relationship between the size of the PDF page and just how you control where the typeset results are physically placed within the PDF page area. With the clear cavets above in mind, the way I think about this is that TeX is busily building the typeset output into a box (number 255) with width \hsize and height \vsize and when it decides there is enough to dump out a page it will “call” the \output routine. It is through the \output routine that you control the physical location of TeX’s page (box 255) within your PDF page area. This is undoubtedly a pretty simple-minded view but it works for me (until I know better…).

Setting the page parameters via Lua code

To start with, here is a small LuaTeX routine to set up the PDF page size together with \hsize and \vsize, again for use with Plain TeX. Note you can also set \hoffset, \voffset and \topskip through the LuaTeX API.

\setpageparams#1#2#3#4{…} takes 4 values (assumed in mm), converts them to TeX’s special points, and assigns the values to the appropriate dimension parameter.

\pdfoutput=1
\hoffset-1in
\voffset-1in
\topskip=0pt
\nopagenumbers

\def\setpageparams#1#2#3#4{%
\directlua{
        local mm = 25.4
        tex.pdfpagewidth=(#1/mm)*72.27*65536
        tex.pdfpageheight= (#2/mm)*72.27*65536
        tex.hsize=(#3/mm)*72.27*65536
        tex.vsize=(#4/mm)*72.27*65536
}}

\setpageparams{95}{200}{90}{180}
\def\para{Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the Renaissance.\par}

\para\para\para\para\para

\bye

Centering \box255 within our PDF page area

The following code performs three main tasks:

  1. It creates an \output routine which places \box255 horizontally and vertically centred within the PDF page area.
  2. It sets up a Lua table (coords) which contains the corner coordinates of the centred \box255 relative to the PDF page origin, which I’m assuming is the standard lower-left corner of the PDF page.
  3. Using LuaTeX nodes (pdf_literal) it creates a shaded box that sits behind \box255 and is shipped out on every page using the \output routine.
\pdfoutput=1
\hoffset-1in
\voffset-1in
\topskip=0pt
\maxdepth=0pt

\def\setpageparams#1#2#3#4{%
\directlua{

function sptobp(sp)
	return string.format("\%3.3f", (sp*72.0)/(72.27*65536.0))
end

% The function makebox() accepts the array of coordinates and uses them to
% construct the PDF code to draw a shaded box the same size as \box 255

function makebox(coords)

% print to terminal for debugging
print(coords["BL"].x, coords["BL"].y)
print(coords["BR"].x, coords["BR"].y)
print(coords["TL"].x, coords["TL"].y)
print(coords["TR"].x, coords["TR"].y)

data=string.format("q \%3.3f \%3.3f  m \%3.3f \%3.3f  l \%3.3f \%3.3f  l \%3.3f \%3.3f  l   \%3.3f \%3.3f  l 0.5 w 0.75 g f Q",
 coords["BL"].x, coords["BL"].y,
 coords["TL"].x, coords["TL"].y, 
 coords["TR"].x, coords["TR"].y, 
 coords["BR"].x, coords["BR"].y,
 coords["BL"].x, coords["BL"].y
  )

% A pdf_literal node to draw the box
% VERY IMPORTANT: node.mode is set to 2 so that 
% the code origin is relative to the lower-left corner
% of the PDF page. 

pdf1 = node.new("whatsit","pdf_literal")  
pdf1.mode = 2
pdf1.data=data
return(pdf1)
end

local mm = 25.4
tex.pdfpagewidth=(#1/mm)*72.27*65536
tex.pdfpageheight= (#2/mm)*72.27*65536
tex.hsize=(#3/mm)*72.27*65536
tex.vsize=(#4/mm)*72.27*65536

% The use of TeX token registers in the following code is really not
% necessary I just added this as an example to show that you can
% use them in this way

tex.toks[500]=sptobp(0.5*(tex.pdfpagewidth - tex.hsize))
tex.toks[501]=sptobp(0.5*(tex.pdfpageheight - tex.vsize))

% The next part of the code creates a Lua table which stores the
% corner coordinates of \box255 relative to the PDF page
% origin, based on \box255 being horizontally and vertically centred
% within the PDF page area. 

coords={}

coords["BL"]={x=tex.toks[500], y=tex.toks[501]}
coords["BR"]={x=tex.toks[500]+sptobp(tex.hsize), y=tex.toks[501]} 
coords["TL"]={x=tex.toks[500], y=tex.toks[501]+sptobp(tex.vsize)}
coords["TR"]={x=tex.toks[500]+sptobp(tex.hsize), y=tex.toks[501]+sptobp(tex.vsize)} 

% Here we create a pdf_literal and store it in into an hbox.
% The pdf_literal is created by the function makebox() and the
% result is packed into \box2001 which we shipout on every
% page via the \output routine.

tex.box[2001]= node.hpack(makebox(coords))

}}


\setpageparams{95}{200}{80}{175}

% OK, here is my first attempt at an output routine. This is responsible for
% horizontally and vertically centering \box255 within the PDF page area
% and shipping out our \box2001 to draw the shaded box on each page.
% It outline, it starts by creating a \vbox to the same size as \pdfpageheight
% and sandwiches \box255 with \vfill glue top and bottom to centre
% it vertically. Inside that \vbox is an \hbox to \pdfpagewidth with \hfill
% glue to centre \box255 horizontally. \copy2001 is used to add our
% shaded box onto every page

\output={\shipout\vbox to \pdfpageheight{\vfill\vbox to \vsize{%
\offinterlineskip\copy2001\vfill\hbox to \pdfpagewidth{\hfill\box255 \hfill}%
\vfill}\vfill}}

\def\para{Hello Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the Renaissance. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet.."\par}

\para\para\para
\bye

Example results

Again I prefer to use Google Docs PDF viewer. Download here if you can’t see the resulting PDF. I hope this example is useful and, as always, I’d appreciate any notice of any technical errors.