FriBidi and HarfBuzz: bidirectional text and text-shaping

It’s been more than a week since my last post so you may be forgiven for thinking that the blog novelty has worn off, but not so 🙂 Over the last week or so I have been working on building the GNU FriBidi and HarfBuzz engines under Windows using Visual Studio rather than MSYS and MinGW. Been rather short of time recently but today I got them both to build. GNU FriBidi provides support for bidirectional text (e.g., Hebrew or Arabic mixed with English) and HarfBuzz provides an OpenType text-shaping engine for complex scripts. In theory, GNU FriBidi and HarfBuzz could be plugged into LuaTeX to provide typesetting solutions for languages such as Arabic or any other complex script that the HarfBuzz engine provides support for. So, the next step is to create a Lua binding for FriBidi and HarfBuzz and figure out the best way to communicate with the LuaTeX engine. Once I have some working code I’ll post some further notes based on what I find out. Stay tuned…

Compiling the FriBidi Unicode bidi algorithm on Windows

I’m exploring the Unicode Bidi Algorithm (UBA) and found the GNU FriBidi implementation of the UBA (in addition to Unicode’s own implementation). The Unicode implementation compiles quite easily with Visual Studio but GNU FriBidi requires MSYS and some code edits, as documented beautifully on kemovitra.blogspot.com. I know very little about Linux-based builds and usually have to resort to all sorts of edits to get some distros to build with Visual Studio but the excellent notes on kemovitra.blogspot.com worked perfectly, first time, so a huge thank you to the author of that blog post.

Enabling LuaTeX’s use of \pdfoutput

In the tutorial series A minimal LuaTeX setup on Windows I showed how to get started with a very basic LuaTeX setup for plain TeX. Actually, it is so minimal that we have not even enabled the many new primitives (low-level commands) that the LuaTeX engine provides. Without “switching on” these new commands, LuaTeX supports just the original TeX82 primitives plus the \directlua{...} command. It does not even understand the pdfTeX primitive \pdfoutput which enables choosing between PDF and DVI output via your TeX file, rather than LuaTeX’s command line options. For example, before enabling the full LuaTeX command set, if you run the following plain TeX file (say demo.tex):

\pdfoutput=1 %select PDF output
Hello \TeX
\bye

luatex --fmt=plain demo.tex


c:\luatexblog\myfiles&g>luatex --fmt=plain demo.tex
This is LuaTeX, Version beta-0.65.0-2010121314 (rev 4033)
(./demo.tex
! Undefined control sequence.
l.1 \pdfoutput
=1 %select PDF output
?

Hmmm, something is strange because LuaTeX is based on pdfTeX and that, of course, understands \pdfoutput=1.

Creating a new format: luaplain

So, how do we get access to \pdfoutput? One way to do this is to create a new format, derived from plain TeX, but which contains a little bit of code to enable the full LuaTeX command set. We will call this format “luaplain” because it is no longer Knuth’s original plain TeX. The task is to create the luaplain.fmt file so that we can access the full power of LuaTeX using plain TeX (later we will extend this to LaTeX).

  1. Within the minimal TDS tree (see Part 6), create a directory called c:\luatexblog\texmf\tex\luaplain\
  2. Create a TeX file called luaplain.tex containing the following TeX code: \input plain
    \directlua {tex.enableprimitives('', tex.extraprimitives())}
  3. Start a DOS session and change the current directory to c:\luatexblog\texmf\web2c\ (which is where we store .fmt files, as we want to create luaplain.fmt).
  4. Type the command:
    c:\luatexblog\texmf\web2c>luatex --ini luaplain.tex \dump
  5. All being well, you should see the file luaplain.fmt created in c:\luatexblog\texmf\web2c\

Back to our example (demo.tex) which failed under plain TeX

\pdfoutput=1 %select PDF output
Hello \TeX
\bye

Now run LuaTeX with the command line luatex --fmt=luaplain demo.tex and you should see something like this:

c:\luatexblog\myfiles>luatex --fmt=luaplain demo.tex
This is LuaTeX, Version beta-0.65.0-2010121314 (rev 4033)
(./demo.tex [1{c:/luatexblog/texmf/fonts/map/pdftex.map}] )<c:/luatexblog/texmf
/fonts/type1/public/cm/cmr10.PFB>
Output written on demo.pdf (1 page, 11487 bytes).
Transcript written on demo.log.

Success!

Detexify: such a clever tool for LaTeX

Today I came across a really clever use of HTML5 technologies being applied to a common problem in LaTeX: finding the package and code for special symbols. It is called Detexify – it relies on HTML5 features so you need a fairly recent browser. I am really excited by the potential offered through the rising technologies of HTML5, SVG and the canvas element (among many others) and I can’t help wondering what wonderful authorship tools and interfaces will be developed to work in the browser. I know that SVG has been around for a long time, indeed I first started working with it more than 6 years ago, so it is nice to see it becoming “potentially mainstream” real soon… Already, canvasdemos.com gives a flavour of the possible with the canvas element. I can see innumerable applications of these technologies, especially in the domains of creating, writing and sharing scientific content. It is easy to imagine complete authoring applications being written to work in the browser, and surely it cannot be that far away: writing in the medium, for the medium.