Post-publication update: GNU gawk
Since publication of the article below, subsequent investigations with a member of the TeX Live team have identified the exact cause of the problem: An outdated version of GNU’s gawk command-line tool (used during compilation). I had been using version 3.1.7 of GNU’s gawk (supplied with the MSYS distribution I was using) but after updating it to version 4.0.2 the line-ending problem no longer arises. If you are using MSYS on Windows, and want to compile TeX Live…, check the version of gawk installed on your machine. As I say, you live and (re)learn…
Original article
Just a short note to share the solution to a problem I experienced when trying to compile Tex Live from the C/C++ source distribution… on Windows. I have a bit of relevant experience because I regularly compile LuaTeX from source and have built other TeX engines–including Knuthian TeX from raw WEB code and some versions of XeTeX.
So, with that experience, I decided to have a go at building TeX Live from the source file distribution–it’s useful to be able to build and use the latest versions of TeX-related software. Using SVN (via the Tortoise SVN client) I checked out the TeX Live source directory and tried to build it using MinGW64/MSYS64. I read through the notes in README.2building
(supplied with the TeX Live source) and followed the example to build dvipdfm-x
. Running the Build/configure scripts (using the --disable-all-pkgs
option) worked fine but, sadly, compilation failed with a cascade of errors… so I wanted to find out why.
Unquestionably, TeX Live is a truly impressive work of considerable complexity and, of course, it should build OK on Windows–so I figured that the problem must be a relatively minor one to do with my setup. However, tracking it down initially felt like “looking for a needle in a haystack”, to quote a well-known English figure of speech. Well, after a couple of days I found the problem… line endings in some key text files! When I checked out the source via SVN some key files (config.h.in
and similar *.in
files) had been saved with Windows line endings (CR+LF) rather than Linux endings of LF only. Running the top-level TeX Live Build/configure scripts generates a config.status
shell script file for each component/sub-system that has to be compiled. As the config.status
scripts execute, they create a number of temporary files which are processed and deleted on-the-fly. To stop these temporary files being deleted (to assist my bug hunt) I used a simple trick of adding the line alias rm='echo'
at the start of one of the config.status
shell scripts (which are generated by configure
).
I discovered that the config.status
scripts generate a temporary file called defines.awk
–which is a script designed to be executed by the AWK program. The purpose of defines.awk
is to process “template” configuration files (called config.h.in
(and similar)) to generate various config.h
files that contain important settings (#define
s) detected during the configuration process (i.e., during the execution of configure
). These config.h
files vary for each program you are building and are essential for successful compilation. Well, it turned out that the defines.awk
script was failing to correctly parse the config.h.in
files (and similar) simply because the Windows line endings were causing a vital regular expression (in defines.awk
) to fail. This resulted in the config.h
files being a copy of config.h.in
because none of the text replacements had worked due to failure of the AWK regular expression. Not surprisingly, erroneous config.h
files caused the spectacular failure of compilation I experienced on my first attempt. Re-saving the config.h.in
files (and some other *.in
files) with Linux line endings seems to have solved the problems.
And yes, so far all the TeX-related programs I have tried to build have compiled successfully. This is not the first time I have been “bitten” through problems caused by Linux/Widows line endings… so I guess you always live and (re)learn.