Type Design in Holland

A very well crafted overview of a year studying type and media in Holland :-)

Schooling Digital Natives

Pete sent me an interesting link about one of my favourite topics: “school vs internet”. Choice quote:

This is just the beginning. The crisis in education we’re seeing today will only get worse over the next several years as the true digital natives begin to come of age.

Dislike the “digital natives” term but it seems to be sticking, heh.

Lettersoup Graduation Show Videos

Ricardo has posted two videos of his MA show’s Lettersoup demo (part one, part two) which hints at things to come :-)

(youtube-dl is a great tool for saving GooTube videos, because GooTube has a nasty habit of sending videos down the Memory Hole.. :-)

Lovely Spiro Video by Desino Libre

Desino Libre have published a LOVELY video of Inkscape’s Spiro used for a spot of lettering!

Gnash as Auto-Updating Firefox Extension

Gnash is now being distributed as a hefty 18Mb Firefox Extension, which wins sensible auto-updating :-)

Mercurial Support Inside FontForge

Max Rabkin writes to the fontforge-users mailing list:

“I’ve been using this version-control plugin for some time, and I thought some others might find it useful. It’s quite simple, allowing one to access some features of the Mercurial VCS from inside FF. Features:

The repository still needs to be set up before it can be used. I have tried to detect the cases where the current file is not version-controlled, but like the license (which is the same as Fontforge’s) says, there is no guarantee.

Obviously you need to have mercurial installed. Once you’ve got that, just drop this file into your ~/.FontForge/python directory. If you want to edit this, say to adapt it to other VCSes, you can get the repo by running:

 hg clone http://freehg.org/u/taejo/ffhg/

Notes on “Here Comes Everybody” by Conrad Taylor

Been a few years since I hit up Clay Shirky, and he’s better than ever.

Totally loved Gin, Television, and Social Surplus and my good friend Conrad Taylor just posted a great summary of Shirky’s recently published book to the private-archived KIDMM list. Reproduced here with permission, enjoy:

I’ve just finished reading the Clay Shirky book, “Here Comes Everybody”, which Bob Bater loaned to me. Shirky teaches New Media at New York University and writes on the social and economic effects of Internet technologies. Some people on this list will know him for his “Ontology is Overrated” paper at an O’Reilly conference in 2005.

The book is about trends in social organisation that are being made possible by the advent of new communication technologies. As the book jacket says, “In the same way the printing press amplified the individual mind and the telephone amplified two-way conversation, now a host of new tools, from instant messages and mobile phones to weblogs and wikis, amplify group communication. And because we are natively good at working in groups, this amplification of group effort will change more than business models: it will change society.”

I found the book to be entertaining and thought-provoking, but at the same time somewhat frustrating. It is meant to be based around a series of examples: the blog campaign that got the thief of a mobile phone arrested, the Voice Of The Faithful campaign that organised online against sexual abuse by priests, the Stay At Home Moms group on Meetup, the flash mobs that made the Belarus regime look stupid. What I find Shirky is less good at is drawing out the lessons and learning points in a way that makes sense cumulatively and is memorable. So in effect to consolidate my reading, I had to go through the book a second time and take notes. “Fortunately”, I’ve spent a fair bit of time in hospital waiting rooms and had time for this.

Maybe this will get you interested in reading the book; maybe it will be a substitute for reading it!

So, here are a few of the points I’ve been able to pin down…

Then there is a substantial focus on what is happening to content creation.

The editorial model was: filter, then publish. The new model is more like: Publish, then filter!

(This is the kind of Skirky point that irritates librarians and information scientists!)

Shirky swings by knowledge management and Communities of Practice briefly, citing John Seely Brown and Paul Duguid’s “The Social Life of Information” and Etienne Wenger’s “Communities of Practice”.

An interesting chapter is the one entitled “Personal motivation meets collaborative production”. His main point here is that new tools allow large groups to collaborate with minimal or no management, “taking advantage of nonfinancial motivations and… allowing for wildly differing levels of contribution.” Here, his main example is Wikipedia.

There is a powerful little subheading on page 172: “Replace planning with co-ordination.” His example here is Blitzkrieg. In the assault on France in 1940, the German Panzer IIIs and IVs had inferior guns and armour to the French Char B tanks; what they did have that the French hadn’t was radio. But more than that, the Germans re-thought tank warfare, based on appreciation of what radio was useful for. The French Army used the tank as a mobile gun platform accompanying infantry, following predefined troop movements. The German strategy gave a lot more autonomy to the field commanders, and the radio let them react to the situation and improvise co-ordinated attacks.

Shirky follows that introduction with examples of social action co-ordinated with social tools e.g. the Facebook student revolt of UK students against HSBC’s revoking of interest-free overdrafts, and Egyptian political activists using Twitter to alert each other of police harrassment.

Something Bob would like is the section on “Small World Networks” (Watts and Strogatz), which points out that small networks tend to be densely connected, but each additional person increases greatly the number of possible connections, to the point where for each to be in contact with every other is unfeasable. Therefore, large networks will tend to be sparsely connected.

But there is a way round this. If there are people in the small, dense clusters who are also connected to other clusters, the effect is a large network that puts everybody just three or four degrees away from everybody else. Analysis of social networks reveal another kind of Power Law Distribution: there can be found a few people who, by connecting to a large number of clusters and networks, account for a “wildly disproportionate amount of the overall connectivity”. Malcolm Gladwwell, in “The Tipping point”, calls such people Connectors.

In milieux like MaySpace, Meetup or Facebook, where groups can form, this form of Small World networking is commonplace.

One term Shirky uses is “social capital”, derived from the work of sociologist Robert Putnam:

“When you neighbor walks your dog while you are ill, or the guy behind the counter trusts you to pay him next time, social capital is at work. It is the shadow of the future [a term derived from Robert Axelrod’s work on The Prisoner’s Dilemma] on a societal scale. Individuals in groups with more social capital (which is to say, more habits of cooperation) are better off on a large number of metrics, from health and happiness to earning potential…”

Shirky then draws a distinction, within Social Capital, between Bonding Capital and Bridging Capital:

“Bonding capital is an increase in the depth of connections and trust within a relatively homogenous group; bridging capital is an increase in connections among relatively heterogenous groups.”

Then there’s a chapter called “Failure for free” where the main set of cases is drawn from the world of Open Source software development. “Open Source doesn’t reduce the likelihood of failure, it reduces the cost of failure; it essentially gets failure for free.” Cheap failure encourages experimentation.

The chapter where I finally thought it was all beginning to make some kind of coherent sense is th last one, called “Promise, Tool, Bargain.”

The closing section explores the social dilemmas that can arise within online communities.

Hope that’s useful as a summary… it is certainly not complete.

I certainly think that one thing we might aspire to for Know.Ware, but further down the line, is a space for posting mini-reviews of books and other resources, and commenting on them!

Conrad (This article copyright to Conrad Taylor)

Notes from TUG2008 in Cork: Day 4


These are rough notes from a public event, and any errors or stupidity should be attributed to me and my poor note taking; I hope these notes are useful despite their obvious flaws, and everything should be double checked :)

LuaTeX Images

Hartmut Henkel

[09:50 damn overslept again]

Q: Type3 fonts can’t work well in PDF and this can make PDF submissions of theses rejected…


Hans Hagen

Why are there so many kind of a program? Well there are lots of people, and there is no need for everyone to be the same.

We have heard about LuaTeX the program; what are the consequences for macro packages? Where does TeX end and Lua start?

We see no reason to really change TeX, it works and people are happy with it. We don’t provide solutions, we provide the way to provide solutions. There are no common views on how pending issues can be solved.

We open up the internals of TeX for the macro programmers. ConTeXt mkIV shows this off. ConTeXt started off as calling Lua and piping its output to TeX - a caller primitive and a print function.

Then we opened up registers, to access counters and box dimensions. Then we replaced file handling, normally done by kpse library; reading zip files, HTTP or FTP, and so on became possible. TeX itself is unaware of all that, and we moved multipass data to Lua.

Then input encodings were done in Lua, supporting UTF-16 even, and this was the first time the TeX part became smaller and the Lua part becoming larger.

More complex Lua scripting was added, for MetaPost conversion to pipe data to La and print back PDF literals.

Then we got access to node lists, so we could manipulate glyph related data, for robust case swapping and such.

Then we used Lua for font loading, including preparing for OpenType processing. Reading from AFM replaces TFM, for great Type1 support, and ‘font encoding’ has been removed from ConTeXt mkIV now and only maths still needs TFM files - but that will go soon too.

The Oriental TeX project, we started writing support for advanced OpenType features; everything is Lua controlled, and Lua is so fast that major node-crunching is okay. Zapfino can have 1,000,000s of node operations with the complex layout of that font. This is great for a developer as you can afford to implement things that waste time. Runtime virtual font building can be used to construct missing glyphs, and features can be supported dynamically. This kind of stuff is really new in TeX.

The generic attribute mechanism was redone. This was used to specify fonts, but when redone can be used for other things - no big speedup, but more robust and flexible.

XML-FO has a good vertical spacing model.

We really need the TeX-Gyre math fonts!

The stream driven MkII xmlhandling was replaced by a tree based mechanism for arbitrary access/flushing/manipulation of XML nodes, so you can say ‘give me all of these tags with that attribute value’ - MathML support was done as a use case example. Its very fast.

There is a new img library that is powerful.

The sectioning, numbering and lists is done in Lua, apart from the typesetting part; more data is carried around and retained, and its hard to keep it compatible, but possible.

And there are some minor parts of the typesetting operations being done by Lua, but mostly that is left to TeX that does it well.

Authors can just use Lua, forget about the TeX internals and just pass out and return data from/to TeX.

Macro authors, can get information from TeX, use it for your own calculations, and add the results to TeX; or even manipulate that data with Lua and replace the original TeX dataset, and you can replace components of TeX (like file handling) with Lua code to be more flexible and modern, and you can make use of OpenType features, and even replace typesetting parts like hyphenation, kerning, paragraph building - and add new features not seen before :)

Macro writing will become more powerful and convenient.

LaTeX Announcement

Frank Mittelbach

The LaTeX source have been in SVN for a while; yesterday they went public, so development now happens live online, so if there are bug fixes, people can get them quickly, and see how things develop. http://www.latex-projecr.org/svnroot/ or something, there will be a post on the website.

And the last 2 years we’ve been saying we’ll move from TeX to another engine, and we are going to e-TeX officially now.

Don’s Punk Font

Hans Hagen


How it started? We had a mechanism for making virtual fonts, and long ago I ran unto a font by DEK, “punk,” and it was a METAFONT bitmap font. Nowadays no one uses bitmap fonts, and pdfTeX can pretend to make bitmaps fonts using glyph containers. When MPLib started, I tried to give that a go. Taco converted the METAFONT file into something more MetaPost like, and I wrote a virtual font for that.

How does it work? The MetaPost file is processed using MFPLAIN format, and if you look carefully you can see there are differences. There are 10 in fact, for various sizes. These pictures are converted to a PDF stream and stored in the mkIV font cache. It looks like a simple font, but its quite complex actually. At run-time a font is assembled from these pictures, and we can add missing glyphs composed like diacritic ones. We use an attribute signal to say some text has to be punked, and one of the node parsers picks up this signal and randomly chooses a font. The shapes end up in the stream as in-line PDF code as a result of the virtual font. But this means its not search-able.

Virtual fonts have a great potential. Not a real reason to start using LuaTeX, but shows the potential. The type options of LuaTeX are rather minimal (not there) and we need proper PDF text stream support to make it work.

The font need some subtle fine-tuning, and more character support; symbols and math support for example ;) and perhaps we could ask Herman Zapf to do Punk Nova.

And the MetaPost library needs to be made suitable for making fonts, kerning and such. LuaTeX needs to be extended for proper Type 3 handling.

In the future Taco will provide MetaPost features for outputting char strings, so you can have run time glyph generation in LuaTeX.

Eventually, we can apply this mechanism to sophisticated ‘hand writing’ like fonts.

The font itself is punkfont.mf and is available now

Here is a quick walk through of the code.

LuaTeX’s impact on the TeX world: A Lua file, and a TeX file, that does this never before possible. This is an example of this.

Q: how to search?

A: you can use “invisible” text to duplicate the content that makes it search-able.

Languages for Bibliography Styles

Jean-Michel Hufflen


Many programming languages in computer science are specialised, and they relate to the software qualities. “Object Orientated Software Construction” in 1997 by Mertrand Meyer is a book about quality of software. There are 2 kinds of qualities; externals (seen by users) like correctness to perform its tasks, robustness in abnormal conditions, extendability for some changes and re usability in other programs. Efficiency, portability, and usability too.

Then internal quality is seem only by developers; readability, modularity. Modularity is decomposibility, and composibility, can parts be combined with each other? Can you understand a part of the program without knowing the rest? And continuity, can a small change in the specification result in a small change in the program?

The task of a bibliography process er is to walk a bibliographic database for ‘keys’ and sort them into an order, and arrange each reference in a bibliographic style - which may be language dependent.

From a user perspective, you want to design and implement your own bibliographic styles, and perhaps add new features to the processor to do this.

A bibliographic processor ought to work with LaTeX and ConTeXt and XeTeX and LuaTeX; and output HTML and RTF also.

BibTeX is the LaTeX bibliographic processor, using the BST language for bibliographic style. It uses a stack model and a postfix notation.

It has good qualities: correctness - I never crashed it! - and robust - a big file with many syntax errors is no problem - but its extendability is limited, and its re-usability is too. And extending BibTeX is tedious for programming-like things. Its modularity is poor; some functions can be changed but need stack programming stuff, popping and pushing things on the stack. Continuity is average; changes are easy especially for layout, but other kinds of changes are tedious.

Can BibTeX be used for other word processes than LaTeX? Theoretically yes, but practically it is hard as many users put LaTeX commands inside field values. And some notations are hard wired like “~” for unbreaking spaces that is important for authors’ names.

BibTeX’s output is marked up with commands and these interface with citation and formatting functions in TeX; natbib, jurabib, biblatex, etc. BibTeX is used to search databases and sort references in that case.

Another approach is to use XML like formats; .bib files can be converted to a XML format and that is processed by XSLT. bib2xml is such a tool. Most accent commands are expanded to the Unicode characters.

mlBibTeX uses an XSLT for bibliography styles, and this introduces inheritance about natural languages’ specification; it uses Scheme to do some programmatic things. It has a compatibility mode for .BST styles, and its usable if you know Scheme, although it needs better integration.

DSSSL is an old style specification language that uses s-expressions, and was used in SGML. The “Core Expression Language” is a subset of scheme, and it has the full power of a programming language.

Another choice is Perl as a bibliography processor. Bibulus is such a tool. Compact, modular, extensible, efficient and good for HTML especially. But multilingual features are limited; the homepage says “only for developers.”

Tib is old, for Plain TeX, but easy to use.

BibTeX++ compiles BSD into Java classes, but this has similar limitations to BibTeX.

Common Lisp too - cl-bibtex allows the use of Common Lisp functions, and that’s great if you are a Common Lisp hacker ;)

TeX is a great typesetting engine, but not a general purpose programming language, and so delegating that part is good - LuaTeX does this very well.

I think the XML approach of mlBibTeX is promising and adds something these alternatives don’t have. I think the Lua/TeX balance is similar to mlBibTeX’s Scheme/XSLT balance.

Observations of a TeXnician for hire

Boris Veytsman


A story in 6 lessons.

For those old timers, perhaps my comments are debatable, but here are some personal observations, take what you can.

  1. “The usefulness of audacity”

My work as a consultant started years ago; I looked at the rates in TUGBoat, and its $35 a year to be listed as a consultant there. So for $35 I thought I’d be a consultant. I had read some books and used LaTeX but thought I’d give it a try. Nothing happened. So I wrote to Karl Berry and asked if there was anything I could do to help. He said, why don’t I help translate documentation into Russian. I said okay, and started this large and unpaid work. After that work, I got a call from a scientific society, accepting submissions for publications in TeX, and they needed someone to teach the support team TeX. I had a day job as an engineer, and a night job as a professor, but I love teaching, and so I said okay. I asked how the heard about me, and they said Karl Berry recommended me.

2 “Volunteering helps to get paid”

With word of mouth, I got more customers, and eventually I earned a lot of money, not as much as Karl ;) but some.

Then I had to pay taxes. But I can claim business expenses against those taxes, and in 2006 there was a PracTeX conference. A few hours drive away, quite expensive to drive there, I met a lot of nice people who I respect only via email and who I respect a lot. And I got some work from the people I met there.

  1. “Going to conferences helps to get paid”

We have a niche market: most of my work has been with publishers, and its boring work, but they give you specs and you write a style file and its simple clean work. Clients expect you to be an expert in TeX, friends of TeX and friends of friends of TeX. You can’t tell them “I don’t know about that”. This is the lesson

  1. “The way of the Omnivore”

But also,

  1. “You cannot do everything”

Typography is both an art and a science; you need the compositors, opinionated people, they are the boss, and you can have your opinion but only tell it when it is asked for. A few times people have commended me on my deep knowledge of typography. My secret is that, whatever they say, I will do.

  1. “Why This is fun”

Firstly, I was always learning. I’ve learned about Hebrew typography, about PostScript and PDF. It was fun to learn, I’d like to learn it even if I wasn’t being paid. TeX is totally different to my other work in engineering and teaching, so its a kind of distraction for me.

If you do engineering, most companies have their own opinions about secrecy, doing things like patents even on software ideas, and its bad for your karma. But if you are a publisher, they want things to be published and available as free software - its good for my well being and enjoying what I am doing.

Some practical advice:

I do work and teaching, but consulting is different work. The most important part is:

  1. Communication

You must listen to what they want, and make them happy. Publishers have editors, authors, compositors, and you have to make them all happy. The main part of the work is communicating to everyone, and to do that, you must understand their emotions. The typical emotion of a TeX user?


TeX is a big intimidating system, they are afraid it will break in the middle of their production and they will have no way out. So the next important thing is:

  1. Support

You can’t say, “here is my system, its $50 a question” - they will be afraid and you don’t want that.

The third thing? Suppose a user comes to you with a problem, and you expected it and on page 53 you explained the problem and the solution. They already paid for the system and the manual.

I see this as my problem, as an author. I didn’t explain it well, or fun to read so they didn’t read it well, and so I will always apologise, say I will rewrite the manual and explain the solution now too. So the third most important thing with TeX consulting:

  1. Documentation

Talking, explaining, explaining again. That’s what you are doing. So when doing the documentation, you remember that the support is free. So the documentation will save you money by having less support.

In conclusion:

Do what you enjoy doing and be generous. At some point somebody
will pay you!

Don’t chase money for every single thing you do.

If you are a consultant, I hope this will be useful for you; if you are not, you have my name… ;)

If you ask me a question, and I can’t find the answer in 15 minutes, its free. If its such an interesting question I want to know the answer, its also free.

Q: Have you worked for the government?

A: Maryland has.

Q: If you work for the government, it can take a long time for them to pay you ;)

A: Perhaps - I found them to be one of my best customers in terms of prompt payment :-)


What are the upcoming TeX conferences?

EuroTeX 2009, in Delft, August 31 - Sept 4. This will also be the 3rd ConTeXt meeting.

TUG2009 will be at University of Notre Dame, Southbend, Indiana (100 miles west of Chicago, on an Amtrack route, and on a major interstate highway 80, and on the Internet too - during the conference - and hope to see you there. Its July 28th-31st.

Notes from TUG2008 in Cork: Day 3


These are rough notes from a public event, and any errors or stupidity should be attributed to me and my poor note taking; I hope these notes are useful despite their obvious flaws, and everything should be double checked :)


Jonathan Fine

11:10 [Overslept oops]

We have a GSOC project that has implemented a prototype for auto-completion of TeX math symbols, eg $\alpha\beta$ and this can be integrated with the other MathTran stuff.

I want this to be shared with TeXWorks, so users can start dabbling with the web service and move to a full authoring environment without a bump.

The future is a online TeX tutorial environment, important for growing our community.

Q: i18n eg french?

A: I’d like to do something that can handle i18n, sure!

Publishing Mathematics on the Web and PDF

Ross Moore


Copy and paste from a PDF in Adobe Acrobat on Mac OS X, to plain text, can work in a minimally acceptable way, where the fine stuff like subscripts are lost - or you can get a load of strange symbols. Every letter is mapped into the private area, so unless you use the same font with the same private area codes, you can’t understand it.

What do people think of LaTeX? The author of that paper learnt LaTeX 20 years ago, and thinks what he does is what everyone does, but its not a generic as he thought.

Here’s another example, with some math symbols like \infinity and these get converted to number of other chars in the copy & paste operation.

Now lets look at copy and paste out of the Cocoa PDF viewer in TeXShop. Its better but still has load of problems. No idea what is going wrong.

What can be done? We can beg the proprietary companies to fix things.

My Q: Why care about proprietary software? They don’t care. Free software offers a real chance of getting things fixed.

A: Sure. XPDF used to crash doing these copy & paste operations, and now it supports things properly, thanks for my work here.

Q: They aren’t interested in fine maths, but they are interested in supporting standards. HTML5 is thinking about supporting maths better, and part of that is telling browser people who to do things right. I suggest you get involved in such standards efforts.

A: Sure

The ‘galley’ module

Morten Hogholm


What is a galley? A rectangle that fills with text and others material from the top. Most galleys are unrestricted vertically, some are though; a vertical list, vbox, minipage. galley are separated by vertical things like penalties, spaces, specials, and writes.

They are wars on 2 fronts: inter-paragraph material and paragraph shapes.


\end{itemize} \vspace{3pt} \begin{itemize}

gives 13pt of space not 3. Other examples.

LaTeX inserts all inter-paragraph material when it receives it, and TeX has limited tools for dealing with this.

The ‘galley’ module is intrusive and NOOPs TeX spacing and does its own. Its in expl3, and works okay. But it needs work:

It was written at a time before e-TeX was the default engine that it is now, and that provides arrays instead of single values. It needs a data structure for paragraph shapes. It cuts through much of LaTeX internals so they need to be rewritten, and perhaps Taco can add the necessary data structures to LuaTeX so we don’t have to do it in Macros.

Q: Is your thesis available?

A: Sure

Q: In a good model of documents-as-lists, the invisibles ought to be attached to visibles. This is true of horizontal lists as well as verticals.

A: Yes. Galley does this now.

Q: LaTeX2e took things as far as they could go; if we want to go further, we have to scrap LaTeX and start again.


pdftex, xetex, miktex - TeXLive2008

Its enabled by putting \syntex=1 in your preamble.

History: TeX in 78, pdfTeX in 98, and Visual TeX in 99, a proprietary TeX system that claims the ability to to sync the TeX input and DVI output, and TeXtures is proprietary in 00 that also has the ability.

srcltex.sty (<98) is free and tries to do this too. vpe.sty (00) also syncs PDF with TeX source. And I wrote pdfsync (03) to sync TeX source and PDF, and iTeXMac2 is a front end I write for Mac OS X, and that contains a PDF viewer. SyncTeX comes from this work.

In pdfsync, all input has a file name and line number; output has page number and a location in the page. These two things are combined and given a unique tag. These are stored in special nodes at every math switch and every paragraph.

pdfsync isn’t compatible with some packages; previewlatex package, for example, could be fixed, but others could not be. And the mapping of input to output is not 1:1

SyncTeX addresses these problems. Here is a lightweight PDF viewer I wrote “SyncTeX Viewer” for Mac OS X.

I use the kern and glue parts of the output to tag the sync meta-data too, as this happens at the right time in the typesetting process.

This allows syncing at a word or even character level between output and input - demonstrated with iTeXMac2. Ligatures are a bit tricky for character selection but it works okay.

SyncTeX is in TeXLive, when Han The Han added SyncTex to pdfTeX. Jonathan Kew added it to XeTeX, and finally we arrived at the TeXLive implementation.

This is a segmented implementation; each part is targeted at one task. First is memory management; collecting all the input meta-data needs memory and to be efficient. And there are variations depending on the TeX engine used.

And its an orthogonal implementation; there is ZERO SyncTeX code in any engine source. SyncTeX must not change the typesetting process, so its possible to build any engine with or without SyncTeX, so features that clash with SyncTeX can be developed, and then have SyncTeX patched to work with it once it is stable.

This is done with autotools Make files, and I learned autotools for that. ;)

To make use of SyncTeX you need a PDF viewer that supports it, and the SyncTeX parser library is used in TeXWorks for GNU/Linux Windows and Mac OS X, Sumatra-PDF on Windows, and iTeXMac2 on Mac OS X.

You can also parse the SyncTeX output directly, and AucTeX and WinEdt do parse the SyncTeX output in this way.

There is a SyncTeX command line tool, that is an intermediate controller from text editor to PDF viewer.

Xpdf, in the next version, hopefully that will support SyncTeX.

Benefits of SyncTeX?

Its precise, no bad line breaks as in the past, no package incompatibility, same for DVI, xdv and pdf, and same for Plain, LaTeX, ConTeXt, and easier for developers of tools to use.

What is next?

In input we have file name, line number, but what about column number? That is tricky, TeX doesn’t know about that, so the TeX source engine would need to be adapted. A big job.

Also, use SyncTeX output data, with all the information about the hbox and vbox, so you can embed output into HTML that is vertically aligned, perhaps.

Thanks to pdftex, xetex texliv and itexmac2 developers!

[I wonder why they looked at Xpdf not Poppler?]

Notes from TUG2008 in Cork: Day 2


These are rough notes from a public event, and any errors or stupidity should be attributed to me and my poor note taking; I hope these notes are useful despite their obvious flaws, and everything should be double checked :)

Unicode and TeX

Arthur Reutenauer

9:05 [5 mins late]]

XeTeX does on the fly translation from UTF8 to TeX’s “legacy” encodings.

RFC 4646 is a language naming scheme standard that covers everything; the ISO 2 or 3 character codes don’t cover enough language variants. Eg, the UK language could be British English or Ukranian.

xindy: UTF8 indexes

Joachim Schrod


if you create a in index, that usually means page numbers. But not always; music pieces have names, Bibles have named sections that matter. Ranges over structured location references. xindy allows for this. We have a declarative style language for both declaring these locations and for defining the output style. We have pre-made modules for common tasks.

Perhaps the most important contribution of xindy is its theoretical model for index creation. Something that LuaTeX could take on?

We have a set of predefined languages - even Klingon ;) - although that’s not in Unicode! ;p - but this isn’t a very wide selection, its euro centric (because its a community effort)

We have markup normalisation for the index; we made a TeX introductory book that has “\MF” and so on instead of “\index{METAFONT@\MF}”

Do we need a ‘Cork’ math font encoding?

Ulrik Vieth


Returning to Cork this year, I thought about the last time, when the Cork encoding was developed. It provided a model for more 8 bit font encodings, supported many European languages, and started further developments. Its complete 7 bit ASCII support was good … but some shortcomings; didn’t follow any other standards like ISO Latin 1 or 2, and input and output encodings were different (solved in 93/94 by LaTeX2e and inputenc and fontenc) and created a lot of local encoding forks (solved by TeX Gyre fonts) and left out text symbols and the glyphs commonly available in PostScript fonts. So there was a big mess of font encodings.

This is only resolved by moving to Unicode and OpenType fonts. The TeX Gyre project provides a consistent implementation of many encodings, with a root in Unicode/OpenType.

Today, TeX is transitioning again - from DVI/PS to PDF, scalable fonts have replaced bitmap PK fonts, Unicode and OpenType are replacing 8 bit encoded fonts thanks to the new engines that are widely available.

The 7 bit text and math fonts were developed at the same time, DEK needed them to typeset TAOCP. 8 bit text fonts were developed by European users for their own needs but math fonts weren’t. There are reasons for doing them though, and the ‘Aston’ project in 1993 and then the ‘newmath’ prototype in 1997/98,

OpenType math in MS Office 207: while we were waiting for STIX fonts, MS added a MATH table to OpenType, and Cambria Math font is a reference implementation.

There is acceptance of OpenType math: many concepts and idas from TeX were adopted by Microsoft, its officially still experimental but already a de facto standard, FontForge and XeTeX already support it, LuaTeX is likely to follow. Its likely that OpenType Math Support will be adopted in new TeX engines and new TeX fonts. And Unicode sorts out the issue of ‘math font encodings’ - the issue is not developing OpenType Math fonts.

The OpenType font format; developed by Adobe and Microsoft, its a vendor controlled specification and isn’t really open; it has concepts in Type1 and TrueType fonts; the table structure of TrueType; uses Unicode encoding; advanced typographic features like glyph positioning GPOS and glyph ….

The OpenType MATH table: Font specific global parameters, and some have direct relations to TeX parameters, and others are simplifications, although a few TeX parameters don’t have clear correspondence. TeX engines can use some workarounds for that. And glyph specific metric information.

Optical sizing is important for super/sub scripts, and METAFONTs typically have 5/7/10pt adjusted for readability.

Challenges presented by OpenType Math fonts: the scope of the project; a huge set of geometric symbols and alphabetical font shapes to be designed. There are organisational issues, the font extends across multiple Unicode planes (> 16 bits) and there are size variants and optical sizes to be packaged in un-encoded slots. technical issues, matching fontdimens and other TeX parameters to the MATH table, and mapping TFMs to glyph-specific metrics, and font substitutions too.


Q: 10-20 person years put into the OpenType MATH stuff, including Cambria implementation. They don’t claim their MATH table is generic; its specific to Cambria, and its an ongoing and infinite task…

A: Sure

Q: You left out something in the summary: Interface issues. Its useful to have Unicode math, and STIX fonts. But what about higher level interfaces?

A: Sure

Three Typefaces for Mathematics

Dan Rhatigan


This is not about technology, its about design issues.

I’ve been typesetting for a long time; using a lot of core configurations for dealing with math; as I get more into type design, I knew I had problems with type as a compositor/designer. So I was casting about for things to look at for this, and I found 3 case studies that bring up different issues.

  1. Times 4-line Mathematics Series 569, Monotype Corporation 1957 (based on Times New Roman, Stanley Morison and Victor Lardent, et al, 1931)

  2. AMS Euler, Zapf DEK et al, 1985

  3. Cambria Math, Jelle Bosma, Ross Mills, 2004

Trick things about maths? legibility in paragraphs is different to that in equations, they combine multiple styles scripts and symbols and the positioning and spacing is a kind of script of its own, moving vertically and horizontally and even back and forth.

Legibility, of letters, and readability of paragraphs.


Here’s a hand set equation using Modern Series 7, and here’s a machine-set equation using Times Series 569. You can see the x height was normalised and other changes, but the big thing was the italic’s slant was changed, 4’ to be more upright. Times had a 16’ slant which is quite a lot.

Here’s photos of the pattern drawings, with shapes highlighted, and overlapped, and you can really see the difference.

So Knuth also made a font of Modern Series 7, Computer Modern, and then had an idea for a new kind of approach, a CONTRAST of style, rather than a seamless blend. Zapf did the drawings that the typeface was bsaed on, but there was a rich correspondence between Zapf and DEK also, and the design pushed the boundaries of the technology it was meant for. “An upright italic with a casual twist” that reflected the tone of handwriting a mathematician would use. Eliminating the problem of how to fit all the pieces together with a slanted shape. It has the characteristics of a italic shape, though. The calligraphic forms also help. A notion Zapf got behind was not capturing a sense of fine formal broad-nib calligraphy, but the rough quick pen work of someone jotting down an equation. They started with book typography but moved away from it in the process.

The problems of making the subtleties of Zapf’s drawings come across in the digitisation with METAFONT by a team at Stanford. Here’s photos of the final drawings that Zapf submitted. There were subtle modulations

The team decided to drawn the OUTLINES with METAFONT instead of a stroke/nib skeleton/flesh model.

Cambria, the default Math font in Office 2007+ until more math fonts are developed. A focus on ClearType rendering; curves that move quickly from horizontal to vertical, avoiding large diagonal gestures wherever possible - so things render sharp and crisp on screen with ClearType.

Minion Math

I wanted a math font that improves over existing math fonts: something that is very consistent (Computer Modern uses some AMS Math glyphs…) and comprehensive and versatile (not just one width, one optical size, one weight)

Why start with Minion? I like it. It has Greek letters and optical sizes already. 1990 Adobe font, had Multiple Master versions, and then Greek glyphs.

Weights: Regular-Medium-Semi bold-Bold

Optical Sizes: Display-Subhead-Regular-Caption-Tiny

In the final release, they will offer full Unicode math support, full math alphabets, and a real Math italic. I plan to fill the Unicode block for mathematical characters totally.

Consistent look, consistent metrics.

Q: legal status?

A: yes I have a legal agreement, I’m licensed to use their trademark and to publish my font.

Cuneiform with METAFONT

Starting point for cuneiform is the basic elements, the wedges. I didn’t scan images of clay tablets, I’ve constructed the shapes in 3 variants, Classic, Filled and Academic.

I used MetaType1 to produce Type 1 fonts, then FontForge to generate OpenType, and I also use t1utils and others for the final result.

The MetaType1 package, was developed for the TeX Gyre project, and it runs MetaPost (any available version) to produce EPS files with outlines for all the glyphs, and collects the data together into one Type1 file. The MetaPost source files describe the glyph designs, and then additional macros are defined in a MetaType1 macro extension or appended by the user to combine them into a font.

Compound elements with intersections require “remove overlap” during compilation to Type1 and OpenType font formats.

TODO: I wish MetaType1 would be extended to MetaOpenType to produce OpenType directly.

Meta-Designing Parametrized Arabic Fonts For AlQalam

Ameer M Sherif

Hossam A H Fahmy

Here’s a reed pen nib, the traditional Arabic writing tool. Here’s the Naskh style of Arabic script, written right to left, and most letters connect - only 6 do not. And you have the same word written wider or shorter to justify the line as you like. Its not justified by the spaces between the words, as in Latin, but inside the words. There are a lot of ligatures, the same letter can have a very different shape depending on its position in a word. The 2nd and 3rd line of this slide are images from a Arabic calligraphy handbook.

There are other styles of Arabic; like roman, italic, fraktur for Latin. in Naskh you have a unit like an em, a scalable unit, and the base pen nib shape is a square at 45’

A vertical stroke is not really vertical, its not just two points, but 4 points describe it well, “z1..z2..z3..z4”, a 5th point is redundant often, although sharp bends and asymmetric strokes can require them.

There are primitives for Latin glyphs; vertical (stem, bow) and horizontal (arm, bay, turn, elbow) then secondary (nose, bar, dot) and then specialised parts (Q tail, R tail, a belly, g tail)

DEK used a simple set of primitives, and parametrised them to get a large set of glyphs. We want primitives to make letters more flexible and better connected.

We used 3 kinds of primitives:

  1. Some are used without any modifications in many letters

  2. Some are dynamic but change shape only a little

  3. Some are dynamic and change a lot

There are ‘approximate’ directions in calligraphy books, where ligatures are pretty different shapes to their component characters. METAFONT isn’t that smart yet, to learn over time ;), so we have to put that into the design. These are the 2nd kind above.

The 3rd kind are tricky; eg the “kashida” that doesn’t belong to one of the two letters, its a connection between the two. OpenType is buggy; you cannot have glyphs that change width on the fly; you have to predefine sizes. But the line-breaking algorithm ought to tell the font what width an Arabic character it wants.

The best OpenType fonts in Arabic, from Decotype in Holland, have a predefined width. This will create poor connections between joined up glyphs, but if you can have a smart font and line breaker, it will be smooth. (?)

Urdu is totally oblique, and so you need to look at the different Arabic writing styles for each font. Arabic is the most commonly used script after Latin; used for about 15 languages.

Taco and Hans were asking about when Arabic letters stack up; The baseline is the base; for combining letters, we benefit from the declarative nature of METAFONT. The horizontal positioning starts from the right, the vertical positioning starts from the left at the baseline, and the writing starts from the right.

Flexing and contracting with kashidas is a matter of personal taste of a calligrapher, so with type its something the type designer/typographer can decide. The length of the kashida is the length of the word, minus the minimum width of the letters.

We wrote a simple GUI for this: it reads input word(s) and parses them into character streams, lists the chars, manually select the letter-forms and length, then output files with selected letter-forms, lengths and order in word(s), and finally runs METAFONT and a DVIViewer. So we get complete words out of METAFONT using these primitives.

We tested 16 words with 30 people on a comfort scale of 1 to 5, and made a mean average of their opinions. We used Simplified Arabic and Traditional Arabic, that Microsoft ship, and DecoType Nashk - said to be the best available - and ours. We get 3.9/5, DecoType gets 3.2/5, trad 2.4 and simple 2.3. The big difference is the kerning, and Decotype isn’t doing a good job with the kerning right now.


We want automatic selection of the most suitable glyph shapes and sizes.

We want contextual analysis to choose the form, and line justification analysis to choose the size and ligatures. This will take a whole paragraph, and process the whole thing. You won’t know the shape of the first character of the first line until you’ve taken into account the last character of the last line. Very complex!

We want to meta-design all possible letter forms

We want to automatically place dots and other diacritic marks.

We’re not sure if its worth modelling the ink spread and movement speed of human calligraphers

We want to embed METAFONT sources into PDFs; if you want to re-flow things, you need to re-justify them. So the sources of METAFONT should be available in the PDF, and then in the PDF viewers have a METAFONT engine to re-typeset the paragraphs. PDF viewers have an OpenType engine, so why not a METAFONT engine? METAFONT is much much better than the tables of OpenType.

Finally, we want to support other Arabic writing styles. We haven’t finished this one yet, but plan to move forward

Q: Tom Milo (behind DecoType) has the ACE text layout engine as an InDesign plug-in that uses a special font format to set text, and these fonts can be ‘frozen’ into OpenType fonts for general use.


Writing Gregg Shorthand with LaTeX and METAFONT

Gregg shorthand was made in 1888, the current version is the centennial version, it is a simplified alphabet for phonetic writing and brief forms and phrases. text2gregg.php at http://www3.rz.tu-clausthal.de/~rzsjs/steno/Gregg.php shows how this works; lets input “once upon a time there was a family that lived happily ever after” with a proof of 23 to make it larger.

Gregg has a lot of ligatures, and so we need to join curved (C) and vertical (V) strokes together - basically in 3 ways - CV VC and CVC. We use Hermite Interpolation for Bezier Splines to do this smoothly. …

Phonetic writing is done with “unisyn” from http://www.cstr.ed.ac.uk/projects/unisyn/

The 15 most frequent words in any text make up 25% of it.

text2Gregg works great!

CAVE CANEM - a Pompeii before 79AD, there is old Roman cursive, DEK, Herout-Mikulik, Gregg and Pitman. These meta-notations or shorthand notations are machine drawn; and do not confuse pen stenography with machine stenographer products in the US.

There is a book “Gregg shorthand adapted to Irish”, with copy inscribed “courtesy of john r Gregg 1930” (?)

My talk

Multidimensional Text

John Plaice


What is text? In many ways we are stuck in the typewriter age; Most formatting systems assume input and output strongly resemble each other; typewriter, telegraph, WYSIWYG, TeX/LaTeX, Unicode/XML

A sequence of typeset glyphs, and the characters that generate it, there is such a resemblance.

But documents needs to be editor (many times) edited (a lot too) annotated and searched.

This needs versioning (revisions and variants) input methods (sms, typing, audio) and output methods (raw text, formatted text, audio) and various kinds of text processing (spell check, typesetting, searching morphological analysis…)

What do we know? We need to move from one representation to another with only the inherent complexity of each process… We need separate input, output and internal representations (note the plural)

Chris Rowley (Kyoto 2003) wrote about this.

The solution already exists, it took a while to invent it and realise it was already invented; AVMs, or Attribute Value Matrices. Everything is an attribute valued list; values themselves can be AVMs. Any value is reachable through an index (“iterator”) AKA feature structures.

3 common structures; ordered sequence of ‘flat’ structures, ordered trees of ‘hierarchical’ structures, and matrices of multi-ordered data.

Parallel Typesetting

Toby Rahilly


Uses a physics model of forces to layout text! Cool!

Notes from TUG2008 in Cork: Day 1


20 years of TeX Development; a personal perspective


Frank Mittelbach

Late 80s

Early enthusiasm!

Saw the program but couldn’t use it, then later got Latex 2.08 with PCTeX, but I had 512k RAM and it couldn’t run. So I wrote FM-TeX as a mini-LaTeX, and I focused on speed and lightness.

doc and docstrip; about 85% of the LaTeX code on CTAN uses this for documentation and installation

Other packages - array, multicol, theorem, varioref, etc - and I won the Don Knuth Scholarship for multicol.

I identified LaTeX’s shortcomings - poor math support, hardwired fonts, no color or graphics support, no extension interface, missing non-English language support, no input-encoding support, no consistent internal programming language, (nearly) no high level internal interfaces - and a fairly simple pag model

So we discussed at a conference with Don and talked him into various changes, and TeX 3.0 was announced. This opened up non-English usage, but basically froze the processor for a long time.

The Early 90s

The community was running at full speed!

Babel development, NFSS development, E-TeX: guidelines for future TeX extensions (1990). 1990, Cork encoding, a single community wide standard that supported many languages, although it over-did some things which still haunt us today. PostScript fonts exploded, and so fontinst came along, AMS-Math development was started, and LaTeX2e beta was released in 1993, then fixing bugs and doing input encoding support, and then the first official release in June 1994.

The shortcomings I identified were mostly fixed;maths, fonts, languages, input colour and graphics, extension interface, etc.

But still had some: no consistent programming language, no high level internal interface, and a still a fairly simple page model.

And so then there was the first consolidation phase.

The LaTeX Companion book’s 1st Edition, translated to German French Russian and Japanese, and sold over 100,000 copies. And the Graphics Companion was also important.

The Late 90s

Going mainstream!

We tried to establish LaTeX as a product to build up a big user base. We developed regression test suites and had regular maintenance release - 17 by now. Test driven development is typical today but we did it in the early 90s.

We did a lot of hacks to optimise things, which was necessary in those days because computers were slow, but comes back to haunt us. We developed a new kernel in expl3 in 1992, but never published it until 1995 or so, which was a mistake. And we tried to solve some of the license issues, with the LPPL in 1999, and that becoming a free software license in 2003 after 1600 messages in debian-legal.

47% of CTAN is LPPL, 25% unknown, 17% GPL, 5% Public Domain…

But at this time maintenance becomes more and more rigid and slow.

The Early 00s

This was the second consolidation phase (03-07) and we released The LaTeX Companion 2nd Edition, a 90% rewrite, doubled the size, and translated into German and French. And the Graphics Companion is also released.


expl3 available and usable to solve the lack of a consistent internal programming language, and a template interface available within LaTeX2e more high level interface via packages.

But the big problem is the simple page model; this is a limitation of the underlying TeX engine. But, would a replacement system reach the critical mass of users needed to make people switch?

So, is it dead?

  1. Not yet, but the sharks are circling.

  2. Not yet, but it is fragile.

  3. Its still strong, but old, and changing it might kill it.

So the real answer is a mix of 2 and 3. we have solved many large issues i n the past, and the consolidation efforts helped. A semi-optimal standard that is used is better than nothing.

A big dilemma in TeX is that all major development have been by individuals or small groups - students, academics, or old dinosaurs hanging around - and all the large projects with committees have failed. This is good for development but not so good for maintenance.

LaTeX is an exchange protocol; if I send something to you, I expect it to look identical on both our computers.

For LaTeX to evolve, we need to identify what would make people switch (many USA users still use 2.09 seeing no need for updates), we need a clean update and upgrade path for software and documents.


Q: Can you replace the underlying engine and still have it compatible?

A: Gradual stuff, yes. But a different model? No.

Q: Is LaTeX doomed?

A: This is was I meant by 3, there are a lot of old people who used it a long time, and few new users.

A Pragmatic Toolchain

Steve Peter


I consult for http://www.pragprog.com. Who are they? Andy Hunt and Dave Thomas were programmers working for years and years, AT&T and elsewhere, and they started comparing notes, and realised their jobs were the same jobs over and over. They wrote “The Pragmatic Programmer” and people didn’t read it and follow the advice and they kept doing the same jobs. They suggest documenting things, and using LaTeX to do that. They fell in love with Ruby, thought directly in it, and at the time all the docs were in Japanese, and they wrote “Programming Ruby” (now in 3rd edition). My mind works more like Perl, I’m a linguist by training, I figure something out and 6 months later have no idea how it works :)

So these guys bought back the rights to the books and started their own publishing house, the Pragmatic Bookshelf. They typeset TPP themselves, the original was in TROFF, and the publisher suggested using TeX - their in house designer used TeX - and so when they started the Pragmatic Bookshelf, they inherited that tool chain.

So I inherited that tool chain. A few years in, I got an email asking if I knew anyone who knew TeX and XSLT. So I went to the bookstore for a book on XSLT and here I am :)

Our tool chain: The source our authors write is XML, our own DTD “Pragmatic Programmers Book” (PPB), and use make, Ruby, XSLT, TeX and PDF to process that. All the books we sell in print and as PDF downloads. Both are PDFs but there are two different routes; the print PDFs go through Acrobat because our printers don’t like PDFs from other sources. But the online PDFs go via dvips and then ps2pdf.

We don’t use PDF’s DRM although we do watermarking.

Since our tool chain is almost all free software, all our authors can have copies and test things out. Its cross platform, UNIX centered. So when I have my author XML, I just run “make all” and the PDF is built. Normal make switches effect the build pathway for print or screen.

pragprog.sty (with the memoir class file) is how we format the TeX part.

We try to keep the XML as the canonical source, with a pipeline transformation to reading formats, so we can expand to other eBook formats as they become established.

What are out “issues”?

Principles: XML is always the canonical source; everything much be automated.

URLs and hyperrefs in general are a pain.

The biggest problem is making re-flowable PDFs for eBooks. We get an email a day asking us when we’ll support Kindle and so on. I’m under pressure to produce re-flowable PDFs. I can’t do that with stock TeX, and I hope I can figure this out this week ;) LuaTeX? No. XeTeX? No.

And do we used LaTeX? ConTeXt? Eplain? A new format?

We can provide some funding (in dollars…) to do this.

We might switch away from TeX to do this. XSL-FO does have tool chains to produce re-flowable PDFs. It doesn’t look as good as our TeX based chain, so perhaps we’ll add a whole new pipeline with XSL-FO for eBooks.

Q: How to authors write the XML?

We offer authoring packages for TextMate and Emacs. I used to use Emacs but now use Textmate.

Q: How do you process XSLT?

A: Xerces I think.

Q: We have not packaged TeX so it can be used as a reflow engine for PDF files; perhaps we ought to. Can we make a BOF session about this?

A: Sure

Q: Why not using pdfTeX?

A: The original team used psTricks for lots of things so we can’t switch to that, but yes that is the right thing to create re-flowable PDFs.

Kindle has a black and white screen, and has only 2 fonts - serif and sans, no fixed width font or advanced math support.

Q: Could you get XSL-FO out of TeX?

A: Perhaps, not sure..?

Q: Does the optimal route pass through XSL-FO?

A: Not sure either :)

Developing your own Document Class

Niall Mansfield


I wrote “The Joy of X” in LaTeX in 1990, and it was written in STOP format: from Hughes Aircraft. Every section is 2 pages. It has a title, and then a theme summary. Can have table, ol and ul. It also had a per-chapter TOC. Subsections are also 2 pages. The section title is repeated here.

I wrote “Practical TCP/IP” for Addison Wesley initially, got the rights back, and wanted to set it myself with LaTeX. I used Emacs, Inkscape and Subversion. My old TeXspert was an amazon founder so he wasn’t around any more and I was on my own :)

So this was LaTeX2.09, 1400 lines of code including 550 comments.

We tried to convert the class file. this was dreadful. Writing a whole class is impossible for normal people, you don’t know how components work or what to include or omit…

LaTeX2e is wonderful, it looks like normal programming (instead of stack programming) and it has lots of useful packages.

So the new style file: We assume book.cls, its 300 lines of code plus comments, and 34 and \requirepackage’s, 7 \newenvironment convenience functions, and 54 \newcommands. We have a tricky 34 line \@sect{} code though.

We rely on class to do as much as possible; we can rely on standard classes and packages, stuff that is been around for years and is stable and well tested. Whenever we changed anything, we could reuse other people’s things, and changed as little as possible.

We keep changes to the absolute minimum.

No “tidying up”!

Lessons learned?

I had an 800 page book in word, used a proprietary word2tex converter.

Hard stuff: \@sect{} is not for sissies. macro processing is not C, Java, etc. The TeX syntax is crazy; one time, I could work out how to change a “==” to a “>=” expression. Will LuaTeX or PyTeX solve this?

But having done 3 different books with 3 different tool chains - POST, Word and LaTeX - I thought LaTeX is best.

Desirables? Annotable source, where 2 authors can comment without changing the text directly, and “live” diffs where you can see the changes in the text clearly. Word has this as its WYSIWYG.

And does XML mean we’re wasting our time with LaTeX?


LaTeX, Lilypond, Perl

Joe McCool


I am a LaTeX user. An amateur, I try things until they work, and don’t really know what I am doing. I work in the countryside, yesterday is the first time I met someone who knew what LaTeX was.

I am an amateur musician too; I made a book of tunes with some connection to boating/water/maritime stuff. I made a book about a canal we revived with LaTeX and loved it, and so wanted to make a cookbook or songbook next.

Folk music is often learned by ear; its copied by ear, person to person. Folk players are not strong sight readers. This effected my requirements:

Integer number of tunes per page, cross referencing for grouping songs, text snippets associated with the tunes, nice big fonts and colour, easy to input the song data, good control of output, good community support, and a high quality output.

GNU Lilypond’s /lilypond-learning/Engraving.html documentation has some images of various musical fonts. Its called “engraving” rather than “typesetting” for traditional reasons although its really typesetting. For the typesetting of classical music, the quality must be 100%, because anything less can effect the quality of the music being played.

I had other requirement too: Midi production, a standard format for storing music data so people can listen to things. I want the process form source to output to be as automated as possible, and I want web publishing.

There are some proprietary software packages: Finale, Sibelius are the main ones, Noteworthy Composer and Cakewalk are others. I rejected these quickly because they are proprietary. There are free software: MusicTeX, psTricks, GNU Lilypond. I looked at these and GNU Lilypond is the best IMO.

Classical music world regards Lilypond as very high quality. Finale and Sibelius have nice GUI interfaces, and have various interesting features, but their printed output is poor, and in fact they will output in GNU Lilypond format.

GNU Lilypond is under heavy active development, started by some dutch musicians in 1996, it has a strong community development, has a strong free software ethic and is heavily based in hacker culture which can be a strain for musicians without that background. It has partial compilation, a mode for beloved Emacs, and a classical Unix command line interface. It incorporates Scheme; I haven’t got far enough to play with that, but its powerful. The documentation has some good examples of what it is capable of. A lot of space to be handled, a lot of ink to be placed, and it meets the challenge well. A solid UNIX like tool.


LaTeX3 Project

Jonathan Fine


I work at the Open University - basically a strange kind of publishing house - in the LTS - I don’t teach, I prepare course materials - and we wanted PageMaker style layout but done with LaTeX.

Here’s a page full of figures, and the editors are keen the figures be on the same page as the reference in the text.

From 1992 to 1995, the members of the LaTeX3 project cleaned up LaTeX2.09 to create 2e, identified remaining problems in 2e, and published statements of their goals. The Open University has been working on their SGML/XML goals: Allowing document elements to have attributes, and solve real-world production problems. We plan to release it as free software.

At the 1997 TUG, Frank Mittelback and Chris Rowley presented a paper on ‘The LaTeX3 Project’ that was later published in XML Coverpages. They wanted a syntax to automatically convert popular SGML DTDs into LaTeX.

Attributes are meta data belonging to an element; eg a figure element has attributes like caption, src image fie, copyright, location in page (margin, body), and raise/lower. Eg,

    \caption{a spinning top}
\copyright{Ty Coon}

Attributes are keys for building a dictionary, not control sequences to be executed; elements have a list of allowed keys, required keys, default values, they can’t be specified twice; the parser can normalise and validate their values, and if there is an error the \do@Figure is not called.

docx2tex: Word 2007 to TeX


Krisztian Pocza, Mihaly Biczo, Zoltan Porkolab

c:\bin\docx2tex.exe paper.docx paper.tex

This is our PhD work, in Hungary


This is a small tool that uses ECMA and ISO standard formats so we can trust them ;) and its free software.


Word 2007 is not very good a typography. TeX is better. But Word is easier to use, WYSIWYG, and it supports collaboration and team work. So we wanted to connect them.

Features & benefits

Existing solutions: propriety programs exist, mostly as Word plug ins that cannot be tool chained, some only support RTF, and OpenOffice.org can read old Word formats and output some TeX but its format support in both ways has many shortcomings.

docx2tex is free software, it supports most parts of documents although not Word Equations or Drawings (yet). Normal text, text formatting, alignment, lists, figures, tables, listings, cross references, image conversion (via ImageMagick)

Applications & Use Cases

We use it for scientific publications … Our work flow might have several authors, and Word 2007 handles this well, tracking changes and merging forks, and so when we arrive at a final docx file, we convert it to TeX, polish that up, and submit it.

Technical Details

License and availability

The license is X11. GPL is cancer. http://codeplex.com/docx2tex/ Codeplex is a sourceforge from Microsoft. Its written in C# and Microsoft Visual Studio 2008



Q: use Mono to run it?

A: Depends on Miguel. We use .Net 2.0 and I don’t know how much Mono supports that. Maybe. We use the system.io packagine namespace that is part of .net 3

Q: The OOXML specification is complex; what areas did you have trouble with?

A: …

Q: Roundtrip back to OOXML?

A: Interesting.

TeXWorks: Lowering the Barrier To Entry

Jonathan Kew


Been working in XeTeX for the last few years, but recently been doing something different. But there is a link: something that drove XeTeX was that it made things easy that used to be hard.

using a new font in latex used to be intimidating, and there was fontinst to help, but many users never got over that initially barrier that its hard and complex and TeX meant Computer Modern, Palatino or Times.

A big reason for XeTeX adoption is using any font you want easily.

Another big big issue in TeX is is known for being really good, especially at maths and science stuff, but the other side is that people typesetting things who are not doing maths and science, who are frightened off by $ and so on.

So how can we make TeX more accessible to people who aren’t typesetting equations daily?

What does TeX look like when a newcomer arrives? They have to write a TeX document, and need a text editor to do that. Lets looks at some examples of TeX editors:

TeXniCenter. Lots and lots of buttons, cryptic abbreviations for processes you might run…

WinEdt. Very similar, lots of buttons, 1/4 of the screen is used for writing your document…. For someone used to double clicking an icon on the desktop and writing their document immediately, this is frightening.

Kile. The same again. Lots of Greek mathematical symbols…

LaTeXEditor. Same.

We know what they are like. There are other kinds, like Emacs, but I got out of the UNIX world and don’t use it any more.

And then there is Dick Koch’s TeXShop. TeX on Mac OS X has been very popular, largely because of TeXShop. This is not frightening. It is easy to start and to use initially.

There didn’t seem to be anything else like this.

Maybe there is a place for something with a similar interface that is cross platform?

Wikipedia says “The introduction of TeXShop caused a TeX-boom among Macintosh users” and this true. It presents only the essentials, and it has a simplified workflow straight to PDF instead of DVI and so on, and it had a few really neat user interface features - the magnifying glass feature [from XDVI!] and the ability to click somewhere in the output and be taken to the area of source that created that.

TeXWorks is an effort to build a similar program, to give a similar experience, in a portable way. It builds on portable free software. I’ve been working on it in a spare time, hobbyist manner, using existing components and putting them together. I’m using Poppler which is the most popular free software PDF viewing library, and QT, a popular free software GUI toolkit. Those are the key pieces.

What do we hope to do?

  1. A simple text edit. Unicode support, using standard OpenType fonts, multi level undo/redo, search and replace with regex, usual stuff.

  2. Tools to execute TeX to create PDF.

  3. A preview window to view the output. This is unusual; a PDF viewer that is integrated; anti-aliased PDF, opens automatically when TeX finishes, auto-refreshes when you rerun TeX that stays at the same page/view, the magnifying glass feature, and Jerome Lauren’s “SyncTeX” technology to jump around from source to output and back.

  4. Power user features, but that must not complicate the UI for the newcomer. Code folding, interaction with external editors/viewers, and so on.

This presentation is being done with the TeXWorks viewer, but here’s a demo of the UI with sample2e.tex

If we command-click a selection in the source, it takes our viewer to that spot, and likewise the other way. This is SyncTeX at work.

Q: What about page numbers?

A: Right, there is no source to jump to. Jerome’s presentation will show off what SyncTeX can do, its just integrated with TeXWorks.

Here’s the sample for the polyglossia package - a Unicode replacement for babel - and this shows the on-the-fly spellchecker doing red underlining of the source, and we’re set to English so almost everything is misspelled, but if we change the spelling language to German, the German paragraph checks out fine, same for Greek, or we can turn the integrated off. Its based on hanspell (?) and of course is free software.

Using Mac OS X here, I could use TeXShop, but here’s a GNU/Linux machine, and load it up.

Here’s the templates feature: It includes templates for common kinds of documents, and here is a new document based on that template, lets pick a Beamer class, and run TeX on this. So if you install it, and have any standard TeX distribution like TeXLive, it will just run straight away.

And for completeness, lets see it on Windows XP and on Vista. You can see it works precisely the same way.

An earlier presentation took a long time to start up because it used a lot of .Net libraries, but this starts very quickly even though it uses a lot of free software libraries.

So the main thing I want people to do is join in development!

I can’t do everything. There’s a lot of ways I’d like to see the TeX community contribute. Its C++, hosted at Google Code. It has command completion that needs work, specifically. The QT Linguist tool allows localisation and translation.

You can also use it for real work and provide feedback.

There is currently zero documentation and tutorials. Pages suitable for integrated help would be great.

If you can package it for your OS, please do, I haven’t done any of that.

http://tug.org/texworks/ is the homepage

http://code.google.com/p/texworks/ is the development home (source code repo, downloads of binary packages, issue tracker, wiki for developer notes…

Q: Other areas for development?

A: How to integrate images is tricky and scares new users. Old timers who have an investment in PostScript based work-flows aren’t well supported today, but they aren’t the users we’re aiming at, but I’d be happy to accept contributions that support this.

Q: My son is computer illiterate. We told him to throw out Word, took an hour to teach him TeX, and his teachers praise him for the high quality of his work, and he’ll never go back to Word. It only takes an hour.

A: It should!

Q: What about Lyx, ScientificWord? Aren’t they meant for novices?

A: They have a place. But they contain the kinds of documents you can write. They work great for a narrowly defined domain, but they are not general purpose. This can do anything that can be done with TeX.



Tako Hoekwater

John Hobby’s MetaPost. In Pascal, so unpleasant to work with. We thought it would be good to modernise the whole system, at least a little bit, and we got some funding to do this.

We wanted a reusable METAPOST component, and completely re-entrant, so several programs could use the library without duplication in system memory of the same program. Indirect I/O so you can use the library without touching the hard disk ever, and simplifying the subsystem for labels, and having totally dynamic memory allocation - the original MetaPost had static memory allocation so if you needed to do a big job, you needed to recompile it to be able to access more memory.

Problems we ran into?

Pascal WEB is very old fashioned; many global variables, static arrays, string pool, and complicated compilation…

CWEB is a single language to replace Pascal and C, and compilation only depends on ctangle; we have a single C library, with a “mpost” front-end …

Restructuring, we redid the instance structure, revisited the string pool, and isolated the PostScript back end - so the core creates a set of objects, and then those are converted to PostScript, and could be converted to something else. We have a C name-space (mp_…)

Usage: Set up MPLib options, create an MPLib instance, …

Example C program to take input at a command line and run it.

It has Lua bindings, so here’s a Lua example that will work as Lua code within LuaTeX. The core of the program is a string delimited by [[ and ]]

TODO: Dynamic memory allocation for everything, MegaPost (range and precision limits), configurable error strategies, and internationalisation of messages (a nice feature to have, via GNU gettext) and finally, expand the API.

There is no way to say, here’s an equation, what does it look like? And then better use of MPLib can be made in other programs.

http://www.tug.org/metapost/ is the homepage

MPLib has generated the graphics for these slides - the background bookshelf image is generated with random variables, if you look closely.

Q: Other output back-ends other than PostScript?

A: Well I wrote the PostScript back-end in 2 days, and have a private prototype using Lua to create PDF directly, and I think SVG would be very feasible.

[I wonder if it could output Spiro splines…]


Hans Hagens

Was going to be a demo but now just a talk…

This follows up on what Taco has said about MPLib, about ConTeXt MkIV

This is a large rewrite of ConTeXt, as ConTeXt has become quite large, so v4 is a slimming down release.

We started using MetaPost over 10 years ago, when graphics were embedded as EPS. Sebastian Rahtz challenged me to write a MetaPost to PDF converter in TeX so we could use them directly. This meant you could use TeX fonts inside MetaPost. We added some extensions (shading, transparencies, etc). Embedded text was taken care of very efficiently, avoiding reruns totally in the end. We managed to include MetaPost source in a document source, and this allows reusing graphics but with awareness of the state of the document (dimensions, colors, etc) and such graphics can be really cool - well integrated with background mechanisms and adapted to situations… layout, font and other contextual variables are available to graphics. These features are stable and frequently used by users, with no real in-depth knowledge of how it all works required.

The MetaPost run time is now almost zero overhead.