<\body> ||<\author-affiliation> Laboratoire d'informatique, UMR 7161 CNRS Campus de l'École polytechnique 1, rue Honoré d'Estienne d'Orves Bâtiment Alan Turing, CS35003 91120 Palaiseau >>|>> Currently, only a limited number of fonts are available for high quality mathematical typesetting, such as Knuth's computer modern font, the font, and several fonts from the family. An interesting challenge is to develop tools which allow users to pick any existing favorite font and to use it for writing mathematical texts. We will present progress on this problem as part of recent developments in the GNU scientific text editor. > For a long period, most documents with mathematical formulas were typeset using Knuth's font. Recently, a few alternative fonts were designed, such as the font and the fonts. These handcrafted fonts all admit a high quality, but they required an important development effort. Now there exists thousands of fonts for non mathematical purposes. To what extent is it possible to use such fonts for mathematical texts or presentations, or on the web? In this paper we describe recent developments inside the GNU scientific text editor which aim at a better support of general purpose fonts, thereby making life a bit more colorful. The focus is on fully automatic techniques for using existing fonts inside structured documents with mathematical formulas. Further fine tuning for specific characters in particular fonts is another interesting topic which will not be discussed here. There are obvious limitations of what we can do with a font if bold and italic declinations or glyphs for various important characters are missing. Nevertheless we will see that quite a lot is often possible even though the resulting quality may be inferior to what can be achieved manual design. Since various special characters or font effects are often only used at a reduced number of places inside actual documents, the occasional loss of quality may remain within acceptable bounds, even for professional purposes. Our general strategy for turning existing fonts into full fledged mathematical font families is to remedy each of the font's insufficiencies. The most common problems are the following: <\itemize> Lack of the most important font declinations as needed in scientific documents: , , , , . Lack of specific glyphs: non English languages, mathematical symbols, and in particular big operators, extensible brackets and wide accents. Inconsistencies: sloppy design of some glyphs that are important for mathematics (such as , >, ), leading to inconsistencies. The main countermeasures are and . The first technique (see Section

) consists of borrowing missing glyphs from other fonts. This can either be done on the level of an entire font ( for obtaining bold or italic declinations) or for individual characters ( a missing> symbol, or lacking Greek characters). Font emulation consists of combining and altering the glyphs of symbols in a font in order to generate new ones. This can again be done for entire fonts (Section

) or individual glyphs(Sections

and

). All techniques described in this paper have been implemented in , version1.99.5 and beyond. The software can freely be downloaded from our website . The virtual character definitions described in Section

below can be found in the directory; interested users may play with these definitions. Longer examples of what can be obtained using the techniques described in this paper are available here: <\padded> <\indent> In the / universe, there have also been several efforts towards better support for modern fonts, most notably and . The first system also contains features that are similar to those described in Section

. However, these systems do not support full mathematical font emulation as presented in this paper. and also tend to diverge from standard through the introduction of incompatibilities.

In order to borrow missing characters from other fonts, it is important to be able to determine fonts with a similar design, so that the alien glyphs fit nicely into the main text: <\padded> <\with|par-first|0tab> <\indent> >, >, > are acceptable inside +y+\+z+\>>.(

) >,>,>> do not look very well inside >+y+>+z+>>.>(

) Usually, rules for font substitution are specified manually for each individual font. Although this often yields the most precise and predictable results, it can be tedious to write such rules. For this reason, we also implemented a more automatic mechanism in order to determine good substitutes. A prerequisite for our algorithm for automatic font substitutions is a detailed analysis of the main characteristics of all supported fonts. The results of this analysis are stored in a database. Using this database, we may then compute thedistance between two fonts. In the case when a symbol > is missing in afont>, it then suffices to find the closest font > that supports this symbol>. Notice that the best substitution font may depend on the fonts which are installed on your system. In our database we both use discrete font characteristics ( sans serif, small capitals, handwritten, ancient, gothic, ) and continuous ones ( italic slant, height of an \Px\Q symbol, ). Most characteristics are determined automatically by analyzing the name of the font (for some of the discrete characteristics) or individual glyphs (for the continuous ones). Some \Pfont categories\Q (such as handwritten, gothic, ) can be specified manually. One of the most important font characteristics is the height of the \Px\Q symbol (with respect to the design size). When the font > borrows a symbol from the font > we first scale it by the quotient of these x-heights inside > and>. In the example() this was done correctly, contrary to(). Other common font characteristics are also taken into account into our database, such as the italic slant, the width of the \PM\Q symbol, the ascent and descent (above and above the\Px\Q symbol), etc. In addition, we carefully analyze the glyphs themselves in order to determine the horizontal and vertical stroke widths for the \Po\Q and \PO\Q symbols, the average aspect-ratios of uppercase and lowercase letters, and the average area of glyphs that is filled (how much ink will be used). Our current implementation manages to find reasonably good font substitutions. Notice that this may even be a problem on certain occasions. For instance, in the example() below, the sans serif font is such a good match that it can barely be distinguished from the serif font, thereby defeating its purpose: <\padded> <\with|par-first|0tab> <\indent> text is a bit too good.>(

)

Various font alterations such as , and can be emulated in rather obvious ways, although with significant loss of quality: <\itemize> Emboldening can be achieved through the replacement of pixels by small lines. In addition, it may be worth it to horizontally stretch certain characters such as \Pm\Q. The appropriate stretching factors are highly font and character dependent, but using the factors corresponding to the computer modern font usually leads to reasonable results. Italic fonts can be approximated by slanted fonts, which may be further narrowed for a better result. The most important drawback of this method is that it often falls short of producing the correct italic versions of certain characters (a//, f//, g//, ). Small capitals can be emulated by rescaling capitals using a factor that roughly turns an \PX\Q into an \Px\Q. Instead of conserving the aspect-ratio, we found it more pleasing to slightly widen characters as well. The transformed version of \PX\Q may also be taken slightly higher than \Px\Q. With more work, the above \Ppoor man's\Q strategies might be further enhanced. For instance, the italic might be better approximated using a shortened version of instead of . In order to improve bold font emulation, we might also replace pixels by small lines of cleverly adjusted lengths. More elaborate emulation strategies might greatly benefit from a toolkit for \Pretro-engineering\Q the design of existing fonts. For instance, given an outline, we might want to determine the curve(s) followed by a \Ppen\Q and the size (or shape) of the pen at each point of the curve. This would then make it easy to produce high quality narrowed and widened versions of a font, as well as better emboldened fonts, or variants in which the pen's size is uniform (as needed for sans serif and typewriter fonts). Another interesting question is whether it is possible to automatically detect serifs and to add or remove them. We have started to experiment with more elaborate emulation algorithms for the generation of \Pblackboard bold\Q variants of glyphs. The easiest strategy is to produce an outlined version of the possibly emboldened input glyph. The standard AMS blackboard bold font uses this method (>, >, >, >, >), but we consider the result suboptimal with respect to adding a single stroke (>, >, >, >, >). We implemented an algorithm for the detection of the part of contour to be \Pdouble stroked\Q. We next embolden this part and hollow it out. <\big-figure||||||||||||||||||||||||||||||||||||||||||||||>||>|>|>|>, >, >, >, >>|+f|)>>>>||>>|>>|>|>, >, >, >, >>|+f|)>>>>||>|>|>|>, >, >, >, >>|+f|)>>>>||>|>|>|>,

>>|+f|)>>>>||>|>|>|>, >, >, >, >>|+f|)>>>>||>|>|>|>, >, >, >, >>|+f|)>>>>||>|>|>|>, >, >, >, >>|+f|)>>>>||>|>|>|>, >, >, >, >>|+f|)>>>>>>>> Emulation of bold, italic, small capitals and blackboard bold. * These declinations are already supported by the original font.

Missing glyphs can be generated automatically from existing ones using a combination of the following main techniques, listed by increasing complexity: <\itemize> Superposition of several glyphs: and can be combined into >, and> be obtained by juxtaposing two > symbols. Clipping rectangular areas: cutting > and > in their midsts and combining them yields >. Linear transformations: combining a crushed O and an I, we may produce the Greek capital \. Turning around >, we obtain >. Simple graphical constructs such as circles and lines. This can for instance be used for producing the missing half circle of >. Special transformations that directly operate on the pixels of aglyph (or on their outlines if possible). For instance, we designed aspecial \Pcurlyfication\Q method that turns > into > and >>>||-0.2ex> into >>>||-0.2ex>. Similarly, we implemented a \Pflood fill\Q algorithm for transforming > into >. In a similar vein, we need various querying mechanisms: all glyphs come with logical and physical bounding boxes, but we sometimes may want to compute the exact width of some stroke or obtain other kinds of information. We developed a small language that can be used for defining new \Pvirtual\Q characters in terms of existing ones. The design of every new virtual glyph can be regarded as a puzzle: finding a clever way to combine existing glyphs into the desired one using the primitives from the language. Of course, we are looking for robust solutions in the sense that they should work for reasonable font in which the required basic glyphs are available. Let us consider a few examples. For the construction of arrows, it turns out that the single \N and \O are often well suited for the heads (the rescaled symbols\ and \ are acceptable fallbacks). The arrow bars are obtained from the minus sign , but the determination of an appropriate minus is non trivial. For instance, the width of the dash - is usually too large, so we should avoid using this symbol. The underscore is a better candidate; one may also cut the plus sign into several pieces (avoiding the vertical stroke) and recombine them. Assuming that we have an appropriate arrow bar and head, we may use the following code for producing an actual arrow: <\flat-size> <\scm-code> (rightarrow (right-fit arrowbar (align righthead arrowbar * 0.5))) The primitive is used to vertically align the arrow head at the center of the arrow bar. The primitive is less basic and corresponds to sliding the arrowhead from the right to the left until the arrow bar goes past the head on its right. More direct ways to produce arrows turn out to be less robust. Left and left-right arrows can be defined using <\flat-size> <\scm-code> (leftarrow (left-flip rightarrow)) (leftrightarrow (join (part leftarrow * 0.5) (part rightarrow 0.5 *))) These definitions potentially take advantage of an existing in the base font. The primitive performs two horizontal clippings between the middle and the extremities, whereas is used for superposition. An interesting challenge is the emulation of Greek characters. This seems intractable for the lowercase symbols, but is less hopeless for the capitals. For instance, \ can be obtained by flipping the Roman L upside down and we already mentioned how to obtain a reasonable \. More interesting is the case of \, which can be obtained from H by moving the horizontal bar to the top. However, extracting this bar is not so easy in some fonts: consider . For a robust method, we therefore cut the H into pieces: we first extract \ \ \ and recombine them into \. We next take a tiny piece of the central bar, extend it to the desired length, and move it to the top. <\big-figure||||||||||||||||||||||||||||||||||||||||||||||||||||