1This is a README for the font compression reference code. There are several 2compression related modules in this repository. 3 4brotli/ contains reference code for the Brotli byte-level compression 5algorithm. Note that it is licensed under an Apache 2 license. 6 7src/ contains prototype Java code for compressing fonts. 8 9cpp/ contains prototype C++ code for decompressing fonts. 10 11docs/ contains documents describing the proposed compression format. 12 13= How to run the compression test tool = 14 15This document documents how to run the compression reference code. At this 16writing, the code, while it is intended to produce a bytestream that can be 17reconstructed into a working font, the reference decompression code is not 18done, and the exact format of that bytestream is subject to change. 19 20== Building the tool == 21 22On a standard Unix-style environment, it should be as simple as running “ant”. 23 24The tool depends on sfntly for much of the font work. The lib/ directory 25contains a snapshot jar. If you want to use the latest sfntly sources, then cd 26to the java subdirectory, run “ant”, then copy these files dist/lib/sfntly.jar 27dist/tools/conversion/eot/eotconverter.jar and 28dist.tools/conversion/woff/woffconverter.jar to $(thisproject)/lib: 29 30dist/lib/sfntly.jar dist/tools/conversion/eot/eotconverter.jar 31dist.tools/conversion/woff/woffconverter.jar 32 33There’s also a dependency on guava (see references below). 34 35The dependencies are subject to their own licenses. 36 37== Setting up the test == 38 39A run of the tool evaluates a “base” configuration plus one or more test 40configurations, for each font. It measures the file size of the test as a ratio 41over the base file size, then graphs the value of that ratio sorted across all 42files given on the command line. 43 44The test parameters are set by command line options (an improvement from the 45last snapshot). The base is set by the -b command line option, and the 46additional tests are specified by repeated -x command line options (see below). 47 48Each test is specified by a string description. It is a colon-separated list of 49stages. The final stage is entropy compression and can be one of “gzip”, 50“lzma”, “bzip2”, “woff”, “eot” (with actual wire-format MTX compression), or 51“uncomp” (for raw, uncompressed TTF’s). Also, the new wire-format draft 52WOFF2 spec is available as "woff2", and takes an entropy coding as an 53optional argument, as in "woff2/gzip" or "woff2/lzma". 54 55Other stages may optionally include subparameters (following a slash, and 56comma-separated). The stages are: 57 58glyf: performs glyf-table preprocessing based on MTX. There are subparameters: 591. cbbox (composite bounding box). When specified, the bounding box for 60composite glyphs is included, otherwise stripped 2. sbbox (simple bounding 61box). When specified, the bounding box for simple glyphs is included 3. code: 62the bytecode is separated out into a separate stream 4. triplet: triplet coding 63(as in MTX) is used 5. push: push sequences are separated; if unset, pushes are 64kept inline in the bytecode 6. reslice: components of the glyf table are 65separated into individual streams, taking the MTX idea of separating the 66bytecodes further. 67 68hmtx: strips lsb’s from the hmtx table. Based on the idea that lsb’s can be 69reconstructed from bbox. 70 71hdmx: performs the delta coding on hdmx, essentially the same as MTX. 72 73cmap: compresses cmap table: wire format representation is inverse of cmap 74table plus exceptions (one glyph encoded by multiple character codes). 75 76kern: compresses kern table (not robust, intended just for rough testing). 77 78strip: the subparameters are a list of tables to be stripped entirely 79(comma-separated). 80 81The string roughly corresponding to MTX is: 82 83glyf/cbbox,code,triplet,push,hop:hdmx:gzip 84 85Meaning: glyph encoding is used, with simple glyph bboxes stripped (but 86composite glyph bboxes included), triplet coding, push sequences, and hop 87codes. The hdmx table is compressed. And finally, gzip is used as the entropy 88coder. 89 90This differs from MTX in a number of small ways: LZCOMP is not exactly the same 91as gzip. MTX uses three separate compression streams (the base font including 92triplet-coded glyph data), the bytecodes, and the push sequences, while this 93test uses a single stream. MTX also compresses the CVT table (an upper bound on 94the impact of this can be estimated by testing strip/cvt) 95 96Lastly, as a point of methodology, the code by default strips the “dsig” table, 97which would be invalidated by any non-bit-identical change to the font data. If 98it is desired to keep this table, add the “keepdsig” stage. 99 100The string representing the currently most aggressive optimization level is: 101 102glyf/triplet,code,push,reslice:hdmx:hmtx:cmap:kern:lzma 103 104In addition to the MTX one above, it strips the bboxes from composite glyphs, 105reslices the glyf table, compresses the htmx, cmap, and kern tables, and uses 106lzma as the entropy coding. 107 108The string corresponding to the current WOFF Ultra Condensed draft spec 109document is: 110 111glyf/cbbox,triplet,code,reslice:woff2/lzma 112 113The current C++ codebase can roundtrip compressed files as long as no per-table 114entropy coding is specified, as below (this will be fixed soon). 115 116glyf/cbbox,triplet,code,reslice:woff2 117 118 119== Running the tool == 120 121java -jar build/jar/compression.jar *.ttf > chart.html 122 123The tool takes a list of OpenType fonts on the commandline, and generates an 124HTML chart, which it simply outputs to stdout. This chart uses the Google Chart 125API for plotting. 126 127Options: 128 129-b <desc> 130 131Sets the baseline experiment description. 132 133[ -x <desc> ]... 134 135Sets an experiment description. Can be used multiple times. 136 137-o 138 139Outputs the actual compressed file, substituting ".wof2" for ".ttf" in 140the input file name. Only useful when a single -x parameter is specified. 141 142= Decompressing the fonts = 143 144See the cpp/ directory (including cpp/README) for the C++ implementation of 145decompression. This code is based on OTS, and successfully roundtrips the 146basic compression as described in the draft spec. 147 148= References = 149 150sfntly: http://code.google.com/p/sfntly/ Guava: 151http://code.google.com/p/guava-libraries/ MTX: 152http://www.w3.org/Submission/MTX/ 153 154Also please refer to documents (currently Google Docs): 155 156WOFF Ultra Condensed file format: proposals and discussion of wire format 157issues (PDF is in docs/ directory) 158 159WIFF Ultra Condensed: more discussion of results and compression techniques. 160This tool was used to prepare the data in that document. 161