(cache) The CFF 2 CharString Format

The CFF 2 Charstring Format

Contents

1 Introduction

The CFF 2 CharString format provides a method for compact encoding of glyph procedures in an outline font program. CFF 2 charstrings are intended for use only with a CFF 2 font table in an OpenType font file.

The CFF 2 CharString format is closely descended from the Type 2 CharString format. Type 2 CharStrings are documented in Adobe Technical Note #5177, “The Type 2 Charstring Format #5177”, which is available at Adobe Font Technical Notes.

The motivation for developing the CFF 2 CharString format was twofold:

Accordingly, the CFF 2 CharString format has added some new operators, and has removed many more operators. However, the encoding of operators and operands is the same between Type 2 and CFF 2 CharStrings; interpreters will need relatively little change to process both formats. See Appendix D Changes From Type 2 CharStrings for a complete list of the differences between Type 2 and CFF 2 CharString formats.

This document only describes how CFF 2 charstrings are encoded, and does not attempt to explain the reasons for choosing various options. CFF 2 charstrings are based on Type 1 font concepts, and this document assumes familiarity with the Type 1 font format specification. For more information, please see “Adobe Type 1 Font Format” at Adobe Font Technical Notes. Also, familiarity with the CFF 2 font format is assumed; please see the Compact Font Format Version 2 ('CFF 2') chapter for details.

2 CFF 2 Charstrings

The following sections describe the general concepts of encoding a CFF 2 charstring.

2.1 Hints

The CFF 2 charstring format supports six hint operators: hstem, vstem, hstemhm, vstemhm, hintmask, and cntrmask. The hint information must be declared at the beginning of a charstring (see section 3.1) using the hstem, vstem, hstemhm, and vstemhm operators, each of which may each take arguments for multiple stem hints.

CFF 2 hint operators aid the rasterizer in recognizing and controlling stems and counter areas within a glyph. A stem generally consists of two positions (edges) and the associated width. Edge stem hints help to control character features where there is only a single edge (see section 4.3). The CFF 2 charstring format includes edge hints, which are equivalent to the Type 1 concept of ghost hints (see section on ghost hints, page 57, of “Adobe Type 1 Font Format”). They are used to locate an edge rather than a stem that has two edges. A stem width value of –20 is reserved for a top or right edge, and a value of –21 for a bottom or left edge. The operation of hints with other negative width values is undefined.

hintmask

The hintmask operator has the same function as that described in “Changing Hints within a Character,” section 8.1, page 69, of “Adobe Type 1 Font Format.” It provides a means for activating or deactivating stem hints so that only a set of non-overlapping hints are active at one time. The hintmask operator is followed by one or more data bytes that specify the stem hints which are to be active for the subsequent path construction. The number of data bytes must be exactly the number needed to represent the number of stems in the original stem list (those stems specified by the hstem, vstem, hstemhm, and vstemhm commands), using one bit in the data bytes for each stem in the original stem list. Bits with a value of one indicate stems that are active, and a value of zero indicates stems that are inactive.

cntrmask

The cntrmask (countermask) hint causes arbitrary but nonoverlapping collections of counter spaces in a character to be controlled in a manner similar to how stem widths are controlled by the stem hint commands (see Adobe Technical Note #5015, “The Type 1 Font Format Supplement” for more information). The cntrmask operator is followed by one or more data bytes that specify the index number of the stem hints on both sides of a counter space. The number of data bytes must be exactly the number needed to represent the number of stems in the original stem list (those stems specified by the hstem, vstem, hstemhm, or vstemhm commands), using one bit in the data bytes for each stem in the original stem list.

For the example shown in Figure 1, the stem list for the glyph would be:

and the following cntrmask commands would be used to control the counter spaces between those stems:

The bits set in the data bytes indicate that the corresponding stem hints delimit the desired set of counters. Hints specified in the first command have a higher priority than those in the second command. Notice that the V4 stem does not delimit an appropriate counter space, and hence is not referenced in this example.

Note that hints are just that, hints, or recommendations. They are additional guidelines to an intelligent rasterizer.

If the font’s LanguageGroup is not equal to 1 (a LanguageGroup value of 1 indicates complex Asian language glyphs), the cntrmask operator, with three stems, can be used in place of the hstem3 and vstem3 hints in the Type 1 format, as long as the related conditions specified in the Type 1 specification are met. For more information on Counter Control hints, see Adobe Technical Note #5015, “Type 1 Font Format Supplement.”

2.2 The Flex Mechanism

The flex mechanism is provided to improve the rendering of shallow curves, representing them as line segments at small sizes rather than as small humps or dents in the character shape. It is essentially a path construction mechanism: the arguments describe the construction of two curves, with an additional argument that is used as a hint for when the curves should be rendered as a straight line at smaller sizes and resolutions.

The CFF 2 flex mechanism is general; there are no restrictions on what type or orientation of curve may be expressed with a flex operator. The flex operator is used for the general case; special cases can use the flex1, hflex, or hflex1 operators for a more efficient encoding. Figure 2 shows an example of the flex mechanism used for a horizontal curve, and Figure 3 shows an example of flex curves at non-standard angles.

The flex operators can be used for any curved character feature, in any orientation or depth, that meets the following requirements:

2.3 Subroutines

A CFF 2 font program can use subroutines to reduce the storage requirements by combining the program statements that describe common elements of the characters in the font.

Subroutines may be local, or global. Local subroutines are only accessible from the charstring programs in the current font. Global subroutines are those that are shared amongst the various fonts in a FontSet (see Adobe Technical Note #5176, “The CFF Font Format Specification” for more information).

Subroutines may contain sections of charstrings, and are encoded the same as CFF 2 charstrings. They are called with the callsubr (for a local subroutine) or callgsubr (for a global subroutine) operator, using a biased index into the local or global Subrs array as the argument.

Note 1. Unlike the biasing in the Type 1 format, in CFF 2 the bias is not optional, and is fixed — based on the number of subroutines.

Charstring subroutines may call other subroutines, to the depth allowed by the implementation limits (see Appendix B). A charstring subroutine must end with either an endchar or a return operator. If the subroutine ends with an endchar operator, the return is not necessary.

3 Charstring Encoding

A CFF 2 charstring program is a sequence of unsigned 8-bit bytes that encode numbers and operators. The byte value specifies a operator, a number, or subsequent bytes that are to be interpreted in a specific manner.

The bytes are decoded into numbers and operators. One reason the format is more economical than Type 1 is because the CFF 2 charstring interpreter is required to count the number of arguments on the argument stack. It can thus detect additional sets of arguments for a single operator. The stack depth implementation limit is specified in Appendix B.

A number, decoded from a charstring, is pushed onto the CFF 2 argument stack. An operator expects its arguments in order from this argument stack with all arguments generally taken from the bottom of the stack (first argument bottom-most); however, some operators, particularly the subroutine operators, normally work from the top of the stack. If an operator returns results, they are pushed onto the CFF 2 argument stack (last result topmost).

In the following discussion, all numeric constants are decimal numbers, except where indicated.

3.1 CFF 2 Charstring Organization

The sequence and form of a CFF 2 charstring program may be represented as:

Where:

and the following symbols indicate specific usage:

Stated in words, the constraints on the sequence of operators in a charstring are as follows:

CFF 2 charstrings must be structured with operators, or classes of operators, sequenced in the following specific order:

  1. Hints: zero or more of each of the following hint operators, in exactly the following order: hstem, hstemhm, vstem, vstemhm, cntrmask, hintmask. Each entry is optional, and each may be expressed by one or more occurrences of the operator. The hint operators cntrmask and/or hintmask must not occur if the charstring has no stem hints.
  2. Path Construction: The first path of a charstring that contains no hints must begin with one of the moveto operators so that the preceding width can be detected properly.
  3. Zero or more path construction operators are used to draw the path of the character; the second and all subsequent subpaths must also begin with one of the moveto operators. The hintmask operator may be used as needed.

Note 2. Charstrings may contain subr and gsubr calls as desired at any point between complete tokens (operators or numbers). This means that a subr (gsubr) call must not occur between the bytes of a multibyte commands (for example, hintmask).

3.2 Charstring Number Encoding

A charstring byte containing the values from 32 through 254 inclusive indicates an integer. These values are decoded in three ranges (also see Table 1):

If the charstring byte contains the value 255, the next four bytes indicate a two’s complement signed number. The first of these four bytes contains the highest order bits, the second byte contains the next higher order bits and the fourth byte contains the lowest order bits. This number is interpreted as a Fixed; that is, a signed number with 16 bits of fraction.

Note 3. The CFF 2 interpretation of a number encoded in five-bytes (those with an initial byte value of 255) differs from how it is interpreted in the Type 1 format.

In addition to the 32 to 255 range of values, a ShortInt value is specified by using the operator (28) followed by two bytes which represent numbers between –32768 and +32767. The most significant byte follows the (28). This allows a more compact representation of large numbers which occur occasionally in fonts, but perhaps more importantly, this will allow more compact encoding of numbers which may be used as arguments to callsubr and callgsubr.


Table 1. CFF 2 Charstring Encoding Values
Charstring Byte Value Interpretation Number Range Represented Bytes Required
0 – 11 operators operators 0 to 11 1
12 escape: next byte interpreted as additional operators additional 0 to 255 range for operator codes 2
13 – 18 operators operators 13 to 18 1
19, 20 operators (hintmask and cntrmask) operators 19, 20 2 or more
21 – 27 operators operators 21 to 27 1
28 following 2 bytes interpreted as a 16-bit two’s complement number –32768 to +32767 3
29 – 31 operators operators 29 to 31 1
32 – 246 result = v–139 –107 to +107 1
247 – 250 with next byte, w, result = (v–247)*256+w+108 +108 to +1131 2
247 – 250 with next byte, w, result = –[(v–251)*256]–w –108 –108 to –113 2
255 255 next 4 bytes interpreted as a 32-bit two’s-complement number 16-bit signed integer with 16 bits of fraction 5

3.3 CharString Operator Encoding

Charstring operators are encoded in one or two bytes.

Single byte operators are encoded in one byte that contains a value between 0 and 31 inclusive, excluding 12 and 28. Not all possible operator encoding values are defined (see Appendix A for a list of operator encoding values). The behavior of undefined operators is unspecified.

If an operator byte contains the value 12, then the value in the next byte specifies an operator. This escape mechanism allows many extra operators to be encoded.

4 Charstring Operators

CFF 2 charstring operators are divided into five groups, classified by function: 1) path construction; 2) finishing a path; 3) hints; 4) subroutine; and 5) variation data support.

The following definitions use a format similar to that used in the PostScript Language Reference Manual. Parentheses following the operator name either include the operator value that represents this operator in a charstring byte, or the two values (beginning with 12) that represent a two-byte operator.

Many operators take their arguments from the bottom-most entries in the CFF 2 argument stack; this behavior is indicated by the stack bottom symbol ‘|-’ appearing to the left of the first argument. Operators that clear the argument stack are indicated by the stack bottom symbol ‘|-’ in the result position of the operator definition.

Because of this stack-clearing behavior, in general, arguments are not accumulated on the CFF 2 argument stack for later removal by a sequence of operators, arguments generally may be supplied only for the next operator. Notable exceptions occur with subroutine calls and with arithmetic and conditional operators. All stack operations must observe the stack limit (see Appendix B).

4.1 Path Construction Operators

In a CFF 2 charstring, a path is constructed by sequential application of one or more path construction operators. The current point is initially the (0, 0) point of the character coordinate system. The operators listed in this section cause the current point to change, either by a moveto operation, or by appending one or more curve or line segments to the current point. Upon completion of the operation, the current point is updated to the position to which the move was made, or to the last point on the segment or segments.

Many of the operators can take multiple sets of arguments, which indicate a series of path construction operations. The number of operations are limited only by the limit on the stack size (see Appendix B).

All Bézier curve path segments are drawn using six arguments, dxa, dya, dxb, dyb, dxc, dyc; where dxa and dya are relative to the current point, and all subsequent arguments are relative to the previous point. A number of the curve operators take advantage of the situation where some tangent points are horizontal or vertical (and hence the value is zero), thus reducing the number of arguments needed.

The flex operators are considered path construction commands because they specify the drawing of two curves. There is also an additional argument that serves as a hint as to when to render the curves as a straight line at small sizes and low resolutions.

The following are three types of moveto operators. For the initial moveto operators in a charstring, the arguments are relative to the (0, 0) point in the character’s coordinate system; subsequent moveto operators’ arguments are relative to the current point.

Every character path and subpath must begin with one of the moveto operators. If the current path is open when a moveto operator is encountered, the path is closed before performing the moveto operation.

4.2 Finishing a CharString Outline Definition

CFF 2 CharStrings differ from Type 2 CharStrings in that there is no operator for finishing a charstring outline definition.The end of the CharString data implies the end of the path, and serves the same purpose as the Type 2 endchar operator.

The smallest legal CharString is simply an empty byte string.

4.3 Hint Operators

All hints must be declared at the beginning of the charstring program, after the width (see section 3.1 for details).

4.4 Subroutine Operators

The numbering of subroutines is encoded more compactly by using the negative half of the number space, which effectively doubles the number of compactly encodable subroutine numbers. The bias applied depends on the number of subrs (gsubrs). If the number of subrs (gsubrs) is less than 1240, the bias is 107. Otherwise if it is less than 33900, it is 1131; otherwise it is 32768. This bias is added to the encoded subr (gsubr) number to find the appropriate entry in the subr (gsubr) array. Global subroutines may be used in a FontSet even if it only contains one font.

4.5 Variation Data Operators

In order to support variation data in CFF 2 CharStrings, two new operators are added in CFF 2 CharStrings: vsindex and blend.

A variable font holds data representing the equivalent of several distinct design variations, and uses algorithms for interpolation — or blending — between these designs to derive a continuous range of design instances. This allows an entire family of fonts to be represented by a single variable font. For example, a variable font may contain data equivalent to Light and Heavy designs from a family, which can then be interpolated to derive instances for any weight in between Light and Heavy. Moreover, the single variable font provides even greater functionality since it supports a continuous range of variation between those source designs.

See the chapter, OpenType Font Variations Overview for general background on OpenType Font Variations, details on the tables used to support a variable font, terminology, and a specification of the interpolation algorithm used to blend values to derive specific design instances.

Outline data for a variable font in the CFF 2 format is built much like a non-variable CFF2 table would be built, with exactly the same structure and operators as would be used for the default design representation. However, wherever a value occurs in the default design, the single value for the one design is replaced with a set of delta values, followed by the blend operator. (For efficiency, the blend operator is actually called only at the end of a series of such delta sets, rather than after each one.)

Within a variable font, different glyphs can use different sets of regions and associated delta values for the blending operation. When processing a given glyph, the interpreter must determine which set to use. These sets are stored in the CFF 2 table in an ItemVariationStore structure. The ItemVariationStore contains one or more ItemVariationData subtables, each of which contains a list of Variation Regions. The first ItemVariationData subtable defines the default subtable to use, when no other has been specified. When a ItemVariationData subtable other than the default is needed for a set of delta values, the vsindex operator is used. When this operator is used in a Private DICT to set a non-default itemVariationData index, this then becomes the default Item Variation Data index for not only the Private DICT, but also all CharStrings that reference that Private DICT, until the vsindex operator is used again.

Syntax for Font Variations support operators.

vsindex ivs vsindex (22)

Sets the scalars for blending a set of delta values, using the itemVariationData index specified by the unsigned integer index ivs. The index ivs is interpreted as a variation store itemVariationData index for the VariationStore structure. If the vsindex operator is not present in a Private DICT, then the default value is vsindex value is 0. When present, it must be the first operator in the Private DICT. It also sets the default vsindex value for all CharStrings that reference the Private DICT.

blend num(0,0)...num(0,n-1), num(0,0)...num(k-1,0)… num(0,n-1)...num(k-1,n-1) n blend (23) val(0)...val(n-1)

for k delta value operands, produces n interpolated result value(s) from n*k arguments.

The blend operator uses the same set of delta values as used in the TrueType blending model. n is the number of values that will be left on the stack for the next operator. This is passed as the last argument on the stack, and lets the blend operator know that it is preceeded by n+1 groups of operands. The first group contains n arguments that are the actual n values on the stack from the default design. The following n groups each contain k delta values, where k is the number of regions other than the default design. In each of these following groups, the num(i,j) value is for item j on the default design operand stack, and is the delta value for region k. For example, consider an rmoveto operator in a glyph with 3 master source designs and and two design axes, Weight and Width, where the 'Light' master source design is used to provide the default design, and 'Bold' and 'Condensed' are used to provide the regions and delta values for the Bold and Condensed regions.

The first group of arguments are the n values from the default design:

100 150

For each following group i for i = 0 to n-1, the values are for stack element i for from each design other than the default. Parentheses are added here to denote each group.

(100 150) (100 50) (300 100) 2 blend rmoveto

The actual arguments in the groups after the first group are delta values, rather than the actual design values: the difference between the value for the region and the default design value:

(100 150) (0 -50) (150 -50) 2 blend rmoveto.

See the chapter OpenType Font Variations Overview for a full discussion of the calculation of delta values, which is usually more complex than shown in this simple case.

Appendix A. CFF 2 Charstring Command Codes

One-byte CFF 2 Operators

DecHexOperatorDecHexOperator
000–Reserved–1812hstemhm
101hstem1913hintmask
202–Reserved–2014cntrmask
303vstem2115rmoveto
404vmoveto2216hmoveto
505rlineto2317vstemhm
606hlineto2418rcurveline
707vlineto2519rlinecurve
808rrcurveto261avvcurveto
909–Reserved–271bhhcurveto
100acallsubr2821cshortint
110breturn291dcallgsubr
1210cescape301evhcurveto
130d–Reserved–311fhvcurveto
140eReserved–32–24620–f6<numbers>
150fvsindex247–2543f7–fe<numbers>
1610blend2554ff<number>
1711–Reserved–

Two-byte CFF 2 Operators

Note 6. CFF 2 Charstrings do not support any of the arithmetic, storage, or conditional operators that are supported by Type 2 CharStrings.

DecHexOperatorDecHexOperator
12 00c 00–Reserved– 12 200c 14–Reserved–
12 10c 01–Reserved– 12 210c 15–Reserved–
12 20c 02–Reserved– 12 220c 16–Reserved–
12 30c 03–Reserved– 12 230c 17–Reserved–
12 40c 04–Reserved– 12 240c 18–Reserved–
12 50c 05–Reserved– 12 250c 19–Reserved–
12 60c 06–Reserved– 12 260c 1a–Reserved–
12 70c 07–Reserved– 12 270c 1b–Reserved–
12 80c 08–Reserved– 12 280c 1c–Reserved–
12 90c 09–Reserved– 12 290c 1d–Reserved–
12 100c 0a–Reserved– 12 300c 1e–Reserved–
12 110c 0b–Reserved– 12 310c 1f–Reserved–
12 120c 0c–Reserved– 12 320c 20–Reserved–
12 130c 0d–Reserved– 12 330c 21–Reserved–
12 140c 0e–Reserved– 12 340c 22hflex
12 150c 0f–Reserved– 12 350c 23flex
12 160c 10–Reserved– 12 360c 24hflex1
12 170c 11–Reserved– 12 370c 25flex1
12 180c 12–Reserved–12 38–12 2550c26–0cff–Reserved–
12 190c 13–Reserved–

Appendix B. CFF 2 Charstring Implementation Limits

The following are the implementation limits of the CFF 2 charstring interpreter:

Description Limit
Argument stack1931
Number of stem hints (H/V total)96
Subr nesting, stack limit10
Charstring length65535
maximum (g)subrs count65536
TransientArray elements32

Appendix C. Compatibility and Deprecated Operators

CFF 2 CharStrings may not be used with CFF font tables. The CFF 2 CharStrings will be missing the endchar and return operators required by Type 2 CharStrings, and do not provide the width values that are required by CFF 1.0 font tables. In addition, CFF 2 Charstrings may include operators (blend and vsindex) that are not supported by interpreters for CFF 1.0 font tables.

Appendix D Changes From Type 2 CharStrings

Appendix E Changes Since Earlier Versions

The following changes and revisions have been made since the initial publication date of 8 August 2016.


This page was last updated 6 September 2016.

© 2016 Microsoft Corporation. All rights reserved. Terms of use.

Comments to the MST group: how to contact us.