Stylesheet

A stylesheet is a collection of styles. In Word, each document has its own stylesheet.

A style is a set of formatting information collected together and given a name. Word 6.0 supports paragraph and character styles, previous versions supported only paragraph styles. Character styles have just character formatting, paragraph styles have both character and paragraph formatting. The style sheet establishes a correspondence between a style code and a style definition.

Note that the storage and behavior of styles has changed radically since Word 2 for Windows, beginning with nFib 63. Some of the differences are:

  • Character styles are supported.
  • The style code is called an istd, rather than an stc.
  • The istd is a short, where the stc was a byte.
  • The range of the istd is 0-4095, where 4095 is the null style. The range of the stc was 0-256, with 222 as the null style.
  • PAPX's have a short istd at the beginning, rather than a byte stc.
  • CHPX's are a grpprl, not a CHP.
  • Many other changes...
This document describes only the final Word 6.0 version of the stylesheet, not the Word 2.x version.

The styles for a document (both paragraph and character styles) are stored in an array in each document. When new styles are created, they are added to the end of the array. The array can have unused slots. Some slots at the beginning of the array are reserved for specific styles, whether they have been created yet or not. Paragraph and character styles are stored in the same array. Each document has a separate array, so the same style will usually have a different istd in two different documents. Thus style matching between documents must be done by name (or by sti if the styles are built-in.)

Styles are usually referred to using an istd. The istd is an index into an array of STD's (STyle Descriptions). A (doc, istd) pair uniquely identifies a style because it tells which style in which array.

Parts of a style (for more information, see the STD structure below):

  • sti: A style identifier. Built-in styles have an sti that indicates which built-in style they are. User-defined styles all have stiUser.
  • sgc: The type of style, either paragraph or character.
  • istdBase: The style that this style is based on.
  • istdNext: The style that should be applied after this one.
  • stzName: The name of a style, unique within its stylesheet.
  • UPX: The difference between this style and the one it is based on.
  • UPE: The properties of this style (a PAP, CHP, and/or grpprl).
Every paragraph has a paragraph style. Every character has a character style. The default paragraph style is Normal (stiNormal, istdNormal). The default character style is Default Paragraph Font (stiNormalChar, istdNormalChar).

The formatting of a paragraph (the PAP) and a character (the CHP) depend on the paragraph and character styles applied to them, as well as any additional formatting stored in the FKPs. The PAP and CHP are constructed in a layered fashion:

For a PAP:

An initial PAP is determined by getting the PAP from the paragraph's style.

Any paragraph formatting stored in the file (the FKP papx's) is then applied to that PAP.

For a CHP:

An initial CHP is determined by getting the CHP from the paragraph's style.

Properties from the character's style (the UPX.chpx.grpprl) are then applied to that CHP.

Any character formatting stored in the file (the FKP chpx's) is the applied to that CHP.

Note that the resulting PAP and CHP have fields that indicate what style was applied: PAP.istd, CHP.istd.

Stylesheet File Format

The style sheet (STSH) is stored in the file in two parts, a STSHI and then an array of STDs. The STSHI contains general information about the following stylesheet, including how many styles are in it. After the STSHI, each style is written as an STD. Both the STSHI and each STD are preceded by a ushort that indicates their length.

Field

Size

Comment

cbStshi

2 bytes

size of the following STSHI structure

STSHI

(cbStshi)

Stylesheet Information


Then for each style in the stylesheet (stshi.cstd), the following is stored:

cbStd

2 bytes

size of the following STD structure

STD

(cbStd)

the style description


STSHI:

The STSHI structure has the following format:

// STSHI: STyleSHeet Information, as stored in a file
//  Note that new fields can be added to the STSHI without invalidating
//  the file format, because it is stored preceded by it's length.
//  When reading a STSHI from an older version, new fields will be zero.
typedef struct _STSHI
    {
    ushort  cstd;                          // Count of styles in stylesheet
    ushort  cbSTDBaseInFile;               // Length of STD Base as stored in a file
    BF      fStdStylenamesWritten : 1;     // Are built-in stylenames stored?
    BF   :  15;                            // Spare flags
    ushort  stiMaxWhenSaved;               // Max sti known when this file was written
    ushort  istdMaxFixedWhenSaved;         // How many fixed-index istds are there?
    ushort  nVerBuiltInNamesWhenSaved;     // Current version of built-in stylenames
    FTC     rgftcStandardChpStsh[3];       // ftc used by StandardChpStsh for this document
    } STSHI;
The cb preceding the STSHI in the file is the length of the STSHI as stored in the file. The current definition of the STSHI structure might be longer or shorter than that stored in the file, the stylesheet reader routine needs to take this into account.

stshi.cstd: The number of styles in this stylesheet. There will be stshi.cstd (cbSTD, STD) pairs in the file following the STSHI. Note that styles can be empty, i.e. cbSTD == 0.

stshi.cbSTDBaseInFile: The STD structure (see below) is divided into a fixed-length "base", and a variable length part. The stshi.cbSTDBaseInFile indicates the size in bytes of the fixed-length base of the STD as it was written in this file. If the STD base is grown in a future version, the file format doesn't change, because the stylesheet reader can discard parts it doesn't know about, or use defaults if the file's STD is not as large as it was expecting. (Currently, stshi.cbSTDBaseInFile is 8.)

stshi.fStdStylenamesWritten: Previous versions of Word did not store the style name if the style was a built-in style; Word 6.0 does, for compatibility with future versions. Note that the built-in stylenames may need to be "regenerated" if the file is opened in a different language or if stshi.nVerBuiltInNamesWhenSaved doesn't match the expected value.

stshi.stiMaxWhenSaved: This indicates the last built-in style known to the version of Word that saved this file.

stshi.istdMaxFixedWhenSaved: Each array of styles has some fixed-index styles at the beginning. This indicates the number of fixed-index positions reserved in the stylesheet when it was saved.

stshi.nVerBuiltInNamesWhenSaved: Since built-in stylenames are saved with the document, this provides an way to see if the saved names are the same "version" as the names in the version of Word that is loading the file. If not, the built-in stylenames need to be "regenerated", i.e. the old names need to be replaced with the new.

stshi.rgftcStandardChpStsh: This is the default fonts for this stylesheet. The first is for Asci characters (0-127), the second is for Far East characters, and the third is the default font for non-Far East, non-Asci text. See notes on sprmCRgftcX for details.

STD:

The style description is stored in an STD structure as follows:

// STD: STyle Definition
//   The STD contains the entire definition of a style.
//   It has two parts, a fixed-length base (cbSTDBase bytes long)
//   and a variable length remainder holding the name, and the upx and upe
//   arrays (a upx and upe for each type stored in the style, std.cupx)
//   Note that new fields can be added to the BASE of the STD without
//   invalidating the file format, because the STSHI contains the length
//   that is stored in the file.  When reading STDs from an older version,
//   new fields will be zero.
typedef struct _STD
    {
    // Base part of STD:
    ushort    sti : 12;          /* invariant style identifier */
    ushort    fScratch : 1;      /* spare field for any temporary use,
                                    always reset back to zero! */
    ushort    fInvalHeight : 1;  /* PHEs of all text with this style are wrong */
    ushort    fHasUpe : 1;       /* UPEs have been generated */
    ushort    fMassCopy : 1;     /* std has been mass-copied; if unused at
                                    save time, style should be deleted */
    ushort    sgc : 4;           /* style type code */
    ushort    istdBase : 12;     /* base style */
    ushort    cupx : 4;          /* # of UPXs (and UPEs) */
    ushort    istdNext : 12;     /* next style */
    ushort    bchUpe;            /* offset to end of upx's, start of upe's */

    ushort    fAutoRedef : 1;    /* auto redefine style when appropriate */
    ushort    fHidden : 1;       /* hidden from UI? */
    ushort : 14;                 /* unused bits */

    // Variable length part of STD:
    XCHAR    xstzName[2];        /* sub-names are separated by chDelimStyle */
    /* char  grupx[]; */
    /* the UPEs are not stored on the file; they are a cache of the based-on
       chain */
    /* char  grupe[]; */
    } STD;
The cb preceding each STD is the length of the data, which includes all of the STD except the grupe array (which is derived after the file is read in, by building each UPE from the base style UPE plus the exceptions in the UPX.) A cb of zero indicates an empty slot in the style array, i.e. no style has that istd. Note that the STD structure may be longer or shorter than the one stored in the file, stshi.cbSTDBaseInFile indicates the length of the base of the STD (up to stzName) as stored in the file. The stylesheet reader routine has to take this into account.

The variable-length part of the STD actually has three variable-length subparts, the xstzName, the grupx, and the grupe. Since this doesn't fit well into a C structure declaration, some processing is needed to figure out where one part stops and the next part begins. An important note is that all variable-length parts and subparts of the STD begin on EVEN-BYTE OFFSETS within the STD, even if the length of the preceding variable-length part was odd.

std.sti: The sti is an identifier which built-in style this is, or stiUser for a user-defined style. An sti is intended to be permanent through versions of Word, although new sti's may be added in new versions. The sti definitions are:

// standard sti codes - these are invariant identifiers for built-in styles
// and must remain the same (i.e. don't renumber them, or old files will be
// messed up.)
// NOTE: sti and istd are the same for Normal and level styles
// If you want to define a new built-in style:
//   1) Decide if you really need one--it will exist in all future versions!
//   2) Add a new sti below.  You can take the first available slot.
//   3) Change stiMax, and stiPapMax or stiChpMax
//   4) Add entry to _dnsti, and the two ids's in strman.pp
//   5) Add case in GetDefaultUpdForSti
//   6) Change cstiMaxBuiltinDependents if necessary
// If you want to change the definition of a built-in style
//   1) In order to make WinWord 2 documents that use the style look like
//      they did in WinWord 2, add a case in GetDefaultUpdForSti to handle
//      fOldDef.  This definition will be used when converting WinWord 2
//      stylesheets.
//   2) If you change the name of a built-in style, increment nVerBuiltInNames
#define stiNormal      0     // 0x0000

#define stiLev1        1     // 0x0001
#define stiLev2        2     // 0x0002
#define stiLev3        3     // 0x0003
#define stiLev4        4     // 0x0004
#define stiLev5        5     // 0x0005
#define stiLev6        6     // 0x0006
#define stiLev7        7     // 0x0007
#define stiLev8        8     // 0x0008
#define stiLev9        9     // 0x0009
#define stiLevFirst    stiLev1
#define stiLevLast     stiLev9

#define stiIndex1      10    // 0x000A
#define stiIndex2      11    // 0x000B
#define stiIndex3      12    // 0x000C
#define stiIndex4      13    // 0x000D
#define stiIndex5      14    // 0x000E
#define stiIndex6      15    // 0x000F
#define stiIndex7      16    // 0x0010
#define stiIndex8      17    // 0x0011
#define stiIndex9      18    // 0x0012
#define stiIndexFirst  stiIndex1
#define stiIndexLast   stiIndex9

#define stiToc1        19    // 0x0013
#define stiToc2        20    // 0x0014
#define stiToc3        21    // 0x0015
#define stiToc4        22    // 0x0016
#define stiToc5        23    // 0x0017
#define stiToc6        24    // 0x0018
#define stiToc7        25    // 0x0019
#define stiToc8        26    // 0x001A
#define stiToc9        27    // 0x001B
#define stiTocFirst    stiToc1
#define stiTocLast     stiToc9

#define stiNormIndent  28    // 0x001C
#define stiFtnText     29    // 0x001D
#define stiAtnText     30    // 0x001E
#define stiHeader      31    // 0x001F
#define stiFooter      32    // 0x0020
#define stiIndexHeading 33   // 0x0021
#define stiCaption     34    // 0x0022
#define stiToCaption   35    // 0x0023
#define stiEnvAddr     36    // 0x0024
#define stiEnvRet      37    // 0x0025
#define stiFtnRef      38    // 0x0026  char style
#define stiAtnRef      39    // 0x0027  char style
#define stiLnn         40    // 0x0028  char style
#define stiPgn         41    // 0x0029  char style
#define stiEdnRef      42    // 0x002A  char style
#define stiEdnText     43    // 0x002B
#define stiToa         44    // 0x002C
#define stiMacro       45    // 0x002D
#define stiToaHeading  46    // 0x002E
#define stiList        47    // 0x002F
#define stiListBullet  48    // 0x0030
#define stiListNumber  49    // 0x0031
#define stiList2       50    // 0x0032
#define stiList3       51    // 0x0033
#define stiList4       52    // 0x0034
#define stiList5       53    // 0x0035
#define stiListBullet2 54    // 0x0036
#define stiListBullet3 55    // 0x0037
#define stiListBullet4 56    // 0x0038
#define stiListBullet5 57    // 0x0039
#define stiListNumber2 58    // 0x003A
#define stiListNumber3 59    // 0x003B
#define stiListNumber4 60    // 0x003C
#define stiListNumber5 61    // 0x003D
#define stiTitle       62    // 0x003E
#define stiClosing     63    // 0x003F
#define stiSignature   64    // 0x0040
#define stiNormalChar  65    // 0x0041  char style
#define stiBodyText    66    // 0x0042
#define stiBodyText2   67    // 0x0043
#define stiListCont    68    // 0x0044
#define stiListCont2   69    // 0x0045
#define stiListCont3   70    // 0x0046
#define stiListCont4   71    // 0x0047
#define stiListCont5   72    // 0x0048
#define stiMsgHeader   73    // 0x0049
#define stiSubtitle    74    // 0x004A
#define stiSalutation  75    // 0x004B
#define stiDate        76    // 0X004C
#define stiBodyText1I  77    // 0x004D
#define stiBodyText1I2 78    // 0x004E
#define stiNoteHeading 79    // 0x004F
#define stiBodyText2   80    // 0x0050
#define stiBodyText3   81    // 0x0051
#define stiBodyTextInd2 82   // 0x0052
#define stiBodyTextInd3 83   // 0x0053
#define stiBlockQuote  84    // 0x0054
#define stiHyperlink   85    // 0x0055  char style
#define stiHyperlinkFollowed 86 // 0x0056   char style
#define stiStrong      87    // 0x0057  char style
#define stiEmphasis    88    // 0x0058  char style
#define stiNavPane     89    // 0x0059  char style
#define stiPlainText   90    // 0x005A
#define stiMax         91    // number of defined sti's

#define stiUser      0x0ffe  // user styles are distinguished by name
#define stiNil       0x0fff  // max for 12 bits
See below for the names of these styles.

std.stc: The type of each style is indicated by std.sgc. The two types currently in use are:

sgcPara

1

// A paragraph style

sgcChp

2

// A character style


More style types may exist in the future, so styles of an unknown type should be discarded.

std.istdBase: The style that this style is based on. A style is always based on another style or the null style (istdNil). Following a "chain" of based-on styles will always end at the null style, because a based-on chain cannot have a loop in it. A style can have up to 11 "ancestors" in its based-on chain, including the null style. A style's definition is built up from the style that it is based on. See std.cupx, std.grupx, std.grupe.

std.istdNext: The style that should be applied after this one. For a paragraph style, this is the style that is applied when Enter is pressed at the end of a paragraph. For a character style, the next style is essentially ignored, but should be the same as the current style.

std.xstzName: The name of the style, including aliases. The name is stored as an xstz (preceded by a length byte, followed by a null-terminator.) A style name can contain multiple "aliases", separated by commas. Aliases are alternate names for the same style (e.g. a style named "a,b,c" has three aliases, and can be referred to by "a", "b", or "c", or any combination.) WinWord 2.x did not have aliases, but MacWord 5.x did. If a style is a built-in style, the built-in stylename is always stored first.

All names (and aliases) must be unique within a stylesheet (e.g. styles "a,b" and "b,c" should not exist in the same stylesheet, as "b" matches multiple stylenames.)

A stylename (including all its aliases and comma separators) can be up to 253 characters long. So the xstz format of that name can be up to 255 characters. Stylenames are case sensitive.

The built-in stylenames (corresponding to each sti above) are defined for each language version of Word. For the USA, the names are:

// These are the names of the built-in styles as we want to present them
// to the user.
Normal
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Heading 7
Heading 8
Heading 9
Index 1
Index 2
Index 3
Index 4
Index 5
Index 6
Index 7
Index 8
Index 9
TOC 1
TOC 2
TOC 3
TOC 4
TOC 5
TOC 6
TOC 7
TOC 8
TOC 9
Normal Indent
Footnote Text
Annotation Text
Header
Footer
Index Heading
Caption
Table of Figures
Envelope Address
Envelope Return
Footnote Reference
Annotation Reference
Line Number
Page Number
Endnote Reference
Endnote Text
Table of Authorities
Macro Text
TOA Heading
List
List 2
List 3
List 4
List 5
List Bullet
List Bullet 2
List Bullet 3
List Bullet 4
List Bullet 5
List Number
List Number 2
List Number 3
List Number 4
List Number 5
Title
Closing
Signature
Default Paragraph Font
Body Text
Body Text Indent
List Continue
List Continue 2
List Continue 3
List Continue 4
List Continue 5
Message Header
Subtitle
Salutation
Date
Body Text First Indent
Body Text First Indent 2
Note Heading
Body Text 2
Body Text 3
Body Text Indent 2
Body Text Indent 3
Block Text
Hyperlink
Followed Hyperlink
Strong
Emphasis
Document Map
Plain Text
std.cupx: This is the number of UPXs in the std.grupx array. See below.

std.grupx: This is an array of variable-length UPXs, with std.cupx UPXs in the array. This array begins after the variable-length xstzName field, at the next even-byte offset within the STD. A UPX (Universal Property eXception) describes the difference in formatting of this style as compared to its based-on style. The UPX structure looks like this:

typedef union _UPX
    {
    struct
            {
            uchar grpprl[cbMaxGrpprlStyleChpx];
            } chpx;
    struct
            {
            ushort istd;
            uchar grpprl[cbMaxGrpprlStylePapx];
            } papx;
    uchar rgb[1];
    } UPX;
Each UPX stored in a file is not a complete UPX, rather it is a UPX with all trailing zero bytes lopped off, and preceded by a ushort length field. So it is stored like:

Field

Size

Comment

cbUPX

2 bytes

size of the following UPX structure

UPX

(cbUPX)

Nonzero prefix of a UPX structure


Each UPX begins on an even-byte offset within the STD, even if the length of the previous UPX (cbUPX) was odd.

The meaning of each UPX depends on the style type (std.sgc). For a paragraph style, std.cupx is 2. The first UPX is a paragraph UPX (UPX.papx) and the second UPX is a character UPX (UPX.chpx). For a character style, std.cupx is 1, and that UPX is a character UPX (UPX.chpx). Note that new UPXs may be added in the future, so std.cupx might be larger than expected. Any UPXs past those expected should be discarded.

The grpprl within each UPX contains the differences of this property type for this style from the UPE of that property type for the based on style. For example, if two paragraph styles, A and B, were identical except that B was bold where A was not, and B was based on A, B would have two UPXs, where the paragraph UPX would have an empty grpprl, and the character UPX would have a bold sprm in the grpprl. Thus B looks just like A (since B is based on A), with the exception that B is bold.

std.grupe: This is an array (group) of variable-length UPEs. These are not stored in the file! Rather, they are constructed using the std.istdBase and std.grupx fields. A UPE (Universal Property Expansion) describes the "end-result" of the property formatting, i.e. what the style looks like. The UPE structure is the non-zero prefix of a UPD structure. The UPD structure looks like this:

typedef union _UPD
    {
    PAP pap;
    CHP chp;
    struct
            {
            ushort istd;
            uchar cbGrpprl;
            uchar grpprl[cbMaxGrpprlStyleChpx];
            } chpx;
    } UPD;
The std.grupe and std.grupx arrays are similar: there is one UPE for each UPX, and internally they are stored similarly (a length ushort followed by a non-zero prefix), though remember that the UPEs are not stored in the file. The meaning of each UPE depends on the style type (std.sgc). For a paragraph style, the first UPE is a PAP (UPE.pap). The second UPE is a CHP (UPE.chp). For a character style, the first UPE is a CHPX (UPE.chpx).

The UPEs for a style are constructed by taking the UPEs from the based-on style, and applying the UPXs to them. Obviously, if the UPEs for the based-on style haven't yet been constructed, that style's UPE needs to be constructed first. Eventually by following the based-on chain, a style will be based on the null style (istdNil). The UPEs for the null style are predefined:

  • The UPE.pap for the null style is all zeros, except fWidowControl which is 1, dyaLine which is 240, and fMultLinespace which is 1.
  • The UPE.chp for the null style is all zeros, except istd which is 10 (istdNormalChar), hps which is 20, lid which is 0x0400, and ftc which is set to the STSHI.ftcStandardChpStsh.
  • The UPE.chpx for the null style has an istd of zero, a cbGrpprl of zero (and an empty grpprl).
So, for a paragraph style, the first UPE is a UPE.pap. It can be constructed by starting the with first UPE from the based-on style (std.istdBase), and then applying the first UPX (UPX.papx) in std.grupx to that UPE. To apply a UPX.papx to a UPE.pap, set UPE.pap.istd equal to UPX.papx.istd, and then apply the UPX.papx.grpprl to UPE.pap. Similarly, the second UPE is a UPE.chp. It can be constructed by starting with the second UPE from the based-on style, and then applying the second UPX (UPX.chpx) in std.grupx to that UPE. To apply a UPX.chpx to a UPE.chp, apply the UPX.chpx.grpprl to UPE.chp. Note that a UPE.chp for a paragraph style should always have UPE.chp.istd == istdNormalChar.

For a character style, the first (and only) UPE (a UPE.chpx) can be constructed by starting with the first UPE from the based-on style (std.istdBase), and then applying the first UPX (UPX.chpx) in std.grupx to that UPE. To apply a UPX.chpx to a UPE.chpx, take the grpprl in UPE.chpx.grpprl (which has a length of UPE.chpx.cbGrpprl) and merge the grpprl in UPX.chpx.grpprl into it. Merging grpprls is a tricky business, but for character styles it is easy because no prls in character style grpprls should interact with each other. Each prl from the source (the UPX.chpx.grpprl) should be inserted into the destination (the UPE.chpx.grpprl) so that the sprm of each prl is in increasing order, and any prls that have the same sprm are replaced by the prl in the source. UPE.chpx.cbGrpprl is then set to the length of resulting grpprl, and UPE.chpx.istd is set to the style's istd.