Viewing and Editing Hyperlinks as OLE Properties

Berthold von Freyberg
Microsoft Office Program Management

June, 1996

Abstract

In Microsoft® Office 97, we want to store hyperlinks in the compound documents created by Microsoft Word, Microsoft Excel or Microsoft PowerPoint® in such a way that external tools can read, modify, and delete them.

This article provides a detailed description of the format in which the applications write the hyperlinks to a compound file stream and how external tools should access, modify and delete hyperlinks in compound documents.

Hyperlinks in Office 97 Documents

In the following, "hyperlink" refers to the set of TargetAddress (URL/UNC) and SubAddress (such as a cell range in Microsoft Excel, a slide name in PowerPoint, or a bookmark in Word).

This section summarizes the implementation and storage of  hyperlinks in the various applications:

Exposing Hyperlinks

This section describes the OLE properties stream to which Office 97 applications write hyperlink information . The format in which the hyperlink information is written is also described. Office 97 stores the Standard Summary Information property set in an IStream off the root IStorage, named "\005SummaryInformation". In addition, Office 97 stores two sections for the Office Summary Information (FormatID_DocumentSummaryInformation) and for the user-defined properties (FormatID_UserDefinedProperties), in another IStream named "\005DocumentSummaryInformation". Basically, we add one property, PID_HYPERLINKSCHANGED, to the existing twenty properties of the Office DocumentSummaryInformation section and one property, PID_HYPERLINKS, to the UserDefinedProperties section. Microsoft Excel, Word and PowerPoint, respectively, write this array at Save and, when opening it later, read the array and reconcile the hyperlinks in the document with any changes to the array.

Property Name Property ID Property ID code Type What stored in
Hyperlinks PID_HYPERLINKS _PID_HLINKS VT_BLOB one hyperlink per six array elements. Format see below.
HyperlinksChanged PID_HYPERLINKSCHANGED 0x00000016 VT_BOOL The "dirty" bit:

0 = false = no links changed

1 = true = links changed


When saving a document, the application enumerates all hyperlinks (both its own and Office Art’s) and Office writes an array (with several array elements for each hyperlink, see below), in the same order in which it will later load and reconcile them. The application writes PID_HYPERLINKS as one VT_BLOB, but the internal structure which is how the application will later read the array is VT_VARIANT | VT_VECTOR. - Note that, for a given picture, Office Art might write up to three Hlinks in Office 97 to the OLE properties stream:

  1. the linked picture file

  2. the linked fill file

  3. the hyperlink

In addition, in a future version, Office Art might also write a fourth Hlink to the OLE properties stream for a linked line fill file. Internally, Office Art already supports this, but the Office applications themselves do not expose this functionality yet. When saving a document, the application also sets PID_HYPERLINKSCHANGED to False.

A related property that should be used in conjunction with PID_HYPERLINKS is the new custom property PID_LINKBASE that stores the base URL/UNC of a document. This is important in instances where PID_HYPERLINKS contains relative links.

Property Name Property ID Property ID code Type What stored in
Hyperlink Base PID_ LINKBASE _PID_LINKBASE VT_BLOB Base address to be prepended to all relative hyperlinks, internally stored as VT_LPWSTR

Structure of the PID_HYPERLINKS Array

The PID_HYPERLINKS array property has the following format: the DWORD CElements indicates the number of array elements. This is equal to six times the number of hyperlinks because for every hyperlink that follows in PID_HYPERLINKS, there are six array elements:

Note   TargetAddress and SubAddress are padded so that they are DWORD-aligned.

The first three DWORDs are private to the application that writes the file and should not be modified by an external tool.

The DWORD Info holds 2 pieces of information:

  1. HIWORD: an external tool should write this number to communicate whether a hyperlink should be left as is, be modified, or be deleted when the application opens the document.

    0 - do not change anything
    1 - replace the hyperlink with the TargetAddress and SubAddress in the following two DWORDs (VT_LPWSTR)
    2 - delete the hyperlink

  2. LOWORD. An application should write and, upon opening, read this number to check whether the hyperlink is associated with a field, a shape, a file, etc. This provides the application with an additional safety check when reconciling the array with the hyperlinks in the document. Only the application should modify the LOWORD. External tools should not. The following are the LOWORD values:

    0 - graphic shown as background of doc (link to a picture file)
    1 - graphic shown as shape in doc (link to a picture file)
    2 - graphic used to fill a shape (link to a fill file: picture fill, texture fill, or pattern fill)
    3 - graphic used for shape outline (link to a line fill file: for future use only)
    4 - hyperlink attached to a shape
    5 - hyperlink attached to a (Word) field
    6 - hyperlink attached to an (Excel) range
    7 - hyperlink attached to a (PPT) text range
    8 - hyperlink attached to a (Project) task

Note   While currently not yet used, negative values of HIWORD and LOWORD are reserved for Microsoft applications.

Note   The property array only comprises hyperlinks and links to pictures, textured fills and textured line files, not shortcuts, cross-references in Word, or cell references to other workbooks. Hyperlinks that appear in Word’s Undo document or in the AutoText table will also not be exposed. Finally, Office 97 applications do not expose hyperlinks from data path properties of ActiveX Controls in the OLE properties stream.

Note   In contrast to other custom OLE properties, the File::Properties UI does not expose PID_HYPERLINKS.