IFilter Properties and Pseudo-Properties

[This is preliminary documentation and subject to change.]

Text extracted using the IFilter interface may be tagged with many attributes, but only one attribute at a time. When these attributes refer to textual chunks they are treated as properties by the content index but are not treated as properties by the system. They are known as pseudo-properties.

Pseudo-properties are not accessible through the standard OLE IPropertyStorage interface. Pseudo-properties allow the user to search for documents based on the value of some internal field in the document that has not been exposed as a property to the system. For example, a spreadsheet describing monthly sales for an employee might export employee-id and total-sales pseudo-properties. This would enable a query for all spreadsheets (months) in which some employee sold more than x dollars.

Pseudo-property names must follow OLE property naming conventions. Each pseudo-property must be specified as property set\property. Failure to follow this naming convention results in unpredictable query behavior. Specifying a pseudo-property name which matches a true property name may also result in undefined query behavior.

The IFilter implemention may also publish OLE style properties through IFilter. These properties are retrieved using the IFilter::GetValue method call. Logically, they should be considered external annotations of a document. For example, this mechanism can be used to publish HTML anchors. If a class supports retrieval of OLE properties through IPropertyStorage, the IFilter implementation has the option of requesting the caller of IFilter to use IPropertyStorage to enumerate OLE properties, either in lieu of or to supplement properties emitted via IFilter::GetValue.