String Manipulation

The ActiveX controls framework provides a robust system of macros for manipulating strings in all the ways that you will see while working with an ActiveX control.

Types of Strings

There are different types of strings in Automation. It’s important to understand them when working with COM, because the potential for memory leaks and bugs associated with strings is very great.

There are two fundamental types of strings: multibyte (which can be ANSI or double byte) and unicode strings. Of the former, you almost always work with some sort of char * pointer (LPSTR, LPCSTR). Of the latter, a few types are commonly used—most notably WCHAR * (LPWSTR, LPWCSTR), BSTR, and OLESTR strings.

LPWSTR pointers are just that— pointers to a wide string. An LPOLESTR pointer is much the same, with some additional COM rules added to it. An OLESTR is merely a wide string, but when it is an out-parameter to a function, it should be allocated using the host's IMalloc allocator (for example, CoTaskMemAlloc).

A BSTR is a string with a length prefix in the memory location preceding the string. To work with a BSTR, you need to use special APIs designed exclusively for them, notably SysAllocString, SysFreeString, and SysStringLen. For more details, see the Automation Programmer’s Reference, available on MSDN and from Microsoft Press®.

These data types are fully interchangeable in terms of compares and copies, but they are not interchangeable in terms of allocation and freeing. It is not acceptable to call SysFreeString on an OLESTR or LPWSTR string.

Variables of type BSTR and OLESTR as in parameters to functions should not be freed as per standard OLE COM conventions. Variables of type BSTR and OLESTR as out parameters should be expected to be freed, and should thus be allocated appropriately.

Working with Strings

For the most part, your controls will be working with multibyte strings, except when you work with OLE. Therefore, in various scenarios you'll either be given a wide string, and need the multibyte version of it, or you'll have a multibyte string, and need a wide string for it.

To solve these problems, the ActiveX controls framework includes the following macros to work with:

The first two macros take a string of a given type and name, and create a variable of the new name; do not declare a variable of this name yourself and then convert the other string into the new variable. This cannot be used as an rvalue in C/C++ expressions, nor can it be an lvalue; it needs to sit on a line by itself.

The last set of macros does all the remaining interesting work. You can get a variable of type BSTR or use a variable of type OLESTR, created by the IMalloc interface, from an ANSI string, or you can copy the OLESTR and BSTR variables. The only additional functions of interest are those that take a WORD, which is a resource ID, and load in a string from your localization DLL (or the main DLL if you don't do satellite localization) and make either a BSTR or OLESTR out of it. This proves useful in a few places where you need a localized string.

Remember that while these macros were designed with a certain amount of speed in mind, converting strings is not a cheap operation. Control writers should try to avoid string conversions.