Q & A ActiveX/COM

by Don Box

Don Box has been working in networking and distributed object systems since 1989. He is currently chronicling the COM lifestyle in book form for Addison Wesley, and gives seminars on OLE and COM across the globe. Don can be reached at dbox@braintrust.com.

QI am having a reliability problem with my COM server. It houses a large number of transient objects that live for only a short time. Occasionally, the server will begin shutting down just as a new instantiation request comes in via CoCreateInstance. I have tried using the LockServer method on my class factory to keep the server around, but I am still having the problem. What can I do to fix this?

A It sounds as if your server is not managing its lifetime correctly. This is one of the most difficult pieces of COM to get right, as there are three areas that can cause problems. Your own code may have defects. Your clients may have defects. The implementation of COM on your platform may have inherent problems that you may or may not be able to code around.

Managing the lifetime of a server process in COM is the responsibility of the server implementor. The Service Control Manager (SCM) will start your process, but it is your server’s job to decide when to shutdown and exit. Unless your server is unusual, you will want your server process to remain running as long as there is at least one object with an outstanding interface pointer held by an external client. To facilitate this behavior, most programmers use two functions to lock and unlock their module, where the number of locks reflect how many objects are running. These lock and unlock functions, which would be called from each object’s constructor and destructor respectively, manipulate the server’s lock count. When the unlock routine decrements the lock count to zero, some action is taken to shutdown the server in an orderly fashion. Figure 1 shows a simple version of these two functions that call PostQuitMessage on the calling thread when the final unlock is performed. MFC provides its own version of these routines (AfxOleLockApp and AfxOleUnlockApp) and the Active Template Library (ATL) has two member functions on the global module object (Lock/Unlock) that perform the same service.

Figure 1 Lock Module/Unlock Module

////////////////////////////////////////////

//

// Simple Module Locking Routines for

// out-of-process servers

//

 

#include <windows.h>

 

// how many reasons do we have to remain running?

LONG g_cLocks = 0;

 

// Routine to call from instance constructors

void LockModule(void)

{

InterlockedIncrement(&g_cLocks);

}

 

// Routine to call from instance destructors

void UnlockModule(void)

{

if (InterlockedDecrement(&g_cLocks) == 0)

{

// lock count is now zero, so shutdown nicely

PostQuitMessage(0);

}

}

This is where the trouble begins. According to the constraints listed above, what is the correct way to handle class factories? Remember that in an out-of-process server, the class objects are created at startup and registered with OLE32 and the SCM by calling CoRegisterClassObject, as shown in Figure 2.

Figure 2 Creating and Registering Objects

int WINAPI WinMain(HINSTANCE, HINSTANCE,

LPSTR, int) {

CoInitialize(0);

CClassFactory<CFoo> cfFoo;

CClassFactory<CBar> cfBar;

DWORD dwFoo, dwBar;

CoRegisterClassObject(CLSID_Foo, &cfFoo,

CLSCTX_SERVER,

REGCLS_MULTIPLEUSE,

&dwFoo);

CoRegisterClassObject(CLSID_Bar, &cfBar,

CLSCTX_SERVER,

REGCLS_MULTIPLEUSE,

&dwBar);

MSG msg;

while (GetMessage(&msg, 0, 0, 0))

DispatchMessage(&msg);

CoRevokeClassObject(dwFoo);

CoRevokeClassObject(dwBar);

CoUninitialize();

return 0;

}

Assume that the class object’s constructors call the LockModule function shown in Figure 1. If this happens, the initial server lock count would begin at two, not zero. According to Figure 1, the server will only terminate the message pump when the lock count transitions from one to zero. However, the lock count will never be less than two due to the existence of the two class objects, so the server will run indefinitely. You might think that only locking the server when the class object’s reference counts are nonzero would suffice, but because the CoRegisterClassObject API AddRef’s the class object, the same problem would result.

Because of this circular locking problem, most implementors (including the ATL and MFC) simply ignore outstanding references to class objects, allowing the server to shut down while outstanding class factory references remain. This means that simply holding an outstanding class object pointer is not sufficient to keep a server in memory. What’s worse is that this situation is reinforced by COM itself. The IClassFactory interface, which is used for object instantiation, includes a LockServer method that allows clients to indicate that the outstanding class factory reference should indeed hold the server running. Most implementations of LockServer would simply use the module locking routines as follows:

HRESULT CClassFactory::LockServer(BOOL bLock) {

if (bLock)

LockModule();

else

UnlockModule();

return 0;

}

This implementation has the effect of keeping a dummy instance around just to hold the server running.

So, given the availability of the LockServer method, you might think that a client could reliably keep a server running by holding a class factory reference and calling its LockServer method. It turns out that things are not quite this simple. In order to call LockServer, you need to have an interface pointer to the class object, usually obtained via a call to CoGetClassObject. However, between the time that a client calls CoGetClassObject and the time that the LockServer method is called, it is possible for a second client to come along and call CoCreateInstance on the same server and release an object. As shown in Figure 3, by the time the first client calls LockServer, the second client has caused the server to shutdown. While it would be possible to retry the sequence of calls in the client, like this:

HRESULT GetLockedClassObject(REFCLSID rclsid,

IClassFactory **ppcf)

{

HRESULT hr = E_FAIL;

*ppcf = 0;

do {

if (*ppcf)

(*ppcf)->Release(); // free prev. proxy

*ppcf = 0;

hr = CoGetClassObject(rclsid,

CLSCTX_LOCAL_SERVER, 0,

IID_IClassFactory,

(void**)ppcf);

if (SUCCEEDED(hr))

hr = (*ppcf)->LockServer(TRUE);

} while (FACILITY_RPC == HRESULT_FACILITY(hr));

return hr;

}

this code is far from practical and certainly not very elegant. To avoid the need for this nightmarish aberration in the client, something needs to be done, either in the server process or in COM itself.

Figure 3

If you know that your code will only run under Windows NT 4.0 or higher, you are in luck, as the newest release of COM has solved the problem for you. When the Windows NT 4.0 SCM first reaches into your server process to extract the class object as part of a CoGetClassObject call, it calls LockServer(TRUE) while inside your process to ensure that the server remains running. When the last client releases the IClassFactory interface, the stub manager in your server’s address space makes an implicit LockServer(FALSE) call to the class object, releasing the lock and allowing your server to shutdown. The net effect is that LockServer will be called atomically as part of a CoGetClassObject request, plugging the hole mentioned above.

If your server must run under pre-Windows NT 4.0 releases of COM, is there a way to achieve the same effect? The answer is yes. The solution relies on the fact that when your server calls CoRegisterClassObject, OLE32 does not marshal your class object’s interface pointer. Instead, it simply AddRef’s the pointer, adds it to an in-process lookup table that maps CLSIDs to IUnknown pointers, and sends a notification to the SCM that the class is now available. When the SCM receives a CoGetClassObject request from a client, it reaches into your server process, finds the appropriate class object from the lookup table and marshals the interface pointer into a buffer that will be returned to the client for subsequent unmarshaling. Given this implementation, all that is needed is a way for your object to be notified when it is marshaled. This is the role of the IExternalConnection interface, a fairly esoteric interface that is used by the stub manager to notify objects of external locks. Objects that implement IExternalConnection are notified by the stub manager whenever a new interface is marshaled on the object and when the stub manager releases its connections. If your class factory implements this interface, then its AddConnection method will be called when the SCM marshals the first pointer on behalf of CoGetClassObject. The stub manager will call your class factory’s ReleaseConnection method when the final release occurs. Given this behavior, older versions of COM can approach the reliability of Windows NT 4.0 simply by implementing these two methods as follows:

DWORD CClassFactory::AddConnection(DWORD extconn, DWORD res)

{

if (extconn & EXTCONN_STRONG)

LockModule();

return g_cLocks;

}

DWORD

CClassFactory::ReleaseConnection(DWORD extconn, DWORD res,

BOOL bLastUnlockReleases)

{

if (extconn & EXTCONN_STRONG)

UnlockModule();

return g_cLocks;

}

As with the Windows NT 4.0 implicit LockServer solution, this technique guarantees that your server will be locked atomically as part of the first CoGetClassObject call, ensuring that the server will remain running until the last external class factory pointer is released. If you know that your code will only run under Windows NT 4.0, the code above is redundant, but not harmful.

So, assuming you have either ported your application to Windows NT 4.0 and/or have added support for IExternalConnection, you might be thinking that your shutdown problems are behind you. Think again. Note that the code in Figure 1 uses the atomic increment and decrement API functions to manipulate the server’s lock count. This code gives you a false sense of security; you’re lulled into thinking that this code is rock-solid thread-safe software suitable for 24-hour-a-day, seven-day-a-week service. This is simply not the case. The remaining race condition can occur between the time that the server decides to shutdown (the call to PostQuitMessage) and the time that the server notifies the SCM of its decision (the call to CoRevokeClassObject at the end of WinMain). In this gap, if a client decides to create a new object, the SCM would make an additional request on your server, which is in the process of shutting down. The SCM request would probably succeed, but by the time the client makes the first method call on the new object, the server is likely to have shut down. You could code defensively around this problem by revoking the class objects immediately upon deciding to shutdown in UnlockModule (instead of waiting for the message pump to terminate):

void UnlockModule(void) {

if (InterlockedDecrement(&g_cLocks) == 0) {

CoRevokeClassObject(g_dwFoo);

CoRevokeClassObject(g_dwBar);

PostQuitMessage(0);

}

}

This leaves a much smaller window of exposure (especially when combined with the Windows NT 4.0 implicit LockServer call). Once you call CoRevokeClassObject, the SCM will no longer activate objects in your server but instead will start a new instance to satisfy the activation request. It is possible for an activation request to come in from the SCM between the time that your lock count transitions to zero and the call to CoRevokeClassObject is entered. If the server is apartment-threaded, this is not a problem, as the incoming SCM call translates to a PostMessage call and will be dispatched after the call to CoRevokeClassObject has completed. Since CoRevokeClassObject would have already marked the CLSID as canceled by the time the SCM call is serviced, the activation request would fail with a distinguished error code and the SCM would then start a new instance of the server. Because freethreaded servers dispatch incoming calls immediately upon reception, there is still the possibility that an incoming SCM activation request could be serviced after the InterlockedDecrement call but prior to the call to CoRevokeClassObject. To address this race condition, the Windows NT 4.0 release of COM added two additional APIs, CoAddRefServerProcess and CoReleaseServerProcess, that are mandatory for freethreaded servers.

CoAddRefServerProcess and CoReleaseServerProcess are meant to replace the global lock count that most server implementors manage themselves. By allowing OLE32 to manage this lock count for a server, these API functions can temporarily lock the in-process lookup table used by the SCM to ensure that no activation requests are serviced while the lock count is being modified. Moreover, when the internal lock count transitions to zero, CoReleaseServerProcess atomically suspends all class objects exported by the process, so that when CoReleaseServerProcess returns, the SCM will no longer be able to access the server’s class objects. This atomic decrement and suspend closes the gap between the decision to shutdown and notifying the SCM.

The (hopefully) sole remaining race condition related to class objects only affects multi-CLSID servers that are freethreaded or that have more than one single-threaded apartment. The race condition occurs due to the default behavior of CoRegisterClassObject. Remember, CoRegisterClassObject adds the class factory to an in-process lookup table and sends a notification to the SCM. However, once a freethreaded server exposes its first class object by calling CoRegisterClassObject, incoming activation requests can enter the server immediately. As shown in Figure 4, if the server is attempting to register more than one class object, it is possible that a new object will be created and destroyed on the first CLSID prior to the second CLSID ever being registered. To avoid this situation, CoRegisterClassObject now accepts an additional flag, REGCLS_SUSPENDED, which adds the CLSID entry to OLE32’s lookup table, but does not send a notification to the SCM. By registering all class factories with REGCLS_SUSPENDED, no incoming calls will occur before the last class object is registered. To explicitly send the notification to the SCM for all suspended class objects, COM provides an API function, CoResumeClassObjects, that walks the in-process lookup table and sends a message to the SCM for each suspended class object. For completeness, COM also provides an API, CoSuspendClassObjects, for marking all class objects in a process as suspended. Figure 5 shows some sample code for a simple freethreaded server that demonstrates these APIs.

Figure 4  The Race to CoRegisterClassObject

 

Figure 5 Sample Freethreaded Server

#define _WIN32_DCOM

#include <windows.h>

#include <initguid.h>

HANDLE g_heventDone = CreateEvent(0, TRUE, FALSE, 0);

void LockModule(void)

{

CoAddRefServerProcess();

}

void UnlockModule(void)

{

if (CoReleaseServerProcess() == 0)

SetEvent(g_heventDone);

}

int WINAPI WinMain(HINSTANCE, HINSTANCE, LPSTR szCmdParam, int)

{

HRESULT hr = CoInitializeEx(0, COINIT_MULTITHREADED);

if (FAILED(hr))

return int(hr);

 

DWORD dw1, dw2;

CClassFactory cf1(CMTObject1::Create);

CClassFactory cf2(CMTObject2::Create);

 

hr = CoRegisterClassObject(CLSID_MTObject1, &cf1,

CLSCTX_LOCAL_SERVER,

REGCLS_MULTIPLEUSE|REGCLS_SUSPENDED,

&dw1);

if (SUCCEEDED(hr))

{

hr = CoRegisterClassObject(CLSID_MTObject2, &cf2,

CLSCTX_LOCAL_SERVER,

REGCLS_MULTIPLEUSE|REGCLS_SUSPENDED,

&dw2);

if (SUCCEEDED(hr))

{

// all class objects registered suspended, so tell the SCM we are

// ready and wait for our internal shutdown notification

hr = CoResumeClassObjects();

if (SUCCEEDED(hr))

WaitForSingleObject(g_heventDone, INFINITE);

 

hr = CoRevokeClassObject(dw2);

}

hr = CoRevokeClassObject(dw1);

}

 

Sleep(500);

CoUninitialize();

return 0;

}

By this point, you may be thinking, "Gee, I’m glad I only write in-process servers where none of these problems affect me." If so, I’ve got some bad news for you. While Windows NT 4.0 provides enough help to write a thread-safe local server, there is a race condition inherent in in-process servers for which there is virtually no good solution. Here is the problem.

In-process servers do not unload themselves from their client’s address space. Instead, they export a callback function called DllCanUnloadNow that returns either S_OK, indicating that it is OK to unload the DLL, or S_FALSE, indicating that now would be a really bad time to unload the DLL. The client triggers this call by calling CoFreeUnusedLibraries at idle time. In a single-threaded world, this is no problem, as there will only be one thread using your DLL for the lifetime of the process. In fact, even in multithreaded processes, if your DLL does not have a "ThreadingModel" named value in the registry (all MFC 4.1 or earlier DLLs fall into this category), it will be loaded by the main apartment of the process and all calls into your DLL will be via an intraprocess proxy/stub connection. Of course, the performance of your objects will be severely hampered by this proxy/stub connection, so, like most programmers, you want to thread-enable your DLL. Here is where the trouble starts. (See the "Multithreading and Inproc Servers" sidebar for more information.)

Assuming that your DLL has a valid "ThreadingModel" registry entry, you now are faced with a dilemma regarding DllCanUnloadNow. You can count on the following guarantees from COM:

· DllCanUnloadNow will always execute in the context of the main apartment, irrespective of which thread calls CoFreeUnusedLibraries.

· All calls to CoGetClassObject or CoCreateInstance in the client process will be blocked for the duration of CoFreeUnusedLibraries.

Given these two constraints, there is still an unavoidable race condition when one thread is performing a final release while the main apartment is executing DllCanUnloadNow.

To get a grasp on the problem, assume that a client is using your server from more than one thread. You must deal with the possibility that one client thread will be in the process of performing the final release on the last object (and thus decrementing your lock count to zero) while a second client thread is in the process of executing CoFreeUnusedLibraries. As shown in Figure 6, when the thread executing CoFreeUnusedLibraries receives S_OK from your DllCanUnloadNow, it will promptly free your library using FreeLibrary as expected. Of course, the original thread is still in the process of executing the destructor of your object, but unfortunately, the code for the destructor has just been yanked out from under the CPU by CoFreeUnusedLibraries. Maybe you’re considering using a global critical section or mutex to lock something in your destructor that would prevent DllCanUnloadNow from returning S_OK, but unfortunately, no matter how long you postpone releasing such a mutex in your destructor, there are always at least a few instructions after the destructor’s last curly brace that you are not in control of (at the very least, operator delete needs to free the memory). It is this destructor tail code that makes implementing DllCanUnloadNow correctly virtually impossible. So, given this state of affairs, what can you do?

Figure 6  The Race to DllCanUnloadNow

First, you can tell yourself and your customers that the window of exposure is sufficiently small that it is not worth worrying about. I wrote a simple test program that had one thread spinning in a tight loop calling CoFreeUnusedLibraries and a second thread in a tight loop calling CoCreateInstance followed by Release. I found that about 1 in 1500 Release calls fired an ACCESS_VIOLATION exception. My test program was the most brutal client imaginable, so you may feel comfortable with this approach. This is the approach taken by both MFC and the ATL.

Another option is to never allow your server to unload. This is the simplest to implement, as your DllCanUnloadNow can simply return S_FALSE in all situations. This is completely thread-safe; your DLL will never be unloaded from the client’s address space.

A third approach is to use timing information in DllCanUnloadNow. Acknowledging that the window of exposure to a crash is just the amount of time it takes to execute the tail-end of the destructor of your object (that is, the code following the InterlockedDecrement in UnlockModule), you could store the time that the server’s lock count transitions to zero. In the implementation of DllCanUnloadNow, compare the current time to the time of the final unlock and only allow an unload if some acceptable window of time has elapsed. The problem with this approach is that no matter how long of a time interval you use to postpone unloading, there is no guarantee that the thread that was executing the final destructor has had a chance to complete execution due to either a low thread priority or waiting on some operating system resource. This technique, illustrated in Figure 7, is only a heuristic solution, as you can always write a destructor that can cause a crash (a call to Sleep with a sufficient time interval would suffice).

Figure 7 Avoiding a Race Condition

////////////////////////////////////////////

// Simple Module Locking Routines for

// in-process servers

 

#include <windows.h>

 

// how many reasons do we have to remain running?

LONG g_cLocks = 0;

 

// when was the final unlock called?

DWORD g_dwLastUnlockTime = 0;

 

// how long should we postpone unloading?

enum { SAFETY_WINDOW = 1000 };

 

// Routine to call from instance constructors

void LockModule(void)

{

InterlockedIncrement(&g_cLocks);

}

 

// Routine to call from instance destructors

void UnlockModule(void)

{

// record current time

InterlockedExchange((LONG*)&g_dwLastUnlockTime, GetTickCount());

InterlockedDecrement(&g_cLocks);

}

 

STDAPI DllCanUnloadNow(void)

{

// only say "unload" if no locks and SAFETY_WINDOW msecs

// have elapsed since the last unlock

if (g_cLocks == 0 &&

g_dwLastUnlockTime + SAFETY_WINDOW < GetTickCount())

return S_OK;

else

return S_FALSE;

}

Microsoft is aware of this problem and will likely provide some support in a subsequent release of COM. Until then, you must decide which of the three approaches best suits you. If you are writing multithreaded code that will be a client to DLLs you are not in control of, you may want to protect yourself against DLLs that use a technique to implement DllCanUnloadNow that is more liberal than you find acceptable. One way to do this is to wrap all calls to Release in a try/except block, as follows:

BOOL ReleaseInterface(IUnknown **ppunk) {

__try {

if (!*ppunk) return TRUE;

(*ppunk)->Release();

*ppunk = 0;

return TRUE;

}

__except(EXCEPTION_EXECUTE_HANDLER) {

// silently ignore all exceptions

return FALSE;

}

}

This code is not necessary if the DLLs are truly thread-ignorant and lack a "ThreadingModel" registry entry, as all calls will be dispatched to a single thread. You should be aware that if the function above returns FALSE, the destructor was not allowed to complete execution, potentially leaving something important in an indeterminate state. Whether this is preferable to allowing the client program to crash is largely a matter of personal taste. Both are fairly obnoxious.

Armed with an understanding of the pitfalls of multithreading, here are some guidelines for implementing out-of-process servers that are as solid as possible:

· Be sure to use a thread-safe version of the C runtime library, which is not necessarily the default for your development environment.

· Use InterlockedIncrement and InterlockedDecrement on your server’s lock count.

· Implement LockServer correctly on your class factories.

· Implement AddRef and Release in a thread-safe manner if your server is freethreaded.

· Implement IExternalConnection on your class factories if compatibility with Windows NT 3.51 and Windows 95 is needed.

· In apartment-threaded servers, be sure to revoke class objects immediately after deciding to shutdown but before servicing the Windows message queue to ensure no new objects are created.

· Use the REGCLS_SUSPENDED flag along with CoResumeClassObjects to ensure all class factories are exposed to the SCM simultaneously (Windows NT 4.0 only).

· Use CoAddRefServerProcess and CoReleaseServerProcess to ensure that the SCM is notified the instant your process decides to shutdown to prevent further activation requests (Windows NT 4.0 only).

· Ensure that calls to your class factory’s CreateInstance method return CO_E_SERVER_STOPPING if they arrive after you have decided to shutdown (Windows NT 4.0 only).

For in-process servers, the first four guidelines apply, plus one more: be aware of the race condition inherent in DllCanUnloadNow and choose a strategy that is acceptable for your domain.

I hope I’ve impressed upon you the downside of using threads. Threads, like chainsaws, are extremely useful for solving a variety of problems, but if used with carelessness or under the influence of mind-numbing chemicals, can cause a tremendous amount of damage to both the user and innocent bystanders. If you would like to see how adept (or inept) the author is with chainsaws, point your browser to http://www.develop.com/dbox/msj/0197.htm to stress-test a DCOM freethreaded server.

To obtain complete source code listings, see Editor's page.

Have a question about programming in ActiveX? Send your questions via e-mail to Don Box: dbox@braintrust.com

This article is reproduced from Microsoft Systems Journal. Copyright © 1997 by Miller Freeman, Inc. All rights are reserved. No part of this article may be reproduced in any fashion (except in brief quotations used in critical articles and reviews) without the prior consent of Miller Freeman.

To contact Miller Freeman regarding subscription information, call (800) 666-1084 in the U.S. and Canada, or (303) 678-0439 in all other countries. For other inquiries, call (415) 905-2200.