Multithreaded Programming

As you'll recall from Chapter 10, a process is a running program that owns its own memory, file handles, and other system resources. An individual process can contain separate execution paths, called threads. Don't look for separate code for separate threads, however, because a single function can be called from many threads. For the most part, all of a process's code and data space is available to all of the threads in the process. Two threads, for example, can access the same global variables. Threads are managed by the operating system, and each thread has its own stack.

Windows offers two kinds of threads, worker threads and user interface threads. The Microsoft Foundation Class (MFC) Library supports both. A user interface thread has windows, and therefore it has its own message loop. A worker thread doesn't have windows, so it doesn't need to process messages. Worker threads are easier to program and are generally more useful. The remaining examples in this chapter illustrate worker threads. At the end of the chapter, however, an application for a user interface thread is described.

Don't forget that even a single-threaded application has one thread—the main thread. In the MFC hierarchy, CWinApp is derived from CWinThread. Back in Chapter 2, I told you that InitInstance and m_pMainWnd are members of CWinApp. Well, I lied. The members are declared in CWinThread, but of course they're inherited by CWinApp. The important thing to remember here is that an application is a thread.

Writing the Worker Thread Function and Starting the Thread

If you haven't guessed already, using a worker thread for a long computation is more efficient than using a message handler that contains a PeekMessage call. Before you start a worker thread, however, you must write a global function for your thread's main program. This global function should return a UINT, and it should take a single 32-bit value (declared LPVOID) as a parameter. You can use the parameter to pass anything at all to your thread when you start it. The thread does its computation, and when the global function returns, the thread terminates. The thread would also be terminated if the process terminated, but it's preferable to ensure that the worker thread terminates first, which will guarantee that you'll have no memory leaks.

To start the thread (with function name ComputeThreadProc), your program makes the following call:

CWinThread* pThread =
    AfxBeginThread(ComputeThreadProc, GetSafeHwnd(),
                   THREAD_PRIORITY_NORMAL);

The compute thread code looks like this:

UINT ComputeThreadProc(LPVOID pParam)
{
    // Do thread processing
    return 0;
}

The AfxBeginThread function returns immediately; the return value is a pointer to the newly created thread object. You can use that pointer to suspend and resume the thread (CWinThread::SuspendThread and ResumeThread), but the thread object has no member function to terminate the thread. The second parameter is the 32-bit value that gets passed to the global function, and the third parameter is the thread's priority code. Once the worker thread starts, both threads run independently. Windows divides the time between the two threads (and among the threads that belong to other processes) according to their priority. If the main thread is waiting for a message, the compute thread can still run.

How the Main Thread Talks to a Worker Thread

The main thread (your application program) can communicate with the subsidiary worker thread in many different ways. One option that will not work, however, is a Windows message; the worker thread doesn't have a message loop. The simplest means of communication is a global variable because all the threads in the process have access to all the globals. Suppose the worker thread increments and tests a global integer as it computes and then exits when the value reaches 100. The main thread could force the worker thread to terminate by setting the global variable to 100 or higher.

The code below looks as though it should work, and when you test it, it probably will:

UINT ComputeThreadProc(LPVOID pParam)
{
    g_nCount = 0;
    while (g_nCount++ < 100) {
        // Do some computation here
    }
    return 0;
}

There's a problem, however, that you could detect only by looking at the generated assembly code. The value of g_nCount gets loaded into a register, the register is incremented, and then the register value is stored back in g_nCount. Suppose g_nCount is 40 and Windows interrupts the worker thread just after the worker thread loads 40 into the register. Now the main thread gets control and sets g_nCount to 100. When the worker thread resumes, it increments the register value and stores 41 back into g_nCount, obliterating the previous value of 100. The thread loop doesn't terminate!

If you turn on the compiler's optimization switch, you'll have an additional problem. The compiler uses a register for g_nCount, and the register stays loaded for the duration of the loop. If the main thread changes the value of g_nCount in memory, it will have no effect on the worker thread's compute loop. (You can ensure that the counter isn't stored in a register, however, by declaring g_nCount as volatile.)

But suppose you rewrite the thread procedure as shown here:

UINT ComputeThreadProc(LPVOID pParam)
{
    g_nCount = 0;
    while (g_nCount < 100) {
        // Do some computation here
        ::InterlockedIncrement((long*) &g_nCount);
    }
    return 0;
}

The InterlockedIncrement function blocks other threads from accessing the variable while it is being incremented. The main thread can safely stop the worker thread.

Now you've seen some of the pitfalls of using global variables for communication. Using global variables is sometimes appropriate, as the next example illustrates, but there are alternative methods that are more flexible, as you'll see later in this chapter.

How the Worker Thread Talks to the Main Thread

It makes sense for the worker thread to check a global variable in a loop, but what if the main thread did that? Remember the pig function? You definitely don't want your main thread to enter a loop because that would waste CPU cycles and stop your program's message processing. A Windows message is the preferred way for a worker thread to communicate with the main thread because the main thread always has a message loop. This implies, however, that the main thread has a window (visible or invisible) and that the worker thread has a handle to that window.

How does the worker thread get the handle? That's what the 32-bit thread function parameter is for. You pass the handle in the AfxBeginThread call. Why not pass the C++ window pointer instead? Doing so would be dangerous because you can't depend on the continued existence of the object and you're not allowed to share objects of MFC classes among threads. (This rule does not apply to objects derived directly from CObject or to simple classes such as CRect and CString.)

Do you send the message or post it? Better to post it, because sending it could cause reentry of the main thread's MFC message pump code, and that would create problems in modal dialogs. What kind of message do you post? Any user-defined message will do.

The EX12B Program

The EX12B program looks exactly like the EX12A program when you run it. When you look at the code, however, you'll see some differences. The computation is done in a worker thread instead of in the main thread. The count value is stored in a global variable g_nCount, which is set to the maximum value in the dialog window's Cancel button handler. When the thread exits, it posts a message to the dialog, which causes DoModal to exit.

The document, view, frame, and application classes are the same except for their names, and the dialog resource is the same. The modal dialog class is still named CComputeDlg, but the code inside is quite different. The constructor, timer handler, and data exchange functions are pretty much the same. The following code fragment shows the global variable definition and the global thread function as given in the \ex12b\ComputeDlg.cpp file on the companion CD-ROM. Note that the function exits (and the thread terminates) when g_nCount is greater than a constant maximum value. Before it exits, however, the function posts a user-defined message to the dialog window.

int g_nCount = 0; 

UINT ComputeThreadProc(LPVOID pParam)
{
    volatile int nTemp; // volatile else compiler optimizes too much

    for (g_nCount = 0; g_nCount < CComputeDlg::nMaxCount;
                       ::InterlockedIncrement((long*) &g_nCount)) {
        for (nTemp = 0; nTemp < 10000; nTemp++) {
            // uses up CPU cycles
        }
    }
    // WM_THREADFINISHED is user-defined message
    ::PostMessage((HWND) pParam, WM_THREADFINISHED, 0, 0);
    g_nCount = 0;
    return 0; // ends the thread
}

The OnStart handler below is mapped to the dialog's Start button. Its job is to start the timer and the worker thread. You can change the worker thread's priority by changing the third parameter of AfxBeginThread—for example, the computation runs a little more slowly if you set the priority to THREAD_PRIORITY_LOWEST.

void CComputeDlg::OnStart()
{
    m_nTimer = SetTimer(1, 100, NULL); // 1/10 second
    ASSERT(m_nTimer != 0);
    GetDlgItem(IDC_START)->EnableWindow(FALSE);
    AfxBeginThread(ComputeThreadProc, GetSafeHwnd(),
                   THREAD_PRIORITY_NORMAL);
}

The OnCancel handler below is mapped to the dialog's Cancel button. It sets the g_nCount variable to the maximum value, causing the thread to terminate.

void CComputeDlg::OnCancel()
{
    if (g_nCount == 0) { // prior to Start button
        CDialog::OnCancel();
    }
    else { // computation in progress
        g_nCount = nMaxCount; // Force thread to exit
    }
}

The OnThreadFinished handler below is mapped to the dialog's WM_THREADFINISHED user-defined message. It causes the dialog's DoModal function to exit.

LRESULT CComputeDlg::OnThreadFinished(WPARAM wParam, LPARAM lParam)
{
    CDialog::OnOK();
    return 0;
}

Using Events for Thread Synchronization

The global variable is a crude but effective means of interthread communication. Now let's try something more sophisticated. We want to think in terms of thread synchronization instead of simple communication. Our threads must carefully synchronize their interactions with one another.

An event is one type of kernel object (processes and threads are also kernel objects) that Windows provides for thread synchronization. An event is identified by a unique 32-bit handle within a process. It can be identified by name, or its handle can be duplicated for sharing among processes. An event can be either in the signaled (or true) state or in the unsignaled (or false) state. Events come in two types: manual reset and autoreset. We'll be looking at autoreset events here because they're ideal for the synchronization of two processes.

Let's go back to our worker thread example. We want the main (user interface) thread to "signal" the worker thread to make it start or stop, so we'll need a "start" event and a "kill" event. MFC provides a handy CEvent class that's derived from CSyncObject. By default, the constructor creates a Win32 autoreset event object in the unsignaled state. If you declare your events as global objects, any thread can easily access them. When the main thread wants to start or terminate the worker thread, it sets the appropriate event to the signaled state by calling CEvent::SetEvent.

Now the worker thread must monitor the two events and respond when one of them is signaled. MFC provides the CSingleLock class for this purpose, but it's easier to use the Win32 WaitForSingleObject function. This function suspends the thread until the specified object becomes signaled. When the thread is suspended, it's not using any CPU cycles—which is good. The first WaitForSingleObject parameter is the event handle. You can use a CEvent object for this parameter; the object inherits from CSyncObject an operator HANDLE that returns the event handle it has stored as a public data member. The second parameter is the time-out interval. If you set this parameter to INFINITE, the function waits forever until the event becomes signaled. If you set the time-out to 0, WaitForSingleObject returns immediately, with a return value of WAIT_OBJECT_0 if the event was signaled.

The EX12C Program

The EX12C program uses two events to synchronize the worker thread with the main thread. Most of the EX12C code is the same as EX12B, but the CComputeDlg class is quite different. The StdAfx.h file contains the following line for the CEvent class:

#include <afxmt.h>

There are two global event objects, as shown below. Note that the constructors create the Windows events prior to the execution of the main program.

CEvent g_eventStart; // creates autoreset events
CEvent g_eventKill;

It's best to look at the worker thread global function first. The function increments g_nCount just as it did in EX12B. The worker thread is started by the OnInitDialog function instead of by the Start button handler. The first WaitForSingleObject call waits for the start event, which is signaled by the Start button handler. The INFINITE parameter means that the thread waits as long as necessary. The second WaitForSingleObject call is different—it has a 0 time-out value. It's located in the main compute loop and simply makes a quick test to see whether the kill event was signaled by the Cancel button handler. If the event was signaled, the thread terminates.

UINT ComputeThreadProc(LPVOID pParam)
{
    volatile int nTemp;

    ::WaitForSingleObject(g_eventStart, INFINITE);
    TRACE("starting computation\n");
    for (g_nCount = 0; g_nCount < CComputeDlg::nMaxCount;
                       g_nCount++) {
        for (nTemp = 0; nTemp < 10000; nTemp++) {
            // Simulate computation
        }
        if (::WaitForSingleObject(g_eventKill, 0) == WAIT_OBJECT_0) {
            break;
        }
    }
    // Tell owner window we're finished
    ::PostMessage((HWND) pParam, WM_THREADFINISHED, 0, 0);
    g_nCount = 0;
    return 0; // ends the thread
}

Here is the OnInitDialog function that's called when the dialog is initialized. Note that it starts the worker thread, which doesn't do anything until the start event is signaled.

BOOL CComputeDlg::OnInitDialog()
{
    CDialog::OnInitDialog();
    AfxBeginThread(ComputeThreadProc, GetSafeHwnd());
    return TRUE;  // Return TRUE unless you set the focus to a control
                  // EXCEPTION: OCX Property Pages should return FALSE
}

The following Start button handler sets the start event to the signaled state, thereby starting the worker thread's compute loop:

void CComputeDlg::OnStart()
{
    m_nTimer = SetTimer(1, 100, NULL); // 1/10 second
    ASSERT(m_nTimer != 0);
    GetDlgItem(IDC_START)->EnableWindow(FALSE);
    g_eventStart.SetEvent();
}

The following Cancel button handler sets the kill event to the signaled state, causing the worker thread's compute loop to terminate:

void CComputeDlg::OnCancel()
{
    if (g_nCount == 0) { // prior to Start button
        // Must start it before we can kill it
        g_eventStart.SetEvent();
    }
    g_eventKill.SetEvent();
}

Note the awkward use of the start event when the user cancels without starting the compute process. It might be neater to define a new cancel event and then replace the first WaitForSingleObject call with a WaitForMultipleObjects call in the ComputeThreadProc function. If WaitForMultipleObjects detected a cancel event, it could cause an immediate thread termination.

Thread Blocking

The first WaitForSingleObject call in the ComputeThreadProc function above is an example of thread blocking. The thread simply stops executing until an event becomes signaled. A thread could be blocked in many other ways. You could call the Win32 Sleep function, for example, to put your thread to "sleep" for 500 milliseconds. Many functions block threads, particularly those functions that access hardware devices or Internet hosts. Back in the Win16 days, those functions took over the CPU until they were finished. In Win32, they allow other processes and threads to run.

You should avoid putting blocking calls in your main user interface thread. Remember that if your main thread is blocked, it can't process its messages, and that makes the program appear sluggish. If you have a task that requires heavy file I/O, put the code in a worker thread and synchronize it with your main thread.

Be careful of calls in your worker thread that could block indefinitely. Check the online documentation to determine whether you have the option of setting a time-out value for a particular I/O operation. If a call does block forever, the thread will be terminated when the main process exits, but then you'll have some memory leaks. You could call the Win32 TerminateThread function from your main thread, but you'd still have the memory-leak problem.

Critical Sections

Remember the problems with access to the g_nCount global variable? If you want to share global data among threads and you need more flexibility than simple instructions like InterlockedIncrement can provide, critical sections might be the synchronization tools for you. Events are good for signaling, but critical sections (sections of code that require exclusive access to shared data) are good for controlling access to data.

MFC provides the CCriticalSection class that wraps the Windows critical section handle. The constructor calls the Win32 InitializeCriticalSection function, the Lock and Unlock member functions call EnterCriticalSection and LeaveCriticalSection, and the destructor calls DeleteCriticalSection. Here's how you use the class to protect global data:

CCriticalSection g_cs;    // global variables accessible from all threads
int g_nCount;
void func()
{
    g_cs.Lock();
    g_nCount++;
    g_cs.Unlock();
}

Suppose your program tracks time values as hours, minutes, and seconds, each stored in a separate integer, and suppose two threads are sharing time values. Thread A is changing a time value but is interrupted by thread B after it has updated hours but before it has updated minutes and seconds. Thread B will have an invalid time value.

If you write a C++ class for your time format, it's easy to control data access by making the data members private and providing public member functions. The CHMS class, shown in Figure 12-2, does exactly that. Notice that the class has a data member of type CCriticalSection. Thus, a critical section object is associated with each CHMS object.

Notice that the other member functions call the Lock and Unlock member functions. If thread A is executing in the middle of SetTime, thread B will be blocked by the Lock call in GetTotalSecs until thread A calls Unlock. The IncrementSecs function calls SetTime, resulting in nested locks on the critical section. That's okay because Windows keeps track of the nesting level.

The CHMS class works well if you use it to construct global objects. If you share pointers to objects on the heap, you have another set of problems. Each thread must determine whether another thread has deleted the object, and that means you must synchronize access to the pointers.

HMS.H
#include "StdAfx.h" class CHMS { private: int m_nHr, m_nMn, m_nSc; CCriticalSection m_cs; public: CHMS() : m_nHr(0), m_nMn(0), m_nSc(0) {} ~CHMS() {} void SetTime(int nSecs) { m_cs.Lock(); m_nSc = nSecs % 60; m_nMn = (nSecs / 60) % 60; m_nHr = nSecs / 3600; m_cs.Unlock(); } int GetTotalSecs() { int nTotalSecs; m_cs.Lock(); nTotalSecs = m_nHr * 3600 + m_nMn * 60 + m_nSc; m_cs.Unlock(); return nTotalSecs; } void IncrementSecs() { m_cs.Lock(); SetTime(GetTotalSecs() + 1); m_cs.Unlock(); } };

Figure 12-2. The CHMS class listing.

No sample program is provided that uses the CHMS class, but the file hms.h is included in the \vcpp32\ex12c subdirectory on the companion CD-ROM. If you write a multithreaded program, you can share global objects of the class. You don't need any other calls to the thread-related functions.

Mutexes and Semaphores

As I mentioned, I'm leaving these synchronization objects to Jeffrey Richter's Advanced Windows. You might need a mutex or a semaphore if you're controlling access to data across different processes because a critical section is accessible only within a single process. Mutexes and semaphores (along with events) are shareable by name.

User Interface Threads

The MFC library provides good support for UI threads. You derive a class from CWinThread, and you use an overloaded version of AfxBeginThread to start the thread. Your derived CWinThread class has its own InitInstance function, and most important, it has its own message loop. You can construct windows and map messages as required.

Why might you want a user interface thread? If you want multiple top-level windows, you can create and manage them from your main thread. Suppose you allow the user to run multiple instances of your application, but you want all instances to share memory. You can configure a single process to run multiple UI threads such that users think they are running separate processes. That's exactly what Windows Explorer does. Check it out with SPY++.

Starting the second and subsequent threads is a little tricky because the user actually launches a new process for each copy of Windows Explorer. When the second process starts, it signals the first process to start a new thread, and then it exits. The second process can locate the first process either by calling the Win32 FindWindow function or by declaring a shared data section. Shared data sections are explained in detail in Jeffrey Richter's book.