One of the most important elements of performance tuning is to maintain Performance Monitor logs. If you don't record this information, you will have nothing to work with. You can set these logs to record only every two or five minutes; this is sufficient for most performance tuning work and doesn't require too much storage space.
There are three major areas to watch when you are monitoring a server with live users.
Observe whether your users are sending more messages now than they did a few months ago, or whether the average message size is increasing. These factors will change the load on the server, and if you are not aware of them, the load can slowly increase until you have a response-time problem. Microsoft Exchange Server provides counters that give an indication of the overall workload.
If you monitor a server alone, it is nearly impossible to calculate the overall response times that users will experience. However, you can observe the server components of the response time, and get a general idea of whether slow server response times are becoming an issue. Microsoft Exchange Server provides counters that indicate server response times.
By observing the utilization of various resources, you can see where the bottleneck in a system is and even get an idea of where the next bottleneck might occur, after relieving the current one.
The following sections describe the counters of most interest to an administrator, performance analyst, or capacity planner when performing system tuning.
These counters don't provide a complete picture of the load on your Microsoft Exchange Server computer, but they will indicate trends over time if you track them.
Object | Counter | Description |
MSExchangeIS | User Count | The number of connected client sessions. |
Active User Count | The number of users who have been active in the last 10 minutes. | |
MSExchangeIS Private | Messages Submitted/min | The rate of messages being submitted to the private information store. |
Message Recipients Delivered/min | The rate of messages being delivered by the private information store. This will be higher than the submission rate because many messages have multiple recipients. | |
MSExchangeIS Public | Messages Submitted/min | The rate of messages being submitted to the public information store. |
Message Recipients Delivered/min | The rate of messages being delivered by the public information store. This will be higher than the submission rate because many messages have multiple recipients. | |
MSExchangeMTA | Messages/sec | The rate at which the MTA is processing messages. |
Messages Bytes/sec | The number of bytes in the messages being processed by the MTA. Divide this by the Messages/sec counter, and you can determine the average message size. |
These counters don't provide a complete picture of the responsiveness of your Microsoft Exchange Server computer, but they will indicate trends over time if you track them.
Object | Counter | Description |
MSExchangeIS Private | Send Queue Size | Indicates whether the information store is keeping up with the submitted load. The queue can be non-zero at peak traffic times, but it shouldn't stay there long after the peak has passed. |
Average Time for Delivery | Indicates how long it takes the information store to deliver messages. | |
MSExchangeIS Public | Send Queue Size | Indicates whether the information store is keeping up with the submitted load. The queue can be non-zero at peak traffic times, but it shouldn't stay there long after the peak has passed. |
Average Time for Delivery | Indicates how long it takes the information store to deliver messages. | |
MSExchangeMTA | Work Queue Length | Indicates whether the MTA is keeping up with the submitted load. The queue can be non-zero at peak traffic times, but it shouldn't stay there long after the peak has passed. |
Windows NT provides additional counters that can help you analyze processor usage, but many are more useful to a developer than to an administrator. The following counters are relevant to bottleneck analysis.
Object | Counter |
System | % Total Processor Time |
Process | % Processor Time |
If you observe processor utilization at a fine granularity, for example every one or five seconds, note that the counters fluctuate rapidly and will frequently hit 100 percent for short periods of time. For this reason, monitoring processor usage is more useful when averaged over a longer period of time. If you are monitoring for longer periods of time and you find that the processor usage reaches 100 percent and stays there for minutes or hours, your users are probably becoming impatient with response times. You may want to size your system for around 60 percent or 70 percent processor utilization during peak times, so that there is extra room for unexpected demands and for growth.
When you are running other services in addition to Microsoft Exchange Server on the server computer, it is recommended that you analyze per-process processor usage. This enables you to determine which services are using most of the CPU time, and how to appropriately balance the load.
If the processor is your bottleneck, consider taking the following actions:
Recommendations
There are two sets of disk counters: LogicalDisk and PhysicalDisk. Either are fine to use, but LogicalDisk makes it easier to track drive usage. In either case, you must enable the disk counters. They are turned off by default, due to the small performance hit they create.
To enable Performance Monitor disk counters
Following are important disk counters:
Object | Counter |
LogicalDisk | Disk Bytes Written/sec |
Disk Bytes Read/sec | |
Disk Reads/sec | |
Disk Writes/sec | |
Avg. Disk Queue Length | |
% Disk Time (general indicator only; not a reliable indicator of disk saturation). |
Compare the disk operations per second with the specifications for sustained operations provided by your vendor. If your disk operations per second are getting close to the vendor's specifications, you're nearing capacity. Note that the % Disk Time counter is not a fair indication of disk saturation. A disk that is busy 100 percent of the time may actually be capable of doing much more work, due to smart disk controllers and scheduling methods such as elevator algorithms.
Recommendations
If the disk subsystem is your bottleneck, consider taking the following actions:
Object | Counter |
MSExchangeDB | Buffer Asynchronous Reads/sec |
Buffer Asynchronous Writes/sec | |
Buffer Synchronous Reads/sec | |
Buffer Synchronous Writes/sec |
Check the amount of disk activity generated by the information store. If you add up the read/write counters shown above, you can determine how much of the disk's activity on your information store database drive is due to the information store, and how much is generated by other services that might share the same drive.
Following are important memory counters:
Object | Counter |
Memory | Pages/sec |
Page Faults/sec | |
Available Bytes | |
Committed Bytes | |
Process | Page Faults/sec |
Working Set |
The Pages/sec counter indicates the rate at which pages are physically read or written on the paging drive. This indicates the contribution that paging makes to the demand for the disk.
The Page Faults/sec counter indicates the rate at which pages are faulted into the working sets of processes. Due to the page pool in the virtual memory system, the number of pages actually being read and written to disk is much less than the number of page faults. The page faults are of interest if you have services in addition to Microsoft Exchange Server running. In such a case, you can examine the rate of page faults on a per-process basis and determine where they are occurring. With this information, you can make application-tuning changes. For example, you might consider (carefully) adjusting the tradeoff between the information store buffer pool and the system memory pool. Also, by checking the per-process working sets, you can see identify the major memory allocations.
Recommendation
The Available Bytes counter indicates how much physical memory is available at any given time. The system adjusts working sets of processes to keep this above a certain threshold, generally 4 MB. If this level is approached, you should see higher paging and page fault rates. The Committed Bytes counter indicates the amount of virtual address space that the system has committed to applications. This must be backed by the paging file on the disk, so make sure that there is space in the paging file.
Recommendation
Following are important buffer counters.
Object | Counter |
MSExchangeDB | % Buffer Cache Hit |
Buffer Asynchronous Reads/sec | |
Buffer Asynchronous Writes/sec | |
Buffer Synchronous Reads/sec | |
Buffer Synchronous Writes/sec |
Depending on your usage patterns, you may be able optimize the use of server memory by adjusting the number of information store buffers. Monitor the % Buffer Cache Hit counter. If it is consistently very close to 100 percent, try decreasing the number of buffers. You should also monitor the information store disk activity. If the activity doesn't increase, your setting is correct. However, if you notice that the cache hit rate is less than 95 percent, try increasing the number of buffers. As long as the paging activity does not increase, the setting is correct. Be careful when making these changes! Make small adjustments and monitor the results until you're confident in the changes made.
Recommendations
To change the number of information store buffers
At the command prompt, type PerfWiz -V.
Change the setting number for the information store buffers, but be careful when doing this. Make small adjustments and monitor the results until you are confident with the changes made.
The Network Interface object can be obtained by installing the Windows NT Resource Kit. If your network connection is a point-to-point link, the counters below will show all traffic on the link. If the connection is a LAN, these counters will show the traffic to and from the server being monitored.
Object | Counter |
Network Interface | Bytes Received/sec |
Bytes Sent/sec | |
Packets Received/sec | |
Packets Sent/sec |
If you know the capacity ratings of your network and network interface card (NIC), you can compare these ratings to the values for the counters shown above and determine how close to capacity you are operating. For a more complete overview of network traffic, you can also use the Network Monitor tool available with Microsoft Systems Management Server (SMS). Note that Network Monitor is a stand-alone tool. You do not need to have other SMS components installed to run it.
If you find that the server is operating at or near network capacity, you can upgrade the network speed, for example, by moving from a 10-MB Ethernet to a 100-MB Ethernet, or moving from a 64-KB line to a T1 line. You may also want to consider using multiple Ethernet connections for the server, or multiple 64-KB lines, rather than one faster connection or line.
Recommendation
If you suspect that bus saturation is an issue that affects your server, you can monitor its activity on Pentium and Pentium Pro servers. Use the p5ctrs from the Windows NT Resource Kit. To view Pentium counters, you must run Pperf and then change the configuration.
Following are important bus counters:
Object | Counter |
Pentium | Bus Utilization (clks)/sec |
% Code Cache Misses | |
% Data Cache Misses |
If your system bus is near saturation, you have two options:
Recommendations