Volume Configuration

The most confusing decision when configuring a system is how an array of physical drives should be partitioned into logical volumes. In reality this is a very easy step in the configuration process, once the methodology is understood. The key is to let an array do what it is intended to do, that is, distribute I/O requests among all available drives.

The basic principles:

  1. Put all randomly accessed data on a single volume.
  2. Isolate purely sequential I/O to avoid excessive head movement.

Group Randomly Accessed Data

Looking back at the environment definitions covered earlier, most data access on database servers is basically random. When the goal is to optimize these application classes, database data and index files should be grouped onto a single array volume for each disk array controller in a machine. For example, if an OLTP database contains a number of large tables and their associated indexes totaling 5GB, these should all be put on a single volume of six 1GB drives. This guarantees that all drives will have approximately equal I/O load during normal operation of the system. This load balancing is done by the disk controller, due to the low-level striping of files across all disks in the volume, with no additional effort required by the System Administrator or DBA.

In contrast, techniques that have been carried over from traditional, non-array systems include the separation of data and index files to separate physical drives and separation of large data tables to their own dedicated drives. The reason for this was to allow simultaneous access to multiple drives. The problem with these techniques is that it puts the burden of load balancing upon the System Administrator. By carefully studying the applications accessing the database, the data can be distributed in such a manner so that each drive receives an approximately equal number of disk requests. Unfortunately, two assumptions are made that are difficult to achieve. Those are that the System Administrator and DBA can determine the actual I/O profile of an application and, that once determined, the I/O profile will remain constant throughout a day's processing and over the life of the application. For example, an application will often use different tables at different times of the day. Table A may have heavy access in the morning as a business gets started for the day. As the day wears on, most access may move over to Table B. If these tables were separated on different physical disks, the application would overload the disk with Table A in the morning and overload the disk with Table B later. However, with an array controller these two tables would be put on the same logical volume spread across multiple drives allowing all drives to share equally in the I/O load throughout the day. Another example to further demonstrate this concept is data growth. Again using Table A and Table B, when the database is initially deployed Table A and B are approximately the same size and receive the same number of I/O requests. At that time, it makes sense for each table to be on its own drive. Over time, Table B grows by 10 percent per month in both size and I/O requests. After just 9 months Table B has doubled in access frequency and is now a bottleneck to system performance. If both these tables are put on a single array volume this table growth is shared among all drives on the volume, thus avoiding the disk bottleneck.

The case covered thus far applies when all disks are attached to a single controller. Obviously when the amount of data and indexes exceeds the capacity of a single controller you will have multiple volumes in the system. At this point you have three options for distributing the data across the controllers.

  1. Use an operating system facility to stripe the volume across controllers.
  2. Use the database facilities to stripe the data across the controllers.
  3. Segment different data to each controller.

There is no best solution to this situation. It is a matter of preference to the Administrator. General trade-offs of each are listed below to help you make your own decision. The degree of each of these may vary depending upon operating system and database software chosen. The pros and cons of each option are shown in Table 1.

Use an operating system facility to stripe the volume across controllers.
Pros Cons
Ease of space management as database grows More difficult volume management
Ease of backup with everything on one file system Very long tape recovery times of volume
Performance penalty associated with file system striping

Use the database facilities to stripe the data across the controllers.
Pros Cons
Little to no negative performance impact Most complicated database configuration process
Ease of space management as database grows More complicated backup/recovery process

Segment different data to each controller.
Pros Cons
Best performance when balanced properly Requirement of administrator to understand access patterns
Fastest, easiest tape recovery of volume
Fastest tape backup, allows multiple tape usage

Table 1: Multiple Controller Options

Isolate Sequentially Accessed Data

Even though most data in a database server is read and/or written randomly, there are several cases of sequential access. During a long sequential read or write process, if the drive head can stay on the same physical disk track, performance will be dramatically better than if the head must continually seek between tracks. This principle is the same for traditional, non-array controllers and array controllers.

Actually, the largest single time component in reading or writing a block of data on a disk is the time required to move the drive head between tracks. The best drives on the market today range from 8 to 10 milliseconds (ms) average seek time and many drives you may be using from just a couple of years ago could be as high as 16 or 19 ms for an average seek time. The remaining activities in an I/O routine include: the database makes a request from the operating system, the operating system requests the data from the controller, the controller processes the request and makes the request from the drive, the block rotates under the drive head, the data is transferred to the controller, back to the operating system, and finally the application. All of this takes only 10 to 15 ms, depending upon the particular controller, operating system, and database. Obviously, minimizing seek time, where able, has a significant benefit.

So much for low-level numbers, what does this mean to your application and how do you take advantage of it. There are three basic database operations that are characteristically sequential: transaction logging, bulk loading and unloading of the database, and batch reporting or updating requiring table scans. In many cases involving sequential I/O, the operations associated with it will not be the primary function of the system, but may be the most important aspect for your tuning due to nightly batch windows or backup time constraints. If the typical random access to the data will be relatively light or done by few users, you may want to give priority to tuning sequential operations over random I/O distribution.

The transaction log is the most obvious sequential I/O routine. In any update environment the transaction log is sequentially written from beginning to end. You should isolate the log file from any other system I/O activities during heavy update activity. However, a typical log file's space requirement may not need the space of an entire physical disk. You can usually share a single large disk between the log file and other files on the system which are not active during heavy update activities. Examples of items which coexist well with the log file include the operating system, database executables, database dump files, etc.

Bulk loads and batch updates are often associated with decision support systems. The typical scenario involves extracting large volumes of data from production transaction processing systems on a pre-defined schedule, such as daily or weekly. The data is then moved to the DSS server and loaded into a database. Performance of this load operation can be very critical due to nightly time constraints. Optimization of this process is very different from the typical system. In contrast to the random nature of most multi-user systems, a batch load or update can have up to 4 concurrent sequential operations. These operations include reading the input file, writing the database data file, writing the database index file, and writing the transaction log. To optimize the load time of the data each of these file types should be isolated to their own logical volumes. In contrast, pure optimization for daily multi-user decision support would put ALL of the above files on a single, large volume to allow concurrent access of all drives on the server by all read operations. The performance impact on daily access of optimization for load times should be less than 20%. However, the gain in load time can be over 200% which is often well worth the small random access penalty.

For example, assume you have a decision support system which is refreshed nightly, Table 2 gives application characteristics and an optimal configuration for loading the database.


Data File
Space
Requirement
Disk
Volume
Drive
Count

Drive Size
DB Log & OS 500MB Volume 1 1 1GB
DB Data File 2.5GB Volume 2 3 1GB
DB Index File 1GB Volume 3 2 1GB
Extract File 1.5GB Volume 4 1 2GB

Table 2: Batch Processing Example

The final general application classes which rely on sequential I/O performance include batch reporting and database dumps. These are fairly simple operations to optimize and there are only a couple of concepts to keep in mind. The profile of reporting and unloading is sequential table scans with some resulting output being written to a non-database file. In the case of a detailed analysis report, a report file of several megabytes may be spooled to a large disk file. The database dump sequentially reads tables writing the results to either a tape device or disk file. When the output is written to a disk file on the server, the target disk should be a different physical volume from the source of the data. If this is the only activity at the time the volume used for the log file is a good target, otherwise, another volume should be added to the system of sufficient size to accommodate the output file.