Data Protection Levels

Data protection here refers to the degree the database integrity is protected from a physical disk failure. Which scheme is chosen is determined by the degree to which the organization can afford down time and/or data loss. Obviously the higher the degree of protection, the higher the cost of the system. However, the cost of a few thousand dollars of redundant disks may be much less than the cost of a system being down. This comparison does not compare fault-tolerant mode RAID-1 with RAID-5, but just trade-offs of using fault-tolerance or not. The next section discusses the issues around the choice of RAID level.

The most secure option is to configure the entire drive subsystem to run in a fault-tolerant mode. In this case, no single disk failure will stop the system or risk any loss of data. If the drive subsystem supports hot-swap drives, such as the SMART Controller with a ProLiant system, the system will not need to be brought down for repairs. If the system doesn't support this feature, down time will need to be scheduled. Most business critical applications, particularly OLTP systems, use this level of fault-tolerance.

The second option uses fault-tolerance on some volumes, but leaves others unprotected from a single disk failure. This involves putting all database transaction logs, critical database system tables, the operating system, and boot partition on a volume with some form of fault-tolerance. The remainder of the system is configured with volumes using no fault-tolerance. These volumes contain general data tables, index structures, and other non-critical database objects such as temporary space. If a protected drive fails the system continues to run in the same manner as described in the previous configuration. If an unprotected drive fails the system will eventually error and the current instance of the database will be lost. At this point the system is brought down and repaired, the database is restored from a previous backup, and the database restored to the instant of failure by rolling forward committed transactions from the protected transaction logs. The reason for protecting the operating system and system tables of the database is to ease and speed the recovery process. If the operating system or system catalogs are lost, the database can still be recovered, however, the process is more complicated and takes much longer.

The final configuration option is to run with the disk subsystem completely unprotected. In this situation, any disk failure will usually result in the requirement to restore the full database from a previous backup or reload the database if loaded from an ASCII dump of another system. All updates done since the backup or original load are lost and are NOT recoverable. This may be acceptable for many types of operations, such as static weekly loads of a decision support system.