Architecture and Terminology

To start a successful migration, you should understand the basic architecture and terminology associated with SQL Server. The first part of this paper presents these concepts from an Oracle point of view. All of the comparisons have been made from the point of view of an experienced Oracle developer. Many of the examples have been drawn from the sample Oracle and SQL Server applications included as part of this paper.

It is assumed that you are familiar with the terms, concepts, and tools associated with the Oracle DBMS. For more information about the Oracle DBMS and its architecture, see the Oracle 7 Server Concepts Manual. For using the Oracle scripts and examples, it is also assumed that you are familiar with the Oracle Server Manager and Oracle SQL*Plus tools. For more information about these tools, see your Oracle documentation.

System Administrator (SA) Account

The SQL Server system administrator (SA) is similar to an Oracle DBA. The SA account is the centralized administrator for the SQL Server environment and has the following responsibilities:

SQL Server performs no permission checking for the SA, allowing the SA account to access any application or database on the system. Like an Oracle DBA, there are several commands and system procedures that only the SA can issue.

SQL Enterprise Manager and SQL Executive

SQL Server's SQL Enterprise Manager is the SA's administration tool. It offers similar functionality to Server Manager used by the Oracle DBA. SQL Enterprise Manager provides powerful administration tools for managing multiple servers.

Its alert engine enables you to set alerts for various SQL Server events and either notify someone, through e-mail or paging, or automatically execute a task. Additionally, a replication service enables you to set up and administer replication for SQL Servers across your network.

SQL Executive is a scheduling engine that offers robust and varied task scheduling and alerting abilities, and is capable of handling large client/server environments. It is similar in function to the DBMS_JOB job queue scheduling package provided with Oracle. SQL Executive is integrated within SQL Enterprise Manager. Scheduling information is maintained in the msdb database (which is created automatically at SQL Server setup). For information about SQL Enterprise Manager and SQL Executive, see the Microsoft SQL Server Administrator's Companion.

Oracle Database

The term database does not mean the same thing in SQL Server as it does in Oracle. In Oracle, database refers to the entire Oracle DBMS environment and includes the components shown in the illustration.

SQL Server Database

A SQL Server database is used to provide a logical separation of data, applications, and security mechanisms (much like a tablespace). Where Oracle supports multiple tablespaces, SQL Server supports multiple databases.

Database Owner (DBO)

A SQL Server database is similar to an Oracle tablespace in use, but is administered differently. Each SQL Server database is a self-contained administrative domain. It is assigned a Database Owner (DBO) who is responsible for managing the administrative tasks related to that database (unlike Oracle, where one DBA manages the administrative tasks for all tablespaces). These tasks include:

The DBO has permission to do anything within the database. All other database users must be assigned permissions by the DBO. The SA account acts as DBO in any database. It is common in SQL Server installations to only use the SA account for all administration tasks, both server-wide and database-specific. This saves the complexity of managing separate DBO accounts for each database. 

Database System Catalog

Within Oracle, there is one centralized system catalog (data dictionary) for all of the tablespaces on the system. It resides in the SYSTEM tablespace. In SQL Server, each database maintains its own system catalog, which contains information about the following entities within the database:

Centralized System Catalog (master Database)

In addition to providing each database with its own system catalog, SQL Server also contains a centralized system catalog in the master database. The master database is a special database that contains the same system catalogs as any other database, but also has information regarding all other databases on the system and configuration information about the system itself:

Like the SYSTEM tablespace in Oracle, the SQL Server master database must be up and running in order to access any other database. As such, it is important to protect against failures by backing up the master database after any changes have been made to it. The DBO for the master database must always be the SA account (just as the Oracle SYS account is always the owner of the SYSTEM data dictionary tables).

Default Databases

In addition to the master database, SQL Server installs the following databases by default.

model

Used as a template for all newly created user databases.

tempdb

Similar to an Oracle temporary tablespace. It is used for temporary working storage and sort operations. Unlike the Oracle temporary tablespace, users can create temporary tables that are automatically dropped when the user logs off.

msdb

Used to support SQL Executive and its scheduling information.

pubs

Provided as a sample learning tool.

For more information about the default databases, see the Microsoft SQL Server Administrator's Companion.

Devices

When an Oracle tablespace is defined, one or more operating-system files are created for its use. After a file has been created for a given tablespace, it cannot be used or shared by any other tablespace.

A SQL Server database is assigned space on one or more devices. A device is simply an operating-system file that has been preallocated for database use. A pool of database devices is typically created for the SQL Server environment.

Unlike Oracle, a given device can be shared by multiple databases. Additionally, as is shown in the following illustration, any database can store data on multiple devices. Despite this flexibility, SQL Server installations commonly use a device for only a single database to simplify space administration.

Blocks, Pages, and Extents

The Oracle DBMS formats tablespace data files into internal units termed blocks. The block size, set by the DBA when the Oracle database is first created, can range between 512 to 8192 bytes. When an object is created in an Oracle tablespace, the user specifies its space in units called extents (initial extent, next extent, min extents, and max extents). An Oracle extent varies in size, and must contain a chain of at least five contiguous blocks.

SQL Server formats device files into internal units called pages. The page size is fixed at 2048 bytes. When an object is created in a SQL Server database, it is automatically allocated one extent. A SQL Server extent is fixed in size and always contains a chain of eight contiguous pages.

Segments

The term segment in Oracle is used to define a logical structure, such as a data segment (table or cluster), index segment (index), or rollback segment. Segment is simply a logical term used to refer to an Oracle object.

SQL Server segments are defined at the database level and are used to control the physical placement of tables and indexes. Segments are names or labels that are used to point to one or more of the database devices that have been assigned to a database.

Segments allow you to strategically locate database objects on disk. If database segments are not defined and used, database objects are placed wherever space is available within the database. Segments allow:

Using RAID

While segments are familiar to Oracle administrators and are useful in some high-end configurations, they are not needed for most SQL Server installations. Using segments introduces administrative complexity and the possibility for error. In most cases, hardware-based RAID or Windows NT software-based RAID solutions are recommended for use with SQL Server. Although Oracle supports RAID, its multiplatform nature often precludes its use.

Windows NT software-based RAID or hardware-based RAID can set up stripe sets consisting of multiple disk drives that appear as one logical drive. If database devices are created on this stripe set, the disk subsystem assumes responsibility for distributing I/O load across multiple disks.

Randomly spreading out the data over multiple physical disks using RAID is usually as effective at improving performance as carefully placing objects using segments, and using RAID is much simpler.

The recommended RAID configuration for SQL Server is RAID 0 (stripe sets) or RAID 5 (stripe sets with an extra parity drive, for redundancy) to hold database devices. Stripe sets work very well at spreading out the usually random I/O done on database devices.

Transaction log devices (see the following section) must be optimized for sequential I/O and must be secured against a single device failure. Accordingly, RAID 1 (mirroring) is recommended for transaction logs. The size of this drive should be at least as large as the sum of the size of the online redo logs and the rollback segment tablespace(s). After this has been completed, create one log device that takes up all the space defined on this logical drive.

For more information about RAID, see the Microsoft SQL Server Administrator's Companion, your Windows NT Server documentation, and the Microsoft Windows NT Resource Kit.

Transaction Logs, Checkpoints, and Automatic Recovery

The Oracle DBMS performs automatic recovery each time it is started. It verifies that the contents of the tablespace files are coordinated with the contents of the online redo log files. If they are not, Oracle applies the contents of the online redo log files to the tablespace files (roll forward), and then removes any uncommitted transactions that are found in the rollback segments (roll back). If Oracle cannot obtain the information it requires from the online redo log files, it consults the archived redo log files.

SQL Server also performs automatic data recovery. Each time SQL Server is started, it checks each database in the system. It starts with the master database and launches threads to recover all of the other databases in the system.

In each database, the automatic recovery mechanism checks the transaction log. If the transaction log contains any uncommitted transactions, the transactions are rolled back. The recovery mechanism then checks the transaction log for committed transactions that have not yet been written out to the database. If it finds any, it performs those transactions again, rolling forward.

From an Oracle perspective, each transaction log functions both as a rollback segment and as an online redo log. Each database has its own transaction log, which is shared by all users of that database. The transaction log is actually a uniquely designed table that exists in each database.

If you have a small database, the transaction log is often stored on the same device as the database. If you have a large database, it is recommended that the transaction log be stored on a separate device. Storing the transaction log on a separate device significantly improves performance and allows incremental backups of the database. The placement of the transaction log is usually determined when the database is created.

The transaction log records all changes to the database. When a transaction begins, a BEGIN TRANSACTION event is recorded in the log. This event is used during automatic recovery to determine the starting point of a transaction. As each data modification statement is received, the changes are written to the transaction log prior to being written to the database itself. It functions as a write-ahead log. For more information, see the "Transactions, Locking, and Concurrency" section later in this paper.

Like Oracle, an automatic checkpoint mechanism is used to ensure that completed transactions are regularly written from SQL Server's own disk cache to the database device. A checkpoint writes all dirty pages to the database device. A dirty page is any cached page that has been modified since the last checkpoint. The checkpointing of dirty pages onto the database device means that all completed transactions are written out. This process shortens the amount of time that it takes to recover from a system failure, such as a power outage.

Backups

When you begin to use SQL Server, you may encounter a "transaction log full" message. This means that the transaction log is full and can no longer track changes to the database. When this happens, activity cannot take place in the SQL Server database.

This is similar to the error that occurs if the Oracle Archiver cannot copy the contents of an online redo log file to an archived redo log file. When an online redo log file fills, the Oracle Archiver automatically copies its contents to an archived redo log file. It then marks the online redo log file as available for reuse by the log writer. If the archiver cannot complete this task, all Oracle DBMS activity is halted.

The SQL Server error occurs because the transaction log does not automatically back up itself. A transaction log continues to fill up until its contents are backed up to disk. A backup is used to create a copy of a database, transaction log, or table onto a dump device.

A backup cannot be sent to any type of file. Just as Oracle requires a specialized format for its online and archived redo log files, SQL Server requires a specially formatted file for backups. This file is called a dump device. Dump devices are created using SQL Enterprise Manager or the equivalent statement syntax.

If the transaction log and database share the same device, the DUMP DATABASE statement is used to perform full backups. This statement creates a copy of both the database and its transaction log.

The transaction log must also be kept from filling up. This can be accomplished by setting the trunc. log on chkpt. option (see the "Automatically Truncating Transaction Logs" section that follows) or manually by using the DUMP TRANSACTION WITH TRUNCATE ONLY statement. This statement removes committed transactions from the log. Any transactions that are currently in process are not deleted.

If you have placed your transaction log and database on separate devices, you can use both the DUMP DATABASE statement to perform a full backup, and the DUMP TRANSACTION statement to perform an incremental backup of the database. Either statement can be issued while users are actively using the database. In the process of incremental backup, DUMP TRANSACTION frees up space in the transaction log, so a separate truncate step is not required.

Although backups can be performed manually, it is recommended that you use SQL Enterprise Manager to schedule regular backups. With its scheduling facility, you can back up your databases on a recurring basis.

Loads

A load is used to restore a database from a backup (dump) of that database and (if applicable) one or more backups of its transaction log (which re-creates lost transactions). A load overwrites any information in the database with the information that has been backed up. Loads are also performed using SQL Enterprise Manager.

Automatically Truncating Transaction Logs

You are not required to back up the contents of a database. In Oracle, the archiver can be turned off. In SQL Server, the DBO for a database can force the transaction log to erase its contents every time a checkpoint occurs. This is accomplished by configuring the transaction log to truncate itself whenever a checkpoint operation is performed. For more information about this option, see the Microsoft SQL Server Administrator's Companion.

DBCC (Database Consistency Checker)

It is a good idea to use the DBCC (database consistency checker) prior to performing any database backups. The DBCC statement verifies the integrity of the data in the database. If bad data is backed up, the same errors return when the data is restored. It is important to note, however, that the DBCC can only do limited repairs to the contents of a database. Damaged databases usually require a restore from previous clean backups of the database and transaction log.

The DBCC statement is also used to check memory usage, decrease the size of a database, check performance statistics, and so on. For more information about the DBCC statement, see the Microsoft SQL Server Transact-SQL Reference.

Networks

In the Oracle environment, SQL*Net is the software used to support networked connections between Oracle database servers and their clients. It communicates with the Transparent Network Substrate (TNS) data stream protocol. The combination of these two components allows users to run with many different network protocols without needing to write specialized program code.

Net-Libraries (network libraries) are used to support the networked connections between clients and SQL Server. These network libraries communicate between client and server using the Tabular Data Stream (TDS) protocol.

SQL Server includes a set of server Net-Libraries that enable simultaneous connections from clients running named pipes or other inter-process communication (IPC) mechanisms. SQL Server also ships with all client Net-Libraries in the box; there is no need to purchase these libraries separately.

SQL Server Net-Library options can be changed after installation with SQL Server Setup. The SQL Client Configuration Utility is used to configure the DB-Library, default Net-Library, and server connection information for a client running the Windows NT or Windows® 95 operating system. ODBC client applications use the same default Net-Library and server connection information, unless it is changed during ODBC data source setup. For more information about Net-Libraries, see the Microsoft SQL Server Administrator's Companion.

Login Accounts

To access SQL Server, login accounts must be created. A login account allows a user to access the SQL Server environment. The login account cannot access a database and its objects until it is added as a database user to that database.

SQL Server offers two basic types of login security: integrated and standard. The integrated security option uses the security mechanisms within Windows NT when validating login connections. Users do not need to enter login IDS or passwords. This functions like the IDENTIFIED EXTERNALLY option associated with Oracle user accounts. When using Oracle, a forward slash (/) must be entered as the login name to make this option work.

Integrated security requires network protocols that support authenticated connections between clients and servers. These are referred to as trusted connections. The Multi-Protocol and named pipe Net-Libraries provide trusted connections. Trusted connections can be either requested by the client application or required by the server when set in Integrated Security mode using SQL Server Setup.

The standard security mechanism requires that a user enter a login ID and password when requesting access to SQL Server. This functions like the IDENTIFIED BY PASSWORD option associated with Oracle user accounts.

For more information about these security mechanisms, see the Microsoft SQL Server Administrator's Companion.

Groups, Roles, and Permissions

SQL Server and Oracle both use permissions to enforce database security. SQL Server statement-level permissions (also called command permissions) are used to restrict access to Transact-SQL statements. They are similar to the Oracle system-level permissions.

SQL Server also offers object-level permissions.  As in Oracle, object-level ownership is assigned to the creator of the object and cannot be transferred. Object-level permissions must be granted to other database users before they can access the object.

SQL Server statement and object level permissions can be granted directly to database user accounts. However, from an administrative standpoint, it is often simpler to grant permissions to database groups. Groups are similar in function to Oracle roles. Groups are used as collective names for granting and revoking privileges (much like Oracle roles). Groups are created on a database by database basis.

A database may have any number of groups. The default group public is always found in every database and cannot be removed. The public group functions much like the PUBLIC account in Oracle. Each database user is always a member of the PUBLIC group. Unlike Oracle roles, a database user can be a member of only one other group in addition to public. This can be an important consideration when migrating a security system from Oracle to SQL Server.

Database Users, Aliases, and the Guest Account

A user login account must be authorized to use a database. If the user login account is not authorized, it cannot access the database or its objects.

One of the following methods can be used by a login account to access a database:

A database user account is created by the DBO or SA. When creating this account, the DBO or SA specifies a SQL Server login ID, database username, and group name. The database username and group name are optional. The database username does not have to be the same as the user's login ID. If a database username is not provided, the user's login ID and database username are identical. If a group name is not provided, the database user is only a member of the public group.

The SA or DBO can also create a database alias. A database alias maps a login account to a currently existing database user in the database. A database alias allows multiple SQL Server login accounts to share the same database username. Upon logging into a database, each login account shares the same object permissions and database privileges.

A guest account can also be created by the SA or DBO for a database. The guest account allows any valid SQL Server login account to access a database even if it does not have a database user account or alias. The guest account inherits any privileges that have been assigned to the public group.