CREATE INDEX Statement

For SQL Server 6.5 information, see CREATE INDEX Statement in What's New for SQL Server 6.5.

Creates an index on a given table that either changes the physical ordering of the table or provides the optimizer with a logical ordering of the table to increase efficiency for queries. When creating an index for the primary key, use the table- and column-level PRIMARY KEY constraint provided with the CREATE TABLE statement.

Syntax

CREATE [UNIQUE] [CLUSTERED | NONCLUSTERED] INDEX index_name
    ON [[database.]owner.]table_name (column_name [, column_name]...)
[WITH
    [FILLFACTOR = x]
    [[,] IGNORE_DUP_KEY]
    [[,] {SORTED_DATA | SORTED_DATA_REORG}]
    [[,] {IGNORE_DUP_ROW | ALLOW_DUP_ROW}]]
[ON segment_name]

where

UNIQUE

Specifies to create a unique index (one in which no two rows are permitted to have the same index value). The system checks for duplicate values when the index is created (if data already exists) and checks each time data is added with an INSERT or UPDATE statement. If there are duplicate key values, the statement is canceled and an error message giving the first duplicate is returned. You cannot create a unique index on a single column or multiple columns (composite index) where the complete key (all columns of that key) is NULL in more than one row; these are treated as duplicate values for indexing purposes.

Specifying a unique index makes sense only when uniqueness is a characteristic of the data itself. If uniqueness is required, create a UNIQUE or PRIMARY KEY constraint on the column. Specify a unique index only when manually enforcing a primary key (not recommended).

When a unique index exists, UPDATE or INSERT statements that would generate duplicate key values are rolled back, and SQL Server displays an error message. This is true even if the UPDATE or INSERT statement would have changed many rows but caused only one duplicate. If you try to change data on which there is a unique index and you have used the IGNORE_DUP_KEY option, only the rows that violate the UNIQUE index will fail. (For details, see the IGNORE_DUP_KEY option, later in this section.)

You cannot create a unique index on columns that already include duplicate values, whether or not IGNORE_DUP_KEY is set. If you attempt to do so, SQL Server displays an error message and lists the duplicate values. You must eliminate duplicates before you can create a unique index on the column.

CLUSTERED

Creates an object where the physical order of rows is the same as the indexed order of the rows, and the bottom, or leaf, level of the clustered index contains the actual data pages. Because a clustered index changes the physical order of the rows, create the clustered index before creating any nonclustered indexes so that the nonclustered indexes will not have to be rebuilt. By definition, only one clustered index is permitted per table (regardless of how it is created: implicitly, with constraints, or explicitly, with CREATE INDEX). Often it is created on the column(s) that is frequently accessed in groupings. Before you create indexes, it is important to know how your data will be accessed.

Using a clustered index to find data is almost always faster than using a nonclustered index. In addition, using a clustered index is advantageous when many rows with contiguous key values are being retrieved ¾ that is, on columns that are often searched for ranges of values. Once the row with the first key value is found, rows with subsequent indexed values are guaranteed to be physically adjacent.

For example, it would be helpful on an employee table to include a nonclustered index on the primary key of emp_id; however, the clustered index could be created on lname, fname (last name, first name) as that is often how employees are grouped.

If you do not specify CLUSTERED, a nonclustered index will be created.

Note Because the leaf level of a clustered index and its data pages are the same by definition, creating a clustered index and using the ON segment_name clause effectively moves a table from the device on which the table was created to the new segment. Before creating tables or indexes on specific segments, verify which segments are available and that they have enough empty space for the index. It is important that the segment have at least 1.2 times the space required for the entire table.

NONCLUSTERED

Creates an object that specifies the logical ordering of a table. With a nonclustered index, the physical order of the rows is not the same as their indexed order. The leaf level of a nonclustered index contains pointers to rows on data pages. That is, each leaf page contains an indexed value and a pointer to the row with that value. A nonclustered index, in comparison with a clustered index, has an extra level between the index structure and the data itself.

You can have as many as 249 nonclustered indexes per table (regardless of how they are created: implicitly, with constraints, or explicitly, with CREATE INDEX). Each can provide access to the data in a different sorted order.

index_name

Is the name of the index. Index names must be unique within a table but need not be unique within a database. Index names must follow the rules of identifiers. You must be the owner of a table to create an index on it. The owner of a table can create an index at any time, whether or not there is data in the table. Indexes can be created on tables in another database by qualifying the database name.

table_name

Specifies the table that contains the column or columns to be indexed.

column_name

Specifies the column or columns to which the index applies. Specify two or more column names if you want to create a composite index on the combined values in the specified columns. List the columns to be included in the composite index (in sort-priority order) inside the parentheses after table_name.

Composite indexes are used when two or more columns are best searched as a unit. You can combine as many as 16 columns into a single composite index. All the columns in a composite index must be in the same table. The maximum allowable size of the combined index values is 256 bytes. (That is, the sum of the lengths of the columns that make up the composite index cannot exceed 256.)

FILLFACTOR

Specifies how full SQL Server should make each index page. It is seldom necessary to include the FILLFACTOR option in a CREATE INDEX statement. It is provided for fine-tuning performance and is useful only when you are creating a new index on a table with existing data, and then it is useful only when you can accurately predict future changes in that data.

The amount of empty space on an index page is important because when an index page fills up, the system must take time to split it to make room for new rows. The value of the original FILLFACTOR is stored with the index in the sysindexes table.

Note Using an explicit FILLFACTOR setting applies only when the index is first created. SQL Server does not dynamically keep the specified percentage of empty space in the pages.

User-specified FILLFACTOR values can be from 1 through 100. If you don't specify a value, the default is 0. When FILLFACTOR is set to 0, only the leaf pages are filled. Space is left in nonleaf pages for at least one entry (two for nonunique clustered indexes). You can change the default with the sp_configure system stored procedure.

Use a FILLFACTOR of 100 only when there will be no inserts or updates. If FILLFACTOR is 100, SQL Server creates indexes with each page 100 percent full. A FILLFACTOR of 100 makes sense only for read-only tables. An INSERT or UPDATE made after the creation of an index with a 100 percent FILLFACTOR would cause page splits for each INSERT and possibly each UPDATE. The default FILLFACTOR algorithm assumes that there will be at least enough space for one more entry in an internal page.

Smaller FILLFACTOR values (except 0) cause SQL Server to create new indexes with pages that are not completely full. For example, a FILLFACTOR of 10 can be a reasonable choice if you are creating an index on a table that you know contains a small portion of the data it will eventually hold. Smaller FILLFACTOR values also cause each index to take more storage space.

When FILLFACTOR is set to any value other than 0 or 100, space is left in nonleaf pages for one entry (two for nonunique clustered indexes), and the space is left in leaf pages such that no leaf page is more full than the percentage specified by FILLFACTOR, as shown in the following table.

FILLFACTOR Internal page Leaf page

0% One free slot* 100% full

1 - 99% One free slot* <= FILLFACTOR % full

100% 100% full 100% full

*Two free slots for nonunique clustered indexes



FILLFACTOR	Internal page	Leaf page
0%	One free slot*	100% full
1 - 99%	One free slot*	<= FILLFACTOR % full
100%	100% full	100% full
*Two free slots for nonunique clustered indexes

Important Creating a clustered index with a FILLFACTOR affects the amount of storage space your data occupies, since SQL Server redistributes the data when it creates the clustered index.

IGNORE_DUP_KEY

Controls what happens when you attempt to enter a duplicate key in a unique clustered index. It is meaningful only when the UPDATE or INSERT statement affects multiple rows. SQL Server issues a warning and does not insert the row containing the duplicate. If IGNORE_DUP_KEY is set and you issue an UPDATE or INSERT statement that creates duplicate keys, the row that causes the duplicates is ignored. In the case of UPDATE, the row is discarded.

Other changes to the database caused by the UPDATE or INSERT attempt (for example, changes to index pages) are also backed out. However, if the UPDATE or INSERT attempt affects multiple rows, the other rows are added or changed as usual. If IGNORE_DUP_KEY is not specified, no rows will be inserted by the UPDATE or INSERT statement. For more information on duplicate rows, see the IGNORE_DUP_ROW | ALLOW_DUP_ROW option, later in this section.

You cannot create a unique index on a column that already includes duplicate values, whether or not IGNORE_DUP_KEY is set. If you attempt to do so, SQL Server displays an error message and lists the duplicate values. You must eliminate duplicates before you can create a unique index on the column.

Caution With IGNORE_DUP_KEY set, if you try to update a row in a way that creates a duplicate key, that row is discarded. Neither the new value nor the original value of the row that would produce the duplicate exists in the updated table. For example, if you try to update "Smith" to "Jones" and "Jones" already exists, you have one "Jones" and no "Smith." Essentially, the original row is lost because an UPDATE statement is actually a DELETE followed by an INSERT. SQL Server has no way to know about the disallowed duplicate when it deletes the row, and the whole transaction can't be rolled back because the purpose of IGNORE_DUP_KEY (and of the IGNORE_DUP_ROW option) is to allow a transaction to proceed in spite of the presence of duplicates.

SORTED_DATA | SORTED_DATA_REORG

Are options that eliminate the sort performed when a clustered index is created. Both options verify that the data has been sorted by checking each index value to determine whether it is higher than the previous one. If any row fails this check, the CREATE INDEX statement terminates. You can either fix the data or rerun the CREATE INDEX statement without the SORTED_DATA option (in which case the data will be sorted). The SORTED_DATA option, when the data satisfies the checks, will always be faster than the SORTED_DATA_REORG because the data is not copied and nonclustered indexes will not be rebuilt.

SORTED_DATA_REORG differs from SORTED_DATA because it physically reorganizes the data. This option is useful when a FILLFACTOR is specified to compact or expand the pages on which a table is stored. The effects of these options change slightly if used with the ON segment_name option. For details, see the ON segment_name section, later in this section.

Reorganizing the data is a good idea when a table becomes fragmented. To determine whether or not a table is contiguous, use the DBCC statement's SHOW_CONTIG. For details about what causes table fragmentation and recommended solutions, see the DBCC statement.

IGNORE_DUP_ROW | ALLOW_DUP_ROW

Are options for creating a nonunique clustered index; they are mutually exclusive. When creating a nonunique nonclustered index, these options are irrelevant because SQL Server attaches a unique row identification number internally; it never checks for duplicate rows or for identical data values.

The IGNORE_DUP_KEY, IGNORE_DUP_ROW, and ALLOW_DUP_ROW index options control what happens when a duplicate key or duplicate row is created with the INSERT or UPDATE statement. This table shows when these options can be used.

Index type Options

Clustered IGNORE_DUP_ROW or ALLOW_DUP_ROW

Unique clustered IGNORE_DUP_KEY

Nonclustered None

Unique nonclustered IGNORE_DUP_KEY




Index type	Options

Clustered	IGNORE_DUP_ROW or ALLOW_DUP_ROW
Unique clustered	IGNORE_DUP_KEY
Nonclustered	None
Unique nonclustered	IGNORE_DUP_KEY

This table illustrates how ALLOW_DUP_ROW and IGNORE_DUP_ROW affect attempts to create a nonunique clustered index on a table that includes duplicate rows and attempts to enter duplicate rows into a table.

Option set Table has duplicate
rows
Insert duplicate rows

Neither option CREATE INDEX statement fails. INSERT statement fails.

ALLOW_DUP_ROW Statement is completed. Statement is completed.

IGNORE_DUP_ROW Index created but duplicate rows deleted; error message returned. All rows accepted except duplicates; error message returned (see earlier caution).

ON segment_name



Option set	Table has duplicate rows	Insert duplicate rows
Neither option	CREATE INDEX statement fails.	INSERT statement fails.
ALLOW_DUP_ROW	Statement is completed.	Statement is completed.
IGNORE_DUP_ROW	Index created but duplicate rows deleted; error message returned.	All rows accepted except duplicates; error message returned (see earlier caution).

Specifies the database segment on which the index is to be created. A nonclustered index can be created on a different segment from the data pages. Before creating an index on a segment, verify which segments you can use. Certain segments can be allocated to specific tables or indexes for performance reasons or for other considerations.

The ON segment_name clause can affect the placement of the data portion and the index portions of a table:

Creating a nonclustered index on a specified segment_name ensures that index pages (current and future) will be placed on that segment. All data pages will continue to be allocated from the original segment on which the table was created.
Creating a clustered index with SORTED_DATA_REORG on a specified segment_name causes the entire table to be moved to that segment.
Creating a clustered index with SORTED_DATA on specified segment_name causes all future data pages to be allocated from the new segment_name only. The index pages (the B-tree) is created on the new seqment_name only. Current page allocations will not be affected.

Caution If you create a table on a segment and then create a clustered index on that table without specifying a segment name, the entire table migrates to the default segment (unless the SORTED_DATA option is used).

Remarks

Space is allocated to tables and indexes in increments of one extent (8 2K pages) at a time. Each time an extent is filled, another is allocated. For a report on the amount of space allocated and used by an index, use the sp_spaceused system stored procedure.

Creating a clustered index requires space available in your database equal to approximately 1.2 times the size of the data. This is space in addition to the space used by the existing table; the data is duplicated (unless SORTED_DATA is specified) in order to create the clustered index, and the old, nonindexed data is deleted when the index is complete.

You can neither create an index on a view nor create indexes on columns of bit, text, and image types. As the syntax signifies, you can create an index on a table in another database as long as you are the owner of that table.

You can create an index on a temporary table. When the table is dropped or the session ends, all indexes and triggers are dropped.

If there is no data in the table when an index is created, run the UPDATE STATISTICS statement after data is added. To check when the statistics were last updated, use DBCC SHOW_STATISTICS.

A composite index, like any other index, is represented by one row in the sysindexes table.

To display a report on an object's indexes, execute the sp_helpindex system stored procedure.

Permission

CREATE INDEX permission defaults to the table owner and is not transferable.

Examples

A. Simple Index

This example creates an index on the au_id column of the authors table.

CREATE INDEX au_id_ind
    ON authors (au_id)

B. Unique Clustered Index

This example creates an index on the au_id column of the authors table that enforces uniqueness. This index will physically order the data on disk because the CLUSTERED option is specified.

CREATE UNIQUE CLUSTERED INDEX au_id_ind
    ON authors (au_id)

C. Simple Composite Index

This example creates an index on the au_id and title_id columns of the authors table.

CREATE INDEX ind1
    ON titleauthor (au_id, title_id)

D. FILLFACTOR

This example uses the FILLFACTOR option set to 100. A FILLFACTOR of 100 fills every page completely and is useful only when you know that no index values in the table will ever change.

CREATE NONCLUSTERED INDEX zip_ind
    ON authors (zip)
    WITH FILLFACTOR = 100

E. IGNORE_DUP_KEY

This example creates a unique clustered index on the stores table. If a duplicate key is entered, the INSERT or UPDATE statement will be ignored.

CREATE UNIQUE CLUSTERED INDEX stor_id_ind
    ON stores(stor_id)
    WITH IGNORE_DUP_KEY

CREATE TABLE	sp_helpindex
DROP INDEX	sp_spaceused
INSERT	UPDATE
RECONFIGURE	UPDATE STATISTICS
SET