Use Efficient Index Design

Unlike many nonrelational systems, relational indexes are not considered part of the logical database design. Indexes can be dropped, added, and changed without impacting the database schema or application design in any way other than performance. Efficient index design is paramount to achieving good SQL Server performance. For these reasons, you should not hesitate to experiment with different indexes.

The optimizer reliably chooses the most effective index in the majority of cases. The overall index design strategy should be to provide a good selection of indexes to the optimizer, and trust it to make the right decision. This reduces analysis time and gives good performance over a wide variety of situations.

The following are index design recommendations:

SELECT * FROM PHONEBOOK

WHERE LASTNAME='SMITH'

Or

SELECT * FROM MEMBERTABLE

WHERE MEMBER_NO > 5000

AND MEMBER_NO < 6000

By contrast, the LASTNAME or MEMBER_NO columns mentioned above are probably not good candidates for a nonclustered index if this type of query were common. Try to use nonclustered indexes on columns where few rows are returned.

SELECT COUNT (DISTINCT COLNAME), COUNT(*)

FROM TABLENAME

This returns the number of unique values in the column and the total number of rows. Compare these values. On a 10,000-row table, 5,000 unique values would make the column a good candidate for a nonclustered index. On the same table, 20 unique values would better suit a clustered index. Three unique values should not be indexed at all. These are only examples, not hard-and-fast rules. Remember to place the indexes on the individual columns listed in the WHERE and FROM clauses of the queries.

SELECT COL1, COL2, COUNT(*)

FROM TABLENAME

GROUP BY COL1, COL2

This returns one row for each key value along with a count of the instances of each value. To reduce the number of rows returned, it may be helpful to exclude some with a HAVING clause. For example, to exclude all rows that have a unique key, use the following HAVING clause:

HAVING COUNT(*) > 1

The number of rows returned in a query is also an important factor in index selection. The optimizer considers a nonclustered index to cost at least one page I/O per returned row. At this rate, it quickly becomes more efficient to scan the entire table. This is another reason to restrict the size of the result set or to locate a large result set with a clustered index.

Do not always equate index usage with good performance, and vice versa. If using an index always produced the best performance, the optimizer's job would be simple—always use any available index. In reality, incorrect choice of indexed retrieval can result in poor performance. Therefore, the optimizer's task is to select indexed retrieval where it will help performance and avoid indexed retrieval where it will hurt performance.