Declaring an Index as unique in SQL Server

If I know an index will have unique values, how will it affect performance on inserts or selects if I declare it as such.

If the optimiser knows the index is unique how will that affect the query plan?

I understand that specifying uniquenes can serve to preserve integrity, but leaving that discussion aside for the moment, what are the perfomance consequences.

7 thoughts on “Declaring an Index as unique in SQL Server

  1. user

    Perhaps more important: the uniqueness will protect the data integrity. Performance would a reason to ignore this.

    Performance could be affected positively or negatively or not at all: it would depends on the query, if the index is used etc

    Reply
  2. user

    Long story short: if your data are intrinsically UNIQUE, you will benefit from creating a UNIQIE index on them.

    See the article in my blog for detailed explanation:


    Now, the gory details.

    As @Mehrdad said, UNIQUENESS affects the estimated row count in the plan builder.

    UNIQUE index has maximal possible selectivity, that’s why:

    SELECT  *
    FROM    table1 t2, table2 t2
    WHERE   t1.id = :myid
            AND t2.unique_indexed_field = t1.value
    

    almost surely will use NESTED LOOPS, while

    SELECT  *
    FROM    table1 t2, table2 t2
    WHERE   t1.id = :myid
            AND t2.non_unique_indexed_field = t1.value
    

    may benefit from a HASH JOIN if the optimizer thinks that non_unique_indexed_field is not selective.

    If your index is CLUSTERED (i. e. the rows theirselves are contained in the index leaves) and non-UNIQUE, then a special hidden column called uniquifier is added to each index key, thus making the key larger and the index slower.

    That’s why UNIQUE CLUSTERED index is in fact a little more efficicent than a non-UNIQUE CLUSTERED one.

    In Oracle, a join on UNIQUE INDEX is required for a such called key preservation, which ensures that each row from a table will be selected at most once and makes a view updatable.

    This query:

    UPDATE  (
            SELECT  *
            FROM    mytable t1, mytable t2
            WHERE   t2.reference = t1.unique_indexed_field
            )
    SET     value = other_value
    

    will work in Oracle, while this one:

    UPDATE  (
            SELECT  *
            FROM    mytable t1, mytable t2
            WHERE   t2.reference = t1.non_unique_indexed_field
            )
    SET     value = other_value
    

    will fail.

    This is not an issue with SQL Server, though.

    One more thing: for a table like this,

    CREATE TABLE t_indexer (id INT NOT NULL PRIMARY KEY, uval INT NOT NULL, ival INT NOT NULL)
    CREATE UNIQUE INDEX ux_indexer_ux ON t_indexer (uval)
    CREATE INDEX ix_indexer_ux ON t_indexer (ival)
    

    , this query:

    /* Sorts on the non-unique index first */
    SELECT  TOP 1 *
    FROM    t_indexer
    ORDER BY
            ival, uval
    

    will use a TOP N SORT, while this one:

    /* Sorts on the unique index first */
    SELECT  TOP 1 *
    FROM    t_indexer
    ORDER BY
            uval, ival
    

    will use just an index scan.

    For the latter query, there is no point in additional sorting on ival, since uval are unique anyway, and the optimizer takes this into account.

    On sample data of 200,000 rows (id == uval == ival), the former query runs for 15 seconds, while the latter one is instant.

    Reply
  3. user

    I’ve just tested this on my machine for a Production table containing more than 1 million rows because I figured it was a good test. The results were interesting, here’s the raw numbers:

    — No Index:

        Setup Time: 8888, Insert Time: 501690
    

    — Unique Constraint:

        Setup Time:   42, Insert Time: 488030
    

    The Setup consisted of getting the Maximum of the field that I was adding the Unique Constraint to – so logically the performance was dramatically increased by adding the constraint. This would also improve performance when searching by this foreign key.

    Interestingly the Insert Time improved slightly as well (by 2.7228%), so only positive impacts [in my test case] of adding the Constraint (+ inherent index).

    Testing shows only positive impacts from adding the constraint – no performance impact.

    NOTE: For our test system I expect the values to almost always be unique, so I didn’t test inserting non-unique values, in this data it truly is an exception – and not something we need to be performant.

    Reply

Leave a Reply

Your email address will not be published.