Developing Matt

My Technical Journal

Indexing Indexes

leave a comment »

A cluster is a group.  Socially it is a loose term, a cluster can be a group of people huddled together.  But remember, there is always a tie that binds.  Clusters always have a common theme.  Technically a computer cluster is a group of linked computers.  When you say CLUSTER I think PIGEONS.  Not sure why, but I picture a bunch of pigeons sitting on the wire at a car wash.  Clustered by the wire. 

A Clustered index is an ordered index: always stored in order, one at a time, physically arranged in order.  There is only one per table and unlike the other indexes, it’s not a copy of the data, it IS the data (this is why a clustered index seek is mainly i/o bound as displayed in the execution plan).  It is a balanced tree where all the data is in a leaf level.  When querying on a column that has this index the order doesn’t matter, as each leaf node has a forward and backward pointer, each bird knowing the other bird sitting next on either side.  It is very efficient.

Non-clustered index.  It stores all the fields that comprise the index and a reference to the actual record.  There isn’t an order and the ‘leaf nodes’ are index pages (pointing to the actual record) instead of data pages (containing the actual record like a clustered index).  What does the pointer contain?  If the table has a clustered index it uses that, otherwise it (being a HEAP) uses a combination of the file identifier, page number, and row number of the row in the page.  This pointer is affectionately named RID (row id).  Watch for this in your execution plans.  In our pigeon example our pigeons would be in a big oak tree, not a streamlined wire.  The birds are sitting on different branches.  All the queries that utilize this index go through the trunk and find thier way to the bird…and then it flies to the bird on the wire to get the data (that is, if there are birds on a wire in a clustered index)

A covering index is a non-clustered index that utilizes ‘included columns’ in the index to help with performance.  If you are always querying a table using the dateordered column then include it so that it doesn’t have to perform a costly lookup on the actual data row.  You see, if the index has the data already it doesn’t have to run to go to the actual data to get the records.  I refuse to let my pigeon analogy die here.  So, maybe it’s a pigeon with a note around it’s next that includes your extra column?  Oh heck, I don’t know.


Written by matt

January 20, 2009 at 8:35 pm

Posted in Sql Performance

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: