Tuesday, February 14, 2012

Clustered Indexes

We use uniqueidentifiers as our primary key because we have many oltp
systems that get merged into one corporate reporting structure.
Some tables accepted the default behaviour of SQL Server and made the
primary key the clustered index.
Does a clustered index on a GUID sort them in order or does it use them like
a heap table and insert all rows at the end?
Thanks in advance.Yes, the values are sorted, which can also lead to frequent page splits,
which in effect downgrades insert performance. Columns of type
UNIQUEIDENTIFIER are IMHO a lousy choice for clustered indexes for two main
reasons:
1) the size of the data; and
2) high selectiveness (pages get reordered quite frequently).
A performance gain can be achieved, though, by spreading the table on
several files, while keeping the FILL FACTOR of the clustered index low. Thi
s
would increase possibility of paralelism by enabling paralel inserts to occu
r
on different files concurrently.
I'd use a different column for a clustered index - where inserts are
frequent the best candidate IMHO is a column with a small datatype (int,
bigint) with incremental values (such as an IDENTITY column).
ML
http://milambda.blogspot.com/|||Rather than always inserting at the bottom page, a clustered index will
attempt to insert the row within a page so that the rows are clustered
(sorted) based on the value of the key. It's best not to use a clustered
index unless you are attempting to achieve a specific outcome.
You can use DBCC SHOWCONTIG to check for index fragmentation:
http://www.microsoft.com/technet/pr...n/ss2kidbp.mspx
"JI" <jidawgs@.gmail.com> wrote in message
news:OW%23SXUMLGHA.2704@.TK2MSFTNGP15.phx.gbl...
> We use uniqueidentifiers as our primary key because we have many oltp
> systems that get merged into one corporate reporting structure.
> Some tables accepted the default behaviour of SQL Server and made the
> primary key the clustered index.
> Does a clustered index on a GUID sort them in order or does it use them
> like a heap table and insert all rows at the end?
> Thanks in advance.
>|||GUIDs are probably the worst choice for a clustered index, at least as far
as INSERTs go. IDENTITY is probably the best choice for a clustered index
because of its increasing nature, but with many data sources feeding one, it
can be tricky to manage. If the database is designed correctly, each table
should have at least one alternate key, so I'd put the clustered index
there, unless you expect a lot of ranged queries, in which case it may be
better to put the clustered index on the target of the ranged query.
Neither solution may offer much INSERT performance improvement, but it may
significantly improve query performance.
"JI" <jidawgs@.gmail.com> wrote in message
news:OW%23SXUMLGHA.2704@.TK2MSFTNGP15.phx.gbl...
> We use uniqueidentifiers as our primary key because we have many oltp
> systems that get merged into one corporate reporting structure.
> Some tables accepted the default behaviour of SQL Server and made the
> primary key the clustered index.
> Does a clustered index on a GUID sort them in order or does it use them
> like a heap table and insert all rows at the end?
> Thanks in advance.
>|||For sql2k, you can use Gert's xp_new_sequential_guid. This should eliminate
the issues (fragmentation, page split, etc.) with using guid as the primary
key.
http://sqldev.net/xp/xpguid.htm
-oj
"Brian Selzer" <brian@.selzer-software.com> wrote in message
news:eb86LrNLGHA.2780@.tk2msftngp13.phx.gbl...
> GUIDs are probably the worst choice for a clustered index, at least as far
> as INSERTs go. IDENTITY is probably the best choice for a clustered index
> because of its increasing nature, but with many data sources feeding one,
> it can be tricky to manage. If the database is designed correctly, each
> table should have at least one alternate key, so I'd put the clustered
> index there, unless you expect a lot of ranged queries, in which case it
> may be better to put the clustered index on the target of the ranged
> query. Neither solution may offer much INSERT performance improvement, but
> it may significantly improve query performance.
> "JI" <jidawgs@.gmail.com> wrote in message
> news:OW%23SXUMLGHA.2704@.TK2MSFTNGP15.phx.gbl...
>|||This is interesting. It generates a GUID based on the MAC address of the
NIC in the server. The only problem I see is that there are NICs out there
with the same MAC address--even though that's not supposed to happen.
Therefore it's possible for duplicates to be generated on different
machines. Barring that extremely remote possibility, this looks like a good
answer to the problem. SQL 2005 apparently has a NEWSEQUENTIALID() function
that provides this same functionality.
"oj" <nospam_ojngo@.home.com> wrote in message
news:%23GEoJ4NLGHA.668@.TK2MSFTNGP11.phx.gbl...
> For sql2k, you can use Gert's xp_new_sequential_guid. This should
> eliminate the issues (fragmentation, page split, etc.) with using guid as
> the primary key.
> http://sqldev.net/xp/xpguid.htm
>
> --
> -oj
>
> "Brian Selzer" <brian@.selzer-software.com> wrote in message
> news:eb86LrNLGHA.2780@.tk2msftngp13.phx.gbl...
>|||Actually it does not depend on the NIC (as with newid()). ;-)
http://msdn.microsoft.com/library/e...esequential.asp
-oj
"Brian Selzer" <brian@.selzer-software.com> wrote in message
news:%238WfHXOLGHA.360@.TK2MSFTNGP12.phx.gbl...
> This is interesting. It generates a GUID based on the MAC address of the
> NIC in the server. The only problem I see is that there are NICs out
> there with the same MAC address--even though that's not supposed to
> happen. Therefore it's possible for duplicates to be generated on
> different machines. Barring that extremely remote possibility, this looks
> like a good answer to the problem. SQL 2005 apparently has a
> NEWSEQUENTIALID() function that provides this same functionality.
> "oj" <nospam_ojngo@.home.com> wrote in message
> news:%23GEoJ4NLGHA.668@.TK2MSFTNGP11.phx.gbl...
>|||Only if you don't have a NIC in the box. If there's a NIC, then it uses the
MAC.
"oj" <nospam_ojngo@.home.com> wrote in message
news:eYvlNvOLGHA.2124@.TK2MSFTNGP14.phx.gbl...
> Actually it does not depend on the NIC (as with newid()). ;-)
> http://msdn.microsoft.com/library/e...esequential.asp
> --
> -oj
>
> "Brian Selzer" <brian@.selzer-software.com> wrote in message
> news:%238WfHXOLGHA.360@.TK2MSFTNGP12.phx.gbl...
>

No comments:

Post a Comment