Clustered DB Server: Clustered Index or ORDER BY?

Hello,
I'll ask my question up front followed by all the details in case they are
necessary.
In reviewing SQL Books On Line it indicates a clustered index is suitable
for ordering data, instead of using ORDER BY in the query. I have seen some
cases where a clustered index does not work this way, and the only solution
was to recreate the clustered index, or use ORDER BY (or alternatively set
MAX_DOP=1). Some industry forum discussions indicate that ORDER BY is the
only way to guarantee the order of the rows, which supports by own
experiences in the matter. I'd like to know what is correct, and why.
My problem table, TableA, holds temporary data. The older data is deleted
from the back of the table and newer data is inserted in at the front. The
table uses a unique index on one column and a unique clustered index on
another column. Both columns are Ints. I have seen some cases where queries
don't return the rows in the correct order, and it is very important they be
in the correct order.
TableA
/* Col1 has a nonclustered unique index. Col1 increments like an identity,
but it's not one. */
Col1 int
Col2 varchar(100) -- user data
/*Col3 has a clustered index (not defined as unique) */
Col3 int identity(1,1)
At some point in time, only queries like this work correctly:
Select * from TableA
--or
Select * from TableA ORDER BY Col1
--or
Select * from TableA ORDER BY Col3
But queries with filters such as LIKE don't return the rows in the correct
order.
Select * from TableA Where Col3 LIKE 'SOMEVALUE%'
This behavior was found on a mutli-processor server running SQL 2000 SP4 and
was recreated on a copy of the database on another server, also with multiple
processors and SQL 2000 SP4. I've found that specifying OPTION (MAXDOP 1)
fixes the problem when using LIKE. (I have not tested the same situation on
other platforms such as single-processor servers or previous SP's.)
Again I'm interested in knowing if it is correct/valid to use clustered
indexes as a way to order query results, or if it is truly necessary to
specify ORDER BY to get the results in the correct order, and the reasons for
it. Thanks in advance!
Correction: the query on TableA using LIKE should read this:
Select * from TableA Where Col2 LIKE 'SOMEVALUE%'
"Mike K" wrote:

> Hello,
> I'll ask my question up front followed by all the details in case they are
> necessary.
> In reviewing SQL Books On Line it indicates a clustered index is suitable
> for ordering data, instead of using ORDER BY in the query. I have seen some
> cases where a clustered index does not work this way, and the only solution
> was to recreate the clustered index, or use ORDER BY (or alternatively set
> MAX_DOP=1). Some industry forum discussions indicate that ORDER BY is the
> only way to guarantee the order of the rows, which supports by own
> experiences in the matter. I'd like to know what is correct, and why.
> My problem table, TableA, holds temporary data. The older data is deleted
> from the back of the table and newer data is inserted in at the front. The
> table uses a unique index on one column and a unique clustered index on
> another column. Both columns are Ints. I have seen some cases where queries
> don't return the rows in the correct order, and it is very important they be
> in the correct order.
> TableA
> /* Col1 has a nonclustered unique index. Col1 increments like an identity,
> but it's not one. */
> Col1 int
> Col2 varchar(100) -- user data
> /*Col3 has a clustered index (not defined as unique) */
> Col3 int identity(1,1)
> At some point in time, only queries like this work correctly:
> Select * from TableA
> --or
> Select * from TableA ORDER BY Col1
> --or
> Select * from TableA ORDER BY Col3
> But queries with filters such as LIKE don't return the rows in the correct
> order.
> Select * from TableA Where Col3 LIKE 'SOMEVALUE%'
> This behavior was found on a mutli-processor server running SQL 2000 SP4 and
> was recreated on a copy of the database on another server, also with multiple
> processors and SQL 2000 SP4. I've found that specifying OPTION (MAXDOP 1)
> fixes the problem when using LIKE. (I have not tested the same situation on
> other platforms such as single-processor servers or previous SP's.)
> Again I'm interested in knowing if it is correct/valid to use clustered
> indexes as a way to order query results, or if it is truly necessary to
> specify ORDER BY to get the results in the correct order, and the reasons for
> it. Thanks in advance!
|||Ive read on the web several times that ORDER BY is the ONLY way to gaurantee
the resulst coming back a certain way. Ive NEVER read anything suggesting
that a clustered index is used for this.
TIA,
ChrisR
"Mike K" wrote:
[vbcol=seagreen]
> Correction: the query on TableA using LIKE should read this:
> Select * from TableA Where Col2 LIKE 'SOMEVALUE%'
> "Mike K" wrote:
|||Mike K wrote:
> Hello,
> I'll ask my question up front followed by all the details in case
> they are necessary.
> In reviewing SQL Books On Line it indicates a clustered index is
> suitable for ordering data, instead of using ORDER BY in the query. I
> have seen some cases where a clustered index does not work this way,
> and the only solution was to recreate the clustered index, or use
> ORDER BY (or alternatively set MAX_DOP=1). Some industry forum
> discussions indicate that ORDER BY is the only way to guarantee the
> order of the rows, which supports by own experiences in the matter.
> I'd like to know what is correct, and why.
> My problem table, TableA, holds temporary data. The older data is
> deleted from the back of the table and newer data is inserted in at
> the front. The table uses a unique index on one column and a unique
> clustered index on another column. Both columns are Ints. I have seen
> some cases where queries don't return the rows in the correct order,
> and it is very important they be in the correct order.
> TableA
> /* Col1 has a nonclustered unique index. Col1 increments like an
> identity, but it's not one. */
> Col1 int
> Col2 varchar(100) -- user data
> /*Col3 has a clustered index (not defined as unique) */
> Col3 int identity(1,1)
> At some point in time, only queries like this work correctly:
> Select * from TableA
> --or
> Select * from TableA ORDER BY Col1
> --or
> Select * from TableA ORDER BY Col3
> But queries with filters such as LIKE don't return the rows in the
> correct order.
> Select * from TableA Where Col3 LIKE 'SOMEVALUE%'
> This behavior was found on a mutli-processor server running SQL 2000
> SP4 and was recreated on a copy of the database on another server,
> also with multiple processors and SQL 2000 SP4. I've found that
> specifying OPTION (MAXDOP 1) fixes the problem when using LIKE. (I
> have not tested the same situation on other platforms such as
> single-processor servers or previous SP's.)
> Again I'm interested in knowing if it is correct/valid to use
> clustered indexes as a way to order query results, or if it is truly
> necessary to specify ORDER BY to get the results in the correct
> order, and the reasons for it. Thanks in advance!
You must use an ORDER BY to guarantee order.
David Gugick
Quest Software
www.imceda.com
www.quest.com
|||ORDER BY is indeed the only way to guarantee that results are coming back in
a specific order. If the ORDER BY is on the column(s) that make up the
clustered index, SQL Server will likely use that clustered index to look up
the rows, and the order in which to return them, although it could use a
different index if an appropriate one exists. For example if you have a
table with a large number of columns, and you only need a column that has a
non-clustered index plus the column(s) from the clustered index, on which
you order, it is possible that SQL Server decides it is more efficient to
use the narrower non clustered index (which includes the columns of the
clustered index by definition) and sort it, rather than using the wider
clustered index.
IIRC, until SQL Server version 6.0 or 6.5, selecting from a table that had a
clustered index would always return the rows in the order of a clustered
index, unless you specified an ORDER BY clause with a different order. The
Query Optimizer in later versions is more advanced and efficient, and
doesn't do the work of accessing the rows in the order of the clustered
index unless you specifically say so.
Jacco Schalkwijk
SQL Server MVP
"ChrisR" <ChrisR@.discussions.microsoft.com> wrote in message
news:90CF9131-3F8A-44F0-9CCF-DCA1D0A2D2E8@.microsoft.com...[vbcol=seagreen]
> Ive read on the web several times that ORDER BY is the ONLY way to
> gaurantee
> the resulst coming back a certain way. Ive NEVER read anything suggesting
> that a clustered index is used for this.
> --
> TIA,
> ChrisR
>
> "Mike K" wrote:
|||> In reviewing SQL Books On Line it indicates a clustered index is suitable
> for ordering data, instead of using ORDER BY in the query.
Do you happen to remember which topic(s) you looked at? This is inaccurate
information. I believe we have corrected this in SQL Server 2005 Books
Online, but it would be very useful if you could provide the topic that gave
you that idea.
Thanks,
Gail Erickson [MS]
SQL Server Documentation Team
This posting is provided "AS IS" with no warranties, and confers no rights
"Mike K" <MikeK@.discussions.microsoft.com> wrote in message
news:92F631A4-F035-4E60-B6D7-D23EEEEB3A21@.microsoft.com...
> Hello,
> I'll ask my question up front followed by all the details in case they are
> necessary.
> In reviewing SQL Books On Line it indicates a clustered index is suitable
> for ordering data, instead of using ORDER BY in the query. I have seen
> some
> cases where a clustered index does not work this way, and the only
> solution
> was to recreate the clustered index, or use ORDER BY (or alternatively set
> MAX_DOP=1). Some industry forum discussions indicate that ORDER BY is the
> only way to guarantee the order of the rows, which supports by own
> experiences in the matter. I'd like to know what is correct, and why.
> My problem table, TableA, holds temporary data. The older data is deleted
> from the back of the table and newer data is inserted in at the front.
> The
> table uses a unique index on one column and a unique clustered index on
> another column. Both columns are Ints. I have seen some cases where
> queries
> don't return the rows in the correct order, and it is very important they
> be
> in the correct order.
> TableA
> /* Col1 has a nonclustered unique index. Col1 increments like an identity,
> but it's not one. */
> Col1 int
> Col2 varchar(100) -- user data
> /*Col3 has a clustered index (not defined as unique) */
> Col3 int identity(1,1)
> At some point in time, only queries like this work correctly:
> Select * from TableA
> --or
> Select * from TableA ORDER BY Col1
> --or
> Select * from TableA ORDER BY Col3
> But queries with filters such as LIKE don't return the rows in the correct
> order.
> Select * from TableA Where Col3 LIKE 'SOMEVALUE%'
> This behavior was found on a mutli-processor server running SQL 2000 SP4
> and
> was recreated on a copy of the database on another server, also with
> multiple
> processors and SQL 2000 SP4. I've found that specifying OPTION (MAXDOP 1)
> fixes the problem when using LIKE. (I have not tested the same situation
> on
> other platforms such as single-processor servers or previous SP's.)
> Again I'm interested in knowing if it is correct/valid to use clustered
> indexes as a way to order query results, or if it is truly necessary to
> specify ORDER BY to get the results in the correct order, and the reasons
> for
> it. Thanks in advance!
|||This is a pretty simple question. The ONLY way to guarantee a defined
output is to use an ORDER BY clause. The word "only" in this sentence
cannot be stressed too much!!
In general, when using SQL, you should never rely on particular
implementation details of an RDBMS, because the implementation may
change with every version, fix or service pack. An example of this is
the implied ordering in SQL-Server 6.5 when using a GROUP BY which was
no longer the case in SQL-Server 7.0.
The implementation may even behave differently on different platforms or
different configurations. An example is the predictability of the
NEWID() function on Windows NT versus the 'random' implementation on
Windows 2000/2003.
Gert-Jan
Mike K wrote:
> Hello,
> I'll ask my question up front followed by all the details in case they are
> necessary.
> In reviewing SQL Books On Line it indicates a clustered index is suitable
> for ordering data, instead of using ORDER BY in the query. I have seen some
> cases where a clustered index does not work this way, and the only solution
> was to recreate the clustered index, or use ORDER BY (or alternatively set
> MAX_DOP=1). Some industry forum discussions indicate that ORDER BY is the
> only way to guarantee the order of the rows, which supports by own
> experiences in the matter. I'd like to know what is correct, and why.
> My problem table, TableA, holds temporary data. The older data is deleted
> from the back of the table and newer data is inserted in at the front. The
> table uses a unique index on one column and a unique clustered index on
> another column. Both columns are Ints. I have seen some cases where queries
> don't return the rows in the correct order, and it is very important they be
> in the correct order.
> TableA
> /* Col1 has a nonclustered unique index. Col1 increments like an identity,
> but it's not one. */
> Col1 int
> Col2 varchar(100) -- user data
> /*Col3 has a clustered index (not defined as unique) */
> Col3 int identity(1,1)
> At some point in time, only queries like this work correctly:
> Select * from TableA
> --or
> Select * from TableA ORDER BY Col1
> --or
> Select * from TableA ORDER BY Col3
> But queries with filters such as LIKE don't return the rows in the correct
> order.
> Select * from TableA Where Col3 LIKE 'SOMEVALUE%'
> This behavior was found on a mutli-processor server running SQL 2000 SP4 and
> was recreated on a copy of the database on another server, also with multiple
> processors and SQL 2000 SP4. I've found that specifying OPTION (MAXDOP 1)
> fixes the problem when using LIKE. (I have not tested the same situation on
> other platforms such as single-processor servers or previous SP's.)
> Again I'm interested in knowing if it is correct/valid to use clustered
> indexes as a way to order query results, or if it is truly necessary to
> specify ORDER BY to get the results in the correct order, and the reasons for
> it. Thanks in advance!
|||I found it under the topic "clustered indexes, overview". However I beleive I
may have misread or misinterpreted what it said.
I originally interpreted it to mean a clustered index can replace the ORDER
BY. But in re-reading it, I'm interpreting it as meaning queries using ORDER
BY with a clustered index column would run faster than if the clustered index
hadn't been there. This interpretation agrees with everyone's responses.
"Gail Erickson [MS]" wrote:

> Do you happen to remember which topic(s) you looked at? This is inaccurate
> information. I believe we have corrected this in SQL Server 2005 Books
> Online, but it would be very useful if you could provide the topic that gave
> you that idea.
> Thanks,
> --
> Gail Erickson [MS]
> SQL Server Documentation Team
> This posting is provided "AS IS" with no warranties, and confers no rights
> "Mike K" <MikeK@.discussions.microsoft.com> wrote in message
> news:92F631A4-F035-4E60-B6D7-D23EEEEB3A21@.microsoft.com...
>
>

Sunday, February 12, 2012

Clustered Index or ORDER BY?

No comments:

Post a Comment

Clustered DB Server

Blog Archive

About Me