Clustered DB Server: clustered-index issue

hi,
yesterday i realized (by those helpfull explenations of a
few guys of this forum) that clustered index contains all
of the fields of the table (it looks like the index is
actually the table itself...?), and also that updateing
varing field types like varchar may change the the length
of a row, and therefor may cause a page-split in the index.
in the first place i described a case of a clustered index
on an IDENTITY column (IDENTITY_INSERT is never ON so it
is actually always auto-incremented), so the page split
may happen only because of an Update, and never because of
an Insert.
well, my table is acutally has only fields of
int,float,and datetime types, and only 1 vharchar(40). the
maximum length of a raw is exaclty 200 (so the minimum is
160 when the varchar is actually empty), so 40 is 20% of
200.
finally, for my questions:
1. if i want to avoid any chance for page-split, should i
set the fill-factor of the clustered index to 80?
2. what would be considered as better performance for
reading & updating (amount of storage place is not an
issue): A. chaging the type of the varchar(40) to char
(40), and setting fill-factor to 100. B. leaving the
varchar(40) as it is, and setting fill-factor to 80.
Thanks in advance.
edo.
edo
Setting fillfactor=100 is sutiable for 'read-only' tables because the page
is 100 percent full and I/O is lower as well
By having clustrered index on the table, you eliminate page splitting
entirely because all new records will be added to the end of the table. But
only do this if you know that a clustered index on the primary key is the
best option for you when it comes to query performance. But just thinking if
you have datapage with 1,5,7 index structure and a new row (let me say 3) is
added to index page. Then 5 and 7 to be moved on a new datapage (created by
SQL Server and allocated anywhere) in order to make room for 3. Now, your
data is not in logical order (external fragmentation).
"edo" <anonymous@.discussions.microsoft.com> wrote in message
news:1d37601c4535b$957e53b0$a001280a@.phx.gbl...
> hi,
> yesterday i realized (by those helpfull explenations of a
> few guys of this forum) that clustered index contains all
> of the fields of the table (it looks like the index is
> actually the table itself...?), and also that updateing
> varing field types like varchar may change the the length
> of a row, and therefor may cause a page-split in the index.
> in the first place i described a case of a clustered index
> on an IDENTITY column (IDENTITY_INSERT is never ON so it
> is actually always auto-incremented), so the page split
> may happen only because of an Update, and never because of
> an Insert.
> well, my table is acutally has only fields of
> int,float,and datetime types, and only 1 vharchar(40). the
> maximum length of a raw is exaclty 200 (so the minimum is
> 160 when the varchar is actually empty), so 40 is 20% of
> 200.
> finally, for my questions:
> 1. if i want to avoid any chance for page-split, should i
> set the fill-factor of the clustered index to 80?
> 2. what would be considered as better performance for
> reading & updating (amount of storage place is not an
> issue): A. chaging the type of the varchar(40) to char
> (40), and setting fill-factor to 100. B. leaving the
> varchar(40) as it is, and setting fill-factor to 80.
> Thanks in advance.
> edo.
>
>
>
|||> clustered index contains all of the fields of the table (it looks like the
index is actually the table itself...?),
It is the table itself, and the keys in the clustered index determine how
the rows are ordered. And you only avoid page splits during inserts if the
primary key of the table is also the clustered index keys of the table i.e.
a clustered primary key, as you could have a non-clustered primary key.
I'm not too sure of the various options you are contemplating, though they
appear sound. I'll probably go the char(40) route since space is not an
issue and I believe there are (slight) overheads when dealing with varchar
columns.
Peter Yeoh
http://www.yohz.com
Need smaller SQL2K backup files? Try MiniSQLBackup
"edo" <anonymous@.discussions.microsoft.com> wrote in message
news:1d37601c4535b$957e53b0$a001280a@.phx.gbl...
> hi,
> yesterday i realized (by those helpfull explenations of a
> few guys of this forum) that clustered index contains all
> of the fields of the table (it looks like the index is
> actually the table itself...?), and also that updateing
> varing field types like varchar may change the the length
> of a row, and therefor may cause a page-split in the index.
> in the first place i described a case of a clustered index
> on an IDENTITY column (IDENTITY_INSERT is never ON so it
> is actually always auto-incremented), so the page split
> may happen only because of an Update, and never because of
> an Insert.
> well, my table is acutally has only fields of
> int,float,and datetime types, and only 1 vharchar(40). the
> maximum length of a raw is exaclty 200 (so the minimum is
> 160 when the varchar is actually empty), so 40 is 20% of
> 200.
> finally, for my questions:
> 1. if i want to avoid any chance for page-split, should i
> set the fill-factor of the clustered index to 80?
> 2. what would be considered as better performance for
> reading & updating (amount of storage place is not an
> issue): A. chaging the type of the varchar(40) to char
> (40), and setting fill-factor to 100. B. leaving the
> varchar(40) as it is, and setting fill-factor to 80.
> Thanks in advance.
> edo.
>
>
>
|||> it probably save place for at least 1 WHOLE row, so it will be able to
place it as a whole without page-split.
Why would it do that if the clustered index is also your primary key?
Say a page is 8020 bytes. If all new rows reached their max size during
inserts, you could've inserted 32 rows and wasted 1604 bytes. If all new
rows were created with no value for the varchar column, you could've
inserted approx. 40 rows. Now if all 40 rows were then updated to fill the
varchar column to 40 chars, you would need 1600 bytes, which still fits on
your page. So the worst case scenario is to waste 1604 bytes per page, and
the best case is to fill the page entirely. NOTE: above calculations
ignored page overheads.
Peter Yeoh
http://www.yohz.com
Need smaller SQL2K backup files? Try MiniSQLBackup
"edo" <anonymous@.discussions.microsoft.com> wrote in message
news:1d3bd01c45367$8b11c130$a001280a@.phx.gbl...
> hi,
> i guess i made a basic mistake while trying to "calculate"
> the fill-factor percentage needed:
> i guess the server probably does not gain sagnificant
> improvment by saveing only 20% space of the possible row
> size. it probably save place for at least 1 WHOLE row, so
> it will be able to place it as a whole without page-split.
> actually, i believe 20 percent means that there is enough
> space for much more than 1 row.
> so, as to my questions:
> 1. may be i can even set fill-factor to 99% and still be
> sure that there is no chance for a split-page at all?
> 2. same question with no principle changes.
>
> thanks again.
> edo.
|||hi,
you wrote:
But just thinking if you have datapage with 1,5,7 index
structure and a new row (let me say 3) is added to index
page....
i'm not sure i understand this line.
i ensured that 3 can not come after 7, since the clustered
index in on an IDENTITY column, and i never intend to set
IDENTITY_INSERT to ON.
so, maybe you are talking about a kind of a temporary
result-set built by the server as SELECT SQL query? or
what?
anyway, you mentioned "query performance". as for that,
one of the most common tasks my application is doing is to
find raws accordig to this field, like for example:
select * from my_table where recid=X
thanks again,
edo.
|||edo
Yep, I was assuming that you don't have an Identity property on the table.
Does recid column has a clustered index?
If your output is large set I suggets you to create a clustered index on
this column otherwise I would create a non clustered.
Also run SET STATISTICS IO ON to see what is going on when SQL Server is
perfoming the query.
"edo" <anonymous@.discussions.microsoft.com> wrote in message
news:1ccff01c4536a$2327dbb0$a301280a@.phx.gbl...
> hi,
> you wrote:
> But just thinking if you have datapage with 1,5,7 index
> structure and a new row (let me say 3) is added to index
> page....
> i'm not sure i understand this line.
> i ensured that 3 can not come after 7, since the clustered
> index in on an IDENTITY column, and i never intend to set
> IDENTITY_INSERT to ON.
> so, maybe you are talking about a kind of a temporary
> result-set built by the server as SELECT SQL query? or
> what?
> anyway, you mentioned "query performance". as for that,
> one of the most common tasks my application is doing is to
> find raws accordig to this field, like for example:
> select * from my_table where recid=X
> thanks again,
> edo.
>
>
>
|||hi,
well, it's much more clear for me now.
indeed, it seems that the bottom line is that in order to
avoid page-split at all, the fill-factor should be
(minimum_row_size/maximum_row_size)*100.
another question, if i may:
thinking of the search issue ALONE, would it be faster for
the SQL-Server to find a row if the index has larger fill-
factor, or it's only a metter of a wasted space?
for example, given two identical tables, both filled
exactly with the same data, but the fill-factor of the
first is 50% and the fill-factor of the second is 100%.
dose a search in the second table would be faster
(significantly or at all)?
another issue, just to make sure: do a fixed-size type
(like Integer) may influence the actual row size if it
alows NULL values? dose a NULL value take the same amount
of space as "real" Value? if not, in order to ensure a
minimum_row_size of 160, i must set all the fields to NOT
NULL, right?
thanks a lot.
edo.
|||I guess it all boils down to what Uri mentioned in the other post i.e. I/O.
The less I/O SQL Server has to read, the faster your operations can
complete. Thus, the more data that can fit on a page, the faster your
search will execute. Whether it is significant would depend on the
difference in the number of pages between the 2 tables, taking into account
SQL Server reads 8 pages at a time (extents), the OS probably reads it in
bigger chunks, and your disk controller probably has its own caching system.
Not forgetting that the pages may be cached in SQL Server buffers where size
permits, further diminishing the difference if the entire table can fit into
the cache.
Re your second question, based on a simple (unscientific) test I did, a NULL
value does not take up more space than a real value. If you're interested,
look up the DBCC PAGE command to view the contents of a data page.
Peter Yeoh
http://www.yohz.com
Need smaller SQL2K backup files? Try MiniSQLBackup
"edo" <anonymous@.discussions.microsoft.com> wrote in message
news:1d0fd01c45374$6ad1c2a0$a501280a@.phx.gbl...
> hi,
> well, it's much more clear for me now.
> indeed, it seems that the bottom line is that in order to
> avoid page-split at all, the fill-factor should be
> (minimum_row_size/maximum_row_size)*100.
> another question, if i may:
> thinking of the search issue ALONE, would it be faster for
> the SQL-Server to find a row if the index has larger fill-
> factor, or it's only a metter of a wasted space?
> for example, given two identical tables, both filled
> exactly with the same data, but the fill-factor of the
> first is 50% and the fill-factor of the second is 100%.
> dose a search in the second table would be faster
> (significantly or at all)?
> another issue, just to make sure: do a fixed-size type
> (like Integer) may influence the actual row size if it
> alows NULL values? dose a NULL value take the same amount
> of space as "real" Value? if not, in order to ensure a
> minimum_row_size of 160, i must set all the fields to NOT
> NULL, right?
> thanks a lot.
> edo.
>
|||
>If you're interested,
>look up the DBCC PAGE command to view the contents of a
Data page.
i'll check this out.
thanks (again),
edo.
|||There is a bit for each field which allows null, and if the field is null
the flag is turned on... storage for nulls is very very small.
One thing you should consider before changing the varchar into char is
whether or not you intend to index the column... The index on the char
column would be larger than the varchar, and may slow performance..
Wayne Snyder, MCDBA, SQL Server MVP
Mariner, Charlotte, NC
www.mariner-usa.com
(Please respond only to the newsgroups.)
I support the Professional Association of SQL Server (PASS) and it's
community of SQL Server professionals.
www.sqlpass.org
"edo" <anonymous@.discussions.microsoft.com> wrote in message
news:1d37601c4535b$957e53b0$a001280a@.phx.gbl...
> hi,
> yesterday i realized (by those helpfull explenations of a
> few guys of this forum) that clustered index contains all
> of the fields of the table (it looks like the index is
> actually the table itself...?), and also that updateing
> varing field types like varchar may change the the length
> of a row, and therefor may cause a page-split in the index.
> in the first place i described a case of a clustered index
> on an IDENTITY column (IDENTITY_INSERT is never ON so it
> is actually always auto-incremented), so the page split
> may happen only because of an Update, and never because of
> an Insert.
> well, my table is acutally has only fields of
> int,float,and datetime types, and only 1 vharchar(40). the
> maximum length of a raw is exaclty 200 (so the minimum is
> 160 when the varchar is actually empty), so 40 is 20% of
> 200.
> finally, for my questions:
> 1. if i want to avoid any chance for page-split, should i
> set the fill-factor of the clustered index to 80?
> 2. what would be considered as better performance for
> reading & updating (amount of storage place is not an
> issue): A. chaging the type of the varchar(40) to char
> (40), and setting fill-factor to 100. B. leaving the
> varchar(40) as it is, and setting fill-factor to 80.
> Thanks in advance.
> edo.
>
>
>

Thursday, February 16, 2012

clustered-index issue

No comments:

Post a Comment

Clustered DB Server

Blog Archive

About Me