Clustered DB Server: Clustering for Performance

Hello All. I need to set up quite a large SQL System 2005 system that needs
to deal with 40,000 concurrent users, and one of the tables will contain blob
data. 40,000 users throwing around Mb's of data at the same time worries me a
little bit!
My question is to do with Database Clustering and Mirroring. From what I can
see, there is still no load-balancing with SQL Server 2005, so does this mean
even in a clustered environment I am still basically only using a single
database server? I have seen many posts that tell me that clustering is ONLY
for failover and not for performance. I understand that with Active/Passive
this is the case, but how about Active/Active? If I can set up Active/Active
(2 nodes? 4 nodes? 8 nodes? how many are possible?) with a SAN and NOT have
failover implemented (can I turn failover off?) then would I have a load
balanced environment? I could then have all nodes running up to 100% (because
I don't have to worry about the failover) and therefore give me a dramatic
increase in performance compared with using a single server?
If I can do this then I can set up 2 identical clusters and an extra server
for the witness, and use database mirroring for failover? Of course I
understand that mirroring will decrease performance on the clusters. But
would this give me a super-fast database system that might cope with what I
need?
Also, I am thinking about taking the blob data of of the database and create
a new database that just deals with the blobs. How would this affect my
clustered/mirrored environment?
Thanks
Richard
Active/Active is a holdover from a specific technical implementation of
clustering for SQL 7.0. It was not scale-out either. The correct current
term is multi-instance. As you have read, SQL clustering is a failover and
availability technology only. SQL clustering does not load balance. SQL
does not have a native, automatic load balancing technology. There are some
third party virtualization technologies, but like any scale-out technology,
adding nodes does not equate to linear performance gains, and there are many
situations where such solutions do not work.
As with clustering, Database Mirroring is a failover technology and actually
has negative impacts on performance. There is one area where it can be used
to scale out, but that only works for read-only queries.
Segregating the blob data is not a bad idea, provided you do it down to the
physical disk level.
To summarize, there are no shortcuts to scalability, just like you can't
shortcut availability. Having said that, it sounds like you need a
reasonably large SQL server. That is not an impossible proposition. SQL
Server scales up very well. There are several vendors that can sell you an
adequate system. Some will even help you size it. You might think about
hiring an experienced SQL consultant who specializes in large-scale SQL
Systems to guide you through the process. (No, I am not trying to drum up
business, my time is fully booked).
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP
"Richard" <Richard@.discussions.microsoft.com> wrote in message
news:A115E5ED-8801-4142-B921-FD25665FE3C0@.microsoft.com...
> Hello All. I need to set up quite a large SQL System 2005 system that
> needs
> to deal with 40,000 concurrent users, and one of the tables will contain
> blob
> data. 40,000 users throwing around Mb's of data at the same time worries
> me a
> little bit!
> My question is to do with Database Clustering and Mirroring. From what I
> can
> see, there is still no load-balancing with SQL Server 2005, so does this
> mean
> even in a clustered environment I am still basically only using a single
> database server? I have seen many posts that tell me that clustering is
> ONLY
> for failover and not for performance. I understand that with
> Active/Passive
> this is the case, but how about Active/Active? If I can set up
> Active/Active
> (2 nodes? 4 nodes? 8 nodes? how many are possible?) with a SAN and NOT
> have
> failover implemented (can I turn failover off?) then would I have a load
> balanced environment? I could then have all nodes running up to 100%
> (because
> I don't have to worry about the failover) and therefore give me a dramatic
> increase in performance compared with using a single server?
> If I can do this then I can set up 2 identical clusters and an extra
> server
> for the witness, and use database mirroring for failover? Of course I
> understand that mirroring will decrease performance on the clusters. But
> would this give me a super-fast database system that might cope with what
> I
> need?
> Also, I am thinking about taking the blob data of of the database and
> create
> a new database that just deals with the blobs. How would this affect my
> clustered/mirrored environment?
> Thanks
> Richard
|||Richard,
Try not to take this the wrong way but if you are asking questions like
these and need to implement a system that large you are probably a bit over
your head. I suggest you seriously consider hiring a consultant who has
been thru things like this before. Quite honestly there aren't many people
who have dealt with systems that large. And if not done correctly it will
almost certainly turn belly up and die when you even get close to that many
users. That said here are a few comments. One is I doubt you will really
have 40K concurrent users especially since SQL Server only allows 32,767<g>.
Even with heavy use web based apps you rarely have as many concurrent
connections as you would think. And if you are talking anywhere near this
amount you are talking some serious hardware to support it. As I mentioned
in another post Clustering is not a load balancing option. Only one node at
a time has control over a specific disk resource in the cluster. So even
with Active / Active (or more correctly Multi-Instance) you can't share a
database since it resides on only one disk resource. You can't design a
system like this in a newsgroup and if you try you will fail. It requires a
very careful and well laid out plan to implement a large scale database
application.
Andrew J. Kelly SQL MVP
"Richard" <Richard@.discussions.microsoft.com> wrote in message
news:A115E5ED-8801-4142-B921-FD25665FE3C0@.microsoft.com...
> Hello All. I need to set up quite a large SQL System 2005 system that
> needs
> to deal with 40,000 concurrent users, and one of the tables will contain
> blob
> data. 40,000 users throwing around Mb's of data at the same time worries
> me a
> little bit!
> My question is to do with Database Clustering and Mirroring. From what I
> can
> see, there is still no load-balancing with SQL Server 2005, so does this
> mean
> even in a clustered environment I am still basically only using a single
> database server? I have seen many posts that tell me that clustering is
> ONLY
> for failover and not for performance. I understand that with
> Active/Passive
> this is the case, but how about Active/Active? If I can set up
> Active/Active
> (2 nodes? 4 nodes? 8 nodes? how many are possible?) with a SAN and NOT
> have
> failover implemented (can I turn failover off?) then would I have a load
> balanced environment? I could then have all nodes running up to 100%
> (because
> I don't have to worry about the failover) and therefore give me a dramatic
> increase in performance compared with using a single server?
> If I can do this then I can set up 2 identical clusters and an extra
> server
> for the witness, and use database mirroring for failover? Of course I
> understand that mirroring will decrease performance on the clusters. But
> would this give me a super-fast database system that might cope with what
> I
> need?
> Also, I am thinking about taking the blob data of of the database and
> create
> a new database that just deals with the blobs. How would this affect my
> clustered/mirrored environment?
> Thanks
> Richard
|||Wait, who said you were an experienced SQL consultant who specializes in
large-scale SQL
Systems? Oh yeah, I have and firmly believe it - DOH!
Cheers,
Rod
MVP - Windows Server - Clustering
http://www.nw-america.com - Clustering Website
http://msmvps.com/clustering - Blog
http://www.clusterhelp.com - Cluster Training
"Geoff N. Hiten" <SQLCraftsman@.gmail.com> wrote in message
news:e9MrWhrFGHA.3936@.TK2MSFTNGP12.phx.gbl...
> Active/Active is a holdover from a specific technical implementation of
> clustering for SQL 7.0. It was not scale-out either. The correct current
> term is multi-instance. As you have read, SQL clustering is a failover
> and availability technology only. SQL clustering does not load balance.
> SQL does not have a native, automatic load balancing technology. There
> are some third party virtualization technologies, but like any scale-out
> technology, adding nodes does not equate to linear performance gains, and
> there are many situations where such solutions do not work.
> As with clustering, Database Mirroring is a failover technology and
> actually has negative impacts on performance. There is one area where it
> can be used to scale out, but that only works for read-only queries.
> Segregating the blob data is not a bad idea, provided you do it down to
> the physical disk level.
> To summarize, there are no shortcuts to scalability, just like you can't
> shortcut availability. Having said that, it sounds like you need a
> reasonably large SQL server. That is not an impossible proposition. SQL
> Server scales up very well. There are several vendors that can sell you
> an adequate system. Some will even help you size it. You might think
> about hiring an experienced SQL consultant who specializes in large-scale
> SQL Systems to guide you through the process. (No, I am not trying to
> drum up business, my time is fully booked).
> --
> Geoff N. Hiten
> Senior Database Administrator
> Microsoft SQL Server MVP
>
> "Richard" <Richard@.discussions.microsoft.com> wrote in message
> news:A115E5ED-8801-4142-B921-FD25665FE3C0@.microsoft.com...
>
|||Thank you Geoff and Andrew for you swift responses.
We are actually looking for a consultant to come in and help with this.
However I am just doing some ground work beforehand.
I was hoping that SQL Server 2005 would give us some performance
improvements (such as true Load Balancing like Oracle Real Application
Clusters), but the improvements over 2000 seem to be mostly for failover
rather than scaling.
Thanks again for your input
Richard
"Andrew J. Kelly" wrote:

> Richard,
> Try not to take this the wrong way but if you are asking questions like
> these and need to implement a system that large you are probably a bit over
> your head. I suggest you seriously consider hiring a consultant who has
> been thru things like this before. Quite honestly there aren't many people
> who have dealt with systems that large. And if not done correctly it will
> almost certainly turn belly up and die when you even get close to that many
> users. That said here are a few comments. One is I doubt you will really
> have 40K concurrent users especially since SQL Server only allows 32,767<g>.
> Even with heavy use web based apps you rarely have as many concurrent
> connections as you would think. And if you are talking anywhere near this
> amount you are talking some serious hardware to support it. As I mentioned
> in another post Clustering is not a load balancing option. Only one node at
> a time has control over a specific disk resource in the cluster. So even
> with Active / Active (or more correctly Multi-Instance) you can't share a
> database since it resides on only one disk resource. You can't design a
> system like this in a newsgroup and if you try you will fail. It requires a
> very careful and well laid out plan to implement a large scale database
> application.
> --
> Andrew J. Kelly SQL MVP
>
> "Richard" <Richard@.discussions.microsoft.com> wrote in message
> news:A115E5ED-8801-4142-B921-FD25665FE3C0@.microsoft.com...
>
>
|||SQL 2005 has lots of performance enhancements among other things. Just
because it does not work like RAC does not mean it can not scale or handle
large workloads. I just finished working on a system that was doing over 25K
batch requests per second with upwards of 1000 concurrent (active and real
connections) and it was hardly breaking a sweat. Please don't make any
decisions based on appearance or misconceptions.
Andrew J. Kelly SQL MVP
"Richard" <Richard@.discussions.microsoft.com> wrote in message
news:427BB31F-57CA-43A9-965C-E5F18196D696@.microsoft.com...[vbcol=seagreen]
> Thank you Geoff and Andrew for you swift responses.
> We are actually looking for a consultant to come in and help with this.
> However I am just doing some ground work beforehand.
> I was hoping that SQL Server 2005 would give us some performance
> improvements (such as true Load Balancing like Oracle Real Application
> Clusters), but the improvements over 2000 seem to be mostly for failover
> rather than scaling.
> Thanks again for your input
> Richard
> "Andrew J. Kelly" wrote:
|||I second what Andrew said.
Scale-out computing only works for certain workload profiles. SQL will
scale up to almost any real-world database problem. I would start a project
like this by determining the service level requirements such as expected
workload, availability requirements, budget, in-house skills, etc. Then I
would build a solution based on the actual business requirements. After
all, your company could care less how you build the system as long as it
meets its service requirements.
I understand the trepidation you face when designing such a large-scale
system. There is nothing worse than a brand new database system that is
just a little bit too small or slow. Given the likely cost of your system,
getting a good consultant on board early before some irrevocable decisions
are made will save money, time, and frustration.
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP
"Andrew J. Kelly" <sqlmvpnooospam@.shadhawk.com> wrote in message
news:%23GTouptFGHA.752@.TK2MSFTNGP12.phx.gbl...
> SQL 2005 has lots of performance enhancements among other things. Just
> because it does not work like RAC does not mean it can not scale or handle
> large workloads. I just finished working on a system that was doing over
> 25K batch requests per second with upwards of 1000 concurrent (active and
> real connections) and it was hardly breaking a sweat. Please don't make
> any decisions based on appearance or misconceptions.
> --
> Andrew J. Kelly SQL MVP
>
> "Richard" <Richard@.discussions.microsoft.com> wrote in message
> news:427BB31F-57CA-43A9-965C-E5F18196D696@.microsoft.com...
>
|||Now you know why I hate hearing active/active and active/passive with
respect to SQL Server.
"Rodney R. Fournier [MVP]" <rod@.die.spam.die.nw-america.com> wrote in
message news:%23PTVCQsFGHA.216@.TK2MSFTNGP15.phx.gbl...
> Wait, who said you were an experienced SQL consultant who specializes in
> large-scale SQL
> Systems? Oh yeah, I have and firmly believe it - DOH!
> Cheers,
> Rod
> MVP - Windows Server - Clustering
> http://www.nw-america.com - Clustering Website
> http://msmvps.com/clustering - Blog
> http://www.clusterhelp.com - Cluster Training
> "Geoff N. Hiten" <SQLCraftsman@.gmail.com> wrote in message
> news:e9MrWhrFGHA.3936@.TK2MSFTNGP12.phx.gbl...
>
|||Lots of questions and LOTS of things to consider. No, a cluster doesn't
give you performance. You have a single database, no copies anywhere.
Now, if the data your users are reading is truely static or at least static
within a given timeframe AND you have a SAN, you can get increased capacity
by using the Scalable Shared Database feature keeping in mind that all reads
are going to the same set of disk devices.
If the data isn't static, but you need increased read capacity, you can
leverage the replication engine which can provide not only a readable copy
of your database, but it can also be used for failover.
40,000 concurrent users is doable, even on small systems. It really depends
upon what the data volume looks like, how the network is setup to handle the
traffic, and how the code is written that will run against it.
"Richard" <Richard@.discussions.microsoft.com> wrote in message
news:A115E5ED-8801-4142-B921-FD25665FE3C0@.microsoft.com...
> Hello All. I need to set up quite a large SQL System 2005 system that
> needs
> to deal with 40,000 concurrent users, and one of the tables will contain
> blob
> data. 40,000 users throwing around Mb's of data at the same time worries
> me a
> little bit!
> My question is to do with Database Clustering and Mirroring. From what I
> can
> see, there is still no load-balancing with SQL Server 2005, so does this
> mean
> even in a clustered environment I am still basically only using a single
> database server? I have seen many posts that tell me that clustering is
> ONLY
> for failover and not for performance. I understand that with
> Active/Passive
> this is the case, but how about Active/Active? If I can set up
> Active/Active
> (2 nodes? 4 nodes? 8 nodes? how many are possible?) with a SAN and NOT
> have
> failover implemented (can I turn failover off?) then would I have a load
> balanced environment? I could then have all nodes running up to 100%
> (because
> I don't have to worry about the failover) and therefore give me a dramatic
> increase in performance compared with using a single server?
> If I can do this then I can set up 2 identical clusters and an extra
> server
> for the witness, and use database mirroring for failover? Of course I
> understand that mirroring will decrease performance on the clusters. But
> would this give me a super-fast database system that might cope with what
> I
> need?
> Also, I am thinking about taking the blob data of of the database and
> create
> a new database that just deals with the blobs. How would this affect my
> clustered/mirrored environment?
> Thanks
> Richard
|||The operative word is CAN. If you are running Database Mirroring in HA
mode, it can have a negative impact on performance. The impact on
performance is related to how far apart the principal and mirror are as well
as the networking between the two. It is also impacted by the volume of
transactions in the system which is directly related to how much of the
bandwidth between principal and mirror is being used. The impact is further
impacted by the size of the transaction. Tiny, point transactions will be
impacted more by mirroring that will large transactions which either process
a large number of rows or take a long time to execute.
Database Mirroring in HP mode isn't going to have any impact to your
application that an end user will ever be able to notice.
You can create a Database Snapshot against a mirror, but that is a
point-in-time read-only copy of the database which means that in order for
users to see updated data on the other side, the Database Snapshot has to be
dropped and recreated.
"Geoff N. Hiten" <SQLCraftsman@.gmail.com> wrote in message
news:e9MrWhrFGHA.3936@.TK2MSFTNGP12.phx.gbl...
> Active/Active is a holdover from a specific technical implementation of
> clustering for SQL 7.0. It was not scale-out either. The correct current
> term is multi-instance. As you have read, SQL clustering is a failover
> and availability technology only. SQL clustering does not load balance.
> SQL does not have a native, automatic load balancing technology. There
> are some third party virtualization technologies, but like any scale-out
> technology, adding nodes does not equate to linear performance gains, and
> there are many situations where such solutions do not work.
> As with clustering, Database Mirroring is a failover technology and
> actually has negative impacts on performance. There is one area where it
> can be used to scale out, but that only works for read-only queries.
> Segregating the blob data is not a bad idea, provided you do it down to
> the physical disk level.
> To summarize, there are no shortcuts to scalability, just like you can't
> shortcut availability. Having said that, it sounds like you need a
> reasonably large SQL server. That is not an impossible proposition. SQL
> Server scales up very well. There are several vendors that can sell you
> an adequate system. Some will even help you size it. You might think
> about hiring an experienced SQL consultant who specializes in large-scale
> SQL Systems to guide you through the process. (No, I am not trying to
> drum up business, my time is fully booked).
> --
> Geoff N. Hiten
> Senior Database Administrator
> Microsoft SQL Server MVP
>
> "Richard" <Richard@.discussions.microsoft.com> wrote in message
> news:A115E5ED-8801-4142-B921-FD25665FE3C0@.microsoft.com...
>

Sunday, February 19, 2012

Clustering for Performance

No comments:

Post a Comment

Clustered DB Server

Blog Archive

About Me