Saturday, February 25, 2012

Replicating very large articles but need to avoid locks from snap-

I have a very large and very active table which I want to replicate to
another server. How can I set-up a Publication for this table and avoid the
locking during the initial snap-shot in SQL 2005? SQL 2000 allowed
configuring replication without generating a snap-shot - assumed data was
already present. I used this technique and would update unique columns on
the source or primary key to send over complete row delete & inserts to
finally achieve sync'. I'm looking for a similiar work-around in SQL 2005
but don't see it. I'm aware of initializing from a backup but that seems
extreme for just a ~5 of 400 tables I want to replicate. Any ideas how to
skin this cat would be appreciated. Thanks. -CqlBoy
The locking which occurs in SQL 2005 is much less than the default locking
which would occur in SQL 2000.
In SQL 2005 you have the no-sync option as well. To do this create your
publication and then create your subscription as per normal.
Then when you get to the initialize Subscriptions dialog uncheck initialize.
"Cqlboy" <Cqlboy@.discussions.microsoft.com> wrote in message
news:47464A31-3AD8-4A5A-9558-3009CE84A498@.microsoft.com...
>I have a very large and very active table which I want to replicate to
> another server. How can I set-up a Publication for this table and avoid
> the
> locking during the initial snap-shot in SQL 2005? SQL 2000 allowed
> configuring replication without generating a snap-shot - assumed data was
> already present. I used this technique and would update unique columns on
> the source or primary key to send over complete row delete & inserts to
> finally achieve sync'. I'm looking for a similiar work-around in SQL 2005
> but don't see it. I'm aware of initializing from a backup but that seems
> extreme for just a ~5 of 400 tables I want to replicate. Any ideas how to
> skin this cat would be appreciated. Thanks. -CqlBoy
|||Is it also possible to force at the Publisher like in SQL 2000 to send a pair
of DELETE/INSERT statements for an entire row by performing an UPDATE on a
primary key or column that was unique? This was an issue in SQL Server 2000
when the PK was an identiy column, I couldn't update the PK and force the
DELETE/INSERT but their was a trace flag that would trigger this behavior on
any column, thus making it possible to sync' an entire row.
Thanks.
-Blake
"Hilary Cotter" wrote:

> The locking which occurs in SQL 2005 is much less than the default locking
> which would occur in SQL 2000.
> In SQL 2005 you have the no-sync option as well. To do this create your
> publication and then create your subscription as per normal.
> Then when you get to the initialize Subscriptions dialog uncheck initialize.
> "Cqlboy" <Cqlboy@.discussions.microsoft.com> wrote in message
> news:47464A31-3AD8-4A5A-9558-3009CE84A498@.microsoft.com...
>
>
|||I think this is what you are referring to in SQL 2000. It is my
understanding that in SQL 2005, deferred updates are now replicated as an
update not a delete/insert pair, however I can't seem to find any official
documentation supporting it.
"Cqlboy" <Cqlboy@.discussions.microsoft.com> wrote in message
news:CC9DE155-8328-4CBF-8BF5-EE68900A59B1@.microsoft.com...[vbcol=seagreen]
> Is it also possible to force at the Publisher like in SQL 2000 to send a
> pair
> of DELETE/INSERT statements for an entire row by performing an UPDATE on a
> primary key or column that was unique? This was an issue in SQL Server
> 2000
> when the PK was an identiy column, I couldn't update the PK and force the
> DELETE/INSERT but their was a trace flag that would trigger this behavior
> on
> any column, thus making it possible to sync' an entire row.
> Thanks.
> -Blake
> "Hilary Cotter" wrote:
|||Hi Blake, being the person responsible for most of snapshot processing in
transactional\snapshot replication, I must admit that it is a bit difficult
for me to read your posts. That said, in the interest of product
improvement, I must ask what sort of problems you have experienced in
SQL2000 (I can tell locking being one of them) that causes you to develop
such an intense aversion against using replication snapshot processing at
all.
As Hilary had mentioned, locking during snapshot generation for
transactional replication is much reduced by default in SQL2005, and you can
further reduce it by using the new 'database snapshot' sync_method if you
are running Enterprise Edition. In addition, I had also put in quite a few
performance improvements for snapshot processing in SQL2005 which should
hopefully make replication snapshot processing less painful to use. As such,
I would really appreciate if you can give replication snapshot processing a
try in SQL2005.
Thanks much,
-Raymond
"Cqlboy" <Cqlboy@.discussions.microsoft.com> wrote in message
news:CC9DE155-8328-4CBF-8BF5-EE68900A59B1@.microsoft.com...[vbcol=seagreen]
> Is it also possible to force at the Publisher like in SQL 2000 to send a
> pair
> of DELETE/INSERT statements for an entire row by performing an UPDATE on a
> primary key or column that was unique? This was an issue in SQL Server
> 2000
> when the PK was an identiy column, I couldn't update the PK and force the
> DELETE/INSERT but their was a trace flag that would trigger this behavior
> on
> any column, thus making it possible to sync' an entire row.
> Thanks.
> -Blake
> "Hilary Cotter" wrote:
|||Raymond, thank you for the response and tip regarding snapshot processing.
I need to investigate/experiment more with this. I am admittingly not
familiar yet with all of SQL 2005's offerings. Also, let me add, I'm not at
all averse to replication and used SQL 2000 Transactional replication
extensively at my former employer, Experian Corp.
Key things noted from my replication experience is this:
1) Taking an initial snap-shot against several large and important
order-related tables incurring heavy insert/update/delete activity
effectively took the company down until the snap-shot completed. This could
be hours, something upper management would never tolerate. Heaven forbid if
we lost sync' !
2) Publications containing hot articles presented significant loads to the
disk IO subsystem, something that very few DBA's are aware of or understand.
For example, if your publication contained articles responsible for much of
the server IO you need to be aware that this resource drain will effectively
double via the Distributor when replicated. This was a surprize to me when I
first discovered it but it makes perfect sense in hindsight.
3) The last issue came from the Distributor clean up job. Whenever this
executed, it exercised a massive IO spike against its raid group, thus
affecting everything associated with those disks. Yep, I tinkered with the
Clean-Up job schedule and tweaked the agent profiles but could never seem to
alleviate these massive spikes. I never made it this far, but I was
seriously contemplating customizing the code in the Distributor Clean-up
proc' to break up and spread the IO demand over a longer interval, anything
to avoid or reduce this sharp massive hits against the disks.
I learned and became more sensitized to these issues when tasked to
re-architect SQL Server on our EMC Clariion CX500 and CX700 SAN. I captured
file level IO stats using the little understood/known SQL Server function
fn_virtualstats as a source, sampling @. 1 minute intervals. Querying against
one weeks data was very enlightening.
I am aware of moving the Distributor to its own dedicated box/disks but
could never convince management to do so. BTW... it was a heavy bang against
the SAN cache and it seemed we had evidence that the Distributor Clean-Up job
cause global issues via the SAN. Evidence we out grew our SAN, I guess.
I apoligize for this lengthy response but the key answers I am looking for
is how to initialize replication for very large and critical, order-related
articles without taking the company down. Secondly, is their a practical
means of determing the additional burden to the TempDB ? Log ? ... and the
underlying disk subsystem. i.e. can I handle the load? I actually have a
rough idea on this last point but if you have anything to share I'd love to
hear it.
Thanks again for taking the time to respond.
Blake Colson a.k.a CqlBoy
Lead DBA of Operations
RealtyTrac, Inc.
Irvine, CA
"Raymond Mak [MSFT]" wrote:

> Hi Blake, being the person responsible for most of snapshot processing in
> transactional\snapshot replication, I must admit that it is a bit difficult
> for me to read your posts. That said, in the interest of product
> improvement, I must ask what sort of problems you have experienced in
> SQL2000 (I can tell locking being one of them) that causes you to develop
> such an intense aversion against using replication snapshot processing at
> all.
> As Hilary had mentioned, locking during snapshot generation for
> transactional replication is much reduced by default in SQL2005, and you can
> further reduce it by using the new 'database snapshot' sync_method if you
> are running Enterprise Edition. In addition, I had also put in quite a few
> performance improvements for snapshot processing in SQL2005 which should
> hopefully make replication snapshot processing less painful to use. As such,
> I would really appreciate if you can give replication snapshot processing a
> try in SQL2005.
> Thanks much,
> -Raymond
> "Cqlboy" <Cqlboy@.discussions.microsoft.com> wrote in message
> news:CC9DE155-8328-4CBF-8BF5-EE68900A59B1@.microsoft.com...
>
>
|||Blake, I am probably not reading you previous post (the one before you
latest response) correctly but it seems to me that doing a (full table) bulk
update on the pk or uq will incur rather significant resource\locking
contention at the publisher database, and the logreader\distribution agent
will need to transfer almost twice the amount of data (delete\insert +
command formatting overhead) thereby putting further strain on the network
and the cleanup agent. That seems to be a bit of an extreme measure to avoid
replication snapshot processing. And as I have mentioned before, locking
overhead for snapshot generation is much reduced in SQL2005 by default and
you can use sync_method = 'database snapshot' to reduce that further. Paul
Ibison also has a nice article up on his website
(http://www.replicationanswers.com/BCPPartitioning.asp) talking about an
enhancement in SQL2005 that allows a large table to be bulk-copied by the
snapshot agent in parallel. That said, it is not my intention to be pushy
about this and there are indeed cases where all the enhancement in SQL2005
snapshot processing will be inadequate to scale (it is not a pretty sight
but I have seen people managed 100GBs snapshot on SQL2000). I am merely ...
intrigued... by your comments and I very much appreciate your (extremely
detailed) feedback.
-Ryamond
"Cqlboy" <Cqlboy@.discussions.microsoft.com> wrote in message
news:51FEF6A7-F5E5-4D54-BB9E-542EE39D5010@.microsoft.com...[vbcol=seagreen]
> Raymond, thank you for the response and tip regarding snapshot
> processing.
> I need to investigate/experiment more with this. I am admittingly not
> familiar yet with all of SQL 2005's offerings. Also, let me add, I'm not
> at
> all averse to replication and used SQL 2000 Transactional replication
> extensively at my former employer, Experian Corp.
> Key things noted from my replication experience is this:
> 1) Taking an initial snap-shot against several large and important
> order-related tables incurring heavy insert/update/delete activity
> effectively took the company down until the snap-shot completed. This
> could
> be hours, something upper management would never tolerate. Heaven forbid
> if
> we lost sync' !
> 2) Publications containing hot articles presented significant loads to
> the
> disk IO subsystem, something that very few DBA's are aware of or
> understand.
> For example, if your publication contained articles responsible for much
> of
> the server IO you need to be aware that this resource drain will
> effectively
> double via the Distributor when replicated. This was a surprize to me
> when I
> first discovered it but it makes perfect sense in hindsight.
> 3) The last issue came from the Distributor clean up job. Whenever this
> executed, it exercised a massive IO spike against its raid group, thus
> affecting everything associated with those disks. Yep, I tinkered with
> the
> Clean-Up job schedule and tweaked the agent profiles but could never seem
> to
> alleviate these massive spikes. I never made it this far, but I was
> seriously contemplating customizing the code in the Distributor Clean-up
> proc' to break up and spread the IO demand over a longer interval,
> anything
> to avoid or reduce this sharp massive hits against the disks.
> I learned and became more sensitized to these issues when tasked to
> re-architect SQL Server on our EMC Clariion CX500 and CX700 SAN. I
> captured
> file level IO stats using the little understood/known SQL Server function
> fn_virtualstats as a source, sampling @. 1 minute intervals. Querying
> against
> one weeks data was very enlightening.
> I am aware of moving the Distributor to its own dedicated box/disks but
> could never convince management to do so. BTW... it was a heavy bang
> against
> the SAN cache and it seemed we had evidence that the Distributor Clean-Up
> job
> cause global issues via the SAN. Evidence we out grew our SAN, I guess.
> I apoligize for this lengthy response but the key answers I am looking for
> is how to initialize replication for very large and critical,
> order-related
> articles without taking the company down. Secondly, is their a practical
> means of determing the additional burden to the TempDB ? Log ? ... and
> the
> underlying disk subsystem. i.e. can I handle the load? I actually have a
> rough idea on this last point but if you have anything to share I'd love
> to
> hear it.
> Thanks again for taking the time to respond.
> Blake Colson a.k.a CqlBoy
> Lead DBA of Operations
> RealtyTrac, Inc.
> Irvine, CA
>
> "Raymond Mak [MSFT]" wrote:
|||Sort of a (perhaps not so useful) data point, I have seen from internal
testing that the SQL2005 snapshot agent is able to pull 100MB(ytes)/s on an
admittedly high end server locally (8-way 3.2Ghz Intel proc [P4 based?] &
8GB RAM & 3disk RAID0). Granted that the data we used is artificially
generated but still.
-Raymond
"Raymond Mak [MSFT]" <rmak@.online.microsoft.com> wrote in message
news:uc4Q3wSiHHA.1708@.TK2MSFTNGP03.phx.gbl...
> Blake, I am probably not reading you previous post (the one before you
> latest response) correctly but it seems to me that doing a (full table)
> bulk update on the pk or uq will incur rather significant resource\locking
> contention at the publisher database, and the logreader\distribution agent
> will need to transfer almost twice the amount of data (delete\insert +
> command formatting overhead) thereby putting further strain on the network
> and the cleanup agent. That seems to be a bit of an extreme measure to
> avoid replication snapshot processing. And as I have mentioned before,
> locking overhead for snapshot generation is much reduced in SQL2005 by
> default and you can use sync_method = 'database snapshot' to reduce that
> further. Paul Ibison also has a nice article up on his website
> (http://www.replicationanswers.com/BCPPartitioning.asp) talking about an
> enhancement in SQL2005 that allows a large table to be bulk-copied by the
> snapshot agent in parallel. That said, it is not my intention to be pushy
> about this and there are indeed cases where all the enhancement in SQL2005
> snapshot processing will be inadequate to scale (it is not a pretty sight
> but I have seen people managed 100GBs snapshot on SQL2000). I am merely
> ... intrigued... by your comments and I very much appreciate your
> (extremely detailed) feedback.
> -Ryamond
> "Cqlboy" <Cqlboy@.discussions.microsoft.com> wrote in message
> news:51FEF6A7-F5E5-4D54-BB9E-542EE39D5010@.microsoft.com...
>

No comments:

Post a Comment