Friday, January 28, 2011

Troubleshooting SQL connectivity issues after database migration

We have a just completed a migration from SQL 2000 to SQL 2008 R2 and have started to intermittently receive SqlExceptions with the following two error messages:

  • A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)
  • A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)

We have 3 web servers connecting to this SQL Server running around 100 applications (all accessing the same 8 databases on the SQL Server).

Because these exceptions were not occurring on the 2000 server, we feel like it is unlikely to be an application issue (however, we are not ruling it out). Traffic on the web sites is typical, ruling out a high traffic issue. The old SQL 2000 box had 4 CPUs and 8 GB RAM, while the new one has 24 GB RAM and 16 CPUs (which is currently and during the issue underutilized).

These errors occurred for a period of about 5 minutes several hours ago and have not as yet reoccurred.

The sys.dm_os_ring_buffers system view does not show entries for these disconnects, and there are no corresponding event log entries on either the server or the client.

Some googling has found a few similar reports, however nothing seems definitive(see links below). Has anyone seen errors like this after migrating from SQL 2000 to SQL 2008 R2?

Links:

  • If I'm understanding correctly you've not only changed your software but also your hardware - so there are plenty of changes that could be causing this connection error. I've seen plenty of recommendations to double check your NIC drivers and motherboard firmware (!!) to fix this. Yikes!

    Anyway - you should be able to see this error in your server application log. From here you may be able to get an idea the date/time the exception occured so you can compare it to the individual client/application event to narrow down what's happening when this exception pops up.

    You can also use Netmon to trace the connections from the clients to the server. You'll want to give it a couple of days to reproduce the error. This should narrow it down a bit and at least give you and idea of what is failing.

    Chris Shaffer : There were no entries in the event log, and the errors occurred in various several of the applications for the duration of the event. I will look into Netmon to see if it can help.
    From shiitake

0 comments:

Post a Comment