Hi
I have a small application that reads XML files and inserts the information on a SQL DB.
There are ~ 300 000 files to import, each one with ~ 1000 records.
I started the application on 20% of the files and it has been running for 18 hours now, I hope I can improve this time for the rest of the files.
I'm not using a multi-thread approach, but since the computer I'm running the process on has 4 cores I was thinking on doing it to get some improvement on the performance (although I guess the main problem is the I/O and not only the processing).
I was thinking on using the BeginExecutingNonQuery() method on the SqlCommand object I create for each insertion, but I don't know if I should limit the max amount of simultaneous threads (nor I know how to do it).
What's your advice to get the best CPU utilization?
Thanks
-
Look into bulk insert.
-
Have you tried using SqlBulkCopy? Basically, you load your data into a DataTable instance, then use the SqlBulkCopy class to load it to SQL Server. Should offer a HUGE performance increase without as much change to your current process as using bcp or another utility.
willvv : The thing is, I have to do some small processing to the information before inserting it, basically the files contain information that goes in 4 different tables, but I'll take a look, it might be useful. ThanksHarper Shelby : It should help even in that case - perhaps you'd need 4 DataTables to hold the results of your processing, but in any case you should improve your performance. -
If I understand you correctly, you are reading those files on the same machine that runs the database. Although I don't know much about your machine, I bet that your bottleneck is disk IO. This doesn't sound terribly computation intensive to me.
willvv : I know the process isn't computation-intensive, but I'm guessing if I do 4 simultaneous inserts, SQL will take advantage of the multiple cores and do it in a more efficient way. And yes, the IO is intensive, but at least I want to keep SQL as busy as I can.innaM : Then make sure you're reading the files from another physical drive then the one holding the db.
0 comments:
Post a Comment