Friday, February 23, 2007

Can MS SQL Server handle 1 million records per sec?

Lot of them have asked questions about the credibility of handling such huge transactions like 1 to 2 million records per sec! The article below explains you that Yes, you can achieve...

Load 1TB in less than 1 hour
OVERVIEW

This project was done using SQL Server 2005 Enterprise Edition (with SP1 beta) using the BULK INSERT command with 60 parallel input files, each of approximately 16.67 GB. The best time I got was 50 minutes. Hewlett Packard was kind enough to loan me some time one night on the machine they use to do the TPC-C tests. It is an Itanium2 Superdome with 64 1.6GHZ 9M (Madison) CPUs. For the insert tests the system was equipped with 256GB of RAM. The HP SAN configuration is rated at 14GB per second throughput. You can find the specific configurations at http://tpc.org/tpcc/results/tpcc_result_detail.asp?id=103082701. (Note: The configurations at this site may be updated periodically).


BEST PRACTICES and LESSIONS LEARNED
Here is a list of things we learned during this project that should be helpful in your projects.

Run as many load processes as you have available CPUs. If you have 32 CPUs, run 32 parallel loads. If you have 8 CPUs, run 8 parallel loads.
If you have control over the creation of your input files, make them of a size that is evenly divisible by the number of load threads you want to run in parallel. Also make sure all records belong to one partition if you want to use the switch partition strategy.
Use BULK insert instead of BCP if you are running the process on the SQL Server machine.
Use table partitioning to gain another 8-10%, but only if your input files are GUARANTEED to match your partitioning function, meaning that all records in one file must be in the same partition.
Use TABLOCK to avoid row at a time locking.
Use ROWS PER BATCH = 2500, or something near this if you are importing multiple streams into one table.

NOT FINISED YET! Check more on this blog to find out how to do test setups and other configuration steps that you need to take care of - http://blogs.msdn.com/sqlcat/archive/2006/05/19/602142.aspx

Cheers - D

No comments: