I’m working on a VB6 application that has to quickly insert 32,000 records into a SQL Server 2000 database (connected using ODBC). Using ADO 2.7 I’ve tried the following:
**First the connection:**
Public gcnMain As ADODB.Connection
Dim strConnection As String
strConnection = "Provider=MSDASQL.1;Persist Security Info=False;Data Source=SepsisQStar"
Set gcnMain = New ADODB.Connection
gcnMain.Open strConnection
frmMain.Show
**Then the Insert Statement:**
For i = 1 To SpecData.GetNumberOfDataPoints
MoverZ = SpecData.GetDataPointXValue(i)
Intensity = SpecData.GetDataPointYValue(i)
strCom = "INSERT INTO RawData (FileName, FileTimeStamp, SampleName, MOverZ, Intensity) VALUES ('"
strCom = strCom & FileName & "','"
strCom = strCom & FileTimeStamp & "','"
strCom = strCom & SampleName & "',"
strCom = strCom & MoverZ & ","
strCom = strCom & Intensity & ")"
gcnMain.Execute strCom, , adCmdText + adExecuteNoRecords
Next
**Alternatively:**
'this record will never exist so it opens an empty RecordSet
rsRS.Open "select * from RawData where RawDataID = -1", gcnMain, adOpenForwardOnly, adLockPessimistic
For i = 1 To SpecData.GetNumberOfDataPoints
MoverZ = SpecData.GetDataPointXValue(i)
Intensity = SpecData.GetDataPointYValue(i)
rsRS.AddNew Array("FileName", "FileTimeStamp", "SampleName", "MOverZ", "Intensity"), Array(FileName, FileTimeStamp, SampleName, MoverZ, Intensity)
Next
End Sub
The For Next will loop about 32,000 times.
In my test environment, both methods take about 5+ mins to do the insert.
Is there anyway to speed this up? Perhaps using DAO or some other database connectivity?
Try caching SpecData.GetNumberOfDataPoints into a variable to avoid it being evaluated with each iteration of the loop.
Also caching references to SpecData by using the With statement may also improve speed:
Dim lngNumberOfDataPoints As Long
lngNumberOfDataPoints = SpecData.GetNumberOfDataPoints
With SpecData
For i = 1 To lngNumberOfDataPoints
MoverZ = .GetDataPointXValue(i)
Intensity = .GetDataPointYValue(i)
strCom = "INSERT INTO RawData (FileName, FileTimeStamp, SampleName, MOverZ, Intensity) VALUES ('"
strCom = strCom & FileName & "','"
strCom = strCom & FileTimeStamp & "','"
strCom = strCom & SampleName & "',"
strCom = strCom & MoverZ & ","
strCom = strCom & Intensity & ")"
gcnMain.Execute strCom, , adCmdText + adExecuteNoRecords
Next
End With
These changes may provide an improvement, but also you could investigate wrapping all of your inserts in an ADO transaction which can dramatically improve performance. I’ve done it before but can’t recall how it was done off the top of my head.
The first two tips are very good, and the third (rewrite) tip is also good, though I can’t recall the specifics either.
I’m pretty sure doing 32,000 INSERT statements would be slower than 32.000 .AddNew methods, so I’d apply the tips to your second strategy:
Public gcnMain As ADODB.Connection
Sub Test()
Dim strConnection As String
Dim lngDataPoints As Long
strConnection = "Provider=MSDASQL.1;Persist Security Info=False;Data Source=SepsisQStar"
Set gcnMain = New ADODB.Connection
gcnMain.Open strConnection
frmMain.Show
rsRS.Open "select * from RawData where RawDataID = -1", gcnMain, adOpenForwardOnly, adLockPessimistic
With SpecData
lngDataPoints = .GetNumberOfDataPoints
For i = 1 To lngDataPoints
MoverZ = .GetDataPointXValue(i)
Intensity = .GetDataPointYValue(i)
rsRS.AddNew Array("FileName", "FileTimeStamp", "SampleName", "MOverZ", "Intensity"), Array(FileName, FileTimeStamp, SampleName, MoverZ, Intensity)
Next
End With
End Sub
I suspect the real bottleneck is the “GetDataPoint…” calls. As an exercise, try commenting out the actual Addnew line and see how long it takes to simply iterate the loop.
One thing I forgot to mention is that string concatenation is pretty slow in VB, (all those & calls), though since you are only doing 32,000 iterations that delay shouldn’t amount to much.
One technique you might want to look into is “Bulk inserting”. Basically, instead of inserting each record into the database one by one, you write the information into a text file, send the text file to the database, (if necessary,) and then issue the bulk insert statement, which translates as “Hey database, find such-and-such file and import each line from into into this table, according to these format rules.” When you can get it to work, I found it to be very fast indeed.
The usual hitch is the process of sending a file from the client machine directly to the database server, or to somewhere that it can read from easily enough. Thought I’d mention the idea tho.
FWIW, I don’t think that & concatenation is that inefficient when you’re only dealing with strings of a relatively constant length. It’s when you’re building up a single string that gets longer and longer as the loop executes that you really feel the burn.
Wow, I can’t believe this question has resurfaced after a month. This application has already been put into production.
My eventually solution was the method chrisk hits on. A Bulk Insert drops the time down to mere seconds for 32000 records. I use the ADO command object to execute the Bulk Instert. Works like a charm -->