EF Codefirst Bulk Insert

c# ef-code-first entity-framework sql-server

Question

I need to insert around 2500 rows using EF Code First.

My original code looked something like this:

foreach(var item in listOfItemsToBeAdded)
{
    //biz logic
    context.MyStuff.Add(i);
}

This took a very long time. It was around 2.2 seconds for each DBSet.Add() call, which equates to around 90 minutes.

I refactored the code to this:

var tempItemList = new List<MyStuff>();
foreach(var item in listOfItemsToBeAdded)
{
    //biz logic
    tempItemList.Add(item)
}
context.MyStuff.ToList().AddRange(tempItemList);

This only takes around 4 seconds to run. However, the .ToList() queries all the items currently in the table, which is extremely necessary and could be dangerous or even more time consuming in the long run. One workaround would be to do something like context.MyStuff.Where(x=>x.ID = *empty guid*).AddRange(tempItemList) because then I know there will never be anything returned.

But I'm curious if anyone else knows of an efficient way to to a bulk insert using EF Code First?

2
4
9/3/2013 7:45:27 PM

Accepted Answer

Validation is normally a very expensive portion of EF, I had great performance improvements by disabling it with:

context.Configuration.AutoDetectChangesEnabled = false;
context.Configuration.ValidateOnSaveEnabled = false;

I believe I found that in a similar SO question--perhaps it was this answer

Another answer on that question rightly points out that if you really need bulk insert performance you should look at using System.Data.SqlClient.SqlBulkCopy. The choice between EF and ADO.NET for this issue really revolves around your priorities.

13
5/23/2017 11:46:00 AM

Expert Answer

As STW pointed out, the DetectChanges method called every time you call the Add method is VERY expensive.

Common solution are:

  • Use AddRange over Add
  • SET AutoDetectChanges to false
  • SPLIT SaveChanges in multiple batches

See: Improve Entity Framework Add Performance

It's important to note that using AddRange doesn't perform a BulkInsert, it's simply invoke the DetecthChanges method once (after all entities is added) which greatly improve the performance.

But I'm curious if anyone else knows of an efficient way to to a bulk insert using EF Code First

There is some third party library supporting Bulk Insert available:

See: Entity Framework Bulk Insert library


Disclaimer: I'm the owner of Entity Framework Extensions

This library allows you to perform all bulk operations you need for your scenarios:

  • Bulk SaveChanges
  • Bulk Insert
  • Bulk Delete
  • Bulk Update
  • Bulk Merge

Example

// Easy to use
context.BulkSaveChanges();

// Easy to customize
context.BulkSaveChanges(bulk => bulk.BatchSize = 100);

// Perform Bulk Operations
context.BulkDelete(customers);
context.BulkInsert(customers);
context.BulkUpdate(customers);

// Customize Primary Key
context.BulkMerge(customers, operation => {
   operation.ColumnPrimaryKeyExpression = 
        customer => customer.Code;
});
1
2/27/2018 2:34:03 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow