I'm looking for the fastest way of inserting into Entity Framework.
I'm asking this because of the scenario where you have an active TransactionScope and the insertion is huge (4000+). It can potentially last more than 10 minutes (default timeout of transactions), and this will lead to an incomplete transaction.
To your remark in the comments to your question:
"...SavingChanges (for each record)..."
That's the worst thing you can do! Calling SaveChanges()
for each record slows bulk inserts extremely down. I would do a few simple tests which will very likely improve the performance:
SaveChanges()
once after ALL records.SaveChanges()
after for example 100 records.SaveChanges()
after for example 100 records and dispose the context and create a new one.For bulk inserts I am working and experimenting with a pattern like this:
using (TransactionScope scope = new TransactionScope())
{
MyDbContext context = null;
try
{
context = new MyDbContext();
context.Configuration.AutoDetectChangesEnabled = false;
int count = 0;
foreach (var entityToInsert in someCollectionOfEntitiesToInsert)
{
++count;
context = AddToContext(context, entityToInsert, count, 100, true);
}
context.SaveChanges();
}
finally
{
if (context != null)
context.Dispose();
}
scope.Complete();
}
private MyDbContext AddToContext(MyDbContext context,
Entity entity, int count, int commitCount, bool recreateContext)
{
context.Set<Entity>().Add(entity);
if (count % commitCount == 0)
{
context.SaveChanges();
if (recreateContext)
{
context.Dispose();
context = new MyDbContext();
context.Configuration.AutoDetectChangesEnabled = false;
}
}
return context;
}
I have a test program which inserts 560.000 entities (9 scalar properties, no navigation properties) into the DB. With this code it works in less than 3 minutes.
For the performance it is important to call SaveChanges()
after "many" records ("many" around 100 or 1000). It also improves the performance to dispose the context after SaveChanges and create a new one. This clears the context from all entites, SaveChanges
doesn't do that, the entities are still attached to the context in state Unchanged
. It is the growing size of attached entities in the context what slows down the insertion step by step. So, it is helpful to clear it after some time.
Here are a few measurements for my 560000 entities:
The behaviour in the first test above is that the performance is very non-linear and decreases extremely over time. ("Many hours" is an estimation, I never finished this test, I stopped at 50.000 entities after 20 minutes.) This non-linear behaviour is not so significant in all other tests.
I'm looking for the fastest way of inserting into Entity Framework
There are some third-party libraries supporting Bulk Insert available:
See: Entity Framework Bulk Insert library
Be careful, when choosing a bulk insert library. Only Entity Framework Extensions supports all kind of associations and inheritances and it's the only one still supported.
Disclaimer: I'm the owner of Entity Framework Extensions
This library allows you to perform all bulk operations you need for your scenarios:
Example
// Easy to use
context.BulkSaveChanges();
// Easy to customize
context.BulkSaveChanges(bulk => bulk.BatchSize = 100);
// Perform Bulk Operations
context.BulkDelete(customers);
context.BulkInsert(customers);
context.BulkUpdate(customers);
// Customize Primary Key
context.BulkMerge(customers, operation => {
operation.ColumnPrimaryKeyExpression =
customer => customer.Code;
});