Avoiding the if-exists-update-else-insert race problem in Entity Framework

.net entity-framework race-condition transactions

Question

I've been reading other questions on how to implement if-exists-insert-else-update semantics in EF, but either I'm not understanding how the answers work, or they are in fact not addressing the issue. A common solution offered is to wrap the work in a transaction scope (eg: Implementing if-not-exists-insert using Entity Framework without race conditions):

using (var scope = new TransactionScope()) // default isolation level is serializable
using(var context = new MyEntities())
{
    var user = context.Users.SingleOrDefault(u => u.Id == userId); // *
    if (user != null)
    {
        // update the user
        user.property = newProperty;
        context.SaveChanges();
    }
    else
    {
        user = new User
        {
             // etc
        };
        context.Users.AddObject(user);
        context.SaveChanges();
    }
}

But I fail to see how this solves anything, as for this to work, the line I have starred above should block if a second thread tries to access the same user ID, unblocking only when the first thread has finished its work. Using a transaction will not cause this however, and we'll get an UpdateException thrown due to the key violation that occurs when the second thread attempts to create the same user for a second time.

Instead of catching the exception caused by the race condition, it would be better to prevent the race condition from happening in the first place. One way to do this would be for the starred line to take out an exclusive lock on the database row that matches its condition, meaning that in the context of this block, only one thread at a time could work with a user.

It seems that this must be a common problem for users of the EF, so I'm looking for a clean, generic solution that I can use everywhere.

I'd really like to avoid using a stored procedure to create my user if possible.

Any ideas?

EDIT: I tried executing the above code concurrently on two different threads using the same user ID, and despite taking out serializable transactions, they were both able to enter the critical section (*) concurrently. This lead to an UpdateException being thrown when the second thread attempted to insert the same user ID that the first had just inserted. This is because, as pointed out by Ladislav below, a serializable transaction takes exclusive locks only after it has begun modifying data, not reading.

1
31
5/23/2017 12:34:25 PM

Accepted Answer

When using serializable transaction SQL Server issues shared locks on read records / tables. Shared locks doesn't allow other transactions modifying locked data (transactions will block) but it allows other transactions reading data before the transaction which issued locks start modifying data. That is the reason why the example doesn't work - concurrent reads are allowed with shared locks until the first transaction starts modifying data.

You want isolation where select command locks the whole table exclusively for a single client. It must lock the whole table because otherwise it will not solve concurrency for inserting "the same" record. Granular control for locking records or tables by select commands is possible when using hints but you must write direct SQL queries to use them - EF has no support for that. I described approach for exclusively locking that table here but it is like creating sequential access to the table and it affects all other clients accessing this table.

If you are really sure that this operation happens just in your single method and there are not other applications using your database you can simply place the code into critical section (.NET synchronization for example with lock) and ensure on the .NET side that only single thread can access critical section. That is not so reliable solution but any playing with locks and transaction levels has a big impact on the database performance and throughput. You can combine this approach with optimistic concurrency (unique constraints, timestamps, etc).

14
5/23/2017 12:10:05 PM

Popular Answer

Just to add my way, not that it really deals with the annoyance of exceptions being thrown and transactions not quite cutting it as a scalable solution but it does avoid race conditions from causing problems where lock type solutions are not possible (easily managed) such as in distributed systems.

I very simply use the exception and try the insert first. I use a modification of your original code as an example:

using(var context = new MyEntities())
{
    EntityEntry entityUser = null;
    try 
    {
        user = new User
        {
             // etc
        };
        entityUser = context.Users.Add(user);
        context.SaveChanges(); // Will throw if the entity already exists
    } 
    catch (DbUpdateException x)
    when (x.InnerException != null && x.InnerException.Message.StartsWith("Cannot insert duplicate key row in object"))
    {
        if (entityUser != null)
        {
            // Detach the entity to stop it hanging around on the context
            entityUser.State = EntityState.Detached;
        }
        var user = context.Users.Find(userId);
        if (user != null) // just in case someone deleted it in the mean time
        {
            // update the user
            user.property = newProperty;
            context.SaveChanges();
        }
    }
}

It's not pretty, but it works and might be of use to someone.



Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow