Transient Fault Handling with SQL Azure using Entity Framework

.net enterprise-library entity-framework


I currently use SQL Azure and Entity SQL in my application.


Entities model = new Entities();
db_Item item = model.db_Item.First();

Now I want to use the Transient Fault Handling out of the Enterprise Library but there are no examples or solutions that I can find that would allow me to do something like override the Entities class, so I don't have to update my code in hundreds of places.

Could someone please provide more information on how this could be done?

9/5/2012 1:20:50 PM

Accepted Answer

Going through what I have so far.

  1. The Entity Framework does not provide access to the connection open and section where the SQL is sent to the server, hence it is currently impossibile to provide retry logic around this area.

  2. The EF team are aware of this shortfall and are planning on actually integrating retry logic into EF for possibily version 6.

  3. As per Case #3 of [1] you can send a SQL command to the database on the OnContextCreated. This however means for EVERY single DB call you make to the DB, you will have to make 2. I wouldn't recommend this in hardly any situation unless you don't care about performance.

  4. The only viable option so far is implementing retry logic in the form of the Enterprise Library Transient Fault Handling Application Block [2] around every call you make to the database. In existing applications this is extremely tedious.

  5. When I get time I am looking further into the source code of EF to see if anything further can be done, while we wait for EF 6. I would keep an eye on [3]

  6. Some hope, it is currently under review by the EF team. [4]

Update: 2013-11-14

Just thought I would update this post to let everyone know that EF6 has been released and supports connection resiliency out of the box.

No need for workarounds any more.

Update: 2013-03-23

EF 6 Alpha 3 released with Connection Resiliency -

Update: 2012-11-04

The EF team have officially announced it is planned for EF 6. [4]





11/14/2013 7:39:11 AM

Popular Answer

Since this seems one of the most popular questions on SO about Azure transient handling, I'll add this answer here.

Entity Framework does indeed have resiliency code built in (per Adam's answer)


1) You must add code to activate it, manually

public class MyConfiguration : DbConfiguration
    public MyConfiguration()
            () => new SqlAzureExecutionStrategy());

            () => new CommitFailureHandler()); 

The first method call activates retries, the second call sets a handler to avoid duplicate updates when retries happen.

Note: This class will be found and instantiated automatically, as discussed here: Just make sure the class is in the same assembly as your DbContext class and has a public constructor with no parameters.

2) The built-in SqlAzureExecutionStrategy is not good enough. It doesn't cover all the transient errors. This is not surprising when you consider that the SQL Server team is working independently of Entity Framework, so they are unlikely to ever be completely in synch on what transient errors are possible. It's also difficult to figure that out yourself.

The solution we used, backed by a suggestion from another software company, is to create our own Execution Strategy, which retries every SqlException and TimeoutException, except for a few that we whitelist as not worth retrying (such as permission denied).

public class WhiteListSqlAzureExecutionStrategy : DbExecutionStrategy
    public WhiteListSqlAzureExecutionStrategy()

    protected override bool ShouldRetryOn(Exception exception)
        var sqlException = exception as SqlException;

        // If this is an SqlException then we want to always retry
        // Unless the all the exception types are in the white list. 
        // With those errors there is no point in retrying.
        if (sqlException != null)
            var retry = false;
            foreach (SqlError err in sqlException.Errors)
                // Exception white list.
                switch (err.Number)
                    // Primary Key violation
                    case 2627:

                    // Constraint violation
                    case 547:

                    // Invalid column name, We have seen this happen when the Snapshot helper runs for a column 'CreatedOn'
                    // This is not one of our columns and it appears to be using our execution strategy.
                    // An invalid column is also something that probably doesn't get resolved by retries.
                    case 207:

                    // The server principal "username" is not able to access the database "dbname" under the current security context
                    // May occur when using restricted user - Entity Framework wants to access master for something
                    // probably not transient
                    case 916:

                    // XXX permission denied on object. (XXX = select, etc)
                    // Should not occur if db access is correct, but occurred when using restricted user - EF accessing __MigrationHistory
                    case 229:

                    // Invalid object name 'xxx'.
                    // Occurs at startup because Entity Framework looks for EdmMetadata, an old table
                    // (Perhaps only if it can't access __MigrationHistory?)
                    case 208:

                        retry = true;
            return retry;

        if (exception is TimeoutException)
            return true;

        return false;

3) There is a kind of bug where EF runs the retries N^2 times instead of N, which makes for much longer delays than you'd expect. It's supposed to take up to about 26 seconds, but the bug makes it take minutes. However, this isn't so bad because in reality SQL Azure regularly has unavailability for more than a minute :(

4) If you haven't been doing so already, you really need to dispose of your DbContext after it's used. It seems this is the point that the CommitFailureHandler runs it's purging to tidy up the __TransactionHistory table; if you don't dispose, this table will grow forever (although see next point).

5) You should probably call ClearTransactionHistory somewhere in your startup or in a background thread, to clear any leftovers in __TransactionHistory.

Related Questions


Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow