Entity Framework 5 performance concerns

asp.net-mvc c# entity-framework performance sql


Right now I'm working on a pretty complex database. Our object model is designed to be mapped to the database. We're using EF 5 with POCO classes, manually generated.

Everything is working, but there's some complaining about the performances. I've never had performance problems with EF so I'm wondering if this time I just did something terribly wrong, or the problem could reside somewhere else.

The main query may be composed of dynamic parameters. I have several if and switch blocks that are conceptually like this:

if (parameter != null) { query = query.Where(c => c.Field == parameter); }

Also, for some complex And/Or combinations I'm using LinqKit extensions from Albahari.

The query is against a big table of "Orders", containing years and years of data. The average use is a 2 months range filter though.

Now when the main query is composed, it gets paginated with a Skip/Take combination, where the Take is set to 10 elements.

After all this, the IQueryable is sent through layers, reaches the MVC layer where Automapper is employed.

Here, when Automapper starts iterating (and thus the query is really executed) it calls a bunch of navigation properties, which have their own navigation properties and so on. Everything is set to Lazy Loading according to EF recommendations to avoid eager loading if you have more than 3 or 4 distinct entities to include. My scenario is something like this:

  • Orders (maximum 10)
    • Many navigation properties under Order
      • Some of these have other navigation under them (localization entities)
    • Order details (many order details per order)
      • Many navigation properties under each Order detail
        • Some of these have other navigation under them (localization entities)

This easily leads to a total of 300+ queries for a single rendered "page". Each of those queries is very fast, running in a few milliseconds, but still there are 2 main concerns:

  • The lazy loaded properties are called in sequence and not parallelized, thus taking more time
  • As a consequence of previous point, there's some dead time between each query, as the database has to receive the sql, run it, return it and so on for each query.

Just to see how it went, I tried to make the same query with eager loading, and as I predicted it was a total disaster, with a translated sql of more than 7K lines (yes, seven thousands) and way more slow overall.

Now I'm reluctant to think that EF and Linq are not the right choice for this scenario. Some are saying that if they were to write a stored procedure which fetches all the needed data, it would run tens of times faster. I don't believe that to be true, and we would lose the automatic materialization of all related entities.

I thought of some things I could do to improve, like:

  • Table splitting to reduce the selected columns
  • Turn off object tracking, as this scenario is read only (have untracked entities)

With all of this said, the main complaint is that the result page (done in MVC 4) renders too slowly, and after a bit of diagnostics it seems all "Server Time" and not "Network Time", taking about from 8 to 12 seconds of server time.

From my experience, this should not be happening. I'm wondering if I'm approaching this query need in a wrong way, or if I have to turn my attention to something else (maybe a bad configured IIS server, or anything else I'm really clueless). Needles to say, the database has its indexes ok, checked very carefully by our dba.

So if anyone has any tip, advice, best practice I'm missing about this, or just can tell me that I'm dead wrong in using EF with Lazy Loading for this scenario... you're all welcome.

5/24/2013 7:04:28 PM

Popular Answer

For a very complex query that brings up tons of hierarchical data, stored procs won't generally help you performance-wise over LINQ/EF if you take the right approach. As you've noted, the two "out of the box" options with EF (lazy and eager loading) don't work well in this scenario. However, there are still several good ways to optimize this:

(1) Rather than reading a bunch of entities into memory and then mapping via automapper, do the "automapping" directly in the query where possible. For example:

var mapped = myOrdersQuery.Select(o => new OrderInfo { Order = o, DetailCount = o.Details.Count, ... })
    // by deferring the load until here, we can bring only the information we actually need 
    // into memory with a single query

This approach works really well if you only need a subset of the fields in your complex hierarchy. Also, EF's ability to select hierarchical data makes this much easier than using stored procs if you need to return something more complex than flat tabular data.

(2) Run multiple LINQ queries by hand and assemble the results in memory. For example:

// read with AsNoTracking() since we'll be manually setting associations
var myOrders = myOrdersQuery.AsNoTracking().ToList();
var orderIds = myOrders.Select(o => o.Id);
var myDetails = context.Details.Where(d => orderIds.Contains(d.OrderId)).ToLookup(d => d.OrderId);
// reassemble in memory
myOrders.ForEach(o => o.Details = myDetails[o.Id].ToList());

This works really well when you need all the data and still want to take advantage of as much EF materialization as possible. Note that, in most cases a stored proc approach can do no better than this (it's working with raw SQL, so it has to run multiple tabular queries) but can't reuse logic you've already written in LINQ.

(3) Use Include() to manually control which associations are eager-loaded. This can be combined with #2 to take advantage of EF loading for some associations while giving you the flexibility to manually load others.

8/13/2013 12:40:04 PM

Related Questions


Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow