LinQ to SQL throws Stackoverflow exception when using Any()

c# entity-framework-6 linq linq-to-sql postgresql

Question

I'm using a LinQ query that looks like this

public List<TEntity> GetEntities<TEntity>(int[] ids)
{
    var someDbSet = new DbSet<TEntity>();

    var resultQ = someDbSet.Where(t => !ids.Any() || ids.Contains(t.ID)); //<= crashing line

    return resultQ.toList();
}

It usually works, but for some case when ids size is ~ 7000 items it crashes. The thrown exception message is "Exception of type 'System.StackOverflowException' was thrown.". It has no stack trace or InnerException.

I also get this info: "EntityFramework.pdb not loaded... contains the debug information required to find the source for the module EntityFramework.dll"

Is this a known bug or can someone explain why it doesn't work when the array is bigger?

I'm using .NET Framework 4.5, EntityFramework 6.1.3, EntityFramework6.Npgsql 3.0.3

1
0
2/6/2020 11:34:02 AM

Accepted Answer

If we pass an array with only two values int[] ids = {1, 2} to your method GetEntities EntityFramework will generate the next query:

SELECT 
[Extent1].[Id] AS [Id], 
...
FROM [dbo].[Entity] AS [Extent1]
WHERE ( NOT EXISTS (SELECT 
    1 AS [C1]
    FROM  (SELECT 
        1 AS [C0]
        FROM  ( SELECT 1 AS X ) AS [SingleRowTable1]
    UNION ALL
        SELECT 
        1 AS [C0]
        FROM  ( SELECT 1 AS X ) AS [SingleRowTable2]) AS [UnionAll1]
)) OR (1 = [Extent1].[Id]) OR (2 = [Extent1].[Id])

If we increase the number of elements in ids array this query becomes more complex with more levels of nestings. I think that EntityFramework uses some recursive algorithm to generate SQL-code for !ids.Any() expression. When the number of elements in ids array increases the depth of the recursion also increases. Therefore it generates StackOverflowException when the number of elements in ids array (and also the depth of the recursion) is large.

If we delete !ids.Any() expression the next query will be generated:

SELECT 
[Extent1].[Id] AS [Id], 
...
FROM [dbo].[Entity] AS [Extent1]
WHERE [Extent1].[Id] IN (1,2) 

Such query does not generate StackOverflowException when the number of elements in the ids array is large. Therefore it would be better to extract !ids.Any() expression out of LINQ query:

public List<TEntity> GetEntities<TEntity>(int[] ids)
{
    var someDbSet = new DbSet<TEntity>();

    if (!ids.Any())
        return someDbSet.ToList();

    var resultQ = someDbSet.Where(t => ids.Contains(t.ID));
    return resultQ.toList();
}

You should also take to account that there is a limitation on number of items for WHERE IN condition: Limit on the WHERE col IN (...) condition.


ionutnespus wrote:

Yes, extracting the condition outside Where() is working. Still, I couldn't find any explanation why EF would use such a complicated algorithm for such a simple condition. Any thoughts on that?

I've decided to answer this question by extending this post because the asnwer is large and contains code.

I don't know for sure why EF generates such complex query but I've made some research and here are my thoughts. If we modify your GetEntites method and use next condition in LINQ query:

someDbSet.Where(t => !ids.Any(i => i == 3) || ids.Contains(t.ID));

the next SQL-query will be generated if ids = {1, 2}:

SELECT 
[Extent1].[Id] AS [Id], 
...
FROM [dbo].[Entity] AS [Extent1]
WHERE ( NOT EXISTS (
    SELECT 1 AS [C1]
    FROM (
        SELECT 1 AS [C0] FROM  ( SELECT 1 AS X ) AS [SingleRowTable1] WHERE 3 = 1
        UNION ALL
        SELECT 1 AS [C0] FROM  ( SELECT 1 AS X ) AS [SingleRowTable2] WHERE 3 = 2
        ) AS [UnionAll1]
)) OR (1 = [Extent1].[Id]) OR (2 = [Extent1].[Id])

Here you can see that NOT EXISTS condition contains two subqueries each of which checks if the next element of the ids array equals required value. I think that it is logical to use NOT EXISTS SQL-condition to represent Any() method. But why does EF generates one subquery for each array element? In my opinion EF does so because because EF Team tried to write algorithm that generates queries that are not depend on database type. But this is only my opinion. May be it would be better to ask this question EF Team on github.

1
2/7/2020 2:59:14 PM

Popular Answer

Can you try like this?

    public List<TEntity> GetEntities<TEntity>(int[] ids)
    {
        var someDbSet = new DbSet<TEntity>();
        var resultQ = new List<your_list_type>();
        foreach( var id in ids) {
          resultQ.Add(someDbSet.Where(prm => prm.ID == id).FirstOrDefault());
}
        return resultQ;

    }


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow