Tuesday, July 6, 2010

List.Find(Predicate <T>) Considered Harmful

Hold on, it's not goto!

I dare to say that every program have ever written on this entire planet needed some sort of searh functionality, and if it didn't, it's probably because it's too lame and frickin' uslelss.

Today I was working on a piece of code that part of it is concerned about finding items in a generic List<T>. The code I wrote was some like this:

var products = ProductsCollection.Find(p => p.Price > 500);

Doesn't that look concise and elegant. For me it does, the problem is, however, this code is not SAFE (yeah, i was surprised too).

When I run this code I got a System.NullReferenceException. WTF is that? ProductsCollection is totally a valid reference, however, it was empty.

OK, wait a second why should searching an empty list throw an exception? The expected result should be null or something but not an exception. After thinking about it a little, I thought, Oh, what if the list contains value types? In such case null is not a valid return type, so an exception makes sense to me now.

Here I thought I really got it and understood that List.Find will throw a null reference exception if called on an empty list. I was totally wrong if you must know. The call would simply return default(T) which is null for reference types. The exception was actually thrown when I used the return of the find!

Now I wondered "OK, now what if i'm searching a list that only contains value types and the item i'm searching for doesn't exist in that list? What would be the result in this case? Well, let's try it out":

List<int> evens = new List<int> { 0, 2, 4, 6, 8, 10};
var evenGreaterThan10 = evens.Find(e => e > 10);
Console.WriteLine(evenGreaterThan10);

What the search returned in this case is the value 0, yes zero! because nothing in the list is greater than 10 so Find will just give you default(T). This can lead to some really nasty bugs. Don't curse, this is frickin' documented you lazy sloths.

Important Note:
When searching a list containing value types, make sure the default value for the type does not satisfy the search predicate. Otherwise, there is no way to distinguish between a default value indicating that no match was found and a list element that happens to have the default value
for the type. If the default value satisfies the search predicate, use the FindIndex method instead.


Trying to be safe in this case you could simply check after finding to see if the returned value is what you actually asked for, something like that:

List<int> evens = new List<int> { 0, 2, 4, 6, 8, 10};
var evenGreaterThan10 = evens.Find(e => e > 10);
if(evenGreaterThan10 > 10)
{
   // valid value
}
else 
{
   // none found    
}

I'm not sure how do you feel about that, but for me, I really hate it! So what I ended up doing is something similar to the well known TryParse style, I overloaded the Find method with an extension method that would allow the usage to be something like this:

List<int> evens = new List<int> { 0, 2, 4, 6, 8, 10};
int i;
if(evens.Find(e => e > 6, out i))
  Console.WriteLine(i);
else 
  Console.WriteLine("None found");

The extension method is as simple as:

public static bool Find<T>(this List<T> list, Predicate<T> predicate, out T output)
{
  output = list.Find(predicate);
  return predicate(output);
}

EDIT:
As reported by Kevin Hall in the comment, this method has a serious bug. Consider the following code segment:
List<int> x = new List<int> { -10, -8, -6, -4 };
int myResult = -9999;
bool resultFound = x.Find(e => e > -3, out myResult);

In this case result found would be true and myResult would be zero! A yet better way to do this is by making use of the FindIndex method like so:

public static bool Find<T>(this List<T> list, Predicate<T> predicate, out T output)
{
  int index = list.FindIndex(predicate);
  if (index != -1)
  {
    output = list[index];
    return true;
  }
  output = default(T);
  return false;
}

Thanks Kevin for pointing that out.

I'm much happier now, I can drop the suicide thought for a while! What do you think dear reader?

13 comments:

Andrew Theken said...

I believe that you could also just declare it as List and then you'll get null back when the predicate doesn't find anything. Of course, the LINQ methods are probably the best way to go.

Andrew Theken said...

ehh.. blogger broke my comment. I was saying, you could just make it List of nullable int. (List<int?>)

Sebastian Sylvan said...

So does that mean that if the default value does match the predicate your find will return true? That's a bit horrible!

I'd use:

public static bool Find(this List list, Predicate predicate, ref T output)
{
int outputIndex = list.FindIndex(predicate);
if ( outputIndex != -1 )
{
output = list[outputIndex]
return true;
}
else
{
return false;
}

}

Bernard said...

I like the approach taken in D's std.algorithm; find returns the input range iterated up to the point of the first match, or an empty range other wise. So:

auto result = find!"a.Price > 500"(products);
if (result != []) {
writeln(result[0]);
} else {
writeln("None found.");
}

(this may be a double comment -- apologies if so!)

David said...

Don't use .Find()

Use List.Where().FirstOrDefault()

teppicymon said...

If you use the 'First' function, it will throw an exception if there is no matching element, but otherwise functionally equivalent for what you're trying to do.

System.InvalidOperationException: Sequence contains no matching element

teppicymon said...

...and the FirstOrDefault function will return you the 'default' of the type, which I tend to use for reference types and compare the result to null.

But still, your method is the only way that doesn't mean handling exceptions!

Matthew said...

Good idea, but because I dislike "out" arguments in functions I'd go one step further and have the extension method return a tuple, holding both the boolean and i results.

Scott Brickey said...

While I am thrilled with the TryParse approach (over simply wrapping Parse in my own Try/Catch block).. I have recently decided that I prefer a TryParse which returns nullable, such as:
static bool? TryParse(object s)
{
bool t;
if (Boolean.TryParse(s, out t))
return (bool?)t;
else
return (bool?)null;
}

it probably doesn't make much difference in code... it just means my initial variable is always assigned.


Similar to your issue, the SharePoint API has received some complaints that a lot of their functions would throw an exception when a value was not found (such as locating a list by name). While well documented, people preferred to work with nulls... MS decided to fix it w/ the new SharePoint 2010 API :)

Galilyou said...

@Andrew Theken I don't see how this helps! at the end I have to check if the return is not null! Again!

Galilyou said...

@Sebastian So what's worrying you is the double call to the predicate?

Kevin Hall said...

There's a serious bug in your code. Consider the following:

List x = new List { -10, -8, -6, -4};

int myResult = -9999;

bool resultFound = x.Find(e => e > -3, myResult);

// resultFound will be true
// myResult will be 0

Galilyou said...

@Kevin Hall Nice Catch! I think I would just use FindIndex instead!

public static bool Find(this List list, Predicate predicate, out T output)
{
int index = list.FindIndex(predicate);
if (index != -1)
{
output = list[index];
return true;
}
output = default(T);
return false;
}