DevTalk.net

ActiveMesa's software development blog. MathSharp, R2P, X2C and more!

Chained null checks and the Maybe monad

with 27 comments

Nested IF StatementsA great many programmers have met a situation where, while accessing a nested object property (e.g., person.Address.PostCode), they have to do several null checks. This requirement frequently pops up in XML parsing where missing elements and attributes can return null when you attempt to access them (and subsequently trying to access Value throws a NullReferenceException). In this article, I’ll show how a take on the Maybe monad in C#, coupled with use of extension methods, can be used to improve readability.

Problem Description

So, to start with, let’s look at the way to get a person’s post code (just imagine you’re working with XML or something). The code shown below does several null checks and assigns the value only if it is available.

string postCode = null;
if (person != null && person.Address != null && person.Address.PostCode != null)
{
  postCode = person.Address.PostCode.ToString();
}

What you’ve got up there is some fairly unreadable (and un-maintainable) code. Actually, we’re lucky to have all of our code fall under a single if – something that might not be possible in a more complex scenario. Let’s imagine a more complicated situation – say we need to perform some operation between the if evaluations. What do we get? That’s right – a chain of ifs.

string postCode;
if (person != null)
{
  if (HasMedicalRecord(person) && person.Address != null)
  {
    CheckAddress(person.Address);
    if (person.Address.PostCode != null)
      postCode = person.Address.PostCode.ToString();
    else
      postCode = "UNKNOWN";
  }
}

The code presented above contains a lot of excess data – for example, person.Address.PostCode is mentioned twice. There’s nothing incorrect about the code per se, it just has a bit too many symbols. To sum up, we want our code to communicate better that

  • If the value is null, no further evaluations should be done; if the value is not null, then this is the value we’re going to work with
  • If we perform some action, it only happens on a valid object

So what am I suggesting? I propose that we create a fluent interface that will satisfy the above conditions without any nesting. To do that, we are going to employ the Maybe monad.

For those of you who know F#, the Maybe monad will be familiar as the Option type. For C# developers, let’s just assume that you can have variable that either have Some value or no value (None). Of course, C# doesn’t directly support this none-some duality except by using null. Which is precisely why I’m proposing the chained extension solution presented below.

With

Our primary concern is to do the null checks to ‘shorten’ them so they don’t pollute our code. For that, we’ll define a With() extension method:

public static TResult With<TInput, TResult>(this TInput o, Func<TInput, TResult> evaluator)
  where TResult : class where TInput : class
{
  if (o == null) return null;
  return evaluator(o);
}

The above method can be attached to any type (because TInput is effectively object). As a parameter, this method takes a function which defines the next value in the chain. If we passed null, we get null back. Let’s rewrite our first example using this method:

string postCode = this.With(x => person)
                      .With(x => x.Address)
                      .With(x => x.PostCode);

I suppose, in the above example, we could replace Func<> with Expression<> and try to pull properties, but I’ve seen this done and the resulting code is too slow and it’s also somewhat limiting – it assumes that you’re working with just one object, whereas my Maybe chains can (and do) drag in many objects.

Return

Here comes another piece of syntactic sugar – the Return() method. This method will return the ‘current’ value just like Where() does, but in case null was passed, it will return a different value that we supply. Consider this a kind of «Where() with fallback» method.

public static TResult Return<TInput,TResult>(this TInput o, 
  Func<TInput, TResult> evaluator, TResult failureValue) where TInput: class
{
  if (o == null) return failureValue;
  return evaluator(o);
}

So let’s assume now that, with the absense of a postcode, we want to return, say, string.Empty. Here’s how:

string postCode = this.With(x => person).With(x => x.Address)
                      .Return(x => x.PostCode, string.Empty);

By the way, you could rewrite the extension method so that failureValue would also be computed via a Func<> – I am yet to meet a scenario where this is required, though. It is typically the case that we never know at which stage the chain failed (and yielded null), so the terminal Return() is typically an indicator (either true/false or null/not null).

If & Unless

Going through the call chain, you sometimes need to do checks not related to null. Theoretically, you could suspend the chain and use an if, or you could use an if in one of the delegates, but… you can simply define an If() extension method (and an Unless() if you feel like it) and plug it into the chain:

public static TInput If<TInput>(this TInput o, Func<TInput, bool> evaluator) 
  where TInput : class
{
  if (o == null) return null;
  return evaluator(o) ? o : null;
}
 
public static TInput Unless<TInput>(this TInput o, Func<TInput, bool> evaluator)
  where TInput : class
{
  if (o == null) return null;
  return evaluator(o) ? null : o;
}

Do

Seeing how we’re having a party here, let’s add yet another method that simply calls a delegate – and that’s it. Of course, this method is best used for one-line calls and not for evaluating 20-line algorithms with convoluted logic. Nevertheless, the call is quite useful in practice.

public static TInput Do<TInput>(this TInput o, Action<TInput> action) 
  where TInput: class
{
  if (o == null) return null;
  action(o);
  return o;
}

So, we’re done: we’ve got the infrastructure we need to get our post code extraction to be a bit more readable. Here is the end result:

string postCode = this.With(x => person)
    .If(x => HasMedicalRecord(x))]
    .With(x => x.Address)
    .Do(x => CheckAddress(x))
    .With(x => x.PostCode)
    .Return(x => x.ToString(), "UNKNOWN");

As you can see, the depth of nesting has fallen to zero – no more curly braces!

Discussion

I use these Maybe-monadic-chain-null-extension-methods (call them how you will) in my R2P software product. Here’s an example of real-life use of these constructs:

public override void VisitInvocationExpression(IInvocationExpression expression)
{
  base.VisitInvocationExpression(expression);
  string typeName = this.With(x => expression)
    .With(x => x.InvokedExpression)
    .With(x => x as IReferenceExpression)
    .With(x => x.Reference)
    .With(x => x.Resolve())
    .With(x => x.DeclaredElement)
    .With(x => x.GetContainingType())
    .Return(x => x.CLRName, null);
  this.If(x => Array.IndexOf(types, typeName) != -1)
    .With(x => ExpressionStatementNavigator.GetByExpression(expression))
    .Do(x =>
          {
            var suggestion = new SideEffectSuggestion(typeName);
            var highlightInfo = new HighlightingInfo(
              expression.GetDocumentRange(),
              suggestion);
            context.HighlightingInfos.Add(highlightInfo);
          });
}

I have to point out here that, at any point, you can stop the chain and start a new one. Why would you want that? Well, for example, you cannot define shared variables within the chain (unless you refactor it all to have a Dictionary<string,object>-like parameter).

By the way, quite frequently I find myself making additional, domain-specific methods to plug into this chain. For example:

public static IElement IsWithin<TContainingType>(this IElement self) 
  where TContainingType: class, IElement
{
  if (self == null) return self;
  var owner = self.GetContainingElement<TContainingType>(false);
  return owner == null ? self : null;
}

One more thing: this type of notation is actually light obfuscation because, as I’m sure you’ve guessed, each extension method call will be shown as a static method call in Reflector:

public override void VisitInvocationExpression(IInvocationExpression expression)
{
    base.VisitInvocationExpression(expression);
    string typeName = this.With<SideEffectAnalyser, IInvocationExpression>(
    delegate (SideEffectAnalyser x) {
        return expression;
    }).With<IInvocationExpression, ICSharpExpression>(delegate (IInvocationExpression x) {
        return x.InvokedExpression;
    }).With<ICSharpExpression, IReferenceExpression>(delegate (ICSharpExpression x) {
        return (x as IReferenceExpression);
    }).With<IReferenceExpression, IReference>(delegate (IReferenceExpression x) {
        return x.Reference;
    })
    ⋮
    // and so on
}

This approach is easily extensible – for example, a colleague of mine does try-catch checks in his chains, too. Hey, this is kind of like AOP, but without post-build or dynamic proxies. Oh, and the performance hit for these chains is negligible compared to if statements.

That’s it! Comments are, as always, welcome! Oh, and if you like this article, please vote for it on CodeProject.  ■

Written by Dmitri Nesteruk

September 12th, 2010 at 9:05 am

Posted in CSharp

  • bsnote

    One thing is bad here: it doesn’t matter at what level the check fails, all of them are executed. Do you have an idea, how this can be avoided?

    • http://devtalk.net Dmitri Nesteruk

      I think the point is to go through with them. You could certainly break out of the execution by throwing an exception, but then you would have to catch it somewhere, which would lead to uglier top-level code. Not to mention the fact that throwing exceptions in this instance is just plain wrong.

      I imagine this could be done at the pure IL level (ehh… somehow). But personally I’m happy to execute the whole chain knowing it’s resistant to failure.

  • Андрей

    Ваши статьи вдохновляют меня учиться, учиться и еще раз учиться! :)

  • Rafi

    Maybe I missed something here, but it looks as ugly

  • http://blog.threenine.co.uk Gary Woodfine

    Excellent post. I agree with you in that it is porbably better to execute the entire chain. however off the top of my head you could probably use the yield statement if you are looking to return errors.
    Sorry I can’t supply more information, it’s late and I really need to get to bed :-)

    • http://devtalk.net Dmitri Nesteruk

      Well that would imply that the method would have to return IEnumerable, which isn’t exactly what we want.

  • Shrike

    Nice.
    Я еще иногда использую паттерн NullObject.
    Тогда person.Address никогда не возвращает null, а возвращает NullAddress.
    Имеет смысл, если типы генерируются. Писать руками утомительно конечно.
    Не так гибко, зато с data binding’ом WPF дружит.

  • Tim Collinson

    Dmitri,

    Great post. I was just playing around with this and noticed that while it works fine for things like a string, it falls apart with ints which are not nullable. Any ideas? Or is that the point to begin with since we wouldn’t be checking for nulls in ints?

    • http://devtalk.net Dmitri Nesteruk

      This is happening because of the :class constraint which is, IIRC, necessary for all of this to work.Naturally, you cannot compare int to null and so propagation of numeric data via these chains becomes problematic. The simplest solution I can think of is creating yet another chain method that foregoes the null check, e.g.:

      public static TResult WithValue<TInput, TResult>(this TInput o, Func<TInput, TResult> evaluator)
        where TInput:struct
      {
        return evaluator(o);
      }
      
  • karlssberg

    I wouldn’t worry too much about subsequent null test being executed when the first fails. If you could see how a compiler reorders the flow of your code to improve performance while still maintaining the program logic then you’d simply give up trying these little speed hacks. Compilers generally do a better job of tweaking our code to make it faster.

  • http://activeengine.wordpress.com David Robbins

    Wow – you just blew me away. Really nice work.

    • http://devtalk.net Dmitri Nesteruk

      Thanks! Glad you like it.

  • Guest

    just … why?!

  • Pingback: Итоги 2011 года « Дмитрий Нестерук

  • Computersade

    Can someone explain to me how this:
    if (person != null)
    is ” fairly unreadable (and un-maintainable) code”? It reads as “If the person object is not null”. Seems perfectly readable (and maintainable) to me.
    Even in long hand “if open bracket the person object is not null close bracket” it still makes sense.

    Now this:
    this.With(x => person)
    is what? Readable? Maintainable? It reads… umm, not sure, erm let’s look at the “with” method… ah, when evaluating, using the with method, the object “this” - is the result, returned into variable x, equal to or greater than the person object?!? Is that right, erm, maybe I should have another crack at it…
    In English it actually reads “this dot with open bracket x is equal to or greater than the person object close bracket”. What does that mean??

    (Nice bit of code though)

    • http://devtalk.net Dmitri

      I think you’re missing the point. A single null check is no problem, but lots and lots of nested null checks take readability to zero.

    • Rodrigo

      You should maybe re-read this.

  • MichaelFreidgeim

    The extension name With is not meaningful for me, I like IfNotNull from http://smellegantcode.wordpress.com/2008/12/11/the-maybe-monad-in-c/

    • http://devtalk.net Dmitri

      I guess if all you’re doing is checking for nulls, it is maybe a bit more meaningful. However, that’s not always the use case. For example, you might have a Nullable<T> in there, or maybe a struct. In this case, calling the function When() makes a lot more sense.

  • Pingback: Скринкаст: монада Maybe на языке C# | | CopyBase.RU - Интересное из сетиCopyBase.RU — Интересное из сети

  • Pingback: Монады в C# | Ka-Lab

  • Pingback: Nullsafe navigation in c#

  • Pingback: JetBrains .NET Tools Blog » ReSharper SDK Adventures Part 5 — D-style Mixins in C#

  • http://profile.yahoo.com/WHZL5YYUNZDDDUP63M5OY7K2MQ Tormoz

    I had a colleague who wrote his own Maybe monad, and my initial reaction was, Why? And: I can’t read this. It took me a while truly to appreciate the flexibility and improved readability it provided. Yes, the terseness can be an initial barrier to understanding, but one I “got it” I really appreciated it. Sometimes it’s a matter of which verbs we select: some people like “With,” for example, and others won’t.

    In any event, thanks for making a complex topic very understandable — and for providing repeatable examples too.

  • Pingback: Tracing computation expressions | onoffswitch.net

  • http://twitter.com/gsscoder Giacomo S. S.

    Great post, also if I prefer I simpler Maybe monad (like this http://blog.ploeh.dk/2011/02/04/TheBCLalreadyhasaMaybemonad/).

    In 2013 after the diffusion of many FP langs and after a lot of langs like C# got great functional constructs… there’s still people ask “why?!”.

    Why we should avoid boilerplate unreadable code that check nulls? This post and also the one quoted by me, explain “why”.

    I’ve wrote too a lot of code that check a lot of times for null, I admit. But once understood I’ve shame of this code and for the future, you’ll never see a null check in my code (except for parameter guard clauses).

    This is why.

  • Pingback: Minimizing the null ref with dynamic proxies | Onoffswitch.net