We’ve all had a null-pointer exception. We go to do something with a value, and the thing is null, which causes the program to crash. We wanted something to be there…and it just wasn’t! Why and how does that even happen? One of the most common ways to throw a null pointer exception is to not check if the object you get is null.
Let’s say we’re looking for an element in a list, and we want to find the element’s index. But we don’t just want the element’s index, that’s too boring. We want it’s index plus one, which is so much more exciting!!! For example, if our list was ["cow","duck","sheep","pig"]
, and the element we are looking for is "sheep"
, then we would want our final answer to be 3
, since "sheep"
is at index 2
(remember, Computer Scientists start counting at 0
). Sounds easy enough, right?
Here’s some code that will do this in Python:
Sure enough, if we call get_index_after_element(["cow","duck","sheep","pig"], "sheep")
, this function call will return 3
.
These functions don’t look terrible. They’re short and well documented at least, relatively easy to understand. We even use some function composition (code re-use), namely get_index_after_element
calls find_element_index
. So what’s the problem here?
Null pointers. Null pointers are the problem (well…None in Python, but the concept is nearly identical). What if the element is not in the list? Then what happens? Here’s what happens when I execute it in my interpreter:
That’s nice. A crash. With a TypeError
telling me that I can’t add a NoneType
and an int
. Now, we can avoid crashing by checking if the type returned by find_element_index
is None
. If it returns None
, we don’t want to add one to it. The new method would look like this:
Not only does this muddy up the code a good bit, making it harder to read and understand, but we have a serious problem on our hands. How do we deal with null-pointers? Is there a general rule about checking for null pointers? Do we have check for them before every single operation? Some of the goofiest null-pointer crashes occur when checking for a null-pointer (yes, this can and does happen). If I have an object A
, and A
has attribute b
, if I say something like if (A.b != null)
, but A
itself is null
, guess what, our favorite friend is back again. Do I really have to check A
first and then check A.b
? What if b
is an object too, and I want to access A.b.c
? Hopefully you see where this is going…
This was obviously a toy example. Any software engineer knows not to do what I just did. However, it turns out when you have a code base of a couple million lines, these null-pointer errors are easier to cause than you think. As engineers, I believe it’s important to put ourselves in a position to minimize errors. The best way that I’ve seen how to minimize this particular error (actually, eliminate it, in fact), is by using the Maybe
type from Haskell. Here’s an example of how to find the index of "sheep"
in the list ["cow","duck","sheep","pig"]
in Haskell:
This program outputs Just 2
. The Maybe
type constructor is implemented as Just A
or Nothing
for a given type A. So, sheepIndex could return Just Int
or Nothing
. In this case, since "sheep"
was in the list, it returned Just 2
.
What if we try to add one to the index? Let’s give it a whirl like we did in Python:
Thankfully, this code doesn’t compile. Haskell gives us a nice error message:
It’s telling us we can’t us the + function with a Maybe
Type, and thank goodness for that. I wish my C++ and Java compilers would yell at me for doing this too. Yes, Haskell is getting in the way, but it’s getting in the way for the right reason. Code like this should NOT compile!
It looks like null pointers should just be a thing of the past. I hope other mainstream languages start supporting this feature by default as well, because it really shouldn’t be specific to Haskell. It’s a great idea, and just another one of Haskell’s amazing features!
Leave a Reply