Monday, May 16, 2011

Scala's Match is Not Switch++

Scala's match operator is a beautifully powerful beast. It can do some really sophisticated stuff with very little effort. However, if your introduction to 'match' was along the lines of, "It's like Java's 'switch', but better", then I want to offer a word of caution. I'll show an example where this thinking caused problems for me, briefly explain match and explain how I look at it now to prevent this confusion for myself.

A (Bad) Example
Some things can look quite straightforward, but at runtime act quite differently from how they look. You might call these bugs. Take the following snippet as an example. What do you think this prints out?

object TestingObject {
def main(args: Array[String]) {
val testValue = "Boris"
val inputValue = "Natasha"
List(inputValue) match {
case List(testValue) => println("Match!")
case _ => println("No match")
}
}
}
Because you're a smart alec and you know what makes for an interesting blog, you have probably correctly guessed that this prints out "Match!".

Let's try and get a little more information...

object TestingObject {
def main(args: Array[String]) {
val testValue = "Boris"
val inputValue = "Natasha"
List(inputValue) match {
case List(testValue) => println("Match! testValue = " + testValue)
case _ => println("No match")
}
}
}
The code now prints "Match! testValue = Natasha". This gives us a bit more insight into what's going on. Obviously the testValue that is defined in the case expression is not a reference to the original val testValue but a declaration of a new reference with a different scope.

My mistake when I wrote this code was to believe that everything that I put between case and => was an input to the matching algorithm, i.e. it was going into some == or .equals() that happens under the hood. This is true in Java's switch statement, where everything in the case must be a constant, but it's certainly not how things work in Scala's match.

New Perspective
I prefer now to think of match the other way around to how it was first introduced to me: I now consider the use of an explicit extractor object to be the normal case, and I think of any other form of case expression, such as matching against a constant value, as syntactic sugar for some built-in, implicit extractor.

When you use an extractor object, you have an object that defines an unapply() function which accepts an object of the type you are matching as a parameter. The unapply() function can either return a Boolean or an Option. If it returns a Boolean, then true means the argument is a match and false means that it wasn't. If it returns an Option, then None means that it wasn't a match, while a Some indicates it did match. Not only this, but the object contained in the Some becomes available to the expression on the right of the =>, assuming you give it a name in the case expression. I think this is the most significant difference between switch and match: In Scala, the case expression has outputs.

My problem in the example at the start was caused because I thought of List(testValue) in the case expression as an input, but it's not. It's actually the name of an Extractor object (List) and a name given to the output of that Extractor (testValue).

My new way of thinking is to pretend that everything in a Scala case is an extractor. So when I look at a case that is matching a plain old constant, I think to myself, "That's calling an extractor that returns true if the input value equals the constant. If I see case x: Int => I think, "That's calling an extractor that returns Some(x) if the object is an Int or None if it's not.

Re-Program Yourself
So, my suggestion is to rid yourself of any notion that Scala's match is like "switch on steroids". That view can lead to the false belief that match is a switch that can deal with all kinds of interesting, non-constant input values. In actual fact, the cases in match are all about Extractors, not inputs, and matching against constant values in the same way as switch is best thought of as just a nice little trick that the compiler does for you to hide the Extractor.

Want to Learn More?
If you suspect that a bit too much of your Scala knowledge may come from blogs that were comparing it with Java, reading one of these books might help:

From Amazon...


From Book Depository...


4 comments:

  1. if you want to use a var/val/identifier as an input to the match, use backticks:

    List(inputValue) match {
    case List(`testValue`) => println("Match!")
    case _ => println("No match")
    }

    will match as you originally expected.

    It would be sometimes nice if there was more warning when you hide an outer scoped identifier.

    ReplyDelete
  2. I did not know about the backticks, but another option is to have a value with the first letter being upper-case:

    List(inputValue) match {
    case List(TestValue) => println("Match!")
    case _ => println("No match")
    }

    ReplyDelete
  3. Hi Jed,

    Thanks heaps for adding that. I wasn't aware of that syntax.

    I had a play around with it, just to see what it's doing, and I see that it still interprets 'List' as an Extractor, and it is trying to match the value in 'testValue' to the output of List.unapply(). Though it looks like it, it's not creating a List with testValue in it and then calling .equals() (which is what I originally thought was going to happen in my Bad Example).

    But it's a handy thing to know. Cheers.

    ReplyDelete
  4. Hi Joa,

    Thanks for the extra knowledge. Do you ever use that feature? It seems a bit 'magic' to me to use the casing of an identifier to change what the code does. I'd be afraid the next person to read the code wouldn't spot the subtle difference. What do you think?

    ReplyDelete