Pattern Matching in Scala distilled

As the name suggests, pattern matching enables the checking of a sequence of tokens for the presence of the same pattern. Beyond the Scala language, pattern matching is a commonly employed programming technique in a number of (predominantly functional), programming languages including Haskell, Erlang, ML, Clojure and Prolog. In this post I’ll cover the different flavours of pattern matching in Scala, as well as providing reference to typesafe null and extractor patterns to cover the idiomatic usage of pattern matching in Scala.

Within the Scala arena, pattern matching is something of a ‘golden hammer‘ due to its suitability across a number of scenarios, with the overwhelming benefit being syntactical simplicity. Some common uses of pattern matching include, type detection and casting operations, responding to Actors messages, recursive XML parsing, deconstructing collections and for partial function application.
Scala supports a varied type of patterns, namely: constants, variables, constructors, sequences, tuples, and types. So let’s see some examples !
Constant pattern matching:
interestingly, specifying the first case as foo would cause the example to fail, as x would be variable bound to foo ! (in fact the compiler would state that case BAR would be unreachable !)
Variable pattern matching:
Constructor pattern matching:
Pattern matching over a sequence:
Notice how the list is ‘decapitated‘ (i.e. the head is removed) and the remainder of the list is bound to the ‘rest’ variable and submitted back to the function
This other snippet shown the use of the sequence wildcard.
Pattern matching a Tuple:
Pattern matching for Types:
Use of pattern matching in for() expressions:
Pattern matching applied to XML:
A couple of points of note here, are that pattern matching supports ‘deep matching‘ of patterns over the object graph. Also, variable pattern matches allow for the variable to be bound and reused as a return value. Alternately, a variable can be bound in a pattern match using the ‘@’ sign, and wildcard binding using ‘_’, or ‘_*’ for sequences, is also available. If a pattern match is followed by further constraints using if..then.. clauses, this is known as adding a ‘pattern guard‘ to the expression.
Constant and variable pattern matches are differentiated by convention as constant matches start with capitalised lettering whereas variable matches start with lowercase lettering. Scala also uses the type erasure for Collection as is present in Java. Therefore, only Arrays can use Type pattern matches as they are the only Collection types to retain their type information.

Coming from a Java background, pattern matching appears to be a generalised case.. switch.. statement, and syntactically there’s a fair similarity between the two. The differences between Scala’s pattern matching and Java’s case/switch are as follows: matches are expressions in Scala (so they always return a value); matches do not ‘fall through’ in Scala; if no patterns are matched in Scala a MatchError is thrown.
If the Scala compiler doesn’t determine that a pattern match is authoritative over its input classes, the @unchecked annotation must be included. So what makes a pattern match authoritative for all available input classes ? Arise the sealed class !
A superclass of a Set of case classes can be marked as sealed which then provides some guarantees over the concrete case classes, but certain rules must be adhered to for this to take effect, namely: the concrete child case classes must be in the same source file as their parent superclass.

Other Scala constructs that are often (idiomatically) juxtaposed with pattern matching are typesafe Nulls (aka Option classes), extractors and injectors.
Typesafe nulls effectively dispense with the typical ‘myReturnValue == null’ technique frequently employed in Java. Instead, Scala provides the facility to return either Some(x) or None from an operation. This marks ‘null’ checks as redundant given a type safe return value is guaranteed. Some(x) acts a wrapper for the ‘x’ value desired, and None is the typesafe Null equivalent. Some(x) and None are both subclasses of the Option class, and is typically used as a return type from get() operations on a Collection or Map. This has the benefit of being able to readily chain method calls together to form something akin to a Fluent Builder pattern, without having to instersperse the ‘building blocks’ with null checks.
Formally, an extractor is an Object that includes a method called ‘unapply’ as one of its members. Unapply matches a value and takes it apart into it’s constituent parts. Often the extractor is matched an injector in the format of a sibling ‘apply’ method for building values, but the existince of unapply does not mandate an apply method is available, (though the unapply method MUST be implemented for the Object to qualify as a valid extractor). Extractors are useful as a more flexible alternative to case classes as they break the dependency between the data representation and a pattern, a property known as ‘representation independence‘. The benefit of this, is that it allows for implementation types to be changed without affecting existing clients, (so a number of pattern matches could be formulated to construct instances of a class without having to specifically cater for these different patterns via overloading the classes constructor.. also valuable when try to adapt client code to library classes for which the source is not available). The cost of this added flexibility is reduced brevity of impementation and degraded performance.

And finally, throwing it all together in an extended example !
(the example here uses the standard coding dojo bowling kata – http://codingdojo.org/cgi-bin/wiki.pl?KataBowling)
Happy hacking !

Leave a Reply