Julien Truffaut
4th April 2023
One of the most convenient features of Scala is its ability to pattern match on data such as case classes, enumerations or litterals. In this article, I will present my 10 most useful tips for working efficiently with pattern matching.
Most tips will use code examples with the following enumeration:
enum OrderType {
case Market
case Limit(limitPrice: Double)
case Stop(stopPrice: Double)
case StopLimit(limitPrice: Double, stopPrice: Double)
}
An OrderType
represents strategies to buy or sell financial instruments in the market. I only encoded four different order types here, but more types exist.
We should always pattern match on all possible cases. Otherwise, we risk getting a MatchError
at runtime.
def getLimitPrice(orderType: OrderType): Option[Double] =
orderType match {
case Limit(limitPrice) => Some(limitPrice)
case StopLimit(limitPrice, _) => Some(limitPrice)
}
getLimitPrice(Limit(12.4))
// res: Option[Double] = Some(12.4)
getLimitPrice(Market)
// scala.MatchError: Market
Fortunately, the compiler warns us if we forget to handle some cases:
Warning: match may not be exhaustive.
It would fail on pattern case: Market, OrderType.Stop(_)
Note that this warning is only available when pattern matching on sealed classes. For example, if we match on String, we won’t get any warning:
def isLuckyNumber(s: String): Boolean =
s match {
case "3" => true
case "5" => false
case "7" => true
}
Warnings are easy to miss, but thankfully we can configure the compiler to turn some warnings into errors using the Wconf option. We just need to add the following line to our build.sbt
file:
scalacOptions += "-Wconf:cat=other-match-analysis:error"
When an enumeration contains many branches, it is tempting to use a case _
to handle all the other cases. For example:
def hasLimit(orderType: OrderType): Boolean =
orderType match {
case _: Limit => true
case _: StopLimit => true
case _ => false
}
While this code is correct, it is likely to cause bugs later on. Indeed, in a few months or years, we may extend the OrderType
class with new values. When this happens, it is likely we won’t remember to update the hasLimit
function and the compiler can’t warn us because we used a catch-all.
In practice, we can’t always avoid a catch-all. For example, if we pattern match on an Int
, we don’t want to enumerate the 4 billion possible values. However, for a custom enumeration like OrderType
, it makes sense to be explicit, like this:
def hasLimit(orderType: OrderType): Boolean =
orderType match {
case Market => false
case _: Limit => true
case _: Stop => false
case _: StopLimit => true
}
When several cases lead to the same result, we can group them using pattern alternatives |
and reduce code duplication:
def hasLimit(orderType: OrderType): Boolean =
orderType match {
case Market | _: Stop => false
case _: Limit | _: StopLimit => true
}
When pattern matching on a class, we can either deconstruct all the classes’ fields or only match on the class itself. You can see both approaches below:
orderType match {
case Limit(limitPrice) => ...
case StopLimit(limitPrice, stopPrice) => ...
}
orderType match {
case x: Limit => ...
case x: StopLimit => ...
}
In my opinion, the latter approach is preferable because the code is refactor-proof, meaning that it won’t break when we add, remove or reorder fields inside the classes.
Guards allow us to add conditions to pattern matching. They are written after the pattern and preceded by the keyword if
. Here's an example:
def getLimitPrice(orderType: OrderType): Option[Double] =
orderType match {
case Limit(limitPrice) if limitPrice <= 0 => None
case Limit(limitPrice) if limitPrice > 0 => Some(limitPrice)
case Market => None
case _: Stop => None
case StopLimit(limitPrice, _) => Some(limitPrice)
}
The problem with guards is that they break the exhaustivity checker. The code above handles all the possible cases, yet the compiler emits the following warning:
Warning: match may not be exhaustive.
It would fail on pattern case: OrderType.Limit(_)
If you wish to use a condition, I suggest defining it inside the code block of the match. For example:
def getLimitPrice(orderType: OrderType): Option[Double] =
orderType match {
case Market | _: Stop => None
case x: Limit => if(x.limitPrice <= 0) None else Some(x.limitPrice)
case x: StopLimit => if(x.limitPrice <= 0) None else Some(x.limitPrice)
}
Or even better, with a filter after the match to avoid code duplication:
def getLimitPrice(orderType: OrderType): Option[Double] =
(orderType match {
case Market | _: Stop => None
case x: Limit => Some(x.limitPrice)
case x: StopLimit => Some(x.limitPrice)
}).filter(_ > 0)
Edit 1: u/kag0
pointed out to me that the compiler is actually doing its job by emitting a warning in the example above.
Instead, I should have defined the second case Limit
without a guard similar to the else
branch of if-then-else
.
def getLimitPrice(orderType: OrderType): Option[Double] =
orderType match {
case Limit(limitPrice) if limitPrice <= 0 => None
case Limit(limitPrice) => Some(limitPrice)
case Market => None
case _: Stop => None
case StopLimit(limitPrice, _) => Some(limitPrice)
}
Pattern matching is great because it is simple and works on a large number of classes. However, it can be quite verbose. It is sometimes better to use combinators as opposed to multiple matches. For example, I find that using a for-comprehension and the method toRight
is a more readable solution than a nested pattern-match.
case class User(id: UserId, name: String, role: Role, email: Option[Email])
case class UserId(value: Long)
case class Email(value: String)
def getUserEmail(id: UserId, users: Map[UserId, User]): Either[UserEmailError, Email] =
for {
user <- users.get(id).toRight("User $id not found")
email <- user.email.toRight("Email not found for user $id")
} yield email
def getUserEmail(id: UserId, users: Map[UserId, User]): Either[UserEmailError, Email] =
users.get(id) match {
case None => Left("User $id not found")
case Some(user) =>
user.email match {
case None => Left("Email not found for user $id")
case Some(email) => Right(email)
}
}
Open classes are classes which aren’t sealed and can be extended by anyone. For example:
trait OrderAPI {
def createOrder(instrument: Instrument, orderType: OrderType): Future[OrderId]
def deleteOrder(orderId: OrderId): Future[Unit]
}
Parametric types are type variables which are defined between square brackets such as[A]
.
We should avoid pattern matching on both open classes and parametric types because it breaks encapsulation. For example, we shouldn’t do the following:
def increment(value: Any) =
value match {
case x: Int => x + 1
case x: Double => x + 1
case other => other
}
def increment[A](value: A) =
value match {
case x: Int => x + 1
case x: Double => x + 1
case other => other
}
The reasons to avoid this sort of code are subtle. That’s why I made a video to go over them in detail.
When you want to compare multiple values together, it can be useful to put all the values in a tuple and match on it. For example, you may want to sum two optional numbers together only if at least one of the values is defined.
def sumOptions(optA: Option[Int], optB: Option[Int]): Option[Int] =
(optA, optB) match {
case (Some(a), Some(b)) => Some(a + b)
case (None , Some(b) => Some(b)
case (Some(a), None) => Some(a)
case (None , None) => None
}
Let me know on reddit which tips I missed and I will update this post with the most popular recommendation!