Julien Truffaut
16th January 2023
Implicits are one of the most feared features of the Scala programming language and for good reasons!
First, the concept of implicits is fairly specific to Scala. No other mainstream programming language has a similar concept. This means that new Scala developers have no patterns to rely on to use implicits correctly.
Second, the keyword implicit is overused in Scala 2 (similar to _
). Therefore, it requires a lots of time and practice to distinguish between the various usages of implicits. On that point, Scala 3 has made great improvements by introducing dedicated syntax for each implicit’s use case.
This blog post will concentrate on Scala 2 as it is currently the most used Scala major version. However, I will mention the differences introduced in Scala 3 regarding implicits along the way.
Now, before we dive deep into the design patterns of implicit parameters, it is worth spending a few minutes reviewing how implicit parameters work.
A function or class constructor can have explicit and implicit parameters, which are by default explicit. They are only implicit if we add an implicit
keyword at the beginning of the parentheses.
def sorted[A](list: List[A])(implicit ordering: Ordering[A]): List[A]
class UserService(config: Config)(implicit ec: ExecutionContext) { }
In the above examples, list
is an explicit parameter, and ordering
is an implicit parameter of the function sorted
. Class constructor behaves identically to simple functions, so for the rest of the blog post, I will only use examples with simple functions.
Note that explicit and implicit parameters are always defined in separate sets of parentheses. In Scala 2, all the implicit parameters must be defined in the last set of parentheses. This restriction no longer exists in Scala 3.
Let's say that we have a method createEmptyBlogPost
which takes both an explicit and an implicit parameter (I‘ll explain later why I made this choice).
def createEmptyBlogPost(title: String)(implicit requesterId: UserId): BlogPost =
BlogPost(
author = requesterId,
title = title,
content = ""
)
case class BlogPost(
userId : UserId,
title : String,
content: String,
)
case class UserId(value: String)
How do we call the function createEmptyBlogPost
? The first option is to pass the implicit parameter explicitly.
createEmptyBlogPost("Scala Implicits: The complete guide")(`UserId("john_1234")`)
// res: BlogPost = BlogPost(
// author = UserId("john_1234"),
// title = "Scala Implicits: The complete guide",
// content = "",
// )
However, this isn’t idiomatic. Normally, implicit parameters are not specified by the developers. Instead, the compiler passes them automatically to the function. It is a form of dependency injection.
createEmptyBlogPost("Scala Implicits: The complete guide") // Implicit call
// res: BlogPost = BlogPost(
// author = UserId("john_1234"),
// title = "Scala Implicits: The complete guide",
// content = "",
// )
Now, the question is: how does the compiler know which value to inject?
The compiler maintains a map where the key is a type and the value, a value of the key’s type (this isn’t how it is really implemented in the compiler but it is a good mental model). For example,
val ImplicitValues: Map[Type, Value] = // pseudo-code
Map(
Int -> 5,
String -> "",
UserId -> UserId("john_1234"),
)
Then, when the compiler needs to pass an implicit parameter of type UserId, it looks up the value at the key UserId
which is UserId("john_1234")
, and injects it into the function createEmptyBlogPost
. All of this happens at compile time which means that the program runtime is not affected by implicits!
Next, what happens if there is no key for the type UserId
? In this case, the compiler throws a compile time error. For example,
val ImplicitValues: Map[Type, Value] = // pseudo-code
Map(
Int -> 5,
String -> "",
// No entry for UserId
)
createEmptyBlogPost("Scala Implicits: The complete guide")
error: could not find implicit value for parameter requesterId: UserId
Finally, how can we inform the compiler that UserId("john_1234")
should be the implicit value for the type UserId?
We need to use the implicit keyword, but this time before a val
or def
. For example,
implicit val requesterId: UserId = UserId("john_1234")
Note that implicit definitions are scoped, similar to normal values. In the following example, the first call of createEmptyBlogPost
compiles because the implicit requesterId
is defined in the same curly braces as where the function is called, while the second call of createEmptyBlogPost
fails to compile because the implicit value isn’t visible at this location.
class BlogPostTest extends AnyFunSuite {
test("createEmptyBlogPost gets the author implicitly") {
implicit val requesterId: UserId = UserId("john_1234")
val result = createEmptyBlogPost("Scala Implicits: The complete guide") // ✅ Compile
assert(result.author == requesterId)
}
test("createEmptyBlogPost has no content") {
val result = createEmptyBlogPost("Scala Implicits: The complete guide") // ❌ could not find implicit value
assert(result.content.isEmpty)
}
}
If we move the requesterId
definition one line up (first line inside the BlogPostTest class), then the two calls to createEmptyBlogPost
would compile.
Implicit values can also be imported from another scope. For example,
object User {
implicit val defaultUser: User = UserId("john_1234")
}
import User.defaultUser // or User._
createEmptyBlogPost("Scala Implicits: The complete guide") // ✅ Compile
Let’s summarise what we have seen so far about implicit parameters: The compiler keeps track of all the implicits available within a scope. At compile-time, the compiler injects all implicit parameters not passed explicitly. If an implicit is missing, we get a compile-time error “could not find implicit value…” If there are two-or-more implicit values of the same type in the same scope, we get another compile-time error: “ambiguous implicit …”
We haven’t seen an example for the last case yet but it makes sense since the compiler needs to inject values by looking up its type. So if there is more than one value per type, the compiler can’t decide which one to choose. It is ambiguous – as the error message says.
Note that it is possible to create several implicits with the same type, and to define a sort of priority for the compiler. However, the use cases for this feature are really advanced. If you need this feature and you are not writing a generic library, the chances are you are misusing implicits parameters and you would be better off passing the arguments explicitly.
Now that we know how implicit parameters work, let’s have a look at two good use cases for this feature.
The environment pattern takes its name from environment variables used in shell scripts and CI/CD platforms. The idea is that most parameters change everytime we call a function but some parameters are static within a session like JAVA_HOME or SBT_OPTS. These environment parameters are generally initialised at the beginning of a session and they stay the same until the session terminates.
Let’s have a look at how this pattern translates in Scala. Say that we work on an http service to manage blog posts.
val httpService = {
case req @ POST -> Root / "blog" => // create a blog
case req @ PUT -> Root / "blog" / id => // update a blog
case req @ DELETE -> Root / "blog" / id => // delete a blog
}
This REST interface is the only entry point in our application which means that at any point of time, we should know the user who made the request. This information can be useful in many places, for example to ascertain the author of a blog post or the user who made the last update. Now, because this user ID is everywhere, we would like to avoid passing it around manually to every function. So let’s pass it implicitly!
The first thing we do in the implementation of the endpoint is to authenticate the user who made the request. Once authenticated, we assign the UserId to an implicit value and call the method create
of the BlogAPI
class.
case req @ POST -> Root / "blog" =>
implicit val requesterId: UserId = extractRequesterId(req)
for {
payload <- req.parseBodyAs[NewBlog]
_ <- blogAPI.create(payload.title)
} yield Ok()
BlogAPI.create
in turn calls the pure function createEmptyBlogPost
and saves the result to the database.
class BlogAPI(db: DB) {
def create(title: String)(implicit requesterId: UserId): Future[Unit] = {
val newBlog = createEmptyBlogPost(title)
db.save(newBlog)
}
}
def createEmptyBlogPost(title: String)(`implicit requesterId: UserId`): BlogPost =
BlogPost(
author = requesterId,
title = title,
content = "",
)
To summarise, the environment patterns works a follow: When the server receives an http request, we mark the UserId of the requester as an implicit value. All following methods take the UserId as an implicit parameter.
The benefits of this pattern are:
We don’t clutter the logic by passing a UserId everywhere. It is not a big deal for a single parameter but we may want to pass around other contextual values such as a correlation ID or a span for tracing.
We get the guarantee that all our logic will use the same requesterId
within a request. This guarantee comes from the usage of implicit which ensures a unique value per type and the fact that we don’t pass implicit parameters explicitly.
Let's have a look at a second example of the environment pattern using a different kind of context.
Imagine we want to extend our BlogPost
data structure to include the timestamp at which it was created. The easiest way to do this is to add a field createdAt
in the case class BlogPost
and modify createEmptyBlogPost
by initialising createdAt
using Instant.now()
.
case class BlogPost(
author : UserId,
title : String,
content : String,
createdAt: Instant,
)
def createEmptyBlogPost(title: String)(implicit requesterId: UserId): BlogPost =
BlogPost(
author = requesterId,
title = title,
content = "",
createAt = Instant.now(),
)
It works very well and it is simple. Unfortunately, it makes our code difficult to test as Instant.now()
is non-deterministic. Everytime we call this function, we get a different result which makes our logic difficult to test.
test("create blog post") {
implicit val requesterId: UserId = UserId("john_1234")
val result = createEmptyBlogPost("Test")
assert(result == BlogPost(requesterId, "Test", "", ???)) // which timestamp?
}
There are ways to work around this issue, we could: Ignore the timestamp when we compare two BlogPost in our tests. Catch the call to Instant.now() using some mock frameworks and override it.
However, these two solutions are error prone. In my opinion, a better approach consists of defining a Clock
interface with two implementations: one for production and one for tests:
trait Clock {
def now(): Instant
}
object Clock {
val real: Clock = new Clock {
def now(): Instant = Instant.now()
}
def static(timestamp: Instant): Clock = new Clock {
def now(): Instant = timestamp
}
}
Clock.real
is the real system clock which uses Instant.now()
while Clock.static
always returns the same time.
Then, we need to update createEmptyBlogPost
to take an implicit Clock
:
def createEmptyBlogPost(title: String)(implicit requesterId: UserId, clock: Clock): BlogPost =
BlogPost(
author = requesterId,
title = title,
content = "",
createAt = clock.now(),
)
Finally, we need to set the Clock
environment value as an implicit value at the start of its context.
Clock.real
is for all our production code, so we should initialise it in the Main
class of our application:
object Main extends App {
implicit val clock: Clock = Clock.real
...
}
Clock.static
is for our test code, so we can initialise it inside individual tests or at the beginning of a test suite. For example,
class BlogPostTest extends AnyFunSuite {
implicit val clock: Clock = Clock.static(Instant.EPOCH)
...
}
If you have worked with Scala Futures, you have already seen this pattern before. Indeed, almost all methods of the Future
API require an implicit ExecutionContext
(a sort of thread pool). Typically, applications using Futures define the production ExecutionContext
in the main of the application:
object Main extends App {
implicit val ec: ExecutionContext = ExecutionContext.global
...
}
Individual test files may decide to use a custom ExecutionContext
, for example one with a single thread:
class BlogDatabaseTest extends AnyFunSuite {
implicit val ec: ExecutionContext = `fixedSizeExecutionContext(1)`
...
}
To summarise, we have seen that environment parameters are static in a given context, and a context can be a few different things. We have seen two context examples today: the per request context and production vs test context.
It is important that everyone in our team can clearly identify context boundaries, because environment parameters only change when the context changes. Therefore, I would recommend using clear and simple context.
Another important point is to use precise types for the environment variables. This is because implicit requires a unique value per type and generic types like Int, Boolean, String or LocalDate may be used to encode different things.
def createQueue[A](implicit size: Int): Queue[A] =
...
def connect(hostname: String)(implicit port: Int): Unit =
...
If you do want to pass a port number as an environment parameter, I recommend creating a wrapper type:
case class PortNumber(value: Int)
// or even better
case class HttpServerPortNumber(value: Int)
The most important take away about implicit parameters is that values injected by the compiler should be obvious! If you need to check your imports or run a debugger to figure out which value was injected, then it is not obvious and you would be better off passing the values explicitly.
In my next blog post, I present another useful pattern for implicit parameters. Check it out!
This is probably obvious but the simplest way to share parameters across multiple functions is to package them in a class and pass the parameters to the constructor of the class. This works well if the shared parameters are initialised at the beginning of the application (e.g. a thread pool or http client) but not so much for short lived parameters like a requester ID or correlation ID.
ThreadLocal lets us set and access variables within a thread. This method is nice because we don’t need to change the signature of our functions. However, it doesn’t work well with parallelism/concurrency because it is extremely difficult to propagate the parameters properly, and often results in errors.
A Reader is a type constructor: Reader[R, A]
where R
represents the type of the environment parameters and A
the result of the computation.
Here is how our blog post example would look with Reader
:
def createEmptyBlogPost(title: String): Reader[(UserId, Clock), BlogPost] = ...
As you can see, the return type changed from BlogPost
to Reader[R, BlogPost]
and the two implicit parameters moved to the R
type parameter.
Reader
lets us combine multiple Reader
together using a for-comprehension. However, this only works if all the Reader
have the same R
. For example, this code doesn’t compile because createProject
only requires a UserId
and not a Clock
.
def createProject(projectName: String): Reader[UserId, Project] = ...
for {
blog <- createBlogPost("Implicits for the noob")
project <- createProject("tutorial") ❌doesn’t compile
} yield ...
So if we want to use the Reader
approach, we either need to use the same environment variables for all functions or we need to narrow or expand the R type appropriately which often ends up being more verbose than explicit parameters.
In cats (typelevel), Reader
is a called Kleisli
. You can Find more details here.
ZIO
is also a type constructor but with three type parameters: ZIO[R, E, A]
where the R
also represents the type of the environment parameters. However, ZIO
solves the shortcoming of Reading
by using variance. In a nutshell, this means that we can compose ZIO
values with different environment variables and the compiler automatically expands the R
type appropriately.
Find more details on ZIO official website.