A common issue noticed when reviewing pull requests is the understanding of the boundaries of Futures. In most cases, people take for granted that given a function with this signature
def asyncJob(): Future[Int]
the full body of the function will be executed in the Future.
Let’s look closely into this with a couple of examples
Example 1
Here’s a question for you: when running this program, what’s the value of the I/O latency printed out: 5 secs or 7 secs?
def main(): Unit = { def recordTime(requestTimestamp: Long): PartialFunction[Try[_], Unit] = { case Success(_) => logger.info("running 'andThen' function in the future") logger.info(s"io latency: ${System.currentTimeMillis() - requestTimestamp}") } def asyncJob(): Future[Int] = { logger.info("executing some code in main thread") Thread.sleep(2000) //async call Future { logger.info("io call running in new thread") Thread.sleep(5000) 3 } } val f = asyncJob(). map{ logger.info("running 'map' body in the present ...") i: Int => { logger.info("running 'map' function in the future") i + 1 } } .andThen{ logger.info("running 'andThen' body in the present ...") recordTime(System.currentTimeMillis()) } logger.info(Await.result(f, 10 seconds).toString) }
If you said 5 secs, you are right! Let’s examine the output of the program:
18:10:56.708 [main] INFO fjab.Main$ - executing some code in main thread 18:10:59.070 [main] INFO fjab.Main$ - running 'map' body in the present … 18:10:59.071 [scala-execution-context-global-15] INFO fjab.Main$ - io call running in new thread 18:10:59.072 [main] INFO fjab.Main$ - running 'andThen' body in the present … 18:11:04.078 [scala-execution-context-global-16] INFO fjab.Main$ - running 'map' function in the future 18:11:04.079 [scala-execution-context-global-16] INFO fjab.Main$ - running 'andThen' function in the future 18:11:04.079 [scala-execution-context-global-16] INFO fjab.Main$ - io latency: 5007 18:11:04.084 [main] INFO fjab.Main$ - 4
The first Thread.sleep() in asyncJob() is executed by the main thread. Afterwards, the main thread starts the Future that will run in a different thread and then goes on to execute the body of map and the body of andThen. It’s in the execution of andThen that the time offset is set . Finally it blocks to wait for the result of the Future.
Meanwhile, the Future is running in the new thread and, after completion, another thread executes the callbacks defined in map and andThen.
Example 2
Let’s consider this other example, where asyncJob() calls a function that can throw an exception.
def main(): Unit = { def asyncJob(): Future[Int] = { evilFunction() //async call Future { logger.info("io call running in new thread") Thread.sleep(5000) 3 } } def evilFunction() = throw new Exception val f = asyncJob() f.onComplete { case Success(value) => logger.info(value.toString) case Failure(exception) => logger.info("error") } Await.result(f, 10 seconds) }
Given the signature of asyncJob(), we’d expect the following piece of code to cover any eventuality
f.onComplete { case Success(value) => logger.info(value.toString) case Failure(exception) => logger.info("error") }
and yet the exception thrown by evilFunction() will blow the stack, as the exception is not thrown in the execution context of the Future but in the main thread:
Exception in thread "main" java.lang.Exception at fjab.Main$.evilFunction$1(Main.scala:28) at fjab.Main$.asyncJob$1(Main.scala:18) at fjab.Main$.example2(Main.scala:30) at fjab.Main$.main(Main.scala:13) at fjab.Main.main(Main.scala)
The morale of the story? It is important to check the boundaries of the Futures. Furthermore, all these issues could be minimised by adopting as best practice to wrap the whole body of the function in a Future. If that’s not possible, the function should be broken down into smaller units up to the point where the full body of the function can be wrapped in the Future.
The examples in this post can be found in my repo