Friday, January 25, 2013

Scala Try/Catch Lifting Proposal

TL;DR

Scalac currently deals with try/catch on a potentially non empty stack by lifting to a def. But that can lead to more boxing of vars captured in the lifted defs and a second pass to deal with try/catches that get put into the boxing constructor. I propose we use local vals to lift try/catch expressions instead of using defs. That will prevent the extra boxing and, perhaps later, allow us to inline more code.

Background

Scala makes try/catch an expression. But the JVM is actively hostile to using try/catch as an expression because the instant a throw happens the local stack frame is erased. So if we naively expand try/catch then an expression like "foo(bar(), try/catch)" leads to different stack heights before foo is finally applied. The "main" branch has both "this" and the bar() result on the stack, the catch branch does not. Boom, bytecode verify error. In fact that has been a source of several bugs as we've missed some corner case or other involving try/catch. In particular, nested try/catch can also break us in things like in foo(bar(), {val x = 42; blurg match {case whatever => if(something) x else try/catch}}) where the try/catch is nested in an if/else in a match in a block.

Current Situation

Currently UnCurry looks for try/catch in a spot where it could cause this problem and lifts such try/catch expressions into defs. In
other words

foo(bar(), try a catch {case b => c})

becomes

def liftedTry1 = try a catch {case b => c})
foo(bar(), liftedTry1)

And that's fine. But there's a nasty wrinkle. If the try/catch refers to a local var then suddenly that var is a captured var that needs boxing. That's a boxing that a user would probably not predict without a very subtle understanding of Scala internals.

Further, the boxing, which happens in LambdaLift, may well lead yet again to try/catch on a non-empty stack.

var x = try a catch {case b => c})
foo(bar(), try x catch{...})

After UnCurry

var x = try a catch {case b => c})
def litedTry1 = try x catch{...}
foo(bar(), liftedTry1)

After captured var boxing in LambdaLift

var x = new IntRef(try a catch {case b => c}))
def litedTry1 = try x.value catch{...}
foo(bar(), liftedTry1)

Having the try/catch in the constructor call means it's going to end up on a non-empty stack. To deal with that LambdaLift does a post-transform step to deal with try/catch yet again.

var x = try new IntRef(a) catch {case b => new IntRef(c)}))
def litedTry1 = try x.value catch{...}
foo(bar(), liftedTry1)

And again that post-transform needs to be smart about try/catch nested inside things.

Proposal Outline

I propose that we get rid of the try lifting logic in both UnCurry and LambdaLift and defer it to a later phase. That phase would look for try/catch problems and, when found, transform into local vals. An example:

val r = quux().foo(bar(), try a catch {case b => c}), baz())

So finding a try/catch inside an expression on a non-empty stack, we need to transform to local vals like so:

val r = {
  val temp0 = quux()
  val temp1 = bar()
  val temp2 = try a catch {case b => c})
  temp0.foo(temp1, temp2, baz())
}

It's important that quux() and bar() get evaluated and stuffed into vals in order to preserve order of evaluation. (By-name translation will have long since been done, so it need not be considered here)

We also need to deal recursively with nested try/catch

val r = quux().foo(bar(), qwerb().flurb(try a catch {case b => c})), baz())

step1:

val r = {
  val temp0 = quux()
  val temp1 = bar()
  val temp2 = qwerb().flurb(try a catch {case b => c}))
  temp0.foo(temp1, temp2, baz())
}

That leaves a try/catch expression on a non-empty stack, so
step2:

val r = {
  val temp0 = quux()
  val temp1 = bar()
  val temp2 = {
    val temp4 = qwerb()
    val temp5 = try a catch {case b => c})
    temp4.flurb(temp5)
  }
  temp0.foo(temp1, temp2, baz())
}

Details still need to be worked out. The existing logic in UnCurry seems to be a good starting point for finding problematic try/catch, though.

Inlining

The main body of the proposal stands alone and I think it's worth while no matter what. But it may allow some improvements in inlining

Because we want to inline code that may already be compiled, we defer inlining until we're working with i-code. As part of its analysis the inliner examines the code it wants to inline to see if it has any exception handlers. If it does then it refuses to inline the code into a spot that has a non-empty stack. In other words, it's a third place we deal with problematic try/catch, but this time by refusing to inline.

I propose that after try/catch lifting is implemented via local vals, we can make inlining decisions earlier in the compiler, perhaps right after UnCurry. Under this scheme, if we decide we want to inline then we would carry around a node representing the to-be inlined code. Then, when we get to the try/catch lifting phase it conservatively treat the code like any other node that has a try/catch, lifting to a local val if a potential try/catch would be problematic. Then the inlining phase would do the actual inlining.

Besides allowing code with exception handlers to be inlined, with this second part of the proposal the compiler should have to do less work over all. Currently we do a lot of work analyzing and transforming lambdas that may eventually be transformed away by the inliner.

I have to admit this second part of the proposal has no meat on it. It's not clear it's even workable. But it is at least some argument supporting lifting try/catch with local vals.