My Profile Photo

y-code


Blog about things I am mostly interested in code: functional programming, parsing grammars, compilers


opTreePF Function

In previous post was told how Parser works under the hood. Summarizing, rule is a macro that replaces AST of rule body with another code.

Let us have parser as follows:

class SimpleParser(val input: ParserInput) extends Parser {
  def abR = rule { "ab" }
}

Before expanding rule macro, str implicit would be applied to a nested string "ab" as everything inside rule body should be of type Rule:

  def abR = rule { SimpleParser.this.str("ab") }

And then it would be parsed by opTreePF:

  val opTreePF: PartialFunction[Tree, Tree] = {
    case q"$a.this.str($s)" => ???
      // `a` would be `SimpleParser` and `s` would be `"ab"` in pattern-matched body
  }

Some Options to Implement str Rule

opTreePF returns piece of code that would work at runtime. str might be implemented in several ways. Naive implementation is as follows:

case q"$a.this.str($s)" => qinput.sliceString(cursor, cursor + $s.lengith) == $s

A more effective imperative implementation is that one:

case q"$a.this.str($s)" => q“““
  var ix = 0
  while (ix < $s.length && cursorChar == $s.charAt(ix)) {
    ix += 1
    __advance()
  }
  ix == $s.length ”””

Optimization when s is empty string could be applied as follows:

case q"$a.this.str($s)" => q“““
  if ($s.isEmpty) true
  else {
    var ix = 0
    while (ix < $s.length && cursorChar == $s.charAt(ix)) {
      ix += 1
      __advance()
    }
    ix == $s.length
  } ”””

As well as specialization to s if it is a literal constant:

def unroll(s: String, ix: Int = 0): Tree =
  if (ix < s.length) { q“““
    if (cursorChar == ${s charAt ix}) {
      __advance()
      ${unroll(s, ix + 1)}
    } else false
  ””” } else qtrue

case q"$a.this.str($s)" => s match {
  case Literal(Constant(sc)) => unroll(sc)
  case _ => q“““
    if ($s.isEmpty) true
    else {
      var ix = 0
      while (ix < $s.length && cursorChar == $s.charAt(ix)) {
        ix += 1
        __advance()
      }
      ix == $s.length
    } ”””

Real-World Implementation

parboiled2 uses last implementation. Moreover it defines opTreePF signature differently:

val opTreePF: PartialFunction[Tree, OpTree]

where OpTree and its descendants implements code generation logic:

sealed abstract class OpTree {
  def ruleFrame: Tree

  def renderRule(ruleName: Tree): Tree = q“““
    // split out into separate method so as to not double the rule method size
    // which would effectively decrease method inlining by about 50%
    def wrapped: Boolean = ${render(wrapped = true, ruleName)}
    val matched =
      if (__collectingErrors) wrapped
      else ${render(wrapped = false)}
    // "matched" boolean is encoded as 'ruleResult ne null'
    if (matched) org.parboiled2.Rule else null”””

  // renders a Boolean Tree
  def render(wrapped: Boolean, ruleName: Tree = Literal(Constant(""))): Tree =
    if (wrapped) q“““
      try ${renderInner(wrapped)}
      catch {
        case e: org.parboiled2.Parser.CollectingRuleStackException 
          e.save(org.parboiled2.RuleFrame($ruleFrame, $ruleName))
      }”””
    else renderInner(wrapped)

  // renders a Boolean Tree
  protected def renderInner(wrapped: Boolean): Tree
}

Important thing to notice is that OpTree generates same method both for error-less execution and error collection. If __collectingErrors flag is set then renderInner code is wrapped in try/catch block that saves context information where parsing failed.

Summary

In this blog article we learned what opTreePF is, and its role on path from RuleDSL to code generation.

comments powered by Disqus