In previous post was told how Parser
works under the hood. Summarizing, rule
is a macro that replaces AST of rule body with another code.
Let us have parser as follows:
class SimpleParser(val input: ParserInput) extends Parser {
def abR = rule { "ab" }
}
Before expanding rule
macro, str
implicit would be applied to a nested string "ab"
as everything inside rule body should be of type Rule
:
def abR = rule { SimpleParser.this.str("ab") }
And then it would be parsed by opTreePF
:
val opTreePF: PartialFunction[Tree, Tree] = {
case q"$a.this.str($s)" => ???
// `a` would be `SimpleParser` and `s` would be `"ab"` in pattern-matched body
}
Some Options to Implement str
Rule
opTreePF
returns piece of code that would work at runtime. str
might be implemented in several ways. Naive implementation is as follows:
case q"$a.this.str($s)" => q“input.sliceString(cursor, cursor + $s.lengith) == $s”
A more effective imperative implementation is that one:
case q"$a.this.str($s)" => q“““
var ix = 0
while (ix < $s.length && cursorChar == $s.charAt(ix)) {
ix += 1
__advance()
}
ix == $s.length ”””
Optimization when s
is empty string could be applied as follows:
case q"$a.this.str($s)" => q“““
if ($s.isEmpty) true
else {
var ix = 0
while (ix < $s.length && cursorChar == $s.charAt(ix)) {
ix += 1
__advance()
}
ix == $s.length
} ”””
As well as specialization to s
if it is a literal constant:
def unroll(s: String, ix: Int = 0): Tree =
if (ix < s.length) { q“““
if (cursorChar == ${s charAt ix}) {
__advance()
${unroll(s, ix + 1)}
} else false
””” } else q“true”
case q"$a.this.str($s)" => s match {
case Literal(Constant(sc)) => unroll(sc)
case _ => q“““
if ($s.isEmpty) true
else {
var ix = 0
while (ix < $s.length && cursorChar == $s.charAt(ix)) {
ix += 1
__advance()
}
ix == $s.length
} ”””
Real-World Implementation
parboiled2
uses last implementation. Moreover it defines opTreePF
signature differently:
val opTreePF: PartialFunction[Tree, OpTree]
where OpTree
and its descendants implements code generation logic:
sealed abstract class OpTree {
def ruleFrame: Tree
def renderRule(ruleName: Tree): Tree = q“““
// split out into separate method so as to not double the rule method size
// which would effectively decrease method inlining by about 50%
def wrapped: Boolean = ${render(wrapped = true, ruleName)}
val matched =
if (__collectingErrors) wrapped
else ${render(wrapped = false)}
// "matched" boolean is encoded as 'ruleResult ne null'
if (matched) org.parboiled2.Rule else null”””
// renders a Boolean Tree
def render(wrapped: Boolean, ruleName: Tree = Literal(Constant(""))): Tree =
if (wrapped) q“““
try ${renderInner(wrapped)}
catch {
case e: org.parboiled2.Parser.CollectingRuleStackException ⇒
e.save(org.parboiled2.RuleFrame($ruleFrame, $ruleName))
}”””
else renderInner(wrapped)
// renders a Boolean Tree
protected def renderInner(wrapped: Boolean): Tree
}
Important thing to notice is that OpTree
generates same method both for error-less execution and error collection. If __collectingErrors
flag is set then renderInner
code is wrapped in try/catch
block that saves context information where parsing failed.
Summary
In this blog article we learned what opTreePF
is, and its role on path from RuleDSL to code generation.