Erlang Plugin for NetBeans in Scala#4: Minimal Parser Intergation

With lexer integrated, you've got all tokens of source file. You can code for pretty formatting, indentation, auto pair inserting, pair matching based on tokens now. But I'll try to integrate an Erlang parser first to get the basic infrastructure ready.

I can choose to use Erlang's native compiler, than via jInterface, Scala can rpc call Erlang runtime to parse source file and return an AST tree. But, Erlang's library function for parsing is not good on error-recover currently, so I choose to write an Erlang parser in Rats!. Rats! generated parser is bad on error messages and error-recover too, but I know how to improve it.

Migrating a NetBeans Schliemann based grammar definition to Rats! is almost straightforward. For me, as I'm already familiar with Rats!, it's an one day work. The preliminary Rats! definition for Erlang can be found at ParserErlang.rats. Then, I rewrote a new ParserErlang.rats according to latest Erlang spec in another half day, without the LL(k) limitation, the grammar rules now keep as close as the original spec.

To get it be used to generate a ParserErlang.java, I added corresponding ant target to build.xml, which looks like:

    <target name="rats" depends="init" description="Scanner">
        <echo message="Rebuilding token scanner... ${rats.package.dir}"/>
        <java fork="yes"
             dir="${src.dir}/${rats.package.dir}"
             classname="xtc.parser.Rats"
             classpath="${rats.jar}">
            <arg value="-in"/>
            <arg value="${src.dir}"/>
            <arg value="${rats.lexer.file}"/>
        </java>

        <echo message="Rebuilding grammar parser... ${rats.package.dir}"/>
        <java fork="yes"
             dir="${src.dir}/${rats.package.dir}"
             classname="xtc.parser.Rats"
             classpath="${rats.jar}">
            <arg value="-in"/>
            <arg value="${src.dir}"/>
            <arg value="${rats.parser.file}"/>
        </java>
    </target>

And rats related properties, such as "rats.lexer.file" and "rats.parser.file" are defined in nbproject/project.properties as:

rats.jar=${cluster}/modules/xtc.jar
rats.package.dir=org/netbeans/modules/erlang/editor/rats
rats.lexer.file=LexerErlang.rats
rats.parser.file=ParserErlang.rats

Running target "rats" will generate LexerErlang.java and ParserErlang.java, the first one is that we've used in ErlangLexer.scala, the later one is to be integrated into NetBeans' Parsing API as the parser.

Now, you should extend two Parsing API abstract classes, org.netbeans.modules.csl.spi.ParserResult and org.netbeans.modules.parsing.spi.Parser. The first one will carry result's AST Node and syntax errors that parser detected, the errors will be highlighted automatically in Editor. The second one is the bridge between NetBeans parsing task and your real parser (ParserErlang.java here)

There are tricks to do some error-recover in these two classes, by adding a "." or "end" or "}" etc to the buffered source chars when a syntax error occurred, it's called "sanitize" the source to recover the error. NetBeans' Ruby, JavaScripts supporting have some good examples of this trick. For my case, I will do error recover in Rats! definition later, so I do not use this trick currently, but I leave some "sanitize" related code there.

First, it's an ErlangParserResult.scala which extended ParserResult. The code is simple at the first phase.

ErlangParserResult.scala

/*
 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS HEADER.
 *
 * Copyright 1997-2007 Sun Microsystems, Inc. All rights reserved.
 *
 * The contents of this file are subject to the terms of either the GNU
 * General Public License Version 2 only ("GPL") or the Common
 * Development and Distribution License("CDDL") (collectively, the
 * "License"). You may not use this file except in compliance with the
 * License. You can obtain a copy of the License at
 * http://www.netbeans.org/cddl-gplv2.html
 * or nbbuild/licenses/CDDL-GPL-2-CP. See the License for the
 * specific language governing permissions and limitations under the
 * License.  When distributing the software, include this License Header
 * Notice in each file and include the License file at
 * nbbuild/licenses/CDDL-GPL-2-CP.  Sun designates this
 * particular file as subject to the "Classpath" exception as provided
 * by Sun in the GPL Version 2 section of the License file that
 * accompanied this code. If applicable, add the following below the
 * License Header, with the fields enclosed by brackets [] replaced by
 * your own identifying information:
 * "Portions Copyrighted [year] [name of copyright owner]"
 *
 * Contributor(s):
 *
 * The Original Software is NetBeans. The Initial Developer of the Original
 * Software is Sun Microsystems, Inc. Portions Copyright 1997-2006 Sun
 * Microsystems, Inc. All Rights Reserved.
 *
 * If you wish your version of this file to be governed by only the CDDL
 * or only the GPL Version 2, indicate your decision by adding
 * "[Contributor] elects to include this software in this distribution
 * under the [CDDL or GPL Version 2] license." If you do not indicate a
 * single choice of license, a recipient has the option to distribute
 * your version of this file under either the CDDL, the GPL Version 2 or
 * to extend the choice of license to its licensees as provided above.
 * However, if you add GPL Version 2 code and therefore, elected the GPL
 * Version 2 license, then the option applies only if the new code is
 * made subject to such option by the copyright holder.
 */
package org.netbeans.modules.erlang.editor

import _root_.java.util.{Collections, ArrayList, List}
import org.netbeans.api.lexer.{TokenHierarchy, TokenId}
import org.netbeans.modules.csl.api.Error
import org.netbeans.modules.csl.api.OffsetRange
import org.netbeans.modules.csl.spi.ParserResult
import org.netbeans.modules.parsing.api.Snapshot
import org.netbeans.modules.erlang.editor.rats.ParserErlang
import xtc.tree.{GNode}

/**
 *
 * @author Caoyuan Deng
 */
class ErlangParserResult(parser:ErlangParser,
                         snapshot:Snapshot,
                         val rootNode:GNode,
                         val th:TokenHierarchy[_]) extends ParserResult(snapshot) {

    override
    protected def invalidate :Unit = {
        // XXX: what exactly should we do here?
    }

    override
    def getDiagnostics :List[Error] = _errors

    private var _errors = Collections.emptyList[Error]
    
    def errors = _errors
    def errors_=(errors:List[Error]) = {
        this._errors = new ArrayList[Error](errors)
    }

    var source :String = _
    
    /**
     * Return whether the source code for the parse result was "cleaned"
     * or "sanitized" (modified to reduce chance of parser errors) or not.
     * This method returns OffsetRange.NONE if the source was not sanitized,
     * otherwise returns the actual sanitized range.
     */
    var sanitizedRange = OffsetRange.NONE
    var sanitizedContents :String = _
    var sanitized :Sanitize = NONE

    var isCommentsAdded :Boolean = false
    
    /**
     * Set the range of source that was sanitized, if any.
     */
    def setSanitized(sanitized:Sanitize, sanitizedRange:OffsetRange, sanitizedContents:String) :Unit = {
        this.sanitized = sanitized
        this.sanitizedRange = sanitizedRange
        this.sanitizedContents = sanitizedContents
    }

    override
    def toString = {
        "ErlangParseResult(file=" + snapshot.getSource.getFileObject + ",rootnode=" + rootNode + ")"
    }
}

Then, the ErlangParser.scala which extended Parser:

ErlangParser.scala

/*
 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS HEADER.
 *
 * Copyright 1997-2007 Sun Microsystems, Inc. All rights reserved.
 *
 * The contents of this file are subject to the terms of either the GNU
 * General Public License Version 2 only ("GPL") or the Common
 * Development and Distribution License("CDDL") (collectively, the
 * "License"). You may not use this file except in compliance with the
 * License. You can obtain a copy of the License at
 * http://www.netbeans.org/cddl-gplv2.html
 * or nbbuild/licenses/CDDL-GPL-2-CP. See the License for the
 * specific language governing permissions and limitations under the
 * License.  When distributing the software, include this License Header
 * Notice in each file and include the License file at
 * nbbuild/licenses/CDDL-GPL-2-CP.  Sun designates this
 * particular file as subject to the "Classpath" exception as provided
 * by Sun in the GPL Version 2 section of the License file that
 * accompanied this code. If applicable, add the following below the
 * License Header, with the fields enclosed by brackets [] replaced by
 * your own identifying information:
 * "Portions Copyrighted [year] [name of copyright owner]"
 *
 * Contributor(s):
 *
 * The Original Software is NetBeans. The Initial Developer of the Original
 * Software is Sun Microsystems, Inc. Portions Copyright 1997-2006 Sun
 * Microsystems, Inc. All Rights Reserved.
 *
 * If you wish your version of this file to be governed by only the CDDL
 * or only the GPL Version 2, indicate your decision by adding
 * "[Contributor] elects to include this software in this distribution
 * under the [CDDL or GPL Version 2] license." If you do not indicate a
 * single choice of license, a recipient has the option to distribute
 * your version of this file under either the CDDL, the GPL Version 2 or
 * to extend the choice of license to its licensees as provided above.
 * However, if you add GPL Version 2 code and therefore, elected the GPL
 * Version 2 license, then the option applies only if the new code is
 * made subject to such option by the copyright holder.
 */
package org.netbeans.modules.erlang.editor


import _root_.java.io.{IOException, StringReader}
import _root_.java.util.ArrayList
import _root_.java.util.Collection
import _root_.java.util.List
import _root_.java.util.ListIterator
import _root_.javax.swing.event.ChangeListener
import _root_.javax.swing.text.BadLocationException

import org.netbeans.modules.csl.api.ElementHandle
import org.netbeans.modules.csl.api.Error
import org.netbeans.modules.csl.api.OffsetRange
import org.netbeans.modules.csl.api.Severity
import org.netbeans.modules.csl.spi.DefaultError
import org.netbeans.modules.parsing.api.Snapshot
import org.netbeans.modules.parsing.api.Task
import org.netbeans.modules.parsing.spi.ParseException
import org.netbeans.modules.csl.api.EditHistory
import org.netbeans.modules.csl.spi.GsfUtilities
import org.netbeans.modules.csl.spi.ParserResult
import org.netbeans.modules.parsing.api.Source
import org.netbeans.modules.parsing.spi.Parser
import org.netbeans.modules.parsing.spi.Parser.Result
import org.netbeans.modules.parsing.spi.ParserFactory
import org.netbeans.modules.parsing.spi.SourceModificationEvent
import org.openide.filesystems.FileObject
import org.openide.util.Exceptions

import org.netbeans.api.editor.EditorRegistry
import org.netbeans.api.lexer.{TokenHierarchy, TokenId}
import org.netbeans.editor.BaseDocument
import org.netbeans.modules.editor.NbEditorUtilities

import xtc.parser.{ParseError, SemanticValue}
import xtc.tree.{GNode, Location}

import org.netbeans.modules.erlang.editor.lexer.ErlangTokenId
import org.netbeans.modules.erlang.editor.rats.ParserErlang

/**
 *
 * @author Caoyuan Deng
 */
class ErlangParser extends Parser {

    private var lastResult :ErlangParserResult = _

    @throws(classOf[ParseException])
    override
    def parse(snapshot:Snapshot, task:Task, event:SourceModificationEvent) :Unit = {
        val context = new Context(snapshot, event)
        lastResult = parseBuffer(context, NONE)
        lastResult.errors = context.errors
    }

    @throws(classOf[ParseException])
    override
    def getResult(task:Task) :Result = {
        assert(lastResult != null, "getResult() called prior parse()") //NOI18N
        lastResult
    }

    override
    def cancel :Unit = {}

    override
    def addChangeListener(changeListener:ChangeListener) :Unit = {
        // no-op, we don't support state changes
    }

    override
    def removeChangeListener(changeListener:ChangeListener) :Unit = {
        // no-op, we don't support state changes
    }

    private def lexToAst(source:Snapshot, offset:Int) :Int = source match {
        case null => offset
        case _ => source.getEmbeddedOffset(offset)
    }

    private def astToLex(source:Snapshot, offset:Int) :Int = source match {
        case null => offset
        case _ => source.getOriginalOffset(offset)
    }

    private def sanitizeSource(context:Context, sanitizing:Sanitize) :Boolean = {
        false
    }

    private def sanitize(context:Context, sanitizing:Sanitize) :ErlangParserResult = {
        sanitizing match {
            case NEVER =>
                createParseResult(context)
            case NONE =>
                createParseResult(context)
            case _ =>
                // we are out of trick, just return as it
                createParseResult(context)
        }
    }

    protected def notifyError(context:Context, message:String, sourceName:String,
                              start:Int, lineSource:String, end:Int,
                              sanitizing:Sanitize, severity:Severity,
                              key:String, params:Object) :Unit = {

        val error = new DefaultError(key, message, null, context.fo, start, end, severity)

        params match {
            case null =>
            case x:Array[Object] => error.setParameters(x)
            case _ => error.setParameters(Array(params))
        }

        context.notifyError(error)

        if (sanitizing == NONE) {
            context.errorOffset = start
        }
    }

    protected def parseBuffer(context:Context, sanitizing:Sanitize) :ErlangParserResult = {
        var sanitizedSource = false
        var source = context.source

        sanitizing match {
            case NONE | NEVER =>
            case _ =>
                val ok = sanitizeSource(context, sanitizing)
                if (ok) {
                    assert(context.sanitizedSource != null)
                    sanitizedSource = true
                    source = context.sanitizedSource
                } else {
                    // Try next trick
                    return sanitize(context, sanitizing)
                }
        }

        if (sanitizing == NONE) {
            context.errorOffset = -1
        }

        val parser = createParser(context)

        val ignoreErrors = sanitizedSource
        var root :GNode = null
        try {
            var error :ParseError = null
            val r = parser.pS(0)
            if (r.hasValue) {
                val v = r.asInstanceOf[SemanticValue]
                root = v.value.asInstanceOf[GNode]
            } else {
                error = r.parseError
            }

            if (error != null && !ignoreErrors) {
                var start = 0
                if (error.index != -1) {
                    start = error.index
                }
                notifyError(context, error.msg, "Syntax error",
                            start, "", start,
                            sanitizing, Severity.ERROR,
                            "SYNTAX_ERROR", Array(error))

                System.err.println(error.msg)
            }

        } catch {
            case e:IOException => e.printStackTrace
            case e:IllegalArgumentException =>
                // An internal exception thrown by parser, just catch it and notify
                notifyError(context, e.getMessage, "",
                            0, "", 0,
                            sanitizing, Severity.ERROR,
                            "SYNTAX_ERROR", Array(e))
        }

        if (root != null) {
            context.sanitized = sanitizing
            context.root = root
            val r = createParseResult(context)
            r.setSanitized(context.sanitized, context.sanitizedRange, context.sanitizedContents)
            r.source = source
            r
        } else {
            sanitize(context, sanitizing)
        }
    }

    protected def createParser(context:Context) :ParserErlang = {
        val in = new StringReader(context.source)
        val fileName = if (context.fo != null) context.fo.getNameExt else ""

        val parser = new ParserErlang(in, fileName)
        context.parser = parser

        parser
    }

    private def createParseResult(context:Context) :ErlangParserResult = {
        new ErlangParserResult(this, context.snapshot, context.root, context.th)
    }

    /** Parsing context */
    class Context(val snapshot:Snapshot, event:SourceModificationEvent) {
        val errors :List[Error] = new ArrayList[Error]

        var source :String = ErlangParser.asString(snapshot.getText)
        var caretOffset :Int = GsfUtilities.getLastKnownCaretOffset(snapshot, event)

        var root :GNode = _
        var th :TokenHierarchy[_] = _
        var parser :ParserErlang = _
        var errorOffset :Int = _
        var sanitizedSource :String = _
        var sanitizedRange :OffsetRange = OffsetRange.NONE
        var sanitizedContents :String = _
        var sanitized :Sanitize = NONE

        def notifyError(error:Error) = errors.add(error)

        def fo = snapshot.getSource.getFileObject

        override
        def toString = "ErlangParser.Context(" + fo + ")" // NOI18N

    }
}

object ErlangParser {
    def asString(sequence:CharSequence) :String = sequence match {
        case s:String => s
        case _ => sequence.toString
    }

    def sourceUri(source:Source) :String = source.getFileObject match {
        case null => "fileless" //NOI18N
        case f => f.getNameExt
    }
}

/** Attempts to sanitize the input buffer */
sealed case class Sanitize
/** Only parse the current file accurately, don't try heuristics */
case object NEVER extends Sanitize
/** Perform no sanitization */
case object NONE extends Sanitize

The major tasks of these two classes are:

  • Parsing the source buffer which is passed automatically by the framework when source is modified or just opened, the source text can be got from snapshot.getText. The parsing result used to be an AST tree, you can do semantic/structure analysis on AST tree to get more detailed information later, but as the first step, I just pass/keep AST tree in ErlangParserResult for later usage.
  • Notifying the errors to framework. This can be done by store the parsing errors to ParserResult, and implemented getDiagnostics to return the error.

The last step is register the ErlangParser in ErlangLanguage.scala in one line code:

override def getParser = new ErlangParser

Now, open a .erl file, if there is syntax error, the editor will indicate and highlight it now.

My next step is to improve the Rats! definition, add error recover etc.

Comments

No comments.