BNF Converter Java 1.5 Mode

By Björn Bringert, 2005. Based on documentation for the Java Mode by Michael Pellauer.

One of the best features of the LBNF file format is that it is completely independent of the target programming language. The new Java 1.5 mode allows the BNF Converter to output a compiler front end in Java 15 using JLex and CUP.

The BNFC Java 1.5 mode was developed for JLex version 1.2.6 and CUP version 0.10k. The generated Java code requires JDK version 1.5.0 or greater.

BNF Converter Homepage: http://www.cs.chalmers.se/~markus/BNFC/

JLex Homepage: http://www.cs.princeton.edu/~appel/modern/java/JLex/

CUP Homepage: http://www.cs.princeton.edu/~appel/modern/java/CUP/

USAGE

bnfc [-m] -java1.5 FILENAME.cf

To access the Java 1.5 mode simply pass the -java1.5 command to bnfc. It uses FILENAME as the package name of the generated Java code. The result will be the following files:

Filename Description
Absyn/*.java Abstract Syntax tree package (in subdirectory Absyn)
FILENAME.cup CUP file (parser specification)
Yylex JLex file (lexer specification)
PrettyPrinter.java A Pretty Printer for the abstract syntax tree
VisitSkel.java A code skeleton for traversing the syntax tree which uses the Visitor Design Pattern
Test.java A test program
Makefile Makefile (generated only with the -m flag)

Please ensure that your $CLASSPATH correctly points to JLex.Main and java_cup.Main. It may also be useful to include the current working directory in the $CLASSPATH.

Compiling the Compiler Front End

You can use the generated Makefile to compile the generated Java code with make (gmake on some systems).

If all goes well the following files should be generated:

File Description
Absyn/*.class Abstract syntax class files
Yylex.java Lexer generated by JLex
Yylex.class Compiled lexer class
parser.java Parser generated by CUP
sym.java CUP token symbols
parser.class, sym.class,
CUP$parser$actions.class
Compiled parser classes
PrettyPrinter.class Compiled pretty printer
Test.class Compiled test program

Testing the Compiler Front End

The Java class Test.java may be used to test the result of the generation. To use it simply give it the name of a file written in the target language. If parsing is correct, the abstract syntax tree (in Haskell-like syntax) and linearized pretty-printed result will be shown.

For example, if we had created a subset of Java called Javalette, in Javalette.cf:

> java Javalette.Test helloworld.jl

Parse Succesful!

[Abstract Syntax]

(PDefList [(FuncDef TInt(FFuncName "main")[][(SPrintString "Hello
World"), (SReturn NoExpression)]NoSemicolon)])

[Linearized Tree]

int main () 
{
  printString ("Hello World") ;
  return ;
}

If no argument is given then it attempts to read from stdin.

Note that the current directory may need to be included in your $CLASSPATH.

Packages in the Generated Code

The BNF Converter generates code that makes use of Java packages. For example, if your file name was Javalette.cf, then it will generate code with a package name of "Javalette". Because of this your Java compiler may expect that the current directory name matches the package name. So in our example above the Java compiler probably will expect the current directory to be named "Javalette" and be included in the $CLASSPATH somehow.

Generation of the Abstract Syntax tree can result in hundreds of classes (one for each rule in your LBNF file). Therefore they are placed in a subpackage called "Absyn". A subdirectory will be created if one does not already exist.

The Abstract Syntax Tree

The generated code supports the Visitor Design Pattern.

Here is an example. Let us say that we have made a simple LBNF file to describe lists of boolean expressions, called BoolExp.cf:

PROG.           PROG   ::= [EXP] ;

OrExp.          EXP     ::= EXP "||" EXP1 ;
AndExp.         EXP1    ::= EXP1 "&&" EXP2 ;
TVal.           EXP2    ::= "true" ;
FVal.           EXP2    ::= "false" ;

separator EXP ";" ;
coercions EXP 2 ;

The Absyn package will contain the following files:

AndExp.java  EXP.java  FVal.java  ListEXP.java  
OrExp.java  PROG.java  TVal.java

Note that there is one Java class for each Label in the LBNF file. The top of the tree is PROG, which is just a pointer to the top of a list:

package BoolExp.Absyn; // Java Package generated by the BNF Converter.

public class PROG implements java.io.Serializable
{
  public ListEXP listexp_;

  public PROG(ListEXP p1) { listexp_ = p1; }

  public <R,A> R accept(BoolExp.Absyn.PROG.Visitor<R,A> v, A arg) { return v.visit(this, arg); }

  public interface Visitor <R,A> {
    public R visit(BoolExp.Absyn.PROG p, A arg);
  }
}

The class ListEXP is automatically generated to represent the list of EXP. Using Java 1.5 generics this is trivial, but we still generate a class, so that the list implementation could be changed without changing any of the code which uses lists of EXP.

package BoolExp.Absyn; // Java Package generated by the BNF Converter.

public class ListEXP extends java.util.LinkedList<EXP> { }

It is interesting to note that ListEXP refers to EXP. But EXP is an abstract class:

package BoolExp.Absyn; // Java Package generated by the BNF Converter.

public abstract class EXP {
  public abstract <R,A> R accept(EXP.Visitor<R,A> v, A arg);
  public interface Visitor <R,A> {
    public R visit(BoolExp.Absyn.OrExp p, A arg);
    public R visit(BoolExp.Absyn.AndExp p, A arg);
    public R visit(BoolExp.Absyn.TVal p, A arg);
    public R visit(BoolExp.Absyn.FVal p, A arg);
  }
}

Abstract classes represent LBNF Categories, such as EXP. If a category does not have a label with the same name (as PROG did in our example) then it is declared abstract. All Labels of a category are represented as a subclass of the abstract category. Label names should not match category names if that category has more than one label.

Here are EXP's subtypes in our example:

package BoolExp.Absyn;

public class AndExp extends EXP
{
  public EXP exp_1, exp_2;

  public AndExp(EXP p1, EXP p2) { exp_1 = p1; exp_2 = p2; }
}
package BoolExp.Absyn;

public class OrExp extends EXP
{
  public EXP exp_1, exp_2;

  public OrExp(EXP p1, EXP p2) { exp_1 = p1; exp_2 = p2; }
}
package BoolExp.Absyn;

public class TVal extends EXP
{
  public TVal() { }
}
package BoolExp.Absyn;

public class FVal extends EXP
{
  public FVal() { }
}

The classes also contain some support code for the Visitor design pattern which is not shown above.

Using the Skeleton File

The generated file VisitSkel.java shows all the visitors and their methods. Since the generated Visitor interfaces are parametrized, their use can vary.

Known Issues

Here is a list of known issues with the BNF Converter's Java Mode.

Keywords

Java contains many more keywords than Haskell, and currently there is nothing to prevent a programmer from using these keywords in their LBNF grammar. For example:

Static.   Package   ::= Sychronized "." Switch ;

While this will be accepted by BNFC it might result in un-compilable code, as the Java back-end uses lowercase names for instance variables. As such programmers should generally avoid matching Label names with Java keywords, or the name of already existing packages.

Reuse of Label Names

Reusing label names currently results in a warning from BNFC. However, for Java this is a much more serious issue, as they are used for class names. For example:

public class FOO extends BAR
{
 ...
}

public class FOO extends BAZ //OOPS! That's two classes named FOO
{
 ...
}

The best solution is for the programmer to ensure that all label names are unique and not ignore the BNFC warning.

JLex Regular Expressions

BNFC's regular expressions (for user-defined token types) are exactly isomorphic to those of Alex for Haskell. Unfortunately JLex regular expressions are not as expressive as those of Alex. Specifically the following operators are not supported:

Currently using those operators in a regular expression will cause generation of the lexer by JLex to fail. In the future a solution might be found to this by transforming the regular expression, but currently the programmer should avoid those operators if they wish to generate Java code.

Shift-Reduce Conflicts in CUP

Certain grammar constructs result in shift-reduce conflicts in CUP that do not result in them in Happy. We are currently investigating exactly what these differences are and how to resolve them, however initial investigations seem to indicate that CUP's default policy of Shifting is sufficient. In the meantime, the generated Makefile is currently set to tell CUP to expect 100 conflicts before it aborts parser generation. Depending on the needs of your project this number could be made bigger or smaller.