Have a look at these generated parsers. YMTD!

All of my efforts are still going to the JavaCC component of my development efforts. However, this still has a lot to do with FreeMarker because the progress on resuscitating crufty old JavaCC is going to spill over pretty soon into a very significant transformation of crufty old FreeMarker! Still, just as most people who eat sausages do not care so much to know what went into them, most FreeMarker users are not terribly interested in the parsing generator technology used to build the parser. (There are some exceptions but this is the general case surely.)

A lot of my energy in JavaCC over the last days has been going towards making the code it generates more understandable. Truth told, I'm not sure whether I'm doing it more for myself or more for users. (Probably both, in roughly equal measure...)

The overall problem space -- not just JavaCC, but parser generators more generally, and code generation even more generally -- is extremely difficult. You see, in this space, when you have a bug in your code (those of you who never do can ignore this, I suppose...) the bug is manifested typically not in the code that you wrote, but in the code that your code generates generates. This makes things very very difficult because, when you debug the generated code, you need to be able to see where it was generated. So, one of the things you see is that this newer version of JavaCC, JavaCC 21, puts location information all over the place so that you can more easily track down the real source of a bug.

For example, compare this code, the latest greatest FTL Parser in "Apache FreeMarker" generated by the legacy JavaCC tool, with the parser generated from the exact same grammar by JavaCC 21. Just compare the code generated for any grammatical production and you'll see that we have a lot more clarity about where the code came from. There are various other improvements. The parser generated by JavaCC 21 uses type-safe enums rather than integer constants to represent the various Token types. This also allows the use of java.util.EnumSet in spots where you want to check whether the next Token matches a given grammatical expansion. (BTW, this is still actually under-utilized and offers a lot of low-hanging fruit that will be plucked soon!) Also, the grammar maintains a call stack of locations relative to the actual grammar file  and inserts that information into the stack trace of ParseExceptions!

The older FTL grammars did not use any of the automatic tree-building in the JavaCC package (a.k.a. JJTree). I am referring to Apache FreeMarker, which is really just FreeMarker 2.3.x (which was basically done in 2004) and FreeMarker 2.4.x, (which unfortunately received no attention after 2008 approximately). (The FreeMarker 2.4.x release cycle was basically sabotaged by Daniel Dekany, which is a story I shall outline at some point separately.) The newest FTL grammar (which is still basically over a decade old) that uses more of the newer JavaCC (previously FreeCC) improvements is here: https://github.com/javacc21/javacc21/blob/master/examples/freemarker/FTL.javacc

You can see the code generated from that here. Of course, the code for the grammatical productions is much more verbose than in the other cases because this uses automated tree-building.

But getting back to the question of tracing down where bugs came from, from a development standpoint, the biggest move forward in going from legacy JavaCC to JavaCC 21 is that all of the code generation is done by templates. (You guessed it! FreeMarker templates!)

If you are curious, here is the main FreeMarker template that generates the parser code. The funny thing about this is that, even with the code generation separated out into templates like this, JavaCC development is still hardly easy. Basically, internal development of JavaCC just went from being pretty much impossible to being feasible, yet still extremely difficult. (I suppose the use of JavaCC to develop complex grammars went from being very difficult to just moderately difficult.)

The situation with the internal development of legacy JavaCC, which (for whatever bizarre reasons) has always eschewed using a template engine like FreeMarker, is completely intractable. (I am probably the best qualified person on the face of the earth to say this, so you ought to believe me!) It is simply too difficult to develop the codebase forward when all of the code is being generated with println statements and no easy way, when debugging the generated code to trace back its real origin.

Well, getting back to FreeMarker, as JavaCC takes on new features like the ability to generate fault-tolerant parsers, FreeMarker should get a new lease on life, because there will be the potential of much better tool support. So... stay tuned....

Start the discussion at parsers.org