PanPG (formerly "PEG") 0.0.6 Released

PanPG version 0.0.6 is released.

There have been many changes since the version 0.0.5 release in March.

New Project Name

The most obvious change is the new project name.

The project was initially called simply "PEG", whcih was an internal name as part of a larger project.

The project is now named PanPG.

Caveats

This release still generates slow parsers. If you're waiting for a faster version, the next few releases will be performance oriented. I am planning to release a version every time that average parser speed on medium-complexity grammars doubles. Hopefully there will be six or seven of these releases. As described in the project roadmap, this release is the API release, intended to make the API comfortable to use but flexible enough to support the features planned for the future.

Generated parsers are believed to be correct and should be compatible with most reasonable JavaScript environments. Parsers generated by v0.0.6 for ECMAScript, JSON, and PanPG's own PEG format have been successfully tested in IE7, Opera, Safari, Firefox, Chrome, and node.js.

API Features

This release features numerous API enhancements.

Parsers can now be output as CommonJS modules with the commonjs option. These are hybrid modules that can also be used as scripts in the browser or with require in a CommonJS environment. This feature is off by default.

The PanPG compiling and utility APIs themselves are now shipped as CommonJS modules, which in the browser use "PanPG" and "PanPG_util" as namespaces, and in a CommonJS environment as module names.

The throwing version of generateParser is now the default.

Parser usage is still the same, a parser is simply a function that returns a parse tree from an input string. However, the name array and input string are now attached to the parser result, which simplifies handling parse trees with functions like showTree.

New in this release is the treeWalker function which takes a parse tree and a dictionary of callback functions, and walks the parse tree. This is the main API for dealing with parse trees, without having to directly handle the parse tree format. However, the parse tree format itself is still stable since v0.0.5 and is intended to stay that way.

Also new is the ability to compose grammars by using opts.patches or equivalently by passing an array of grammars, rather than a single grammar, as the first argument to generateParser. This allows re-using parts of grammars, and easily dealing with closely related languages or versions of a language without unnecessary duplication between grammars.

Another major new feature is explain which is the industrial-strength complement to the showTree function. Explain takes the grammar, options, and input string as arguments, compiles a tracing version of the parser, runs the tracing parser over the input, and analyzes the trace to determine which rules were tried at every point in the input, in what order, and which ones succeeded and failed. Explain is slow and generates a lot of output, so it should be used with small inputs.

Codegen Features

Like the previous release, this release uses the new "flat" code generator, called the v6 codegen (the codegen and release version numbers are independent, though they coincidentally both end in "6" with this release).

Parsers are generated faster than in v0.0.5, often much faster as grammar size increases. The ECMAScript 5 parser is generated about 8 times faster.

The v6 codegen now identifies many regular sub-grammars, which will enable some of the optimizations of the v5 codegen to begin to be ported.

There is one new code generation feature in this version, namely DFA generation, which I hope to write about soon, as the optimizations it potentially enables are implemented and parser speeds improve.

The new elide and drop options can be used to generate a parser that omits some of the rules in the grammar from generated parse trees and potentially parses faster.

The new opts.start option can be passed to specify a custom start rule, this can be used to generate a parser for a fragment of a grammar, to put the start rule somewhere other than at the top, or to compose grammars.

Documentation and Home Page

The accepted PEG language is still not documented (and will probably be simplified a bit before that happens) but the rest of the public API is now fully documented in the README file.

The project now has a small home page, which brings the demo, documentation, downloads, and other information together.

There is also a source code tarball, and this version can also be installed through npm for node.js users.

As always, the latest unstable source code is available from the online revision store using wget. There is a new rvs_get script which can also be used.

The demo page has been improved so that the grammar can be edited; this makes it suitable for quick tests and you can even write an entire grammar from scratch on the demo page and paste the generated parser into your project without ever needing to install PanPG.

Acknowledgments

Thanks to Jeremy Ashkenas, Ash Berlin, Steve Goguen, Daniel Ly, David Majda, and Isaac Schlueter for their suggestions and feedback.

Links

The Project home page will always have links to the most current downloads, demos, and documentation.

The project roadmap from March is relevant (v0.0.6 is the "API Enhancements' release).

As always, suggestions and bug reports are welcomed at inimino@inimino.org, or stop by #inimino on Freenode IRC.