Cheat Sheet

The PDF Cheat Sheet is:

https://computingsimplicity.neocities.org/2021/PEG%20Cheat%20Sheet.pdf

It is incomplete, but maybe enough to get started.

PEG Editor

I favour Ohm-JS.

The grammar editor for Ohm-JS is a time-saver. https://ohmlang.github.io/editor/

The editor can be used for any PEG library, keeping in mind the syntactic variances.

Using any of the other libraries, the only way to debug a grammar is manually.

The Ohm Editor makes the process of debugging a grammar very quick. Maybe an order-of-magnitude faster than manual debugging.

In my experience, PEG (aka recursive descent parsers) are fairly easy to debug. The main point of failure is knowing where to include whitespace.

Debugging a PEG grammar takes about 1 day or so.

Debugging a PEG grammar using the Ohm Editor, takes 10’s of minutes.

IMO, you should always use the Ohm Editor, regardless of which PEG library1 you are using.

Parser Combinators

https://en.wikipedia.org/wiki/Parser_combinator

Parser combinators are functions that help build parsers in a functional manner.

I believe that PEG is a better choice than using combinators.

I believe that a design should exhibit its intent - DI (Design Intent).

I do not subscribe to the idea that you should choose a language before expressing the solution to a problem.

I do not feel that FP is a one-size-fits-all solution to all programming problems.

PEG expresses the DI of a grammar more clearly than programs written in any functional language.

You might create a PEG layer that emits parser combinators.

In such a case, I would use a less-restrictive language, like Common Lisp, as the toolbox language2 and layer grammar constructs onto it using ESRAP, and, not bother with parser combinators.

Or, I would consider using Ohm-JS to generate Haskell code in parser-combinator form.

Whitespace - Rule of Thumb

I have found that a consistent approach to parsing whitespace makes debugging easier.

I tend to put “ws*” at the end of sub-rules.

And, I put a “ws*” after every terminal item (e.g. a constant string or a constant character).

Of course, one could write a PEG parser that generates another PEG parser and inserts whitespace parsing at the “right” places.


  1. Or, if you are using parser combinators.  ↩︎

  2. https://guitarvydas.github.io/2021/03/16/Toolbox–Languages.html  ↩︎