I created a grammar and drew a diagram of the AST.

The diagram shows that certain constructs are missing from the textual representation

Grammar

Below is a grammar for PEG (Ohm-JS) that I am working on.

The point of this essay is to discuss the diagram of the grammar (not, to discuss the grammar itself).

ex3 {
  Specification = MatchSection OutputSection
  MatchSection = "#"+ "grok" Rule+
  OutputSection = "#"+ "emit" Emitter+

  Rule = Rid "<-" Patterns

  Patterns = MultiplePatterns | SinglePattern
  MultiplePatterns = Pattern Pattern+
  SinglePattern = Pattern

  Pattern = PatternWithOperator | PatternWithoutOperator
  PatternWithOperator = Primary Operator
  PatternWithoutOperator = Primary
  Emitter = Eid "->" codeToEOL

  Primary = ParenthesizedPrimary | Range | quoted | RuleReference
  ParenthesizedPrimary = "(" Patterns ")"
  RuleReference = id ~arrow
  Range = "[" alnum "-" alnum "]"
  quoted = "'" any "'"
  
  Operator = "+" | "?" | "*"
  
  Rid = id &arrow
  Eid = id &arrow
  
  arrow = "<-" | "->"
  
  keyword = "grok" | "emit"
  id = ~keyword letter alnum*
  
  codeToEOL = (~newline any)* newline+
  newline = "\n"

}

AST View 1

RuleRid<-orPatterns+MultiplePatternsSinglePatternPatternPatternorPatternPatternWithOperatorPatternWithoutOperatororPrimaryPrimaryorOperator?*+ParenthesizedPrimaryRangequotednotRuleReferencePatterns()<alnum>[]-<alnum><any>''notidorarrow<-->idarrow<alnum><letter>*orkeywordmatchoutput

AST1 View 2

Rule
Rule
Rid<-PatternsMultiplePatternsSinglePatternPatternPatternPatternPatternWithOperatorPatternWithoutOperatorPrimaryPrimaryOperator?*+ParenthesizedPrimaryRangequotedRuleReferencePatterns()<alnum>[
]
]
-
-
<alnum>
<alnum>
<any>''idarrow<-->idarrow<alnum><letter>keywordmatchoutput

Constructs

Nodes

A list of the Node kinds follows:
- constant (aka terminal) (gray circle)
- compound (aka non-terminal) (empty circle)
- compound node reference: node referring to another compound node (dotted circle)
- compound node target: node that is the target (referenced) from another node(s) (circle with 3pt outline, “come-from”)

Connectors

A list of the connector kinds follows:
- AND (black arrow)
- OR (red arrow labelled “or”)
- NOT (red arrow labelled “not”)
- one-or-more (black arrow labelled “+”)
- zero-or-more (black arrow labelled “*” (not shown))
- optional (black arrow labelled “?” (not shown))

Discussion

View 1

There are two kinds of nodes that do not appear, explicitly, in the grammar:

  1. reference
  2. target.

Given these two kinds of nodes, it should be possible to implement DRY1 automation, putting less onus on
the programmer(s) to create DRY explicitly.

Relationship to Text-only Languages

Text-only technologies (like OOP and FP) put the onus on the programmer to implement DRY explicitly.

I suggest that it may be possible to automate at least some of the DRY work by supplying at least two more constructs in text-based languages:

  1. Reference
  2. Target.

View 2

View 2 shows the links from reference nodes to target nodes.

This view shows a rats-nest of connections.

This view is not obvious from the text representation of the graph, but is immediately obvious in the drawing.

This view shows that that various grammar rules depend on other grammar rules. The implication is that the grammar must be treated as a single unit. This is not a surprise, because one tends to think of a grammar specification as a whole.

Grammars that contain inter-dependencies cannot be composed from smaller pieces.

See Also

Blog
References

  1. DRY means Don’t Repeat Yourself. DRY is a maintenance-engineering issue.