Attribute Elider

The job of this phase is to filter out attributes that are not needed by subsequent passes.

This sounds like it is too specific and needs to be generalized.

In contrast, though, if the code is simple-enough, then it can be written-afresh for every solution. Programmers are used to writing fresh bits of code, say in Python, for every solution. Abstraction and generalization cannot be planned ahead-of-time - one needs at least 3 iterations before a piece of code can be generalized¹.

There seems to be a very fine psychological line between writing code that feel you need to keep vs. writing code that you don’t mind discarding and re-implementing. The coding process can be untangled into 2 portions - (1) thinking-about-the-problem vs. (2) implementing (DI/Architecting vs. Implementation resp.). Once code is untangled from Architecture, it becomes possible to imagine re-implementing code while keeping what has been learned².

2021-07-30-attribute elider.png

Grok

The grok grammar for this pass is

AppAttributeelider{
  Diagrams = Diagram+
  Diagram = "<diagram" Attribute* ">" GraphModel "</diagram>"
  Attribute = NameAttribute | DiagramIDAttribute | OtherAttribute
  NameAttribute = "name" "=" string
  DiagramIDAttribute = "id" "=" string
  OtherAttribute = alnum+ "=" attributeValue
  string= "\"" notDQ* "\""
  notDQ = ~"\"" any
  encodedChar = ~"<" any
  attributeValue = number | string		  
  number = digit+

  GraphModel = "<mxGraphModel" Attribute+ ">" Root "</mxGraphModel>" 
  Root = "<root>" Cell+ "</root>"
  Cell = CellWithContent | CellWithoutContent
  CellWithoutContent = "<mxCell" CellAttribute+ "/>"
  CellWithContent = "<mxCell" CellAttribute+ ">" Geometry? "</mxCell>"	     
  Geometry = "<mxGeometry" GAttribute+ "/>"
  CellAttribute =   KindAttribute 
                  | ValueAttribute 
		  | EdgeAttribute
		  | VertexAttribute
		  | SourceAttribute
		  | TargetAttribute
		  | IDAttribute
		  | RedAttribute
		  | OtherCellAttribute
  KindAttribute = "kind" "=" string
  ValueAttribute = "value" "=" string
  SourceAttribute = "source" "=" string
  TargetAttribute = "target" "=" string
  IDAttribute = "id" "=" string
  EdgeAttribute = "edge" "=" quote "1" quote
  VertexAttribute = "vertex" "=" quote "1" quote
  RedAttribute = "fillColor" "=" quote "#f8cecc" quote
  OtherCellAttribute = alnum+ "=" attributeValue

  GAttribute = 
                OtherGAttribute 
  OtherGAttribute = alnum+ "=" attributeValue

  quote = "\""
}

Here, the pattern matching recognizes several attributes of interest and groups all other attributes in the OtherCellAttribute match.

Note that the increase in code is offset by the decrease in “error checking”. When the workflow reaches this point, it is known that the input code can be easily pattern matched. Errors in the code format are picked off by earlier passes (as Ohm-JS syntax errors) or (as a last resort) the grammar engine of this pass.

Emit

Again, the emit code seems to be larger than in the previous pass, but the amount of programming work was trivial - the pattern matcher has done the “heavy lifting” for us.

In this code, we simply return attributes that are of interest (value, source, target, id, etc.) and return nothing ([[]] - empty strings) for the catch-all rule OtherCellAttribute.

If we were to add another attribute-of-interest, we would need to modify the grok grammar and the emit code. This is not ideal, but if this situation were to happen often, we could write another PEG parser for a new notation that outputs combined grok and emit code for the attributes-of-interest. As of now, this has not been required.

The code for attribute-of-interest returns a unity transform of the attributes and requires little work.

Diagrams [@ds] = [[${ds}]]
Diagram [k @a k2 graphmodel k3] = [[${k}${a}${k2}\n${graphmodel}\n${k3}\n]]
Attribute [a] = [[${a}]]
NameAttribute [an k s] = [[\ ${an}${k}${s}]]
DiagramIDAttribute [an k s] = [[\ ${an}${k}${s}]]
OtherAttribute [@an k s] = [[\ ${an}${k}${s}]]
string [q1 @cs q2] = [[${q1}${cs}${q2}]]
notDQ [c] = [[${c}]]
encodedChar [c] = [[${c}]]
attributeValue [x] = [[${x}]]
number [n] = [[${n}]]

GraphModel [k1 @as k2 root k3] = [[${k1}${as}${k2}${root}${k3}]]
Root [k1 @cells k2] = [[${k1}${cells}${k2}]]
Cell [c] = [[${c}]]
CellWithoutContent [k1 @as k2] = [[${k1}${as}${k2}]]
CellWithContent [k1 @as k2 @geometry k3] = [[${k1}${as}${k2}${geometry}${k3}]]
Geometry [k1 @as k2] = [[${k1}${as}${k2}]]

CellAttribute [a] = [[${a}]]
KindAttribute [kind eq s] = [[\ ${kind}${eq}${s}]]
ValueAttribute [v eq s] = [[\ ${v}${eq}${s}]]
SourceAttribute [v eq s] = [[\ ${v}${eq}${s}]]
TargetAttribute [v eq s] = [[\ ${v}${eq}${s}]]
IDAttribute [v eq s] = [[\ ${v}${eq}${s}]]
EdgeAttribute [v eq q1 s q2] = [[\ ${v}${eq}${q1}${s}${q2}]]
VertexAttribute [v eq q1 s q2] = [[\ ${v}${eq}${q1}${s}${q2}]]
RedAttribute [id k q1 s q2] = [[\ fillColor="red"]]
OtherCellAttribute [@an k s] = [[]]

GAttribute [a] = [[${a}]]
OtherGAttribute [@an k s] = [[\ ${an}${k}${s}]]

quote [c] = [[${c}]]

Parsing Diagrams - DaS Workbench 3 Attribute Elider Phase

Attribute Elider

Grok

Emit

See Also