Transpiling Diagrams

Introduction

I discuss some diagram transpiler issues at the 50,000 foot level.

Ensure that all boxes are concurrent. The sequential paradigm (call/return) does not work for diagrams.
Allow text to be included in the diagram (text is better than diagrams for a certain class of programs, e.g. "a = b + c" should be written and not drawn).
Use an editor that allows drawing any shape and does not need to know details about what is being drawn.
Imagine Emacs (vim, vscode, etc.) for diagrams. Emacs knows only how to edit characters, in general. Emacs doesn't check that the code is consistent - that's the job of the compiler in later stages. Emacs allows you to save inconsistent programs - for example, it doesn't check for consistency and stop one from saving an inconsistent program. The compiler will raise error messages, later.
Imagine that diagrams are the same as text code.
Think box/arrow/ellipse/text instead of pixels.
Programmers' editors edit a grid of cells. Cells may not overlap. Cells are also called "characters".
Programmers' editors are not-quite-2D. They allow 2D layout of text, but insist on arranging cells in lines and columns.
Diagram editors, OTOH, allow cells to overlap. No strict grid structure is imposed.
Don't think in terms of pixels, think about larger constructs, like boxes. One doesn't need pixel/raster recognition algorithms to effectively use diagrams as syntax (DaS).

Convert the diagram into XML form. I drew the diagrams in draw.io. It outputs a compressed XML file. Each tab on the diagram is contained in its own element delimited by <diagram> … </diagram>
Copy/Paste the compressed data into the tool https://jgraph.github.io/drawio-tools/tools/convert.html and press the decode button. This should result in human-readable XML of an mxGraph structure.
The structure contains graphical information about every item in the drawing.
Weed out the syntactic sugar.
Normalize the data - I like factbases (see another one of my essays).
Use code to transform the data into some very convenient form, e.g. JSON. I like using PEG-based parsers (Ohm-js for Javascript, ESRAP for Common Lisp).
N.B. the graphical data can be considered to be a textual programming language, where details like X, Y, Width and Height have been added.
Convert the 2D graphical information into 1D textual information. E.g. find all boxes, then, find all boxes that intersect boxes (high school math). Small boxes that sit on the edges of larger boxes are "ports".
Use standard text-compilation technique from that point on.