Parsing vs REGEXing

I wanted to change all "/" to "_" in my .opml file, but only in qualified identifiers.


The global-find-and-replace strategy doesn't work, because the .opml contains legitimate "/"s that shouldn't change.


Using a parser, the change can be done in 1 line:


   qident_recursive [id slash qid] = [[${id}${slash}${qid}]]


becomes


  qident_recursive [id slash qid] = [[${id}_${qid}]]



REGEX is Flat, Parsing is Hierarchical

The main difference between the two approaches is that REGEX and find-and-replace are "flat".


REGEX cannot easily follow structure.


Parsing can follow structure.



Parsing Is Now Accessible

Parsing used to be the domain of compiler writers.


PEG, and especially Ohm-JS, bring this technology down from the mountain and make it as easily accessible as REGEX (which used to be only available in the domain of compiler writers).



Appendix OPML Source Code

See the code in the project https://guitarvydas.github.io/2021/05/10/Software-Components-101-Engine-Part-3-Factbase.html.


Appendix - PEG and Ohm-JS

https://guitarvydas.github.io/2021/04/02/PEG-Cheat-Sheet.html

https://guitarvydas.github.io/2021/03/24/REGEX-vs-PEG.html

https://guitarvydas.github.io/2021/03/19/Racket-PEG.html

https://guitarvydas.github.io/2021/03/17/PEG-vs.-Other-Pattern-Matchers.html

https://guitarvydas.github.io/2020/12/27/PEG.html


https://guitarvydas.github.io/2021/05/09/Ohm-Editor.html

https://guitarvydas.github.io/2020/12/09/OhmInSmallSteps.html


https://github.com/harc/ohm