PFR and PF (Parsing Find and Replace)
Overview
PFR is a command-line find-and-replace utility that uses PEG parsing instead of REGEXP. PFR outputs the resulting (modified input) file to stdout.
PF is a command-line find utility that uses PEG parsing instead of REGEXP (like GREP, except with parsing). PF merely prints the names of the matching files on stdout.
Usage
pfr source-filename grammar-filename action-filename [support.js] [t] [v]
pf source-filename grammar-filename
The 3 files contain text.
Source-filename refers to a file which contains arbitrary text which is to-be-parsed.
Grammar-filename refers to a file which contains a grammar specification in Ohm-JS format.
Action-filename refers to a file which contains an action specification in Glue format.
Support.js [optional] contains functions that are used by the action rules (only required if action rules call-out to support functions ; not needed in simple use-cases). Default: support.js is not used and not read.
t [optional] lower-case ‘t’ as the 5th argument turns on rule tracing. Default: tracing disabled.
v [optional] lower-case ‘v’ as the 6th argument turns on display of generated action code. Default: displaying disabled.
Code
PFR and PF are shell scripts (Bash) contained in that directory. Put them on your PATH.
Workflow
[This section doesn’t belong in the final PFR/PF man page, but the tools are young…]
A workflow that works best for me is:
-
Use the Ohm-Editor to build and debug the grammar
-
Write a “vanilla” action spec that simply outputs what is input (“identity”) ; note that if you use syntactic rules instead of lexical rules in the grammar, the output will omit spacing (that’s OK, you just need to be aware of this ; the only way to get a pure identity grammar is to use syntactic rules only (not necessary))
-
Modify the identity action spec to suit your needs.
Appendix - Ohm-JS Format
See Ohm-JS Syntax.
Appendix - Glue Format
See Glue Manual.
See Glue Overview.
Appendix - Sample Grammars and Actions
ABC Glue (uses the ftranpile() function - a precursor to PFR - with a simple.ohm file and a simple .glue file (action spec))
first class comments (this project was a quickie - one weekend - entry in the language jam ; the current versions of sequence.js and details.js contain the pre-cursor of PFR ; each of these contains references to multiple .ohm files and to multiple .glue files (action specs))
A sample (very simple) grammar specification from ABC Glue:
ABCgrok {
TopLevel = Assignment+
Assignment = Variable "=" Expression -- complex
| Variable "=" number -- simple
Expression = Variable "+" Variable
Variable = "a" .. "z"
number = dig+
dig = "0" .. "9"
}
In this example, there are 7 grammar rules
TopLevelis one-or-more AssignmentsAssignment_complexis aVariable,=and anExpressionAssignment_simpleis aVariable,=and anumber- A
Variableis a single lower case letter, fromathroughz - A
numberis one-or-moredigs - A
digis a single character0through9
A sample action specification (aka Glue):
TopLevel [@assignments] = [[${console.log (assignments)}]]
Assignment_complex [v keq e] = [[var ${v} = ${e};\n]]
Assignment_simple [v keq n] = [[var ${v} = ${n};\n]]
Expression [v1 kplus v2] = [[${v1} + ${v2}]]
Variable [c] = [[${c}]]
number [@digits] = [[${digits}]]
dig [c] = [[${c}]]
One Action rule for each grammar rule.
The Action rules must have the same name as the grammar rules.
Each rule creates and returns one string (the string might be empty).
TopLevelaccepts aniteratornode calledassignments, then callsconsole.log()to print out the assignments.Assignment_complextakes three parametersv,keqandewhich represent the sub-matches in the Assignment_complex grammar rule, then returns the stringvar ${v} = ${e};followed by a newline ; ${v} is expanded to be the text of thevarparameter, ${e} is expanded to be the text of theeparameter, the parameter calledkeqis ignored)Assignment_simpletakes three parametersv,keqandnand returns the stringvar ${v} = ${n}followed by a newline, the parameterkeqis ignoredVariabletakes three parametersv1,kplusandv2and returns a plus statement (ignoring the parameterkplus)numbertakes an iteration node containing all of thedigs and returns a string containing the digits (the iteration node is collapsed automatically to make a single string)digtakes one parametercand returns a string of length 1 containing the character.
Future
These commands should deal with stdin, stdout and stderr in the usual UNIX® style. Instead, these commands currently require filenames on the command line.