PFR and PF (Parsing Find and Replace)
Overview
PFR is a command-line find-and-replace utility that uses PEG parsing instead of REGEXP. PFR outputs the resulting (modified input) file to stdout.
PF is a command-line find utility that uses PEG parsing instead of REGEXP (like GREP, except with parsing). PF merely prints the names of the matching files on stdout.
Usage
pfr source-filename grammar-filename action-filename [support.js] [t] [v]
pf source-filename grammar-filename
The 3 files contain text.
Source-filename refers to a file which contains arbitrary text which is to-be-parsed.
Grammar-filename refers to a file which contains a grammar specification in Ohm-JS format.
Action-filename refers to a file which contains an action specification in Glue format.
Support.js [optional] contains functions that are used by the action rules (only required if action rules call-out to support functions ; not needed in simple use-cases). Default: support.js is not used and not read.
t [optional] lower-case ‘t’ as the 5th argument turns on rule tracing. Default: tracing disabled.
v [optional] lower-case ‘v’ as the 6th argument turns on display of generated action code. Default: displaying disabled.
Code
PFR and PF are shell scripts (Bash) contained in that directory. Put them on your PATH.
Workflow
[This section doesn’t belong in the final PFR/PF man page, but the tools are young…]
A workflow that works best for me is:
-
Use the Ohm-Editor to build and debug the grammar
-
Write a “vanilla” action spec that simply outputs what is input (“identity”) ; note that if you use syntactic rules instead of lexical rules in the grammar, the output will omit spacing (that’s OK, you just need to be aware of this ; the only way to get a pure identity grammar is to use syntactic rules only (not necessary))
-
Modify the identity action spec to suit your needs.
Appendix - Ohm-JS Format
See Ohm-JS Syntax.
Appendix - Glue Format
See Glue Manual.
See Glue Overview.
Appendix - Sample Grammars and Actions
ABC Glue (uses the ftranpile()
function - a precursor to PFR
- with a simple.ohm
file and a simple .glue
file (action spec))
first class comments (this project was a quickie - one weekend - entry in the language jam ; the current versions of sequence.js
and details.js
contain the pre-cursor of PFR
; each of these contains references to multiple .ohm
files and to multiple .glue
files (action specs))
A sample (very simple) grammar specification from ABC Glue:
ABCgrok {
TopLevel = Assignment+
Assignment = Variable "=" Expression -- complex
| Variable "=" number -- simple
Expression = Variable "+" Variable
Variable = "a" .. "z"
number = dig+
dig = "0" .. "9"
}
In this example, there are 7 grammar rules
TopLevel
is one-or-more AssignmentsAssignment_complex
is aVariable
,=
and anExpression
Assignment_simple
is aVariable
,=
and anumber
- A
Variable
is a single lower case letter, froma
throughz
- A
number
is one-or-moredig
s - A
dig
is a single character0
through9
A sample action specification (aka Glue):
TopLevel [@assignments] = [[${console.log (assignments)}]]
Assignment_complex [v keq e] = [[var ${v} = ${e};\n]]
Assignment_simple [v keq n] = [[var ${v} = ${n};\n]]
Expression [v1 kplus v2] = [[${v1} + ${v2}]]
Variable [c] = [[${c}]]
number [@digits] = [[${digits}]]
dig [c] = [[${c}]]
One Action rule for each grammar rule.
The Action rules must have the same name as the grammar rules.
Each rule creates and returns one string (the string might be empty).
TopLevel
accepts aniterator
node calledassignments
, then callsconsole.log()
to print out the assignments.Assignment_complex
takes three parametersv
,keq
ande
which represent the sub-matches in the Assignment_complex grammar rule, then returns the stringvar ${v} = ${e};
followed by a newline ; ${v} is expanded to be the text of thevar
parameter, ${e} is expanded to be the text of thee
parameter, the parameter calledkeq
is ignored)Assignment_simple
takes three parametersv
,keq
andn
and returns the stringvar ${v} = ${n}
followed by a newline, the parameterkeq
is ignoredVariable
takes three parametersv1
,kplus
andv2
and returns a plus statement (ignoring the parameterkplus
)number
takes an iteration node containing all of thedig
s and returns a string containing the digits (the iteration node is collapsed automatically to make a single string)dig
takes one parameterc
and returns a string of length 1 containing the character.
Future
These commands should deal with stdin, stdout and stderr in the usual UNIX® style. Instead, these commands currently require filenames on the command line.