State and GOTO

In this essay, I examine some of the uses of State.

State has been maligned in literature, in the same manner that GOTO has been maligned.

GOTO is used in every programming language — under the hood. Structured Programming showed us how to hide GOTO and to isolate1 its use.

State is used in threads, among other things.

The problem is not State, but isolation of State.

Another problem with State (and GOTO and Variables and Message Passing, etc.) is lack of code localization. Spaghetti state can be helped by strict nesting2, in some instances.

Throwing away all state might not be necessary.

Nuanced Use of Case (Switch)

Case vs. Types - OOP

One form of state is querying of object types.

This problem was lassoed by Object-Oriented Programming3.

Case vs. Continuations - StateCharts

CPS - Continuation Passing Style — is another form of GOTO. CPS is more powerful that GOTO.4

CPS can become unruly when continuations are passed to regions of code that are not local to the origin of the continuation. JavaScript callbacks are an attempt to syntactically nest such uses, but callbacks failed to deliver enough benefits.

Exceptions are another attempt at encapsulating CPS. Exceptions are flawed in that they are declared statically but operate dynamically. The dynamic call-chain determines the control-flow changes caused by exceptions.

GOTO changes control-flow.

OO bundles up data. (Closures also bundle up data).

CPS bundles up and delivers, both, control-flow and data.

StateCharts provide a way to describe non-linear changes to control-flow that retain code locality.

I discuss the original StateCharts paper in https://guitarvydas.github.io/2020/12/09/StateCharts.html and https://guitarvydas.github.io/2021/02/25/statecharts-(again).html

Pattern Matching

Pattern matching consists of two parts:

One can obviate the need for flags (and associated if…then…else…endif code) by writing patterns instead.

Pattern matching was explored in compiler technologies and is called parsing in that literature.

Pattern matching is used in function languages, as a way to remove the need for variables and for writing declarative specifications of code using overloaded functions. PatternParsing is a more general form of this kind of pattern matching.

A common use of pattern matching is REGEX. Patterns are written in a DSL called a REGEX and the matching engine is hidden from view. For example, to match and “a” followed by a “b”, one would write a REGEX DSL snippet “/ab/”.

An early form of a pattern-matching-and-engine language is PROLOG. One would write patterns and the PROLOG engine would use backtracking to return an exhaustive list of all matches.

Early forms of pattern matching in compilers was embodied in tools such as LEX and YACC.

The most recent form of pattern matching is PEG. PEG subsumes the capabilities of REGEX, LEX and YACC.

PEG uses a backtracking engine and, thus, subsumes some of the uses of PROLOG (and other relational technologies).

PEG might enable a new breed of programming, for example DSLs that write DSLs (I call these SCLs5)

PEG rules can be built in layers, hence, PEG could be used to match text and to match type information. This should allow PEG to be used instead of REGEX and Haskell.

Conditional Compilation - #IFDEF

Conditional compilation has been used as a way of creating “portable” programs by brute force.

Conditional compilation is known as “#ifdef” in C-class languages and “#+” in Lisp-class languages.

In “#ifdef” languages, a DSL is used to express conditional compilation. The DSL is usually distinct from the underlying language, e.g. the kind of expressions used in C #ifdef is a language unto itself and does not provide access to the full C language.

In “#+” Lisp conditional compilation, the expression language is full Lisp itself — no extra DSL is created.

Such conditional compilation is a small step towards the full generality of pattern matching. Little languages — SCLs and DSLs — built on top of these languages could provide the desired conditional compilation effects.

Portability was explored in compiler technologies, for example the techniques developed for dealing with disparate target CPU architectures.

Many of the successful techniques at portability used a two phase approach. The compiler would emit code for a generalized architecture (often called a VM, today) and a pattern-matcher would transform the generalized code into code specialized for a particular architecture.

Some of the interesting technologies included:6

[Note that all of the above techniques are based on Divide and Conquer. The transformations become easier as more layers are added.]

Macros

Macros are DSLs for changing the behaviour of compilers (and interpreters).

Macros can program the compiler.

In the extreme case — e.g. Lisp macros — macros employ the full underlying language. For example one can write a Lisp program that is run at compile-time. This lisp program — called a macro — creates new source code that is fed to the compiler.

In the other extreme, e.g. C macros, a separate macro-processor phase is run and its output is fed to the C compiler. Such macros work in two steps — preprocess, then compile. Obviously, the Lisp method is more powerful, but carries more potential risk.

Inlining of functions is a formalization of macros. Inlining is an attempt to encapsulate one use-case of macros, e.g. optimization of programs defined in the function-only domain.

Lisp macro processing work recursively. C-style macro preprocessing works in only two steps.

The ideal extension would be to have an infinite number of preprocessing steps — a pipeline of preprocessors. [PEG might be one way to achieve this effect. PEG grammars that emit programs that are further processed by other PEG grammars, and so on.]

Case vs. Flags

PL constructs like if … then … else … end if allow code to query state flags with little constraint on how the code is nested.

It is possible to write clear code using such constructs, but such structuring tends to be rare.

Likewise, note that it is possible to write clear code in assembly language, including all forms of 1st class functions, etc.

Code in general, needs to be constrained with syntactic sugar.

Functions and Functional Programming

FP (functional programming) is but one way to structure code.

FP limits the available paradigms to, mostly, a one-in and one-out style. Various extensions to FP include exceptions and full-blown use of CPS, e.g. in Denotational Semantics, but, the results are increasingly less readable.

FP works by prohibiting changes of state — this creates isolation between functions and allows manipulation of the notation.

It should be noted that UNIX® processes also create isolation between blocks of code — no data nor control-flow can escape the boundaries of a process.


  1. Isolation is like encapsulation, but includes control–flow.  ↩︎

  2. As seen with the solutions for Global Variables, Structured Programming, etc.  ↩︎

  3. Bertrand Meyer pointed this out.  ↩︎

  4. And, therefore, CPS is more dangerous that GOTO.  ↩︎

  5. SCL mean Solution Centric Language.  ↩︎

  6. See https://guitarvydas.github.io/2021/01/14/References.html  ↩︎