DRY means “Don’t Repeat Yourself”.

RY mean “Repeat Yourself”.

Many programs are first created by cut & paste of working code.

Cut/Paste is RY.

Programmers are taught to organize code so that there is little repetition - DRY.

DRY organization is also called scoping (and I call it nesting).

RY introduces multiple degrees of freedom into a design.

In general RY results in dimension explosion. DRY tries to constrain the explosion by grouping similarities.

An example of RY is shown below.

[This is incorrect code, but shows the problem].

In this (incorrect) code, a port is

  • a rectangle that contains nothing else
  • a circle that contains nothing else
  • a rectangle that contains only one text object
  • a circle that contains only one text object.

We can write a rule for port by making 4 clones of the code, with the appropriate conditions: [where \+ contains(ID,_) means “doesn’t contain anything”].

% layer high
%     port is a rect or circle that contains nothing
port(ID) :-
    rect(ID,nil),
    \+ contains(ID,_).
port(ID) :-
    circle(ID,nil),
    \+ contains(ID,_).
% or, port is a rect or circle that contains only [text]
port(ID) :-
    rect(ID,nil),
    contains(ID,Other),
    text(Other,nil).
port(ID) :-
    circle(ID,nil),
    contains(ID,Other),
    text(Other,nil).

The above is RY. When we start adding more lines to the code, the code explodes in a number of directions and becomes unmaintainable.

For example, at this point in the design, we want to add one more condition to the rule. E.G. a port must be a square rectangle. A rounded rectangle is not a port.

We could just hack on the code:

% layer high
%     port is a rect or circle that contains nothing
port(ID) :-
    rect(ID,nil),
    \+ rounded(ID,nil).
    \+ contains(ID,_).
port(ID) :-
    circle(ID,nil),
    \+ contains(ID,_).
% or, port is a rect or circle that contains only [text]
port(ID) :-
    rect(ID,nil),
    \+ rounded(ID,nil).
    contains(ID,Other),
    text(Other,nil).
port(ID) :-
    circle(ID,nil),
    contains(ID,Other),
    text(Other,nil).

We see lots of commonality in the code above. If we keep hacking, the code will grow in a number of dimensions.

Programmers are taught to apply DRY principles to the above.

Programmers look for commonality and try to constrain it. The ideal is to lasso every bit of commonality and to create a single rule for each grouping.

We could lasso the shape of ports first - rounded rectangles and squares - or, we could lasso the notions of containing-nothing vs. containing-only-text.

Can we wring out all of the commonalities?

Trying to do this in an all-in-one manner brings complexity into the picture and obscures the final result. The result might be good from a DRY perspective, but becomes harder to read.

Some call this abstraction.

Let’s say that we rewrite the code in the following way:

% a port is, maybe, a square rect or a circle
maybePort(ID) :-
    squareRect(ID,nil).
maybePort(ID) :-
    circle(ID,nil).
% a port contains nothing, or, 
port(ID) :-
    maybePort(ID),
    containsNothing(ID).
port(ID) :-
    maybePort(ID),
    containsExactlyOneTextItem(ID).

which starts out with rect-ness and circle-ness of ports.

We could write the same code in another way:

maybePort(ID) :-
    containsNothing(ID).
maybePort(ID) :-
    containsExactlyOneTextItem(ID).
port(ID) :-
    maybePort(ID),
    squareRect(ID).
port(ID) :-
    maybePort(ID),
    circle(ID).

Which choice is better?

It depends on the Architect.

Ideally, the code should capture the Architect’s thought process and make the thought process clear to the reader.

What happens in a more complex situation? We get flat code that represents a bunch of non-flat decisions that were made by the Architect.

The ideal code would capture that Architect’s non-flat decisions, e.g. it would be

port(ID) :-
    maybePort(ID),
    containsNothing(ID).
port(ID) :-
    maybePort(ID),
    containsExactlyOneTextItem(ID).

eliding maybePort and deferring its definition to elsewhere

or

port(ID) :-
    maybePort(ID),
    squareRect(ID).
port(ID) :-
    maybePort(ID),
    circle(ID).

or, better,

(squarePort OR circle)
AND
(containsNothing OR containsExactlyOneTextItem)

or, if we wanted to capture the Architect’s flow

(squarePort OR circle) -> (containsNothing OR containsExactlyOneTextItem)
                  else -> fail
(containsNothing OR containsExactlyOneTextItem) -> (squarePort OR circle)
                                           else -> fail

[In this simple example it is “obvious” that this flow-code is the same as the code that precedes it. The difference is that the former code uses text only, while the latter uses the “->” operator to show left-to-right design.]

The State Explosion Problem

State-based code has a similar problem, often call the “state explosion problem”.

StateCharts constrain the State Explosion problem.

Database Theory

Databse theory is the notion of wringing out all commonality in data structures.

User Friendly

What notation is more user-friendly?

Math? Math wrings out commonality, but most people dislike the notation. Mathematics maps a 4D design (X/Y/Z/time1)) to 2D (X/Y, aka text). [We see this mapping in the above, where the code was abstracted to “maybePort”]

Drawings? The idea of running diagrams has not been explored sufficiently (IMO). Code->drawings is not sufficient. We need drawings->code2.

IDEs could deal with DRY and help programmers to specify DRY. Github uses diff, we just need to use diff for higher level concepts.

OO was an attempt to capture DRY. OO was unsatisfactory, because of how it warps control-flow.

IDE elision to help programmers create code in layers. Org-mode for code.

What we really want is to capture the design process. Did the Architect address this problem by starting with rect/circle commonalities, or, did the Architect start by addressing the containment commonalities.

It helps to know what the Architect was thinking, when one wants to upgrade or otherwise maintain the code. The above example is simple. Things get hoary in real-life examples.

To capture this kind of thought process, we need to capture the design over time. Mathematics wrings time out of its notation. That is probably one of the reasons that non-mathematicians don’t like math - it doesn’t always suit the way they think about certain problems.

We want to capture the thought process, for example

  • one train of thought starts out with rectangles and circles
  • another train of thought starts out with containment (contain nothing, contain only 1 text item).

If we could build black boxes and just plug them together, we might be able to capture the thought process, for example:

  1. rect -> not rounded -> contains-nothing -> it’s a port.
  2. rect -> not rounded -> contains-1-text-item -> it’s a port.
  3. circle -> contains-nothing -> it’s a port.
  4. circle -> contains-1-text-item -> it’s a port.
  5. contains-nothing -> rect -> not rounded -> it’s a port
  6. contains-1-text-item -> rect -> not rounded -> it’s a port
  7. contains-nothing -> circle -> it’s a port
  8. contains-1-text-item -> circle -> it’s a port.

Programmers are taught to abstract the above choices down to fewer choices.

Math notation can capture the above choices, but doesn’t show any kind of ordering (by the Architect).

Box-and-arrow notation, using black boxes, can capture the above and show the ordering of design.

Isolation could be used to create pluggable components.

See Also

References
Table of Contents

  1. A Design evolves over time and can punch-in and punch-out to different layers. 

  2. Drawings->code can be accomplished by expressing the diagrams in normalized triple form (based on shapes, instead of pixels), using exhaustive search (e.g Relational Programming) to parse the information, then using well-known compiler technologies to transpile the result into working code.