Monday, 2 November 2020

Even More Aphorisms

Alan Perlis was the first winner of the Turing Award in 1966. In 1982 he published [1] a set of 130 epigrams on programming. His aim, he explained, was to capture—in metaphors—something of the relationship between classical human endeavours and software development work. "Epigrams," he wrote, "are interfaces across which appreciation and insight flow." This post offers a few aphorisms. My dictionary tells me that an epigram is 'a pointed or antithetical saying', while an aphorism is 'a short pithy maxim'. Whatever they may be called, I hope that these remarks will evoke some appreciation and insight.

41. Causality is not a topic in logic or science: it is an informal notion indispensable in modelling and in failure diagnosis.
42. A wise development project prefigures the system's evolution by enacting its earliest phases in vitro.
43. The 'fault' concept is minimally useful in modelling—if it's in the model it's not a fault. The vital idea is deviation between model and subject.
44. Modelling the physical world without domains is destructive fragmentation: it deprives causal links of their homes.
45. Any formalism restricts what you can say: as evidence of its own applicability it is merely a self-fulfilling prophecy.
46. Separation of concerns is the mother of simplicity; but giving birth is not always easy.
47. Natural language is the indispensable intermediary between unfettered thought and formalisation.
48. The machine-world interface of sensors and actuators gives the machine a wafer-thin view of the governed world.
49. The Eight Queens problem is not about chess; but the Lift problem really is about lifts.
50. An unsuitable method, like a chisel used as a screwdriver, works poorly, breaks easily, and is hard to use.

[1] A J Perlis; Epigrams on Programming; ACM SIGPLAN Notices 17,9 September 1982.

Links to other posts:
 ↑  Causality: Causality explains how a system works
 ↑  Physical Bipartite System: The nature of a bipartite system
 ↑  Simplicity: Simplicity is hard, but vital
 ↑  Triplet Models: What models are needed of the machine and governed world?
 ←  Ten Aphorisms: Ten short remarks
 ←  Ten More Aphorisms: Ten more short remarks
 ←  Yet More Aphorisms: Ten more short remarks
 ←  Again More Aphorisms: Ten more short remarks

Sunday, 25 October 2020

Program Specification

A program is correct, Dijkstra insisted [1], only if it provably satisfies a given formal specification. The specification has been chosen for its pleasantness (to use Dijkstra's word): that is, a program satisfying the specification will satisfy the informal needs of the programmer's customers. The specification is located at the interface between need and satisfaction—the program's input-output interface. It acts as a firewall, rigorously isolating the programmer's correctness concern from the customers' need for an informal pleasantness that could corrupt the formality of the programming task. For a program to compute the GCD (Greatest Common Divisor) of two positive integers, the interface is two argument integers A and B, and a result integer C: the specification is "C is the GCD of A and B." On the other side of the interface, the customer judges GCD more pleasant than LCM (Lowest Common Multiple): but that is another world. The programmer need not—indeed, must not—be concerned with the customers' reasons: everything needed for correctness is in the specification.

For cyber-physical systems, too, software engineers may aspire to write provably correct programs. Here, a pleasant specification is one whose satisfying program ensures the governed world behaviour desired by stakeholders. Where is this specification firewall located? Necessarily, at the shared phenomena where sensors and actuators connect the formal to the non-formal—the machine's input-output interface to the governed world. The level of abstraction is low. If only state phenomena are shared at the interface, then in one run of the system, nothing is visible at this interface except an evolving trace of these state values. The trace is constrained by the physical properties of the governed world and by execution of a correct program: acceptable traces are exactly those corresponding to desired governed world behaviour. The specification stipulates only that the program, monitoring the sensors and controlling the actuators, ensures that the evolving trace is always an acceptable trace. The customer gets the desired governed world behaviour. The programmer need know nothing of this behaviour above the level of the machine side of the sensors and actuators.

For almost any real system, this reductionist specification is absurd, offering the programmer only incomprehensible stipulations about sensor and actuator states. Common sense—for a lift system, for example—demands a specification expressed in such abstractions as "Go to next floor up", "Close the doors", "Illuminate request button", and so on. After all, the specifiers must have considered these and similar abstractions in their process of arriving at the reductionist specification. But these abstractions have no direct physical realisation: there is no actuator available to the trace specification that can respond to the command "Go to next floor up". Deeper in the governed world, the abstraction level can be raised only by major physical changes far outside the remit and scope of software engineering. Is our goal—of formal program specification located strictly at the machine-world interface—therefore unachievable? The answer must be yes.

How can this be? If 'GCD' is a permissible abstraction in specifying the arithmetical program, why is 'Go to next floor up' not allowed in the lift system? Is the difference perhaps a consequence of inadequate development method? No, it is not. The difference is in the problem worlds of the two examples. The GCD program is about arithmetic, which is a formal and abstract system. 'GCD' is a permissible abstraction because the axioms and theorems of arithmetic are certain, and are known to the specifier, the programmer, and even to the customer who finds 'GCD' more pleasant than 'LCM'. In unpacking 'GCD' we may wish to refer to 'divides', 'remainder', 'zero', 'modulo' and even 'prime': these are potentially related notions, but their definitions and relationships present no challenge to mathematical certainty and shared knowledge. In contrast, the governed world of a cyber-physical system is neither abstract nor formal. In Einstein's words [2]: "As far as the propositions of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality.” Knowledge of the governed world's modelled properties is neither universal nor eternal, but specific to the time, the place, and the purposes of the particular system in hand—and always contingent. Its scope includes specific physical domains and the causal links they effectuate. There can be no presumption that any of this is known to the software programmer.

This challenge is fundamental for every cyber-physical system. It provides the ultimate justification of many of the basic structuring ideas presented in this blog, and of the principle that development must rest on an iterative cooperation between those whose primary concerns lie in the governed world and those for whom they lie in the machine.

[1] E W Dijkstra; On the Cruelty of Really Teaching Computing Science; CACM Volume 32 Number 12, page 1414, December 1989.

[2] Albert Einstein; Sidelights on Relativity: Ether and Relativity: II Geometry and Experience; Methuen, 1922.

Links to other posts:
 ↑  Physical Bipartite System: The nature of a bipartite system
 → The ABC of Modelling: Axiomatic, Behavioural, and Consequential aspects

Tuesday, 13 October 2020

Simplicity

Tony Hoare [1] managed the development of the promised software for the Elliott 503 Mk II computer. In 1964 the project ended in disastrous failure. He identified the fundamental cause only when his most senior manager shouted at him ‘You let your programmers do things which you yourself do not understand!’—he had strayed from the narrow path of simplicity.

Simplicity is the antithesis of complexity. Not of computational complexity, which measures the time or space consumed by an algorithm, but of the complexity which resists human understanding. Structured programming, in Dijkstra’s sense, embraces simplicity. Progress in program execution is directly mapped to the program text; the execution context of every segment of code is clearly visible in the nested enclosing segments. A structured program is comprehensible: its readers understand its behaviour well enough to enact it in their minds as they read.

The machine program of a triplet should be a structured program. In the same way, a triplet’s model of the governed world, and its behaviour operational principle, must be simple. An interesting departure from simplicity of operational principle is apparent in the classic pendulum clock. Most causal links in the clock's operational principle are clear: the weight or spring turns the gear train; the gear train turns the hands; the force of gravity acts on the pendulum; and so on. But the escapement presents a complexity. Here the pendulum acts on the escape wheel to ensure timely controlled rotation of the gear train; but also the escape wheel acts on the pendulum to maintain its swing against the resistance of air pressure and friction. The operational principle reveals this problem explicitly. Horologists worked on this probem for centuries, proposing and improving mechanism designs, eventually trying to dissolve the mechanical conflict by electromagnetic coupling. (For a software taste of this kind of complexity, try designing a driver assistance behaviour that combines speed limiting with cruise control.)

Simplicity must also be sought more generally in the governed world model. Domain properties must hold, as described, over the whole behaviour enactment. Each domain must play a constant role over the enactment (setting the clock and tuning the radio can't appear in the same behaviour for the Honda Jazz, because the tuning buttons domain is also used to set the clock time). The enactment time span must be unbroken: the machine may sometimes wait for the governed world, and vice versa, but there must be no time at which machine execution has been suspended for later resumption. The problem must be closed in the sense that the triplet behaviour can be deduced from its machine program and governed world model alone, without reference to properties of anything external.

There was a happy ending to the Elliott 803 story. Tony Hoare was given the opportunity to repair the catastrophe, and he succeeded. Twenty five years later [1], he summed up the lesson in a famous maxim. "There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."

[1] Tony Hoare; The Emperor’s old clothes; CACM February 1981. Reprinted in C A R Hoare and C B Jones; Essays in Computing Science; pages 1-18, Prentice Hall 1989.

Wednesday, 7 October 2020

Jigsaw Puzzles

In 1766 the mapmaker John Spilsbury made an educational aid of a new kind: a map of Europe, mounted on wood and dissected by cutting around the borders of the kingdoms. Children would learn the countries' shapes and borders by reassembling the original map. Dissected pictures soon became—and still are—very popular as puzzles. For childrens' puzzles the dissected pieces usually correspond to identifiable components of the whole picture. For adults the dissection is arbitrary: the pieces are much smaller, usually linked by the familiar blob-and-socket device.

Solving jigsaw puzzles invites informal analogies to many intellectual tasks. The informality is itself an advantage, freeing an active mind to find fresh perspectives and understandings. Michael Polanyi [1] likened the distributed work of scientific research groups to solving a gigantic jigsaw puzzle. Each group works to assemble what they believe is one identifiable coherent part of the whole picture. As the work progresses, mismatches are recognised between the groupings and the emerging structure of the whole problem. Groups A and B are working on distinct but adjacent parts of the same house; group C proves to be working on two cars, not one; groups D and E discover that they are working on scattered subsets of the same windmill. Group F discovers that the recalcitrant pieces of foliage that won't fit into their tree are in fact fragments of a large garden hedge that merits a group to itself. The shape of the task evolves as it progresses.

Closer to home, jigsaw dissections can be compared to software engineering structures. Requirements intended to describe system behaviour are often defined as stimulus-response pairs: "If Set Ratio mode is not selected, Set Ratio mode shall be selected if the Control Panel SR switch is pressed." If this is a piece of the puzzle, how many pieces would make the whole picture? Is the whole picture provided, or will it appear only when all the pieces are assembled? Where are the blobs and sockets connecting each piece to its neighbours? How might this piece's picture fragment continue on a neighbouring piece? Are these the only pieces, or are there pieces of other kinds that provide a matrix and a frame? Can the stimulus-response pairs be grouped into larger more meaningful structures—like Spilsbury's kingdoms? What would such structures correspond to in reality?

One of Spilsbury's jigsaws, with its ornamented title "Europe divided into its Kingdoms", can be seen at the British Library [2] in London. Inevitably—in a non-formal reality—the structuring in kingdoms is not perfectly simple. The physical boundaries of most kingdoms are too irregular to be precisely represented; parts of the kingdom of Spain are tiny islands, embedded in the Mediterranean sea and too small to form separate pieces; Sardinia (a disconnected part of Italy) and Corsica (a disconnected part of France) are combined into a single piece. As a cartographer, Spilsbury could identify these anomalies and deal firmly with them: he had the whole map in front of him as he made his dissection. At a requirements engineering workshop a senior consultant from a car manufacturer proudly announced that his company's leading product line had 200,000 stimulus-response requirements. Were they dissected from the complete picture? Or were the developers hoping to infer the whole picture from these 200,000 pieces? The consultant didn't tell us.

[1] Michael Polanyi; The Republic of Science; in M. Grene ed, Knowing and Being, pp50-1, U Chicago 1969.

[2] [https://www.bl.uk/learning/timeline/large104695.html] accessed 7th October 2020.

Thursday, 17 September 2020

Two Use Cases

System requirements are often documented by use cases [1]. We can identify an important aspect of the use case concept and its application by comparing two simplified examples: using an ATM and using a lift. The traditional depictions are a stick figure labelled ATM_User with an arrow pointing to an ellipse labelled Use_ATM, and the same graphic with labels Lift_User and Use_Lift. Each use-case describes a class of interaction episodes in which the user can obtain some desired result: the ATM_User can withdraw cash and perform other banking actions, and the Lift_User can travel from one floor to another. At first sight the two are closely parallel. Each episode takes place between the system—essentially what we call the machine—and the user, regarded as an external agent. Each use case is typically considered to be a requirement on the system, which must provide the desired results.

But Use_ATM and Use_Lift differ radically. Use_ATM describes a system constituent behaviour. The user is a participant, playing a role not unlike a pilot or a car driver. The user and machine engage in a close dialogue, the machine responding to each user request, sometimes by asking for further information—such as a PIN or the amount of cash desired. In Use_Lift there is no such dialogue. The user can press buttons, and enter or leave the lift at a floor, but the machine makes no response except when a button is newly pressed, when it illuminates the button and adds the corresponding floor to its set of outstanding requests. Individual users, and their comings and goings, are invisible to the system: nothing in the system behaviour corresponds to an instance of Use_Lift.

We do not consider Use_ATM to be a requirement. Rather, it is itself a behaviour design detailing the interaction between a governed world domain—the ATM_User, singled out as uniquely significant—and the rest of the system. By contrast, Use_Lift can be understood as a requirement on a constituent behaviour NLS (Normal_Lift_Service) of the lift system. It describes action sequences available to a user at a floor lobby wanting to travel to the lobby of another floor. The requirement is that users performing certain of these action sequences are enabled to achieve their desired result. There will be further requirements of the NLS behaviour—for example, concerning dwelling time at a visited floor, lift car rest position when no requests are outstanding, maximum wait times, and many more. The Use-Lift requirement is simply that the system's NLS behaviour must afford travel opportunities between floors much as the requirement on a train service is to afford travel opportunities between stations.

[1] Ivar Jacobson, Ian Spence, Kurt Bittner; USE-CASE 2.0 The Definitive Guide, 2011.

Links to other posts:
 ↑ System Behaviour Complexity: The component structure of system behaviour

Tuesday, 21 July 2020

The Value of Formalism

The value of formalism in software engineering—not least for cyber-physical systems—is palpable. As Robert Boyle wrote in 1666 [1]: “In pure mathematicks, he that can demonstrate well may be sure of the truth of a conclusion without consulting experience about it." A formal model allows reliable reasoning: mathematical proof of invariants and other formally stated requirements; confident evaluation and simplification of expressions; exhaustive checking of preconditions; simulation of a modelled executable process. These are not small advantages, and become hugely more powerful where good software tools are available. While avoiding the risk of logical fallacies, formal modelling also avoids also the pitfalls and obstacles—fuzzy phenomena, contingent physical properties, unexpected interactions, surprising physical effects, and hidden interferences—that characterise the governed world of a cyber-physical system.

Unsurprisingly, there is a price to be paid. The pitfalls and obstacles of the governed world cannot be exorcised by a formal utterance: perversely, they persist, ensuring that a formal model of the physical governed world can never be correct. The software engineering goal must be adequacy, not correctness. A model's fidelity to its subject does not embody timeless global truth, but adequacy to the conditions and purposes of its use. Achieving this adequacy is the work of a necessary practical discipline, not the QED of a mathematical proof. In the most demanding context—in a critical function of a critical system—three conceptual phases of model development can be identified: pre-formal, formal, and post-formal.

The pre-formal phase is an exploration of the governed world, captured in natural language. Natural language allows unfettered discovery because the evolving description can be continually modified and extended during the exploration. The set of primitive types can be enlarged; concepts such as causality and temporal sequencing can be conveniently introduced; higher-order statements can be added to a first order text. Only when substantial understanding has been achieved through such exploration can a language for formalisation be confidently chosen.

In the formal phase the axiomatic aspect of the governed world model is developed in the chosen formalism, describing the domain properties on which the machine will rely to ensure the desired behaviour in the world. Then the behavioural aspect is addressed, combining the axiomatic model with the software machine behaviour at the machine-world interface. Separating these two aspects avoids a potential confirmation bias. The axiomatic properties are first modelled without accompanying references to the specific intended behaviour. Then, they are instantiated in the specific setting of interaction with the machine.

In the post-formal phase the axiomatic and behavioural model aspects are critically audited for previously overlooked vulnerabilities in the physical governed world. For example, in some possible traces a necessary causal link may be disabled by a state value set in an earlier event. In another example, it may be belatedly recognised that a human participant—such as the pilot of an aircraft—may repeat an action often enough for the eventual result to cause a lethal failure.

Echoing Boyle, Herman Weyl [2] characterised the "decisive step of mathematical abstraction: we forget about what the symbols stand for." This amnesia is indeed the essence of formalism. But software engineers must remember.

[1] Robert Boyle; Preface to Hydrostatical Paradoxes made out by New Experiments, For the most part Physical and Easy, 1666.
[2] Herman Weyl; The Mathematical Way of Thinking; Bicentenial Conference, U Penn, 1940.

Links to other posts:
 ← Triplets: Triplets (Machine+World=Behaviour) are system behaviour elements
 ← Triplet Models: What models are needed of the machine and governed world?
 ← The ABC of Modelling: Axiomatic, behavioural, and consequential aspects