Monday, 23 March 2020

Triplet Behaviour Concerns

Triplet enactment concerns address failures in enacting a triplet; triplet behaviour concerns address failures in the governed behaviour of a correctly enacted triplet. This post lists only a few behaviour concerns, with brief comments on each. There are many more: every project should establish and maintain its own checklist of particular concerns.

BREAKAGE: Domain interaction protocols must be correctly observed. For example, in a lift system a minimum settling time is stipulated between stopping and restarting the hoist motor, and a longer time if direction is reversed. Ignoring this rule may damage or even break the motor.

CREEP: Suppose the program variable p records the linear position of a robot arm P: p:=posn(P) sets p to the sensed position of P, and moveTo(P,p) moves P to the recorded position p. Conversion between the real physical position of P and the floating-point value of p is unavoidably approximate. The sequence {p:=posn(P);moveTo(P,p)} may change the position of P, perhaps with a bias in one direction. Over time, repeated execution of the sequence, with no other assignments to p and no other moves of P, may cause the arm position to creep. A detailed analysis of creep—in temporal rather than spatial position—is given in a report [1] on a failed critical system.

CRITICAL PRIORITY: In identifying the requirements are relevant to a triplet, global requirements should be treated with scepticism. For example, the apparently global requirement "The lift doors are never open while the lift is moving" may not apply to firefighter mode operation: a firefighter's need to escape from a burning floor takes precedence over the more general safety requirement.

DORMANT RESPONSE: When a triplet behaviour has entered a waiting state, it may later execute a delayed action that is no longer appropriate. For example, when the driver presses the brake pedal on the highway, the Cruise Control behaviour goes into standby mode, waiting for the driver to press the Resume button. If, after joining slow urban traffic, the driver presses the Resume button by accident, the resulting acceleration will be very dangerous and possibly fatal.

RACE CONDITION: A race condition exists between two temporal phenomena—each an occurrence of an event or of a state value—whose ordering is indeterminate. A race may be harmless: for example, an empty waiting lift may be summoned almost simultaneously by requests from a floor above and a floor below, and will then depart in the direction of the 'winning' request. In general, races are unavoidable because the governed world has independent agents and processes that cannot be prevented from racing in competition. Addressing a race condition concern means ensuring that neither outcome can cause serious failure. A lethal race condition is described in [2].

[1] US GAO Report; GAO/IMTEC-92-26 Patriot Missile Software Problem; February 1992.
[2] Nancy G Leveson and Clark S Turner; An Investigation of the Therac-25 Accidents; IEEE Computer 26,7 July 1993.

Links to other posts:
 ↑ Avoiding Failure: Checklists of failures and how to avoid them
 ↑ Triplets: Triplets (Machine+World=Behaviour) are system behaviour elements
 ← Triplet Enactment Concerns: Failures to avoid in triplet enactment design

Thursday, 19 March 2020

Triplet Enactment Concerns

Triplet enactment concerns address failures in enacting a triplet; triplet behaviour concerns address failures in the governed behaviour of a correctly enacted triplet. The distinction is convenient but not rigorous. This post lists enactment concerns, with brief comments on each.

INITIALISATION: A triplet is enacted when an execution of its machine program is begun by its parent machine in the enactment tree. Enactment is considered to end when program execution ends. The triplet's machine program defines local variables; the global variables are governed world domains. The governed world model specifies a precondition on the global variables for enactment to begin. For example: firefighter lift service may specify that the lift car is at the ground floor, the doors are open, and the hoist motor is switched off. The external agent activating the enactment must ensure that the precondition is satisfied. For different behaviours different treatments of the initialisation concern will be appropriate. The weakest precondition is true, which is always satisfied. In general, a stronger precondition will restrict the freedom to combine the triplet with other concurrent behaviours. An initial phase of the triplet behaviour that allows a weaker precondition by establishing desired conditions may sometimes overcome this disadvantage.

TERMINATION: Enactment of a triplet behaviour may end in three ways. First, the machine program may terminate at a programmed halt. For example, a CloseDoors behaviour in a lift system halts when the doors reach their closed position. Second, the parent machine may issue an OrderlyHalt control command: the machine program halts on reaching the next orderly state of the governed world—where "orderly" is defined in the behaviour design. For example, in a Stop-Start feature of a car, "orderly" may mean "the engine is running, possibly successfully restarted after a stop." Third, the enactment may be forcibly ended by an Abort command. This is appropriate if all governed world states are considered orderly or a critical situation unconditionally demands immediate pre-emptive termination.

In every case the termination concern demands careful attention to the governed world state that will hold when the behaviour enactment ends. Potential failures include continuing physical processes that the machine would have stopped had it not terminated: for example, spatial movement of a vehicle or heating of a solid or gas in a closed vessel. Failures also include leaving a sequentially compound action at an incomplete stage that leaves some resource permanently unavailable: for example, by acquiring a resource and never releasing it.

INTEGRITY: The enactment of a behaviour must have temporal integrity: the machine execution cannot be suspended for later resumption. Sometimes the machine may wait for an event or state change in the governed world, but waiting is itself an execution state. In the period between suspend and resume, by contrast, execution is absent. On resumption the machine would need to reset any software local variables representing governed world phenomena which may have changed—demanding, in effect, another initialisation.

Links to other posts:
 ↑ Avoiding Failure:  Checklists of failures and how to avoid them
 ↑  Physical Bipartite System:  The nature of a bipartite system
 ↑ Enactment: A behaviour is enacted by executing its machine program
 ↑ Triplets:  Triplets (Machine+World=Behaviour) are system behaviour elements
 → Triplet Behaviour Concerns: Failures to avoid in triplet behaviour design

Thursday, 12 March 2020

The Right-Hand Side

The title alludes to a diagram in Brian Cantwell Smith's paper The Limits of Correctness [1]: the diagram shows a pair of relationships computer ↔ model ↔ world. The model, Smith writes, is the "glasses through which the computer sees the world." Model theory, he adds, studies the relationship on the left between computer and model, but the right-hand side relationship remains problematic: for the right-hand side "we have no theory." A historic example showing how much this matters. In 1960, in the Ballistic Missile Early Warning System (BMEWS), a defective model misinterpreted radar reflections from the moon as a launch of Soviet missiles against the USA: fortunately, the threatened outbreak of war was averted. Might a good theory have prevented this defect?

The model is the "glasses through which the computer sees the world." Yes: but the computer also sees the world directly, through the interface of sensors and actuators. These are physical things—domains—that effectuate causal links by which the world and the machine constrain each other. Further causal links are effectuated within and between domains contained in the world—and similarly in the machine (although we rarely think of the machine like this). The governed world model maps the domains of the world and interface, and their causal links. This map shows what effects the machine can evoke—directly and indirectly—at each actuator, and what information about the governed world can be inferred—directly and indirectly—at each sensor during a behaviour enactment.

We can liken the map of causal links in the world to a map of one-way roads over the domain infrastructure, in which road junctions correspond to state and event phenomena. To design a behaviour is to specify a set of possible complex journeys between and within the machine and the governed world. The designer relies on the map to show which routes, passing through which physical phenomena, are possible in the world. The model—that is, the map—can be perfectly formalised, allowing perfectly reliable inferences. Why, then, is the right-hand side relationship problematic?

Einstein[2] stated the problem succinctly: "... as far as the propositions of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality.” The map cannot be perfectly reliable: the map is fixed, but the roads change; minor roads are omitted; access may be subject to unstated conditions; a road may be blocked; a path may be cut off by an accident. Contingencies like these have their counterparts in physical failures that may vitiate the designed behaviour.

Should we then, aim to solve this problem by a theory of the right-hand side relationship? No. To achieve its purpose, the theory itself must be formal; but relating a formal theory to the non-formal physical world would embody the original problem again—in an infinite regression. Instead we should seek a sound practical discipline of developing models that—although imperfect—are fit for purpose. This is not a disappointment. Fit for purpose is exactly what an engineering product must be.

[1] Brian Cantwell Smith; The Limits of Correctness; ACM SIGCAS Computers and Society, Volume 14,15, Issue 1,2,3,4, pages 18-26, January 1985.
[2] Albert Einstein; Geometry and Experience; Methuen, 1922.

Links to other posts:
 ← Models: Types and purposes of models of the physical world
 ←  Physical Bipartite System: The nature of a bipartite system

Saturday, 29 February 2020

Radical and Normal Design

Vincenti [1] characterises radical design: “How the device should be arranged or even how it works is largely unknown. [There is] no presumption of success. The problem is to design something that will function well enough to warrant further development.” In normal design, by contrast, he writes: “The engineer knows at the outset how the device in question works, what are its customary features, and that, if properly designed along such lines, it has a good likelihood of accomplishing the desired task.”

For complex engineering products, normal design is the outcome of long evolution. In the evolutionary process many factors may be at play at every level. Society as a whole affects the demand and the rewards for particular engineering products, and constrains or supports development of relevant techniques and disciplines. Communities of specialists form, sharing knowledge in conferences, published media, and educational courses. Critics compare competing products, and engineers examine competitors' products seeking to improve their own. Supporting specialisms arise in components, in development methods and tools, and in scientific foundations.

The shape and structure of specialisation fuel the evolution of normal design: only the social existence of a specialist community enables the growth of the relevant knowledge and skills. The NATO Science Committee conference saw software engineering as a new specialism that should take its place alongside the traditional engineering branches. They were recommending—though not in these words—that software design and development should evolve from a radical to a normal engineering discipline.

Emergence of specialism and the evolution of normal design can happen when a class of product, its goals, its uses, and its substance are well-defined and bounded: examples include early Fortran compilers, smartphones, cars, TVs and airplanes. When the product is complex, specialism at the level of the complete product is indispensable: the properties of the complete product are more than a simple combination of the properties of its components. In cyber-physical systems, this imperative need for product specialism is hard to satisfy in the absence of powerful social factors.

[1] Walter G Vincenti; What Engineers Know and How They Know It: Analytical Studies from Aeronautical History; The Johns Hopkins University Press, Baltimore, paperback edition, 1993.

Links to other posts:
 ← NATO and Vincenti:  The NATO conferences and a wonderful book
 ←  Software Engineering:   Engineering BY software and OF software

Ten More Aphorisms

Alan Perlis was the first winner of the Turing Award in 1966. In 1982 he published [1] a set of 130 epigrams on programming. His aim, he explained, was to capture—in metaphors—something of the relationship between classical human endeavours and software development work. "Epigrams," he wrote, "are interfaces across which appreciation and insight flow." This post offers a few aphorisms. My dictionary tells me that an epigram is 'a pointed or antithetical saying', while an aphorism is 'a short pithy maxim'. Whatever they may be called, I hope that these remarks will offer some appreciation and insight.

11. In a cyber-physical system, logic and physics can show the presence of errors, but not their absence.
12. To master complexity you need a clear idea of the simplicity you are aiming for.
13. No system can check its own assumptions: if it checks, they aren't assumptions.
14. In a cyber-physical system, the die is cast for success or failure in the pre-formal development work.
15. Software engineering for a cyber-physical system is programming the physical world.
16. Traceability should trace the graph of detailed development steps, not just their products.
17. A deficient development method cannot be redeemed by skilful execution.
18. A declarative specification is like the Sphinx's riddle: "Here are some properties of something—but of what?"
19. Cyber-physical systems can exhibit no referential transparency: everything depends on context, including—recursively—the context.
20. Natural science aims to be universal, but engineering is always specific to the current project.

[1] A J Perlis; Epigrams on Programming; ACM SIGPLAN Notices 17,9 September 1982.

Links to other posts:
 ←  Ten Aphorisms: Ten short remarks

Wednesday, 12 February 2020

The Text Pointer

Dijkstra in his anathema [1] on the GO TO statement explained that, for human intelligibility, progress through a program text must map easily to progress through the executed process. GO TO statements frustrate this desire. Abolish the GO TO statement, and we can substitute the clarity of structured programs for the obscurity of chaotic flowcharts. The benefit is undeniable: structured programming defines a coordinate system, elaborated to handle procedure calls and loops, for the program and process alike. Values of the text pointer (my term, not Dijkstra's) are points in this coordinate system.

But wait. Surely the program variable values already define the state of the computation, and hence a sufficient coordinate system, don't they? No, they don't. Dijkstra gives a tiny example. The meaning of a variable n, used in a program that counts how many people have entered a room, depends on whether or not the program execution has yet updated n to reflect the most recent entry. A better coordinate system for program text and program execution is a necessity. One way of thinking about this stipulation is to recognise that the text pointer itself is an undeclared program variable, implicit in the text semantics.

The famous GO TO letter combines two related—but quite distinct—themes. First, the human need to map a program's text easily to its execution: arbitrary GO TO statements make it impossible to satisfy this need. Second, the inadequacy of variable values for characterising execution progress. The letter might well have been entitled "GO TO Statement Considered Harmful and Variable Values Considered Inadequate."

Both of these themes are important in software engineering for cyber-physical systems. First, a triplet satisfying its simplicity criteria must execute a regular process—that is, its machine must be a structured program. Second, the governed behaviour in the physical world cannot be adequately described without reference to the program executed in the machine—specifically to its local variables, including—of course—the program's text pointer.

The second theme was compellingly evidenced in a project to specify an access control system in an event-based formalism. After much soul-searching, the specifiers decided that the formalism's lack of an explicit sequencing mechanism was intolerable; so they introduced an ad hoc feature for sequencing constraints on specified actions. Of course, the text pointer for a sequence was in truth the text pointer of the machine needed to govern the specified behaviour. A text pointer always denotes the progress state of a purposeful behaviour—either of the machine or of a purposeful participant in the governed world.

[1] E W Dijkstra; Go To Statement Considered Harmful; a letter to CACM, Volume 11, Number 3, March 1968.

Links to other posts:
 ↑ Triplets:  Triplets are system behaviour elements: (Machine+World=Behaviour)