A Sense of Rebellion

Come Closer and We Won’t Even Have to Talk

This essay recapitulates a theory that Johnson presented at a 1970 conference: the degree to which our communication (and the interfaces supporting it) ought to be formal and non-ambiguous depends on what is being communicated. Lovers can skip words and communicate through touch and glances; bureaucracies have no choice but to use language.

It’s always curious how and where ideas arise because later the instance of their coming may bear a relation to the context of their use. I’d like to tell the story of one such moment because it might make the idea that came out of it clearer.

A few years ago, I was visiting an old friend of mine in Munich; he was just on the verge of leaving the book publishing world to become a TV producer-director. That called for some sort of celebration and I was privileged to be included: an evening- long gourmet dinner at the “Walter- spiele” restaurant, one of the finest in the world.

There were about seven of us altogether: three couples from the publishing business and me. The meal was excellent, eaten over hours of lively conversation. That’s where both my troubles and my idea began. You must understand that when people who publish books - especially very literary or artistic books —get together, they derive extreme delight from telling each other witty and elegant yams, all in the most brilliant language possible. But poor me, my German is far from fluent and I haven’t much idea what anyone said that evening except when we were examining the menu and deciding what to have. Aha! How about that!? What I started to notice goes as follows.

So long as we were discussing something immediately at hand—on the table, on the menu, in the room —my German was more than adequate for participation. If our consideration shifted to something a little more distant in space or time —something that happened on the way to the restaurant or some item of the day’s news — I was able to comprehend fairly well but I couldn’t generate much of an intelligent comment.

But when the elegant tales were told and the objects and events of them were either fictitious or at least very far removed from our time and space right there in the restaurant, well, I just couldn’t follow any of it. To amuse myself I started listening to the individual words that were tossed back and forth, looking at them as if they were separate objects for study. I made a curious discovery: many of the words were those I could remember having had great trouble with on vocabulary lists in school.

For example, the verbs were those that had complicated prepositional prefixes which greatly (but precisely) modify the meanings of the root words. OK, that follows because it is the prepositions in our language, too, that point to the material, time, and place relationships. Try some: of (of wood), over (over there), at (at noon, at my place), and so on. These weren’t words that anyone had found it necessary to use while discussing the menu that night in the Munich restaurant, but that’s because the stuff of things and their where-and-when are obvious if they are present here, now. If you fabricate a story set in some other time and place, then you have to carry along with you all of the prepositional references so that no one will be confused about what you are referring to.

I returned to this country a day or two later, back to a crazy laboratory on Boston’s waterfront that a bunch of us had been working in for almost two years and where we had been experimenting with a variety of non-verbal communication methods. TELEGRASP (see July 1977 ROM) was one of the gadgets, but there were lots more. Our main interest was in communicating with people by manipulating the environment, or some part of it, that surrounded them: acoustical changes, “live” furniture, video displays, and so on, but always in such a way that the people were participants rather than passive audience.

Among other efforts, we were looking for a conceptual structure to lead us deeper into the how and why of successful environmental communication. How was it that we could sometimes get so much across with only a grunt or a nudge or a smile combined with a look of the eyes in a particular direction? How was it that at other times or with someone else, a long verbal explanation would be necessary?

I felt I had come back with a piece of the answer. It had to do with some kind of “distance” between the people (or other entities) engaged in a dialogue and the “referent” of that dialogue. The greater the distance, the greater the necessity that the dialogue carry with it the cues as to the where and when and what of the referent. If the dialogue is about the people who are actually having the dialogue and about their relationship to each other (as, for instance, when two people make love), then words aren’t necessary at all! As matters move off a bit we can still get along without words pretty well but the areas of ambiguity begin to widen and we must take more care to make our meaning clear.

What we were after in our lab was a way to formalize a theory that would allow us to make adequate designs for things like automobile seats that would help the driver, auditoriums that would help conferences, theatres that would let the audience enliven a performance, and even manufacturing tools that would let an unskilled operator (the customer?) fashion a shoe or a piece of clothing. In short, we wanted not only a handle on the person-to-person dialogue situation, but also a way to bring to technical fruition a means to make possible “courteous” dialogues between people and their machines.

The usefulness of our research has been greatest for the consideration of man-machine or manman pairings and it is presented in these contexts. It must therefore be made clear at the outset that no claim is made for the application of these notions to deductive processes: computers programmed in advance to perform ritual acts. The applications that are intended are those in which an ongoing exploration of one system by another is entailed and where no final result nor homeostatic state is anticipated.

Dialogue is a communicative behavior which we have as yet been unable to describe sufficiently well in scientific terms to apply a measure to it. The root of our problem as observers is that if we attempt to identify or separate inputs and outputs to or from the participants, we effectively obliterate the message and render the meaningful exchange empty. We cannot observe the messages because the medium for each participant includes the responsiveness of the other in an infinite recursion.

The isolation and identification of transitory segments of a dialogue is possible, but their meanings when embedded in the broader contexts of the dialogue itself can never fully be known unless one is oneself a participant — whereupon the meanings necessarily take on a self-referent quality knowable only to that knower. The meaning of an object or event to an

organism is embodied in his consequent response (real or latent) to it; such is his way of exploring the context of its occurrence. We cannot wholly share that required to sustain it.

Any communication is about something —let us call it the “referent”— even if that something is only the message itself. In general we may say that the referent exists within the information space that may be explored by the participants in the dialogue. Consider the idea that there may be a measure of “distance” in time and/or space between the referent and the moment and place of the dialogue in process. Let us examine the consequences of moving along a scale of this “distance” from zero outward.

Figure I provides a graphic and verbal representation of five positions distributed along the scale. The illustrations are more or less familiar instances of interpersonal or human-machine interfacing situations. The positions are numbered 1 through 5 for convenience of discussion, but no implication is intended either that the possible extremes are depicted or that equal spacings of “distance” have been achieved.

Position 1: The referent is the participants themselves and their relation to each other. The communication may be characterized as immediate because in fact no media are employed apart from the responsive environment that each of the participants is for the other.

Position 2: We move out to where the referent is external to either participant but where their interactions with it and with each other are indistinguishable. The communication is communal and its medium is the referent itself. Examples within the realm of interpersonal communication are common but generally pass unrecognized as dialogue. Blind children, for instance, will not learn to run or jump until they have shared the experience in direct physical contact with another person. Learning to manipulate substances, common shapes, and culturally-established relationships of whole and part is acquired largely through the sharing of toys and other media with adults. More familiar is the way in which heavy objects teach us about gravity, or the way a well-tuned sports car can give the drive a “Feel” of the road which no passenger can ever share.

Position 3: Here the participants begin to lose direct contact with each other’s reactions and they communicate more by effects which both produce upon a common environment, although the effects themselves may be somewhat remote in space or delayed in their responsiveness. The referent is distinctly separate from the participants and their sharing of it might be termed adjacent.

Note, as we move from one position to the next, a number of other changes that are taking place. Direct reaction back from the medium becomes less important (e.g., a squeeze back from the hand that has been grasped, light switches whose resistance to being moved is not relevant to the status of the lights, on or off states). Instead, the feedback loops upon which a communicant depends in order to monitor the effects produced become increasingly more reliant upon that part of the sensorium which we learn to use passively or at a distance: sight and hearing.

For the human participant there is a decreasing involvement of large- muscle movement: his efferent activity moves toward the fingertips and thence to lips and sign language where direct contact is unnecessary.

As for the communicant’s problem of exploring the context of a respondent’s messages, the need for more time of exploration and more storage of past experience for reference increases. Ambiguities become more difficult to resolve as they find expression simultaneously in more sensorimotor modalities. Stated another way: expressions of intention are more equivocal because the recipient is less and less a part of the intender’s self-referent loops in his environment.

Position 4: At this distance the communication about the referent is metaphorical in that its representation is only mapped onto the environment of each participant by implication. The model of the referent is not itself of real substance, as it was in the preceding positions, in the sense that it could be manipulated directly. Rather, the interaction must be brought about by an intervening agent. Chalk on a blackboard, a knob, a pushbutton, or a light-pen on a computer do not react back upon the user except to his eyes or ears. His participation, his involvement in real events, has been reduced to a level where he can no longer have confidence in the reality of the source. An example would be the mayor who no longer really knows how the statistics, gathered by others as a base for his decisions, were obtained; he does not participate in the intentions of the fact-finders.

Position 5: At the furthest distance shown we encounter the familiar world of symbolic language and its referents which need not even exist in fact. This is the realm of the telegraph, highspeed printer, speechmaking, books, contract-writing, etc., where the burden of the entailment of the intended meaning of a message rests almost entirely upon the sender’s mastery of the medium. I say “almost” because in many cases the transaction between participants can allow for some clarifying questions and answers.

Yet further, a sixth position on the scale might show a referent at the level of metalanguage —of which this article itself may be an example. But Figure 1 does not continue the scale that far.

This proposed scale is intended to allow us to place in rank order the varieties of dialogue which we shall consider. The scale parameter has been referred to simply as distance in time and/or space, and although it may in fact be precisely quantifiable in many cases, for our purposes here it is quite sufficient to be able to specify a comparative ordering of a few examples. We will focus here on broader speculation about what other inferences may be made about a system and the dialogue in which it is engaged, once its predominant position on the scale is known. We have been able to extract a number of valuable design criteria from them.

It should be emphasized again that the communications of interest here are those characteristic of dialogue and not those in which information flows in one direction only. If a system is intended to be employed simply as a filter and does not allow the recipient an opportunity to be environment to the source, then a dialogue is not in process. The processes of dialogue are really fundamentally different from the more familiar “scientific” methods of telling or being told.

Consider presenting to an engineer, trained in today’s state-of-the-art, the problem either of a communication device to employ a participant a tactile sense, or of a prosthetic hand for an amputee. Our best guess is that he will proceed immediately to think in terms of tailoring an input or an output characteristic as if either could valuably exist in isolation. He will present tactile stimuli seriatim as patterns to a presumed passive recipient, making quite inappropriate use of a touching-grasping system which depends heavily upon exploration to achieve its highest-resolution “sensing.” Dialogue design is needed. As for prosthetic hands, they have not yet been granted self-referent behaviors of their own and are directed as simple inert tools, presenting feedback only to the user’s eyes .Neither engineered solution will be suited to the proposed scale because the referents are not describable in terms of the behavior of the systems selected to mediate them: these behaviors have in fact been made irrelevant.

Suppose that one wished to create a purely mechanical device to serve as one of the participants in a dialogue. Until now we have concerned ourselves primarily with relatively immediate or mediated dialogue between persons as the ultimate originators of the information flows. Now we ask whether it might be possible, say, to provide a computer or some other complex system with an interface adapted to the kinds of informal physiological transactions at which human beings are so skilled? Would such a device require a tremendous amount of centralized computation in order to orchestrate its behavior, or relatively little if one organizes it properly with decentralized foci of simple, reflexive units?

Figure 2 is an attempt at a simple schematic view of the increasing levels of system complexity that are required as one moves up the scale: that is, as the perceiver’s task of modeling the referent becomes correspondingly more complex. Each diagram (numbered to correspond to the same respective scale positions as in Figure 1) is intended to depict only the mechanical participant in the dialogue and to summarize graphically the relationships of the parts of the interfacing mechanism to each other and to the referent.

In each diagram the referent of the dialogue is depicted as a dashed circle and is labeled R. It will be seen in Diagram 1 to surround the rest of the figure, indicating that the referent is the dialogue itself and the participants in it; in Diagram 5 the referent is suitably separate and remote. The small ellipses represent separate active components of the physical interface. These might be air bags in a piece of furniture or water jets in a hydrotherapy bath. In each diagram only three are shown, but in a real device there would more likely be dozens or hundreds of them. Each has a means of “local computation” enabling it to respond energetically to changes in its immediate environment.

The diagrams imply that as one moves up the scale the significance of the local computations diminishes and a means of central organization acquires control. By the time we reach Diagram 5 the incoming information is being brought directly to the central controller: input and output no longer share a common perceptual space.

’Let us return to a more philosophical level of argument inasmuch as the diagrams are intended only to suggest a general aspect of connectedness of parts. Each participant in the dialogue is attempting to explore a referent to determine the intention conveyed upon it by the other. Let me rephrase the foregoing sentence so that its meaning will not be obscure, for again the exceptional properties of dialogue are such that its triadic nature is not revealed by our more familiar modes of description. If the referent were itself inanimate and were to be explored by a single observer, then there would arise no difficulty on the part of a second, passive observer, (wire tapper, eavesdropper) in bearing witness to the messages. However, if a dialogue is in process, then the respondent is imposing frames of reference upon the referent which it did not possess before and which, for their description, partake of the participation itself. One of the problems upon which this article is intended to shed light is that of placing some kind of upper bound on the complexity actually required for a realizable system engaged in a dialogue at various levels of the proposed scale.

Most of the work in which the author and his colleagues have been engaged is focused on systems to be found at the lower end of the scale. Position 1. Suppose, for example, one were to fashion a hospital bed which would participate with the patient in manipulating him for orthopedic therapy; or suppose that an automobile seat were to be made that would concern itself with the comfort and alertness of the driver (i.e., the referent is the relationship of, between, and within the participants) . It will be found that one needs no more complexity of computation within the mechanism than a simple reflexive responsiveness of each component to its own local environment, and that this may in fact be provided within the structure of the components themselves. That is, as the physical relationships of the aggregate change,

they thereby alter their own properties of responsiveness. Interactions between elements of the interface will be effective and need not be explicitly controlled as separate computations by a “remote” computer.

We have found that the liveliness of interaction of such a system with a human respondent is greatly enhanced if the interfacing mechanism is sufficiently sensitive to its own changes that it will sustain a slow, quasi-periodic movement of its own: not so slow as to be undiscernible (below respiration rates) but not so fast as to be alarming (above cardiac rhythms). While the device should not be preprogrammed to move, it should be mildly active even when the respondent is not. Of course, variations upon this selfgenerated activity produce a system which is even more useful. It is just such a self-adjusting variability that the next position offers.

Moving along to Position 2 of the scale, one might want to provide a means of “holding hands at a distance” (we have proposed a device called “TELEGRASP”) or of communicating to a driver some of the conditions of his near surroundings (see Table). The necessity arises for computations among the components of the interfacing system which give them a behavior that is purposive in the aggregate. That is, the longer timegrain and space-grain of its communications require that the organizing elements of the system be sensitive to the correlation of responses that are further separated in space or time than were those of Position 1. In our own research work it appears that multi-parameter, self-organizing controllers will suffice, and that their computations of performance assessment may be carried out in terms of the behavior of the component modules themselves; no additional sensorium is needed.

Further progression up the scale enriches the complexity of system realization. The immediate interactions between the components of the interface and their environment become less and less useful, but this is not to say that those interactions may be obviated for system function. Rather, the description of a receding referent becomes less relevant to the detailed interactions at the interface. The performance-assessment criteria of the system as a whole must be established by way of a more external, indirect means of observation. Figure 1 and Figure 2 show a similar progression. Each higher position should be viewed as an elaboration upon the one below it.

Positions 4 and 5 correspond to ways in which computers are familiarly employed today. Computer graphics in architecture is a recent example of human-machine

“metaphorical” dialogue. Computer programming via compiler illustrates a symbolic exchange at a teletype.

Let me state unequivocally that I do not know how to realize systems of hardware representative of Positions 4 and 5 if I go at it by my proposed approach : up the scale. Neither would I presume to name the evolutionary process so embodied “phylogenetic.” I do predict, however, that the current interest in the development of small, locally-dedicated computers can benefit from these considerations. They will achieve their individual performance levels through a process of “ontogenetic” learning rather than by preprogramming.

There is without doubt much value in the current state-of-the-art of computers; tremendous amounts of thought and capital have gone into their elaboration and refinement. They fall far short, however, of fulfilling a need for complex systems which can provide their skills to users who themselves have only informal means of communication: children in particular, but in addition a vast population of potential users who have tasks they want performed for which no algorithms exist simply because in any instance the user has not yet “decided” what it is he wants. At present, the kinds of skills that human beings develop easily are being made irrelevant by machines which cannot recognize our messages nor elicit our intentions.

The major population for which we would like to see such efforts directed is that of the world’s children from age zero (literally) onward. At no other time in history has it been so difficult for a child to raise the information level of his own environment to a point where it responds to him in his style. The computer is looked upon askance by most young students as just one more addition to an educational process designed to make them look stupid. My own contention is that this opinion can be reversed but that it will take too long to achieve systems of necessary dialogue design if one works down the scale from the top. There is the further, real danger that the embodiments arrived at would be far more complicated than necessary.

All along the scale I have suggested there are simplicities to be exploited; it is my hope that the scale itself will help to identify them.