Concrete computing machines, either sequential or concurrent, rely on an intimate relation between computation and time. We recall the general characteristic properties of physical time and of present realizations of computing systems. We emphasize the role of computing interferences, i.e. the necessity to avoid them in order to give a causal implementation to logical operations. We compare synchronous and asynchronous systems, and make a brief survey of some methods used to deal with computing interferences. Using a graphic representation, we show that synchronous and asynchronous circuits reflect the same opposition as the Newtonian and relativistic causal structures for physical space-time.
Concrete computing machines, either sequential or concurrent, rely on an intimate relation between computation and time. We recall the general characteristic properties of physical time and of present realizations of computing systems. We emphasize the role of computing interferences, i.e. the necessity to avoid them in order to give a causal implementation to logical operations. We compare synchronous and asynchronous systems, and make a brief survey of some methods used to deal with computing interferences. Using a graphic representation, we show that synchronous and asynchronous circuits reflect the same opposition as the Newtonian and relativistic causal structures for physical space-time.
I. INTRODUCTION
Are concurrent computing machines equivalent to Turing machines? This question, which amounts to confront two fundamental notions like time and computation may be treated in a purely mathematical framework. Practical consequences however cannot be independent of concrete realizations, that is concrete machines performing actual computations in physical time.
This remark may seem curious, if one aims at showing theorems, which cannot depend on the physical properties of time or machines. But, even in a mathematical treatment of concurrent computation, one needs a representation of time. Usually, time is modelized as a real parameter, shared by all parts of the computation. Unfortunately, such a representation does not correspond to the observable time that can be obtained from physical systems like clocks, neither to the reference time that is defined by metrology, nor to the operational time that occurs in practical realizations of logical circuits. Without questioning the validity of demonstrated theorems, difficulties may emerge when trying to find practical applications.
The distinction just made between abstract and concrete machines raises related questions. When a machine M can be simulated by a program P running on another machine, how can one identify the concrete machine M and the program P ? And in case a machine cannot be simulated on another one, indicating some greater expressive power, is the latter due to computation or to some fundamental physical property? Before entering such questions, one must first consider the sequential machines that are presently realized. These machines which we get from a constructor, which are made of matter, which transform electric energy into heat, which we communicate with through a keyboard and a screen, why do we need them? Let us note that all the features that make them concrete are inconveniences: we would prefer them lighter, smaller, less power consuming and less dissipating. It would be ideal to make all these parameters equal to zero. In fact, we need them for their logical function, their ability to compute. But then, since this function is mathematically known, modelized and even simulated, where is the need for a concrete machine, whose features are mainly inconveniences? The natural answer is that these machines compute faster than humans can do, with just a pencil and paper. And yet, pencil and paper are already rudimentary elements of a concrete machine, using physical objects to memorize different steps of computations. The interest in concrete machines comes from their intimate relation with physical time.
If computing machines go faster than humans, then one must be confident in their action, as in most cases one is unable to check their output. Indeed, in very specific cases, one can verify the correctness of a result in much less time than is needed to obtain it. This is the case for instance of the prime factorization of integers. But few concrete applications have this property. In most cases, one cannot check the result in much less time than the computation itself. If the result is important, and no other way is available to obtain it, then one must be confident in the machine.
What can support such a confidence? Necessarily reasoning, founded on correct functioning of the machine at a given time on some particular computations, generalized to other times and other computations. A computing machine cannot be tested for all computations it can do, at any time. Even for a finite machine, the number of possible computations increases exponentially with the memory size, and a memory of one hundred bits already allows a number of configurations that cannot be tested in less time than the age of the universe. To establish a reasoning leading to confidence, one must:
-check that each elementary component effectively realizes the function it has been designed for (physical val-idation).
-prove in a deductive way that the particular composition of these elementary components building the machine effectively leads to the global function used (logical validation).
The first condition is ensured by choices in implementation design and by tests made by the constructor. The second condition is obtained from a mathematical representation of the machine and from the logics of computation. These two steps of validation require good representations of all components at the physical level, and of the global machine at the logical level. If the confidence one can put in a machine relies on good modelizations of both its physical and logical functioning, how could such a machine perform more than it has been designed for, more than our present theories can modelize? Even if the existence of a new type of calculus, still unknown today, can be envisaged, with machines performing this new type of calculus, how could one build such machines without having for them good modelizations? In such a case, one could not ensure the two validation steps, and one could not say that these machines operate correctly neither be confident in their ouput.
One consequence is that realistic models are necessary, both of the computational structure and of the physical implementation of logical operators. Modelization is made easier when logical and physical constraints can be separated. This is the reason for developing sequential machines or synchronous concurrent machines. In that case, the logical validation of the machine can be made, whilst ignoring the implementation characteristics of its components. The latter will finally and mainly limit the performance of the machine through the value of the clock frequency. The machine can equivalently be simulated on another concrete machine with identical clock frequency, at the expense of slower performances. However, in the case of asynchronous concurrent machines, logical and physical constraints are more involved. Although machines built with asynchronous circuits are less widely used, much effort has been devoted to their understanding and modelization [28, 31, 16, 4, 2, 14] . In fact, they may even appear as an unavoidable evolution of computing machines. On one hand, clock timed circuits are reaching limits where clock signal distribution consumes too many resources and progress in performances approaches saturation point. On the other hand, asynchronous circuits constitute the most general class of circuits, and thus allow one to express in the most general way the questions raised by the implementation of computation on physical systems, and the solutions that may be brought.
In this article, we shall be concerned with the relation of concrete computing machines with physical time. After recalling the general characteristic properties of physical time and of computing machines which are presently realized, we shall compare the solutions provided by synchronous and asynchronous systems to the implementation of logical operations. We shall show that they give different implementations of causal relations, reflecting in that way different causal structures for space-time.
II. PHYSICAL TIME
The notion of time may be seen to follow from two necessities. From a logical point of view, time can be considered as the concept which allows one to make a distinction between two different types of propositions: general and universal propositions (like mathematical ones) which are eternal, and particular propositions which are related to changing reality (like those describing physical systems). Moreover, time is also rendered necessary by the formulation of physics: time is the concept which allows one to give a formal expression to movement, and hence to the laws of physics.
Properties of time are in fact imposed by the functions that this notion must fulfill. From the logical side, time allows one to conceive a same object by characterizing it by its different states, these states being asssociated with the object at different times. A time parameter can then be used not only to index the different states characterizing a same object, but also to organize the states of different objects into classes of simultaneity.
The relation of order that can be introduced on the time parameter allows one to define a relation of logical causality between the state transitions affecting different objects. However, in order to be realized physically, for instance on real machines, the causal relation between states cannot be independent of the real motions affecting physical systems. In particular, the simultaneity classes defined with the help of the time parameter must coincide with those that are associated with real events occuring in physical space, hence with the physical time.
The notion of physical time is intimately related to the laws of physics. After having remarked that pendulum oscillations are isochronous, Galileo Galilei could give a mathematical representation of motion induced by free fall, by relating the undergone distance to the elapsed time, the latter being understood as a universal reference for all motions. The existence of such a reference is made possible by the existence of physical laws governing all movements, and in particular by the existence of regular movements like inertial motions.
This introduction of time leaves an important conventional part in the definition of a time reference, even if a natural choice is provided by motions which appear as most regular, like the Earth motion around the Sun. This leads in fact to distinguish two types of time. Thus, Leibniz [25] , relying on logical arguments, could consider that space and time are mere relations between objects or events, which are fixed by an observer in a conventional way, thus building subjective space and time. Still, one is also bound to admit the existence of objective space and time, as the only way to understand how physical laws governing displacements of objects and time ordering of events can be formulated in a universal way, independently of the observer.
The formulation of the universal law of gravitation led Newton [22] to fix the role played by time in physical laws, and to endow it with the mathematical representation that we still use nowadays: that of a real parameter which all physical quantities depend on. In fact, Newton introduced two different notions of time, which he distinguished both in their conception and in their usage. The first one, which he called "absolute and mathematical", allowed him to write mathematical equations for the laws of mechanics and gravitation. The second notion, which he called "common and sensible", allowed him to relate the motions of different physical systems, including clocks. Even if Newton privileged the first notion, which he considered as representing absolute space and time, seeing clocks as systems to be improved in order to make them as close as possible to ideal space and time, he nevertheless made two distinct uses of these notions. The first one, which identifies with the curvilinear coordinate on the planet's trajectory, he used as a mathematical tool to deal with infinitesimals of different orders. The second one, which is the physical time as can be defined by Kepler's area law, he used as a measure of inertial motions, which he compared planetary motions with.
The theory of relativity [5] has led to question the a priori and absolute character of physical space and time. According to relativity, the notion of time relies on clocks, the date of an event being defined by coincidence of this event with a top delivered by a clock located at the same place. But in order to be defined in whole space, the notion of time also relies on the exchange of light signals, which are necessary to compare and synchronize the indications of remote clocks. The universal and finite velocity of light propagation then leads to a definition of time simultaneity which depends on the observer's motion. In other words, time simultaneity is not given a priori but results from a construction, or clock synchronization. By exchanging light signals, on which time references provided by clocks are encoded, one can compare these references and synchronize clocks. Then, time allows one to construct space. By comparing the light signals received from several remote clocks, one can, by quadrangulation, determine positions both in time and space. This relativistic definition of time and space is rendered necessary as soon as a high precision must be attained. This is the case for instance when corrections linked to the finite velocity of light, or relativistic effects, must be taken into account [11, 36] . Hence, this relativistic definition is the one used in physics for high precision space-time measurements [33] , and in metrology to define time and space standards [24] and to construct the space-time reference systems required by physics [23, 35] .
It is also the one used in modern practical positioning systems at the surface of the Earth, like GPS [13, 12] . Finally, as it will appear in the following, it is also the notion of time which is implicitly used by asynchronous communicating and computing systems [10] .
The consequences of the theory of relativity on our conception of space and time have been remarkably discussed at the logical level by Russell [26, 27] . Our representation in terms of permanent material structures located in space and evolving according to a unique external time, must be replaced by that in terms of events which are located both in space and time. This conception of space-time not only affects the formulation of modern theories in a fundamental way [9] , but also underlies present applications in physics and metrology [7] .
When refering to physical time, simultaneity classes cannot be defined a priori any more, and rely on a physical implementation by means of propagating light signals. This constructive character of time has important consequences on the functioning of devices which rely on the physical exchange of information. Causal relations between events cannot be derived by simple comparison with an external, a priori given, parameter. For systems which are unlocalized in space, like communicating processors, this means that the time order relation of occuring events, even if it can be defined unambiguously at the local level of each processor, nevertheless requires a more complete representation to be defined over the whole system in a consistent way [28, 31] . In the following, we shall analyse how the functioning of actual devices depends on the causal structure of physical space-time.
III. LOGICAL DEVICES
To discuss the intimate relation between time and computation, one must first recall some general principles which underly the physical implementation of computing systems, and which are applied in concrete machines realized with present technologies.
Implementation of logical operations
In CMOS (Complementary MOS) technology, logical gates are implemented using two electrical networks N u and N d , as represented on Figure 1 . x i describe input channels, and z the output channel. N u and N d are built with electrical switches which are combined in series/parallel networks, thus allowing to implement the logics of propositions [29] . 
2-N u and N d can be simultaneously false, i.e. (¬N u ∧ ¬N d ) can be true. In such configurations, the output z is not connected to any voltage source. Then, because of electrical capacities, z memorizes its previous value. This allows one to realize memories, like the latch represented in Figure 3 : In practice, in order to have memorization last for long enough, and quite generally for all memories, one must ensure memory stability by using some feed-back, by means of a looped amplifier. This feed-back can be permanent (static) or recurrent (dynamic logic). One possibility of electrical feed-back is shown in Figure 4 , where two looped amplifiers have been added on output z, one of them being weak, in the sense that it cannot create any serious short circuit when it conflicts with any of the two networks N u and N d . Another way to realize a stable memory is to implement a static feed-back on a logical gate corresponding to the first case. Then, the quoted latch can be realized with a looped multiplexer (mux), as shown in Figure 5 : This exhibits a very general difficulty which characterizes looped systems: variable z appears on both sides of its defining equation (3). This equation does not mean that an equality must be realized, for instance with electric voltages, but that the following assignment must be realized:
In other words, two values of the variable z must be distinguished, which correspond to successive times: z a (after) and z b (before) are linked by equation (5). It is required that the two values z a and z b do not interfere, and that variable z change from z b to z a . The assignment represented by equation (4) expresses a causality requirement that must be implemented in order to realize computations.
In the particular case of the latch just described, operation may only cause problem in case e is falling. Indeed, in other cases:
-when e is low, z is memorized -when e is high, z copies x -when e rises, z begins to copy x so that the circuit operates correctly in these three cases. However, if e falls down while x changes, z will hesitate between two values of x. The whole circuit may enter a metastable state which is invalid (electric voltage will stay in metastable balance at an intermediate level) and which may last for an unbounded time. It may leave this state for any of the two possible values of z, and this in an undeterministic way, which may not be eventually acceptable for the type of computation envisaged. Let us note that the circuit of Figure 4 shows the same defects, for it involves a feed-back, although this may be less apparent when treated at the electrical level.
Although chosen here as an example, the latch shows properties which are encountered quite generally in looped devices. This brief discussion shows that the correct operation of a circuit cannot be analysed without taking its environment into account, in particular the time ordering relations of input and output signals. This is entailed by the existence of loops and must be dealt with quite generally, for computing machines are naturally looped systems. In all cases, time constraints must be implemented in order to ensure the causality relations which are necessary for computation. These constraints will take very different forms, according to the type of implementation chosen, whether by means of synchronous or asynchronous systems. Before discussing separately these two classes of systems, we shall first recall one important property they share, as it is also imposed by implementation of complex computations, the property of modularity.
Modularity
There exist many various ways to organize electronic components into logical circuits, in order to realize machines performing computations. Usually and quite generally, one defines complex circuits as hierarchies built with elementary circuits called primitives. This method, imposed by practical considerations, indeed hints at a logical necessity: one must be able to design and realize with the same rigour circuits of increasing complexity. More precisely, one must insure that circuits implementing logical functions of high complexity level behave as they should, and one must obtain this confidence in a rather short time. Because of the exponential increase of the number of configurations to check, this requirement implies that a direct physical test of the circuit's behaviour soon becomes impossible when the complexity of the logical function increases. This aim can then only be attained with the help of modular implementations, by taking advantage both of their composite logical structure, and of the logical simplicity of chosen primitives [19] . Proofs relying on known properties of composition of primitives may be developed, which allow one to deduce the correct functioning of a whole modular complex from that of its constituent primitives. Then, a test of the whole complex reduces to that of some of its constituents, which are logically simple. Although efficient, such strategy may not reveal itself so straightforward. According to the type of physical implementation chosen for the primitives, problems may appear which prevent the systematic development of complex circuits operating correctly, and which do not occur when the choice of primitives is modified, or when pecular constraints are put on their composition. Then, there results that logical and physical aspects of modular implementations must be analysed concurrently.
IV. SYNCHRONOUS AND ASYNCHRONOUS CIRCUITS
Time appears in computing systems very early, already in the definition of the electronic circuits which implement logical functions. Most circuits which are known and used are synchronous. Synchronous circuits may be defined as automata whose transitions between successive states are triggered by pulses delivered by a global
clock. An alternative class of circuits is provided by asynchronous circuits. In this section, we introduce the strategies followed by these two main classes of circuits for making the implementation of computation effective, and, in particular, for dealing with problems of computation interferences.
Synchronous circuits
VLSI circuits which are produced nowadays are highly concurrent devices (a microprocessor can contain up to 10 7 transistors, i.e. 10 6 logical gates), and yet most of them can be modelized as non concurrent devices and can be considered as single finite automata. This property comes from their synchronous character, which means that all operations on internal memories are simultaneously activated by a single pulse of a global clock shared by the whole circuit.
Synchronous implementations use a global clock to avoid the stability problem which has been discussed in previous section. More precisely, with same notations, all latches are systematically operated in such a way to ensure that logical variables x be stable when variables e are falling. There exist many different types of memories, but all present the same problem, reflecting the time character of logical assignment. For the sake of simplicity, we shall only discuss the case of the previous latch, and consider it as a generic example. A synchronous device using two items of this latch for each register bit (master-slave flip-flop) is sketched on Figure 6 . The enabling signals e 1 and e 2 are mutually excluded in time, and are derived systematically from a common clock. The output is fed back under the form of input variables Q i into a combinatorial operator. If the clock period is larger than the feedback time, then variables Q ′ i are always stable when e 1 is falling and the latches act as required, i.e., they make the iteration of the combinatorial function effective.
The global state Q is encoded by the state of all memory bits, and can only change at the arrival of a clock pulse. Regarding specification and design, synchronous circuits may be considered as modular composites, where primitives, and other modules as well, are finite automata of the type described by Figure 6 . The register encodes the state of the circuit, while the combinatorial operator represents the implementation of its transition function. All registers are activated by a single clock pulse. By connecting several automata of this type, one obtains another automaton of the same type, only with a larger memory and a more complex combinatorial function. In such a representation, and from a logical point of view, neither time nor space are involved. One only needs to consider the successive logical steps associated with successive clock periods. The logical time of a computation reduces to a mere integer, which one only relates to physical time by multiplying it by the mean clock period. In other words, time is discretized. 
Fig . 6 . Synchronous circuit.
The only physical constraint one must impose is that the clock frequency be smaller than the limit value necessary for all internal propagations to be performed in less time than the clock period. Space neither plays a role in the logical function. The whole circuit may be considered as local, i.e. propagation times need only to be taken into account when circuits are connected on large distances, that is when propagation times are large when compared to the clock period, as is the case when computers are connected. Implementation on a silicon chip must take into account and control all propagation times within a circuit, so that to insure that all inputs become stable before the end of each clock period. The clock signal must be implemented so that it arrives simultaneously at all latches, that is with negligible delays when compared to the clock period. Clearly, such properties can only be checked once the whole circuit has been specified. Such type of circuit cannot be implemented incrementally, i.e. by implementing a part without any knowledge on its connections with other parts and on the clock frequency of the whole circuit. In other words, all parts must be local at every scale, from primitives to the whole circuit.
The synchronous approach is however questioned in present VLSI designs. This arises mainly because of difficulties which are encountered when distributing simultaneously a same clock signal to millions of latches, over several cm 2 , and at a frequency of the order of a Gigaherthz. Clock distribution results in using an important part of the chip surface and in producing an important part of the overall dissipation.
Asynchronous circuits
Asynchronous circuits may be defined in opposition to synchronous circuits, by the requirement of not using a global clock. But, rather than being complementary, the class they build includes synchronous circuits.
Usual models represent asynchronous circuits as general devices which are distributed and communicate along connecting channels, as shown on Figure 7 . The activity of such circuits is not ruled by the pulses of a global clock, but proceeds with communications distributed between many concurrent parts. These devices can be simple logical gates (a few transistors) or, at the opposite, complex processors. Communications can be realized through a single wire or through a complex network. Clearly, concurrency cannot be ignored any more. Indeed, one can no more define a logical state which would be associated with the global circuit at a definite time. For each device can change its state following a communication, without being synchronized with most other devices. The notion of computing step itself must be revised, as it relies on a total ordering of all logical events.
D 1
Env. Fig. 7 . Asynchronous circuit. Problems of computation interference may then arise at two different levels. At lowest level, functioning of a single component may be endangered by computing interferences within the component itself, due to internal loops and instabilities of internal variables. At highest level, composition of asynchronous circuits may induce computation interferences due to exchanges between one module and its environment, the latter sending signals which conflict with the module operation.
We shall discuss the second case only and shall assume that primitives may be defined which are free from internal computing interferences (see for instance [21, 6] ). We first briefly describe those that are most frequently used in asynchronous circuits. In some examples, logical functions are defined in a way which does not distinguish between rising and falling edges. These undistinguished transitions are called events, and the logical function operates on these events. But the events are still transitions between different levels (or Boolean variables), so that each primitive can be considered in both ways, either as an operation on Boolean variables or as (another) operation on events.
A most frequently used primitive is the join element, or Muller's C-element. It has two inputs x 1 and x 2 and one output z, and its logical function can be described in the following way:
-if x 1 = x 2 , then z = x 1 = x 2 -if inputs become different from one another, then z keeps its previous value.
Then, ouput z only changes after both inputs x 1 and x 2 have changed. This allows a rendez-vous to be realized between levels (wait until two inputs acquire the same value) or between events (wait until two inputs have received the same number of rising or falling edges, after proper initialization).
The primitive C-element can be realized using an electrical feed-back analogous to that of Another primitive is the toggle, represented in Figure 9 , which possesses one input x and two ouputs z 1 and z 2 . Successive events on the input are alternatively sent to outputs z 1 and z 2 . The first event after initialization is sent to the marked output z 1 .
The or operation between events, also called merge, can be realized with a classical exclusive or gate (xor between levels). The sequencer, represented in Figure 9 , possesses three inputs x 1 , x 2 and x 3 and two outputs z 1 and z 2 . Its role is to grant a given resource to one of two different processes which can make requests on inputs x 1 and x 2 . When an event is received on x 3 , a granting event is produced either on z 1 or z 2 according to an existing request respectively on x 1 or x 2 . When two requests are present, the sequencer arbitrates between the two, and thus introduces some part of indeterminism. The sequencer may take an unbounded time to arbitrate, but it is required to realize the mutual exclusion of the two grant signals. 
V. COMPOSITION OF CIRCUITS AND TIME ORDERING
In this section, we discuss the composition of asynchronous circuits, and some solutions which have been brought to the problem of computing interferences.
Compositions of asynchronous circuits correspond to distributed systems, where different parts communicate in a way which is not regularized by a global clock. Then various and arbitrary time delays affect successive transitions at the input of a module. Inputs may then conflict with the correct operation of the module itself. The approaches followed in circuit design to deal with computing interferences fall into two main classes. One practical approach to timing problems consists in working directly on the physical implementation, by keeping track of all the delays occuring in the logical circuit, together with all the constraints which must be satisfied by these delays in order to make the whole circuit operate correctly. Then, programs are developed to find and optimize solutions in a systematic way [1, 30] . Although practically very efficient, this strategy rapidly attains such a complexity that it becomes very difficult to distinguish fundamental issues from practical choices. In the other class of approaches, one attempts to separate as much as possible the logical issues related with timing from their physical manifestations, that is mainly from the values of time delays. This has led to different studies, focussing either on the determination of a best choice of logical primitives, satisfying criteria like speed-insensitivity or delayinsensitivity [16] , or on a more restrictive definition of modular composition, like delay-insensitive compositions [32, 4] . In the following, we shall only briefly discuss approaches related with delay-insensitivity, and focus on the fundamental relation they tend to exhibit between the occurence of computing interferences and the causal structure of physical space-time.
In order to make the analysis easier to follow, we shall introduce a graphic representation of the communications occuring between modules of a composition (see Figure 11 ). These graphs are analogous to those that can be used in relativistic physics to represent the space-time evolution of localized physical systems, together with the light signals they exchange. As discussed in a previous section, an essential feature is the absence of an a priori given global and common time. Only a local time ordering can be made between the successive events occuring on each module, reflecting the causal relations which can be made locally. Although time is represented as the vertical axis, this only indicates the direction for increasing time on each module. Different modules are displayed on the horizontal axis, which roughly corresponds to space. Each module is then represented by a vertical line, indicating the causal succession of the local events occuring at its inputs or outputs. Communications are then represented by inclined arrows leaving a module (output) to reach another module (input). Although they may vary, the slopes of theses arrows must always be greater than a strictly positive lower bound, which corresponds to light velocity. Varying slopes indicate that varying speed and delays affect communications between modules.
In the following, we shall denote by "event" each arrow corresponding to a communication, and shall call "point" the intersection of this event with the time evolution of a module (following in that way the notation introduced by Russell in his discussion of the causal structure of relativistic space-time [27] ). The logical specification of each module is translated into causal relations between the points which represent the occurence of events on the module. These local constraints may be given a precise expression using a formal language well suited to represent time ordered event structures [16, 4, 34] . As propagation delays play an essential part, ordering constraints will be most conveniently visualized on graphic representations, which allow the analysis of global causal relations within distributed systems.
Delay-insensitivity
In order to discuss the role of delays in computation interferences, let us first analyse the illustrating example of the Q-element [16, 17] , which is represented in Figure 10 . The formal expression describing the logical function of the Q-element can be written in a language which is derived from CSP (Communicating Sequential Processes) [8] 
Each variable between brackets, which precedes a transition, represents a logical variable which must be true be-fore the circuit can execute the transition which follows (; denotes time succession, and * arbitrary repetition of the expresssion in brackets). Thus, the circuit waits for x i to be true, then emits a rising edge on output y o , etc... This logical function can be implemented as a composition of a C-element with two and gates, as represented in Figure 10 . Output u of the C-element is followed by a fork, which relates u to one input of each of the and gates. Two other forks also dispatch the event produced by the environment x i (resp. y i ) on two inputs denoted by x 1 and x 2 (resp. y 1 and y 2 ).
The logical operation of the circuit may be represented using a space-time graph, as in Figure 11 . Left and right parts of the circuit environment are respectively represented as X and Y . The series of points corresponding to the definition of the logical function of each module can be followed on each vertical line. Situations which correspond to rendez-vous, i.e. intervals where a primitive is waiting for the arrival of two events in any order, have been represented by a thick line. This is systematically the case for the C-element, but also for the and gates, when they are waiting for their two inputs to be true. Pairs of points which cannot occur in reverse order without ruining computation, have been signalized by dashed lines. The two cases involve the internal variable u and one event, y 1 ↑ or x 1 ↓ produced by the environment (Y or X). Event y 1 ↑ must reach the and gate B before event u ↑, recalling that the latter has been produced by the arrival of event y 2 ↑ on the C-element. Then, the fork which dispatches both events y 1 ↑ and y 2 ↑ plays a crucial role in determining the order of points on and gate B.
A few remarks are in order. Concurrent computing is well illustrated by Figure 11 . Different computations proceed along paths involving vertical and propagation lines, each representing a causally ordered series of operations. Causal order makes only sense either within each vertical line, where it is associated with the logical function of the module, or within propagating lines, where it connects the output of one module to the input of another module. But no a priori total order exists between all points of the graph. This is illustrated by the independence of computation on the order of some pairs of points. For instance, two events belonging to different branches of the fork on variable u at the output of the C-element may have arbitrary relative order. Imposing a total ordering would amount to implement a global time, by means of clock distribution for instance, which would allow one to draw horizontal lines on the graph of Figure 11 . But such condition is too restrictive, as computation only relies on causal relations imposed by vertical and propagation lines. The remaining freedom in the ordering of events, as the one related to the fork at the output of the C-element, is necessary for optimizing the circuit performance. For a definite implementation, event ordering will depend on the relative spatial localization of modules, so that the remaining freedom may be used to find an optimal arrangement of modules on the chip.
The property of delay-insensitivity [21] is easily seen on the graph. It corresponds to the independence of causal ordering of computation on delays occuring in responses of modules or in propagations of signals, i.e. on vertical or horizontal displacements of the modules. Such property is made possible by using primitives which wait for the arrival of events at their input before producing other events at their output. But this condition appears to be unsufficient. In that respect, it is instructive to compare the two kinds of forks used by the previous composition implementing the Q-element. No constraint affects the events produced by the fork at the output of the Celement (thick lines in Figure 11 ). However, forks dispatching the events produced by the environment must be implemented in such a way to respect the causal order of the events which they generate and which finally arrive at the same and gate (dashed lines in Figure 11 ). Such forks, which are called isochronic forks [16, 17] , must be isolated and given a special treatment at the implementation level, in order to satisfy the delay constraints which are necessary for preserving causal ordering.
The property of delay-insensitivity (DI) has been introduced and much developed as a simple condition one can impose on primitives and logical circuits, with the aim to design in a systematical way asynchronous circuits of arbitrary complexity, without having to take time scales into account. One approach consists in defining DI circuits as compositions of stable primitives devoid of internal loops (only electrical loops being used for memorization) [15] . A primitive is defined to be stable, by imposing that an input which changes the output cannot change before the output has been established. It can then be shown that only compositions of C-elements can be DI according to this definition.
But, it can also be shown that compositions using C-elements (and generalized C-elements with more inputs) exclusively, strongly limit the type of allowed computations, excluding most circuits of interest [17] . The isochronic fork may then be advocated as a weakest compromise to delay insensitivity. Adding the isochronic fork and using this extended class of elements, called quasi delay-insensitive (QDI), complex and efficient asynchronous circuits have been realized [18] . However, as illustrated by the example of the Q-element, isochronic forks need to be identified at the logical level and their implementation must be given a special treatment, which may reveal itself intricate for very complex circuits.
Delay insensitive composition
Another approach for avoiding computation interferences [21, 32, 4] , consists in defining a less restrictive set of DI primitives, together with a notion of DI composition of these primitives. Circuits are represented in a formal language, called trace theory, similar to the one used in equation (6) , with further syntax rules on logical operations. Computing interferences are avoided by imposing structural constraints under the form of simple rules. Let us first recall definitions and some properties of trace structures [4, 20] .
Definition 1 Trace structures are defined as triples R =< iR, oR, tR >, where iR and oR are finite sets of symbols, respectively the input alphabet and the output alphabet, and tR is the set of traces, which is a subset of (iR∪oR) * , the set of all finite-length sequences of symbols taken in the union set iR ∪ oR.
Trace structures are traditionally denoted by capital letters, while lower case letters a, b, c denote symbols and s, t traces. The following short notations are also frequently used:
Definition 2 Operations of concatenation, union, repetition, prefix-closure, projection and weaving are defined on trace structures:
where, for convenience, notation aR ≡ iR ∪ oR has been introduced for the total alphabet of R, where t ↓ A denotes the projection of trace t on alphabet A and (tR) * is the set of all finite-length concatenations of traces in tR (symbols ∃, ∀, ∈, ∩ and ∪ denote as usual, existence, universality, set belonging, set intersection and set union). The pref operator constructs prefix-closed structures, while the projection operator hides internal symbols; finally, the weave operator expresses instantaneous synchronization. A circuit is specified by a prefix-closed, non empty, trace structure R with iR ∩ oR = ∅. The trace structure representing the environment of a circuit with trace structure R is the reflection of the latter, and may also be given a compact notation:
A trace structure R may be physically implemented by letting each symbol a in the alphabets iR and oR correspond to a channel, and each occurence of this symbol in a trace of tR correspond to an event, i.e. a high or low transition, on the corresponding channel. Symbols in iR or oR describe communication actions that are respectively produced by the environment or (exclusive or) by the circuit. In order to be able to ignore transmission delays while avoiding transmission and computing interferences, the following rules may be imposed [32] .
R 0 ∀s ∈ tR, a ∈ aR saa ∈ tR
Rule R 0 excludes two consecutive transitions on the same wire, and hence transmission interferences that may result. Rule R 1 expresses independence of computation on the order of signals travelling in the same direction, as this order may depend on suffered delays. The Celement is easily seen to satisfy this rule. However, the and gate only complies with the rule when it is waiting for a rising edge on its two inputs, and does not in all other cases. Thus the and gate, and also the or gate are excluded by this rule, although the toggle, the merge and the sequencer are compatible. Rule R 2 expresses the same property for signals travelling in opposite directions, in case their order does not change the result locally. Note that due to the necessary symmetric treatment of a circuit and its environment, all rules are symmetric in the exchange of input and output symbols. One must exclude the possibility for a symbol of one type to disable a symbol of another type (symbol a is said to disable symbol b in trace structure R, if there is a trace s with sa ∈ tR ∧ sb ∈ tR ∧ sab ∈ tR). Such exclusion is necessary to prevent an admissible input symbol to get disabled by an output signal, depending on the delay the former has suffered on its way to the circuit (by symmetry the same property must also hold for the environment and output signals). Depending on the level of exclusion, this property leads to define three classes, with rules R 
These rules successively allow for more decision possibility. Rule R Finally, rule R 2 appears on specific examples to be too restrictive [32] . An alternative and more generally efficient rule is provided by:
This rule, which is conveniently expressed on a spacetime graph, as shown in Figure 12 , concerns three events a, b, c connecting one module M and its environment E. It stipulates that, if two time orders are allowed for the occurences of two events of different types (i.e. one input and one output) a and b, then if the event c, of the same type as a, is a consequence of the order "a then b", it should also be a consequence of the other order "b then a". This rule imposes that if an order on events is differently seen by a module and its environment, due to propagation time delays, then this order should have no consequence on the logical behavior of the module. As illustrated by Figure 12 , this rule only affects the case on the left part of the figure, that is, only the case when propagation can change the order of events. The set of DI components is given by trace structures, defined according to Definitions 1 and 2, which satisfy the weakest form of the rules, i.e. R 0 , R 1 , R ′ 2 and R ′′′ 3 [32] .
A set of DI primitive components for asynchronous circuits can thus be obtained with the following list of specifications in terms of trace structures (see Table 1 ). The wire corresponds to a component which waits for an event to occur on its input, then sends an event on its output, and repeats this sequence indefinitely. The inverted wire (iwire) behaves similarly, but begins by sending an event on its output. The fork duplicates one input. As can be seen from definitions (8), weaving not only consists in putting in parallel, but also in synchronizing common output symbols. In the particular case of two wires with a common output, weaving leads to the C-element. The other components correspond to the primitive circuits which have been previously introduced (see Figure 9) .
The objective is to realize circuits corresponding to given complex specifications by combining simple DI primitive circuits. This aim may be attained by making use of operations such as decomposition and substitution, together with two theorems setting the conditions for performing these operations.
Conditions in (15) respectively describe a closed network (each input is connected to an output and conversely (i)), absence of output interferences (two outputs cannot be connected (ii)), absence of computing interferences (any event produced by a component is compatible with the behavior of the component which receives it (iii)) and correct behavior at circuit boundary (network behaves as prescribed (iv)). Decomposition will be denoted by
Let us now state two useful theorems (proofs may be obtained in [3] ).
Substitution Theorem 1 For components
holds if
The latter condition stipulates that internal symbols of S, i.e. symbols in (aR 2 ∪ aR 3 )\S, where \ means set deletion, should not appear in (aR 0 ∪ aR 1 ). It can be realized by an appropriate renaming of internal symbols of S.
Separation Theorem 2 For components
Condition (17) stipulates that the internal symbols of the decompositions of R 0 and S 0 are disjoint (this condition may be satisfied by renaming some of these symbols), and conditions (18) stipulate that the outputs of any two components R i ||S i and R j ||S j are also disjoint when the components are different (these conditions may also be satisfied by reordering the components).
With the help of these two theorems, the previously defined DI primitives may be combined to give modular compositions which are delay insensitive, hence circuits where computing interferences cannot be introduced by delay modifications only. We briefly describe an example of a circuit which can be obtained with such a composition of DI primitives, the token-ring interface [4] . The token-ring interface is a device allowing to connect several machines, which must share a common resource (like a memory, or a bus). One item Alloc i of this device will be associated with each machine M i , all items being identical and realizing the same function, as shown by Figure 13 . Item Alloc i of this device is connected to two environments, the machine M i on top of the figure, and, at bottom, the ring R where a token circulates. The arrival of the token at Alloc i corresponds to an event on b, its departure to an event on q. The machine M i can make a request under the form of an event on a 1 . Alloc i grants the resource to the machine M i by an event on p 1 . The machine M i signals the end of its use of the resource by an event on a 0 , which is acknowledged by Alloc i under the form of an event on p 0 .
Initially, the token-ring interface is specified by the following trace structure: 
This specification results from weaving two trace structures which respectively describe the communications of the token-ring interface with the machine M i and with the ring R. The two trace stuctures interact through their common dependence on two events p1 and a0. Each trace structure may be decomposed into primitive elements. Substitution and separation theorems may be applied, finally leading to a possible decomposition, as shown by Figure 13 in a graphic way:
The first component is a sequencer (see Table 1 ), necessary for synchronizing the output p1 shared by the two trace structures defining the token-ring. The sequencer also arbitrates between corresponding inputs. Other components describe an iwire, two wires and a merge. Although they do not appear explicitly in decomposition (20) , two forks appear in Figure 13 , as a consequence of double occurences of a0? and q1? in (20) . The DI property of this implementation can be visualized on a space-time graph, as in Figure 14 . Two cases have been represented in Figure 14 . In the first case, the request a 1 done by the machine i is not granted, the token being sent back to the ring. When the token arrives a second time, the resource is granted to the machine M i which was waiting. This illustrates the undeterministic behavior of the module Alloc, which depends on arbitration performed by sequencer B. The figure also shows that the two forks, that on q 1 (output of B) and that defined by D cannot create computation interferences, so that no particular constraints are necessary. This results from the function of sequencer B, which is not perturbed whatever the order of the events on its inputs. Sequencer B waits for an event on b to make a decision, and then arbitrates between the different requests it has received.
As shown by the example of the token-ring interface, DI primitives and DI decomposition may be used to generate modular compositions which are delay insensitive, and, as shown with the help of space-time graphs, that remain free of computing interferences. Delay-insensitivity appears as a simple criterion for escaping problems raised by computing interferences in a purely logically way, i.e. without recourse to a detailed analysis of the physical implementation of a circuit. The DI criterion allows one to treat asynchronous circuits efficiently, like in the case of synchronous circuits, by allowing to represent them formally (in terms of trace structures). Although revealing a genuinely different underlying structure, the causal constraints on asynchronous circuits, as exhibited by spacetime graphs, can nonetheless be embedded in a simple set of formal rules which limit the definition and composition of DI circuits. In general, these rules allow DI circuits to be decomposed into a number of DI primitive components which increases linearly with the length of the circuit specification [3] . 
VI. CONCLUSION
Without giving definite answers to the problems raised in the introduction, we have nethertheless tried to provide some hints on the essential role played by physical time in computation. The necessary reference to physical time in physical implementations of logical circuits forces one to give an explicit treatment of computation interferences. These arise as obstructions when trying to make the causality underlying logical circuits coincide with the physical causality of their implementations. For synchronous circuits, these may be avoided by ruling the whole circuit with a single clock, which thus provides a global reference to a Newtonian time. In general however, circuits must be considered as asynchronous and physical space-time as relativistic. In the latter, not all points are causally related, but only those such that one point falls within the light cone issued from the other. In that respect, asynchronous circuits and relativistic spacetime share the same founding point of view. Points derive from events and not the converse, propagating events being treated as primary entities and not as successions of points. The distinction between two classes of points can also be seen in a simple way: two points are causally related if and only if there exists a path between them using vertical or propagation lines (in different spatial directions, but in the same time direction); on another hand, points defined on two different events originating from the same point are not causally related [27] . Similarly for a concurrent computation, each computing path connects points which are causally related. Avoiding computing interferences corresponds to impose that different computing paths respect a same time ordering, but only for pairs of causally related points.
Remedies to computing interferences in asynchronous circuits consist in recognizing paths which may conflict with a module specification, and in eventually delaying these paths, so that to respect a prescribed time ordering. This can be done either physically, at the implementation level by introducing explicit time delays, or at the logical level, by imposing specification rules which prevent the occurence of such conflicts. The latter solution, by imposing delay insensitivity both on circuits specification and decomposition in a consistent way, has the advantage of providing a purely logical characterization of the causal constraints. DI circuits then build a class which may be seen as intermediate between synchronous circuits and general asynchronous circuits. They share with the former the possibility to be completely characterized by formal expressions and rules. But they rely on the same causal structure as the latter. Synchronous circuits rely on time simultaneity classes, and thus on a causal structure which is typical of Newtonian spacetime. Asynchronous circuits, on another hand, rely on a consistent treatment of propagation delays and time ordering, hence on a causal structure which characterizes relativistic space-time.
Delay-insensitivity provides an interesting transition between local properties, like those defining sequential processors, and global ones, like those exhibited by distributed systems. But DI circuits hardly exhaust the computation potentialities brought by the introduction of asynchronous circuits. The critical consequences of delay sensitivity rather suggest to consider a further alternative when attempting to classify the different types of computations, i.e. those performed by synchronous, by DI asynchronous and by DS (delay-sensitive) asynchronous circuits. Similarly, in the same way as asynchronous computing machines may not always allow simulation by synchronous computing machines, one may infer that physical processes and physical laws, which intrinsically obey relativistic causality, may be simulated by synchronous machines in particular cases only. This hints at another advantage of computations based on asynchronous circuits, i.e. the ability to simulate in a universal way real physical processes.
