Abstract. The cryptographic concept of simulatability has become a salient technique for faithfully analyzing and proving security properties of arbitrary cryptographic pr otocols. We investigate the relationship between simulatability in synchronous and asynchronous frameworks by means of the formal models of Pfitzmann et. al., which are seminal in using this concept in order to bridge the gap between the formal-methods and the cryptographic community. We show that the synchronous model can be seen as a special case of the asynchronous one with respect to simulatability, i.e., we present an embedding between both models that we show to preserve simulatability. We show that this result allows for carrying over lemmas and theorems that rely on simulatability from the asynchronous model to its synchronous counterpart without any additional work. Hence future work can concentrate on the more general asynchronous case, without having to neglect the analysis of synchronous protocols.
Introduction
In recent times, the analysis of cryptographic protocols has been getting more and more attention, and the demand for general frameworks for representing cryptographic protocols and the security requirements of cryptographic tasks has been rising. Existing framework are either motivated by the complexity-theoretic view on cryptography, which aims at proving cryptographic protocols with respect to the cryptographic semantics, or they are motivated by the view of the formal-methods community, which aims at capturing abstractions of cryptography in order to make such protocols accessible for formal verification. Frameworks built on abstractions will be further dealt with in the related literature along with a discussion on the cryptographic justification of these abstractions.
For living up to the probabilistic nature of cryptography, a framework for dealing with actual cryptography necessarily has to be able to deal with probabilistic behaviors. The standard understanding in well-known, non security-specific probabilistic frameworks like [38, 41] is that the order of events is fixed by means of a probabilistic scheduler that has full information about the system. In contrast to that, the standard understanding in cryptology (closest to a rigorous definition in [10] ) is that the adversary schedules everything, but only with realistic information. This corresponds to making a certain subclass of schedulers explicit for the model from [38] . However, if one splits a machine into local submachines, or defines intermediate systems for the purposes of proof only, this may introduce many schedules that do not correspond to a schedule of the original system and therefore just complicate the proofs. The typical solution is a distributed definition of scheduling which allows machines that have been scheduled to schedule certain (statically fixed) other machines themselves.
Based on these requirements, several general definitions of secure protocols were developed over the years, e.g. [15, 28, 7, 23, 35, 18, 11, 37, 12] , which are all potential candidates for such a framework. To allow for a faithful analysis of cryptographic protocols, it is well-known that such models not only have to capture probabilistic behaviors, but also complexity-theoretically bounded adversaries as well as a reactive environment of the protocol, i.e., continuous interaction with the users and the adversary. Unfortunately, most of the above work does not live up to these requirements in spite of its generality, mainly since it Related Literature. If cryptographic protocols should be verified using formal methods, some kind of abstraction is needed as the underlying reduction proofs of cryptography are still out of scope of current verification techniques. This abstraction is usually based on the so-called Dolev-Yao abstraction [13] , which considers cryptographic primitives, e.g., E for encryption and D for decryption, as operators in a free algebra where only predefined cancellation rules hold. For instance, twofold encryption of a message m does not yield another message from the basic message space but the term E(E(m)). A typical cancellation rule is D(E(m)) = m. This abstraction simplifies proofs of larger protocols considerably, and it gave rise to a large body of literature on analyzing the security of protocols using techniques for formal verification of computer programs (a very partial list of work includes [29, 26, 20, 9, 27, 21, 24, 33, 39, 1] ).
Since this line of work turned out to be very successful, the interesting question arose whether these abstractions are indeed justified from the view of cryptography, i.e., whether properties proved for the abstractions are still valid for the cryptographic implementation. Abadi et. al. showed in [3, 2] that the DolevYao model is cryptographically faithful at least for symmetric encryption and synchronous protocols. There, however, the adversary is restricted to passive eavesdropping. Consequently, it was not necessary to choose a reactive model of a system and its honest users, and the notion of simulatability could be replaced by the weaker notion of indistinguishability [43] . Another interesting approach has been presented by Guttman et. al. [17] which show that the probability of two executions of the same protocol -either executed in a Dolev-Yao-like framework or using real cryptographic primitives -may deviate from each other at most for a certain bound. However, their results are specific for the Wegman-Carter system so far. Moreover, as this system is information-theoretically secure, its security proof is much easier to handle than primitives with security guarantees only against computationally bounded adversaries since no reduction proofs against underlying number-theoretic assumptions have to be made. Some further approaches for special security goals or primitives are [40, 22] . However, there is evidence that the original Dolev-Yao model is not justified in the presence of active attacks, even if provably secure cryptographic primitives are used, cf. [34] for an (admittedly constructed) counterexample. This exemplifies the demand for "better" abstractions which the models of Canetti and of Pfitzmann et. al. want to establish using the concept of simulatability.
Simulatability bridges this gap by serving as a cryptographically sufficient relationship between abstract specifications and cryptographic implementations, i.e., abstractions which can be shown to simulate a given implementation in a particular sense are known to be sound with respect to the security definitions of cryptography. Simulatability was first invented for multi-party function evaluation [42, 15, 7, 28, 11] , i.e., systems with only one initial input set and only one output set. An extension to a reactive scenario, where participants can make new inputs many times, e.g., start new sessions like key exchanges, was first fully defined in [34] , with extensions to asynchronous systems in [37, 12] . Each of the three considered models was already successfully used to built up sound abstractions of various cryptographic primitives and all of them enjoy a composition theorem, i.e., large protocols can be refined step-wise without destroying the already proven properties.
Comparing the models of Canetti and Pfitzmann et. al., we can state that Canetti's work enjoys a more general composition theorem and has moreover addressed more cryptographic primitives so far. On the other hand, the models of Pfitzmann et. al. are more rigorously defined and early examples of tool-supported proofs in their models exist [5, 4] , using PVS [32] . Moreover, the recently published universally composable cryptographic library [6] may pave the way to formal verification of large security protocols within their models.
Outline. In Section 2 we review the reactive models for synchronous and asynchronous time. In Section 3, we explain how the embedding works and give a rigorous definition. Starting with a proof sketch of the first embedding theorem in Section 4 (there will be two of them) and some lemmas capturing essential steps in the theorem's proof, we fade to the embedding theorems in Section 5. In conjunction, both theorems allow for carrying over theorems from the asynchronous to the synchronous case, which is shown in Section 6 by means of an example. For the sake of readability, most occurring proofs are postponed to the Appendix.
Review of the Reactive Models in Synchronous and Asynchronous Networks
In this section we briefly review the synchronous and the asynchronous model for probabilistic reactive systems as introduced in [35] and [37] , respectively. Several definitions are only sketched, whereas those that are essential for understanding our upcoming results are given in full detail.
General System Model
In the following we consider a finite alphabet Σ and some special symbols !, ?, ↔ , ⊳ ∈ Σ that will be used to express different ports of machines. For s ∈ Σ * and l ∈ N 0 , we define s⌈ l to be the l-letter prefix of s.
Our machine model is probabilistic state-transition machines, similar to probabilistic I/O automata as sketched by Lynch [25] . Communication between different machines is done via ports which are divided into input and output ports. Inspired by the CSP-Notation [19] we write input and output ports as q? and q!.
Ports will later be connected by naming convention, i.e., a port q! always sends messages to q?. In the asynchronous model, a special machine called a buffer will further be inserted in each connection to ensure asynchronous behavior. A buffer stores all of its inputs in an internal list. If a machine wants to schedule the i-th message of buffer q (this machine must have the unique clock-out port q ⊳ !), it simply sends i at q ⊳ !, cf. Figure 1 . The buffer then schedules the i-th message and removes it from its internal list. Neither buffers nor clock ports occur in synchronous machines; they are just included to establish a distributed scheduling in the asynchronous case.
As the low-level complement q c of a port q (either in-or output port) we denote the port with which it connects according to Figure 1 
, and vice versa. The high-level complement q C of a port q denotes the connecting port without the buffer, i.e., q! C = q? and vice versa. For a set or a sequence P of ports, let in(P ) and out(P ) denote the subset or subsequence of P consisting of the input ports or the output ports of P , respectively.
After introducing ports, we now focus on the definition of machines. A machine has a sequence of ports, containing both input ports and output ports, and a set of states, comprising sets of initial and final states. If a machine is switched, it receives an input tuple at its input ports and performs its transition function yielding a new state and an output tuple in the deterministic case, or a finite distribution over the set of states and possible outputs in the probabilistic case. Furthermore, each machine has a bound on the length of the considered inputs which allows time bounds on the computation time independent of the environment. The parts of an input that are beyond the length bound are ignored, i.e., incoming strings are only processed up to a predefined length. In particular, this is used to ensure polynomial runtime of individual machines. 
Definition 1. (Machines) A machine is a tuple
In the text, we often write "M" also for name M . For a setM of machines, let ports(M ) denote the set of ports of all machines M ∈M . Machines usually start with one initial input, i.e., the starting state is parameterized. Complexity is measured in terms of the length of this initial input, usually a security parameter k given in unary representation; in particular, polynomial-time is meant in this sense. We only briefly state here, that these machines have a natural realization as a probabilistic interactive Turing machine as introduced in [16] . We call a machine M a black-box submachine of a machine M ′ if the machine M ′ has access to the statetransition function δ M of M, i.e., it can execute δ M for the current state of the machine and arbitrary inputs. In order to cope with specific inputs and outputs of a machine M, we introduce some additional notation which is not contained in the original model. Let P := (p 1 ?, . . . , p n ?) ⊆ in(Ports M ) be a subsequence of the input ports of M and (v i ) i∈{1,...,n} ∈ (Σ * ) n . Then I p 1 ?=v 1 ,...,pn ?=vn denotes the input with p i ? = v i for all i and p ′ ? = ǫ for all p ′ ? ∈ in(Ports M ) \ P . In the special case p i ? = ǫ for all i, i.e., in case of an all-empty input, we write I ǫ . Outputs are defined similarly.
A collection C of machines is a finite set of machines with pairwise different machine names and disjoint sets of ports. In the asynchronous model, the completion [C] of a collection C is the union of all machines of C and the buffers needed for every channel. A port of a collection is called free if its connecting port is not in the collection. These port will be connected to the users and the adversary. The free ports of a collection C are denoted as free(C). In the asynchronous model, a collection C is called closed if its completion [C] has no free ports except a special master clock-in port clk ⊳ ?, i.e., free([C]) = {clk ⊳ ?}. When we define the interaction of several machines, this port will be used to resolve situations where the interaction cannot proceed. In the synchronous case, we demand free(C) = ∅.
For security purposes, special collections are needed, because an adversary may have taken over parts of the initially intended system, e.g., different situations have to be captured depending on which and how many users are considered as being malicious. Therefore, a system consists of several possible remaining structures. The separation of the free ports into specified ports and others is an important feature of the upcoming security definitions. The specified ports are those where a certain service is guaranteed. Note that this definition is valid for both the synchronous and the asynchronous case. In particular, buffers do not have to be explicitly included in the specification of a system, e.g., in the specification of a cryptographic protocol that one wants to analyze. The different timing assumption stem from the different definitions of runs which we will introduce in the following.
Definition 2. (Structures and Systems
A structure can be completed to a configuration by adding machines H and A, modeling the joint honest users and the adversary, respectively. The machine H is restricted to the specified ports S , A connects to the remaining free ports of the structure and both machines can interact, e.g., in order to model active attacks. In the asynchronous case, buffers are additionally added to close the collection. 
Capturing Asynchronous Runs
For a configuration, both models define a probability space of runs (sometimes called traces or executions). In the asynchronous model, scheduling of machines is done sequentially, so we have exactly one active machine M at any time. If this machine has clock-out ports, it is allowed to select the next message to be scheduled as explained at the beginning of Section 2.1. If this message exists, it is delivered by the buffer and the unique receiving machine is the next active machine. If M tries to schedule multiple messages, only one is taken, and if it schedules none or the message does not exist, the special master scheduler is scheduled. This is formally captured as follows. 
For a singleton M = {H} we write view conf (H) instead of view conf ({H}). 3
This rather informal definition of runs can naturally be formalized using transition probabilities, which induce probability spaces over the finite sequences of steps similar to Markov Chains. The extension to infinite sequences can then be achieved using well-established results of measure theory and probability theory, cf. Section 5 of [31] . It is further easy to show that views of polynomial-time machines are of polynomial size.
Capturing Synchronous Runs
In the synchronous model, ports, machines, collections, structures, and systems are defined similar to the asynchronous model. The only exception is that there are no clock ports and no buffers, which have only been included to model asynchronous timing, i.e., corresponding ports p? and p! are directly connected.
The main difference is the definition of runs. Instead of our asynchronous run algorithm (cf. Definition 4), runs are defined using rounds which is the usual concept in synchronous scenarios. Every global round is again divided into n so-called subrounds, and there is a mapping κ, called clocking scheme, from the set {1, . . . , n} into the powerset of considered machines, i.e., the machines of the structure, the user, and the adversary. κ(i) denotes which machines switch in subround i. After finishing the n-th subround, the run starts the first subround of the next global round. At the beginning of each subround, all messages from the previous subround are transported from the output ports to the connected input ports. After that, each machine of κ(i) switches with its current inputs yielding a finite distribution over the set of states and the set of possible outputs.
Definition 5. (Clocking Scheme)
A clocking scheme κ for a configuration (M , S , H, A) and n ∈ N is a mapping from the set {1, . . . , n} to the powerset ofM ∪ {H, A}, i.e., it assigns each number a subset of the machines. 3
Definition 6. (Synchronous Runs and Views) Given a configuration conf = (M , S , H, A)
along with a clocking scheme κ for n ∈ N, runs are defined as follows: Each global round i has n subrounds. In subround 
each state-transition function δ M is applied to M's current input yielding a new state and output (probabilistically). The output at a port p! is available as input at p? until the machine with port p? is switched. If several inputs arrive until that time, they are concatenated. This gives a family of random variables
Again, the view of a polynomial-time machine can easily be shown to be of polynomial size. 
The order of these tuples can be chosen arbitrary since they switch simultaneously and do not influence each other. After that, we have the steps (M, 1, 2, s, I ′ , s ′ , O) for all M ∈ κ(2) and so on, until we finally have steps of the form
Obviously, this characterization of runs is equivalent to the original one (we just expanded the function), but it is better suited for our upcoming proofs.
Instead of arbitrary clocking schemes as in the above definition of runs, the authors of [35] focus on only one special clocking scheme κ, given by (M ∪ {H}, {A}, {H}, {A}). Clocking the adversary between the correct machines is the well-known model of "rushing adversaries". In [35] , it has been shown that this clocking scheme does not restrict the possibilities of the adversary, hence we can use it without loss of generality. Moreover, we restrict ourselves to those configurations where the honest user and the adversary are only connected via one duplex channel. This is indeed no restriction to generality in the synchronous model, because outputs at several ports to the same machine can simply be concatenated using a separation symbol and decomposed again, respectively. In the following, we give these two channels fixed names p A H and p H A , i.e., p A H ! sends messages from A to H and vice versa.
Simulatability
The definition of one system securely implementing another one is based on the common concept of simulatability. Simulatability essentially means that whatever might happen to an honest user in a real system Sys real can also happen in an ideal (abstract) system Sys id : For every structure struc 1 ∈ Sys real , every user H, and every adversary A 1 , there exists an adversary A 2 on a corresponding ideal structure struc 2 such that the view of H is indistinguishable in the two configurations. Indistinguishability ("≈") is a well-defined cryptographic notion from [43] . We only give the definition of computational indistinguishability; a more comprehensive definition is given in the Appendix. 
This is shown in Figure 2 . In the following, we Fig. 2 . Overview of the simulatability definition.
augment ≥ with a subscript sync or async to distinguish the definition of the synchronous and asynchronous case. In a typical ideal system, each structure contains only one machine TH called trusted host, which serves as an ideal functionality of the real system. The machine TH is usually deterministic and maintains a very simple transition function, hence validation based on this ideal functionality is in scope of current verification techniques.
Idea and Definition of the Embedding
The informal idea of the embedding ϕ Sys is to add an explicit master scheduler that should simulate the synchronous run induced by the given clocking scheme. However, due to the general distributed scheduling (cf. Definition 4), leaving the actual machines unmodified leads to non-simulatable situations, as these machines can clock themselves without ever giving control to this explicit master scheduler.
Hence, we first define a mapping ϕ M that surrounds single synchronous machines (i.e., machines that are designed for a synchronous environment) with an "asynchronous coat". More precisely, if a synchronous machine makes a transition, it obtains all inputs at once that arrived since its last scheduling, whereas in asynchronous scenarios, these inputs come one by one and have to be processed in several transitions. Thus, the surrounding asynchronous machine stores all inputs internally, until it is asked to perform the transition of its synchronous submachine. It then schedules this submachine with the collected inputs and forwards its outputs. As these asynchronous machines do not produce any clock outputs, the master scheduler can try to simulate the synchronous time by a suitable scheduling strategy. 
Definition 9 (Mapping ϕ M
Based on this definition, we now formalize the desired mapping ϕ Sys on synchronous systems.
Definition 10 (Mapping ϕ Sys ).
Let an arbitrary synchronous system Sys sync = {(M sync , S sync ) | sync ∈ I } for a finite index set I and a clocking scheme κ be given. We then define
The machine X sync,κ is an explicit master scheduler that has to be added to the considered structure to model the synchronous clocking scheme κ in the asynchronous system. Its ports are given by
The master clock-in port.
Ports for clocking all output ports of the given structure.
-{p ⊳ ! | p? ∈ free(M sync )}: Ports for clocking inputs of the systems (either made by H or A).
Ports for clocking the connection between A and H.
Ports for clocking, i.e., giving control to, each machine.
Internally, it maintains a variable local rnd over {1, . . . , n} and a variable global rnd over N both initialized with 1. For the sake of readability, we describe the behavior of X sync,κ using "for"-loops. This is just a notational convention that should be understood as follows: every time X sync,κ is scheduled, it performs the next step of the loop.
Schedule Current Machines: For all machines
Here, the order of the switched machines can only be chosen arbitrary with the restriction that output ports of the adversary are scheduled first if A ∈ κ(local rnd ). 3 3. Switch to next Round: Set local rnd := local rnd + 1. If local rnd > n, set global rnd := global rnd + 1 and local rnd := 1. Go to Phase (1).
3
To put it all into a nutshell, the specific master scheduler simulates the clocking scheme κ by first scheduling the machines that ought to switch in the particular subround (Step 1) and afterwards scheduling all buffers that could be influenced by outputs of these machines (Step 2). Finally, it switches to the next subround (
Step 3) and continues with the first step again. Moreover, we define a mapping ϕ conf on synchronous configurations of a system Sys, i.e., configurations which consist of synchronous machines only, by
with X sync,κ given as in ϕ Sys for the particular structure. We will in the following simply write ϕ instead of ϕ Sys , ϕ M , and ϕ conf if its meaning is clear from the context.
Preliminary Work for the Embedding Theorems
We now have to prove that the function ϕ has the desired properties with respect to simulatability, i.e., ϕ Sys (Sys sync,1 ) ≥ async ϕ Sys (Sys sync,2 ) ⇒ Sys sync,1 ≥ sync Sys sync,2 .
This captures the content of our first embedding theorem. Unfortunately, the converse direction does not hold, but our second embedding theorem will state a weaker version that is still sufficient for our purpose.
Proof Overview
Before we turn our attention to the auxiliary lemmas for the embedding theorems we exemplarily present an informal description of the proof of the first embedding theorem. The proof consists of four steps. A graphical illustration is given in Figure 3. 1. Starting with a synchronous configuration conf sync,1 ∈ Conf(Sys sync,1 ), we apply our embedding function ϕ conf which yields an asynchronous configuration conf async,1 ∈ Conf(ϕ Sys (Sys sync,1 )). We now define a mapping φ on the runs of the asynchronous system yielding runs of the synchronous system. Intuitively, φ "compresses" an asynchronous run to its synchronous counterpart, which consists of much less steps. We then show in Theorem 1 that view conf sync,1 (H sync ) = φ(view conf async,1 (ϕ(H sync ))). (Sys sync,2 ) ), written view conf async,1 (ϕ(H sync )) ≈ view conf async,2 (ϕ(H sync )). We then show that φ(view conf async,1 (ϕ(H sync ))) ≈ φ(view conf async,2 (ϕ(H sync ))). 3. We finally reverse the function ϕ by removing the coating of the user and the machines of the structure.
Since we do not know anything about the newly derived adversary A async,2 , i.e., it is not forced to fit the structure imposed by the mapping ϕ, we define a new adversary A sync,2 using A async,2 as a black-box submachine, and we will show in Theorem 2 that φ(view conf async,2 (ϕ(H sync ))) = view conf sync,2 (H sync ). 4. Altogether, transitivity of the relation ≈ implies view conf sync,1 (H sync )) ≈ view conf sync,2 (H sync ).
We first take a look at the runs in a synchronous system Sys sync and in its asynchronous counterpart Sys async := ϕ(Sys sync ). In the following, we will simply write S instead of S sync , because the set of specified ports is not influenced by the mapping ϕ.
Compressing asynchronous runs to synchronous counterparts
In the following, let an arbitrary synchronous system Sys sync with a clocking scheme κ and an arbitrary configuration conf sync = (M sync , S , H sync , A sync ) ∈ Conf(Sys sync ) be given. Moreover, let an asynchronous configuration conf async be given which fits the form conf async = (ϕ(M sync ) ∪ {X sync,κ }, S , ϕ(H sync ), A ′ ) (i.e., ϕ(conf sync ) but with an arbitrary adversary). 4 First of all, note that runs of conf async always have a prescribed structure induced by the behavior of the master scheduler X sync,κ : they are built by "blocks". The steps (M sync , i, j, s, I, s ′ , O) of the machines M sync ∈M sync ∪ {H sync } switched in round [i.j] in the synchronous run are represented by the following two blocks in the asynchronous run.
1. The first block consists of the steps induced by clocking the machines ϕ(M sync ) with M sync ∈ κ(j) and A ′ if A sync ∈ κ(j), i.e.,
Step (1) in the definition of X sync,κ . More precisely, the block is built by |κ(j)| sub-blocks, one for every switched machine. Every sub-block is built by the following steps.
-The first step of the sub-block is always given by (X sync,κ , s
for two arbitrary states s 1 , s ′ 1 of X sync,κ , i.e., the master scheduler schedules the machine ϕ(M sync ) respectively A ′ .
-After that, we have the transition of the scheduled buffer.
-We now have to distinguish the following two cases:
• If M sync = A sync , there is a step (ϕ(M sync ), s, I p Msync ?=(i,j) , s ′ , δ Msync (input store Msync )) and steps for the receiving buffers.
• If M sync = A sync , we have a step (A ′ , s, I p Async ?=(i,j) , s ′ , O). If O = O ǫ we have steps for the receiving buffers. If there are nonempty outputs at ports p! and p ⊳ ! (which has to be a self-loop because there are no free clock-in ports in the system), there is furthermore a clocking step for this particular buffer. In this case, the adversary is scheduled again, so this sub-point of the block is repeated until the self-loop of the adversary either ends or it is repeated forever in case of divergence, i.e., we obtain a step (A ′ , s ′ , I ′ , s ′′ , O) where I ′ is now given by I ′ := I p?=O p! and so on. 2. The second block consists of the steps induced by clocking the outgoing messages of the switched machines, i.e.,
Step (2) in the definition of X sync,κ . Now the buffers of the output ports are switched by the master scheduler. This is done similar as in the first part with the restriction that output ports of A ′ are clocked first if A sync ∈ κ(j). The block again has |κ(j)| sub-blocks built by the following steps.
-The first step of the sub-block is given by (X sync,κ , s
) for the first output port p! ∈ ports(M sync ) and two arbitrary states s 1 , s ′ 1 of X sync,κ . -The step of the clocked buffer.
-In case of a nonempty output let M ′ denote the unique machine with p? ∈ ports(M ′ ). We now have to distinguish two cases:
, where I ′ consists of the output of ϕ(M sync ) at p!.
• If M ′ = A ′ , we obtain a step (A ′ , s, I ′ , s ′ , O), where I ′ consists of the output of ϕ(M sync ) respectively A ′ at p!. If O = O ǫ we have steps for the receiving buffers. If O has a clocked self-loop, we proceed identical to the first block. -The three previous steps are repeated for every output port of every machine M sync ∈ κ(j).
After this detailed description of the run, (i.e., its blocks) the mapping φ can be defined. Informally, it combines the blocks of all machines M sync ∈ κ(j) yielding the synchronous steps of every machine M sync that switches in the j-th subround of the particular global round.
Definition 11. (Mapping φ)
Let an arbitrary synchronous system Sys sync with a clocking scheme κ and an arbitrary configuration conf sync = (M sync , S , H sync , A sync ) ∈ Conf(Sys sync ) be given. For a given asynchronous configuration conf async which fits the form conf async = (ϕ(M sync ) ∪ {X sync,κ }, S , ϕ(H sync ), A ′ ), we define the mapping φ on the runs of conf async by the following algorithm. The algorithm has internal arrays (inputs M,p? ) for M ∈ ϕ(M sync ) ∪ {ϕ(H sync ), A ′ } and p? ∈ in(Ports M ). It goes from block to block modifying them as follows. 
Every step of a buffer is deleted from the run. 2. The two remaining steps of the first block are modified as follows. If the scheduled machine is
ϕ(M sync ) = A ′ ,
3
Note that all necessary information (e.g., M sync , i, j, s, s ′ etc.) is already given by the block except for the inputs of each machine in the synchronous case. At this point, it also becomes clear why we defined the master scheduler to schedule each machine specifically with a tuple (i, j) indicating the current global and local round, since this information would otherwise not be contained in the asynchronous run.
To overcome the absence of the gathered inputs in the run, the algorithms has to collect all "partial" inputs itself in its third step, and it can use this information to calculate the outputs of each machine (although for this, it could as well use the information contained in the run). Moreover, the new blocks built by the mapping φ in one particular subround do not depend on the second block of this subround. The mapping φ is obviously also defined on the view of arbitrary subsets of machines, because the step in the first block, carrying the information of the step, and the message-receiving steps in the second block will also be part of the view of the considered machine. Furthermore, note that the mapping φ is explicitly defined for arbitrary adversaries A ′ (not only for ϕ(A sync )) which we will need in Theorem 2. Furthermore, the following lemma establishes a computational bound on the mapping φ in polynomial-time configurations:
Lemma 1. If conf async is a polynomial-time configuration that fits the form required by Definition 11, then φ applied to the view of the honest user and the adversary is computable in polynomial-time. 2

Auxiliary Theorems
The following theorem captures the first step of our proofsketch of Section 4.1.
Theorem 1. Let a synchronous system
Sys sync , a clocking scheme κ, and a configuration conf sync = (M sync , S , H sync , A sync ) ∈ Conf(Sys sync ) be given, and set conf async := ϕ(conf sync ). Then
After performing this first step of the proof, asynchronous simulatability can now be applied. In order to convert the derived asynchronous configuration into a synchronous configuration again (cf.
Step 3 of our proofsketch), we present the following theorem (again postponing its proof to the Appendix). 
Theorem 2. Let an arbitrary synchronous system
Note, that the standard clocking scheme (M ∪ {H}, {A}, {H}, {A}) fulfills the postulated requirement.
The Embedding Theorems
This section contains our two main theorems. We start with a lemma capturing some simple properties of indistinguishable random variables. The lemma is well-known and easily proved. 
Lemma 2 (Indistinguishability). Indistinguishability of two families of random variables implies indistin
where
Using the result of the previous theorems, the proof will be rather simple, cf. the illustration in Figure 3 .
Proof. Let an arbitrary configuration conf sync,1 = (M sync,1 , S , H sync , A sync,1 ) ∈ Conf(Sys sync,1 ) be given.
1. We apply ϕ conf on conf sync,1 yielding a configuration conf async,1 = (ϕ(M sync,1 ) ∪ {X sync,1,κ 1 }, S , ϕ(H sync ), ϕ(A sync,1 )) ∈ Conf(Sys async,1 ). According to Theorem 1, applying the mapping φ to the runs of conf async,1 yields
Moreover, if conf sync,1 is polynomial-time then conf async,1 is also polynomial-time, and the mapping φ is polynomial-time computable. 2. Thus, the precondition ϕ(Sys sync,1 ) ≥ f async ϕ(Sys sync,2 ) can be applied yielding a configuration conf async,2 = (ϕ(M sync,2 ) ∪ {X sync,2,κ 2 }, S , ϕ(H sync ), A async,2 ) ∈ Conf(Sys async,2 ) with
and ϕ(M sync,2 , S ) ∈ f (ϕ(M sync,1 , S )). Moreover, in the computational case, conf async,2 is polynomialtime, so the mapping φ is polynomial-time computable. Using Lemma 2, this yields
3. We now apply Theorem 2 to the configuration conf async,2 , which yields a configuration conf sync,2 = (M sync , S , H sync , A sync,2 ) ∈ Conf(Sys sync,2 ) with φ(view conf async,2 (ϕ(H sync ))) = view conf sync,2 (H sync ).
According to Theorem 2, conf sync,2 is a polynomial-time configuration iff conf async,2 is polynomial. 4. Putting it all together, we have
Using Lemma 2, we obtain view conf sync,1 (H sync ) ≈ view conf sync,2 (H sync ). Hence, conf sync,2 is an indistinguishable configuration for conf sync,1 . Moreover, we have
Note that the theorem is applicable to the standard clocking scheme. So far, we have shown that asynchronous simulatability among these asynchronous representations implies synchronous simulatability, i.e., ϕ Sys (Sys sync,1 ) ≥ async ϕ Sys (Sys sync,2 ) ⇒ Sys sync,1 ≥ sync Sys sync,2 .
We already briefly stated in the previous section that the converse implication does not hold in general. We had to show that for each configuration conf async,1 ∈ Conf(ϕ Sys (Sys sync,1 )) there exists an indistinguishable configuration conf async,2 ∈ Conf(ϕ Sys (Sys sync,2 )) provided that Sys sync,1 ≥ sync Sys sync,2 . However, both the honest user and the adversary may have clock-out ports and they can alternately schedule each other (and also the system erratically), which we cannot capture by a fixed synchronous clocking scheme, so we cannot exploit our assumption Sys sync,1 ≥ sync Sys sync,2 .
Anyhow, it is sufficient for our purpose to show that the claim holds for at least those configurations where the honest user H async fits the form ϕ M (H sync ) for a synchronous machine H sync . We denote this version of simulatability for the restricted class of users by ≥ async,H in the sequel. Looking at the proof of the first embedding theorem, it is immediately clear that the theorem also holds for the weaker precondition ϕ Sys (Sys sync,1 ) ≥ async,H ϕ Sys (Sys sync,2 ), since we only need to derive an indistinguishable configuration for users of the special form ϕ(H sync ), and the user remains unchanged at simulatability. We can now capture the content of the second embedding theorem as 
Deriving Synchronous Theorems from Asynchronous Ones
Recall that our long-term goal is to avoid proving each and every theorem and lemma for both models. We now briefly show how our two embedding theorems can be used for circumventing this problem. One of the most important theorems of both models is transitivity of the relation ≥. 
Lemma 3 (Transitivity
This has been proven in [35] for the synchronous and in [37] for the asynchronous model. We now exemplarily show how to derive the synchronous version from the asynchronous one using our previous results.
Lemma 4. (Asynchronous Version of Transitivity implies Synchronous Version) Assume that the asynchronous version of the transitivity lemma (Lemma 3) has already been proven, then the synchronous version holds as well. 2
Proof. We omit the superscripts f i for the sake of readability. Let arbitrary synchronous systems Sys 1 , Sys 2 , and Sys 3 be given such that Sys 1 ≥ sync Sys 2 and Sys 2 ≥ sync Sys 3 . We have to show that Sys 1 ≥ sync Sys 3 holds, provided that asynchronous transitivity has already been proven. According to our second embedding theorem, we know that ϕ(Sys 1 ) ≥ async,H ϕ(Sys 2 ) and ϕ(Sys 2 ) ≥ async,H ϕ(Sys 3 ).
Obviously, the asynchronous version of transitivity is applicable to the relation ≥ async,H instead of ≥ async as well, since it is a special case only, and the honest user remains unchanged at simulatability. Thus, we can apply our (already proven) asynchronous version of the transitivity lemma, which yields
Now, we use our first embedding theorem in conjunction with its subsequent remarks (stating that the theorem holds as well for the restricted version ≥ async,H of simulatability) yielding Sys 1 ≥ sync Sys 3 .
This proof technique is applicable to almost all theorems that rely on simulatability. As the most important example, we name the preservation theorem [36, 4] , which states that integrity properties expressed in lineartime logic are preserved under simulatability. The proof of this theorem is difficult and comprises several pages for both models. Using our work, the synchronous proof could as well be omitted. However, this proof techniques is unfortunately not immediately applicable to carry over lemmas dealing with composition of systems, since it is not immediately clear what the result of composing two systems with different master schedulers is. This problem can probably be circumvented as follows. First, both master schedulers are combined to an overall scheduler X for the whole system. Secondly, an intermediate system can be defined, where this combined master scheduler is split into two separate machines X 1 and X 2 such that X 1 stays the true master scheduler with the unique master clock-in port clk ⊳ ?, and X 2 is considered as a "slave" master scheduler, i.e., a usual machine that is explicitly given control by X 2 to handle the scheduling demands of "its" system. Finally, our embedding theorems are applicable in this intermediate system, and the resulting schedulers can be composed again to an overall master scheduler. However, formally establishing this result requires additional research. 
A Postponed Definitions
The following definition for indistinguishability of random variables is essentially from [43] . 
(as a function of k). SMALL should be closed under affine addition, and with a function g also contain every function g ′ ≤ g. c) computationally indistinguishable ("≈ poly ") if for every algorithm Dis (the distinguisher) that is probabilistic polynomial-time in its first input,
Intuitively, given the security parameter and an element chosen according to either var k or var ′ k , Dis tries to guess which distribution the element came from. The class NEGL denotes the set of all negligible functions, i.e., g : N → R ≥0 ∈ NEGL if for all positive polynomials Q, ∃k 0 ∀k ≥ k 0 : g(k) ≤ 1/Q(k).
We write ≈ if we want to treat all three cases simultaneously.
3
For reasons of completeness, we now present the extended definition of simulatability, based on the three different kinds of indistinguishability. Definition 8 was simplified in the sense that only computational indistinguishability of views was covered, which represents the most common case when applying simulatability to cryptographic protocols. 
B Postponed Proofs
Proof. (Lemma 1) In case of a polynomial configuration, especially the adversary has to be polynomialtime. This implies that there cannot be any infinite successive clocked self-loops, so the steps of every sub-block are bounded by a polynomial in the security parameter k. Moreover, both the adversary and the honest user will reach final state after a polynomial number of blocks, so the algorithm for φ applied to the view either of the honest user or the adversary only makes a polynomial number of transition, each one with a polynomial number of steps. 5 This implies that φ is computable in polynomial-time when applied to the view of the honest user and the adversary if it is used in a polynomial-time configuration.
Proof. (Theorem 1) Note that the view of ϕ(M sync ) does only contain the steps of its internal blackbox function-call after being modified by the mapping φ. Thus, it is sufficient to show that the inputs of the blackbox call in conf async and the original inputs of H sync in conf sync are equal. It is quite easy to see that the arrays input store Msync and inputs Msync are always equal if the machine M sync is switched. This can easily be proven by induction over the number of (sub-)rounds. In the first round, both arrays are empty yielding a correct start of the induction. Starting with the second round, the contents of these arrays are totally determined by the inputs at the ports of M sync . However, these inputs only depend on prior outputs of other machines M . Moreover, these outputs have to be equal because these machines used the same input tuple in both configurations, since we have input store M = inputs M by induction hypothesis. Therefore, the arrays inputs Msync and input store Msync must be equal at replacing the block by construction of the algorithm, so δ Msync (s, inputs Msync ) = δ Msync (s, input store Msync ) also holds. We do not have to worry about the arrangement of the blocks because of the following reasons. First of all, note that we first switch all machines in a subround and schedule the outgoing messages afterwards. Moreover, messages sent by the adversary are always scheduled first if the adversary is scheduled in the considered subround. This prevents that machines which should switch simultaneously in the synchronous system may influence each other in the asynchronous system in the same subround. If we did not consider this restriction, the adversary would be able to create a message that is scheduled in this particular subround, but nevertheless depends on inputs arriving in this subround.
Putting it all together, the runs induced by the mapping φ in conf async and the original runs are equal by definition of φ, so we finally obtain
for an arbitrary configuration conf sync ∈ Conf(Sys sync ), conf async := ϕ(conf sync ), and an arbitrary M sync ∈ (M sync ∪ {H sync , A sync }). As a special case, this implies
which finishes our proof.
Proof. (Theorem 2) We first reverse our function ϕ on the structure (ϕ(M sync ) ∪ {X sync,κ }, S ) and on the user ϕ(H sync ) yielding the structure (M sync , S ) of Sys sync,2 and the original honest user H sync . Note, that we cannot reverse the function ϕ on the new adversary A async in the same way, because we did not demand it to have a similar internal structure, so we construct a new adversary A sync for the synchronous configuration as follows. The ports of A sync are given by
i.e., it connects to all remaining free ports ofM sync and H sync . Internally, A sync maintains an array (output store p! ) p!∈out(ports(Aasync)) of lists over Σ * all initially empty. A sync has the adversary A async as a blackbox submachine and its behavior is defined as follows. If A sync is clocked in the synchronous system, it gets an input tuple I = (I p? ) p?∈in(ports(Async)) . It now tries to restore the order in which these messages would have arrived in the asynchronous system. More precisely, it knows the clocking scheme κ, so it know which machines have been clocking after the last clocking of A sync . Moreover, it knows the order in which machines are switched by X sync,κ in one particular subround. Using the order on the ports of the asynchronous machines, it can finally decide in which order messages sent by one machine on different ports would have arrived in the asynchronous system. The only problem which might arise is that a machine has been clocked more then once since the last clocking of the adversary. This might result in two inputs at the same port of A sync which would be concatenated without any separation symbol. Such an input would not be restorable into its original form, so we had to include the restriction to the considered clocking scheme that every machine and the user are at most clocked once between two successive clockings of the adversary. Note, that our usually used clocking scheme (M ∪ {H}, {A}, {H}, {A}) fulfills this requirement.
After restoring both the usual messages and their order, A sync uses the blackbox function δ Aasync on the first input yielding an output tuple O. This tuple O is appended to the array output store, i.e. each component O p! is appended to output store p! . If there is a nonempty output c at a clock-out port p ⊳ !, we would have a clocked self-loop in conf async if output store p! [c] = ǫ. In this case, this component is removed from the array and δ Aasync is called again with the new state and I := I p?=output store p! [c] and so on.
The above steps are repeated with the second input and the new state of A async and so on until all inputs have been considered. Finally, the blackbox function is used with I p Async ?=(i,j) where i denotes the global round and j denotes the subround the adversary is clocked in. 6 This correspond to the clocking signal of X sync,κ in the asynchronous system. The output tuple is again concatenated to the same array and possible clocked self loops are considered again. Finally, A sync outputs the first elements of each list of output store p! with p! C ∈ ports(M sync ∪ {H sync }) as its output tuple O and removes these elements from the lists. Note, that this newly defined adversary A sync is polynomial iff A async is polynomial by construction. Thus, if the original configuration conf async has been polynomial-time (i.e., the user ϕ(H sync ) and the adversary A async must be polynomial-time) then the configuration conf sync = (M sync , S , H sync , A sync ) will also be polynomial-time, since the runtime of H sync is always bounded by ϕ(H sync ).
A sync "reverse" the function ϕ by construction. The asynchronous adversary would receive many single inputs, and it would produce outputs every time which would be stored in the outgoing buffers. Possible clocked self-loops are handled by repeated calls of the transition function with correct inputs. If A async is scheduled by X sync,κ it again performs an arbitrary transition and the first element of its outgoing buffer would be clocked. The synchronous adversary first splits its input messages into their original order and uses the blackbox function one by one storing the outputs in output store . The split inputs correspond to the original inputs of the asynchronous system, so the output tuples are also equal after every step. Therefore, the contents of output store always correspond to the outgoing buffers in the asynchronous system after a clocking step of A async . If the synchronous adversary is clocked it again calls its blackbox function with the correct input and stores the output in the array. After that, it outputs the first element of each list of the array and removes these elements from the lists. In the asynchronous system messages stored in the outgoing buffers are treated in the same way. More formally we can show the following lemma.
Lemma 5. We denote this "reversion" of ϕ M byφ M and the reversion of the whole configuration byφ conf for the moment. Then for an arbitrary configuration conf async = (ϕ(M sync )∪{X sync,κ }, S , ϕ(H sync ), A async ) we have view ϕ conf (φ conf (confasync)) (ϕ(M)) = view confasync (ϕ(M))
for every M ∈ (M sync ∪ {H sync }) and
where the view of A async in the first configuration is given as a submachine of ϕ M (φ M (A async )). 2
Proof. The proof is illustrated in Figure 4 . We first show that A ′ async := ϕ M (φ M (A async )) behaves exactly as A async , i.e., both machines are perfectly indistinguishable for their environment. This is already sufficient to show that the views of ϕ(M) for every M ∈ (M sync ∪ {H sync }) are equal in both configurations because they remain unchanged. We will also show that the view of A async is equal in both configurations which finishes our proof. We show that both adversaries A ′ async and A async behave identically between two successive clockings. Moreover, we show that the content of array output store p! of A ′ async always equal the outgoing buffers p in the corresponding asynchronous configuration at every clocking of A async as a submachine of A ′ async if we identify clockings of A async in both configurations in the natural way. 7 Furthermore, we show that outputs made by the adversary are always equal in both configurations. At the start of the run both buffers and arrays are empty which fulfills our claim. Now assume that A ′ async receives an arbitrary input at p? = p Async ?. It stores the message in its array input store p? and gives the control to the master scheduler. If A ′ async receives a non-empty input at p A ? it applies the state transition function δφ M (Aasync) on the arrays input store. Now, the arrays input store are decomposed into single inputs again preserving their original order, and the function δ Aasync is applied to every such input. Since the inputs are obviously equal in both configuration, we obtain identical outputs, and moreover identical views for A async . By precondition, the arrays output store are mapped to the outgoing buffers. After one call of δ Aasync , every output at p! is stored either in output store p! or in p at the same position, so they remain validly mapped. Now, either the first component of output store p! or the first entry of p for p! C ∈ (ports(M sync ) ∪ {H sync }) are output yielding identical outputs and therefore identical views for the environment in both configurations, i.e., view ϕ conf (φ conf (confasync)) (ϕ(M)) = view confasync (ϕ(M))
for M ∈ (M sync ∪ {H sync }). We already showed that the views of A async are equal in both configurations which finishes our proof.
According to Lemma 5, the function ϕ conf •φ conf yields identical views for ϕ(M) for every M ∈ (M sync ∪ {H sync }) and the asynchronous adversary, i.e.,
-view ϕ conf (φ conf (confasync)) (ϕ(M)) = view confasync (ϕ(M)) and -view ϕ conf (φ conf (confasync)) (A async ) = view confasync (A async ).
We already showed in Theorem 1 that view confsync (M) = φ(view ϕ(confsync) (ϕ(M))) holds for every synchronous configuration conf sync = (M sync , S , H sync , A sync ) and for every machine M ∈ (M sync ∪ {H sync , A sync }). If we now set conf sync :=φ conf (conf async ), we obtain -view confsync (M) = φ(view ϕ conf (φ conf (confasync)) (ϕ(M)))
Moreover, this implies -view confsync (A sync ) = φ(view ϕ conf (φ conf (confasync)) (A async ))) since the views of A async and ϕ(φ(A async )) are identical. We apply the mapping φ on the first two equations and, using Lemma 2, we obtain -φ(view ϕ conf (φ conf (confasync)) (ϕ(M))) = φ(view confasync (ϕ(M))) and -φ(view ϕ conf (φ conf (confasync)) (A async )) = φ(view confasync (A async ))
Note, that φ is in fact defined on runs of these configuration because both the machines of the structure and the honest user have the prescribed form. Using transitivity, we immediately obtain the desired result view confsync (M) = φ(view confasync (ϕ(M))) and view confsync (A sync ) = φ(view confasync (A async ))
As a special case we set M := H sync which yields view confsync (H sync ) = φ(view confasync (ϕ(H sync ))). 7 More precisely, this means that we identify the i-th clocking of Aasync in confasync with the i-th call of δ Aasync by A ′ async in ϕ conf (φ conf (confasync)).
