Into and out of metastability
What is metastability? Consider the crosscut through a vicious miniature-golf trap in Figure 1 . Hit the ball too lightly, and it remains where ball 1 is. Hit it too hard, and it reaches position 2. Can you Editors' note: Metastability can arise whenever a signal is sampled close to a transition, leading to indecision as to its correct value. Synchronizer circuits, which guard against metastability, are becoming ubiquitous with the proliferation of timing domains on a chip. Despite the critical importance of reliable synchronization, this topic remains inadequately understood. This tutorial provides a glimpse into the theory and practice of this fascinating subject.
À ÀMontek Singh (UNC Chapel Hill) and Luciano Lavagno (Politecnico di Torino)
Late Friday afternoon, just before the engineers locked the lab to leave J, the new interplanetary spacecraft, churning cycles all weekend, the sirens went off and the warning lights started flashing red. J had been undergoing fully active system tests for a year, without a single glitch. But now, J's project managers realized, all they had were a smoking power supply, a dead spacecraft, and no chance of meeting the scheduled launch.
All the lab engineers and all J's designers could not put J back together again. They tried every test in the book but couldn't figure out what had happened. Finally, they phoned K, an engineering colleague on the other side of the continent. It took him a bit of time, but eventually K uncovered the elusive culprit: metastability failure in a supposedly good synchronizer. The failure led the logic into an inconsistent state, which turned on too many units simultaneously. That event overloaded the power supply, which eventually blew up. Luckily, it happened in prelaunch tests and not a zillion miles away from Earth.
make it stop and stay at the middle position? It is metastable, because even if your ball has landed and stopped there, the slightest disturbance (such as the wind) will make it fall to either side. And we cannot really tell to which side it will eventually fall.
In flip-flops, metastability means indecision as to whether the output should be 0 or 1. Let's consider a simplified circuit analysis model. The typical flip-flops comprise master and slave latches and decoupling inverters. In metastability, the voltage levels of nodes A and B of the master latch are roughly midway between logic 1 (V DD ) and 0 (GND). Exact voltage levels depend on transistor sizing (by design, as well as due to arbitrary process variations) and are not necessarily the same for the two nodes. However, for the sake of simplicity, assume that they are (V A ¼ V B ¼ V DD /2).
Entering metastability
How does the master latch enter metastability? Consider the flip-flop in Figure 2a . Assume that the clock is low, node A is at 1, and input D changes from 0 to 1. As a result, node A is falling and node B is rising. When the clock rises, it disconnects the input from node A and closes the AÀB loop. If A and B happen to be around their metastable levels, it would take them a long time to diverge toward legal digital values, as Figure 3 shows. In fact, one popular definition says that if the output of a flip-flop changes later than the nominal clock-to-Q propagation delay Asynchronous Design (t pCQ ), then the flip-flop must have been metastable. We can simulate the effect by playing with the relative timing of clock and data until we obtain the desired result, as Figure 3 demonstrates. Incidentally, other badly timed inputs to the flip-flop (asynchronous reset, clear, and even too short a pulse of the clock due to bad clock gating) could also result in metastability.
When the coincidence of clock and data is unknown, we use probability to assess how likely the latch is to enter metastability (we focus on the master latch for now and discuss the entire flip-flop later). The simplest model for asynchronous input assumes that data is likely to change at any time with uniform distribution. We can define a short window T W around the clock's sampling edge (sort of ''setupand-hold time'') such that if data changes during that window, the latch could become metastable (namely, the flip-flop output might change later than t pCQ ). If it is known that data has indeed changed sometime during a certain clock cycleÀ Àand since the occurrence of that change is uniformly distributed over clock cycle T C À Àthe probability of entering metastability, which is the probability of D's having changed within the T W window, is
But D may not change every cycle; if it changes at a rate F D , then the rate of entering metastability becomes Rate ¼ F D F C T W . For instance, if F C ¼ 1 GHz, F D ¼ 100 MHz, and T W ¼ 20 ps, then Rate ¼ 2,000,000 times/sec. Indeed, the poor latch enters metastability often, twice per microsecond or once every 500 clock cycles. (Note that we traded probability for rateÀ Àwe need that in the following discussion.)
Exiting metastability
Now that we know how often a latch has entered metastability, how fast does the latch exit from it? In metastability, the two inverters operate at their linear-transfer-function region and can be modeled, using small-signal analysis, as negative amplifiers (see Figure 4) . Each inverter drives, through its output resistance R, a capacitive load C comprising the other inverter's input capacitance as well as any other external load connected to the node. Typically, the master latch becomes metastable and resolves before the second phase of the clock cycle. In rare cases, when the master latch resolves precisely half a cycle after the onset of metastability, the slave latch could enter metastability as a result (its input is changing exactly when its clock disconnects its input, and so on, thereby repeating the aforementioned master-latch scenario). The simple model results in two first-order differential equations that can be combined into one, as follows:
Because A/R & g m , we often estimate t ¼ C/g m ; higher capacitive load on the master nodes and lower inverter gain impede the resolution of metastability. The master latch is exponentially sensitive to capacitance, and different latch circuits often differ mainly on the capacitive load they have to drive. In the past t was shown to scale nicely with technology, but new evidence has recently emerged indicating that in future technologies t may deteriorate rather than improve.
The voltage difference V thus demonstrates an ''explosion'' of sorts (like any other physical measure that grows exponentially fastÀ Àe.g., as in a chemical explosion). This behavior is best demonstrated by circuit simulation of a latch starting from a minute voltage difference V 0 ¼ 1 mV (see Figure 5 ). The voltage curves of the two nodes do not appear to change very fast at all (let alone explode).
However, observing the logarithm of the voltage difference V in Figure 6 reveals a totally different picture. The straight line from the initial voltage V 0 up to about V 1 ¼ 0.1 V or log(V 1 ) ¼ À1 (V 1 is approximately the transistor threshold voltage, V TH ) traverses five orders of magnitude at an exponential growth rate, indicating that the ''explosion'' actually happens at the microscopic level. As the voltage difference approaches the transistor threshold voltage, the latch changes its mode of operation from two interconnected small-signal linear amplifiers (as in Figure 4 ) to a typical, slower digital circuit. We say that metastability has resolved as soon as the growth rate of V is no longer exponential (the log curve in Figure 6 flattens off).
The log chart facilitates a simple estimate of t. Take the ratio of two arbitrary voltage values V x (t x ), V y (t y ) along the straight line and solve for t:
Actually, several factors affect t. First, in some circuits it may change during resolution, and the line of Figure 6 is not exactly straight. Process variations might result in t several times larger than predicted by simulations. Low supply voltage, especially when the metastability voltage is close to the threshold voltage (and gm decreases substantially), as well as very high or extremely low temperatures, could increase t by several orders of magnitude. These issues make the research of synchronization an interesting challenge with practical implications. Clearly, if metastability starts with V ¼ V 0 and ends when V ¼ V 1 , then the time to exit metastability is t m :
Thus, the time to exit metastability depends logarithmically on the starting voltage V 0 (and not on the fixed exit voltage V 1 ), as Figure 7 clearly demonstrates.
Had we started with V 0 ¼ V 1 , then the time in metastability would be zero (the leftmost curve in Figure 7 ). On the other hand, if V 0 ¼ 0, we would have waited forever, but this possibility is unlikely. In fact, the claim that ''in metastability, the two nodes of the latch get stuck in the middle and would eventually get out of there by some random process,'' which some researchers and designers often make, should be taken lightly. Two factors affect the actual initial voltage: when exactly the clock's sampling edge blocked the input and closed the latch (specifically, what the actual value of V was at that moment), and how noise might have changed that value. Observe that thermal noise is anywhere from 1 mV to 1 mV, much higher than V in the extreme cases on the right-hand side of Figure 7 . Since we don't know V 0 deterministically, we don't know how long the latch will stay metastableÀ Àbut we can provide a statistical estimate. Probabilistic analysis shows that if a latch is metastable at time zero, the probability it will remain metastable at time t > 0 is e --t/t , which diminishes exponentially fast. In other words, even if a latch became metastable, it would resolve pretty fast.
Synchronization reliability
All the foregoing leads us to computing reliability. If a latch receives asynchronous inputs, we can't guarantee that it will never become metastableÀ Àin fact, we already know that it will definitely become metastable at the high rate of F D F C T W (2 million times/s in the preceding example). Instead, we can compute the reliability that it will fail as a result of entering metastability. The synchronizer's whole purpose is to minimize that failure probability. Now we can finally Figure 5 ); the bottom chart shows the log of their difference (the line starting at 26 is the same as in Figure 6 ).
define and estimate synchronization failures: we want the metastability to resolve within a synchronization period S so that we can safely sample the output of the latch (or flip-flop). Failure means that a flip-flop became metastable after the clock's sampling edge, and that it is still metastable S time later. The two events are independent, so we can multiply their probabilities:
Now we can take advantage of the expression for the rate of entering metastability computed previously to derive the rate of expected failures:
The inverse of the failure rate is the mean time between failures (MTBF):
Let's design synchronizers with MTBF that is many orders of magnitude longer than the expected lifetime of a given product. For instance, consider an ASIC designed for a 28-nm high-performance CMOS process. We estimate t ¼ 10 ps, T W ¼ 20 ps (experimentally, we know that both parameters are close to the typical gate delay of the process technology), and F C ¼ 1 GHz. Let's assume that data changes every 10 clock cycles at the input of our flip-flop, and we allocate one clock cycle for resolution: S ¼ T C . Plug all these into the formula and we obtain 4 Â 10 29 years. (This figure should be quite safeÀ Àthe universe is believed to be only 10 10 years old.)
What happens at the flip-flop during metastability, and what can we see at its output? It's been said that we can see a wobbling signal that hovers around half V DD , or that it can oscillate. Well, this is not exactly the case. If node A in Figure 2 is around V DD /2, the chance that we can still see the same value at Q, three inverters later (or even one inverter, if the slave latch is metastable) is practically zero. Instead, the output will most likely be either 0 or 1, and as V A resolves, the output may (or may not) toggle at some later time. If indeed that toggle happens later than the nominal t pCQ , then we know that the flipflop was metastable. And this is exactly what we want to mask with the synchronizer. Figure 8 shows a simple two-flip-flop synchronization circuit (we don't call it a synchronizer yetÀ Àthat comes later). The first flip-flop (FF1) could become metastable. The second flip-flop (FF2) samples Q1 a cycle later; hence S ¼ T C . Actually, any logic and wire delays along the path from FF1 to FF2 are subtracted from the resolution time: S ¼ T C À t pCQ (FF1) À t SETUP (FF2) À t PD (wire), and so on. A failure means that Q2 is unstable (it might change later than t pCQ ), and we know how to compute MTBF for that event. But what really happens inside the circuit? Consider Figure 9 , in which D1 switches dangerously close to the rising clock. Any one of six outcomes could result: (a) Q1 could switch at the beginning of clock cycle 1 and Q2 will copy that on clock cycle 2. (b) Q1 could completely miss D1. It will surely rise on cycle 2, and Q2 will rise one cycle later.
Two-flip-flop synchronizer
Asynchronous Design (c) FF1 could become metastable, but its output stays low. It later resolves so that Q1 rises (the bold rising edge). This will happen before the end of the cycle (except, maybe, once every MTBF years). Then Q2 rises in cycle 2. (d) FF1 could become metastable, its output stays low, and when it resolves, the output still stays low. This appears the same as case (b). Q1 is forced to rise in cycle 2, and Q2 rises in cycle 3. (e) FF1 goes metastable, and its output goes high.
Later, it resolves to low (we see a glitch on Q1). By the end of cycle 1, Q1 is low. It rises in cycle 2, and Q2 rises in cycle 3. (f) FF1 goes metastable, its output goes high, and it later resolves to high. Q1 appears the same as case (a). Q2 rises in cycle 2.
The bottom line is that Q2 is never metastable (except, maybe, once every MTBF years). Q2 goes high either one or two cycles later than the input. The synchronization circuit exchanges the ''analog'' uncertainty of metastability (continuous voltage levels changing over continuous time) for a simpler ''digital'' uncertainty (discrete voltage levels switching only at uncertain discrete time points) of whether the output switches one or two cycles later. Other than this uncertainty, the output signal is a solid, legal digital signal.
What does happen when it really fails? Well, once every MTBF years, FF1 becomes metastable and resolves exactly one clock cycle later. Q1 might then switch exactly when FF2 samples it, possibly making FF2 metastable. Is this as unrealistic as it sounds? No. Run your clocks sufficiently fast, and watch for meltdown! Or continue reading to find out how to fight the odds.
A word of caution: the two flip-flops should be placed near each other, or else the wire delay between them would detract from the resolution time S. Missing this seemingly minor detail has made quite a few synchronizers fail unexpectedly.
This, however, is only half the story. To assure correct operation, we assume in Figure 9 that D1 stays high for at least two cycles (in cases b, d, e) so that FF1 is guaranteed to sample 1 at its input on the rising clock of cycle 2. How would the sender know how long D1 must be kept high? We have no idea how fast the sender clock is ticking, so we can't simply count cycles. To solve that, the receiver must send back an acknowledgment signal. Figure 10a shows a complete synchronizer. The sender sends req (also known as request, strobe, ready, or valid), req is synchronized by the top synchronization circuits, the receiver sends ack (or acknowledgment, or stall), ack is synchronized by the sender, and only then is the sender allowed to change req again. This roundtrip handshake is the key to correct synchronization. Now we can add data that needs to cross over, as in Figure 10b . The sender places data on the bus going to the right, and raises req. Once the receiver gets wind of req (synchronized to its clock), it stores the data and sends back ack. It could also send back data on the bus going to the left. When the sender receives ack, it can store the received data and also start a new cycle.
Note that the synchronizer doesn't synchronize the dataÀ Àrather, it synchronizes the control signals. Attempts to synchronize the data bit by bit usually lead to catastrophic results; even if all data lines toggle simultaneously, some bits might pass through after one cycle, while others might take two cycles because of metastability. Beware: that's a complete loss of data. Another forbidden practice is to synchronize the same asynchronous input by two different parallel synchronizers; one might resolve to 1 while the other resolves to 0, leading to an inconsistent state. In fact, that was the problem that grounded the J spacecraft . . .
The two-flip-flop synchronizer comes in many flavors. As Figure 11a shows, when using slow clocks, resolution of less than half a cycle could suffice in the receiver side. In other cases, two flip-flops might not be enough. The clock may be fast (e.g., on processors that execute faster than 1 GHz), the supply voltage may go very low (especially in near-threshold designs), and the temperature may rise above 100 C or drop far below freezing (for example, in a phone that will be used outdoors on a Scandinavian winter night or in a chip on the outside of an aircraft). For instance, if S ¼ T C , F C ¼ 1 GHz, F D ¼ 1 kHz (now we're being conservative), and due to low voltage and high temperature t ¼ 100 ps and T W ¼ 200 ps, the MTBF is about one minute. Three flip-flops (the sender side in Figure 11b ) would increase the MTBF a bit, to about one month. But if we use four flip-flops, S ¼ 3 T C and the MTBF jumps to 1,000 years. Caution and careful design is the name of the game here. Unique flip-flops designed especially for synchronization are more robust to variations in process, voltage, and temperature. Some use current sources to enhance the inverter gain; others sample multiple times and actively detect when synchronization is successful. The avid designer with the freedom to use nonstandard circuits can take advantage of such inventions, but typical ASIC and FPGA designers are usually constrained to using only standard flip-flops and will have to follow the usual, well-beaten path.
Another cause for concern is the total number of synchronizers in the design, be it a single chip or a system comprising multiple ASICs. MTBF decreases roughly linearly with the number of synchronizers. Thus, if your system uses 1,000 synchronizers, you should be sure to design each one for MBTF at least three orders of magnitude higher than your reliability target for the entire system. Similar concepts of synchronization are used for signals other than data that cross clock domains. Input signals might arrive at an unknown timing. The trailing edge of the reset signal and of any asynchronous inputs to flip-flops are typically synchronized to each clock domain in a chip. Clock-gating signals are synchronized to eliminate clock glitches when the clocks are gated or when a domain switches from one clock to another. Scan test chains are synchronized when crossing clock domains. These applications are usually well understood and are well supported by special EDA tools for physical design.
The key issues, as usual, are latency and throughput. It may take two cycles of the receiver clock to receive req, two more cycles of the sender clock to receive ack, and possibly one more on each side to digest its input and change state. If req and ack must be lowered before new data can be transferred, consider another penalty of 3 þ 3 cycles. (No wonder, then, that we used F D much lower than F C in the previous examples.) This slow pace is fine for many cases, but occasionally we want to work faster. Luckily, there are suitable solutions.
Two-clock FIFO synchronizer
The most common fast synchronizer uses a twoclock FIFO buffer as shown in Figure 12 . Its advantages are hard to beat: you don't have to design it (it's typically available as a predesigned library element or IP core), and it's (usually) fast. The writer places data on the input bus and asserts wen (write enable); if full is not asserted, the data was accepted and stored. The reader asserts ren (read enable), and if empty is not asserted then data was produced at the output. The RAM is organized as a cyclic buffer. Each data word is written into the cell pointed to by the write pointer and is read out when the read pointer reaches that word. On write and on read, the write pointer and the read pointer are respectively incremented. When the read pointer points to the same word as the write pointer, the FIFO buffer is empty. To determine that, the two pointers must be compared. However, they belong to two different clock domainsÀ Àthus, the write pointer has to be synchronized with rclk (read clock) when compared (on the right in Figure 12 ). That's where the synchronization is; it's applied to the pointers, rather than to the data. That's also where latency is incurred. When a new data word is written into an empty FIFO buffer, it might take one or two additional rclk cycles before the new write pointer passes through the synchronizer and deasserts empty. But when the two pointers are far from each other, no synchronization latency is incurred; data latency is still there. When the RAM holds k words, a newly inserted word will stay there for at least k rclk cycles before it is read out. Incidentally, the pointers are usually maintained in Gray code so that only a single bit at a time changes in the synchronizer.
Asynchronous Design
The FIFO solution usually works. It is nontrivial to design, but it's often available in libraries and elsewhere. The key question for the user of a library FIFO buffer is how large the RAM should be (how deep should the buffer be). The common approach says ''when in doubt, double it. '' You might think that the life of the FPGA designer is less complex heresimply use trial and error. However, an FPGA has occasionally failed in mission because of a too-short FIFO buffer. Care is needed here.
The two-clock FIFO synchronizer, as well as a mixed-timed FIFO synchronizer interfacing a clock domain to an asynchronous (clockless) circuit, are used in a network on chip (NoC) that offers connectivity among many modules on a chip and also assumes full responsibility for synchronization. This is a refreshing idea: let someone elseÀ Àthe NoC vendor or designerÀ Àintegrate your chip and take care of all clock domain crossings.
There are other fast synchronizers, which require a higher design effort than simply using the common FIFO synchronizer. The faster they are, the more complex the circuit. Most cases can be solved effectively with a good FIFO synchronizer. Two special cases are discussed next: one involves mesochronous clock domains; the other, synchronizing over long distance.
Mesochronous, multisynchronous, periodic, and rational synchronizers
Two mesochronous clock domains tick to the same frequency, but their relative phase is unknown in advance. They are typically driven by the same clock, but no attempt is made to balance the two clock trees relative to each other (such balancing might incur a heavy penalty in area and power). Once started, their relative phase remains stable. Figure 13 shows a common example: input X is sampled by each of the three registers in turn, and the oldest available sample is channeled to the output. The key question is how to set up the two counters, depending on the relative phase. The previously discussed two-clock FIFO synchronizer (with at least four stages) can also do the job. It should incur a one-or two-cycle synchronization latency at start-up, but thereafter the data latency is the same as in Figure 13 . As an added advantage, the two-clock FIFO synchronizer enables back pressure; when the receiver stops pulling data out, the sender is signaled full and can stall the data flow. It turns out that mesochronous clock domains are not always mesochronous. The paths taken by a global clock to the various domains may suffer delay changes during operation, typically due to temperature and voltage changes. These drifts are typically slow, spanning many clock cycles. This could lead to domains operating at the same frequency but at slowly changing relative phases. Such a relationship is termed multisynchronous, to distinguish this case from mesochronous operation. Synchronizers for multisynchronous domains need to continuously watch out for phase drifts and adapt to them. Figure 14 shows a conflict detector, which identifies when the sender and receiver clocks, xclk and rclk, are dangerously within one d of each other (see the waveform on the right-hand side of Figure 14) . A useful value of d is at least a few gate delays, providing a safe margin.
The synchronizer (see Figure 15 ) delays the clock of the first receiver register by t KO (keep-out delay) if and only if xclk is within d of rclk, as demonstrated by the waveform. This adjustment is made insensitive to any metastability in the conflict detector because the phase drift is known to be slow. Typically, the delay is changed only if the conflict detector has detected a change for a certain number of consecutive cycles, to filter out back-and-forth changes when xclk hovers around rclk AEd. As before, the designer should also consider whether a simpler two-clock FIFO synchronizer could achieve the same purpose. Incidentally, in addition to on-chip clock domain crossings, multisynchronous domains exist in phase-adaptive SDRAM access circuits and in clock and data recovery circuits in high-speed serial link serializer/ deserializer (SerDes) systems.
A similar keep-out mechanism could be applied when synchronizing periodic clock domains. Periodic clocks are unrelated to each otherÀ Àthey are neither mesochronous nor are their frequencies an integral multiple of each other. Hence, we can expect that every few cycles the two clocks might get dangerously close to each other. But the conflict detector of Figure 14 is too slow to detect this on time (it could take k þ 2 cycles to resolve and produce the Asynchronous Design unsafe signal). Luckily, since the clock frequencies are stable, we can predict such conflicts in advance. A number of predictive synchronizers have been proposed, but they tend to be complex, especially in light of the fact that the two-clock FIFO synchronizer might be suitable.
Another similar situation is that of rational clocks, wherein the two frequencies are related by a ratio known at design time (e.g., 1:3 or 5:6). In that case, determining danger cycles is simpler than for periodic clocks with unknown frequencies, and a simple logic circuit could be used to control the clock delay selector of Figure 15 .
Different situations call for specific synchronizers that might excel given certain design parameters, but the conservative designer might well opt for the simpler, safer, commonly available two-clock FIFO synchronizer described earlier.
Long-distance synchronization
What is a long distance, and what does it have to do with synchronizers? When we need to bridge the frequency gap between two clock domains placed so far apart that the signals take close to a clock cycle or even longer to travel between them, we face a new risk. The designer can't rely on counting cycles when waiting for the signal to arriveÀ À process, voltage, and temperature variations as well as design variations (such as actual floor plan or placement and routing) might result in an unknown number of cycles for traversing the interconnecting wires.
The simplest (and slowest) approach is to stretch the simple synchronizers of Figure 10 over the distance. It's slow because, when using return-to-zero signaling on req, four flight times over the distance are required before the next data word can be sent. We should guaranteeÀ Àfor example, by means of timing constraintsÀ Àthat when req has been synchronized (and the receiver is ready to sample its data input), the data word has already arrived. Such a procedure is not trivial when req and the data wires are routed through completely different areas of the chip. This safety margin requirement usually results in even slower operation.
Using fast asynchronous channels somewhat mitigates the performance issue. Data bits are sent under the control of proper handshake protocols, and they're synchronized when reaching their destination. The downside is the need for special IP cores, because asynchronous design is rarely practiced and isn't supported by common EDA tools.
Although the physical distance bounds the latency, throughput over long channels can be increased if we turn them into pipelines. But multistage pipelines require clocks at each stage, and it's not clear which clocks should be used in a multiclock-domain chip. When the data sent from Clock1 to Clock10 is routed near the areas of Clock2, Clock3, . . ., Clock9À Àall unrelated to either Clock1 or Clock10À Àwhich clocks do we use along the road? Some designs have solved it simply by clocking the pipes at the fastest frequency available on chip, but that solution is power hungry.
The ultimate solution might lie in employing a NoC. The network infrastructure is intended to facilitate multiple transfers among multiple modules, over varying distances, to support varying clock frequencies. Asynchronous and synchronizing NoCs have been devised to address these issues and especially to provide complete synchronization while interfacing each source and destination module.
Verification
Since the design of proper synchronization is such an elusive goal, verification is essential. But, for the same reason, verification is difficult and unfortunately doesn't always guarantee a correct solution.
Circuit simulations, such as that shown in Figure 3 , are useful for analyzing a single synchronizer but ineffective in proving that many synchronizations in a large SoC would all work correctly. An interesting simulation-based verification method has been developed at the logic level. Recall from the two-flip-flop synchronizer discussion that a good synchronizer should contain all level and timing uncertainties, replacing them with the simpler logic uncertainty of when a signal crosses overÀ Àit could happen in one cycle or in the next one. Assuming that we have selected a good synchronizer, we can facilitate logic verification by replacing the synchronizer with a special synchronous delay block that inserts a delay of either k or k þ 1 cycles at random. Although this approach leads to a design space of at least 2 n cases if there are n synchronization circuits, it is still widely used and is effective in detecting many logic errors (but not all of themÀ Àit wouldn't have helped the J spacecraft designers, for instance).
There are several EDA verification software tools, commonly called clock-domain-crossing (CDC) checkers, which identify and check all signals that cross from one domain to another. Using a variety of structural design rules, they are helpful in assuring that no such crossing remains unchecked. Some assess overall SoC reliability in terms of MTBF. At least one tool also suggests solutions for problematic crossings in terms of synchronizer IP cores.
THIS SHORT TUTORIAL has been an attempt to present both the beauty and the criticality of the subject of metastability and synchronizers. For more than 65 years, many researchers and practitioners have shared the excitement of trying to crack this tough nut. The elusive physical phenomena, the mysterious mathematical treatment, and the challenging engineering solutions have all contributed to making this an attractive, intriguing field. The folklore, the myths, the rumors, and the horror stories have added a fun aspect to a problem that has been blamed for the demise of several products and the large financial losses that resulted. Fortunately, with a clear understanding of the risks, respect for the dangers, and a strict engineering discipline, we can avoid the pitfalls and create safe, reliable, and profitable digital systems and products.
pointed at the science of metastability, and Peter Alfke stressed the engineering of it. The comments of Charles Dike, Cliff Cummings, Shalom Bresticker, Shlomi Beer, Reuven Dobkin, and Richard Katz helped to significantly improve this writing. The cumulative experience and know-how of hundreds of engineers, students, and colleagues taught me the good and the bad of metastability and convinced me that this is a delightful subject.
Ran Ginosar is an associate professor of electrical engineering and computer science at the TechnionÀ ÀIsrael Institute of Technology. His research interests include synchronization, asynchronous VLSI, and VLSI architecture for parallel computing. He has a PhD in electrical engineering and computer science from Princeton University. He is a senior member of IEEE. 
