Abstract-Stochastic switching circuits are relay circuits that consist of stochastic switches called pswitches. The study of stochastic switching circuits has widespread applications in many fields of computer science, neuroscience, and biochemistry. In this paper, we discuss several properties of stochastic switching circuits, including robustness, expressibility, and probability approximation.
I. INTRODUCTION

I
N his master's thesis of 1938, Claude Shannon demonstrated how Boolean algebra can be used to synthesize and simplify relay circuits, establishing the foundation of modern digital circuit design [12] . Later, deterministic switches were replaced with probabilistic switches to make stochastic switching circuits, which were studied in [15] . There are a few features of stochastic switching circuits that make them very similar to neural systems. First, randomness is inherent in neural systems and it may play a crucial role in thinking and reasoning. Switching (and relaying) technique provides us a natural way of manipulating this randomness. Second, in a switching system, each switch can be treated as either a memory element or a control element for computing. This might enable creating an intelligent system where storage and computing are highly integrated. In this paper, we study stochastic switching circuits from a basic starting point with focusing on probability synthesis. We consider two-terminal stochastic switching circuits, where each probabilistic switch, or pswitch, is closed with some probability chosen from a finite set of rational numbers, called a pswitch set. By selecting pswitches with different probabilities and composing them in appropriate ways, we can realize a variety of different closure probabilities.
Formally, for a two-terminal stochastic switching circuit C, the probabilities of pswitches are taken from a fixed pswitch set S, and all these pswitches are open or closed independently. We use P (C) to denote the probability that the two terminals of C are connected, and call P (C) the closure probability of C. Given a pswitch set S, a probability x can be realized if and only if there exists a circuit C such that x = P (C). Based on the ways of composing pswitches, we have series-parallel (sp) circuits and non-series-parallel (non-sp) circuits. An sp circuit consists of either a single pswitch or two sp circuits connected in series or parallel, see the circuit in Fig. 1 (a) and 1(b) as examples. The circuit in Fig. 1(c) is a non-sp circuit. A special type of sp circuits is called simple-series-parallel (ssp) circuits. An ssp circuit is either a single pswitch, or is built by taking an ssp circuit and adding another pswitch in either series or parallel. For example, the circuit in Fig. 1(a) is an ssp circuit but the one in Fig. 1(b) is not.
In this paper, we first study the robustness of different stochastic switching circuits in the presence of small error perturbations. We assume that the probabilities of individual pswitches are taken from a fixed pswitch set with a given error allowance of ǫ; that is, the error probabilities of the pswitches are bounded by ǫ. We show that ssp circuits are robust to small error perturbations, but the error probability of a general sp circuit may be amplified by adding additional pswitches. These results might help us understand why local errors do not accumulate in a natural system, and how to enhance the robustness of a system when designing a circuit.
Next, we study the problem of synthesizing desired probabilities with stochastic switching circuits. We mainly focus on ssp circuits due to their robustness against small error perturbations. Two main questions are addressed: (1) Expressibility: Given the pswitch set S = { switching circuit using as a few as possible pswitches, to get a good approximation of the desired probabilities?
The study of probability synthesis based on stochastic switching circuits has widespread applications. Recently, people found that DNA molecules can be constructed that closely approximate the dynamic behavior of arbitrary systems of coupled chemical reactions [13] , which leads to the field of molecular computing [2] . In such systems, the quantities of molecules involved in a reaction are often surprisingly small, and the exact sequence of reactions is determined by chance [4] . Stochastic switching circuits provide a simple and powerful tool to manipulate stochasticity in molecular systems. Comparing with combinational logic circuits, stochastic switching circuits are easier to implement using molecular reactions. Another type of applications is probabilistic electrical systems without sophisticated computing components. In such systems, stochastic switching circuits have many advantages in generating desired probabilities, including its constructive simplicity, robustness, and low cost.
The remainder of this paper is organized as follows: Section II describes related work and introduces some existing results on stochastic switching circuits. In Section III, we analyze the robustness of different kinds of stochastic switching circuits. Then we discuss the expressibility of stochastic switching circuits in Section IV and probability approximation in Section V, followed by the conclusion in Section VI.
II. RELATED WORKS AND PRELIMINARIES
There are a number of studies related to the problem of generating desired distributions from the algorithmic perspective. This problem dates back to von Neumann [14] , who considered of simulating an unbiased coin using a biased coin with unknown probability. Later, Elias [3] improved this algorithm such that the expected number of unbiased random bits generated per coin toss is asymptotically equal to the entropy of the biased coin. On the other hand, people have considered the case that the probability distribution of the tossed coin is known. Knuth and Yao [9] have given a procedure to generate an arbitrary probability using an unbiased coin. Han and Hoshi [7] have demonstrated how to generate an arbitrary probability using a general M -sided biased coin. All these works aim to efficiently convert one distribution to another. However, they require computing models and may not be applicable for some simple or distributed electrical/molecular systems.
There are a number of studies focusing on synthesizing a simple physical device to generate desired probabilities. Gill [5] [6] discussed the problem of generating rational probabilities using a sequential state machine. Motivated by neural computation, Jeavons et al. provided an algorithm to generate binary sequences with probability a q n from a set of stochastic binary sequences with probabilities in
Their method can be implemented using the concept of linear feedback shift registers. Recently, inspired by PCMOS technology [1] , Qian et al. considered the synthesis of decimal probabilities using combinational logic [11] . They have considered three different scenarios, depending on whether the given probabilities can be duplicated, and whether there is freedom to choose the probabilities. In contact to the foregoing contributions, we consider the properties and probability synthesis of stochastic switching circuits. Our approach is orthogonal and complementary to that of Qian and Riedel, which is based on combinational logic. Generally, each switching circuit can be equivalently expressed by a combinational logic circuit. All the constructive methods of stochastic switching circuits in this paper can be directly applied to probabilistic combinational logic circuits.
In the rest of this section, we introduce the original work that started the study on stochastic switching circuits (Wilhelm and Bruck [15] ). Similar to resistor circuits [10] , connecting one terminal of a switching circuit C 1 (where P (C 1 ) = p 1 ) to one terminal of a circuit C 2 (where P (C 2 ) = p 2 ) places them in series. The resulting circuit is closed if and only if both of C 1 and C 2 are closed, so the probability of the resulting circuit is
Connecting both terminals of C 1 and C 2 together places the circuits in parallel. The resulting circuit is closed if and only if either C 1 or C 2 is closed, so the probability of the resulting circuit is
Based on these rules, we can calculate the probability of any given ssp or sp circuit. For example, the probability of the circuit in Fig. 1(a) is
and the probability of the circuit in Fig. 1(b) is
Let us consider the non-sp circuit in Fig. 1(c) . In this circuit, we call the pswitch in the middle a 'bridge'. If the bridge is closed, the circuit has a closure probability of is open, the circuit has a closure probability of 7 16 . Since the bridge is closed with probability 1 2 , the overall probability of the circuit is
An important and interesting question is that if S is uniform, i.e., S = { with 0 < a < 2 n , using an ssp circuit when S = { 1 2 }. In their algorithm, at most n pswitches are used, which is optimal. They also proved that given the pswitch set S = { Wilhelm and Bruck also demonstrated the concept of duality in sp circuits. The dual of a single pswitch of probability p appearing in series is the corresponding pswitch of probability 1 − p appearing in parallel. Similarly, the dual of a pswitch of probability p appearing in parallel is a pswitch of probability 1 − p appearing in series. For example, in Fig. 2 , the circuit in (b) is the dual of the circuit in (a), and vice versa. It can be proved that dual circuits satisfy the following relation: Theorem 1 (Duality Theorem [15] ). For a stochastic seriesparallel circuit C and its dual C, we have
where P (C) is the probability of circuit C and P (C) is the probability of circuit C.
III. ROBUSTNESS
In this section, we analyze the robustness of different kinds of stochastic switching circuits, where the probabilities of individual pswitches are taken from a fixed pswitch set, but given an error allowance of ǫ; i.e., the error probabilities of the pswitches are bounded by ǫ. For a stochastic circuit with multiple pswitches, the error probability of the circuit is the absolute difference between the probability that the circuit is closed when error probabilities of pswitches are included, and the probability that the circuit is closed when error probabilities are omitted. We show that ssp circuits are robust to small error perturbations, but the error probability of a general sp circuit may be amplified with additional pswitches.
A. Robustness of ssp Circuits
Here, we analyze the susceptibility of ssp circuits to small error perturbations in individual pswitches. Based on our assumption, instead of assigning a pswitch a probability of p, the pswitch may be assigned a probability between p − ǫ and p + ǫ, where ǫ is a fixed error allowance.
Theorem 2 (Robustness of ssp circuits). Given a pswitch set S, if the error probability of each pswitch is bounded by ǫ, then the total error probability of an ssp circuit is bounded by ǫ min(min(S), 1 − max(S))
.
Proof:
We induct on the number of pswitches. If we have just one pswitch, the result is trivial. Suppose the result holds for n pswitches, and note that for an ssp circuit with n + 1 pswitches, the last pswitch will either be added in series or in parallel with the first n pswitches. By the induction hypothesis, the circuit constructed from the first n pswitches has probability p + ǫ 1 of being closed, where ǫ 1 is the error probability introduced by the first n pswitches and |ǫ 1 | ≤ ǫ min(min(S),1−max(S)) . The (n + 1)st pswitch has probability t + ǫ 2 of being closed, where t ∈ S and |ǫ 2 | ≤ ǫ. If the (n + 1)st pswitch is added in series, see Fig. 3(a) , then the new circuit (with errors) has probability
of being closed. Without considering the error probability of each pswitch, the probability of the new circuit is tp. Hence, the overall error probability of the circuit is e 1 = ǫ 2 (p + ǫ 1 ) + tǫ 1 . By the triangle inequality and the induction hypothesis,
, completing the induction. Similarly, if the (n + 1)st pswitch is added in parallel, see Fig. 3(b) , then the new circuit (with errors) has probability
of being closed. Without considering the error probability of each pswitch, the probability that the circuit is closed is p + t − tp. Hence, the overall error probability of the circuit with n + 1 pswitches is e 2 = ǫ 1 (1 − t) + ǫ 2 (1 − p − ǫ 1 ). Again using the induction hypothesis and the triangle inequality, we have
This completes the proof.
The theorem above implies that ssp circuits are robust to small error perturbations: no matter how big the circuit is, the error probability of an ssp circuit will be well bounded by a constant times ǫ. Let us consider a case that S = { 1 2 }. In this case, the overall error probability of any ssp circuit is bounded by 2ǫ if each pswitch is given an error allowance of ǫ.
B. Robustness of sp Circuits
We have proved that for a given pswitch set S, the overall error probability of an ssp circuit is well bounded. We want to know whether this property holds for all sp circuits. Unfortunately, we show that as the number of pswitches increases, the overall error probability of an sp circuit may also increase. In this subsection, we will give the upper bound and lower bound for the error probabilities of sp circuits.
Theorem 3 (Lower bound for sp circuits)
. Given a pswitch set S, if the error probability of each pswitch is ǫ (where ǫ → 0), then there exists an sp circuit of size n with overall error probability O(log n)ǫ.
Proof: Suppose p ∈ S, and without loss of generality, assume ǫ > 0. We construct an sp circuit as shown in Fig. 4 , by connecting a + 1 strings of pswitches in parallel. Among these strings, we have a strings of b pswitches and one string of n−ab pswitches, and all pswitches have probability p. Now, we let a and b satisfy the following relation:
Without considering pswitch errors, the probability of the circuit is
Suppose we introduce an error of ǫ to each pswitch, such that the probability of each pswitch is p + ǫ (assume ǫ > 0). Then the probability of the circuit is
where p 2 (0) = p 1 . Assuming n is large enough, we have the following error probability for the circuit:
So when n is large enough, we have
Since b⌊(
Finally, we have |e 1 | ∼ O(log n)ǫ, completing the proof.
In the following theorem, we will give the upper bound for the error probabilities of sp circuits.
Theorem 4 (Upper bound for sp circuits).
Given an sp circuit with n pswitches taken from a finite pswitch set S, if each pswitch has error probability bounded by ǫ, then the total error probability of the circuit is bounded by c √ nǫ, where c = max t∈S
is a constant.
Proof: Assume x is a pswitch in a stochastic circuit C, and the actual probability of x is t x + ǫ x , where ǫ x is the error part such that |ǫ x | ≤ ǫ. Let P (C|x = 1) denote the probability of circuit C when x is closed, and let P (C|x = 0) denote the probability of C when x is open.
Without considering the error probability of x, the probability of circuit C can be written as
Considering the error part of x, we have
In order to prove the theorem, we define a term called the error contribution. In a circuit C, the error contribution of pswitch x is defined as
In the rest of the proof, we have two steps.
(1) In the first step, we show that given an sp circuit with size n, there exists at least one pswitch such that its error contribution is bounded by
ǫ, where P is the probability of the sp circuit and c = max t∈S
We induct on the number of pswitches. If the circuit has only one pswitch, the result is trivial. Suppose the result holds for k pswitches for all k < n. We need to prove that the result holds for any sp circuit C with n pswitches.
Suppose circuit C is constructed by connecting two sp circuits C 1 and C 2 in series, where C 1 has n 1 pswitches and probability P 1 , and C 2 has n 2 pswitches and probability P 2 . Note that n 1 + n 2 = n and n 1 < n, n 2 < n.
By the induction hypothesis, circuit C 1 contains a pswitch x 1 with error contribution
In circuit C, the error contribution of pswitch x 1 is
Similarly, in the circuit C 2 , there exists a pswitch x 2 such that the error contribution of x 2 is
and the error contribution of x 2 to circuit C is e x2 (C) = P 1 e x2 (C 2 ).
Since the circuit C is constructed by connecting circuits C 1 and C 2 in series, the probability of circuit C is P = P 1 P 2 . Thus, we only need to prove that either e x1 (C) or e x2 (C) is bounded by
This can be proved by contradiction as follows.
Assume both e x1 (C) and e x2 (C) are larger than
ǫ. Then we have
which can be simplified as
Adding the two inequalities yields
which is a contradiction. So we conclude that at least one of e x1 (C) and e x2 (C) is bounded by
ǫ when C is constructed by connecting two sp circuits in series. If the circuit C is constructed by connecting two sp circuits in parallel, using a similar argument, we can get the same conclusion.
Finally, we get that given an sp circuit with size n, there exists at least one pswitch such that its error contribution is bounded by
In the second step, we prove the theorem based on the result above.
We again induct on the number of pswitches. If we have less than three pswitches, the result is trivial. Suppose the result holds for any sp circuit with n ≥ 2 pswitches; we want to prove that the result also holds for any circuit with n + 1 pswitches.
Based on the result in the first step, we know that given an sp circuit C with n + 1 pswitches, there exists a pswitch x with error contribution bounded by
ǫ. By keeping pswitch x closed, we obtain an sp circuit D 1 with at most n pswitches. Please see Fig. 5(a)(b) as an example. Without considering pswitch errors, D 1 is closed with probability p 1 ; considering all pswitch errors, D 1 is closed with probability q 1 . According to our assumption, we have By keeping pswitch x open, we obtain an sp circuit D 2 with at most n pswitches. Please see Fig. 5 (a)(c) as an example. Without considering pswitch errors, D 2 is closed with probability p 2 ; considering all pswitch errors, D 2 is closed with probability q 2 . According to our assumption, we have
For the initial sp circuit C with n + 1 pswitches, without considering pswitch errors, the overall probability of the circuit is given by
where t x is the probability of pswitch x. Considering all pswitch errors, the overall probability of the circuit is
We know that the error contribution of pswitch x to the circuit C is
Then by the triangle inequality, we can get the error probability of the circuit C:
This finishes the induction.
C. Robustness of Non-sp Circuits
Here we extend our discussion to the case of general stochastic switching circuits. We have the following theorem, which clearly holds for sp and ssp circuits: Theorem 5 (Upper bound for general circuits). Given a general stochastic switching circuit with n pswitches taken from a finite pswitch set S, if each pswitch has error probability bounded by ǫ, then the total probability of the circuit is bounded by nǫ.
Proof:
We first index all the pswitches in the circuit C as x 1 , x 2 , . . . , x n , see Fig. 6 as an example.
Let t i +ǫ i be the probability that x i is closed, where ǫ i is the error part such that |ǫ i | ≤ ǫ. Let P (k) denote the probability that C is closed when we only take into account the errors of x 1 , x 2 , . . . , x k , i.e.,
where P (a 1 , a 2 , . . . , a n ) indicates the probability of C if x i is closed with probability a i for all 1 ≤ i ≤ n. The overall error probability of the circuit C can then be written as
Now, we prove that
Therefore, we have
as we wanted.
Note that in most of cases, the actual error probability of a circuit is much smaller than nǫ when n is large. However, nǫ is still achievable in the following case: by placing n pswitches with probability p − ǫ in series, where ǫ → ∞, we can get a circuit whose probability is
Without considering the errors, the probability of the circuit is p n , so the overall error is n · p n−1 ǫ. Choosing p sufficiently close to 1, we can make the error probability of the circuit arbitrarily close to nǫ.
IV. EXPRESSIBILITY
In the previous section, we showed that ssp circuits are robust against noise. This property is important in natural systems and useful in engineering system design, because the local error of a system should not be amplified. In this section, we consider another property of stochastic switching q } for some integer q, the questions we ask are: What kinds of probabilities can be realized using stochastic switching circuits (or only ssp circuits)? How many pswitches are sufficient? Wilhelm and Bruck [15] proved that if q = 2 or q = 3, all rational a q n , with 0 < a < q n , can be realized by an ssp circuit with at most n pswitches, which is optimal. They also showed that if q = 4, all rational a q n , with 0 < a < q n , can be realized using at most 2n − 1 pswitches. In this section we generalize these results: 1) If q is an even number, all rational a q n , with 0 < a < q n , can be realized by an ssp circuit with at most ⌈log 2 q⌉(n − 1) + 1 pswitches (Theorem 7). 2) If q is odd and a multiple of 3, all rational a q n , with 0 < a < q n , can be realized by an ssp circuit with at most ⌈log 3 q⌉(n − 1) + 1 pswitches (Theorem 8).
3) However, if q is a prime number greater than 3, there exists at least one rational a q n , with 0 < a < q n , that cannot be realized using an sp circuit (Theorem 11). Table I summarizes these results. We see that when q = 2, 3, or 4, our results agree with the results in [15] .
A. Backward Algorithms
As mentioned in [15] , switching circuits may be synthesized using forward algorithms, where circuits are built by adding pswitches sequentially, or backward algorithms, where circuits are built starting from the "outermost" pswitch. Fig. 7 gives a simple demonstration of a backward algorithm. Assume that the desired probability is p 1 and we plan to insert three pswitches, namely x 1 , x 2 , x 3 in backward direction. Here, for simplicity, we use x 1 , x 2 , x 3 to denote the closure probabilities of the pswitches, rather than their states (1 or 0). If x 1 ≤ p 1 , then x 1 has to be inserted in parallel. If x 1 > p 1 , then x 1 has to be inserted in series. After the insertion, we can try to realize the inner box with probability p 2 such that p 2 + x 1 − p 2 x 1 = p 1 . This process is continued recursively until for some m, p m can be realized with a single pswitch. Generally, in backward algorithms, we use x k to denote the kth pswitch inserted in the backward direction, and use p k to denote the probability that we want to realize with pswitches x k , x k+1 , x k+2 , . . .
Backward algorithms have significant advantages over forward algorithms for probability synthesis. In a forward algorithm, if we want to add one pswitch, we have 2|S| choices, since each pswitch may be added in either series or parallel. But in a backward algorithm, if we want to insert one pswitch, we have only |S| choices. That is because the insertion (series or parallel) of a pswitch x k simply depends on the comparison the local error of a system should not be amplified. In this circuits, called expressibility. Namely, given a pswitch set pswitches are sufficient? Wilhelm and Bruck [15] proved that pswitches, which is pswitches.
, can be realized by an ssp circuit with at most , can be realized by an ssp circuit with at 1 p (a) Step 1: We assume that the desired probability is p 1 .
Step 2: Insert x 1 in parallel as the last pswitch. Now we try to realize p 2 such that p 2 +x 1 −p 2 x 1 = p 1 .
Step 3: Insert x 2 in series as the last pswitch. Now we try to realize p 3 such that p 3 x 2 = p 2 . of x k and p k . Therefore, backward algorithms can significantly reduce the search space, hence are more efficient than forward algorithms. In this paper, most of the circuit constructions are based on backward algorithms.
B. Multiples of 2 or 3
We consider the case that S = { 1 q , 2 q , . . . , q−1 q } and q is a multiple of 2 or 3. We show that based on a backward algorithm, all rational a q n , with 0 < a < q n , can be realized using a bounded number of pswitches. Before describing the details, we introduce a characteristic function called d for a given probability Note that the function d is well defined, i.e., the value of d is unchanged when both b and q w are multiplied by the same constant. From the definition of the characteristic function d, we see that for any rational a q n with 0 < a < q n , d is a positive integer. In each iteration of the algorithm, we hope to reduce d(p k ) such that it can reach 1 after a certain number of iterations. If d = 1, this means the desired probability can be realized using a single pswitch and the construction is done. During this process, we keep each successive probability p k in the form of b q w , since only this kind of probabilities can , . . . , 9 10 }.
be realized with the pswitch set S. Now, we describe the algorithm as follows. 
2) Let
We find the optimal x k ∈ S that minimizes d(p k+1 ) with
gcd(b, q w−1 ) .
3) Insert pswitch x k to the circuit. If x k > p k , the pswitch is inserted in series; otherwise, it is inserted in parallel.
Then we set p k+1 = h(x k , p k ). 4) Let k = k + 1.
5) Repeat steps 2-4 until p k can be realized using a single pswitch. Then insert p k into the circuit.
In Algorithm 1, the characteristic function d(p k ) strictly decreases as k increases, until it reaches 1. Finally, p k can be replaced by a single pswitch and the construction is done. Fig. 8 gives an example of a circuit realized by this algorithm. At the beginning, we have p 1 = 71 10 2 , with d(p 1 ) = 10. Then we add the "best" pswitch to minimize d(p 2 ), where the optimal pswitch is In the following theorem, we show that if q is a multiple of 2 or 3, then Algorithm 1 realizes any rational a q n with 0 < a < q n .
. Fig. 9 .
When q is even, the way to add a pswitch x ∈ S such that Proof: The characteristic function d(p 1 ) of the initial probability p 1 is bounded by q n−1 . We only need to prove that there exists an integer m such that d(p m ) = 1, i.e., p m can be realized by a single pswitch. Hence the desired probability p 1 can be realized by an ssp circuit with m pswitches. It is enough to show that the characteristic function d(p k ) decreases as k increases.
First, we consider the case where q is even. We will show that for any See Fig. 9 , depending on the values of p k and d(p k ), we have four different cases of inserting a pswitch By checking all the cases to insert a pswitch, it is straightforward to see that when
Since x k is optimal in each step of Algorithm 1, we have
Finally, we can conclude that when q is even, there exists an integer m such that d(p m ) = 1. Consequently, p 1 can be realized with at most m pswitches. Finally, we can conclude that p 1 can be realized with a finite number of pswitches when q is odd and a multiple of 3.
For each value q ∈ {2, 3, 4, 6, 8, 9, 10}, we enumerate all rational numbers with optimal size n ∈ (3, 4, 5). Here, we say that a desired probability is realized with optimal size if it cannot be realized with fewer pswitches. As a comparison, we use Algorithm 1 to realize these rational numbers again. Fig. 10 presents the average number of pswitches required using Algorithm 1 when the optimal size is n. It is shown that when q is a multiple of 2 or 3, Algorithm 1 can construct circuits with almost optimal size.
The next theorem gives an upper bound for the size of the circuits when q is even.
Theorem 7 (Upper bound of circuit size when q is even). Suppose q is even. Given a pswitch set
any rational a q n with 0 < a < q n can be realized by an ssp circuit, using at most ⌈log 2 q⌉(n − 1) + 1 pswitches.
Proof: In order to achieve this upper bound, we use a modified version of Algorithm 1. Instead of inserting the optimal pswitch x k , we insert the pswitch x described in Fig. 9 as the kth pswitch. The resulting characteristic function has the following properties:
(1) d(p k ) decreases as k increases, and when d(p m ) = 1 for some m, the procedure stops.
then N is the number of required pswitches. We only need to prove that N ≤ ⌈log 2 q⌉(n − 1) + 1. Since q is even, we can write q = 2 c or q = 2 c t, where t > 1 is odd. Let us first consider the case of q = 2 c . At the beginning, d(p 1 ) is a factor of q n−1 , so according to property (2), we can get N ≤ c(n − 1) + 1 = ⌈log 2 q⌉(n − 1) + 1.
In the case of q = 2 c t, let us define a set M as
and let M i be the ith smallest element in M . According to properties (2) and (3) and the fact that d(p 1 ) is a factor of q n−1 , we see that d(p Mi ) is a factor of q n−i . Therefore, there exits a minimal k,
Based on properties (2) and (3), we also see that
Therefore,
Using the similar methods, we can prove the following theorems as well when q is a multiple of 3 or 6. Note that Theorem 7 also applies to the case that q is a multiple of 6, but Theorem 9 provides a tighter upper bound. 
C. Prime Number Larger Than 3
We proved that if q is a multiple of 2 or 3, all rational a q n can be realized with a finite number of pswitches. We want to know whether this result also holds if q is an arbitrary number greater than 2. Unfortunately, the answer is negative. Proof: Assume there exits a rational a q n which cannot be realized by an sp circuit with n pswitches, but can be realized with at least l > n pswitches. Further, suppose that this l is minimal for all rationals with denominator q k . Under these assumptions, we will prove that there exists a rational
which cannot be realized with n ′ pswitches but can be realized with l ′ pswitches such that l ′ < l. This conclusion contradicts the assumption that l is minimal.
According to the definition of sp circuits, we know that a q n can be realized by connecting two sp circuits C 1 and C 2 in series or in parallel. Assume C 1 consists of l 1 pswitches and is closed with probability b1 q l 1 , and C 2 consists of l 2 pswitches and is closed with probability b2 q l 2 , where l 1 + l 2 = l. If C 1 and C 2 are connected in series, we can get
Therefore, b 1 b 2 = aq l−n , where b 1 b 2 is a multiple of q. Since q is a prime number, either b 1 or b 2 is a multiple of q. Without loss of generality, assume b 1 is a multiple of q, and we write b 1 = cq. Consider the probability c q l 1 −1 , which can be realized with C 1 , using l 1 pswitches. Assume that the same probability can also be realized with another sp circuit C 3 , using l 1 − 1 pswitches. By connecting C 3 and C 2 in series, we can realize a q n with l 1 − 1 + l 2 = l − 1 pswitches, contradicting the assumption that a q n cannot be realized with less than l pswitches. Therefore, we see that c q l 1 −1 cannot be realized with l 1 − 1 pswitches, but it can be realized with l 1 pswitches. Since l 1 < l, this also contradicts our assumption that l is minimal.
If C 1 and C 2 are connected in parallel, we have
Using a similar argument as above, we can conclude that either b 1 or b 2 is a multiple of q. Then either (1) a q l can be realized with less than l pswitches or (2) l is not optimal, yielding a contradiction. This proves the lemma.
Based on the lemma above, it is easy to get the following theorem.
Theorem 11 (When q is a prime number larger than 3). For a prime number q > 3, there exists an integer a, with 0 < a < q n , such that a q n cannot be realized using an sp circuit whenever n ≥ 2.
Proof: The conclusion follows Lemma 10 and the following result in [15] : For any q > 3, no pswitch set containing all a q , with 0 < a < q, can realize all P r (C) = b q 2 , with 0 < b < q 2 , using at most 2 pswitches.
V. PROBABILITY APPROXIMATION
In this section, we consider a general case where given an arbitrary pswitch set, we want to realize a desired probability. Clearly, not every desired probability p d can be realized without any error using a finite number of pswitches for a fixed pswitch set S. So the question is whether we can construct a circuit with at most n pswitches such that it can approximate the desired probability very well. Namely, the difference between the probability of the constructed circuit and the desired probability should be as small as possible.
A. Greedy Algorithm
Given an arbitrary pswitch set S with |S| ≥ 2, it is not easy to find the optimal circuit (ssp circuit) with n pswitches which approximates the desired probability p d . As we discussed in the last section, a backward algorithm provides |S| choices for each successive insertion. To find the optimal circuit, we may have to search through |S| n different combinations. As |S| or n increases, the number of combinations will increase dramatically. In order to reduce the search space, we propose a greedy algorithm: In each step, we insert m pswitches, which are the "best" locally. Normally, m is a very small constant. Since each step has complexity |S| m , the total number of possible combinations is reduced to |S| m n m , which is much smaller than |S| n when |S| ≥ 2 and n is large. Now, we describe this greedy algorithm briefly. The same notations x 1 , x 2 , . . . and p 1 , p 2 , . . . are used, as those described for the backward algorithms: x k indicates the kth pswitch inserted and p k indicates the desired probability of the subcircuit constructed by x k , x k+1 , . . .
Algorithm 2 (Greedy algorithm with step-length m).
1) Assume that the desired probability is p 1 . Set k = 1 and start with an empty circuit. 2) Select the optimal x m = (x 1 , x 2 , . . . , x m ) ∈ S m to minimize f (x m , S, p k ), which will be specified later, and this x m is denoted as x * = (x * 1 , x * 2 , . . . , x * m ). So far, according to the backward algorithm described in Section IV-A, we know how to finish step 3, including how to insert m pswitches one by one into a circuit in a backward direction, and how to update p k . The only thing unclear in the procedure above is the expression of f (x m , S, p k ). In order to get a good expression for f (x m , S, p k ), we study how errors propagate in a backward algorithm. Note that in a backward algorithm, we insert pswitches x 1 , x 2 , . . . , x n one by one: if x k > p k , then x k is inserted in series; if x k < p k , then x k is inserted in parallel. Now, given a circuit C with size n constructed using a backward algorithm, we let C (k) denote the subcircuit constructed by x k1 , x k1+1 , . . . , x n and call |P (C (k) ) − p k | as the approximation error of p k , denoted by e k . In the following theorem, we will show how e k1 affects that of e k2 for k 2 < k 1 after inserting pswitches x k2 , . . . , x k1−1 .
Lemma 12.
In a backward algorithm, let p k denote the desired probability of the subcircuit C (k) constructed by x k , x k+1 , . . . , x n , and let e k denote the approximation error of p k . Then for any k 2 < k 1 ≤ n, we have
Proof: We only need to prove that for any k less than the circuit size, the following result holds:
When x k = p k , we have e k = e k+1 = 0, so the result is trivial.
When x k > p k , then x k is inserted in series. In this case, we have p k+1 x k = p k , and
As a result, the approximation error of p k is
When x k < p k , then x k is inserted in parallel. In this case, we have
and
In each step of the greedy algorithm, our goal is to minimize e k , the approximation error of p k . According to the lemma above, we know that
where the term e k+m is unknown. But we can minimize k+m−1 i=k r(x i ) such that e k is as small as possible. Based on the above discussion, we express f (x, S, p k ) as
with
In the rest of this section, based on this expression for f (x, S, p k ), we show that the greedy algorithm has good performance in reducing the approximation error of p d .
B. Approximation Error when |S| = 1
When S has only one element, say S = {p}, the greedy algorithm above can become really simple. If p k > p k , then we insert one pswitch in parallel; otherwise, we insert it in series. Fig. 11 demonstrates how to approximate 1 2 using four pswitches with the same probability using 4 pswitches of probability 1 3 .
Note that in the greedy algorithm, when p is close to 1 2 , the probability of the resulting circuit will quickly converge to the desired probability. But when p is close to 0 or 1, the convergence speed is slower. In the following theorem, we provide an upper bound for the approximation error of the desired probability when |S| = 1.
Theorem 13 (Approximation error when |S| = 1). Given n pswitches, each with probability p, and a desired probability p d , the greedy algorithm (Algorithm 2) with m = 1 generates an ssp circuit C with approximation error
where equality is achieved when
In the following proof, we only consider the case when p < 1 2 . From duality, the result will also hold for p > 1 2 . We induct on the number of pswitches. For one pswitch, the result is trivial: the worst-case desired probability is p + 1−p 2 , with approximation error 1−p 2 . Now assume the result of the theorem holds for n pswitches, we want to prove that it also holds for n + 1 pswitches.
Let p 1 = p d be approximated with n + 1 pswitches using Algorithm 2. At the beginning, one pswitch is inserted in series if p d < p, or in parallel if p d > p. According to Lemma 12,  we know that the approximation error of p 1 is
where r(p) ≤ max{p, 1 − p}, and e 2 is the approximation error of p 2 . According to our assumption, we know that
So we have
Note that equality is achieved if r(p) = max{p, 1 − p} and e 2 = . In this case, p 2 = f n (p) ≥ 1 2 > p and the last pswitch is inserted in parallel. As a result, we have If we let p = 1 2 , the theorem shows that for any desired probability p d and any integer n, we can find an ssp circuit with n pswitches to approximate p d , such that the approximation error is at most 1 2q n . This agrees with the result in [15] : Given a pswitch set S = { 1 2 }, all rational a 2 n , with 0 < a < q n , can be realized using at most n pswitches.
C. Approximation Error when |S| > 1
In this subsection, we show that using the greedy algorithm (Algorithm 2) with small m, such as 1 or 2, we can construct a circuit to obtain a good approximation of any desired probability. Here, given a pswitch set S = {s 1 , s 2 , . . . , s |S| }, we define its maximal interval ∆ as
where we let s 0 = 0 and s |S|+1 = 1. In the following theorems, we will see that the approximation error of the greedy algorithm depends on ∆, and can decrease rapidly as n increases.
Let us first consider the case m = 1: .
Proof:
In the following proof, we only consider the case that n is odd. If the result holds for odd n, then the result will also hold for even n. In order to simplify the proof, we assume that s 0 = 0 and s |S|+1 = 1 also belong to S; i.e., there are pswitches with probability 0 or 1. This assumption will not affect our conclusion.
We write n = 2k + 1 and induction on k. When k = 0, the result is trivial, since the approximation error of one pswitch satisfies e ≤ ∆ 2 . Assume the result holds for 2k + 1 pswitches. We want to show that the result also holds for 2(k + 1) + 1 pswitches.
When m = 1 in the greedy algorithm, if we want to approximate p 1 = p d with 2(k + 1) + 1 pswitches, we should insert a pswitch with probability arg min x f (x, S, p 1 ) in the first step, where f (x, S, p 1 ) is defined in (1) .
Let x upper = min{x ∈ S|x > p 1 } and x lower = max{x ∈ S|x < p 1 }. Since 0 ∈ S and 1 ∈ S, we know that x upper and x lower exist.
(1) We first consider the case that 1 − x lower ≤ x upper . In this case, we insert x lower in parallel as the first pswitch. Therefore, we can get
According to the definition of ∆, there exists a pswitch x ∈ S such that p 2 ≤ x < p 2 + ∆. Assume in the algorithm, we insert pswitch x * as the second one. Since x * is locally optimal, we have
Assume the approximation error of p 3 is e 3 . According to Lemma 12,  Then we have the same result as the first case.
According to the two theorems above, when we let ∆ → 0, the approximation error for m = 1 is upper bounded by k . It shows that the greedy algorithm has good performance in terms of approximation error, even when m is very small. Comparing with the case of m = 1, if we choose m = 2, the probability of the constructed circuit can converge to the desired probability faster as the circuit size n increases.
In the following theorem, we consider the special case S = { 1 q , 2 q , . . . , q−1 q } for some integer q. In this case, we obtain a new upper bound for the approximation error when using the greedy algorithm with m = 2. This bound is slightly tighter than the one obtained in Theorem 15. 
The proof is similar to the proof of Theorem 15, so we simply provide a sketch. Assume that in each step, we insert two pswitches in the following way (see Fig. 12 ):
(1) If p k ∈ [0, in series or in parallel. In this case, in series, and then insert a pswitch x 2 = q−1 q in parallel. In this case,
Based on the above analysis, we know that for any p k ∈ (0, 1), we can always find x = (x 1 , x 2 ) such that f ((x 1 , x 2 ), S, p k )) ≤ ∆(1 − ∆).
Hence, the result of the theorem can be proved by induction.
This completes the proof. 5 }, and suppose we want to realize 3 7 using five pswitches. Using the greedy algorithm with m = 2, we can get the circuit in Fig. 13 , whose probability is 0.4278, and approximation error is e = 3 7 − 0.4278 = 7.3 × 10 −4 , which is very small. , . . . , 4 5 }.
VI. CONCLUSION
In this paper, we have studied the robustness and synthesis of stochastic switching circuits. We have shown that ssp circuits are robust against small error perturbations, while general sp circuits are not. As a result, we focused on constructing ssp circuits to synthesize or approximate probabilities. We generalized the results in [15] and proved that when q is a multiple of 2 or 3, all rational fractions a q n can be realized using ssp circuits when the pswitch set S = { However, this property does not hold when q is a prime number greater than 3. For a more general case of an arbitrary pswitch set, we proposed a greedy algorithm to construct ssp circuits. This method can approximate any desired probability with low circuit complexity and small errors.
