Modem microelectronic technology.gives opportunities to build digital circuits of huge complexity and provides a wide diversity of logic building blocks. Although logic designers have been building circuits for many years, they have realized that advances in microelectronic technology are outstripping their abilities to make use of the created opportunities. In this paper, we present the fundamentals of a logic design methodology which meets the requirements of today's complex circuits and modem building blocks. The methodology is based on the theory of general full-decompositions which constitutes the theory of digital circuit structures at the highest abstraction level. The paper explains the theory and shows how it can be used for digital circuit synthesis. The decomposition methodology that is presented ensures "correctness by construction" and enables very effective and et]ficient post-factum validation. It makes possible extensive examination of the structural features of the required information processing in relation to a given set of objectives and constraints.
On the other hand, traditional logic design methods are not suitable for very complex circuits or implementations with constrained building blocks for the following main reasons: they are only devoted to some very special cases of possible implementation structures, they often leave unconsidered some important parameters that sufficiently influence the actual design objectives, they often fail to find global optima for large designs, they do not consider hard constraints, and they often do not consider correctness aspects in an appropriate manner.
Logic synthesis is typically performed without any relation to the target implementation structure and therefore, a technology mapping must be applied in order to map the synthesized logic network into a network of building blocks that can be implemented. Sometimes, the technology mapping algorithm is unable to construct any implementable network from a given initial network and it cannot guarantee an optimal solution, if the initial network is constructed without any regard to future implementation.
The bad practice of target-independent logic synthesis follows from the lack of appropriate modelling tools and synthesis methods for digital circuit structures. Traditional logic modelling tools model circuits in terms of functionally complete systems composed of a minimal number of some special structural elements (e.g. AND+ OR+NOT, NAND, NOR, MUX or AND+EXOR) instead of modelling them in terms of all structural elements at the designer's disposal or, just generally, in terms of all possible subcircuits. For example, the commonly used Boolean algebra enables us to express all the possible Boolean functions but fails to model their implementation structures. Boolean algebra makes it possible to decompose functions exclusively into networks consisting of AND, OR, and NOT subfunctions, or into the equivalent NAND or NOR networks, while in general they can be decomposed into subfunctions of any kind. From the above we can conclude that the opportunities created by microelectronic technology cannot effectively be exploited. It has become extremely important to develop a new generation of methods which will effectively and efficiently deal with design complexity and the characteristic features of modern building blocks, enabling modelling and synthesis of all reasonable circuit structures and providing "correctness by construction", easy correctness verification and intelligent search algorithms for the effective and efficient exploration of the huge space of correct circuit structures. In order to solve the problem a structural decomposition approach may be used. It consists of transforming a system into the structure of two or more cooperating sub-systems in such a way that the original system's behaviour is retained and certain constraints and objectives are satisfied.
The theoretical work in this field was started by Ashenhurst [4] and Curtis [10] for combinational circuits and by Hartmanis [16] [17] [18] for sequential circuits in the early 1960s. However, they over-simplified the actual problems and left some important parameters, that sufficiently influence the actual design objectives, unconsidered. For example, Hartmanis and others only partly considered decomposing the internal states of sequential machines. It was an incomplete solution, because the most important design parameters of a circuit for implementing a sequential machine (complexity, speed, testability etc.), or the possibility of implementing a machine with limited building blocks, depend on the whole implementation structure, i.e. on the distribution of the machine's inputs,-outputs, state memory, functionality, and interconnections between the building blocks. So, from the practical viewpoint, decomposing the whole sequential or combinational process into an appropriate structure is necessary, i.e. full-decomposition. The theoretical works devoted to decomposition from the 1960s and 1970s should be considered as first steps on the way towards a complete decomposition theory. The first practical solutions were obtained in the 1980s (e.g. [1] [5] [21] [27] [35] [45] [46] ).
The strongest stimulus for developing decomposition methods and tools came recently from the newest generation of multi-block programmable devices. In the case of fine granular multi-block FPGA's the hard constraints are active for virtually all non-trivial circuits. Implementation is impossible without decomposition.
In this paper, we will present the fundamentals of a decompositional design methodology which meets the requirements of today's complex circuits and modern microelectronic technology. The methodology is based on the theory of general full-decompositions which has been developed by us during the last few years and applied in a number of prototype decomposition tools [22] [23] [24] [25] [26] [27] [28] [29] [30] . General decomposition consists of transforming a sequential machine or a Boolean function into the structure of two or more cooperating partial machines in such a way that the original machine's behaviour is retained, and all the important structural attributes relating to inputs, outputs, state memory elements, functional units, and interconnections between the units, are appropriately considered at the same time.
Our previous publications focused on heuristic search algorithms for decomposition and presented some benchmark results [26] [27] [28] [29] [30] . The main aim of this paper is to present a general full-decomposition model and to show that this model, together with its theorem, constitute the theory of digital circuit sfructures at the highest abstraction level and form a sound base for the construction of decomposition algorithms. Since general full-decomposition includes various special decomposition types for sequential and combinational circuits, the general decomposition theorem will also be interpreted for some important special cases. Other aims of this paper are to explain how the model can be used for the digital circuit synthesis when focusing on correctness and optimization aspects. Sometimes, the design requirements do not completely specify a machine. For example, certain input/state FIGURE An original sequential machine. GENERAL DECOMPOSITION 227 combinations may never occur due to external constraints or due to realizing the machine in such a way that some of the input/state combinations of the realization are not used for implementing the original machine. Therefore, from the functional viewpoint, the designer does not care what will be the next-states or outputs for such combinations. Sometimes, outputs are sampled only at specified times: when they are not being sampled, they may be unspecified. If a certain input/state combination is followed by a general reset signal, the output for this combination should be specified, but the next-state need not. In all such situations one talks about so called "don't care" conditions. "Don't cares" are commonly denoted by "_". To account for "don't care" conditions, the sequential machine definition should be extended by slightly changing the definitions of functions 6 and h: 6:
S I-->SU{-} and h: S I-->OU{-} (for a single output machine) or h=[hj], hi: S I---Ot_J{-} / O=
[O] (for a multiple output machine). A sequential machine without "don't cares" will be referred to as completely specified and with "don't cares" as incompletely specified.
A (.) and sum (/) operations as well as the ordering relations (-<) for bit partitions are defined in the same way as for "normal" partitions, but the block of the bit-partition's product being the product of a block (important or don't care) with an important block is an important block; and the block of the bit-partition's sum being the sum of some blocks (important or don't care) with a don't care block is a don't care block. The zero bit-partition is defined as a bit partition with an empty don't care block.
"rra: is a symbol partition induced by a bit partition i.e. a (partial) output information from a certain component machine j, -< -< n, is separately transmitted to the input of a certain machine i, -< <-n, i.e. without combining it with a (partial) output information from other partial machines k, --< k -< n, k 4: j.
A general composition is said to be in maximally preprocessed form if the connection rules Con compute the scalar values, i.e. information transmitted from various partial machines to a certain machine is combined prior to connecting it to the input of this machine. Of course, the compositions in partially preprocessed form, lying between the two above extremes, are also possible.
Allowing for external local connections between the outputs and inputs of a certain component machine Mi,
gives more freedom in describing the circuit structure.
The machine M can influence it own behavior partly through its internal state and partly by affecting the inputs. The precise form of this influence is defined by a specific choice of the connections Con and machine functions i and X i. 
Formal definitions for compositions TC of various special types T can be introduced in a very similar way, as special cases of the above definition [22] [23] [24] [25] . Each In a general composition, there is a danger of information loops occurring in the exchanged information. Such loops at the level of elementary (binary) signal lines will result in sequential behavior of the two interconnected combinational circuits which compute h instead of the required combinational behaviour. We say that a general composition is legal if and only if the composition h* of h is guaranteed to be a function. This is satisfied if and only if the signal values of each elementary (binary) signal used for information exchange between the partial machines are computed independently of the values of this signal. Of course the cyclic signal flows can occur exclusively due to the interconnection circuity. Checking for acyclic signal flow is equivalent to tracing the primary information sources i.e. to check if the signal values of each elementary signal line, used for information exchange between partial machines or for transmitting information between the outputs and inputs of a certain machine, are originally computed from the primary input and state information of the composition machine only. Therefore, the partial machines must together possess enough primary input and state information to compute all the information transmitted by interconnections. The composition's legality guarantees that the information that has to be transmitted will be computed by the partial machines.
Fortunately, the legality is structurally guaranteed for most of the special cases of the general composition because the information loops are, in most cases, impossible at the level of total information flows between the outputs and inputs of the partial machines. One has to check for non-closed loops at the level of partial information flows (signal lines) only in the most general cases of the general composition, i.e., for machines other than Moore machines in cases where the exchanged information is computed in more than one partial machine when using some information transmitted from the other machines or in the presence of local connections. In particular, the composition legality is structurally guaranteed for Moore machines (where the partial state information of the component machines is transmitted between the partial machines) and for the following compositions of Mealy machines without local connections: parallel compositions (without information exchange), serial compositions (with unidirectional information flow) and compositions where the exchanged information is only computed from the (primary) input and state information of each partial machine itself. The proof of Theorem can be found in Appendix.
GENERAL FULL-DECOMPOSITION
Theorem can be interpreted in the terms of the equivalence relations introduced by the appropriate partitions and it is interpreted graphically in Fig. 6 (2)), the composition of the partial machines is legal (condition (3)). For incompletely specified machines or multi-state realizations, partitions have to be replaced with set systems and a theorem similar to Theorem can be proved. Its interpretation will be slightly different. The classifications of the elements from S I computed by the original and component machines will no more define the equivalence relations but the compatibility relations denoted by the appropriate set systems. In these relations each element can be a member of many compatibility classes, because the compatibility relations are not required to be transitive.
The next section of the paper is devoted to the discussion of some important special cases of the general full-decomposition model and Theorem 1. Usage of the model for decompositional logic synthesis is discussed in Section 6 and an example to illustrate the usage is presented in Section 7.
SPECIAL CASES
The general full-decomposition model covers all other known structural models for sequential and combinational circuits, including the following: parallel full-decompositions [19] [20] [21] [22] [23] [24] [25] [26] [27] , in which each of the component machines can compute its own next-state and output independently (Fig. 7) ; serial full-decompositions [22] [23] [24] [25] [26] [27] , in which only one of the component machines (Me) uses information from the second machine (M1) in order to compute its own next-state and output (Fig. 8); decompositions with the separate realization of the next-state and output functions [24] (Fig. 9) ; bit full-decompositions [25] [26] [27] , where the decoders and O are reduced to the appropriate distribution of the input and output bit lines (Fig. 10) ; input-bit parallelfull-decompositions (referred to in the literature as cascade decompositions [30] , serial decompositions [35] , Boolean decompositions [14] [44], decompositions with generalized decoders [9] [44], three-level decompositions [37] , or decompositions into submachines [46] ) (Fig. 11) ; bit parallel full-decompositions (parallel decompositions [21] , output decompositions [30] ) (Fig. 12) Below, the general decomposition theorem will be interpreted for a number of important special cases. For reasons of simplicity in presentation, we consider the decompositions with only two machines and without local connections later in the section; however, the presented results can be very easily extended to n machines. (0) is satisfied, then the state behaviour of M will also be realized.
01
Theorem 2 is interpreted graphically in Fig. 13 only if two bit partition trinities (*riB, *rS, *rOB) and ("riB, "rS, "rOB) exist that satisfy the following conditions: (1) (*ri, *rs) and ("ri, "rs) are IB-S partition pairs, where" rI ind(*rIB), "rI ind("rIa),
(2) *rs and "rs are SP-partitions, (1) *ri" "ri' <-*ri and "ri" *ri' -< "ri, where: "ri'->'ri, *rlr *ri * (2) *ri "ri -< *ri' and *ri ''ri -< "ri', (3) (*ri ""ri *ro(0)) is an I-O partition pair.
For parallel decomposition, Theorem 9 is reduced to the following theorem. Theorem 10. A combinational machine M has a parallel full-decomposition if two partition doubles (*ri, *ri and ("ri, "ri exist that satisfy the following conditions:
(1) *rI -< *rI, and "rI -< "ri, (2) (*ri ""ri, *ro(0)) is an I-O partition pair.
For serial decomposition, the following theorem will result from Theorem 9.
Theorem 11. A combinational machine M has a serial full-decomposition if partition doubles (*ri, *rI and ("rI, *) "ri exist that satisfy the following conditions:
(1) *ri <-*ri and "r *ri' -< "ri, where: *ri' -> *ri, 6 . USING THE MODEL IN DIGITAL CIRCUIT SYNTHESIS
Decompositional Logic Synthesis
The aim of synthesis is to provide a circuit structure that realizes the specified behaviour, satisfies certain constraints and optimizes specific objectives.
In general, the constraints and objectives refer to the circuit's performance and how the various resources are used during the whole life-cycle of the circuit. They can be formulated along various dimensions such as time, area, inputs, outputs, power consumption, testability, reliability, maintainability, design time or cost etc.
In our methodology, the behaviour is specified in the form of an original sequential machine or Boolean function and the physical constraints and objectives are modelled as a constrained multi-objective optimization problem.
Decompositional synthesis consists of applying the general decomposition model and theorem, or their special cases, a number of times and in this way, adding the structural information to the design specifications until a directly implementable design description is obtained.
By repetitive use of the general full-decomposition model or its special cases, all possible implementation structures for sequential and combinatorial machines (all meaningful partial machine networks) can be obtained.
The appropriate decomposition theorems guarantee correctness by construction and limit the search for 238 solutions to the decompositional structures that realize the specified behaviour. The model, together with its theorems, forms a basis for decompositional synthesis. The model information on how the model was used during synthesis and allow us to check the correctness of the synthesis in a relatively easy waymby backward mapping the synthesis result into the specification. In this manner, checking the correctness of the human designer or automatic design tool behaviour is possible (see Section 6.2 for further information and Section 7 for an example).
For small sequential or combinational machines, the optimal decompositions can be found by implicit enumeration, limited only by the properties of the building blocks and the algebraic properties described by the appropriate decomposition theorems and partition pair theory. For large systems, the number of possible decompositions is so great than an implicit exhaustive search, performed using only the algebraic and building block properties, is impossible. It becomes necessary to construct the most promising decompositions using the theorems presented in this paper together with the appropriate heuristics. The heuristic evaluation functions and selection mechanisms must limit the search space to a manageable size and keep high-quality solutions in this limited space.
Our methodology guarantees "correctness by construction", easy post factum correctness verification and satisfaction of all the originally specified constraints. The objectives can be near-optimally satisfied, because the problems at hand are computationally complex and heuristic algorithms must be used.
Correctness Aspects
Currently, simulation and prototype testing are commonly used to validate designs, but this approach is not sufficient for complex circuits. The techniques of formal validation are more promising and therefore have been applied in our methodology. We will show that it is possible to use them very effectively and efficiently. Proving correctness consists of providing evidence of the fact that the realization relation holds between an original specification and its implementation. Since synthesis involves adding detailed information to specification, proving correctness must involve the opposite i.e. abstracting from the information and in this way relating the detailed implementation description to its more abstract specification. The correctness-proving process associated with the decompositional logic synthesis is performed as a series of abstraction steps which are used to gradually relate the lower level design descriptions to the immediate higher level specifications, starting from the bottom level implementation and continuing until the original top level specification is reached.
Four types of abstraction are used in this process: structural abstraction (hiding the information about a circuit's internal structure by computing the behavioural description for the composition of partial machines); data abstraction (hiding the information about the implementation of data by replacing the binary data values with their symbolic abstract representations); behavioural abstraction (e.g. leaving unspecified behaviour for certain state/input combinations which will never occur in the operating environment ("don't cares")); and temporal abstraction (relating several units of lower level time to one unit of higher level time, e.g. relating the delays of structural implementation elements to the clock period of a sequential machine).
The structural, behavioural and data abstractions have actually been used to prove the general full-decomposition theorem. This theorem gives the necessary and sufficient eonditions which must be fulfilled by each general composition of partial machines, in order to realize the functional behaviour as specified by the original machine. Once proved, the general full-decomposition theorem provides synthesis rules that are problem independent and guarantee functional correctness. The parametric correctness, in the sense of satisfying the hard physical constraints, is achieved with the class specific rules, which are distinct for different classes of target architectures (building blocks). These rules are obtained by modelling the synthesis problem as a multiobjective constrained optimization problem (see Section 6.3). The parametric optimization is guided by the rules that are problem-instance specific. These rules are constructed and selected automatically by the search algorithms, based on information about the characteristic features of a sequential machine related to the characteristics of building blocks and optimization aims [29] [30] [31] [32] .
Many researchers and designers are convinced that "correctness by construction" makes post-factum verification unnecessary. This is not true. Even if the construction rules are proved to be correct, their application can be faulty due to mistakes made by designers or errors in the synthesis tools.
The physical constraints are verified by estimating the parameters involved by using the abstract modelling, the lower level synthesis tools, or simulation, and checking the estimates against the constraints. Verification of the near-optimal satisfaction of the objectives consists of checking the performance of the synthesis tools by using benchmarking and statistical analysis of the synthesised designs [29] [30] [31] [32] . The functional correctness is checked by repeatedly applying two elementary verification processes:
checking of the correctness of each particular decomposition (transformation) computed by the syn thesis tools, i.e. verifying whether the proposed GENERAL DECOMPOSITION 239 system of partitions and associated state, input, and output mappings satisfy the conditions of the general full-decomposition theorem, and checking of if each particular planned decomposition has been applied successfully; this is performed by reverse mapping of the decompositional implementation structure into its specification.
In general, design verification is a complex process because one does not know the sequence of transformations which have to be performed, in order to show that an implementation satisfies its specification. In our methodology verification is simple, because the sequence of transformations results from the information produced during synthesis. If information about synthesis transformations to be performed is memorized, then finding the reverse transformations and performing the reverse mapping is very easy (see an example in Section 7). In place of verifying that a certain implementation is a realization of a given specification, as done by traditional verification methods, we check if the specific planned decompositional structure is correct and then we prove that the synthesised implementation is the planned realization of the specification. This results in a very efficient verification process. Of course, it requires prior knowledge of the class of correct structures and knowledge of decisions made for selecting a certain structure from the class of correct structures. The first part of the required knowledge is general and it is given in the form of a general full-decomposition model and its theorem proven in this paper. The second part of the knowledge is problem instance dependent; however, it must be found before constructing the required decomposition. Therefore, the only extra activity to be performed to enable the reverse mapping is keeping a record of the decisions taken during the construction of a certain decompositional structure, i.e. recording what instance of the model is intended to be used. This record can be kept in terms of (partial) machines and appropriate mapping functions. Since the highest level record represents the original machine and lowest level the resulting realization structure, the extra information recorded is limited to the tables of partial machines from the intermediate levels and appropriate mapping functions.
Since verification processes are performed by using reverse operators to those used during synthesis, the probability of masking the synthesis faults, by faults during verification, is negligible. Therefore, the verification performed in this way is very reliable. Synthesis faults can be rapidly detected and localized because the elementary verification processes can be immediately performed after finding or applying an elementary transformation.
The post factum verification by reverse mapping is much more efficient than a verification by traditional verification methods which do not use information from the synthesis. The time savings result from the fact that it is no longer required to find the sequence of the verification transformations, because this sequence is unambiguously defined by the sequence of the synthesis transformations. Therefore, the verification time is composed exclusively of time for performing the reverse transformations (which is comparable to time for performing synthesis transformations) and time for comparing the original specification with the result of the reverse mapping (which is proportional to the dimensions of the specification). Verification by reverse mapping is also much easier than proving correctness for the complex software of the synthesis tool and ensuring the correct functioning of its hardware. It is equivalent to showing that the synthesis tool has performed correctly for a particular case. An example of verification by reverse mapping can be found in Section 7.
The principle of reverse mapping is very general. It can be applied for off-line and on-line correctness verification of various systems in all cases where forward transformations are known. In particular, it can be applied for design verification of any kind of transformational design.
Search for Optimal Solutions
For large sequential or combinatorial systems, an exhaustive search for optimal decompositions is impossible. It is necessary to construct only the most promising decompositions when using heuristic search algorithms, and to choose the best of these. In our previous publications [29] [30] [31] [32] , some specific heuristic decomposition algorithms have been described and benchmark results from their software implementation have been presented. This section aims to describe the underlying principles of those algorithms. These principles can be used for construction of the heuristic search algorithms for various decomposition problems.
A specific decomposition problem can be modelled as a special multi-objective constrained optimization problem (MOCOP) [29] [29] [30] . Models without hard constraints can be solved by special multi-objective clustering algorithms [32] . The partitioning process which produces partitions on the set of elementary component machines is preceded by analysis of characteristic features of an original machine related to the characteristics of building blocks. In particular, the input, output, and state information and their interrelations are analysed. This analysis enables us to distinguish elementary component machines and to characterize them and their correlations. Its results are used to guide the heuristic packing or clustering processes.
The partitioning problem is represented in terms of a space of states, where each state corresponds to a particular (partial) solution. A (partial) solution consists of a (partially constructed) partition. The tree of (partial) solutions has the form of an implicit tree, i.e. it is defined only by means of an initial state, the rules for generating the tree and the termination criteria. The rules describe how to generate successors to each partial solution (i.e. they define move operators that are (partial) mappings from states to states). Any state that meets a termination criterion is called a goal state. Partitions in ptcking algorithms are constructed by putting unallocated elements successively into the partition blocks [29] [30].
Clustering algorithms construct the successor partitions by merging some of the partition blocks [32] . Heuristic algorithms are used to select the most promising partial solutions and to develop them further when applying only the best move operators.
A heuristic search algorithm can be effective and efficient, if it is able to appropriately compose a broad search of a solution space in many promising directions with a fast convergence to the (near-)optimal solutions. The fast convergence can result from using the knowledge cumulated in the previous search steps for selecting the most promising (partial) solutions and move opera- The selection mechanisms Select Moves and Select States must ensure that a solution that violates the hard constraints will not be constructed and they will try to satisfy the objectives optimally by limited expenditure of computation time and memory space. In order to fulfil the first task, Select Moves will select only those move operators which, applied to a given state, do not lead to the violation of hard constraints. In order to fulfil the second task, Select Moves and Select States will select a number of the most promising operators or states, respectively, by using the estimations provided by some heuristic evaluation functions. The selection mechanisms and evaluation functions determine together the extent of the search and quality of the results.
The selection mechanisms use heuristic elaborations of one coherent decision rule: "in each state of the search, take a decision which has the greatest chance of leading to the optimal solution, i.e. a decision which is most certain according to the estimations given by the heuristic evaluation functions". If there are more decisions of the same or comparable quality, a number of them will be tried in parallel (beam-search).
According to the above rule, Select Moves will apply those move operators which maximize the choice certainty in a given current state and it will leave the operators which are open to doubts for future consideration. Since information contained in the partial solutions and used by the evaluation functions grows with the progress of computations, the uncertainty related to operators decreases. In each computation, Select Moves will maximize the conditional probability that the application of a certain move operator to a certain solution state leads to the optimal complete solution. Under this condition, it will maximize the growth of the information in the partial solution, which will then be used in the successive computations steps in order to estimate the quality of choices. Qmax denotes the quality of the best alternative. Poor quality alternatives are not taken into account. Q(PS) can be computed by cumulating the qualities of the choices of operators that took place during the construction of PS and prediction of the quality of the best possible future choices on the way to the complete solution. Another possibility consists of predicting the quality of the best complete solution that can be achieved from a certain present state PS.
Generally, operators and partial solutions are estimated with some uncertainty. This uncertainty decreases with the progress of computations, because both the "sure" information contained in partial solutions and the quality of prediction grow with this progress. In the first phase of the search, the choices of operators can be done with much more certainty than the choices of partial solutions. In this phase, partial solutions almost do not exist or, in other words, they are far from being complete solutions and almost anything can happen to them on the way to achievable complete solutions.
Therefore in the first phase, the search should be performed almost exclusively based on the choices of operators and, with the progress of computations, more and more on the choices of partial solutions. In our algorithm, this is achieved by giving a relatively low value to MAXMOVES compared to MAXSTATES and a relatively high value to OQFACTOR compared to SQ-FACTOR.
Since the uncertainty of estimations decreases with the progress of computations, MAXMOVES and MAX-STATES can decrease and OQFACTOR and SQFAC-TOR can increase with the progress of computations, increasing the search efficiency in this way.
In the method described above, the double beamsearch allows for effective and efficient decision-making under changing uncertainty.
In the first search phase the algorithm is divergent to high degree, i.e. a large number of the most promising directions in the search space are tried. In the second phase, when it is already possible to estimate the search directions and operators with a relatively high degree of certainty, the search becomes more and more convergent. The highly divergent character of the search in the first phase, composed with the continuous interplay between the partial solutions in the second phase, result in a global character of the double-beam algorithm.
The search method presented was implemented in a number of decomposition and state assignment programs and when tested on benchmarks, it efficiently produced very good results [29] [30] [31] .
Of course, it is possible to use the solutions found by our constructive double-beam algorithm as good initial solutions for the search algorithms that perform search in the space of complete solutions (e.g. for local searches, simulated annealing, tabu search, or genetic algorithms).
However, this was not necessary in the tested cases, because the double-beam constructed the strictly optimal solutions [29] [30] .
Complex multiple general decomposition problems can be solved by decomposing them into systems of more specific subproblems, which are easier to solve than the original problem, and then solving the systems of subproblems by using systems, of cooperating subproblem-specific algorithms. In this section, we have discussed only some very general principles of searching for the optimal decompositions. For each particular decomposition problem, the problem specific features should be used in order to distinguish the elementary component machines (atomic computations) and to perform the partitioning processes effectively and efficiently. In this way the generic packaging or clustering algorithms will be transformed into some problem specific algorithms. For example, in the case of a traditional two-level AND-OR decomposition of Boolean functions, the atomic computations can be defined as computations of minterms, the partial machines will be limited to AND circuits which will be able to compute product terms, and the output decoder will be limited to be an OR circuit. In this case, the decomposition problem can be viewed as clustering of minterms into larger terms. The aim will be to find the minimal number of clusters (terms, partial machines), that realize all the atomic computations (minterms).
EXAMPLE
The aim of the example is to illustrate the use of the proposed decompositional synthesis methodology for logic synthesis and correctness verification.
Consider a Moore machine M defined by the Table I  and the following partitions on M: state partitions: For the above partitions, the following statements are true:
(1) "rrl and "fr 2 are SP-partitions, Figure 14 .
Since conditions (7)- (8) Designing the decompositional implementation of M can proceed further with combinational logic synthesis and layout synthesis in order to optimize the specified excitation functions and to find an optimal layout. The combinational logic synthesis can also apply the decompositional paradigm. Since the decompositional realization of M, as given in the Tables II-X, has been obtained when using the previously proven correct ways of construction (Theorems 7 and 8), this realization is correct. However, it is correct if those correct ways of construction have been actually applied and not only planned to be used, i.e. if the human designer or automatic tool which actually applied these theorems performed fault-free. Since this cannot be guaranteed, the actual application of the correct construction must be checked for correctness.
Applying the concept of reverse mapping, it is possible to make a straightforward check whether the designed composition of partial machines realizes the specified behaviour of M.
In the first step, the decompositional structure of M 2 is mapped into its specification. From the tables of M2,1 and M2,2 (Tables V and VI), the table of (Table XIV) with the table for M 2 being the specification of M 2 obtained during the synthesis process (Table III) uncovers the fault. In one table in the first row and the third column state "C" appears and in the second "-" appears in the same place.
CONCLUSIONS
Implementation of a sequential machine or a Boolean function requires finding the composition of some structural elements, which allows realization of the inputoutput behaviour specified by a certain machine or function, and which satisfies a certain set of constraints and objectives. In general, the problem of finding an optimal implementation remains unsolved. Only the special case of two-level logic and unconstrained minimization (in the sense of the minimal term cover), can be processed by the exact techniques for designs up to about 20 inputs [11] and by the nearly optimal heuristic techniques for larger designs [6] . Constrained optimization or optimization for objectives other than minimum term cover remained unsolved even for the two-level 82.1 Def. 1, Def.2,(7), (8) QSi(B i, ([x] 'rrli,[(s,x)]'rr's ii)) (14) .f3 [{8(t,z) (t,z)indS Thus, if condition 5 of Theorem is also satisfied, then from the above calculations, from the definition of the state and output behaviour realization and the definitions 2 and 3, it follows that a general composition of n component machines M will realize the state behaviour of M, i.e. machine M has a general full-decomposition with state and output behaviour realization.
This ends the proof in one direction. This part of the proof shows how to construct the partial machines and their general compositions so that they form decompositional structures which realize behaviour of a specified sequential machine. In the second part of the proof, it will be shown that the proposed structures are the only possible decomposition structures which solve the general decomposition problem.
Let is a surjective partial function, each element from S must be unambiguously defined by n-tuples of elements from "rrs, i.e. 1-Ii ,ITS'= "rrs(0). So, condition (5) must be satisfied.
Summarizing, if a sequential machine M has a general full-decomposition then n trinities of partitions ('ITIi,'ITsi,Trs 11) exist, and they satisfy conditions (1)- (5) of Theorem 1. This ends the proof.
