Abstract-In this paper, a new methodology is presented for topology optimization of networked embedded systems as they occur in automotive and avionic systems and partially in wireless sensor networks. By introducing a model which is (1.) suitable for heterogeneous networks with different communication bandwidths, (2.) modeling of routing restrictions, and (3.) flexible binding of tasks onto processors, current design issues of networked embedded systems can be investigated. On the basis of this model, the presented methodology firstly allocates the required resources which can be communication links as well as computational nodes and secondly binds the functionality onto the nodes and the data dependencies onto the links such that no routing restrictions will be violated or capacities on communication links will be exceeded. By applying Evolutionary Algorithms, we are able to consider multiple objectives simultaneously during the optimization process and allow for a subsequent unbiased decision making. An experimental evaluation as well as a demonstration of a case study from the field of automotive electronics will show the applicability of the presented approach.
I. INTRODUCTION
Embedded networks that can be found, e.g. in automotive systems, nowadays consist of up to 100 Electronic Control Units (ECUs) which are connected via different types of shared buses. Several communication standards like LIN [10] , CAN [4] , FlexRay [5] or TTP [15] combined with lots of design alternatives concerning the computational nodes, increase the design complexity of the entire networked embedded system. Moreover, the networked embedded system executes functionality which is typically distributed and consists of communicating processes statically bound onto computational nodes in the network. A system-level designer, hence, has to take the decision about which computational and communication resources are required and where to execute tasks in the network such that no overload occurs on nodes and the capacity of communication links is not exceeded. Additionally, all these decisions have to be taken by respecting different constraints and objectives, like minimization of monetary costs, power consumption or maximizing fault-tolerance.
In this paper, we consider such networks that consist of computational nodes which are able to execute a certain Supported in part by the German Science Foundation (DFG), SFB 694 (Integration elektronischer Komponenten in mobile Systeme) and SPP 1148 (Rekonfigurierbare Rechensysteme) amount of software load, and links with a certain capacity for the communication demand between the functions. Our methodology requires a so-called architecture graph [3] containing all available resources. Out of these available resources the resources for the final system are selected and the functionality represented by a problem graph, introduced later on, is bound onto the selected network nodes. Moreover, the messages between the functions are bound to the communication links. By respecting multiple objectives, our methodology determines a set of so-called Pareto-optimal solutions that allows for an unbiased decision making.
Several approaches exist targeting a similar problem which is commonly referred to as design space exploration of embedded systems. Unfortunately, as these approaches are destined for SoC designs, no straightforward extensions exist for exploring the implementation alternatives of networked embedded systems as they occur in automotive, avionic, or wireless sensor networks.
For signal processing architectures, SPADE (System-level Performance Analysis and Design space Exploration) [9] is a tool for performance analysis. This tool is incorporated by Artemis (Architectures and Methods for Embedded Media Systems) which explores the design space [13] .
Another framework, called MILAN (Model-based Integrated simuLAtioN), is a design space exploration tool that works at different levels of abstraction [12] . Hierarchical data flow graphs including alternatives for application specification as well as an architecture template will be defined and explored at different levels of detail before simulative evaluation.
Thiele et al. [14] propose a design space exploration methodology based on Evolutionary Algorithms for packet processing applications, called EXPO.
Kianzad and Bhattacharyya propose a framework called CHARMED (Co-synthesis of HARdware-software Multimode EmbeddeD systems) [7] for the automatic design space exploration for periodic multi-mode embedded systems.
Balarin et al. [1] propose Metropolis, a design space exploration framework which integrates tools for simulation, verification, and synthesis.
All these tools have in common that they either do not consider communication at all or assume restricted binding conditions by requiring explicit communication modeling which is prohibitive in networked embedded systems. Here, we will present a strategy for solving a multi-commodity or multi-concurrent flow [6] problem together with the binding of functionality onto computational nodes in the network.
The paper is structured as follows: While the next section introduces the network system model, gives an example and reasons for the chosen model, Section III explains our methodology for topology optimization of networked embedded systems. An evaluation of the proposed strategy will be given in Section IV before concluding in Section V.
II. NETWORK SYSTEM MODEL
The input to the topology optimization framework is a so-called specification graph. In this framework, we strictly separate behavior and structure:
architecture graph g a , mapping edges E m , and a set of message types M . Problem and architecture graph can be defined formally:
The problem graph g p represents the set of applications to be realized by the implementation. Vertices represent processes and edges represent data dependencies between the processes. The architecture graph g a (V a , E a ) models the template for the architecture of the system. As mentioned before, the architecture graph consists of all available resources. During the topology optimization phase, a subset of these resources will be selected (see Section III) for implementation. Vertices v a ∈ V a represent resources and the connections of resources are modeled by edges e a ∈ E a . Finally, the mapping edges e m ∈ E m relate vertices of the problem graph g p with vertices of the architecture graph g a . A mapping edge e m ∈ E m indicates the possible implementation of a process on the corresponding resource. An example of a specification graph is shown in Fig. 1 . Gray nodes connected via directed edges represent functions with their data dependencies and white nodes represent resources connected via directed edges. The dashed edges in Fig. 1 In networks with links supporting different bandwidth protocols and bandwidths, it is crucial to distinguish different demands. Assume a certain amount of data has to be transferred between two nodes in a network. Between these nodes are two Model for the exploration of network topologies: Edges in the problem graph are annotated with demands. Edges in the architecture graph are annotated with capacities. Note that capacities and demands which are equal to zero or infinity, resp., are not shown. types of network, one which is dedicated for data transfer and supports multi-cell packages and one which is dedicated for, e.g., sensor values and therefore has a good payload/protocol ratio for one word messages. In such a case, the data which has to be transferred over two different networks would cause a different traffic in each network. Hence, we associate with each edge e ∈ E p so-called demand values which represent the required bandwidth when using a given message type or kind of network, respectively. An example for a network consisting of heterogeneous, multiple protocols can be found in automotive systems, where CAN-buses of different speed grades are connected to, e.g, a LIN-or MOST-bus. If messages have to be transferred between nodes connected to these different bus systems, a gateway has to be passed for adapting the messages to the corresponding network type.
Definition 5 (Demand): Exemplarily, Fig. 1 shows a problem graph consisting of three nodes with three demands. While the demand between P 1 and P 2 as well as the demand between P 1 and P 3 can be routed over all two network types (|M | = 2), the demand between P 2 and P 3 can be routed only over a network that can transfer message type m 2 . This will be expressed by setting d 2,1 for edge e 2 = (P 2, P 3) between P 2 and P 3 to ∞. On the other hand, the supported bandwidth is modeled by socalled capacities to each message type m ∈ M associated with edges e ∈ E a in the architecture graph.
Definition 6 (Capacity):
we associate a real value c i,j ∈ R + 0 (possibly 0, if the message type cannot be routed over e i ) indicating the capacity on a link e i for message type m j . For each edge e i ∈ E a , exactly one capacity c i is greater than 0. Fig. 1 shows a network consisting of four computational nodes (Ctrl.1,. . . ,Ctrl.4), one gateway (GW) and two buses. While BU S1 can transfer the message type m1, BU S2 can handle message type m2. The gateway can convert a message of type m1 to a message of type m2 and vice versa. Note that only capacities c > 0 and demands d < ∞ are shown in this figure. In our model, we assign exactly one capacity with c > 0 to each edge e ∈ E a in the architecture graph and at least one demand with d < ∞ to the edges e ∈ E p in the problem graph. Depending on the type of capacity, a demand of the corresponding type can be routed over such an architecture graph edge. With this extension, it is possible to limit the routing possibilities, and moreover, to assign different demands to one problem graph edge.
III. TOPOLOGY OPTIMIZATION
From the previously described specification graph, the topology optimization framework (a) selects a subset of resources, (b) binds processes to these resources, and (c) assigns demands to a path p = (e 1 , e 2 , . . . , e n ) where e 1 , . . . , e n ∈ E a and
In summary, the topology optimization framework generates solutions to the given specification by using MultiObjective Evolutionary Algorithms. Basically, this is done by encoding solutions in so-called chromosomes (see Fig. 2 ). Each solution can be decoded to an implementation (the so-called phenotype). Our topology optimization framework makes use of the formal definition of an implementation as given in [3] . In our case, an implementation consists of three parts: (i) the allocation that indicates which elements of the problem and architecture graph are used in the implementation, (ii) the process binding, i.e., the set of mapping edges which defines the binding of vertices in the problem graph to components of the architecture graph, and (iii) the demand binding assigning a problem graph edge with its demands to a path in the architecture graph while satisfying capacity constraints.
Before defining the term implementation formally, we will explain the concept of activation as described in [3] .
1} that assigns to each edge and to each vertex the value 1 (activated) or 0 (not activated). The task of topology optimization is to determine an implementation, i.e., an assignment of activity values to vertices and edges of the specification graph. An allocation α of a given specification graph g s is the subset of all activated vertices and edges of the problem graph g p and the architecture graph g a , i.e., α = α v ∪ α e , where α v = {v ∈ V p ∪ V a | a(v) = 1} and α e = {e ∈ E p ∪ E a | a(e) = 1}. 
exists. This definition differs from the concepts of feasible binding presented in [3] in a way that communicating processes require a path in the architecture graph and not a direct link for establishing this communication. This way, we are able to consider networked embedded systems. However, considering multi-hop communications, we have to regard the capacity of connections and data demands of communication. This step will be named demand binding in the following.
Definition 9 (Feasible Demand Binding):
The process of demand binding can be expressed by using the following ILP formulation: We define a binary variable with 
else Then, the following two kinds of constraints exist: (x i,j , .., x i,|Ea| ) T . This constraint literally means that all incoming and outgoing flows of an architecture graph node have to be equal. If a demand producing or consuming process is mapped onto an architecture graph node, the sum of incoming flows differs from the sum of outgoing flows.
• The second constraint restricts the sum of demands d i,j bound onto an architecture graph edge e a,j to be less than or equal to the edge's capacity c j , where d i,j is the demand of the problem graph edge e p,i . ∀j = 1..|E a |:
A feasible allocation is an allocation α allowing at least one feasible binding β p with a corresponding feasible demand binding β d .
A. Chromosome Decoding
As mentioned above, the decoding can be subdivided into three parts, the allocation, the process binding and the demand binding. While the allocation of resources and the binding of processes are part of the decoding process introduced in [3] , the demand binding which requires to solve a multicommodity flow problem is new and explained in the following. In Fig. 3 the flow of the decoding step is presented. First, the allocation of resources is determined by the Allocation List (cf. Fig. 2 ). The allocation is repaired using the Repair Allocation List such that all processes can be bound onto resources. This is done by inserting resources into the allocation regarding their occurrence in the Repair Allocation List. After all processes are bound onto the architecture graph nodes using the processes' occurrence order in the Process Binding Priority List, a path feasibility check is performed which checks whether two adjacent problem graph nodes can communicate over a path between the allocated resources. This check is implemented as a depth first search suitable for cyclic graphs. During this check, the demands and capacities on the edges in the problem graph or the architecture graph are not respected. In the next phase, we have to perform the task of demand binding. Generally, there are two possible solutions in this problem: (1) Using an ILP solver for exact solutions or (2) using a heuristic by encoding the demand binding in the chromosome. In this paper, we will compare both approaches. By using an ILP solver, we use the allocation and process binding to formulate the ILP as presented in Def. 9. The objective of this ILP formulation is to minimize the total flow in the network: min(
. Using a chromosome encoding, a Problem Graph Edge Priority List is decoded. Each element in this list refers to a certain edge in the problem graph that has to be mapped onto the edges of the architecture graph. Beginning with the first element, the demand of the problem graph edge is bound onto the shortest path with sufficient capacities. All capacities along this path are reduced by the demand. Here, only the demand type is considered which corresponds to the capacity type of the architecture graph edge. The objective to be minimized corresponds to the ILP formulation. If no path with sufficient capacities can be found in the architecture graph, an error counter is increased by one. This error counter is another objective to be minimized and helps the EA to guide the search towards feasible solutions. If this error counter equals zero, a valid implementation has been found.
B. Performance Evaluation and Constraint Checking
After decoding an implementation and checking its feasibility, the performance evaluation takes place. Typical evaluators are: To sum the monetary costs associated with the allocated resources resulting in an overall cost objective value. Additionally, the computational load on the network nodes as an orthogonal objective to the flow in the network can be considered. However, competing objectives allow for implementing different solutions which are all called to be Pareto-optimal.
As the specification graph might be annotated with arbitrary attributes, a system designer can also define additional objectives and load customized evaluators into our topology optimization framework. In order to support different objective evaluators, all objectives are assumed to be minimized.
After performance evaluation, the constraints imposed on an implementation can be tested. For this purpose, our topology optimization framework permits dynamic loading of constraint checking algorithms, too. A constraint is specified by a lower and an upper bound. Usually, the already computed objective values can be tested by a constraint checker. If a constraint is violated, the checker is required to return one, otherwise the checker returns zero. Using this scheme, the return values of the constraint checkers can be added and only a result without constraint violation indicates a valid implementation. This sum is minimized as well, i.e., the constraint violation is treated as an additional objective value. In case of an invalid implementation, all objective values are set to infinity. Thus, the only remaining objective is to minimize the number of constraint violations. That way, the task of system synthesis is a multi-objective optimization problem (see e.g., [16] , [8] ).
IV. EXPERIMENTAL EVALUATION
In the following experiments, we are comparing the two different implementations, one using the Evolutionary Algorithm SPEA2 [2] for binding of demands onto the resources in the architecture graph and one solving the ILP formulation from Def. 9 using LpSolve [11] . For the evaluation of these algorithms, we used applications with 10 and 20 individual demands and all mapping possibilities on a 3x3 mesh. We defined the capacities of the architecture graph edges each to 100% and produced for each number of demands three different scenarios by varying the demand sizes between 1% and 100%. For each of these scenarios, we executed three iteration runs with the EA-based and the ILP-based methodology and obtained three sets S i ILP and S i EA with i = 1, .., 3 containing the set of Pareto-optimal solutions after the iteration. Extracting all Pareto-optimal solutions out of the sets S i ILP and S i EA provides us a set of solutions which we assumed to contain the Pareto-optimal solutions. Therefore, these solutions were taken as a reference set R. In order to evaluate the iteration runs of our proposed heuristic and the combined approach incorporating an ILP, the shortest normalized distance d(s) between the Pareto-front of the reference set R and the solutions s ∈ S i ILP or s ∈ S i EA resp. for all iteration runs and scenarios are determined:
The indices o1 and o2 denote the two objectives, for a considered points, whereas, the indices max and min denote the maximal value or the minimal value of the points belonging to the reference set R. The average distances and standard deviations in each iteration for the cases with 10 demands and 20 demands are presented in Fig. 4a) and Fig. 4b) . Fig. 5a ) and Fig. 5b) show the distance over the exploration time. The topology optimization has been executed on a Intel Pentium IV (2.7GHz/512MB RAM) running Linux. We can clearly see that our proposed heuristic converges faster and at the same time runs faster than the hybrid approach of an EA with an exact ILP formulation (EA/ILP).
A case study from the field of automotive applications aimed at optimizing the network topology and the binding of an adaptive light controller. In this case study, more than 100 different processes are producing, processing and consuming data. These processes need to be bound onto a network consisting of 36 sensors, 30 controllers/gateways, and 35 actuators. While the controllers could be connected with different types of buses, we connected each sensor and actuator via pointto-point (P2P) communication links to each controller in the specification graph. One objective of the topology optimization was to find solutions that minimize the total wire length of the P2P connections and the bus systems while the topology optimization on the other hand aimed at minimizing the monetary cost of the entire system which has a direct effect to the allocation of resources. Additionally, certain demands have been annotated to the edges between the functional units which have to be bound to the edges and nodes between the sensors, controllers/gateways and actuators. By combining all demand routing alternatives, binding possibilities and resource allocation options, the search space incorporated about 2 300 possibilities.
Using our heuristic, we were able to find a set of approximated Pareto-optimal solutions to this topology optimization problem. Whereas, the hybrid ILP/EA-based approach having failed completely caused by the huge ILP models. Thus, we are not able to present quantitative results for this case study. However, this example shows an application area where our approach for topology optimization helps a system-level designer for an unbiased decision making.
V. CONCLUSION AND FUTURE WORK
In this paper, we presented a framework topology optimization for networked embedded systems. The input specification is given by a specification graph permitting modeling of demands for heterogeneous networks. The novelties may be summarized as follows: a) Our specification graph enables the modeling of routing restrictions, for example if a certain type of demand cannot be routed over some parts in a network. In addition to the model, b) we proposed a new chromosome encoding and a novel heuristic for demand binding. c) The performance of the proposed strategy has been compared with an ILP-based approach and it has been shown that it performs very well. Moreover, the applicability to recent design issues in the field of automotive networks has been presented. All in all, the presented framework provides a first methodology for multi-objective exploration of heterogeneous networked embedded systems. 
