140 research outputs found

    Temporal analysis and scheduling of hard real-time radios running on a multi-processor

    Get PDF
    On a multi-radio baseband system, multiple independent transceivers must share the resources of a multi-processor, while meeting each its own hard real-time requirements. Not all possible combinations of transceivers are known at compile time, so a solution must be found that either allows for independent timing analysis or relies on runtime timing analysis. This thesis proposes a design flow and software architecture that meets these challenges, while enabling features such as independent transceiver compilation and dynamic loading, and taking into account other challenges such as ease of programming, efficiency, and ease of validation. We take data flow as the basic model of computation, as it fits the application domain, and several static variants (such as Single-Rate, Multi-Rate and Cyclo-Static) have been shown to possess strong analytical properties. Traditional temporal analysis of data flow can provide minimum throughput guarantees for a self-timed implementation of data flow. Since transceivers may need to guarantee strictly periodic execution and meet latency requirements, we extend the analysis techniques to show that we can enforce strict periodicity for an actor in the graph; we also provide maximum latency analysis techniques for periodic, sporadic and bursty sources. We propose a scheduling strategy and an automatic scheduling flow that enable the simultaneous execution of multiple transceivers with hard-realtime requirements, described as Single-Rate Data Flow (SRDF) graphs. Each transceiver has its own execution rate and starts and stops independently from other transceivers, at times unknown at compile time, on a multiprocessor. We show how to combine scheduling and mapping decisions with the input application data flow graph to generate a worst-case temporal analysis graph. We propose algorithms to find a mapping per transceiver in the form of clusters of statically-ordered actors, and a budget for either a Time Division Multiplex (TDM) or Non-Preemptive Non-Blocking Round Robin (NPNBRR) scheduler per cluster per transceiver. The budget is computed such that if the platform can provide it, then the desired minimum throughput and maximum latency of the transceiver are guaranteed, while minimizing the required processing resources. We illustrate the use of these techniques to map a combination of WLAN and TDS-CDMA receivers onto a prototype Software-Defined Radio platform. The functionality of transceivers for standards with very dynamic behavior – such as WLAN – cannot be conveniently modeled as an SRDF graph, since SRDF is not capable of expressing variations of actor firing rules depending on the values of input data. Because of this, we propose a restricted, customized data flow model of computation, Mode-Controlled Data Flow (MCDF), that can capture the data-value dependent behavior of a transceiver, while allowing rigorous temporal analysis, and tight resource budgeting. We develop a number of analysis techniques to characterize the temporal behavior of MCDF graphs, in terms of maximum latencies and throughput. We also provide an extension to MCDF of our scheduling strategy for SRDF. The capabilities of MCDF are then illustrated with a WLAN 802.11a receiver model. Having computed budgets for each transceiver, we propose a way to use these budgets for run-time resource mapping and admissibility analysis. During run-time, at transceiver start time, the budget for each cluster of statically-ordered actors is allocated by a resource manager to platform resources. The resource manager enforces strict admission control, to restrict transceivers from interfering with each other’s worst-case temporal behaviors. We propose algorithms adapted from Vector Bin-Packing to enable the mapping at start time of transceivers to the multi-processor architecture, considering also the case where the processors are connected by a network on chip with resource reservation guarantees, in which case we also find routing and resource allocation on the network-on-chip. In our experiments, our resource allocation algorithms can keep 95% of the system resources occupied, while suffering from an allocation failure rate of less than 5%. An implementation of the framework was carried out on a prototype board. We present performance and memory utilization figures for this implementation, as they provide insights into the costs of adopting our approach. It turns out that the scheduling and synchronization overhead for an unoptimized implementation with no hardware support for synchronization of the framework is 16.3% of the cycle budget for a WLAN receiver on an EVP processor at 320 MHz. However, this overhead is less than 1% for mobile standards such as TDS-CDMA or LTE, which have lower rates, and thus larger cycle budgets. Considering that clock speeds will increase and that the synchronization primitives can be optimized to exploit the addressing modes available in the EVP, these results are very promising

    Composition and synchronization of real-time components upon one processor

    Get PDF
    Many industrial systems have various hardware and software functions for controlling mechanics. If these functions act independently, as they do in legacy situations, their overall performance is not optimal. There is a trend towards optimizing the overall system performance and creating a synergy between the different functions in a system, which is achieved by replacing more and more dedicated, single-function hardware by software components running on programmable platforms. This increases the re-usability of the functions, but their synergy requires also that (parts of) the multiple software functions share the same embedded platform. In this work, we look at the composition of inter-dependent software functions on a shared platform from a timing perspective. We consider platforms comprised of one preemptive processor resource and, optionally, multiple non-preemptive resources. Each function is implemented by a set of tasks; the group of tasks of a function that executes on the same processor, along with its scheduler, is called a component. The tasks of a component typically have hard timing constraints. Fulfilling these timing constraints of a component requires analysis. Looking at a single function, co-operative scheduling of the tasks within a component has already proven to be a powerful tool to make the implementation of a function more predictable. For example, co-operative scheduling can accelerate the execution of a task (making it easier to satisfy timing constraints), it can reduce the cost of arbitrary preemptions (leading to more realistic execution-time estimates) and it can guarantee access to other resources without the need for arbitration by other protocols. Since timeliness is an important functional requirement, (re-)use of a component for composition and integration on a platform must deal with timing. To enable us to analyze and specify the timing requirements of a particular component in isolation from other components, we reserve and enforce the availability of all its specified resources during run-time. The real-time systems community has proposed hierarchical scheduling frameworks (HSFs) to implement this isolation between components. After admitting a component on a shared platform, a component in an HSF keeps meeting its timing constraints as long as it behaves as specified. If it violates its specification, it may be penalized, but other components are temporally isolated from the malignant effects. A component in an HSF is said to execute on a virtual platform with a dedicated processor at a speed proportional to its reserved processor supply. Three effects disturb this point of view. Firstly, processor time is supplied discontinuously. Secondly, the actual processor is faster. Thirdly, the HSF no longer guarantees the isolation of an individual component when two arbitrary components violate their specification during access to non-preemptive resources, even when access is arbitrated via well-defined real-time protocols. The scientific contributions of this work focus on these three issues. Our solutions to these issues cover the system design from component requirements to run-time allocation. Firstly, we present a novel scheduling method that enables us to integrate the component into an HSF. It guarantees that each integrated component executes its tasks exactly in the same order regardless of a continuous or a discontinuous supply of processor time. Using our method, the component executes on a virtual platform and it only experiences that the processor speed is different from the actual processor speed. As a result, we can focus on the traditional scheduling problem of meeting deadline constraints of tasks on a uni-processor platform. For such platforms, we show how scheduling tasks co-operatively within a component helps to meet the deadlines of this component. We compare the strength of these cooperative scheduling techniques to theoretically optimal schedulers. Secondly, we standardize the way of computing the resource requirements of a component, even in the presence of non-preemptive resources. We can therefore apply the same timing analysis to the components in an HSF as to the tasks inside, regardless of their scheduling or their protocol being used for non-preemptive resources. This increases the re-usability of the timing analysis of components. We also make non-preemptive resources transparent during the development cycle of a component, i.e., the developer of a component can be unaware of the actual protocol being used in an HSF. Components can therefore be unaware that access to non-preemptive resources requires arbitration. Finally, we complement the existing real-time protocols for arbitrating access to non-preemptive resources with mechanisms to confine temporal faults to those components in the HSF that share the same non-preemptive resources. We compare the overheads of sharing non-preemptive resources between components with and without mechanisms for confinement of temporal faults. We do this by means of experiments within an HSF-enabled real-time operating system

    Traffic engineering in dynamic optical networks

    Get PDF
    Traffic Engineering (TE) refers to all the techniques a Service Provider employs to improve the efficiency and reliability of network operations. In IP over Optical (IPO) networks, traffic coming from upper layers is carried over the logical topology defined by the set of established lightpaths. Within this framework then, TE techniques allow to optimize the configuration of optical resources with respect to an highly dynamic traffic demand. TE can be performed with two main methods: if the demand is known only in terms of an aggregated traffic matrix, the problem of automatically updating the configuration of an optical network to accommodate traffic changes is called Virtual Topology Reconfiguration (VTR). If instead the traffic demand is known in terms of data-level connection requests with sub-wavelength granularity, arriving dynamically from some source node to any destination node, the problem is called Dynamic Traffic Grooming (DTG). In this dissertation new VTR algorithms for load balancing in optical networks based on Local Search (LS) techniques are presented. The main advantage of using LS is the minimization of network disruption, since the reconfiguration involves only a small part of the network. A comparison between the proposed schemes and the optimal solutions found via an ILP solver shows calculation time savings for comparable results of network congestion. A similar load balancing technique has been applied to alleviate congestion in an MPLS network, based on the efficient rerouting of Label-Switched Paths (LSP) from the most congested links to allow a better usage of network resources. Many algorithms have been developed to deal with DTG in IPO networks, where most of the attention is focused on optimizing the physical resources utilization by considering specific constraints on the optical node architecture, while very few attention has been put so far on the Quality of Service (QoS) guarantees for the carried traffic. In this thesis a novel Traffic Engineering scheme is proposed to guarantee QoS from both the viewpoint of service differentiation and transmission quality. Another contribution in this thesis is a formal framework for the definition of dynamic grooming policies in IPO networks. The framework is then specialized for an overlay architecture, where the control plane of the IP and optical level are separated, and no information is shared between the two. A family of grooming policies based on constraints on the number of hops and on the bandwidth sharing degree at the IP level is defined, and its performance analyzed in both regular and irregular topologies. While most of the literature on DTG problem implicitly considers the grooming of low-speed connections onto optical channels using a TDM approach, the proposed grooming policies are evaluated here by considering a realistic traffic model which consider a Dynamic Statistical Multiplexing (DSM) approach, i.e. a single wavelength channel is shared between multiple IP elastic traffic flows

    Bandwith allocation and scheduling in photonic networks

    Get PDF
    This thesis describes a framework for bandwidth allocation and scheduling in the Agile All-Photonic Network (AAPN). This framework is also applicable to any single-hop communication network with significant signalling delay (such as satellite-TDMA systems). Slot-by-slot scheduling approaches do not provide adequate performance for wide-area networks, so we focus on frame-based scheduling. We propose three novel fixed-length frame scheduling algorithms (Minimum Cost Search, Fair Matching and Minimum Rejection) and a feedback control system for stabilization.MCS is a greedy algorithm, which allocates time-slots sequentially using a cost function. This function is defined such that the time-slots with higher blocking probability are assigned first. MCS does not guarantee 100% throughput, thought it has a low blocking percentage. Our optimum scheduling approach is based on modifying the demand matrix such that the network resources are fully utilized, while the requests are optimally served. The Fair Matching Algorithm (FMA) uses the weighted max-min fairness criterion to achieve a fair share of resources amongst the connections in the network. When rejection is inevitable, FMA selects rejections such that the maximum percentage rejection experienced in the network is minimized. In another approach we formulate the rejection task as an optimization problem and propose the Minimum Rejection Algorithm (MRA), which minimizes total rejection. The minimum rejection problem is a special case of maximum flow problem. Due to the complexity of the algorithms that solve the max-flow problem we propose a heuristic algorithm with lower complexity.Scheduling in wide-area networks must be based on predictions of traffic demand and the resultant errors can lead to instability and unfairness. We design a feedback control system based on Smith's principle, which removes the destabilizing delays from the feedback loop by using a "loop cancelation" technique. The feedback control system we propose reduces the effect of prediction errors, increasing the speed of the response to sudden changes in traffic arrival rates and improving the fairness in the network through equalization of queue-lengths

    Particle swarm optimization for routing and wavelength assignment in next generation WDM networks.

    Get PDF
    PhDAll-optical Wave Division Multiplexed (WDM) networking is a promising technology for long-haul backbone and large metropolitan optical networks in order to meet the non-diminishing bandwidth demands of future applications and services. Examples could include archival and recovery of data to/from Storage Area Networks (i.e. for banks), High bandwidth medical imaging (for remote operations), High Definition (HD) digital broadcast and streaming over the Internet, distributed orchestrated computing, and peak-demand short-term connectivity for Access Network providers and wireless network operators for backhaul surges. One desirable feature is fast and automatic provisioning. Connection (lightpath) provisioning in optically switched networks requires both route computation and a single wavelength to be assigned for the lightpath. This is called Routing and Wavelength Assignment (RWA). RWA can be classified as static RWA and dynamic RWA. Static RWA is an NP-hard (non-polynomial time hard) optimisation task. Dynamic RWA is even more challenging as connection requests arrive dynamically, on-the-fly and have random connection holding times. Traditionally, global-optimum mathematical search schemes like integer linear programming and graph colouring are used to find an optimal solution for NP-hard problems. However such schemes become unusable for connection provisioning in a dynamic environment, due to the computational complexity and time required to undertake the search. To perform dynamic provisioning, different heuristic and stochastic techniques are used. Particle Swarm Optimisation (PSO) is a population-based global optimisation scheme that belongs to the class of evolutionary search algorithms and has successfully been used to solve many NP-hard optimisation problems in both static and dynamic environments. In this thesis, a novel PSO based scheme is proposed to solve the static RWA case, which can achieve optimal/near-optimal solution. In order to reduce the risk of premature convergence of the swarm and to avoid selecting local optima, a search scheme is proposed to solve the static RWA, based on the position of swarm‘s global best particle and personal best position of each particle. To solve dynamic RWA problem, a PSO based scheme is proposed which can provision a connection within a fraction of a second. This feature is crucial to provisioning services like bandwidth on demand connectivity. To improve the convergence speed of the swarm towards an optimal/near-optimal solution, a novel chaotic factor is introduced into the PSO algorithm, i.e. CPSO, which helps the swarm reach a relatively good solution in fewer iterations. Experimental results for PSO/CPSO based dynamic RWA algorithms show that the proposed schemes perform better compared to other evolutionary techniques like genetic algorithms, ant colony optimization. This is both in terms of quality of solution and computation time. The proposed schemes also show significant improvements in blocking probability performance compared to traditional dynamic RWA schemes like SP-FF and SP-MU algorithms

    Deployment and Debugging of Real-Time Applications on Multicore Architectures

    Get PDF
    It is essential to enable information extraction from software. Program tracing techniques are an example of information extraction. Program tracing extracts information from the program during execution. Tracing helps with the testing and validation of software to ensure that the software under test is correct. Information extraction is done by instrumenting the program. Logged information can be stored in dedicated logging memories or can be buffered and streamed off-chip to an external monitor. The designer inspects the trace after execution to identify potentially erroneous state information. In addition, the trace can provide the state information that serves as input to generate the erroneous output for reproducibility. Information extraction can be difficult and expensive due to the increase in size and complexity of modern software systems. For the sub-class of software systems known as real-time systems, these issues are further aggravated. This is because real-time systems demand timing guarantees in addition to functional correctness. Consequently, any instrumentation to the original program code for the purpose of information extraction may affect the temporal behaviors of the program. This perturbation of temporal behaviors can lead to the violation of timing constraints, which may bias the program execution and/or cause the program to miss its deadline. As a result, there is considerable interest in devising techniques to allow for information extraction without missing a program’s deadline that is known as time-aware instrumentation. This thesis investigates time-aware instrumentation mechanisms to instrument programs while respecting their timing constraints and functional behavior. Knowledge of the underlying hardware on which the software runs, enables the extraction of more information via the instrumentation process. Chip-multiprocessors offer a solution to the performance bottleneck on uni-processors. Providing timing guarantees for hard real-time systems, however, on chip-multiprocessors is difficult. This is because conventional communication interconnects are designed to optimize the average-case performance. Therefore, researchers propose interconnects such as the priority-aware networks to satisfy the requirements of hard real-time systems. The priority-aware interconnects, however, lack the proper analysis techniques to facilitate the deployment of real-time systems. This thesis also investigates latency and buffer space analysis techniques for pipelined communication resource models, as well as algorithms for the proper deployment of real-time applications to these platforms. The analysis techniques proposed in this thesis provide guarantees on the schedulability of real-time systems on chip-multiprocessors. These guarantees are based on reducing contention in the interconnect while simultaneously accurately computing the worst-case communication latencies. While these worst-case latencies provide bounds for computing the overall worst-case execution time of applications on chip-multiprocessors, they also provide means to assigning instrumentation budgets required by time-aware instrumentation. Leveraging these platform-specific analysis techniques for the assignment of instrumentation budgets, allows for extracting more information from the instrumentation process
    • …
    corecore