8,824 research outputs found
Modeling, Analysis, and Hard Real-time Scheduling of Adaptive Streaming Applications
In real-time systems, the application's behavior has to be predictable at
compile-time to guarantee timing constraints. However, modern streaming
applications which exhibit adaptive behavior due to mode switching at run-time,
may degrade system predictability due to unknown behavior of the application
during mode transitions. Therefore, proper temporal analysis during mode
transitions is imperative to preserve system predictability. To this end, in
this paper, we initially introduce Mode Aware Data Flow (MADF) which is our new
predictable Model of Computation (MoC) to efficiently capture the behavior of
adaptive streaming applications. Then, as an important part of the operational
semantics of MADF, we propose the Maximum-Overlap Offset (MOO) which is our
novel protocol for mode transitions. The main advantage of this transition
protocol is that, in contrast to self-timed transition protocols, it avoids
timing interference between modes upon mode transitions. As a result, any mode
transition can be analyzed independently from the mode transitions that
occurred in the past. Based on this transition protocol, we propose a hard
real-time analysis as well to guarantee timing constraints by avoiding
processor overloading during mode transitions. Therefore, using this protocol,
we can derive a lower bound and an upper bound on the earliest starting time of
the tasks in the new mode during mode transitions in such a way that hard
real-time constraints are respected.Comment: Accepted for presentation at EMSOFT 2018 and for publication in IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
(TCAD) as part of the ESWEEK-TCAD special issu
The Chameleon Architecture for Streaming DSP Applications
We focus on architectures for streaming DSP applications such as wireless baseband processing and image processing. We aim at a single generic architecture that is capable of dealing with different DSP applications. This architecture has to be energy efficient and fault tolerant. We introduce a heterogeneous tiled architecture and present the details of a domain-specific reconfigurable tile processor called Montium. This reconfigurable processor has a small footprint (1.8 mm in a 130 nm process), is power efficient and exploits the locality of reference principle. Reconfiguring the device is very fast, for example, loading the coefficients for a 200 tap FIR filter is done within 80 clock cycles. The tiles on the tiled architecture are connected to a Network-on-Chip (NoC) via a network interface (NI). Two NoCs have been developed: a packet-switched and a circuit-switched version. Both provide two types of services: guaranteed throughput (GT) and best effort (BE). For both NoCs estimates of power consumption are presented. The NI synchronizes data transfers, configures and starts/stops the tile processor. For dynamically mapping applications onto the tiled architecture, we introduce a run-time mapping tool
Formal and Informal Methods for Multi-Core Design Space Exploration
We propose a tool-supported methodology for design-space exploration for
embedded systems. It provides means to define high-level models of applications
and multi-processor architectures and evaluate the performance of different
deployment (mapping, scheduling) strategies while taking uncertainty into
account. We argue that this extension of the scope of formal verification is
important for the viability of the domain.Comment: In Proceedings QAPL 2014, arXiv:1406.156
MorphoSys: efficient colocation of QoS-constrained workloads in the cloud
In hosting environments such as IaaS clouds, desirable application performance is usually guaranteed through the use of Service Level Agreements (SLAs), which specify minimal fractions of resource capacities that must be allocated for unencumbered use for proper operation. Arbitrary colocation of applications with different SLAs on a single host may result in inefficient utilization of the hostâs resources. In this paper, we propose that periodic resource allocation and consumption models -- often used to characterize real-time workloads -- be used for a more granular expression of SLAs. Our proposed SLA model has the salient feature that it exposes flexibilities that enable the infrastructure provider to safely transform SLAs from one form to another for the purpose of achieving more efficient colocation. Towards that goal, we present MORPHOSYS: a framework for a service that allows the manipulation of SLAs to enable efficient colocation of arbitrary workloads in a dynamic setting. We present results from extensive trace-driven simulations of colocated Video-on-Demand servers in a cloud setting. These results show that potentially-significant reduction in wasted resources (by as much as 60%) are possible using MORPHOSYS.National Science Foundation (0720604, 0735974, 0820138, 0952145, 1012798
Influences on Throughput and Latency in Stream Programs
Vu Thien Nga Nguyen and Raimund Kirner, 'Influences on Throughput and Latency in Stream Programs' paper presented at the 2nd Workshop on Feedback-Directed Compiler Optimization for Multi-Core Architectures. Berlin, Germany. 22 January 2013Stream programming is a promising approach to execute programs on parallel hardware such as multi-core systems. It allows to reuse sequential code at component level and to extend such code with concurrency-handling at the communication level. In this paper we investigate in the performance of stream programs in terms of throughput and latency. We identify factors that affect these performance metrics and propose an efficient scheduling approach to obtain the maximal performance
Towards an HLA Run-time Infrastructure with Hard Real-time Capabilities
Our work takes place in the context of the HLA standard and its application in real-time systems context. The HLA standard is inadequate for taking into consideration the different constraints involved in real-time computer systems. Many works have been invested in order to providing real-time capabilities to Run Time Infrastructures (RTI) to run real time simulation. Most of these initiatives focus on major issues including QoS guarantee, Worst Case Transit Time (WCTT) knowledge and scheduling services provided by the underlying operating systems. Even if our ultimate objective is to achieve real-time capabilities for distributed HLA federations executions, this paper describes a preliminary work focusing on achieving hard real-time properties for HLA federations running on a single computer under Linux operating systems. Our paper proposes a novel global bottom up approach for designing real-time Run time Infrastructures and a formal model for validation of uni processor to (then) distributed real-time simulation with CERTI
Architecture Design Space Exploration for Streaming Applications Through Timing Analysis
In this paper we compare the maximum achievable throughput of different memory organisations of the processing elements that constitute a multiprocessor system on chip. This is done by modelling the mapping of a task with input and output channels on a processing element as a homogeneous synchronous dataflow graph, and use maximum cycle mean analysis to derive the throughput. In a HiperLAN2 case study we show how these techniques can be used to derive the required clock frequency and communication latencies in order to meet the application's throughput requirement on a multiprocessor system on chip that has one of the investigated memory organisations
- âŠ