265 research outputs found
Polymorphic computing abstraction for heterogeneous architectures
Integration of multiple computing paradigms onto system on chip (SoC) has pushed the boundaries of design space exploration for hardware architectures and computing system software stack. The heterogeneity of computing styles in SoC has created a new class of architectures referred to as Heterogeneous Architectures. Novel applications developed to exploit the different computing styles are user centric for embedded SoC. Software and hardware designers are faced with several challenges to harness the full potential of heterogeneous architectures. Applications have to execute on more than one compute style to increase overall SoC resource utilization. The implication of such an abstraction is that application threads need to be polymorphic. Operating system layer is thus faced with the problem of scheduling polymorphic threads. Resource allocation is also an important problem to be dealt by the OS. Morphism evolution of application threads is constrained by the availability of heterogeneous computing resources. Traditional design optimization goals such as computational power and lower energy per computation are inadequate to satisfy user centric application resource needs. Resource allocation decisions at application layer need to permeate to the architectural layer to avoid conflicting demands which may affect energy-delay characteristics of application threads. We propose Polymorphic computing abstraction as a unified computing model for heterogeneous architectures to address the above issues. Simulation environment for polymorphic applications is developed and evaluated under various scheduling strategies to determine the effectiveness of polymorphism abstraction on resource allocation. User satisfaction model is also developed to complement polymorphism and used for optimization of resource utilization at application and network layer of embedded systems
Emerging research directions in computer science : contributions from the young informatics faculty in Karlsruhe
In order to build better human-friendly human-computer interfaces,
such interfaces need to be enabled with capabilities to perceive
the user, his location, identity, activities and in particular his interaction
with others and the machine. Only with these perception capabilities
can smart systems ( for example human-friendly robots or smart environments) become posssible. In my research I\u27m thus focusing on the
development of novel techniques for the visual perception of humans and
their activities, in order to facilitate perceptive multimodal interfaces,
humanoid robots and smart environments. My work includes research
on person tracking, person identication, recognition of pointing gestures,
estimation of head orientation and focus of attention, as well as
audio-visual scene and activity analysis. Application areas are humanfriendly
humanoid robots, smart environments, content-based image and
video analysis, as well as safety- and security-related applications. This
article gives a brief overview of my ongoing research activities in these
areas
Recommended from our members
QoS-aware mechanisms for improving cost-efficiency of datacenters
Warehouse Scale Computers (WSCs) promise high cost-efficiency by amortizing power, cooling, and management overheads. WSCs today host a large variety of jobs with two broad performance requirements categories: latency-critical (LC) and best-effort (BE). Ideally, to fully utilize all hardware resources, WSC operators can simply fill all the nodes with computing jobs. Unfortunately, because colocated jobs contend for shared resources, systems with high loads often experience performance degradation, which negatively impacts the Quality of Service (QoS) for LC jobs. In fact, service providers usually over-provision resources to avoid any interference with LC jobs, leading to significant resource inefficiencies. In this dissertation, I explore opportunities across different system-abstraction layers to improve the cost-efficiency of dataceters by increasing resource utilization of WSCs with little or no impact on the performance of LC jobs. The dissertation has three main components. First, I explore opportunities to improve the throughput of multicore systems by reducing the performance variation of LC jobs. The main insight is that by reshaping the latency distribution curve, performance headroom of LC jobs can be effectively converted to improved BE throughput. I develop, implement, and evaluate a runtime system that achieves this goal with existing hardware. I leverage the cache partitioning, per-core frequency scaling, and thread masking of server processors. Evaluation results show the proposed solution enables 30% higher system throughput compared to solutions proposed in prior works while maintaining at least as good QoS for LC jobs. Second, I study resource contention in near-future heterogeneous memory architectures (HMA). This study is motivated by recent developments in non-volatile memory (NVM) technologies, which enable higher storage density at the cost of same performance. To understand the performance and QoS impact of HMAs, I design and implement a performance emulator in the Linux kernel that runs unmodified workloads with high accuracy, low overhead, and complete transparency. I further propose and evaluate multiple data and resource management QoS mechanisms, such as locality-aware page admission, occupancy management, and write buffer jailing. Third, I focus on accelerated machine learning (ML) systems. By profiling the performance of production workloads and accelerators, I show that accelerated ML tasks are highly sensitive to main memory interference due to fine-grained interaction between CPU and accelerator tasks. As a result, memory resource contention can significantly decreases the performance and efficiency gains of accelerators. I propose a runtime system that leverages existing hardware capabilities and show 17% higher system efficiency compared to previous approaches. This study further exposes opportunities for future processor architecturesElectrical and Computer Engineerin
Securing Real-Time Internet-of-Things
Modern embedded and cyber-physical systems are ubiquitous. A large number of
critical cyber-physical systems have real-time requirements (e.g., avionics,
automobiles, power grids, manufacturing systems, industrial control systems,
etc.). Recent developments and new functionality requires real-time embedded
devices to be connected to the Internet. This gives rise to the real-time
Internet-of-things (RT-IoT) that promises a better user experience through
stronger connectivity and efficient use of next-generation embedded devices.
However RT- IoT are also increasingly becoming targets for cyber-attacks which
is exacerbated by this increased connectivity. This paper gives an introduction
to RT-IoT systems, an outlook of current approaches and possible research
challenges towards secure RT- IoT frameworks
Dynamic Resource Allocation in Embedded, High-Performance and Cloud Computing
The availability of many-core computing platforms enables a wide variety of technical solutions for systems across the embedded, high-performance and cloud computing domains. However, large scale manycore systems are notoriously hard to optimise. Choices regarding resource allocation alone can account for wide variability in timeliness and energy dissipation (up to several orders of magnitude). Dynamic Resource Allocation in Embedded, High-Performance and Cloud Computing covers dynamic resource allocation heuristics for manycore systems, aiming to provide appropriate guarantees on performance and energy efficiency. It addresses different types of systems, aiming to harmonise the approaches to dynamic allocation across the complete spectrum between systems with little flexibility and strict real-time guarantees all the way to highly dynamic systems with soft performance requirements. Technical topics presented in the book include:
Load and Resource Models
Admission Control
Feedback-based Allocation and Optimisation
Search-based Allocation Heuristics
Distributed Allocation based on Swarm Intelligence
Value-Based Allocation
Each of the topics is illustrated with examples based on realistic computational platforms such as Network-on-Chip manycore processors, grids and private cloud environments.Note.-- EUR 6,000 BPC fee funded by the EC FP7 Post-Grant Open Access Pilo
- …