1,324 research outputs found
Towards Increased Power Efficiency in Low End Embedded Processors: Can Cache Help?
Embedded processors are often characterized by limited resources and are optimized for specific applications. A rising number of battery powered applications has driven a trend towards increased energy efficiency sometimes even traded with performance. Particularly, lower power and low specification embedded processors lack on-chip cache memories. This is mainly in order to avoid the higher energy overhead a cache structure would pose in an embedded processor. This paper proposes energy and throughput models which can be used to analyze energy and time overhead for a particular application due to introduction of a data cache architecture in a previously non-cached system or alternatively can be used in reconfigurable systems for cache overhead analysis.
Connecting the World of Embedded Mobiles: The RIOT Approach to Ubiquitous Networking for the Internet of Things
The Internet of Things (IoT) is rapidly evolving based on low-power compliant
protocol standards that extend the Internet into the embedded world. Pioneering
implementations have proven it is feasible to inter-network very constrained
devices, but had to rely on peculiar cross-layered designs and offer a
minimalistic set of features. In the long run, however, professional use and
massive deployment of IoT devices require full-featured, cleanly composed, and
flexible network stacks.
This paper introduces the networking architecture that turns RIOT into a
powerful IoT system, to enable low-power wireless scenarios. RIOT networking
offers (i) a modular architecture with generic interfaces for plugging in
drivers, protocols, or entire stacks, (ii) support for multiple heterogeneous
interfaces and stacks that can concurrently operate, and (iii) GNRC, its
cleanly layered, recursively composed default network stack. We contribute an
in-depth analysis of the communication performance and resource efficiency of
RIOT, both on a micro-benchmarking level as well as by comparing IoT
communication across different platforms. Our findings show that, though it is
based on significantly different design trade-offs, the networking subsystem of
RIOT achieves a performance equivalent to that of Contiki and TinyOS, the two
operating systems which pioneered IoT software platforms
Investigating the viability of adaptive caches as a defense mechanism against cache side-channel attacks
The ongoing miniaturization of semiconductor manufacturing technologies has enabled the integration of tens to hundreds of processing cores on a single chip. Unlike frequency-scaling where performance is increased equally across the board, core-scaling and hardware thread-scaling harness the additional processing power through the concurrent execution of multiple processes or programs. This approach of mingling or interleaving process executions has engendered a new set of security challenges that risks to undermine nearly three decades’ worth of computer architecture design efforts.
The complexity of the runtime interactions and aggressive resource sharing among processes, e.g., caches or interconnect network paths, have created a fertile ground to mount attacks of ever-increasing acuteness against these computer systems. One such class of attacks is cache side-channel attacks.
While caches are vital to the performance of current processors, they have also been the target of numerous side-channel attacks. As a result, a few cache architectures have been proposed to defend against these attacks. However, these designs tend to provide security at the expense of performance, area and power. Therefore, the design of secure, high-performance cache architectures is still a pressing research challenge.
In this thesis, we examine the viability of self-aware adaptive caches as a defense mechanism against cache side-channel attacks. We define an adaptive cache as a caching structure with (i) run-time reconfiguration capability, and (ii) intelligent built-in logic to monitor itself and determine its parameter settings. Since the success of most cache side-channel attacks depend on the attacker’s knowledge of the key cache parameters such as associativity, set count, replacement policy, among others, an adaptive cache can provide a moving target defense approach against many of these cache side-channel attacks.
Therefore, we hypothesize that the runtime changes in certain cache parameters should render some of the side-channel attacks less effective due to their dependence on knowing the exact configuration of the caches.2020-06-03T00:00:00
Combining Target-independent Analysis with Dynamic Profiling to Build the Performance Model of a DSP
Fast and accurate performance estimation is a key aspect of heterogeneous embedded systems design flow, since cycle-accurate simulators, when exist, are usually too slow to be used during design space exploration. Performance estimation techniques are usually based on combination of estimation of the single processing elements which compose the system. Architectural characteristics of Digital Signal Processors (DSP), such as the presence of Single Instruction Multiple Data operations or of special hardware units to control loop executions, introduce peculiar aspects in the performance estimation problem. In this paper we present a methodology to estimate the performance of a function on a given dataset on a DSP. Estimation is performed combining the host profiling data with the function GNU GCC GIMPLE representation. Starting from the results of this analysis, we build a performance model of a DSP by exploiting the Linear Regression Technique. Use of GIMPLE representation allows to take directly into account the target-independent optimizations performed by the DSP compiler. We validate our approach by building a performance model of the MagicV DSP and by testing the model on a set of significative benchmarks
Parallel Architectures for Planetary Exploration Requirements (PAPER)
The Parallel Architectures for Planetary Exploration Requirements (PAPER) project is essentially research oriented towards technology insertion issues for NASA's unmanned planetary probes. It was initiated to complement and augment the long-term efforts for space exploration with particular reference to NASA/LaRC's (NASA Langley Research Center) research needs for planetary exploration missions of the mid and late 1990s. The requirements for space missions as given in the somewhat dated Advanced Information Processing Systems (AIPS) requirements document are contrasted with the new requirements from JPL/Caltech involving sensor data capture and scene analysis. It is shown that more stringent requirements have arisen as a result of technological advancements. Two possible architectures, the AIPS Proof of Concept (POC) configuration and the MAX Fault-tolerant dataflow multiprocessor, were evaluated. The main observation was that the AIPS design is biased towards fault tolerance and may not be an ideal architecture for planetary and deep space probes due to high cost and complexity. The MAX concepts appears to be a promising candidate, except that more detailed information is required. The feasibility for adding neural computation capability to this architecture needs to be studied. Key impact issues for architectural design of computing systems meant for planetary missions were also identified
Recommended from our members
Energy-efficient mobile Web computing
Next-generation Web services will be primarily accessed through mobile devices. However, mobile devices are low-performance and stringently energy-constrained. In my dissertation, I propose the design of a high-performance and energy-efficient mobile Web computing substrate. It is a hardware/software co-designed system that delivers satisfactory user quality-of-service (QoS) experience on a mobile energy budget. The key insight is that the traditional interfaces between different Web stacks need to be enhanced with new abstractions that express user QoS experience and that expose architectural-level complexities. On the basis of the enhanced interfaces, I propose synergistic cross-layer optimizations across the processor architecture, Web runtime, programming language, and application layers to maximize the whole system efficiency. The contributions made in this dissertation will likely have a long-term impact because the target application domain, the Web, is becoming a universal mobile development platform, and because our solutions target the fundamental computation layers of the Web domain.Electrical and Computer Engineerin
Dynamic adaptation and distribution of binaries to heterogeneous architectures
Real time multimedia workloads require progressingly more processing power. Modern many-core architectures provide enough processing power to satisfy the requirements of many real time multimedia workloads. When even they are un- able to satisfy processing power requirements, network-distribution can provide many workloads with even more computing power.
In this thesis, we present solutions that can be used to make it practical to use the processing power that networks of many-core architectures can provide. The research focus on solutions that can be included in our Parallel Processing Graphs (P2G) project.
We have developed the foundation for network distribution in P2G, and we have suggested a viable solution for execution of workloads on heterogeneous multi- core architectures
Performance estimation of embedded software with confidence levels
Since time constraints are a very critical aspect of an embedded system, performance evaluation can not be postponed to the end of the design flow, but it has to be introduced since its early stages. Estimation techniques based on mathematical models are usually preferred during this phase since they provide quite accurate estimation of the application performance in a fast way. However, the estimation error has to be considered during design space exploration to evaluate if a solution can be accepted (e.g., by discarding solutions whose estimated time is too close to constraint). Evaluate if the possible error can be significant analyzing a punctual estimation is not a trivial task. In this paper we propose a methodology, based on statistical analysis, that provides a prediction interval on the estimation and a confidence level on meeting a time constraint. This information can drive design space exploration reducing the number of solutions to be validated. The results show how the produced intervals effectively capture the estimation error introduced by a linear model
A HyperNet Architecture
Network virtualization is becoming a fundamental building block of future Internet architectures. By adding networking resources into the “cloud”, it is possible for users to rent virtual routers from the underlying network infrastructure, connect them with virtual channels to form a virtual network, and tailor the virtual network (e.g., load application-specific networking protocols, libraries and software stacks on to the virtual routers) to carry out a specific task. In addition, network virtualization technology allows such special-purpose virtual networks to co-exist on the same set of network infrastructure without interfering with each other.
Although the underlying network resources needed to support virtualized networks are rapidly becoming available, constructing a virtual network from the ground up and using the network is a challenging and labor-intensive task, one best left to experts.
To tackle this problem, we introduce the concept of a HyperNet, a pre-built, pre-configured network package that a user can easily deploy or access a virtual network to carry out a specific task (e.g., multicast video conferencing). HyperNets package together the network topology configuration, software, and network services needed to create and deploy a custom virtual network. Users download HyperNets from HyperNet repositories and then “run” them on virtualized network infrastructure much like users download and run virtual appliances on a virtual machine. To support the HyperNet abstraction, we created a Network Hypervisor service that provides a set of APIs that can be called to create a virtual network with certain characteristics.
To evaluate the HyperNet architecture, we implemented several example Hyper-Nets and ran them on our prototype implementation of the Network Hypervisor. Our experiments show that the Hypervisor API can be used to compose almost any special-purpose network – networks capable of carrying out functions that the current Internet does not provide. Moreover, the design of our HyperNet architecture is highly extensible, enabling developers to write high-level libraries (using the Network Hypervisor APIs) to achieve complicated tasks
Fine grain analysis of simulators accuracy for calibrating performance models
In embedded system design, the tuning and validation of a cycle accurate simulator is a difficult task. The designer has to assure that the estimation error of the simulator meets the design constraints on every application. If an application is not correctly estimated, the designer has to identify on which parts of the application the simulator introduces an estimation error and consequently fix the simulator. However, detecting which are the mispredicted parts of a very large application can be a difficult process which requires a lot of time. In this paper we propose a methodology which helps the designer to fast and automatically isolate the portions of the application mispredicted by a simulator. This is accomplished by recursively analyzing the application source code trace highlighting the mispredicted sections of source code. The results obtained applying the methodology to the TSIM simulator show how our methodology is able to fast analyze large applications isolating small portions of mispredicted code
- …