41 research outputs found
Modeling and visualizing networked multi-core embedded software energy consumption
In this report we present a network-level multi-core energy model and a
software development process workflow that allows software developers to
estimate the energy consumption of multi-core embedded programs. This work
focuses on a high performance, cache-less and timing predictable embedded
processor architecture, XS1. Prior modelling work is improved to increase
accuracy, then extended to be parametric with respect to voltage and frequency
scaling (VFS) and then integrated into a larger scale model of a network of
interconnected cores. The modelling is supported by enhancements to an open
source instruction set simulator to provide the first network timing aware
simulations of the target architecture. Simulation based modelling techniques
are combined with methods of results presentation to demonstrate how such work
can be integrated into a software developer's workflow, enabling the developer
to make informed, energy aware coding decisions. A set of single-,
multi-threaded and multi-core benchmarks are used to exercise and evaluate the
models and provide use case examples for how results can be presented and
interpreted. The models all yield accuracy within an average +/-5 % error
margin
A software controlled voltage tuning system using multi-purpose ring oscillators
This paper presents a novel software driven voltage tuning method that
utilises multi-purpose Ring Oscillators (ROs) to provide process variation and
environment sensitive energy reductions. The proposed technique enables voltage
tuning based on the observed frequency of the ROs, taken as a representation of
the device speed and used to estimate a safe minimum operating voltage at a
given core frequency. A conservative linear relationship between RO frequency
and silicon speed is used to approximate the critical path of the processor.
Using a multi-purpose RO not specifically implemented for critical path
characterisation is a unique approach to voltage tuning. The parameters
governing the relationship between RO and silicon speed are obtained through
the testing of a sample of processors from different wafer regions. These
parameters can then be used on all devices of that model. The tuning method and
software control framework is demonstrated on a sample of XMOS XS1-U8A-64
embedded microprocessors, yielding a dynamic power saving of up to 25% with no
performance reduction and no negative impact on the real-time constraints of
the embedded software running on the processor
Overview of Swallow --- A Scalable 480-core System for Investigating the Performance and Energy Efficiency of Many-core Applications and Operating Systems
We present Swallow, a scalable many-core architecture, with a current
configuration of 480 x 32-bit processors.
Swallow is an open-source architecture, designed from the ground up to
deliver scalable increases in usable computational power to allow
experimentation with many-core applications and the operating systems that
support them.
Scalability is enabled by the creation of a tile-able system with a
low-latency interconnect, featuring an attractive communication-to-computation
ratio and the use of a distributed memory configuration.
We analyse the energy and computational and communication performances of
Swallow. The system provides 240GIPS with each core consuming 71--193mW,
dependent on workload. Power consumption per instruction is lower than almost
all systems of comparable scale.
We also show how the use of a distributed operating system (nOS) allows the
easy creation of scalable software to exploit Swallow's potential. Finally, we
show two use case studies: modelling neurons and the overlay of shared memory
on a distributed memory system.Comment: An open source release of the Swallow system design and code will
follow and references to these will be added at a later dat
IoT Droplocks: Wireless Fingerprint Theft Using Hacked Smart Locks
Electronic locks can provide security- and convenience-enhancing features,
with fingerprint readers an increasingly common feature in these products. When
equipped with a wireless radio, they become a smart lock and join the billions
of IoT devices proliferating our world. However, such capabilities can also be
used to transform smart locks into fingerprint harvesters that compromise an
individual's security without their knowledge. We have named this the droplock
attack. This paper demonstrates how the harvesting technique works, shows that
off-the-shelf smart locks can be invisibly modified to perform such attacks,
discusses the implications for smart device design and usage, and calls for
better manufacturer and public treatment of this issue.Comment: Submitted and accepted into 2022 IEEE International Conferences on
Internet of Things (iThings) and IEEE Green Computing & Communications
(GreenCom) and IEEE Cyber, Physical & Social Computing (CPSCom) and IEEE
Smart Data (SmartData) and IEEE Congress. Submitted version: 10 pages, 8
figure
A Benes Based NoC Switching Architecture for Mixed Criticality Embedded Systems
Multi-core, Mixed Criticality Embedded (MCE) real-time systems require high
timing precision and predictability to guarantee there will be no interference
between tasks. These guarantees are necessary in application areas such as
avionics and automotive, where task interference or missed deadlines could be
catastrophic, and safety requirements are strict. In modern multi-core systems,
the interconnect becomes a potential point of uncertainty, introducing major
challenges in proving behaviour is always within specified constraints,
limiting the means of growing system performance to add more tasks, or provide
more computational resources to existing tasks.
We present MCENoC, a Network-on-Chip (NoC) switching architecture that
provides innovations to overcome this with predictable, formally verifiable
timing behaviour that is consistent across the whole NoC. We show how the
fundamental properties of Benes networks benefit MCE applications and meet our
architecture requirements. Using SystemVerilog Assertions (SVA), formal
properties are defined that aid the refinement of the specification of the
design as well as enabling the implementation to be exhaustively formally
verified. We demonstrate the performance of the design in terms of size,
throughput and predictability, and discuss the application level considerations
needed to exploit this architecture
Swallow:Building an Energy-Transparent Many-Core Embedded Real-Time System
Swallow is a many-core platform of interconnected embedded real time processors with time-deterministic execution and a cache-less memory subsystem. Its largest current configuration is 480 Ć 32-bit processors. It is open-source, designed from the ground up to allow the exploration of flexibility, scalability and energy efficiency in large systems of embedded processors. Further, it enables the behavior of various structures of parallel programs to be explored. It is a proof of concept and design example for other potential systems of this kind. We present the energy transparency features and proportional energy scaling of the system that allows it to be expanded beyond hundreds of cores. We discuss the design choices, construction and novel network implementation of Swallow. Currently, the system provides up to 240 GIPS, with each core consuming 71ā193 mW, dependent on workload. Its power per instruction is lower than almost all systems of comparable scale. We discuss the challenges associated with efficiently utilizing this system, particularly communication/computation ratios, and give recommendations for future systems and their software
Static analysis of energy consumption for LLVM IR programs
Energy models can be constructed by characterizing the energy consumed by
executing each instruction in a processor's instruction set. This can be used
to determine how much energy is required to execute a sequence of assembly
instructions, without the need to instrument or measure hardware.
However, statically analyzing low-level program structures is hard, and the
gap between the high-level program structure and the low-level energy models
needs to be bridged. We have developed techniques for performing a static
analysis on the intermediate compiler representations of a program.
Specifically, we target LLVM IR, a representation used by modern compilers,
including Clang. Using these techniques we can automatically infer an estimate
of the energy consumed when running a function under different platforms, using
different compilers.
One of the challenges in doing so is that of determining an energy cost of
executing LLVM IR program segments, for which we have developed two different
approaches. When this information is used in conjunction with our analysis, we
are able to infer energy formulae that characterize the energy consumption for
a particular program. This approach can be applied to any languages targeting
the LLVM toolchain, including C and XC or architectures such as ARM Cortex-M or
XMOS xCORE, with a focus towards embedded platforms. Our techniques are
validated on these platforms by comparing the static analysis results to the
physical measurements taken from the hardware. Static energy consumption
estimation enables energy-aware software development, without requiring
hardware knowledge
Chapter Measuring Energy
Data centres are part of today's critical information and communication infrastructure, and the majority of business transactions as well as much of our digital life now depend on them. At the same time, data centres are large primary energy consumers, with energy consumed by IT and server room air conditioning equipment and also by general building facilities. In many data centres, IT equipment energy and cooling energy requirements are not always coordinated, so energy consumption is not optimised. Most data centres lack an integrated energy management system that jointly optimises and controls all its energy consuming equipments in order to reduce energy consumption and increase the usage of local renewable energy sources. In this chapter, the authors discuss the challenges of coordinated energy management in data centres and present a novel scalable, integrated energy management system architecture for data centre wide optimisation. A prototype of the system has been implemented, including joint workload and thermal management algorithms. The control algorithms are evaluated in an accurate simulationābased model of a real data centre. Results show significant energy savings potential, in some cases up to 40%, by integrating workload and thermal management
Measuring Energy
This chapter provides an introduction to quantifying the energy consumed by software. It is written for computer scientists, software engineers, embedded system developers and programmers who want to understand how to measure the energy consumed by the code they write in order to optimize for energy efficiency. We start with an overview of the electrical foundations of energy measurement and show how these are applied by reviewing the most commonly found energy sensing techniques. This is followed by a brief discussion of the signal processing required to obtain energy consumption data from sensing. We then present two energy measurement systems that are based on sensing techniques. Both can be used to directly measure the energy consumed by software running on embedded systems without the need to modify the hardware. As an alternative, regression-based techniques can be used to infer energy consumption based on monitoring events during program execution using counters monitors offered by the hardware. We introduce the foundations of regression analysis and illustrate how an energy model for an ARM processor can be built using linear regression. In the conclusion, we offer a wider discussion on what should be considered when selecting an energy measurement technique