1,327 research outputs found
On Reliability-Aware Server Consolidation in Cloud Datacenters
In the past few years, datacenter (DC) energy consumption has become an
important issue in technology world. Server consolidation using virtualization
and virtual machine (VM) live migration allows cloud DCs to improve resource
utilization and hence energy efficiency. In order to save energy, consolidation
techniques try to turn off the idle servers, while because of workload
fluctuations, these offline servers should be turned on to support the
increased resource demands. These repeated on-off cycles could affect the
hardware reliability and wear-and-tear of servers and as a result, increase the
maintenance and replacement costs. In this paper we propose a holistic
mathematical model for reliability-aware server consolidation with the
objective of minimizing total DC costs including energy and reliability costs.
In fact, we try to minimize the number of active PMs and racks, in a
reliability-aware manner. We formulate the problem as a Mixed Integer Linear
Programming (MILP) model which is in form of NP-complete. Finally, we evaluate
the performance of our approach in different scenarios using extensive
numerical MATLAB simulations.Comment: International Symposium on Parallel and Distributed Computing
(ISPDC), Innsbruck, Austria, 201
Multi-Antenna Assisted Virtual Full-Duplex Relaying with Reliability-Aware Iterative Decoding
In this paper, a multi-antenna assisted virtual full-duplex (FD) relaying
with reliability-aware iterative decoding at destination node is proposed to
improve system spectral efficiency and reliability. This scheme enables two
half-duplex relay nodes, mimicked as FD relaying, to alternatively serve as
transmitter and receiver to relay their decoded data signals regardless the
decoding errors, meanwhile, cancel the inter-relay interference with
QR-decomposition. Then, by deploying the reliability-aware iterative
detection/decoding process, destination node can efficiently mitigate
inter-frame interference and error propagation effect at the same time.
Simulation results show that, without extra cost of time delay and signalling
overhead, our proposed scheme outperforms the conventional selective
decode-and-forward (S-DF) relaying schemes, such as cyclic redundancy check
based S-DF relaying and threshold based S-DF relaying, by up to 8 dB in terms
of bit-error-rate.Comment: 6 pages, 4 figures, conference paper has been submitte
PowerPlanningDL: Reliability-Aware Framework for On-Chip Power Grid Design using Deep Learning
With the increase in the complexity of chip designs, VLSI physical design has
become a time-consuming task, which is an iterative design process. Power
planning is that part of the floorplanning in VLSI physical design where power
grid networks are designed in order to provide adequate power to all the
underlying functional blocks. Power planning also requires multiple iterative
steps to create the power grid network while satisfying the allowed worst-case
IR drop and Electromigration (EM) margin. For the first time, this paper
introduces Deep learning (DL)-based framework to approximately predict the
initial design of the power grid network, considering different reliability
constraints. The proposed framework reduces many iterative design steps and
speeds up the total design cycle. Neural Network-based multi-target regression
technique is used to create the DL model. Feature extraction is done, and the
training dataset is generated from the floorplans of some of the power grid
designs extracted from the IBM processor. The DL model is trained using the
generated dataset. The proposed DL-based framework is validated using a new set
of power grid specifications (obtained by perturbing the designs used in the
training phase). The results show that the predicted power grid design is
closer to the original design with minimal prediction error (~2%). The proposed
DL-based approach also improves the design cycle time with a speedup of ~6X for
standard power grid benchmarks.Comment: Published in proceedings of IEEE/ACM Design, Automation and Test in
Europe Conference (DATE) 2020, 6 page
Optimizing soft error reliability through scheduling on heterogeneous multicore processors
Reliability to soft errors is an increasingly important issue as technology continues to shrink. In this paper, we show that applications exhibit different reliability characteristics on big, high-performance cores versus small, power-efficient cores, and that there is significant opportunity to improve system reliability through reliability-aware scheduling on heterogeneous multicore processors. We monitor the reliability characteristics of all running applications, and dynamically schedule applications to the different core types in a heterogeneous multicore to maximize system reliability. Reliability-aware scheduling improves reliability by 25.4 percent on average (and up to 60.2 percent) compared to performance-optimized scheduling on a heterogeneous multicore processor with two big cores and two small cores, while degrading performance by 6.3 percent only. We also introduce a novel system-level reliability metric for multiprogram workloads on (heterogeneous) multicores. We provide a trade-off analysis among reliability-, power- and performance-optimized scheduling, and evaluate reliability-aware scheduling under performance constraints and for unprotected L1 caches. In addition, we also extend our scheduling mechanisms to multithreaded programs. The hardware cost in support of our reliability-aware scheduler is limited to 296 bytes per core
Experimental Validation of the Reliability-Aware Multi-UAV Coverage Path Planning Problem
Unmanned aerial vehicles (UAVs) have become crucial for various applications, necessitating reliable and time-constrained performance. Multi-UAV solutions offer advantages but require effective coordination. Traditional coverage path planning methods overlook uncertainties and individual UAV failures. To address this, reliability-aware multi-UAV coverage path planning methods optimise task allocation to maximise mission completion probabilities given a failure model. This paper presents an experimental validation of the reliability-aware approach, specifically an approach using a Greedy Genetic Algorithm (GGA). We evaluate the GGA performance in real-world environments, comparing mission reliability to computed reliability and comparing it against a traditional multi-UAV methods. The experimental validation demonstrates the practical viability and effectiveness of the reliability-aware approach, showing significant improvement in mission reliability despite the inevitable mismatch between real and assumed failure models
Reliability-Aware Power Management Of Multi-Core Systems (MPSoCs)
Long-term reliability of processors in embedded systems is experiencing growing attention since decreasing feature sizes and increasing power consumption have a negative influence on the lifespan. Among other measures, the reliability can be influenced significantly by Dynamic Power Management (DPM), since it affects the processor\u27s temperature. Compared to single-core systems reconfigurable multi-core SoCs offer much more possibilities to optimize power and reliability.
The impact of different DPM-strategies on the lifespan of multi-core processors is the focus of this presentation. It is shown that the long-term reliability of a multi-core system can be influenced deliberately with different DPM strategies and that temperature cycling greatly influences the estimated lifespan. In this presentation, a new reliability-aware dynamic power management (RADPM) policy is explained
Reliability-Aware Optimization of Approximate Computational Kernels with Rely
Emerging high-performance architectures are anticipated to contain unreliable components (e.g., ALUs) that offer low power consumption at the expense of soft errors. Some applications (such as multimedia processing, machine learning, and big data analytics) can often naturally tolerate soft errors and can therefore trade accuracy of their results for reduced energy consumption by utilizing these unreliable hardware components. We present and evaluate a technique for reliability-aware optimization of approximate computational kernel implementations. Our technique takes a standard implementation of a computation and automatically replaces some of its arithmetic operations with unreliable versions that consume less power, but may produce incorrect results with some probability. Our technique works with a developer-provided specification of the required reliability of a computation -- the probability that it returns the correct result -- and produces an unreliable implementation that satisfies that specification. We evaluate our approach on five applications from the image processing, numerical analysis, and financial analysis domains and demonstrate how our technique enables automatic exploration of the trade-off between the reliability of a computation and its performance
- …