Search CORE

1,327 research outputs found

On Reliability-Aware Server Consolidation in Cloud Datacenters

Author: Goudarzi Maziar
Tashtarian Farzad
Varasteh Amir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2017
Field of study

In the past few years, datacenter (DC) energy consumption has become an important issue in technology world. Server consolidation using virtualization and virtual machine (VM) live migration allows cloud DCs to improve resource utilization and hence energy efficiency. In order to save energy, consolidation techniques try to turn off the idle servers, while because of workload fluctuations, these offline servers should be turned on to support the increased resource demands. These repeated on-off cycles could affect the hardware reliability and wear-and-tear of servers and as a result, increase the maintenance and replacement costs. In this paper we propose a holistic mathematical model for reliability-aware server consolidation with the objective of minimizing total DC costs including energy and reliability costs. In fact, we try to minimize the number of active PMs and racks, in a reliability-aware manner. We formulate the problem as a Mixed Integer Linear Programming (MILP) model which is in form of NP-complete. Finally, we evaluate the performance of our approach in different scenarios using extensive numerical MATLAB simulations.Comment: International Symposium on Parallel and Distributed Computing (ISPDC), Innsbruck, Austria, 201

arXiv.org e-Print Archive

Crossref

Multi-Antenna Assisted Virtual Full-Duplex Relaying with Reliability-Aware Iterative Decoding

Author: Hou Jiancao
Ma Yi
Narayanan Sandeep
Shikh-Bahaei Mohammad
Publication venue
Publication date: 18/10/2017
Field of study

In this paper, a multi-antenna assisted virtual full-duplex (FD) relaying with reliability-aware iterative decoding at destination node is proposed to improve system spectral efficiency and reliability. This scheme enables two half-duplex relay nodes, mimicked as FD relaying, to alternatively serve as transmitter and receiver to relay their decoded data signals regardless the decoding errors, meanwhile, cancel the inter-relay interference with QR-decomposition. Then, by deploying the reliability-aware iterative detection/decoding process, destination node can efficiently mitigate inter-frame interference and error propagation effect at the same time. Simulation results show that, without extra cost of time delay and signalling overhead, our proposed scheme outperforms the conventional selective decode-and-forward (S-DF) relaying schemes, such as cyclic redundancy check based S-DF relaying and threshold based S-DF relaying, by up to 8 dB in terms of bit-error-rate.Comment: 6 pages, 4 figures, conference paper has been submitte

arXiv.org e-Print Archive

Crossref

PowerPlanningDL: Reliability-Aware Framework for On-Chip Power Grid Design using Deep Learning

Author: kingma
lin
mohammad fawaz
ng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/07/2020
Field of study

With the increase in the complexity of chip designs, VLSI physical design has become a time-consuming task, which is an iterative design process. Power planning is that part of the floorplanning in VLSI physical design where power grid networks are designed in order to provide adequate power to all the underlying functional blocks. Power planning also requires multiple iterative steps to create the power grid network while satisfying the allowed worst-case IR drop and Electromigration (EM) margin. For the first time, this paper introduces Deep learning (DL)-based framework to approximately predict the initial design of the power grid network, considering different reliability constraints. The proposed framework reduces many iterative design steps and speeds up the total design cycle. Neural Network-based multi-target regression technique is used to create the DL model. Feature extraction is done, and the training dataset is generated from the floorplans of some of the power grid designs extracted from the IBM processor. The DL model is trained using the generated dataset. The proposed DL-based framework is validated using a new set of power grid specifications (obtained by perturbing the designs used in the training phase). The results show that the predicted power grid design is closer to the original design with minimal prediction error (~2%). The proposed DL-based approach also improves the design cycle time with a speedup of ~6X for standard power grid benchmarks.Comment: Published in proceedings of IEEE/ACM Design, Automation and Test in Europe Conference (DATE) 2020, 6 page

arXiv.org e-Print Archive

Crossref

Optimizing soft error reliability through scheduling on heterogeneous multicore processors

Author: Eeckhout Lieven
Eyerman Stijn
Naithani Ajeya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Reliability to soft errors is an increasingly important issue as technology continues to shrink. In this paper, we show that applications exhibit different reliability characteristics on big, high-performance cores versus small, power-efficient cores, and that there is significant opportunity to improve system reliability through reliability-aware scheduling on heterogeneous multicore processors. We monitor the reliability characteristics of all running applications, and dynamically schedule applications to the different core types in a heterogeneous multicore to maximize system reliability. Reliability-aware scheduling improves reliability by 25.4 percent on average (and up to 60.2 percent) compared to performance-optimized scheduling on a heterogeneous multicore processor with two big cores and two small cores, while degrading performance by 6.3 percent only. We also introduce a novel system-level reliability metric for multiprogram workloads on (heterogeneous) multicores. We provide a trade-off analysis among reliability-, power- and performance-optimized scheduling, and evaluate reliability-aware scheduling under performance constraints and for unprotected L1 caches. In addition, we also extend our scheduling mechanisms to multithreaded programs. The hardware cost in support of our reliability-aware scheduler is limited to 296 bytes per core

Ghent University Academic Bibliography

Experimental Validation of the Reliability-Aware Multi-UAV Coverage Path Planning Problem

Author: Li Mickey H L
Richards Arthur G
Sooriyabandara Mahesh
Publication venue
Publication date: 12/01/2024
Field of study

Unmanned aerial vehicles (UAVs) have become crucial for various applications, necessitating reliable and time-constrained performance. Multi-UAV solutions offer advantages but require effective coordination. Traditional coverage path planning methods overlook uncertainties and individual UAV failures. To address this, reliability-aware multi-UAV coverage path planning methods optimise task allocation to maximise mission completion probabilities given a failure model. This paper presents an experimental validation of the reliability-aware approach, specifically an approach using a Greedy Genetic Algorithm (GGA). We evaluate the GGA performance in real-world environments, comparing mission reliability to computed reliability and comparing it against a traditional multi-UAV methods. The experimental validation demonstrates the practical viability and effectiveness of the reliability-aware approach, showing significant improvement in mission reliability despite the inevitable mismatch between real and assumed failure models

Explore Bristol Research

Reliability-Aware Power Management Of Multi-Core Systems (MPSoCs)

Author: Damm Markus
Haase Jan
Hauser Dennis
Hofmann Andreas
Waldschmidt Klaus
Publication venue: Dagstuhl Seminar Proceedings. 06141 - Dynamically Reconfigurable Architectures
Publication date: 01/01/2006
Field of study

Long-term reliability of processors in embedded systems is experiencing growing attention since decreasing feature sizes and increasing power consumption have a negative influence on the lifespan. Among other measures, the reliability can be influenced significantly by Dynamic Power Management (DPM), since it affects the processor\u27s temperature. Compared to single-core systems reconfigurable multi-core SoCs offer much more possibilities to optimize power and reliability. The impact of different DPM-strategies on the lifespan of multi-core processors is the focus of this presentation. It is shown that the long-term reliability of a multi-core system can be influenced deliberately with different DPM strategies and that temperature cycling greatly influences the estimated lifespan. In this presentation, a new reliability-aware dynamic power management (RADPM) policy is explained

Dagstuhl Research Online Publication Server

Reliability-Aware Optimization of Approximate Computational Kernels with Rely

Author: Achour Sara
Carbin Michael
Misailovic Sasa
Qi Zichao
Rinard Martin
Publication venue
Publication date: 09/01/2014
Field of study

Emerging high-performance architectures are anticipated to contain unreliable components (e.g., ALUs) that offer low power consumption at the expense of soft errors. Some applications (such as multimedia processing, machine learning, and big data analytics) can often naturally tolerate soft errors and can therefore trade accuracy of their results for reduced energy consumption by utilizing these unreliable hardware components. We present and evaluate a technique for reliability-aware optimization of approximate computational kernel implementations. Our technique takes a standard implementation of a computation and automatically replaces some of its arithmetic operations with unreliable versions that consume less power, but may produce incorrect results with some probability. Our technique works with a developer-provided specification of the required reliability of a computation -- the probability that it returns the correct result -- and produces an unreliable implementation that satisfies that specification. We evaluate our approach on five applications from the image processing, numerical analysis, and financial analysis domains and demonstrate how our technique enables automatic exploration of the trade-off between the reliability of a computation and its performance

DSpace@MIT