64 research outputs found
Transient and Permanent Error Control for High-End Multiprocessor Systems-on-Chip
High-end MPSoC systems with built-in high-radix topologies achieve good performance because of the improved connectivity and the reduced network diameter. In high-end MPSoC systems, fault tolerance support is becoming a compulsory feature. In this work, we propose a combined method to address permanent and transient link and router failures in those systems. The LBDRhr mechanism is proposed to tolerate permanent link failures in some popular high-radix topologies. The increased router complexity may lead to more transient router errors than routers using simple XY routing algorithm. We exploit the inherent information redundancy (IIR) in LBDRhr logic to manage transient errors in the network routers. Thorough analyses are provided to discover the appropriate internal nodes and the forbidden signal patterns for transient error detection. Simulation results show that LBDRhr logic can tolerate all of the permanent failure combinations of long-range links and 80% of links failures at short-range links. Case studies show that the error detection method based on the new IIR extraction method reduces the power consumption and the residual error rate by 33% and up to two orders of magnitude, respectively, compared to triple modular redundancy. The impact of network topologies on the efficiency of the detection mechanism has been examined in this work, as well
A Multi-Function Provable Data Possession Scheme in Cloud Computing
In order to satisfy the different requirements of provable data possession in cloud computing, a multi-function provable data possession (MF-PDP) is proposed, which supports public verification, data dynamic, unlimited times verification, sampling verification. Besides, it is security in RO model and it is verification privacy under half trust model and can prevent from replacing attack and replay attack. The detail design is provided and the theory analysis
about the correct, security and performance are also described. The experiment emulation and compare analysis suggest the feasibility and advantage
Quantum Algorithm for Unsupervised Anomaly Detection
Anomaly detection, an important branch of machine learning, plays a critical
role in fraud detection, health care, intrusion detection, military
surveillance, etc. As one of the most commonly used unsupervised anomaly
detection algorithms, the Local Outlier Factor algorithm (LOF algorithm) has
been extensively studied. This algorithm contains three steps, i.e.,
determining the k-distance neighborhood for each data point x, computing the
local reachability density of x, and calculating the local outlier factor of x
to judge whether x is abnormal. The LOF algorithm is computationally expensive
when processing big data sets. Here we present a quantum LOF algorithm
consisting of three parts corresponding to the classical algorithm.
Specifically, the k-distance neighborhood of x is determined by amplitude
estimation and minimum search; the local reachability density of each data
point is calculated in parallel based on the quantum multiply-adder; the local
outlier factor of each data point is obtained in parallel using amplitude
estimation. It is shown that our quantum algorithm achieves exponential speedup
on the dimension of the data points and polynomial speedup on the number of
data points compared to its classical counterpart. This work demonstrates the
advantage of quantum computing in unsupervised anomaly detection
Towards Energy-Efficient and Secure Computing Systems
Countermeasures against diverse security threats typically incur noticeable hardware cost and power overhead, which may become the obstacle for those countermeasures to be applicable in energy-efficient computing systems. This work presents a summary of energy-efficiency techniques that have been applied in security primitives or mechanisms to ensure computing systems’ resilience against various security threats on hardware. This work also uses examples to discuss practical methods for securing the hardware for computing systems to achieve energy efficiency
Transient and permanent error management for networks-on-chip
Thesis (Ph. D.)--University of Rochester. Dept. of Electrical and Computer Engineering, 2011.Reliability has become one of the most important metrics for on-chip
communications infrastructures in nanoscale technologies. Reduced supply voltages
and high clock frequency exacerbate the impact of noise sources such as particle
strikes and crosstalk, which can cause transient errors in transmitted data.
Manufacturing defects and aging issues can cause permanent errors in the
communication links. The modularity of the Networks-on-Chip (NoCs) approach
facilitates the exploration of error control techniques for on-chip interconnects and
many-cores systems. Unfortunately, error control is not free. Worst-case error
management methods are simple but waste energy and bandwidth in favorable noise
conditions. Consequently, cost-effective techniques for improving link error
resilience are needed. In this work, we propose configurable error control methods to
tackle variable transient errors and exploit existing transient error control redundancy
for permanent error management, achieving high reliability and low average energy
consumption with minor area overhead.
To adapt to the variable transient error rates, a configurable error control
coding (ECC) scheme is proposed for datalink-layer transient error management. The
proposed method can adjust both error detection and error correction capability at
runtime by varying the number of redundant wires for parity check bits. The obtained
error resilience makes the proposed method suitable for a range of link error rates.
Configuring the number of redundant wires to match the noise conditions reduces the
average energy consumption in the ECC codec and interconnect link. A hardware efficient
implementation for the configurable ECC is presented, as well.
We integrate the error control techniques in the datalink and physical layers to
co-manage transient and permanent errors. Infrequently used redundant wires for the
configurable ECC are utilized as spare wires to replace permanently unusable links.
To maintain the transient and permanent error co-management capability as noise
conditions change, we propose a packet re-organization algorithm combined with
shortening error control coding method. This method reduces the need for energy consuming
fault-tolerant routing, minimizing latency and energy overhead induced by
error control. This co-management method is suitable for NoCs operating in variable
noise conditions with a small number of permanently unusable wires.
To further improve energy efficiency, the adaptation on ECC is extended to
the network layer. We employ end-to-end error control in the network layer in low
noise conditions and enhance the error control capability in high noise conditions by
adding hop-to-hop error control in the datalink layer. A protocol that boosts or
reduces error control strength is presented to support runtime seamless ECC mode
switching. Simply combining end-to-end error control with hop-to-hop error control
significantly increases energy consumption. To address this issue, we apply the concept of product codes to the dual-layer error control; the hop-to-hop error control
is designed to be compatible with one dimension of the product code. Consequently,
the dual-layer cooperative error control can switch error control modes without
interrupting normal NoC operation, achieving high reliability and energy efficiency in
a wide range of link error rates.
To evaluate performance and energy consumption of different error control
methods on a large size NoC, we propose a flexible parallel NoC simulator. Plug-and-play
error control coding (ECC) insertion and some typical error control codecs have
been implemented in the simulator. The flexible fault injection environment provided
by our simulator assists error control exploration for specific purposes. In addition,
we use C and message passing interface (MPI) languages to schedule parallel
simulation on a multiprocessor server, addressing the prohibitive simulation time and
system resource challenges caused by the large number of communicating nodes and
extensive number of simulation variables
- …