27 research outputs found

    GAIA: Delving into Gradient-based Attribution Abnormality for Out-of-distribution Detection

    Full text link
    Detecting out-of-distribution (OOD) examples is crucial to guarantee the reliability and safety of deep neural networks in real-world settings. In this paper, we offer an innovative perspective on quantifying the disparities between in-distribution (ID) and OOD data -- analyzing the uncertainty that arises when models attempt to explain their predictive decisions. This perspective is motivated by our observation that gradient-based attribution methods encounter challenges in assigning feature importance to OOD data, thereby yielding divergent explanation patterns. Consequently, we investigate how attribution gradients lead to uncertain explanation outcomes and introduce two forms of abnormalities for OOD detection: the zero-deflation abnormality and the channel-wise average abnormality. We then propose GAIA, a simple and effective approach that incorporates Gradient Abnormality Inspection and Aggregation. The effectiveness of GAIA is validated on both commonly utilized (CIFAR) and large-scale (ImageNet-1k) benchmarks. Specifically, GAIA reduces the average FPR95 by 23.10% on CIFAR10 and by 45.41% on CIFAR100 compared to advanced post-hoc methods.Comment: Accepted by NeurIPS202

    Shoggoth: Towards Efficient Edge-Cloud Collaborative Real-Time Video Inference via Adaptive Online Learning

    Full text link
    This paper proposes Shoggoth, an efficient edge-cloud collaborative architecture, for boosting inference performance on real-time video of changing scenes. Shoggoth uses online knowledge distillation to improve the accuracy of models suffering from data drift and offloads the labeling process to the cloud, alleviating constrained resources of edge devices. At the edge, we design adaptive training using small batches to adapt models under limited computing power, and adaptive sampling of training frames for robustness and reducing bandwidth. The evaluations on the realistic dataset show 15%-20% model accuracy improvement compared to the edge-only strategy and fewer network costs than the cloud-only strategy.Comment: Accepted by 60th ACM/IEEE Design Automation Conference (DAC2023

    EdgeMA: Model Adaptation System for Real-Time Video Analytics on Edge Devices

    Full text link
    Real-time video analytics on edge devices for changing scenes remains a difficult task. As edge devices are usually resource-constrained, edge deep neural networks (DNNs) have fewer weights and shallower architectures than general DNNs. As a result, they only perform well in limited scenarios and are sensitive to data drift. In this paper, we introduce EdgeMA, a practical and efficient video analytics system designed to adapt models to shifts in real-world video streams over time, addressing the data drift problem. EdgeMA extracts the gray level co-occurrence matrix based statistical texture feature and uses the Random Forest classifier to detect the domain shift. Moreover, we have incorporated a method of model adaptation based on importance weighting, specifically designed to update models to cope with the label distribution shift. Through rigorous evaluation of EdgeMA on a real-world dataset, our results illustrate that EdgeMA significantly improves inference accuracy.Comment: Accepted by 30th International Conference on Neural Information Processing (ICONIP 2023

    BMCloud: Minimizing Repair Bandwidth and Maintenance Cost in Cloud Storage

    Get PDF
    To protect data in cloud storage, fault tolerance and efficient recovery become very important. Recent studies have developed numerous solutions based on erasure code techniques to solve this problem using functional repairs. However, there are two limitations to address. The first one is consistency since the Encoding Matrix (EM) is different among clouds. The other one is repairing bandwidth, which is a concern for most of us. We addressed these two problems from both theoretical and practical perspectives. We developed BMCloud, a new low repair bandwidth, low maintenance cost cloud storage system, which aims to reduce repair bandwidth and maintenance cost. The system employs both functional repair and exact repair while it inherits advantages from the both. We propose the JUDGE_STYLE algorithm, which can judge whether the system should adopt exact repair or functional repair. We implemented a networked storage system prototype and demonstrated our findings. Compared with existing solutions, BMCloud can be used in engineering to save repair bandwidth and degrade maintenance significantly

    Metadata Management For Distributed Multimedia Storage System

    No full text
    As a result of the multimedia rapid growth, there has been a huge increase in the amount of information generated and shared by people all over the world. Demand for large-scale multimedia storage system is growing rapidly. This paper describes the design and implementation of the Two-level metadata server for Distributed Multimedia Storage System (DMSS). The DMSS divides the logical view of the stored data from the physical view. The logical view is managed by the metadata server which is called GMS (Global Metadata Server), and the physical view is managed by a component of storage servers which is called LMS (Local Metadata Server). Adopting LMS, each storage server can maintain its own storage resources and metadata and data independently, and can offer storage service independently. The DMSS allows the application servers to access the storage servers directly and in parallel providing very high performance. © 2008 IEEE

    Greencht: A Power-Proportional Replication Scheme For Consistent Hashing Based Key Value Storage Systems

    No full text
    Distributed key value storage systems are widely used by many popular networking corporations. Nevertheless, server power consumption has become a growing concern for key value storage system designers since the power consumption of servers contributes substantially to a data center\u27s power bills. In this paper, we propose GreenCHT, a power-proportional replication scheme for consistent hashing based key value storage systems. GreenCHT consists of a power-aware replication strategy - multi-tier replication strategy and a centralized power control service - predictive power-mode scheduler. The multitier replication provides power-proportionality and ensures data availability, reliability, consistency, as well as fault-tolerance of the whole system. The predictive power-mode scheduler component predicts workloads and exploits load fluctuation to schedule nodes to be powered-up and powered-down. GreenCHT is implemented based on Sheepdog, a distributed key value system that uses consistent hashing as an underlying distributed hash table. By replicating twelve real workload traces collected from Microsoft, the evaluation results show that GreenCHT can provide significant power savings while maintaining an acceptable performance. We observed that GreenCHT can reduce power consumption by up to 35%-61%

    The Research And Design For High Availability Object Storage System

    No full text
    With the growing scale of the computer storage systems, the likelihood of multi-disk failures happening in the storage systems has increased dramatically. Based on a thorough analysis on the fault-tolerance capability on various existing storage systems, we propose a new hierarchical, highly reliable, multi-disk fault-tolerant storage system architecture: High Availability Object Storage System (HAOSS). In the HAOSS, each object has an attribute field for reliability level, which can be set by the user according to the importance of data. Higher reliability level corresponds to better data survivability in case of multi-device failure. The HAOSS is composed of two layers: the upper-layer and the lower-layer. The upper-layer achieves the high availability by storing multiple replicas for each storage object in a set of storage devices. The individual replicas can service the I/O requests in parallel so as to obtain high performance. The lower-layer deploys RAID5, RAID6 or RAID-Blaum coding schemes to tolerate multi-disk failures. In addition, the disk utilization rate of RAID-Blaum is higher than that of multiple replicas, and it can be further improved by growing the RAID group size. These advantages come at the price of more complicated fault-tolerant coding schemes, which involve a large amount of calculation for encoding and cause an adverse impact on the I/O performance, especially on the write performance. Results from both our internal experiments and third-party independent tests have shown that HAOSS servers have better multi-disk- failure tolerance than existing similar products. In a 1000Mb Ethernet interconnection environment, with a request block size of 1024KB, the sequential read performance for a HAOSS server reaches 104MB/s, which is very close to the theoretical maximum effective bandwidth of Ethernet networks. The HAOSS offers a complete storage solution for high availability applications without the compromises that today\u27s storage systems require in either performance or fault-tolerance. © 2009 SPIE

    S2-RAID: A New RAID Architecture for Fast Data Recovery

    No full text
    Abstract--As disk volume grows rapidly with terabyte disk becoming a norm, RAID reconstruction time in case of a failure takes prohibitively long time. This paper presents a new RAID architecture, S 2-RAID, allowing the disk array to reconstruct very quickly in case of a disk failure. The idea is to form skewed sub RAIDs (S 2-RAID) in the RAID structure so that reconstruction can be done in parallel dramatically speeding up data reconstruction time and hence minimizing the chance of data loss. To make such parallel reconstruction conflict-free, each sub-RAID is formed by selecting one logic partition from each disk group with size being a prime number. We have implemented a prototype S 2-RAID system in Linux operating system for the purpose of evaluating its performance potential. SPC IO traces and standard benchmarks have been used to measure the performance of S 2-RAID as compared to existing baseline software RAID, MD. Experimental results show that our new S 2-RAID speeds up data reconstruction time by a factor of 3 to 6 compared to the traditional RAID. At the same time, S 2-RAID shows similar or better production performance than baseline RAID while online RAID reconstruction is in progress.

    A New High-Performance, Energy-Efficient Replication Storage System With Reliability Guarantee

    No full text
    In modern replication storage systems where data carries two or more multiple copies, a primary group of disks is always up to service incoming requests while other disks are often spun down to sleep states to save energy during slack periods. However, since new writes cannot be immediately synchronized onto all disks, system reliability is degraded. This paper develops PERAID, a new high-performance, energy-efficient replication storage system, which aims to improve both performance and energy efficiency without compromising reliability. It employs a parity software RAID as a virtual write buffer disk at the front end to absorb new writes. Since extra parity redundancy supplies two or more copies, PERAID guarantees comparable reliability with that of a replication storage system. In addition, PERAID offers better write performance compared to the replication system by avoiding the classical small-write problem in traditional parity RAID: buffering many small random writes into few large writes and writing to storage in a parallel fashion. By evaluating our PERAID prototype using two benchmarks and two real-life traces, we found that PERAID significantly improves write performance and saves more energy than existing solutions such as GRAID, eRAID. © 2012 IEEE

    S\u3csup\u3e2\u3c/sup\u3e-RAID: Parallel RAID architecture for fast data recovery

    No full text
    As disk volume grows rapidly with terabyte disk becoming a norm, RAID reconstruction process in case of a failure takes prohibitively long time. This paper presents a new RAID architecture, S2-RAID, allowing the disk array to reconstruct very quickly in case of a disk failure. The idea is to form skewed sub-arrays in the RAID structure so that reconstruction can be done in parallel dramatically speeding up data reconstruction process and hence minimizing the chance of data loss. We analyse the data recovery ability of this architecture and show its good scalability. A prototype S2-RAID system has been built and implemented in the Linux operating system for the purpose of evaluating its performance potential. Real world I/O traces including SPC, Microsoft, and a collection of a production environment have been used to measure the performance of S2-RAID as compared to existing baseline software RAID5, Parity Declustering, and RAID50. Experimental results show that our new S2-RAID speeds up data reconstruction time by a factor 2 to 4 compared to the traditional RAID. Meanwhile, S2-RAID keeps comparable production performance to that of the baseline RAID layouts while online RAID reconstruction is in progress. © 1990-2012 IEEE
    corecore