1,643 research outputs found

    HEC: Collaborative Research: SAM^2 Toolkit: Scalable and Adaptive Metadata Management for High-End Computing

    Get PDF
    The increasing demand for Exa-byte-scale storage capacity by high end computing applications requires a higher level of scalability and dependability than that provided by current file and storage systems. The proposal deals with file systems research for metadata management of scalable cluster-based parallel and distributed file storage systems in the HEC environment. It aims to develop a scalable and adaptive metadata management (SAM2) toolkit to extend features of and fully leverage the peak performance promised by state-of-the-art cluster-based parallel and distributed file storage systems used by the high performance computing community. There is a large body of research on data movement and management scaling, however, the need to scale up the attributes of cluster-based file systems and I/O, that is, metadata, has been underestimated. An understanding of the characteristics of metadata traffic, and an application of proper load-balancing, caching, prefetching and grouping mechanisms to perform metadata management correspondingly, will lead to a high scalability. It is anticipated that by appropriately plugging the scalable and adaptive metadata management components into the state-of-the-art cluster-based parallel and distributed file storage systems one could potentially increase the performance of applications and file systems, and help translate the promise and potential of high peak performance of such systems to real application performance improvements. The project involves the following components: 1. Develop multi-variable forecasting models to analyze and predict file metadata access patterns. 2. Develop scalable and adaptive file name mapping schemes using the duplicative Bloom filter array technique to enforce load balance and increase scalability 3. Develop decentralized, locality-aware metadata grouping schemes to facilitate the bulk metadata operations such as prefetching. 4. Develop an adaptive cache coherence protocol using a distributed shared object model for client-side and server-side metadata caching. 5. Prototype the SAM2 components into the state-of-the-art parallel virtual file system PVFS2 and a distributed storage data caching system, set up an experimental framework for a DOE CMS Tier 2 site at University of Nebraska-Lincoln and conduct benchmark, evaluation and validation studies

    Energy efficient and latency aware adaptive compression in wireless sensor networks

    Get PDF
    Wireless sensor networks are composed of a few to several thousand sensors deployed over an area or on specific objects to sense data and report that data back to a sink either directly or through a series of hops across other sensor nodes. There are many applications for wireless sensor networks including environment monitoring, wildlife tracking, security, structural heath monitoring, troop tracking, and many others. The sensors communicate wirelessly and are typically very small in size and powered by batteries. Wireless sensor networks are thus often constrained in bandwidth, processor speed, and power. Also, many wireless sensor network applications have a very low tolerance for latency and need to transmit the data in real time. Data compression is a useful tool for minimizing the bandwidth and power required to transmit data from the sensor nodes to the sink; however, compression algorithms often add a significant amount of latency or require a great deal of additional processing. The following papers define and analyze multiple approaches for achieving effective compression while reducing latency and power consumption far below what would be required to process and transmit the data uncompressed. The algorithms target many different types of sensor applications from lossless compression on a single sensor to error tolerant, collaborative compression across an entire network of sensors to compression of XML data on sensors. Extensive analysis over many different real-life data sets and comparison of several existing compression methods show significant contribution to efficient wireless sensor communication --Abstract, page iv

    FedHAP: Federated Hashing with Global Prototypes for Cross-silo Retrieval

    Full text link
    Deep hashing has been widely applied in large-scale data retrieval due to its superior retrieval efficiency and low storage cost. However, data are often scattered in data silos with privacy concerns, so performing centralized data storage and retrieval is not always possible. Leveraging the concept of federated learning (FL) to perform deep hashing is a recent research trend. However, existing frameworks mostly rely on the aggregation of the local deep hashing models, which are trained by performing similarity learning with local skewed data only. Therefore, they cannot work well for non-IID clients in a real federated environment. To overcome these challenges, we propose a novel federated hashing framework that enables participating clients to jointly train the shared deep hashing model by leveraging the prototypical hash codes for each class. Globally, the transmission of global prototypes with only one prototypical hash code per class will minimize the impact of communication cost and privacy risk. Locally, the use of global prototypes are maximized by jointly training a discriminator network and the local hashing network. Extensive experiments on benchmark datasets are conducted to demonstrate that our method can significantly improve the performance of the deep hashing model in the federated environments with non-IID data distributions

    A Survey on Design and Implementation of Protected Searchable Data in the Cloud

    Get PDF
    While cloud computing has exploded in popularity in recent years thanks to the potential efficiency and cost savings of outsourcing the storage and management of data and applications, a number of vulnerabilities that led to multiple attacks have deterred many potential users. As a result, experts in the field argued that new mechanisms are needed in order to create trusted and secure cloud services. Such mechanisms would eradicate the suspicion of users towards cloud computing by providing the necessary security guarantees. Searchable Encryption is among the most promising solutions - one that has the potential to help offer truly secure and privacy-preserving cloud services. We start this paper by surveying the most important searchable encryption schemes and their relevance to cloud computing. In light of this analysis we demonstrate the inefficiencies of the existing schemes and expand our analysis by discussing certain confidentiality and privacy issues. Further, we examine how to integrate such a scheme with a popular cloud platform. Finally, we have chosen - based on the findings of our analysis - an existing scheme and implemented it to review its practical maturity for deployment in real systems. The survey of the field, together with the analysis and with the extensive experimental results provides a comprehensive review of the theoretical and practical aspects of searchable encryption

    Temporal Lossy In-Situ Compression for Computational Fluid Dynamics Simulations

    Get PDF
    Während CFD Simulationen für Metallschmelze im Rahmen des SFB920 fallen auf dem Taurus HPC Cluster in Dresden sehr große Datenmengen an, deren Handhabung den wissenschaftlichen Arbeitsablauf stark verlangsamen. Zum einen ist der Transfer in Visualisierungssysteme nur unter hohem Zeitaufwand möglich. Zum anderen ist interaktive Analyse von zeitlich abhängigen Prozessen auf Grund des Speicherflaschenhalses nahezu unmöglich. Aus diesen Gründen beschäftigt sich die vorliegende Dissertation mit der Entwicklung sog. Temporaler In-Situ Kompression für wissenschaftliche Daten direkt innerhalb von CFD Simulationen. Dabei werden mittels neuer Quantisierungsverfahren die Daten auf ~10% komprimiert, wobei dekomprimierte Daten einen Fehler von maximal 1% aufweisen. Im Gegensatz zu nicht-temporaler Kompression, wird bei temporaler Kompression der Unterschied zwischen Zeitschritten komprimiert, um den Kompressionsgrad zu erhöhen. Da die Datenmenge um ein Vielfaches kleiner ist, werden Kosten für die Speicherung und die Übertragung gesenkt. Da Kompression, Transfer und Dekompression bis zu 4 mal schneller ablaufen als der Transfer von unkomprimierten Daten, wird der wissenschaftliche Arbeitsablauf beschleunigt
    • …
    corecore