28,601 research outputs found

    Towards batch-processing on cold storage devices

    Get PDF
    Large amounts of data in storage systems is cold, i.e., Written Once and Read Occasionally (WORO). The rapid growth of massive-scale archival and historical data increases the demand for petabyte-scale cheap storage for such cold data. A Cold Storage Device (CSD) is a disk-based storage system which is designed to trade off performance for cost and power efficiency. Inevitably, the design restrictions used in CSD's results in performance limitations. These limitations are not a concern for WORO workloads, however, the very low price/performance characteristics of CSDs makes them interesting for other applications, e.g., batch processes, too. Applications, however, can be very slow on CSD's if they do not take their characteristics into account. In this paper we design two strategies for data partitioning in CSDs -- a crucial operation in many batch analytics tasks like hash-join, near-duplicate detection, and data localization. We show that our strategies can efficiently use CSDs for batch processing of terabyte-scale data by accelerating data partitioning by 3.5x in our experiments

    TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep LearningInference in Function as a Service Environments

    Full text link
    Deep neural networks (DNNs) have become core computation components within low latency Function as a Service (FaaS) prediction pipelines: including image recognition, object detection, natural language processing, speech synthesis, and personalized recommendation pipelines. Cloud computing, as the de-facto backbone of modern computing infrastructure for both enterprise and consumer applications, has to be able to handle user-defined pipelines of diverse DNN inference workloads while maintaining isolation and latency guarantees, and minimizing resource waste. The current solution for guaranteeing isolation within FaaS is suboptimal -- suffering from "cold start" latency. A major cause of such inefficiency is the need to move large amount of model data within and across servers. We propose TrIMS as a novel solution to address these issues. Our proposed solution consists of a persistent model store across the GPU, CPU, local storage, and cloud storage hierarchy, an efficient resource management layer that provides isolation, and a succinct set of application APIs and container technologies for easy and transparent integration with FaaS, Deep Learning (DL) frameworks, and user code. We demonstrate our solution by interfacing TrIMS with the Apache MXNet framework and demonstrate up to 24x speedup in latency for image classification models and up to 210x speedup for large models. We achieve up to 8x system throughput improvement.Comment: In Proceedings CLOUD 201

    SusOrganic - Development of quality standards and optimised processing methods for organic produce - Final report

    Get PDF
    The SusOrganic project aimed to develop improved drying and cooling/freezing processes for organic products in terms of sustainability and objective product quality criteria. Initially, the consortium focused on a predefined set products to investigate (fish, meat, fruits and vegetables). Contacting participants in the fruit and vegetable sector showed that there is only little perceived need for making changes for the improvement of the processes. At the same time, it became clear that hops and herb producers (drying) face several challenges in terms of product quality and cost of drying processes. Therefore, the range of products was extended to these products. The results of a consumer survey conducted as part the project showed clearly that consumers trust in the organic label, but also tend to mix up the term organic with regional or fair ­trade. Further, the primary production on farm and not the processing is explicitly included in the consumers’ evaluation of sustainability. Appearance of organic products was found to be one of the least important quality criteria or attributes regarding buying decisions. However, there are indications that an imperfect appearance could be a quality attribute for consumers, as the product then is perceived to be processed without artificial additives. Regarding drying operations, small scale producers in the organic sector often work with old and/or modified techniques and technologies, which often leads to an inefficient drying processes due to high energy consumptions and decreased product quality. Inappropriate air volume flow and distribution often cause inefficient removal of the moisture from the product and heterogeneous drying throughout the bulk. Guidelines for improvement of the physical setup of existing driers as well as designs for new drying operations, including novel drying strategies were developed. Besides chilling and freezing, the innovative idea of superchilling was included into the project.The superchilled cold chain is only a few degrees colder than the refrigeration chain but has a significant impact on the preservation characteristic due to shock frosting of the outer layer of the product and the further distribution of very small ice crystals throughout the product during storage. Super­chilling of organically grown salmon eliminated the demand of ice for transport, resulting in both, a reduction of energy costs and a better value chain performance in terms of carbon foot printing. This is mainly due to the significantly reduced transport volume and weight without the presence of ice. The product quality is not different but the shelf life is extended compared to chilled fish. This means that the high quality of organic salmon can be maintained over a longer time period, which can be helpful,e.g. to reach far distant markets. The same trend was found for superchilled organic meat products such as pork and chicken. The consortium also developed innovative noninvasive measurement and control systems and improved drying strategies and systems for fruits, vegetables, herbs, hops and meat. Those systems are based on changes occurring inside the product and therefore require observation strategies of the product during the drying process. Through auditing campaigns as well as pilot scale drying tests it has been possible to develop optimisation strategies for both herb and hops commodities, which can help reduce microbial spoilage and retain higher levels of volatile product components whilst reducing the energy demands. These results can be applied with modifications to the other commodities under investigation. The environmental and cost performance of superchilling of salmon and drying of meat, fruit and vegetables were also investigated and the findings indicated that both superchilling and drying could improve sustainability of organic food value chains especially in case of far distant markets. An additional outcome of the project, beyond the original scope was the development of a noninvasive, visual sensor based detection system for authenticity checks of meat products in terms of fresh and prefrozen meats

    funcX: A Federated Function Serving Fabric for Science

    Full text link
    Exploding data volumes and velocities, new computational methods and platforms, and ubiquitous connectivity demand new approaches to computation in the sciences. These new approaches must enable computation to be mobile, so that, for example, it can occur near data, be triggered by events (e.g., arrival of new data), be offloaded to specialized accelerators, or run remotely where resources are available. They also require new design approaches in which monolithic applications can be decomposed into smaller components, that may in turn be executed separately and on the most suitable resources. To address these needs we present funcX---a distributed function as a service (FaaS) platform that enables flexible, scalable, and high performance remote function execution. funcX's endpoint software can transform existing clouds, clusters, and supercomputers into function serving systems, while funcX's cloud-hosted service provides transparent, secure, and reliable function execution across a federated ecosystem of endpoints. We motivate the need for funcX with several scientific case studies, present our prototype design and implementation, show optimizations that deliver throughput in excess of 1 million functions per second, and demonstrate, via experiments on two supercomputers, that funcX can scale to more than more than 130000 concurrent workers.Comment: Accepted to ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC 2020). arXiv admin note: substantial text overlap with arXiv:1908.0490
    • …
    corecore