990 research outputs found

    What does fault tolerant Deep Learning need from MPI?

    Full text link
    Deep Learning (DL) algorithms have become the de facto Machine Learning (ML) algorithm for large scale data analysis. DL algorithms are computationally expensive - even distributed DL implementations which use MPI require days of training (model learning) time on commonly studied datasets. Long running DL applications become susceptible to faults - requiring development of a fault tolerant system infrastructure, in addition to fault tolerant DL algorithms. This raises an important question: What is needed from MPI for de- signing fault tolerant DL implementations? In this paper, we address this problem for permanent faults. We motivate the need for a fault tolerant MPI specification by an in-depth consideration of recent innovations in DL algorithms and their properties, which drive the need for specific fault tolerance features. We present an in-depth discussion on the suitability of different parallelism types (model, data and hybrid); a need (or lack thereof) for check-pointing of any critical data structures; and most importantly, consideration for several fault tolerance proposals (user-level fault mitigation (ULFM), Reinit) in MPI and their applicability to fault tolerant DL implementations. We leverage a distributed memory implementation of Caffe, currently available under the Machine Learning Toolkit for Extreme Scale (MaTEx). We implement our approaches by ex- tending MaTEx-Caffe for using ULFM-based implementation. Our evaluation using the ImageNet dataset and AlexNet, and GoogLeNet neural network topologies demonstrates the effectiveness of the proposed fault tolerant DL implementation using OpenMPI based ULFM

    The Case for Graph-Based Recommendations

    Get PDF
    Recommender systems have been intensively used to create personalised profiles, which enhance the user experience. In certain areas, such as e-learning, this approach is short-sighted, since each student masters each concept through different means. The progress from one concept to the next, or from one lesson to another, does not necessarily follow a fixed pattern. Given these settings, we can no longer use simple structures (vectors, strings, etc.) to represent each user's interactions with the system, because the sequence of events and their mapping to user's intentions, build up into more complex synergies. As a consequence, we propose a graph-based interpretation of the problem and identify the challenges behind (a) using graphs to model the users' journeys and hence as the input to the recommender system, and (b) producing recommendations in the form of graphs of actions to be taken

    HIL: designing an exokernel for the data center

    Full text link
    We propose a new Exokernel-like layer to allow mutually untrusting physically deployed services to efficiently share the resources of a data center. We believe that such a layer offers not only efficiency gains, but may also enable new economic models, new applications, and new security-sensitive uses. A prototype (currently in active use) demonstrates that the proposed layer is viable, and can support a variety of existing provisioning tools and use cases.Partial support for this work was provided by the MassTech Collaborative Research Matching Grant Program, National Science Foundation awards 1347525 and 1149232 as well as the several commercial partners of the Massachusetts Open Cloud who may be found at http://www.massopencloud.or

    Energy-efficient through-life smart design, manufacturing and operation of ships in an industry 4.0 environment

    Get PDF
    Energy efficiency is an important factor in the marine industry to help reduce manufacturing and operational costs as well as the impact on the environment. In the face of global competition and cost-effectiveness, ship builders and operators today require a major overhaul in the entire ship design, manufacturing and operation process to achieve these goals. This paper highlights smart design, manufacturing and operation as the way forward in an industry 4.0 (i4) era from designing for better energy efficiency to more intelligent ships and smart operation through-life. The paper (i) draws parallels between ship design, manufacturing and operation processes, (ii) identifies key challenges facing such a temporal (lifecycle) as opposed to spatial (mass) products, (iii) proposes a closed-loop ship lifecycle framework and (iv) outlines potential future directions in smart design, manufacturing and operation of ships in an industry 4.0 value chain so as to achieve more energy-efficient vessels. Through computational intelligence and cyber-physical integration, we envision that industry 4.0 can revolutionise ship design, manufacturing and operations in a smart product through-life process in the near future

    Streaming 1.9 Billion Hypersparse Network Updates per Second with D4M

    Full text link
    The Dynamic Distributed Dimensional Data Model (D4M) library implements associative arrays in a variety of languages (Python, Julia, and Matlab/Octave) and provides a lightweight in-memory database implementation of hypersparse arrays that are ideal for analyzing many types of network data. D4M relies on associative arrays which combine properties of spreadsheets, databases, matrices, graphs, and networks, while providing rigorous mathematical guarantees, such as linearity. Streaming updates of D4M associative arrays put enormous pressure on the memory hierarchy. This work describes the design and performance optimization of an implementation of hierarchical associative arrays that reduces memory pressure and dramatically increases the update rate into an associative array. The parameters of hierarchical associative arrays rely on controlling the number of entries in each level in the hierarchy before an update is cascaded. The parameters are easily tunable to achieve optimal performance for a variety of applications. Hierarchical arrays achieve over 40,000 updates per second in a single instance. Scaling to 34,000 instances of hierarchical D4M associative arrays on 1,100 server nodes on the MIT SuperCloud achieved a sustained update rate of 1,900,000,000 updates per second. This capability allows the MIT SuperCloud to analyze extremely large streaming network data sets.Comment: 6 pages; 6 figures; accepted to IEEE High Performance Extreme Computing (HPEC) Conference 2019. arXiv admin note: text overlap with arXiv:1807.05308, arXiv:1902.0084

    Your Smart Home Can't Keep a Secret: Towards Automated Fingerprinting of IoT Traffic with Neural Networks

    Get PDF
    The IoT (Internet of Things) technology has been widely adopted in recent years and has profoundly changed the people's daily lives. However, in the meantime, such a fast-growing technology has also introduced new privacy issues, which need to be better understood and measured. In this work, we look into how private information can be leaked from network traffic generated in the smart home network. Although researchers have proposed techniques to infer IoT device types or user behaviors under clean experiment setup, the effectiveness of such approaches become questionable in the complex but realistic network environment, where common techniques like Network Address and Port Translation (NAPT) and Virtual Private Network (VPN) are enabled. Traffic analysis using traditional methods (e.g., through classical machine-learning models) is much less effective under those settings, as the features picked manually are not distinctive any more. In this work, we propose a traffic analysis framework based on sequence-learning techniques like LSTM and leveraged the temporal relations between packets for the attack of device identification. We evaluated it under different environment settings (e.g., pure-IoT and noisy environment with multiple non-IoT devices). The results showed our framework was able to differentiate device types with a high accuracy. This result suggests IoT network communications pose prominent challenges to users' privacy, even when they are protected by encryption and morphed by the network gateway. As such, new privacy protection methods on IoT traffic need to be developed towards mitigating this new issue

    Method to Generate Disaster-Damage Map using 3D photometry and Crowd Sourcing

    Get PDF
    Thanks to the rapid progress of the Internet and mobile devices, information related to disaster areas can be collected through the Internet. To grasp the degree of damage in a disaster situation, the use of crowdsourcing for coordinating the individual efforts (micro tasks) of an enormous number of users (workers) on the Internet has been drawing attention as a means of quickly solving problems. However, the information gathered from the Internet is huge and diverse, so it is difficult to formulate as a crowdsourcing task. This paper proposes a conversion platform for the images of a disaster site photographed by various users as information about the site, integrating the images into a single map using 3D image processing, and providing the map to crowdsourcing as a micro task.Published in: 2017 IEEE International Conference on Big Data (Big Data) Date of Conference: 11-14 Dec. 2017 Conference Location: Boston, MA, US
    corecore