652,357 research outputs found

    What does fault tolerant Deep Learning need from MPI?

    Full text link
    Deep Learning (DL) algorithms have become the de facto Machine Learning (ML) algorithm for large scale data analysis. DL algorithms are computationally expensive - even distributed DL implementations which use MPI require days of training (model learning) time on commonly studied datasets. Long running DL applications become susceptible to faults - requiring development of a fault tolerant system infrastructure, in addition to fault tolerant DL algorithms. This raises an important question: What is needed from MPI for de- signing fault tolerant DL implementations? In this paper, we address this problem for permanent faults. We motivate the need for a fault tolerant MPI specification by an in-depth consideration of recent innovations in DL algorithms and their properties, which drive the need for specific fault tolerance features. We present an in-depth discussion on the suitability of different parallelism types (model, data and hybrid); a need (or lack thereof) for check-pointing of any critical data structures; and most importantly, consideration for several fault tolerance proposals (user-level fault mitigation (ULFM), Reinit) in MPI and their applicability to fault tolerant DL implementations. We leverage a distributed memory implementation of Caffe, currently available under the Machine Learning Toolkit for Extreme Scale (MaTEx). We implement our approaches by ex- tending MaTEx-Caffe for using ULFM-based implementation. Our evaluation using the ImageNet dataset and AlexNet, and GoogLeNet neural network topologies demonstrates the effectiveness of the proposed fault tolerant DL implementation using OpenMPI based ULFM

    Time-efficient fault detection and diagnosis system for analog circuits

    Get PDF
    Time-efficient fault analysis and diagnosis of analog circuits are the most important prerequisites to achieve online health monitoring of electronic equipments, which are involving continuing challenges of ultra-large-scale integration, component tolerance, limited test points but multiple faults. This work reports an FPGA (field programmable gate array)-based analog fault diagnostic system by applying two-dimensional information fusion, two-port network analysis and interval math theory. The proposed system has three advantages over traditional ones. First, it possesses high processing speed and smart circuit size as the embedded algorithms execute parallel on FPGA. Second, the hardware structure has a good compatibility with other diagnostic algorithms. Third, the equipped Ethernet interface enhances its flexibility for remote monitoring and controlling. The experimental results obtained from two realistic example circuits indicate that the proposed methodology had yielded competitive performance in both diagnosis accuracy and time-effectiveness, with about 96% accuracy while within 60 ms computational time.Peer reviewedFinal Published versio

    Symbolic tolerance and sensitivity analysis of large scale electronic circuits

    Get PDF
    Available from British Library Document Supply Centre-DSC:DXN029693 / BLDSC - British Library Document Supply CentreSIGLEGBUnited Kingdo

    Equational Reasonings in Wireless Network Gossip Protocols

    Get PDF
    Gossip protocols have been proposed as a robust and efficient method for disseminating information throughout large-scale networks. In this paper, we propose a compositional analysis technique to study formal probabilistic models of gossip protocols expressed in a simple probabilistic timed process calculus for wireless sensor networks. We equip the calculus with a simulation theory to compare probabilistic protocols that have similar behaviour up to a certain tolerance. The theory is used to prove a number of algebraic laws which revealed to be very effective to estimate the performances of gossip networks, with and without communication collisions, and randomised gossip networks. Our simulation theory is an asymmetric variant of the weak bisimulation metric that maintains most of the properties of the original definition. However, our asymmetric version is particularly suitable to reason on protocols in which the systems under consideration are not approximately equivalent, as in the case of gossip protocols

    Domestication and divergence of Saccharomyces cerevisiae beer yeasts

    Get PDF
    Whereas domestication of livestock, pets, and crops is well documented, it is still unclear to what extent microbes associated with the production of food have also undergone human selection and where the plethora of industrial strains originates from. Here, we present the genomes and phenomes of 157 industrial Saccharomyces cerevisiae yeasts. Our analyses reveal that today's industrial yeasts can be divided into five sublineages that are genetically and phenotypically separated from wild strains and originate from only a few ancestors through complex patterns of domestication and local divergence. Large-scale phenotyping and genome analysis further show strong industry-specific selection for stress tolerance, sugar utilization, and flavor production, while the sexual cycle and other phenotypes related to survival in nature show decay, particularly in beer yeasts. Together, these results shed light on the origins, evolutionary history, and phenotypic diversity of industrial yeasts and provide a resource for further selection of superior strains

    ART 2-A: An Adaptive Resonance Algorithm for Rapid Category Learning and Recognition

    Full text link
    This article introduces ART 2-A, an efficient algorithm that emulates the self-organizing pattern recognition and hypothesis testing properties of the ART 2 neural network architecture, but at a speed two to three orders of magnitude faster. Analysis and simulations show how the ART 2-A systems correspond to ART 2 dynamics at both the fast-learn limit and at intermediate learning rates. Intermediate learning rates permit fast commitment of category nodes but slow recoding, analogous to properties of word frequency effects, encoding specificity effects, and episodic memory. Better noise tolerance is hereby achieved without a loss of learning stability. The ART 2 and ART 2-A systems are contrasted with the leader algorithm. The speed of ART 2-A makes practical the use of ART 2 modules in large-scale neural computation.BP (89-A-1204); Defense Advanced Research Projects Agency (90-0083); National Science Foundation (IRI-90-00530); Air Force Office of Scientific Research (90-0175, 90-0128); Army Research Office (DAAL-03-88-K0088

    3D Scanning and computer-aided tolerance software analysis for product inspection

    Get PDF
    Tolerances are vital for every physical product, with a tight connection and competing needs between engineering design and manufacturing. 1D, 2D and 3D tolerance analysis can be applied to any product for determining these tolerances. With increase in dimensions the difficulty of tolerance analysis also increases. This research explores tolerance analysis in 3D situation. 3D scanning is a recently developed technology. In the industrial field, this technology is popular for inspecting product quality and in reverse engineering. It compares the dimensions between the 3D scanning model and the CAD model to inspect product quality. It also can generate a CAD model out of the 3D scanning model used in reverse engineering. The device mainly used in 3D scanning is the 3D optical scanner and the 3D laser scanner. These two types of 3D scanner use the same triangulation principle but one uses optical light and the other laser light. This research includes a 3D tolerance analysis and 3D scan. Before tolerance analysis a tolerance stack-up analysis was completed. Tolerance analysis was done using Crystal Ball software. The software uses Monte Carlo simulation to get results based on HTM calculator in Excel. HTM calculator contains every transformation nominal position and tolerance value. HTM calculated nominal position distance should be the same as CAD software Creo measured distance. Transformation nominal position was based on a loop diagram. Tolerance value was based on the defined tolerance in drawing and 3D scanning value. 3D scanning in this research is used to inspect product quality. Both parts and the assembly device were scanned. Parts were selected based on the loop diagram. The device was assembled using 3D scanning parts. The results of the tolerance analysis were shown through distribution charts and sensitivity charts. Comparing the simulation results of 3D scanning data and defined tolerances in drawing, distribution charts results were not reliable but sensitivity charts results were similar. The results of 3D scanning measurement data show the current device tolerance value is too tight. 3D scanning devices used in this research are not suited for large scale implementation, e.g. in product inspection

    Comparison and Design Optimization of a Five-Phase Flux-Switching PM Machine for In-Wheel Traction Applications

    Get PDF
    A comparative study of five-phase outer-rotor flux-switching permanent magnet (FSPM) machines with different topologies for in-wheel traction applications is presented in this paper. Those topologies include double-layer winding, single-layer winding, C-core, and E-core configurations. The electromagnetic performance in the low-speed region, the flux-weakening capability in the high-speed region, and the fault-tolerance capability are all investigated in detail. The results indicate that the E-core FSPM machine has performance advantages. Furthermore, two kinds of E-core FSPM machines with different stator and rotor pole combinations are optimized, respectively. In order to reduce the computational burden during the large-scale optimization process, a mathematical technique is developed based on the concept of computationally efficient finite-element analysis. While a differential evolution algorithm serves as a global search engine to target optimized designs. Subsequently, multiobjective tradeoffs are presented based on a Pareto-set for 20 000 candidate designs. Finally, an optimal design is prototyped, and some experimental results are given to confirm the validity of the simulation results in this paper
    corecore