20,729 research outputs found

    Theoretical and Empirical Analysis of a Parallel Boosting Algorithm

    Full text link
    Many real-world problems involve massive amounts of data. Under these circumstances learning algorithms often become prohibitively expensive, making scalability a pressing issue to be addressed. A common approach is to perform sampling to reduce the size of the dataset and enable efficient learning. Alternatively, one customizes learning algorithms to achieve scalability. In either case, the key challenge is to obtain algorithmic efficiency without compromising the quality of the results. In this paper we discuss a meta-learning algorithm (PSBML) which combines features of parallel algorithms with concepts from ensemble and boosting methodologies to achieve the desired scalability property. We present both theoretical and empirical analyses which show that PSBML preserves a critical property of boosting, specifically, convergence to a distribution centered around the margin. We then present additional empirical analyses showing that this meta-level algorithm provides a general and effective framework that can be used in combination with a variety of learning classifiers. We perform extensive experiments to investigate the tradeoff achieved between scalability and accuracy, and robustness to noise, on both synthetic and real-world data. These empirical results corroborate our theoretical analysis, and demonstrate the potential of PSBML in achieving scalability without sacrificing accuracy

    Taming Tail Latency for Erasure-coded, Distributed Storage Systems

    Full text link
    Distributed storage systems are known to be susceptible to long tails in response time. In modern online storage systems such as Bing, Facebook, and Amazon, the long tails of the service latency are of particular concern. with 99.9th percentile response times being orders of magnitude worse than the mean. As erasure codes emerge as a popular technique to achieve high data reliability in distributed storage while attaining space efficiency, taming tail latency still remains an open problem due to the lack of mathematical models for analyzing such systems. To this end, we propose a framework for quantifying and optimizing tail latency in erasure-coded storage systems. In particular, we derive upper bounds on tail latency in closed form for arbitrary service time distribution and heterogeneous files. Based on the model, we formulate an optimization problem to jointly minimize the weighted latency tail probability of all files over the placement of files on the servers, and the choice of servers to access the requested files. The non-convex problem is solved using an efficient, alternating optimization algorithm. Numerical results show significant reduction of tail latency for erasure-coded storage systems with a realistic workload.Comment: 11 pages, 8 figure

    Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes

    Full text link
    Synchronized stochastic gradient descent (SGD) optimizers with data parallelism are widely used in training large-scale deep neural networks. Although using larger mini-batch sizes can improve the system scalability by reducing the communication-to-computation ratio, it may hurt the generalization ability of the models. To this end, we build a highly scalable deep learning training system for dense GPU clusters with three main contributions: (1) We propose a mixed-precision training method that significantly improves the training throughput of a single GPU without losing accuracy. (2) We propose an optimization approach for extremely large mini-batch size (up to 64k) that can train CNN models on the ImageNet dataset without losing accuracy. (3) We propose highly optimized all-reduce algorithms that achieve up to 3x and 11x speedup on AlexNet and ResNet-50 respectively than NCCL-based training on a cluster with 1024 Tesla P40 GPUs. On training ResNet-50 with 90 epochs, the state-of-the-art GPU-based system with 1024 Tesla P100 GPUs spent 15 minutes and achieved 74.9\% top-1 test accuracy, and another KNL-based system with 2048 Intel KNLs spent 20 minutes and achieved 75.4\% accuracy. Our training system can achieve 75.8\% top-1 test accuracy in only 6.6 minutes using 2048 Tesla P40 GPUs. When training AlexNet with 95 epochs, our system can achieve 58.7\% top-1 test accuracy within 4 minutes, which also outperforms all other existing systems.Comment: arXiv admin note: text overlap with arXiv:1803.03383 by other author

    Dynamics Based Features For Graph Classification

    Full text link
    Numerous social, medical, engineering and biological challenges can be framed as graph-based learning tasks. Here, we propose a new feature based approach to network classification. We show how dynamics on a network can be useful to reveal patterns about the organization of the components of the underlying graph where the process takes place. We define generalized assortativities on networks and use them as generalized features across multiple time scales. These features turn out to be suitable signatures for discriminating between different classes of networks. Our method is evaluated empirically on established network benchmarks. We also introduce a new dataset of human brain networks (connectomes) and use it to evaluate our method. Results reveal that our dynamics based features are competitive and often outperform state of the art accuracies.Comment: This paper is under review as a conference paper at ECML-PKDD 201

    Network Topology Inference Using Information Cascades with Limited Statistical Knowledge

    Full text link
    We study the problem of inferring network topology from information cascades, in which the amount of time taken for information to diffuse across an edge in the network follows an unknown distribution. Unlike previous studies, which assume knowledge of these distributions, we only require that diffusion along different edges in the network be independent together with limited moment information (e.g., the means). We introduce the concept of a separating vertex set for a graph, which is a set of vertices in which for any two given distinct vertices of the graph, there exists a vertex whose distance to them are different. We show that a necessary condition for reconstructing a tree perfectly using distance information between pairs of vertices is given by the size of an observed separating vertex set. We then propose an algorithm to recover the tree structure using infection times, whose differences have means corresponding to the distance between two vertices. To improve the accuracy of our algorithm, we propose the concept of redundant vertices, which allows us to perform averaging to better estimate the distance between two vertices. Though the theory is developed mainly for tree networks, we demonstrate how the algorithm can be extended heuristically to general graphs. Simulations using synthetic and real networks, and experiments using real-world data suggest that our proposed algorithm performs better than some current state-of-the-art network reconstruction methods

    Tuning for Tissue Image Segmentation Workflows for Accuracy and Performance

    Full text link
    We propose a software platform that integrates methods and tools for multi-objective parameter auto- tuning in tissue image segmentation workflows. The goal of our work is to provide an approach for improving the accuracy of nucleus/cell segmentation pipelines by tuning their input parameters. The shape, size and texture features of nuclei in tissue are important biomarkers for disease prognosis, and accurate computation of these features depends on accurate delineation of boundaries of nuclei. Input parameters in many nucleus segmentation workflows affect segmentation accuracy and have to be tuned for optimal performance. This is a time-consuming and computationally expensive process; automating this step facilitates more robust image segmentation workflows and enables more efficient application of image analysis in large image datasets. Our software platform adjusts the parameters of a nuclear segmentation algorithm to maximize the quality of image segmentation results while minimizing the execution time. It implements several optimization methods to search the parameter space efficiently. In addition, the methodology is developed to execute on high performance computing systems to reduce the execution time of the parameter tuning phase. Our results using three real-world image segmentation workflows demonstrate that the proposed solution is able to (1) search a small fraction (about 100 points) of the parameter space, which contains billions to trillions of points, and improve the quality of segmentation output by 1.20x, 1.29x, and 1.29x, on average; (2) decrease the execution time of a segmentation workflow by up to 11.79x while improving output quality; and (3) effectively use parallel systems to accelerate parameter tuning and segmentation phases.Comment: 29 pages, 5 figure

    Memristor-based Deep Convolution Neural Network: A Case Study

    Full text link
    In this paper, we firstly introduce a method to efficiently implement large-scale high-dimensional convolution with realistic memristor-based circuit components. An experiment verified simulator is adapted for accurate prediction of analog crossbar behavior. An improved conversion algorithm is developed to convert convolution kernels to memristor-based circuits, which minimizes the error with consideration of the data and kernel patterns in CNNs. With circuit simulation for all convolution layers in ResNet-20, we found that 8-bit ADC/DAC is necessary to preserve software level classification accuracy

    Multiparametric Deep Learning Tissue Signatures for a Radiological Biomarker of Breast Cancer: Preliminary Results

    Full text link
    A new paradigm is beginning to emerge in Radiology with the advent of increased computational capabilities and algorithms. This has led to the ability of real time learning by computer systems of different lesion types to help the radiologist in defining disease. For example, using a deep learning network, we developed and tested a multiparametric deep learning (MPDL) network for segmentation and classification using multiparametric magnetic resonance imaging (mpMRI) radiological images. The MPDL network was constructed from stacked sparse autoencoders with inputs from mpMRI. Evaluation of MPDL consisted of cross-validation, sensitivity, and specificity. Dice similarity between MPDL and post-DCE lesions were evaluated. We demonstrate high sensitivity and specificity for differentiation of malignant from benign lesions of 90% and 85% respectively with an AUC of 0.93. The Integrated MPDL method accurately segmented and classified different breast tissue from multiparametric breast MRI using deep leaning tissue signatures.Comment: Deep Learning, Machine learning, Magnetic resonance imaging, multiparametric MRI, Breast, Cancer, Diffusion, tissue biomarker

    Regularized Bidimensional Estimation of the Hazard Rate

    Full text link
    In epidemiological or demographic studies, with variable age at onset, a typical quantity of interest is the incidence of a disease (for example the cancer incidence). In these studies, the individuals are usually highly heterogeneous in terms of dates of birth (the cohort) and with respect to the calendar time (the period) and appropriate estimation methods are needed. In this article a new estimation method is presented which extends classical age-period-cohort analysis by allowing interactions between age, period and cohort effects. This paper introduces a bidimensional regularized estimate of the hazard rate where a penalty is introduced on the likelihood of the model. This penalty can be designed either to smooth the hazard rate or to enforce consecutive values of the hazard to be equal, leading to a parsimonious representation of the hazard rate. In the latter case, we make use of an iterative penalized likelihood scheme to approximate the L0 norm, which makes the computation tractable. The method is evaluated on simulated data and applied on breast cancer survival data from the SEER program

    Unravelling the forces underlying urban industrial agglomeration

    Full text link
    As early as the 1920's Marshall suggested that firms co-locate in cities to reduce the costs of moving goods, people, and ideas. These 'forces of agglomeration' have given rise, for example, to the high tech clusters of San Francisco and Boston, and the automobile cluster in Detroit. Yet, despite its importance for city planners and industrial policy-makers, until recently there has been little success in estimating the relative importance of each Marshallian channel to the location decisions of firms. Here we explore a burgeoning literature that aims to exploit the co-location patterns of industries in cities in order to disentangle the relationship between industry co-agglomeration and customer/supplier, labour and idea sharing. Building on previous approaches that focus on across- and between-industry estimates, we propose a network-based method to estimate the relative importance of each Marshallian channel at a meso scale. Specifically, we use a community detection technique to construct a hierarchical decomposition of the full set of industries into clusters based on co-agglomeration patterns, and show that these industry clusters exhibit distinct patterns in terms of their relative reliance on individual Marshallian channels
    • …