20,729 research outputs found
Theoretical and Empirical Analysis of a Parallel Boosting Algorithm
Many real-world problems involve massive amounts of data. Under these
circumstances learning algorithms often become prohibitively expensive, making
scalability a pressing issue to be addressed. A common approach is to perform
sampling to reduce the size of the dataset and enable efficient learning.
Alternatively, one customizes learning algorithms to achieve scalability. In
either case, the key challenge is to obtain algorithmic efficiency without
compromising the quality of the results. In this paper we discuss a
meta-learning algorithm (PSBML) which combines features of parallel algorithms
with concepts from ensemble and boosting methodologies to achieve the desired
scalability property. We present both theoretical and empirical analyses which
show that PSBML preserves a critical property of boosting, specifically,
convergence to a distribution centered around the margin. We then present
additional empirical analyses showing that this meta-level algorithm provides a
general and effective framework that can be used in combination with a variety
of learning classifiers. We perform extensive experiments to investigate the
tradeoff achieved between scalability and accuracy, and robustness to noise, on
both synthetic and real-world data. These empirical results corroborate our
theoretical analysis, and demonstrate the potential of PSBML in achieving
scalability without sacrificing accuracy
Taming Tail Latency for Erasure-coded, Distributed Storage Systems
Distributed storage systems are known to be susceptible to long tails in
response time. In modern online storage systems such as Bing, Facebook, and
Amazon, the long tails of the service latency are of particular concern. with
99.9th percentile response times being orders of magnitude worse than the mean.
As erasure codes emerge as a popular technique to achieve high data reliability
in distributed storage while attaining space efficiency, taming tail latency
still remains an open problem due to the lack of mathematical models for
analyzing such systems. To this end, we propose a framework for quantifying and
optimizing tail latency in erasure-coded storage systems. In particular, we
derive upper bounds on tail latency in closed form for arbitrary service time
distribution and heterogeneous files. Based on the model, we formulate an
optimization problem to jointly minimize the weighted latency tail probability
of all files over the placement of files on the servers, and the choice of
servers to access the requested files. The non-convex problem is solved using
an efficient, alternating optimization algorithm. Numerical results show
significant reduction of tail latency for erasure-coded storage systems with a
realistic workload.Comment: 11 pages, 8 figure
Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes
Synchronized stochastic gradient descent (SGD) optimizers with data
parallelism are widely used in training large-scale deep neural networks.
Although using larger mini-batch sizes can improve the system scalability by
reducing the communication-to-computation ratio, it may hurt the generalization
ability of the models. To this end, we build a highly scalable deep learning
training system for dense GPU clusters with three main contributions: (1) We
propose a mixed-precision training method that significantly improves the
training throughput of a single GPU without losing accuracy. (2) We propose an
optimization approach for extremely large mini-batch size (up to 64k) that can
train CNN models on the ImageNet dataset without losing accuracy. (3) We
propose highly optimized all-reduce algorithms that achieve up to 3x and 11x
speedup on AlexNet and ResNet-50 respectively than NCCL-based training on a
cluster with 1024 Tesla P40 GPUs. On training ResNet-50 with 90 epochs, the
state-of-the-art GPU-based system with 1024 Tesla P100 GPUs spent 15 minutes
and achieved 74.9\% top-1 test accuracy, and another KNL-based system with 2048
Intel KNLs spent 20 minutes and achieved 75.4\% accuracy. Our training system
can achieve 75.8\% top-1 test accuracy in only 6.6 minutes using 2048 Tesla P40
GPUs. When training AlexNet with 95 epochs, our system can achieve 58.7\% top-1
test accuracy within 4 minutes, which also outperforms all other existing
systems.Comment: arXiv admin note: text overlap with arXiv:1803.03383 by other author
Dynamics Based Features For Graph Classification
Numerous social, medical, engineering and biological challenges can be framed
as graph-based learning tasks. Here, we propose a new feature based approach to
network classification. We show how dynamics on a network can be useful to
reveal patterns about the organization of the components of the underlying
graph where the process takes place. We define generalized assortativities on
networks and use them as generalized features across multiple time scales.
These features turn out to be suitable signatures for discriminating between
different classes of networks. Our method is evaluated empirically on
established network benchmarks. We also introduce a new dataset of human brain
networks (connectomes) and use it to evaluate our method. Results reveal that
our dynamics based features are competitive and often outperform state of the
art accuracies.Comment: This paper is under review as a conference paper at ECML-PKDD 201
Network Topology Inference Using Information Cascades with Limited Statistical Knowledge
We study the problem of inferring network topology from information cascades,
in which the amount of time taken for information to diffuse across an edge in
the network follows an unknown distribution. Unlike previous studies, which
assume knowledge of these distributions, we only require that diffusion along
different edges in the network be independent together with limited moment
information (e.g., the means). We introduce the concept of a separating vertex
set for a graph, which is a set of vertices in which for any two given distinct
vertices of the graph, there exists a vertex whose distance to them are
different. We show that a necessary condition for reconstructing a tree
perfectly using distance information between pairs of vertices is given by the
size of an observed separating vertex set. We then propose an algorithm to
recover the tree structure using infection times, whose differences have means
corresponding to the distance between two vertices. To improve the accuracy of
our algorithm, we propose the concept of redundant vertices, which allows us to
perform averaging to better estimate the distance between two vertices. Though
the theory is developed mainly for tree networks, we demonstrate how the
algorithm can be extended heuristically to general graphs. Simulations using
synthetic and real networks, and experiments using real-world data suggest that
our proposed algorithm performs better than some current state-of-the-art
network reconstruction methods
Tuning for Tissue Image Segmentation Workflows for Accuracy and Performance
We propose a software platform that integrates methods and tools for
multi-objective parameter auto- tuning in tissue image segmentation workflows.
The goal of our work is to provide an approach for improving the accuracy of
nucleus/cell segmentation pipelines by tuning their input parameters. The
shape, size and texture features of nuclei in tissue are important biomarkers
for disease prognosis, and accurate computation of these features depends on
accurate delineation of boundaries of nuclei. Input parameters in many nucleus
segmentation workflows affect segmentation accuracy and have to be tuned for
optimal performance. This is a time-consuming and computationally expensive
process; automating this step facilitates more robust image segmentation
workflows and enables more efficient application of image analysis in large
image datasets. Our software platform adjusts the parameters of a nuclear
segmentation algorithm to maximize the quality of image segmentation results
while minimizing the execution time. It implements several optimization methods
to search the parameter space efficiently. In addition, the methodology is
developed to execute on high performance computing systems to reduce the
execution time of the parameter tuning phase. Our results using three
real-world image segmentation workflows demonstrate that the proposed solution
is able to (1) search a small fraction (about 100 points) of the parameter
space, which contains billions to trillions of points, and improve the quality
of segmentation output by 1.20x, 1.29x, and 1.29x, on average; (2) decrease the
execution time of a segmentation workflow by up to 11.79x while improving
output quality; and (3) effectively use parallel systems to accelerate
parameter tuning and segmentation phases.Comment: 29 pages, 5 figure
Memristor-based Deep Convolution Neural Network: A Case Study
In this paper, we firstly introduce a method to efficiently implement
large-scale high-dimensional convolution with realistic memristor-based circuit
components. An experiment verified simulator is adapted for accurate prediction
of analog crossbar behavior. An improved conversion algorithm is developed to
convert convolution kernels to memristor-based circuits, which minimizes the
error with consideration of the data and kernel patterns in CNNs. With circuit
simulation for all convolution layers in ResNet-20, we found that 8-bit ADC/DAC
is necessary to preserve software level classification accuracy
Multiparametric Deep Learning Tissue Signatures for a Radiological Biomarker of Breast Cancer: Preliminary Results
A new paradigm is beginning to emerge in Radiology with the advent of
increased computational capabilities and algorithms. This has led to the
ability of real time learning by computer systems of different lesion types to
help the radiologist in defining disease. For example, using a deep learning
network, we developed and tested a multiparametric deep learning (MPDL) network
for segmentation and classification using multiparametric magnetic resonance
imaging (mpMRI) radiological images. The MPDL network was constructed from
stacked sparse autoencoders with inputs from mpMRI. Evaluation of MPDL
consisted of cross-validation, sensitivity, and specificity. Dice similarity
between MPDL and post-DCE lesions were evaluated. We demonstrate high
sensitivity and specificity for differentiation of malignant from benign
lesions of 90% and 85% respectively with an AUC of 0.93. The Integrated MPDL
method accurately segmented and classified different breast tissue from
multiparametric breast MRI using deep leaning tissue signatures.Comment: Deep Learning, Machine learning, Magnetic resonance imaging,
multiparametric MRI, Breast, Cancer, Diffusion, tissue biomarker
Regularized Bidimensional Estimation of the Hazard Rate
In epidemiological or demographic studies, with variable age at onset, a
typical quantity of interest is the incidence of a disease (for example the
cancer incidence). In these studies, the individuals are usually highly
heterogeneous in terms of dates of birth (the cohort) and with respect to the
calendar time (the period) and appropriate estimation methods are needed. In
this article a new estimation method is presented which extends classical
age-period-cohort analysis by allowing interactions between age, period and
cohort effects. This paper introduces a bidimensional regularized estimate of
the hazard rate where a penalty is introduced on the likelihood of the model.
This penalty can be designed either to smooth the hazard rate or to enforce
consecutive values of the hazard to be equal, leading to a parsimonious
representation of the hazard rate. In the latter case, we make use of an
iterative penalized likelihood scheme to approximate the L0 norm, which makes
the computation tractable. The method is evaluated on simulated data and
applied on breast cancer survival data from the SEER program
Unravelling the forces underlying urban industrial agglomeration
As early as the 1920's Marshall suggested that firms co-locate in cities to
reduce the costs of moving goods, people, and ideas. These 'forces of
agglomeration' have given rise, for example, to the high tech clusters of San
Francisco and Boston, and the automobile cluster in Detroit. Yet, despite its
importance for city planners and industrial policy-makers, until recently there
has been little success in estimating the relative importance of each
Marshallian channel to the location decisions of firms.
Here we explore a burgeoning literature that aims to exploit the co-location
patterns of industries in cities in order to disentangle the relationship
between industry co-agglomeration and customer/supplier, labour and idea
sharing. Building on previous approaches that focus on across- and
between-industry estimates, we propose a network-based method to estimate the
relative importance of each Marshallian channel at a meso scale. Specifically,
we use a community detection technique to construct a hierarchical
decomposition of the full set of industries into clusters based on
co-agglomeration patterns, and show that these industry clusters exhibit
distinct patterns in terms of their relative reliance on individual Marshallian
channels
- …