9,998 research outputs found
Edge-labeling Graph Neural Network for Few-shot Learning
In this paper, we propose a novel edge-labeling graph neural network (EGNN),
which adapts a deep neural network on the edge-labeling graph, for few-shot
learning. The previous graph neural network (GNN) approaches in few-shot
learning have been based on the node-labeling framework, which implicitly
models the intra-cluster similarity and the inter-cluster dissimilarity. In
contrast, the proposed EGNN learns to predict the edge-labels rather than the
node-labels on the graph that enables the evolution of an explicit clustering
by iteratively updating the edge-labels with direct exploitation of both
intra-cluster similarity and the inter-cluster dissimilarity. It is also well
suited for performing on various numbers of classes without retraining, and can
be easily extended to perform a transductive inference. The parameters of the
EGNN are learned by episodic training with an edge-labeling loss to obtain a
well-generalizable model for unseen low-data problem. On both of the supervised
and semi-supervised few-shot image classification tasks with two benchmark
datasets, the proposed EGNN significantly improves the performances over the
existing GNNs.Comment: accepted to CVPR 201
Data Efficient and Weakly Supervised Computational Pathology on Whole Slide Images
The rapidly emerging field of computational pathology has the potential to
enable objective diagnosis, therapeutic response prediction and identification
of new morphological features of clinical relevance. However, deep
learning-based computational pathology approaches either require manual
annotation of gigapixel whole slide images (WSIs) in fully-supervised settings
or thousands of WSIs with slide-level labels in a weakly-supervised setting.
Moreover, whole slide level computational pathology methods also suffer from
domain adaptation and interpretability issues. These challenges have prevented
the broad adaptation of computational pathology for clinical and research
purposes. Here we present CLAM - Clustering-constrained attention multiple
instance learning, an easy-to-use, high-throughput, and interpretable WSI-level
processing and learning method that only requires slide-level labels while
being data efficient, adaptable and capable of handling multi-class subtyping
problems. CLAM is a deep-learning-based weakly-supervised method that uses
attention-based learning to automatically identify sub-regions of high
diagnostic value in order to accurately classify the whole slide, while also
utilizing instance-level clustering over the representative regions identified
to constrain and refine the feature space. In three separate analyses, we
demonstrate the data efficiency and adaptability of CLAM and its superior
performance over standard weakly-supervised classification. We demonstrate that
CLAM models are interpretable and can be used to identify well-known and new
morphological features. We further show that models trained using CLAM are
adaptable to independent test cohorts, cell phone microscopy images, and
biopsies. CLAM is a general-purpose and adaptable method that can be used for a
variety of different computational pathology tasks in both clinical and
research settings
ECKO: Ensemble of Clustered Knockoffs for multivariate inference on fMRI data
Continuous improvement in medical imaging techniques allows the acquisition
of higher-resolution images. When these are used in a predictive setting, a
greater number of explanatory variables are potentially related to the
dependent variable (the response). Meanwhile, the number of acquisitions per
experiment remains limited. In such high dimension/small sample size setting,
it is desirable to find the explanatory variables that are truly related to the
response while controlling the rate of false discoveries. To achieve this goal,
novel multivariate inference procedures, such as knockoff inference, have been
proposed recently. However, they require the feature covariance to be
well-defined, which is impossible in high-dimensional settings. In this paper,
we propose a new algorithm, called Ensemble of Clustered Knockoffs, that allows
to select explanatory variables while controlling the false discovery rate
(FDR), up to a prescribed spatial tolerance. The core idea is that
knockoff-based inference can be applied on groups (clusters) of voxels, which
drastically reduces the problem's dimension; an ensembling step then removes
the dependence on a fixed clustering and stabilizes the results. We benchmark
this algorithm and other FDR-controlling methods on brain imaging datasets and
observe empirical gains in sensitivity, while the false discovery rate is
controlled at the nominal level.Comment: Accepted to 26th International Conference on Information Processing
in Medical Imaging (IPMI
Crowd-Powered Data Mining
Many data mining tasks cannot be completely addressed by auto- mated
processes, such as sentiment analysis and image classification. Crowdsourcing
is an effective way to harness the human cognitive ability to process these
machine-hard tasks. Thanks to public crowdsourcing platforms, e.g., Amazon
Mechanical Turk and Crowd- Flower, we can easily involve hundreds of thousands
of ordinary workers (i.e., the crowd) to address these machine-hard tasks. In
this tutorial, we will survey and synthesize a wide spectrum of existing
studies on crowd-powered data mining. We first give an overview of
crowdsourcing, and then summarize the fundamental techniques, including quality
control, cost control, and latency control, which must be considered in
crowdsourced data mining. Next we review crowd-powered data mining operations,
including classification, clustering, pattern mining, machine learning using
the crowd (including deep learning, transfer learning and semi-supervised
learning) and knowledge discovery. Finally, we provide the emerging challenges
in crowdsourced data mining
LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving
In this paper, we present LaserNet, a computationally efficient method for 3D
object detection from LiDAR data for autonomous driving. The efficiency results
from processing LiDAR data in the native range view of the sensor, where the
input data is naturally compact. Operating in the range view involves well
known challenges for learning, including occlusion and scale variation, but it
also provides contextual information based on how the sensor data was captured.
Our approach uses a fully convolutional network to predict a multimodal
distribution over 3D boxes for each point and then it efficiently fuses these
distributions to generate a prediction for each object. Experiments show that
modeling each detection as a distribution rather than a single deterministic
box leads to better overall detection performance. Benchmark results show that
this approach has significantly lower runtime than other recent detectors and
that it achieves state-of-the-art performance when compared on a large dataset
that has enough data to overcome the challenges of training on the range view.Comment: Accepted for publication at CVPR 201
Learning to Identify High Betweenness Centrality Nodes from Scratch: A Novel Graph Neural Network Approach
Betweenness centrality (BC) is one of the most used centrality measures for
network analysis, which seeks to describe the importance of nodes in a network
in terms of the fraction of shortest paths that pass through them. It is key to
many valuable applications, including community detection and network
dismantling. Computing BC scores on large networks is computationally
challenging due to high time complexity. Many approximation algorithms have
been proposed to speed up the estimation of BC, which are mainly
sampling-based. However, these methods are still prone to considerable
execution time on large-scale networks, and their results are often exacerbated
when small changes happen to the network structures. In this paper, we focus on
identifying nodes with high BC in a graph, since many application scenarios are
built upon retrieving nodes with top-k BC. Different from previous heuristic
methods, we turn this task into a learning problem and design an
encoder-decoder based framework to resolve the problem. More specifcally, the
encoder leverages the network structure to encode each node into an embedding
vector, which captures the important structural information of the node. The
decoder transforms the embedding vector for each node into a scalar, which
captures the relative rank of this node in terms of BC. We use the pairwise
ranking loss to train the model to identify the orders of nodes regarding their
BC. By training on small-scale networks, the learned model is capable of
assigning relative BC scores to nodes for any unseen networks, and thus
identifying the highly-ranked nodes. Comprehensive experiments on both
synthetic and real-world networks demonstrate that, compared to representative
baselines, our model drastically speeds up the prediction without noticeable
sacrifce in accuracy, and outperforms the state-of-the-art by accuracy on
several large real-world networks.Comment: 10 pages, 4 figures, 8 table
Unsupervised Submodular Rank Aggregation on Score-based Permutations
Unsupervised rank aggregation on score-based permutations, which is widely
used in many applications, has not been deeply explored yet. This work studies
the use of submodular optimization for rank aggregation on score-based
permutations in an unsupervised way. Specifically, we propose an unsupervised
approach based on the Lovasz Bregman divergence for setting up linear
structured convex and nested structured concave objective functions. In
addition, stochastic optimization methods are applied in the training process
and efficient algorithms for inference can be guaranteed. The experimental
results from Information Retrieval, Combining Distributed Neural Networks,
Influencers in Social Networks, and Distributed Automatic Speech Recognition
tasks demonstrate the effectiveness of the proposed methods
Semantic Instance Segmentation with a Discriminative Loss Function
Semantic instance segmentation remains a challenging task. In this work we
propose to tackle the problem with a discriminative loss function, operating at
the pixel level, that encourages a convolutional network to produce a
representation of the image that can easily be clustered into instances with a
simple post-processing step. The loss function encourages the network to map
each pixel to a point in feature space so that pixels belonging to the same
instance lie close together while different instances are separated by a wide
margin. Our approach of combining an off-the-shelf network with a principled
loss function inspired by a metric learning objective is conceptually simple
and distinct from recent efforts in instance segmentation. In contrast to
previous works, our method does not rely on object proposals or recurrent
mechanisms. A key contribution of our work is to demonstrate that such a simple
setup without bells and whistles is effective and can perform on par with more
complex methods. Moreover, we show that it does not suffer from some of the
limitations of the popular detect-and-segment approaches. We achieve
competitive performance on the Cityscapes and CVPPP leaf segmentation
benchmarks.Comment: Published at "Deep Learning for Robotic Vision", workshop at CVPR
201
Multi-view X-ray R-CNN
Motivated by the detection of prohibited objects in carry-on luggage as a
part of avionic security screening, we develop a CNN-based object detection
approach for multi-view X-ray image data. Our contributions are two-fold.
First, we introduce a novel multi-view pooling layer to perform a 3D
aggregation of 2D CNN-features extracted from each view. To that end, our
pooling layer exploits the known geometry of the imaging system to ensure
geometric consistency of the feature aggregation. Second, we introduce an
end-to-end trainable multi-view detection pipeline based on Faster R-CNN, which
derives the region proposals and performs the final classification in 3D using
these aggregated multi-view features. Our approach shows significant accuracy
gains compared to single-view detection while even being more efficient than
performing single-view detection in each view.Comment: To appear at the 40th German Conference on Pattern Recognition (GCPR)
201
RelNN: A Deep Neural Model for Relational Learning
Statistical relational AI (StarAI) aims at reasoning and learning in noisy
domains described in terms of objects and relationships by combining
probability with first-order logic. With huge advances in deep learning in the
current years, combining deep networks with first-order logic has been the
focus of several recent studies. Many of the existing attempts, however, only
focus on relations and ignore object properties. The attempts that do consider
object properties are limited in terms of modelling power or scalability. In
this paper, we develop relational neural networks (RelNNs) by adding hidden
layers to relational logistic regression (the relational counterpart of
logistic regression). We learn latent properties for objects both directly and
through general rules. Back-propagation is used for training these models. A
modular, layer-wise architecture facilitates utilizing the techniques developed
within deep learning community to our architecture. Initial experiments on
eight tasks over three real-world datasets show that RelNNs are promising
models for relational learning.Comment: 9 pages, 8 figures, accepted at AAAI-201
- …