9,455 research outputs found
Deep and Wide Multiscale Recursive Networks for Robust Image Labeling
Feedforward multilayer networks trained by supervised learning have recently
demonstrated state of the art performance on image labeling problems such as
boundary prediction and scene parsing. As even very low error rates can limit
practical usage of such systems, methods that perform closer to human accuracy
remain desirable. In this work, we propose a new type of network with the
following properties that address what we hypothesize to be limiting aspects of
existing methods: (1) a `wide' structure with thousands of features, (2) a
large field of view, (3) recursive iterations that exploit statistical
dependencies in label space, and (4) a parallelizable architecture that can be
trained in a fraction of the time compared to benchmark multilayer
convolutional networks. For the specific image labeling problem of boundary
prediction, we also introduce a novel example weighting algorithm that improves
segmentation accuracy. Experiments in the challenging domain of connectomic
reconstruction of neural circuity from 3d electron microscopy data show that
these "Deep And Wide Multiscale Recursive" (DAWMR) networks lead to new levels
of image labeling performance. The highest performing architecture has twelve
layers, interwoven supervised and unsupervised stages, and uses an input field
of view of 157,464 voxels () to make a prediction at each image location.
We present an associated open source software package that enables the simple
and flexible creation of DAWMR networks
See the Tree Through the Lines: The Shazoo Algorithm -- Full Version --
Predicting the nodes of a given graph is a fascinating theoretical problem
with applications in several domains. Since graph sparsification via spanning
trees retains enough information while making the task much easier, trees are
an important special case of this problem. Although it is known how to predict
the nodes of an unweighted tree in a nearly optimal way, in the weighted case a
fully satisfactory algorithm is not available yet. We fill this hole and
introduce an efficient node predictor, Shazoo, which is nearly optimal on any
weighted tree. Moreover, we show that Shazoo can be viewed as a common
nontrivial generalization of both previous approaches for unweighted trees and
weighted lines. Experiments on real-world datasets confirm that Shazoo performs
well in that it fully exploits the structure of the input tree, and gets very
close to (and sometimes better than) less scalable energy minimization methods
Low congestion online routing and an improved mistake bound for online prediction of graph labeling
In this paper, we show a connection between a certain online low-congestion
routing problem and an online prediction of graph labeling. More specifically,
we prove that if there exists a routing scheme that guarantees a congestion of
on any edge, there exists an online prediction algorithm with mistake
bound times the cut size, which is the size of the cut induced by the
label partitioning of graph vertices. With previous known bound of
for for the routing problem on trees with vertices, we obtain an
improved prediction algorithm for graphs with high effective resistance.
In contrast to previous approaches that move the graph problem into problems
in vector space using graph Laplacian and rely on the analysis of the
perceptron algorithm, our proof are purely combinatorial. Further more, our
approach directly generalizes to the case where labels are not binary.Comment: 5 page
An Improved Phrase-based Approach to Annotating and Summarizing Student Course Responses
Teaching large classes remains a great challenge, primarily because it is
difficult to attend to all the student needs in a timely manner. Automatic text
summarization systems can be leveraged to summarize the student feedback,
submitted immediately after each lecture, but it is left to be discovered what
makes a good summary for student responses. In this work we explore a new
methodology that effectively extracts summary phrases from the student
responses. Each phrase is tagged with the number of students who raise the
issue. The phrases are evaluated along two dimensions: with respect to text
content, they should be informative and well-formed, measured by the ROUGE
metric; additionally, they shall attend to the most pressing student needs,
measured by a newly proposed metric. This work is enabled by a phrase-based
annotation and highlighting scheme, which is new to the summarization task. The
phrase-based framework allows us to summarize the student responses into a set
of bullet points and present to the instructor promptly.Comment: 11 page
Textual Analogy Parsing: What's Shared and What's Compared among Analogous Facts
To understand a sentence like "whereas only 10% of White Americans live at or
below the poverty line, 28% of African Americans do" it is important not only
to identify individual facts, e.g., poverty rates of distinct demographic
groups, but also the higher-order relations between them, e.g., the disparity
between them. In this paper, we propose the task of Textual Analogy Parsing
(TAP) to model this higher-order meaning. The output of TAP is a frame-style
meaning representation which explicitly specifies what is shared (e.g., poverty
rates) and what is compared (e.g., White Americans vs. African Americans, 10%
vs. 28%) between its component facts. Such a meaning representation can enable
new applications that rely on discourse understanding such as automated chart
generation from quantitative text. We present a new dataset for TAP, baselines,
and a model that successfully uses an ILP to enforce the structural constraints
of the problem.Comment: 12 pages including appendix and references. To be presented at EMNLP
201
Leveraging Domain Knowledge to Improve Microscopy Image Segmentation with Lifted Multicuts
The throughput of electron microscopes has increased significantly in recent
years, enabling detailed analysis of cell morphology and ultrastructure.
Analysis of neural circuits at single-synapse resolution remains the flagship
target of this technique, but applications to cell and developmental biology
are also starting to emerge at scale. The amount of data acquired in such
studies makes manual instance segmentation, a fundamental step in many analysis
pipelines, impossible. While automatic segmentation approaches have improved
significantly thanks to the adoption of convolutional neural networks, their
accuracy still lags behind human annotations and requires additional manual
proof-reading. A major hindrance to further improvements is the limited field
of view of the segmentation networks preventing them from exploiting the
expected cell morphology or other prior biological knowledge which humans use
to inform their segmentation decisions. In this contribution, we show how such
domain-specific information can be leveraged by expressing it as long-range
interactions in a graph partitioning problem known as the lifted multicut
problem. Using this formulation, we demonstrate significant improvement in
segmentation accuracy for three challenging EM segmentation problems from
neuroscience and cell biology
Situation Recognition with Graph Neural Networks
We address the problem of recognizing situations in images. Given an image,
the task is to predict the most salient verb (action), and fill its semantic
roles such as who is performing the action, what is the source and target of
the action, etc. Different verbs have different roles (e.g. attacking has
weapon), and each role can take on many possible values (nouns). We propose a
model based on Graph Neural Networks that allows us to efficiently capture
joint dependencies between roles using neural networks defined on a graph.
Experiments with different graph connectivities show that our approach that
propagates information between roles significantly outperforms existing work,
as well as multiple baselines. We obtain roughly 3-5% improvement over previous
work in predicting the full situation. We also provide a thorough qualitative
analysis of our model and influence of different roles in the verbs.Comment: ICCV201
On Pairwise Clustering with Side Information
Pairwise clustering, in general, partitions a set of items via a known
similarity function. In our treatment, clustering is modeled as a transductive
prediction problem. Thus rather than beginning with a known similarity
function, the function instead is hidden and the learner only receives a random
sample consisting of a subset of the pairwise similarities. An additional set
of pairwise side-information may be given to the learner, which then determines
the inductive bias of our algorithms. We measure performance not based on the
recovery of the hidden similarity function, but instead on how well we classify
each item. We give tight bounds on the number of misclassifications. We provide
two algorithms. The first algorithm SACA is a simple agglomerative clustering
algorithm which runs in near linear time, and which serves as a baseline for
our analyses. Whereas the second algorithm, RGCA, enables the incorporation of
side-information which may lead to improved bounds at the cost of a longer
running time
MaxiMin Active Learning in Overparameterized Model Classes}
Generating labeled training datasets has become a major bottleneck in Machine
Learning (ML) pipelines. Active ML aims to address this issue by designing
learning algorithms that automatically and adaptively select the most
informative examples for labeling so that human time is not wasted labeling
irrelevant, redundant, or trivial examples. This paper proposes a new approach
to active ML with nonparametric or overparameterized models such as kernel
methods and neural networks. In the context of binary classification, the new
approach is shown to possess a variety of desirable properties that allow
active learning algorithms to automatically and efficiently identify decision
boundaries and data clusters.Comment: 43 pages, 12 figure
Learning Deep Representations for Scene Labeling with Semantic Context Guided Supervision
Scene labeling is a challenging classification problem where each input image
requires a pixel-level prediction map. Recently, deep-learning-based methods
have shown their effectiveness on solving this problem. However, we argue that
the large intra-class variation provides ambiguous training information and
hinders the deep models' ability to learn more discriminative deep feature
representations. Unlike existing methods that mainly utilize semantic context
for regularizing or smoothing the prediction map, we design novel supervisions
from semantic context for learning better deep feature representations. Two
types of semantic context, scene names of images and label map statistics of
image patches, are exploited to create label hierarchies between the original
classes and newly created subclasses as the learning supervisions. Such
subclasses show lower intra-class variation, and help CNN detect more
meaningful visual patterns and learn more effective deep features. Novel
training strategies and network structure that take advantages of such label
hierarchies are introduced. Our proposed method is evaluated extensively on
four popular datasets, Stanford Background (8 classes), SIFTFlow (33 classes),
Barcelona (170 classes) and LM+Sun datasets (232 classes) with 3 different
networks structures, and show state-of-the-art performance. The experiments
show that our proposed method makes deep models learn more discriminative
feature representations without increasing model size or complexity.Comment: 13 page
- …