14,264 research outputs found
GOGGLES: Automatic Image Labeling with Affinity Coding
Generating large labeled training data is becoming the biggest bottleneck in
building and deploying supervised machine learning models. Recently, the data
programming paradigm has been proposed to reduce the human cost in labeling
training data. However, data programming relies on designing labeling functions
which still requires significant domain expertise. Also, it is prohibitively
difficult to write labeling functions for image datasets as it is hard to
express domain knowledge using raw features for images (pixels).
We propose affinity coding, a new domain-agnostic paradigm for automated
training data labeling. The core premise of affinity coding is that the
affinity scores of instance pairs belonging to the same class on average should
be higher than those of pairs belonging to different classes, according to some
affinity functions. We build the GOGGLES system that implements affinity coding
for labeling image datasets by designing a novel set of reusable affinity
functions for images, and propose a novel hierarchical generative model for
class inference using a small development set.
We compare GOGGLES with existing data programming systems on 5 image labeling
tasks from diverse domains. GOGGLES achieves labeling accuracies ranging from a
minimum of 71% to a maximum of 98% without requiring any extensive human
annotation. In terms of end-to-end performance, GOGGLES outperforms the
state-of-the-art data programming system Snuba by 21% and a state-of-the-art
few-shot learning technique by 5%, and is only 7% away from the fully
supervised upper bound.Comment: Published at 2020 ACM SIGMOD International Conference on Management
of Dat
Machine learning methods for histopathological image analysis
Abundant accumulation of digital histopathological images has led to the
increased demand for their analysis, such as computer-aided diagnosis using
machine learning techniques. However, digital pathological images and related
tasks have some issues to be considered. In this mini-review, we introduce the
application of digital pathological image analysis using machine learning
algorithms, address some problems specific to such analysis, and propose
possible solutions.Comment: 23 pages, 4 figure
Guided Proofreading of Automatic Segmentations for Connectomics
Automatic cell image segmentation methods in connectomics produce merge and
split errors, which require correction through proofreading. Previous research
has identified the visual search for these errors as the bottleneck in
interactive proofreading. To aid error correction, we develop two classifiers
that automatically recommend candidate merges and splits to the user. These
classifiers use a convolutional neural network (CNN) that has been trained with
errors in automatic segmentations against expert-labeled ground truth. Our
classifiers detect potentially-erroneous regions by considering a large context
region around a segmentation boundary. Corrections can then be performed by a
user with yes/no decisions, which reduces variation of information 7.5x faster
than previous proofreading methods. We also present a fully-automatic mode that
uses a probability threshold to make merge/split decisions. Extensive
experiments using the automatic approach and comparing performance of novice
and expert users demonstrate that our method performs favorably against
state-of-the-art proofreading methods on different connectomics datasets.Comment: Supplemental material available at
http://rhoana.org/guidedproofreading/supplemental.pd
- …