468 research outputs found
Kernel functions based on triplet comparisons
Given only information in the form of similarity triplets "Object A is more
similar to object B than to object C" about a data set, we propose two ways of
defining a kernel function on the data set. While previous approaches construct
a low-dimensional Euclidean embedding of the data set that reflects the given
similarity triplets, we aim at defining kernel functions that correspond to
high-dimensional embeddings. These kernel functions can subsequently be used to
apply any kernel method to the data set
Recommended from our members
Towards solving computer vision problems: datasets, labels, algorithms, and applications
The solution to a supervised computer vision problem consists of an application, algorithm, input data, and a set of human generated labels. Solving these kinds of tasks involves collecting large quantities of data, collecting appropriate labels, and developing machine vision algorithms tailored to the application. Progress on these problems has often benefited from large scale datasets with high fidelity labels. Successful algorithms display a synergy between application goals and the size and quality of the dataset. This thesis presents work highlighting the importance of each component of a supervised vision task.First, the problem of automatically classifying groups of people into social categories is introduced. This problem is called Urban Tribe Classification. To tackle this problem, each individual and the entire group of individuals are modeled. Since this was a newly introduced computer vision problem, a dataset for this task was created. On this dataset, the combined representation of group and individuals outperforms using only the person representations. This model showed promising results for automatic subculture classification.Second, the problem of creating perceptual embeddings based on human similarity judgements is tackled. This work focuses on triplet similarity comparisons of the form ``Is object more similar to or ?'', which have been useful for computer vision and machine learning applications. Unfortunately, triplet similarity comparisons, like many human labeling efforts, can be prohibitively expensive. This work proposes two techniques for dealing with this obstacle. First, an alternative display for collecting triplets is designed. This display shows a probe image and a grid of query images, allowing the user to collect multiple triplets simultaneously. The display is shown to reduce the cost and time of triplet collection. In addition, higher quality embeddings are created with the improved triplet collection UI. A 10,000-food item dataset of human taste similarity was created using this UI. Second, ``SNaCK,'' a low-dimensional perceptual embedding algorithm that combines human expertise with automatic machine kernels, is introduced. Both parts are complementary: human insight can capture relationships that are not apparent from the object's visual similarity and the machine can help relieve the human from having to exhaustively specify many constraints. Finally, the precise localization of key frames of an action is explored. This work focuses on detecting the exact starting frame of a behavior, an important task for neuroscience research. To address this problem, a loss designed to penalize extra and missed action start detections over small misalignments. Recurrent neural networks (RNN) are trained to optimize this loss. The model is shown to reduce the number of false positives, an important criteria defined by the neuroscientist. The performance of the model is evaluated on a new dataset, the Mouse Reach Dataset, a large, annotated video dataset of mice performing a sequence of actions. The dataset was created for neuroscience research. On this dataset, the proposed model outperforms related approaches and baseline methods using an unstructured loss
Crowdsourcing subjective annotations using pairwise comparisons reduces bias and error compared to the majority-vote method
How to better reduce measurement variability and bias introduced by
subjectivity in crowdsourced labelling remains an open question. We introduce a
theoretical framework for understanding how random error and measurement bias
enter into crowdsourced annotations of subjective constructs. We then propose a
pipeline that combines pairwise comparison labelling with Elo scoring, and
demonstrate that it outperforms the ubiquitous majority-voting method in
reducing both types of measurement error. To assess the performance of the
labelling approaches, we constructed an agent-based model of crowdsourced
labelling that lets us introduce different types of subjectivity into the
tasks. We find that under most conditions with task subjectivity, the
comparison approach produced higher scores. Further, the comparison
approach is less susceptible to inflating bias, which majority voting tends to
do. To facilitate applications, we show with simulated and real-world data that
the number of required random comparisons for the same classification accuracy
scales log-linearly with the number of labelled items. We also
implemented the Elo system as an open-source Python package.Comment: Accepted for publication at ACM CSCW 202
- …