2,258 research outputs found
KCRC-LCD: Discriminative Kernel Collaborative Representation with Locality Constrained Dictionary for Visual Categorization
We consider the image classification problem via kernel collaborative
representation classification with locality constrained dictionary (KCRC-LCD).
Specifically, we propose a kernel collaborative representation classification
(KCRC) approach in which kernel method is used to improve the discrimination
ability of collaborative representation classification (CRC). We then measure
the similarities between the query and atoms in the global dictionary in order
to construct a locality constrained dictionary (LCD) for KCRC. In addition, we
discuss several similarity measure approaches in LCD and further present a
simple yet effective unified similarity measure whose superiority is validated
in experiments. There are several appealing aspects associated with LCD. First,
LCD can be nicely incorporated under the framework of KCRC. The LCD similarity
measure can be kernelized under KCRC, which theoretically links CRC and LCD
under the kernel method. Second, KCRC-LCD becomes more scalable to both the
training set size and the feature dimension. Example shows that KCRC is able to
perfectly classify data with certain distribution, while conventional CRC fails
completely. Comprehensive experiments on many public datasets also show that
KCRC-LCD is a robust discriminative classifier with both excellent performance
and good scalability, being comparable or outperforming many other
state-of-the-art approaches
Grounding semantics in robots for Visual Question Answering
In this thesis I describe an operational implementation of an object detection and description system that incorporates in an end-to-end Visual Question Answering system and evaluated it on two visual question answering datasets for compositional language and elementary visual reasoning
Learning Discriminative Stein Kernel for SPD Matrices and Its Applications
Stein kernel has recently shown promising performance on classifying images
represented by symmetric positive definite (SPD) matrices. It evaluates the
similarity between two SPD matrices through their eigenvalues. In this paper,
we argue that directly using the original eigenvalues may be problematic
because: i) Eigenvalue estimation becomes biased when the number of samples is
inadequate, which may lead to unreliable kernel evaluation; ii) More
importantly, eigenvalues only reflect the property of an individual SPD matrix.
They are not necessarily optimal for computing Stein kernel when the goal is to
discriminate different sets of SPD matrices. To address the two issues in one
shot, we propose a discriminative Stein kernel, in which an extra parameter
vector is defined to adjust the eigenvalues of the input SPD matrices. The
optimal parameter values are sought by optimizing a proxy of classification
performance. To show the generality of the proposed method, three different
kernel learning criteria that are commonly used in the literature are employed
respectively as a proxy. A comprehensive experimental study is conducted on a
variety of image classification tasks to compare our proposed discriminative
Stein kernel with the original Stein kernel and other commonly used methods for
evaluating the similarity between SPD matrices. The experimental results
demonstrate that, the discriminative Stein kernel can attain greater
discrimination and better align with classification tasks by altering the
eigenvalues. This makes it produce higher classification performance than the
original Stein kernel and other commonly used methods.Comment: 13 page
Investigating human-perceptual properties of "shapes" using 3D shapes and 2D fonts
Shapes are generally used to convey meaning. They are used in video games, films and other multimedia, in diverse ways. 3D shapes may be destined for virtual scenes or represent objects to be constructed in the real-world. Fonts add character to an otherwise plain block of text, allowing the writer to make important points more visually prominent or distinct from other text. They can indicate the structure of a document, at a glance. Rather than studying shapes through traditional geometric shape descriptors, we provide alternative methods to describe and analyse shapes, from a lens of human perception. This is done via the concepts of Schelling Points and Image Specificity. Schelling Points are choices people make when they aim to match with what they expect others to choose but cannot communicate with others to determine an answer. We study whole mesh selections in this setting, where Schelling Meshes are the most frequently selected shapes. The key idea behind image Specificity is that different images evoke different descriptions; but ‘Specific’ images yield more consistent descriptions than others. We apply Specificity to 2D fonts. We show that each concept can be learned and predict them for fonts and 3D shapes, respectively, using a depth image-based convolutional neural network. Results are shown for a range of fonts and 3D shapes and we demonstrate that font Specificity and the Schelling meshes concept are useful for visualisation, clustering, and search applications. Overall, we find that each concept represents similarities between their respective type of shape, even when there are discontinuities between the shape geometries themselves. The ‘context’ of these similarities is in some kind of abstract or subjective meaning which is consistent among different people
- …