20,772 research outputs found

    VisTaNet: Attention Guided Deep Fusion for Surface Roughness Classification

    Full text link
    Human texture perception is a weighted average of multi-sensory inputs: visual and tactile. While the visual sensing mechanism extracts global features, the tactile mechanism complements it by extracting local features. The lack of coupled visuotactile datasets in the literature is a challenge for studying multimodal fusion strategies analogous to human texture perception. This paper presents a visual dataset that augments an existing tactile dataset. We propose a novel deep fusion architecture that fuses visual and tactile data using four types of fusion strategies: summation, concatenation, max-pooling, and attention. Our model shows significant performance improvements (97.22%) in surface roughness classification accuracy over tactile only (SVM - 92.60%) and visual only (FENet-50 - 85.01%) architectures. Among the several fusion techniques, attention-guided architecture results in better classification accuracy. Our study shows that analogous to human texture perception, the proposed model chooses a weighted combination of the two modalities (visual and tactile), thus resulting in higher surface roughness classification accuracy; and it chooses to maximize the weightage of the tactile modality where the visual modality fails and vice-versa

    Factorized Topic Models

    Full text link
    In this paper we present a modification to a latent topic model, which makes the model exploit supervision to produce a factorized representation of the observed data. The structured parameterization separately encodes variance that is shared between classes from variance that is private to each class by the introduction of a new prior over the topic space. The approach allows for a more eff{}icient inference and provides an intuitive interpretation of the data in terms of an informative signal together with structured noise. The factorized representation is shown to enhance inference performance for image, text, and video classification.Comment: ICLR 201
    • …
    corecore