80,014 research outputs found

    Low-rank SIFT: An Affine Invariant Feature for Place Recognition

    Full text link
    In this paper, we present a novel affine-invariant feature based on SIFT, leveraging the regular appearance of man-made objects. The feature achieves full affine invariance without needing to simulate over affine parameter space. Low-rank SIFT, as we name the feature, is based on our observation that local tilt, which are caused by changes of camera axis orientation, could be normalized by converting local patches to standard low-rank forms. Rotation, translation and scaling invariance could be achieved in ways similar to SIFT. As an extension of SIFT, our method seeks to add prior to solve the ill-posed affine parameter estimation problem and normalizes them directly, and is applicable to objects with regular structures. Furthermore, owing to recent breakthrough in convex optimization, such parameter could be computed efficiently. We will demonstrate its effectiveness in place recognition as our major application. As extra contributions, we also describe our pipeline of constructing geotagged building database from the ground up, as well as an efficient scheme for automatic feature selection

    Sympathy Begins with a Smile, Intelligence Begins with a Word: Use of Multimodal Features in Spoken Human-Robot Interaction

    Full text link
    Recognition of social signals, from human facial expressions or prosody of speech, is a popular research topic in human-robot interaction studies. There is also a long line of research in the spoken dialogue community that investigates user satisfaction in relation to dialogue characteristics. However, very little research relates a combination of multimodal social signals and language features detected during spoken face-to-face human-robot interaction to the resulting user perception of a robot. In this paper we show how different emotional facial expressions of human users, in combination with prosodic characteristics of human speech and features of human-robot dialogue, correlate with users' impressions of the robot after a conversation. We find that happiness in the user's recognised facial expression strongly correlates with likeability of a robot, while dialogue-related features (such as number of human turns or number of sentences per robot utterance) correlate with perceiving a robot as intelligent. In addition, we show that facial expression, emotional features, and prosody are better predictors of human ratings related to perceived robot likeability and anthropomorphism, while linguistic and non-linguistic features more often predict perceived robot intelligence and interpretability. As such, these characteristics may in future be used as an online reward signal for in-situ Reinforcement Learning based adaptive human-robot dialogue systems.Comment: Robo-NLP workshop at ACL 2017. 9 pages, 5 figures, 6 table

    Inclusion and online learning opportunities: Designing for accessibility

    Get PDF
    Higher education institutions worldwide are adopting flexible learning methods and online technologies which increase the potential for widening the learning community to include people for whom participation may previously have been difficult or impossible. The development of courseware that is accessible, flexible and informative can benefit not only people with special needs, but such courseware provides a better educational experience for all students

    Making science count in government

    Get PDF
    Science is an essential component of policy-making in most areas of government, but the scientific community does not always understand its role in this process.Publisher PDFPeer reviewe

    Attention-based Pyramid Aggregation Network for Visual Place Recognition

    Full text link
    Visual place recognition is challenging in the urban environment and is usually viewed as a large scale image retrieval task. The intrinsic challenges in place recognition exist that the confusing objects such as cars and trees frequently occur in the complex urban scene, and buildings with repetitive structures may cause over-counting and the burstiness problem degrading the image representations. To address these problems, we present an Attention-based Pyramid Aggregation Network (APANet), which is trained in an end-to-end manner for place recognition. One main component of APANet, the spatial pyramid pooling, can effectively encode the multi-size buildings containing geo-information. The other one, the attention block, is adopted as a region evaluator for suppressing the confusing regional features while highlighting the discriminative ones. When testing, we further propose a simple yet effective PCA power whitening strategy, which significantly improves the widely used PCA whitening by reasonably limiting the impact of over-counting. Experimental evaluations demonstrate that the proposed APANet outperforms the state-of-the-art methods on two place recognition benchmarks, and generalizes well on standard image retrieval datasets.Comment: Accepted to ACM Multimedia 201
    • …
    corecore