80,014 research outputs found
Low-rank SIFT: An Affine Invariant Feature for Place Recognition
In this paper, we present a novel affine-invariant feature based on SIFT,
leveraging the regular appearance of man-made objects. The feature achieves
full affine invariance without needing to simulate over affine parameter space.
Low-rank SIFT, as we name the feature, is based on our observation that local
tilt, which are caused by changes of camera axis orientation, could be
normalized by converting local patches to standard low-rank forms. Rotation,
translation and scaling invariance could be achieved in ways similar to SIFT.
As an extension of SIFT, our method seeks to add prior to solve the ill-posed
affine parameter estimation problem and normalizes them directly, and is
applicable to objects with regular structures. Furthermore, owing to recent
breakthrough in convex optimization, such parameter could be computed
efficiently. We will demonstrate its effectiveness in place recognition as our
major application. As extra contributions, we also describe our pipeline of
constructing geotagged building database from the ground up, as well as an
efficient scheme for automatic feature selection
Sympathy Begins with a Smile, Intelligence Begins with a Word: Use of Multimodal Features in Spoken Human-Robot Interaction
Recognition of social signals, from human facial expressions or prosody of
speech, is a popular research topic in human-robot interaction studies. There
is also a long line of research in the spoken dialogue community that
investigates user satisfaction in relation to dialogue characteristics.
However, very little research relates a combination of multimodal social
signals and language features detected during spoken face-to-face human-robot
interaction to the resulting user perception of a robot. In this paper we show
how different emotional facial expressions of human users, in combination with
prosodic characteristics of human speech and features of human-robot dialogue,
correlate with users' impressions of the robot after a conversation. We find
that happiness in the user's recognised facial expression strongly correlates
with likeability of a robot, while dialogue-related features (such as number of
human turns or number of sentences per robot utterance) correlate with
perceiving a robot as intelligent. In addition, we show that facial expression,
emotional features, and prosody are better predictors of human ratings related
to perceived robot likeability and anthropomorphism, while linguistic and
non-linguistic features more often predict perceived robot intelligence and
interpretability. As such, these characteristics may in future be used as an
online reward signal for in-situ Reinforcement Learning based adaptive
human-robot dialogue systems.Comment: Robo-NLP workshop at ACL 2017. 9 pages, 5 figures, 6 table
Inclusion and online learning opportunities: Designing for accessibility
Higher education institutions worldwide are adopting flexible learning methods and online technologies which increase the potential for widening the learning community to include people for whom participation may previously have been difficult or impossible. The development of courseware that is accessible, flexible and informative can benefit not only people with special needs, but such courseware provides a better educational experience for all students
Making science count in government
Science is an essential component of policy-making in most areas of government, but the scientific community does not always understand its role in this process.Publisher PDFPeer reviewe
Attention-based Pyramid Aggregation Network for Visual Place Recognition
Visual place recognition is challenging in the urban environment and is
usually viewed as a large scale image retrieval task. The intrinsic challenges
in place recognition exist that the confusing objects such as cars and trees
frequently occur in the complex urban scene, and buildings with repetitive
structures may cause over-counting and the burstiness problem degrading the
image representations. To address these problems, we present an Attention-based
Pyramid Aggregation Network (APANet), which is trained in an end-to-end manner
for place recognition. One main component of APANet, the spatial pyramid
pooling, can effectively encode the multi-size buildings containing
geo-information. The other one, the attention block, is adopted as a region
evaluator for suppressing the confusing regional features while highlighting
the discriminative ones. When testing, we further propose a simple yet
effective PCA power whitening strategy, which significantly improves the widely
used PCA whitening by reasonably limiting the impact of over-counting.
Experimental evaluations demonstrate that the proposed APANet outperforms the
state-of-the-art methods on two place recognition benchmarks, and generalizes
well on standard image retrieval datasets.Comment: Accepted to ACM Multimedia 201
- …