Search CORE

29 research outputs found

Temporal Attention-Gated Model for Robust Sequence Classification

Author: Baltrušaitis Tadas
Morency Louis-Philippe
Pei Wenjie
Tax David M. J.
Publication venue
Publication date: 15/04/2017
Field of study

Typical techniques for sequence classification are designed for well-segmented sequences which have been edited to remove noisy or irrelevant parts. Therefore, such methods cannot be easily applied on noisy sequences expected in real-world applications. In this paper, we present the Temporal Attention-Gated Model (TAGM) which integrates ideas from attention models and gated recurrent networks to better deal with noisy or unsegmented sequences. Specifically, we extend the concept of attention model to measure the relevance of each observation (time step) of a sequence. We then use a novel gated recurrent network to learn the hidden representation for the final prediction. An important advantage of our approach is interpretability since the temporal attention weights provide a meaningful value for the salience of each time step in the sequence. We demonstrate the merits of our TAGM approach, both for prediction accuracy and interpretability, on three different tasks: spoken digit recognition, text-based sentiment analysis and visual event recognition.Comment: Accepted by CVPR 201

arXiv.org e-Print Archive

Crossref

The Cambridge Face Tracker: Accurate, Low Cost Measurement of Head Posture Using Computer Vision and Face Recognition Software.

Author: Baltrušaitis Tadas
Robinson Peter
Thomas Peter BM
Vivian Anthony J
Publication venue: Transl Vis Sci Technol
Publication date: 30/09/2016
Field of study

PURPOSE: We validate a video-based method of head posture measurement. METHODS: The Cambridge Face Tracker uses neural networks (constrained local neural fields) to recognize facial features in video. The relative position of these facial features is used to calculate head posture. First, we assess the accuracy of this approach against videos in three research databases where each frame is tagged with a precisely measured head posture. Second, we compare our method to a commercially available mechanical device, the Cervical Range of Motion device: four subjects each adopted 43 distinct head postures that were measured using both methods. RESULTS: The Cambridge Face Tracker achieved confident facial recognition in 92% of the approximately 38,000 frames of video from the three databases. The respective mean error in absolute head posture was 3.34°, 3.86°, and 2.81°, with a median error of 1.97°, 2.16°, and 1.96°. The accuracy decreased with more extreme head posture. Comparing The Cambridge Face Tracker to the Cervical Range of Motion Device gave correlation coefficients of 0.99 (P < 0.0001), 0.96 (P < 0.0001), and 0.99 (P < 0.0001) for yaw, pitch, and roll, respectively. CONCLUSIONS: The Cambridge Face Tracker performs well under real-world conditions and within the range of normally-encountered head posture. It allows useful quantification of head posture in real time or from precaptured video. Its performance is similar to that of a clinically validated mechanical device. It has significant advantages over other approaches in that subjects do not need to wear any apparatus, and it requires only low cost, easy-to-setup consumer electronics. TRANSLATIONAL RELEVANCE: Noncontact assessment of head posture allows more complete clinical assessment of patients, and could benefit surgical planning in future

Crossref

PubMed Central

Apollo (Cambridge)

Geometries of Light and Shadows, from Piero della Francesca to James Turrell

Author: A Andersen
A Bortot
A Rosa De
A Rosa De
C Candito
EB Gilman
F Siguret
G Ciucci
G Rodis-Lewis
J Baltrušaitis
J Baltrušaitis
J D’Auzoles de Lapeyre
JF Niceron
L Nix
L Vagnetti
M Cojannot-Le Blanc
M Kemp
O Krakovitch
P Portoghesi
PJS Withmore
R Ceñal
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

This chapter addresses the problem of representing light and shadow in the artistic culture, from its uncertain beginnings, related to the studies on conical linear perspective in the Fifteenth Century, to the applications of light projection in the installations of contemporary art. Here are examined in particular two works by two artists, representing two different conceptual approaches to the perception and symbolism of light and shadow. The first is the so-called Brera Madonna by Piero della Francesca, where the image projected from a luminous radiation is employed with a narrative purpose, supporting the apparently hidden script of the painting and according to the artist\u2019s own speculations about perspective as a means to clarify the phenomenal world. The second is one of James Turrell\u2019s Dark Spaces installations, where quantum electrodynamics interpretation of light is taken into account: for Turrell, light is physical and thus can shape spaces where the visitors, or viewers, can \u201csee themselves seeing.\u201d In his body of work, perceptual deceptions are carefullyproduced by the interaction of the senses with his phenomenal staging of light and darkness, but a strong symbolic component is always present, often related to his own speculative interests. In both cases, light and shadow, through their geometries, emphasize both phenomenal and spiritual contents of the work of art, intended as a device to expand the perception and the knowledge of the viewer

Archivio istituzionale della ricerca - Università di Trieste

Archivio istituzionale della ricerca - Università IUAV di Venezia

Crossref

Learning Tversky Similarity

Author: A Tversky
AW Smeulders
B Bouchon-Meunier
G Chechik
G Coletti
G Coletti
G Patterson
J Duchi
KQ Weinberger
M Baioletti
M Lesot
MM Richter
N Garcia
R Datta
S Lang
S Santini
SSM Salehi
T Baltrušaitis
Y Chen
Y Liu
YA Tolias
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/05/2020
Field of study

In this paper, we advocate Tversky's ratio model as an appropriate basis for computational approaches to semantic similarity, that is, the comparison of objects such as images in a semantically meaningful way. We consider the problem of learning Tversky similarity measures from suitable training data indicating whether two objects tend to be similar or dissimilar. Experimentally, we evaluate our approach to similarity learning on two image datasets, showing that is performs very well compared to existing methods

arXiv.org e-Print Archive

Crossref

Detecting human Activities Based on a multimodal sensor data set using a bidirectional long short-term memory model: a case study

Author: A Jaimes
AC Scheffer
Alexander Mathis
C Chen
E Kanjo
F Ordóñez
HF Nweke
J Wang
JC Núñez
JR Kwapisz
JY Hesterman
K Greff
L Ding
L Martínez-Villaseñor
M Ermes
M Schuster
MF Akay
N Neverova
R Igual
R Zhao
S Brownsell
S Chung
S Hochreiter
S Hochreiter
SW Sun
T Baltrušaitis
Y Bengio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/01/2020
Field of study

Human falls are one of the leading causes of fatal unintentional injuries worldwide. Falls result in a direct financial cost to health systems, and indirectly, to society’s productivity. Unsurprisingly, human fall detection and prevention is a major focus of health research. In this chapter, we present and evaluate several bidirectional long short-term memory (Bi-LSTM) models using a data set provided by the Challenge UP competition. The main goal of this study is to detect 12 human daily activities (six daily human activities, five falls, and one post-fall activity) derived from multi-modal data sources - wearable sensors, ambient sensors, and vision devices. Our proposed Bi-LSTM model leverages data from accelerometer and gyroscope sensors located at the ankle, right pocket, belt, and neck of the subject. We utilize a grid search technique to evaluate variations of the Bi-LSTM model and identify a configuration that presents the best results. The best Bi-LSTM model achieved good results for precision and f1-score, 43.30% and 38.50%, respectivel

Crossref

DCU Online Research Access Service

Flagging Early Examples of Ambiguity I

Author: Alais D
Angiolini F
Baltrušaitis J
Baltrušaitis J
Baracchini C
Bernardini R
Bernardini R
Burckhardt T
Ciardi R P
Guarnieri G
Necker L A
Rubin E
Smith A M
Spini G
Publication venue: 'Pion Ltd'
Publication date
Field of study

Crossref

Learning-Based confidence estimation for Multi-modal classifier fusion

Author: A El-Sayed
C Busso
C Seiffert
J Wagner
JH Friedman
L White
MR Alam
N Poh
RE Schapire
T Baltrušaitis
Publication venue: Springer Verlag
Publication date: 01/01/2019
Field of study

We propose a novel confidence estimation method for predictions from a multi-class classifier. Unlike existing methods, we learn a confidence-estimator on the basis of a held-out set from the training data. The predicted confidence values by the proposed system are used to improve the accuracy of multi-modal emotion and sentiment classification. The scores of different classes from the individual modalities are superposed on the basis of confidence values. Experimental results demonstrate that the accuracy of the proposed confidence based fusion method is significantly superior to that of the classifier trained on any modality separately, and achieves superior performance compared to other fusion methods

Crossref

Research Repository

Displaced dynamic expression regression for real-time facial tracking and animation

Author: Baltrušaitis T.
Cao X.
Chai J.-X.
Cootes T. F.
Dollar P.
Huang G. B.
Pighin F.
Saragih J. M.
Xiao J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref