11 research outputs found
Discriminatively Trained Latent Ordinal Model for Video Classification
We study the problem of video classification for facial analysis and human
action recognition. We propose a novel weakly supervised learning method that
models the video as a sequence of automatically mined, discriminative
sub-events (eg. onset and offset phase for "smile", running and jumping for
"highjump"). The proposed model is inspired by the recent works on Multiple
Instance Learning and latent SVM/HCRF -- it extends such frameworks to model
the ordinal aspect in the videos, approximately. We obtain consistent
improvements over relevant competitive baselines on four challenging and
publicly available video based facial analysis datasets for prediction of
expression, clinical pain and intent in dyadic conversations and on three
challenging human action datasets. We also validate the method with qualitative
results and show that they largely support the intuitions behind the method.Comment: Paper accepted in IEEE TPAMI. arXiv admin note: substantial text
overlap with arXiv:1604.0150
An improved classification approach for echocardiograms embedding temporal information
Cardiovascular disease is an umbrella term for all diseases of the heart. At present, computer-aided echocardiogram diagnosis is becoming increasingly beneficial. For echocardiography, different cardiac views can be acquired depending on the location and angulations of the ultrasound transducer. Hence, the automatic echocardiogram view classification is the first step for echocardiogram diagnosis, especially for computer-aided system and even for automatic diagnosis in the future. In addition, heart views classification makes it possible to label images especially for large-scale echo videos, provide a facility for database management and collection.
This thesis presents a framework for automatic cardiac viewpoints classification of echocardiogram video data. In this research, we aim to overcome the challenges facing this investigation while analyzing, recognizing and classifying echocardiogram videos from 3D (2D spatial and 1D temporal) space. Specifically, we extend 2D KAZE approach into 3D space for feature detection and propose a histogram of acceleration as feature descriptor. Subsequently, feature encoding follows before the application of SVM to classify echo videos.
In addition, comparison with the state of the art methodologies also takes place, including 2D SIFT, 3D SIFT, and optical flow technique to extract temporal information sustained in the video images.
As a result, the performance of 2D KAZE, 2D KAZE with Optical Flow, 3D KAZE, Optical Flow, 2D SIFT and 3D SIFT delivers accuracy rate of 89.4%, 84.3%, 87.9%, 79.4%, 83.8% and 73.8% respectively for the eight view classes of echo videos
Erkennung menschlicher Aktivitäten durch Erfassung und Analyse von Bewegungstrajektorien
Das Verstehen menschlichen Verhaltens ist essenziell für intelligente technische Systeme in menschlichen Umgebungen. Diese Arbeit befasst sich mit der videobasierten Aktivitätsanalyse. Dazu werden zwei Methoden der Merkmalsextraktion untersucht: ein markerloses dreidimensionales Körpertracking mit einem evolutionären Algorithmus und ein modellfreies Tracking dynamischer Videomerkmale. Anschließend erfolgt eine Modellierung und Klassifikation von Aktivitäten auf Basis der gewonnenen Merkmale