Search CORE

28,124 research outputs found

Automated Video Analysis of Animal Movements Using Gabor Orientation Filters

Author: Kristan Wiliam B., Jr.
Wagenaar Daniel A.
Publication venue: Humana Press Inc.
Publication date: 01/01/2010
Field of study

To quantify locomotory behavior, tools for determining the location and shape of an animal’s body are a first requirement. Video recording is a convenient technology to store raw movement data, but extracting body coordinates from video recordings is a nontrivial task. The algorithm described in this paper solves this task for videos of leeches or other quasi-linear animals in a manner inspired by the mammalian visual processing system: the video frames are fed through a bank of Gabor filters, which locally detect segments of the animal at a particular orientation. The algorithm assumes that the image location with maximal filter output lies on the animal’s body and traces its shape out in both directions from there. The algorithm successfully extracted location and shape information from video clips of swimming leeches, as well as from still photographs of swimming and crawling snakes. A Matlab implementation with a graphical user interface is available online, and should make this algorithm conveniently usable in many other contexts

CiteSeerX

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Caltech Authors

Multimodal Observation and Interpretation of Subjects Engaged in Problem Solving

Author: Balzarini Raffaella
Crowley James L.
Guntz Thomas
Vaufreydaz Dominique
Publication venue
Publication date: 12/10/2017
Field of study

In this paper we present the first results of a pilot experiment in the capture and interpretation of multimodal signals of human experts engaged in solving challenging chess problems. Our goal is to investigate the extent to which observations of eye-gaze, posture, emotion and other physiological signals can be used to model the cognitive state of subjects, and to explore the integration of multiple sensor modalities to improve the reliability of detection of human displays of awareness and emotion. We observed chess players engaged in problems of increasing difficulty while recording their behavior. Such recordings can be used to estimate a participant's awareness of the current situation and to predict ability to respond effectively to challenging situations. Results show that a multimodal approach is more accurate than a unimodal one. By combining body posture, visual attention and emotion, the multimodal approach can reach up to 93% of accuracy when determining player's chess expertise while unimodal approach reaches 86%. Finally this experiment validates the use of our equipment as a general and reproducible tool for the study of participants engaged in screen-based interaction and/or problem solving

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Image processing for plastic surgery planning

Author: Favaedi Leila
Favaedi Leila
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/10/2010
Field of study

This thesis presents some image processing tools for plastic surgery planning. In particular, it presents a novel method that combines local and global context in a probabilistic relaxation framework to identify cephalometric landmarks used in Maxillofacial plastic surgery. It also uses a method that utilises global and local symmetry to identify abnormalities in CT frontal images of the human body. The proposed methodologies are evaluated with the help of several clinical data supplied by collaborating plastic surgeons

Spiral - Imperial College Digital Repository

An Adaptive Sampling Scheme to Efficiently Train Fully Convolutional Networks for Semantic Segmentation

Author: C Wang
K Kamnitsas
O Jimenez-del Toro
O Ronneberger
OA Jiménez del Toro
PF Christ
Q Dou
R Kéchichian
T Gass
X Qi
Y Nesterov
Publication venue
Publication date: 22/12/2017
Field of study

Deep convolutional neural networks (CNNs) have shown excellent performance in object recognition tasks and dense classification problems such as semantic segmentation. However, training deep neural networks on large and sparse datasets is still challenging and can require large amounts of computation and memory. In this work, we address the task of performing semantic segmentation on large data sets, such as three-dimensional medical images. We propose an adaptive sampling scheme that uses a-posterior error maps, generated throughout training, to focus sampling on difficult regions, resulting in improved learning. Our contribution is threefold: 1) We give a detailed description of the proposed sampling algorithm to speed up and improve learning performance on large images. We propose a deep dual path CNN that captures information at fine and coarse scales, resulting in a network with a large field of view and high resolution outputs. We show that our method is able to attain new state-of-the-art results on the VISCERAL Anatomy benchmark

arXiv.org e-Print Archive

Crossref

Digital Image Access & Retrieval

Author: Heidorn P. Bryan
Sandore Beth
Publication venue: Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
Publication date: 01/01/1997
Field of study

The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

Illinois Digital Environment for Access to Learning and Scholarship Repository

Seeing with sound? Exploring different characteristics of a visual-to-auditory sensory substitution device

Author: Bach-Y-Rita P
David Brown
Durette B
Evans K K
Jamie Ward
Marks L E
McKelvie S J
Tom Macpherson
Publication venue: 'Pion Ltd'
Publication date: 01/01/2011
Field of study

Sensory substitution devices convert live visual images into auditory signals, for example with a web camera (to record the images), a computer (to perform the conversion) and headphones (to listen to the sounds). In a series of three experiments, the performance of one such device (‘The vOICe’) was assessed under various conditions on blindfolded sighted participants. The main task that we used involved identifying and locating objects placed on a table by holding a webcam (like a flashlight) or wearing it on the head (like a miner’s light). Identifying objects on a table was easier with a hand-held device, but locating the objects was easier with a head-mounted device. Brightness converted into loudness was less effective than the reverse contrast (dark being loud), suggesting that performance under these conditions (natural indoor lighting, novice users) is related more to the properties of the auditory signal (ie the amount of noise in it) than the cross-modal association between loudness and brightness. Individual differences in musical memory (detecting pitch changes in two sequences of notes) was related to the time taken to identify or recognise objects, but individual differences in self-reported vividness of visual imagery did not reliably predict performance across the experiments. In general, the results suggest that the auditory characteristics of the device may be more important for initial learning than visual associations

Crossref

Sussex Research Online