Search CORE

19,438 research outputs found

End-to-End Tracking and Semantic Segmentation Using Recurrent Neural Networks

Author: Dequaire Julie
Ondruska Peter
Posner Ingmar
Wang Dominic Zeng
Publication venue
Publication date: 01/01/2016
Field of study

In this work we present a novel end-to-end framework for tracking and classifying a robot's surroundings in complex, dynamic and only partially observable real-world environments. The approach deploys a recurrent neural network to filter an input stream of raw laser measurements in order to directly infer object locations, along with their identity in both visible and occluded areas. To achieve this we first train the network using unsupervised Deep Tracking, a recently proposed theoretical framework for end-to-end space occupancy prediction. We show that by learning to track on a large amount of unsupervised data, the network creates a rich internal representation of its environment which we in turn exploit through the principle of inductive transfer of knowledge to perform the task of it's semantic classification. As a result, we show that only a small amount of labelled data suffices to steer the network towards mastering this additional task. Furthermore we propose a novel recurrent neural network architecture specifically tailored to tracking and semantic classification in real-world robotics applications. We demonstrate the tracking and classification performance of the method on real-world data collected at a busy road junction. Our evaluation shows that the proposed end-to-end framework compares favourably to a state-of-the-art, model-free tracking solution and that it outperforms a conventional one-shot training scheme for semantic classification

arXiv.org e-Print Archive

Oxford University Research Archive

Machine Analysis of Facial Expressions

Author: Bartlett M.S.
Pantic M.
Publication venue: I-Tech Education and Publishing
Publication date: 01/01/2007
Field of study

No abstract

IntechOpen

CiteSeerX

Crossref

University of Twente Research Information

Multi-sensor data fusion techniques for RPAS detect, track and avoid

Author: Cappello F
Ramasamy S
Sabatini R
Publication venue: SAE International (Warrendale, PA, United States)
Publication date: 01/01/2015
Field of study

Accurate and robust tracking of objects is of growing interest amongst the computer vision scientific community. The ability of a multi-sensor system to detect and track objects, and accurately predict their future trajectory is critical in the context of mission- and safety-critical applications. Remotely Piloted Aircraft System (RPAS) are currently not equipped to routinely access all classes of airspace since certified Detect-and-Avoid (DAA) systems are yet to be developed. Such capabilities can be achieved by incorporating both cooperative and non-cooperative DAA functions, as well as providing enhanced communications, navigation and surveillance (CNS) services. DAA is highly dependent on the performance of CNS systems for Detection, Tacking and avoiding (DTA) tasks and maneuvers. In order to perform an effective detection of objects, a number of high performance, reliable and accurate avionics sensors and systems are adopted including non-cooperative sensors (visual and thermal cameras, Laser radar (LIDAR) and acoustic sensors) and cooperative systems (Automatic Dependent Surveillance-Broadcast (ADS-B) and Traffic Collision Avoidance System (TCAS)). In this paper the sensors and system information candidates are fully exploited in a Multi-Sensor Data Fusion (MSDF) architecture. An Unscented Kalman Filter (UKF) and a more advanced Particle Filter (PF) are adopted to estimate the state vector of the objects based for maneuvering and non-maneuvering DTA tasks. Furthermore, an artificial neural network is conceptualised/adopted to exploit the use of statistical learning methods, which acts to combined information obtained from the UKF and PF. After describing the MSDF architecture, the key mathematical models for data fusion are presented. Conceptual studies are carried out on visual and thermal image fusion architectures

RMIT Research Repository

Digital Oculomotor Biomarkers in Dementia

Author: Mengoudi Kyriaki
Publication venue: UCL (University College London)
Publication date: 28/06/2021
Field of study

Dementia is an umbrella term that covers a number of neurodegenerative syndromes featuring gradual disturbance of various cognitive functions that are severe enough to interfere with tasks of daily life. The diagnosis of dementia occurs frequently when pathological changes have been developing for years, symptoms of cognitive impairment are evident and the quality of life of the patients has already been deteriorated significantly. Although brain imaging and fluid biomarkers allow the monitoring of disease progression in vivo, they are expensive, invasive and not necessarily diagnostic in isolation. Recent studies suggest that eye-tracking technology is an innovative tool that holds promise for accelerating early detection of the disease, as well as, supporting the development of strategies that minimise impairment during every day activities. However, the optimal methods for quantitative evaluation of oculomotor behaviour during complex and naturalistic tasks in dementia have yet to be determined. This thesis investigates the development of computational tools and techniques to analyse eye movements of dementia patients and healthy controls under naturalistic and less constrained scenarios to identify novel digital oculomotor biomarkers. Three key contributions are made. First, the evaluation of the role of environment during navigation in patients with typical Alzheimer disease and Posterior Cortical Atrophy compared to a control group using a combination of eye movement and egocentric video analysis. Secondly, the development of a novel method of extracting salient features directly from the raw eye-tracking data of a mixed sample of dementia patients during a novel instruction-less cognitive test to detect oculomotor biomarkers of dementia-related cognitive dysfunction. Third, the application of unsupervised anomaly detection techniques for visualisation of oculomotor anomalies during various cognitive tasks. The work presented in this thesis furthers our understanding of dementia-related oculomotor dysfunction and gives future research direction for the development of computerised cognitive tests and ecological interventions

UCL Discovery

Marshall Space Flight Center Research and Technology Report 2019

Author: Dankanich John W.
Morris Heather C.
Publication venue
Publication date
Field of study

Today, our calling to explore is greater than ever before, and here at Marshall Space Flight Centerwe make human deep space exploration possible. A key goal for Artemis is demonstrating and perfecting capabilities on the Moon for technologies needed for humans to get to Mars. This years report features 10 of the Agencys 16 Technology Areas, and I am proud of Marshalls role in creating solutions for so many of these daunting technical challenges. Many of these projects will lead to sustainable in-space architecture for human space exploration that will allow us to travel to the Moon, on to Mars, and beyond. Others are developing new scientific instruments capable of providing an unprecedented glimpse into our universe. NASA has led the charge in space exploration for more than six decades, and through the Artemis program we will help build on our work in low Earth orbit and pave the way to the Moon and Mars. At Marshall, we leverage the skills and interest of the international community to conduct scientific research, develop and demonstrate technology, and train international crews to operate further from Earth for longer periods of time than ever before first at the lunar surface, then on to our next giant leap, human exploration of Mars. While each project in this report seeks to advance new technology and challenge conventions, it is important to recognize the diversity of activities and people supporting our mission. This report not only showcases the Centers capabilities and our partnerships, it also highlights the progress our people have achieved in the past year. These scientists, researchers and innovators are why Marshall and NASA will continue to be a leader in innovation, exploration, and discovery for years to come

NASA Technical Reports Server

Content-prioritised video coding for British Sign Language communication.

Author: Muir Laura Joy
Publication venue
Publication date: 31/10/2007
Field of study

Video communication of British Sign Language (BSL) is important for remote interpersonal communication and for the equal provision of services for deaf people. However, the use of video telephony and video conferencing applications for BSL communication is limited by inadequate video quality. BSL is a highly structured, linguistically complete, natural language system that expresses vocabulary and grammar visually and spatially using a complex combination of facial expressions (such as eyebrow movements, eye blinks and mouth/lip shapes), hand gestures, body movements and finger-spelling that change in space and time. Accurate natural BSL communication places specific demands on visual media applications which must compress video image data for efficient transmission. Current video compression schemes apply methods to reduce statistical redundancy and perceptual irrelevance in video image data based on a general model of Human Visual System (HVS) sensitivities. This thesis presents novel video image coding methods developed to achieve the conflicting requirements for high image quality and efficient coding. Novel methods of prioritising visually important video image content for optimised video coding are developed to exploit the HVS spatial and temporal response mechanisms of BSL users (determined by Eye Movement Tracking) and the characteristics of BSL video image content. The methods implement an accurate model of HVS foveation, applied in the spatial and temporal domains, at the pre-processing stage of a current standard-based system (H.264). Comparison of the performance of the developed and standard coding systems, using methods of video quality evaluation developed for this thesis, demonstrates improved perceived quality at low bit rates. BSL users, broadcasters and service providers benefit from the perception of high quality video over a range of available transmission bandwidths. The research community benefits from a new approach to video coding optimisation and better understanding of the communication needs of deaf people

Open Access Institutional Repository at Robert Gordon University

Affective Computing for Emotion Detection using Vision and Wearable Sensors

Author: Keary Alphonsus
Publication venue
Publication date: 01/03/2018
Field of study

The research explores the opportunities, challenges, limitations, and presents advancements in computing that relates to, arises from, or deliberately influences emotions (Picard, 1997). The field is referred to as Affective Computing (AC) and is expected to play a major role in the engineering and development of computationally and cognitively intelligent systems, processors and applications in the future. Today the field of AC is bolstered by the emergence of multiple sources of affective data and is fuelled on by developments under various Internet of Things (IoTs) projects and the fusion potential of multiple sensory affective data streams. The core focus of this thesis involves investigation into whether the sensitivity and specificity (predictive performance) of AC, based on the fusion of multi-sensor data streams, is fit for purpose? Can such AC powered technologies and techniques truly deliver increasingly accurate emotion predictions of subjects in the real world? The thesis begins by presenting a number of research justifications and AC research questions that are used to formulate the original thesis hypothesis and thesis objectives. As part of the research conducted, a detailed state of the art investigations explored many aspects of AC from both a scientific and technological perspective. The complexity of AC as a multi-sensor, multi-modality, data fusion problem unfolded during the state of the art research and this ultimately led to novel thinking and origination in the form of the creation of an AC conceptualised architecture that will act as a practical and theoretical foundation for the engineering of future AC platforms and solutions. The AC conceptual architecture developed as a result of this research, was applied to the engineering of a series of software artifacts that were combined to create a prototypical AC multi-sensor platform known as the Emotion Fusion Server (EFS) to be used in the thesis hypothesis AC experimentation phases of the research. The thesis research used the EFS platform to conduct a detailed series of AC experiments to investigate if the fusion of multiple sensory sources of affective data from sensory devices can significantly increase the accuracy of emotion prediction by computationally intelligent means. The research involved conducting numerous controlled experiments along with the statistical analysis of the performance of sensors for the purposes of AC, the findings of which serve to assess the feasibility of AC in various domains and points to future directions for the AC field. The AC experiments data investigations conducted in relation to the thesis hypothesis used applied statistical methods and techniques, and the results, analytics and evaluations are presented throughout the two thesis research volumes. The thesis concludes by providing a detailed set of formal findings, conclusions and decisions in relation to the overarching research hypothesis on the sensitivity and specificity of the fusion of vision and wearables sensor modalities and offers foresights and guidance into the many problems, challenges and projections for the AC field into the future

SWORD (Cork Inst. of Technology)

Statistical Methods to Measure Reading Progression Using Eye-Gaze Fixation Points

Author: Bottos Stephen
Publication venue: 'University of Windsor Leddy Library'
Publication date: 25/07/2019
Field of study

In this thesis, we investigate methods to accurately track reading progression by analyzing eye-gaze fixation points, using commercially available eye tracking devices and without the imposition of unnatural movement constraints. In order to obtain the most accurate eye-gaze fixation point data possible, the current state of the art relies on expensive, cumbersome apparatuses. Eye-gaze tracking using less expensive hardware, and without constraints imposed on the individual whose gaze is being tracked, results in less reliable, noise-corrupt data which proves difficult to interpret. Extending the accessibility of accurate reading progression tracking beyond its current limits and enabling its feasibility in a real-world, constraint-free environment will enable a multitude of futuristic functionalities for educational, enterprise, and consumer technologies. We first discuss the ``Line Detection System\u27\u27 (LDS), a Kalman filter and hidden Markov model based algorithm designed to infer from noisy data the line of text associated with each eye-gaze fixation point reported every few milliseconds during reading. This system is shown to yield an average line detection accuracy of 88.1\%. Next, we discuss a ``Horizontal Saccade Tracking System\u27\u27 (HSTS) which aims to track horizontal progression within each line, using a least squares approach to filter out noise. Finally, we discuss a novel ``Slip-Kalman\u27\u27 filter which is custom designed to track the progression of reading. This method improves upon the original LDS, performing at an average line detection accuracy of 97.8\%, and offers advanced capability in horizontal tracking compared to the HSTS. The performance of each method is demonstrated using 25 pages worth of data collected during readin

Scholarship at UWindsor