Search CORE

18 research outputs found

Base-package recommendation framework based on consumer behaviours in IPTV platform

Author: GUNAWARDHAHA Chanka
NAVARATHNA Rajitha
RANGANAYANKE Ruwinda
SHANMUGALINGAM Kuruparan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2020
Field of study

Institutional Knowledge at Singapore Management University

Factorized Variational Autoencoders for Modeling Audience Reactions to Movies

Author: Carr Peter
Deng Zhiwei
Mandt Stephan
Matthews Iain
Mori Greg
Navarathna Rajitha
Yue Yisong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2017
Field of study

Matrix and tensor factorization methods are often used for finding underlying low-dimensional patterns from noisy data. In this paper, we study non-linear tensor factorization methods based on deep variational autoencoders. Our approach is well-suited for settings where the relationship between the latent representation to be learned and the raw data representation is highly complex. We apply our approach to a large dataset of facial expressions of movie-watching audiences (over 16 million faces). Our experiments show that compared to conventional linear factorization methods, our method achieves better reconstruction of the data, and further discovers interpretable latent factors

Crossref

Caltech Authors

Facial feature detection for in-car environment

Author: Lucey Patrick
Navarathna Rajitha
Publication venue: Queensland University of Technology, Faculty of Built Environment and Engineering
Publication date: 01/01/2009
Field of study

Acoustically, vehicles are extremely noisy environments and as a consequence audio-only in-car voice recognition systems perform very poorly. Seeing that the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem. However, implementing such an approach requires a system being able to accurately locate and track the driver’s face and facial features in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using this system, we present our results which show that using the Viola-Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose

Queensland University of Technology ePrints Archive

Robust recognition of human behaviour in challenging environments

Author: Navarathna Rajitha Dharshana Bandara
Publication venue: 'Queensland University of Technology'
Publication date: 01/01/2014
Field of study

Novel techniques have been developed for the automatic recognition of human behaviour in challenging environments using information from visual and infra-red camera feeds. The techniques have been applied to two interesting scenarios: Recognise drivers' speech using lip movements and recognising audience behaviour, while watching a movie, using facial features and body movements. Outcome of the research in these two areas will be useful in the improving the performance of voice recognition in automobiles for voice based control and for obtaining accurate movie interest ratings based on live audience response analysis

Queensland University of Technology ePrints Archive

Visual front-end wars: Viola-Jones face detector vs Fourier Lucas-Kanade

Author: Dean David
Kalantari Shahram
Navarathna Rajitha
Sridharan Sridha
Publication venue: Inria
Publication date: 01/01/2013
Field of study

The performance of visual speech recognition (VSR) systems are significantly influenced by the accuracy of the visual front-end. The current state-of-the-art VSR systems use off-the-shelf face detectors such as Viola- Jones (VJ) which has limited reliability for changes in illumination and head poses. For a VSR system to perform well under these conditions, an accurate visual front end is required. This is an important problem to be solved in many practical implementations of audio visual speech recognition systems, for example in automotive environments for an efficient human-vehicle computer interface. In this paper, we re-examine the current state-of-the-art VSR by comparing off-the-shelf face detectors with the recently developed Fourier Lucas-Kanade (FLK) image alignment technique. A variety of image alignment and visual speech recognition experiments are performed on a clean dataset as well as with a challenging automotive audio-visual speech dataset. Our results indicate that the FLK image alignment technique can significantly outperform off-the shelf face detectors, but requires frequent fine-tuning

Queensland University of Technology ePrints Archive

Audio visual automatic speech recognition in vehicles

Author: Dean David B.
Lucey Patrick J.
Navarathna Rajitha
Sridharan Sridha
Publication venue
Publication date: 01/01/2010
Field of study

Acoustically, car cabins are extremely noisy and as a consequence, existing audio-only speech recognition systems, for voice-based control of vehicle functions such as the GPS based navigator, perform poorly. Audio-only speech recognition systems fail to make use of the visual modality of speech (eg: lip movements). As the visual modality is immune to acoustic noise, utilising this visual information in conjunction with an audio only speech recognition system has the potential to improve the accuracy of the system. The field of recognising speech using both auditory and visual inputs is known as Audio Visual Speech Recognition (AVSR). Continuous research in AVASR field has been ongoing for the past twenty-five years with notable progress being made. However, the practical deployment of AVASR systems for use in a variety of real-world applications has not yet emerged. The main reason is due to most research to date neglecting to address variabilities in the visual domain such as illumination and viewpoint in the design of the visual front-end of the AVSR system. In this paper we present an AVASR system in a real-world car environment using the AVICAR database [1], which is publicly available in-car database and we show that the use of visual speech conjunction with the audio modality is a better approach to improve the robustness and effectiveness of voice-only recognition systems in car cabin environments

CiteSeerX

Queensland University of Technology ePrints Archive

Recognising audio-visual speech in vehicles using the AVICAR database

Author: Dean David
Fookes Clinton
Lucey Patrick
Navarathna Rajitha
Sridharan Sridha
Publication venue: The Australasian Speech Science and Technology Association Inc
Publication date: 01/01/2010
Field of study

Interacting with technology within a vehicle environment using a voice interface can greatly reduce the effects of driver distraction. Most current approaches to this problem only utilise the audio signal, making them susceptible to acoustic noise. An obvious approach to circumvent this is to use the visual modality in addition. However, capturing, storing and distributing audio-visual data in a vehicle environment is very costly and difficult. One current dataset available for such research is the AVICAR [1] database. Unfortunately this database is largely unusable due to timing mismatch between the two streams and in addition, no protocol is available. We have overcome this problem by re-synchronising the streams on the phone-number portion of the dataset and established a protocol for further research. This paper presents the first audio-visual results on this dataset for speaker-independent speech recognition. We hope this will serve as a catalyst for future research in this area

Queensland University of Technology ePrints Archive

Lip detection for audio-visual speech recognition in-car environment

Author: Dean David
Fookes Clinton
Lucey Patrick
Navarathna Rajitha
Sridharan Sridha
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Acoustically, car cabins are extremely noisy and as a consequence audio-only, in-car voice recognition systems perform poorly. As the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem by using audio visual automatic speech recognition (AVASR). However, implementing AVASR requires a system being able to accurately locate and track the drivers face and lip area in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using the AVICAR [1] in-car database, we show that the Viola- Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose for audio-visual speech recognition system

Crossref

Queensland University of Technology ePrints Archive