1,105 research outputs found

    Object Tracking from Audio and Video data using Linear Prediction method

    Get PDF
    Microphone arrays and video surveillance by camera are widely used for detection and tracking of a moving speaker. In this project, object tracking was planned using multimodal fusion i.e., Audio-Visual perception. Source localisation can be done by GCC-PHAT, GCC-ML for time delay estimation delay estimation. These methods are based on spectral content of the speech signals that can be effected by noise and reverberation. Video tracking can be done using Kalman filter or Particle filter. Therefore Linear Prediction method is used for audio and video tracking. Linear prediction in source localisation use features related to excitation source information of speech which are less effected by noise. Hence by using this excitation source information, time delays are estimated and the results are compared with GCC PHAT method. The dataset obtained from [20] is used in video tracking a single moving object captured through stationary camera. Then for object detection, projection histogram is done followed by linear prediction for tracking and the corresponding results are compared with Kalman filter method

    Acoustic Speaker Localization with Strong Reverberation and Adaptive Feature Filtering with a Bayes RFS Framework

    Get PDF
    The thesis investigates the challenges of speaker localization in presence of strong reverberation, multi-speaker tracking, and multi-feature multi-speaker state filtering, using sound recordings from microphones. Novel reverberation-robust speaker localization algorithms are derived from the signal and room acoustics models. A multi-speaker tracking filter and a multi-feature multi-speaker state filter are developed based upon the generalized labeled multi-Bernoulli random finite set framework. Experiments and comparative studies have verified and demonstrated the benefits of the proposed methods

    Speech Modeling and Robust Estimation for Diagnosis of Parkinson’s Disease

    Get PDF

    Exploring Motion Signatures for Vision-Based Tracking, Recognition and Navigation

    Get PDF
    As cameras become more and more popular in intelligent systems, algorithms and systems for understanding video data become more and more important. There is a broad range of applications, including object detection, tracking, scene understanding, and robot navigation. Besides the stationary information, video data contains rich motion information of the environment. Biological visual systems, like human and animal eyes, are very sensitive to the motion information. This inspires active research on vision-based motion analysis in recent years. The main focus of motion analysis has been on low level motion representations of pixels and image regions. However, the motion signatures can benefit a broader range of applications if further in-depth analysis techniques are developed. In this dissertation, we mainly discuss how to exploit motion signatures to solve problems in two applications: object recognition and robot navigation. First, we use bird species recognition as the application to explore motion signatures for object recognition. We begin with study of the periodic wingbeat motion of flying birds. To analyze the wing motion of a flying bird, we establish kinematics models for bird wings, and obtain wingbeat periodicity in image frames after the perspective projection. Time series of salient extremities on bird images are extracted, and the wingbeat frequency is acquired for species classification. Physical experiments show that the frequency based recognition method is robust to segmentation errors and measurement lost up to 30%. In addition to the wing motion, the body motion of the bird is also analyzed to extract the flying velocity in 3D space. An interacting multi-model approach is then designed to capture the combined object motion patterns and different environment conditions. The proposed systems and algorithms are tested in physical experiments, and the results show a false positive rate of around 20% with a low false negative rate close to zero. Second, we explore motion signatures for vision-based vehicle navigation. We discover that motion vectors (MVs) encoded in Moving Picture Experts Group (MPEG) videos provide rich information of the motion in the environment, which can be used to reconstruct the vehicle ego-motion and the structure of the scene. However, MVs suffer from high noise level. To handle the challenge, an error propagation model for MVs is first proposed. Several steps, including MV merging, plane-at-infinity elimination, and planar region extraction, are designed to further reduce noises. The extracted planes are used as landmarks in an extended Kalman filter (EKF) for simultaneous localization and mapping. Results show that the algorithm performs localization and plane mapping with a relative trajectory error below 5:1%. Exploiting the fact that MVs encodes both environment information and moving obstacles, we further propose to track moving objects at the same time of localization and mapping. This enables the two critical navigation functionalities, localization and obstacle avoidance, to be performed in a single framework. MVs are labeled as stationary or moving according to their consistency to geometric constraints. Therefore, the extracted planes are separated into moving objects and the stationary scene. Multiple EKFs are used to track the static scene and the moving objects simultaneously. In physical experiments, we show a detection rate of moving objects at 96:6% and a mean absolute localization error below 3:5 meters

    A Brief Exposition on Brain-Computer Interface

    Get PDF
    Brain-Computer Interface is a technology that records brain signals and translates them into useful commands to operate a drone or a wheelchair. Drones are used in various applications such as aerial operations, where pilot’s presence is impossible. The BCI can also be used for patients suffering from brain diseases who lose their body control and are unable to move to satisfy their basic needs. By taking advantage of BCI and drone technology, algorithms for Mind-Controlled Unmanned Aerial System can be developed. This paper deals with the classification of BCI & UAV, methodologies of BCI, the framework of BCI, neuro-imaging methods, BCI headset options, BCI platforms, electrode types & their placement, and the result of feature extraction technique (FFT) with 72.5% accuracy

    ERP source tracking and localization from single trial EEG MEG signals

    Get PDF
    Electroencephalography (EEG) and magnetoencephalography (MEG), which are two of a number of neuroimaging techniques, are scalp recordings of the electrical activity of the brain. EEG and MEG (E/MEG) have excellent temporal resolution, they are easy to acquire, and have a wide range of applications in science, medicine and engineering. These valuable signals, however, suffer from poor spatial resolution and in many cases from very low signal to noise ratios. In this study, new computational methods for analyzing and improving the quality of E/MEG signals are presented. We mainly focus on single trial event-related potential (ERP) estimation and E/MEG dipole source localization. Several methods basically based on particle filtering (PF) are proposed. First, a method using PF for single trial estimation of ERP signals is considered. In this method, the wavelet coefficients of each ERP are assumed to be a Markovian process and do not change extensively across trials. The wavelet coefficients are then estimated recursively using PF. The results both for simulations and real data are compared with those of the well known Kalman Filtering (KF) approach. In the next method we move from single trial estimation to source localization of E/MEG signals. The beamforming (BF) approach for dipole source localization is generalized based on prior information about the noise. BF is in fact a spatial filter that minimizes the power of all the signals at the output of the filter except those that come from the locations of interest. In the proposed method, using two more constraints than in the classical BF formulation, the output noise powers are minimized and the interference activities are stopped.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    corecore