482 research outputs found

    A new recursive algorithm for time-varying autoregressive (TVAR) model estimation and its application to speech analysis

    Get PDF
    This paper proposes a new state-regularized (SR) and QR decomposition based recursive least squares (QRRLS) algorithm with variable forgetting factor (VFF) for recursive coefficient estimation of time-varying autoregressive (AR) models. It employs the estimated coefficients as prior information to minimize the exponentially weighted observation error, which leads to reduced variance and bias over traditional regularized RLS algorithm. It also increases the tracking speed by introducing a new measure of convergence status to control the FF. Simulations using synthetic and real speech signals show that the proposed method has improved tracking performance and reduced estimation error variance than conventional TVAR modeling methods during rapid changing of AR coefficients. Β© 2012 IEEE.published_or_final_versionThe 2012 IEEE International Symposium on Circuits and Systems (ISCAS), Seoul, Korea, 20-23 May 2012. In IEEE International Symposium on Circuits and Systems Proceedings, 2012, p. 1026-102

    Accurate Range-based Indoor Localization Using PSO-Kalman Filter Fusion

    Get PDF
    Accurate indoor localization often depends on infrastructure support for distance estimation in range-based techniques. One can also trade off accuracy to reduce infrastructure investment by using relative positions of other nodes, as in range-free localization. Even for range-based methods where accurate Ultra-WideBand (UWB) signals are used, non line-of-sight (NLOS) conditions pose significant difficulty in accurate indoor localization. Existing solutions rely on additional measurements from sensors and typically correct the noise using a Kalman filter (KF). Solutions can also be customized to specific environments through extensive profiling. In this work, a range-based indoor localization algorithm called PSO - Kalman Filter Fusion (PKFF) is proposed that minimizes the effects of NLOS on localization error without using additional sensors or profiling. Location estimates from a windowed Particle Swarm Optimization (PSO) and a dynamically adjusted KF are fused based on a weighted variance factor. PKFF achieved a 40% lower 90-percentile root-mean-square localization error (RMSE) over the standard least squares trilateration algorithm at 61 cm compared to 102 cm

    Data mining based learning algorithms for semi-supervised object identification and tracking

    Get PDF
    Sensor exploitation (SE) is the crucial step in surveillance applications such as airport security and search and rescue operations. It allows localization and identification of movement in urban settings and can significantly boost knowledge gathering, interpretation and action. Data mining techniques offer the promise of precise and accurate knowledge acquisition techniques in high-dimensional data domains (and diminishing the β€œcurse of dimensionality” prevalent in such datasets), coupled by algorithmic design in feature extraction, discriminative ranking, feature fusion and supervised learning (classification). Consequently, data mining techniques and algorithms can be used to refine and process captured data and to detect, recognize, classify, and track objects with predictable high degrees of specificity and sensitivity. Automatic object detection and tracking algorithms face several obstacles, such as large and incomplete datasets, ill-defined regions of interest (ROIs), variable scalability, lack of compactness, angular regions, partial occlusions, environmental variables, and unknown potential object classes, which work against their ability to achieve accurate real-time results. Methods must produce fast and accurate results by streamlining image processing, data compression and reduction, feature extraction, classification, and tracking algorithms. Data mining techniques can sufficiently address these challenges by implementing efficient and accurate dimensionality reduction with feature extraction to refine incomplete (ill-partitioning) data-space and addressing challenges related to object classification, intra-class variability, and inter-class dependencies. A series of methods have been developed to combat many of the challenges for the purpose of creating a sensor exploitation and tracking framework for real time image sensor inputs. The framework has been broken down into a series of sub-routines, which work in both series and parallel to accomplish tasks such as image pre-processing, data reduction, segmentation, object detection, tracking, and classification. These methods can be implemented either independently or together to form a synergistic solution to object detection and tracking. The main contributions to the SE field include novel feature extraction methods for highly discriminative object detection, classification, and tracking. Also, a new supervised classification scheme is presented for detecting objects in urban environments. This scheme incorporates both novel features and non-maximal suppression to reduce false alarms, which can be abundant in cluttered environments such as cities. Lastly, a performance evaluation of Graphical Processing Unit (GPU) implementations of the subtask algorithms is presented, which provides insight into speed-up gains throughout the SE framework to improve design for real time applications. The overall framework provides a comprehensive SE system, which can be tailored for integration into a layered sensing scheme to provide the war fighter with automated assistance and support. As more sensor technology and integration continues to advance, this SE framework can provide faster and more accurate decision support for both intelligence and civilian applications

    Self-correcting Bayesian target tracking

    Get PDF
    The copyright of this thesis rests with the author and no quotation from it or information derived from it may be published without the prior written consent of the authorAbstract Visual tracking, a building block for many applications, has challenges such as occlusions,illumination changes, background clutter and variable motion dynamics that may degrade the tracking performance and are likely to cause failures. In this thesis, we propose Track-Evaluate-Correct framework (self-correlation) for existing trackers in order to achieve a robust tracking. For a tracker in the framework, we embed an evaluation block to check the status of tracking quality and a correction block to avoid upcoming failures or to recover from failures. We present a generic representation and formulation of the self-correcting tracking for Bayesian trackers using a Dynamic Bayesian Network (DBN). The self-correcting tracking is done similarly to a selfaware system where parameters are tuned in the model or different models are fused or selected in a piece-wise way in order to deal with tracking challenges and failures. In the DBN model representation, the parameter tuning, fusion and model selection are done based on evaluation and correction variables that correspond to the evaluation and correction, respectively. The inferences of variables in the DBN model are used to explain the operation of self-correcting tracking. The specific contributions under the generic self-correcting framework are correlation-based selfcorrecting tracking for an extended object with model points and tracker-level fusion as described below. For improving the probabilistic tracking of extended object with a set of model points, we use Track-Evaluate-Correct framework in order to achieve self-correcting tracking. The framework combines the tracker with an on-line performance measure and a correction technique. We correlate model point trajectories to improve on-line the accuracy of a failed or an uncertain tracker. A model point tracker gets assistance from neighbouring trackers whenever degradation in its performance is detected using the on-line performance measure. The correction of the model point state is based on the correlation information from the states of other trackers. Partial Least Square regression is used to model the correlation of point tracker states from short windowed trajectories adaptively. Experimental results on data obtained from optical motion capture systems show the improvement in tracking performance of the proposed framework compared to the baseline tracker and other state-of-the-art trackers. The proposed framework allows appropriate re-initialisation of local trackers to recover from failures that are caused by clutter and missed detections in the motion capture data. Finally, we propose a tracker-level fusion framework to obtain self-correcting tracking. The fusion framework combines trackers addressing different tracking challenges to improve the overall performance. As a novelty of the proposed framework, we include an online performance measure to identify the track quality level of each tracker to guide the fusion. The trackers in the framework assist each other based on appropriate mixing of the prior states. Moreover, the track quality level is used to update the target appearance model. We demonstrate the framework with two Bayesian trackers on video sequences with various challenges and show its robustness compared to the independent use of the trackers used in the framework, and also compared to other state-of-the-art trackers. The appropriate online performance measure based appearance model update and prior mixing on trackers allows the proposed framework to deal with tracking challenges

    Real-time neural signal processing and low-power hardware co-design for wireless implantable brain machine interfaces

    Get PDF
    Intracortical Brain-Machine Interfaces (iBMIs) have advanced significantly over the past two decades, demonstrating their utility in various aspects, including neuroprosthetic control and communication. To increase the information transfer rate and improve the devices’ robustness and longevity, iBMI technology aims to increase channel counts to access more neural data while reducing invasiveness through miniaturisation and avoiding percutaneous connectors (wired implants). However, as the number of channels increases, the raw data bandwidth required for wireless transmission also increases becoming prohibitive, requiring efficient on-implant processing to reduce the amount of data through data compression or feature extraction. The fundamental aim of this research is to develop methods for high-performance neural spike processing co-designed within low-power hardware that is scaleable for real-time wireless BMI applications. The specific original contributions include the following: Firstly, a new method has been developed for hardware-efficient spike detection, which achieves state-of-the-art spike detection performance and significantly reduces the hardware complexity. Secondly, a novel thresholding mechanism for spike detection has been introduced. By incorporating firing rate information as a key determinant in establishing the spike detection threshold, we have improved the adaptiveness of spike detection. This eventually allows the spike detection to overcome the signal degradation that arises due to scar tissue growth around the recording site, thereby ensuring enduringly stable spike detection results. The long-term decoding performance, as a consequence, has also been improved notably. Thirdly, the relationship between spike detection performance and neural decoding accuracy has been investigated to be nonlinear, offering new opportunities for further reducing transmission bandwidth by at least 30% with minor decoding performance degradation. In summary, this thesis presents a journey toward designing ultra-hardware-efficient spike detection algorithms and applying them to reduce the data bandwidth and improve neural decoding performance. The software-hardware co-design approach is essential for the next generation of wireless brain-machine interfaces with increased channel counts and a highly constrained hardware budget. The fundamental aim of this research is to develop methods for high-performance neural spike processing co-designed within low-power hardware that is scaleable for real-time wireless BMI applications. The specific original contributions include the following: Firstly, a new method has been developed for hardware-efficient spike detection, which achieves state-of-the-art spike detection performance and significantly reduces the hardware complexity. Secondly, a novel thresholding mechanism for spike detection has been introduced. By incorporating firing rate information as a key determinant in establishing the spike detection threshold, we have improved the adaptiveness of spike detection. This eventually allows the spike detection to overcome the signal degradation that arises due to scar tissue growth around the recording site, thereby ensuring enduringly stable spike detection results. The long-term decoding performance, as a consequence, has also been improved notably. Thirdly, the relationship between spike detection performance and neural decoding accuracy has been investigated to be nonlinear, offering new opportunities for further reducing transmission bandwidth by at least 30\% with only minor decoding performance degradation. In summary, this thesis presents a journey toward designing ultra-hardware-efficient spike detection algorithms and applying them to reduce the data bandwidth and improve neural decoding performance. The software-hardware co-design approach is essential for the next generation of wireless brain-machine interfaces with increased channel counts and a highly constrained hardware budget.Open Acces

    Seeing sound: a new way to illustrate auditory objects and their neural correlates

    Full text link
    This thesis develops a new method for time-frequency signal processing and examines the relevance of the new representation in studies of neural coding in songbirds. The method groups together associated regions of the time-frequency plane into objects defined by time-frequency contours. By combining information about structurally stable contour shapes over multiple time-scales and angles, a signal decomposition is produced that distributes resolution adaptively. As a result, distinct signal components are represented in their own most parsimonious forms.Β  Next, through neural recordings in singing birds, it was found that activity in song premotor cortex is significantly correlated with the objects defined by this new representation of sound. In this process, an automated way of finding sub-syllable acoustic transitions in birdsongs was first developed, and then increased spiking probability was found at the boundaries of these acoustic transitions. Finally, a new approach to study auditory cortical sequence processing more generally is proposed. In this approach, songbirds were trained to discriminate Morse-code-like sequences of clicks, and the neural correlates of this behavior were examined in primary and secondary auditory cortex. It was found that a distinct transformation of auditory responses to the sequences of clicks exists as information transferred from primary to secondary auditory areas. Neurons in secondary auditory areas respond asynchronously and selectively -- in a manner that depends on the temporal context of the click. This transformation from a temporal to a spatial representation of sound provides a possible basis for the songbird's natural ability to discriminate complex temporal sequences

    Machine learning and inferencing for the decomposition of speech mixtures

    Get PDF
    In this dissertation, we present and evaluate a novel approach for incorporating machine learning and inferencing into the time-frequency decomposition of speech signals in the context of speaker-independent multi-speaker pitch tracking. The pitch tracking performance of the resulting algorithm is comparable to that of a state-of-the-art machine-learning algorithm for multi-pitch tracking while being significantly more computationally efficient and requiring much less training data. Multi-pitch tracking is a time-frequency signal processing problem in which mutual interferences of the harmonics from different speakers make it challenging to design an algorithm to reliably estimate the fundamental frequency trajectories of the individual speakers. The current state-of-the-art in speaker-independent multi-pitch tracking utilizes 1) a deep neural network for producing spectrograms of individual speakers and 2) another deep neural network that acts upon the individual spectrograms and the original audio’s spectrogram to produce estimates of the pitch tracks of the individual speakers. However, the implementation of this Multi-Spectrogram Machine- Learning (MS-ML) algorithm could be computationally intensive and make it impractical for hardware platforms such as embedded devices where the computational power is limited. Instead of utilizing deep neural networks to estimate the pitch values directly, we have derived and evaluated a fault recognition and diagnosis (FRD) framework that utilizes machine learning and inferencing techniques to recognize potential faults in the pitch tracks produced by a traditional multi-pitch tracking algorithm. The result of this fault-recognition phase is then used to trigger a fault-diagnosis phase aimed at resolving the recognized fault(s) through adaptive adjustment of the time-frequency analysis of the input signal. The pitch estimates produced by the resulting FRD-ML algorithm are found to be comparable in accuracy to those produced via the MS-ML algorithm. However, our evaluation of the FRD-ML algorithm shows it to have significant advantages over the MS-ML algorithm. Specifically, the number of multiplications per second in FRD-ML is found to be two orders of magnitude less while the number of additions per second is about the same as in the MS-ML algorithm. Furthermore, the required amount of training data to achieve optimal performance is found to be two orders of magnitude less for the FRD-ML algorithm in comparison to the MS-ML algorithm. The reduction in the number of multiplications per second means it is more feasible to implement the MPT solution on hardware platforms with limited computational power such as embedded devices rather than relying on Graphics Processing Units (GPUs) or cloud computing. The reduction in training data size makes the algorithm more flexible in terms of configuring for different application scenarios such as training for different languages where there may not be a large amount of training data
    • …
    corecore