1,434 research outputs found

    Online estimation of discrete densities using classifier chains

    Get PDF
    We propose an approach to estimate a discrete joint density online, that is, the algorithm is only provided the current example, its current estimate, and a limited amount of memory. To design an online estimator for discrete densities, we use classifier chains to model dependencies among features. Each classifier in the chain estimates the probability of one particular feature. Because a single chain may not provide a reliable estimate, we also consider ensembles of classifier chains. Our experiments on synthetic data show that the approach is feasible and the estimated densities approach the true, known distribution with increasing amounts of data

    Hidden Markov Models

    Get PDF
    Hidden Markov Models (HMMs), although known for decades, have made a big career nowadays and are still in state of development. This book presents theoretical issues and a variety of HMMs applications in speech recognition and synthesis, medicine, neurosciences, computational biology, bioinformatics, seismology, environment protection and engineering. I hope that the reader will find this book useful and helpful for their own research

    On adaptive decision rules and decision parameter adaptation for automatic speech recognition

    Get PDF
    Recent advances in automatic speech recognition are accomplished by designing a plug-in maximum a posteriori decision rule such that the forms of the acoustic and language model distributions are specified and the parameters of the assumed distributions are estimated from a collection of speech and language training corpora. Maximum-likelihood point estimation is by far the most prevailing training method. However, due to the problems of unknown speech distributions, sparse training data, high spectral and temporal variabilities in speech, and possible mismatch between training and testing conditions, a dynamic training strategy is needed. To cope with the changing speakers and speaking conditions in real operational conditions for high-performance speech recognition, such paradigms incorporate a small amount of speaker and environment specific adaptation data into the training process. Bayesian adaptive learning is an optimal way to combine prior knowledge in an existing collection of general models with a new set of condition-specific adaptation data. In this paper, the mathematical framework for Bayesian adaptation of acoustic and language model parameters is first described. Maximum a posteriori point estimation is then developed for hidden Markov models and a number of useful parameters densities commonly used in automatic speech recognition and natural language processing.published_or_final_versio

    Adaptive sequential feature selection in visual perception and pattern recognition

    Get PDF
    In the human visual system, one of the most prominent functions of the extensive feedback from the higher brain areas within and outside of the visual cortex is attentional modulation. The feedback helps the brain to concentrate its resources on visual features that are relevant for recognition, i. e. it iteratively selects certain aspects of the visual scene for refined processing by the lower areas until the inference process in the higher areas converges to a single hypothesis about this scene. In order to minimize a number of required selection-refinement iterations, one has to find a short sequence of maximally informative portions of the visual input. Since the feedback is not static, the selection process is adapted to a scene that should be recognized. To find a scene-specific subset of informative features, the adaptive selection process on every iteration utilizes results of previous processing in order to reduce the remaining uncertainty about the visual scene. This phenomenon inspired us to develop a computational algorithm solving a visual classification task that would incorporate such principle, adaptive feature selection. It is especially interesting because usually feature selection methods are not adaptive as they define a unique set of informative features for a task and use them for classifying all objects. However, an adaptive algorithm selects features that are the most informative for the particular input. Thus, the selection process should be driven by statistics of the environment concerning the current task and the object to be classified. Applied to a classification task, our adaptive feature selection algorithm favors features that maximally reduce the current class uncertainty, which is iteratively updated with values of the previously selected features that are observed on the testing sample. In information-theoretical terms, the selection criterion is the mutual information of a class variable and a feature-candidate conditioned on the already selected features, which take values observed on the current testing sample. Then, the main question investigated in this thesis is whether the proposed adaptive way of selecting features is advantageous over the conventional feature selection and in which situations. Further, we studied whether the proposed adaptive information-theoretical selection scheme, which is a computationally complex algorithm, is utilized by humans while they perform a visual classification task. For this, we constructed a psychophysical experiment where people had to select image parts that as they think are relevant for classification of these images. We present the analysis of behavioral data where we investigate whether human strategies of task-dependent selective attention can be explained by a simple ranker based on the mutual information, a more complex feature selection algorithm based on the conventional static mutual information and the proposed here adaptive feature selector that mimics a mechanism of the iterative hypothesis refinement. Hereby, the main contribution of this work is the adaptive feature selection criterion based on the conditional mutual information. Also it is shown that such adaptive selection strategy is indeed used by people while performing visual classification.:1. Introduction 2. Conventional feature selection 3. Adaptive feature selection 4. Experimental investigations of ACMIFS 5. Information-theoretical strategies of selective attention 6. Discussion Appendix Bibliograph

    Decision making with reciprocal chains and binary neural network models

    Get PDF
    Automated decision making systems are relied on in increasingly diverse and critical settings. Human users expect such systems to improve or augment their own decision making in complex scenarios, in real time, often across distributed networks of devices. This thesis studies binary decision making systems of two forms. The rst system is built from a reciprocal chain, a statistical model able to capture the intentional behaviour of targets moving through a statespace, such as moving towards a destination state. The rst part of the thesis questions the utility of this higher level information in a tracking problem where the system must decide whether a target exists or not. The contributions of this study characterise the bene ts to be expected from reciprocal chains for tracking, using statistical tools and a novel simulation environment that provides relevant numerical experiments. Real world decision making systems often combine statistical models, such as the reciprocal chain, with the second type of system studied in this thesis, a neural network. In the tracking context, a neural network typically forms the object detection system. However, the power consumption and memory usage of state of the art neural networks makes their use on small devices infeasible. This motivates the study of binary neural networks in the second part of the thesis. Such networks use less memory and are e cient to run, compared to standard full precision networks. However, their optimisation is di cult, due to the non-di erentiable functions involved. Several algorithms elect to optimise surrogate networks that are di erentiable and correspond in some way to the original binary network. Unfortunately, the many choices involved in the algorithm design are poorly understood. The second part of the thesis questions the role of parameter initialisation in the optimisation of binary neural networks. Borrowing analytic tools from statistical physics, it is possible to characterise the typical behaviour of a range of algorithms at initialisation precisely, by studying how input signals propagate through these networks on average. This theoretical development also yields practical outcomes, providing scales that limit network depth and suggesting new initialisation methods for binary neural networks.Thesis (Ph.D.) -- University of Adelaide, School of Electrical & Electronic Engineering, 202
    corecore