87 research outputs found

    Hidden Markov Models

    Get PDF
    Hidden Markov Models (HMMs), although known for decades, have made a big career nowadays and are still in state of development. This book presents theoretical issues and a variety of HMMs applications in speech recognition and synthesis, medicine, neurosciences, computational biology, bioinformatics, seismology, environment protection and engineering. I hope that the reader will find this book useful and helpful for their own research

    The Revisiting Problem in Simultaneous Localization and Mapping: A Survey on Visual Loop Closure Detection

    Full text link
    Where am I? This is one of the most critical questions that any intelligent system should answer to decide whether it navigates to a previously visited area. This problem has long been acknowledged for its challenging nature in simultaneous localization and mapping (SLAM), wherein the robot needs to correctly associate the incoming sensory data to the database allowing consistent map generation. The significant advances in computer vision achieved over the last 20 years, the increased computational power, and the growing demand for long-term exploration contributed to efficiently performing such a complex task with inexpensive perception sensors. In this article, visual loop closure detection, which formulates a solution based solely on appearance input data, is surveyed. We start by briefly introducing place recognition and SLAM concepts in robotics. Then, we describe a loop closure detection system's structure, covering an extensive collection of topics, including the feature extraction, the environment representation, the decision-making step, and the evaluation process. We conclude by discussing open and new research challenges, particularly concerning the robustness in dynamic environments, the computational complexity, and scalability in long-term operations. The article aims to serve as a tutorial and a position paper for newcomers to visual loop closure detection.Comment: 25 pages, 15 figure

    Video Understanding: A Predictive Analytics Perspective

    Get PDF
    This dissertation includes a detailed study of video predictive understanding, an emerging perspective on video-based computer vision research. This direction explores machine vision techniques to fill in missing spatiotemporal information in videos (e.g., predict the future), which is of great importance for understanding real world dynamics and benefits many applications. We investigate this direction with depth and breadth. Four emerging areas are considered and improved by our efforts: early action recognition, future activity prediction, trajectory prediction and procedure planning. For each, our research presents innovative solutions based on machine learning techniques (deep learning in particular) and meanwhile pays special attention to their interpretability, multi-modality and efficiency, which we consider as critical for next-generation Artificial Intelligence (AI). Finally, we conclude this dissertation by discussing current shortcomings as well as future directions

    Video Understanding: A Predictive Analytics Perspective

    Get PDF
    This dissertation includes a detailed study of video predictive understanding, an emerging perspective on video-based computer vision research. This direction explores machine vision techniques to fill in missing spatiotemporal information in videos (e.g., predict the future), which is of great importance for understanding real world dynamics and benefits many applications. We investigate this direction with depth and breadth. Four emerging areas are considered and improved by our efforts: early action recognition, future activity prediction, trajectory prediction and procedure planning. For each, our research presents innovative solutions based on machine learning techniques (deep learning in particular) and meanwhile pays special attention to their interpretability, multi-modality and efficiency, which we consider as critical for next-generation Artificial Intelligence (AI). Finally, we conclude this dissertation by discussing current shortcomings as well as future directions

    Analyzing Handwritten and Transcribed Symbols in Disparate Corpora

    Get PDF
    Cuneiform tablets appertain to the oldest textual artifacts used for more than three millennia and are comparable in amount and relevance to texts written in Latin or ancient Greek. These tablets are typically found in the Middle East and were written by imprinting wedge-shaped impressions into wet clay. Motivated by the increased demand for computerized analysis of documents within the Digital Humanities, we develop the foundation for quantitative processing of cuneiform script. Using a 3D-Scanner to acquire a cuneiform tablet or manually creating line tracings are two completely different representations of the same type of text source. Each representation is typically processed with its own tool-set and the textual analysis is therefore limited to a certain type of digital representation. To homogenize these data source a unifying minimal wedge feature description is introduced. It is extracted by pattern matching and subsequent conflict resolution as cuneiform is written densely with highly overlapping wedges. Similarity metrics for cuneiform signs based on distinct assumptions are presented. (i) An implicit model represents cuneiform signs using undirected mathematical graphs and measures the similarity of signs with graph kernels. (ii) An explicit model approaches the problem of recognition by an optimal assignment between the wedge configurations of two signs. Further, methods for spotting cuneiform script are developed, combining the feature descriptors for cuneiform wedges with prior work on segmentation-free word spotting using part-structured models. The ink-ball model is adapted by treating wedge feature descriptors as individual parts. The similarity metrics and the adapted spotting model are both evaluated on a real-world dataset outperforming the state-of-the-art in cuneiform sign similarity and spotting. To prove the applicability of these methods for computational cuneiform analysis, a novel approach is presented for mining frequent constellations of wedges resulting in spatial n-grams. Furthermore, a method for automatized transliteration of tablets is evaluated by employing structured and sequential learning on a dataset of parallel sentences. Finally, the conclusion outlines how the presented methods enable the development of new tools and computational analyses, which are objective and reproducible, for quantitative processing of cuneiform script

    Doppler compensation algorithms for DSP-based implementation of OFDM underwater acoustic communication systems

    Get PDF
    In recent years, orthogonal frequency division multiplexing (OFDM) has gained considerable attention in the development of underwater communication (UWC) systems for civilian and military applications. However, the wideband nature of the communication links necessitate robust algorithms to combat the consequences of severe channel conditions such as frequency selectivity, ambient noise, severe multipath and Doppler Effect due to velocity change between the transmitter and receiver. This velocity perturbation comprises two scenarios; the first induces constant time scale expansion/compression or zero acceleration during the transmitted packet time, and the second is time varying Doppler-shift. The latter is an increasingly important area in autonomous underwater vehicle (AUV) applications. The aim of this thesis is to design a low complexity OFDM-based receiver structure for underwater communication that tackles the inherent Doppler effect and is applicable for developing real-time systems on a digital signal processor (DSP). The proposed structure presents a paradigm in modem design from previous generations of single carrier receivers employing computationally expensive equalizers. The thesis demonstrates the issues related to designing a practical OFDM system, such as channel coding and peak-to-average power ratio (PAPR). In channel coding, the proposed algorithms employ convolutional bit-interleaved coded modulation with iterative decoding (BICM-ID) to obtain a higher degree of protection against power fading caused by the channel. A novel receiver structure that combines an adaptive Doppler-shift correction and BICM-ID for multi-carrier systems is presented. In addition, the selective mapping (SLM) technique has been utilized for PAPR. Due to their time varying and frequency selective channel nature, the proposed systems are investigated via both laboratory simulations and experiments conducted in the North Sea off the UKā€™s North East coast. The results of the study show that the proposed systems outperform block-based Doppler-shift compensation and are capable of tracking the Doppler-shift at acceleration up to 1m /s2.EThOS - Electronic Theses Online ServiceIraqi Government's Ministry of Higher Education and Scientific ResearchGBUnited Kingdo

    Factor Graphs for Computer Vision and Image Processing

    No full text
    Factor graphs have been used extensively in the decoding of error correcting codes such as turbo codes, and in signal processing. However, while computer vision and pattern recognition are awash with graphical model usage, it is some-what surprising that factor graphs are still somewhat under-researched in these communities. This is surprising because factor graphs naturally generalise both Markov random fields and Bayesian networks. Moreover, they are useful in modelling relationships between variables that are not necessarily probabilistic and allow for efficient marginalisation via a sum-product of probabilities. In this thesis, we present and illustrate the utility of factor graphs in the vision community through some of the fieldā€™s popular problems. The thesis does so with a particular focus on maximum a posteriori (MAP) inference in graphical structures with layers. To this end, we are able to break-down complex problems into factored representations and more computationally realisable constructions. Firstly, we present a sum-product framework that uses the explicit factorisation in local subgraphs from the partitioned factor graph of a layered structure to perform inference. This provides an efficient method to perform inference since exact inference is attainable in the resulting local subtrees. Secondly, we extend this framework to the entire graphical structure without partitioning, and discuss preliminary ways to combine outputs from a multilevel construction. Lastly, we further our endeavour to combine evidence from different methods through a simplicial spanning tree reparameterisation of the factor graph in a way that ensures consistency, to produce an ensembled and improved result. Throughout the thesis, the underlying feature we make use of is to enforce adjacency constraints using Delaunay triangulations computed by adding points dynamically, or using a convex hull algorithm. The adjacency relationships from Delaunay triangulations aid the factor graph approaches in this thesis to be both efficient and competitive for computer vision tasks. This is because of the low treewidth they provide in local subgraphs, as well as the reparameterised interpretation of the graph they form through the spanning tree of simplexes. While exact inference is known to be intractable for junction trees obtained from the loopy graphs in computer vision, in this thesis we are able to effect exact inference on our spanning tree of simplexes. More importantly, the approaches presented here are not restricted to the computer vision and image processing fields, but are extendable to more general applications that involve distributed computations
    • ā€¦
    corecore