87 research outputs found
Hidden Markov Models
Hidden Markov Models (HMMs), although known for decades, have made a big career nowadays and are still in state of development. This book presents theoretical issues and a variety of HMMs applications in speech recognition and synthesis, medicine, neurosciences, computational biology, bioinformatics, seismology, environment protection and engineering. I hope that the reader will find this book useful and helpful for their own research
The Revisiting Problem in Simultaneous Localization and Mapping: A Survey on Visual Loop Closure Detection
Where am I? This is one of the most critical questions that any intelligent
system should answer to decide whether it navigates to a previously visited
area. This problem has long been acknowledged for its challenging nature in
simultaneous localization and mapping (SLAM), wherein the robot needs to
correctly associate the incoming sensory data to the database allowing
consistent map generation. The significant advances in computer vision achieved
over the last 20 years, the increased computational power, and the growing
demand for long-term exploration contributed to efficiently performing such a
complex task with inexpensive perception sensors. In this article, visual loop
closure detection, which formulates a solution based solely on appearance input
data, is surveyed. We start by briefly introducing place recognition and SLAM
concepts in robotics. Then, we describe a loop closure detection system's
structure, covering an extensive collection of topics, including the feature
extraction, the environment representation, the decision-making step, and the
evaluation process. We conclude by discussing open and new research challenges,
particularly concerning the robustness in dynamic environments, the
computational complexity, and scalability in long-term operations. The article
aims to serve as a tutorial and a position paper for newcomers to visual loop
closure detection.Comment: 25 pages, 15 figure
Video Understanding: A Predictive Analytics Perspective
This dissertation includes a detailed study of video predictive understanding, an emerging perspective on video-based computer vision research. This direction explores machine vision techniques to fill in missing spatiotemporal information in videos (e.g., predict the future), which is of great importance for understanding real world dynamics and benefits many applications. We investigate this direction with depth and breadth. Four emerging areas are considered and improved by our efforts: early action recognition, future activity prediction, trajectory prediction and procedure planning. For each, our research presents innovative solutions based on machine learning techniques (deep learning in particular) and meanwhile pays special attention to their interpretability, multi-modality and efficiency, which we consider as critical for next-generation Artificial Intelligence (AI). Finally, we conclude this dissertation by discussing current shortcomings as well as future directions
Video Understanding: A Predictive Analytics Perspective
This dissertation includes a detailed study of video predictive understanding, an emerging perspective on video-based computer vision research. This direction explores machine vision techniques to fill in missing spatiotemporal information in videos (e.g., predict the future), which is of great importance for understanding real world dynamics and benefits many applications. We investigate this direction with depth and breadth. Four emerging areas are considered and improved by our efforts: early action recognition, future activity prediction, trajectory prediction and procedure planning. For each, our research presents innovative solutions based on machine learning techniques (deep learning in particular) and meanwhile pays special attention to their interpretability, multi-modality and efficiency, which we consider as critical for next-generation Artificial Intelligence (AI). Finally, we conclude this dissertation by discussing current shortcomings as well as future directions
Analyzing Handwritten and Transcribed Symbols in Disparate Corpora
Cuneiform tablets appertain to the oldest textual artifacts used for more than
three millennia and are comparable in amount and relevance
to texts written in Latin or ancient Greek.
These tablets are typically found in the Middle East and were
written by imprinting wedge-shaped impressions into wet clay.
Motivated by the increased demand for computerized analysis of documents within
the Digital Humanities, we develop the foundation for quantitative processing
of cuneiform script.
Using a 3D-Scanner to acquire a cuneiform tablet or manually creating line
tracings are two completely different representations of the same type of text
source. Each representation is typically processed with its own tool-set and
the textual analysis is therefore limited to a certain type of digital
representation. To homogenize these data source a unifying minimal wedge
feature description is introduced. It is extracted by
pattern matching and subsequent conflict resolution
as cuneiform is written densely with highly overlapping wedges.
Similarity metrics for cuneiform signs based on distinct
assumptions are presented. (i) An implicit model represents cuneiform signs
using undirected mathematical graphs and measures the similarity of
signs with graph kernels.
(ii) An explicit model approaches the problem of recognition by an optimal
assignment between the wedge configurations of two signs.
Further, methods for spotting cuneiform script are developed, combining
the feature descriptors for cuneiform wedges with prior work on
segmentation-free word spotting using part-structured models.
The ink-ball model is adapted by treating wedge feature descriptors as
individual parts.
The similarity metrics and the adapted spotting model are both evaluated
on a real-world dataset outperforming the state-of-the-art in
cuneiform sign similarity and spotting.
To prove the applicability of these methods for computational cuneiform
analysis, a novel approach is presented for mining frequent
constellations of wedges resulting in spatial n-grams. Furthermore,
a method for automatized transliteration of tablets is evaluated by
employing structured and sequential learning on a dataset of
parallel sentences. Finally, the conclusion
outlines how the presented methods enable the development of new tools
and computational analyses, which are objective and reproducible,
for quantitative processing of cuneiform script
Doppler compensation algorithms for DSP-based implementation of OFDM underwater acoustic communication systems
In recent years, orthogonal frequency division multiplexing (OFDM) has gained considerable attention in the development of underwater communication (UWC) systems for civilian and military applications. However, the wideband nature of the communication links necessitate robust algorithms to combat the consequences of severe channel conditions such as frequency selectivity, ambient noise, severe multipath and Doppler Effect due to velocity change between the transmitter and receiver. This velocity perturbation comprises two scenarios; the first induces constant time scale expansion/compression or zero acceleration during the transmitted packet time, and the second is time varying Doppler-shift. The latter is an increasingly important area in autonomous underwater vehicle (AUV) applications. The aim of this thesis is to design a low complexity OFDM-based receiver structure for underwater communication that tackles the inherent Doppler effect and is applicable for developing real-time systems on a digital signal processor (DSP). The proposed structure presents a paradigm in modem design from previous generations of single carrier receivers employing computationally expensive equalizers. The thesis demonstrates the issues related to designing a practical OFDM system, such as channel coding and peak-to-average power ratio (PAPR). In channel coding, the proposed algorithms employ convolutional bit-interleaved coded modulation with iterative decoding (BICM-ID) to obtain a higher degree of protection against power fading caused by the channel. A novel receiver structure that combines an adaptive Doppler-shift correction and BICM-ID for multi-carrier systems is presented. In addition, the selective mapping (SLM) technique has been utilized for PAPR. Due to their time varying and frequency selective channel nature, the proposed systems are investigated via both laboratory simulations and experiments conducted in the North Sea off the UKās North East coast. The results of the study show that the proposed systems outperform block-based Doppler-shift compensation and are capable of tracking the Doppler-shift at acceleration up to 1m /s2.EThOS - Electronic Theses Online ServiceIraqi Government's Ministry of Higher Education and Scientific ResearchGBUnited Kingdo
Factor Graphs for Computer Vision and Image Processing
Factor graphs have been used extensively in the decoding of error
correcting codes such as turbo codes, and in signal processing.
However, while computer vision and pattern recognition are awash
with graphical model usage, it is some-what surprising that
factor graphs are still somewhat under-researched in these
communities. This is surprising because factor graphs naturally
generalise both Markov random fields and Bayesian networks.
Moreover, they are useful in modelling relationships between
variables that are not necessarily probabilistic and allow for
efficient marginalisation via a sum-product of probabilities.
In this thesis, we present and illustrate the utility of factor
graphs in the vision community through some of the fieldās
popular problems. The thesis does so with a particular focus on
maximum a posteriori (MAP) inference in graphical
structures with layers. To this end, we are able to break-down
complex problems into factored representations and more
computationally realisable constructions. Firstly, we present a
sum-product framework that uses the explicit factorisation
in local subgraphs from the partitioned factor graph of a layered
structure to perform inference. This provides an efficient method
to perform inference since exact inference is attainable in the
resulting local subtrees. Secondly, we extend this framework to
the entire graphical structure without partitioning, and discuss
preliminary ways to combine outputs from a multilevel
construction. Lastly, we further our endeavour to combine
evidence from different methods through
a simplicial spanning tree reparameterisation of the factor graph
in a way that ensures consistency, to produce an ensembled and
improved result. Throughout the thesis, the underlying feature we
make use of is to enforce adjacency constraints using Delaunay
triangulations computed by adding points dynamically, or using a
convex hull algorithm. The adjacency relationships from Delaunay
triangulations aid the factor graph approaches in this thesis to
be both efficient and
competitive for computer vision tasks. This is because of the low
treewidth they provide in local subgraphs, as well as the
reparameterised interpretation of the graph they form through the
spanning tree of simplexes. While exact inference is known to be
intractable for junction trees obtained from the loopy graphs in
computer vision, in this thesis we are able to effect exact
inference on our spanning tree of simplexes. More importantly,
the approaches presented here are not restricted to the computer
vision and image processing fields, but are extendable to more
general applications that involve distributed computations
- ā¦