111 research outputs found
Stereo-Based Tracking-by-Multiple Hypotheses Framework for Multiple Vehicle Detection and Tracking
In this paper, we present a tracking-by-multiple hypotheses framework to detect and track multiple vehicles accurately and precisely. The tracking-bymultiple hypotheses framework consists of obstacle detection, vehicle recognition, visual tracking, global position tracking, data association and particle filtering. The multiple hypotheses are from obstacle detection, vehicle recognition and visual tracking. The obstacle detection detects all the obstacles on the road. The vehicle recognition classifies the detected obstacles as vehicles or non-vehicles. 3D feature-based visual tracking estimates the current target state using the previous target state. The multiple hypotheses should be linked to corresponding tracks to update the target state. The hierarchical data association method assigns multiple tracks to the correct hypotheses with multiple similarity functions. In the particle filter framework, the target state is updated using the Gaussian motion model and the observation model with associated multiple hypotheses. The experimental results demonstrate that the proposed method enhances the accuracy and precision of the region of interest. © 2013 Lim et al.1
Communications in Mobile Wireless Networks: A Finite Time-Horizon Viewpoint
In mobile wireless networks (MWNs), short-term communications
carry two key features: 1) Different from communications over a
large time window where the performance is governed by the
long-term average effect, the short-term communications in MWNs
are sensitive to the instantaneous location and channel condition
caused by node mobility. 2) The short-term communications in MWNs
have the finite blocklength coding effect which means it is not
amenable to the well-known Shannon's capacity formulation.
To deal with the short-term communications in MWNs, this thesis
focuses on three main issues: how the node mobility affects the
instantaneous interference, how to reduce the uncertainty in the
locations of mobile users, and what is the maximal throughput of
a multi-user network over a short time-horizon.
First, we study interference prediction in MWNs by proposing and
using a general-order linear model for node mobility. The
proposed mobility model can well approximate node dynamics of
practical MWNs. Unlike previous studies on interference
statistics, we are able through this model to give a best
estimate of the time-varying interference at any time rather than
long-term average effects. In particular, we propose a compound
Gaussian point process functional (CGPPF) in a general framework
to obtain analytical results on the mean value and
moment-generating function of the interference prediction.
Second, to reduce the uncertainty in nodal locations, the
cooperative localization problem for mobile nodes is studied. In
contrast to previous works, which highly rely on the synchronized
time-slotted systems, this cooperative localization framework we
establish does not need any synchronization for the communication
links and measurement processes in the entire wireless network.
To solve the cooperative localization problem in a distributed
manner, we first propose the centralized localization algorithm
based on the global information, and use it as the benchmark.
Then, we rigorously prove when a localization estimation with
partial information has a small performance gap from the one with
global information. Finally, by applying this result at each
node, the distributed prior-cut algorithm is designed to solve
this asynchronous localization problem.
Finally, we study the throughput region of any MWN consisting of
multiple transmitter-receiver pairs where interference is treated
as noise. Unlike the infinite-horizon throughput region, which is
simply the convex hull of the throughput region of one time slot,
the finite-horizon throughput region is generally non-convex.
Instead of directly characterizing all achievable rate-tuples in
the finite-horizon throughput region, we propose a metric termed
the rate margin, which not only determines whether any given
rate-tuple is within the throughput region (i.e., achievable or
unachievable), but also tells the amount of scaling that can be
done to the given achievable (unachievable) rate-tuple such that
the resulting rate-tuple is still within (brought back into) the
throughput region.
This thesis advances our understanding in communications in MWNs
from a finite-time horizon viewpoint. It establishes new
frameworks for tracking the instantaneous behaviors, such as
interference and nodal location, of MWNs. It also reveals the
fundamental limits on short-term communications of a multi-user
mobile network, which sheds light on communications with low
latency
On unifying sparsity and geometry for image-based 3D scene representation
Demand has emerged for next generation visual technologies that go beyond conventional 2D imaging. Such technologies should capture and communicate all perceptually relevant three-dimensional information about an environment to a distant observer, providing a satisfying, immersive experience. Camera networks offer a low cost solution to the acquisition of 3D visual information, by capturing multi-view images from different viewpoints. However, the camera's representation of the data is not ideal for common tasks such as data compression or 3D scene analysis, as it does not make the 3D scene geometry explicit. Image-based scene representations fundamentally require a multi-view image model that facilitates extraction of underlying geometrical relationships between the cameras and scene components. Developing new, efficient multi-view image models is thus one of the major challenges in image-based 3D scene representation methods. This dissertation focuses on defining and exploiting a new method for multi-view image representation, from which the 3D geometry information is easily extractable, and which is additionally highly compressible. The method is based on sparse image representation using an overcomplete dictionary of geometric features, where a single image is represented as a linear combination of few fundamental image structure features (edges for example). We construct the dictionary by applying a unitary operator to an analytic function, which introduces a composition of geometric transforms (translations, rotation and anisotropic scaling) to that function. The advantage of this approach is that the features across multiple views can be related with a single composition of transforms. We then establish a connection between image components and scene geometry by defining the transforms that satisfy the multi-view geometry constraint, and obtain a new geometric multi-view correlation model. We first address the construction of dictionaries for images acquired by omnidirectional cameras, which are particularly convenient for scene representation due to their wide field of view. Since most omnidirectional images can be uniquely mapped to spherical images, we form a dictionary by applying motions on the sphere, rotations, and anisotropic scaling to a function that lives on the sphere. We have used this dictionary and a sparse approximation algorithm, Matching Pursuit, for compression of omnidirectional images, and additionally for coding 3D objects represented as spherical signals. Both methods offer better rate-distortion performance than state of the art schemes at low bit rates. The novel multi-view representation method and the dictionary on the sphere are then exploited for the design of a distributed coding method for multi-view omnidirectional images. In a distributed scenario, cameras compress acquired images without communicating with each other. Using a reliable model of correlation between views, distributed coding can achieve higher compression ratios than independent compression of each image. However, the lack of a proper model has been an obstacle for distributed coding in camera networks for many years. We propose to use our geometric correlation model for distributed multi-view image coding with side information. The encoder employs a coset coding strategy, developed by dictionary partitioning based on atom shape similarity and multi-view geometry constraints. Our method results in significant rate savings compared to independent coding. An additional contribution of the proposed correlation model is that it gives information about the scene geometry, leading to a new camera pose estimation method using an extremely small amount of data from each camera. Finally, we develop a method for learning stereo visual dictionaries based on the new multi-view image model. Although dictionary learning for still images has received a lot of attention recently, dictionary learning for stereo images has been investigated only sparingly. Our method maximizes the likelihood that a set of natural stereo images is efficiently represented with selected stereo dictionaries, where the multi-view geometry constraint is included in the probabilistic modeling. Experimental results demonstrate that including the geometric constraints in learning leads to stereo dictionaries that give both better distributed stereo matching and approximation properties than randomly selected dictionaries. We show that learning dictionaries for optimal scene representation based on the novel correlation model improves the camera pose estimation and that it can be beneficial for distributed coding
Discrete Wavelet Transforms
The discrete wavelet transform (DWT) algorithms have a firm position in processing of signals in several areas of research and industry. As DWT provides both octave-scale frequency and spatial timing of the analyzed signal, it is constantly used to solve and treat more and more advanced problems. The present book: Discrete Wavelet Transforms: Algorithms and Applications reviews the recent progress in discrete wavelet transform algorithms and applications. The book covers a wide range of methods (e.g. lifting, shift invariance, multi-scale analysis) for constructing DWTs. The book chapters are organized into four major parts. Part I describes the progress in hardware implementations of the DWT algorithms. Applications include multitone modulation for ADSL and equalization techniques, a scalable architecture for FPGA-implementation, lifting based algorithm for VLSI implementation, comparison between DWT and FFT based OFDM and modified SPIHT codec. Part II addresses image processing algorithms such as multiresolution approach for edge detection, low bit rate image compression, low complexity implementation of CQF wavelets and compression of multi-component images. Part III focuses watermaking DWT algorithms. Finally, Part IV describes shift invariant DWTs, DC lossless property, DWT based analysis and estimation of colored noise and an application of the wavelet Galerkin method. The chapters of the present book consist of both tutorial and highly advanced material. Therefore, the book is intended to be a reference text for graduate students and researchers to obtain state-of-the-art knowledge on specific applications
Towards autonomous diagnostic systems with medical imaging
Democratizing access to high quality healthcare has highlighted the need for autonomous diagnostic systems that a non-expert can use. Remote communities, first responders and even deep space explorers will come to rely on medical imaging systems that will provide them with Point of Care diagnostic capabilities.
This thesis introduces the building blocks that would enable the creation of such a system. Firstly, we present a case study in order to further motivate the need and requirements of autonomous diagnostic systems. This case study primarily concerns deep space exploration where astronauts cannot rely on communication with earth-bound doctors to help them through diagnosis, nor can they make the trip back to earth for treatment. Requirements and possible solutions about the major challenges faced with such an application are discussed.
Moreover, this work describes how a system can explore its perceived environment by developing a Multi Agent Reinforcement Learning method that allows for implicit communication between the agents. Under this regime agents can share the knowledge that benefits them all in achieving their individual tasks. Furthermore, we explore how systems can understand the 3D properties of 2D depicted objects in a probabilistic way.
In Part II, this work explores how to reason about the extracted information in a causally enabled manner. A critical view on the applications of causality in medical imaging, and its potential uses is provided. It is then narrowed down to estimating possible future outcomes and reasoning about counterfactual outcomes by embedding data on a pseudo-Riemannian manifold and constraining the latent space by using the relativistic concept of light cones.
By formalizing an approach to estimating counterfactuals, a computationally lighter alternative to the abduction-action-prediction paradigm is presented through the introduction of Deep Twin Networks. Appropriate partial identifiability constraints for categorical variables are derived and the method is applied in a series of medical tasks involving structured data, images and videos.
All methods are evaluated in a wide array of synthetic and real life tasks that showcase their abilities, often achieving state-of-the-art performance or matching the existing best performance while requiring a fraction of the computational cost.Open Acces
A Comparative Study of the Evolution of Mammalian High-Frequency Hearing and Echolocation
PhDThe lineage that gave rise to mammals split from other basal amniotes, approximately
300 million years ago. Since then, mammals have evolved many sensory novelties,
including high-frequency hearing and echolocation. Sensitivity to high frequencies is
particularly well developed in many echolocating mammals; for example, the upper
hearing limit of several laryngeal echolocating bat species are estimated to be
approximately ten times that of humans. In order to process the high frequency sounds
produced during echolocation, the inner ears of laryngeal echolocating bats have
undergone substantial modifications. Despite the evolutionary significance of laryngeal
echolocation, it is unknown how many times it evolved within bats. Its occurrence on
most, but not all, bat lineages suggests it either evolved once with secondary loss, or
independently on multiple lineages. Distinguishing between these possibilities is
complicated by morphological diversity and convergence. Furthermore, the genetic
basis underpinning echolocation remains largely unknown.
To elucidate the evolutionary history of this key trait in bats, a combined molecular and
morphological approach was taken. Firstly, for two mammalian ‘hearing genes’
sequence convergence, phylogenetic signal and selection pressures were examined
across echolocating and non-echolocating mammal species. Secondly, substitution rates
of Conserved Non-coding Elements associated with genes regulating ear development
were compared across mammals. Finally, as mammalian inner ear development is
controlled by many genes, the gross structure of the bony labyrinth was studied in order
to examine the combined genetic effect. Structural variation of bat cochleae and
vestibular systems was examined using micro-computed tomography reconstructions,
and related to ecological data.
Subsequent analyses found evidence of convergence at the molecular level, in terms of
amino acid substitutions, and also the morphological level, in terms of inner ear
morphology. No evidence of degeneration, supporting loss-of-function in Old World
fruit bats was found. Conversely, evidence of differential evolution pressures acting on
the two echolocating bat lineages was found, which supports multiple origins of
laryngeal echolocation in bats.NERC; CRF; CEE; SRF; Teeling lab
Online Synthesis Of Speculative Building Information Models For Robot Motion Planning
Autonomous mobile robots today still lack the necessary understanding of indoor environments for making informed decisions about the state of the world beyond their immediate field of view. As a result, they are forced to make conservative and often inaccurate assumptions about unexplored space, inhibiting the degree of performance being increasingly expected of them in the areas of high-speed navigation and mission planning. In order to address this limitation, this thesis explores the use of Building Information Models (BIMs) for providing the existing ecosystem of local and global planning algorithms with informative compact higher-level representations of indoor environments. Although BIMs have long been used in architecture, engineering, and construction for a number of different purposes, to our knowledge, this is the first instance of them being used in robotics. Given the technical constraints accompanying this domain, including a limited and incomplete set of observations which grows over time, the systems we present are designed such that together they produce BIMs capable of providing explanations of both the explored and unexplored space in an online fashion. The first is a SLAM system that uses the structural regularity of buildings in order to mitigate drift and provide the simplest explanation of architectural features such as floors, walls, and ceilings. The planar model generated is then passed to a secondary system that then reasons about their mutual relationships in order to provide a water-tight model of the observed and inferred freespace. Our experimental results demonstrate this to be an accurate and efficient approach towards this end
Recommended from our members
Interactive Imaging via Hand Gesture Recognition.
With the growth of computer power, Digital Image Processing plays a more and more important role in the modern world, including the field of industry, medical, communications, spaceflight technology etc. As a sub-field, Interactive Image Processing emphasizes particularly on the communications between machine and human. The basic flowchart is definition of object, analysis and training phase, recognition and feedback. Generally speaking, the core issue is how we define the interesting object and track them more accurately in order to complete the interaction process successfully.
This thesis proposes a novel dynamic simulation scheme for interactive image processing. The work consists of two main parts: Hand Motion Detection and Hand Gesture recognition. Within a hand motion detection processing, movement of hand will be identified and extracted. In a specific detection period, the current image is compared with the previous image in order to generate the difference between them. If the generated difference exceeds predefined threshold alarm, a typical hand motion movement is detected. Furthermore, in some particular situations, changes of hand gesture are also desired to be detected and classified. This task requires features extraction and feature comparison among each type of gestures. The essentials of hand gesture are including some low level features such as color, shape etc. Another important feature is orientation histogram. Each type of hand gestures has its particular representation in the domain of orientation histogram. Because Gaussian Mixture Model has great advantages to represent the object with essential feature elements and the Expectation-Maximization is the efficient procedure to compute the maximum likelihood between testing images and predefined standard sample of each different gesture, the comparability between testing image and samples of each type of gestures will be estimated by Expectation-Maximization algorithm in Gaussian Mixture Model. The performance of this approach in experiments shows the proposed method works well and accurately
- …