111 research outputs found

    Stereo-Based Tracking-by-Multiple Hypotheses Framework for Multiple Vehicle Detection and Tracking

    Get PDF
    In this paper, we present a tracking-by-multiple hypotheses framework to detect and track multiple vehicles accurately and precisely. The tracking-bymultiple hypotheses framework consists of obstacle detection, vehicle recognition, visual tracking, global position tracking, data association and particle filtering. The multiple hypotheses are from obstacle detection, vehicle recognition and visual tracking. The obstacle detection detects all the obstacles on the road. The vehicle recognition classifies the detected obstacles as vehicles or non-vehicles. 3D feature-based visual tracking estimates the current target state using the previous target state. The multiple hypotheses should be linked to corresponding tracks to update the target state. The hierarchical data association method assigns multiple tracks to the correct hypotheses with multiple similarity functions. In the particle filter framework, the target state is updated using the Gaussian motion model and the observation model with associated multiple hypotheses. The experimental results demonstrate that the proposed method enhances the accuracy and precision of the region of interest. © 2013 Lim et al.1

    Communications in Mobile Wireless Networks: A Finite Time-Horizon Viewpoint

    No full text
    In mobile wireless networks (MWNs), short-term communications carry two key features: 1) Different from communications over a large time window where the performance is governed by the long-term average effect, the short-term communications in MWNs are sensitive to the instantaneous location and channel condition caused by node mobility. 2) The short-term communications in MWNs have the finite blocklength coding effect which means it is not amenable to the well-known Shannon's capacity formulation. To deal with the short-term communications in MWNs, this thesis focuses on three main issues: how the node mobility affects the instantaneous interference, how to reduce the uncertainty in the locations of mobile users, and what is the maximal throughput of a multi-user network over a short time-horizon. First, we study interference prediction in MWNs by proposing and using a general-order linear model for node mobility. The proposed mobility model can well approximate node dynamics of practical MWNs. Unlike previous studies on interference statistics, we are able through this model to give a best estimate of the time-varying interference at any time rather than long-term average effects. In particular, we propose a compound Gaussian point process functional (CGPPF) in a general framework to obtain analytical results on the mean value and moment-generating function of the interference prediction. Second, to reduce the uncertainty in nodal locations, the cooperative localization problem for mobile nodes is studied. In contrast to previous works, which highly rely on the synchronized time-slotted systems, this cooperative localization framework we establish does not need any synchronization for the communication links and measurement processes in the entire wireless network. To solve the cooperative localization problem in a distributed manner, we first propose the centralized localization algorithm based on the global information, and use it as the benchmark. Then, we rigorously prove when a localization estimation with partial information has a small performance gap from the one with global information. Finally, by applying this result at each node, the distributed prior-cut algorithm is designed to solve this asynchronous localization problem. Finally, we study the throughput region of any MWN consisting of multiple transmitter-receiver pairs where interference is treated as noise. Unlike the infinite-horizon throughput region, which is simply the convex hull of the throughput region of one time slot, the finite-horizon throughput region is generally non-convex. Instead of directly characterizing all achievable rate-tuples in the finite-horizon throughput region, we propose a metric termed the rate margin, which not only determines whether any given rate-tuple is within the throughput region (i.e., achievable or unachievable), but also tells the amount of scaling that can be done to the given achievable (unachievable) rate-tuple such that the resulting rate-tuple is still within (brought back into) the throughput region. This thesis advances our understanding in communications in MWNs from a finite-time horizon viewpoint. It establishes new frameworks for tracking the instantaneous behaviors, such as interference and nodal location, of MWNs. It also reveals the fundamental limits on short-term communications of a multi-user mobile network, which sheds light on communications with low latency

    On unifying sparsity and geometry for image-based 3D scene representation

    Get PDF
    Demand has emerged for next generation visual technologies that go beyond conventional 2D imaging. Such technologies should capture and communicate all perceptually relevant three-dimensional information about an environment to a distant observer, providing a satisfying, immersive experience. Camera networks offer a low cost solution to the acquisition of 3D visual information, by capturing multi-view images from different viewpoints. However, the camera's representation of the data is not ideal for common tasks such as data compression or 3D scene analysis, as it does not make the 3D scene geometry explicit. Image-based scene representations fundamentally require a multi-view image model that facilitates extraction of underlying geometrical relationships between the cameras and scene components. Developing new, efficient multi-view image models is thus one of the major challenges in image-based 3D scene representation methods. This dissertation focuses on defining and exploiting a new method for multi-view image representation, from which the 3D geometry information is easily extractable, and which is additionally highly compressible. The method is based on sparse image representation using an overcomplete dictionary of geometric features, where a single image is represented as a linear combination of few fundamental image structure features (edges for example). We construct the dictionary by applying a unitary operator to an analytic function, which introduces a composition of geometric transforms (translations, rotation and anisotropic scaling) to that function. The advantage of this approach is that the features across multiple views can be related with a single composition of transforms. We then establish a connection between image components and scene geometry by defining the transforms that satisfy the multi-view geometry constraint, and obtain a new geometric multi-view correlation model. We first address the construction of dictionaries for images acquired by omnidirectional cameras, which are particularly convenient for scene representation due to their wide field of view. Since most omnidirectional images can be uniquely mapped to spherical images, we form a dictionary by applying motions on the sphere, rotations, and anisotropic scaling to a function that lives on the sphere. We have used this dictionary and a sparse approximation algorithm, Matching Pursuit, for compression of omnidirectional images, and additionally for coding 3D objects represented as spherical signals. Both methods offer better rate-distortion performance than state of the art schemes at low bit rates. The novel multi-view representation method and the dictionary on the sphere are then exploited for the design of a distributed coding method for multi-view omnidirectional images. In a distributed scenario, cameras compress acquired images without communicating with each other. Using a reliable model of correlation between views, distributed coding can achieve higher compression ratios than independent compression of each image. However, the lack of a proper model has been an obstacle for distributed coding in camera networks for many years. We propose to use our geometric correlation model for distributed multi-view image coding with side information. The encoder employs a coset coding strategy, developed by dictionary partitioning based on atom shape similarity and multi-view geometry constraints. Our method results in significant rate savings compared to independent coding. An additional contribution of the proposed correlation model is that it gives information about the scene geometry, leading to a new camera pose estimation method using an extremely small amount of data from each camera. Finally, we develop a method for learning stereo visual dictionaries based on the new multi-view image model. Although dictionary learning for still images has received a lot of attention recently, dictionary learning for stereo images has been investigated only sparingly. Our method maximizes the likelihood that a set of natural stereo images is efficiently represented with selected stereo dictionaries, where the multi-view geometry constraint is included in the probabilistic modeling. Experimental results demonstrate that including the geometric constraints in learning leads to stereo dictionaries that give both better distributed stereo matching and approximation properties than randomly selected dictionaries. We show that learning dictionaries for optimal scene representation based on the novel correlation model improves the camera pose estimation and that it can be beneficial for distributed coding

    Discrete Wavelet Transforms

    Get PDF
    The discrete wavelet transform (DWT) algorithms have a firm position in processing of signals in several areas of research and industry. As DWT provides both octave-scale frequency and spatial timing of the analyzed signal, it is constantly used to solve and treat more and more advanced problems. The present book: Discrete Wavelet Transforms: Algorithms and Applications reviews the recent progress in discrete wavelet transform algorithms and applications. The book covers a wide range of methods (e.g. lifting, shift invariance, multi-scale analysis) for constructing DWTs. The book chapters are organized into four major parts. Part I describes the progress in hardware implementations of the DWT algorithms. Applications include multitone modulation for ADSL and equalization techniques, a scalable architecture for FPGA-implementation, lifting based algorithm for VLSI implementation, comparison between DWT and FFT based OFDM and modified SPIHT codec. Part II addresses image processing algorithms such as multiresolution approach for edge detection, low bit rate image compression, low complexity implementation of CQF wavelets and compression of multi-component images. Part III focuses watermaking DWT algorithms. Finally, Part IV describes shift invariant DWTs, DC lossless property, DWT based analysis and estimation of colored noise and an application of the wavelet Galerkin method. The chapters of the present book consist of both tutorial and highly advanced material. Therefore, the book is intended to be a reference text for graduate students and researchers to obtain state-of-the-art knowledge on specific applications

    Towards autonomous diagnostic systems with medical imaging

    Get PDF
    Democratizing access to high quality healthcare has highlighted the need for autonomous diagnostic systems that a non-expert can use. Remote communities, first responders and even deep space explorers will come to rely on medical imaging systems that will provide them with Point of Care diagnostic capabilities. This thesis introduces the building blocks that would enable the creation of such a system. Firstly, we present a case study in order to further motivate the need and requirements of autonomous diagnostic systems. This case study primarily concerns deep space exploration where astronauts cannot rely on communication with earth-bound doctors to help them through diagnosis, nor can they make the trip back to earth for treatment. Requirements and possible solutions about the major challenges faced with such an application are discussed. Moreover, this work describes how a system can explore its perceived environment by developing a Multi Agent Reinforcement Learning method that allows for implicit communication between the agents. Under this regime agents can share the knowledge that benefits them all in achieving their individual tasks. Furthermore, we explore how systems can understand the 3D properties of 2D depicted objects in a probabilistic way. In Part II, this work explores how to reason about the extracted information in a causally enabled manner. A critical view on the applications of causality in medical imaging, and its potential uses is provided. It is then narrowed down to estimating possible future outcomes and reasoning about counterfactual outcomes by embedding data on a pseudo-Riemannian manifold and constraining the latent space by using the relativistic concept of light cones. By formalizing an approach to estimating counterfactuals, a computationally lighter alternative to the abduction-action-prediction paradigm is presented through the introduction of Deep Twin Networks. Appropriate partial identifiability constraints for categorical variables are derived and the method is applied in a series of medical tasks involving structured data, images and videos. All methods are evaluated in a wide array of synthetic and real life tasks that showcase their abilities, often achieving state-of-the-art performance or matching the existing best performance while requiring a fraction of the computational cost.Open Acces

    Image Forensics in the Wild

    Get PDF

    A Comparative Study of the Evolution of Mammalian High-Frequency Hearing and Echolocation

    Get PDF
    PhDThe lineage that gave rise to mammals split from other basal amniotes, approximately 300 million years ago. Since then, mammals have evolved many sensory novelties, including high-frequency hearing and echolocation. Sensitivity to high frequencies is particularly well developed in many echolocating mammals; for example, the upper hearing limit of several laryngeal echolocating bat species are estimated to be approximately ten times that of humans. In order to process the high frequency sounds produced during echolocation, the inner ears of laryngeal echolocating bats have undergone substantial modifications. Despite the evolutionary significance of laryngeal echolocation, it is unknown how many times it evolved within bats. Its occurrence on most, but not all, bat lineages suggests it either evolved once with secondary loss, or independently on multiple lineages. Distinguishing between these possibilities is complicated by morphological diversity and convergence. Furthermore, the genetic basis underpinning echolocation remains largely unknown. To elucidate the evolutionary history of this key trait in bats, a combined molecular and morphological approach was taken. Firstly, for two mammalian ‘hearing genes’ sequence convergence, phylogenetic signal and selection pressures were examined across echolocating and non-echolocating mammal species. Secondly, substitution rates of Conserved Non-coding Elements associated with genes regulating ear development were compared across mammals. Finally, as mammalian inner ear development is controlled by many genes, the gross structure of the bony labyrinth was studied in order to examine the combined genetic effect. Structural variation of bat cochleae and vestibular systems was examined using micro-computed tomography reconstructions, and related to ecological data. Subsequent analyses found evidence of convergence at the molecular level, in terms of amino acid substitutions, and also the morphological level, in terms of inner ear morphology. No evidence of degeneration, supporting loss-of-function in Old World fruit bats was found. Conversely, evidence of differential evolution pressures acting on the two echolocating bat lineages was found, which supports multiple origins of laryngeal echolocation in bats.NERC; CRF; CEE; SRF; Teeling lab

    Online Synthesis Of Speculative Building Information Models For Robot Motion Planning

    Get PDF
    Autonomous mobile robots today still lack the necessary understanding of indoor environments for making informed decisions about the state of the world beyond their immediate field of view. As a result, they are forced to make conservative and often inaccurate assumptions about unexplored space, inhibiting the degree of performance being increasingly expected of them in the areas of high-speed navigation and mission planning. In order to address this limitation, this thesis explores the use of Building Information Models (BIMs) for providing the existing ecosystem of local and global planning algorithms with informative compact higher-level representations of indoor environments. Although BIMs have long been used in architecture, engineering, and construction for a number of different purposes, to our knowledge, this is the first instance of them being used in robotics. Given the technical constraints accompanying this domain, including a limited and incomplete set of observations which grows over time, the systems we present are designed such that together they produce BIMs capable of providing explanations of both the explored and unexplored space in an online fashion. The first is a SLAM system that uses the structural regularity of buildings in order to mitigate drift and provide the simplest explanation of architectural features such as floors, walls, and ceilings. The planar model generated is then passed to a secondary system that then reasons about their mutual relationships in order to provide a water-tight model of the observed and inferred freespace. Our experimental results demonstrate this to be an accurate and efficient approach towards this end
    corecore