25 research outputs found

    A Survey on Multimedia-Based Cross-Layer Optimization in Visual Sensor Networks

    Get PDF
    Visual sensor networks (VSNs) comprised of battery-operated electronic devices endowed with low-resolution cameras have expanded the applicability of a series of monitoring applications. Those types of sensors are interconnected by ad hoc error-prone wireless links, imposing stringent restrictions on available bandwidth, end-to-end delay and packet error rates. In such context, multimedia coding is required for data compression and error-resilience, also ensuring energy preservation over the path(s) toward the sink and improving the end-to-end perceptual quality of the received media. Cross-layer optimization may enhance the expected efficiency of VSNs applications, disrupting the conventional information flow of the protocol layers. When the inner characteristics of the multimedia coding techniques are exploited by cross-layer protocols and architectures, higher efficiency may be obtained in visual sensor networks. This paper surveys recent research on multimedia-based cross-layer optimization, presenting the proposed strategies and mechanisms for transmission rate adjustment, congestion control, multipath selection, energy preservation and error recovery. We note that many multimedia-based cross-layer optimization solutions have been proposed in recent years, each one bringing a wealth of contributions to visual sensor networks

    Compressed Sensing in Multi-Signal Environments.

    Full text link
    Technological advances and the ability to build cheap high performance sensors make it possible to deploy tens or even hundreds of sensors to acquire information about a common phenomenon of interest. The increasing number of sensors allows us to acquire ever more detailed information about the underlying scene that was not possible before. This, however, directly translates to increasing amounts of data that needs to be acquired, transmitted, and processed. The amount of data can be overwhelming, especially in applications that involve high-resolution signals such as images or videos. Compressed sensing (CS) is a novel acquisition and reconstruction scheme that is particularly useful in scenarios when high resolution signals are difficult or expensive to encode. When applying CS in a multi-signal scenario, there are several aspects that need to be considered such as the sensing matrix, the joint signal model, and the reconstruction algorithm. The purpose of this dissertation is to provide a complete treatment of these aspects in various multi-signal environments. Specific applications include video, multi-view imaging, and structural health monitoring systems. For each application, we propose a novel joint signal model that accurately captures the joint signal structure, and we tailor the reconstruction algorithm to each signal model to successfully recover the signals of interest.PHDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/98007/1/jaeypark_1.pd

    Network streaming and compression for mixed reality tele-immersion

    Get PDF
    Bulterman, D.C.A. [Promotor]Cesar, P.S. [Copromotor

    Personal imaging

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts & Sciences, 1997.Includes bibliographical references (p. 217-223).In this thesis, I propose a new synergy between humans and computers, called "Humanistic Intelligence" (HI), and provide a precise definition of this new form of human-computer interaction. I then present a means and apparatus for reducing this principle to practice. The bulk of this thesis concentrates on a specific embodiment of this invention, called Personal Imaging, most notably, a system which I show attains new levels of creativity in photography, defines a new genre of documentary video, and goes beyond digital photography/video to define a new renaissance in imaging, based on simple principles of projective geometry combined with linearity and superposition properties of light. I first present a mathematical theory of imaging which allows the apparatus to measure, to within a single unknown constant, the quantity of light arriving from each direction, to a fixed point in space, using a collection of images taken from a sensor array having a possibly unknown nonlinearity. Within the context of personal imaging, this theory is a contribution in and of itself (in the sense that it was an unsolved problem previously), but when also combined with the proposed apparatus, it allows one to construct environment maps by simply looking around. I then present a new form of connected humanistic intelligence in which individuals can communicate, across boundaries of time and space, using shared environment maps, and the resulting computer-mediated reality that arises out of long-term adaptation in a personal imaging environment. Finally, I present a new philosophical framework for cultural criticism which arises out of a new concept called 'humanistic property'. This new philosophical framework has two axes, a 'reflectionist' axis and a 'diffusionist' axis. In particular, I apply the new framework to personal imaging, thus completing a body of work that lies at the intersection of art, science, and technology.by Steve Mann.Ph.D

    The plenacoustic function and its applications

    Get PDF
    This thesis is a study of the spatial evolution of the sound field. We first present an analysis of the sound field along different geometries. In the case of the sound field studied along a line in a room, we describe a two-dimensional function characterizing the sound field along space and time. Calculating the Fourier transform of this function leads to a spectrum having a butterfly shape. The spectrum is shown to be almost bandlimited along the spatial frequency dimension, which allows the interpolation of the sound field at any position along the line when a sufficient number of microphones is present. Using this Fourier representation of the sound field, we develop a spatial sampling theorem trading off quality of reconstruction with spatial sampling frequency. The study is generalized for planes of microphones and microphones located in three dimensions. The presented theory is compared to simulations and real measurements of room impulse responses. We describe a similar theory for circular arrays of microphones or loudspeakers. Application of this theory is presented for the study of the angular sampling of head-related transfer functions (HRTFs). As a result, we show that to reconstruct HRTFs at any possible angle in the horizontal plane, an angular spacing of 5 degrees is necessary for HRTFs sampled at 44.1 kHz. Because recording that many HRTFs is not easy, we develop interpolation techniques to achieve acceptable results for databases containing two or four times fewer HRTFs. The technique is based on the decomposition of the HRTFs in their carrier and complex envelopes. With the Fourier representation of the sound field, it is then shown how one can correctly obtain all room impulse responses measured along a trajectory when using a moving loudspeaker or microphone. The presented method permits the reconstruction of the room impulse responses at any position along the trajectory, provided that the speed satisfies a given relation. The maximal speed is shown to be dependent on the maximal frequency emitted and the radius of the circle. This method takes into account the Doppler effect present when one element is moving in the scenario. It is then shown that the measurement of HRTFs in the horizontal plane can be achieved in less than one second. In the last part, we model spatio-temporal channel impulse responses between a fixed source and a moving receiver. The trajectory followed by the moving element is modeled as a continuous autoregressive process. The presented model is simple and versatile. It allows the generation of random trajectories with a controlled smoothness. Application of this study can be found in the modeling of acoustic channels for acoustic echo cancellation or of time-varying multipath electromagnetic channels used in mobile wireless communications

    Reconstruction from Spatio-Spectrally Coded Multispectral Light Fields

    Get PDF
    In this work, spatio-spectrally coded multispectral light fields, as taken by a light field camera with a spectrally coded microlens array, are investigated. For the reconstruction of the coded light fields, two methods, one based on the principles of compressed sensing and one deep learning approach, are developed. Using novel synthetic as well as a real-world datasets, the proposed reconstruction approaches are evaluated in detail

    Reconstruction from Spatio-Spectrally Coded Multispectral Light Fields

    Get PDF
    In dieser Arbeit werden spektral kodierte multispektrale Lichtfelder untersucht, wie sie von einer Lichtfeldkamera mit einem spektral kodierten Mikrolinsenarray aufgenommen werden. Für die Rekonstruktion der kodierten Lichtfelder werden zwei Methoden entwickelt, eine basierend auf den Prinzipien des Compressed Sensing sowie eine Deep Learning Methode. Anhand neuartiger synthetischer und realer Datensätze werden die vorgeschlagenen Rekonstruktionsansätze im Detail evaluiert

    Reconstruction from Spatio-Spectrally Coded Multispectral Light Fields

    Get PDF
    In this work, spatio-spectrally coded multispectral light fields, as taken by a light field camera with a spectrally coded microlens array, are investigated. For the reconstruction of the coded light fields, two methods, one based on the principles of compressed sensing and one deep learning approach, are developed. Using novel synthetic as well as a real-world datasets, the proposed reconstruction approaches are evaluated in detail

    Distributed Compressed Representation of Correlated Image Sets

    Get PDF
    Vision sensor networks and video cameras find widespread usage in several applications that rely on effective representation of scenes or analysis of 3D information. These systems usually acquire multiple images of the same 3D scene from different viewpoints or at different time instants. Therefore, these images are generally correlated through displacement of scene objects. Efficient compression techniques have to exploit this correlation in order to efficiently communicate the 3D scene information. Instead of joint encoding that requires communication between the cameras, in this thesis we concentrate on distributed representation, where the captured images are encoded independently, but decoded jointly to exploit the correlation between images. One of the most important and challenging tasks relies in estimation of the underlying correlation from the compressed correlated images for effective reconstruction or analysis in the joint decoder. This thesis focuses on developing efficient correlation estimation algorithms and joint representation of multiple correlated images captured by various sensing methodologies, e.g., planar, omnidirectional and compressive sensing (CS) sensors. The geometry of the 2D visual representation and the acquisition complexity vary for each sensor type. Therefore, we need to carefully consider the specific geometric nature of the captured images while developing distributed representation algorithms. In this thesis we propose robust algorithms in different scene analysis and reconstruction scenarios. We first concentrate on the distributed representation of omnidirectional images captured by catadioptric sensors. The omnidirectional images are captured from different viewpoints and encoded independently with a balanced rate distribution among the different cameras. They are mapped on the sphere which captures the plenoptic function in its radial form without Euclidean discrepancies. We propose a transform-based distributed coding algorithm, where the spherical images initially undergo a multi-resolution decomposition. The visual information is then split into two correlated partitions. The encoder transmits one partition after entropy coding, as well as the syndrome bits resulting from the Slepian-Wolf encoding of the other partition. The joint decoder estimates a disparity image to take benefit of the correlation between views and uses the syndrome bits to decode the missing information. Such a strategy proves to be beneficial with respect to the independent processing of images and shows only a small performance loss compared to the joint encoding of different views. The encoding complexity in the previous approach is non-negligible due to the visual information processing based on Slepian-Wolf coding and its associated rate parameter estimation. We therefore discard the Slepian-Wolf encoding and propose a distributed coding solution, where the correlated images are encoded independently using transform-based coding solutions (e.g., SPIHT). The central decoder now builds a correlation model from the compressed images, which is used to jointly decode a pair of images. Experimental results demonstrate that the proposed distributed coding solution improves the rate-distortion performance of the separate coding results for both planar and omnidirectional images. However, this improvement is significant only at medium to high bit rates. We therefore propose a rate allocation scheme that identifies and transmits the necessary visual information from each image to improve the correlation estimation accuracy at low bit rate. Experimental results show that for a given bit budget the proposed encoding scheme permits to compute an accurate correlation estimation comparing to the one obtained with SPIHT, JPEG 2000 or JPEG coding schemes. We show however that the improvement in the correlation estimation comes at the price of penalizing the image reconstruction quality; therefore there exists an interesting trade-off between the accurate correlation estimation and image reconstruction as encoding optimization objectives are different in both cases. Next, we further simplify the encoding complexity by replacing the classical imaging sensors with the simple CS sensors, that directly acquire the compressed images in the form of quantized linear measurements. We now concentrate on the particular problem, where one image is selected as the reference and it is used as a side information for the correlation estimation. We propose a geometry-based model to describe the correlation between the visual information in a pair of images. The joint decoder first captures the most prominent visual features in the reconstructed reference image using geometric functions. Since the images are correlated, these features are likely to be present in the other images too, possibly with geometric transformations. Hence, we propose to estimate the correlation model with a regularized optimization problem that locates these features in the compressed images. The regularization terms enforce smoothness of the transformation field, and consistency between the estimated images and the quantized measurements. Experimental results show that the proposed scheme is able to efficiently estimate the correlation between images for several multi-view and video datasets. The proposed scheme is finally shown to outperform DSC schemes based on unsupervised disparity (or motion) learning, as well as independent coding solutions based on JPEG 2000. We then extend the previous scenario to a symmetric decoding problem, where we are interested to estimate the correlation model directly from the quantized linear measurements without explicitly reconstructing the reference images. We first show that the motion field that represents the main source of correlation between images can be described as a linear operator. We further derive a linear relationship between the correlated measurements in the compressed domain. We then derive a regularized cost function to estimate the correlation model directly in the compressed domain using graph-based optimization algorithms. Experimental results show that the proposed scheme estimates an accurate correlation model among images in both multi-view and video imaging scenarios. We then propose a robust data fidelity term that improves the quality of the correlation estimation when the measurements are quantized. Finally, we show by experiments that the proposed compressed correlation estimation scheme is able to compete the solution of a scheme that estimates a correlation model from the reconstructed images without the complexity of image reconstruction. Finally, we study the benefit of using the correlation information while jointly reconstructing the images from the compressed linear measurements. We consider both the asymmetric and symmetric scenarios described previously. We propose joint reconstruction methodologies based on a constrained optimization problem which is solved using effective proximal splitting methods. The constraints included in our framework enforce the reconstructed images to satisfy both the correlation and the quantized measurements consistency objectives. Experimental results demonstrate that the proposed joint reconstruction scheme improves the quality of the decoded images, when compared to a scheme where the images are handled independently. In this thesis we build efficient distributed scene representation algorithms for the multiple correlated images captured in planar, omnidirectional and CS cameras. The coding rate in our symmetric distributed coding solution stays balanced between the encoders and stays close to the joint encoding solutions. Our novel algorithms lead to effective correlation estimation in different sensing and coding scenarios. In addition, we provide innovative solutions for robust correlation estimation from highly compressed images in simple sensing frameworks. Our CS-based joint reconstruction frameworks effectively exploit the inter-view correlation, that permits to achieve high compression gains compared to state-of-the-art independent and distributed coding solutions
    corecore