320 research outputs found

    Incremental refinement of image salient-point detection

    Get PDF
    Low-level image analysis systems typically detect "points of interest", i.e., areas of natural images that contain corners or edges. Most of the robust and computationally efficient detectors proposed for this task use the autocorrelation matrix of the localized image derivatives. Although the performance of such detectors and their suitability for particular applications has been studied in relevant literature, their behavior under limited input source (image) precision or limited computational or energy resources is largely unknown. All existing frameworks assume that the input image is readily available for processing and that sufficient computational and energy resources exist for the completion of the result. Nevertheless, recent advances in incremental image sensors or compressed sensing, as well as the demand for low-complexity scene analysis in sensor networks now challenge these assumptions. In this paper, we investigate an approach to compute salient points of images incrementally, i.e., the salient point detector can operate with a coarsely quantized input image representation and successively refine the result (the derived salient points) as the image precision is successively refined by the sensor. This has the advantage that the image sensing and the salient point detection can be terminated at any input image precision (e.g., bound set by the sensory equipment or by computation, or by the salient point accuracy required by the application) and the obtained salient points under this precision are readily available. We focus on the popular detector proposed by Harris and Stephens and demonstrate how such an approach can operate when the image samples are refined in a bitwise manner, i.e., the image bitplanes are received one-by-one from the image sensor. We estimate the required energy for image sensing as well as the computation required for the salient point detection based on stochastic source modeling. The computation and energy required by the proposed incremental refinement approach is compared against the conventional salient-point detector realization that operates directly on each source precision and cannot refine the result. Our experiments demonstrate the feasibility of incremental approaches for salient point detection in various classes of natural images. In addition, a first comparison between the results obtained by the intermediate detectors is presented and a novel application for adaptive low-energy image sensing based on points of saliency is presented

    Colour image coding with wavelets and matching pursuit

    Get PDF
    This thesis considers sparse approximation of still images as the basis of a lossy compression system. The Matching Pursuit (MP) algorithm is presented as a method particularly suited for application in lossy scalable image coding. Its multichannel extension, capable of exploiting inter-channel correlations, is found to be an efficient way to represent colour data in RGB colour space. Known problems with MP, high computational complexity of encoding and dictionary design, are tackled by finding an appropriate partitioning of an image. The idea of performing MP in the spatio-frequency domain after transform such as Discrete Wavelet Transform (DWT) is explored. The main challenge, though, is to encode the image representation obtained after MP into a bit-stream. Novel approaches for encoding the atomic decomposition of a signal and colour amplitudes quantisation are proposed and evaluated. The image codec that has been built is capable of competing with scalable coders such as JPEG 2000 and SPIHT in terms of compression ratio

    Smooth real-time motion planning based on a cascade dual-quaternion screw-geometry MPC

    Full text link
    This paper investigates the tracking problem of a smooth coordinate-invariant trajectory using dual quaternion algebra. The proposed architecture consists of a cascade structure in which the outer-loop MPC performs real-time smoothing of the manipulator's end-effector twist while an inner-loop kinematic controller ensures tracking of the instantaneous desired end-effector pose. Experiments on a 77-DoF Franka Emika Panda robotic manipulator validate the proposed method demonstrating its application to constraint the robot twists, accelerations and jerks within prescribed bounds

    Representation Learning with Adversarial Latent Autoencoders

    Get PDF
    A large number of deep learning methods applied to computer vision problems require encoder-decoder maps. These methods include, but are not limited to, self-representation learning, generalization, few-shot learning, and novelty detection. Encoder-decoder maps are also useful for photo manipulation, photo editing, superresolution, etc. Encoder-decoder maps are typically learned using autoencoder networks.Traditionally, autoencoder reciprocity is achieved in the image-space using pixel-wisesimilarity loss, which has a widely known flaw of producing non-realistic reconstructions. This flaw is typical for the Variational Autoencoder (VAE) family and is not only limited to pixel-wise similarity losses, but is common to all methods relying upon the explicit maximum likelihood training paradigm, as opposed to an implicit one. Likelihood maximization, coupled with poor decoder distribution leads to poor or blurry reconstructions at best. Generative Adversarial Networks (GANs) on the other hand, perform an implicit maximization of the likelihood by solving a minimax game, thus bypassing the issues derived from the explicit maximization. This provides GAN architectures with remarkable generative power, enabling the generation of high-resolution images of humans, which are indistinguishable from real photos to the naked eye. However, GAN architectures lack inference capabilities, which makes them unsuitable for training encoder-decoder maps, effectively limiting their application space.We introduce an autoencoder architecture that (a) is free from the consequences ofmaximizing the likelihood directly, (b) produces reconstructions competitive in quality with state-of-the-art GAN architectures, and (c) allows learning disentangled representations, which makes it useful in a variety of problems. We show that the proposed architecture and training paradigm significantly improves the state-of-the-art in novelty and anomaly detection methods, it enables novel kinds of image manipulations, and has significant potential for other applications

    Robust Algorithms for Low-Rank and Sparse Matrix Models

    Full text link
    Data in statistical signal processing problems is often inherently matrix-valued, and a natural first step in working with such data is to impose a model with structure that captures the distinctive features of the underlying data. Under the right model, one can design algorithms that can reliably tease weak signals out of highly corrupted data. In this thesis, we study two important classes of matrix structure: low-rankness and sparsity. In particular, we focus on robust principal component analysis (PCA) models that decompose data into the sum of low-rank and sparse (in an appropriate sense) components. Robust PCA models are popular because they are useful models for data in practice and because efficient algorithms exist for solving them. This thesis focuses on developing new robust PCA algorithms that advance the state-of-the-art in several key respects. First, we develop a theoretical understanding of the effect of outliers on PCA and the extent to which one can reliably reject outliers from corrupted data using thresholding schemes. We apply these insights and other recent results from low-rank matrix estimation to design robust PCA algorithms with improved low-rank models that are well-suited for processing highly corrupted data. On the sparse modeling front, we use sparse signal models like spatial continuity and dictionary learning to develop new methods with important adaptive representational capabilities. We also propose efficient algorithms for implementing our methods, including an extension of our dictionary learning algorithms to the online or sequential data setting. The underlying theme of our work is to combine ideas from low-rank and sparse modeling in novel ways to design robust algorithms that produce accurate reconstructions from highly undersampled or corrupted data. We consider a variety of application domains for our methods, including foreground-background separation, photometric stereo, and inverse problems such as video inpainting and dynamic magnetic resonance imaging.PHDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143925/1/brimoor_1.pd

    Visual perception an information-based approach to understanding biological and artificial vision

    Get PDF
    The central issues of this dissertation are (a) what should we be doing — what problems should we be trying to solve — in order to build computer vision systems, and (b) what relevance biological vision has to the solution of these problems. The approach taken to tackle these issues centres mostly on the clarification and use of information-based ideas, and an investigation into the nature of the processes underlying perception. The primary objective is to demonstrate that information theory and extensions of it, and measurement theory are powerful tools in helping to find solutions to these problems. The quantitative meaning of information is examined, from its origins in physical theories, through Shannon information theory, Gabor representations and codes towards semantic interpretations of the term. Also the application of information theory to the understanding of the developmental and functional properties of biological visual systems is discussed. This includes a review of the current state of knowledge of the architecture and function of the early visual pathways, particularly the retina, and a discussion of the possible coding functions of cortical neurons. The nature of perception is discussed from a number of points of view: the types and function of explanation of perceptual systems and how these relate to the operation of the system; the role of the observer in describing perceptual functions in other systems or organisms; the status and role of objectivist and representational viewpoints in understanding vision; the philosophical basis of perception; the relationship between pattern recognition and perception, and the interpretation of perception in terms of a theory of measurement These two threads of research, information theory and measurement theory are brought together in an overview and reinterpretation of the cortical role in mammalian vision. Finally the application of some of the coding and recognition concepts to industrial inspection problems are described. The nature of the coding processes used are unusual in that coded images are used as the input for a simple neural network classifier, rather than a heuristic feature set The relationship between the Karhunen-Loève transform and the singular value decomposition is clarified as background the coding technique used to code the images. This coding technique has also been used to code long sequences of moving images to investigate the possibilities of recognition of people on the basis of their gait or posture and this application is briefly described
    corecore