2,921 research outputs found

    Role of homeostasis in learning sparse representations

    Full text link
    Neurons in the input layer of primary visual cortex in primates develop edge-like receptive fields. One approach to understanding the emergence of this response is to state that neural activity has to efficiently represent sensory data with respect to the statistics of natural scenes. Furthermore, it is believed that such an efficient coding is achieved using a competition across neurons so as to generate a sparse representation, that is, where a relatively small number of neurons are simultaneously active. Indeed, different models of sparse coding, coupled with Hebbian learning and homeostasis, have been proposed that successfully match the observed emergent response. However, the specific role of homeostasis in learning such sparse representations is still largely unknown. By quantitatively assessing the efficiency of the neural representation during learning, we derive a cooperative homeostasis mechanism that optimally tunes the competition between neurons within the sparse coding algorithm. We apply this homeostasis while learning small patches taken from natural images and compare its efficiency with state-of-the-art algorithms. Results show that while different sparse coding algorithms give similar coding results, the homeostasis provides an optimal balance for the representation of natural images within the population of neurons. Competition in sparse coding is optimized when it is fair. By contributing to optimizing statistical competition across neurons, homeostasis is crucial in providing a more efficient solution to the emergence of independent components

    Subspace methods for portfolio design

    Get PDF
    Financial signal processing (FSP) is one of the emerging areas in the field of signal processing. It is comprised of mathematical finance and signal processing. Signal processing engineers consider speech, image, video, and price of a stock as signals of interest for the given application. The information that they will infer from raw data is different for each application. Financial engineers develop new solutions for financial problems using their knowledge base in signal processing. The goal of financial engineers is to process the harvested financial signal to get meaningful information for the purpose. Designing investment portfolios have always been at the center of finance. An investment portfolio is comprised of financial instruments such as stocks, bonds, futures, options, and others. It is designed based on risk limits and return expectations of investors and managed by portfolio managers. Modern Portfolio Theory (MPT) offers a mathematical method for portfolio optimization. It defines the risk as the standard deviation of the portfolio return and provides closed-form solution for the risk optimization problem where asset allocations are derived from. The risk and the return of an investment are the two inseparable performance metrics. Therefore, risk normalized return, called Sharpe ratio, is the most widely used performance metric for financial investments. Subspace methods have been one of the pillars of functional analysis and signal processing. They are used for portfolio design, regression analysis and noise filtering in finance applications. Each subspace has its unique characteristics that may serve requirements of a specific application. For still image and video compression applications, Discrete Cosine Transform (DCT) has been successfully employed in transform coding where Karhunen-Loeve Transform (KLT) is the optimum block transform. In this dissertation, a signal processing framework to design investment portfolios is proposed. Portfolio theory and subspace methods are investigated and jointly treated. First, KLT, also known as eigenanalysis or principal component analysis (PCA) of empirical correlation matrix for a random vector process that statistically represents asset returns in a basket of instruments, is investigated. Auto-regressive, order one, AR(1) discrete process is employed to approximate such an empirical correlation matrix. Eigenvector and eigenvalue kernels of AR(1) process are utilized for closed-form expressions of Sharpe ratios and market exposures of the resulting eigenportfolios. Their performances are evaluated and compared for various statistical scenarios. Then, a novel methodology to design subband/filterbank portfolios for a given empirical correlation matrix by using the theory of optimal filter banks is proposed. It is a natural extension of the celebrated eigenportfolios. Closed-form expressions for Sharpe ratios and market exposures of subband/filterbank portfolios are derived and compared with eigenportfolios. A simple and powerful new method using the rate-distortion theory to sparse eigen-subspaces, called Sparse KLT (SKLT), is developed. The method utilizes varying size mid-tread (zero-zone) pdf-optimized (Lloyd-Max) quantizers created for each eigenvector (or for the entire eigenmatrix) of a given eigen-subspace to achieve the desired cardinality reduction. The sparsity performance comparisons demonstrate the superiority of the proposed SKLT method over the popular sparse representation algorithms reported in the literature

    Rate-Accuracy Trade-Off In Video Classification With Deep Convolutional Neural Networks

    Get PDF
    Advanced video classification systems decode video frames to derive the necessary texture and motion representations for ingestion and analysis by spatio-temporal deep convolutional neural networks (CNNs). However, when considering visual Internet-of-Things applications, surveillance systems and semantic crawlers of large video repositories, the video capture and the CNN-based semantic analysis parts do not tend to be co-located. This necessitates the transport of compressed video over networks and incurs significant overhead in bandwidth and energy consumption, thereby significantly undermining the deployment potential of such systems. In this paper, we investigate the trade-off between the encoding bitrate and the achievable accuracy of CNN-based video classification models that directly ingest AVC/H.264 and HEVC encoded videos. Instead of retaining entire compressed video bitstreams and applying complex optical flow calculations prior to CNN processing, we only retain motion vector and select texture information at significantly-reduced bitrates and apply no additional processing prior to CNN ingestion. Based on three CNN architectures and two action recognition datasets, we achieve 11%-94% saving in bitrate with marginal effect on classification accuracy. A model-based selection between multiple CNNs increases these savings further, to the point where, if up to 7% loss of accuracy can be tolerated, video classification can take place with as little as 3 kbps for the transport of the required compressed video information to the system implementing the CNN models

    Selected Topics in Bayesian Image/Video Processing

    Get PDF
    In this dissertation, three problems in image deblurring, inpainting and virtual content insertion are solved in a Bayesian framework.;Camera shake, motion or defocus during exposure leads to image blur. Single image deblurring has achieved remarkable results by solving a MAP problem, but there is no perfect solution due to inaccurate image prior and estimator. In the first part, a new non-blind deconvolution algorithm is proposed. The image prior is represented by a Gaussian Scale Mixture(GSM) model, which is estimated from non-blurry images as training data. Our experimental results on a total twelve natural images have shown that more details are restored than previous deblurring algorithms.;In augmented reality, it is a challenging problem to insert virtual content in video streams by blending it with spatial and temporal information. A generic virtual content insertion (VCI) system is introduced in the second part. To the best of my knowledge, it is the first successful system to insert content on the building facades from street view video streams. Without knowing camera positions, the geometry model of a building facade is established by using a detection and tracking combined strategy. Moreover, motion stabilization, dynamic registration and color harmonization contribute to the excellent augmented performance in this automatic VCI system.;Coding efficiency is an important objective in video coding. In recent years, video coding standards have been developing by adding new tools. However, it costs numerous modifications in the complex coding systems. Therefore, it is desirable to consider alternative standard-compliant approaches without modifying the codec structures. In the third part, an exemplar-based data pruning video compression scheme for intra frame is introduced. Data pruning is used as a pre-processing tool to remove part of video data before they are encoded. At the decoder, missing data is reconstructed by a sparse linear combination of similar patches. The novelty is to create a patch library to exploit similarity of patches. The scheme achieves an average 4% bit rate reduction on some high definition videos

    Livrable D3.3 of the PERSEE project : 2D coding tools

    Get PDF
    49Livrable D3.3 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D3.3 du projet. Son titre : 2D coding tool

    A Novel Rate Control Algorithm for Onboard Predictive Coding of Multispectral and Hyperspectral Images

    Get PDF
    Predictive coding is attractive for compression onboard of spacecrafts thanks to its low computational complexity, modest memory requirements and the ability to accurately control quality on a pixel-by-pixel basis. Traditionally, predictive compression focused on the lossless and near-lossless modes of operation where the maximum error can be bounded but the rate of the compressed image is variable. Rate control is considered a challenging problem for predictive encoders due to the dependencies between quantization and prediction in the feedback loop, and the lack of a signal representation that packs the signal's energy into few coefficients. In this paper, we show that it is possible to design a rate control scheme intended for onboard implementation. In particular, we propose a general framework to select quantizers in each spatial and spectral region of an image so as to achieve the desired target rate while minimizing distortion. The rate control algorithm allows to achieve lossy, near-lossless compression, and any in-between type of compression, e.g., lossy compression with a near-lossless constraint. While this framework is independent of the specific predictor used, in order to show its performance, in this paper we tailor it to the predictor adopted by the CCSDS-123 lossless compression standard, obtaining an extension that allows to perform lossless, near-lossless and lossy compression in a single package. We show that the rate controller has excellent performance in terms of accuracy in the output rate, rate-distortion characteristics and is extremely competitive with respect to state-of-the-art transform coding
    • …