1,349 research outputs found
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
Near-Lossless Bitonal Image Compression System
The main purpose of this thesis is to develop an efficient near-lossless bitonal compression algorithm and to implement that algorithm on a hardware platform. The current methods for compression of bitonal images include the JBIG and JBIG2 algorithms, however both JBIG and JBIG2 have their disadvantages. Both of these algorithms are covered by patents filed by IBM, making them costly to implement commercially. Also, JBIG only provides means for lossless compression while JBIG2 provides lossy methods only for document-type images. For these reasons a new method for introducing loss and controlling this loss to sustain quality is developed. The lossless bitonal image compression algorithm used for this thesis is called Block Arithmetic Coder for Image Compression (BACIC), which can efficiently compress bitonal images. In this thesis, loss is introduced for cases where better compression efficiency is needed. However, introducing loss in bitonal images is especially difficult, because pixels undergo such a drastic change, either from white to black or black to white. Such pixel flipping introduces salt and pepper noise, which can be very distracting when viewing an image. Two methods are used in combination to control the visual distortion introduced into the image. The first is to keep track of the error created by the flipping of pixels, and using this error to decide whether flipping another pixel will cause the visual distortion to exceed a predefined threshold. The second method is region of interest consideration. In this method, lower loss or no loss is introduced into the important parts of an image, and higher loss is introduced into the less important parts. This allows for a good quality image while increasing the compression efficiency. Also, the ability of BACIC to compress grayscale images is studied and BACICm, a multiplanar BACIC algorithm, is created. A hardware implementation of the BACIC lossless bitonal image compression algorithm is also designed. The hardware implementation is done using VHDL targeting a Xilinx FPGA, which is very useful, because of its flexibility. The programmed FPGA could be included in a product of the facsimile or printing industry to handle the compression or decompression internal to the unit, giving it an advantage in the marketplace
Role of homeostasis in learning sparse representations
Neurons in the input layer of primary visual cortex in primates develop
edge-like receptive fields. One approach to understanding the emergence of this
response is to state that neural activity has to efficiently represent sensory
data with respect to the statistics of natural scenes. Furthermore, it is
believed that such an efficient coding is achieved using a competition across
neurons so as to generate a sparse representation, that is, where a relatively
small number of neurons are simultaneously active. Indeed, different models of
sparse coding, coupled with Hebbian learning and homeostasis, have been
proposed that successfully match the observed emergent response. However, the
specific role of homeostasis in learning such sparse representations is still
largely unknown. By quantitatively assessing the efficiency of the neural
representation during learning, we derive a cooperative homeostasis mechanism
that optimally tunes the competition between neurons within the sparse coding
algorithm. We apply this homeostasis while learning small patches taken from
natural images and compare its efficiency with state-of-the-art algorithms.
Results show that while different sparse coding algorithms give similar coding
results, the homeostasis provides an optimal balance for the representation of
natural images within the population of neurons. Competition in sparse coding
is optimized when it is fair. By contributing to optimizing statistical
competition across neurons, homeostasis is crucial in providing a more
efficient solution to the emergence of independent components
Moments-Based Fast Wedgelet Transform
In the paper the moments-based fast wedgelet
transform has been presented. In order to perform the classical
wedgelet transform one searches the whole wedgelets’
dictionary to find the best matching. Whereas in the proposed
method the parameters of wedgelet are computed directly
from an image basing on moments computation. Such
parameters describe wedgelet reflecting the edge present in
the image. However, such wedgelet is not necessarily the
best one in the meaning of Mean Square Error. So, to overcome
that drawback, the method which improves the matching
result has also been proposed. It works in the way that
the better matching one needs to obtain the longer time it
takes. The proposed transform works in linear time with respect
to the number of pixels of the full quadtree decomposition
of an image. More precisely, for an image of size
N Ă—N pixels the time complexity of the proposed wedgelet
transform is O(N2 log2 N)
Colour image coding with wavelets and matching pursuit
This thesis considers sparse approximation of still images as the basis of a lossy compression system. The Matching Pursuit (MP) algorithm is presented as a method particularly suited for application in lossy scalable image coding. Its multichannel extension, capable of exploiting inter-channel correlations, is found to be an efficient way to represent colour data in RGB colour space. Known problems with MP, high computational complexity of encoding and dictionary design, are tackled by finding an appropriate partitioning of an image. The idea of performing MP in the spatio-frequency domain after transform such as Discrete Wavelet Transform (DWT) is explored. The main challenge, though, is to encode the image representation obtained after MP into a bit-stream. Novel approaches for encoding the atomic decomposition of a signal and colour amplitudes quantisation are proposed and evaluated. The image codec that has been built is capable of competing with scalable coders such as JPEG 2000 and SPIHT in terms of compression ratio
- …