Search CORE

562 research outputs found

Fast Encoding Method for Vector Quantization Using Modified L2-Norm Pyramid

Author: Kotani Koji
Ohmi Tadahiro
Pan Zhibin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/06/2009
Field of study

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Institutional Repositories DataBase (IRDB)

Fast search algorithms for ECVQ using projection pyramids and variance of codewords

Author: Hashimoto Hideo
Imamura Kousuke
Swilem Ahmed
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

金沢大学大学院自然科学研究科情報システム金沢大学工学部Vector quantization for image compression requires expensive time to find the closest codeword through the codebook. Codebook design based on empirical data for entropy-constrained vector quantization (ECVQ) involves a time consuming training phase in which a Lagrangian cost measure has to be minimized over the set of codebook vectors. In this paper, we propose two fast codebook generation methods for ECVQ. In the first one, we use an appropriate topological structure of input vectors and codewords to reject many codewords that are impossible to be candidates for the best codeword. In the second method, we use the variance test to increase the ability of the first algorithm to reject more codewords. These algorithms allow significant acceleration in the codebook design process. Experimental results are presented on image block data. These results show that our new algorithms perform better than the previously known methods

Kanazawa University Repository for Academic Resources

Feature Encoding Strategies for Multi-View Image Classification

Author: Doerr Kyle
Publication venue: Scholarship@Western
Publication date: 26/01/2016
Field of study

Machine vision systems can vary greatly in size and complexity depending on the task at hand. However, the purpose of inspection, quality and reliability remains the same. This work sets out to bridge the gap between traditional machine vision and computer vision. By applying powerful computer vision techniques, we are able to achieve more robust solutions in manufacturing settings. This thesis presents a framework for applying powerful new image classification techniques used for image retrieval in the Bag of Words (BoW) framework. In addition, an exhaustive evaluation of commonly used feature pooling approaches is conducted with results showing that spatial augmentation can outperform mean and max descriptor pooling on an in-house dataset and the CalTech 3D dataset. The results for the experiments contained within, details a framework that performs classification using multiple view points. The results show that the feature encoding method known as Triangulation Embedding outperforms the Vector of Locally Aggregated Descriptors (VLAD) and the standard BoW framework with an accuracy of 99.28%. This improvement is also seen on the public Caltech 3D dataset where the improvement over VLAD and BoW was 5.64% and 12.23% respectively. This proposed multiple view classification system is also robust enough to handle the real world problem of camera failure and still classify with a high reliability. A missing camera input was simulated and showed that using the Triangulation Embedding method, the system could perform classification with a very minor reduction in accuracy at 98.89%, compared to the BoW baseline at 96.60% using the same techniques. The presented solution tackles the traditional machine vision problem of object identification and also allows for the training of a machine vision system that can be done without any expert level knowledge

Scholarship@Western

Detecting Nature in Pictures

Author: Tiago Miguel Pereira Andrade
Publication venue
Publication date: 12/07/2013
Field of study

Com o advento da partilha em grande escala de imagens geo-referenciadas na internet, em portais como o Flickr e Panoramio, existem agora grandes fontes de dados prontas a serem processadas para a extracção de informação útil. A utilização destes dados para a criação de um mapa das áreas naturais e de origem humana do nosso planeta, pode fornecer conhecimento adicional aos decisores políticos responsáveis pela conservação do planeta.O problema de determinar o grau de naturalidade de uma imagem, pré-condição para a criação de tal mapa, pode ser generalizado como um problema de classificação de paisagens. Foram executadas experiências para melhor compreender a aplicabilidade de cada uma das técnicas identificadas para a classificação de paisagens quando aplicada à tarefa de distinguir entre imagens naturais e de origem humana. As suas vantagens e limitações, como os seus requisitos computacionais, são detalhados.Com uma escolha cuidada das técnicas e respectivos parâmetros foi possível construir um classificador capaz de distinguir entre paisagens naturais e de origem humana com elevada precisão, mas também capaz de processar uma grande quantidade de imagens dentro de um espaço de tempo razoável.With the advent of large-scale geo-tagged image sharing on the internet, on websites such as Flickr and Panoramio, there are now large sources of data ready to be mined for useful information. Using this data to automatically create a map of man-made and natural areas of our planet, can provide additional knowledge to decision-makers responsible for world-conservation.The problem of determining the degree of naturalness of an image, required to create such a map, can be generalized as a scene classification task. Experiments were performed to better understand the applicability of each of the identified scene classification techniques to perform the distinction between man-made and natural images. Their advantages and limitations, such as their computational costs, are detailed.With careful selection of techniques and their parameters it was possible to build a classifier capable of distinguishing between natural and man-made scenery with high accuracy and that can also process a large amount of pictures within a reasonable time frame

Repositório Aberto da Universidade do Porto

Sparse Modeling for Image and Vision Processing

Author: Ecole Normale Supérieure
Francis Bach
Francis Bach
Hal Id Hal
Jean Ponce
Jean Ponce
Julien Mairal
Julien Mairal
Sparse Modeling Image
Vision Processing
Publication venue
Publication date: 01/01/2014
Field of study

In recent years, a large amount of multi-disciplinary research has been conducted on sparse models and their applications. In statistics and machine learning, the sparsity principle is used to perform model selection---that is, automatically selecting a simple model among a large collection of them. In signal processing, sparse coding consists of representing data with linear combinations of a few dictionary elements. Subsequently, the corresponding tools have been widely adopted by several scientific communities such as neuroscience, bioinformatics, or computer vision. The goal of this monograph is to offer a self-contained view of sparse modeling for visual recognition and image processing. More specifically, we focus on applications where the dictionary is learned and adapted to data, yielding a compact representation that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics and Visio

arXiv.org e-Print Archive

CiteSeerX

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

New methods in image compression using multi-level transforms and adaptive statistical encoDing

Author: Pike Thomas William
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/1995
Field of study

The need to meet the demand for high quality digital images, with comparatively modest storage requirements, is driving the development of new image compression techniques. This demand has spurred new techniques based on time to frequency spatial transformation methods. At the core of these methods are a family of transformations built on basis sets called wavelets. The wavelet transform permits an image to be represented in a substantially reduced space by transferring the energy of the image to a smaller set of coefficients. Although these techniques are lossy as the compression ratio rises, very adequate reconstructions can be made from surprisingly small sets of coefficients. This work explores the transformation process, storage of the representation and the application of these techniques to 24-bit color images. A working color image compression model is illustrated

University of Nevada, Las Vegas Repository