Search CORE

186 research outputs found

S^2-Transformer for Mask-Aware Hyperspectral Image Reconstruction

Author: Li Kunpeng
Tao Zhiqiang
Wang Jiamian
Yuan Xin
Zhang Yulun
Publication venue
Publication date: 14/12/2022
Field of study

The technology of hyperspectral imaging (HSI) records the visual information upon long-range-distributed spectral wavelengths. A representative hyperspectral image acquisition procedure conducts a 3D-to-2D encoding by the coded aperture snapshot spectral imager (CASSI) and requires a software decoder for the 3D signal reconstruction. By observing this physical encoding procedure, two major challenges stand in the way of a high-fidelity reconstruction. (i) To obtain 2D measurements, CASSI dislocates multiple channels by disperser-titling and squeezes them onto the same spatial region, yielding an entangled data loss. (ii) The physical coded aperture leads to a masked data loss by selectively blocking the pixel-wise light exposure. To tackle these challenges, we propose a spatial-spectral (S^2-) Transformer network with a mask-aware learning strategy. First, we simultaneously leverage spatial and spectral attention modeling to disentangle the blended information in the 2D measurement along both two dimensions. A series of Transformer structures are systematically designed to fully investigate the spatial and spectral informative properties of the hyperspectral data. Second, the masked pixels will induce higher prediction difficulty and should be treated differently from unmasked ones. Thereby, we adaptively prioritize the loss penalty attributing to the mask structure by inferring the pixel-wise reconstruction difficulty upon the mask-encoded prediction. We theoretically discusses the distinct convergence tendencies between masked/unmasked regions of the proposed learning strategy. Extensive experiments demonstrates that the proposed method achieves superior reconstruction performance. Additionally, we empirically elaborate the behaviour of spatial and spectral attentions under the proposed architecture, and comprehensively examine the impact of the mask-aware learning.Comment: 11 pages, 16 figures, 6 tables, Code: https://github.com/Jiamian-Wang/S2-transformer-HS

arXiv.org e-Print Archive

I'm sorry to say, but your understanding of image processing fundamentals is absolutely wrong

Author: Diamant Emanuel
Publication venue
Publication date: 01/08/2008
Field of study

The ongoing discussion whether modern vision systems have to be viewed as visually-enabled cognitive systems or cognitively-enabled vision systems is groundless, because perceptual and cognitive faculties of vision are separate components of human (and consequently, artificial) information processing system modeling.Comment: To be published as chapter 5 in "Frontiers in Brain, Vision and AI", I-TECH Publisher, Viena, 200

arXiv.org e-Print Archive

IntechOpen

Crossref

A bag of words description scheme for image quality assessment

Author: Fernandes Miguel Francisco Fidalgo
Publication venue
Publication date: 31/10/2016
Field of study

Every day millions of images are obtained, processed, compressed, saved, transmitted and reproduced. All these operations can cause distortions that affect their quality. The quality of these images should be measured subjectively. However, that brings the disadvantage of achieving a considerable number of tests with individuals requested to provide a statistical analysis of an image’s perceptual quality. Several objective metrics have been developed, that try to model the human perception of quality. However, in most applications the representation of human quality perception given by these metrics is far from the desired representation. Therefore, this work proposes the usage of machine learning models that allow for a better approximation. In this work, definitions for image and quality are given and some of the difficulties of the study of image quality are mentioned. Moreover, three metrics are initially explained. One uses the image’s original quality has a reference (SSIM) while the other two are no reference (BRISQUE and QAC). A comparison is made, showing a large discrepancy of values between the two kinds of metrics. The database that is used for the tests is TID2013. This database was chosen due to its dimension and by the fact of considering a large number of distortions. A study of each type of distortion in this database is made. Furthermore, some concepts of machine learning are introduced along with algorithms relevant in the context of this dissertation, notably, K-means, KNN and SVM. Description aggregator algorithms like “bag of words” and “fisher-vectors” are also mentioned. This dissertation studies a new model that combines machine learning and a quality metric for quality estimation. This model is based on the division of images in cells, where a specific metric is computed. With this division, it is possible to obtain local quality descriptors that will be aggregated using “bag of words”. A SVM with an RBF kernel is trained and tested on the same database and the results of the model are evaluated using cross-validation. The results are analysed using Pearson, Spearman and Kendall correlations and the RMSE to evaluate the representation of the model when compared with the subjective results. The model improves the results of the metric that was used and shows a new path to apply machine learning for quality evaluation.No nosso dia-a-dia as imagens são obtidas, processadas, comprimidas, guardadas, transmitidas e reproduzidas. Em qualquer destas operações podem ocorrer distorções que prejudicam a sua qualidade. A qualidade destas imagens pode ser medida de forma subjectiva, o que tem a desvantagem de serem necessários vários testes, a um número considerável de indivíduos para ser feita uma análise estatística da qualidade perceptual de uma imagem. Foram desenvolvidas várias métricas objectivas, que de alguma forma tentam modelar a percepção humana de qualidade. Todavia, em muitas aplicações a representação de percepção de qualidade humana dada por estas métricas fica aquém do desejável, razão porque se propõe neste trabalho usar modelos de reconhecimento de padrões que permitam uma maior aproximação. Neste trabalho, são dadas definições para imagem e qualidade e algumas das dificuldades do estudo da qualidade de imagem são referidas. É referida a importância da qualidade de imagem como ramo de estudo, e são estudadas diversas métricas de qualidade. São explicadas três métricas, uma delas que usa a qualidade original como referência (SSIM) e duas métricas sem referência (BRISQUE e QAC). Uma comparação é feita entre elas, mostrando- – se uma grande discrepância de valores entre os dois tipos de métricas. Para os testes feitos é usada a base de dados TID2013, que é muitas vezes considerada para estudos de qualidade de métricas devido à sua dimensão e ao facto de considerar um grande número de distorções. Neste trabalho também se fez um estudo dos tipos de distorção incluidos nesta base de dados e como é que eles são simulados. São introduzidos também alguns conceitos teóricos de reconhecimento de padrões e alguns algoritmos relevantes no contexto da dissertação, são descritos como o K-means, KNN e as SVMs. Algoritmos de agregação de descritores como o “bag of words” e o “fisher-vectors” também são referidos. Esta dissertação adiciona métodos de reconhecimento de padrões a métricas objectivas de qua– lidade de imagem. Uma nova técnica é proposta, baseada na divisão de imagens em células, nas quais uma métrica será calculada. Esta divisão permite obter descritores locais de qualidade que serão agregados usando “bag of words”. Uma SVM com kernel RBF é treinada e testada na mesma base de dados e os resultados do modelo são mostrados usando cross-validation. Os resultados são analisados usando as correlações de Pearson, Spearman e Kendall e o RMSE que permitem avaliar a proximidade entre a métrica desenvolvida e os resultados subjectivos. Este modelo melhora os resultados obtidos com a métrica usada e demonstra uma nova forma de aplicar modelos de reconhecimento de padrões ao estudo de avaliação de qualidade

UBibliorum repositorio digital da ubi

MantissaCam: Learning Snapshot High-dynamic-range Imaging with Perceptually-based In-pixel Irradiance Encoding

Author: Dudek Piotr
Martel Julien N. P.
So Haley M.
Wetzstein Gordon
Publication venue
Publication date: 09/12/2021
Field of study

The ability to image high-dynamic-range (HDR) scenes is crucial in many computer vision applications. The dynamic range of conventional sensors, however, is fundamentally limited by their well capacity, resulting in saturation of bright scene parts. To overcome this limitation, emerging sensors offer in-pixel processing capabilities to encode the incident irradiance. Among the most promising encoding schemes is modulo wrapping, which results in a computational photography problem where the HDR scene is computed by an irradiance unwrapping algorithm from the wrapped low-dynamic-range (LDR) sensor image. Here, we design a neural network--based algorithm that outperforms previous irradiance unwrapping methods and, more importantly, we design a perceptually inspired "mantissa" encoding scheme that more efficiently wraps an HDR scene into an LDR sensor. Combined with our reconstruction framework, MantissaCam achieves state-of-the-art results among modulo-type snapshot HDR imaging approaches. We demonstrate the efficacy of our method in simulation and show preliminary results of a prototype MantissaCam implemented with a programmable sensor

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Compressive sampling using a pushframe camera

Author: Bennett Stuart
Griffin Paul F
Jeffers John
Marshall Stephen
Murray Paul
Noblet Yoann
Oi Daniel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/07/2021
Field of study

The recently described pushframe imager, a parallelized single pixel camera capturing with a pushbroom-like motion, is intrinsically suited to both remote-sensing and compressive sampling. It optically applies a 2D mask to the imaged scene, before performing light integration along a single spatial axis, but previous work has not made use of the architecture's potential for taking measurements sparsely. In this paper we develop a strongly performing static binarized noiselet compressive sampling mask design, tailored to pushframe hardware, allowing both a single exposure per motion time-step, and retention of 2D correlations in the scene. Results from simulated and real-world captures are presented, with performance shown to be similar to that of immobile — and hence inappropriate for satellite use — whole-scene imagers. A particular feature of our sampling approach is that the degree of compression can be varied without altering the pattern, and we demonstrate the utility of this for efficiently storing and transmitting multi-spectral images

University of Strathclyde Institutional Repository

Recommended from our members

A Cognitive Radio Compressive Sensing Framework

Author: Karampoulas Dimitrios
Publication venue
Publication date: 08/03/2018
Field of study

With the proliferation of wireless devices and services, allied with further significant predicted growth, there is an ever increasing demand for higher transmission rates. This is especially challenging given the limited availability of radio spectrum, and is further exacerbated by a rigid licensing regulatory regime. Spectrum however, is largely underutilized and this has prompted regulators to promote the concept of opportunistic spectrum access. This allows unlicensed secondary users to use bands which are licensed to primary users, but are currently unoccupied, so leading to more efficient spectrum utilization. A potentially attractive solution to this spectrum underutilisation problem is cognitive radio (CR) technology, which enables the identification and usage of vacant bands by continuously sensing the radio environment, though CR enforces stringent timing requirements and high sampling rates. Compressive sensing (CS) has emerged as a novel sampling paradigm, which provides the theoretical basis to resolve some of these issues, especially for signals exhibiting sparsity in some domain. For CR-related signals however, existing CS architectures such as the random demodulator and compressive multiplexer have limitations in regard to the signal types used, spectrum estimation methods applied, spectral band classification and a dependence on Fourier domain based sparsity. This thesis presents a new generic CS framework which addresses these issues by specifically embracing three original scientific contributions: i) seamless embedding of the concept of precolouring into existing CS architectures to enhance signal sparsity for CR-related digital modulation schemes; ii) integration of the multitaper spectral estimator to improve sparsity in CR narrowband modulation schemes; and iii) exploiting sparsity in an alternative, non-Fourier (Walsh-Hadamard) domain to expand the applicable CR-related modulation schemes. Critical analysis reveals the new CS framework provides a consistently superior and robust solution for the recovery of an extensive set of currently employed CR-type signals encountered in wireless communication standards. Significantly, the generic and portable nature of the framework affords the opportunity for further extensions into other CS architectures and sparsity domains

Open Research Online (The Open University)

Feature extraction for image quality prediction

Author: Kayargadde V.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/1995
Field of study

Repository TU/e

Pure OAI Repository

Design of large polyphase filters in the Quadratic Residue Number System

Author: Cardarilli G
Nannarelli A
Oster Y
Petricca M
Re M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Crossref

ART

Online Research Database In Technology