5 research outputs found

    Learnable Reconstruction Methods from RGB Images to Hyperspectral Imaging: A Survey

    Full text link
    Hyperspectral imaging enables versatile applications due to its competence in capturing abundant spatial and spectral information, which are crucial for identifying substances. However, the devices for acquiring hyperspectral images are expensive and complicated. Therefore, many alternative spectral imaging methods have been proposed by directly reconstructing the hyperspectral information from lower-cost, more available RGB images. We present a thorough investigation of these state-of-the-art spectral reconstruction methods from the widespread RGB images. A systematic study and comparison of more than 25 methods has revealed that most of the data-driven deep learning methods are superior to prior-based methods in terms of reconstruction accuracy and quality despite lower speeds. This comprehensive review can serve as a fruitful reference source for peer researchers, thus further inspiring future development directions in related domains

    Diseño e implementación de un sistema multiespectral en el rango ultravioleta, visible e infrarrojo : aplicación al estudio y conservación de obras de arte

    Get PDF
    Multispectral systems have several applications and there are different possible configurations in which they can be implemented. Different characteristics make them useful, but the basic one is the access to spectral information of a scene or sample with high spatial resolution. In this thesis , the main objective has been the design and implementation of a multispectral system that covers part of the UV, the visible, and part of theIR ranges of the electromagnetic spectrum to be applied to artwork studies. The main part of the thesis is dedicated to the design, characterization and application of a multispectral system based on multiplexed illumination using light emitting diodes (LED). This system comprises two modules : Module 1 UV-Vis, with a CCD camera sensitive between 370nm and 930nm , coupled to a source of LEDs with 16 different channels, i.e. 16 wavelengths of emission; and Module 2 IR with an lnGaAs camera with sensitivity between 930nm and 1650nm coupled to an LED source with 7 emission wavelengths. Thus , the complete system covers from 370nm to 1650nm with a total of23 channels obtained with LED illumination. The system elements were characterized and simulations were performed to assess their performance in the reconstruction of spectral reflectance under ideal conditions , and also under conditions of quantization noise and additive noise. Its performance was evaluated using the formula CIEDE2000 color difference, the mean square error (RMSE) and the goodness-of-fit coefficient (GFC) . The simulation results showed a good overall system performance, but with better results for module 1 UV -Vis due to the increased amount of LED channels in its spectral range. Computer programs with their respective graphical interfaces to control the hardware and processing the information provided by the system were implemented. For the spectral reconstruction we employed the method based on direct interpolation using splines , and the methods based on training set of samples with known digital system responses and spectral reflectances: the undetermined pseudo-inverse (PSE -1) and simple pseudo-inverse (PSE). The equipment was evaluated over real samples of the Color Checker Chart CCCR and a series of frescoes patches painted with pigments used in artworks. The results of the metrics CIEDE2000, RMSE and GFC showed that the methods ofthe PSE-1 and PSE have similar performance, with slightly better results for the second one. The interpolation method presented a slightly lower performance, but it has the practical value of not needing training. The results for the PSE method were similar to those obtained through simulation, and proved again that the module 2 IR has lower performance. lt was concluded that overall system performance was good with CIEDE2000 and RMSE average values for the methods based on PSE in the order of 1 unit. The developed system was applied to artworks in the museum ofPedralbes Monastery in Barcelona, and the churches of Sant Pere in Terrassa. Different images of murals of the chapel of San Miguel in the Monastery of Pedralbes were captured. The evaluation of the system performance for this museum application showed similar performance to the reported one in laboratory. We also captured a painting of large format, oil on wood named: La Virgen de la Leche. For this artwork the modular design and easy movement of the system was used to generate a complete picture by composition from several smaller images. At the churches of Sant Pere, we explored wall paintings dating from the Visigoth (VI-VII) and Romance times (XII -XIII) to assess whether there were features in the paintings that were not evident in the visible range, but in other spectral ranges. Enhancement algorithms were implemented for this task. The results obtained in this thesis demonstrate the potential of the developed multispecral system for obtaining spectral information in the ultraviolet, visible and infrared regions.Los sistemas multiespectrales tienen la característica principal de proporcionar acceso a información espectral de una muestra con alta resolución espacial. En esta tesis, como principal objetivo, se ha diseñado e implementado un sistema multiespectral para aplicarlo al estudio de obras de arte. Este sistema comprende dos módulos: el módulo 1 UV-Vis, con una cámara CCD con sensibilidad entre 370nm y930nm, acoplada a una fuente de diodos emisores de luz (LED) con 16 canales diferentes, es decir 16 1ongitudes de onda de emisión; y el módulo 2 IR, con una cámara lnGaAs con sensibilidad entre 930nm y 1650nm, acoplada a una fuente LED con 7 longitudes de onda de emisión. Por tanto, el sistema completo abarca desde 370nm a 1650nm con un total de 23 canales. Se caracterizaron los elementos del sistema y se han realizado simulaciones para evaluar su rendimiento en la reconstrucción de reflectancias espectrales bajo condiciones ideales, condiciones de ruido de cuantificación y aditivo. Su rendimiento se evaluó empleando la fórmula de diferencia de color CIEDE2000, el error cuadrático medio (RMSE) y el coeficiente de bondad del ajuste (GFC). Los resultados de las simulaciones mostraron un buen rendimiento general del sistema, aunque con mejores resultados para el módulo 1 UV-Vis debido a la mayor cantidad de canales LED en su rango espectral. Paralelamente se implementaron los programas computacionales con sus respectivas interfaces gráficas necesarias para el control del hardware usado y para el procesamiento de la información proporcionada por el sistema. Para la reconstrucción espectral empleamos un método de interpolación directa basado en splines, y los métodos de pseudoinversa indeterminada (PSE-l) y pseudoinversa simple (PSE) que necesitan de un entrenamiento con un conjunto de muestras con respuestas digitales del sistema y reflectancias espectrales conocidas. El equipo se evaluó sobre muestras reales de la carta de colores Color Checker CCCR y sobre un conjunto de pinturas al fresco realizadas con pigmentos comúnmente presentas en obras de arte. Los resultados de las métricas CIEDE2000, RMSE y GFC mostraron que los métodos de la PSE-1 y PSE tienen desempeños similares, con resultados ligeramente mejores para el segundo método. El método de interpolación presentó un rendimiento ligeramente menor, aunque tiene el valor práctico de no necesitar entrenamiento. Los resultados reales para el método del PSE fueron similares a los obtenidos mediante simulación, y se mostró una vez más que el módulo 2 IR tiene un rendimiento inferior. Se concluyó que en general el desempeño del sistema era bueno, con valores CIEDE2000 y RMSE promedio para los métodos basados en PSE del orden de 1 unidad en ambos casos. El sistema desarrollado fue aplicado a obras de arte en el museo de Monasterio de Pedralbes, en Barcelona, y las Iglesias de Sant Pere, en Terrassa. En el Monasterio de Pedralbes se capturaron diferentes imágenes de pinturas murales de la capilla de San Miguel y se evaluó el desempeño del sistema para esta aplicación de museo, mostrando un desempeño similar al reportado en las pruebas de laboratorio. También se accedió a la obra “Virgen de la Leche” que es un óleo en tabla de gran formato. En esta obra se aprovechó el diseño modular y de fácil movimiento del sistema para generar por composición una imagen completa a partir de varias imágenes menores. En las iglesias de Sant Pere se exploraron pinturas murales que se estima datan de las épocas visigoda (siglos VI-VII) y románica (siglos XII-XIII) para evaluar si existían características en las pinturas que no fueran evidentes en el rango visible y que si lo fueran en otros rangos espectrales. Para ello se implementaron algoritmos de realce de la información. Los resultados obtenidos en esta tesis doctoral ponen de manifiesto las potencialidades del sistema multiespecral desarrollado para la obtención de información espectral en las regiones ultravioleta, visible e infrarroj

    잡음에 강인한 음성 구간 검출과 음성 향상을 위한 딥 러닝 기반 기법 연구

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2017. 2. 김남수.Over the past decades, a number of approaches have been proposed to improve the performances of voice activity detection (VAD) and speech enhancement algorithms which are crucial for speech communication and speech signal processing systems. In particular, the increasing use of machine learning-based techniques has led to the more robust algorithms in low SNR conditions. Among them, the deep neural network (DNN) has been one of the most popular techniques. While the DNN-based technique is successfully applied to these tasks, the characteristics of VAD and speech enhancement tasks are not fully incorporated to the DNN structures and objective functions. In this thesis, we propose the novel training schemes and post-filter for DNN-based VAD and speech enhancement. Unlike algorithms with basic DNN-based framework, the proposed algorithm combines the knowledge from signal processing and machine learning society to develop the improve DNN-based VAD and speech enhancement algorithm. In the following chapters, the environmental mismatch problem in the VAD area is compensated by applying multi-task learning to the DNN-based VAD. Also, the DNN-based framework is proposed in the speech enhancement scenario and the novel objective function and post-filter which are derived from the characteristics on human auditory perception improve the DNN-based speech enhancement algorithm. In the VAD task, the DNN-based algorithm was recently proposed and outperformed the traditional and other machine learning-based VAD algorithms. However, the performance of the DNN-based algorithm sometimes deteriorates when the training and test environments are not matched with each other. In order to increase the performance of the DNN-based VAD in unseen environments, we adopt the multi-task learning (MTL) framework which consists of the primary VAD and subsidiary feature enhancement tasks. By employing the MTL framework, the DNN learns the denoising function in the shared hidden layers that is useful to maintain the VAD performance in mismatched noise conditions. Second, the DNN-based framework is applied to the speech enhancement by considering it as a regression task. The encoding vector of the conventional nonnegative matrix factorization (NMF)-based algorithm is estimated by the proposed DNN and the performance of the DNN-based algorithm is compared to the conventional NMF-based algorithm. Third, the perceptually motivated objective function is proposed for the DNN-based speech enhancement. In the proposed technique, a new objective function which consists of the Mel-scale weighted mean square error, temporal and spectral variations similarities between the enhanced and clean speech is employed in the DNN training stage. The proposed objective function helps to compute the gradients based on a perceptually motivated non-linear frequency scale and alleviates the over-smoothness of the estimated speech. Furthermore, the post-filter which adjusts the variance over frequency bins further compensates the lack of contrasts between spectral peaks and valleys in the enhanced speech. The conventional GV equalization post-filters do not consider the spectral dynamics over frequency bins. To consider the contrast between spectral peaks and valleys in each enhanced speech frames, the proposed algorithm matches the variance over coefficients in the log-power spectra domain. Finally, in the speech enhancement task, an integrated technique using the proposed perceptually motivated objective function and the post-filter is described. In matched and mismatched noise conditions, the performance results of the conventional and proposed algorithm are discussed. Also, the subjective preference test result of these algorithms is also provided.1 Introduction 1 2 Conventional Approaches for Speech Enhancement 7 2.1 NMF-Based Speech Enhancement 7 3 Deep Neural Networks 13 3.1 Introduction 13 3.2 Objective Function 14 3.3 Stochastic Gradient Descent 16 4 DNN-Based Voiced Activity Detection with Multi-Task Learning Framework 19 4.1 Introduction 19 4.2 DNN-Based VAD Algorithm 21 4.3 DNN-Based VAD with MTL framework 23 4.4 Experimental Results 26 4.4.1 Experiments in Matched Noise Conditions 26 4.4.2 Experiments in Mismatched Noise Conditions 28 4.5 Summary 30 5 NMF-based Speech Enhancement Using Deep Neural Network 35 5.1 Introduction 35 5.2 Encoding Vector Estimation Using DNN 37 5.3 Experiments 42 5.4 Summary 47 6 DNN-Based Monaural Speech Enhancement with Temporal and Spectral Variations Equalization 49 6.1 Introduction 49 6.2 Conventional DNN-Based Speech Enhancement 53 6.2.1 Training Stage 53 6.2.2 Test Stage 55 6.3 Perceptually-Motivated Criteria 56 6.3.1 Perceptually Motivated Objective Function 56 6.3.2 Mel-Scale Weighted Mean Square Error 58 6.3.3 Temporal Variation Similarity 58 6.3.4 Spectral Variation Similarity 61 6.3.5 DNN Training with the Proposed Objective Function 62 6.4 Experiments 62 6.4.1 Performance Evaluation with Varying Weight Parameters 64 6.4.2 Performance Evaluation in Matched Noise Conditions 64 6.4.3 Performance Evaluation in Mismatched Noise Conditions 66 6.4.4 Comparison Between Variation Analysis Method 66 6.4.5 Subjective Test Results 67 6.5 Summary 68 7 Spectral Variance Equalization Post-filter for DNN-Based Speech Enhancement 75 7.1 Introduction 75 7.2 GV Equalization Post-Filter 76 7.3 Spectral Variance(SV) Equalization Post-Filter 77 7.4 Experiments 78 7.4.1 Objective Test Results 78 7.4.2 Subjective Test Results 79 7.5 Summary 81 8 Conclusions 83 Bibliography 85 Appendix 95 요약 97Docto

    Eigenviews for object recognition in multispectral imaging systems

    No full text
    We address the problem of representing multispectral images of objects using eigenviews for recognition purposes. Eigenviews have long been used for object recognition and pose estimation purposes in the grayscale and color image settings. The purpose of this paper is two-fold: firstly to extend the idealogies of eigenviews to multispectral images and secondly to propose the use of dimensionality reduction techniques other than those popularly used. Principal Component Analysis (PCA) and its various kernel-based flavors are popularly used to extract eigenviews. We propose the use of Independent Component Analysis (ICA) and Non-negative Matrix Factorization (NMF) as possible candidates for eigenview extraction. Multispectral images of a collection of 3D objects captured under different viewpoint locations are used to obtain representative views (eigenviews) that encode the information in these images. The idea is illustrated with a collection of eight synthetic objects imaged in both reflection and emission bands. A Nearest Neighbor classifier is used to perform the classification of an arbitrary view of an object. Classifier performance under additive white Gaussian noise is also tested. The results demonstrate that this system holds promise for use in object recognition under the multispectral imaging setting and also for novel dimensionality reduction techniques. The number of eigenviews needed by various techniques to obtain a given classifier accuracy is also calculated as a measure of the performance of the dimensionality reduction technique. 1
    corecore