7 research outputs found

    Conversão de imagens em RGB em cartas de cores do solo de Munsell

    Get PDF
    [Objective] The transformation from RGB to Munsell color space is a relevant issue for different tasks, such as the identification of soil taxonomy, organic materials, rock materials, skin types, among others. This research aims to develop alternatives based on feedforward networks and the convolutional neural networks to predict the hue, value, and chroma in the Munsell soil-color charts (MSCCs) from RGB images. [Methodology] We used images of Munsell soil-color charts from 2000 and 2009 versions taken from Millota et al. (2018) to train and test the models. A division of 2856 images in 10% for testing, 20% for validation, and 70% for training was used to build the models. [Results] The best approach was the convolutional neural networks for classification with 93% of total accuracy of hue, value, and chroma combination; it comprises three CNN, one for the hue prediction, another for value prediction, and the last one for chroma prediction. However, the three best models show closeness between the prediction and real values according to the CIEDE2000 distance. The cases classified incorrectly with this approach had a CIEDE2000 average of 0.27 and a standard deviation of 1.06. [Conclusions] The models demonstrated better color recognition in uncontrolled environments than the Transformation of Centore, which is the classical method to transform from RGB to HVC. The results were promising, but the model should be tested with real images at different applications, such as soil real images, to classify the soil color.[Objetivo] La transformación del espacio de color RGB al de color Munsell es un tema relevante para diferentes tareas como la identificación de: la taxonomía del suelo, materiales orgánicos, materiales rocosos. tipo de piel entre otros. Esta investigación tiene como objetivo desarrollar alternativas basadas en las redes feedforward y las Redes Neuronales Convolucionales para predecir el tono, el valor y el croma en las cartas de color del suelo de Munsell (MSCC) a partir de imágenes RGB. [Metodología] Con el fin de entrenar y probar los modelos, usamos imágenes de los gráficos de colores de suelo de Munsell de las versiones 2000 y 2009 tomadas de Millota et al. (2018). Se utilizó una división de 2856 imágenes en 10% para pruebas, 20% para validación y 70% para entrenamiento con miras a construir los modelos. [Resultados] El mejor enfoque fueron las redes neuronales convolucionales para la clasificación con un 93% de precisión total de la combinación de tono, valor y croma (consta de tres CNN, uno para la predicción de tono, otra para la de valor y la última para la de croma), aunque los tres mejores modelos muestran cercanía entre la predicción y los valores reales según la distancia CIEDE2000. Los casos clasificados incorrectamente con este enfoque tuvieron un promedio CIEDE2000 de 0.27 y una desviación estándar de 1.06. [Conclusiones] Los modelos demostraron un mejor reconocimiento de color en entornos no controlados que la transformación de Centore, la cual es el método clásico para transformar de RGB a HVC. Los resultados fueron prometedores, pero el modelo debe evaluarse ampliamente con imágenes reales del suelo para clasificar su color.[Objetivo] A conversão do espaço de cor RGB para o espaço de cores Munsell é um tema relevante para diferentes tarefas como a identificação: da taxonomia do solo, dos materiais orgânicos, dos materiais rochosos, do tipo de pele, dentre outros. Esta pesquisa tem como objetivo desenvolver alternativas baseadas nas redes feed-forward e nas Redes Neurais Convolucionais (CNN) para prever o matiz, o valor e o croma nas cartas de cores do solo de Munsell (MSCC) a partir de imagens RGB. [Metodologia] Para treinar e testar os modelos, usamos imagens dos gráficos de cores do solo de Munsell das versões 2000 e 2009 tomadas de Millota et al. (2018). Foi usada uma divisão de 2856 imagens em 10% para testes, 20% para validação e 70% para treinamento com o intuito de construir os modelos. [Resultados] O melhor enfoque foram as redes neurais convolucionais para a classificação com 93% de precisão total da combinação de matiz, valor e croma (consta de três CNN, um para a previsão de matiz, outra para a previsão de valor e a última para a previsão de croma), embora três melhores modelos tenham mostrado proximidade entre a previsão e os valores reais dependendo da distância CIEDE2000. Os casos classificados incorretamente com este enfoque tiveram uma média CIEDE2000 de 0,27 e um desvio padrão de 1,06. [Conclusões] Os modelos demonstraram um melhor reconhecimento de cor em ambientes não controlados que a conversão de Centore, que é o método clássico para converter de RGB a HVC. Os resultados foram prometedores, mas o modelo deve ser amplamente avaliado com imagens reais de solo para classificar sua cor

    Conversão de imagens em RGB em cartas de cores do solo de Munsell

    Get PDF
    [Objective] The transformation from RGB to Munsell color space is a relevant issue for different tasks, such as the identification of soil taxonomy, organic materials, rock materials, skin types, among others. This research aims to develop alternatives based on feedforward networks and the convolutional neural networks to predict the hue, value, and chroma in the Munsell soil-color charts (MSCCs) from RGB images. [Methodology] We used images of Munsell soil-color charts from 2000 and 2009 versions taken from Millota et al. (2018) to train and test the models. A division of 2856 images in 10% for testing, 20% for validation, and 70% for training was used to build the models. [Results] The best approach was the convolutional neural networks for classification with 93% of total accuracy of hue, value, and chroma combination; it comprises three CNN, one for the hue prediction, another for value prediction, and the last one for chroma prediction. However, the three best models show closeness between the prediction and real values according to the CIEDE2000 distance. The cases classified incorrectly with this approach had a CIEDE2000 average of 0.27 and a standard deviation of 1.06. [Conclusions] The models demonstrated better color recognition in uncontrolled environments than the Transformation of Centore, which is the classical method to transform from RGB to HVC. The results were promising, but the model should be tested with real images at different applications, such as soil real images, to classify the soil color.[Objetivo] La transformación del espacio de color RGB al de color Munsell es un tema relevante para diferentes tareas como la identificación de: la taxonomía del suelo, materiales orgánicos, materiales rocosos. tipo de piel entre otros. Esta investigación tiene como objetivo desarrollar alternativas basadas en las redes feedforward y las Redes Neuronales Convolucionales para predecir el tono, el valor y el croma en las cartas de color del suelo de Munsell (MSCC) a partir de imágenes RGB. [Metodología] Con el fin de entrenar y probar los modelos, usamos imágenes de los gráficos de colores de suelo de Munsell de las versiones 2000 y 2009 tomadas de Millota et al. (2018). Se utilizó una división de 2856 imágenes en 10% para pruebas, 20% para validación y 70% para entrenamiento con miras a construir los modelos. [Resultados] El mejor enfoque fueron las redes neuronales convolucionales para la clasificación con un 93% de precisión total de la combinación de tono, valor y croma (consta de tres CNN, uno para la predicción de tono, otra para la de valor y la última para la de croma), aunque los tres mejores modelos muestran cercanía entre la predicción y los valores reales según la distancia CIEDE2000. Los casos clasificados incorrectamente con este enfoque tuvieron un promedio CIEDE2000 de 0.27 y una desviación estándar de 1.06. [Conclusiones] Los modelos demostraron un mejor reconocimiento de color en entornos no controlados que la transformación de Centore, la cual es el método clásico para transformar de RGB a HVC. Los resultados fueron prometedores, pero el modelo debe evaluarse ampliamente con imágenes reales del suelo para clasificar su color.[Objetivo] A conversão do espaço de cor RGB para o espaço de cores Munsell é um tema relevante para diferentes tarefas como a identificação: da taxonomia do solo, dos materiais orgânicos, dos materiais rochosos, do tipo de pele, dentre outros. Esta pesquisa tem como objetivo desenvolver alternativas baseadas nas redes feed-forward e nas Redes Neurais Convolucionais (CNN) para prever o matiz, o valor e o croma nas cartas de cores do solo de Munsell (MSCC) a partir de imagens RGB. [Metodologia] Para treinar e testar os modelos, usamos imagens dos gráficos de cores do solo de Munsell das versões 2000 e 2009 tomadas de Millota et al. (2018). Foi usada uma divisão de 2856 imagens em 10% para testes, 20% para validação e 70% para treinamento com o intuito de construir os modelos. [Resultados] O melhor enfoque foram as redes neurais convolucionais para a classificação com 93% de precisão total da combinação de matiz, valor e croma (consta de três CNN, um para a previsão de matiz, outra para a previsão de valor e a última para a previsão de croma), embora três melhores modelos tenham mostrado proximidade entre a previsão e os valores reais dependendo da distância CIEDE2000. Os casos classificados incorretamente com este enfoque tiveram uma média CIEDE2000 de 0,27 e um desvio padrão de 1,06. [Conclusões] Os modelos demonstraram um melhor reconhecimento de cor em ambientes não controlados que a conversão de Centore, que é o método clássico para converter de RGB a HVC. Os resultados foram prometedores, mas o modelo deve ser amplamente avaliado com imagens reais de solo para classificar sua cor

    Object-based attention mechanism for color calibration of UAV remote sensing images in precision agriculture.

    Get PDF
    Color calibration is a critical step for unmanned aerial vehicle (UAV) remote sensing, especially in precision agriculture, which relies mainly on correlating color changes to specific quality attributes, e.g. plant health, disease, and pest stresses. In UAV remote sensing, the exemplar-based color transfer is popularly used for color calibration, where the automatic search for the semantic correspondences is the key to ensuring the color transfer accuracy. However, the existing attention mechanisms encounter difficulties in building the precise semantic correspondences between the reference image and the target one, in which the normalized cross correlation is often computed for feature reassembling. As a result, the color transfer accuracy is inevitably decreased by the disturbance from the semantically unrelated pixels, leading to semantic mismatch due to the absence of semantic correspondences. In this article, we proposed an unsupervised object-based attention mechanism (OBAM) to suppress the disturbance of the semantically unrelated pixels, along with a further introduced weight-adjusted Adaptive Instance Normalization (AdaIN) (WAA) method to tackle the challenges caused by the absence of semantic correspondences. By embedding the proposed modules into a photorealistic style transfer method with progressive stylization, the color transfer accuracy can be improved while better preserving the structural details. We evaluated our approach on the UAV data of different crop types including rice, beans, and cotton. Extensive experiments demonstrate that our proposed method outperforms several state-of-the-art methods. As our approach requires no annotated labels, it can be easily embedded into the off-the-shelf color transfer approaches. Relevant codes and configurations will be available at https://github.com/huanghsheng/object-based-attention-mechanis

    Translational Functional Imaging in Surgery Enabled by Deep Learning

    Get PDF
    Many clinical applications currently rely on several imaging modalities such as Positron Emission Tomography (PET), Magnetic Resonance Imaging (MRI), Computed Tomography (CT), etc. All such modalities provide valuable patient data to the clinical staff to aid clinical decision-making and patient care. Despite the undeniable success of such modalities, most of them are limited to preoperative scans and focus on morphology analysis, e.g. tumor segmentation, radiation treatment planning, anomaly detection, etc. Even though the assessment of different functional properties such as perfusion is crucial in many surgical procedures, it remains highly challenging via simple visual inspection. Functional imaging techniques such as Spectral Imaging (SI) link the unique optical properties of different tissue types with metabolism changes, blood flow, chemical composition, etc. As such, SI is capable of providing much richer information that can improve patient treatment and care. In particular, perfusion assessment with functional imaging has become more relevant due to its involvement in the treatment and development of several diseases such as cardiovascular diseases. Current clinical practice relies on Indocyanine Green (ICG) injection to assess perfusion. Unfortunately, this method can only be used once per surgery and has been shown to trigger deadly complications in some patients (e.g. anaphylactic shock). This thesis addressed common roadblocks in the path to translating optical functional imaging modalities to clinical practice. The main challenges that were tackled are related to a) the slow recording and processing speed that SI devices suffer from, b) the errors introduced in functional parameter estimations under changing illumination conditions, c) the lack of medical data, and d) the high tissue inter-patient heterogeneity that is commonly overlooked. This framework follows a natural path to translation that starts with hardware optimization. To overcome the limitation that the lack of labeled clinical data and current slow SI devices impose, a domain- and task-specific band selection component was introduced. The implementation of such component resulted in a reduction of the amount of data needed to monitor perfusion. Moreover, this method leverages large amounts of synthetic data, which paired with unlabeled in vivo data is capable of generating highly accurate simulations of a wide range of domains. This approach was validated in vivo in a head and neck rat model, and showed higher oxygenation contrast between normal and cancerous tissue, in comparison to a baseline using all available bands. The need for translation to open surgical procedures was met by the implementation of an automatic light source estimation component. This method extracts specular reflections from low exposure spectral images, and processes them to obtain an estimate of the light source spectrum that generated such reflections. The benefits of light source estimation were demonstrated in silico, in ex vivo pig liver, and in vivo human lips, where the oxygenation estimation error was reduced when utilizing the correct light source estimated with this method. These experiments also showed that the performance of the approach proposed in this thesis surpass the performance of other baseline approaches. Video-rate functional property estimation was achieved by two main components: a regression and an Out-of-Distribution (OoD) component. At the core of both components is a compact SI camera that is paired with state-of-the-art deep learning models to achieve real time functional estimations. The first of such components features a deep learning model based on a Convolutional Neural Network (CNN) architecture that was trained on highly accurate physics-based simulations of light-tissue interactions. By doing this, the challenge of lack of in vivo labeled data was overcome. This approach was validated in the task of perfusion monitoring in pig brain and in a clinical study involving human skin. It was shown that this approach is capable of monitoring subtle perfusion changes in human skin in an arm clamping experiment. Even more, this approach was capable of monitoring Spreading Depolarizations (SDs) (deoxygenation waves) in the surface of a pig brain. Even though this method is well suited for perfusion monitoring in domains that are well represented with the physics-based simulations on which it was trained, its performance cannot be guaranteed for outlier domains. To handle outlier domains, the task of ischemia monitoring was rephrased as an OoD detection task. This new functional estimation component comprises an ensemble of Invertible Neural Networks (INNs) that only requires perfused tissue data from individual patients to detect ischemic tissue as outliers. The first ever clinical study involving a video-rate capable SI camera in laparoscopic partial nephrectomy was designed to validate this approach. Such study revealed particularly high inter-patient tissue heterogeneity under the presence of pathologies (cancer). Moreover, it demonstrated that this personalized approach is now capable of monitoring ischemia at video-rate with SI during laparoscopic surgery. In conclusion, this thesis addressed challenges related to slow image recording and processing during surgery. It also proposed a method for light source estimation to facilitate translation to open surgical procedures. Moreover, the methodology proposed in this thesis was validated in a wide range of domains: in silico, rat head and neck, pig liver and brain, and human skin and kidney. In particular, the first clinical trial with spectral imaging in minimally invasive surgery demonstrated that video-rate ischemia monitoring is now possible with deep learning
    corecore