3 research outputs found

    Reconhecimento de expressões faciais em neonatos

    Get PDF
    Orientador: Profa Dra Olga R. P. BellonDissertação (mestrado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 30/10/2019Inclui referências: p. 46-51Área de concentração: Ciência da ComputaçãoResumo: A avaliação de dor é uma tarefa difícil e complexa, que é particularmente importante para recém-nascidos, que não conseguem verbaliza-la de maneira adequada e são vulneráveis a danos cerebrais decorrentes do não tratamento da dor. As ferramentas utilizadas no ambiente clínico para auxiliar na avaliação de dor requerem treinamento dos profissionais de saúde que irão utilizá-las, e seu uso é afetado pelo viés no reconhecimento da dor de cada indivíduo. Por essa razão, esforços tem sido colocados em automatizar essa tarefa, e uma das maneiras de fazê-lo é analisando a expressão facial do neonato, uma vez que esta é comprovadamente correlacionada à dor. Nessa dissertação, as diferenças entre os principais trabalhos em reconhecimento automático de expressão facial de neonatos são apresentadas, examinando os métodos utilizados, bases de dados e performances dos sistemas. Com isso em mente, testamos os principais métodos utilizados com objetivo de comparar suas performances mais a fundo. Esse estudo também avança o entendimento da base de dados COPE, a única base de dados de expressão facial de neonatos publicamente disponível. Conduzimos testes com métodos off the shelf para detecção de face, e em 54% das imagens nenhuma face foi detectada, reforçando a necessidade do desenvolvimento de sistemas específicos para recém-nascidos ou mais robustos à mudanças de público. Desde a publicação da base COPE em 2005, avanços significativos foram alcançados na área de processamento de imagens, e por essa razão comparamos métodos clássicos de extração de características em processamento de imagens com características provenientes de redes neurais convolucionais (CNNs), que são consideradas estado da arte para a maioria das aplicações de visão computacional. Um delta de 19% foi observado entre os filtros de gabor (melhor dos métodos clássicos) e características da ResNet50 (melhor das CNNs). Também testamos a robustez dos métodos a ruído, um fator importante em problemas de visão computacional onde devem ser considerados cenários da vida real. Para os métodos clássicos, foi observado um delta menor na performance entre cenários limpos e ruidosos, mas de maneira geral a performance foi pior que das CNNs. Em adição, estressando a performance das CNNs, testamos quais camadas produziriam melhor performance, na tentativa de verificar se camadas mais rasas poderiam ter desempenho igual ou melhor que camadas mais profundas, o que significaria menor custo computacional. Os resultados mostraram melhores resultados utilizando as camadas mais profundas. De maneira geral, estudando a literatura da área notamos uma tendência na utilização de métricas enviesadas, como acurácia, em um campo onde uma visão mais completa de performance de modelos deveria ser utilizada, por se tratar de um público tão vulnerável. Por fim, também observamos uma dificuldade no acesso as bases da literatura. Nossos esforços reforçam o potencial da utilização de métodos de visão computacional, porém fora limitados à base de dados utilizada. Palavras-chave: Expressões faciais, avaliação de dor, visão computacionalAbstract: Pain evaluation is a difficult and complex task, that is particularly important for newborns, who cannot verbalize it properly and are vulnerable to cerebral damage due to untreated pain. The current pain assessment tools used in clinical settings require extensive training for the caregivers and can be affected by each individual's bias towards pain recognition. For this reason, efforts have been made to automate this task, and one of the ways to do so is analyzing the newborn's facial expression, that has been proved to correlate with pain. In this dissertation, the differences among the most prominent works in automatic neonatal facial expression recognition were outlined, examining methods used, databases and final performance. With this in mind, we tested main methods used to compare their performances more in depth. This study also advances the understanding of the COPE database, the only publicly available newborn facial expression database. We conducted a test with off the shelf methods for face detection, and found that in 54% of the images, no face was found, reinforcing the need to develop either tailored applications or more robust ones. Since the COPE database was published, in 2005, significant advances in image processing have been made, and for this reason, we compared classical image processing feature extraction methods with Convolutional Neural Networks (CNNs), that are considered to be state of the art for most computer vision problems. We saw a difference of 19% in recall when using gabor filters (best of classical methods) and then the ResNet50 features (best of CNNs). We also tested the methods in regards to robustness to image noise, an important factor for computer vision problems when real world scenarios are considered. We found that image processing methods had a smaller delta in performance from clean to noisy scenarios, but had overall poor performance. In addition, stressing the CNNs performance, we also studied which layers yielded best performance in order to verify if shallow layers could produce the same results as deeper ones for this application, meaning less computational cost, but our test showed superior performance in deeper layers. Overall, studying the literature we noticed a tendency to use biased metrics, such as accuracy, in a field where a more complete view of model performance should be used. Moreover, we also found it very difficult to access data for this field. Our findings reinforce the potential of more complex computer vision methods, but are limited to the dataset that was used. Keywords: Facial expression, pain assessment, computer visio

    Subjective and objective quality assessment of ancient degraded documents

    Get PDF
    Archiving, restoration and analysis of damaged manuscripts have been largely increased in recent decades. Usually, these documents are physically degraded because of aging and improper handing. They also cannot be processed manually because a massive volume of these documents exist in libraries and archives around the world. Therefore, automatic methodologies are needed to preserve and to process their content. These documents are usually processed through their images. Degraded document image processing is a difficult task mainly because of the existing physical degradations. While it can be very difficult to accurately locate and remove such distortions, analyzing the severity and type(s) of these distortions is feasible. This analysis provides useful information on the type and severity of degradations with a number of applications. The main contributions of this thesis are to propose models for objectively assessing the physical condition of document images and to classify their degradations. In this thesis, three datasets of degraded document images along with the subjective ratings for each image are developed. In addition, three no-reference document image quality assessment (NR-DIQA) metrics are proposed for historical and medieval document images. It should be mentioned that degraded medieval document images are a subset of the historical document images and may contain both graphical and textual content. Finally, we propose a degradation classification model in order to identify common distortion types in old document images. Essentially, existing no reference image quality assessment (NR-IQA) metrics are not designed to assess physical document distortions. In the first contribution, we propose the first dataset of degraded document images along with the human opinion scores for each document image. This dataset is introduced to evaluate the quality of historical document images. We also propose an objective NR-DIQA metric based on the statistics of the mean subtracted contrast normalized (MSCN) coefficients computed from segmented layers of each document image. The segmentation into four layers of foreground and background is done based on an analysis of the log-Gabor filters. This segmentation is based on the assumption that the sensitivity of the human visual system (HVS) is different at the locations of text and non-text. Experimental results show that the proposed metric has comparable or better performance than the state-of-the-art metrics, while it has a moderate complexity. Degradation identification and quality assessment can complement each other to provide information on both type and severity of degradations in document images. Therefore, we introduced, in the second contribution, a multi-distortion historical document image database that can be used for the research on quality assessment of degraded documents as well as degradation classification. The developed dataset contains historical document images which are classified into four categories based on their distortion types, namely, paper translucency, stain, readers’ annotations, and worn holes. An efficient NR-DIQA metric is then proposed based on three sets of spatial and frequency image features extracted from two layers of text and non-text. In addition, these features are used to estimate the probability of the four aforementioned physical distortions for the first time in the literature. Both proposed quality assessment and degradation classification models deliver a very promising performance. Finally, we develop in the third contribution a dataset and a quality assessment metric for degraded medieval document (DMD) images. This type of degraded images contains both textual and pictorial information. The introduced DMD dataset is the first dataset in its category that also provides human ratings. Also, we propose a new no-reference metric in order to evaluate the quality of DMD images in the developed dataset. The proposed metric is based on the extraction of several statistical features from three layers of text, non-text, and graphics. The segmentation is based on color saliency with assumption that pictorial parts are colorful. It also follows HVS that gives different weights to each layer. The experimental results validate the effectiveness of the proposed NR-DIQA strategy for DMD images

    High-speed surface profilometry based on an adaptive microscope with axial chromatic encoding

    Get PDF
    An adaptive microscope with axial chromatic encoding is designed and developed, namely the AdaScope. With the ability to confocally address any locations within the measurement volume, the AdaScope provides the hardware foundation for a cascade measurement strategy to be developed, dramatically accelerating the speed of 3D confocal microscopy
    corecore