12 research outputs found
An Efficient Block-Based Algorithm for Hair Removal in Dermoscopic Images
Hair occlusion in dermoscopy images affects the diagnostic operation of the skin lesion. Segmentation and classification of skin lesions are two major steps of the diagnostic operation required by Dermatologists. We propose a new algorithm for hair removal in dermoscopy images that includes two main stages: hair detection and inpainting. In hair detection, a morphological bottom-hat operation is implemented on Y-channel image of YIQ color space followed by a binarization operation. In inpainting, the repaired Y-channel is partitioned into 256 nonoverlapped blocks and for each block, white pixels are replaced by locating the highest peak of using a histogram function and a morphological close operation. Our proposed algorithm reports a true positive rate (sensitivity) of 97.36%, a false positive rate (fall-out) of 4.25%, and a true negative rate (specificity) of 95.75%. The diagnostic accuracy achieved is recorded at a high level of 95.78%
A Passive Learning Sensor Architecture for Multimodal Image Labeling: An Application for Social Robots
Object detection and classification have countless applications in human-robot interacting systems. It is a necessary skill for autonomous robots that perform tasks in household scenarios. Despite the great advances in deep learning and computer vision, social robots performing non-trivial tasks usually spend most of their time finding and modeling objects. Working in real scenarios means dealing with constant environment changes and relatively low-quality sensor data due to the distance at which objects are often found. Ambient intelligence systems equipped with different sensors can also benefit from the ability to find objects, enabling them to inform humans about their location. For these applications to succeed, systems need to detect the objects that may potentially contain other objects, working with relatively low-resolution sensor data. A passive learning architecture for sensors has been designed in order to take advantage of multimodal information, obtained using an RGB-D camera and trained semantic language models. The main contribution of the architecture lies in the improvement of the performance of the sensor under conditions of low resolution and high light variations using a combination of image labeling and word semantics. The tests performed on each of the stages of the architecture compare this solution with current research labeling techniques for the application of an autonomous social robot working in an apartment. The results obtained demonstrate that the proposed sensor architecture outperforms state-of-the-art approaches
Políticas de Copyright de Publicações Científicas em Repositórios Institucionais: O Caso do INESC TEC
A progressiva transformação das práticas científicas, impulsionada pelo desenvolvimento das novas Tecnologias de Informação e Comunicação (TIC), têm possibilitado aumentar o acesso à informação, caminhando gradualmente para uma abertura do ciclo de pesquisa. Isto permitirá resolver a longo prazo uma adversidade que se tem colocado aos investigadores, que passa pela existência de barreiras que limitam as condições de acesso, sejam estas geográficas ou financeiras. Apesar da produção científica ser dominada, maioritariamente, por grandes editoras comerciais, estando sujeita às regras por estas impostas, o Movimento do Acesso Aberto cuja primeira declaração pública, a Declaração de Budapeste (BOAI), é de 2002, vem propor alterações significativas que beneficiam os autores e os leitores. Este Movimento vem a ganhar importância em Portugal desde 2003, com a constituição do primeiro repositório institucional a nível nacional. Os repositórios institucionais surgiram como uma ferramenta de divulgação da produção científica de uma instituição, com o intuito de permitir abrir aos resultados da investigação, quer antes da publicação e do próprio processo de arbitragem (preprint), quer depois (postprint), e, consequentemente, aumentar a visibilidade do trabalho desenvolvido por um investigador e a respetiva instituição. O estudo apresentado, que passou por uma análise das políticas de copyright das publicações científicas mais relevantes do INESC TEC, permitiu não só perceber que as editoras adotam cada vez mais políticas que possibilitam o auto-arquivo das publicações em repositórios institucionais, como também que existe todo um trabalho de sensibilização a percorrer, não só para os investigadores, como para a instituição e toda a sociedade. A produção de um conjunto de recomendações, que passam pela implementação de uma política institucional que incentive o auto-arquivo das publicações desenvolvidas no âmbito institucional no repositório, serve como mote para uma maior valorização da produção científica do INESC TEC.The progressive transformation of scientific practices, driven by the development of new Information and Communication Technologies (ICT), which made it possible to increase access to information, gradually moving towards an opening of the research cycle. This opening makes it possible to resolve, in the long term, the adversity that has been placed on researchers, which involves the existence of barriers that limit access conditions, whether geographical or financial. Although large commercial publishers predominantly dominate scientific production and subject it to the rules imposed by them, the Open Access movement whose first public declaration, the Budapest Declaration (BOAI), was in 2002, proposes significant changes that benefit the authors and the readers. This Movement has gained importance in Portugal since 2003, with the constitution of the first institutional repository at the national level. Institutional repositories have emerged as a tool for disseminating the scientific production of an institution to open the results of the research, both before publication and the preprint process and postprint, increase the visibility of work done by an investigator and his or her institution. The present study, which underwent an analysis of the copyright policies of INESC TEC most relevant scientific publications, allowed not only to realize that publishers are increasingly adopting policies that make it possible to self-archive publications in institutional repositories, all the work of raising awareness, not only for researchers but also for the institution and the whole society. The production of a set of recommendations, which go through the implementation of an institutional policy that encourages the self-archiving of the publications developed in the institutional scope in the repository, serves as a motto for a greater appreciation of the scientific production of INESC TEC
Програмний комплекс для розпізнавання промовленого тексту з відео по міміці людини
У даній бакалаврській роботі розглянуті питання розпізнавання промовленого тексту за відеорядом міміки людини за допомогою методів глибокого навчання. Досліджені різні способи розпізнавання промовленого тексту. Розроблений спосіб розпізнавання на базі згорткової нейронної мережі з рекурентною мережою.
Розроблена програма для розпізнавання промовленого тексту за відеорядом обличчя людини, досліджено роботу моделі на різних наборах даних.
Загальний обсяг роботи: 62 сторінки, 23 рисунків, 8 таблиць, 23 джерел.In this work, the problem of recognition of spoken text by the video of human facial expressions through the methods of deep learning are studied. Different ways of recognition of spoken text have been investigated. The method of recognition on the basis of a convolutional neural network with a recurrent network is developed.
The program for recognition of the spoken text on the face of the person is developed, the model work is investigated on different data sets.
Total volume of work: 62 pages, 23 figures, 8 tables, 23 sources
Compressão e análise de dados genómicos
Doutoramento em InformáticaGenomic sequences are large codi ed messages describing most of the structure
of all known living organisms. Since the presentation of the rst genomic
sequence, a huge amount of genomics data have been generated,
with diversi ed characteristics, rendering the data deluge phenomenon a
serious problem in most genomics centers. As such, most of the data are
discarded (when possible), while other are compressed using general purpose
algorithms, often attaining modest data reduction results.
Several speci c algorithms have been proposed for the compression of genomic
data, but unfortunately only a few of them have been made available
as usable and reliable compression tools. From those, most have been developed
to some speci c purpose. In this thesis, we propose a compressor
for genomic sequences of multiple natures, able to function in a reference
or reference-free mode. Besides, it is very
exible and can cope with diverse
hardware speci cations. It uses a mixture of nite-context models (FCMs)
and eXtended FCMs. The results show improvements over state-of-the-art
compressors.
Since the compressor can be seen as a unsupervised alignment-free method
to estimate algorithmic complexity of genomic sequences, it is the ideal
candidate to perform analysis of and between sequences. Accordingly, we
de ne a way to approximate directly the Normalized Information Distance,
aiming to identify evolutionary similarities in intra- and inter-species. Moreover,
we introduce a new concept, the Normalized Relative Compression,
that is able to quantify and infer new characteristics of the data, previously
undetected by other methods. We also investigate local measures, being
able to locate speci c events, using complexity pro les. Furthermore, we
present and explore a method based on complexity pro les to detect and
visualize genomic rearrangements between sequences, identifying several insights
of the genomic evolution of humans.
Finally, we introduce the concept of relative uniqueness and apply it to the
Ebolavirus, identifying three regions that appear in all the virus sequences
outbreak but nowhere in the human genome. In fact, we show that these
sequences are su cient to classify di erent sub-species. Also, we identify
regions in human chromosomes that are absent from close primates DNA,
specifying novel traits in human uniqueness.As sequências genómicas podem ser vistas como grandes mensagens codificadas, descrevendo a maior parte da estrutura de todos os organismos
vivos. Desde a apresentação da primeira sequência, um enorme número de
dados genómicos tem sido gerado, com diversas características, originando
um sério problema de excesso de dados nos principais centros de genómica.
Por esta razão, a maioria dos dados é descartada (quando possível), enquanto
outros são comprimidos usando algoritmos genéricos, quase sempre
obtendo resultados de compressão modestos.
Têm também sido propostos alguns algoritmos de compressão para
sequências genómicas, mas infelizmente apenas alguns estão disponíveis
como ferramentas eficientes e prontas para utilização. Destes, a maioria
tem sido utilizada para propósitos específicos. Nesta tese, propomos
um compressor para sequências genómicas de natureza múltipla, capaz de
funcionar em modo referencial ou sem referência. Além disso, é bastante
flexível e pode lidar com diversas especificações de hardware. O compressor
usa uma mistura de modelos de contexto-finito (FCMs) e FCMs estendidos.
Os resultados mostram melhorias relativamente a compressores estado-dearte.
Uma vez que o compressor pode ser visto como um método não supervisionado,
que não utiliza alinhamentos para estimar a complexidade
algortímica das sequências genómicas, ele é o candidato ideal para realizar
análise de e entre sequências. Em conformidade, definimos uma maneira
de aproximar directamente a distância de informação normalizada (NID),
visando a identificação evolucionária de similaridades em intra e interespécies. Além disso, introduzimos um novo conceito, a compressão relativa
normalizada (NRC), que é capaz de quantificar e inferir novas características
nos dados, anteriormente indetectados por outros métodos. Investigamos
também medidas locais, localizando eventos específicos, usando perfis de
complexidade. Propomos e exploramos um novo método baseado em perfis de complexidade para detectar e visualizar rearranjos genómicos entre
sequências, identificando algumas características da evolução genómica humana.
Por último, introduzimos um novo conceito de singularidade relativa e
aplicamo-lo ao Ebolavirus, identificando três regiões presentes em todas
as sequências do surto viral, mas ausentes do genoma humano. De facto,
mostramos que as três sequências são suficientes para classificar diferentes
sub-espécies. Também identificamos regiões nos cromossomas humanos que
estão ausentes do ADN de primatas próximos, especificando novas características da singularidade humana
An investigation of the use of gradients in imaging, including best approximation and the Structural Similarity image quality measure
The L^2-based mean squared error (MSE) and its variations continue to be the most widely employed metrics in image processing. This is most probably due to the fact that (1) the MSE is simple to compute and (2) it possesses a number of convenient mathematical properties, including differentiability and convexity. It is well known, however, that these L^2-based measures perform poorly in terms of measuring the visual quality of images. Their failure is partially due to the fact that the L^2 metric does not capture spatial relationships between pixels. This was a motivation for the introduction of the so-called Structural Similarity (SSIM) image quality measure [1] which, along with is variations, continues to be one of the most effective measure of visual quality. The SSIM index measures the similarity between two images by combining three components of the human visual system--luminance, contrast, and structure. It is our belief that the structure term, which measures the correlation between images, is the most important component of the SSIM.
A considerable portion of this thesis focusses on adapting the L^2 distance for image processing applications. Our first approach involves inserting an intensity-dependent weight function into the integral such that it conforms to generalized Weber's model of perception. We solve the associated best approximation problem and discuss examples in both one- and two-dimensions. Motivated by the success of the SSIM, we also solve the Weberized best approximation problem with an added regularization term involving the correlation.
Another approach we take towards adapting the MSE for image processing involves introducing gradient information into the metric. Specifically, we study the traditional L^2 best approximation problem with an added regularization term involving the L^2 distance between gradients. We use orthonormal functions to solve the best approximation problem in both the continuous and discrete setting. In both cases, we prove that the Fourier coefficients remain optimal provided certain assumptions on the orthonormal basis hold.
Our final best approximation problem to be considered involves maximizing the correlation between gradients. We obtain the relevant stationarity conditions and show that an infinity of solutions exists. A unique solution can be obtained using two assumptions adapted from [2]. We demonstrate that this problem is equivalent to maximizing the entire SSIM function between gradients. During this work, we prove that the discrete derivatives of the DCT and DFT basis functions form an orthogonal set, a result which has not appeared in the literature to the best of our knowledge.
Our study of gradients is not limited to best approximation problems. A second major focus of this thesis concerns the development of gradient-based image quality measures. This was based on the idea that the human visual system may also be sensitive to distortions in the magnitudes and/or direction of variations in greyscale or colour intensities--in other words, their gradients. Indeed, as we show in a persuasive simple example, the use of the L^2 distances between image gradients already yields a significant improvement over the MSE. One naturally wonders whether a measure of the correlation between image gradients could yield even better results--in fact, possibly "better" than the SSIM itself! (We will define what we mean by "better" in this thesis.) For this reason, we pursue many possible forms of a "gradient-based SSIM".
First, however, we must address the question of how to define the correlation between the gradient vectors of two images. We formulate and compare many novel gradient similarity measures. Among those, we justify our selection of a preferred measure which, although simple-minded, we show to be highly correlated with the "rigorous" canonical correlation method. We then present many attempts at incorporating our gradient similarity measure into the SSIM. We finally arrive at a novel gradient-based SSIM, our so-called "gradSSIM1", which we argue does, in fact, improve the SSIM. The novelty of our approach lies in its use of SSIM-dependent exponents, which allow us to seamlessly blend our measure of gradient correlation and the traditional SSIM.
To compare two image quality measures, e.g., the SSIM and our "gradSSIM1", we require use of the LIVE image database [3]. This database contains numerous distorted images, each of which is associated with a single score indicating visual quality. We suggest that these scores be considered as the independent variable, an understanding that does not appear to be have been adopted elsewhere in the literature. This work also necessitated a detailed analysis of the SSIM, including the roles of its individual components and the effect of varying its stability constants. It appears that such analysis has not been performed elsewhere in the literature.
References:
[1] Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing, 13(4):600-612, 2004.
[2] P. Bendevis and E.R. Vrscay. Structural Similarity-Based Approximation over Orthogonal Bases: Investigating the Use of Individual Components Functions S_k(x,y). In Aurelio Campilho and Mohamed Kamel, editors, Image Analysis and Recognition - 11th International COnference, ICIAR 2014, Vilamoura, Portugal, October 22-24, 2014, Proceedings, Part 1, volume 8814 of Lecture Notes in Computer Science, pages 55-64, 2014.
[3] H.R. Sheikh, M.F. Sabir, and A.C. Bovik. A Statistical Evaluation of Recent Full Reference Image Quality Assessment Algorithms. IEEE Transactions on Image Processing, 15(11):3440-2451, November 2006