Search CORE

156 research outputs found

Clustering of images using Generative Adversarial Networks

Author: Marín Ciudad Eloy
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/07/2020
Field of study

Recently, a variation of GANs, called ClusterGAN, has shown the ability to perform unsupervised classification of images according to content thanks to the use of a discrete-continuous latent space and a clustering-specific loss. This thesis demonstrates that it is possible to cluster images and annotate query images using GANs with some limitations according to image content. These limitations are related to the level of the features that describe an image, since clustering is performed based on low-level features. Also, in this thesis, we propose a new approach to cluster images by their dominant color, obtaining promising results. The proposed approach consists in using several small patches per image and the CIELAB color space because it approximates the way the human vision works.Recientemente, una variación de las GANs (ClusterGAN), ha demostrado la habilidad de clasificar imágenes según su contenido de forma no supervisada, gracias al uso de un espacio latente discreto-continuo y una perdida especifica para clustering. Esta tesis demuestra que es posible agrupar imágenes y anotar nuevas imágenes utilizando GANs con algunas limitaciones respecto al contenido de estas. Estas limitaciones están relacionadas con el nivel de las características que describen una imagen, ya que el clustering se realiza basado en características de bajo nivel. En esta tesis también proponemos un nuevo enfoque para agrupar imágenes según su color dominante, obteniendo resultados prometedores. El enfoque propuesto consiste en utilizar diversos parches por imagen y el espacio de color CIELAB, ya que aproxima la manera en la que el sistema visual humano funciona.Recentment, una variació de les GANs (ClusterGAN), ha demostrat l'habilitat de classificar imatges segons el seu contingut de forma no supervisada, gràcies a l'ús d'un espai latent discret-continu i una pèrdua especifica per dur a terme clustering. Aquesta tesi demostra que és possible agrupar imatges i anotar noves imatges utilitzant GANs amb algunes limitacions respecte al contingut d'aquestes. Aquestes limitacions estan relacionades amb el nivell de les característiques que descriuen una imatge, ja que el clustering es realitza basant-se en característiques de baix nivell. En aquesta tesi també proposem un nou enfocament per agrupar imatges segons el seu color dominant, obtenint resultats prometedors. L'enfocament proposat consisteix a utilitzar diversos pedaços per imatge i l'espai de color CIELAB, ja que aproxima la forma en la qual el sistema visual humà funciona

UPCommons. Portal del coneixement obert de la UPC

Advances in Image Processing, Analysis and Recognition Technology

Author
Publication venue: 'MDPI AG'
Publication date: 21/06/2022
Field of study

For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches

Directory of Open Access Books (DOAB)

Fruit ripeness classification: A survey

Author: Alessandro Zangari
Andrea Albarelli
Andrea Gasparetto
Matteo Marcuzzo
Matteo Rizzo
Publication venue
Publication date: 01/01/2023
Field of study

Fruit is a key crop in worldwide agriculture feeding millions of people. The standard supply chain of fruit products involves quality checks to guarantee freshness, taste, and, most of all, safety. An important factor that determines fruit quality is its stage of ripening. This is usually manually classified by field experts, making it a labor-intensive and error-prone process. Thus, there is an arising need for automation in fruit ripeness classification. Many automatic methods have been proposed that employ a variety of feature descriptors for the food item to be graded. Machine learning and deep learning techniques dominate the top-performing methods. Furthermore, deep learning can operate on raw data and thus relieve the users from having to compute complex engineered features, which are often crop-specific. In this survey, we review the latest methods proposed in the literature to automatize fruit ripeness classification, highlighting the most common feature descriptors they operate on

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Using Domain-Specific Information in Image Processing

Author: Cash Brianna Rose
Publication venue
Publication date: 01/01/2014
Field of study

With the increasing availability of high resolution imaging tools, even in our pockets (i.e. smartphones), everyday users can do far more than simply digitally capturing a family moment. The ease of new applications available in these portable forms, linked with users who have expert knowledge about the images and tasks, opens the door to new possibilities. With this in mind we propose two new approaches that utilize the user's knowledge for improved results. We apply these approaches to real life problems in medical and scientific image applications. In the first approach, we introduce a class of linear and nonlinear methods which we call Domain-Specific Grayscale (DSGS) methods. A DSGS method transforms a color image into an image analogous to a grayscale image, where user-specified information is used to optimize a specified image processing task and reduce the computational complexity. We introduce new methods based on projection into the space of single-coordinate images, and we adapt support vector machines by using their scores to create a DSGS image. We apply these methods to applications in dermatology, analyzing images of skin tests and skin lesions, and demonstrate their usefulness. In the second approach, we introduce a tool for improved image deblurring that safeguards against bias that can easily be introduced by a user favoring a particular result. This is particularly important in scientific and medical applications used for discovery or diagnosis. We provide real-time results of choices of regularization methods and parameter selection, and we check the statistical plausibility of the results, using three statistical diagnostics, allowing a user to see the results of the choices. Our work demonstrates the utility of domain-specific information, supplied by the user, in improving the results of image processing algorithms

Digital Repository at the University of Maryland

Detection and Classification of Diabetic Retinopathy Pathologies in Fundus Images

Author: Agurto Rios Carla Paola
Publication venue: UNM Digital Repository
Publication date: 30/01/2013
Field of study

Diabetic Retinopathy (DR) is a disease that affects up to 80% of diabetics around the world. It is the second greatest cause of blindness in the Western world, and one of the leading causes of blindness in the U.S. Many studies have demonstrated that early treatment can reduce the number of sight-threatening DR cases, mitigating the medical and economic impact of the disease. Accurate, early detection of eye disease is important because of its potential to reduce rates of blindness worldwide. Retinal photography for DR has been promoted for decades for its utility in both disease screening and clinical research studies. In recent years, several research centers have presented systems to detect pathology in retinal images. However, these approaches apply specialized algorithms to detect specific types of lesion in the retina. In order to detect multiple lesions, these systems generally implement multiple algorithms. Furthermore, some of these studies evaluate their algorithms on a single dataset, thus avoiding potential problems associated with the differences in fundus imaging devices, such as camera resolution. These methodologies primarily employ bottom-up approaches, in which the accurate segmentation of all the lesions in the retina is the basis for correct determination. A disadvantage of bottom-up approaches is that they rely on the accurate segmentation of all lesions in order to measure performance. On the other hand, top-down approaches do not depend on the segmentation of specific lesions. Thus, top-down methods can potentially detect abnormalities not explicitly used in their training phase. A disadvantage of these methods is that they cannot identify specific pathologies and require large datasets to build their training models. In this dissertation, I merged the advantages of the top-down and bottom-up approaches to detect DR with high accuracy. First, I developed an algorithm based on a top-down approach to detect abnormalities in the retina due to DR. By doing so, I was able to evaluate DR pathologies other than microaneurysms and exudates, which are the main focus of most current approaches. In addition, I demonstrated good generalization capacity of this algorithm by applying it to other eye diseases, such as age-related macular degeneration. Due to the fact that high accuracy is required for sight-threatening conditions, I developed two bottom-up approaches, since it has been proven that bottom-up approaches produce more accurate results than top-down approaches for particular structures. Consequently, I developed an algorithm to detect exudates in the macula. The presence of this pathology is considered to be a surrogate for clinical significant macular edema (CSME), a sight-threatening condition of DR. The analysis of the optic disc is usually not taken into account in DR screening systems. However, there is a pathology called neovascularization that is present in advanced stages of DR, making its detection of crucial clinical importance. In order to address this problem, I developed an algorithm to detect neovascularization in the optic disc. These algorithms are based on amplitude-modulation and frequency-modulation (AM-FM) representations, morphological image processing methods, and classification algorithms. The methods were tested on a diverse set of large databases and are considered to be the state-of the art in this field

Computer mediated colour fidelity and communication

Author: Peter A. Rhodes (7152605)
Publication venue
Publication date: 01/01/1995
Field of study

Developments in technology have meant that computercontrolled imaging devices are becoming more powerful and more affordable. Despite their increasing prevalence, computer-aided design and desktop publishing software has failed to keep pace, leading to disappointing colour reproduction across different devices. Although there has been a recent drive to incorporate colour management functionality into modern computer systems, in general this is limited in scope and fails to properly consider the way in which colours are perceived. Furthermore, differences in viewing conditions or representation severely impede the communication of colour between groups of users. The approach proposed here is to provide WYSIWYG colour across a range of imaging devices through a combination of existing device characterisation and colour appearance modeling techniques. In addition, to further facilitate colour communication, various common colour notation systems are defined by a series of mathematical mappings. This enables both the implementation of computer-based colour atlases (which have a number of practical advantages over physical specifiers) and also the interrelation of colour represented in hitherto incompatible notations. Together with the proposed solution, details are given of a computer system which has been implemented. The system was used by textile designers for a real task. Prior to undertaking this work, designers were interviewed in order to ascertain where colour played an important role in their work and where it was found to be a problem. A summary of the findings of these interviews together with a survey of existing approaches to the problems of colour fidelity and communication in colour computer systems are also given. As background to this work, the topics of colour science and colour imaging are introduced

Loughborough University Institutional Repository

Colour measurement and colour reproduction systems.

Author: Chalmers Andrew Neil.
Publication venue
Publication date: 01/01/1987
Field of study

Thesis (M.Sc.Eng.)-University of Natal, Durban, 1987.Techniques of colour measurement and colour reproduction are important in a wide range of commercial and social activities in most modern economies. Their study thus constitutes one of the major areas of interest to the CIE. The project described in this thesis began as an outgrowth of studies of new types of light sources and of the colorimetry of colour-TV systems; plus a conviction that modern TV cameras can operate effectively with a wide range of different illuminating spectra. It was soon evident that two important prerequisites for this research were: an understanding of the processes of human colour vision; and a knowledge of the standard, international, colorimetric terminology of the CIE. These topics are discussed fully in the text. Also included is a review of modern gas-discharge lamps, the~y properties, and their applications. Both high-pressure (HID) types and low-pressure (fluorescent-tube) types are considered. Because of the need to measure the colours of surfaces and their TV reproductions as accurately as possible, various forms of colorimeter were examined, leading to the choice of a spectrophotometer system for this work. The design, construction, and evaluation of an original spetrophotometer system (the UND Spectrophotometer) are described fully in the text. Finally, attention is given to the operation of a television system under nonstandard lighting. Twelve different light sources were evaluated as TV ((taking" illuminants, using both subjective and colorimetric methods of assessment. The experimental results tend to confirm that colorimetric methods are unsuited to colour reproduction evaluation, and that subjective methods are more meaningful. A subjective scale of colour reproduction performance was established, and it was found to correlate closely with the CIE general colour rendering index (Ra) for the various test lamps. The work reported herein predates similar experiments with TV lighting by other workers, and it includes a wider range of light sources. In spite of differences in experimental technique, however, there is broad agreement with their general results

ResearchSpace@UKZN

Recommended from our members

Visibility metrics and their applications in visually lossless image compression

Author: Ye Nanyang
Publication venue: University of Cambridge
Publication date: 09/01/2020
Field of study

Visibility metrics are image metrics that predict the probability that a human observer can detect differences between a pair of images. These metrics can provide localized information in the form of visibility maps, in which each value represents a probability of detection. An important application of the visibility metric is visually lossless image compression that aims at compressing a given image to the lowest fraction of bit per pixel while keeping the compression artifacts invisible at the same time. In previous works, most visibility metrics were modeled based on largely simplified assumptions and mathematical models of human visual systems. This approach generally fits well into experimental data measured with simple stimuli, such as Gabor patches. However, it cannot predict complex non-linear effects, such as contrast masking in natural images, particularly well. To predict visibility of image differences accurately, we collected the largest visibility dataset under fixed viewing conditions for calibrating existing visibility metrics and proposed a deep neural network-based visibility metric. We demonstrated in our experiments that the deep neural network-based visibility metric significantly outperformed existing visibility metrics. However, the deep neural network-based visibility metric cannot predict visibility under varying viewing conditions, such as display brightness and viewing distances that have great impacts on the visibility of distortions. To extend the deep neural network-based visibility metric to varying viewing conditions, we collected the largest visibility dataset under varying display brightness and viewing distances. We proposed incorporating white-box modules, in other words, luminance masking and viewing distance adaptation, into the black-box deep neural network, and we found that the combination of white-box modules and black-box deep neural networks could generalize our proposed visibility metric to varying viewing conditions. To demonstrate the application of our proposed deep neural network-based visibility metric to visually lossless image compression, we collected the visually lossless image compression dataset under fixed viewing conditions and significantly improved the deep neural network-based visibility metric's accuracy of predicting visually lossless image compression threshold by pre-training the visibility metric with a synthetic dataset generated by the state-of-the-art white-box visibility metric---HDR-VDP \cite{Mantiuk2011}. In a large-scale study of 1000 images, we found that with our improved visibility metric, we can save around 60\% to 70\% bits for visually lossless image compression encoding as compared to the default visually lossless quality level of 90. Because predicting image visibility and predicting image quality are closely related research topics, we also proposed a trained perceptually uniform transform for high dynamic range images and videos quality assessments by training a perceptual encoding function on a set of subjective quality assessment datasets. We have shown that when combining the trained perceptual encoding function with standard dynamic range image quality metrics, such as peak-signal-noise-ratio (PSNR), better performance was achieved compared to the untrained version

Apollo (Cambridge)

Percepção do ambiente urbano e navegação usando visão robótica : concepção e implementação aplicado à veículo autônomo

Author: Vitor Giovani Bernardes, 1985-
Publication venue: [s.n.]
Publication date: 26/08/2018
Field of study

Orientadores: Janito Vaqueiro Ferreira, Alessandro Corrêa VictorinoTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia MecânicaResumo: O desenvolvimento de veículos autônomos capazes de se locomover em ruas urbanas pode proporcionar importantes benefícios na redução de acidentes, no aumentando da qualidade de vida e também na redução de custos. Veículos inteligentes, por exemplo, frequentemente baseiam suas decisões em observações obtidas a partir de vários sensores tais como LIDAR, GPS e câmeras. Atualmente, sensores de câmera têm recebido grande atenção pelo motivo de que eles são de baixo custo, fáceis de utilizar e fornecem dados com rica informação. Ambientes urbanos representam um interessante mas também desafiador cenário neste contexto, onde o traçado das ruas podem ser muito complexos, a presença de objetos tais como árvores, bicicletas, veículos podem gerar observações parciais e também estas observações são muitas vezes ruidosas ou ainda perdidas devido a completas oclusões. Portanto, o processo de percepção por natureza precisa ser capaz de lidar com a incerteza no conhecimento do mundo em torno do veículo. Nesta tese, este problema de percepção é analisado para a condução nos ambientes urbanos associado com a capacidade de realizar um deslocamento seguro baseado no processo de tomada de decisão em navegação autônoma. Projeta-se um sistema de percepção que permita veículos robóticos a trafegar autonomamente nas ruas, sem a necessidade de adaptar a infraestrutura, sem o conhecimento prévio do ambiente e considerando a presença de objetos dinâmicos tais como veículos. Propõe-se um novo método baseado em aprendizado de máquina para extrair o contexto semântico usando um par de imagens estéreo, a qual é vinculada a uma grade de ocupação evidencial que modela as incertezas de um ambiente urbano desconhecido, aplicando a teoria de Dempster-Shafer. Para a tomada de decisão no planejamento do caminho, aplica-se a abordagem dos tentáculos virtuais para gerar possíveis caminhos a partir do centro de referencia do veículo e com base nisto, duas novas estratégias são propostas. Em primeiro, uma nova estratégia para escolher o caminho correto para melhor evitar obstáculos e seguir a tarefa local no contexto da navegação hibrida e, em segundo, um novo controle de malha fechada baseado na odometria visual e o tentáculo virtual é modelado para execução do seguimento de caminho. Finalmente, um completo sistema automotivo integrando os modelos de percepção, planejamento e controle são implementados e validados experimentalmente em condições reais usando um veículo autônomo experimental, onde os resultados mostram que a abordagem desenvolvida realiza com sucesso uma segura navegação local com base em sensores de câmeraAbstract: The development of autonomous vehicles capable of getting around on urban roads can provide important benefits in reducing accidents, in increasing life comfort and also in providing cost savings. Intelligent vehicles for example often base their decisions on observations obtained from various sensors such as LIDAR, GPS and Cameras. Actually, camera sensors have been receiving large attention due to they are cheap, easy to employ and provide rich data information. Inner-city environments represent an interesting but also very challenging scenario in this context, where the road layout may be very complex, the presence of objects such as trees, bicycles, cars might generate partial observations and also these observations are often noisy or even missing due to heavy occlusions. Thus, perception process by nature needs to be able to deal with uncertainties in the knowledge of the world around the car. While highway navigation and autonomous driving using a prior knowledge of the environment have been demonstrating successfully, understanding and navigating general inner-city scenarios with little prior knowledge remains an unsolved problem. In this thesis, this perception problem is analyzed for driving in the inner-city environments associated with the capacity to perform a safe displacement based on decision-making process in autonomous navigation. It is designed a perception system that allows robotic-cars to drive autonomously on roads, without the need to adapt the infrastructure, without requiring previous knowledge of the environment and considering the presence of dynamic objects such as cars. It is proposed a novel method based on machine learning to extract the semantic context using a pair of stereo images, which is merged in an evidential grid to model the uncertainties of an unknown urban environment, applying the Dempster-Shafer theory. To make decisions in path-planning, it is applied the virtual tentacle approach to generate possible paths starting from ego-referenced car and based on it, two news strategies are proposed. First one, a new strategy to select the correct path to better avoid obstacles and to follow the local task in the context of hybrid navigation, and second, a new closed loop control based on visual odometry and virtual tentacle is modeled to path-following execution. Finally, a complete automotive system integrating the perception, path-planning and control modules are implemented and experimentally validated in real situations using an experimental autonomous car, where the results show that the developed approach successfully performs a safe local navigation based on camera sensorsDoutoradoMecanica dos Sólidos e Projeto MecanicoDoutor em Engenharia Mecânic

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio da Producao Cientifica e Intelectual da Unicamp

Gesture tracking and neural activity segmentation in head-fixed behaving mice by deep learning methods

Author: Abbas Waseem
Publication venue: 'Fundacio per la Universitat Oberta de Catalunya'
Publication date: 30/10/2020
Field of study

The typical approach used by neuroscientists is to study the response of laboratory animals to a stimulus while recording their neural activity at the same time. With the advent of calcium imaging technology, researchers can now study neural activity at sub-cellular resolutions in vivo. Similarly, recording the behaviour of laboratory animals is also becoming more affordable. Although it is now easier to record behavioural and neural data, this data comes with its own set of challenges. The biggest challenge, given the sheer volume of the data, is annotation. A traditional approach is to annotate the data manually, frame by frame. With behavioural data, manual annotation is done by looking at each frame and tracing the animals; with neural data, this is carried out by a trained neuroscientist. In this research, we propose automated tools based on deep learning that can aid in the processing of behavioural and neural data. These tools will help neuroscientists annotate and analyse the data they acquire in an automated and reliable way.La configuración típica empleada por los neurocientíficos consiste en estudiar la respuesta de los animales de laboratorio a un estímulo y registrar al mismo tiempo su actividad neuronal. Con la llegada de la tecnología de imágenes del calcio, los investigadores pueden ahora estudiar la actividad neuronal a resoluciones subcelulares in vivo. Del mismo modo, el registro del comportamiento de los animales de laboratorio también se está volviendo más asequible. Aunque ahora es más fácil registrar los datos del comportamiento y los datos neuronales, estos datos ofrecen su propio conjunto de desafíos. El mayor desafío es la anotación de los datos debido a su gran volumen. Un enfoque tradicional es anotar los datos manualmente, fotograma a fotograma. En el caso de los datos sobre el comportamiento, la anotación manual se hace mirando cada fotograma y rastreando los animales, mientras que, para los datos neuronales, la anotación la hace un neurocientífico capacitado. En esta investigación, proponemos herramientas automatizadas basadas en el aprendizaje profundo que pueden ayudar a procesar los datos de comportamiento y los datos neuronales.La configuració típica emprada pels neurocientífics consisteix a estudiar la resposta dels animals de laboratori a un estímul i registrar al mateix temps la seva activitat neuronal. Amb l'arribada de la tecnologia d'imatges basades en calci, els investigadors poden ara estudiar l'activitat neuronal a resolucions subcel·lulars in vivo. De la mateixa manera, el registre del comportament dels animals de laboratori també ha esdevingut molt més assequible. Tot i que ara és més fàcil registrar les dades del comportament i les dades neuronals, aquestes dades ofereixen el seu propi conjunt de reptes. El major desafiament és l'anotació de les dades, degut al seu gran volum. Un enfocament tradicional és anotar les dades manualment, fotograma a fotograma. En el cas de les dades sobre el comportament, l'anotació manual es fa mirant cada fotograma i rastrejant els animals, mentre que per a les dades neuronals, l'anotació la fa un neurocientífic capacitat. En aquesta investigació, proposem eines automatitzades basades en laprenentatge profund que poden ajudar a modelar les dades de comportament i les dades neuronals

Tesis Doctorals en Xarxa