14 research outputs found

    Transfer Learning for Binary Classification of Thermal Images

    Get PDF
    La clasificación de imágenes térmicas es un aspecto clave en el sector industrial, debido a que suele ser el punto de partida en la detección de fallos en equipos eléctricos. En algunos casos, esta tarea se automatiza mediante el uso de técnicas tradicionales de inteligencia artificial, mientras que en otros, es realizada de manera manual, lo cual puede traer consigo altas tasas de error humano. Este artículo presenta un análisis comparativo entre once arquitecturas de transfer learning (AlexNet, VGG16, VGG19, ResNet, DenseNet, MobileNet v2, GoogLeNet, ResNeXt, Wide ResNet, MNASNet y ShuffleNet) mediante el uso de fine-tuning, con la finalidad de realizar una clasificación binaria de imágenes térmicas en una red de distribución eléctrica. Para ello, se dispone de una base de datos con 815 imágenes, divididas mediante la técnica tipo hold-out 60-20-20 y validación cruzada con 5-folds, para finalmente analizar su rendimiento mediante el test de Friedman. Luego de los experimentos, se obtuvieron resultados satisfactorios con exactitudes superiores a 85 % en diez de las arquitecturas previamente entrenadas. Sin embargo, la arquitectura que no se entrenó previamente presentó una exactitud baja; concluyéndose que la aplicación de transfer learning mediante el uso de arquitecturas previamente entrenadas es un mecanismo adecuado en la clasificación de este tipo de imágenes, y representa una alternativa confiable frente a técnicas tradicionales de inteligencia artificial.The classification of thermal images is a key aspect in the industrial sector, since it is usually the starting point for the detection of faults in electrical equipment. In some cases, this task is automated through the use of traditional artificial intelligence techniques, while in others, it is performed manually, which can lead to high rates of human error. This paper presents a comparative analysis between eleven transfer learning architectures (AlexNet, VGG16, VGG19, ResNet, DenseNet, MobileNet v2, GoogLeNet, ResNeXt, Wide ResNet, MNASNet and ShuffleNet) through the use of fine-tuning, in order to perform a binary classification of thermal images in an electrical distribution network. For this, a database with 815 images is available, divided using the 60-20-20 hold-out technique and cross-validation with 5-Folds, to finally analyze their performance using Friedman test. After the experiments, satisfactory results were obtained with accuracies above 85 % in 10 of the previously trained architectures. However, the architecture that was not previously trained had low accuracy; with this, it is concluded that the application of transfer learning through the use of previously trained architectures is a proper mechanism in the classification of this type of images, and represents a reliable alternative to traditional artificial intelligence techniques

    Computational assessment of the retinal vascular tortuosity integrating domain-related information

    Get PDF
    [Abstract] The retinal vascular tortuosity presents a valuable potential as a clinical biomarker of many relevant vascular and systemic diseases. Commonly, the existent approaches face the tortuosity quantification by means of fully mathematical representations of the vessel segments. However, the specialists, based on their diagnostic experience, commonly analyze additional domain-related information that is not represented in these mathematical metrics of reference. In this work, we propose a novel computational tortuosity metric that outperforms the mathematical metrics of reference also incorporating anatomical properties of the fundus image such as the distinction between arteries and veins, the distance to the optic disc, the distance to the fovea, and the vessel caliber. The evaluation of its prognostic performance shows that the integration of the anatomical factors provides an accurate tortuosity assessment that is more adjusted to the specialists’ perception.Instituto de Salud Carlos II; DTS18/00136Ministerio de Ciencia, Innovación y Universidades; DPI2015-69948-RMinisterio de Ciencia, Innovación y Universidades; RTI2018-095894-B-I00Xunta de Galicia; ED431G/01Xunta de Galicia; ED431C 2016-04

    An approach based on Open Research Knowledge Graph for Knowledge Acquisition from scientific papers

    Full text link
    A scientific paper can be divided into two major constructs which are Metadata and Full-body text. Metadata provides a brief overview of the paper while the Full-body text contains key-insights that can be valuable to fellow researchers. To retrieve metadata and key-insights from scientific papers, knowledge acquisition is a central activity. It consists of gathering, analyzing and organizing knowledge embedded in scientific papers in such a way that it can be used and reused whenever needed. Given the wealth of scientific literature, manual knowledge acquisition is a cumbersome task. Thus, computer-assisted and (semi-)automatic strategies are generally adopted. Our purpose in this research was two fold: curate Open Research Knowledge Graph (ORKG) with papers related to ontology learning and define an approach using ORKG as a computer-assisted tool to organize key-insights extracted from research papers. This approach was used to document the "epidemiological surveillance systems design and implementation" research problem and to prepare the related work of this paper. It is currently used to document "food information engineering", "Tabular data to Knowledge Graph Matching" and "Question Answering" research problems and "Neuro-symbolic AI" domain

    Color Fundus Image Registration Using a Learning-Based Domain-Specific Landmark Detection Methodology

    Get PDF
    Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract] Medical imaging, and particularly retinal imaging, allows to accurately diagnose many eye pathologies as well as some systemic diseases such as hypertension or diabetes. Registering these images is crucial to correctly compare key structures, not only within patients, but also to contrast data with a model or among a population. Currently, this field is dominated by complex classical methods because the novel deep learning methods cannot compete yet in terms of results and commonly used methods are difficult to adapt to the retinal domain. In this work, we propose a novel method to register color fundus images based on previous works which employed classical approaches to detect domain-specific landmarks. Instead, we propose to use deep learning methods for the detection of these highly-specific domain-related landmarks. Our method uses a neural network to detect the bifurcations and crossovers of the retinal blood vessels, whose arrangement and location are unique to each eye and person. This proposal is the first deep learning feature-based registration method in fundus imaging. These keypoints are matched using a method based on RANSAC (Random Sample Consensus) without the requirement to calculate complex descriptors. Our method was tested using the public FIRE dataset, although the landmark detection network was trained using the DRIVE dataset. Our method provides accurate results, a registration score of 0.657 for the whole FIRE dataset (0.908 for category S, 0.293 for category P and 0.660 for category A). Therefore, our proposal can compete with complex classical methods and beat the deep learning methods in the state of the art.This research was funded by Instituto de Salud Carlos III, Government of Spain, DTS18/00 136 research project; Ministerio de Ciencia e Innovación y Universidades, Government of Spain, RTI2018-095 894-B-I00 research project; Consellería de Cultura, Educación e Universidade, Xunta de Galicia through the predoctoral grant contract ref. ED481A 2021/147 and Grupos de Referencia Competitiva, grant ref. ED431C 2020/24; CITIC, Centro de Investigación de Galicia ref. ED431G 2019/01, receives financial support from Consellería de Educación, Universidade e Formación Profesional, Xunta de Galicia, through the ERDF (80%) and Secretaría Xeral de Universidades (20%). The funding institutions had no involvement in the study design, in the collection, analysis and interpretation of data; in the writing of the manuscript; or in the decision to submit the manuscript for publication. Funding for open access charge: Universidade da Coruña/CISUGXunta de Galicia; ED481A 2021/147Xunta de Galicia; ED431C 2020/24Xunta de Galicia; ED431G 2019/0

    OntoSIDES: Ontology-based student progress monitoring on the national evaluation system of French Medical Schools

    Get PDF
    International audienceWe introduce OntoSIDES, the core of an ontology-based learning management system in Medicine, in which theeducational content, the traces of students’ activities and the correction of exams are linked and related to itemsof an official reference program in a unified RDF data model. OntoSIDES is an RDF knowledge base comprised ofa lightweight domain ontology that serves as a pivot high-level vocabulary of the query interface with users, andof a dataset made of factual statements relating individual entities to classes and properties of the ontology.Thanks to an automatic mapping-based data materialization and rule-based data saturation, OntoSIDES containsaround 8 millions triples to date, and provides an integrated access to useful information for student progressmonitoring, using a powerful query language (namely SPARQL) allowing users to express their specific needs ofdata exploration and analysis. Since we do not expect end-users to master the raw syntax of SPARQL and toexpress directly complex queries in SPARQL, we have designed a set of parametrized queries that users caninstantiate through a user-friendly interface

    Meaning-sensitive noisy text analytics in the low data regime

    Get PDF
    Digital connectivity is revolutionising people’s quality of life. As broadband and mobile services become faster and more prevalent globally than before, people have started to frequently express their wants and desires on social media platforms. Thus, deriving insights from text data has become a popular approach, both in the industry and academia, to provide social media analytics solutions across a range of disciplines, including consumer behaviour, sales, sports and sociology. Businesses can harness the data shared on social networks to improve their organisations’ strategic business decisions by leveraging advanced Natural Language Processing (NLP) techniques, such as context-aware representations. Specifically, SportsHosts, our industry partner, will be able to launch digital marketing solutions that optimise audience targeting and personalisation using NLP-powered solutions. However, social media data are often noisy and diverse, making the task very challenging. Further, real-world NLP tasks often suffer from insufficient labelled data due to the costly and time-consuming nature of manual annotation. Nevertheless, businesses are keen on maximising the return on investment by boosting the performance of these NLP models in the real world, particularly with social media data. In this thesis, we make several contributions to address these challenges. Firstly, we propose to improve the NLP model’s ability to comprehend noisy text in a low data regime by leveraging prior knowledge from pre-trained language models. Secondly, we analyse the impact of text augmentation and the quality of synthetic sentences in a context-aware NLP setting and propose a meaning-sensitive text augmentation technique using a Masked Language Model. Thirdly, we offer a cost-efficient text data annotation methodology and an end-to-end framework to deploy efficient and effective social media analytics solutions in the real world.Doctor of Philosoph

    Applications of Silicon Retinas: from Neuroscience to Computer Vision

    Full text link
    Traditional visual sensor technology is firmly rooted in the concept of sequences of image frames. The sequence of stroboscopic images in these "frame cameras" is very different compared to the information running from the retina to the visual cortex. While conventional cameras have improved in the direction of smaller pixels and higher frame rates, the basics of image acquisition have remained the same. Event-based vision sensors were originally known as "silicon retinas" but are now widely called "event cameras." They are a new type of vision sensors that take inspiration from the mechanisms developed by nature for the mammalian retina and suggest a different way of perceiving the world. As in the neural system, the sensed information is encoded in a train of spikes, or so-called events, comparable to the action potential generated in the nerve. Event-based sensors produce sparse and asynchronous output that represents in- formative changes in the scene. These sensors have advantages in terms of fast response, low latency, high dynamic range, and sparse output. All these char- acteristics are appealing for computer vision and robotic applications, increasing the interest in this kind of sensor. However, since the sensor’s output is very dif- ferent, algorithms applied for frames need to be rethought and re-adapted. This thesis focuses on several applications of event cameras in scientific scenarios. It aims to identify where they can make the difference compared to frame cam- eras. The presented applications use the Dynamic Vision Sensor (event camera developed by the Sensors Group of the Institute of Neuroinformatics, University of Zurich and ETH). To explore some applications in more extreme situations, the first chapters of the thesis focus on the characterization of several advanced versions of the standard DVS. The low light condition represents a challenging situation for every vision sensor. Taking inspiration from standard Complementary Metal Oxide Semiconductor (CMOS) technology, the DVS pixel performances in a low light scenario can be improved, increasing sensitivity and quantum efficiency, by using back-side illumination. This thesis characterizes the so-called Back Side Illumination DAVIS (BSI DAVIS) camera and shows results from its application in calcium imaging of neural activity. The BSI DAVIS has shown better performance in the low light scene due to its high Quantum Efficiency (QE) of 93% and proved to be the best type of technology for microscopy application. The BSI DAVIS allows detecting fast dynamic changes in neural fluorescent imaging using the green fluorescent calcium indicator GCaMP6f. Event camera advances have pushed the exploration of event-based cameras in computer vision tasks. Chapters of this thesis focus on two of the most active research areas in computer vision: human pose estimation and hand gesture classification. Both chapters report the datasets collected to achieve the task, fulfilling the continuous need for data for this kind of new technology. The Dynamic Vision Sensor Human Pose dataset (DHP19) is an extensive collection of 33 whole-body human actions from 17 subjects. The chapter presents the first benchmark neural network model for 3D pose estimation using DHP19. The network archives a mean error of less than 8 mm in the 3D space, which is comparable with frame-based Human Pose Estimation (HPE) methods using frames. The gesture classification chapter reports an application running on a mobile device and explores future developments in the direction of embedded portable low power devices for online processing. The sparse output from the sensor suggests using a small model with a reduced number of parameters and low power consumption. The thesis also describes pilot results from two other scientific imaging applica- tions for raindrop size measurement and laser speckle analysis presented in the appendices
    corecore