86 research outputs found

    Training deep retrieval models with noisy datasets

    Get PDF
    In this thesis we study loss functions that allow to train Convolutional Neural Networks (CNNs) under noisy datasets for the particular task of Content- Based Image Retrieval (CBIR). In particular, we propose two novel losses to fit models that generate global image representations. First, a Soft-Matching (SM) loss, exploiting both image content and meta data, is used to specialized general CNNs to particular cities or regions using weakly annotated datasets. Second, a Bag Exponential (BE) loss inspired by the Multiple Instance Learning (MIL) framework is employed to train CNNs for CBIR under noisy datasets. The first part of the thesis introduces a novel training framework that, relying on image content and meta data, learns location-adapted deep models that provide fine-tuned image descriptors for specific visual contents. Our networks, which start from a baseline model originally learned for a different task, are specialized using a custom pairwise loss function, our proposed SM loss, that uses weak labels based on image content and meta data. The experimental results show that the proposed location-adapted CNNs achieve an improvement of up to a 55% over the baseline networks on a landmark discovery task. This implies that the models successfully learn the visual clues and peculiarities of the region for which they are trained, and generate image descriptors that are better location-adapted. In addition, for those landmarks that are not present on the training set or even other cities, our proposed models perform at least as well as the baseline network, which indicates a good resilience against overfitting. The second part of the thesis introduces the BE Loss function to train CNNs for image retrieval borrowing inspiration from the MIL framework. The loss combines the use of an exponential function acting as a soft margin, and a MILbased mechanism working with bags of positive and negative pairs of images. The method allows to train deep retrieval networks under noisy datasets, by weighing the influence of the different samples at loss level, which increases the performance of the generated global descriptors. The rationale behind the improvement is that we are handling noise in an end-to-end manner and, therefore, avoiding its negative influence as well as the unintentional biases due to fixed pre-processing cleaning procedures. In addition, our method is general enough to suit other scenarios requiring different weights for the training instances (e.g. boosting the influence of hard positives during training). The proposed bag exponential function can bee seen as a back door to guide the learning process according to a certain objective in a end-to-end manner, allowing the model to approach such an objective smoothly and progressively. Our results show that our loss allows CNN-based retrieval systems to be trained with noisy training sets and achieve state-of-the-art performance. Furthermore, we have found that it is better to use training sets that are highly correlated with the final task, even if they are noisy, than training with a clean set that is only weakly related with the topic at hand. From our point of view, this result represents a big leap in the applicability of retrieval systems and help to reduce the effort needed to set-up new CBIR applications: e.g. by allowing a fast automatic generation of noisy training datasets and then using our bag exponential loss to deal with noise. Moreover, we also consider that this result opens a new line of research for CNN-based image retrieval: let the models decide not only on the best features to solve the task but also on the most relevant samples to do it.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: Luis Salgado Álvarez de Sotomayor.- Secretario: Pablos Martínez Olmos.- Vocal: Ernest Valveny Llobe

    Towards increased efficiency and automation in fluorescence micrograph analysis based on hand-labeled data

    Get PDF
    Held CH. Towards increased efficiency and automation in fluorescence micrograph analysis based on hand-labeled data. Bielefeld: Universität Bielefeld; 2013.In the past decade, automation in fluorescence microscopy has strongly increased, particularly in regards to image acquisition and sample preparation, which results in a huge volume of data. The amount of time required for manual assessment of an experiment is hence mainly determined by the amount of time required for data analysis. In addition, manual data analysis is often a task with poor reproducibility and lack of objectivity. Using automated image analysis software, the time required for data analysis can be reduced while quality and reproducibility of the evaluation are improved. Most image analysis approaches are based on a segmentation of the image. By arranging several image processing methods in a so-called segmentation pipeline, and by adjusting all parameters, a broad range of fluorescence image data can be segmented. The drawback of available software tools is the long time required to calibrate the segmentation pipeline for an experiment, particularly for researchers with little knowledge of image processing. As a result, many experiments that could benefit from automated image analysis are still evaluated manually. In order to reduce the amount of time users have to spend in adapting automated image analysis software to their data, research was carried out on a novel image analysis concept based on hand-labeled data. Using this concept, the user is required to provide hand-labeled cells, based on which an efficient combination of image processing methods and their parameterization is automatically calibrated, without further user input. The development of a segmentation pipeline that allows high-quality segmentation of a broad range of fluorescence micrographs in short time poses a challenge. In this work, a three-stage segmentation pipeline consisting of exchangeable preprocessing, figure-ground separation and cell-splitting methods was developed. These methods are mainly based on the state of the art, whereas some of them represent contributions to this status. Discretization of parameters must be performed carefully, as a broad range of fluorescence image data shall be supported. In order to allow calibration of the segmentation pipeline in a short time, discretization with equidistant as well as nonlinear step sizes was implemented. Apart from parameter discretization, quality of the calibration strongly depends on choice of the parameter optimization technique. In order to reduce calibration runtime, exploratory parameter space analysis was performed for different segmentation methods. This experiment showed that parameter spaces are mostly monotonous, but also show several local performance maxima. The comparison of different parameter optimization techniques indicated that the coordinate descent method results in a good parameterization of the segmentation pipeline in a small amount of time. In order to minimize the amount of time spent by the user in calibration of the system, correlation between the number of hand-labeled reference samples and the resulting segmentation performance was investigated. This experiment demonstrates that as few as ten reference samples often result in a good parameterization of the segmentation pipeline. Due to the low number of cells required for automatic calibration of the segmentation pipeline, as well as its short runtime, it can be concluded that the investigated method improves automation and efficiency in fluorescence micrograph analysis

    MFT: Long-Term Tracking of Every Pixel

    Full text link
    We propose MFT -- Multi-Flow dense Tracker -- a novel method for dense, pixel-level, long-term tracking. The approach exploits optical flows estimated not only between consecutive frames, but also for pairs of frames at logarithmically spaced intervals. It selects the most reliable sequence of flows on the basis of estimates of its geometric accuracy and the probability of occlusion, both provided by a pre-trained CNN. We show that MFT achieves competitive performance on the TAP-Vid benchmark, outperforming baselines by a significant margin, and tracking densely orders of magnitude faster than the state-of-the-art point-tracking methods. The method is insensitive to medium-length occlusions and it is robustified by estimating flow with respect to the reference frame, which reduces drift.Comment: accepted to WACV 2024. Code at https://github.com/serycjon/MF

    Generation of a Land Cover Atlas of environmental critic zones using unconventional tools

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Text Similarity Between Concepts Extracted from Source Code and Documentation

    Get PDF
    Context: Constant evolution in software systems often results in its documentation losing sync with the content of the source code. The traceability research field has often helped in the past with the aim to recover links between code and documentation, when the two fell out of sync. Objective: The aim of this paper is to compare the concepts contained within the source code of a system with those extracted from its documentation, in order to detect how similar these two sets are. If vastly different, the difference between the two sets might indicate a considerable ageing of the documentation, and a need to update it. Methods: In this paper we reduce the source code of 50 software systems to a set of key terms, each containing the concepts of one of the systems sampled. At the same time, we reduce the documentation of each system to another set of key terms. We then use four different approaches for set comparison to detect how the sets are similar. Results: Using the well known Jaccard index as the benchmark for the comparisons, we have discovered that the cosine distance has excellent comparative powers, and depending on the pre-training of the machine learning model. In particular, the SpaCy and the FastText embeddings offer up to 80% and 90% similarity scores. Conclusion: For most of the sampled systems, the source code and the documentation tend to contain very similar concepts. Given the accuracy for one pre-trained model (e.g., FastText), it becomes also evident that a few systems show a measurable drift between the concepts contained in the documentation and in the source code.</p

    Smoke plume segmentation of wildfire images

    Get PDF
    Aquest treball s'emmarca dins del camp d'estudi de les xarxes neuronals en Aprenentatge profund. L'objectiu del projecte és analitzar i aplicar les xarxes neuronals que hi ha avui dia en el mercat per resoldre un problema en específic. Aquest és tracta de la segmentació de plomalls de fum en incendis forestals. S'ha desenvolupat un estudi de les xarxes neuronals utilitzades per resoldre problemes de segmentació d'imatges i també una reconstrucció posterior en 3D d'aquests plomalls de fum. L'algorisme finalment escollit és tracta del model UNet, una xarxa neuronal convolucional basada en l'estructura d'autoencoders amb connexions de pas, que desenvolupa tasques d'autoaprenentatge per finalment obtenir una predicció de la classe a segmentar entrenada, en aquest cas plomalls. de fum. Posteriorment, una comparativa entre algoritmes tradicionals i el model UNet aplicat fent servir aprenentatge profund s'ha realitzat, veient que tant quantitativament com qualitativament s'aconsegueix els millors resultats aplicant el model UNet, però a la vegada comporta més temps de computació. Tots aquests models s'han desenvolupat amb el llenguatge de programació Python utilitzant els llibres d'aprenentatge automàtic Tensorflow i Keras. Dins del model UNet s'han dut a terme múltiples experiments per obtenir els diferents valors dels hiperparàmetres més adequats per a l'aplicació del projecte, obtenint una precisió del 93.45 % en el model final per a la segmentació de fum en imatges d'incendis. forestals.Este trabajo se enmarca dentro del campo de estudio de las redes neuronales en aprendizaje profundo. El objetivo del proyecto es analizar y aplicar las redes neuronales que existen hoy en día en el mercado para resolver un problema en específico. Éste se trata de la segmentación de penachos de humo en incendios forestales. Se ha desarrollado un estudio de las redes neuronales utilizadas para resolver problemas de segmentación de imágenes y también una reconstrucción posterior en 3D de estos penachos de humo. El algoritmo finalmente escogido se trata del modelo UNet, una red neuronal convolucional basada en la estructura de autoencoders con conexiones de paso, que desarrolla tareas de autoaprendizaje para finalmente obtener una predicción de la clase a segmentar entrenada, en este caso penachos de humo. Posteriormente, una comparativa entre algoritmos tradicionales y el modelo UNet aplicado utilizando aprendizaje profundo se ha realizado, viendo que tanto cuantitativa como cualitativamente se consigue los mejores resultados aplicando el modelo UNet, pero a la vez conlleva más tiempo de computación. Todos estos modelos se han desarrollado con el lenguaje de programación Python utilizando libros de aprendizaje automático Tensorflow y Keras. Dentro del modelo UNet se han llevado a cabo múltiples experimentos para obtener los distintos valores de los hiperparámetros más adecuados para la aplicación del proyecto, obteniendo una precisión del 93.45 % en el modelo final para la segmentación de humo en imágenes de incendios forestales.This work is framed within the field of study of neural networks in Deep Learning. The aim of the project is to analyse and apply the neural networks that exist today in the market to solve a specific problem. This is about the segmentation of smoke plumes in forest fires. A study of the neural networks used to solve image segmentation problems and also a subsequent 3D reconstruction of these smoke plumes has been developed. The algorithm finally chosen is the UNet model, a convolutional neural network based on the structure of autoencoders with step connections, which develops self-learning tasks to finally obtain a prediction of the class to be trained, in this case smoke plumes. Also, a comparison between traditional algorithms and the UNet model applied using deep learning has been carried out, seeing that both quantitatively and qualitatively the best results are achieved by applying the UNet model, but at the same time it involves more computing time. All these models have been developed in the Python programming language using the Tensorflow and Keras machine learning books. Within the UNet model, multiple experiments have been carried out to obtain the different hyperparameter values most suitable for the project application, obtaining an accuracy of 93.45% in the final model for smoke segmentation in wildfire images

    Remote sensing tools for the objective quantification of tree structural condition from individual trees to landscape scale assessment

    Get PDF
    Tree management is the practice of protecting and caring for trees for sustainable, defined objectives. However, there are often conflicts between maintaining trees and the obligation to protect targets, such as people or infrastructure, from the risks associated with the failure of trees and major limbs. Where there are targets worthy of protection, tree structural condition is typically monitored relative to the prescribed management objectives. Traditionally, field methods for capturing data on tree structural condition are manual with a tree surveyor taking very limited direct measurements, and only from parts of the tree that are within reach from the ground. Consequently, large sections of the tree remain unmeasured due to the logistical complications of accessing the aerial structure. Therefore, the surveyor estimates tree part sizes, approximates counts of relevant tree features and uses personal interpretation to infer the significance of the observations. These techniques are temporally and logistically demanding, and largely subjective. This thesis develops solutions to the limitations of traditional methods through the development of remote sensing (RS) tools for assessing tree structural condition, in order to inform tree management interventions. For individual trees, a proximal photogrammetry technique is developed for objectively quantifying tree structural condition by measuring the self-affinity of tree crowns in fractal dimensions. This can identify the individual tree crown complexity along a structural condition continuum, which is more effective than the traditional categorical approach for monitoring tree condition. Moving out in scale, a framework is developed which optimises the matchpairing agreement between ground reference tree data and RS-derived individual tree crown (ITC) delineations in order to quantify the accuracy of different ITC delineation algorithms. The framework is then used to identify an optimal ITC delineation algorithm which is applied to aerial laser scanning data to map individual trees and extract a point cloud for each tree. Metrics are then derived from the point cloud to classify a tree according to its structural condition, a process which is then applied to the tree population across an entire landscape. This provides information with which to spatially optimise tree survey and management resources, improve the decision making process and move towards proactive tree management. The research presented in this thesis develops RS tools for assessing tree structural condition, at a range of investigative scales. These objective, data-rich tools will enable resource-limited tree managers to direct remedial interventions in an optimised and precise way
    corecore