9 research outputs found

    Curriculum Dropout

    Full text link
    Dropout is a very effective way of regularizing neural networks. Stochastically "dropping out" units with a certain probability discourages over-specific co-adaptations of feature detectors, preventing overfitting and improving network generalization. Besides, Dropout can be interpreted as an approximate model aggregation technique, where an exponential number of smaller networks are averaged in order to get a more powerful ensemble. In this paper, we show that using a fixed dropout probability during training is a suboptimal choice. We thus propose a time scheduling for the probability of retaining neurons in the network. This induces an adaptive regularization scheme that smoothly increases the difficulty of the optimization problem. This idea of "starting easy" and adaptively increasing the difficulty of the learning problem has its roots in curriculum learning and allows one to train better models. Indeed, we prove that our optimization strategy implements a very general curriculum scheme, by gradually adding noise to both the input and intermediate feature representations within the network architecture. Experiments on seven image classification datasets and different network architectures show that our method, named Curriculum Dropout, frequently yields to better generalization and, at worst, performs just as well as the standard Dropout method.Comment: Accepted at ICCV (International Conference on Computer Vision) 201

    Tempering the Adversary: An Exploration into the Applications of Game Theoretic Feature Selection and Regression

    Get PDF
    Most modern machine learning algorithms tend to focus on an average-case approach, where every data point contributes the same amount of influence towards calculating the fit of a model. This per-data point error (or loss) is averaged together into an overall loss and typically minimized with an objective function. However, this can be insensitive to valuable outliers. Inspired by game theory, the goal of this work is to explore the utility of incorporating an optimally-playing adversary into feature selection and regression frameworks. The adversary assigns weights to the data elements so as to degrade the modeler\u27s performance in an optimal manner, thereby forcing the modeler to construct a more robust solution. A tuning parameter enables tempering of the power wielded by the adversary, allowing us to explore the spectrum between average case and worst case. By formulating our method as a linear program, it can be solved efficiently, and can accommodate sub-population constraints, a feature that other related methods cannot easily implement. We feel that the need to generate models while understanding the influence of sub-population constraints should be particularly prominent in biomedical literature, and though our method was developed in response to the ubiquity of sub-population data and outliers that exist in this realm, our method is generic and can be applied to data sets that are not exclusively biomedical in nature. We additionally explore the implementation of our method as an adversarial regression problem. Here, instead of providing the user with a fitting of parameters for the model, we provide the user with an ensemble of parameters which can be tuned based on sensitivity to outliers and various sub-population constraints. Finally, to help foster a better understanding of various data sets, we will discuss potential automated applications of our method which will enable data scientists to explore underlying relationships and sensitivities that may be a consequence of sub-populations and meaningful outliers

    Aplicación de algoritmos de Machine Learning a la clasificación y reconocimiento de partículas en fluidos multifásicos

    Get PDF
    La holografía digital ha emergido como una posible herramienta útil en aplicaciones de biomedicina, aunque ciertas dificultades se interponen para que alcance todo su potencial. En particular, la tarea de localizar y clasificar partículas de diferentes tamaños presentes en fluidos, a partir de la información tridimensional que proporciona registrarlas en un holograma, es especialmente complicada cuando la densidad de partículas es elevada y/o los volúmenes analizados son de gran tamaño. Esta tarea es el primer paso de cara al desarrollo de nuevas herramientas contra el cáncer y análisis tridimensionales del riego sanguíneo. En este trabajo se propone una solución basada en machine learning para dicho problema. Se eligen los datos de entrada, generados a partir de un algoritmo sintetizador de hologramas, y los planos de salida adecuados para cumplir este objetivo. Se adapta una arquitectura basada en U-Net, aportando varias novedades. Se escoge un método de entrenamiento adecuado para el modelo, y se entrena el mismo en numerosas ocasiones hasta conseguir los hiperparámetros más óptimos. Los resultados que ofrece este modelo son muy positivos, y se especula sobre la posibilidad de extrapolarlos a casos reales. Finalmente se sugieren ciertas vías por las que continuar mejorando el algoritmo y los retos que deberá superar aún la técnica para tener una utilidad directa en el campo de la biomedicina.<br /

    Using artificial intelligence to monitor the state of the machine

    Get PDF
    Práca je zameraná na monitorovanie najviac namáhaných častí obrábacieho stroja. Použitá metóda umelej inteligencie je rekurentná neurónová sieť a jej modifikácie. Nakoľko dáta zo senzorov mali sekvenčný charakter, bolo vhodné zvoliť práve rekurentnú neuóonovú sieť. Práca sa zaoberá riešením troch úloh. Prvá úloha bola zameraná na stanovenie predpokladaného opotrebenia frézy, na základe nepriamej metódy využívajúcej neurónovú sieť. Ďalšia úloha sa zameriava na detekciu poruchy ložíska na základe dát získaných z akcelerometra. Treťou úlohou bolo predikovať dobu do poškodenia monitorovaného ložiska.This thesis is focus on monitoring state of machine parts that are under the most stress. Type of artificial intelligence used in this work is recurrent neural network and its modifications. Chosen type of neural network was used because of the sequential character of used data. This thesis is solving three problems. In first problem algorithm is trying to determine state of mill tool wear using recurrent neural network. Used method for monitoring state is indirect. Second Problem was focused on detecting fault of a bearing and classifying it to specific category. In third problem RNN is used to predict RUL of monitored bearing.

    Urban air pollution modelling with machine learning using fixed and mobile sensors

    Get PDF
    Detailed air quality (AQ) information is crucial for sustainable urban management, and many regions in the world have built static AQ monitoring networks to provide AQ information. However, they can only monitor the region-level AQ conditions or sparse point-based air pollutant measurements, but cannot capture the urban dynamics with high-resolution spatio-temporal variations over the region. Without pollution details, citizens will not be able to make fully informed decisions when choosing their everyday outdoor routes or activities, and policy-makers can only make macroscopic regulating decisions on controlling pollution triggering factors and emission sources. An increasing research effort has been paid on mobile and ubiquitous sampling campaigns as they are deemed the more economically and operationally feasible methods to collect urban AQ data with high spatio-temporal resolution. The current research proposes a Machine Learning based AQ Inference (Deep AQ) framework from data-driven perspective, consisting of data pre-processing, feature extraction and transformation, and pixelwise (grid-level) AQ inference. The Deep AQ framework is adaptable to integrate AQ measurements from the fixed monitoring sites (temporally dense but spatially sparse), and mobile low-cost sensors (temporally sparse but spatially dense). While instantaneous pollutant concentration varies in the micro-environment, this research samples representative values in each grid-cell-unit and achieves AQ inference at 1 km \times 1 km pixelwise scale. This research explores the predictive power of the Deep AQ framework based on samples from only 40 fixed monitoring sites in Chengdu, China (4,900 {\mathrm{km}}^\mathrm{2}, 26 April - 12 June 2019) and collaborative sampling from 28 fixed monitoring sites and 15 low-cost sensors equipped with taxis deployed in Beijing, China (3,025 {\mathrm{km}}^\mathrm{2}, 19 June - 16 July 2018). The proposed Deep AQ framework is capable of producing high-resolution (1 km \times 1 km, hourly) pixelwise AQ inference based on multi-source AQ samples (fixed or mobile) and urban features (land use, population, traffic, and meteorological information, etc.). This research has achieved high-resolution (1 km \times 1 km, hourly) AQ inference (Chengdu: less than 1% spatio-temporal coverage; Beijing: less than 5% spatio-temporal coverage) with reasonable and satisfactory accuracy by the proposed methods in urban cases (Chengdu: SMAPE \mathrm{<} 20%; Beijing: SMAPE \mathrm{<} 15%). Detailed outcomes and main conclusions are provided in this thesis on the aspects of fixed and mobile sensing, spatio-temporal coverage and density, and the relative importance of urban features. Outcomes from this research facilitate to provide a scientific and detailed health impact assessment framework for exposure analysis and inform policy-makers with data driven evidence for sustainable urban management.Open Acces
    corecore