4 research outputs found

    Aplicación de técnicas de clasificación a la detección de cáncer

    Get PDF
    En este Trabajo Fin de Grado se realiza un estudio comparativo de diversos métodos de clasificación estadística, tanto desde el punto de vista teórico como aplicado. La memoria se estructura en 3 capítulos. En el Capítulo 1 se realiza una breve introducción a las técnicas de machine learning, centrándonos en las técnicas de clasificación. Distinguimos entre técnicas paramétricas y no paramétricas. En el Capítulo 2, se realiza una revisión metodológica de algunos de los más importantes clasificadores. Comenzamos con el estudio de los paramétricos: regresión logística y análisis discriminante. En el caso de la regresión logística, introducimos el modelo, estimación de los coeficientes, y realización de predicciones tanto en el caso simple como múltiple. En cuanto al Análisis Discriminante Lineal (LDA), este método se introduce como un clasificador basado en el Teorema de Bayes, y se trata tanto el caso de uno como de varios predictores. A continuación, recogemos el método de clasificación basado en las Máquinas de Véctor Soporte (SVM). Destacamos que es un método no paramétrico, en el que el problema de clasificación se reduce a un subconjunto potencialmente pequeño de las observaciones disponibles en el conjunto de entrenamiento. Frente a los clasificadores paramétricos, las máquinas de vector soporte resultan se bastante robustos. Para finalizar el Capítulo 2, se recogen medidas para evaluar la calidad del clasificador aplicado: tasa de error y entrenamiento, equilibrio entre sesgo y varianza del modelo, métodos de remuestreo basadas en técnicas de validación cruzada, métodos de evaluación y selección del modelo, y medidas específicas de clasificación como son la sensibilidad, especificidad, curva ROC, y AUC. En el Capítulo 3, se aplican los métodos y medidas anteriores al conjunto de datos Wisconsin, sobre diagnóstico de cáncer de mama, y que se encuentran disponibles en Kaggle. Se realiza un estudio descriptivo de estos datos, se detectan outliers, y se aplican métodos de selección de variables, para quedarnos con aquellas con mayor poder discriminatorio. Los datos se dividen en conjunto de entrenamiento y test. A ellos se les aplicarán los distintos clasificadores: regresión logística, análisis discriminante lineal, y máquinas de vector soporte. Se obtienen y comparan las medidas de precisión obtenidas en ellos. El análisis estadístico se ha realizado utilizando el lenguaje y librerías de R.In this work a comparison of different statistical classification methods is carried out. Theoretical results and applications are given. The work is divided in three chapters. In Chapter 1, machine learning and classification techniques are introduced. We distinguish between parametric and non-parametric methods. In Chapter 2, a methodological review of most relevant classifiers is given. First, parametric methods are considered: logistic regression and linear discriminant analysis. As for logistic regression, the model is introduced, estimators of the coefficients, predictions for simple and multiple setting are studied. Second, Linear Discriminant Analysis (LDA) is introduced as a classifier based on Bayes theorem, results for one and several predictors are given. Next, classification methods based on Support Vector Machine (SVM) are studied. This is a nonparametric approach, in which the classification problem is reduced to a really small subset of data available in the training set. Support vector machines are more robust methods than the parametric ones. To conclude Chapter 2, measures to evaluate the quality of a classifier are given. These are: training and error rate, balance between bias and variance in a model, resampling methods based on cross validation, methods to evaluate and select a model, and tailored measures of classification such as sensitivity, specifity, ROC curve and AUC. In Chapter 3, the previously methods and measures are applied to Wisconsin dataset, available at Kaggle. A descriptive study is carried out, techniques to detect outliers are applied, and methods to select predictor variables are considered in order to keep those explanatory variables with greater discriminatory power. The dataset is split into training and test set. The different classification methods, previously introduced, are applied, that is, logistic regression, LDA and SVM. The measures of quality of these classifiers are obtained. Comparison between them are given. R and libraries of this software have been used in our study.Universidad de Sevilla. Grado en Estadístic

    Real-time detection of uncalibrated sensors using Neural Networks

    Get PDF
    Nowadays, sensors play a major role in several contexts like science, industry and daily life which benefit of their use. However, the retrieved information must be reliable. Anomalies in the behavior of sensors can give rise to critical consequences such as ruining a scientific project or jeopardizing the quality of the production in industrial production lines. One of the more subtle kind of anomalies are uncalibrations. An uncalibration is said to take place when the sensor is not adjusted or standardized by calibration according to a ground truth value. In this work, an online machine-learning based uncalibration detector for temperature, humidity and pressure sensors was developed. This solution integrates an Artificial Neural Network as main component which learns from the behavior of the sensors under calibrated conditions. Then, after trained and deployed, it detects uncalibrations once they take place. The obtained results show that the proposed solution is able to detect uncalibrations for deviation values of 0.25 degrees, 1% RH and 1.5 Pa, respectively. This solution can be adapted to different contexts by means of transfer learning, whose application allows for the addition of new sensors, the deployment into new environments and the retraining of the model with minimum amounts of data

    Real-time detection of uncalibrated sensors using neural networks

    Get PDF
    Nowadays, sensors play a major role in several fields, such as science, industry and everyday technology. Therefore, the information received from the sensors must be reliable. If the sensors present any anomalies, serious problems can arise, such as publishing wrong theories in scientific papers, or causing production delays in industry. One of the most common anomalies are uncalibrations. An uncalibration occurs when the sensor is not adjusted or standardized by calibration according to a ground truth value. In this work, an online machine-learning based uncalibration detector for temperature, humidity and pressure sensors is presented. This development integrates an artificial neural network as the main component which learns from the behavior of the sensors under calibrated conditions. Then, after being trained and deployed, it detects uncalibrations once they take place. The obtained results show that the proposed system is able to detect the 100% of the presented uncalibration events, although the time response in the detection depends on the resolution of the model for the specific location, i.e., the minimum statistically significant variation in the sensor behavior that the system is able to detect. This architecture can be adapted to different contexts by applying transfer learning, such as adding new sensors or having different environments by re-training the model with minimum amount of dataEuropean Union (UE). H2020 VIMS Grant ID: 878757Ministerio de Ciencia, Innovación y Universidades PID2019-105556GB-C33 (MIND-ROB)European Union H2020 CHIST-ERA SMALL (PCI2019-111841-2

    Real-time detection of uncalibrated sensors using Neural Networks

    No full text
    Nowadays, sensors play a major role in several contexts like science, industry and daily life which benefit of their use. However, the retrieved information must be reliable. Anomalies in the behavior of sensors can give rise to critical consequences such as ruining a scientific project or jeopardizing the quality of the production in industrial production lines. One of the more subtle kind of anomalies are uncalibrations. An uncalibration is said to take place when the sensor is not adjusted or standardized by calibration according to a ground truth value. In this work, an online machine-learning based uncalibration detector for temperature, humidity and pressure sensors was developed. This solution integrates an Artificial Neural Network as main component which learns from the behavior of the sensors under calibrated conditions. Then, after trained and deployed, it detects uncalibrations once they take place. The obtained results show that the proposed solution is able to detect uncalibrations for deviation values of 0.25º, 1% RH and 1.5 Pa, respectively. This solution can be adapted to different contexts by means of transfer learning, whose application allows for the addition of new sensors, the deployment into new environments and the retraining of the model with minimum amounts of data.Ministerio de Ciencia, Innovación y Universidades PID2019-105556GB-C3
    corecore