4 research outputs found

    Classifier selection with permutation tests

    Get PDF
    This work presents a content-based recommender system for machine learning classifier algorithms. Given a new data set, a recommendation of what classifier is likely to perform best is made based on classifier performance over similar known data sets. This similarity is measured according to a data set characterization that includes several state-of-the-art metrics taking into account physical structure, statistics, and information theory. A novelty with respect to prior work is the use of a robust approach based on permutation tests to directly assess whether a given learning algorithm is able to exploit the attributes in a data set to predict class labels, and compare it to the more commonly used F-score metric for evaluating classifier performance. To evaluate our approach, we have conducted an extensive experimentation including 8 of the main machine learning classification methods with varying configurations and 65 binary data sets, leading to over 2331 experiments. Our results show that using the information from the permutation test clearly improves the quality of the recommendations.Peer ReviewedPostprint (author's final draft

    Detecci贸n de fallas en cajas de engranajes utilizando el m茅todo de aprendizaje de m谩quinas Support Vector Machine (SVM)

    Get PDF
    El objetivo de esta investigaci贸n fue crear un modelo predictivo bajo el enfoque de aprendizaje de m谩quinas y verificar su efectividad para clasificar y detectar fallas en cajas de engranajes de manera autom谩tica, para lo cual se utiliz贸 un conjunto de datos de se帽ales de vibraci贸n obtenido del repositorio de Iniciativa de Datos de Energ铆a Abierta (OEDI) del departamento de energ铆a de EE. UU. La creaci贸n del modelo se llev贸 a cabo utilizando el m茅todo de aprendizaje de m谩quinas supervisado Support Vector Machine (SVM) y con la ayuda del software de programaci贸n Python, donde se realiz贸 el preprocesamiento y an谩lisis del conjunto de datos. Al conjunto de datos se le extrajo caracter铆sticas en el dominio del tiempo y dominio de la frecuencia. Para seleccionar las mejores caracter铆sticas se aplic贸 el m茅todo de Eliminaci贸n Recursiva de Caracter铆sticas con Validaci贸n Cruzada (RFECV). Para ingresar al clasificador SVM los datos se dividieron en 70% para entrenamiento y 30% para prueba. Como resultado se obtuvo tres modelos de detecci贸n de fallas, un primer modelo donde se utiliz贸 un conjunto de datos recopilados por cuatro aceler贸metros bajo una carga de 50%, un segundo modelo donde se combin贸 los datos recopilados por cuatro aceler贸metros y cargas en un rango de 0 a 90% y un tercer modelo utilizando los datos de un solo aceler贸metro del modelo dos. Cada modelo se entren贸 y probo obteni茅ndose excelentes resultados, logrando una exactitud de 99,84% y una precisi贸n de 99,82% para el mejor modelo. Los resultados demuestran que el m茅todo empleado clasifica y predice fallas con alta exactitud y precisi贸n, siendo un m茅todo prometedor y de gran aporte para el mantenimiento industrial. Se recomienda reducir y estandarizar el conjunto de caracter铆sticas, de esa forma se consigue reducir la carga computacional y a su vez mejorar el rendimiento del modelo.The objective of this research was to create a predictive model under the machine learning approach and verify its effectiveness to classify and detect faults in gearboxes automatically, for which a data set of vibration signals obtained from the repository was used from the Open Energy Data Initiative (OEDI) of the US Department of Energy. The creation of the model was carried out using the Support Vector Machine (SVM) supervised machine learning method and with the aid of Python programming software, where the preprocessing and analysis of the data set was performed. Features in the time domain and frequency domain were extracted from the data set. To select the best features, the Recursive Features Elimination with Cross Validation (RFECV) method was applied. To enter the SVM classifier, the data was divided into 70% for training and 30% for testing. As a result, three fault detection models were obtained, a first model where a set of data collected by four accelerometers under a load of 50% was produced, a second model where the data collected by four accelerometers and loads in a range of 0 to 90% and a third model using the data from a single accelerometer of model two. Each model was trained and tested obtaining excellent results, achieving an accuracy of 99,84% and a precision of 99,82% for the best model. The results show that the method used classifies and predicts faults with high accuracy and precision, being a promising method and of great contribution to industrial maintenance. It is recommended to reduce and standardize the set of features, in this way it is possible to reduce the computational load and in turn improve the performance of the model

    Classifier selection with permutation tests

    No full text
    This work presents a content-based recommender system for machine learning classifier algorithms. Given a new data set, a recommendation of what classifier is likely to perform best is made based on classifier performance over similar known data sets. This similarity is measured according to a data set characterization that includes several state-of-the-art metrics taking into account physical structure, statistics, and information theory. A novelty with respect to prior work is the use of a robust approach based on permutation tests to directly assess whether a given learning algorithm is able to exploit the attributes in a data set to predict class labels, and compare it to the more commonly used F-score metric for evaluating classifier performance. To evaluate our approach, we have conducted an extensive experimentation including 8 of the main machine learning classification methods with varying configurations and 65 binary data sets, leading to over 2331 experiments. Our results show that using the information from the permutation test clearly improves the quality of the recommendations.Peer Reviewe
    corecore