24 research outputs found
Sistema de soporte de decisión para la gestión de fallos en equipos industriales, basado en métodos de ensamble
Los fallos en equipos industriales representan eventos críticos en el ámbito de cualquier organización. Su clasificación y caracterización representa un factor importante que apoya el proceso de toma de decisiones en las actividades de mantenimiento. La Minería de Datos ha desempeñado un rol significativo en la evaluación y clasificación de los fallos presentados. Los algoritmos basados en redes bayesianas y árboles de decisión han sido utilizados, de manera individual y en conjunto, para la construcción de modelos de clasificación híbridos, con el propósito de la evaluación y caracterización de fallos. Este trabajo propone el desarrollo de modelos híbridos usando los métodos de ensamble Grading y Vote, combinando las técnicas de redes bayesianas (BayesNet y Naive BayesUpdateable) y árboles de decisión (RandomTree). Se determina la precisión de los métodos de ensamble con los distintos algoritmos, mediante experimentos con el mismo set de datos particionado.Sociedad Argentina de Informática e Investigación Operativ
Sistema de soporte de decisión para la gestión de fallos en equipos industriales, basado en métodos de ensamble
Los fallos en equipos industriales representan eventos críticos en el ámbito de cualquier organización. Su clasificación y caracterización representa un factor importante que apoya el proceso de toma de decisiones en las actividades de mantenimiento. La Minería de Datos ha desempeñado un rol significativo en la evaluación y clasificación de los fallos presentados. Los algoritmos basados en redes bayesianas y árboles de decisión han sido utilizados, de manera individual y en conjunto, para la construcción de modelos de clasificación híbridos, con el propósito de la evaluación y caracterización de fallos. Este trabajo propone el desarrollo de modelos híbridos usando los métodos de ensamble Grading y Vote, combinando las técnicas de redes bayesianas (BayesNet y Naive BayesUpdateable) y árboles de decisión (RandomTree). Se determina la precisión de los métodos de ensamble con los distintos algoritmos, mediante experimentos con el mismo set de datos particionado.Sociedad Argentina de Informática e Investigación Operativ
Sistema de soporte de decisión para la gestión de fallos en equipos industriales, basado en métodos de ensamble
Los fallos en equipos industriales representan eventos críticos en el ámbito de cualquier organización. Su clasificación y caracterización representa un factor importante que apoya el proceso de toma de decisiones en las actividades de mantenimiento. La Minería de Datos ha desempeñado un rol significativo en la evaluación y clasificación de los fallos presentados. Los algoritmos basados en redes bayesianas y árboles de decisión han sido utilizados, de manera individual y en conjunto, para la construcción de modelos de clasificación híbridos, con el propósito de la evaluación y caracterización de fallos. Este trabajo propone el desarrollo de modelos híbridos usando los métodos de ensamble Grading y Vote, combinando las técnicas de redes bayesianas (BayesNet y Naive BayesUpdateable) y árboles de decisión (RandomTree). Se determina la precisión de los métodos de ensamble con los distintos algoritmos, mediante experimentos con el mismo set de datos particionado.Sociedad Argentina de Informática e Investigación Operativ
Machine Learning Classification of Females Susceptibility to Visceral Fat Associated Diseases
The problem of classifying subjects into risk categories is a common challenge in medical research. Machine Learning (ML) methods are widely used in the areas of risk prediction and classification. The primary objective of these algorithms is to predict dichotomous responses (e.g. healthy/at risk) based on several features. Similarly to statistical inference models, also ML models are subject to the common problem of class imbalance. Therefore, they are affected by the majority class increasing the false-negative rate.
In this paper, we built and evaluated eighteen ML models classifying approximately 4300 female participants from the UK Biobank into three categorical risk statuses based on responses for the discretised visceral adipose tissue values from magnetic resonance imaging. We also examined the effect of sampling techniques on classification modelling when dealing with class imbalance.
Results showed that the use of sampling techniques had a significant impact. They not only drove an improvement in predicting patients risk status but also facilitated an increase in the information contained within each variable. Based on domain experts criteria, the three best models for classification were finally identified.
These encouraging results will guide further developments of classification models for predicting visceral adipose tissue without the need for a costly scan
Ensemble approach combining multiple methods improves human transcription start site prediction
Dineen DG, Schroeder M, Higgins DG, Cunningham P. Ensemble approach combining multiple methods improves human transcription start site prediction. BMC Genomics. 2010;11(1): 677.Background: The computational prediction of transcription start sites is an important unsolved problem. Some recent progress has been made, but many promoters, particularly those not associated with CpG islands, are still difficult to locate using current methods. These methods use different features and training sets, along with a variety of machine learning techniques and result in different prediction sets. Results: We demonstrate the heterogeneity of current prediction sets, and take advantage of this heterogeneity to construct a two-level classifier ('Profisi Ensemble') using predictions from 7 programs, along with 2 other data sources. Support vector machines using 'full' and 'reduced' data sets are combined in an either/or approach. We achieve a 14% increase in performance over the current state-of-the-art, as benchmarked by a third-party tool. Conclusions: Supervised learning methods are a useful way to combine predictions from diverse sources
Recommended from our members
Combining a Baiting and a User Search Profiling Techniques for Masquerade Detection
Masquerade attacks are characterized by an adversary stealing a legitimate user's credentials and using them to impersonate the victim and perform malicious activities, such as stealing information. Prior work on masquerade attack detection has focused on profiling legitimate user behavior and detecting abnormal behavior indicative of a masquerade attack. Like any anomaly-detection based techniques, detecting masquerade attacks by profiling user behavior suffers from a significant number of false positives. We extend prior work and provide a novel integrated detection approach in this paper. We combine a user behavior profiling technique with a baiting technique in order to more accurately detect masquerade activity. We show that using this integrated approach reduces the false positives by 36% when compared to user behavior profiling alone, while achieving almost perfect detection results. We also show how this combined detection approach serves as a mechanism for hardening the masquerade attack detector against mimicry attacks
A Meta-Learning Method for Concept Drift
The knowledge hidden in evolving data may change with time, this issue is known as concept drift. It often causes a learning system to decrease its prediction accuracy. Most existing techniques apply ensemble methods to improve learning performance on concept drift. In this paper, we propose a novel meta learning approach for this issue and develop a method: Multi-Step Learning (MSL). In our method, a MSL learner is structured in a recursive manner, which contains all the base learners maintained in a hierarchy, ensuring the learned concepts are traceable. We evaluated MSL and two ensemble techniques on three synthetic datasets, which contain a number of drastic concept drifts. The experimental results show that the proposed method generally performs better than the ensemble techniques in terms of prediction accuracy
Machine Learning Prediction of Susceptibility to Visceral Fat Associated Diseases
Classifying subjects into risk categories is a common challenge in medical research. Machine Learning (ML) methods are widely used in the areas of risk prediction and classification. The primary objective of such algorithms is to use several features to predict dichotomous responses (e.g., healthy/at risk). Similar to statistical inference modelling, ML modelling is subject to the problem of class imbalance and is affected by the majority class, increasing the false-negative rate. In this study, we built and evaluated thirty-six ML models to classify approximately 4300 female and 4100 male participants from the UK Biobank into three categorical risk statuses based on discretised visceral adipose tissue (VAT) measurements from magnetic resonance imaging. We also examined the effect of sampling techniques on the models when dealing with class imbalance. The sampling techniques used had a significant impact on the classification and resulted in an improvement in risk status prediction by facilitating an increase in the information contained within each variable. Based on domain expert criteria the best three classification models for the female and male cohort visceral fat prediction were identified. The Area Under Receiver Operator Characteristic curve of the models tested (with external data) was 0.78 to 0.89 for females and 0.75 to 0.86 for males. These encouraging results will be used to guide further development of models to enable prediction of VAT value. This will be useful to identify individuals with excess VAT volume who are at risk of developing metabolic disease ensuring relevant lifestyle interventions can be appropriately targeted