Search CORE

2,573 research outputs found

Learning cost-sensitive Bayesian networks via direct and indirect methods

Author: Nashnush EB
Vadera S
Publication venue: 'IOS Press'
Publication date: 01/01/2017
Field of study

Cost-Sensitive learning has become an increasingly important area that recognizes that real world classification problems need to take the costs of misclassification and accuracy into account. Much work has been done on cost-sensitive decision tree learning, but very little has been done on cost-sensitive Bayesian networks. Although there has been significant research on Bayesian networks there has been relatively little research on learning cost-sensitive Bayesian networks. Hence, this paper explores whether it is possible to develop algorithms that learn cost-sensitive Bayesian networks by taking (i) an indirect approach that changes the data distribution to reflect the costs of misclassification; and (ii) a direct approach that amends an existing accuracy based algorithm for learning Bayesian networks. An empirical comparison of the new approaches is carried out with cost-sensitive decision tree learning algorithms on 33 data sets, and the results show that the new algorithms perform better in terms of misclassification cost and maintaining accuracy

University of Salford Institutional Repository

Backward Sequential Feature Elimination And Joining Algorithms In Machine Learning

Author: Valsan Sanya
Publication venue: SJSU ScholarWorks
Publication date: 01/04/2014
Field of study

The Naïve Bayes Model is a special case of Bayesian networks with strong independence assumptions. It is typically used for classification problems. The Naïve Bayes model is trained using the given data to estimate the parameters necessary for classification. This model of classification is very popular since it is simple yet efficient and accurate. While the Naïve Bayes model is considered accurate on most of the problem instances, there is a set of problems for which the Naïve Bayes does not give accurate results when compared to other classifiers such as the decision tree algorithms. One reason for it could be the strong independence assumption of the Naïve Bayes model. This project aims at searching for dependencies between the features and studying the consequences of applying these dependencies in classifying instances. We propose two different algorithms, the Backward Sequential Joining and the Backward Sequential Elimination that can be applied in order to improve the accuracy of the Naïve Bayes model. We then compare the accuracies of the different algorithms and derive conclusion based on the results

SJSU ScholarWorks

Applied Computational Techniques on Schizophrenia Using Genetic Mutations

Author: Aguiar-Pulido Vanessa
Fernández-Lozano Carlos
Gestal M.
Munteanu Cristian-Robert
Rivero Daniel
Publication venue: 'Bentham Science Publishers Ltd.'
Publication date: 01/01/2013
Field of study

[Abstract] Schizophrenia is a complex disease, with both genetic and environmental influence. Machine learning techniques can be used to associate different genetic variations at different genes with a (schizophrenic or non-schizophrenic) phenotype. Several machine learning techniques were applied to schizophrenia data to obtain the results presented in this study. Considering these data, Quantitative Genotype – Disease Relationships (QDGRs) can be used for disease prediction. One of the best machine learning-based models obtained after this exhaustive comparative study was implemented online; this model is an artificial neural network (ANN). Thus, the tool offers the possibility to introduce Single Nucleotide Polymorphism (SNP) sequences in order to classify a patient with schizophrenia. Besides this comparative study, a method for variable selection, based on ANNs and evolutionary computation (EC), is also presented. This method uses half the number of variables as the original ANN and the variables obtained are among those found in other publications. In the future, QDGR models based on nucleic acid information could be expanded to other diseases.Programa Iberoamericano de Ciencia y Tecnología para el Desarrollo; 209RT-0366Xunta de Galicia; 10SIN105004PRInstituto de Salud Carlos III; RD07/0067/0005Xunta de Galicia; Ref. 2009/5

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

University of Miami: Scholarship Miami

Virus-Host Coevolution: Common Patterns of Nucleotide Motif Usage in Flaviviridae and Their Hosts

Author: Andreas Tauch
Andréa M. Macedo
Bruno E. F. Mota
Carlos R. Machado
Francisco P. Lobo
Glória R. Franco
Lark L. Coffey
Sérgio D. J. Pena
Vasco Azevedo
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Virus-host biological interaction is a continuous coevolutionary process involving both host immune system and viral escape mechanisms. Flaviviridae family is composed of fast evolving RNA viruses that infects vertebrate (mammals and birds) and/or invertebrate (ticks and mosquitoes) organisms. These host groups are very distinct life forms separated by a long evolutionary time, so lineage-specific anti-viral mechanisms are likely to have evolved. Flaviviridae viruses which infect a single host lineage would be subjected to specific host-induced pressures and, therefore, selected by them. In this work we compare the genomic evolutionary patterns of Flaviviridae viruses and their hosts in an attempt to uncover coevolutionary processes inducing common features in such disparate groups. Especially, we have analyzed dinucleotide and codon usage patterns in the coding regions of vertebrate and invertebrate organisms as well as in Flaviviridae viruses which specifically infect one or both host types. The two host groups possess very distinctive dinucleotide and codon usage patterns. A pronounced CpG under-representation was found in the vertebrate group, possibly induced by the methylation-deamination process, as well as a prominent TpA decrease. The invertebrate group displayed only a TpA frequency reduction bias. Flaviviridae viruses mimicked host nucleotide motif usage in a host-specific manner. Vertebrate-infecting viruses possessed under-representation of CpG and TpA, and insect-only viruses displayed only a TpA under-representation bias. Single-host Flaviviridae members which persistently infect mammals or insect hosts (Hepacivirus and insect-only Flavivirus, respectively) were found to posses a codon usage profile more similar to that of their hosts than to related Flaviviridae. We demonstrated that vertebrates and mosquitoes genomes are under very distinct lineage-specific constraints, and Flaviviridae viruses which specifically infect these lineages appear to be subject to the same evolutionary pressures that shaped their host coding regions, evidencing the lineage-specific coevolutionary processes between the viral and host groups

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Publications at Bielefeld University

Mining Predictive Patterns and Extension to Multivariate Temporal Data

Author: Batal Iyad
Publication venue
Publication date: 01/01/2012
Field of study

An important goal of knowledge discovery is the search for patterns in the data that can help explaining its underlying structure. To be practically useful, the discovered patterns should be novel (unexpected) and easy to understand by humans. In this thesis, we study the problem of mining patterns (defining subpopulations of data instances) that are important for predicting and explaining a specific outcome variable. An example is the task of identifying groups of patients that respond better to a certain treatment than the rest of the patients. We propose and present efficient methods for mining predictive patterns for both atemporal and temporal (time series) data. Our first method relies on frequent pattern mining to explore the search space. It applies a novel evaluation technique for extracting a small set of frequent patterns that are highly predictive and have low redundancy. We show the benefits of this method on several synthetic and public datasets. Our temporal pattern mining method works on complex multivariate temporal data, such as electronic health records, for the event detection task. It first converts time series into time-interval sequences of temporal abstractions and then mines temporal patterns backwards in time, starting from patterns related to the most recent observations. We show the benefits of our temporal pattern mining method on two real-world clinical tasks

CiteSeerX

D-Scholarship@Pitt

Improving the Interpretability of Classification Rules Discovered by an Ant Colony Algorithm: Extended Results

Author: Freitas Alex A.
Otero Fernando E.B.
Publication venue: 'MIT Press - Journals'
Publication date: 01/09/2016
Field of study

The vast majority of Ant Colony Optimization (ACO) algorithms for inducing classification rules use an ACO-based procedure to create a rule in an one-at-a-time fashion. An improved search strategy has been proposed in the cAnt-MinerPB algorithm, where an ACO-based procedure is used to create a complete list of rules (ordered rules)-i.e., the ACO search is guided by the quality of a list of rules, instead of an individual rule. In this paper we propose an extension of the cAnt-MinerPB algorithm to discover a set of rules (unordered rules). The main motivations for this work are to improve the interpretation of individual rules by discovering a set of rules and to evaluate the impact on the predictive accuracy of the algorithm. We also propose a new measure to evaluate the interpretability of the discovered rules to mitigate the fact that the commonly-used model size measure ignores how the rules are used to make a class prediction. Comparisons with state-of-the-art rule induction algorithms, support vector machines and the cAnt-MinerPB producing ordered rules are also presented

Kent Academic Repository