6,985 research outputs found

    NeuroSVM: A Graphical User Interface for Identification of Liver Patients

    Full text link
    Diagnosis of liver infection at preliminary stage is important for better treatment. In todays scenario devices like sensors are used for detection of infections. Accurate classification techniques are required for automatic identification of disease samples. In this context, this study utilizes data mining approaches for classification of liver patients from healthy individuals. Four algorithms (Naive Bayes, Bagging, Random forest and SVM) were implemented for classification using R platform. Further to improve the accuracy of classification a hybrid NeuroSVM model was developed using SVM and feed-forward artificial neural network (ANN). The hybrid model was tested for its performance using statistical parameters like root mean square error (RMSE) and mean absolute percentage error (MAPE). The model resulted in a prediction accuracy of 98.83%. The results suggested that development of hybrid model improved the accuracy of prediction. To serve the medicinal community for prediction of liver disease among patients, a graphical user interface (GUI) has been developed using R. The GUI is deployed as a package in local repository of R platform for users to perform prediction.Comment: 9 pages, 6 figure

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    A New Methodology for Generalizing Unweighted Network Measures

    Full text link
    Several important complex network measures that helped discovering common patterns across real-world networks ignore edge weights, an important information in real-world networks. We propose a new methodology for generalizing measures of unweighted networks through a generalization of the cardinality concept of a set of weights. The key observation here is that many measures of unweighted networks use the cardinality (the size) of some subset of edges in their computation. For example, the node degree is the number of edges incident to a node. We define the effective cardinality, a new metric that quantifies how many edges are effectively being used, assuming that an edge's weight reflects the amount of interaction across that edge. We prove that a generalized measure, using our method, reduces to the original unweighted measure if there is no disparity between weights, which ensures that the laws that govern the original unweighted measure will also govern the generalized measure when the weights are equal. We also prove that our generalization ensures a partial ordering (among sets of weighted edges) that is consistent with the original unweighted measure, unlike previously developed generalizations. We illustrate the applicability of our method by generalizing four unweighted network measures. As a case study, we analyze four real-world weighted networks using our generalized degree and clustering coefficient. The analysis shows that the generalized degree distribution is consistent with the power-law hypothesis but with steeper decline and that there is a common pattern governing the ratio between the generalized degree and the traditional degree. The analysis also shows that nodes with more uniform weights tend to cluster with nodes that also have more uniform weights among themselves.Comment: 23 pages, 10 figure

    Evolving Ensemble Fuzzy Classifier

    Full text link
    The concept of ensemble learning offers a promising avenue in learning from data streams under complex environments because it addresses the bias and variance dilemma better than its single model counterpart and features a reconfigurable structure, which is well suited to the given context. While various extensions of ensemble learning for mining non-stationary data streams can be found in the literature, most of them are crafted under a static base classifier and revisits preceding samples in the sliding window for a retraining step. This feature causes computationally prohibitive complexity and is not flexible enough to cope with rapidly changing environments. Their complexities are often demanding because it involves a large collection of offline classifiers due to the absence of structural complexities reduction mechanisms and lack of an online feature selection mechanism. A novel evolving ensemble classifier, namely Parsimonious Ensemble pENsemble, is proposed in this paper. pENsemble differs from existing architectures in the fact that it is built upon an evolving classifier from data streams, termed Parsimonious Classifier pClass. pENsemble is equipped by an ensemble pruning mechanism, which estimates a localized generalization error of a base classifier. A dynamic online feature selection scenario is integrated into the pENsemble. This method allows for dynamic selection and deselection of input features on the fly. pENsemble adopts a dynamic ensemble structure to output a final classification decision where it features a novel drift detection scenario to grow the ensemble structure. The efficacy of the pENsemble has been numerically demonstrated through rigorous numerical studies with dynamic and evolving data streams where it delivers the most encouraging performance in attaining a tradeoff between accuracy and complexity.Comment: this paper has been published by IEEE Transactions on Fuzzy System
    • …
    corecore