17,986 research outputs found

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    An Overview of Classifier Fusion Methods

    Get PDF
    A number of classifier fusion methods have been recently developed opening an alternative approach leading to a potential improvement in the classification performance. As there is little theory of information fusion itself, currently we are faced with different methods designed for different problems and producing different results. This paper gives an overview of classifier fusion methods and attempts to identify new trends that may dominate this area of research in future. A taxonomy of fusion methods trying to bring some order into the existing “pudding of diversities” is also provided

    An Overview of the Use of Neural Networks for Data Mining Tasks

    Get PDF
    In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

    Feature selection for modular GA-based classification

    Get PDF
    Genetic algorithms (GAs) have been used as conventional methods for classifiers to adaptively evolve solutions for classification problems. Feature selection plays an important role in finding relevant features in classification. In this paper, feature selection is explored with modular GA-based classification. A new feature selection technique, Relative Importance Factor (RIF), is proposed to find less relevant features in the input domain of each class module. By removing these features, it is aimed to reduce the classification error and dimensionality of classification problems. Benchmark classification data sets are used to evaluate the proposed approach. The experiment results show that RIF can be used to find less relevant features and help achieve lower classification error with the feature space dimension reduced
    corecore