8,511 research outputs found

    ISBDD model for classification of hyperspectral remote sensing imagery

    Get PDF
    The diverse density (DD) algorithm was proposed to handle the problem of low classification accuracy when training samples contain interference such as mixed pixels. The DD algorithm can learn a feature vector from training bags, which comprise instances (pixels). However, the feature vector learned by the DD algorithm cannot always effectively represent one type of ground cover. To handle this problem, an instance space-based diverse density (ISBDD) model that employs a novel training strategy is proposed in this paper. In the ISBDD model, DD values of each pixel are computed instead of learning a feature vector, and as a result, the pixel can be classified according to its DD values. Airborne hyperspectral data collected by the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) sensor and the Push-broom Hyperspectral Imager (PHI) are applied to evaluate the performance of the proposed model. Results show that the overall classification accuracy of ISBDD model on the AVIRIS and PHI images is up to 97.65% and 89.02%, respectively, while the kappa coefficient is up to 0.97 and 0.88, respectively

    MTAP: The Motif Tool Assessment Platform

    Get PDF
    Background: In recent years, substantial effort has been applied to de novo regulatory motif discovery. At this time, more than 150 software tools exist to detect regulatory binding sites given a set of genomic sequences. As the number of software packages increases, it becomes more important to identify the tools with the best performance characteristics for specific problem domains. Identifying the correct tool is difficult because of the great variability in motif detection software. Consequently, many labs spend considerable effort testing methods to find one that works well in their problem of interest. Results: In this work, we propose a method (MTAP) that substantially reduces the effort required to assess de novo regulatory motif discovery software. MTAP differs from previous attempts at regulatory motif assessment in that it automates motif discovery tool pipelines (something that traditionally required many manual steps), automatically constructs orthologous upstream sequences, and provides automated benchmarks for many popular tools. As a proof of concept, we have run benchmarks over human, mouse, fly, yeast, E. coli and B. subtilis. Conclusion: MTAP presents a new approach to the challenging problem of assessing regulatory motif discovery methods. The most current version of MTAP can be downloaded from http:// biobase.ist.unomaha.edu

    MTAP: The Motif Tool Assessment Platform

    Get PDF
    Background: In recent years, substantial effort has been applied to de novo regulatory motif discovery. At this time, more than 150 software tools exist to detect regulatory binding sites given a set of genomic sequences. As the number of software packages increases, it becomes more important to identify the tools with the best performance characteristics for specific problem domains. Identifying the correct tool is difficult because of the great variability in motif detection software. Consequently, many labs spend considerable effort testing methods to find one that works well in their problem of interest. Results: In this work, we propose a method (MTAP) that substantially reduces the effort required to assess de novo regulatory motif discovery software. MTAP differs from previous attempts at regulatory motif assessment in that it automates motif discovery tool pipelines (something that traditionally required many manual steps), automatically constructs orthologous upstream sequences, and provides automated benchmarks for many popular tools. As a proof of concept, we have run benchmarks over human, mouse, fly, yeast, E. coli and B. subtilis. Conclusion: MTAP presents a new approach to the challenging problem of assessing regulatory motif discovery methods. The most current version of MTAP can be downloaded from http://biobase.ist.unomaha.edu

    Learning from class-imbalanced data: overlap-driven resampling for imbalanced data classification.

    Get PDF
    Classification of imbalanced datasets has attracted substantial research interest over the past years. This is because imbalanced datasets are common in several domains such as health, finance and security, but learning algorithms are generally not designed to handle them. Many existing solutions focus mainly on the class distribution problem. However, a number of reports showed that class overlap had a higher negative impact on the learning process than class imbalance. This thesis thoroughly explores the impact of class overlap on the learning algorithm and demonstrates how elimination of class overlap can effectively improve the classification of imbalanced datasets. Novel undersampling approaches were developed with the main objective of enhancing the presence of minority class instances in the overlapping region. This is achieved by identifying and removing majority class instances potentially residing in such a region. Seven methods under the two different approaches were designed for the task. Extensive experiments were carried out to evaluate the methods on simulated and well-known real-world datasets. Results showed that substantial improvement in the classification accuracy of the minority class was obtained with favourable trade-offs with the majority class accuracy. Moreover, successful application of the methods in predictive diagnostics of diseases with imbalanced records is presented. These novel overlap-based approaches have several advantages over other common resampling methods. First, the undersampling amount is independent of class imbalance and proportional to the degree of overlap. This could effectively address the problem of class overlap while reducing the effect of class imbalance. Second, information loss is minimised as instance elimination is contained within the problematic region. Third, adaptive parameters enable the methods to be generalised across different problems. It is also worth pointing out that these methods provide different trade-offs, which offer more alternatives to real-world users in selecting the best fit solution to the problem
    • …
    corecore