3,878 research outputs found

    Rough ACO: A Hybridized Model for Feature Selection in Gene Expression Data

    Get PDF
    Dimensionality reduction of a feature set is a common preprocessing step used for pattern recognition, classification applications and in compression schemes. Rough Set Theory is one of the popular methods used, and can be shown to be optimal using different optimality criteria. This paper proposes a novel method for dimensionality reduction of a feature set by choosing a subset of the original features that contains most of the essential information, using the same criteria as the ACO hybridized with Rough Set Theory. We call this method Rough ACO. The proposed method is successfully applied for choosing the best feature combinations and then applying the Upper and Lower Approximations to find the reduced set of features from a gene expression data

    An Improved Stock Price Prediction using Hybrid Market Indicators

    Get PDF
    In this paper the effect of hybrid market indicators is examined for an improved stock price prediction. The hybrid market indicators consist of technical, fundamental and expert opinion variables as input to artificial neural networks model. The empirical results obtained with published stock data of Dell and Nokia obtained from New York Stock Exchange shows that the proposed model can be effective to improve accuracy of stock price prediction

    Setting up a large set of protein-ligand PDB complexes for the development and validation of knowledge-based docking algorithms

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The number of algorithms available to predict ligand-protein interactions is large and ever-increasing. The number of test cases used to validate these methods is usually small and problem dependent. Recently, several databases have been released for further understanding of protein-ligand interactions, having the Protein Data Bank as backend support. Nevertheless, it appears to be difficult to test docking methods on a large variety of complexes. In this paper we report the development of a new database of protein-ligand complexes tailored for testing of docking algorithms.</p> <p>Methods</p> <p>Using a new definition of molecular contact, small ligands contained in the 2005 PDB edition were identified and processed. The database was enriched in molecular properties. In particular, an automated typing of ligand atoms was performed. A filtering procedure was applied to select a non-redundant dataset of complexes. Data mining was performed to obtain information on the frequencies of different types of atomic contacts. Docking simulations were run with the program DOCK.</p> <p>Results</p> <p>We compiled a large database of small ligand-protein complexes, enriched with different calculated properties, that currently contains more than 6000 non-redundant structures. As an example to demonstrate the value of the new database, we derived a new set of chemical matching rules to be used in the context of the program DOCK, based on contact frequencies between ligand atoms and points representing the protein surface, and proved their enhanced efficiency with respect to the default set of rules included in that program.</p> <p>Conclusion</p> <p>The new database constitutes a valuable resource for the development of knowledge-based docking algorithms and for testing docking programs on large sets of protein-ligand complexes. The new chemical matching rules proposed in this work significantly increase the success rate in DOCKing simulations. The database developed in this work is available at <url>http://cimlcsext.cim.sld.cu:8080/screeningbrowser/</url>.</p

    Data mining of gene arrays for biomarkers of survival in ovarian cancer

    Get PDF
    The expected five-year survival rate from a stage III ovarian cancer diagnosis is a mere 22%; this applies to the 7000 new cases diagnosed yearly in the UK. Stratification of patients with this heterogeneous disease, based on active molecular pathways, would aid a targeted treatment improving the prognosis for many cases. While hundreds of genes have been associated with ovarian cancer, few have yet been verified by peer research for clinical significance. Here, a meta-analysis approach was applied to two care fully selected gene expression microarray datasets. Artificial neural networks, Cox univariate survival analyses and T-tests identified genes whose expression was consistently and significantly associated with patient survival. The rigor of this experimental design increases confidence in the genes found to be of interest. A list of 56 genes were distilled from a potential 37,000 to be significantly related to survival in both datasets with a FDR of 1.39859 × 10−11, the identities of which both verify genes already implicated with this disease and provide novel genes and pathways to pursue. Further investigation and validation of these may lead to clinical insights and have potential to predict a patient’s response to treatment or be used as a novel target for therapy
    • …
    corecore