36 research outputs found

    Validating Gene Clusterings by Selecting Informative Gene Ontology Terms with Mutual Information

    Full text link
    We propose a method for global validation of gene clusterings. The method selects a set of informative and non-redundant GO terms through an exploration of the Gene Ontology structure guided by mutual information. Our approach yields a global assessment of the clustering quality, and a higher level interpretation for the clusters, as it relates GO terms with specific clusters. We show that in two gene expression data sets our method offers an improvement over previous approaches

    Partitioning Biological Networks into Highly Connected Clusters with Maximum Edge Coverage

    No full text
    Abstract. We introduce the combinatorial optimization problem Highly Connected Deletion, which asks for removing as few edges as possible from a graph such that the resulting graph consists of highly connected components. We show that Highly Connected Deletion is NP-hard and provide a fixed-parameter algorithm and a kernelization. We propose exact and heuristic solution strategies, based on polynomial-time data reduction rules and integer linear programming with column generation. The data reduction typically identifies 85 % of the edges that need to be deleted for an optimal solution; the column generation method can then optimally solve protein interaction networks with up to 5 000 vertices and 12 000 edges.

    Estimating the Quality of Ontology-Based Annotations by Considering Evolutionary Changes

    No full text
    Abstract. Ontology-based annotations associate objects, such as genes and proteins, with well-defined ontology concepts to semantically and uniformly describe object properties. Such annotation mappings are utilized in different applications and analysis studies whose results strongly depend on the quality of the used annotations. To study the quality of annotations we propose a generic evaluation approach considering the annotation generation methods (provenance) as well as the evolution of ontologies, object sources, and annotations. Thus, it facilitates the identification of reliable annotations, e.g., for use in analysis applications. We evaluate our approach for functional protein annotations in Ensembl and Swiss-Prot using the Gene Ontology
    corecore