9 research outputs found

    BISON: bio-interface for the semi-global analysis of network patterns

    Get PDF
    BACKGROUND: The large amount of genomics data that have accumulated over the past decade require extensive data mining. However, the global nature of data mining, which includes pattern mining, poses difficulties for users who want to study specific questions in a more local environment. This creates a need for techniques that allow a localized analysis of globally determined patterns. RESULTS: We developed a tool that determines and evaluates global patterns based on protein property and network information, while providing all the benefits of a perspective that is targeted at biologist users with specific goals and interests. Our tool uses our own data mining techniques, integrated into current visualization and navigation techniques. The functionality of the tool is discussed in the context of the transcriptional network of regulation in the enteric bacterium Escherichia coli. Two biological questions were asked: (i) Which functional categories of proteins (identified by hidden Markov models) are regulated by a regulator with a specific domain? (ii) Which regulators are involved in the regulation of proteins that contain a common hidden Markov model? Using these examples, we explain the gene-centered and pattern-centered analysis that the tool permits. CONCLUSION: In summary, we have a tool that can be used for a wide variety of applications in biology, medicine, or agriculture. The pattern mining engine is global in the way that patterns are determined across the entire network. The tool still permits a localized analysis for users who want to analyze a subportion of the total network. We have named the tool BISON (Bio-Interface for the Semi-global analysis Of Network patterns)

    Differential association rule mining for the study of protein-protein interaction networks

    No full text
    christopher.besemann Protein-protein interactions are of great interest to biologists. A variety of high-throughput techniques have been devised, each of which leads to a separate definition of an interaction network. The concept of differential association rule mining is introduced to study the annotations of proteins in the context of one or more interaction networks. Differences among items across edges of a network are explicitly targeted. As a second step we identify differences between networks that are separately defined on the same set of nodes. The technique of differential association rule mining is applied to the comparison of protein annotations within an interaction network and between different interaction networks. In both cases we were able to find rules that explain known properties of protein interaction networks as well as rules that show promise for advanced study. General Terms association rule mining, protein interactions, relational data mining, graph-based data mining, redundant rules 1

    Integration of Profile Hidden Markov Model Output into Association Rule Mining ∗ ABSTRACT

    No full text
    Scientific models typically depend on parameters. Preserving the parameter dependence of models in the pattern mining context opens up several applications. Within association rule mining (ARM), the choice of parameters can be studied with more flexibly then in traditional model building. Studying support, confidence, and other rule metrics as a function of model parameters allows conclusions on assumptions underlying the models. We present efficient techniques to handle multiple model output data sets at little more than the cost of one. We integrate output from hidden Markov models into the association rule mining framework, demonstrating the potential for frequent pattern mining in the field of scientific modeling and experimentation

    Mining Edge-disjoint Patterns in Graph-relational Data

    No full text
    Diverse types of data are associated with proteins, including network and categorical data. While graph mining techniques have long focused on data with no more than one label per node, generalizations have recently been developed. We show that existing generalizations are not well suited to typical biological networks and are likely to return few or no results on protein regulatory networks. They are, furthermore, ill-suited to graphs that are dense or show the small world property, which are typical features of biological networks. A graph-relational edge disjoint instance mining algorithm (GR-EDI) is presented that resolves these problems. Our algorithm treats bipartite edges separately and only constrains unipartite edges to be disjoint. We introduce a new pattern constraint that recovers the downward closure property. The algorithm uses a search lattice traversal strategy that allows more effective mining of graphs that cannot be considered as sparse due to hubs. Effectiveness is demonstrated for a real biological example. While existing techniques return few or no patterns, GR-EDI is able to extract many patterns
    corecore