A noise reducing sampling approach for uncovering critical properties in large scale biological networks

Abstract

A correlation network is a graph-based representation of relationships among genes or gene products, such as proteins. The advent of high-throughput bioinformatics has resulted in the generation of volumes of data that require sophisticated in silico models, such as the correlation network, for in-depth analysis. Each element in our network represents expression levels of multiple samples of one gene and an edge connecting two nodes reflects the correlation level between the two corresponding genes in the network according to the Pearson correlation coefficient. Biological networks made in this manner are generally found to adhere to a scale-free structural nature, that is, it is modular and adheres to a power-law degree distribution. Filtering these structures to remove noise and coincidental edges in the network is a necessity for network theorists because unfortunately, when examining entire genomes at once, network size and complexity can act as a bottleneck for network manageability. Our previous work demonstrated that chordal graph based sampling of network results in viable models. In this paper, we extend our research to investigate how different orderings affect the results of our sampling, and maintain the viability of resulting network structures. Our results show that chordal graph based sampling not only conserves clusters that are present within the original networks, but by reducing noise can also help uncover additional functional clusters that were previously not obtainable from the original network

    Similar works