2 research outputs found

    Comparative Analysis of Thresholding Algorithms for Microarray-derived Gene Correlation Matrices

    Get PDF
    The thresholding problem is important in today’s data-rich research scenario. A threshold is a well-defined point in the data distribution beyond which the data is highly likely to have scientific meaning. The selection of threshold is crucial since it heavily influences any downstream analysis and inferences made there from. A legitimate threshold is one that is not arbitrary but scientifically well grounded, data-dependent and best segregates the information-rich and noisy sections of data. Although the thresholding problem is not restricted to any particular field of study, little research has been done. This study investigates the problem in context of network-based analysis of transcriptomic data. Six conceptually diverse algorithms – based on number of maximal cliques, correlations of control spots with genes, top 1% of correlations, spectral graph clustering, Bonferroni correction of p-values and statistical power – are used to threshold the gene correlation matrices of three time-series microarray datasets and tested for stability and validity. Stability or reliability of the first four algorithms towards thresholding is tested upon block bootstrapping of arrays in the datasets and comparing the estimated thresholds against the bootstrap threshold distributions. Validity of thresholding algorithms is tested by comparison of the estimated thresholds against threshold based on biological information. Thresholds based on the modular basis of gene networks are concluded to perform better both in terms of stability as well as validity. Future challenges to research the problem have been identified. Although the study utilizes transcriptomic data for analysis, we assert its applicability to thresholding across various fields

    Accepted for the Council:

    No full text
    Dr. Mike Langston and Dr. Arnold Saxton for their encouragement, ideas and constant support. Dr. Elissa Chesler and Dr. Brynn Voy for their insight and ideas when things blurred out for me. John Eblen, Andy Perkins, Gary Rogers, Yun Zhang and all the wonderful students under Dr. Langston. For assisting me around and making me feel at home. Especially John, a wonderful friend and colleague. For helping me out with Perl and UNIX programming. And giving me sufficient insight in graph theory so as to be able to write about it. Dr. Bing Zhang and Dr. Roumyana Yordanova for their help on certain topics of the study. The GST program and its current and former Directors, Dr. Peterson and Dr. Becker, for giving me an opportunity to study at University of Tennessee, Knoxville. The thresholding problem is important in today’s data-rich research scenario.
    corecore