29 research outputs found

    Comparative Analysis of Thresholding Algorithms for Microarray-derived Gene Correlation Matrices

    Get PDF
    The thresholding problem is important in today’s data-rich research scenario. A threshold is a well-defined point in the data distribution beyond which the data is highly likely to have scientific meaning. The selection of threshold is crucial since it heavily influences any downstream analysis and inferences made there from. A legitimate threshold is one that is not arbitrary but scientifically well grounded, data-dependent and best segregates the information-rich and noisy sections of data. Although the thresholding problem is not restricted to any particular field of study, little research has been done. This study investigates the problem in context of network-based analysis of transcriptomic data. Six conceptually diverse algorithms – based on number of maximal cliques, correlations of control spots with genes, top 1% of correlations, spectral graph clustering, Bonferroni correction of p-values and statistical power – are used to threshold the gene correlation matrices of three time-series microarray datasets and tested for stability and validity. Stability or reliability of the first four algorithms towards thresholding is tested upon block bootstrapping of arrays in the datasets and comparing the estimated thresholds against the bootstrap threshold distributions. Validity of thresholding algorithms is tested by comparison of the estimated thresholds against threshold based on biological information. Thresholds based on the modular basis of gene networks are concluded to perform better both in terms of stability as well as validity. Future challenges to research the problem have been identified. Although the study utilizes transcriptomic data for analysis, we assert its applicability to thresholding across various fields

    Comparison of threshold selection methods for microarray gene co-expression matrices

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Network and clustering analyses of microarray co-expression correlation data often require application of a threshold to discard small correlations, thus reducing computational demands and decreasing the number of uninformative correlations. This study investigated threshold selection in the context of combinatorial network analysis of transcriptome data.</p> <p>Findings</p> <p>Six conceptually diverse methods - based on number of maximal cliques, correlation of control spots with expressed genes, top 1% of correlations, spectral graph clustering, Bonferroni correction of p-values, and statistical power - were used to estimate a correlation threshold for three time-series microarray datasets. The validity of thresholds was tested by comparison to thresholds derived from Gene Ontology information. Stability and reliability of the best methods were evaluated with block bootstrapping.</p> <p>Two threshold methods, number of maximal cliques and spectral graph, used information in the correlation matrix structure and performed well in terms of stability. Comparison to Gene Ontology found thresholds from number of maximal cliques extracted from a co-expression matrix were the most biologically valid. Approaches to improve both methods were suggested.</p> <p>Conclusion</p> <p>Threshold selection approaches based on network structure of gene relationships gave thresholds with greater relevance to curated biological relationships than approaches based on statistical pair-wise relationships.</p

    Extracting Gene Networks for Low-Dose Radiation Using Graph Theoretical Algorithms

    Get PDF
    Genes with common functions often exhibit correlated expression levels, which can be used to identify sets of interacting genes from microarray data. Microarrays typically measure expression across genomic space, creating a massive matrix of co-expression that must be mined to extract only the most relevant gene interactions. We describe a graph theoretical approach to extracting co-expressed sets of genes, based on the computation of cliques. Unlike the results of traditional clustering algorithms, cliques are not disjoint and allow genes to be assigned to multiple sets of interacting partners, consistent with biological reality. A graph is created by thresholding the correlation matrix to include only the correlations most likely to signify functional relationships. Cliques computed from the graph correspond to sets of genes for which significant edges are present between all members of the set, representing potential members of common or interacting pathways. Clique membership can be used to infer function about poorly annotated genes, based on the known functions of better-annotated genes with which they share clique membership (i.e., “guilt-by-association”). We illustrate our method by applying it to microarray data collected from the spleens of mice exposed to low-dose ionizing radiation. Differential analysis is used to identify sets of genes whose interactions are impacted by radiation exposure. The correlation graph is also queried independently of clique to extract edges that are impacted by radiation. We present several examples of multiple gene interactions that are altered by radiation exposure and thus represent potential molecular pathways that mediate the radiation response

    Validation of a host blood transcriptomic biomarker for pulmonary tuberculosis in people living with HIV: a prospective diagnostic and prognostic accuracy study.

    Get PDF
    BACKGROUND: A rapid, blood-based triage test that allows targeted investigation for tuberculosis at the point of care could shorten the time to tuberculosis treatment and reduce mortality. We aimed to test the performance of a host blood transcriptomic signature (RISK11) in diagnosing tuberculosis and predicting progression to active pulmonary disease (prognosis) in people with HIV in a community setting. METHODS: In this prospective diagnostic and prognostic accuracy study, adults (aged 18-59 years) with HIV were recruited from five communities in South Africa. Individuals with a history of tuberculosis or household exposure to multidrug-resistant tuberculosis within the past 3 years, comorbid risk factors for tuberculosis, or any condition that would interfere with the study were excluded. RISK11 status was assessed at baseline by real-time PCR; participants and study staff were masked to the result. Participants underwent active surveillance for microbiologically confirmed tuberculosis by providing spontaneously expectorated sputum samples at baseline, if symptomatic during 15 months of follow-up, and at 15 months (the end of the study). The coprimary outcomes were the prevalence and cumulative incidence of tuberculosis disease confirmed by a positive Xpert MTB/RIF, Xpert Ultra, or Mycobacteria Growth Indicator Tube culture, or a combination of such, on at least two separate sputum samples collected within any 30-day period. FINDINGS: Between March 22, 2017, and May 15, 2018, 963 participants were assessed for eligibility and 861 were enrolled. Among 820 participants with valid RISK11 results, eight (1%) had prevalent tuberculosis at baseline: seven (2·5%; 95% CI 1·2-5·0) of 285 RISK11-positive participants and one (0·2%; 0·0-1·1) of 535 RISK11-negative participants. The relative risk (RR) of prevalent tuberculosis was 13·1 times (95% CI 2·1-81·6) greater in RISK11-positive participants than in RISK11-negative participants. RISK11 had a diagnostic area under the receiver operating characteristic curve (AUC) of 88·2% (95% CI 77·6-96·7), and a sensitivity of 87·5% (58·3-100·0) and specificity of 65·8% (62·5-69·0) at a predefined score threshold (60%). Of those with RISK11 results, eight had primary endpoint incident tuberculosis during 15 months of follow-up. Tuberculosis incidence was 2·5 per 100 person-years (95% CI 0·7-4·4) in the RISK11-positive group and 0·2 per 100 person-years (0·0-0·5) in the RISK11-negative group. The probability of primary endpoint incident tuberculosis was greater in the RISK11-positive group than in the RISK11-negative group (cumulative incidence ratio 16·0 [95% CI 2·0-129·5]). RISK11 had a prognostic AUC of 80·0% (95% CI 70·6-86·9), and a sensitivity of 88·6% (43·5-98·7) and a specificity of 68·9% (65·3-72·3) for incident tuberculosis at the 60% threshold. INTERPRETATION: RISK11 identified prevalent tuberculosis and predicted risk of progression to incident tuberculosis within 15 months in ambulant people living with HIV. RISK11's performance approached, but did not meet, WHO's target product profile benchmarks for screening and prognostic tests for tuberculosis. FUNDING: Bill & Melinda Gates Foundation and the South African Medical Research Council

    Biomarker-guided tuberculosis preventive therapy (CORTIS): a randomised controlled trial.

    Get PDF
    BACKGROUND: Targeted preventive therapy for individuals at highest risk of incident tuberculosis might impact the epidemic by interrupting transmission. We tested performance of a transcriptomic signature of tuberculosis (RISK11) and efficacy of signature-guided preventive therapy in parallel, using a hybrid three-group study design. METHODS: Adult volunteers aged 18-59 years were recruited at five geographically distinct communities in South Africa. Whole blood was sampled for RISK11 by quantitative RT-PCR assay from eligible volunteers without HIV, recent previous tuberculosis (ie, <3 years before screening), or comorbidities at screening. RISK11-positive participants were block randomised (1:2; block size 15) to once-weekly, directly-observed, open-label isoniazid and rifapentine for 12 weeks (ie, RISK11 positive and 3HP positive), or no treatment (ie, RISK11 positive and 3HP negative). A subset of eligible RISK11-negative volunteers were randomly assigned to no treatment (ie, RISK11 negative and 3HP negative). Diagnostic discrimination of prevalent tuberculosis was tested in all participants at baseline. Thereafter, prognostic discrimination of incident tuberculosis was tested in the untreated RISK11-positive versus RISK11-negative groups, and treatment efficacy in the 3HP-treated versus untreated RISK11-positive groups, during active surveillance through 15 months. The primary endpoint was microbiologically confirmed pulmonary tuberculosis. The primary outcome measures were risk ratio [RR] for tuberculosis of RISK11-positive to RISK11-negative participants, and treatment efficacy. This trial is registered with ClinicalTrials.gov, NCT02735590. FINDINGS: 20 207 volunteers were screened, and 2923 participants were enrolled, including RISK11-positive participants randomly assigned to 3HP (n=375) or no 3HP (n=764), and 1784 RISK11-negative participants. Cumulative probability of prevalent or incident tuberculosis disease was 0·066 (95% CI 0·049 to 0·084) in RISK11-positive (3HP negative) participants and 0·018 (0·011 to 0·025) in RISK11-negative participants (RR 3·69, 95% CI 2·25-6·05) over 15 months. Tuberculosis prevalence was 47 (4·1%) of 1139 versus 14 (0·78%) of 1984 in RISK11-positive compared with RISK11-negative participants, respectively (diagnostic RR 5·13, 95% CI 2·93 to 9·43). Tuberculosis incidence over 15 months was 2·09 (95% CI 0·97 to 3·19) vs 0·80 (0·30 to 1·30) per 100 person years in RISK11-positive (3HP-negative) participants compared with RISK11-negative participants (cumulative incidence ratio 2·6, 95% CI 1·2 to 5·9). Serious adverse events related to 3HP included one hospitalisation for seizures (unintentional isoniazid overdose) and one death of unknown cause (possibly temporally related). Tuberculosis incidence over 15 months was 1·94 (95% CI 0·35 to 3·50) versus 2·09 (95% CI 0·97 to 3·19) per 100 person-years in 3HP-treated RISK11-positive participants compared with untreated RISK11-positive participants (efficacy 7·0%, 95% CI -145 to 65). INTERPRETATION: The RISK11 signature discriminated between individuals with prevalent tuberculosis, or progression to incident tuberculosis, and individuals who remained healthy, but provision of 3HP to signature-positive individuals after exclusion of baseline disease did not reduce progression to tuberculosis over 15 months. FUNDING: Bill and Melinda Gates Foundation, South African Medical Research Council

    Predisposition to Cancer Caused by Genetic and Functional Defects of Mammalian Atad5

    Get PDF
    ATAD5, the human ortholog of yeast Elg1, plays a role in PCNA deubiquitination. Since PCNA modification is important to regulate DNA damage bypass, ATAD5 may be important for suppression of genomic instability in mammals in vivo. To test this hypothesis, we generated heterozygous (Atad5+/m) mice that were haploinsuffficient for Atad5. Atad5+/m mice displayed high levels of genomic instability in vivo, and Atad5+/m mouse embryonic fibroblasts (MEFs) exhibited molecular defects in PCNA deubiquitination in response to DNA damage, as well as DNA damage hypersensitivity and high levels of genomic instability, apoptosis, and aneuploidy. Importantly, 90% of haploinsufficient Atad5+/m mice developed tumors, including sarcomas, carcinomas, and adenocarcinomas, between 11 and 20 months of age. High levels of genomic alterations were evident in tumors that arose in the Atad5+/m mice. Consistent with a role for Atad5 in suppressing tumorigenesis, we also identified somatic mutations of ATAD5 in 4.6% of sporadic human endometrial tumors, including two nonsense mutations that resulted in loss of proper ATAD5 function. Taken together, our findings indicate that loss-of-function mutations in mammalian Atad5 are sufficient to cause genomic instability and tumorigenesis

    Accepted for the Council:

    No full text
    Dr. Mike Langston and Dr. Arnold Saxton for their encouragement, ideas and constant support. Dr. Elissa Chesler and Dr. Brynn Voy for their insight and ideas when things blurred out for me. John Eblen, Andy Perkins, Gary Rogers, Yun Zhang and all the wonderful students under Dr. Langston. For assisting me around and making me feel at home. Especially John, a wonderful friend and colleague. For helping me out with Perl and UNIX programming. And giving me sufficient insight in graph theory so as to be able to write about it. Dr. Bing Zhang and Dr. Roumyana Yordanova for their help on certain topics of the study. The GST program and its current and former Directors, Dr. Peterson and Dr. Becker, for giving me an opportunity to study at University of Tennessee, Knoxville. The thresholding problem is important in today’s data-rich research scenario.

    CORTIS-HR: Statistical Analysis Plan

    No full text
    Statistical Analysis Plan: 11-gene Correlates of Risk (COR) Diagnostic and Predictive Performance Analysis in HIV-Infected Adults in the "Validation of Correlates of Risk of TB Disease in High Risk Populations (CORTIS-HR): A companion study of the CORTIS-01 Trial"Version: 1.0Date: 11 December 2019</div
    corecore