77 research outputs found

    A unified framework for finding differentially expressed genes from microarray experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>This paper presents a unified framework for finding differentially expressed genes (DEGs) from the microarray data. The proposed framework has three interrelated modules: (i) gene ranking, ii) significance analysis of genes and (iii) validation. The first module uses two gene selection algorithms, namely, a) two-way clustering and b) combined adaptive ranking to rank the genes. The second module converts the gene ranks into p-values using an R-test and fuses the two sets of p-values using the Fisher's omnibus criterion. The DEGs are selected using the FDR analysis. The third module performs three fold validations of the obtained DEGs. The robustness of the proposed unified framework in gene selection is first illustrated using false discovery rate analysis. In addition, the clustering-based validation of the DEGs is performed by employing an adaptive subspace-based clustering algorithm on the training and the test datasets. Finally, a projection-based visualization is performed to validate the DEGs obtained using the unified framework.</p> <p>Results</p> <p>The performance of the unified framework is compared with well-known ranking algorithms such as t-statistics, Significance Analysis of Microarrays (SAM), Adaptive Ranking, Combined Adaptive Ranking and Two-way Clustering. The performance curves obtained using 50 simulated microarray datasets each following two different distributions indicate the superiority of the unified framework over the other reported algorithms. Further analyses on 3 real cancer datasets and 3 Parkinson's datasets show the similar improvement in performance. First, a 3 fold validation process is provided for the two-sample cancer datasets. In addition, the analysis on 3 sets of Parkinson's data is performed to demonstrate the scalability of the proposed method to multi-sample microarray datasets.</p> <p>Conclusion</p> <p>This paper presents a unified framework for the robust selection of genes from the two-sample as well as multi-sample microarray experiments. Two different ranking methods used in module 1 bring diversity in the selection of genes. The conversion of ranks to p-values, the fusion of p-values and FDR analysis aid in the identification of significant genes which cannot be judged based on gene ranking alone. The 3 fold validation, namely, robustness in selection of genes using FDR analysis, clustering, and visualization demonstrate the relevance of the DEGs. Empirical analyses on 50 artificial datasets and 6 real microarray datasets illustrate the efficacy of the proposed approach. The analyses on 3 cancer datasets demonstrate the utility of the proposed approach on microarray datasets with two classes of samples. The scalability of the proposed unified approach to multi-sample (more than two sample classes) microarray datasets is addressed using three sets of Parkinson's Data. Empirical analyses show that the unified framework outperformed other gene selection methods in selecting differentially expressed genes from microarray data.</p

    Supervised group Lasso with applications to microarray data analysis

    Get PDF
    BACKGROUND: A tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure. RESULTS: We propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data. CONCLUSION: We analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods

    Query Large Scale Microarray Compendium Datasets Using a Model-Based Bayesian Approach with Variable Selection

    Get PDF
    In microarray gene expression data analysis, it is often of interest to identify genes that share similar expression profiles with a particular gene such as a key regulatory protein. Multiple studies have been conducted using various correlation measures to identify co-expressed genes. While working well for small datasets, the heterogeneity introduced from increased sample size inevitably reduces the sensitivity and specificity of these approaches. This is because most co-expression relationships do not extend to all experimental conditions. With the rapid increase in the size of microarray datasets, identifying functionally related genes from large and diverse microarray gene expression datasets is a key challenge. We develop a model-based gene expression query algorithm built under the Bayesian model selection framework. It is capable of detecting co-expression profiles under a subset of samples/experimental conditions. In addition, it allows linearly transformed expression patterns to be recognized and is robust against sporadic outliers in the data. Both features are critically important for increasing the power of identifying co-expressed genes in large scale gene expression datasets. Our simulation studies suggest that this method outperforms existing correlation coefficients or mutual information-based query tools. When we apply this new method to the Escherichia coli microarray compendium data, it identifies a majority of known regulons as well as novel potential target genes of numerous key transcription factors

    Enhanced upper genital tract pathologies by blocking Tim-3 and PD-L1 signaling pathways in mice intravaginally infected with Chlamydia muridarum

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Although Tim-3 & PD-L1 signaling pathways play important roles in negatively regulating immune responses, their roles in chlamydial infection have not been evaluated.</p> <p>Methods</p> <p>Neutralization antibodies targeting Tim-3 and PD-L1 were used to treat mice. Following an intravaginal infection with <it>C. muridarum </it>organisms, mice with or without the dual antibody treatment were compared for live chlamydial organism shedding from the lower genital tract and inflammatory pathology in the upper genital tract.</p> <p>Results</p> <p>Mice treated with anti-Tim-3 and anti-PD-L1 antibodies displayed a time course of live organism shedding similar to that of mice treated with equivalent amounts of isotype-matched IgG molecules. The combined antibody blocking failed to alter either the lower genital tract cytokine or systemic humoral and cellular adaptive responses to <it>C. muridarum </it>infection. However, the antibody blocking significantly enhanced <it>C. muridarum</it>-induced pathologies in the upper genital tract, including more significant hydrosalpinx and inflammatory infiltration in uterine horn and oviduct tissues.</p> <p>Conclusions</p> <p>The Tim-3 and PD-L1-mediated signaling can significantly reduce pathologies in the upper genital tract without suppressing immunity against chlamydial infection, suggesting that Tim-3 and PD-L1-mediated negative regulation may be manipulated to attenuate tubal pathologies in women persistently infected with <it>C. trachomatis </it>organisms.</p

    Identification of the CRE-1 Cellulolytic Regulon in Neurospora crassa

    Get PDF
    Background: In filamentous ascomycete fungi, the utilization of alternate carbon sources is influenced by the zinc finger transcription factor CreA/CRE-1, which encodes a carbon catabolite repressor protein homologous to Mig1 from Saccharomyces cerevisiae. In Neurospora crassa, deletion of cre-1 results in increased secretion of amylase and b-galactosidase. Methodology/Principal Findings: Here we show that a strain carrying a deletion of cre-1 has increased cellulolytic activity and increased expression of cellulolytic genes during growth on crystalline cellulose (Avicel). Constitutive expression of cre-1 complements the phenotype of a N. crassa Dcre-1 strain grown on Avicel, and also results in stronger repression of cellulolytic protein secretion and enzyme activity. We determined the CRE-1 regulon by investigating the secretome and transcriptome of a Dcre-1 strain as compared to wild type when grown on Avicel versus minimal medium. Chromatin immunoprecipitation-PCR of putative target genes showed that CRE-1 binds to only some adjacent 59-SYGGRG-39 motifs, consistent with previous findings in other fungi, and suggests that unidentified additional regulatory factors affect CRE-1 binding to promoter regions. Characterization of 30 mutants containing deletions in genes whose expression level increased in a Dcre-1 strain under cellulolytic conditions identified novel genes that affect cellulase activity and protein secretion

    Efficacy and cost-effectiveness of an outcall program to reduce carer burden and depression among carers of cancer patients (PROTECT) : rationale and design of a randomized controlled trial

    Get PDF
    Published: 6 January 2014BACKGROUND: Carers provide extended and often unrecognized support to people with cancer. The aim of this study is to test the hypothesis that excessive carer burden is modifiable through a telephone outcall intervention that includes supportive care, information and referral to appropriate psycho-social services. Secondary aims include estimation of changes in psychological health and quality of life. The study will determine whether the intervention reduces unmet needs among patient dyads. A formal economic program will also be conducted. METHODS/DESIGN: This study is a single-blind, multi-centre, randomized controlled trial to determine the efficacy and cost-efficacy of a telephone outcall program among carers of newly diagnosed cancer patients. A total of 230 carer/patient dyads will be recruited into the study; following written consent, carers will be randomly allocated to either the outcall intervention program (n = 115) or to a minimal outcall / attention control service (n = 115). Carer assessments will occur at baseline, at one and six months post-intervention. The primary outcome is change in carer burden; the secondary outcomes are change in carer depression, quality of life, health literacy and unmet needs. The trial patients will be assessed at baseline and one month post-intervention to determine depression levels and unmet needs. The economic analysis will include perspectives of both the health care sector and broader society and comprise a cost-consequences analysis where all outcomes will be compared to costs. DISCUSSION: This study will contribute to our understanding on the potential impact of a telephone outcall program on carer burden and provide new evidence on an approach for improving the wellbeing of carers.Patricia M Livingston, Richard H Osborne, Mari Botti, Cathy Mihalopoulos, Sean McGuigan, Leila Heckel, Kate Gunn, Jacquie Chirgwin, David M Ashley and Melinda William

    Gene expression profiling of primary cultures of ovarian epithelial cells identifies novel molecular classifiers of ovarian cancer

    Get PDF
    In order to elucidate the biological variance between normal ovarian surface epithelial (NOSE) and epithelial ovarian cancer (EOC) cells, and to build a molecular classifier to discover new markers distinguishing these cells, we analysed gene expression patterns of 65 primary cultures of these tissues by oligonucleotide microarray. Unsupervised clustering highlights three subgroups of tumours: low malignant potential tumours, invasive solid tumours and tumour cells derived from ascites. We selected 18 genes with expression profiles that enable the distinction of NOSE from these three groups of EOC with 92% accuracy. Validation using an independent published data set derived from tissues or primary cultures confirmed a high accuracy (87–96%). The distinctive expression pattern of a subset of genes was validated by quantitative reverse transcription–PCR. An ovarian-specific tissue array representing tissues from NOSE and EOC samples of various subtypes and grades was used to further assess the protein expression patterns of two differentially expressed genes (Msln and BMP-2) by immunohistochemistry. This study highlights the relevance of using primary cultures of epithelial ovarian cells as a model system for gene profiling studies and demonstrates that the statistical analysis of gene expression profiling is a useful approach for selecting novel molecular tumour markers

    Familial hypercholesterolaemia in children and adolescents from 48 countries: a cross-sectional study

    Get PDF
    Background Approximately 450 000 children are born with familial hypercholesterolaemia worldwide every year, yet only 2·1% of adults with familial hypercholesterolaemia were diagnosed before age 18 years via current diagnostic approaches, which are derived from observations in adults. We aimed to characterise children and adolescents with heterozygous familial hypercholesterolaemia (HeFH) and understand current approaches to the identification and management of familial hypercholesterolaemia to inform future public health strategies. Methods For this cross-sectional study, we assessed children and adolescents younger than 18 years with a clinical or genetic diagnosis of HeFH at the time of entry into the Familial Hypercholesterolaemia Studies Collaboration (FHSC) registry between Oct 1, 2015, and Jan 31, 2021. Data in the registry were collected from 55 regional or national registries in 48 countries. Diagnoses relying on self-reported history of familial hypercholesterolaemia and suspected secondary hypercholesterolaemia were excluded from the registry; people with untreated LDL cholesterol (LDL-C) of at least 13·0 mmol/L were excluded from this study. Data were assessed overall and by WHO region, World Bank country income status, age, diagnostic criteria, and index-case status. The main outcome of this study was to assess current identification and management of children and adolescents with familial hypercholesterolaemia. Findings Of 63 093 individuals in the FHSC registry, 11 848 (18·8%) were children or adolescents younger than 18 years with HeFH and were included in this study; 5756 (50·2%) of 11 476 included individuals were female and 5720 (49·8%) were male. Sex data were missing for 372 (3·1%) of 11 848 individuals. Median age at registry entry was 9·6 years (IQR 5·8–13·2). 10 099 (89·9%) of 11 235 included individuals had a final genetically confirmed diagnosis of familial hypercholesterolaemia and 1136 (10·1%) had a clinical diagnosis. Genetically confirmed diagnosis data or clinical diagnosis data were missing for 613 (5·2%) of 11 848 individuals. Genetic diagnosis was more common in children and adolescents from high-income countries (9427 [92·4%] of 10 202) than in children and adolescents from non-high-income countries (199 [48·0%] of 415). 3414 (31·6%) of 10 804 children or adolescents were index cases. Familial-hypercholesterolaemia-related physical signs, cardiovascular risk factors, and cardiovascular disease were uncommon, but were more common in non-high-income countries. 7557 (72·4%) of 10 428 included children or adolescents were not taking lipid-lowering medication (LLM) and had a median LDL-C of 5·00 mmol/L (IQR 4·05–6·08). Compared with genetic diagnosis, the use of unadapted clinical criteria intended for use in adults and reliant on more extreme phenotypes could result in 50–75% of children and adolescents with familial hypercholesterolaemia not being identified. Interpretation Clinical characteristics observed in adults with familial hypercholesterolaemia are uncommon in children and adolescents with familial hypercholesterolaemia, hence detection in this age group relies on measurement of LDL-C and genetic confirmation. Where genetic testing is unavailable, increased availability and use of LDL-C measurements in the first few years of life could help reduce the current gap between prevalence and detection, enabling increased use of combination LLM to reach recommended LDL-C targets early in life. Funding Pfizer, Amgen, Merck Sharp & Dohme, Sanofi–Aventis, Daiichi Sankyo, and Regeneron

    Pan-cancer analysis of whole genomes

    Get PDF
    Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale(1-3). Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4-5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter(4); identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation(5,6); analyses timings and patterns of tumour evolution(7); describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity(8,9); and evaluates a range of more-specialized features of cancer genomes(8,10-18).Peer reviewe
    corecore