2,386 research outputs found

    Gene-expression of metastasized versus non-metastasized primary head and neck squamous cell carcinomas: A pathway-based analysis

    Get PDF
    Background: Regional lymph node metastasis is an important prognostic factor in head and neck squamous cell carcinoma (HNSCC) and plays a decisive role in the choice of treatment. Here, we present an independent gene expression validation study of metastasized versus non-metastasized HNSCC. Methods: We used a dataset recently published by Roepman et al. as reference dataset and an independent gene expression dataset of 11 metastasized and 11 non-metastasized HNSCC tumors as validation dataset. Reference and validation studies were performed on different microarray platforms with different probe sets and probe content. In addition to a supervised gene-based analysis, a supervised pathway-based analysis was performed, evaluating differences in gene expression for predefined tumorigenesis- and metastasis related gene sets. Results: The gene-based analysis showed 26 significant differentially expressed genes in the reference dataset, 21 of which were present on the microarray platform used in the validation study. 7 of these genes appeared to be significantly expressed in the validation dataset, but failed to pass the correction for multiple testing. The pathway-based analysis revealed 23 significant differentially expressed gene sets, 7 of which were statistically validated. These gene sets are involved in extracellular matrix remodeling (MMPs, MMP regulating pathways and the uPA system), hypoxia and angiogenesis (HIF1α regulated angiogenic factors and HIF1α regulated invasion). Conclusion: Pathways that are differentially expressed between metastasized and non-metastasized HNSCC are involved in the processes of extracellular matrix remodeling, hypoxia and angiogenesis. A supervised pathway-based analysis enhances the understanding of the biological context of the results, the comparability of results across different microarray studies, and reduces multiple testing problems by focusing on a limited number of pathways of interest instead of analyzing the large number of probes available on the microarray

    Fuzzy logistic regression for detecting differential DNA methylation regions

    Get PDF
    “Epigenetics is the study of changes in gene activity or function that are not related to a change in the DNA sequence. DNA methylation is one of the main types of epigenetic modifications, that occur when a methyl chemical group attaches to a cytosine on the DNA sequence. Although the sequence does not change, the addition of a methyl group can change the way genes are expressed and produce different phenotypes. DNA methylation is involved in many biological processes and has important implications in the fields of biomedicine and agriculture. Statistical methods have been developed to compare DNA methylation at cytosine nucleotides between populations of interest (e.g., healthy and diseased) across the entire genome from next generation sequence (NGS) data. Testing for the differences between populations in DNA methylation at specific sites is often followed by an assessment of regional difference using post hoc aggregation procedures to group neighboring sites that are differentially methylated. Although site-level analysis can yield some useful information, there are advantages to testing for differential methylation across entire genomic regions. Examining genomic regions produces less noise, reduces the numbers of statistical tests, and has the potential to provide more informative results to biologists. In this research, several different types of logistic regression models are investigated to test for differentially methylated regions (DMRs). The focus of this work is on developing a fuzzy logistic regression model for DMR detection. Two other logistic regression methods (weighted average logistic regression and ordinal logistic regression) are also introduced as alternative approaches. The performance of these novel approaches are then compared with an existing logistic regression method (MAGIg) for region-level testing, using data simulated based on two (one plant, one human) real NGS methylation data sets”--Abstract, page iii

    Gene Expression Analysis Methods on Microarray Data a A Review

    Get PDF
    In recent years a new type of experiments are changing the way that biologists and other specialists analyze many problems. These are called high throughput experiments and the main difference with those that were performed some years ago is mainly in the quantity of the data obtained from them. Thanks to the technology known generically as microarrays, it is possible to study nowadays in a single experiment the behavior of all the genes of an organism under different conditions. The data generated by these experiments may consist from thousands to millions of variables and they pose many challenges to the scientists who have to analyze them. Many of these are of statistical nature and will be the center of this review. There are many types of microarrays which have been developed to answer different biological questions and some of them will be explained later. For the sake of simplicity we start with the most well known ones: expression microarrays

    Human-Associated Microbial Signatures: Examining Their Predictive Value

    Get PDF
    SummaryHost-associated microbial communities are unique to individuals, affect host health, and correlate with disease states. Although advanced technologies capture detailed snapshots of microbial communities, high within- and between-subject variation hampers discovery of microbial signatures in diagnostic or forensic settings. We suggest turning to machine learning and discuss key directions toward harnessing human-associated microbial signatures

    Human aging is characterized by focused changes in gene expression and deregulation of alternative splicing

    Get PDF
    This is the final version. Available on open access from Wiley via the DOI in this recordSummary: Aging is a major risk factor for chronic disease in the human population, but there are little human data on gene expression alterations that accompany the process. We examined human peripheral blood leukocyte in-vivo RNA in a large-scale transcriptomic microarray study (subjects aged 30-104years). We tested associations between probe expression intensity and advancing age (adjusting for confounding factors), initially in a discovery set (n=458), following-up findings in a replication set (n=240). We confirmed expression of key results by real-time PCR. Of 16571 expressed probes, only 295 (2%) were robustly associated with age. Just six probes were required for a highly efficient model for distinguishing between young and old (area under the curve in replication set; 95%). The focused nature of age-related gene expression may therefore provide potential biomarkers of aging. Similarly, only 7 of 1065 biological or metabolic pathways were age-associated, in gene set enrichment analysis, notably including the processing of messenger RNAs (mRNAs); [P<0.002, false discovery rate (FDR) q<0.05]. This is supported by our observation of age-associated disruption to the balance of alternatively expressed isoforms for selected genes, suggesting that modification of mRNA processing may be a feature of human aging. © 2011 The Authors. Aging Cell © 2011 Blackwell Publishing Ltd/Anatomical Society of Great Britain and Ireland.National Institute for Health Research (NIHR

    Prediction of gene expression in embryonic structures of Drosophila melanogaster.

    Get PDF
    Understanding how sets of genes are coordinately regulated in space and time to generate the diversity of cell types that characterise complex metazoans is a major challenge in modern biology. The use of high-throughput approaches, such as large-scale in situ hybridisation and genome-wide expression profiling via DNA microarrays, is beginning to provide insights into the complexities of development. However, in many organisms the collection and annotation of comprehensive in situ localisation data is a difficult and time-consuming task. Here, we present a widely applicable computational approach, integrating developmental time-course microarray data with annotated in situ hybridisation studies, that facilitates the de novo prediction of tissue-specific expression for genes that have no in vivo gene expression localisation data available. Using a classification approach, trained with data from microarray and in situ hybridisation studies of gene expression during Drosophila embryonic development, we made a set of predictions on the tissue-specific expression of Drosophila genes that have not been systematically characterised by in situ hybridisation experiments. The reliability of our predictions is confirmed by literature-derived annotations in FlyBase, by overrepresentation of Gene Ontology biological process annotations, and, in a selected set, by detailed gene-specific studies from the literature. Our novel organism-independent method will be of considerable utility in enriching the annotation of gene function and expression in complex multicellular organisms

    Gene expression profiling in primary breast cancer distinguishes patients developing local recurrence after breast-conservation surgery, with or without postoperative radiotherapy

    Get PDF
    Introduction Some patients with breast cancer develop local recurrence after breast-conservation surgery despite postoperative radiotherapy, whereas others remain free of local recurrence even in the absence of radiotherapy. As clinical parameters are insufficient for identifying these two groups of patients, we investigated whether gene expression profiling would add further information. Methods We performed gene expression analysis (oligonucleotide arrays, 26,824 reporters) on 143 patients with lymph node-negative disease and tumor-free margins. A support vector machine was employed to build classifiers using leave-one-out cross-validation. Results Within the estrogen receptor-positive (ER+) subgroup, the gene expression profile clearly distinguished patients with local recurrence after radiotherapy (n = 20) from those without local recurrence (n = 80 with or without radiotherapy). The receiver operating characteristic (ROC) area was 0.91, and 5,237 of 26,824 reporters had a P value of less than 0.001 (false discovery rate = 0.005). This gene expression profile provides substantially added value to conventional clinical markers (for example, age, histological grade, and tumor size) in predicting local recurrence despite radiotherapy. Within the ER- subgroup, a weaker, but still significant, signal was found (ROC area = 0.74). The ROC area for distinguishing patients who develop local recurrence from those who remain local recurrence-free in the absence of radiotherapy was 0.66 (combined ER+/ER-). Conclusion A highly distinct gene expression profile for patients developing local recurrence after breast-conservation surgery despite radiotherapy has been identified. If verified in further studies, this profile might be a most important tool in the decision making for surgery and adjuvant therapy

    Evaluation of gene importance in microarray data based upon probability of selection

    Get PDF
    BACKGROUND: Microarray devices permit a genome-scale evaluation of gene function. This technology has catalyzed biomedical research and development in recent years. As many important diseases can be traced down to the gene level, a long-standing research problem is to identify specific gene expression patterns linking to metabolic characteristics that contribute to disease development and progression. The microarray approach offers an expedited solution to this problem. However, it has posed a challenging issue to recognize disease-related genes expression patterns embedded in the microarray data. In selecting a small set of biologically significant genes for classifier design, the nature of high data dimensionality inherent in this problem creates substantial amount of uncertainty. RESULTS: Here we present a model for probability analysis of selected genes in order to determine their importance. Our contribution is that we show how to derive the P value of each selected gene in multiple gene selection trials based on different combinations of data samples and how to conduct a reliability analysis accordingly. The importance of a gene is indicated by its associated P value in that a smaller value implies higher information content from information theory. On the microarray data concerning the subtype classification of small round blue cell tumors, we demonstrate that the method is capable of finding the smallest set of genes (19 genes) with optimal classification performance, compared with results reported in the literature. CONCLUSION: In classifier design based on microarray data, the probability value derived from gene selection based on multiple combinations of data samples enables an effective mechanism for reducing the tendency of fitting local data particularities
    • …
    corecore