2,915 research outputs found

    Gene-Gene Interaction Analysis for the Accelerated Failure Time Model Using a Unified Model-Based Multifactor Dimensionality Reduction Method

    Get PDF
    Although a large number of genetic variants have been identified to be associated with common diseases through genome-wide association studies, there still exits limitations in explaining the missing heritability. One approach to solving this missing heritability problem is to investigate gene-gene interactions, rather than a single-locus approach. For gene-gene interaction analysis, the multifactor dimensionality reduction (MDR) method has been widely applied, since the constructive induction algorithm of MDR efficiently reduces high-order dimensions into one dimension by classifying multi-level genotypes into high- and low-risk groups. The MDR method has been extended to various phenotypes and has been improved to provide a significance test for gene-gene interactions. In this paper, we propose a simple method, called accelerated failure time (AFT) UM-MDR, in which the idea of a unified model-based MDR is extended to the survival phenotype by incorporating AFT-MDR into the classification step. The proposed AFT UM-MDR method is compared with AFT-MDR through simulation studies, and a short discussion is given

    Toxicogenomic Fingerprint Identification in Springtails to Assess Pesticide-Contaminated Soils

    Get PDF

    INTEGRATIVE ANALYSIS OF OMICS DATA IN ADULT GLIOMA AND OTHER TCGA CANCERS TO GUIDE PRECISION MEDICINE

    Get PDF
    Transcriptomic profiling and gene expression signatures have been widely applied as effective approaches for enhancing the molecular classification, diagnosis, prognosis or prediction of therapeutic response towards personalized therapy for cancer patients. Thanks to modern genome-wide profiling technology, scientists are able to build engines leveraging massive genomic variations and integrating with clinical data to identify “at risk” individuals for the sake of prevention, diagnosis and therapeutic interventions. In my graduate work for my Ph.D. thesis, I have investigated genomic sequencing data mining to comprehensively characterise molecular classifications and aberrant genomic events associated with clinical prognosis and treatment response, through applying high-dimensional omics genomic data to promote the understanding of gene signatures and somatic molecular alterations contributing to cancer progression and clinical outcomes. Following this motivation, my dissertation has been focused on the following three topics in translational genomics. 1) Characterization of transcriptomic plasticity and its association with the tumor microenvironment in glioblastoma (GBM). I have integrated transcriptomic, genomic, protein and clinical data to increase the accuracy of GBM classification, and identify the association between the GBM mesenchymal subtype and reduced tumorpurity, accompanied with increased presence of tumor-associated microglia. Then I have tackled the sole source of microglial as intrinsic tumor bulk but not their corresponding neurosphere cells through both transcriptional and protein level analysis using a panel of sphere-forming glioma cultures and their parent GBM samples.FurthermoreI have demonstrated my hypothesis through longitudinal analysis of paired primary and recurrent GBM samples that the phenotypic alterations of GBM subtypes are not due to intrinsic proneural-to-mesenchymal transition in tumor cells, rather it is intertwined with increased level of microglia upon disease recurrence. Collectively I have elucidated the critical role of tumor microenvironment (Microglia and macrophages from central nervous system) contributing to the intra-tumor heterogeneity and accurate classification of GBM patients based on transcriptomic profiling, which will not only significantly impact on clinical perspective but also pave the way for preclinical cancer research. 2) Identification of prognostic gene signatures that stratify adult diffuse glioma patientsharboring1p/19q co-deletions. I have compared multiple statistical methods and derived a gene signature significantly associated with survival by applying a machine learning algorithm. Then I have identified inflammatory response and acetylation activity that associated with malignant progression of 1p/19q co-deleted glioma. In addition, I showed this signature translates to other types of adult diffuse glioma, suggesting its universality in the pathobiology of other subset gliomas. My efforts on integrative data analysis of this highly curated data set usingoptimizedstatistical models will reflect the pending update to WHO classification system oftumorsin the central nervous system (CNS). 3) Comprehensive characterization of somatic fusion transcripts in Pan-Cancers. I have identified a panel of novel fusion transcripts across all of TCGA cancer types through transcriptomic profiling. Then I have predicted fusion proteins with kinase activity and hub function of pathway network based on the annotation of genetically mobile domains and functional domain architectures. I have evaluated a panel of in -frame gene fusions as potential driver mutations based on network fusion centrality hypothesis. I have also characterised the emerging complexity of genetic architecture in fusion transcripts through integrating genomic structure and somatic variants and delineating the distinct genomic patterns of fusion events across different cancer types. Overall my exploration of the pathogenetic impact and clinical relevance of candidate gene fusions have provided fundamental insights into the management of a subset of cancer patients by predicting the oncogenic signalling and specific drug targets encoded by these fusion genes. Taken together, the translational genomic research I have conducted during my Ph.D. study will shed new light on precision medicine and contribute to the cancer research community. The novel classification concept, gene signature and fusion transcripts I have identified will address several hotly debated issues in translational genomics, such as complex interactions between tumor bulks and their adjacent microenvironments, prognostic markers for clinical diagnostics and personalized therapy, distinct patterns of genomic structure alterations and oncogenic events in different cancer types, therefore facilitating our understanding of genomic alterations and moving us towards the development of precision medicine

    Genome-wide association study of primary tooth eruption identifies pleiotropic loci associated with height and craniofacial distances

    Get PDF
    Twin and family studies indicate that the timing of primary tooth eruption is highly heritable, with estimates typically exceeding 80%. To identify variants involved in primary tooth eruption we performed a population based genome-wide association study of ‘age at first tooth’ and ‘number of teeth’ using 5998 and 6609 individuals respectively from the Avon Longitudinal Study of Parents and Children (ALSPAC) and 5403 individuals from the 1966 Northern Finland Birth Cohort (NFBC1966). We tested 2,446,724 SNPs imputed in both studies. Analyses were controlled for the effect of gestational age, sex and age of measurement. Results from the two studies were combined using fixed effects inverse variance meta-analysis. We identified a total of fifteen independent loci, with ten loci reaching genome-wide significance (p<5x10−8) for ‘age at first tooth’ and eleven loci for ‘number of teeth’. Together these associations explain 6.06% of the variation in ‘age of first tooth’ and 4.76% of the variation in ‘number of teeth’. The identified loci included eight previously unidentified loci, some containing genes known to play a role in tooth and other developmental pathways, including a SNP in the protein-coding region of BMP4 (rs17563, P= 9.080x10−17). Three of these loci, containing the genes HMGA2, AJUBA and ADK, also showed evidence of association with craniofacial distances, particularly those indexing facial width. Our results suggest that the genome-wide association approach is a powerful strategy for detecting variants involved in tooth eruption, and potentially craniofacial growth and more generally organ development

    INTEGRATION OF MULTI-PLATFORM HIGH-DIMENSIONAL OMIC DATA

    Get PDF
    The development of high-throughput biotechnologies have made data accessible from different platforms, including RNA sequencing, copy number variation, DNA methylation, protein lysate arrays, etc. The high-dimensional omic data derived from different technological platforms have been extensively used to facilitate comprehensive understanding of disease mechanisms and to determine personalized health treatments. Although vital to the progress of clinical research, the high dimensional multi-platform data impose new challenges for data analysis. Numerous studies have been proposed to integrate multi-platform omic data; however, few have efficiently and simultaneously addressed the problems that arise from high dimensionality and complex correlations. In my dissertation, I propose a statistical framework of shared informative factor model (SIFORM) that can jointly analyze multi-platform omic data and explore their associations with a disease phenotype. The common disease- associated sample characteristics across different data types can be captured through the shared structure space, while the corresponding weights of genetic variables directly index the strengths of their association with the phenotype. I compare the performance of the proposed method with several popular regularized regression methods and canonical correlation analysis (CCA)-based methods through extensive simulation studies and two lung adenocarcinoma applications. The two lung adenocarcinoma applications jointly explore the associations of mRNA expression and protein expression with smoking status and survival using The Cancer Genome Atlas (TCGA) datasets. The simulation studies demonstrate the superior performance of SIFORM in terms of biomarker detection accuracy. In lung cancer applications, SIFORM identifies many biomarkers that belong to key pathways for lung tumorigenesis. It also discovers potential prognostic biomarkers for lung cancer patients survival and some biomarkers that reveal different tumorigenesis mechanisms between light smokers and heavy smokers. To improve the prediction accuracy and interpretability of the proposed model, I extend it to PSIFORM by incorporating existing biological pathway information to current statistical framework. I adopt a network-based regularization to ensure that the neighboring genes in the same pathway tend to be selected (or eliminated) simultaneously. Through simulation studies and a TCGA kidney cancer application, I show that PSIFORM outperforms its competitors in both variable selection and prediction. The statistical framework of PSIFORM also has a great potential in incorporating the hierarchical order across the multi-platform omic measurements

    Integrating Omics Data into Genomic Prediction

    Get PDF

    From clustered data to causal inference : new methodology motivated by the analysis of subfertility treatments

    Get PDF

    Polymorphisms in Plasmodium falciparum chloroquine resistance transporter and multidrug resistance 1 genes: parasite risk factors that affect treatment outcomes for P. falciparum malaria after artemether-lumefantrine and artesunate-amodiaquine.

    Get PDF
    Adequate clinical and parasitologic cure by artemisinin combination therapies relies on the artemisinin component and the partner drug. Polymorphisms in the Plasmodium falciparum chloroquine resistance transporter (pfcrt) and P. falciparum multidrug resistance 1 (pfmdr1) genes are associated with decreased sensitivity to amodiaquine and lumefantrine, but effects of these polymorphisms on therapeutic responses to artesunate-amodiaquine (ASAQ) and artemether-lumefantrine (AL) have not been clearly defined. Individual patient data from 31 clinical trials were harmonized and pooled by using standardized methods from the WorldWide Antimalarial Resistance Network. Data for more than 7,000 patients were analyzed to assess relationships between parasite polymorphisms in pfcrt and pfmdr1 and clinically relevant outcomes after treatment with AL or ASAQ. Presence of the pfmdr1 gene N86 (adjusted hazards ratio = 4.74, 95% confidence interval = 2.29 - 9.78, P < 0.001) and increased pfmdr1 copy number (adjusted hazards ratio = 6.52, 95% confidence interval = 2.36-17.97, P < 0.001 : were significant independent risk factors for recrudescence in patients treated with AL. AL and ASAQ exerted opposing selective effects on single-nucleotide polymorphisms in pfcrt and pfmdr1. Monitoring selection and responding to emerging signs of drug resistance are critical tools for preserving efficacy of artemisinin combination therapies; determination of the prevalence of at least pfcrt K76T and pfmdr1 N86Y should now be routine

    Statistical Methods in Integrative Genomics

    Get PDF
    Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and future research directions
    corecore