248 research outputs found

    Prediction of G Protein-Coupled Receptors with SVM-Prot Features and Random Forest

    Get PDF

    ์œ ์ „์ฒด ์„œ์—ด ๋ถ„์„์—์„œ ๊ณ ์ฐจ ๊ด€๊ณ„์˜ ์ง„ํ™”์  ๊ธฐ๊ณ„ํ•™์Šต

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ํ˜‘๋™๊ณผ์ • ์ƒ๋ฌผ์ •๋ณดํ•™์ „๊ณต, 2014. 2. ์žฅ๋ณ‘ํƒ.One of the basic research goals in life science is to understand the complex relationships between biological factors and phenotypes, and to identify the various factors affecting the phenotype. In particular, genomic sequences play a significant role in determining the phenotype, such as gene expression and a susceptibility to disease, so the studies for the fundamental information stored in genome is essential to understanding biological processes. Previous genomic sequence analyses mainly focused on identification of a single associated factor or pairwise relationships with significant effects. Recent development of high-throughput technologies has made it possible to identify the causal factors by carrying out genome-wide analysis. However, it still remains as a challenge to discover higher-order interactions of multiple factors because this involves huge search spaces and computational costs. In this dissertation, we develop effective methods for identifying the higher-order relationships of sequence elements affecting the phenotype, by combining statistical learning with evolutionary computation. The methods are applied to finding the associated combinatorial factors and dysfunctional modules in various genome-wide sequence analysis problems. Firstly, we show statistical learning-based methods to detect co-regulatory sequence motifs and to investigate combinatorial effects of DNA methylation, affecting on downstream gene expression. Next, to examine the sequence datasets with a huge number of attributes on human genome, we apply evolutionary computation approaches. Our methods search the problem feature space based on machine learning techniques using training datasets in evolutionary computation processes and are able to find candidate solution well in computationally expensive optimization problems. The experimental results show that the approaches are useful to find the higher-order relationships associated to disease using genomic and epigenomic datasets. In conclusion, our studies would provide practical methods to analyze complex interactions among sequence elements in genomic/epigenomic studies.Abstract i 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Organization of the dissertation . . . . . . . . . . . . . . . . . . . . . 7 2 Genome biology and computational analysis 9 2.1 Fundamentals of genome biology . . . . . . . . . . . . . . . . . . . . 9 2.1.1 DNA, gene, chromosomes and cell biology . . . . . . . . . . . 9 2.1.2 Gene expression and regulation . . . . . . . . . . . . . . . . . 10 2.1.3 Genomics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1.4 Epigenomics . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2 Evolutionary machine learning . . . . . . . . . . . . . . . . . . . . . 13 2.2.1 Machine learning and evolutionary computation . . . . . . . 13 2.2.2 Evolutionary computation in biology . . . . . . . . . . . . . . 13 3 Identifying co-regulatory sequence motifs 16 3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.2.1 Investigation of the relationship between regulatory sequence motifs and expression prolfies . . . . . . . . . . . . . . . . . . 18 3.2.2 Preparation of the gene expression datasets . . . . . . . . . . 21 3.2.3 Preparation of the gene sequence datasets . . . . . . . . . . . 22 3.2.4 Measurement of the eect of motif combinations . . . . . . . 23 3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3.1 Identication of the relationship between gene expression and known motifs . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.2 Identification of cell cycle-related motifs . . . . . . . . . . . . 28 3.3.3 Combinational effects of regulatory motifs . . . . . . . . . . . 30 3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4 Investigation of combinatorial eects of DNA methylation 35 4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2 Materials and methods . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.2.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.2.2 Proling of DNA methylation patterns . . . . . . . . . . . . . 39 4.2.3 Identifying differentially methylated/expressed genes by information theoretic analysis . . . . . . . . . . . . . . . . . . . . 39 4.2.4 Identifying downregulated genes in each subtype for integrative analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.2.5 Correlation between DNA methylation and gene expression . 41 4.2.6 Combinatorial effects of DNA methylation in various genomic regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.2.7 Analysis of transcription factor binding regions possibly blocked by DNA methylation . . . . . . . . . . . . . . . . . . . . . . . 43 4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.3.1 DNA methylation in 30 ICBP cell lines . . . . . . . . . . . . 44 4.3.2 Information theoretic analysis of phenotype-differentially methylated and expressed genes . . . . . . . . . . . . . . . . . . . . 45 4.3.3 Integrated analysis of DNA methylation and gene expression 47 4.3.4 Investigation of the combinatorial eects of DNA methylation in various regions on downstream gene expression levels . . . 52 4.3.5 Integrative analysis of transcription factors, DNA methylation and gene expression . . . . . . . . . . . . . . . . . . . . . . . 56 4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5 Detecting multiple SNP interaction via evolutionary learning 63 5.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.2 Materials and methods . . . . . . . . . . . . . . . . . . . . . . . . . . 65 5.2.1 Identifying higher-order interaction of SNPs . . . . . . . . . . 65 5.2.2 Algorithm Description . . . . . . . . . . . . . . . . . . . . . . 66 5.2.3 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.3.1 Identifying interaction between features in simulation data . 72 5.3.2 Identifying higher-order SNP interactions in Korean population 74 5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 6 Identifying DNA methylation modules by probabilistic evolution- ary learning 85 6.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 6.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.2.1 Evolutionary learning procedure to identify a set of DNA methylation sites associated to disease . . . . . . . . . . . . . . . . 87 6.2.2 Learning dependency graph . . . . . . . . . . . . . . . . . . . 88 6.2.3 Fitness evaluation in population . . . . . . . . . . . . . . . . 90 6.2.4 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 6.3.1 DNA methylation modules associated to breast cancer . . . 92 6.3.2 Modules associated to colorectal cancer using high-throughput sequencing data . . . . . . . . . . . . . . . . . . . . . . . . . . 96 6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 7 Conclusion 104 Bibliography 106 ์ดˆ๋ก 133Docto

    Deciphering the Immune Evolution Landscape of Multiple Myeloma Long-Term Survivors Using Single Cell Genomics

    Get PDF
    Multiple myeloma (MM) is a malignant bone marrow (BM) disease characterized by somatic hypermutation and DNA damage in plasma cells; leading to the overproduction of dysfunctional malignant myeloma cells. Accumulation of myeloma cells has direct and indirect effects on the BM and other organs. Despite the development of new therapeutic options; MM remains incurable and only a small fraction of patients experiences long-term survival (LTS). The past has shown that ultimately all patients still relapse; leading to the hypothesis that a state of active immune-surveillance is required to control the residual disease. To understand the long-term survival phenomenon and its link to the immune-phenotypes in MM disease; we collected paired bone marrow samples from 24 patients who survived for about 7 to 17 years after Autologous Stem Cell Transplant (ASCT), with a high plasma cell infiltration in the BM (median 49.5%) at diagnosis time. Response assessment according to the International Myeloma Working Group (IMWG) revealed that 15 patients were in complete remission (CR), whereas 9 patients were in non-complete remission (non-CR) that had tumor cells which remained stable over recent years. We performed single-cell RNA-seq sequencing on more than 290,000 bone marrow cells from 11 patients before treatment (BT) and in LTS, as well as three healthy controls using 10x Genomics technology. I developed a computational approach using the state-of-the-art single cell methods, statistical inference and machine learning models to decipher the bone marrow immune cell types and states across all clinical groups. I performed in-depth analyses of the bone marrow immune microenvironment across all captured cell types, and provided the global landscape of cellular states across all clinical groups. In this work, I defined new cellular states, marker genes, and gene signatures associated with the patientsโ€™ clinical and survival states. Additionally, I defined a new myeloid population termed Myeloma-associated Neutrophils (MAN) cells and a T cell exhaustion population termed Aberrant Memory Cytotoxic (AMC) CD8+ T cells in newly diagnosed Multiple Myeloma patients. Moreover, I propose new therapeutic targets CXCR3 and NR4A2 in AMC CD8+ T cells, which could be further investigated to reverse the T cell exhaustion state in newly diagnosed MM patients. Furthermore, I defined new prognostic markers in the CD8+ T cell compartment which could be predictive for the global disease state. Finally, I propose that MM long-term survivors go through a complex and evolving immune landscape and acquire cellular states in a stepwise manner. Furthermore, I propose the Continuum Immune Landscape (CIL) Model which explains the immune landscape of MM patients before and after long-term survival. Additionally, I introduced the Disease-State Trajectories (DST) hypothesis regarding the disease-associated dysregulated cellular states in MM context, which could be generalized into other tumor entities and diseases

    Abstracts of Papers, 85th Annual Meeting of the Virginia Academy of Science, May 24-26, 2007, James Madison University, Harrisonburg VA

    Get PDF
    Full abstracts of the 85th Annual Meeting of the Virginia Academy of Science, May 24-26, 2007, James Madison University, Harrisonburg V

    92nd Annual Meeting of the Virginia Academy of Science: Proceedings

    Get PDF
    Full proceedings of the 92nd Annual Meeting of the Virginia Academy of Science, May 13-15, 2014, Virginia Commonwealth University, Richmond, Virgini
    • โ€ฆ
    corecore