70 research outputs found

    Construction of phylogenetic trees by kernel-based comparative analysis of metabolic networks

    Get PDF
    BACKGROUND: To infer the tree of life requires knowledge of the common characteristics of each species descended from a common ancestor as the measuring criteria and a method to calculate the distance between the resulting values of each measure. Conventional phylogenetic analysis based on genomic sequences provides information about the genetic relationships between different organisms. In contrast, comparative analysis of metabolic pathways in different organisms can yield insights into their functional relationships under different physiological conditions. However, evaluating the similarities or differences between metabolic networks is a computationally challenging problem, and systematic methods of doing this are desirable. Here we introduce a graph-kernel method for computing the similarity between metabolic networks in polynomial time, and use it to profile metabolic pathways and to construct phylogenetic trees. RESULTS: To compare the structures of metabolic networks in organisms, we adopted the exponential graph kernel, which is a kernel-based approach with a labeled graph that includes a label matrix and an adjacency matrix. To construct the phylogenetic trees, we used an unweighted pair-group method with arithmetic mean, i.e., a hierarchical clustering algorithm. We applied the kernel-based network profiling method in a comparative analysis of nine carbohydrate metabolic networks from 81 biological species encompassing Archaea, Eukaryota, and Eubacteria. The resulting phylogenetic hierarchies generally support the tripartite scheme of three domains rather than the two domains of prokaryotes and eukaryotes. CONCLUSION: By combining the kernel machines with metabolic information, the method infers the context of biosphere development that covers physiological events required for adaptation by genetic reconstruction. The results show that one may obtain a global view of the tree of life by comparing the metabolic pathway structures using meta-level information rather than sequence information. This method may yield further information about biological evolution, such as the history of horizontal transfer of each gene, by studying the detailed structure of the phylogenetic tree constructed by the kernel-based method

    Extracting regulatory modules from gene expression data by sequential pattern mining

    Full text link
    Abstract Background Identifying a regulatory module (RM), a bi-set of co-regulated genes and co-regulating conditions (or samples), has been an important challenge in functional genomics and bioinformatics. Given a microarray gene-expression matrix, biclustering has been the most common method for extracting RMs. Among biclustering methods, order-preserving biclustering by a sequential pattern mining technique has native advantage over the conventional biclustering approaches since it preserves the order of genes (or conditions) according to the magnitude of the expression value. However, previous sequential pattern mining-based biclustering has several weak points in that they can easily be computationally intractable in the real-size of microarray data and sensitive to inherent noise in the expression value. Results In this paper, we propose a novel sequential pattern mining algorithm that is scalable in the size of microarray data and robust with respect to noise. When applied to the microarray data of yeast, the proposed algorithm successfully found long order-preserving patterns, which are biologically significant but cannot be found in randomly shuffled data. The resulting patterns are well enriched to known annotations and are consistent with known biological knowledge. Furthermore, RMs as well as inter-module relations were inferred from the biologically significant patterns. Conclusions Our approach for identifying RMs could be valuable for systematically revealing the mechanism of gene regulation at a genome-wide level.</p

    Molecular Evolution Patterns in Metastatic Lymph Nodes Reflect the Differential Treatment Response of Advanced Primary Lung Cancer

    Get PDF
    Tumor heterogeneity influences the clinical outcome of patients with cancer, and the diagnostic method to measure the tumor heterogeneity needs to be developed. We analyzed genomic features on pairs of primary and multiple metastatic lymph nodes from six patients with lung cancer using whole-exome sequencing and RNA sequencing. Although somatic single-nucleotide variants were shared in primary lung cancer and metastases, tumor evolution predicted by the pattern of genomic alterations was matched to anatomic location of the tumors. Four of six cases exhibited a branched clonal evolution pattern. Lymph nodes with acquired somatic variants demonstrated resistance to the cancer treatment. In this study, we demonstrated that multiple biopsies and sequencing strategies for different tumor regions are required for a comprehensive understanding of the landscape of genetic alteration and for guiding targeted therapy in advanced primary lung cancer. Cancer Res; 76(22); 6568-76. ©2016 AACR

    The diagnostic application of targeted re-sequencing in Korean patients with retinitis pigmentosa

    Get PDF
    This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.Abstract Background Identification of the causative genes of retinitis pigmentosa (RP) is important for the clinical care of patients with RP. However, a comprehensive genetic study has not been performed in Korean RP patients. Moreover, the genetic heterogeneity found in sensorineural genetic disorders makes identification of pathogenic mutations challenging. Therefore, high throughput genetic testing using massively parallel sequencing is needed. Results Sixty-two Korean patients with nonsyndromic RP (46 patients from 18 families and 16 simplex cases) who consented to molecular genetic testing were recruited in this study and targeted exome sequencing was applied on 53 RP-related genes. Causal variants were characterised by selecting exonic and splicing variants, selecting variants with low allele frequency (below 1 %), and discarding the remaining variants with quality below 20. The variants were additionally confirmed by an inheritance pattern and cosegregation test of the families, and the rest of the variants were prioritised using in-silico prediction tools. Finally, causal variants were detected from 10 of 18 familial cases (55.5 %) and 7 of 16 simplex cases (43.7 %) in total. Novel variants were detected in 13 of 20 (65 %) candidate variants. Compound heterozygous variants were found in four of 7 simplex cases. Conclusion Panel-based targeted re-sequencing can be used as an effective molecular diagnostic tool for RP

    Transcriptional regulatory framework for vascular cambium development in Arabidopsis roots

    Get PDF
    Vascular cambium, a lateral plant meristem is a central producer of woody biomass. Although a few transcription factors (TFs) have been shown to regulate cambial activity1, the phenotypes of the corresponding loss-of-function mutants are relatively modest, highlighting our limited understanding of the underlying transcriptional regulation. Here, we utilize cambium cell-specific transcript profiling followed by a combination of TF network and genetic analyses to identify 62 novel TF genotypes displaying an array of cambial phenotypes. This approach culminated in virtual loss of cambial activity when both WUSCHEL-RELATED HOMEOBOX 4 (WOX4) and KNOTTED-like from Arabidopsis thaliana 1 (KNAT1; also known as BREVIPEDICELLUS (BP) were mutated, thereby unlocking the genetic redundancy in the regulation of cambium development. We also identified TFs with dual functions in cambial cell proliferation and xylem differentiation, including WOX4, SHORT VEGETATIVE PHASE (SVP) and PETAL LOSS (PTL). Using the TF network information, we combined overexpression of the cambial activator WOX4 and removal of the putative inhibitor PTL to engineer Arabidopsis for enhanced radial growth. This line also showed ectopic cambial activity, thus further highlighting the central roles of WOX4 and PTL in cambium development.This work was supported by Finnish Centre of Excellence in Molecular Biology of Primary Producers (Academy of Finland CoE program 2014-2019) decision #271832, the Gatsby Foundation (GAT3395/PR3)), the University of Helsinki (award 799992091) and the European Research Council Advanced Investigator Grant SYMDEV (No. 323052) to Y.H.; Academy of Finland (grants #132376, #266431, #271832), University of Helsinki HiLIFE fellowship to A.P.M.; National Research Foundation of Korea (2018R1A5A1023599 and 2016R1A2B2015258) to J-Y. L

    Computational identification of condition-specific miRNA targets based on gene expression profiles and sequence information

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNAs (miRNAs) are small and noncoding RNAs that play important roles in various biological processes. They regulate target mRNAs post-transcriptionally through complementary base pairing. Since the changes of miRNAs affect the expression of target genes, the expression levels of target genes in specific biological processes could be different from those of non-target genes. Here we demonstrate that gene expression profiles contain useful information in separating miRNA targets from non-targets.</p> <p>Results</p> <p>The gene expression profiles related to various developmental processes and stresses, as well as the sequences of miRNAs and mRNAs in <it>Arabidopsis</it>, were used to determine whether a given gene is a miRNA target. It is based on the model combining the support vector machine (SVM) classifier and the scoring method based on complementary base pairing between miRNAs and mRNAs. The proposed model yielded low false positive rate and retrieved condition-specific candidate targets through a genome-wide screening.</p> <p>Conclusion</p> <p>Our approach provides a novel framework into screening target genes by considering the gene regulation of miRNAs. It can be broadly applied to identify condition-specific targets computationally by embedding information of gene expression profiles.</p

    Comprehensive evaluation of matrix factorization methods for the analysis of DNA microarray gene expression data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Clustering-based methods on gene-expression analysis have been shown to be useful in biomedical applications such as cancer subtype discovery. Among them, Matrix factorization (MF) is advantageous for clustering gene expression patterns from DNA microarray experiments, as it efficiently reduces the dimension of gene expression data. Although several MF methods have been proposed for clustering gene expression patterns, a systematic evaluation has not been reported yet.</p> <p>Results</p> <p>Here we evaluated the clustering performance of orthogonal and non-orthogonal MFs by a total of nine measurements for performance in four gene expression datasets and one well-known dataset for clustering. Specifically, we employed a non-orthogonal MF algorithm, BSNMF (Bi-directional Sparse Non-negative Matrix Factorization), that applies bi-directional sparseness constraints superimposed on non-negative constraints, comprising a few dominantly co-expressed genes and samples together. Non-orthogonal MFs tended to show better clustering-quality and prediction-accuracy indices than orthogonal MFs as well as a traditional method, K-means. Moreover, BSNMF showed improved performance in these measurements. Non-orthogonal MFs including BSNMF showed also good performance in the functional enrichment test using Gene Ontology terms and biological pathways.</p> <p>Conclusions</p> <p>In conclusion, the clustering performance of orthogonal and non-orthogonal MFs was appropriately evaluated for clustering microarray data by comprehensive measurements. This study showed that non-orthogonal MFs have better performance than orthogonal MFs and <it>K</it>-means for clustering microarray data.</p
    corecore