119 research outputs found

    Genomic Selection for Crop Improvement: New Molecular Breeding Strategies for Crop Improvement

    Get PDF
    Genomic Selection for Crop Improvement serves as handbook for users by providing basic as well as advanced understandings of genomic selection. This useful review explains germplasm use, phenotyping evaluation, marker genotyping methods, and statistical models involved in genomic selection. It also includes examples of ongoing activities of genomic selection for crop improvement and efforts initiated to deploy the genomic selection in some important crops. In order to understand the potential of GS breeding, it is high time to bring complete information in the form of a book that can serve as a ready reference for geneticist and plant breeders

    Information Theory in Computational Biology: Where We Stand Today

    Get PDF
    "A Mathematical Theory of Communication" was published in 1948 by Claude Shannon to address the problems in the field of data compression and communication over (noisy) communication channels. Since then, the concepts and ideas developed in Shannon's work have formed the basis of information theory, a cornerstone of statistical learning and inference, and has been playing a key role in disciplines such as physics and thermodynamics, probability and statistics, computational sciences and biological sciences. In this article we review the basic information theory based concepts and describe their key applications in multiple major areas of research in computational biology-gene expression and transcriptomics, alignment-free sequence comparison, sequencing and error correction, genome-wide disease-gene association mapping, metabolic networks and metabolomics, and protein sequence, structure and interaction analysis

    The role of visual adaptation in cichlid fish speciation

    Get PDF
    D. Shane Wright (1) , Ole Seehausen (2), Ton G.G. Groothuis (1), Martine E. Maan (1) (1) University of Groningen; GELIFES; EGDB(2) Department of Fish Ecology & Evolution, EAWAG Centre for Ecology, Evolution and Biogeochemistry, Kastanienbaum AND Institute of Ecology and Evolution, Aquatic Ecology, University of Bern.In less than 15,000 years, Lake Victoria cichlid fishes have radiated into as many as 500 different species. Ecological and sexual sel ection are thought to contribute to this ongoing speciation process, but genetic differentiation remains low. However, recent work in visual pigment genes, opsins, has shown more diversity. Unlike neighboring Lakes Malawi and Tanganyika, Lake Victoria is highly turbid, resulting in a long wavelength shift in the light spectrum with increasing depth, providing an environmental gradient for exploring divergent coevolution in sensory systems and colour signals via sensory drive. Pundamilia pundamila and Pundamilia nyererei are two sympatric species found at rocky islands across southern portions of Lake Victoria, differing in male colouration and the depth they reside. Previous work has shown species differentiation in colour discrimination, corresponding to divergent female preferences for conspecific male colouration. A mechanistic link between colour vision and preference would provide a rapid route to reproductive isolation between divergently adapting populations. This link is tested by experimental manip ulation of colour vision - raising both species and their hybrids under light conditions mimicking shallow and deep habitats. We quantify the expression of retinal opsins and test behaviours important for speciation: mate choice, habitat preference, and fo raging performance

    Assessment of network module identification across complex diseases

    Full text link
    Many bioinformatics methods have been proposed for reducing the complexity of large gene or protein networks into relevant subnetworks or modules. Yet, how such methods compare to each other in terms of their ability to identify disease-relevant modules in different types of network remains poorly understood. We launched the 'Disease Module Identification DREAM Challenge', an open competition to comprehensively assess module identification methods across diverse protein-protein interaction, signaling, gene co-expression, homology and cancer-gene networks. Predicted network modules were tested for association with complex traits and diseases using a unique collection of 180 genome-wide association studies. Our robust assessment of 75 module identification methods reveals top-performing algorithms, which recover complementary trait-associated modules. We find that most of these modules correspond to core disease-relevant pathways, which often comprise therapeutic targets. This community challenge establishes biologically interpretable benchmarks, tools and guidelines for molecular network analysis to study human disease biology

    Identification of disease related significant SNPs

    Get PDF
    Single nucleotide polymorphisms (SNPs) are DNA sequence variations that occur when a single nucleotide in the genome sequence is altered. Since, variations in DNA sequence can have a major impact on complex human diseases such as obesity, epilepsy, type 2 diabetes, rheumatoid arthritis; SNPs have become increasingly significant in identification of such complex diseases. Recent biological studies point out that a single altered gene may have a small effect on a complex disease, whereas interactions between multiple genes may have a significant role. Therefore, identifying multiple genes associated with complex disorders is essential. In this spirit, combinations of multiple SNPs rather than individual SNPs should be analyzed. However, assessing a very large number of SNP combinations is computationally challenging and due to this challenge, in literature there exist a limited number of studies on extracting statistically significant SNP combinations. In this thesis work, we focus on this challenging problem and develop a five step "disease-associated multi-SNP combinations search procedure" to identify statistically significant SNP combinations and the significant rules defining the associations between SNPs and a specified disease. The proposed five step multi-SNP combinations procedure is applied to the simulated rheumatoid arthritis data set provided by Genetic Analysis Workshop 15. In each step, statistically significant SNPs are extracted from the available set of SNPs that are not yet classified as significant or insignificant. In the first step, the genome wide association analysis (GWA) is performed on the original complete multi-family data set. Then, in the second step we use the tag SNP selection algorithm to find a smaller subset of informative SNP markers. In literature most tag SNP selection methods are based on the pair wise (two-markers) linkage disequilibrium (LD) measures. But in this thesis, both the pair wise and multiple marker LD measures have been incorporated to improve the genetic coverage. Up to the third step the procedure aims to identify individual significant SNPs. In the third step a genetic algorithm (GA) based feature selection method is performed. It provides a significant combination of SNPs and the GA constructs this combination by maximizing the explanatory power of the selected SNPs while trying to decrease the number of selected SNPs dynamically. Since GA is a probabilistic search approach, at each execution it may provide different SNP combinations. We apply the GA several times to obtain multiple significant SNP combinations, and for each combination we calculate the associated pseudo r-square values and apply some statistical tests to check its significance. We also consider the union and intersection of the SNP combinations, identified by the GA, as potentially significant SNP combinations. After identifying multiple statistically significant SNP combinations, in the fourth and fifth steps we focus on extracting rules to explain the association between the SNPs and the disease. In the fourth step we apply a classification method, called Decision Tree Forest, to calculate the importance values of individual SNPs that belong to at least one of the SNP combinations found by the GA. Since each marker in a SNP combination is in bi-allelic form, genotypes of a SNP can affect the disease status. Different genotypes of SNPs are considered to define candidate rules. Then utilizing the calculated importance values and the occurrence percentage of the candidate rule in the data set, in the fifth step we perform our proposed rule extraction method to select the rules among the candidate ones. In literature there are many classification approaches such as the decision tree, decision forest and random forest. Each of these methods considers SNP interactions which are explanatory for a large subset of patients. However, in real life some SNP interactions that are observed only in a small subset of patients might cause the disease. The existing classification methods do not identify such interactions as significant. However, of the proposed five-step multi-SNP combinations procedure extracts these interactions as well as the others. This is a significant contribution to the research on identifying significant interactions that may cause a human to have the disease

    A stew of mixed ingredients: Observational omics in the post-GWAS era

    Get PDF
    The past 20 years have seen extensive profiling of the DNA. Collectively, scientists all across the world have identified many places in the DNA, known as loci, that impact human traits such as disease state or immune function. However, interpreting the results from these studies, known as genome wide association studies (GWAS), has been challenging. This thesis studies several approaches for interpreting GWAS results, with a specific focus on our immune system given its important role in preventing and causing disease. This is done through the use of so called ā€˜omicsā€™ technologies, that can study the role of thousands of genes, proteins and genetic variants at the same time. By doing this, maps can be constructed of which genes and proteins interact to impact human traits. The ultimate goal of this research is to provide a better understanding of the cascade between the DNA and human traits. The hope is that building a specific understanding of how the variation in the DNA leads to the development of human traits, such as disease, will ultimately aid the development of drugs for these diseases

    Network and multi-scale signal analysis for the integration of large omic datasets: applications in \u3ci\u3ePopulus trichocarpa\u3c/i\u3e

    Get PDF
    Poplar species are promising sources of cellulosic biomass for biofuels because of their fast growth rate, high cellulose content and moderate lignin content. There is an increasing movement on integrating multiple layers of ā€™omics data in a systems biology approach to understand gene-phenotype relationships and assist in plant breeding programs. This dissertation involves the use of network and signal processing techniques for the combined analysis of these various data types, for the goals of (1) increasing fundamental knowledge of P. trichocarpa and (2) facilitating the generation of hypotheses about target genes and phenotypes of interest. A data integration ā€œLines of Evidenceā€ method is presented for the identification and prioritization of target genes involved in functions of interest. A new post-GWAS method, Pleiotropy Decomposition, is presented, which extracts pleiotropic relationships between genes and phenotypes from GWAS results, allowing for identification of genes with signatures favorable to genome editing. Continuous wavelet transform signal processing analysis is applied in the characterization of genome distributions of various features (including variant density, gene density, and methylation profiles) in order to identify chromosome structures such as the centromere. This resulted in the approximate centromere locations on all P. trichocarpa chromosomes, which had previously not been adequately reported in the scientific literature. Discrete wavelet transform signal processing followed by correlation analysis was applied to genomic features from various data types including transposable element density, methylation density, SNP density, gene density, centromere position and putative ancestral centromere position. Subsequent correlation analysis of the resulting wavelet coefficients identified scale-specific relationships between these genomic features, and provide insights into the evolution of the genome structure of P. trichocarpa. These methods have provided strategies to both increase fundamental knowledge about the P. trichocarpa system, as well as to identify new target genes related to biofuels targets. We intend that these approaches will ultimately be used in the designing of better plants for more efficient and sustainable production of bioenergy

    Genomics and New Approaches to Study Complex Traits in Pigs and Other Livestock Species. A Focus on the Investigation of Gene Networks Related to Fat Quality and Deposition in Pigs and Preliminary Research to Study Factors Related to Performances in Piglets and Poultry

    Get PDF
    The recent development of new technologies and the progress of genetics and genomics have opened new horizons in breeding programs. Genomic selection has been successfully applied to dairy cattle breeding schemes, but had a limited success in other livestock species, such as pigs and chickens. Anyway, the technological advances seen in the last years may represent important tools for the deciphering of complex traits, for which traditional selection has been slowly accelerating. The studies reported in the present thesis are addressed towards the application of different OMICs technologies to the analysis of productive traits in livestock species. The study was mainly focused on the investigation of candidate genes and gene networks associated to porcine fatness traits using genomics, transcriptomics and single gene studies. The results confirmed the important role that ELOVL elongase 6 gene region plays in backfat fatty acid composition, and reported associations between markers on Perilipin genes and fatness traits. Furthermore, a transcriptome analysis performed on backfat samples of pigs divergent for backfat thickness highlighted several expression patterns related to adipose tissue deposition and suggested that Perilipin 2 gene may play an important role in adipose tissue deposition. OMICs technologies were also applied in two preliminary studies performed with the aim of identifying factors and genetic causes involved in variations in pig colostrum metabolome and in the occurrence of myopathy-like defects in breast muscle of broiler chickens. On the whole, the application of genomics, transcriptomics and metabolomics showed to be an effective tool for the study of complex traits in different livestock species, and for the detection of genes involved in phenotypic variations. Further studies are needed, but the found evidences contribute to increase the knowledge about markers useful for the genetic improvement of complex polygenic traits
    • ā€¦
    corecore