14 research outputs found

    Building predictive unbound brain-to-plasma concentration ratio (Kp,uu,brain) models

    Get PDF
    Abstract The blood-brain barrier (BBB) constitutes a dynamic membrane primarily evolved to protect the brain from exposure to harmful xenobiotics. The distribution of synthesized drugs across the blood-brain barrier (BBB) is a vital parameter to consider in drug discovery projects involving a central nervous system (CNS) target, since the molecules should be capable of crossing the major hurdle, BBB. In contrast, the peripherally acting drugs have to be designed optimally to minimize brain exposure which could possibly result in undue side effects. It is thus important to establish the BBB permeability of molecules early in the drug discovery pipeline. Previously, most of the in-silico attempts for the prediction of brain exposure have relied on the total drug distribution between the blood plasma and the brain. However, it is now understood that the unbound brain-to-plasma concentration ratio ( Kp,uu,brain) is the parameter that precisely indicates the BBB availability of compounds. Kp,uu,brain describes the free drug concentration of the drug molecule in the brain, which, according to the free drug hypothesis, is the parameter that causes the relevant pharmacological response at the target site. Current work involves revisiting a model built in 2011 and uploaded in an in-house server and checking for its performance on the data collected since then. This gave a satisfying result showing the stability of the model. The old dataset was then further extended with the temporal dataset in order to update the model. This is important to maintain a substantial chemical space so as to ensure a good predictability with unknown data. Using other methods and descriptors not used in the previous study, a further improvement in the model performance was achieved. Attempts were also made in order to interpret the model by identifying the most influential descriptors in the model.Popular science summary: Predictive model for unbound brain-to-plasma concentration ratio Blood-brain barrier (BBB) is a dynamic interface evolved to protect the brain from exposure to toxic xenobiotics and to maintain homeostasis. Distribution of drugs across BBB is critical for any drug discovery project. A drug designed for a target in brain has to pass through the BBB in sufficient concentrations to elicit the desired therapeutic effect. On the other hand, a drug designed for a non-CNS target should be kept away from the brain to avoid fatal side effects. Unbound brain-to-plasma concentration ratio, Kp,uu,brain is a parameter that describes the distribution of a molecule across the BBB. It represents the free drug concentration in the brain, which is the fraction that elicits the pharmacological effect on the CNS. The experimental measurement of this parameter is time consuming and laborious. Computational prediction of such properties thus prove to be of a great utility in reducing the time and resources spent by aiding in the early elimination of compounds possessing undesirable qualities. This helps in reducing late stage compound attrition (failure rate) which has always been a major problem for pharmaceutical industries. Quantitative Structure Activity Relationship (QSAR) is an approach that attempts to establish a meaningful relationship between the chemical structure of a molecule and its chemical/biological activity. Once established, this relationship can be used to predict the activity of a new compound based on its chemical structure. In a typical QSAR experiment, the chemical structures are often represented in terms of numerical values called molecular descriptors. The thesis work utilized machine learning algorithm (Support Vector Machine and Random forest) to define the structure -activity relationship. A predictive model for estimating the unbound brain-to-plasma concentration ratio (Kp,uu,brain) was developed based on a training set of in-house compounds and was mounted in an in-house program (C-lab) in 2011 for routine use. The thesis project involved validating the existing model and updating the model by extending the dataset with the data collected since 2011. Different combinations of Machine Learning algorithms, modeling approaches and molecular descriptors (calculated numerical values representing of chemical structures) were used to build the models. Further, combining the prediction from these models, consensus models were built and validated. Two-class classification models were also evaluated based on categorizing compounds into BBB positive (crosses BBB) or negative (does not cross BBB). The validation of the old model using temporal test set (Kp,uu,brain data collected since 2011) gave a promising result showing stability and good predictive power. However, it is very important to keep the chemical space updated, which defines the purpose for updating the model. The new model (a consensus model with five components) shows a significant improvement in terms of the predictive power along with an improvement in the classification performance. This model will be uploaded to C-lab and will be accessible for use within AstraZeneca. Advisors: Hongming Chen, Ola Engkvist (Computational Chemistry, AstraZeneca R&D Mölndal) Master´s Degree Project 60 credits in Bioinformatics (2014) Department of Biology., Lund Universit

    A High-Quality Assembly of the Nine-Spined Stickleback (Pungitius pungitius) Genome

    Get PDF
    The Gasterosteidae fish family hosts several species that are important models for eco-evolutionary, genetic, and genomic research. In particular, a wealth of genetic and genomic data has been generated for the three-spined stickleback (Gasterosteus aculeatus), the "ecology's supermodel," whereas the genomic resources for the nine-spined stickleback (Pungitius pungitius) have remained relatively scarce. Here, we report a high-quality chromosome-level genome assembly of P. pungitius consisting of 5,303 contigs (N50 = 1.2Mbp) with a total size of 521 Mbp. These contigs were mapped to 21 linkage groups using a high-density linkage map, yielding a final assembly with 98.5% BUSCO completeness. A total of 25,062 protein-coding genes were annotated, and about 23% of the assembly was found to consist of repetitive elements. A comprehensive analysis of repetitive elements uncovered centromere-specific tandem repeats and provided insights into the evolution of retrotransposons. A multigene phylogenetic analysis inferred a divergence time of about 26 million years ago (Ma) between nine- and three-spined sticklebacks, which is far older than the commonly assumed estimate of 13 Ma. Compared with the three-spined stickleback, we identified an additional duplication of several genes in the hemoglobin cluster. Sequencing data from populations adapted to different environments indicated potential copy number variations in hemoglobin genes. Furthermore, genome-wide synteny comparisons between three- and nine-spined sticklebacks identified chromosomal rearrangements underlying the karyotypic differences between the two species. The high-quality chromosome-scale assembly of the nine-spined stickleback genome obtained with long-read sequencing technology provides a crucial resource for comparative and population genomic investigations of stickleback fishes and teleosts.Peer reviewe

    The Chromosome-Level Genome Assembly of European Grayling Reveals Aspects of a Unique Genome Evolution Process Within Salmonids

    Get PDF
    Salmonids represent an intriguing taxonomical group for investigating genome evolution in vertebrates due to their relatively recent last common whole genome duplication event, which occurred between 80 and 100 million years ago. Here, we report on the chromosome-level genome assembly of European grayling (Thymallus thymallus), which represents one of the earliest diverged salmonid subfamilies. To achieve this, we first generated relatively long genomic scaffolds by using a previously published draft genome assembly along with long-read sequencing data and a linkage map. We then merged those scaffolds by applying synteny evidence from the Atlantic salmon (Salmo salar) genome. Comparisons of the European grayling genome assembly to the genomes of Atlantic salmon and Northern pike (Esox lucius), the latter used as a nonduplicated outgroup, detailed aspects of the characteristic chromosome evolution process that has taken place in European grayling. While Atlantic salmon and other salmonid genomes are portrayed by the typical occurrence of numerous chromosomal fusions, European grayling chromosomes were confirmed to be fusion-free and were characterized by a relatively large proportion of paracentric and pericentric inversions. We further reported on transposable elements specific to either the European grayling or Atlantic salmon genome, on the male-specific sdY gene in the European grayling chromosome 11A, and on regions under residual tetrasomy in the homeologous European grayling chromosome pairs 9A-9B and 25A-25B. The same chromosome pairs have been observed under residual tetrasomy in Atlantic salmon and in other salmonids, suggesting that this feature has been conserved since the subfamily split.Peer reviewe

    Comparative genomics in teleost fish: Insights into forces driving genome evolution

    No full text
    Every genome encodes a story of evolution, the remarkable complexity of life. Today, about five decades after Ohno’s momentous proposition on the genetic redundancy being an important driver for the evolution of genetic novelty, we are still continuing to solve the enigma that is a genome, the genetic code that ‘defines’ us and every other organism. The genomic revolution, fueled by the developments in sequencing technology, has provided an unparalleled opportunity to unravel the intricacies of this story to a fine resolution. These rapid advances coupled with increased feasibility of whole genome sequencing has especially bolstered the field of evolutionary genomics by providing an opportunity to study the molecular basis of evolution in species across the tree of life. Teleost fish are an outstanding model system to study a multitude of questions regarding the evolution of vertebrate genomes. The accruing genomic resource for these species is continuing to enable comparative genomic studies shedding light on vital aspects of genome evolution and the impact of whole genome duplication. This thesis explores into some aspects of genome and chromosome evolution in two important teleost family fish, namely, salmonids and sticklebacks Gene and genome duplication are the primary mode of generation of new genetic material for novelty to evolve. The relatively young whole genome duplication (WGD) in the salmonid lineage (referred to as Ss4R WGD) offers a great opportunity to gain insights into the evolution of gene duplicates consequent to polyploidy. To this end, we sequenced and assembled the draft genome of a representative of the earliest diverging non-anadromous salmonid lineage, Thymallus thymallus. We used this novel genomic resource in a comparative phylogenomic framework to gain insights into the consequences of lineage-specific rediploidization and genome-wide selective constraints on gene expression regulation. The genetic redundancy introduced post polyploidy is associated with rewiring of the regulatory network causing shifts in the gene expression patterns. Extensive divergence of ohnologs is often observed post WGD and is considered vital for retention of duplicates. Our analyses demonstrate that selection is important in the evolution of tissue expression following Ss4R WGD. To address large-scale genome structure evolution in grayling, we further generated a chromosome-level assembly for grayling by using long-read PacBio data and a linkage map. Using this resource, we could investigate the chromosomal rearrangements responsible for the extreme differences in karyotypes between Atlantic salmon (Salmo salar) and European grayling. While the Atlantic salmon karyotype has evolved through a series of Robertsonian translocations and fusions, we confirm that the more primitive looking karyotype of grayling has evolved primarily through inversions. Sticklebacks, particularly the three-spined stickleback (Gasterosteus aculeatus), have been a well-studied system in many realms of evolutionary biology. Yet another notable member of the same family and an emerging model system in ecology and evolutionary biology, is the nine-spined-stickleback (Pungitius pungitius). We generated a high-quality chromosome-scale genome assembly for the nine-spined stickleback using high coverage longread PacBio data and a high-density linkage map. Utilizing this high-quality genome assembly, we provide a comprehensive analysis of repetitive elements including centromeric repeats in the nine-spined stickleback genome. We also describe a recent duplication in the hemoglobin cluster and show that this region could potentially involve frequent copy number variations in closely related populations. Finally, we also identify structural variations potentially explaining the karyotypic variation between the three- and nine-spined sticklebacks. Taken together, this thesis, while providing the genome assembly and annotation valuable for further studies, also demonstrates the utility of comparative genomic analyses among closely related species to elucidate various facets of genome evolution

    Karyotype evolution in salmonids

    No full text
    Karyotype evolution in salmonid

    Grayling draft genome dataset

    No full text
    <div>###### Description</div><div><br></div><div><br></div><div>Tthymallus_scaffolds.fasta-Assembled scaffold sequences</div><div><br></div><div>Tthymallus_RepeatLibrary_deNovo.fasta-De novo repeat library sequences</div><div><br></div><div>Tthymallus_transcriptome_DeNovo.fasta- De novo transcriptome assembly using Trinity (followed by RSEM based filtering)</div><div><br></div><div>Tthymallus_transcriptome_ReferenceBased.fasta-Reference-based transcriptome using STAR-Cufflinks-transdecoder pipeline (followed by filtering based on homology to known proteins from zebrafish and stickleback proteins.</div><div><br></div><div>Tthymallus_ScaffoldAnnotation.gff3-MAKER pipeline based annotation for the scaffolds.</div><div><br></div><div>Tthymallus_proteins.fasta-Grayling protein sequences used for inferring orthologous groups (based on MAKER annotations)</div><div><br></div><div>Tthymallus_maker_fullOutput.gff- Full output from MAKER</div><div><br></div><div>Tthymallus_CPMcounts.txt-Expression counts for grayling</div><div><br></div><div>OrthologousGroups.txt-Inferred orthologous groups using Orthofinder</div><div><br></div><div><br></div><div><br></div

    Switching on the light: using metagenomic shotgun sequencing to characterize the intestinal microbiome of Atlantic cod

    No full text
    Atlantic cod (Gadus morhua) is an ecologically important species with a wide‐spread distribution in the North Atlantic Ocean, yet little is known about the diversity of its intestinal microbiome in its natural habitat. No geographical differentiation in this microbiome was observed based on 16S rRNA amplicon analyses, yet such finding may result from an inherent lack of power of this method to resolve fine‐scaled biological complexity. Here, we use metagenomic shotgun sequencing to investigate the intestinal microbiome of 19 adult Atlantic cod individuals from two coastal populations in Norway–located 470 km apart. Resolving the species community to unprecedented resolution, we identify two abundant species, Photobacterium iliopiscarium and Photobacterium kishitanii, which comprise over 50% of the classified reads. Interestingly, the intestinal P. kishitanii strains have functionally intact lux genes, and its high abundance suggests that fish intestines form an important part of its ecological niche. These observations support a hypothesis that bioluminescence plays an ecological role in the marine food web. Despite our improved taxonomical resolution, we identify no geographical differences in bacterial community structure, indicating that the intestinal microbiome of these coastal cod is colonized by a limited number of closely related bacterial species with a broad geographical distribution

    The grayling genome reveals selection on gene expression regulation after whole-genome duplication

    No full text
    Whole-genome duplication (WGD) has been a major evolutionary driver of increased genomic complexity in vertebrates. One such event occurred in the salmonid family ∼80 Ma (Ss4R) giving rise to a plethora of structural and regulatory duplicate-driven divergence, making salmonids an exemplary system to investigate the evolutionary consequences of WGD. Here, we present a draft genome assembly of European grayling (Thymallus thymallus) and use this in a comparative framework to study evolution of gene regulation following WGD. Among the Ss4R duplicates identified in European grayling and Atlantic salmon (Salmo salar), one-third reflect nonneutral tissue expression evolution, with strong purifying selection, maintained over ∼50 Myr. Of these, the majority reflect conserved tissue regulation under strong selective constraints related to brain and neural-related functions, as well as higher-order protein–protein interactions. A small subset of the duplicates have evolved tissue regulatory expression divergence in a common ancestor, which have been subsequently conserved in both lineages, suggestive of adaptive divergence following WGD. These candidates for adaptive tissue expression divergence have elevated rates of protein coding- and promoter-sequence evolution and are enriched for immune- and lipid metabolism ontology terms. Lastly, lineage-specific duplicate divergence points toward underlying differences in adaptive pressures on expression regulation in the nonanadromous grayling versus the anadromous Atlantic salmon. Our findings enhance our understanding of the role of WGD in genome evolution and highlight cases of regulatory divergence of Ss4R duplicates, possibly related to a niche shift in early salmonid evolution
    corecore