50 research outputs found

    Assembly of female and male hihi genomes (stitchbird; Notiomystis cincta) enables characterization of the W chromosome and resources for conservation genomics

    Get PDF
    A high-quality reference genome can be a valuable resource for threatened species by providing a foundation to assess their evolutionary potential to adapt to future pressures such as environmental change. We assembled the genome of a female hihi (Notiomysits cincta), a threatened passerine bird endemic to Aotearoa New Zealand. The assembled genome is 1.06 Gb, and is of high quality and highly contiguous, with a contig N50 of 7.0 Mb, estimated QV of 44 and a BUSCO completeness of 96.8%. A male assembly of comparable quality was generated in parallel. A population linkage map was used to scaffold the autosomal contigs into chromosomes. Female and male sequence coverage and comparative genomics analyses were used to identify Z-, and W-linked contigs. In total, 94.6% of the assembly length was assigned to putative nuclear chromosome scaffolds. Native DNA methylation was highly correlated between sexes, with the W chromosome contigs more highly methylated than autosomal chromosomes and Z contigs. 43 differentially methylated regions were identified, and these may represent interesting candidates for the establishment or maintenance of sex differences. By generating a high-quality reference assembly of the heterogametic sex, we have created a resource that enables characterization of genome-wide diversity and facilitates the investigation of female-specific evolutionary processes. The reference genomes will form the basis for fine-scale assessment of the impacts of low genetic diversity and inbreeding on the adaptive potential of the species and will therefore enable tailored and informed conservation management of this threatened taonga (treasured) species

    Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes

    Get PDF
    Abstract Background Previous studies exploring sequence variation in the model legume, Medicago truncatula, relied on mapping short reads to a single reference. However, read-mapping approaches are inadequate to examine large, diverse gene families or to probe variation in repeat-rich or highly divergent genome regions. De novo sequencing and assembly of M. truncatula genomes enables near-comprehensive discovery of structural variants (SVs), analysis of rapidly evolving gene families, and ultimately, construction of a pan-genome. Results Genome-wide synteny based on 15 de novo M. truncatula assemblies effectively detected different types of SVs indicating that as much as 22% of the genome is involved in large structural changes, altogether affecting 28% of gene models. A total of 63 million base pairs (Mbp) of novel sequence was discovered, expanding the reference genome space for Medicago by 16%. Pan-genome analysis revealed that 42% (180 Mbp) of genomic sequences is missing in one or more accession, while examination of de novo annotated genes identified 67% (50,700) of all ortholog groups as dispensable – estimates comparable to recent studies in rice, maize and soybean. Rapidly evolving gene families typically associated with biotic interactions and stress response were found to be enriched in the accession-specific gene pool. The nucleotide-binding site leucine-rich repeat (NBS-LRR) family, in particular, harbors the highest level of nucleotide diversity, large effect single nucleotide change, protein diversity, and presence/absence variation. However, the leucine-rich repeat (LRR) and heat shock gene families are disproportionately affected by large effect single nucleotide changes and even higher levels of copy number variation. Conclusions Analysis of multiple M. truncatula genomes illustrates the value of de novo assemblies to discover and describe structural variation, something that is often under-estimated when using read-mapping approaches. Comparisons among the de novo assemblies also indicate that different large gene families differ in the architecture of their structural variation

    Population genomics of the critically endangered kākāpō

    Get PDF
    Summary The kākāpō is a flightless parrot endemic to New Zealand. Once common in the archipelago, only 201 individuals remain today, most of them descending from an isolated island population. We report the first genome-wide analyses of the species, including a high-quality genome assembly for kākāpō, one of the first chromosome-level reference genomes sequenced by the Vertebrate Genomes Project (VGP). We also sequenced and analyzed 35 modern genomes from the sole surviving island population and 14 genomes from the extinct mainland population. While theory suggests that such a small population is likely to have accumulated deleterious mutations through genetic drift, our analyses on the impact of the long-term small population size in kākāpō indicate that present-day island kākāpō have a reduced number of harmful mutations compared to mainland individuals. We hypothesize that this reduced mutational load is due to the island population having been subjected to a combination of genetic drift and purging of deleterious mutations, through increased inbreeding and purifying selection, since its isolation from the mainland ∼10,000 years ago. Our results provide evidence that small populations can survive even when isolated for hundreds of generations. This work provides key insights into kākāpō breeding and recovery and more generally into the application of genetic tools in conservation efforts for endangered species

    Genomic Complexities in the Legume-Rhizobial Symbiosis

    No full text
    University of Minnesota Ph.D. dissertation. August 2018. Major: Plant and Microbial Biology. Advisor: Peter Tiffin. 1 computer file (PDF); viii, 118 pages + 2 supplemental tables.As two different organisms interact in a symbiotic nature and evolve over time, underlying genetic changes begin to reflect this. These changes are reflected as genomic complexities within the genome of each species, and further segregates within individuals of those species. In my work, I have studied three facets of understanding the genomic complexities of symbiotic organisms. i) I developed the omics database generator (ODG) which allows characterization of multiple genomes simultaneously while connecting new data to existing, published research. As more genomes are sequenced the ability to compare these genomes to existing datasets will require more advanced genomic tools. ii) I have studied copy-number variation and presence-absence variation in M. truncatula to identify genes involved in the legume-rhizobial symbiosis, resulting in the identification of a likely causative locus and exploring the copy-number variation within the association mapping population using a high-throughput method using second-generation sequencing data. iii) I have further investigated the dynamics of repeats in a population of Ensifer meliloti to understand how the genomes are being shaped and evolving, as well as the characteristics of repeats unique to the E. meliloti lineage and those shared with other species, and understanding the relationships between these repeats and the underlying genomes. My work has expanded on methods of examining genomic complexities underlying complex, quantitative host-symbiont interactions at a genomic scale

    ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding

    No full text
    Abstract Background Rapid generation of omics data in recent years have resulted in vast amounts of disconnected datasets without systemic integration and knowledge building, while individual groups have made customized, annotated datasets available on the web with few ways to link them to in-lab datasets. With so many research groups generating their own data, the ability to relate it to the larger genomic and comparative genomic context is becoming increasingly crucial to make full use of the data. Results The Omics Database Generator (ODG) allows users to create customized databases that utilize published genomics data integrated with experimental data which can be queried using a flexible graph database. When provided with omics and experimental data, ODG will create a comparative, multi-dimensional graph database. ODG can import definitions and annotations from other sources such as InterProScan, the Gene Ontology, ENZYME, UniPathway, and others. This annotation data can be especially useful for studying new or understudied species for which transcripts have only been predicted, and rapidly give additional layers of annotation to predicted genes. In better studied species, ODG can perform syntenic annotation translations or rapidly identify characteristics of a set of genes or nucleotide locations, such as hits from an association study. ODG provides a web-based user-interface for configuring the data import and for querying the database. Queries can also be run from the command-line and the database can be queried directly through programming language hooks available for most languages. ODG supports most common genomic formats as well as generic, easy to use tab-separated value format for user-provided annotations. Conclusions ODG is a user-friendly database generation and query tool that adapts to the supplied data to produce a comparative genomic database or multi-layered annotation database. ODG provides rapid comparative genomic annotation and is therefore particularly useful for non-model or understudied species. For species for which more data are available, ODG can be used to conduct complex multi-omics, pattern-matching queries

    Supplemental Material for Cullen et al., 2023

    No full text
    Supplemental material for the publication Germline stem cells and oocyte production in the Honeybee Queen Ovary</p

    Additional file 3: Table S3. of ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding

    No full text
    Gene annotations of top scoring BLAST+ hits for the predicted genes in the four rhizobia strains, as inferred from E. coli MG1655 and E. meliloti 1021. (XLSX 383 kb

    Additional file 1: Figure S1. of ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding

    No full text
    Advanced users can query ODG using Neo4j’s query language CYPHER. Presented is an example identifying HMM Matches to nearby genes and aggregating GO term counts, requiring GO terms to be labelled as a biological process. (JPEG 274 kb
    corecore