727 research outputs found

    The giant diploid faba genome unlocks variation in a global protein crop

    Get PDF
    Publisher Copyright: Š 2023, The Author(s).Increasing the proportion of locally produced plant protein in currently meat-rich diets could substantially reduce greenhouse gas emissions and loss of biodiversity1. However, plant protein production is hampered by the lack of a cool-season legume equivalent to soybean in agronomic value2. Faba bean (Vicia faba L.) has a high yield potential and is well suited for cultivation in temperate regions, but genomic resources are scarce. Here, we report a high-quality chromosome-scale assembly of the faba bean genome and show that it has expanded to a massive 13 Gb in size through an imbalance between the rates of amplification and elimination of retrotransposons and satellite repeats. Genes and recombination events are evenly dispersed across chromosomes and the gene space is remarkably compact considering the genome size, although with substantial copy number variation driven by tandem duplication. Demonstrating practical application of the genome sequence, we develop a targeted genotyping assay and use high-resolution genome-wide association analysis to dissect the genetic basis of seed size and hilum colour. The resources presented constitute a genomics-based breeding platform for faba bean, enabling breeders and geneticists to accelerate the improvement of sustainable protein production across the Mediterranean, subtropical and northern temperate agroecological zones.Peer reviewe

    Dynamics of Genome Rearrangement in Bacterial Populations

    Get PDF
    Genome structure variation has profound impacts on phenotype in organisms ranging from microbes to humans, yet little is known about how natural selection acts on genome arrangement. Pathogenic bacteria such as Yersinia pestis, which causes bubonic and pneumonic plague, often exhibit a high degree of genomic rearrangement. The recent availability of several Yersinia genomes offers an unprecedented opportunity to study the evolution of genome structure and arrangement. We introduce a set of statistical methods to study patterns of rearrangement in circular chromosomes and apply them to the Yersinia. We constructed a multiple alignment of eight Yersinia genomes using Mauve software to identify 78 conserved segments that are internally free from genome rearrangement. Based on the alignment, we applied Bayesian statistical methods to infer the phylogenetic inversion history of Yersinia. The sampling of genome arrangement reconstructions contains seven parsimonious tree topologies, each having different histories of 79 inversions. Topologies with a greater number of inversions also exist, but were sampled less frequently. The inversion phylogenies agree with results suggested by SNP patterns. We then analyzed reconstructed inversion histories to identify patterns of rearrangement. We confirm an over-representation of “symmetric inversions”—inversions with endpoints that are equally distant from the origin of chromosomal replication. Ancestral genome arrangements demonstrate moderate preference for replichore balance in Yersinia. We found that all inversions are shorter than expected under a neutral model, whereas inversions acting within a single replichore are much shorter than expected. We also found evidence for a canonical configuration of the origin and terminus of replication. Finally, breakpoint reuse analysis reveals that inversions with endpoints proximal to the origin of DNA replication are nearly three times more frequent. Our findings represent the first characterization of genome arrangement evolution in a bacterial population evolving outside laboratory conditions. Insight into the process of genomic rearrangement may further the understanding of pathogen population dynamics and selection on the architecture of circular bacterial chromosomes

    Chromatin interaction analysis reveals changes in small chromosome and telomere clustering between epithelial and breast cancer cells

    Get PDF
    BACKGROUND: Higher-order chromatin structure is often perturbed in cancer and other pathological states. Although several genetic and epigenetic differences have been charted between normal and breast cancer tissues, changes in higher-order chromatin organization during tumorigenesis have not been fully explored. To probe the differences in higher-order chromatin structure between mammary epithelial and breast cancer cells, we performed Hi-C analysis on MCF-10A mammary epithelial and MCF-7 breast cancer cell lines. RESULTS: Our studies reveal that the small, gene-rich chromosomes chr16 through chr22 in the MCF-7 breast cancer genome display decreased interaction frequency with each other compared to the inter-chromosomal interaction frequency in the MCF-10A epithelial cells. Interestingly, this finding is associated with a higher occurrence of open compartments on chr16-22 in MCF-7 cells. Pathway analysis of the MCF-7 up-regulated genes located in altered compartment regions on chr16-22 reveals pathways related to repression of WNT signaling. There are also differences in intra-chromosomal interactions between the cell lines; telomeric and sub-telomeric regions in the MCF-10A cells display more frequent interactions than are observed in the MCF-7 cells. CONCLUSIONS: We show evidence of an intricate relationship between chromosomal organization and gene expression between epithelial and breast cancer cells. Importantly, this work provides a genome-wide view of higher-order chromatin dynamics and a resource for studying higher-order chromatin interactions in two cell lines commonly used to study the progression of breast cancer

    The identification and classification of endogenous retroviruses in the horse genome

    Get PDF
    Endogenous retroviruses (ERVs) are sequences that derived from ancient retroviral infections of germ cells and integrated in humans, mammals and other vertebrates millions years ago. These ERVs are inherited according to Mendelian expectations in the same way as all other genes in the genome. Size of complete endogenous retrovirus is between 8-12 kb long in average and contains gag, pro, pol and env genes that always occur in the same order. Coding sequences are flanked by two LTRs (Long Terminal Repeat sequences). Most ERVs are defective that are carrying multitude of inactivating mutations. However some ERVs still have open reading frames in their genome. These ERVs settle close to functional genes or within the genes and can influence or control functions of the host genes using their LTRs. Most integration has deleterious effects. However some integration could be example of positive co-adaptation as syncitin which is involved to form the syncytial layer of the placenta. The first equine endogenous beta retrovirus which is EcERV-Beta1 has been found in 2011 by Antoinette C. van der Kuyl1. The first known beta retrovirus and few pol gene similar to foamy retrovirus were only known endogenous retroviruses fixed in the domestic horse (Equus caballus) genome. Our aim of the study was to identify other endogenous retrovirus sequences in an equine genome and classify them into groups. Based on the high number of SINEs (Equine Repetitive Element) in the horse genome we hypothesized that certain ERVs will be located sufficiently close to SINEs that they will be amplified using an unbiased SINE-PCR approach with degenerate primers. The nearest SINE element was located 5.5 k bp upstream at the 5’of the EcERV-Beta1. Pan-pol PCR was also used to find novel ERVs based on 640 bp long region of pol gene which is the most conserved region of ERVs. 27 complete and novel ERVs that are 13 beta, 13 gamma, 1 spuma and 249 candidate endogenous retroviruses have been revealed using LTR_STRUC tool and double checked by Retrotector© online tool and NCBI-BLAST tool. It was proven that EcERV-Beta1 which has 2 LTRs with 1% divergence between LTRs has a polymorphism among 13 different breeds

    Hematopoietic gene promoters subjected to a group-combinatorial study of DNA samples: identification of a megakaryocytic selective DNA signature

    Get PDF
    Identification of common sub-sequences for a group of functionally related DNA sequences can shed light on the role of such elements in cell-specific gene expression. In the megakaryocytic lineage, no one single unique transcription factor was described as linage specific, raising the possibility that a cluster of gene promoter sequences presents a unique signature. Here, the megakaryocytic gene promoter group, which consists of both human and mouse 5′ non-coding regions, served as a case study. A methodology for group-combinatorial search has been implemented as a customized software platform. It extracts the longest common sequences for a group of related DNA sequences and allows for single gaps of varying length, as well as double- and multiple-gap sequences. The results point to common DNA sequences in a group of genes that is selectively expressed in megakaryocytes, and which does not appear in a large group of control, random and specific sequences. This suggests a role for a combination of these sequences in cell-specific gene expression in the megakaryocytic lineage. The data also point to an intrinsic cross-species difference in the organization of 5′ non-coding sequences within the mammalian genomes. This methodology may be used for the identification of regulatory sequences in other lineages

    Hematopoietic gene promoters subjected to a group-combinatorial study of DNA samples: identification of a megakaryocytic selective DNA signature

    Get PDF
    Identification of common sub-sequences for a group of functionally related DNA sequences can shed light on the role of such elements in cell-specific gene expression. In the megakaryocytic lineage, no one single unique transcription factor was described as linage specific, raising the possibility that a cluster of gene promoter sequences presents a unique signature. Here, the megakaryocytic gene promoter group, which consists of both human and mouse 5′ non-coding regions, served as a case study. A methodology for group-combinatorial search has been implemented as a customized software platform. It extracts the longest common sequences for a group of related DNA sequences and allows for single gaps of varying length, as well as double- and multiple-gap sequences. The results point to common DNA sequences in a group of genes that is selectively expressed in megakaryocytes, and which does not appear in a large group of control, random and specific sequences. This suggests a role for a combination of these sequences in cell-specific gene expression in the megakaryocytic lineage. The data also point to an intrinsic cross-species difference in the organization of 5′ non-coding sequences within the mammalian genomes. This methodology may be used for the identification of regulatory sequences in other lineages

    Comparative genomics of repetitive elements between maize inbred lines B73 and Mo17

    Get PDF
    Master of ScienceGenetics Interdepartmental ProgramSanzhen LiuThe major component of complex genomes is repetitive elements, which remain recalcitrant to characterization. Using maize as a model system, we analyzed whole genome shotgun (WGS) sequences for the two maize inbred lines B73 and Mo17 using k-mer analysis to quantify the differences between the two genomes. Significant differences were identified in highly repetitive sequences, including centromere, 45S ribosomal DNA (rDNA), knob, and telomere repeats. Genotype specific 45S rDNA sequences were discovered. The B73 and Mo17 polymorphic k-mers were used to examine allele-specific expression of 45S rDNA in the hybrids. Although Mo17 contains higher copy number than B73, equivalent levels of overall 45S rDNA expression indicates that transcriptional or post-transcriptional regulation mechanisms operate for the 45S rDNA in the hybrids. Using WGS sequences of B73xMo17 doubled haploids, genomic locations showing differential repetitive contents were genetically mapped, revealing differences in organization of highly repetitive sequences between the two genomes. In an analysis of WGS sequences of HapMap2 lines, including maize wild progenitor, landraces, and improved lines, decreases and increases in abundance of additional sets of k-mers associated with centromere, 45S rDNA, knob, and retrotransposons were found among groups, revealing global evolutionary trends of genomic repeats during maize domestication and improvement
    • …
    corecore