54 research outputs found

    Comparative Analysis of Tandem Repeats from Hundreds of Species Reveals Unique Insights into Centromere Evolution

    Get PDF
    Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. The assumption that the most abundant tandem repeat is the centromere DNA was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and in length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond ~50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution, including the appearance of higher order repeat structures in which several polymorphic monomers make up a larger repeating unit. While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animals and plants. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes

    Intratumoral heterogeneity and clonal evolution in liver cancer

    Get PDF
    Clonal evolution of a tumor ecosystem depends on different selection pressures that are principally immune and treatment mediated. We integrate RNA-seq, DNA sequencing, TCR-seq and SNP array data across multiple regions of liver cancer specimens to map spatio-temporal interactions between cancer and immune cells. We investigate how these interactions reflect intra-tumor heterogeneity (ITH) by correlating regional neo-epitope and viral antigen burden with the regional adaptive immune response. Regional expression of passenger mutations dominantly recruits adaptive responses as opposed to hepatitis B virus and cancer-testis antigens. We detect different clonal expansion of the adaptive immune system in distant regions of the same tumor. An ITH-based gene signature improves single-biopsy patient survival predictions and an expression survey of 38,553 single cells across 7 regions of 2 patients further reveals heterogeneity in liver cancer. These data quantify transcriptomic ITH and how the different components of the HCC ecosystem interact during cancer evolution

    Bat pluripotent stem cells reveal unique entanglement between host and viruses

    Get PDF
    Bats have evolved features unique amongst mammals, including flight, laryngeal echolocation, and certain species have been shown to have a unique immune response that may enable them to tolerate viruses such as SARS-CoVs, MERS-CoVs, Nipah, and Marburg viruses. Robust cellular models have yet to be developed for bats, hindering our ability to further understand their special biology and handling of viral pathogens. To establish bats as new model study species, we generated induced pluripotent stem cells (iPSCs) from a wild greater horseshoe bat (Rhinolophus ferrumequinum) using a modified Yamanaka protocol. Rhinolophids are amongst the longest living bat species and are asymptomatic carriers of coronaviruses, including one of the viruses most closely related to SARS-CoV-2. Bat induced pluripotent stem (BiPS) cells were stable in culture, readily differentiated into all three germ layers, and formed complex embryoid bodies, including organoids. The BiPS cells were found to have a core pluripotency gene expression program similar to that of other species, but it also resembled that of cells attacked by viruses. The BiPS cells produced a rich set of diverse endogenized viral sequences and in particular retroviruses. We further validated our protocol by developing iPS cells from an evolutionary distant bat species Myotis myotis (greater mouse-eared bat) non-lethally sampled in the wild, which exhibited similar attributes to the greater horseshoe bat iPS cells, suggesting that this unique pluripotent state evolved in the ancestral bat lineage. Although previous studies have suggested that bats have developed powerful strategies to tame their inflammatory response, our results argue that they have also evolved mechanisms to accommodate a substantial load of endogenous viral sequences and suggest that the natural history of bats and viruses is more profoundly intertwined than previously thought. Further study of bat iPS cells and their differentiated progeny should advance our understanding of the role bats play as virus hosts, provide a novel method of disease surveillance, and enable the functional studies required to ascertain the molecular basis of bats’ unique traits.N

    SARS-CoV-2 susceptibility and COVID-19 disease severity are associated with genetic variants affecting gene expression in a variety of tissues

    Get PDF
    Variability in SARS-CoV-2 susceptibility and COVID-19 disease severity between individuals is partly due to genetic factors. Here, we identify 4 genomic loci with suggestive associations for SARS-CoV-2 susceptibility and 19 for COVID-19 disease severity. Four of these 23 loci likely have an ethnicity-specific component. Genome-wide association study (GWAS) signals in 11 loci colocalize with expression quantitative trait loci (eQTLs) associated with the expression of 20 genes in 62 tissues/cell types (range: 1:43 tissues/gene), including lung, brain, heart, muscle, and skin as well as the digestive system and immune system. We perform genetic fine mapping to compute 99% credible SNP sets, which identify 10 GWAS loci that have eight or fewer SNPs in the credible set, including three loci with one single likely causal SNP. Our study suggests that the diverse symptoms and disease severity of COVID-19 observed between individuals is associated with variants across the genome, affecting gene expression levels in a wide variety of tissue types
    corecore