56 research outputs found
Comparative Analysis of Tandem Repeats from Hundreds of Species Reveals Unique Insights into Centromere Evolution
Centromeres are essential for chromosome segregation, yet their DNA sequences
evolve rapidly. In most animals and plants that have been studied, centromeres
contain megabase-scale arrays of tandem repeats. Despite their importance, very
little is known about the degree to which centromere tandem repeats share
common properties between different species across different phyla. We used
bioinformatic methods to identify high-copy tandem repeats from 282 species
using publicly available genomic sequence and our own data. The assumption that
the most abundant tandem repeat is the centromere DNA was true for most species
whose centromeres have been previously characterized, suggesting this is a
general property of genomes. Our methods are compatible with all current
sequencing technologies. Long Pacific Biosciences sequence reads allowed us to
find tandem repeat monomers up to 1,419 bp. High-copy centromere tandem repeats
were found in almost all animal and plant genomes, but repeat monomers were
highly variable in sequence composition and in length. Furthermore,
phylogenetic analysis of sequence homology showed little evidence of sequence
conservation beyond ~50 million years of divergence. We find that despite an
overall lack of sequence conservation, centromere tandem repeats from diverse
species showed similar modes of evolution, including the appearance of higher
order repeat structures in which several polymorphic monomers make up a larger
repeating unit. While centromere position in most eukaryotes is epigenetically
determined, our results indicate that tandem repeats are highly prevalent at
centromeres of both animals and plants. This suggests a functional role for
such repeats, perhaps in promoting concerted evolution of centromere DNA across
chromosomes
Recommended from our members
The impact of sequencing depth on the inferred taxonomic composition and AMR gene content of metagenomic samples
Shotgun metagenomics is increasingly used to characterise microbial communities, particularly for the investigation of antimicrobial resistance (AMR) in different animal and environmental contexts. There are many different approaches for inferring the taxonomic composition and AMR gene content of complex community samples from shotgun metagenomic data, but there has been little work establishing the optimum sequencing depth, data processing and analysis methods for these samples. In this study we used shotgun metagenomics and sequencing of cultured isolates from the same samples to address these issues. We sampled three potential environmental AMR gene reservoirs (pig caeca, river sediment, effluent) and sequenced samples with shotgun metagenomics at high depth (~ 200 million reads per sample). Alongside this, we cultured single-colony isolates of Enterobacteriaceae from the same samples and used hybrid sequencing (short- and long-reads) to create high- quality assemblies for comparison to the metagenomic data. To automate data processing, we developed an open- source software pipeline, ‘ResPipe’
Recommended from our members
Capture of complete ciliate chromosomes in single sequencing reads reveals widespread chromosome isoforms
Background
Whole-genome shotgun sequencing, which stitches together millions of short sequencing reads into a single genome, ushered in the era of modern genomics and led to a rapid expansion of the number of genome sequences available. Nevertheless, assembly of short reads remains difficult, resulting in fragmented genome sequences. Ultimately, only a sequencing technology capable of capturing complete chromosomes in a single run could resolve all ambiguities. Even “third generation” sequencing technologies produce reads far shorter than most eukaryotic chromosomes. However, the ciliate Oxytricha trifallax has a somatic genome with thousands of chromosomes averaging only 3.2 kbp, making it an ideal candidate for exploring the benefits of sequencing whole chromosomes without assembly.
Results
We used single-molecule real-time sequencing to capture thousands of complete chromosomes in single reads and to update the published Oxytricha trifallax JRB310 genome assembly. In this version, over 50% of the completed chromosomes with two telomeres derive from single reads. The improved assembly includes over 12,000 new chromosome isoforms, and demonstrates that somatic chromosomes derive from variable rearrangements between somatic segments encoded up to 191,000 base pairs away. However, while long reads reduce the need for assembly, a hybrid approach that supplements long-read sequencing with short reads for error correction produced the most complete and accurate assembly, overall.
Conclusions
This assembly provides the first example of complete eukaryotic chromosomes captured by single sequencing reads and demonstrates that traditional approaches to genome assembly can mask considerable structural variation
Intratumoral heterogeneity and clonal evolution in liver cancer
Clonal evolution of a tumor ecosystem depends on different selection pressures that are principally immune and treatment mediated. We integrate RNA-seq, DNA sequencing, TCR-seq and SNP array data across multiple regions of liver cancer specimens to map spatio-temporal interactions between cancer and immune cells. We investigate how these interactions reflect intra-tumor heterogeneity (ITH) by correlating regional neo-epitope and viral antigen burden with the regional adaptive immune response. Regional expression of passenger mutations dominantly recruits adaptive responses as opposed to hepatitis B virus and cancer-testis antigens. We detect different clonal expansion of the adaptive immune system in distant regions of the same tumor. An ITH-based gene signature improves single-biopsy patient survival predictions and an expression survey of 38,553 single cells across 7 regions of 2 patients further reveals heterogeneity in liver cancer. These data quantify transcriptomic ITH and how the different components of the HCC ecosystem interact during cancer evolution
Recommended from our members
Genomic network analysis of environmental and livestock F-type plasmid populations
F-type plasmids are diverse and of great clinical significance, often carrying genes conferring antimicrobial resistance (AMR) such as extended-spectrum β-lactamases, particularly in Enterobacterales. Organising this plasmid diversity is challenging, and current knowledge is largely based on plasmids from clinical settings. Here, we present a network community analysis of a large survey of F-type plasmids from environmental (influent, effluent, and upstream/downstream waterways surrounding wastewater treatment works) and livestock settings. We use a tractable and scalable methodology to examine the relationship between plasmid metadata and network communities. This reveals how niche (sampling compartment and host genera) partition and shape plasmid diversity. We also perform pangenome-style analyses on network communities. We show that such communities define unique combinations of core genes, with limited overlap. Building plasmid phylogenies based on alignments of these core genes, we demonstrate that plasmid accessory function is closely linked to core gene content. Taken together, our results suggest that stable F-type plasmid backbone structures can persist in environmental settings while allowing dramatic variation in accessory gene content that may be linked to niche adaptation. The association of F-type plasmids with AMR likely reflects their suitability for rapid niche adaptation
Bat pluripotent stem cells reveal unique entanglement between host and viruses
Bats have evolved features unique amongst mammals, including flight, laryngeal echolocation, and certain species have been shown to have a unique immune response that may enable them to tolerate viruses such as SARS-CoVs, MERS-CoVs, Nipah, and Marburg viruses. Robust cellular models have yet to be developed for bats, hindering our ability to further understand their special biology and handling of viral pathogens. To establish bats as new model study species, we generated induced pluripotent stem cells (iPSCs) from a wild greater horseshoe bat (Rhinolophus ferrumequinum) using a modified Yamanaka protocol. Rhinolophids are amongst the longest living bat species and are asymptomatic carriers of coronaviruses, including one of the viruses most closely related to SARS-CoV-2. Bat induced pluripotent stem (BiPS) cells were stable in culture, readily differentiated into all three germ layers, and formed complex embryoid bodies, including organoids. The BiPS cells were found to have a core pluripotency gene expression program similar to that of other species, but it also resembled that of cells attacked by viruses. The BiPS cells produced a rich set of diverse endogenized viral sequences and in particular retroviruses. We further validated our protocol by developing iPS cells from an evolutionary distant bat species Myotis myotis (greater mouse-eared bat) non-lethally sampled in the wild, which exhibited similar attributes to the greater horseshoe bat iPS cells, suggesting that this unique pluripotent state evolved in the ancestral bat lineage. Although previous studies have suggested that bats have developed powerful strategies to tame their inflammatory response, our results argue that they have also evolved mechanisms to accommodate a substantial load of endogenous viral sequences and suggest that the natural history of bats and viruses is more profoundly intertwined than previously thought. Further study of bat iPS cells and their differentiated progeny should advance our understanding of the role bats play as virus hosts, provide a novel method of disease surveillance, and enable the functional studies required to ascertain the molecular basis of bats’ unique traits.N
Recommended from our members
Niche and local geography shape the pangenome of wastewater- and livestock-associated Enterobacteriaceae
Escherichia coli and other Enterobacteriaceae are diverse species with “open” pangenomes, where genes move intra- and interspecies via horizontal gene transfer. However, most analyses focus on clinical isolates. The pangenome dynamics of natural populations remain understudied, despite their suggested role as reservoirs for antimicrobial resistance (AMR) genes. Here, we analyze near-complete genomes for 827 Enterobacteriaceae (553 Escherichia and 274 non-Escherichia spp.) with 2292 circularized plasmids in total, collected from 19 locations (livestock farms and wastewater treatment works in the United Kingdom) within a 30-km radius at three time points over a year. We find different dynamics for chromosomal and plasmid-borne genes. Plasmids have a higher burden of AMR genes and insertion sequences, and AMR-gene-carrying plasmids show evidence of being under stronger selective pressure. Environmental niche and local geography both play a role in shaping plasmid dynamics. Our results highlight the importance of local strategies for controlling the spread of AMR
Recommended from our members
Klebsiella pneumoniae induces host metabolic stress that promotes tolerance to pulmonary infection
K. pneumoniae sequence type 258 (Kp ST258) is a major cause of healthcare-associated pneumonia. However, it remains unclear how it causes protracted courses of infection in spite of its expression of immunostimulatory lipopolysaccharide, which should activate a brisk inflammatory response and bacterial clearance. We predicted that the metabolic stress induced by the bacteria in the host cells shapes an immune response that tolerates infection. We combined in situ metabolic imaging and transcriptional analyses to demonstrate that Kp ST258 activates host glutaminolysis and fatty acid oxidation. This response creates an oxidant-rich microenvironment conducive to the accumulation of anti-inflammatory myeloid cells. In this setting, metabolically active Kp ST258 elicits a disease-tolerant immune response. The bacteria, in turn, adapt to airway oxidants by upregulating the type VI secretion system, which is highly conserved across ST258 strains worldwide. Thus, much of the global success of Kp ST258 in hospital settings can be explained by the metabolic activity provoked in the host that promotes disease tolerance.
Keywords: immunometabolism, Klebsiella pneumoniae, immunosuppressive, anti-inflammatory, itaconate, Type Six Secretion Syste
SARS-CoV-2 susceptibility and COVID-19 disease severity are associated with genetic variants affecting gene expression in a variety of tissues
Variability in SARS-CoV-2 susceptibility and COVID-19 disease severity between individuals is partly due to
genetic factors. Here, we identify 4 genomic loci with suggestive associations for SARS-CoV-2 susceptibility
and 19 for COVID-19 disease severity. Four of these 23 loci likely have an ethnicity-specific component.
Genome-wide association study (GWAS) signals in 11 loci colocalize with expression quantitative trait loci
(eQTLs) associated with the expression of 20 genes in 62 tissues/cell types (range: 1:43 tissues/gene),
including lung, brain, heart, muscle, and skin as well as the digestive system and immune system. We perform
genetic fine mapping to compute 99% credible SNP sets, which identify 10 GWAS loci that have eight or fewer
SNPs in the credible set, including three loci with one single likely causal SNP. Our study suggests that the
diverse symptoms and disease severity of COVID-19 observed between individuals is associated with variants across the genome, affecting gene expression levels in a wide variety of tissue types
- …