94 research outputs found
A new Plasmodium vivax reference sequence with improved assembly of the subtelomeres reveals an abundance of pir genes
Plasmodium vivax is now the predominant cause of malaria in the Asia-Pacific, South America and Horn of Africa. Laboratory studies of this species are constrained by the inability to maintain the parasite in continuous ex vivo culture, but genomic approaches provide an alternative and complementary avenue to investigate the parasite's biology and epidemiology. To date, molecular studies of P. vivax have relied on the Salvador-I reference genome sequence, derived from a monkey-adapted strain from South America. However, the Salvador-I reference remains highly fragmented with over 2500 unassembled scaffolds. Using high-depth Illumina sequence data, we assembled and annotated a new reference sequence, PvP01, sourced directly from a patient from Papua Indonesia. Draft assemblies of isolates from China (PvC01) and Thailand (PvT01) were also prepared for comparative purposes. The quality of the PvP01 assembly is improved greatly over Salvador-I, with fragmentation reduced to 226 scaffolds. Detailed manual curation has ensured highly comprehensive annotation, with functions attributed to 58% core genes in PvP01 versus 38% in Salvador-I. The assemblies of PvP01, PvC01 and PvT01 are larger than that of Salvador-I (28-30 versus 27 Mb), owing to improved assembly of the subtelomeres. An extensive repertoire of over 1200 Plasmodium interspersed repeat (pir) genes were identified in PvP01 compared to 346 in Salvador-I, suggesting a vital role in parasite survival or development. The manually curated PvP01 reference and PvC01 and PvT01 draft assemblies are important new resources to study vivax malaria. PvP01 is maintained at GeneDB and ongoing curation will ensure continual improvements in assembly and annotation quality
Recommended from our members
Frequent expansion of Plasmodium vivax Duffy Binding Protein in Ethiopia and its epidemiological significance.
Plasmodium vivax invasion of human erythrocytes depends on the Duffy Binding Protein (PvDBP) which interacts with the Duffy antigen. PvDBP copy number has been recently shown to vary between P. vivax isolates in Sub-Saharan Africa. However, the extent of PvDBP copy number variation, the type of PvDBP multiplications, as well as its significance across broad samples are still unclear. We determined the prevalence and type of PvDBP duplications, as well as PvDBP copy number variation among 178 Ethiopian P. vivax isolates using a PCR-based diagnostic method, a novel quantitative real-time PCR assay and whole genome sequencing. For the 145 symptomatic samples, PvDBP duplications were detected in 95 isolates, of which 81 had the Cambodian and 14 Malagasy-type PvDBP duplications. PvDBP varied from 1 to >4 copies. Isolates with multiple PvDBP copies were found to be higher in symptomatic than asymptomatic infections. For the 33 asymptomatic samples, PvDBP was detected with two copies in two of the isolates, and both were the Cambodian-type PvDBP duplication. PvDBP copy number in Duffy-negative heterozygotes was not significantly different from that in Duffy-positives, providing no support for the hypothesis that increased copy number is a specific association with Duffy-negativity, although the number of Duffy-negatives was small and further sampling is required to test this association thoroughly
Genome Sequencing and Analysis of Yersina pestis KIM D27, an Avirulent Strain Exempt from Select Agent Regulation
Yersinia pestis is the causative agent of the plague. Y. pestis KIM 10+ strain was passaged and selected for loss of the 102 kb pgm locus, resulting in an attenuated strain, KIM D27. In this study, whole genome sequencing was performed on KIM D27 in order to identify any additional differences. Initial assemblies of 454 data were highly fragmented, and various bioinformatic tools detected between 15 and 465 SNPs and INDELs when comparing both strains, the vast majority associated with A or T homopolymer sequences. Consequently, Illumina sequencing was performed to improve the quality of the assembly. Hybrid sequence assemblies were performed and a total of 56 validated SNP/INDELs and 5 repeat differences were identified in the D27 strain relative to published KIM 10+ sequence. However, further analysis showed that 55 of these SNP/INDELs and 3 repeats were errors in the KIM 10+ reference sequence. We conclude that both 454 and Illumina sequencing were required to obtain the most accurate and rapid sequence results for Y. pestis KIMD27. SNP and INDELS calls were most accurate when both Newbler and CLC Genomics Workbench were employed. For purposes of obtaining high quality genome sequence differences between strains, any identified differences should be verified in both the new and reference genomes
Recommended from our members
Genome Sequencing and Analysis of <i>Yersina pestis</i> KIM D27, an Avirulent Strain Exempt from Select Agent Regulation
Yersinia pestis is the causative agent of the plague. Y. pestis KIM 10+ strain was passaged and selected for loss of the 102 kb pgm locus, resulting in an attenuated strain, KIM D27. In this study, whole genome sequencing was performed on KIM D27 in order to identify any additional differences. Initial assemblies of 454 data were highly fragmented, and various bioinformatic tools detected between 15 and 465 SNPs and INDELs when comparing both strains, the vast majority associated with A or T homopolymer sequences. Consequently, Illumina sequencing was performed to improve the quality of the assembly. Hybrid sequence assemblies were performed and a total of 56 validated SNP/INDELs and 5 repeat differences were identified in the D27 strain relative to published KIM 10+ sequence. However, further analysis showed that 55 of these SNP/INDELs and 3 repeats were errors in the KIM 10+ reference sequence. We conclude that both 454 and Illumina sequencing were required to obtain the most accurate and rapid sequence results for Y. pestis KIMD27. SNP and INDELS calls were most accurate when both Newbler and CLC Genomics Workbench were employed. For purposes of obtaining high quality genome sequence differences between strains, any identified differences should be verified in both the new and reference genomes.</p
The complete chloroplast genome sequence of Gossypium hirsutum: organization and phylogenetic relationships to other angiosperms
BACKGROUND: Cotton (Gossypium hirsutum) is the most important fiber crop grown in 90 countries. In 2004–2005, US farmers planted 79% of the 5.7-million hectares of nuclear transgenic cotton. Unfortunately, genetically modified cotton has the potential to hybridize with other cultivated and wild relatives, resulting in geographical restrictions to cultivation. However, chloroplast genetic engineering offers the possibility of containment because of maternal inheritance of transgenes. The complete chloroplast genome of cotton provides essential information required for genetic engineering. In addition, the sequence data were used to assess phylogenetic relationships among the major clades of rosids using cotton and 25 other completely sequenced angiosperm chloroplast genomes. RESULTS: The complete cotton chloroplast genome is 160,301 bp in length, with 112 unique genes and 19 duplicated genes within the IR, containing a total of 131 genes. There are four ribosomal RNAs, 30 distinct tRNA genes and 17 intron-containing genes. The gene order in cotton is identical to that of tobacco but lacks rpl22 and infA. There are 30 direct and 24 inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Most of the direct repeats are within intergenic spacer regions, introns and a 72 bp-long direct repeat is within the psaA and psaB genes. Comparison of protein coding sequences with expressed sequence tags (ESTs) revealed nucleotide substitutions resulting in amino acid changes in ndhC, rpl23, rpl20, rps3 and clpP. Phylogenetic analysis of a data set including 61 protein-coding genes using both maximum likelihood and maximum parsimony were performed for 28 taxa, including cotton and five other angiosperm chloroplast genomes that were not included in any previous phylogenies. CONCLUSION: Cotton chloroplast genome lacks rpl22 and infA and contains a number of dispersed direct and inverted repeats. RNA editing resulted in amino acid changes with significant impact on their hydropathy. Phylogenetic analysis provides strong support for the position of cotton in the Malvales in the eurosids II clade sister to Arabidopsis in the Brassicales. Furthermore, there is strong support for the placement of the Myrtales sister to the eurosid I clade, although expanded taxon sampling is needed to further test this relationship
Complete Plastid Genome Sequence of Daucus Carota: Implications for Biotechnology and Phylogeny of Angiosperms
Background Carrot (Daucus carota) is a major food crop in the US and worldwide. Its capacity for storage and its lifecycle as a biennial make it an attractive species for the introduction of foreign genes, especially for oral delivery of vaccines and other therapeutic proteins. Until recently efforts to express recombinant proteins in carrot have had limited success in terms of protein accumulation in the edible tap roots. Plastid genetic engineering offers the potential to overcome this limitation, as demonstrated by the accumulation of BADH in chromoplasts of carrot taproots to confer exceedingly high levels of salt resistance. The complete plastid genome of carrot provides essential information required for genetic engineering. Additionally, the sequence data add to the rapidly growing database of plastid genomes for assessing phylogenetic relationships among angiosperms. Results The complete carrot plastid genome is 155,911 bp in length, with 115 unique genes and 21 duplicated genes within the IR. There are four ribosomal RNAs, 30 distinct tRNA genes and 18 intron-containing genes. Repeat analysis reveals 12 direct and 2 inverted repeats ≥ 30 bp with a sequence identity ≥ 90%. Phylogenetic analysis of nucleotide sequences for 61 protein-coding genes using both maximum parsimony (MP) and maximum likelihood (ML) were performed for 29 angiosperms. Phylogenies from both methods provide strong support for the monophyly of several major angiosperm clades, including monocots, eudicots, rosids, asterids, eurosids II, euasterids I, and euasterids II. Conclusion The carrot plastid genome contains a number of dispersed direct and inverted repeats scattered throughout coding and non-coding regions. This is the first sequenced plastid genome of the family Apiaceae and only the second published genome sequence of the species-rich euasterid II clade. Both MP and ML trees provide very strong support (100% bootstrap) for the sister relationship of Daucus with Panax in the euasterid II clade. These results provide the best taxon sampling of complete chloroplast genomes and the strongest support yet for the sister relationship of Caryophyllales to the asterids. The availability of the complete plastid genome sequence should facilitate improved transformation efficiency and foreign gene expression in carrot through utilization of endogenous flanking sequences and regulatory elements
Using Plasmodium knowlesi as a model for screening Plasmodium vivax blood-stage malaria vaccine targets reveals new candidates.
Plasmodium vivax is responsible for the majority of malaria cases outside Africa. Unlike P. falciparum, the P. vivax life-cycle includes a dormant liver stage, the hypnozoite, which can cause infection in the absence of mosquito transmission. An effective vaccine against P. vivax blood stages would limit symptoms and pathology from such recurrent infections, and therefore could play a critical role in the control of this species. Vaccine development in P. vivax, however, lags considerably behind P. falciparum, which has many identified targets with several having transitioned to Phase II testing. By contrast only one P. vivax blood-stage vaccine candidate based on the Duffy Binding Protein (PvDBP), has reached Phase Ia, in large part because the lack of a continuous in vitro culture system for P. vivax limits systematic screening of new candidates. We used the close phylogenetic relationship between P. vivax and P. knowlesi, for which an in vitro culture system in human erythrocytes exists, to test the scalability of systematic reverse vaccinology to identify and prioritise P. vivax blood-stage targets. A panel of P. vivax proteins predicted to function in erythrocyte invasion were expressed as full-length recombinant ectodomains in a mammalian expression system. Eight of these antigens were used to generate polyclonal antibodies, which were screened for their ability to recognize orthologous proteins in P. knowlesi. These antibodies were then tested for inhibition of growth and invasion of both wild type P. knowlesi and chimeric P. knowlesi lines modified using CRISPR/Cas9 to exchange P. knowlesi genes with their P. vivax orthologues. Candidates that induced antibodies that inhibited invasion to a similar level as PvDBP were identified, confirming the utility of P. knowlesi as a model for P. vivax vaccine development and prioritizing antigens for further follow up.European Union, National Institutes of Health (US
Comparative genomics of mutualistic viruses of Glyptapanteles parasitic wasps
Comparative genome analysis of two endosymbiotic polydnaviruses from Glyptapanteles parasitic wasps reveals new insights into the evolutionary arms race between host and parasite
Asymptomatic Plasmodium vivax infections induce robust IgG responses to multiple blood-stage proteins in a low-transmission region of western Thailand
BACKGROUND: Thailand is aiming to eliminate malaria by the year
2024. Plasmodium vivax has now become the dominant species
causing malaria within the country, and a high proportion of
infections are asymptomatic. A better understanding of antibody
dynamics to P. vivax antigens in a low-transmission setting,
where acquired immune responses are poorly characterized, will
be pivotal for developing new strategies for elimination, such
as improved surveillance methods and vaccines. The objective of
this study was to characterize total IgG antibody levels to 11
key P. vivax proteins in a village of western Thailand. METHODS:
Plasma samples from 546 volunteers enrolled in a cross-sectional
survey conducted in 2012 in Kanchanaburi Province were utilized.
Total IgG levels to 11 different proteins known or predicted to
be involved in reticulocyte binding or invasion (ARP, GAMA, P41,
P12, PVX_081550, and five members of the PvRBP family), as well
as the leading pre-erythrocytic vaccine candidate (CSP) were
measured using a multiplexed bead-based assay. Associations
between IgG levels and infection status, age, and spatial
location were explored. RESULTS: Individuals from a
low-transmission region of western Thailand reacted to all 11 P.
vivax recombinant proteins. Significantly greater IgG levels
were observed in the presence of a current P. vivax infection,
despite all infected individuals being asymptomatic. IgG levels
were also higher in adults (18 years and older) than in
children. For most of the proteins, higher IgG levels were
observed in individuals living closer to the Myanmar border and
further away from local health services. CONCLUSIONS: Robust IgG
responses were observed to most proteins and IgG levels
correlated with surrogates of exposure, suggesting these
antigens may serve as potential biomarkers of exposure,
immunity, or both
Recommended from our members
Evaluation of transboundary environmental issues in Central Europe
Central Europe has experienced environmental degradation for hundreds of years. The proximity of countries, their shared resources, and transboundary movement of environmental pollution, create the potential for regional environmental strife. The goal of this project was to identify the sources and sinks of environmental pollution in Central Europe and evaluate the possible impact of transboundary movement of pollution on the countries of Central Europe. In meeting the objectives of identifying sources of contaminants, determining transboundary movement of contaminants, and assessing socio-economic implications, large quantities of disparate data were examined. To facilitate use of the data, the authors refined mapping procedures that enable processing information from virtually any map or spreadsheet data that can be geo-referenced. Because the procedure is freed from a priori constraints of scale that confound most Geographical Information Systems, they have the capacity to generate new projections and apply sophisticated statistical analyses to the data. The analysis indicates substantial environmental problems. While transboundary pollution issues may spawn conflict among the Central European countries and their neighbors, it appears that common environmental problems facing the entire region have had the effect of bringing the countries together, even though opportunities for deteriorating relationships may still arise
- …