Search CORE

3,722 research outputs found

An improved Plasmodium cynomolgi genome assembly reveals an unexpected methyltransferase gene expansion.

Author: Berriman Matt
Böhme Ulrike
Kocken Clemens H.M.
Otto Thomas Dan
Pasini Erica M.
Rutledge Gavin G.
Sanders Mandy
Voorberg-Van der Wel Annemarie
Publication venue: 'F1000 Research Ltd'
Publication date: 01/06/2017
Field of study

Background: Plasmodium cynomolgi, a non-human primate malaria parasite species, has been an important model parasite since its discovery in 1907. Similarities in the biology of P. cynomolgi to the closely related, but less tractable, human malaria parasite P. vivax make it the model parasite of choice for liver biology and vaccine studies pertinent to P. vivax malaria. Molecular and genome-scale studies of P. cynomolgi have relied on the current reference genome sequence, which remains highly fragmented with 1,649 unassigned scaffolds and little representation of the subtelomeres. Methods: Using long-read sequence data (Pacific Biosciences SMRT technology), we assembled and annotated a new reference genome sequence, PcyM, sourced from an Indian rhesus monkey. We compare the newly assembled genome sequence with those of several other Plasmodium species, including a re-annotated P. coatneyi assembly. Results: The new PcyM genome assembly is of significantly higher quality than the existing reference, comprising only 56 pieces, no gaps and an improved average gene length. Detailed manual curation has ensured a comprehensive annotation of the genome with 6,632 genes, nearly 1,000 more than previously attributed to P. cynomolgi. The new assembly also has an improved representation of the subtelomeric regions, which account for nearly 40% of the sequence. Within the subtelomeres, we identified more than 1300 Plasmodium interspersed repeat (pir) genes, as well as a striking expansion of 36 methyltransferase pseudogenes that originated from a single copy on chromosome 9. Conclusions: The manually curated PcyM reference genome sequence is an important new resource for the malaria research community. The high quality and contiguity of the data have enabled the discovery of a novel expansion of methyltransferase in the subtelomeres, and illustrates the new comparative genomics capabilities that are being unlocked by complete reference genomes

Directory of Open Access Journals

Enlighten

A new Plasmodium vivax reference sequence with improved assembly of the subtelomeres reveals an abundance of pir genes

Author: Auburn Sarah
Berriman Matthew
Böhme Ulrike
Gao Qi
Hostetler Jessica
Newbold Chris I
Nosten Francois
Otto Thomas D.
Price Ric N
Sanders Mandy
Steinbiss Sascha
Trimarsanto Hidayat
Publication venue: 'F1000 Research Ltd'
Publication date: 01/01/2016
Field of study

Plasmodium vivax is now the predominant cause of malaria in the Asia-Pacific, South America and Horn of Africa. Laboratory studies of this species are constrained by the inability to maintain the parasite in continuous ex vivo culture, but genomic approaches provide an alternative and complementary avenue to investigate the parasite's biology and epidemiology. To date, molecular studies of P. vivax have relied on the Salvador-I reference genome sequence, derived from a monkey-adapted strain from South America. However, the Salvador-I reference remains highly fragmented with over 2500 unassembled scaffolds. Using high-depth Illumina sequence data, we assembled and annotated a new reference sequence, PvP01, sourced directly from a patient from Papua Indonesia. Draft assemblies of isolates from China (PvC01) and Thailand (PvT01) were also prepared for comparative purposes. The quality of the PvP01 assembly is improved greatly over Salvador-I, with fragmentation reduced to 226 scaffolds. Detailed manual curation has ensured highly comprehensive annotation, with functions attributed to 58% core genes in PvP01 versus 38% in Salvador-I. The assemblies of PvP01, PvC01 and PvT01 are larger than that of Salvador-I (28-30 versus 27 Mb), owing to improved assembly of the subtelomeres. An extensive repertoire of over 1200 Plasmodium interspersed repeat (pir) genes were identified in PvP01 compared to 346 in Salvador-I, suggesting a vital role in parasite survival or development. The manually curated PvP01 reference and PvC01 and PvT01 draft assemblies are important new resources to study vivax malaria. PvP01 is maintained at GeneDB and ongoing curation will ensure continual improvements in assembly and annotation quality

Crossref

Directory of Open Access Journals

PubMed Central

Oxford University Research Archive

Enlighten

A Molecular Biology Database Digest

Author: Bry François
Kröger Peer
Publication venue
Publication date: 01/01/2000
Field of study

Computational Biology or Bioinformatics has been defined as the application of mathematical and Computer Science methods to solving problems in Molecular Biology that require large scale data, computation, and analysis [18]. As expected, Molecular Biology databases play an essential role in Computational Biology research and development. This paper introduces into current Molecular Biology databases, stressing data modeling, data acquisition, data retrieval, and the integration of Molecular Biology data from different sources. This paper is primarily intended for an audience of computer scientists with a limited background in Biology

CiteSeerX

Open Access LMU

Recommended from our members

Identification and characterization of dysregulated P-element induced wimpy testis-interacting RNAs in head and neck squamous cell carcinoma.

Author: Ku Jonjei
Kuo Selena Z
Li Pin Xue
Ongkeko Weg M
Saad Maarouf A
Wang-Rodriguez Jessica
Yu Michael Andrew
Zheng Hao
Publication venue: eScholarship, University of California
Publication date: 01/03/2019
Field of study

It is clear that alcohol consumption is a major risk factor in the pathogenesis of head and neck squamous cell carcinoma (HNSCC); however, the molecular mechanism underlying the pathogenesis of alcohol-associated HNSCC remains poorly understood. The aim of the present study was to identify and characterize P-element-induced wimpy testis (PIWI)-interacting RNAs (piRNAs) and PIWI proteins dysregulated in alcohol-associated HNSCC to elucidate their function in the development of this cancer. Using next generation RNA-sequencing (RNA-seq) data obtained from 40 HNSCC patients, the piRNA and PIWI protein expression of HNSCC samples was compared between alcohol drinkers and non-drinkers. A separate piRNA expression RNA-seq analysis of 18 non-smoker HNSCC patients was also conducted. To verify piRNA expression, reverse transcription-quantitative polymerase chain reaction (RT-qPCR) was performed on the most differentially expressed alcohol-associated piRNAs in ethanol and acetaldehyde-treated normal oral keratinocytes. The correlation between piRNA expression and patient survival was analyzed using Kaplan-Meier estimators and multivariate Cox proportional hazard models. A comparison between alcohol drinking and non-drinking HNSCC patients demonstrated that a panel of 3,223 piRNA transcripts were consistently detected and differentially expressed. RNA-seq analysis and in vitro RT-qPCR verification revealed that 4 of these piRNAs, piR-35373, piR-266308, piR-58510 and piR-38034, were significantly dysregulated between drinking and non-drinking cohorts. Of these four piRNAs, low expression of piR-58510 and piR-35373 significantly correlated with improved patient survival. Furthermore, human PIWI-like protein 4 was consistently upregulated in ethanol and acetaldehyde-treated normal oral keratinocytes. These results demonstrate that alcohol consumption may cause dysregulation of piRNA expression in HNSCC and in vitro verifications identified 4 piRNAs that may be involved in the pathogenesis of alcohol-associated HNSCC

eScholarship - University of California

Transcriptional and Proteomic Analysis of a Ferric Uptake Regulator (Fur) Mutant of Shewanella oneidensis: Possible Involvement of Fur in Energy Metabolism, Transcriptional Regulation, and Oxidative Stress

Author: Beliaev Alexander S.
Brandt Craig C.
Giometti Carol S.
Khare Tripti
Lies Douglas P.
Lim Hanjo
Nealson Kenneth H.
Thompson Dorothea K.
Tiedje James M.
Tollaksen Sandra L.
Yates John, III
Zhou Jizhong
Publication venue: 'American Society for Microbiology'
Publication date: 01/01/2002
Field of study

The iron-directed, coordinate regulation of genes depends on the fur (ferric uptake regulator) gene product, which acts as an iron-responsive, transcriptional repressor protein. To investigate the biological function of a fur homolog in the dissimilatory metal-reducing bacterium Shewanella oneidensis MR-1, a fur knockout strain (FUR1) was generated by suicide plasmid integration into this gene and characterized using phenotype assays, DNA microarrays containing 691 arrayed genes, and two-dimensional polyacrylamide gel electrophoresis. Physiological studies indicated that FUR1 was similar to the wild-type strain when they were compared for anaerobic growth and reduction of various electron acceptors. Transcription profiling, however, revealed that genes with predicted functions in electron transport, energy metabolism, transcriptional regulation, and oxidative stress protection were either repressed (ccoNQ, etrA, cytochrome b and c maturation-encoding genes, qor, yiaY, sodB, rpoH, phoB, and chvI) or induced (yggW, pdhC, prpC, aceE, fdhD, and ppc) in the fur mutant. Disruption of fur also resulted in derepression of genes (hxuC, alcC, fhuA, hemR, irgA, and ompW) putatively involved in iron uptake. This agreed with the finding that the fur mutant produced threefold-higher levels of siderophore than the wild-type strain under conditions of sufficient iron. Analysis of a subset of the FUR1 proteome (i.e., primarily soluble cytoplasmic and periplasmic proteins) indicated that 11 major protein species reproducibly showed significant (P < 0.05) differences in abundance relative to the wild type. Protein identification using mass spectrometry indicated that the expression of two of these proteins (SodB and AlcC) correlated with the microarray data. These results suggest a possible regulatory role of S. oneidensis MR-1 Fur in energy metabolism that extends the traditional model of Fur as a negative regulator of iron acquisition systems

CU FIND (Campbell University, Catherine W. Wood School of Nursing)

PubMed Central

Caltech Authors

Integration of Biological Sources: Exploring the Case of Protein Homology

Author: Boerman Tjeerd W.
Keulen Maurice van
Severing Edouard I.
Vet Paul van der
Publication venue: University of Twente, Centre for Telematics and Information Technology
Publication date: 01/01/2011
Field of study

Data integration is a key issue in the domain of bioin- formatics, which deals with huge amounts of heteroge- neous biological data that grows and changes rapidly. This paper serves as an introduction in the field of bioinformatics and the biological concepts it deals with, and an exploration of the integration problems a bioinformatics scientist faces. We examine ProGMap, an integrated protein homology system used by bioin- formatics scientists at Wageningen University, and several use cases related to protein homology. A key issue we identify is the huge manual effort required to unify source databases into a single resource. Un- certain databases are able to contain several possi- ble worlds, and it has been proposed that they can be used to significantly reduce initial integration efforts. We propose several directions for future work where uncertain databases can be applied to bioinformatics, with the goal of furthering the cause of bioinformatics integration

University of Twente Research Information

The Universal Protein Resource (UniProt)

Author: Apweiler Rolf
Bairoch Amos
Barker Winona C.
Boeckmann Brigitte
Ferro Serenella
Gasteiger Elisabeth
Huang Hongzhan
Lopez Rodrigo
Magrane Michele
Martin Maria J.
Natale Darren A.
O'Donovan Claire
Redaschi Nicole
Wu Cathy H.
Yeh Lai-Su L.
Publication venue
Publication date: 02/08/2017
Field of study

The Universal Protein Resource (UniProt) provides the scientific community with a single, centralized, authoritative resource for protein sequences and functional information. Formed by uniting the Swiss-Prot, TrEMBL and PIR protein database activities, the UniProt consortium produces three layers of protein sequence databases: the UniProt Archive (UniParc), the UniProt Knowledgebase (UniProt) and the UniProt Reference (UniRef) databases. The UniProt Knowledgebase is a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase with extensive cross-references. This centrepiece consists of two sections: UniProt/Swiss-Prot, with fully, manually curated entries; and UniProt/TrEMBL, enriched with automated classification and annotation. During 2004, tens of thousands of Knowledgebase records got manually annotated or updated; we introduced a new comment line topic: TOXIC DOSE to store information on the acute toxicity of a toxin; the UniProt keyword list got augmented by additional keywords; we improved the documentation of the keywords and are continuously overhauling and standardizing the annotation of post-translational modifications. Furthermore, we introduced a new documentation file of the strains and their synonyms. Many new database cross-references were introduced and we started to make use of Digital Object Identifiers. We also achieved in collaboration with the Macromolecular Structure Database group at EBI an improved integration with structural databases by residue level mapping of sequences from the Protein Data Bank entries onto corresponding UniProt entries. For convenient sequence searches we provide the UniRef non-redundant sequence databases. The comprehensive UniParc database stores the complete body of publicly available protein sequence data. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). New releases are published every two week

RERO DOC Digital Library

UniProt: the Universal Protein knowledgebase

Author: Apweiler Rolf
Bairoch Amos
Barker Winona C.
Boeckmann Brigitte
Ferro Serenella
Gasteiger Elisabeth
Huang Hongzhan
Lopez Rodrigo
Magrane Michele
Martin Maria J.
Natale Darren A.
O'Donovan Claire
Redaschi Nicole
Wu Cathy H.
Yeh Lai‐Su L.
Publication venue
Publication date: 02/08/2017
Field of study

To provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, the Swiss‐Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt) consortium. Our mission is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross‐references and query interfaces. The central database will have two sections, corresponding to the familiar Swiss‐Prot (fully manually curated entries) and TrEMBL (enriched with automated classification, annotation and extensive cross‐references). For convenient sequence searches, UniProt also provides several non‐redundant sequence databases. The UniProt NREF (UniRef) databases provide representative subsets of the knowledgebase suitable for efficient searching. The comprehensive UniProt Archive (UniParc) is updated daily from many public source databases. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). The scientific community is encouraged to submit data for inclusion in UniPro

RERO DOC Digital Library