57 research outputs found

    ProtRepeatsDB: a database of amino acid repeats in genomes

    Get PDF
    BACKGROUND: Genome wide and cross species comparisons of amino acid repeats is an intriguing problem in biology mainly due to the highly polymorphic nature and diverse functions of amino acid repeats. Innate protein repeats constitute vital functional and structural regions in proteins. Repeats are of great consequence in evolution of proteins, as evident from analysis of repeats in different organisms. In the post genomic era, availability of protein sequences encoded in different genomes provides a unique opportunity to perform large scale comparative studies of amino acid repeats. ProtRepeatsDB is a relational database of perfect and mismatch repeats, access to which is designed as a resource and collection of tools for detection and cross species comparisons of different types of amino acid repeats. DESCRIPTION: ProtRepeatsDB (v1.2) consists of perfect as well as mismatch amino acid repeats in the protein sequences of 141 organisms, the genomes of which are now available. The web interface of ProtRepeatsDB consists of different tools to perform repeat s; based on protein IDs, organism name, repeat sequences, and keywords as in FASTA headers, size, frequency, gene ontology (GO) annotation IDs and regular expressions (REGEXP) describing repeats. These tools also allow formulation of a variety of simple, complex and logical queries to facilitate mining and large-scale cross-species comparisons of amino acid repeats. In addition to this, the database also contains sequence analysis tools to determine repeats in user input sequences. CONCLUSION: ProtRepeatsDB is a multi-organism database of different types of amino acid repeats present in proteins. It integrates useful tools to perform genome wide queries for rapid screening and identification of amino acid repeats and facilitates comparative and evolutionary studies of the repeats. The database is useful for identification of species or organism specific repeat markers, interspecies variations and polymorphism

    Association analysis of nine candidate gene polymorphisms in Indian patients with type 2 diabetic retinopathy

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Diabetic retinopathy (DR) is classically defined as a microvasculopathy that primarily affects the small blood vessels of the inner retina as a complication of diabetes mellitus (DM).It is a multifactorial disease with a strong genetic component. The aim of this study is to investigate the association of a set of nine candidate genes with the development of diabetic retinopathy in a South Indian cohort who have type 2 diabetes mellitus (T2DM).</p> <p>Methods</p> <p>Seven candidate genes (<it>RAGE, PEDF, AKR1B1, EPO, HTRA1, ICAM </it>and <it>HFE</it>) were chosen based on reported association with DR in the literature. Two more, <it>CFH </it>and ARMS2, were chosen based on their roles in biological pathways previously implicated in DR. Fourteen single nucleotide polymorphisms (SNPs) and one dinucleotide repeat polymorphism, previously reported to show association with DR or other related diseases, were genotyped in 345 DR and 356 diabetic patients without retinopathy (DNR). The genes which showed positive association in this screening set were tested further in additional sets of 100 DR and 90 DNR additional patients from the Aravind Eye Hospital. Those which showed association in the secondary screen were subjected to a combined analysis with the 100 DR and 100 DNR subjects previously recruited and genotyped through the Sankara Nethralaya Hospital, India. Genotypes were evaluated using a combination of direct sequencing, TaqMan SNP genotyping, RFLP analysis, and SNaPshot PCR assays. Chi-square and Fisher exact tests were used to analyze the genotype and allele frequencies.</p> <p>Results</p> <p>Among the nine loci (15 polymorphisms) screened, SNP rs2070600 (G82S) in the <it>RAGE </it>gene, showed significant association with DR (allelic P = 0.016, dominant model P = 0.012), compared to DNR. SNP rs2070600 further showed significant association with DR in the confirmation cohort (P = 0.035, dominant model P = 0.032). Combining the two cohorts gave an allelic P < 0.003 and dominant P = 0.0013). Combined analysis with the Sankara Nethralaya cohort gave an allelic P = 0.0003 and dominant P = 0.00011 with an OR = 0.49 (0.34 - 0.70) for the minor allele. In <it>HTRA1</it>, rs11200638 (G>A), showed marginal significance with DR (P = 0.055) while rs10490924 in LOC387715 gave a P = 0.07. No statistical significance was observed for SNPs in the other 7 genes studied.</p> <p>Conclusions</p> <p>This study confirms significant association of one polymorphism only (rs2070600 in <it>RAGE</it>) with DR in an Indian population which had T2DM.</p

    Integrative analysis of the Trypanosoma brucei gene expression cascade predicts differential regulation of mRNA processing and unusual control of ribosomal protein expression

    Get PDF
    Background: Trypanosoma brucei is a unicellular parasite which multiplies in mammals (bloodstream form) and Tsetse flies (procyclic form). Trypanosome RNA polymerase II transcription is polycistronic, individual mRNAs being excised by trans splicing and polyadenylation. We previously made detailed measurements of mRNA half-lives in bloodstream and procyclic forms, and developed a mathematical model of gene expression for bloodstream forms. At the whole transcriptome level, many bloodstream-form mRNAs were less abundant than was predicted by the model. Results: We refined the published mathematical model and extended it to the procyclic form. We used the model, together with known mRNA half-lives, to predict the abundances of individual mRNAs, assuming rapid, unregulated mRNA processing; then we compared the results with measured mRNA abundances. Remarkably, the abundances of most mRNAs in procyclic forms are predicted quite well by the model, being largely explained by variations in mRNA decay rates and length. In bloodstream forms substantially more mRNAs are less abundant than predicted. We list mRNAs that are likely to show particularly slow or inefficient processing, either in both forms or with developmental regulation. We also measured ribosome occupancies of all mRNAs in trypanosomes grown in the same conditions as were used to measure mRNA turnover. In procyclic forms there was a weak positive correlation between ribosome density and mRNA half-life, suggesting cross-talk between translation and mRNA decay; ribosome density was related to the proportion of the mRNA on polysomes, indicating control of translation initiation. Ribosomal protein mRNAs in procyclics appeared to be exceptionally rapidly processed but poorly translated. Conclusions: Levels of mRNAs in procyclic form trypanosomes are determined mainly by length and mRNA decay, with some control of precursor processing. In bloodstream forms variations in nuclear events play a larger role in transcriptome regulation, suggesting aquisition of new control mechanisms during adaptation to mammalian parasitism

    Assessment and improvement of the Plasmodium yoelii yoelii genome annotation through comparative analysis

    Get PDF
    Motivation: The sequencing of the Plasmodium yoelii genome, a model rodent malaria parasite, has greatly facilitated research for the development of new drug and vaccine candidates against malaria. Unfortunately, only preliminary gene models were annotated on the partially sequenced genome, mostly by in silico gene prediction, and there has been no major improvement of the annotation since 2002

    GeneDB--an annotation database for pathogens.

    Get PDF
    GeneDB (http://www.genedb.org) is a genome database for prokaryotic and eukaryotic pathogens and closely related organisms. The resource provides a portal to genome sequence and annotation data, which is primarily generated by the Pathogen Genomics group at the Wellcome Trust Sanger Institute. It combines data from completed and ongoing genome projects with curated annotation, which is readily accessible from a web based resource. The development of the database in recent years has focused on providing database-driven annotation tools and pipelines, as well as catering for increasingly frequent assembly updates. The website has been significantly redesigned to take advantage of current web technologies, and improve usability. The current release stores 41 data sets, of which 17 are manually curated and maintained by biologists, who review and incorporate data from the scientific literature, as well as other sources. GeneDB is primarily a production and annotation database for the genomes of predominantly pathogenic organisms

    CyclinPred: A SVM-Based Method for Predicting Cyclin Protein Sequences

    Get PDF
    Functional annotation of protein sequences with low similarity to well characterized protein sequences is a major challenge of computational biology in the post genomic era. The cyclin protein family is once such important family of proteins which consists of sequences with low sequence similarity making discovery of novel cyclins and establishing orthologous relationships amongst the cyclins, a difficult task. The currently identified cyclin motifs and cyclin associated domains do not represent all of the identified and characterized cyclin sequences. We describe a Support Vector Machine (SVM) based classifier, CyclinPred, which can predict cyclin sequences with high efficiency. The SVM classifier was trained with features of selected cyclin and non cyclin protein sequences. The training features of the protein sequences include amino acid composition, dipeptide composition, secondary structure composition and PSI-BLAST generated Position Specific Scoring Matrix (PSSM) profiles. Results obtained from Leave-One-Out cross validation or jackknife test, self consistency and holdout tests prove that the SVM classifier trained with features of PSSM profile was more accurate than the classifiers based on either of the other features alone or hybrids of these features. A cyclin prediction server- CyclinPred has been setup based on SVM model trained with PSSM profiles. CyclinPred prediction results prove that the method may be used as a cyclin prediction tool, complementing conventional cyclin prediction methods

    TriTrypDB: a functional genomic resource for the Trypanosomatidae

    Get PDF
    TriTrypDB (http://tritrypdb.org) is an integrated database providing access to genome-scale datasets for kinetoplastid parasites, and supporting a variety of complex queries driven by research and development needs. TriTrypDB is a collaborative project, utilizing the GUS/WDK computational infrastructure developed by the Eukaryotic Pathogen Bioinformatics Resource Center (EuPathDB.org) to integrate genome annotation and analyses from GeneDB and elsewhere with a wide variety of functional genomics datasets made available by members of the global research community, often pre-publication. Currently, TriTrypDB integrates datasets from Leishmania braziliensis, L. infantum, L. major, L. tarentolae, Trypanosoma brucei and T. cruzi. Users may examine individual genes or chromosomal spans in their genomic context, including syntenic alignments with other kinetoplastid organisms. Data within TriTrypDB can be interrogated utilizing a sophisticated search strategy system that enables a user to construct complex queries combining multiple data types. All search strategies are stored, allowing future access and integrated searches. ‘User Comments’ may be added to any gene page, enhancing available annotation; such comments become immediately searchable via the text search, and are forwarded to curators for incorporation into the reference annotation when appropriate

    Characterization and localization of Plasmodium falciparum homolog of prokaryotic ClpQ/HslV protease

    No full text
    The β subunits (β1, β2, and β5) of 20S proteasome and HslV/ClpQ are ATP-dependent threonine proteases present in eukaryotes and prokaryotes, respectively that control levels of key regulatory proteins in the cell. The orthologue of prokaryotic HslV protease in Plasmodium falciparum (PfHslV) is a novel drug target candidate that has no homolog in the human host. In the present study, the PfHslV was expressed, localized and biochemically characterized. The recombinant PfHslV harbored threonine protease specific activity as well as chymotrypsin like and peptidyl glutamyl peptide hydrolase activities. All the three activities could be inhibited by respective specific inhibitors. The protein was localized in the cytosol of the parasite as a soluble protein by Western immunoblotting of parasite fractions and by immuno-fluorescence microscopy. Activity of the protease in the parasite was ascertained by following the degradation of GFP in a transgenic parasite line expressing fusion protein of GFP and Arc-repressor gene, a known target of HslV protease in the prokaryotes. A model structure of PfHslV was constructed based on the crystal structure of Escherichia coli HslV to assess the structural homology. Availability of the structure model of PfHslV may facilitate identification or designing of novel and specific drugs against PfHslV. The in vitro protease assays with recombinant PfHslV and the transgenic parasite line generated in the present study may be exploited in the screening of novel inhibitors to evaluate their anti-malarial activity
    corecore