108 research outputs found
KinasePhos: a web tool for identifying protein kinase-specific phosphorylation sites
KinasePhos is a novel web server for computationally identifying catalytic kinase-specific phosphorylation sites. The known phosphorylation sites from public domain data sources are categorized by their annotated protein kinases. Based on the profile hidden Markov model, computational models are learned from the kinase-specific groups of the phosphorylation sites. After evaluating the learned models, the model with highest accuracy was selected from each kinase-specific group, for use in a web-based prediction tool for identifying protein phosphorylation sites. Therefore, this work developed a kinase-specific phosphorylation site prediction tool with both high sensitivity and specificity. The prediction tool is freely available at
i-Genome: A database to summarize oligonucleotide data in genomes
BACKGROUND: Information on the occurrence of sequence features in genomes is crucial to comparative genomics, evolutionary analysis, the analyses of regulatory sequences and the quantitative evaluation of sequences. Computing the frequencies and the occurrences of a pattern in complete genomes is time-consuming. RESULTS: The proposed database provides information about sequence features generated by exhaustively computing the sequences of the complete genome. The repetitive elements in the eukaryotic genomes, such as LINEs, SINEs, Alu and LTR, are obtained from Repbase. The database supports various complete genomes including human, yeast, worm, and 128 microbial genomes. CONCLUSIONS: This investigation presents and implements an efficiently computational approach to accumulate the occurrences of the oligonucleotides or patterns in complete genomes. A database is established to maintain the information of the sequence features, including the distributions of oligonucleotide, the gene distribution, the distribution of repetitive elements in genomes and the occurrences of the oligonucleotides. The database can provide more effective and efficient way to access the repetitive features in genomes
Rapid Detection of Heterogeneous Vancomycin-Intermediate Staphylococcus aureus Based on Matrix-Assisted Laser Desorption Ionization Time-of-Flight: Using a Machine Learning Approach and Unbiased Validation
Heterogeneous vancomycin-intermediate Staphylococcus aureus (hVISA) is an emerging superbug with implicit drug resistance to vancomycin. Detecting hVISA can guide the correct administration of antibiotics. However, hVISA cannot be detected in most clinical microbiology laboratories because the required diagnostic tools are either expensive, time consuming, or labor intensive. By contrast, matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) is a cost-effective and rapid tool that has potential for providing antibiotics resistance information. To analyze complex MALDI-TOF mass spectra, machine learning (ML) algorithms can be used to generate robust hVISA detection models. In this study, MALDI-TOF mass spectra were obtained from 35 hVISA/vancomycin-intermediate S. aureus (VISA) and 90 vancomycin-susceptible S. aureus isolates. The vancomycin susceptibility of the isolates was determined using an Etest and modified population analysis profile–area under the curve. ML algorithms, namely a decision tree, k-nearest neighbors, random forest, and a support vector machine (SVM), were trained and validated using nested cross-validation to provide unbiased validation results. The area under the curve of the models ranged from 0.67 to 0.79, and the SVM-derived model outperformed those of the other algorithms. The peaks at m/z 1132, 2895, 3176, and 6591 were noted as informative peaks for detecting hVISA/VISA. We demonstrated that hVISA/VISA could be detected by analyzing MALDI-TOF mass spectra using ML. Moreover, the results are particularly robust due to a strict validation method. The ML models in this study can provide rapid and accurate reports regarding hVISA/VISA and thus guide the correct administration of antibiotics in treatment of S. aureus infection
ProSplicer: a database of putative alternative splicing information derived from protein, mRNA and expressed sequence tag sequence data
ProSplicer is a database of putative alternative splicing information derived from the alignment of proteins, mRNA sequences and expressed sequence tags (ESTs) against human genomic DNA sequences. Proteins, mRNA and ESTs provide valuable evidence that can reveal splice variants of genes. The alternative splicing information in the database can help users investigate the alternative splicing and tissue-specific expression of genes
SpliceInfo: an information repository for mRNA alternative splicing in human genome
We have developed an information repository named SpliceInfo to collect the occurrences of the four major alternative-splicing (AS) modes in human genome; these include exon skipping, 5′-alternative splicing, 3′-alternative splicing and intron retention. The dataset is derived by comparing the nucleotide and protein sequences available for a given gene for evidence of AS. Additional features such as the tissue specificity of the mRNA, the protein domain contained by exons, the GC-ratio of exons, the repeats contained within the exons, and the Gene Ontology are annotated computationally for each exonic region that is alternatively spliced. Motivated by a previous investigation of AS-related motifs such as exonic splicing enhancer and exonic splicing silencer, this resource also provides a means of identifying motifs candidates and this should help to identify potential regulatory mechanisms within a particular exonic sequence set and its two flanking intronic sequence sets. This is carried out using motif discovery tools to identify motif candidates related to alternative splicing regulation and together with a secondary structure prediction tool, will help in the identification of the structural properties of such regulatory motifs. The integrated resource is now available on http://SpliceInfo.mbc.NCTU.edu.tw/
Mean Daily Dosage of Aspirin and the Risk of Incident Alzheimer’s Dementia in Patients with Type 2 Diabetes Mellitus: A Nationwide Retrospective Cohort Study in Taiwan
Background. Type 2 diabetes mellitus patients are known to have higher risk of developing dementia while aspirin use has been shown to prevent incident dementia. This study was conducted to evaluate the potential benefits of aspirin use on dementia in patients with type 2 diabetes mellitus and identify the appropriate dosage of aspirin that provides the most benefit. Method. A Taiwan nationwide, population-based retrospective 8-year study was employed to analyze the association between the use of aspirin and incidence of dementia including Alzheimer’s disease and non-Alzheimer’s dementia using multivariate Cox-proportional hazards regression model and adjusting for several potential confounders. Results. Regular aspirin use in mean daily dosage of within 40 mg was associated with a decreased risk of developing incident Alzheimer’s dementia in patients with type 2 diabetes mellitus (adjusted HR of 0.51 with 95% CI of 0.27–0.97, p value 0.041). Conclusion. A mean daily dosage of aspirin use within 40 mg might decrease the risk of developing Alzheimer’s disease in patients with type 2 diabetes mellitus
RNAMST: efficient and flexible approach for identifying RNA structural homologs
RNA molecules fold into characteristic secondary structures for their diverse functional activities such as post-translational regulation of gene expression. Searching homologs of a pre-defined RNA structural motif, which may be a known functional element or a putative RNA structural motif, can provide useful information for deciphering RNA regulatory mechanisms. Since searching for the RNA structural homologs among the numerous RNA sequences is extremely time-consuming, this work develops a data preprocessing strategy to enhance the search efficiency and presents RNAMST, which is an efficient and flexible web server for rapidly identifying homologs of a pre-defined RNA structural motif among numerous RNA sequences. Intuitive user interface are provided on the web server to facilitate the predictive analysis. By comparing the proposed web server to other tools developed previously, RNAMST performs remarkably more efficiently and provides more effective and flexible functions. RNAMST is now available on the web at
RINGdb: An integrated database for G protein-coupled receptors and regulators of G protein signaling
BACKGROUND: Many marketed therapeutic agents have been developed to modulate the function of G protein-coupled receptors (GPCRs). The regulators of G-protein signaling (RGS proteins) are also being examined as potential drug targets. To facilitate clinical and pharmacological research, we have developed a novel integrated biological database called RINGdb to provide comprehensive and organized RGS protein and GPCR information. RESULTS: RINGdb contains information on mutations, tissue distributions, protein-protein interactions, diseases/disorders and other features, which has been automatically collected from the Internet and manually extracted from the literature. In addition, RINGdb offers various user-friendly query functions to answer different questions about RGS proteins and GPCRs such as their possible contribution to disease processes, the putative direct or indirect relationship between RGS proteins and GPCRs. RINGdb also integrates organized database cross-references to allow users direct access to detailed information. The database is now available at . CONCLUSION: RINGdb is the only integrated database on the Internet to provide comprehensive RGS protein and GPCR information. This knowledgebase will be useful for clinical research, drug discovery and GPCR signaling pathway research
RNALogo: a new approach to display structural RNA alignment
Regulatory RNAs play essential roles in many essential biological processes, ranging from gene regulation to protein synthesis. This work presents a web-based tool, RNALogo, to create a new graphical representation of the patterns in a multiple RNA sequence alignment with a consensus structure. The RNALogo graph can indicate significant features within an RNA sequence alignment and its consensus RNA secondary structure. RNALogo extends Sequence logos, and specifically incorporates RNA secondary structures and mutual information of base-paired regions into the graphical representation. Each RNALogo graph is composed of stacks of letters, with one stack for each position in the consensus RNA secondary structure. RNALogo provides a convenient and high configurable logo generator. An RNALogo graph is generated for each RNA family in Rfam, and these generated logos are accumulated into a gallery of RNALogo. Users can search or browse RNALogo graphs in this gallery to receive additional perspectives of known RNA families. RNALogo is now available at: http://rnalogo.mbc.nctu.edu.tw/
- …