Search CORE

AtRTPrimer: database for Arabidopsis genome-wide homogeneous and specific RT-PCR primer-pairs

Author: Han Sangjo
Kim Dongsup
Publication venue: BioMed Central
Publication date: 30/03/2006
Field of study

BACKGROUND: Primer design is a critical step in all types of RT-PCR methods to ensure specificity and efficiency of a target amplicon. However, most traditional primer design programs suggest primers on a single template of limited genetic complexity. To provide researchers with a sufficient number of pre-designed specific RT-PCR primer pairs for whole genes in Arabidopsis, we aimed to construct a genome-wide primer-pair database. DESCRIPTION: We considered the homogeneous physical and chemical properties of each primer (homogeneity) of a gene, non-specific binding against all other known genes (specificity), and other possible amplicons from its corresponding genomic DNA or similar cDNAs (additional information). Then, we evaluated the reliability of our database with selected primer pairs from 15 genes using conventional and real time RT-PCR. CONCLUSION: Approximately 97% of 28,952 genes investigated were finally registered in AtRTPrimer. Unlike other freely available primer databases for Arabidopsis thaliana, AtRTPrimer provides a large number of reliable primer pairs for each gene so that researchers can perform various types of RT-PCR experiments for their specific needs. Furthermore, by experimentally evaluating our database, we made sure that our database provides good starting primer pairs for Arabidopsis researchers to perform various types of RT-PCR experiments

Drug-drug relationship based on target information: application to drug target identification

Author: Kim Dongsup
Park Keunwan
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Interaction network among functional drug groups

Author: Dongsup Kim
Keunwan Park
Minho Lee
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

PostMod: sequence based prediction of kinase-specific phosphorylation sites with indirect relationship

Author: Jung Inkyung
Kim Dongsup
Matsuyama Akihisa
Yoshida Minoru
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Post-translational modifications (PTMs) have a key role in regulating cell functions. Consequently, identification of PTM sites has a significant impact on understanding protein function and revealing cellular signal transductions. Especially, phosphorylation is a ubiquitous process with a large portion of proteins undergoing this modification. Experimental methods to identify phosphorylation sites are labor-intensive and of high-cost. With the exponentially growing protein sequence data, development of computational approaches to predict phosphorylation sites is highly desirable. Results Here, we present a simple and effective method to recognize phosphorylation sites by combining sequence patterns and evolutionary information and by applying a novel noise-reducing algorithm. We suggested that considering long-range region surrounding a phosphorylation site is important for recognizing phosphorylation peptides. Also, from compared results to AutoMotif in 36 different kinase families, new method outperforms AutoMotif. The mean accuracy, precision, and recall of our method are 0.93, 0.67, and 0.40, respectively, whereas those of AutoMotif with a polynomial kernel are 0.91, 0.47, and 0.17, respectively. Also our method shows better or comparable performance in four main kinase groups, CDK, CK2, PKA, and PKC compared to six existing predictors. Conclusion Our method is remarkable in that it is powerful and intuitive approach without need of a sophisticated training algorithm. Moreover, our method is generally applicable to other types of PTMs.</p

Predicting and improving the protein sequence alignment quality by support vector regression

Author: Jeong Chan-seok
Kim Dongsup
Lee Minho
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background For successful protein structure prediction by comparative modeling, in addition to identifying a good template protein with known structure, obtaining an accurate sequence alignment between a query protein and a template protein is critical. It has been known that the alignment accuracy can vary significantly depending on our choice of various alignment parameters such as gap opening penalty and gap extension penalty. Because the accuracy of sequence alignment is typically measured by comparing it with its corresponding structure alignment, there is no good way of evaluating alignment accuracy without knowing the structure of a query protein, which is obviously not available at the time of structure prediction. Moreover, there is no universal alignment parameter option that would always yield the optimal alignment. Results In this work, we develop a method to predict the quality of the alignment between a query and a template. We train the support vector regression (SVR) models to predict the MaxSub scores as a measure of alignment quality. The alignment between a query protein and a template of length n is transformed into a (n + 1)-dimensional feature vector, then it is used as an input to predict the alignment quality by the trained SVR model. Performance of our work is evaluated by various measures including Pearson correlation coefficient between the observed and predicted MaxSub scores. Result shows high correlation coefficient of 0.945. For a pair of query and template, 48 alignments are generated by changing alignment options. Trained SVR models are then applied to predict the MaxSub scores of those and to select the best alignment option which is chosen specifically to the query-template pair. This adaptive selection procedure results in 7.4% improvement of MaxSub scores, compared to those when the single best parameter option is used for all query-template pairs. Conclusion The present work demonstrates that the alignment quality can be predicted with reasonable accuracy. Our method is useful not only for selecting the optimal alignment parameters for a chosen template based on predicted alignment quality, but also for filtering out problematic templates that are not suitable for structure prediction due to poor alignment accuracy. This is implemented as a part in FORECAST, the server for fold-recognition and is freely available on the web at http://pbil.kaist.ac.kr/forecast</p

Application of nonnegative matrix factorization to improve profile-profile alignment features for fold recognition and remote homolog detection

Author: Jung Inkyung
Kim Dongsup
Lee Jaehyung
Lee Soo-Young
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Nonnegative matrix factorization (NMF) is a feature extraction method that has the property of intuitive part-based representation of the original features. This unique ability makes NMF a potentially promising method for biological sequence analysis. Here, we apply NMF to fold recognition and remote homolog detection problems. Recent studies have shown that combining support vector machines (SVM) with profile-profile alignments improves performance of fold recognition and remote homolog detection remarkably. However, it is not clear which parts of sequences are essential for the performance improvement. Results The performance of fold recognition and remote homolog detection using NMF features is compared to that of the unmodified profile-profile alignment (PPA) features by estimating Receiver Operating Characteristic (ROC) scores. The overall performance is noticeably improved. For fold recognition at the fold level, SVM with NMF features recognize 30% of homolog proteins at > 0.99 ROC scores, while original PPA feature, HHsearch, and PSI-BLAST recognize almost none. For detecting remote homologs that are related at the superfamily level, NMF features also achieve higher performance than the original PPA features. At > 0.90 ROC50 scores, 25% of proteins with NMF features correctly detects remotely related proteins, whereas using original PPA features only 1% of proteins detect remote homologs. In addition, we investigate the effect of number of positive training examples and the number of basis vectors on performance improvement. We also analyze the ability of NMF to extract essential features by comparing NMF basis vectors with functionally important sites and structurally conserved regions of proteins. The results show that NMF basis vectors have significant overlap with functional sites from PROSITE and with structurally conserved regions from the multiple structural alignments generated by MUSTANG. The correlation between NMF basis vectors and biologically essential parts of proteins supports our conjecture that NMF basis vectors can explicitly represent important sites of proteins. Conclusion The present work demonstrates that applying NMF to profile-profile alignments can reveal essential features of proteins and that these features significantly improve the performance of fold recognition and remote homolog detection.</p

SCUD: Saccharomyces Cerevisiae Ubiquitination Database

Author: Jung Jin Woo
Kim Dongsup
Kim Kwang Pyo
Lee Minho
Lee Won-Chul
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Dynamic Path Integral Methods: A Maximum Entropy Approach Based on the Combined use of Real and Imaginary Time Quantum Monte Carlo Data

Author: David L. Freeman
Doll J. D.
Dongsup Kim
J. D. Doll
Publication venue: DigitalCommons@URI
Publication date: 08/03/1998
Field of study

A new numerical procedure for the study of finite temperature quantumdynamics is developed. The method is based on the observation that the real and imaginary time dynamical data contain complementary types of information. Maximum entropy methods, based on a combination of real and imaginary time input data, are used to calculate the spectral densities associated with real time correlation functions. Model studies demonstrate that the inclusion of even modest amounts of short-time real time data significantly improves the quality of the resulting spectral densities over that achievable using either real time data or imaginary time data separately

Elsevier - Publisher Connector

DigitalCommons@URI

Prevalence of type-specific oncogenic human papillomavirus infection assessed by HPV E6/E7 mRNA among women with high-grade cervical lesions

Author: Kim Geehyuk
Kim Sunghyun
Lee Dongsup
Lee Hyeyoung
Park Kwang Hwa
Park Sunyoung
Wang Hye-young
Publication venue: The Authors. Published by Elsevier Ltd.
Publication date: 01/08/2015
Field of study

SummaryObjectivesHuman papillomavirus (HPV) infection is a major cause of premalignant dysplasia and cervical cancer. There are no data on the prevalence of genotype-specific HPV infection assessed by HPV E6/E7 mRNA in women representative of the Korean population across a broad age range.MethodsA total of 630 women aged 17–90 years were enrolled in this study. ThinPrep liquid-based cytology samples were evaluated using the CervicGen HPV RT-qDx assay, which detects 16 high-risk (HR) HPV genotypes (set 1: HPV 16, 31, 33, 35, 52, and 58; set 2: HPV 18, 39, 45, 51, 59, and 68; and set 3: HPV 53, 56, 66, and 69).ResultsThe overall prevalence of HPV infection was 33.2% (n=209), and oncogenic high-risk HPV was detected in 75.9% (n=107) of 141 women with high-grade cervical lesions. HPV 16 was the most common HPV genotype among women with high-grade cervical lesions and histologically confirmed cervical intraepithelial neoplasia grade 2 and above (CIN2+) in the Republic of Korea (41.6%). Among women aged over 30 years, 182/329 (55%) had invasive cervical cancer and 135 (74%) of these were infected with oncogenic HR-HPV types (in particular 25% with HPV 16). Among patients diagnosed with CIN2+, the positivity rate of HR-HPV was the highest in women aged 40–49 years.ConclusionsThese results suggest that the determination of specific HPV genotypes is very important for evaluating the potential impact of preventive measures, including the use of prophylactic vaccines, on reducing the burden of cervical cancer