7,563 research outputs found

    SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and Linear Motif Scoring

    Get PDF
    Nuclear localization signals (NLSs) are stretches of residues in proteins mediating their importing into the nucleus. NLSs are known to have diverse patterns, of which only a limited number are covered by currently known NLS motifs. Here we propose a sequential pattern mining algorithm SeqNLS to effectively identify potential NLS patterns without being constrained by the limitation of current knowledge of NLSs. The extracted frequent sequential patterns are used to predict NLS candidates which are then filtered by a linear motif-scoring scheme based on predicted sequence disorder and by the relatively local conservation (IRLC) based masking. The experiment results on the newly curated Yeast and Hybrid datasets show that SeqNLS is effective in detecting potential NLSs. The performance comparison between SeqNLS with and without the linear motif scoring shows that linear motif features are highly complementary to sequence features in discerning NLSs. For the two independent datasets, our SeqNLS not only can consistently find over 50% of NLSs with prediction precision of at least 0.7, but also outperforms other state-of-the-art NLS prediction methods in terms of F1 score or prediction precision with similar or higher recall rates. The web server of the SeqNLS algorithm is available at http://mleg.cse.sc.edu/seqNLS

    Computational Analysis and Prediction of Genome-Wide Protein Targeting Signals and Localization

    Get PDF
    Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. In the first part of the dissertation, we propose a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. In the second part of the dissertation, we propose a computational method, SeqNLS, to predict nuclear localization signal (NLS). The major difficulty of NLS prediction is that NLSs are known to have diverse patterns, but the knowledge to NLS patterns is limited and only a portion of NLSs can be covered by the known NLS motifs. In SeqNLS, on the one hand we propose a sequential-pattern approach to effectively detect potential NLS segments without constrained by the limited knowledge of NLS patterns. On the other hand, we introduce a model for NLS prediction which utilizes the fact that NLS is one type of linear motifs. Our experiment results show that our sequential-pattern approach is effectively in extensively searching potential NLSs. Our method can consistently find over 50% of NLSs with prediction precision at least 0.7 in the two independent datasets. The performance of our method can outperform the-state-of-art NLS prediction methods in terms of F1-score. The binding affinity between a nuclear localization signal (NLS) and its import receptor is closely related to corresponding nuclear import activity. PTM based modulation of the NLS binding affinity to the import receptor is one of the most understood mechanisms to regulate nuclear import of proteins. However, identification of such regulation mechanisms is challenging due to the difficulty of assessing the impact of the PTM on corresponding nuclear import activities. In the third part of the dissertation we proposed NIpredict, an effective algorithm to predict nuclear import activity given its NLS, in which molecular interaction energy components (MIECs) were used to characterize the NLS-import receptor interaction, and the support vector regression machine (SVR) was used to learn the relationship between the characterized NLS-import receptor interaction and the corresponding nuclear import activity. Our experiments showed that nuclear import activity change due to NLS change could be accurately predicted by the NIpredict algorithm. Based on NIpredict, we developed a systematic framework to identify potential PTM-based nuclear import regulations for human and yeast nuclear proteins. Application of this approach has uncovered the potential nuclear import regulation mechanisms by phosphorylation and/or acetylation of three nuclear proteins including SF1, histone H1, and ORC6

    Predicting Drug-Target Interaction Networks Based on Functional Groups and Biological Features

    Get PDF
    Background: Study of drug-target interaction networks is an important topic for drug development. It is both timeconsuming and costly to determine compound-protein interactions or potential drug-target interactions by experiments alone. As a complement, the in silico prediction methods can provide us with very useful information in a timely manner. Methods/Principal Findings: To realize this, drug compounds are encoded with functional groups and proteins encoded by biological features including biochemical and physicochemical properties. The optimal feature selection procedures are adopted by means of the mRMR (Maximum Relevance Minimum Redundancy) method. Instead of classifying the proteins as a whole family, target proteins are divided into four groups: enzymes, ion channels, G-protein- coupled receptors and nuclear receptors. Thus, four independent predictors are established using the Nearest Neighbor algorithm as their operation engine, with each to predict the interactions between drugs and one of the four protein groups. As a result, the overall success rates by the jackknife cross-validation tests achieved with the four predictors are 85.48%, 80.78%, 78.49%, and 85.66%, respectively. Conclusion/Significance: Our results indicate that the network prediction system thus established is quite promising an

    Spatio-temporal Video Parsing for Abnormality Detection

    Get PDF
    Abnormality detection in video poses particular challenges due to the infinite size of the class of all irregular objects and behaviors. Thus no (or by far not enough) abnormal training samples are available and we need to find abnormalities in test data without actually knowing what they are. Nevertheless, the prevailing concept of the field is to directly search for individual abnormal local patches or image regions independent of another. To address this problem, we propose a method for joint detection of abnormalities in videos by spatio-temporal video parsing. The goal of video parsing is to find a set of indispensable normal spatio-temporal object hypotheses that jointly explain all the foreground of a video, while, at the same time, being supported by normal training samples. Consequently, we avoid a direct detection of abnormalities and discover them indirectly as those hypotheses which are needed for covering the foreground without finding an explanation for themselves by normal samples. Abnormalities are localized by MAP inference in a graphical model and we solve it efficiently by formulating it as a convex optimization problem. We experimentally evaluate our approach on several challenging benchmark sets, improving over the state-of-the-art on all standard benchmarks both in terms of abnormality classification and localization.Comment: 15 pages, 12 figures, 3 table

    Towards a systems biology approach to mammalian cell cycle: modeling the entrance into S phase of quiescent fibroblasts after serum stimulation

    Get PDF
    Background: The cell cycle is a complex process that allows eukaryotic cells to replicate chromosomal DNA and partition it into two daughter cells. A relevant regulatory step is in the G0/G1phase, a point called the restriction (R) point where intracellular and extracellular signals are monitored and integrated. Results: Subcellular localization of cell cycle proteins is increasingly recognized as a major factor that regulates cell cycle transitions. Nevertheless, current mathematical models of the G1/S networks of mammalian cells do not consider this aspect. Hence, there is a need for a computational model that incorporates this regulatory aspect that has a relevant role in cancer, since altered localization of key cell cycle players, notably of inhibitors of cyclin-dependent kinases, has been reported to occur in neoplastic cells and to be linked to cancer aggressiveness. Conclusion: The network of the model components involved in the G1to S transition process was identified through a literature and web-based data mining and the corresponding wiring diagram of the G1to S transition drawn with Cell Designer notation. The model has been implemented in Mathematica using Ordinary Differential Equations. Time-courses of level and of sub-cellular localization of key cell cycle players in mouse fibroblasts re-entering the cell cycle after serum starvation/re-feeding have been used to constrain network design and parameter determination. The model allows to recapitulate events from growth factor stimulation to the onset of S phase. The R point estimated by simulation is consistent with the R point experimentally determined. The major element of novelty of our model of the G1to S transition is the explicit modeling of cytoplasmic/nuclear shuttling of cyclins, cyclin-dependent kinases, their inhibitor and complexes. Sensitivity analysis of the network performance newly reveals that the biological effect brought about by Cki overexpression is strictly dependent on whether the Cki is promoting nuclear translocation of cyclin/Cdk containing complexes. © 2009 Alfieri et al; licensee BioMed Central Ltd

    Verticilium longisporum phospholipase VlsPLA(2) is a virulence factor that targets host nuclei and modulates plant immunity

    Get PDF
    Phospholipase A(2) (PLA(2)) is a lipolytic enzyme that hydrolyses phospholipids in the cell membrane. In the present study, we investigated the role of secreted PLA(2) (VlsPLA(2)) in Verticillium longisporum, a fungal phytopathogen that mostly infects plants belonging to the Brassicaceae family, causing severe annual yield loss worldwide. Expression of the VlsPLA(2) gene, which encodes active PLA(2), is highly induced during the interaction of the fungus with the host plant Brassica napus. Heterologous expression of VlsPLA(2) in Nicotiana benthamiana resulted in increased synthesis of certain phospholipids compared to plants in which enzymatically inactive PLA(2) was expressed (VlsPLA(2)(Delta CD)). Moreover, VlsPLA(2) suppresses the hypersensitive response triggered by the Cf4/Avr4 complex, thereby suppressing the chitin--induced reactive oxygen species burst. VlsPLA(2)-overexpressing V. longisporum strains showed increased virulence in Arabidopsis plants, and transcriptomic analysis of this fungal strain revealed that the induction of the gene contributed to increased virulence. VlsPLA(2) was initially localized to the host nucleus and then translocated to the chloroplasts at later time points. In addition, VlsPLA(2) bound to the vesicle-associated membrane protein A (VAMPA) and was transported to the nuclear membrane. In the nucleus, VlsPLA(2) caused major alterations in the expression levels of genes encoding transcription factors and subtilisin-like proteases, which play a role in plant immunity. In conclusion, our study showed that VlsPLA(2) acts as a virulence factor, possibly by hydrolysing host nuclear envelope phospholipids, which, through a signal transduction cascade, may suppress basal plant immune responses
    • …
    corecore