1,024 research outputs found
Predicting protein disorder by analyzing amino acid sequence
<p>Abstract</p> <p>Background</p> <p>Many protein regions and some entire proteins have no definite tertiary structure, presenting instead as dynamic, disorder ensembles under different physiochemical circumstances. These proteins and regions are known as Intrinsically Unstructured Proteins (IUP). IUP have been associated with a wide range of protein functions, along with roles in diseases characterized by protein misfolding and aggregation.</p> <p>Results</p> <p>Identifying IUP is important task in structural and functional genomics. We exact useful features from sequences and develop machine learning algorithms for the above task. We compare our IUP predictor with PONDRs (mainly neural-network-based predictors), disEMBL (also based on neural networks) and Globplot (based on disorder propensity).</p> <p>Conclusion</p> <p>We find that augmenting features derived from physiochemical properties of amino acids (such as hydrophobicity, complexity etc.) and using ensemble method proved beneficial. The IUP predictor is a viable alternative software tool for identifying IUP protein regions and proteins.</p
Predicting siRNA potency with random forests and support vector machines
Abstract Background Short interfering RNAs (siRNAs) can be used to knockdown gene expression in functional genomics. For a target gene of interest, many siRNA molecules may be designed, whereas their efficiency of expression inhibition often varies. Results To facilitate gene functional studies, we have developed a new machine learning method to predict siRNA potency based on random forests and support vector machines. Since there were many potential sequence features, random forests were used to select the most relevant features affecting gene expression inhibition. Support vector machine classifiers were then constructed using the selected sequence features for predicting siRNA potency. Interestingly, gene expression inhibition is significantly affected by nucleotide dimer and trimer compositions of siRNA sequence. Conclusions The findings in this study should help design potent siRNAs for functional genomics, and might also provide further insights into the molecular mechanism of RNA interference
Analyzing adjuvant radiotherapy suggests a non monotonic radio-sensitivity over tumor volumes
Background: Adjuvant Radiotherapy (RT) after surgical removal of tumors proved beneficial in long-term tumor control and treatment planning. For many years, it has been well concluded that radio-sensitivities of tumors upon radiotherapy decrease according to the sizes of tumors and RT models based on Poisson statistics have been used extensively to validate clinical data. Results: We found that Poisson statistics on RT is actually derived from bacterial cells despite of many validations from clinical data. However cancerous cells do have abnormal cellular communications and use chemical messengers to signal both surrounding normal and cancerous cells to develop new blood vessels and to invade, to metastasis and to overcome intercellular spatial confinements in general. We therefore investigated the cell killing effects on adjuvant RT and found that radio-sensitivity is actually not a monotonic function of volume as it was believed before. We present detailed analysis and explanation to justify above statement. Based on EUD, we present an equivalent radio-sensitivity model. Conclusion: We conclude that radio sensitivity is a sophisticated function over tumor volumes, since tumor responses upon radio therapy also depend on cellular communications
Analyzing Adjuvant Radiotherapy Suggests a Non Monotonic Radio-Sensitivity Over Tumor Volumes
Background: Adjuvant Radiotherapy (RT) after surgical removal of tumors proved beneficial in long-term tumor control and treatment planning. For many years, it has been well concluded that radio-sensitivities of tumors upon radiotherapy decrease according to the sizes of tumors and RT models based on Poisson statistics have been used extensively to validate clinical data. Results: We found that Poisson statistics on RT is actually derived from bacterial cells despite of many validations from clinical data. However cancerous cells do have abnormal cellular communications and use chemical messengers to signal both surrounding normal and cancerous cells to develop new blood vessels and to invade, to metastasis and to overcome intercellular spatial confinements in general. We therefore investigated the cell killing effects on adjuvant RT and found that radio-sensitivity is actually not a monotonic function of volume as it was believed before. We present detailed analysis and explanation to justify above statement. Based on EUD, we present an equivalent radio-sensitivity model. Conclusion: We conclude that radio sensitivity is a sophisticated function over tumor volumes, since tumor responses upon radio therapy also depend on cellular communications
Fast Predictive Simple Geodesic Regression
Deformable image registration and regression are important tasks in medical
image analysis. However, they are computationally expensive, especially when
analyzing large-scale datasets that contain thousands of images. Hence, cluster
computing is typically used, making the approaches dependent on such
computational infrastructure. Even larger computational resources are required
as study sizes increase. This limits the use of deformable image registration
and regression for clinical applications and as component algorithms for other
image analysis approaches. We therefore propose using a fast predictive
approach to perform image registrations. In particular, we employ these fast
registration predictions to approximate a simplified geodesic regression model
to capture longitudinal brain changes. The resulting method is orders of
magnitude faster than the standard optimization-based regression model and
hence facilitates large-scale analysis on a single graphics processing unit
(GPU). We evaluate our results on 3D brain magnetic resonance images (MRI) from
the ADNI datasets.Comment: 19 pages, 10 figures, 13 table
Recommended from our members
Installation of internal electric fields by non-redox active cations in transition metal complexes.
Local electric fields contribute to the high selectivity and catalytic activity in enzyme active sites and confined reaction centers in zeolites by modifying the relative energy of transition states, intermediates and/or products. Proximal charged functionalities can generate equivalent internal electric fields in molecular systems but the magnitude of their effect and impact on electronic structure has been minimally explored. To generate quantitative insight into installing internal fields in synthetic systems, we report an experimental and computational study using transition metal (M1) Schiff base complexes functionalized with a crown ether unit containing a mono- or dicationic alkali or alkaline earth metal ion (M2). The synthesis and characterization of the complexes M1 = Ni(ii) and M2 = Na+ or Ba2+ are reported. The electronic absorption spectra and density functional theory (DFT) calculations establish that the cations generate a robust electric field at the metal, which stabilizes the Ni-based molecular orbitals without significantly changing their relative energies. The stabilization is also reflected in the experimental Ni(ii/i) reduction potentials, which are shifted 0.12 V and 0.34 V positive for M2 = Na+ and Ba2+, respectively, compared to a complex lacking a proximal cation. To compare with the cationic Ni complexes, we also synthesized a series of Ni(salen) complexes modified in the 5' position with electron-donating and -withdrawing functionalities (-CF3, -Cl, -H, -tBu, and -OCH3). Data from this series of compounds provides further evidence that the reduction potential shifts observed in the cationic complexes are not due to inductive ligand effects. DFT studies were also performed on the previously reported monocationic and dicatonic Fe(ii)(CH3CN) and Fe(iii)Cl analogues of this system to analyze the impact of an anionic chloride on the electrostatic potential and electronic structure of the Fe site
Genomics, Molecular Imaging, Bioinformatics, and Bio-Nano-Info Integration are Synergistic Components of Translational Medicine and Personalized Healthcare Research
Supported by National Science Foundation (NSF), International Society of Intelligent Biological Medicine (ISIBM), International Journal of Computational Biology and Drug Design and International Journal of Functional Informatics and Personalized Medicine, IEEE 7th Bioinformatics and Bioengineering attracted more than 600 papers and 500 researchers and medical doctors. It was the only synergistic inter/multidisciplinary IEEE conference with 24 Keynote Lectures, 7 Tutorials, 5 Cutting-Edge Research Workshops and 32 Scientific Sessions including 11 Special Research Interest Sessions that were designed dynamically at Harvard in response to the current research trends and advances. The committee was very grateful for the IEEE Plenary Keynote Lectures given by: Dr. A. Keith Dunker (Indiana), Dr. Jun Liu (Harvard), Dr. Brian Athey (Michigan), Dr. Mark Borodovsky (Georgia Tech and President of ISIBM), Dr. Hamid Arabnia (Georgia and Vice-President of ISIBM), Dr. Ruzena Bajcsy (Berkeley and Member of United States National Academy of Engineering and Member of United States Institute of Medicine of the National Academies), Dr. Mary Yang (United States National Institutes of Health and Oak Ridge, DOE), Dr. Chih-Ming Ho (UCLA and Member of United States National Academy of Engineering and Academician of Academia Sinica), Dr. Andy Baxevanis (United States National Institutes of Health), Dr. Arif Ghafoor (Purdue), Dr. John Quackenbush (Harvard), Dr. Eric Jakobsson (UIUC), Dr. Vladimir Uversky (Indiana), Dr. Laura Elnitski (United States National Institutes of Health) and other world-class scientific leaders. The Harvard meeting was a large academic event 100% full-sponsored by IEEE financially and academically. After a rigorous peer-review process, the committee selected 27 high-quality research papers from 600 submissions. The committee is grateful for contributions from keynote speakers Dr. Russ Altman (IEEE BIBM conference keynote lecturer on combining simulation and machine learning to recognize function in 4D), Dr. Mary Qu Yang (IEEE BIBM workshop keynote lecturer on new initiatives of detecting microscopic disease using machine learning and molecular biology, http://ieeexplore.ieee.org/servlet/opac? punumber=4425386) and Dr. Jack Y.Yang (IEEE BIBM workshop keynote lecturer on data mining and knowledge discovery in translational medicine) from the first IEEE Computer Society BioInformatics and BioMedicine (IEEE BIBM) international conference and workshops, November 2- 4, 2007, Silicon Valley, California, USA
Supervised Learning Method for the Prediction of Subcellular Localization of Proteins Using Amino Acid and Amino Acid Pair Composition
Background
Occurrence of protein in the cell is an important step in understanding its function. It is highly desirable to predict a protein\u27s subcellular locations automatically from its sequence. Most studied methods for prediction of subcellular localization of proteins are signal peptides, the location by sequence homology, and the correlation between the total amino acid compositions of proteins. Taking amino-acid composition and amino acid pair composition into consideration helps improving the prediction accuracy. Results
We constructed a dataset of protein sequences from SWISS-PROT database and segmented them into 12 classes based on their subcellular locations. SVM modules were trained to predict the subcellular location based on amino acid composition and amino acid pair composition. Results were calculated after 10-fold cross validation. Radial Basis Function (RBF) outperformed polynomial and linear kernel functions. Total prediction accuracy reached to 71.8% for amino acid composition and 77.0% for amino acid pair composition. In order to observe the impact of number of subcellular locations we constructed two more datasets of nine and five subcellular locations. Total accuracy was further improved to 79.9% and 85.66%. Conclusions
A new SVM based approach is presented based on amino acid and amino acid pair composition. Result shows that data simulation and taking more protein features into consideration improves the accuracy to a great extent. It was also noticed that the data set needs to be crafted to take account of the distribution of data in all the classes
Recommended from our members
Investigation of transmembrane proteins using a computational approach
Background: An important subfamily of membrane proteins are the transmembrane α-helical proteins, in which the membrane-spanning regions are made up of α-helices. Given the obvious biological and medical significance of these proteins, it is of tremendous practical importance to identify the location of transmembrane segments. The difficulty of inferring the secondary or tertiary structure of transmembrane proteins using experimental techniques has led to a surge of interest in applying techniques from machine learning and bioinformatics to infer secondary structure from primary structure in these proteins. We are therefore interested in determining which physicochemical properties are most useful for discriminating transmembrane segments from non-transmembrane segments in transmembrane proteins, and for discriminating intrinsically unstructured segments from intrinsically structured segments in transmembrane proteins, and in using the results of these investigations to develop classifiers to identify transmembrane segments in transmembrane proteins. Results: We determined that the most useful properties for discriminating transmembrane segments from non-transmembrane segments and for discriminating intrinsically unstructured segments from intrinsically structured segments in transmembrane proteins were hydropathy, polarity, and flexibility, and used the results of this analysis to construct classifiers to discriminate transmembrane segments from non-transmembrane segments using four classification techniques: two variants of the Self-Organizing Global Ranking algorithm, a decision tree algorithm, and a support vector machine algorithm. All four techniques exhibited good performance, with out-of-sample accuracies of approximately 75%. Conclusions: Several interesting observations emerged from our study: intrinsically unstructured segments and transmembrane segments tend to have opposite properties; transmembrane proteins appear to be much richer in intrinsically unstructured segments than other proteins; and, in approximately 70% of transmembrane proteins that contain intrinsically unstructured segments, the intrinsically unstructured segments are close to transmembrane segments
- …