22 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationThe unstable expansion of the polyglutamine (polyQ) tract is a critical factor in the pathogenic pathway of at least ten neurodegenerative diseases, including Huntington's disease, spinal and bulbar muscular atrophy (SBMA), dentatorubral-pallidoluysian atrophy (DRPLA), and seven spinocerebellar ataxias, all of which are termed as polyglutamine diseases. One less understood but common feature of polyQ diseases is polyQ protein aggregation. This dissertation explores the protein folding, hydrogen bonding, and water accessibility changes which are induced by the enlargement of the polyQ tract using advanced informatics and computational methods, including protein 3D structure modeling and molecular dynamics simulations. This dissertation also demonstrates that these state-of-the-art computational and informatics methods are powerful tools to provide useful insights into protein aggregation in polyQ diseases. The enlargement of polyQ segments affects both local and global structures of polyQ proteins as well as their water-accessibility, hydrogen bond patterns, and other structural characteristics. Results from both isolated polyQ and polyQ segments in the context of ataxin-2 and ataxin-3 show that the polyQ tracts increasingly prefer self-interaction as the lengths of the tracts increase, indicating an increased tendency toward aggregation among larger polyQ tracts. These results provide new insights into possible pathogenic mechanisms of polyQ diseases based solely on the increased propensity toward polyQ aggregation and suggest that the modulation of solvent-polyQ interaction may be a possible therapeutic strategy for treating polyQ diseases. The analysis pipeline designed and used in this study is an effective way to study the molecular mechanism of polyQ diseases, and can be generalized to study other diseases associated with the protein conformation changes, such as Parkinson's disease, Alzheimer's disease, and cancer

    Metadata Discovery and Integration to Support Reproducible Research using the Open Further Platform

    No full text
    Modern biomedical research, often requires reusing and combining (federation and/or integration of) data from multiple disparate sources such as clinical and electronic health record (phenotypes), genomic public and private annotations (genotypes), proteomics, metabolomics, biospecimen collections and environmental data. Each data source embeds within itself different meanings (semantic) and structural (syntactic) descriptions about the data either explicitly or implicitly. Metadata as described by the FAIR1 (Findable, Accessible, Interoperable, and Reusable) principles is a requirement for reproducible research - which requires discovery of these metadata and its understanding to facilitate proper use of data. Current state of the art requires a great deal of human manual curation, which renders these procedures non-scalable and consequently of limited practical value in the emerging big data biomedical science paradigm. To overcome these limitations, we are prototyping a computational infrastructure that supports automated and semi-automated mapping of metadata artifacts and terminologies. First, we advanced OpenFurther's metadata repository to adapt metadata specifications developed by the bioCADDIE consortium to store metadata for scalable interoperability between systems for creating, managing and using data. Second, we applied machine learning methods for automatically discovering metadata. Our preliminary results show that machine learning models were able to classify protein structure, genetic variant and general English corpus data with an average accuracy of 99%. Finally, we will use the findings for these work to develop a metadata and semantics discovery and mapping framework which will be agnostic to specific mapping algorithms or tools as many of these are domain-specific and also dependent on data; and will choose the best available solution based on the mapping performance making it scalable and suitable for emerging big data applications. This will allow proper reuse, federation and integration of the metadata-enriched data as needed for supporting reproducible research. Wilkinson MD, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016;3:160018. Gouripeddi R, Facelli JC, et al. FURTHeR: An Infrastructure for Clinical, Translational and Comparative Effectiveness Research. AMIA Annual Fall Symposium. 2013; Wash, DC. WG3 Members. (2015). WG3-MetadataSpecifications: NIH BD2K bioCADDIE Data Discovery Index WG3 Metadata Specification v1. Zenodo. 10.5281/zenodo.2801

    Molecular dynamics analysis of the aggregation propensity of polyglutamine segments.

    No full text
    Protein misfolding and aggregation is a pathogenic feature shared among at least ten polyglutamine (polyQ) neurodegenerative diseases. While solvent-solution interaction is a key factor driving protein folding and aggregation, the solvation properties of expanded polyQ tracts are not well understood. By using GPU-enabled all-atom molecular dynamics simulations of polyQ monomers in an explicit solvent environment, this study shows that solvent-polyQ interaction propensity decreases as the lengths of polyQ tract increases. This study finds a predominance in long-distance interactions between residues far apart in polyQ sequences with longer polyQ segments, that leads to significant conformational differences. This study also indicates that large loops, comprised of parallel β-structures, appear in long polyQ tracts and present new aggregation building blocks with aggregation driven by long-distance intra-polyQ interactions. Finally, consistent with previous observations using coarse-grain simulations, this study demonstrates that there is a gain in the aggregation propensity with increased polyQ length, and that this gain is correlated with decreasing ability of solvent-polyQ interaction. These results suggest the modulation of solvent-polyQ interactions as a possible therapeutic strategy for treating polyQ diseases

    Distance distribution of observed hydrogen bonds with more than 50% frequency.

    No full text
    <p>The normalized distance is calculated as (|acceptor residue index-donor residue index|+1)/(the number of repeat in polyQ). Red: Q18; Green: Q32; Blue: Q46.</p

    Solvent accessible surface area.

    No full text
    <p><b>A</b>. Total SASA. <b>B</b>. Normalized SASA. Red: backbone SASA; Green: sidechain SASA; Blue: total SASA. The error bars represent the standard deviation of the average values calculated over the six independent MD runs performed in this study.</p

    Comparison of AMBER CPU and GPU performance for simulations of polyQ monomers in explicit solvent with different number of repeats.

    No full text
    <p>Comparison of AMBER CPU and GPU performance for simulations of polyQ monomers in explicit solvent with different number of repeats.</p

    Number of intra-polyQ hydrogen bonds vs. the number of solvent-polyQ hydrogen bond.

    No full text
    <p><b>A</b>. Total count. <b>B</b>. Normalized count per 100 Qs. The error bars are the standard deviation from all 6 MD simulation runs for each polyQ length. Red: Q18; Green: Q32; Blue: Q46. The error bars represent the standard deviation of the average values calculated over the six independent MD runs performed in this study.</p

    Secondary structure of polyQ fragments of different lengths.

    No full text
    <p><b>A</b>. Q18. <b>B</b>. Q32. <b>C</b>. Q46. Colors indicate secondary structures of different types. Blue: parallel β structure; Sky blue: anti-parallel β structure; Dark green: 3-helix; Green: α-helix; Olive: pi-helix; Orange: turn; Red: bend; Black: loop. X-axis: residue index; Y-axis: percentage of frames in the 80 ns simulations. These results are the averaged ones over the six runs performed here.</p

    Secondary structure of Q32 monomers at different time frames for each of the six independent MD runs performed.

    No full text
    <p>X-axis: frame index with each frame representing 100 ps of simulation; Y-axis: residue index indicating the secondary structure as depicted at the right panel.</p

    Solvent-polyQ hydrogen bond count.

    No full text
    <p><b>A</b>. Total count. <b>B</b>. Count normalized by polyQ length. Shapes indicate hydrogen bonds of different types. The error bars represent the standard deviation of the average values calculated over the six independent MD runs performed in this study.</p
    corecore