121 research outputs found

    Protemot: prediction of protein binding sites with automatically extracted geometrical templates

    Get PDF
    Geometrical analysis of protein tertiary substructures has been an effective approach employed to predict protein binding sites. This article presents the Protemot web server that carries out prediction of protein binding sites based on the structural templates automatically extracted from the crystal structures of proteinā€“ligand complexes in the PDB (Protein Data Bank). The automatic extraction mechanism is essential for creating and maintaining a comprehensive template library that timely accommodates to the new release of PDB as the number of entries continues to grow rapidly. The design of Protemot is also distinctive by the mechanism employed to expedite the analysis process that matches the tertiary substructures on the contour of the query protein with the templates in the library. This expediting mechanism is essential for providing reasonable response time to the user as the number of entries in the template library continues to grow rapidly due to rapid growth of the number of entries in PDB. This article also reports the experiments conducted to evaluate the prediction power delivered by the Protemot web server. Experimental results show that Protemot can deliver a superior prediction power than a web server based on a manually curated template library with insufficient quantity of entries. Availability:

    Machine Learning Methods for Medical and Biological Image Computing

    Get PDF
    Medical and biological imaging technologies provide valuable visualization information of structure and function for an organ from the level of individual molecules to the whole object. Brain is the most complex organ in body, and it increasingly attracts intense research attentions with the rapid development of medical and bio-logical imaging technologies. A massive amount of high-dimensional brain imaging data being generated makes the design of computational methods for eļ¬ƒcient analysis on those images highly demanded. The current study of computational methods using hand-crafted features does not scale with the increasing number of brain images, hindering the pace of scientiļ¬c discoveries in neuroscience. In this thesis, I propose computational methods using high-level features for automated analysis of brain images at diļ¬€erent levels. At the brain function level, I develop a deep learning based framework for completing and integrating multi-modality neuroimaging data, which increases the diagnosis accuracy for Alzheimerā€™s disease. At the cellular level, I propose to use three dimensional convolutional neural networks (CNNs) for segmenting the volumetric neuronal images, which improves the performance of digital reconstruction of neuron structures. I design a novel CNN architecture such that the model training and testing image prediction can be implemented in an end-to-end manner. At the molecular level, I build a voxel CNN classiļ¬er to capture discriminative features of the input along three spatial dimensions, which facilitate the identiļ¬cation of secondary structures of proteins from electron microscopy im-ages. In order to classify genes speciļ¬cally expressed in diļ¬€erent brain cell-type, I propose to use invariant image feature descriptors to capture local gene expression information from cellular-resolution in situ hybridization images. I build image-level representations by applying regularized learning and vector quantization on generated image descriptors. The developed computational methods in this dissertation are evaluated using images from medical and biological experiments in comparison with baseline methods. Experimental results demonstrate that the developed representations, formulations, and algorithms are eļ¬€ective and eļ¬ƒcient in learning from brain imaging data

    Machine Learning in Discrete Molecular Spaces

    Get PDF
    The past decade has seen an explosion of machine learning in chemistry. Whether it is in property prediction, synthesis, molecular design, or any other subdivision, machine learning seems poised to become an integral, if not a dominant, component of future research efforts. This extraordinary capacity rests on the interac- tion between machine learning models and the underlying chemical data landscape commonly referred to as chemical space. Chemical space has multiple incarnations, but is generally considered the space of all possible molecules. In this sense, it is one example of a molecular set: an arbitrary collection of molecules. This thesis is devoted to precisely these objects, and particularly how they interact with machine learning models. This work is predicated on the idea that by better understanding the relationship between molecular sets and the models trained on them we can improve models, achieve greater interpretability, and further break down the walls between data-driven and human-centric chemistry. The hope is that this enables the full predictive power of machine learning to be leveraged while continuing to build our understanding of chemistry. The first three chapters of this thesis introduce and reviews the necessary machine learning theory, particularly the tools that have been specially designed for chemical problems. This is followed by an extensive literature review in which the contributions of machine learning to multiple facets of chemistry over the last two decades are explored. Chapters 4-7 explore the research conducted throughout this PhD. Here we explore how we can meaningfully describe the properties of an arbitrary set of molecules through information theory; how we can determine the most informative data points in a set of molecules; how graph signal processing can be used to understand the relationship between the chosen molecular representation, the property, and the machine learning model; and finally how this approach can be brought to bear on protein space. Each of these sub-projects briefly explores the necessary mathematical theory before leveraging it to provide approaches that resolve the posed problems. We conclude with a summary of the contributions of this work and outline fruitful avenues for further exploration

    Data Science Methods for Analyzing Nanomaterial Images and Videos

    Get PDF
    A large amount of nanomaterial characterization data has been routinely collected by using electron microscopes and stored in image or video formats. A bottleneck in making effective use of the image/video data is the lack of the development of sophisticated data science methods capable of unlocking valuable material pertinent information buried in the raw data. To address this problem, the research of this dissertation begins with understanding the physical mechanisms behind the concerned process to determine why the generic methods fall short. Afterwards, it designs and improves image processing and statistical modeling tools to address the practical challenges. Specifically, this dissertation consists of two main tasks: extracting useful information from images or videos of nanomaterials captured by electron microscopes, and designing analytical methods for modeling/monitoring the dynamic growth of nanoparticles. In the first task, a two-pipeline framework is proposed to fuse two kinds of image information for nanoscale object detection that can accurately identify and measure nanoparticles in transmission electron microscope (TEM) images of high noise and low contrast. To handle the second task of analyzing nanoparticle growth, this dissertation develops dynamic nonparametric models for time-varying probability density functions (PDFs) estimation. Unlike simple statistics, a PDF contains fuller information about the nanoscale objects of interests. Characterizing the dynamic changes of the PDF as the nanoparticles grow into different sizes and morph into different shapes, the proposed nonparametric methods are capable of analyzing an in situ TEM video to delineate growth stages in a retrospective analysis, or tracking the nanoparticle growth process in a prospective analysis. The resulting analytic methods have applications in areas beyond the nanoparticle growth process such as the image-based process control tasks in additive manufacturing

    Deep Learning Techniques for Multi-Dimensional Medical Image Analysis

    Get PDF

    Deep Learning Techniques for Multi-Dimensional Medical Image Analysis

    Get PDF

    Development of magnetic resonance imaging techniques for mouse models of Alzheimer's Disease

    Get PDF
    Due to increasing life expectancy in western societies, a rise in the prevalence of Alzheimerā€™s Disease (AD) is expected to have adverse social and economic consequences. The success of emerging treatments for AD relies heavily on the ability to test their efficacy. Sensitive biomarkers are required that provide information specific to the therapeutic targets. Through manipulation of the genome, transgenic mice have been bred to exhibit particular pathological features of AD in isolation. Magnetic Resonance Imaging (MRI) of these mouse models can be used to observe phenotypic abnormalities in-vivo in a controlled environment. As summarised in the introductory chapter, the aim of this work was to develop MRI techniques for inclusion in multi-parametric protocols to characterise AD models in-vivo. Structural MRI has become an increasingly popular tool in the measurement of atrophy of brain tissue over time and requires both accuracy and stability of the imaging system. In chapter 3, a protocol for the calibration of system gradients for high resolution, pre-clinical MRI is described. A structural phantom has been designed and 3D printed for use in a 9.4T small bore MRI and micro CT system. Post processing software is used to monitor gradient stability and provide corrections for scaling errors and non-linearity. Diffusion Tensor Imaging (DTI) and Quantitative Susceptibility Mapping (QSM) are MRI techniques that have shown sensitivity to changes in white matter regions of the brain. QSM may also provide a non invasive method for measurement of increased iron concentration in grey matter tissue observed in AD. Chapters 4 and 5 evaluate the utility of these measurements as imaging biomarkers in a mouse model that exhibits tau pathology associated with AD. Discrepancies between transgenic and wild-type groups were identified for both MRI techniques indicating the potential benefit of their inclusion in a multi-parametric in-vivo protocol
    • ā€¦
    corecore