Search CORE

653 research outputs found

Multiscale Modeling of RNA Structures Using NMR Chemical Shifts

Author: Zhang Kexin
Publication venue
Publication date: 01/01/2020
Field of study

Structure determination is an important step in understanding the mechanisms of functional non-coding ribonucleic acids (ncRNAs). Experimental observables in solution-state nuclear magnetic resonance (NMR) spectroscopy provide valuable information about the structural and dynamic properties of RNAs. In particular, NMR-derived chemical shifts are considered structural "fingerprints" of RNA conformational state(s). In my thesis, I have developed computational tools to model RNA structures (mainly secondary structures) using structural information extracted from NMR chemical shifts. Inspired by methods that incorporate chemical-mapping data into RNA secondary structure prediction, I have developed a framework, CS-Fold, for using assigned chemical shift data to conditionally guide secondary structure folding algorithms. First, I developed neural network classifiers, CS2BPS (Chemical Shift to Base Pairing Status), that take assigned chemical shifts as input and output the predicted base pairing status of individual residues in an RNA. Then I used the base pairing status predictions as folding restraints to guide RNA secondary structure prediction. Extensive testing indicates that from assigned NMR chemical shifts, we could accurately predict the secondary structures of RNAs and map distinct conformational states of a single RNA. Another way to utilize experimental data like NMR chemical shifts in structure modeling is probabilistic modeling, that is, using experimental data to recover native-like structure from a structural ensemble that contains a set of low energy structure models. I first developed a model, SS2CS (Secondary Structure to Chemical Shift), that takes secondary structure as input and predicts chemical shifts with high accuracies. Using Bayesian/maximum entropy (BME), I was able to reweight secondary structure models based on the agreement between the measured and reweighted ensemble-averaged chemical shifts. Results indicate that BME could identify the native or near-native structure from a set of low energy structure models as well as recover some of the non-canonical interactions in tertiary structures. We could also probe the conformational landscape by studying the weight pattern assigned by BME. Finally, I explored RNA structural annotation using assigned NMR chemical shifts. Using multitask learning, eleven structural properties were annotated by classifying individual residues in terms of each structural property. The results indicate that our method, CS-Annotate, could predict the structural properties with reasonable accuracy. We believe that CS-Annotate could be used for assessing the quality of a structure model by comparing the structure derived structural properties with the CS-Annotate derived structural properties. One major limitation of the tools developed is that they require assigned chemical shifts. And to assign chemical shifts, a secondary structure model is typically assumed. However, with the recent advances in singly labeled RNA synthesis, chemical shifts could be assigned without the assumption about the secondary structure. We envision that using the chemical shifts derived from singly labeled NMR experiments, CS-Fold could be used for modeling the secondary structure of RNA. We also believe that unassigned chemical shifts could be used for selecting structure models. Native-like structures could be recovered by comparing optimally assigned chemical shifts with computed chemical shifts (generated by SS2CS). Overall, the results presented in this thesis indicate we could extract crucial structural information of the residues in an RNA based on its NMR chemical shifts. Moreover, with the tools like CS-Fold, SS2CS, and CS-Annotate, we could accurately predict the secondary structure, model conformational landscape, and study structural properties of an RNA.PHDChemistryUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163247/1/kexin_1.pd

Deep Blue Documents at the University of Michigan

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Multidisciplinary Digital Publishing Institute

Ezid

Directory of Open Access Journals

eScholarship - University of California

Multi-label Classification for Tree and Directed Acyclic Graphs Hierarchies

Author: Eduardo F Morales
L Enrique Sucar
Mallinali Ramírez-Corona
Publication venue
Publication date: 11/04/2020
Field of study

Abstract. Hierarchical Multi-label Classification (HMC) is the task of assigning a set of classes to a single instance with the peculiarity that the classes are ordered in a predefined structure. We propose a novel HMC method for tree and Directed Acyclic Graphs (DAG) hierarchies. Using the combined predictions of locals classifiers and a weighting scheme according to the level in the hierarchy, we select the "best" single path for tree hierarchies, and multiple paths for DAG hierarchies. We developed a method that returns paths from the root down to a leaf node (Mandatory Leaf Node Prediction or MLNP) and an extension for Non Mandatory Leaf Node Prediction (NMLNP). For NMLNP we compared several pruning approaches varying the pruning direction, pruning time and pruning condition. Additionally, we propose a new evaluation metric for hierarchical classifiers, that avoids the bias of current measures which favor conservative approaches when using NMLNP. The proposed approach was experimentally evaluated with 10 tree and 8 DAG hierarchical datasets in the domain of protein function prediction. We concluded that our method works better for deep, DAG hierarchies and in general NMLNP improves MLNP

CiteSeerX

Special Issue on the Seventh Probabilistic Graphical Models Conference (PGM 2014)

Author: Renooij S.
Publication venue
Publication date: 01/01/2016
Field of study

Utrecht University Repository

Dynamic Multiscale Tree Learning Using Ensemble Strong Classifiers for Multi-label Segmentation of Medical Images with Lesions

Author: Ali Mahjoub Mohamed
Amiri Samya
Rekik Islem
Publication venue: 'Scitepress'
Publication date: 05/09/2017
Field of study

We introduce a dynamic multiscale tree (DMT) architecture that learns how to leverage the strengths of different existing classifiers for supervised multi-label image segmentation. Unlike previous works that simply aggregate or cascade classifiers for addressing image segmentation and labeling tasks, we propose to embed strong classifiers into a tree structure that allows bi-directional flow of information between its classifier nodes to gradually improve their performances. Our DMT is a generic classification model that inherently embeds different cascades of classifiers while enhancing learning transfer between them to boost up their classification accuracies. Specifically, each node in our DMT can nest a Structured Random Forest (SRF) classifier or a Bayesian Network (BN) classifier. The proposed SRF-BN DMT architecture has several appealing properties. First, while SRF operates at a patch-level (regular image region), BN operates at the super-pixel level (irregular image region), thereby enabling the DMT to integrate multi-level image knowledge in the learning process. Second, although BN is powerful in modeling dependencies between image elements (superpixels, edges) and their features, the learning of its structure and parameters is challenging. On the other hand, SRF may fail to accurately detect very irregular object boundaries. The proposed DMT robustly overcomes these limitations for both classifiers through the ascending and descending flow of contextual information between each parent node and its children nodes. Third, we train DMT using different scales, where we progressively decrease the patch and superpixel sizes as we go deeper along the tree edges nearing its leaf nodes. Last, DMT demonstrates its outperformance in comparison to several state-of-the-art segmentation methods for multi-labeling of brain images with gliomas

arXiv.org e-Print Archive

University of Dundee Online Publications

Tree-based Ensemble Classifier Learning for Automatic Brain Glioma Segmentation

Author: Ali Mahjoub Mohamed
Amiri Samya
Rekik Islem
Publication venue: 'Elsevier BV'
Publication date: 01/11/2018
Field of study

University of Dundee Online Publications

A supervised learning framework in the context of multiple annotators

Author: Gil González Julián
Publication venue: Doctorado en Ingeniería
Publication date: 01/01/2021
Field of study

The increasing popularity of crowdsourcing platforms, i.e., Amazon Mechanical Turk, is changing how datasets for supervised learning are built. In these cases, instead of having datasets labeled by one source (which is supposed to be an expert who provided the absolute gold standard), we have datasets labeled by multiple annotators with different and unknown expertise. Hence, we face a multi-labeler scenario, which typical supervised learning models cannot tackle. For such a reason, much attention has recently been given to the approaches that capture multiple annotators’ wisdom. However, such methods residing on two key assumptions: the labeler’s performance does not depend on the input space and independence among the annotators, which are hardly feasible in real-world settings..

Repositorio academico de la Universidad Tecnológica de Pereira

The Evolution of First Person Vision Methods: A Survey

Author: Betancourt Alejandro
Morerio Pietro
Rauterberg Matthias
Regazzoni Carlo S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

The emergence of new wearable technologies such as action cameras and smart-glasses has increased the interest of computer vision scientists in the First Person perspective. Nowadays, this field is attracting attention and investments of companies aiming to develop commercial devices with First Person Vision recording capabilities. Due to this interest, an increasing demand of methods to process these videos, possibly in real-time, is expected. Current approaches present a particular combinations of different image features and quantitative methods to accomplish specific objectives like object detection, activity recognition, user machine interaction and so on. This paper summarizes the evolution of the state of the art in First Person Vision video analysis between 1997 and 2014, highlighting, among others, most commonly used features, methods, challenges and opportunities within the field.Comment: First Person Vision, Egocentric Vision, Wearable Devices, Smart Glasses, Computer Vision, Video Analytics, Human-machine Interactio

arXiv.org e-Print Archive

CiteSeerX

Pure OAI Repository

Archivio istituzionale della ricerca - Università di Genova