78 research outputs found

    Ontology of core data mining entities

    Get PDF
    In this article, we present OntoDM-core, an ontology of core data mining entities. OntoDM-core defines themost essential datamining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with many domain resources and is easy to extend

    Glycolysis Upregulation Is Neuroprotective As A Compensatory Mechanism In Als

    Get PDF
    Amyotrophic Lateral Sclerosis (ALS), is a fatal neurodegenerative disorder, with TDP-43 inclusions as a major pathological hallmark. Using a Drosophila model of TDP-43 proteinopathy we found significant alterations in glucose metabolism including increased pyruvate, suggesting that modulating glycolysis may be neuroprotective. Indeed, a high sugar diet improves locomotor and lifespan defects caused by TDP-43 proteinopathy in motor neurons or glia, but not muscle, suggesting that metabolic dysregulation occurs in the nervous system. Overexpressing human glucose transporter GLUT-3 in motor neurons mitigates TDP-43 dependent defects in synaptic vesicle recycling and improves locomotion. Furthermore, PFK mRNA, a key indicator of glycolysis, is upregulated in flies and patient derived iPSC motor neurons with TDP-43 pathology. Surprisingly, PFK overexpression rescues TDP-43 induced locomotor deficits. These findings from multiple ALS models show that mechanistically, glycolysis is upregulated in degenerating motor neurons as a compensatory mechanism and suggest that increased glucose availability is protective

    S.cerevisiae Complex Function Prediction with Modular Multi-Relational Framework

    Full text link
    Proceeding of: 23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2010, Córdoba, Spain, June 1-4, 2010Determining the functions of genes is essential for understanding how the metabolisms work, and for trying to solve their malfunctions. Genes usually work in groups rather than isolated, so functions should be assigned to gene groups and not to individual genes. Moreover, the genetic knowledge has many relations and is very frequently changeable. Thus, a propositional ad-hoc approach is not appropriate to deal with the gene group function prediction domain. We propose the Modular Multi-Relational Framework (MMRF), which faces the problem from a relational and flexible point of view. The MMRF consists of several modules covering all involved domain tasks (grouping, representing and learning using computational prediction techniques). A specific application is described, including a relational representation language, where each module of MMRF is individually instantiated and refined for obtaining a prediction under specific given conditions.This research work has been supported by CICYT, TRA 2007-67374-C02-02 project and by the expert biological knowledge of the Structural Computational Biology Group in Spanish National Cancer Research Centre (CNIO). The authors would like to thank members of Tilde tool developer group in K.U.Leuven for providing their help and many useful suggestions.Publicad

    Multi-Target Prediction: A Unifying View on Problems and Methods

    Full text link
    Multi-target prediction (MTP) is concerned with the simultaneous prediction of multiple target variables of diverse type. Due to its enormous application potential, it has developed into an active and rapidly expanding research field that combines several subfields of machine learning, including multivariate regression, multi-label classification, multi-task learning, dyadic prediction, zero-shot learning, network inference, and matrix completion. In this paper, we present a unifying view on MTP problems and methods. First, we formally discuss commonalities and differences between existing MTP problems. To this end, we introduce a general framework that covers the above subfields as special cases. As a second contribution, we provide a structured overview of MTP methods. This is accomplished by identifying a number of key properties, which distinguish such methods and determine their suitability for different types of problems. Finally, we also discuss a few challenges for future research

    Predicting gene function using hierarchical multi-label decision tree ensembles

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>S. cerevisiae</it>, <it>A. thaliana </it>and <it>M. musculus </it>are well-studied organisms in biology and the sequencing of their genomes was completed many years ago. It is still a challenge, however, to develop methods that assign biological functions to the ORFs in these genomes automatically. Different machine learning methods have been proposed to this end, but it remains unclear which method is to be preferred in terms of predictive performance, efficiency and usability.</p> <p>Results</p> <p>We study the use of decision tree based models for predicting the multiple functions of ORFs. First, we describe an algorithm for learning hierarchical multi-label decision trees. These can simultaneously predict all the functions of an ORF, while respecting a given hierarchy of gene functions (such as FunCat or GO). We present new results obtained with this algorithm, showing that the trees found by it exhibit clearly better predictive performance than the trees found by previously described methods. Nevertheless, the predictive performance of individual trees is lower than that of some recently proposed statistical learning methods. We show that ensembles of such trees are more accurate than single trees and are competitive with state-of-the-art statistical learning and functional linkage methods. Moreover, the ensemble method is computationally efficient and easy to use.</p> <p>Conclusions</p> <p>Our results suggest that decision tree based methods are a state-of-the-art, efficient and easy-to-use approach to ORF function prediction.</p

    A joint physics and radiobiology DREAM team vision - Towards better response prediction models to advance radiotherapy.

    Get PDF
    Radiotherapy developed empirically through experience balancing tumour control and normal tissue toxicities. Early simple mathematical models formalized this practical knowledge and enabled effective cancer treatment to date. Remarkable advances in technology, computing, and experimental biology now create opportunities to incorporate this knowledge into enhanced computational models. The ESTRO DREAM (Dose Response, Experiment, Analysis, Modelling) workshop brought together experts across disciplines to pursue the vision of personalized radiotherapy for optimal outcomes through advanced modelling. The ultimate vision is leveraging quantitative models dynamically during therapy to ultimately achieve truly adaptive and biologically guided radiotherapy at the population as well as individual patient-based levels. This requires the generation of models that inform response-based adaptations, individually optimized delivery and enable biological monitoring to provide decision support to clinicians. The goal is expanding to models that can drive the realization of personalized therapy for optimal outcomes. This position paper provides their propositions that describe how innovations in biology, physics, mathematics, and data science including AI could inform models and improve predictions. It consolidates the DREAM team's consensus on scientific priorities and organizational requirements. Scientifically, it stresses the need for rigorous, multifaceted model development, comprehensive validation and clinical applicability and significance. Organizationally, it reinforces the prerequisites of interdisciplinary research and collaboration between physicians, medical physicists, radiobiologists, and computational scientists throughout model development. Solely by a shared understanding of clinical needs, biological mechanisms, and computational methods, more informed models can be created. Future research environment and support must facilitate this integrative method of operation across multiple disciplines

    Tight regulation of ubiquitin-mediated DNA damage response by USP3 preserves the functional integrity of hematopoietic stem cells

    Get PDF
    Histone ubiquitination at DNA breaks is required for activation of the DNA damage response (DDR) and DNA repair. How the dynamic removal of this modification by deubiquitinating enzymes (DUBs) impacts genome maintenance in vivo is largely unknown. To address this question, we generated mice deficient for Ub-specific protease 3 (USP3; Usp3{delta}/{delta}), a histone H2A DUB which negatively regulates ubiquitin-dependent DDR signaling. Notably, USP3 deletion increased the levels of histone ubiquitination in adult tissues, reduced the hematopoietic stem cell (HSC) reserves over time, and shortened animal life span. Mechanistically, our data show that USP3 is important in HSC homeostasis, preserving HSC self-renewal, and repopulation potential in vivo and proliferation in vitro. A defective DDR and unresolved spontaneous DNA damage contribute to cell cycle restriction of Usp3{delta}/{delta} HSCs. Beyond the hematopoietic system, Usp3{delta}/{delta} animals spontaneously developed tumors, and primary Usp3{delta}/{delta} cells failed to preserve chromosomal integrity. These findings broadly support the regulation of chromatin ubiquitination as a key pathway in preserving tissue function through modulation of the response to genotoxic stress

    Non-homologous DNA end joining in normal and cancer cells and its dependence on break structures

    Get PDF
    DNA double-strand breaks (DSBs) are a serious threat to the cell, for if not or miss-repaired, they can lead to chromosomal aberration, mutation and cancer. DSBs in human cells are repaired via non-homologous DNA end joining (NHEJ) and homologous recombination repair pathways. In the former process, the structure of DNA termini plays an important role, as does the genetic constitution of the cells, through being different in normal and pathological cells. In order to investigate the dependence of NHEJ on DSB structure in normal and cancer cells, we used linearized plasmids with various, complementary or non-complementary, single-stranded or blunt DNA termini, as well as whole-cell extract isolated from normal human lymphocytes, chronic myeloid leukemia K562 cells and lung cancer A549 cells. We observed a pronounced variability in the efficacy of NHEJ reaction depending on the type of ends. Plasmids with complementary and blunt termini were more efficiently repaired than the substrate with 3' protruding single-strand ends. The hierarchy of the effectiveness of NHEJ was on average, from the most effective to the least, A549/ normal lymphocytes/ K562. Our results suggest that the genetic constitution of the cells together with the substrate terminal structure may contribute to the efficacy of the NHEJ reaction. This should be taken into account on considering its applicability in cancer chemo- or radiotherapy by pharmacologically modulating NHEJ cellular responses

    From learning taxonomies to phylogenetic learning: Integration of 16S rRNA gene data into FAME-based bacterial classification

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification.</p> <p>Results</p> <p>In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model.</p> <p>Conclusions</p> <p>FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for the discrimination of bacterial species. Summarized, by phylogenetic learning we are able to situate and evaluate FAME-based bacterial species classification in a more informative context.</p
    corecore