12 research outputs found

    HENA, heterogeneous network-based data set for Alzheimer's disease.

    Get PDF
    Alzheimer's disease and other types of dementia are the top cause for disabilities in later life and various types of experiments have been performed to understand the underlying mechanisms of the disease with the aim of coming up with potential drug targets. These experiments have been carried out by scientists working in different domains such as proteomics, molecular biology, clinical diagnostics and genomics. The results of such experiments are stored in the databases designed for collecting data of similar types. However, in order to get a systematic view of the disease from these independent but complementary data sets, it is necessary to combine them. In this study we describe a heterogeneous network-based data set for Alzheimer's disease (HENA). Additionally, we demonstrate the application of state-of-the-art graph convolutional networks, i.e. deep learning methods for the analysis of such large heterogeneous biological data sets. We expect HENA to allow scientists to explore and analyze their own results in the broader context of Alzheimer's disease research

    Metric Labeling and Semi-metric Embedding for Protein Annotation Prediction

    Full text link
    Computational techniques have been successful at predicting protein function from relational data (functional or physical interactions). These prediction techniques have been used to generate hypotheses and to direct experimental validation. With few exceptions, these predictive tasks are modeled as multi-label classification problems where the labels (functions) are treated independently or semi-independently. However, databases such as the Gene Ontology provide more information about the similarities between functions. It is a largely open question how much the use of relationships between functions can improve the quality of function prediction techniques. In this paper, we explore the use of the Metric Labeling combinatorial optimization problem to make use of heuristically computed distances between functions to make more accurate predictions of protein function in networks derived from both physical interactions and a combination of other data types. To do this, we give a new technique (based on convex optimization) for converting heuristic semimetric distances (from, e.g. Gene Ontology) into a metric that finds an embedding of the semimetric into a metric with minimum least-squares distortion (LSD). The Metric Labeling approach is shown to outperform 5 existing techniques for inferring function from networks. These results suggest Metric Labeling is useful for protein function prediction, and that our LSD minimization approach can help solve the problem of converting heuristic distances to a metric. 1
    corecore