Search CORE

51 research outputs found

Graph Convolutional Networks for Road Networks

Author: Andersen Ove
Clevert Djork-Arné
Diederik
Glorot Xavier
Glorot Xavier
Hamilton William
Thomas
Veličković Petar
Yu Bing
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 05/11/2019
Field of study

Machine learning techniques for road networks hold the potential to facilitate many important transportation applications. Graph Convolutional Networks (GCNs) are neural networks that are capable of leveraging the structure of a road network by utilizing information of, e.g., adjacent road segments. While state-of-the-art GCNs target node classification tasks in social, citation, and biological networks, machine learning tasks in road networks differ substantially from such tasks. In road networks, prediction tasks concern edges representing road segments, and many tasks involve regression. In addition, road networks differ substantially from the networks assumed in the GCN literature in terms of the attribute information available and the network characteristics. Many implicit assumptions of GCNs do therefore not apply. We introduce the notion of Relational Fusion Network (RFN), a novel type of GCN designed specifically for machine learning on road networks. In particular, we propose methods that outperform state-of-the-art GCNs on both a road segment regression task and a road segment classification task by 32-40% and 21-24%, respectively. In addition, we provide experimental evidence of the short-comings of state-of-the-art GCNs in the context of road networks: unlike our method, they cannot effectively leverage the road network structure for road segment classification and fail to outperform a regular multi-layer perceptron.Comment: Ten-page pre-print version of a four-page ACM SIGSPATIAL 2019 poster pape

arXiv.org e-Print Archive

Crossref

VBN

Hybrid Spatio-Temporal Graph Convolutional Network: Improving Traffic Prediction with Navigation Data

Author: Bast Hannah
Ben-Akiva Moshe
Bruna Joan
Clevert Djork-Arné
He Jingrui
Kingma Diederik P
Li Yaguang
Lv Yisheng
Tsymbal Alexey
Vlahogianni Eleni I
Publication venue
Publication date: 22/06/2020
Field of study

Traffic forecasting has recently attracted increasing interest due to the popularity of online navigation services, ridesharing and smart city projects. Owing to the non-stationary nature of road traffic, forecasting accuracy is fundamentally limited by the lack of contextual information. To address this issue, we propose the Hybrid Spatio-Temporal Graph Convolutional Network (H-STGCN), which is able to "deduce" future travel time by exploiting the data of upcoming traffic volume. Specifically, we propose an algorithm to acquire the upcoming traffic volume from an online navigation engine. Taking advantage of the piecewise-linear flow-density relationship, a novel transformer structure converts the upcoming volume into its equivalent in travel time. We combine this signal with the commonly-utilized travel-time signal, and then apply graph convolution to capture the spatial dependency. Particularly, we construct a compound adjacency matrix which reflects the innate traffic proximity. We conduct extensive experiments on real-world datasets. The results show that H-STGCN remarkably outperforms state-of-the-art methods in various metrics, especially for the prediction of non-recurring congestion

arXiv.org e-Print Archive

Crossref

cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate

Author: Alkan
Andreas Mayr
Andreas Mitterecker
Bailey
Bentley
Boeva
Brown
Bullard
Campbell
Chiang
Clevert
DePristo
Djork-Arné Clevert
Dohm
Günter Klambauer
Harchaoui
Hochreiter
Ivakhno
Karin Schwarzbauer
Kim
Lander
Langmead
Le
Magi
Medvedev
Sathirapongsasuti
Sepp Hochreiter
Stratton
Sultan
Talloen
Talloen
Ulrich Bodenhofer
Venkatraman
Wang
Wheeler
Xie
Yoon
Łabaj
Publication venue: Oxford University Press
Publication date
Field of study

Quantitative analyses of next-generation sequencing (NGS) data, such as the detection of copy number variations (CNVs), remain challenging. Current methods detect CNVs as changes in the depth of coverage along chromosomes. Technological or genomic variations in the depth of coverage thus lead to a high false discovery rate (FDR), even upon correction for GC content. In the context of association studies between CNVs and disease, a high FDR means many false CNVs, thereby decreasing the discovery power of the study after correction for multiple testing. We propose ‘Copy Number estimation by a Mixture Of PoissonS’ (cn.MOPS), a data processing pipeline for CNV detection in NGS data. In contrast to previous approaches, cn.MOPS incorporates modeling of depths of coverage across samples at each genomic position. Therefore, cn.MOPS is not affected by read count variations along chromosomes. Using a Bayesian approach, cn.MOPS decomposes variations in the depth of coverage across samples into integer copy numbers and noise by means of its mixture components and Poisson distributions, respectively. The noise estimate allows for reducing the FDR by filtering out detections having high noise that are likely to be false detections. We compared cn.MOPS with the five most popular methods for CNV detection in NGS data using four benchmark datasets: (i) simulated data, (ii) NGS data from a male HapMap individual with implanted CNVs from the X chromosome, (iii) data from HapMap individuals with known CNVs, (iv) high coverage data from the 1000 Genomes Project. cn.MOPS outperformed its five competitors in terms of precision (1–FDR) and recall for both gains and losses in all benchmark data sets. The software cn.MOPS is publicly available as an R package at http://www.bioinf.jku.at/software/cnmops/ and at Bioconductor

Crossref

PubMed Central

A data science roadmap for open science organizations engaged in early-stage drug discovery

Author: Clevert Djork-Arné
Edfeldt Kristina
Edwards Aled M.
Engkvist Ola
Günther Judith
Haibe-Kains Benjamin
Hartley Matthew
Hulcoop David G.
Leach Andrew R.
Marsden Brian D.
Menge Amelie
Misquitta Leonie
Müller Susanne
Owen Dafydd R.
Schapira Matthieu
Schiavone Lovisa Holmberg
Schütt Kristof T.
Skelton Nicholas
Steffen Andreas
Tropsha Alexander
Vernet Erik
Wang Yanli
Wellnitz James
Willson Timothy M.
Publication venue: Nature Research
Publication date: 05/07/2024
Field of study

The Structural Genomics Consortium is an international open science research organization with a focus on accelerating early-stage drug discovery, namely hit discovery and optimization. We, as many others, believe that artificial intelligence (AI) is poised to be a main accelerator in the field. The question is then how to best benefit from recent advances in AI and how to generate, format and disseminate data to enable future breakthroughs in AI-guided drug discovery. We present here the recommendations of a working group composed of experts from both the public and private sectors. Robust data management requires precise ontologies and standardized vocabulary while a centralized database architecture across laboratories facilitates data integration into high-value datasets. Lab automation and opening electronic lab notebooks to data mining push the boundaries of data sharing and data modeling. Important considerations for building robust machine-learning models include transparent and reproducible data processing, choosing the most relevant data representation, defining the right training and test sets, and estimating prediction uncertainty. Beyond data-sharing, cloud-based computing can be harnessed to build and disseminate machine-learning models. Important vectors of acceleration for hit and chemical probe discovery will be (1) the real-time integration of experimental data generation and modeling workflows within design-make-test-analyze (DMTA) cycles openly, and at scale and (2) the adoption of a mindset where data scientists and experimentalists work as a unified team, and where data science is incorporated into the experimental design

Oxford University Research Archive