Search CORE

1,777 research outputs found

Predicting Sequences of Traversed Nodes in Graphs using Network Models with Multiple Higher Orders

Author: Casiraghi Giona
Gote Christoph
Scholtes Ingo
Schweitzer Frank
Publication venue
Publication date: 13/07/2020
Field of study

We propose a novel sequence prediction method for sequential data capturing node traversals in graphs. Our method builds on a statistical modelling framework that combines multiple higher-order network models into a single multi-order model. We develop a technique to fit such multi-order models in empirical sequential data and to select the optimal maximum order. Our framework facilitates both next-element and full sequence prediction given a sequence-prefix of any length. We evaluate our model based on six empirical data sets containing sequences from website navigation as well as public transport systems. The results show that our method out-performs state-of-the-art algorithms for next-element prediction. We further demonstrate the accuracy of our method during out-of-sample sequence prediction and validate that our method can scale to data sets with millions of sequences.Comment: 18 pages, 5 figures, 2 table

arXiv.org e-Print Archive

Understanding Complex Systems: From Networks to Optimal Higher-Order Models

Author: Lambiotte Renaud
Rosvall Martin
Scholtes Ingo
Publication venue
Publication date: 01/01/2018
Field of study

To better understand the structure and function of complex systems, researchers often represent direct interactions between components in complex systems with networks, assuming that indirect influence between distant components can be modelled by paths. Such network models assume that actual paths are memoryless. That is, the way a path continues as it passes through a node does not depend on where it came from. Recent studies of data on actual paths in complex systems question this assumption and instead indicate that memory in paths does have considerable impact on central methods in network science. A growing research community working with so-called higher-order network models addresses this issue, seeking to take advantage of information that conventional network representations disregard. Here we summarise the progress in this area and outline remaining challenges calling for more research.Comment: 8 pages, 4 figure

arXiv.org e-Print Archive

Oxford University Research Archive

Predicting Off-target Effects in CRISPR-Cas9 System using Graph Convolutional Network

Author: Vinodkumar Prasoon Kumar
Publication venue: Tartu Ülikool
Publication date: 01/01/2021
Field of study

CRISPR-Cas9 is a powerful genome editing technology that has been widely applied in target gene repair and gene expression regulation. One of the main challenges for the CRISPR-Cas9 system is the occurrence of unexpected cleavage at some sites (off-targets) and predicting them is necessary due to its relevance in gene editing research. Very few deep learning models have been developed so far that predict the off-target propensity of single guide RNA (sgRNA) at specific DNA fragments by using artificial feature extract operations and machine learning techniques. Unfortunately, they implement a convoluted process that is difficult to understand and implement by researchers. This thesis focuses on developing a novel graph-based approach to predict off-target efficacy of sgRNA in CRISPR-Cas9 system that is easy to understand and replicate by researchers. This is achieved by creating a graph with sequences as nodes and by performing link prediction using Graph Convolutional Network (GCN) to predict the presence of links between sgRNA and off-target inducing target DNA sequences. Features for the sequences are extracted from within the sequences

DSpace at Tartu University Library

Locating Community Smells in Software Development Processes Using Higher-Order Network Centralities

Author: Arzig Carsten
Casiraghi Giona
Gote Christoph
Perri Vincenzo
Scholtes Ingo
Schweitzer Frank
von Gernler Alexander
Zingg Christian
Publication venue
Publication date: 14/09/2023
Field of study

Community smells are negative patterns in software development teams' interactions that impede their ability to successfully create software. Examples are team members working in isolation, lack of communication and collaboration across departments or sub-teams, or areas of the codebase where only a few team members can work on. Current approaches aim to detect community smells by analysing static network representations of software teams' interaction structures. In doing so, they are insufficient to locate community smells within development processes. Extending beyond the capabilities of traditional social network analysis, we show that higher-order network models provide a robust means of revealing such hidden patterns and complex relationships. To this end, we develop a set of centrality measures based on the MOGen higher-order network model and show their effectiveness in predicting influential nodes using five empirical datasets. We then employ these measures for a comprehensive analysis of a product team at the German IT security company genua GmbH, showcasing our method's success in identifying and locating community smells. Specifically, we uncover critical community smells in two areas of the team's development process. Semi-structured interviews with five team members validate our findings: while the team was aware of one community smell and employed measures to address it, it was not aware of the second. This highlights the potential of our approach as a robust tool for identifying and addressing community smells in software development teams. More generally, our work contributes to the social network analysis field with a powerful set of higher-order network centralities that effectively capture community dynamics and indirect relationships.Comment: 48 pages, 19 figures, 4 tables; accepted at Social Network Analysis and Mining (SNAM

arXiv.org e-Print Archive

Repository for Publications and Research Data

Disease spread through animal movements: a static and temporal network analysis of pig trade in Germany

Author: Conraths Franz J.
Gethmann Jörn
Hövel Philipp
Koher Andreas
Lentz Hartmut H. K.
Sauter-Louis Carola
Selhorst Thomas
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 02/03/2016
Field of study

Background: Animal trade plays an important role for the spread of infectious diseases in livestock populations. As a case study, we consider pig trade in Germany, where trade actors (agricultural premises) form a complex network. The central question is how infectious diseases can potentially spread within the system of trade contacts. We address this question by analyzing the underlying network of animal movements. Methodology/Findings: The considered pig trade dataset spans several years and is analyzed with respect to its potential to spread infectious diseases. Focusing on measurements of network-topological properties, we avoid the usage of external parameters, since these properties are independent of specific pathogens. They are on the contrary of great importance for understanding any general spreading process on this particular network. We analyze the system using different network models, which include varying amounts of information: (i) static network, (ii) network as a time series of uncorrelated snapshots, (iii) temporal network, where causality is explicitly taken into account. Findings: Our approach provides a general framework for a topological-temporal characterization of livestock trade networks. We find that a static network view captures many relevant aspects of the trade system, and premises can be classified into two clearly defined risk classes. Moreover, our results allow for an efficient allocation strategy for intervention measures using centrality measures. Data on trade volume does barely alter the results and is therefore of secondary importance. Although a static network description yields useful results, the temporal resolution of data plays an outstanding role for an in-depth understanding of spreading processes. This applies in particular for an accurate calculation of the maximum outbreak size.Comment: main text 33 pages, 17 figures, supporting information 7 pages, 7 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

Terrain and Behavior Modeling for Projecting Multistage Cyber Attacks

Author: Argauer Brian
Fava Daniel
Holsopple Jared
Yang Shanchieh Jay
Publication venue: RIT Scholar Works
Publication date: 01/01/2007
Field of study

Contributions from the information fusion community have enabled comprehensible traces of intrusion alerts occurring on computer networks. Traced or tracked cyber attacks are the bases for threat projection in this work. Due to its complexity, we separate threat projection into two subtasks: predicting likely next targets and predicting attacker behavior. A virtual cyber terrain is proposed for identifying likely targets. Overlaying traced alerts onto the cyber terrain reveals exposed vulnerabilities, services, and hosts. Meanwhile, a novel attempt to extract cyber attack behavior is discussed. Leveraging traditional work on prediction and compression, this work identifies behavior patterns from traced cyber attack data. The extracted behavior patterns are expected to further refine projections deduced from the cyber terrain

CiteSeerX

RIT Scholar Works

Route Planning in Transportation Networks

Author: Bast Hannah
Delling Daniel
Goldberg Andrew
Müller-Hannemann Matthias
Pajor Thomas
Sanders Peter
Wagner Dorothea
Werneck Renato F.
Publication venue
Publication date: 20/04/2015
Field of study

We survey recent advances in algorithms for route planning in transportation networks. For road networks, we show that one can compute driving directions in milliseconds or less even at continental scale. A variety of techniques provide different trade-offs between preprocessing effort, space requirements, and query time. Some algorithms can answer queries in a fraction of a microsecond, while others can deal efficiently with real-time traffic. Journey planning on public transportation systems, although conceptually similar, is a significantly harder problem due to its inherent time-dependent and multicriteria nature. Although exact algorithms are fast enough for interactive queries on metropolitan transit systems, dealing with continent-sized instances requires simplifications or heavy preprocessing. The multimodal route planning problem, which seeks journeys combining schedule-based transportation (buses, trains) with unrestricted modes (walking, driving), is even harder, relying on approximate solutions even for metropolitan inputs.Comment: This is an updated version of the technical report MSR-TR-2014-4, previously published by Microsoft Research. This work was mostly done while the authors Daniel Delling, Andrew Goldberg, and Renato F. Werneck were at Microsoft Research Silicon Valle

arXiv.org e-Print Archive

CiteSeerX