8,005 research outputs found

    Discovering Patterns of Interest in IP Traffic Using Cliques in Bipartite Link Streams

    Full text link
    Studying IP traffic is crucial for many applications. We focus here on the detection of (structurally and temporally) dense sequences of interactions, that may indicate botnets or coordinated network scans. More precisely, we model a MAWI capture of IP traffic as a link streams, i.e. a sequence of interactions (t1,t2,u,v)(t_1 , t_2 , u, v) meaning that devices uu and vv exchanged packets from time t1t_1 to time t2t_2 . This traffic is captured on a single router and so has a bipartite structure: links occur only between nodes in two disjoint sets. We design a method for finding interesting bipartite cliques in such link streams, i.e. two sets of nodes and a time interval such that all nodes in the first set are linked to all nodes in the second set throughout the time interval. We then explore the bipartite cliques present in the considered trace. Comparison with the MAWILab classification of anomalous IP addresses shows that the found cliques succeed in detecting anomalous network activity

    Optimizing surveillance for livestock disease spreading through animal movements

    Full text link
    The spatial propagation of many livestock infectious diseases critically depends on the animal movements among premises; so the knowledge of movement data may help us to detect, manage and control an outbreak. The identification of robust spreading features of the system is however hampered by the temporal dimension characterizing population interactions through movements. Traditional centrality measures do not provide relevant information as results strongly fluctuate in time and outbreak properties heavily depend on geotemporal initial conditions. By focusing on the case study of cattle displacements in Italy, we aim at characterizing livestock epidemics in terms of robust features useful for planning and control, to deal with temporal fluctuations, sensitivity to initial conditions and missing information during an outbreak. Through spatial disease simulations, we detect spreading paths that are stable across different initial conditions, allowing the clustering of the seeds and reducing the epidemic variability. Paths also allow us to identify premises, called sentinels, having a large probability of being infected and providing critical information on the outbreak origin, as encoded in the clusters. This novel procedure provides a general framework that can be applied to specific diseases, for aiding risk assessment analysis and informing the design of optimal surveillance systems.Comment: Supplementary Information at https://sites.google.com/site/paolobajardi/Home/archive/optimizing_surveillance_ESM_l.pdf?attredirects=

    Advances in Learning and Understanding with Graphs through Machine Learning

    Get PDF
    Graphs have increasingly become a crucial way of representing large, complex and disparate datasets from a range of domains, including many scientific disciplines. Graphs are particularly useful at capturing complex relationships or interdependencies within or even between datasets, and enable unique insights which are not possible with other data formats. Over recent years, significant improvements in the ability of machine learning approaches to automatically learn from and identify patterns in datasets have been made. However due to the unique nature of graphs, and the data they are used to represent, employing machine learning with graphs has thus far proved challenging. A review of relevant literature has revealed that key challenges include issues arising with macro-scale graph learning, interpretability of machine learned representations and a failure to incorporate the temporal dimension present in many datasets. Thus, the work and contributions presented in this thesis primarily investigate how modern machine learning techniques can be adapted to tackle key graph mining tasks, with a particular focus on optimal macro-level representation, interpretability and incorporating temporal dynamics into the learning process. The majority of methods employed are novel approaches centered around attempting to use artificial neural networks in order to learn from graph datasets. Firstly, by devising a novel graph fingerprint technique, it is demonstrated that this can successfully be applied to two different tasks whilst out-performing established baselines, namely graph comparison and classification. Secondly, it is shown that a mapping can be found between certain topological features and graph embeddings. This, for perhaps the the first time, suggests that it is possible that machines are learning something analogous to human knowledge acquisition, thus bringing interpretability to the graph embedding process. Thirdly, in exploring two new models for incorporating temporal information into the graph learning process, it is found that including such information is crucial to predictive performance in certain key tasks, such as link prediction, where state-of-the-art baselines are out-performed. The overall contribution of this work is to provide greater insight into and explanation of the ways in which machine learning with respect to graphs is emerging as a crucial set of techniques for understanding complex datasets. This is important as these techniques can potentially be applied to a broad range of scientific disciplines. The thesis concludes with an assessment of limitations and recommendations for future research

    A Descriptive Framework for Temporal Data Visualizations Based on Generalized Space-Time Cubes

    Get PDF
    International audienceWe present the generalized space-time cube, a descriptive model for visualizations of temporal data. Visualizations are described as operations on the cube, which transform the cube's 3D shape into readable 2D visualizations. Operations include extracting subparts of the cube, flattening it across space or time or transforming the cubes geometry and content. We introduce a taxonomy of elementary space-time cube operations and explain how these operations can be combined and parameterized. The generalized space-time cube has two properties: (1) it is purely conceptual without the need to be implemented, and (2) it applies to all datasets that can be represented in two dimensions plus time (e.g. geo-spatial, videos, networks, multivariate data). The proper choice of space-time cube operations depends on many factors, for example, density or sparsity of a cube. Hence, we propose a characterization of structures within space-time cubes, which allows us to discuss strengths and limitations of operations. We finally review interactive systems that support multiple operations, allowing a user to customize his view on the data. With this framework, we hope to facilitate the description, criticism and comparison of temporal data visualizations, as well as encourage the exploration of new techniques and systems. This paper is an extension of Bach et al.'s (2014) work

    Graph Deep Learning: State of the Art and Challenges

    Get PDF
    The last half-decade has seen a surge in deep learning research on irregular domains and efforts to extend convolutional neural networks (CNNs) to work on irregularly structured data. The graph has emerged as a particularly useful geometrical object in deep learning, able to represent a variety of irregular domains well. Graphs can represent various complex systems, from molecular structure, to computer and social and traffic networks. Consequent on the extension of CNNs to graphs, a great amount of research has been published that improves the inferential power and computational efficiency of graph- based convolutional neural networks (GCNNs).The research is incipient, however, and our understanding is relatively rudimentary. The majority of GCNNs are designed to operate with certain properties. In this survey we review of the state of graph representation learning from the perspective of deep learning. We consider challenges in graph deep learning that have been neglected in the majority of work, largely because of the numerous theoretical difficulties they present. We identify four major challenges in graph deep learning: dynamic and evolving graphs, learning with edge signals and information, graph estimation, and the generalization of graph models. For each problem we discuss the theoretical and practical issues, survey the relevant research, while highlighting the limitations of the state of the art. Advances on these challenges would permit GCNNs to be applied to wider range of domains, in situations where graph models have previously been limited owing to the obstructions to applying a model owing to the domains’ natures

    Networks in Archaeology: Phenomena, Abstraction, Representation

    Get PDF
    The application of method and theory from network science to archaeology has dramatically increased over the last decade. In this article, we document this growth over time, discuss several of the important concepts that are used in the application of network approaches to archaeology, and introduce the other articles in this special issue on networks in archaeology. We argue that the suitability and contribution of network science techniques within particular archaeological research contexts can be usefully explored by scrutinizing the past phenomena under study, how these are abstracted into concepts, and how these in turn are represented as network data. For this reason, each of the articles in this special issue is discussed in terms of the phenomena that they seek to address, the abstraction in terms of concepts that they use to study connectivity, and the representations of network data that they employ in their analyses. The approaches currently being used are diverse and interdisciplinary, which we think are evidence of a healthy exploratory stage in the application of network science in archaeology. To facilitate further innovation, application, and collaboration, we also provide a glossary of terms that are currently being used in network science and especially those in the applications to archaeological case studies
    • 

    corecore