1,289 research outputs found

    Towards a Theory of Scale-Free Graphs: Definition, Properties, and Implications (Extended Version)

    Get PDF
    Although the ``scale-free'' literature is large and growing, it gives neither a precise definition of scale-free graphs nor rigorous proofs of many of their claimed properties. In fact, it is easily shown that the existing theory has many inherent contradictions and verifiably false claims. In this paper, we propose a new, mathematically precise, and structural definition of the extent to which a graph is scale-free, and prove a series of results that recover many of the claimed properties while suggesting the potential for a rich and interesting theory. With this definition, scale-free (or its opposite, scale-rich) is closely related to other structural graph properties such as various notions of self-similarity (or respectively, self-dissimilarity). Scale-free graphs are also shown to be the likely outcome of random construction processes, consistent with the heuristic definitions implicit in existing random graph approaches. Our approach clarifies much of the confusion surrounding the sensational qualitative claims in the scale-free literature, and offers rigorous and quantitative alternatives.Comment: 44 pages, 16 figures. The primary version is to appear in Internet Mathematics (2005

    Optimizing voxel scale graph theoretical analysis of fMRI-derived resting state functional connectivity

    Get PDF
    The analysis of neural functional connectivity from resting-state MRI data using tech­ niques derived form graph theoretical foundations has recently attracted a significant amount of research interest. The bulk of such work done to date focuses on relatively small graphs, derived by partitioning the brain into regions of interest. In this thesis we develop tools leveraging high-performance computing and meth­ ods for analyzing “whole brain” graphs in which we consider each grey-matter voxel in the brain to be an individual graph vertex. Based on 26 resting-state fMRI datasets we then empirically determine optimal sets of graph metrics for large graphs under varying assumptions followed by an investigation of the robustness of these metrics as assumptions are varied. We then demonstrate the application of our methods to the question of hierarchical organization in prefrontal cortex. We conclude by describing a technique for significantly reducing the size of our graphs, while losing as little useful information as possible

    Network Infrastructures in the Dark Web

    Get PDF
    With the appearance of the Internet, open to everyone in 1991, criminals saw a big opportunity in moving their organisations to the World Wide Web, taking advantage of these infrastructures as it allowed higher mobility and scalability. Later on, in the year 2000, the first system appeared, creating what is known today as the Dark Web. This layer of the World Wide Web became quickly the option to go when criminals wanted to sell and deliver content such as match-fixing, children pornography, drugs market, guns market, etc. This obscure side of the Dark Web, makes it a relevant topic to study in order to tackle this huge network and help to identify these malicious activities and actors. In this master thesis, it is shown through the study of two datasets from the Dark Web, that we are surrounded by capable technologies that can be applied to these types of problems in order to increase our knowledge about the data and reveal interesting characteristics in an interactive and useful way. One dataset has 10 000 relations from domains living in the Dark Web, and the other dataset has thousands of data from just 11 specific domains from the Dark Web. We reveal detailed information about each dataset by applying di↵erent analysis and data mining algorithms. For the first dataset we studied domains availability patterns with temporal analysis, we categorised domains with machine learning neural networks and we reveal the network topology and nodes relevance with social networks analysis and core-periphery model. Regarding the second dataset, we created a cross matching information web graph and applied a name entity recognition algorithm which ended in a tool for identifying entities within dark web’s domains. All of these approaches culminated in an interactive web application where we publicly not only display the entire research but also the tools developed along with the project (https://darkor.org).Com o surgimento da Internet, aberta a todos em 1991, os criminosos viram uma grande oportunidade em passar as suas organizações para a World Wide Web, aproveitando-se assim dessas infraestruturas que permitiam uma maior mobilidade e escalabilidade. Mais tarde, no ano 2000, surgiu o primeiro sistema, criando o que hoje é conhecido como a Dark Web. Essa camada da World Wide Web tornou-se rapidamente a opção a seguir quando os criminosos queriam vender e entregar conteúdo como combinação de resultados, pornografia infantil, mercado de drogas, mercado de armas, etc. Este lado obscuro da Dark Web, torna-a num tema relevante de estudo a fim de ajudar a identificar atividades e atores maliciosos. Nesta dissertação de mestrado é mostrado, através do estudo de dois conjuntos de dados da Dark Web, que estamos rodeados de tecnologias que podem ser aplicadas neste tipo de problemas de forma a aumentar o nosso conhecimento sobre os dados e revelar características interessantes de forma interativa e útil. Um conjunto de dados tem 10 000 relações de domínios que vivem na Dark Web enquanto que o outro conjunto de dados tem milhares de dados de apenas 11 domínios específicos da Dark Web. Neste estudo revelamos informações detalhadas sobre cada conjunto de dados aplicando diferentes análises e algoritmos de data mining. Para o primeiro conjunto de dados, estudamos padrões de disponibilidade de domínios com análise temporal, categorizamos domínios com o auxílio de redes neuronais e revelamos a topologia da rede e a relevância dos nós com análise de redes sociais e a aplicação de um modelo núcleo-periferia. Em relação ao segundo conjunto de dados, criamos um grafo da rede com cruzamento de dados e aplicamos um algoritmo de reconhecimento de entidades que resultou em uma ferramenta para identificar entidades dentro dos domínios da Dark Web estudados. Todas estas abordagens culminaram em uma aplicação web interativa onde exibimos publicamente não apenas todo o estudo, mas também as ferramentas desenvolvidas ao longo do projeto (https://darkor.org)

    Mobile app recommendations using deep learning and big data

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRMRecommender systems were first introduced to solve information overload problems in enterprises. Over the last decades, recommender systems have found applications in several major websites related to e-commerce, music and video streaming, travel and movie sites, social media and mobile app stores. Several methods have been proposed over the years to build recommender systems. The most popular approaches are based on collaborative filtering techniques, which leverage the similarities between consumer tastes. But the current state of the art in recommender systems is deep-learning methods, which can leverage not only item consumption data but also content, context, and user attributes. Mobile app stores generate data with Big Data properties from app consumption data, behavioral, geographic, demographic, social network and user-generated content data, which includes reviews, comments and search queries. In this dissertation, we propose a deep-learning architecture for recommender systems in mobile app stores that leverage most of these data sources. We analyze three issues related to the impact of the data sources, the impact of embedding layer pretraining and the efficiency of using Kernel methods to improve app scoring at a Big Data scale. An experiment is conducted on a Portuguese Android app store. Results suggest that models can be improved by combining structured and unstructured data. The results also suggest that embedding layer pretraining is essential to obtain good results. Some evidence is provided showing that Kernel-based methods might not be efficient when deployed in Big Data contexts

    Scalable Algorithms for the Analysis of Massive Networks

    Get PDF
    Die Netzwerkanalyse zielt darauf ab, nicht-triviale Erkenntnisse aus vernetzten Daten zu gewinnen. Beispiele für diese Erkenntnisse sind die Wichtigkeit einer Entität im Verhältnis zu anderen nach bestimmten Kriterien oder das Finden des am besten geeigneten Partners für jeden Teilnehmer eines Netzwerks - bekannt als Maximum Weighted Matching (MWM). Da der Begriff der Wichtigkeit an die zu betrachtende Anwendung gebunden ist, wurden zahlreiche Zentralitätsmaße eingeführt. Diese Maße stammen hierbei aus Jahrzehnten, in denen die Rechenleistung sehr begrenzt war und die Netzwerke im Vergleich zu heute viel kleiner waren. Heute sind massive Netzwerke mit Millionen von Kanten allgegenwärtig und eine triviale Berechnung von Zentralitätsmaßen ist oft zu zeitaufwändig. Darüber hinaus ist die Suche nach der Gruppe von k Knoten mit hoher Zentralität eine noch kostspieligere Aufgabe. Skalierbare Algorithmen zur Identifizierung hochzentraler (Gruppen von) Knoten in großen Graphen sind von großer Bedeutung für eine umfassende Netzwerkanalyse. Heutigen Netzwerke verändern sich zusätzlich im zeitlichen Verlauf und die effiziente Aktualisierung der Ergebnisse nach einer Änderung ist eine Herausforderung. Effiziente dynamische Algorithmen sind daher ein weiterer wesentlicher Bestandteil moderner Analyse-Pipelines. Hauptziel dieser Arbeit ist es, skalierbare algorithmische Lösungen für die zwei oben genannten Probleme zu finden. Die meisten unserer Algorithmen benötigen Sekunden bis einige Minuten, um diese Aufgaben in realen Netzwerken mit bis zu Hunderten Millionen von Kanten zu lösen, was eine deutliche Verbesserung gegenüber dem Stand der Technik darstellt. Außerdem erweitern wir einen modernen Algorithmus für MWM auf dynamische Graphen. Experimente zeigen, dass unser dynamischer MWM-Algorithmus Aktualisierungen in Graphen mit Milliarden von Kanten in Millisekunden bewältigt.Network analysis aims to unveil non-trivial insights from networked data by studying relationship patterns between the entities of a network. Among these insights, a popular one is to quantify the importance of an entity with respect to the others according to some criteria. Another one is to find the most suitable matching partner for each participant of a network knowing the pairwise preferences of the participants to be matched with each other - known as Maximum Weighted Matching (MWM). Since the notion of importance is tied to the application under consideration, numerous centrality measures have been introduced. Many of these measures, however, were conceived in a time when computing power was very limited and networks were much smaller compared to today's, and thus scalability to large datasets was not considered. Today, massive networks with millions of edges are ubiquitous, and a complete exact computation for traditional centrality measures are often too time-consuming. This issue is amplified if our objective is to find the group of k vertices that is the most central as a group. Scalable algorithms to identify highly central (groups of) vertices on massive graphs are thus of pivotal importance for large-scale network analysis. In addition to their size, today's networks often evolve over time, which poses the challenge of efficiently updating results after a change occurs. Hence, efficient dynamic algorithms are essential for modern network analysis pipelines. In this work, we propose scalable algorithms for identifying important vertices in a network, and for efficiently updating them in evolving networks. In real-world graphs with hundreds of millions of edges, most of our algorithms require seconds to a few minutes to perform these tasks. Further, we extend a state-of-the-art algorithm for MWM to dynamic graphs. Experiments show that our dynamic MWM algorithm handles updates in graphs with billion edges in milliseconds

    Graph Annotations in Modeling Complex Network Topologies

    Full text link
    The coarsest approximation of the structure of a complex network, such as the Internet, is a simple undirected unweighted graph. This approximation, however, loses too much detail. In reality, objects represented by vertices and edges in such a graph possess some non-trivial internal structure that varies across and differentiates among distinct types of links or nodes. In this work, we abstract such additional information as network annotations. We introduce a network topology modeling framework that treats annotations as an extended correlation profile of a network. Assuming we have this profile measured for a given network, we present an algorithm to rescale it in order to construct networks of varying size that still reproduce the original measured annotation profile. Using this methodology, we accurately capture the network properties essential for realistic simulations of network applications and protocols, or any other simulations involving complex network topologies, including modeling and simulation of network evolution. We apply our approach to the Autonomous System (AS) topology of the Internet annotated with business relationships between ASs. This topology captures the large-scale structure of the Internet. In depth understanding of this structure and tools to model it are cornerstones of research on future Internet architectures and designs. We find that our techniques are able to accurately capture the structure of annotation correlations within this topology, thus reproducing a number of its important properties in synthetically-generated random graphs

    Small-World Brain Networks Revisited.

    Get PDF
    It is nearly 20 years since the concept of a small-world network was first quantitatively defined, by a combination of high clustering and short path length; and about 10 years since this metric of complex network topology began to be widely applied to analysis of neuroimaging and other neuroscience data as part of the rapid growth of the new field of connectomics. Here, we review briefly the foundational concepts of graph theoretical estimation and generation of small-world networks. We take stock of some of the key developments in the field in the past decade and we consider in some detail the implications of recent studies using high-resolution tract-tracing methods to map the anatomical networks of the macaque and the mouse. In doing so, we draw attention to the important methodological distinction between topological analysis of binary or unweighted graphs, which have provided a popular but simple approach to brain network analysis in the past, and the topology of weighted graphs, which retain more biologically relevant information and are more appropriate to the increasingly sophisticated data on brain connectivity emerging from contemporary tract-tracing and other imaging studies. We conclude by highlighting some possible future trends in the further development of weighted small-worldness as part of a deeper and broader understanding of the topology and the functional value of the strong and weak links between areas of mammalian cortex.DSB acknowledges support from the John D. and Catherine T. MacArthur Foundation, the Alfred P. Sloan Foundation, the Army Research Laboratory and the Army Research Office through contract numbers W911NF-10-2-0022 and W911NF-14-1-0679, the National Institute of Health (2-R01-DC-009209-11, 1R01HD086888-01, R01-MH107235, R01-MH107703, and R21-M MH-106799), the Office of Naval Research, and the National Science Foundation (BCS-1441502, CAREER PHY-1554488, and BCS-1631550).This is the final version of the article. It first appeared from Sage at http://dx.doi.org/10.1177/1073858416667720
    corecore