4,243 research outputs found

    Considerations about multistep community detection

    Full text link
    The problem and implications of community detection in networks have raised a huge attention, for its important applications in both natural and social sciences. A number of algorithms has been developed to solve this problem, addressing either speed optimization or the quality of the partitions calculated. In this paper we propose a multi-step procedure bridging the fastest, but less accurate algorithms (coarse clustering), with the slowest, most effective ones (refinement). By adopting heuristic ranking of the nodes, and classifying a fraction of them as `critical', a refinement step can be restricted to this subset of the network, thus saving computational time. Preliminary numerical results are discussed, showing improvement of the final partition.Comment: 12 page

    Embedding Graphs under Centrality Constraints for Network Visualization

    Full text link
    Visual rendering of graphs is a key task in the mapping of complex network data. Although most graph drawing algorithms emphasize aesthetic appeal, certain applications such as travel-time maps place more importance on visualization of structural network properties. The present paper advocates two graph embedding approaches with centrality considerations to comply with node hierarchy. The problem is formulated first as one of constrained multi-dimensional scaling (MDS), and it is solved via block coordinate descent iterations with successive approximations and guaranteed convergence to a KKT point. In addition, a regularization term enforcing graph smoothness is incorporated with the goal of reducing edge crossings. A second approach leverages the locally-linear embedding (LLE) algorithm which assumes that the graph encodes data sampled from a low-dimensional manifold. Closed-form solutions to the resulting centrality-constrained optimization problems are determined yielding meaningful embeddings. Experimental results demonstrate the efficacy of both approaches, especially for visualizing large networks on the order of thousands of nodes.Comment: Submitted to IEEE Transactions on Visualization and Computer Graphic

    Complex Networks and Symmetry I: A Review

    Get PDF
    In this review we establish various connections between complex networks and symmetry. While special types of symmetries (e.g., automorphisms) are studied in detail within discrete mathematics for particular classes of deterministic graphs, the analysis of more general symmetries in real complex networks is far less developed. We argue that real networks, as any entity characterized by imperfections or errors, necessarily require a stochastic notion of invariance. We therefore propose a definition of stochastic symmetry based on graph ensembles and use it to review the main results of network theory from an unusual perspective. The results discussed here and in a companion paper show that stochastic symmetry highlights the most informative topological properties of real networks, even in noisy situations unaccessible to exact techniques.Comment: Final accepted versio

    Clustering and Community Detection in Directed Networks: A Survey

    Full text link
    Networks (or graphs) appear as dominant structures in diverse domains, including sociology, biology, neuroscience and computer science. In most of the aforementioned cases graphs are directed - in the sense that there is directionality on the edges, making the semantics of the edges non symmetric. An interesting feature that real networks present is the clustering or community structure property, under which the graph topology is organized into modules commonly called communities or clusters. The essence here is that nodes of the same community are highly similar while on the contrary, nodes across communities present low similarity. Revealing the underlying community structure of directed complex networks has become a crucial and interdisciplinary topic with a plethora of applications. Therefore, naturally there is a recent wealth of research production in the area of mining directed graphs - with clustering being the primary method and tool for community detection and evaluation. The goal of this paper is to offer an in-depth review of the methods presented so far for clustering directed networks along with the relevant necessary methodological background and also related applications. The survey commences by offering a concise review of the fundamental concepts and methodological base on which graph clustering algorithms capitalize on. Then we present the relevant work along two orthogonal classifications. The first one is mostly concerned with the methodological principles of the clustering algorithms, while the second one approaches the methods from the viewpoint regarding the properties of a good cluster in a directed network. Further, we present methods and metrics for evaluating graph clustering results, demonstrate interesting application domains and provide promising future research directions.Comment: 86 pages, 17 figures. Physics Reports Journal (To Appear

    Finding Streams in Knowledge Graphs to Support Fact Checking

    Full text link
    The volume and velocity of information that gets generated online limits current journalistic practices to fact-check claims at the same rate. Computational approaches for fact checking may be the key to help mitigate the risks of massive misinformation spread. Such approaches can be designed to not only be scalable and effective at assessing veracity of dubious claims, but also to boost a human fact checker's productivity by surfacing relevant facts and patterns to aid their analysis. To this end, we present a novel, unsupervised network-flow based approach to determine the truthfulness of a statement of fact expressed in the form of a (subject, predicate, object) triple. We view a knowledge graph of background information about real-world entities as a flow network, and knowledge as a fluid, abstract commodity. We show that computational fact checking of such a triple then amounts to finding a "knowledge stream" that emanates from the subject node and flows toward the object node through paths connecting them. Evaluation on a range of real-world and hand-crafted datasets of facts related to entertainment, business, sports, geography and more reveals that this network-flow model can be very effective in discerning true statements from false ones, outperforming existing algorithms on many test cases. Moreover, the model is expressive in its ability to automatically discover several useful path patterns and surface relevant facts that may help a human fact checker corroborate or refute a claim.Comment: Extended version of the paper in proceedings of ICDM 201

    LINE: Large-scale Information Network Embedding

    Full text link
    This paper studies the problem of embedding very large information networks into low-dimensional vector spaces, which is useful in many tasks such as visualization, node classification, and link prediction. Most existing graph embedding methods do not scale for real world information networks which usually contain millions of nodes. In this paper, we propose a novel network embedding method called the "LINE," which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted. The method optimizes a carefully designed objective function that preserves both the local and global network structures. An edge-sampling algorithm is proposed that addresses the limitation of the classical stochastic gradient descent and improves both the effectiveness and the efficiency of the inference. Empirical experiments prove the effectiveness of the LINE on a variety of real-world information networks, including language networks, social networks, and citation networks. The algorithm is very efficient, which is able to learn the embedding of a network with millions of vertices and billions of edges in a few hours on a typical single machine. The source code of the LINE is available online.Comment: WWW 201

    Are there any good digraph width measures?

    Full text link
    Several different measures for digraph width have appeared in the last few years. However, none of them shares all the "nice" properties of treewidth: First, being \emph{algorithmically useful} i.e. admitting polynomial-time algorithms for all \MS1-definable problems on digraphs of bounded width. And, second, having nice \emph{structural properties} i.e. being monotone under taking subdigraphs and some form of arc contractions. As for the former, (undirected) \MS1 seems to be the least common denominator of all reasonably expressive logical languages on digraphs that can speak about the edge/arc relation on the vertex set.The latter property is a necessary condition for a width measure to be characterizable by some version of the cops-and-robber game characterizing the ordinary treewidth. Our main result is that \emph{any reasonable} algorithmically useful and structurally nice digraph measure cannot be substantially different from the treewidth of the underlying undirected graph. Moreover, we introduce \emph{directed topological minors} and argue that they are the weakest useful notion of minors for digraphs

    Large Graph Analysis in the GMine System

    Full text link
    Current applications have produced graphs on the order of hundreds of thousands of nodes and millions of edges. To take advantage of such graphs, one must be able to find patterns, outliers and communities. These tasks are better performed in an interactive environment, where human expertise can guide the process. For large graphs, though, there are some challenges: the excessive processing requirements are prohibitive, and drawing hundred-thousand nodes results in cluttered images hard to comprehend. To cope with these problems, we propose an innovative framework suited for any kind of tree-like graph visual design. GMine integrates (a) a representation for graphs organized as hierarchies of partitions - the concepts of SuperGraph and Graph-Tree; and (b) a graph summarization methodology - CEPS. Our graph representation deals with the problem of tracing the connection aspects of a graph hierarchy with sub linear complexity, allowing one to grasp the neighborhood of a single node or of a group of nodes in a single click. As a proof of concept, the visual environment of GMine is instantiated as a system in which large graphs can be investigated globally and locally
    • …
    corecore