1,621 research outputs found

    Distribution of PageRank Mass Among Principle Components of the Web

    Get PDF
    We study the PageRank mass of principal components in a bow-tie Web Graph, as a function of the damping factor c. Using a singular perturbation approach, we show that the PageRank share of IN and SCC components remains high even for very large values of the damping factor, in spite of the fact that it drops to zero when c goes to one. However, a detailed study of the OUT component reveals the presence ``dead-ends'' (small groups of pages linking only to each other) that receive an unfairly high ranking when c is close to one. We argue that this problem can be mitigated by choosing c as small as 1/2

    Random Surfing Without Teleportation

    Full text link
    In the standard Random Surfer Model, the teleportation matrix is necessary to ensure that the final PageRank vector is well-defined. The introduction of this matrix, however, results in serious problems and imposes fundamental limitations to the quality of the ranking vectors. In this work, building on the recently proposed NCDawareRank framework, we exploit the decomposition of the underlying space into blocks, and we derive easy to check necessary and sufficient conditions for random surfing without teleportation.Comment: 13 pages. Published in the Volume: "Algorithms, Probability, Networks and Games, Springer-Verlag, 2015". (The updated version corrects small typos/errors

    LexRank: Graph-based Lexical Centrality as Salience in Text Summarization

    Full text link
    We introduce a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing. We test the technique on the problem of Text Summarization (TS). Extractive TS relies on the concept of sentence salience to identify the most important sentences in a document or set of documents. Salience is typically defined in terms of the presence of particular important words or in terms of similarity to a centroid pseudo-sentence. We consider a new approach, LexRank, for computing sentence importance based on the concept of eigenvector centrality in a graph representation of sentences. In this model, a connectivity matrix based on intra-sentence cosine similarity is used as the adjacency matrix of the graph representation of sentences. Our system, based on LexRank ranked in first place in more than one task in the recent DUC 2004 evaluation. In this paper we present a detailed analysis of our approach and apply it to a larger data set including data from earlier DUC evaluations. We discuss several methods to compute centrality using the similarity graph. The results show that degree-based methods (including LexRank) outperform both centroid-based methods and other systems participating in DUC in most of the cases. Furthermore, the LexRank with threshold method outperforms the other degree-based techniques including continuous LexRank. We also show that our approach is quite insensitive to the noise in the data that may result from an imperfect topical clustering of documents

    Clustering and Community Detection in Directed Networks: A Survey

    Full text link
    Networks (or graphs) appear as dominant structures in diverse domains, including sociology, biology, neuroscience and computer science. In most of the aforementioned cases graphs are directed - in the sense that there is directionality on the edges, making the semantics of the edges non symmetric. An interesting feature that real networks present is the clustering or community structure property, under which the graph topology is organized into modules commonly called communities or clusters. The essence here is that nodes of the same community are highly similar while on the contrary, nodes across communities present low similarity. Revealing the underlying community structure of directed complex networks has become a crucial and interdisciplinary topic with a plethora of applications. Therefore, naturally there is a recent wealth of research production in the area of mining directed graphs - with clustering being the primary method and tool for community detection and evaluation. The goal of this paper is to offer an in-depth review of the methods presented so far for clustering directed networks along with the relevant necessary methodological background and also related applications. The survey commences by offering a concise review of the fundamental concepts and methodological base on which graph clustering algorithms capitalize on. Then we present the relevant work along two orthogonal classifications. The first one is mostly concerned with the methodological principles of the clustering algorithms, while the second one approaches the methods from the viewpoint regarding the properties of a good cluster in a directed network. Further, we present methods and metrics for evaluating graph clustering results, demonstrate interesting application domains and provide promising future research directions.Comment: 86 pages, 17 figures. Physics Reports Journal (To Appear

    Complex Beauty

    Get PDF
    Complex systems and their underlying convoluted networks are ubiquitous, all we need is an eye for them. They pose problems of organized complexity which cannot be approached with a reductionist method. Complexity science and its emergent sister network science both come to grips with the inherent complexity of complex systems with an holistic strategy. The relevance of complexity, however, transcends the sciences. Complex systems and networks are the focal point of a philosophical, cultural and artistic turn of our tightly interrelated and interdependent postmodern society. Here I take a different, aesthetic perspective on complexity. I argue that complex systems can be beautiful and can the object of artification - the neologism refers to processes in which something that is not regarded as art in the traditional sense of the word is changed into art. Complex systems and networks are powerful sources of inspiration for the generative designer, for the artful data visualizer, as well as for the traditional artist. I finally discuss the benefits of a cross-fertilization between science and art
    • …
    corecore