3,148,816 research outputs found

    Scalable Text and Link Analysis with Mixed-Topic Link Models

    Full text link
    Many data sets contain rich information about objects, as well as pairwise relations between them. For instance, in networks of websites, scientific papers, and other documents, each node has content consisting of a collection of words, as well as hyperlinks or citations to other nodes. In order to perform inference on such data sets, and make predictions and recommendations, it is useful to have models that are able to capture the processes which generate the text at each node and the links between them. In this paper, we combine classic ideas in topic modeling with a variant of the mixed-membership block model recently developed in the statistical physics community. The resulting model has the advantage that its parameters, including the mixture of topics of each document and the resulting overlapping communities, can be inferred with a simple and scalable expectation-maximization algorithm. We test our model on three data sets, performing unsupervised topic classification and link prediction. For both tasks, our model outperforms several existing state-of-the-art methods, achieving higher accuracy with significantly less computation, analyzing a data set with 1.3 million words and 44 thousand links in a few minutes.Comment: 11 pages, 4 figure

    An Analysis of Optimal Link Bombs

    Get PDF
    We analyze the phenomenon of collusion for the purpose of boosting the pagerank of a node in an interlinked environment. We investigate the optimal attack pattern for a group of nodes (attackers) attempting to improve the ranking of a specific node (the victim). We consider attacks where the attackers can only manipulate their own outgoing links. We show that the optimal attacks in this scenario are uncoordinated, i.e. the attackers link directly to the victim and no one else. nodes do not link to each other. We also discuss optimal attack patterns for a group that wants to hide itself by not pointing directly to the victim. In these disguised attacks, the attackers link to nodes ll hops away from the victim. We show that an optimal disguised attack exists and how it can be computed. The optimal disguised attack also allows us to find optimal link farm configurations. A link farm can be considered a special case of our approach: the target page of the link farm is the victim and the other nodes in the link farm are the attackers for the purpose of improving the rank of the victim. The target page can however control its own outgoing links for the purpose of improving its own rank, which can be modeled as an optimal disguised attack of 1-hop on itself. Our results are unique in the literature as we show optimality not only in the pagerank score, but also in the rank based on the pagerank score. We further validate our results with experiments on a variety of random graph models.Comment: Full Version of a version which appeared in AIRweb 200

    Link-space formalism for network analysis

    Full text link
    We introduce the link-space formalism for analyzing network models with degree-degree correlations. The formalism is based on a statistical description of the fraction of links l_{i,j} connecting nodes of degrees i and j. To demonstrate its use, we apply the framework to some pedagogical network models, namely, random-attachment, Barabasi-Albert preferential attachment and the classical Erdos and Renyi random graph. For these three models the link-space matrix can be solved analytically. We apply the formalism to a simple one-parameter growing network model whose numerical solution exemplifies the effect of degree-degree correlations for the resulting degree distribution. We also employ the formalism to derive the degree distributions of two very simple network decay models, more specifically, that of random link deletion and random node deletion. The formalism allows detailed analysis of the correlations within networks and we also employ it to derive the form of a perfectly non-assortative network for arbitrary degree distribution.Comment: This updated version has been expanded to include a number of new results. 19 pages, 11 figures. Minor Typos correcte

    Link Graph Analysis for Adult Images Classification

    Full text link
    In order to protect an image search engine's users from undesirable results adult images' classifier should be built. The information about links from websites to images is employed to create such a classifier. These links are represented as a bipartite website-image graph. Each vertex is equipped with scores of adultness and decentness. The scores for image vertexes are initialized with zero, those for website vertexes are initialized according to a text-based website classifier. An iterative algorithm that propagates scores within a website-image graph is described. The scores obtained are used to classify images by choosing an appropriate threshold. The experiments on Internet-scale data have shown that the algorithm under consideration increases classification recall by 17% in comparison with a simple algorithm which classifies an image as adult if it is connected with at least one adult site (at the same precision level).Comment: 7 pages. Young Scientists Conference, 4th Russian Summer School in Information Retrieva

    Signed Link Analysis in Social Media Networks

    Full text link
    Numerous real-world relations can be represented by signed networks with positive links (e.g., trust) and negative links (e.g., distrust). Link analysis plays a crucial role in understanding the link formation and can advance various tasks in social network analysis such as link prediction. The majority of existing works on link analysis have focused on unsigned social networks. The existence of negative links determines that properties and principles of signed networks are substantially distinct from those of unsigned networks, thus we need dedicated efforts on link analysis in signed social networks. In this paper, following social theories in link analysis in unsigned networks, we adopt three social science theories, namely Emotional Information, Diffusion of Innovations and Individual Personality, to guide the task of link analysis in signed networks.Comment: In the 10th International AAAI Conference on Web and Social Media (ICWSM-16

    Analisa Keterkaitan (Link Analysis) Dengan Menggunakan Sequential Pattern Discovery Untuk Prediksi Cuaca

    Full text link
    Tujuan penelitian ini adalah menganalisa keterkaitan antar atribut/itemset pada parameter yang digunakan dalam prediksi cuaca untuk menghasilkan suatu aturan(rule) yang dapat membuktikan suatu kondisi apakah hujan atau tidak hujan. Metoda yang akan digunakan untuk analisa antar atribut ini adalah Sequential Pattern Mining, dimana pola kerjanya adalah menganalisa keterkaitan suatu atribut akan mempengaruhi atribut lainnya dan bagaimana ketergantungan atribut yang satu dengan atribut lainnya. Hasil akhir dari penelitian ini adalah menemukan pola-pola pengetahuan yang tersembunyi di dalam data. Pola tersebut berbentuk aturan (rule) yang dapat membantu menentukan kondisi cuaca apakah hari ini hujan atau tidak

    The Communications link analysis and simulation system (CLASS)

    Get PDF
    The Communications Link Analysis and Simulation System (CLASS) is a comprehensive, computerized communications and tracking system analysis tool under development by the Networks Directorate of the NASA/GSFC. The primary use of this system is to provide the capability to predict the performance of the Tracking and Data Relay Satellite system (TDRSS) User Communications and Tracking links through the TDRSS. The general capabilities and operational philosophy of the current and final versions of the CLASS are described along with some examples of analyses which have been performed utilizing the capabilities of this system
    corecore