3,148,816 research outputs found
Scalable Text and Link Analysis with Mixed-Topic Link Models
Many data sets contain rich information about objects, as well as pairwise
relations between them. For instance, in networks of websites, scientific
papers, and other documents, each node has content consisting of a collection
of words, as well as hyperlinks or citations to other nodes. In order to
perform inference on such data sets, and make predictions and recommendations,
it is useful to have models that are able to capture the processes which
generate the text at each node and the links between them. In this paper, we
combine classic ideas in topic modeling with a variant of the mixed-membership
block model recently developed in the statistical physics community. The
resulting model has the advantage that its parameters, including the mixture of
topics of each document and the resulting overlapping communities, can be
inferred with a simple and scalable expectation-maximization algorithm. We test
our model on three data sets, performing unsupervised topic classification and
link prediction. For both tasks, our model outperforms several existing
state-of-the-art methods, achieving higher accuracy with significantly less
computation, analyzing a data set with 1.3 million words and 44 thousand links
in a few minutes.Comment: 11 pages, 4 figure
An Analysis of Optimal Link Bombs
We analyze the phenomenon of collusion for the purpose of boosting the
pagerank of a node in an interlinked environment. We investigate the optimal
attack pattern for a group of nodes (attackers) attempting to improve the
ranking of a specific node (the victim). We consider attacks where the
attackers can only manipulate their own outgoing links. We show that the
optimal attacks in this scenario are uncoordinated, i.e. the attackers link
directly to the victim and no one else. nodes do not link to each other. We
also discuss optimal attack patterns for a group that wants to hide itself by
not pointing directly to the victim. In these disguised attacks, the attackers
link to nodes hops away from the victim. We show that an optimal disguised
attack exists and how it can be computed. The optimal disguised attack also
allows us to find optimal link farm configurations. A link farm can be
considered a special case of our approach: the target page of the link farm is
the victim and the other nodes in the link farm are the attackers for the
purpose of improving the rank of the victim. The target page can however
control its own outgoing links for the purpose of improving its own rank, which
can be modeled as an optimal disguised attack of 1-hop on itself. Our results
are unique in the literature as we show optimality not only in the pagerank
score, but also in the rank based on the pagerank score. We further validate
our results with experiments on a variety of random graph models.Comment: Full Version of a version which appeared in AIRweb 200
Link-space formalism for network analysis
We introduce the link-space formalism for analyzing network models with
degree-degree correlations. The formalism is based on a statistical description
of the fraction of links l_{i,j} connecting nodes of degrees i and j. To
demonstrate its use, we apply the framework to some pedagogical network models,
namely, random-attachment, Barabasi-Albert preferential attachment and the
classical Erdos and Renyi random graph. For these three models the link-space
matrix can be solved analytically. We apply the formalism to a simple
one-parameter growing network model whose numerical solution exemplifies the
effect of degree-degree correlations for the resulting degree distribution. We
also employ the formalism to derive the degree distributions of two very simple
network decay models, more specifically, that of random link deletion and
random node deletion. The formalism allows detailed analysis of the
correlations within networks and we also employ it to derive the form of a
perfectly non-assortative network for arbitrary degree distribution.Comment: This updated version has been expanded to include a number of new
results. 19 pages, 11 figures. Minor Typos correcte
Link Graph Analysis for Adult Images Classification
In order to protect an image search engine's users from undesirable results
adult images' classifier should be built. The information about links from
websites to images is employed to create such a classifier. These links are
represented as a bipartite website-image graph. Each vertex is equipped with
scores of adultness and decentness. The scores for image vertexes are
initialized with zero, those for website vertexes are initialized according to
a text-based website classifier. An iterative algorithm that propagates scores
within a website-image graph is described. The scores obtained are used to
classify images by choosing an appropriate threshold. The experiments on
Internet-scale data have shown that the algorithm under consideration increases
classification recall by 17% in comparison with a simple algorithm which
classifies an image as adult if it is connected with at least one adult site
(at the same precision level).Comment: 7 pages. Young Scientists Conference, 4th Russian Summer School in
Information Retrieva
Signed Link Analysis in Social Media Networks
Numerous real-world relations can be represented by signed networks with
positive links (e.g., trust) and negative links (e.g., distrust). Link analysis
plays a crucial role in understanding the link formation and can advance
various tasks in social network analysis such as link prediction. The majority
of existing works on link analysis have focused on unsigned social networks.
The existence of negative links determines that properties and principles of
signed networks are substantially distinct from those of unsigned networks,
thus we need dedicated efforts on link analysis in signed social networks. In
this paper, following social theories in link analysis in unsigned networks, we
adopt three social science theories, namely Emotional Information, Diffusion of
Innovations and Individual Personality, to guide the task of link analysis in
signed networks.Comment: In the 10th International AAAI Conference on Web and Social Media
(ICWSM-16
Analisa Keterkaitan (Link Analysis) Dengan Menggunakan Sequential Pattern Discovery Untuk Prediksi Cuaca
Tujuan penelitian ini adalah menganalisa keterkaitan antar atribut/itemset pada parameter yang digunakan dalam prediksi cuaca untuk menghasilkan suatu aturan(rule) yang dapat membuktikan suatu kondisi apakah hujan atau tidak hujan. Metoda yang akan digunakan untuk analisa antar atribut ini adalah Sequential Pattern Mining, dimana pola kerjanya adalah menganalisa keterkaitan suatu atribut akan mempengaruhi atribut lainnya dan bagaimana ketergantungan atribut yang satu dengan atribut lainnya. Hasil akhir dari penelitian ini adalah menemukan pola-pola pengetahuan yang tersembunyi di dalam data. Pola tersebut berbentuk aturan (rule) yang dapat membantu menentukan kondisi cuaca apakah hari ini hujan atau tidak
The Communications link analysis and simulation system (CLASS)
The Communications Link Analysis and Simulation System (CLASS) is a comprehensive, computerized communications and tracking system analysis tool under development by the Networks Directorate of the NASA/GSFC. The primary use of this system is to provide the capability to predict the performance of the Tracking and Data Relay Satellite system (TDRSS) User Communications and Tracking links through the TDRSS. The general capabilities and operational philosophy of the current and final versions of the CLASS are described along with some examples of analyses which have been performed utilizing the capabilities of this system
- …