835 research outputs found
Extraction and Analysis of Facebook Friendship Relations
Online Social Networks (OSNs) are a unique Web and social phenomenon, affecting tastes and behaviors of their users and helping them to maintain/create friendships. It is interesting to analyze the growth and evolution of Online Social Networks both from the point of view of marketing and other of new services and from a scientific viewpoint, since their structure and evolution may share similarities with real-life social networks. In social sciences, several techniques for analyzing (online) social networks have been developed, to evaluate quantitative properties (e.g., defining metrics and measures of structural characteristics of the networks) or qualitative aspects (e.g., studying the attachment model for the network evolution, the binary trust relationships, and the link prediction problem).\ud
However, OSN analysis poses novel challenges both to Computer and Social scientists. We present our long-term research effort in analyzing Facebook, the largest and arguably most successful OSN today: it gathers more than 500 million users. Access to data about Facebook users and their friendship relations, is restricted; thus, we acquired the necessary information directly from the front-end of the Web site, in order to reconstruct a sub-graph representing anonymous interconnections among a significant subset of users. We describe our ad-hoc, privacy-compliant crawler for Facebook data extraction. To minimize bias, we adopt two different graph mining techniques: breadth-first search (BFS) and rejection sampling. To analyze the structural properties of samples consisting of millions of nodes, we developed a specific tool for analyzing quantitative and qualitative properties of social networks, adopting and improving existing Social Network Analysis (SNA) techniques and algorithms
Recommended from our members
Finding Meaning in Context Using Graph Algorithms in Mono- and Cross-lingual Settings
Making computers automatically find the appropriate meaning of words in context is an interesting problem that has proven to be one of the most challenging tasks in natural language processing (NLP). Widespread potential applications of a possible solution to the problem could be envisaged in several NLP tasks such as text simplification, language learning, machine translation, query expansion, information retrieval and text summarization. Ambiguity of words has always been a challenge in these applications, and the traditional endeavor to solve the problem of this ambiguity, namely doing word sense disambiguation using resources like WordNet, has been fraught with debate about the feasibility of the granularity that exists in WordNet senses. The recent trend has therefore been to move away from enforcing any given lexical resource upon automated systems from which to pick potential candidate senses,and to instead encourage them to pick and choose their own resources. Given a sentence with a target ambiguous word, an alternative solution consists of picking potential candidate substitutes for the target, filtering the list of the candidates to a much shorter list using various heuristics, and trying to match these system predictions against a human generated gold standard, with a view to ensuring that the meaning of the sentence does not change after the substitutions. This solution has manifested itself in the SemEval 2007 task of lexical substitution and the more recent SemEval 2010 task of cross-lingual lexical substitution (which I helped organize), where given an English context and a target word within that context, the systems are required to provide between one and ten appropriate substitutes (in English) or translations (in Spanish) for the target word. In this dissertation, I present a comprehensive overview of state-of-the-art research and describe new experiments to tackle the tasks of lexical substitution and cross-lingual lexical substitution. In particular I attempt to answer some research questions pertinent to the tasks, mostly focusing on completely unsupervised approaches. I present a new framework for unsupervised lexical substitution using graphs and centrality algorithms. An additional novelty in this approach is the use of directional similarity rather than the traditional, symmetric word similarity. Additionally, the thesis also explores the extension of the monolingual framework into a cross-lingual one, and examines how well this cross-lingual framework can work for the monolingual lexical substitution and cross-lingual lexical substitution tasks. A comprehensive set of comparative investigations are presented amongst supervised and unsupervised methods, several graph based methods, and the use of monolingual and multilingual information
Similarities on Graphs: Kernels versus Proximity Measures
We analytically study proximity and distance properties of various kernels
and similarity measures on graphs. This helps to understand the mathematical
nature of such measures and can potentially be useful for recommending the
adoption of specific similarity measures in data analysis.Comment: 16 page
Graph-based approaches to word sense induction
This thesis is a study of Word Sense Induction (WSI), the Natural Language Processing (NLP) task of automatically discovering word meanings from text. WSI is an open problem in NLP whose solution would be of considerable benefit to many other NLP tasks. It has, however, has been studied by relatively few NLP researchers and often in set ways. Scope therefore exists to apply novel methods to the problem, methods that may improve
upon those previously applied. This thesis applies a graph-theoretic approach to WSI. In this approach, word senses are identifed by finding particular types of subgraphs in word co-occurrence graphs. A number of original methods for constructing, analysing, and partitioning graphs are introduced, with these methods then incorporated into graphbased WSI systems. These systems are then shown, in a variety of evaluation scenarios, to return results that are comparable to those of the current best performing WSI systems. The main contributions of the thesis are a novel parameter-free soft clustering algorithm that runs in time linear in the number of edges in the input graph, and novel generalisations of the clustering coeficient (a measure of vertex cohesion in graphs) to the weighted case. Further contributions of the thesis include: a review of graph-based WSI systems that have been proposed in the literature; analysis of the methodologies applied in these systems; analysis of the metrics used to evaluate WSI systems, and empirical evidence to verify the usefulness of each novel method introduced in the thesis for inducing word senses
Centrality measures with a new index called E-User (Effective User) Index for determiningthe most effective user in Twitter Online Social Network
Abstract In this study, we considered the issue of determination of the most effective user in the twitter online social network. We worked on asocial network graph which have relationships (edges) between users who posteda tweet and other users who re-posted it. In other words, we assume that there is a relationship between User-X and User-Y when User-X posted a tweet and User-Y re-postedit. In Social Network Analysis (SNA), there are four fundamental centrality measures such as Degree Centrality, Closeness Centrality, Betweenness Centrality, and Eigenvector Centralities.We developed a new approach for determiningthe most effective user in Twitter online social network by using an index named E-User (Effective User) Index.Through this index, we think that we are able to obtain more realistic results in SNA for Twitter.We designed a small weighted and directed social network graph by using a simulated data and used it for determining the most effective user in this study. In our graph, weights indicate the number of retweets between a user and other user, and directions indicate which user did retweet to other user's tweet. In the graph, directions can be bidirected. This means that both users did retweet their tweets to each other
Link ecosystem of the portuguese blogosphere
Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201
Resilience of the Critical Communication Networks Against Spreading Failures: Case of the European National and Research Networks
A backbone network is the central part of the communication network, which provides connectivity within the various systems across large distances. Disruptions in a backbone network would cause severe consequences which could manifest in the service outage on a large scale. Depending on the size and the importance of the network, its failure could leave a substantial impact on the area it is associated with. The failures of the network services could lead to a significant disturbance of human activities. Therefore, making backbone communication networks more resilient directly affects the resilience of the area. Contemporary urban and regional development overwhelmingly converges with the communication infrastructure expansion and their obvious mutual interconnections become more reciprocal.
Spreading failures are of particular interest. They usually originate in a single network segment and then spread to the rest of network often causing a global collapse. Two types of spreading failures are given focus, namely: epidemics and cascading failures. How to make backbone networks more resilient against spreading failures? How to tune the topology or additionally protect nodes or links in order to mitigate an effect of the potential failure? Those are the main questions addressed in this thesis.
First, the epidemic phenomena are discussed. The subjects of epidemic modeling and identification of the most influential spreaders are addressed using a proposed Linear Time-Invariant (LTI) system approach. Throughout the years, LTI system theory has been used mostly to describe electrical circuits and networks. LTI is suitable to characterize the behavior of the system consisting of numerous interconnected components. The results presented in this thesis show that the same mathematical toolbox could be used for the complex network analysis.
Then, cascading failures are discussed. Like any system which can be modeled using an interdependence graph with limited capacity of either nodes or edges, backbone networks are prone to cascades. Numerical simulations are used to model such failures. The resilience of European National Research and Education Networks (NREN) is assessed, weak points and critical areas of the network are identified and the suggestions for its modification are proposed
- …