Search CORE

4,614 research outputs found

Crawling Facebook for Social Network Analysis Purposes

Author: Catanese Salvatore
De Meo Pasquale
Ferrara Emilio
Fiumara Giacomo
Provetti Alessandro
Publication venue: ACM
Publication date: 01/01/2011
Field of study

We describe our work in the collection and analysis of massive data describing the connections between participants to online social networks. Alternative approaches to social network data collection are defined and evaluated in practice, against the popular Facebook Web site. Thanks to our ad-hoc, privacy-compliant crawlers, two large samples, comprising millions of connections, have been collected; the data is anonymous and organized as an undirected graph. We describe a set of tools that we developed to analyze specific properties of such social-network graphs, i.e., among others, degree distribution, centrality measures, scaling laws and distribution of friendship.\u

arXiv.org e-Print Archive

CiteSeerX

Crossref

CogPrints Cognitive Sciences Eprint Archive

Do we really need to catch them all? A new User-guided Social Media Crawling method

Author: Boldt Martin
Bródka Piotr
Erlandsson Fredrik
Johnson Henric
Publication venue: 'MDPI AG'
Publication date: 01/01/2017
Field of study

With the growing use of popular social media services like Facebook and Twitter it is challenging to collect all content from the networks without access to the core infrastructure or paying for it. Thus, if all content cannot be collected one must consider which data are of most importance. In this work we present a novel User-guided Social Media Crawling method (USMC) that is able to collect data from social media, utilizing the wisdom of the crowd to decide the order in which user generated content should be collected to cover as many user interactions as possible. USMC is validated by crawling 160 public Facebook pages, containing content from 368 million users including 1.3 billion interactions, and it is compared with two other crawling methods. The results show that it is possible to cover approximately 75% of the interactions on a Facebook page by sampling just 20% of its posts, and at the same time reduce the crawling time by 53%. In addition, the social network constructed from the 20% sample contains more than 75% of the users and edges compared to the social network created from all posts, and it has similar degree distribution

arXiv.org e-Print Archive

Blekinge Institute of Technology

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Extraction and Analysis of Facebook Friendship Relations

Author: Catanese Salvatore
De Meo Pasquale
Ferrara Emilio
Fiumara Giacomo
Provetti Alessandro
Publication venue: Springer Verlag, London
Publication date: 01/01/2011
Field of study

Online Social Networks (OSNs) are a unique Web and social phenomenon, affecting tastes and behaviors of their users and helping them to maintain/create friendships. It is interesting to analyze the growth and evolution of Online Social Networks both from the point of view of marketing and other of new services and from a scientific viewpoint, since their structure and evolution may share similarities with real-life social networks. In social sciences, several techniques for analyzing (online) social networks have been developed, to evaluate quantitative properties (e.g., defining metrics and measures of structural characteristics of the networks) or qualitative aspects (e.g., studying the attachment model for the network evolution, the binary trust relationships, and the link prediction problem).\ud However, OSN analysis poses novel challenges both to Computer and Social scientists. We present our long-term research effort in analyzing Facebook, the largest and arguably most successful OSN today: it gathers more than 500 million users. Access to data about Facebook users and their friendship relations, is restricted; thus, we acquired the necessary information directly from the front-end of the Web site, in order to reconstruct a sub-graph representing anonymous interconnections among a significant subset of users. We describe our ad-hoc, privacy-compliant crawler for Facebook data extraction. To minimize bias, we adopt two different graph mining techniques: breadth-first search (BFS) and rejection sampling. To analyze the structural properties of samples consisting of millions of nodes, we developed a specific tool for analyzing quantitative and qualitative properties of social networks, adopting and improving existing Social Network Analysis (SNA) techniques and algorithms

CogPrints Cognitive Sciences Eprint Archive

Exploratory Analysis of Pairwise Interactions in Online Social Networks

Author: Humski Luka
Pintar Damir
Vranić Mihaela
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2017
Field of study

In the last few decades sociologists were trying to explain human behaviour by analysing social networks, which requires access to data about interpersonal relationships. This represented a big obstacle in this research field until the emergence of online social networks (OSNs), which vastly facilitated the process of collecting such data. Nowadays, by crawling public profiles on OSNs, it is possible to build a social graph where "friends" on OSN become represented as connected nodes. OSN connection does not necessarily indicate a close real-life relationship, but using OSN interaction records may reveal real-life relationship intensities, a topic which inspired a number of recent researches. Still, published research currently lacks an extensive exploratory analysis of OSN interaction records, i.e. a comprehensive overview of users' interaction via different ways of OSN interaction. In this paper we provide such an overview by leveraging results of conducted extensive social experiment which managed to collect records for over 3,200 Facebook users interacting with over 1,400,000 of their friends. Our exploratory analysis focuses on extracting population distributions and correlation parameters for 13 interaction parameters, providing valuable insight in online social network interaction for future researches aimed at this field of study.Comment: Journal Article published 2 Oct 2017 in Automatika volume 58 issue 4 on pages 422 to 42

arXiv.org e-Print Archive

Crossref

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Towards the cloudification of the social networks analytics

Author: Ayguadé Parra Eduard
Cea Daniel
Nin Guerrero Jordi
Torres Viñals Jordi
Tous Liesa Rubén
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

In the last years, with the increase of the available data from social networks and the rise of big data technologies, social data has emerged as one of the most profitable market for companies to increase their benefits. Besides, social computation scientists see such data as a vast ocean of information to study modern human societies. Nowadays, enterprises and researchers are developing their own mining tools in house, or they are outsourcing their social media mining needs to specialised companies with its consequent economical cost. In this paper, we present the first cloud computing service to facilitate the deployment of social media analytics applications to allow data practitioners to use social mining tools as a service. The main advantage of this service is the possibility to run different queries at the same time and combine their results in real time. Additionally, we also introduce twearch, a prototype to develop twitter mining algorithms as services in the cloud.Peer ReviewedPostprint (author’s final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Web Data Extraction, Applications and Techniques: A Survey

Author: Abel
Amalfitano
Balduzzi
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Berger
Berthold
Bettencourt
Califf
Catanese
Chang
Chen
Chen
Chen
Collins
Conover
Crandall
Crescenzi
Crescenzi
Dalvi
Dalvi
De Meo
De Meo
Doan
Emilio Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Flesca
Freitag
Furche
Gatterbauer
Gatterbauer
Giacomo Fiumara
Gjoka
Gkotsis
Gottlob
Gottlob
Hammersley
Han
Hecht
Hsu
Irmak
Khare
Kim
Kinsella
Kleinberg
Kleinberg
Kohlschütter
Kokkoras
Kokkoras
Kokkoras
Krüpl
Kushmerick
Kwak
Laender
Liu
Manning
Masanès
Mathes
Meng
Mislove
Monge
Muslea
Oro
Pan
Pasquale De Meo
Perito
Phan
Plake
Rahm
Rahm
Reis
Robert Baumgartner
Sahuguet
Sarawagi
Schifanella
Selkow
Shi
Soderland
Szomszor
Turmo
Vosecky
Wang
Wang
Weikum
Wilson
Winograd
Yang
Ye
Zafarani
Zanasi
Zhai
Zhang
Zhang
Publication venue: 'Elsevier BV'
Publication date: 09/06/2014
Field of study

Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

arXiv.org e-Print Archive

Crossref

On Facebook, most ties are weak

Author: De Meo Pasquale
Ferrara Emilio
Fiumara Giacomo
Provetti Alessandro
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/10/2014
Field of study

Pervasive socio-technical networks bring new conceptual and technological challenges to developers and users alike. A central research theme is evaluation of the intensity of relations linking users and how they facilitate communication and the spread of information. These aspects of human relationships have been studied extensively in the social sciences under the framework of the "strength of weak ties" theory proposed by Mark Granovetter.13 Some research has considered whether that theory can be extended to online social networks like Facebook, suggesting interaction data can be used to predict the strength of ties. The approaches being used require handling user-generated data that is often not publicly available due to privacy concerns. Here, we propose an alternative definition of weak and strong ties that requires knowledge of only the topology of the social network (such as who is a friend of whom on Facebook), relying on the fact that online social networks, or OSNs, tend to fragment into communities. We thus suggest classifying as weak ties those edges linking individuals belonging to different communities and strong ties as those connecting users in the same community. We tested this definition on a large network representing part of the Facebook social graph and studied how weak and strong ties affect the information-diffusion process. Our findings suggest individuals in OSNs self-organize to create well-connected communities, while weak ties yield cohesion and optimize the coverage of information spread.Comment: Accepted version of the manuscript before ACM editorial work. Check http://cacm.acm.org/magazines/2014/11/179820-on-facebook-most-ties-are-weak/ for the final versio

arXiv.org e-Print Archive

Crossref

Birkbeck Institutional Research Online