30,520 research outputs found

    Clustering in Geo-Social Networks

    Get PDF
    The rapid growth of Geo-Social Networks (GeoSNs) provides a new and rich form of data. Users of GeoSNs can capture their geographic locations and share them with other users via an operation named checkin. Thus, GeoSNs can track the connections (and the time of these connections) of geographic data to their users. In addition, the users are organized in a social network, which can be extended to a heterogeneous network if the connections to places via checkins are also considered. The goal of this paper is to analyze the opportunities in clustering this rich form of data. We first present a model for clustering geographic locations, based on GeoSN data. Then, we discuss how this model can be extended to consider temporal information from checkins. Finally, we study how the accuracy of community detection approaches can be improved by taking into account the checkins of users in a GeoSN.published_or_final_versio

    Geo-Social Group Queries with Minimum Acquaintance Constraint

    Full text link
    The prosperity of location-based social networking services enables geo-social group queries for group-based activity planning and marketing. This paper proposes a new family of geo-social group queries with minimum acquaintance constraint (GSGQs), which are more appealing than existing geo-social group queries in terms of producing a cohesive group that guarantees the worst-case acquaintance level. GSGQs, also specified with various spatial constraints, are more complex than conventional spatial queries; particularly, those with a strict kkNN spatial constraint are proved to be NP-hard. For efficient processing of general GSGQ queries on large location-based social networks, we devise two social-aware index structures, namely SaR-tree and SaR*-tree. The latter features a novel clustering technique that considers both spatial and social factors. Based on SaR-tree and SaR*-tree, efficient algorithms are developed to process various GSGQs. Extensive experiments on real-world Gowalla and Dianping datasets show that our proposed methods substantially outperform the baseline algorithms based on R-tree.Comment: This is the preprint version that is accepted by the Very Large Data Bases Journa

    Diffusion of Competing Innovations: The Effects of Network Structure on the Provision of Healthcare

    Get PDF
    Medical innovations, in the form of new medication or other clinical practices, evolve and spread through health care systems, impacting on the quality and standards of health care provision, which is demonstrably heterogeneous by geography. Our aim is to investigate the potential for the diffusion of innovation to influence health inequality and overall levels of recommended care. We extend existing diffusion of innovation models to produce agent-based simulations that mimic population-wide adoption of new practices by doctors within a network of influence. Using a computational model of network construction in lieu of empirical data about a network, we simulate the diffusion of competing innovations as they enter and proliferate through a state system comprising 24 geo-political regions, 216 facilities and over 77,000 individuals. Results show that stronger clustering within hospitals or geo-political regions is associated with slower adoption amongst smaller and rural facilities. Results of repeated simulation show how the nature of uptake and competition can contribute to low average levels of recommended care within a system that relies on diffusive adoption. We conclude that an increased disparity in adoption rates is associated with high levels of clustering in the network, and the social phenomena of competitive diffusion of innovation potentially contributes to low levels of recommended care.Innovation Diffusion, Scale-Free Networks, Health Policy, Agent-Based Modelling

    Scaling DBSCAN-like algorithms for event detection systems in Twitter

    Get PDF
    The increasing use of mobile social networks has lately transformed news media. Real-world events are nowadays reported in social networks much faster than in traditional channels. As a result, the autonomous detection of events from networks like Twitter has gained lot of interest in both research and media groups. DBSCAN-like algorithms constitute a well-known clustering approach to retrospective event detection. However, scaling such algorithms to geographically large regions and temporarily long periods present two major shortcomings. First, detecting real-world events from the vast amount of tweets cannot be performed anymore in a single machine. Second, the tweeting activity varies a lot within these broad space-time regions limiting the use of global parameters. Against this background, we propose to scale DBSCAN-like event detection techniques by parallelizing and distributing them through a novel density-aware MapReduce scheme. The proposed scheme partitions tweet data as per its spatial and temporal features and tailors local DBSCAN parameters to local tweet densities. We implement the scheme in Apache Spark and evaluate its performance in a dataset composed of geo-located tweets in the Iberian peninsula during the course of several football matches. The results pointed out to the benefits of our proposal against other state-of-the-art techniques in terms of speed-up and detection accuracy.Peer ReviewedPostprint (author's final draft

    Analyzing large scale trajectory data to identify users with similar behavior

    Get PDF
    In today\u27s society, social networks are a popular way to connect with friends and family and share what\u27s going on in your life. With the Internet connecting us all closer than ever before, it is increasingly common to use social networks to meet new friends online that share similar interests instead of only connecting with those you already know. For the problem of attempting to connect people with similar interests, this paper proposes the foundation for a Geo-social network that aims to extract the semantic meaning from users\u27 location history and use this information to find the similarity between users. Once the similarity scores are obtained, the results are examined to extract the groups of similar users for the Geo-social network. Computing similarity for a large number of users and then grouping based on the results is a computationally intensive task, but fortunately Apache Spark can be leveraged to execute the comparison and clustering of users in parallel across multiple computers, increasing the computation speed when compared to a centralized version and working quickly enough to suggest friends in real time for a given user --Abstract, page iii

    Privacy preservation in peer-to-peer gossiping networks in presence of a passive adversary

    No full text
    In the Web 2.0, more and more personal data are released by users (queries, social networks, geo-located data, ...), which create a huge pool of useful information to leverage in the context of search or recommendation for instance. In fully decentralized systems, tapping on the power of this information usually involves a clustering process that relies on an exchange of personal data (such as user proles) to compute the similarity between users. In this internship, we address the problem of computing similarity between users while preserving their privacy and without relying on a central entity, with regards to a passive adversary
    corecore