2,872 research outputs found

    Exploring Social Media for Event Attendance

    Get PDF
    Large popular events are nowadays well reflected in social media fora (e.g. Twitter), where people discuss their interest in participating in the events. In this paper we propose to exploit the content of non-geotagged posts in social media to build machine-learned classifiers able to infer users' attendance of large events in three temporal periods: before, during and after an event. The categories of features used to train the classifier reflect four different dimensions of social media: textual, temporal, social, and multimedia content. We detail the approach followed to design the feature space and report on experiments conducted on two large music festivals in the UK, namely the VFestival and Creamfields events. Our attendance classifier attains very high accuracy with the highest result observed for the Creamfields dataset ~87% accuracy to classify users that will participate in the event

    Geo-located Twitter as the proxy for global mobility patterns

    Full text link
    In the advent of a pervasive presence of location sharing services researchers gained an unprecedented access to the direct records of human activity in space and time. This paper analyses geo-located Twitter messages in order to uncover global patterns of human mobility. Based on a dataset of almost a billion tweets recorded in 2012 we estimate volumes of international travelers in respect to their country of residence. We examine mobility profiles of different nations looking at the characteristics such as mobility rate, radius of gyration, diversity of destinations and a balance of the inflows and outflows. The temporal patterns disclose the universal seasons of increased international mobility and the peculiar national nature of overseen travels. Our analysis of the community structure of the Twitter mobility network, obtained with the iterative network partitioning, reveals spatially cohesive regions that follow the regional division of the world. Finally, we validate our result with the global tourism statistics and mobility models provided by other authors, and argue that Twitter is a viable source to understand and quantify global mobility patterns.Comment: 17 pages, 13 figure

    Racial categories in machine learning

    Full text link
    Controversies around race and machine learning have sparked debate among computer scientists over how to design machine learning systems that guarantee fairness. These debates rarely engage with how racial identity is embedded in our social experience, making for sociological and psychological complexity. This complexity challenges the paradigm of considering fairness to be a formal property of supervised learning with respect to protected personal attributes. Racial identity is not simply a personal subjective quality. For people labeled "Black" it is an ascribed political category that has consequences for social differentiation embedded in systemic patterns of social inequality achieved through both social and spatial segregation. In the United States, racial classification can best be understood as a system of inherently unequal status categories that places whites as the most privileged category while signifying the Negro/black category as stigmatized. Social stigma is reinforced through the unequal distribution of societal rewards and goods along racial lines that is reinforced by state, corporate, and civic institutions and practices. This creates a dilemma for society and designers: be blind to racial group disparities and thereby reify racialized social inequality by no longer measuring systemic inequality, or be conscious of racial categories in a way that itself reifies race. We propose a third option. By preceding group fairness interventions with unsupervised learning to dynamically detect patterns of segregation, machine learning systems can mitigate the root cause of social disparities, social segregation and stratification, without further anchoring status categories of disadvantage

    Lifecycle-Aware Online Video Caching

    Get PDF
    The current explosion of video traffic compels service providers to deploy caches at edge networks. Nowadays, most caching systems store data with a high programming voltage corresponding to the largest possible ‘expiry date’, typically on the order of years, which maximizes the cache damage. However, popular videos rarely exhibit lifecycles longer than a couple of months. Consequently, the programming voltage can instead be adapted to fit the lifecycle and mitigate the cache damage accordingly. In this paper, we propose LiA-cache, a Lifecycle-Aware caching policy for online videos. LiA-cache finds both near-optimal caching retention times and cache eviction policies by optimizing traffic delivery cost and cache damage cost conjointly. We first investigate temporal patterns of video access from a real-world dataset covering 10 million online videos collected by one of the largest mobile network operators in the world. We next cluster the videos based on their access lifecycles and integrate the clustering into a general model of the caching system. Specifically, LiA-cache analyzes videos and caches them depending on their cluster label. Compared to other popular policies in real-world scenarios, LiA-cache can reduce cache damage up to 90%, while keeping a cache hit ratio close to a policy purely relying on video popularity.Peer reviewe

    Advances in crowd analysis for urban applications through urban event detection

    Get PDF
    The recent expansion of pervasive computing technology has contributed with novel means to pursue human activities in urban space. The urban dynamics unveiled by these means generate an enormous amount of data. These data are mainly endowed by portable and radio-frequency devices, transportation systems, video surveillance, satellites, unmanned aerial vehicles, and social networking services. This has opened a new avenue of opportunities, to understand and predict urban dynamics in detail, and plan various real-time services and applications in response to that. Over the last decade, certain aspects of the crowd, e.g., mobility, sentimental, size estimation and behavioral, have been analyzed in detail and the outcomes have been reported. This paper mainly conducted an extensive survey on various data sources used for different urban applications, the state-of-the-art on urban data generation techniques and associated processing methods in order to demonstrate their merits and capabilities. Then, available open-access crowd data sets for urban event detection are provided along with relevant application programming interfaces. In addition, an outlook on a support system for urban application is provided which fuses data from all the available pervasive technology sources and finally, some open challenges and promising research directions are outlined

    A survey on privacy in human mobility

    Get PDF
    In the last years we have witnessed a pervasive use of location-aware technologies such as vehicular GPS-enabled devices, RFID based tools, mobile phones, etc which generate collection and storing of a large amount of human mobility data. The powerful of this data has been recognized by both the scientific community and the industrial worlds. Human mobility data can be used for different scopes such as urban traffic management, urban planning, urban pollution estimation, etc. Unfortunately, data describing human mobility is sensitive, because people's whereabouts may allow re-identification of individuals in a de-identified database and the access to the places visited by indi-viduals may enable the inference of sensitive information such as religious belief, sexual preferences, health conditions, and so on. The literature reports many approaches aimed at overcoming privacy issues in mobility data, thus in this survey we discuss the advancements on privacy-preserving mo-bility data publishing. We first describe the adversarial attack and privacy models typically taken into consideration for mobility data, then we present frameworks for the privacy risk assessment and finally, we discuss three main categories of privacy-preserving strategies: methods based on anonymization of mobility data, methods based on the differential privacy models and methods which protect privacy by exploiting generative models for synthetic trajectory generation
    • …
    corecore