2,264 research outputs found

    Language in Our Time: An Empirical Analysis of Hashtags

    Get PDF
    Hashtags in online social networks have gained tremendous popularity during the past five years. The resulting large quantity of data has provided a new lens into modern society. Previously, researchers mainly rely on data collected from Twitter to study either a certain type of hashtags or a certain property of hashtags. In this paper, we perform the first large-scale empirical analysis of hashtags shared on Instagram, the major platform for hashtag-sharing. We study hashtags from three different dimensions including the temporal-spatial dimension, the semantic dimension, and the social dimension. Extensive experiments performed on three large-scale datasets with more than 7 million hashtags in total provide a series of interesting observations. First, we show that the temporal patterns of hashtags can be categorized into four different clusters, and people tend to share fewer hashtags at certain places and more hashtags at others. Second, we observe that a non-negligible proportion of hashtags exhibit large semantic displacement. We demonstrate hashtags that are more uniformly shared among users, as quantified by the proposed hashtag entropy, are less prone to semantic displacement. In the end, we propose a bipartite graph embedding model to summarize users' hashtag profiles, and rely on these profiles to perform friendship prediction. Evaluation results show that our approach achieves an effective prediction with AUC (area under the ROC curve) above 0.8 which demonstrates the strong social signals possessed in hashtags.Comment: WWW 201

    Living Knowledge

    Get PDF
    Diversity, especially manifested in language and knowledge, is a function of local goals, needs, competences, beliefs, culture, opinions and personal experience. The Living Knowledge project considers diversity as an asset rather than a problem. With the project, foundational ideas emerged from the synergic contribution of different disciplines, methodologies (with which many partners were previously unfamiliar) and technologies flowed in concrete diversity-aware applications such as the Future Predictor and the Media Content Analyser providing users with better structured information while coping with Web scale complexities. The key notions of diversity, fact, opinion and bias have been defined in relation to three methodologies: Media Content Analysis (MCA) which operates from a social sciences perspective; Multimodal Genre Analysis (MGA) which operates from a semiotic perspective and Facet Analysis (FA) which operates from a knowledge representation and organization perspective. A conceptual architecture that pulls all of them together has become the core of the tools for automatic extraction and the way they interact. In particular, the conceptual architecture has been implemented with the Media Content Analyser application. The scientific and technological results obtained are described in the following

    MORMED: towards a multilingual social networking platform facilitating medicine 2.0

    Get PDF
    The broad adoption of Web 2.0 tools has signalled a new era of "Medicine 2.0" in the field of medical informatics. The support for collaboration within online communities and the sharing of information in social networks offers the opportunity for new communication channels among patients, medical experts, and researchers. This paper introduces MORMED, a novel multilingual social networking and content management platform that exemplifies the Medicine 2.0 paradigm, and aims to achieve knowledge commonality by promoting sociality, while also transcending language barriers through automated translation. The MORMED platform will be piloted in a community interested in the treatment of rare diseases (Lupus or Antiphospholipid Syndrome)

    BlogForever: D3.1 Preservation Strategy Report

    Get PDF
    This report describes preservation planning approaches and strategies recommended by the BlogForever project as a core component of a weblog repository design. More specifically, we start by discussing why we would want to preserve weblogs in the first place and what it is exactly that we are trying to preserve. We further present a review of past and present work and highlight why current practices in web archiving do not address the needs of weblog preservation adequately. We make three distinctive contributions in this volume: a) we propose transferable practical workflows for applying a combination of established metadata and repository standards in developing a weblog repository, b) we provide an automated approach to identifying significant properties of weblog content that uses the notion of communities and how this affects previous strategies, c) we propose a sustainability plan that draws upon community knowledge through innovative repository design

    SportsAnno: what do you think?

    Get PDF
    The automatic summarisation of sports video is of growing importance with the increased availability of on-demand content. Consumers who are unable to view events live often have a desire to watch a summary which allows then to quickly come to terms with all that has happened during a sporting event. Sports forums show that it is not only summaries that are desirable but also the opportunity to share one’s own point of view and discuss the opinions with a community of similar users. In this paper we give an overview of the ways in which annotations have been used to augment existing visual media. We present SportsAnno, a system developed to summarise World Cup 2006 matches and provide a means for open discussion of events within these matches

    Finding information again using an individual’s web history

    Get PDF
    In a lifetime, an “average” person will visit approximately a million webpages. Sometimes a person finds they want to return to a given page at some future date but, having no recollection of where it was (URL, host, etc.) and so has to look for it again from scratch. This paper assesses how a person’s memory could be assisted by the presentation of a “map” of their web browsing activity. Three map organisation approaches were investigated: (i) time-based, (ii) place-based, and (iii) topic-based. Time-based organisation is the least suitable, because the temporal specificity of human memory is generally poor. Place-based approaches lack scalability, and are not helped by the fact that there is little repetition in the paths a person follows between places. Topic-based organisation is more promising, with topics derived from both the web content that is accessed and the search queries that are executed, which provide snapshots into a person’s cognitive processes by explicitly capturing the terminology of “what” they were looking for at that moment in time. In terms of presentation, a map that combines aspects of network connectivity with a space filling approach is likely to be most effective

    Towards Health Informatics 2.0: Blogs, Podcasts and Web 2.0 Applications in Nursing and Health Informatics Education and Professional Collaboration

    Get PDF
    Health professionals and students are expected to be proficient in basic information technology so as to mitigate error, communicate effectively, manage information and collaborate with peers. Web 2.0 applications such as blogs, podcasts and wikis are social networking tools that may enhance health professionals\u27 development of such skills. As Web 2.0 application use by health professionals is in its infancy, the purpose of this paper is to present examples of the use of such tools that hold potential for online and mobile information dissemination, knowledge building in education, and professional collaboration. Examples based in a collaborative model of virtual conference interaction, and in the use of blogs and podcasts within nursing education, are discussed. The paper concludes by seeking to promote debate on the possible development of Web 2.0 tools specific to health informatics, and so developing the next generation of health informatics, \u27Health Informatics 2.0\u27
    • 

    corecore