6 research outputs found

    ON RELEVANCE FILTERING FOR REAL-TIME TWEET SUMMARIZATION

    Get PDF
    Real-time tweet summarization systems (RTS) require mechanisms for capturing relevant tweets, identifying novel tweets, and capturing timely tweets. In this thesis, we tackle the RTS problem with a main focus on the relevance filtering. We experimented with different traditional retrieval models. Additionally, we propose two extensions to alleviate the sparsity and topic drift challenges that affect the relevance filtering. For the sparsity, we propose leveraging word embeddings in Vector Space model (VSM) term weighting to empower the system to use semantic similarity alongside the lexical matching. To mitigate the effect of topic drift, we exploit explicit relevance feedback to enhance profile representation to cope with its development in the stream over time. We conducted extensive experiments over three standard English TREC test collections that were built specifically for RTS. Although the extensions do not generally exhibit better performance, they are comparable to the baselines used. Moreover, we extended an event detection Arabic tweets test collection, called EveTAR, to support tasks that require novelty in the system's output. We collected novelty judgments using in-house annotators and used the collection to test our RTS system. We report preliminary results on EveTAR using different models of the RTS system.This work was made possible by NPRP grants # NPRP 7-1313-1-245 and # NPRP 7-1330-2-483 from the Qatar National Research Fund (a member of Qatar Foundation)

    The Influence of an Individual’s Disposition to Value Privacy in a Non-Contrived Study

    Get PDF
    Unexpected usage of user data has made headlines as both governments and commercial entities have encountered privacy-related issues. Like other social networking sites, LinkedIn provides users to restrict access to their information or allow for public viewing; information available in the public view was used unexpectedly (i.e., profiling). A non-profit entity called ICWATCH used tools to gather information on government mass surveillance programs by scraping publicly accessible user data from LinkedIn. Previous research has shown that privacy concerns influence behavior intention in contrived scenarios. What remains unclear is whether LinkedIn users, whose data was scraped by ICWATCH (an actual situation), would have similar privacy concerns and subsequently express the intent to take privacy-preserving action. This study proposed to answer three research questions in the context of an actual privacy-centric situation, using an explanatory sequential mixed methods design. First, what is the user\u27s disposition towards privacy? Second, to what extent does this influence users\u27 privacy concerns regarding the inclusion of their LinkedIn profile information within ICWATCH? Third, to what extent do these concerns influence their stated intention to modify their LinkedIn profile/settings to minimize/eliminate this inclusion? The two-phase approach performed quantitative analysis on collected survey data, followed by analysis on follow-up interview data to provide context. The resulting analyses found significant support for each hypothesis and divergence of underlying factors between degrees of the hypotheses and variable representations. Those participants who were not inclined to privacy and were not concerned with the situation, as expected, did not intend to modify their LinkedIn profile. However, they did express underlying factors such as control and privacy risk belief, unlike their counterparts. Those participants who were more inclined and more concerned about the situation did express an intent to modify their profile and revealed underlying factors such as regulations and usage. The findings support the extension of the existing literature onto actual privacy-centric situations. The results also highlight challenges with population demographics in actual situations and suggestions for construct prioritization when investigating future situations

    Presentation of self on a decentralised web

    Get PDF
    Self presentation is evolving; with digital technologies, with the Web and personal publishing, and then with mainstream adoption of online social media. Where are we going next? One possibility is towards a world where we log and own vast amounts of data about ourselves. We choose to share - or not - the data as part of our identity, and in interactions with others; it contributes to our day-to-day personhood or sense of self. I imagine a world where the individual is empowered by their digital traces (not imprisoned), but this is a complex world. This thesis examines the many factors at play when we present ourselves through Web technologies. I optimistically look to a future where control over our digital identities are not in the hands of centralised actors, but our own, and both survey and contribute to the ongoing technical work which strives to make this a reality. Decentralisation changes things in unexpected ways. In the context of the bigger picture of our online selves, building on what we already know about self-presentation from decades of Social Science research, I examine what might change as we move towards decentralisation; how people could be affected, and what the possibilities are for a positive change. Finally I explore one possible way of self-presentation on a decentralised social Web through lightweight controls which allow an audience to set their expectations in order for the subject to meet them appropriately. I seek to acknowledge the multifaceted, complicated, messy, socially-shaped nature of the self in a way that makes sense to software developers. Technology may always fall short when dealing with humanness, but the framework outlined in this thesis can provide a foundation for more easily considering all of the factors surrounding individual self-presentation in order to build future systems which empower participants

    Sticks and Stones May Break My Bones but Words Will Never Hurt Me...Until I See Them: A Qualitative Content Analysis of Trolls in Relation to the Gricean Maxims and (IM)Polite Virtual Speech Acts

    Get PDF
    The troll is one of the most obtrusive and disruptive bad actors on the internet. Unlike other bad actors, the troll interacts on a more personal and intimate level with other internet users. Social media platforms, online communities, comment boards, and chatroom forums provide them with this opportunity. What distinguishes these social provocateurs from other bad actors are their virtual speech acts and online behaviors. These acts aim to incite anger, shame, or frustration in others through the weaponization of words, phrases, and other rhetoric. Online trolls come in all forms and use various speech tactics to insult and demean their target audiences. The goal of this research is to investigate trolls\u27 virtual speech acts and the impact of troll-like behaviors on online communities. Using Gricean maxims and politeness theory, this study seeks to identify common vernacular, word usage, and other language behaviors that trolls use to divert the conversation, insult others, and possibly affect fellow internet users’ mental health and well-being
    corecore