8 research outputs found

    Evaluation of Automatic Video Captioning Using Direct Assessment

    Full text link
    We present Direct Assessment, a method for manually assessing the quality of automatically-generated captions for video. Evaluating the accuracy of video captions is particularly difficult because for any given video clip there is no definitive ground truth or correct answer against which to measure. Automatic metrics for comparing automatic video captions against a manual caption such as BLEU and METEOR, drawn from techniques used in evaluating machine translation, were used in the TRECVid video captioning task in 2016 but these are shown to have weaknesses. The work presented here brings human assessment into the evaluation by crowdsourcing how well a caption describes a video. We automatically degrade the quality of some sample captions which are assessed manually and from this we are able to rate the quality of the human assessors, a factor we take into account in the evaluation. Using data from the TRECVid video-to-text task in 2016, we show how our direct assessment method is replicable and robust and should scale to where there many caption-generation techniques to be evaluated.Comment: 26 pages, 8 figure

    On network backbone extraction for modeling online collective behavior

    Get PDF
    Collective user behavior in social media applications often drives several important online and offline phenomena linked to the spread of opinions and information. Several studies have focused on the analysis of such phenomena using networks to model user interactions, represented by edges. However, only a fraction of edges contribute to the actual investigation. Even worse, the often large number of non-relevant edges may obfuscate the salient interactions, blurring the underlying structures and user communities that capture the collective behavior patterns driving the target phenomenon. To solve this issue, researchers have proposed several network backbone extraction techniques to obtain a reduced and representative version of the network that better explains the phenomenon of interest. Each technique has its specific assumptions and procedure to extract the backbone. However, the literature lacks a clear methodology to highlight such assumptions, discuss how they affect the choice of a method and offer validation strategies in scenarios where no ground truth exists. In this work, we fill this gap by proposing a principled methodology for comparing and selecting the most appropriate backbone extraction method given a phenomenon of interest. We characterize ten state-of-the-art techniques in terms of their assumptions, requirements, and other aspects that one must consider to apply them in practice. We present four steps to apply, evaluate and select the best method(s) to a given target phenomenon. We validate our approach using two case studies with different requirements: online discussions on Instagram and coordinated behavior in WhatsApp groups. We show that each method can produce very different backbones, underlying that the choice of an adequate method is of utmost importance to reveal valuable knowledge about the particular phenomenon under investigation

    An automated pipeline for the discovery of conspiracy and conspiracy theory narrative frameworks: Bridgegate, Pizzagate and storytelling on the web

    Full text link
    Although a great deal of attention has been paid to how conspiracy theories circulate on social media and their factual counterpart conspiracies, there has been little computational work done on describing their narrative structures. We present an automated pipeline for the discovery and description of the generative narrative frameworks of conspiracy theories on social media, and actual conspiracies reported in the news media. We base this work on two separate repositories of posts and news articles describing the well-known conspiracy theory Pizzagate from 2016, and the New Jersey conspiracy Bridgegate from 2013. We formulate a graphical generative machine learning model where nodes represent actors/actants, and multi-edges and self-loops among nodes capture context-specific relationships. Posts and news items are viewed as samples of subgraphs of the hidden narrative network. The problem of reconstructing the underlying structure is posed as a latent model estimation problem. We automatically extract and aggregate the actants and their relationships from the posts and articles. We capture context specific actants and interactant relationships by developing a system of supernodes and subnodes. We use these to construct a network, which constitutes the underlying narrative framework. We show how the Pizzagate framework relies on the conspiracy theorists' interpretation of "hidden knowledge" to link otherwise unlinked domains of human interaction, and hypothesize that this multi-domain focus is an important feature of conspiracy theories. While Pizzagate relies on the alignment of multiple domains, Bridgegate remains firmly rooted in the single domain of New Jersey politics. We hypothesize that the narrative framework of a conspiracy theory might stabilize quickly in contrast to the narrative framework of an actual one, which may develop more slowly as revelations come to light.Comment: conspiracy theory, narrative structur

    Social Technology: An Integrated Strategy and Risk Management Framework

    Get PDF
    Accounting firms, corporations, and nonprofits use social technology to attract and develop employees, manage business intelligence, innovate business processes, engage clients, customers, and members, and disseminate information to investors and regulators. Despite its benefits, social technology\u27s unique reach and speed create new risks for managers, accountants, and auditors. Based upon prior research and modifications to Kaplan and Norton\u27s (2004) balanced scorecard and the COSO (2017) Enterprise Risk Management framework, we develop an Integrated Social Technology Strategy and Risk Management Framework to model risk management during strategy selection and implementation. A field investigation involving three large accounting organizations supports the framework\u27s representativeness for the accounting profession. This research identifies significant benefits, risks, and effective risk management controls for social technology strategies, from governance to monitoring activities. These results suggest this framework\u27s potential usefulness to managers, auditors, consultants, and researchers examining how social technology can provide value to organizations

    Making Thin Data Thick: User Behavior Analysis with Minimum Information

    Get PDF
    abstract: With the rise of social media, user-generated content has become available at an unprecedented scale. On Twitter, 1 billion tweets are posted every 5 days and on Facebook, 20 million links are shared every 20 minutes. These massive collections of user-generated content have introduced the human behavior's big-data. This big data has brought about countless opportunities for analyzing human behavior at scale. However, is this data enough? Unfortunately, the data available at the individual-level is limited for most users. This limited individual-level data is often referred to as thin data. Hence, researchers face a big-data paradox, where this big-data is a large collection of mostly limited individual-level information. Researchers are often constrained to derive meaningful insights regarding online user behavior with this limited information. Simply put, they have to make thin data thick. In this dissertation, how human behavior's thin data can be made thick is investigated. The chief objective of this dissertation is to demonstrate how traces of human behavior can be efficiently gleaned from the, often limited, individual-level information; hence, introducing an all-inclusive user behavior analysis methodology that considers social media users with different levels of information availability. To that end, the absolute minimum information in terms of both link or content data that is available for any social media user is determined. Utilizing only minimum information in different applications on social media such as prediction or recommendation tasks allows for solutions that are (1) generalizable to all social media users and that are (2) easy to implement. However, are applications that employ only minimum information as effective or comparable to applications that use more information? In this dissertation, it is shown that common research challenges such as detecting malicious users or friend recommendation (i.e., link prediction) can be effectively performed using only minimum information. More importantly, it is demonstrated that unique user identification can be achieved using minimum information. Theoretical boundaries of unique user identification are obtained by introducing social signatures. Social signatures allow for user identification in any large-scale network on social media. The results on single-site user identification are generalized to multiple sites and it is shown how the same user can be uniquely identified across multiple sites using only minimum link or content information. The findings in this dissertation allows finding the same user across multiple sites, which in turn has multiple implications. In particular, by identifying the same users across sites, (1) patterns that users exhibit across sites are identified, (2) how user behavior varies across sites is determined, and (3) activities that are observed only across sites are identified and studied.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    CLINICAL AND SOCIAL PATHWAYS TO CARE: A COMPUTATIONAL EXAMINATION OF SOCIAL MEDIA FOR MENTAL HEALTH CARE

    Get PDF
    In the last decade, powered by connectivity to large social networks and advances in collecting and analyzing digital traces of individuals from social media platforms, researchers have gleaned rich insights into individuals’ and populations’ mental health states and experiences, including their moods, emotions, social interactions, language, and communication patterns. Using these inferences, researchers have been able to study support-seeking behaviors, distinguishing patterns, risk markers, and diagnosis states for mental illnesses from social media data, promising a fundamental change in mental health care. What we need next in this line of work is for data and algorithms based on social media to be contextualized in people’s pathways to mental health care. However, there are several challenges and unanswered questions that present hurdles. First, gaps exist in the psychometric validity of social media based measurements of behaviors and the utility of these inferences in predicting clinical outcomes in patient populations. Second, if social media can act as an intervention platform, outside of discrete events, a holistic understanding of its role in people’s lives along the course of a mental illness is crucial. Lastly, several questions remain around the ethical implications of research practices in engaging with a vulnerable population subject to this research. This thesis charts out empirical and critical understandings and develops novel computational techniques to ethically and holistically examine how social media can be employed to support mental health care. Focusing on schizophrenia, one of the most debilitating and stigmatizing of mental illnesses, this thesis contributes a deeper understanding on pathways to care via social media along three themes: 1) prediction of clinical mental health states from social media data to support clinical interventions, 2) understanding online self-disclosure and social support as pathways to social care, and 3) the intersection of social and clinical pathways to care along the course of mental illness. In doing so, this work combines theories from social psychology, computer-mediated communication, and clinical literature with machine learning, statistical modeling, and natural language analysis methods applied on large-scale behavioral data from social media platforms. Together, this work contributes novel methodologies and human-centered algorithmic design frameworks to understand the efficacy of social media as a mental health intervention platform, informing clinicians, researchers, and designers who engage in developing and deploying interventions for mental health and well-being.Ph.D

    Evaluation without ground truth in social media research

    No full text
    corecore