47,632 research outputs found

    Social Media Data in Research : Provenance Challenges

    Get PDF
    The work described here was funded by a grant from the United Kingdom’s Economic and Social Research Council Social Media - Developing Understanding, Infrastructure & Engagement (ES/M001628/1).Postprin

    Web-scale provenance reconstruction of implicit information diffusion on social media

    Get PDF
    Fast, massive, and viral data diffused on social media affects a large share of the online population, and thus, the (prospective) information diffusion mechanisms behind it are of great interest to researchers. The (retrospective) provenance of such data is equally important because it contributes to the understanding of the relevance and trustworthiness of the information. Furthermore, computing provenance in a timely way is crucial for particular use cases and practitioners, such as online journalists that promptly need to assess specific pieces of information. Social media currently provide insufficient mechanisms for provenance tracking, publication and generation, while state-of-the-art on social media research focuses mainly on explicit diffusion mechanisms (like retweets in Twitter or reshares in Facebook).The implicit diffusion mechanisms remain understudied due to the difficulties of being captured and properly understood. From a technical side, the state of the art for provenance reconstruction evaluates small datasets after the fact, sidestepping requirements for scale and speed of current social media data. In this paper, we investigate the mechanisms of implicit information diffusion by computing its fine-grained provenance. We prove that explicit mechanisms are insufficient to capture influence and our analysis unravels a significant part of implicit interactions and influence in social media. Our approach works incrementally and can be scaled up to cover a truly Web-scale scenario like major events. We can process datasets consisting of up to several millions of messages on a single machine at rates that cover bursty behaviour, without compromising result quality. By doing that, we provide to online journalists and social media users in general, fine grained provenance reconstruction which sheds lights on implicit interactions not captured by social media providers. These results are provided in an online fashion which also allows for fast relevance and trustworthiness assessment

    Five sepharose-bound ligands for the chromatographic purification of Clostridium collagenase and clostripain

    Get PDF
    Social media data have provoked a mixed response from researchers. While there is great enthusiasm for this new source of social data – Twitter data in particular – concerns are also expressed about their biases and unknown provenance and, consequently, their credibility for social research. This article seeks a middle path, arguing that we must develop better understanding of the construction and circulation of social media data to evaluate their appropriate uses and the claims that might be made from them. Building on sociotechnical approaches, we propose a high-level abstraction of the ‘pipeline’ through which social media data are constructed and circulated. In turn, we explore how this shapes the populations and samples that are present in social media data and the methods that generate data about them. We conclude with some broad principles for supporting methodologically informed social media research in the future

    Consumer Data Research

    Get PDF
    Big Data collected by customer-facing organisations – such as smartphone logs, store loyalty card transactions, smart travel tickets, social media posts, or smart energy meter readings – account for most of the data collected about citizens today. As a result, they are transforming the practice of social science. Consumer Big Data are distinct from conventional social science data not only in their volume, variety and velocity, but also in terms of their provenance and fitness for ever more research purposes. The contributors to this book, all from the Consumer Data Research Centre, provide a first consolidated statement of the enormous potential of consumer data research in the academic, commercial and government sectors – and a timely appraisal of the ways in which consumer data challenge scientific orthodoxies

    Media forensics on social media platforms: a survey

    Get PDF
    The dependability of visual information on the web and the authenticity of digital media appearing virally in social media platforms has been raising unprecedented concerns. As a result, in the last years the multimedia forensics research community pursued the ambition to scale the forensic analysis to real-world web-based open systems. This survey aims at describing the work done so far on the analysis of shared data, covering three main aspects: forensics techniques performing source identification and integrity verification on media uploaded on social networks, platform provenance analysis allowing to identify sharing platforms, and multimedia verification algorithms assessing the credibility of media objects in relation to its associated textual information. The achieved results are highlighted together with current open issues and research challenges to be addressed in order to advance the field in the next future

    Transparency, Reproducibility, and Replicability

    Full text link
    Presentation at the American Statistical Association's Joint Statistical Meetings in Vancouver, Canada, July 29, 2018. Session on Transparency, Reproducibility, and ReplicabilityFrontier social science and evidence-based policy analyses increasingly rely on large-scale, naturally occurring data, such as administrative, transaction, and social media data. These data capture phenomena at higher frequency, lower cost, and greater timeliness than traditional methods. Using naturally occurring data for analytic purposes is not free, requiring management of governance and custody, processing, and linking to other data. Without methods for preservation and access, with appropriate provenance, naturally occurring data may be re-produced again and again, at high cost. The cost is not simply in dollars and time. There is significant cost to science, as replication is impossible. Naturally occurring data naturally changes. Analyses repeated on data without proper documentation, versioning, or provenance vary from one another for reasons having nothing to do with underlying science. The Inter-university Consortium for Social and Political Research has for over 55 years curated and disseminated social science data for re-use and replication. This paper presents steps ICPSR is taking to develop tools and protocols, including a new repository of data linkage algorithms.https://deepblue.lib.umich.edu/bitstream/2027.42/145176/3/JSM Transparency & Reproducibility 2018.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/145176/6/JSM Transparency & Reproducibility 2018.pdfDescription of JSM Transparency & Reproducibility 2018.pdf : Presentatio

    Consumer Data Research

    Get PDF
    Big Data collected by customer-facing organisations – such as smartphone logs, store loyalty card transactions, smart travel tickets, social media posts, or smart energy meter readings – account for most of the data collected about citizens today. As a result, they are transforming the practice of social science. Consumer Big Data are distinct from conventional social science data not only in their volume, variety and velocity, but also in terms of their provenance and fitness for ever more research purposes. The contributors to this book, all from the Consumer Data Research Centre, provide a first consolidated statement of the enormous potential of consumer data research in the academic, commercial and government sectors – and a timely appraisal of the ways in which consumer data challenge scientific orthodoxies

    Provenance analysis for instagram photos

    Get PDF
    As a feasible device fingerprint, sensor pattern noise (SPN) has been proven to be effective in the provenance analysis of digital images. However, with the rise of social media, millions of images are being uploaded to and shared through social media sites every day. An image downloaded from social networks may have gone through a series of unknown image manipulations. Consequently, the trustworthiness of SPN has been challenged in the provenance analysis of the images downloaded from social media platforms. In this paper, we intend to investigate the effects of the pre-defined Instagram images filters on the SPN-based image provenance analysis. We identify two groups of filters that affect the SPN in quite different ways, with Group I consisting of the filters that severely attenuate the SPN and Group II consisting of the filters that well preserve the SPN in the images. We further propose a CNN-based classifier to perform filter-oriented image categorization, aiming to exclude the images manipulated by the filters in Group I and thus improve the reliability of the SPN-based provenance analysis. The results on about 20, 000 images and 18 filters are very promising, with an accuracy higher than 96% in differentiating the filters in Group I and Group II
    • 

    corecore