2,222 research outputs found

    The future of social is personal: the potential of the personal data store

    No full text
    This chapter argues that technical architectures that facilitate the longitudinal, decentralised and individual-centric personal collection and curation of data will be an important, but partial, response to the pressing problem of the autonomy of the data subject, and the asymmetry of power between the subject and large scale service providers/data consumers. Towards framing the scope and role of such Personal Data Stores (PDSes), the legalistic notion of personal data is examined, and it is argued that a more inclusive, intuitive notion expresses more accurately what individuals require in order to preserve their autonomy in a data-driven world of large aggregators. Six challenges towards realising the PDS vision are set out: the requirement to store data for long periods; the difficulties of managing data for individuals; the need to reconsider the regulatory basis for third-party access to data; the need to comply with international data handling standards; the need to integrate privacy-enhancing technologies; and the need to future-proof data gathering against the evolution of social norms. The open experimental PDS platform INDX is introduced and described, as a means of beginning to address at least some of these six challenges

    A Framework for Aggregating Private and Public Web Archives

    Full text link
    Personal and private Web archives are proliferating due to the increase in the tools to create them and the realization that Internet Archive and other public Web archives are unable to capture personalized (e.g., Facebook) and private (e.g., banking) Web pages. We introduce a framework to mitigate issues of aggregation in private, personal, and public Web archives without compromising potential sensitive information contained in private captures. We amend Memento syntax and semantics to allow TimeMap enrichment to account for additional attributes to be expressed inclusive of the requirements for dereferencing private Web archive captures. We provide a method to involve the user further in the negotiation of archival captures in dimensions beyond time. We introduce a model for archival querying precedence and short-circuiting, as needed when aggregating private and personal Web archive captures with those from public Web archives through Memento. Negotiation of this sort is novel to Web archiving and allows for the more seamless aggregation of various types of Web archives to convey a more accurate picture of the past Web.Comment: Preprint version of the ACM/IEEE Joint Conference on Digital Libraries (JCDL 2018) full paper, accessible at the DO

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

    Archiving Software Surrogates on the Web for Future Reference

    Full text link
    Software has long been established as an essential aspect of the scientific process in mathematics and other disciplines. However, reliably referencing software in scientific publications is still challenging for various reasons. A crucial factor is that software dynamics with temporal versions or states are difficult to capture over time. We propose to archive and reference surrogates instead, which can be found on the Web and reflect the actual software to a remarkable extent. Our study shows that about a half of the webpages of software are already archived with almost all of them including some kind of documentation.Comment: TPDL 2016, Hannover, German

    Master of Arts

    Get PDF
    thesisThe rise of digital media technologies has changed how we remember the past. This study examines the memorial functions of Web 2.0 and digital memories. I suggest that memory practices that use Web 2.0 technologies are not just extensions of older forms of human memory practice based on a dichotomy between technological and human memory practices in which one is seen as determining or changing the other; memory practice with/in materiality, specifically Web 2.0 memory practice, is a collective where heterogeneous realities are mingled in the same domain, and the intersection entails new meanings, capacities, and potentials of memories. Borrowing methodological insights from actor-network theory (ANT), I examine the human actors (users and administrator), Web 2.0 technologies (interface and database/server), and political factors (terms and policy) on the same ontological level to show how the mixture of social factors and technological elements becomes memories and/or memorial website. To illustrate this human-technical network of social media memory practice, I examine the online memorial site for the Korean ferry Sewol, Citizen Network Remembering The Sewol (www.sa416.org), an extensive online public documentation that commemorates the tragedy of the Korean ferry Sewol sinking. Through this study, I reveal the ways in which the various actors, including humans and nonhuman, function, and I show how each node of network intersects in the practices of memory production and the politics

    Toward open computational communication science: A practical road map for reusable data and code

    Get PDF
    Computational communication science (CCS) offers an opportunity to accelerate the scope and pace of discovery in communication research. This article argues that CCS will profit from adopting open science practices by fostering the reusability of data and code. We discuss the goals and challenges related to creating reusable data and code and offer practical guidance to individual researchers to achieve this. More specifically, we argue for integration of the research process into reusable workflows and recognition of tools and data as academic work. The challenges and road map are also critically discussed in terms of the additional burden they place on individual scholars, which culminates in a call to action for the field to support and incentivize the reusability of tools and data

    Third-Party SDKs and Mobile App Performance

    Get PDF
    To create attractive mobile apps in the competitive mobile market, developers are increasingly leveraging third-party software development kits (SDKs) in app development. However, little is known about how using third-party toolkits affects app performance. Drawing on the platform literature and the boundary object theory, we conceptualize third-party SDK utilization as a boundary-spanning activity. Based on this, we theorize its impact on app performance, considering the mobile platform and app developers as contextual factors. We examine the causal influence of third-party SDKs on app performance by conducting difference-in-difference-style analyses on a longitudinal dataset of mobile apps released on the Apple App Store and Google Play. We find empirical evidence supporting our theoretical conjectures that utilizing more third-party SDKs increases active users. More interestingly, platform updates and developer platform-specific experience attenuate this positive impact. This study contributes to the platform-based innovation and governance literature and provides managerial implications in mobile domains
    corecore