13,426 research outputs found

    Credibility analysis of textual claims with explainable evidence

    Get PDF
    Despite being a vast resource of valuable information, the Web has been polluted by the spread of false claims. Increasing hoaxes, fake news, and misleading information on the Web have given rise to many fact-checking websites that manually assess these doubtful claims. However, the rapid speed and large scale of misinformation spread have become the bottleneck for manual verification. This calls for credibility assessment tools that can automate this verification process. Prior works in this domain make strong assumptions about the structure of the claims and the communities where they are made. Most importantly, black-box techniques proposed in prior works lack the ability to explain why a certain statement is deemed credible or not. To address these limitations, this dissertation proposes a general framework for automated credibility assessment that does not make any assumption about the structure or origin of the claims. Specifically, we propose a feature-based model, which automatically retrieves relevant articles about the given claim and assesses its credibility by capturing the mutual interaction between the language style of the relevant articles, their stance towards the claim, and the trustworthiness of the underlying web sources. We further enhance our credibility assessment approach and propose a neural-network-based model. Unlike the feature-based model, this model does not rely on feature engineering and external lexicons. Both our models make their assessments interpretable by extracting explainable evidence from judiciously selected web sources. We utilize our models and develop a Web interface, CredEye, which enables users to automatically assess the credibility of a textual claim and dissect into the assessment by browsing through judiciously and automatically selected evidence snippets. In addition, we study the problem of stance classification and propose a neural-network-based model for predicting the stance of diverse user perspectives regarding the controversial claims. Given a controversial claim and a user comment, our stance classification model predicts whether the user comment is supporting or opposing the claim.Das Web ist eine riesige Quelle wertvoller Informationen, allerdings wurde es durch die Verbreitung von Falschmeldungen verschmutzt. Eine zunehmende Anzahl an Hoaxes, Falschmeldungen und irreführenden Informationen im Internet haben viele Websites hervorgebracht, auf denen die Fakten überprüft und zweifelhafte Behauptungen manuell bewertet werden. Die rasante Verbreitung großer Mengen von Fehlinformationen sind jedoch zum Engpass für die manuelle Überprüfung geworden. Dies erfordert Tools zur Bewertung der Glaubwürdigkeit, mit denen dieser Überprüfungsprozess automatisiert werden kann. In früheren Arbeiten in diesem Bereich werden starke Annahmen gemacht über die Struktur der Behauptungen und die Portale, in denen sie gepostet werden. Vor allem aber können die Black-Box-Techniken, die in früheren Arbeiten vorgeschlagen wurden, nicht erklären, warum eine bestimmte Aussage als glaubwürdig erachtet wird oder nicht. Um diesen Einschränkungen zu begegnen, wird in dieser Dissertation ein allgemeines Framework für die automatisierte Bewertung der Glaubwürdigkeit vorgeschlagen, bei dem keine Annahmen über die Struktur oder den Ursprung der Behauptungen gemacht werden. Insbesondere schlagen wir ein featurebasiertes Modell vor, das automatisch relevante Artikel zu einer bestimmten Behauptung abruft und deren Glaubwürdigkeit bewertet, indem die gegenseitige Interaktion zwischen dem Sprachstil der relevanten Artikel, ihre Haltung zur Behauptung und der Vertrauenswürdigkeit der zugrunde liegenden Quellen erfasst wird. Wir verbessern unseren Ansatz zur Bewertung der Glaubwürdigkeit weiter und schlagen ein auf neuronalen Netzen basierendes Modell vor. Im Gegensatz zum featurebasierten Modell ist dieses Modell nicht auf Feature-Engineering und externe Lexika angewiesen. Unsere beiden Modelle machen ihre Einschätzungen interpretierbar, indem sie erklärbare Beweise aus sorgfältig ausgewählten Webquellen extrahieren. Wir verwenden unsere Modelle zur Entwicklung eines Webinterfaces, CredEye, mit dem Benutzer die Glaubwürdigkeit einer Behauptung in Textform automatisch bewerten und verstehen können, indem sie automatisch ausgewählte Beweisstücke einsehen. Darüber hinaus untersuchen wir das Problem der Positionsklassifizierung und schlagen ein auf neuronalen Netzen basierendes Modell vor, um die Position verschiedener Benutzerperspektiven in Bezug auf die umstrittenen Behauptungen vorherzusagen. Bei einer kontroversen Behauptung und einem Benutzerkommentar sagt unser Einstufungsmodell voraus, ob der Benutzerkommentar die Behauptung unterstützt oder ablehnt

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF

    Citizen Science: The Law and Ethics of Public Access to Medical Big Data

    Get PDF
    Patient-related medical information is becoming increasingly available on the Internet, spurred by government open data policies and private sector data sharing initiatives. Websites such as HealthData.gov, GenBank, and PatientsLikeMe allow members of the public to access a wealth of health information. As the medical information terrain quickly changes, the legal system must not lag behind. This Article provides a base on which to build a coherent data policy. It canvasses emergent data troves and wrestles with their legal and ethical ramifications. Publicly accessible medical data have the potential to yield numerous benefits, including scientific discoveries, cost savings, the development of patient support tools, healthcare quality improvement, greater government transparency, public education, and positive changes in healthcare policy. At the same time, the availability of electronic personal health information that can be mined by any Internet user raises concerns related to privacy, discrimination, erroneous research findings, and litigation. This Article analyzes the benefits and risks of health data sharing and proposes balanced legislative, regulatory, and policy modifications to guide data disclosure and use

    The Privacy Matrix

    Get PDF

    Understanding and Improving Continuous Experimentation : From A/B Testing to Continuous Software Optimization

    Get PDF
    Controlled experiments (i.e. A/B tests) are used by many companies with user-intensive products to improve their software with user data. Some companies adopt an experiment-driven approach to software development with continuous experimentation (CE). With CE, every user-affecting software change is evaluated in an experiment and specialized roles seek out opportunities to experiment with functionality. The goal of the thesis is to describe current practice and support CE in industry. The main contributions are threefold. First, a review of the CE literature on: infrastructure and processes, the problem-solution pairs applied in industry practice, and the benefits and challenges of the practice. Second, a multi-case study with 12 companies to analyze how experimentation is used and why some companies fail to fully realize the benefits of CE. A theory for Factors Affecting Continuous Experimentation (FACE) is constructed to realize this goal. Finally, a toolkit called Constraint Oriented Multi-variate Bandit Optimization (COMBO) is developed for supporting automated experimentation with many variables simultaneously, live in a production environment.The research in the thesis is conducted under the design science paradigm using empirical research methods, with simulation experiments of tool proposals and a multi-case study on company usage of CE. Other research methods include systematic literature review and theory building.From FACE we derive three factors that explain CE utility: (1) investments in data infrastructure, (2) user problem complexity, and (3) incentive structures for experimentation. Guidelines are provided on how to strive towards state-of-the-art CE based on company factors. All three factors are relevant for companies wanting to use CE, in particular, for those companies wanting to apply algorithms such as those in COMBO to support personalization of software to users' context in a process of continuous optimization
    corecore