6,470 research outputs found

    Fully Automated Fact Checking Using External Sources

    Full text link
    Given the constantly growing proliferation of false claims online in recent years, there has been also a growing research interest in automatically distinguishing false rumors from factually true claims. Here, we propose a general-purpose framework for fully-automatic fact checking using external sources, tapping the potential of the entire Web as a knowledge source to confirm or reject a claim. Our framework uses a deep neural network with LSTM text encoding to combine semantic kernels with task-specific embeddings that encode a claim together with pieces of potentially-relevant text fragments from the Web, taking the source reliability into account. The evaluation results show good performance on two different tasks and datasets: (i) rumor detection and (ii) fact checking of the answers to a question in community question answering forums.Comment: RANLP-201

    Evaluation Measures for Relevance and Credibility in Ranked Lists

    Full text link
    Recent discussions on alternative facts, fake news, and post truth politics have motivated research on creating technologies that allow people not only to access information, but also to assess the credibility of the information presented to them by information retrieval systems. Whereas technology is in place for filtering information according to relevance and/or credibility, no single measure currently exists for evaluating the accuracy or precision (and more generally effectiveness) of both the relevance and the credibility of retrieved results. One obvious way of doing so is to measure relevance and credibility effectiveness separately, and then consolidate the two measures into one. There at least two problems with such an approach: (I) it is not certain that the same criteria are applied to the evaluation of both relevance and credibility (and applying different criteria introduces bias to the evaluation); (II) many more and richer measures exist for assessing relevance effectiveness than for assessing credibility effectiveness (hence risking further bias). Motivated by the above, we present two novel types of evaluation measures that are designed to measure the effectiveness of both relevance and credibility in ranked lists of retrieval results. Experimental evaluation on a small human-annotated dataset (that we make freely available to the research community) shows that our measures are expressive and intuitive in their interpretation

    Credibility analysis of textual claims with explainable evidence

    Get PDF
    Despite being a vast resource of valuable information, the Web has been polluted by the spread of false claims. Increasing hoaxes, fake news, and misleading information on the Web have given rise to many fact-checking websites that manually assess these doubtful claims. However, the rapid speed and large scale of misinformation spread have become the bottleneck for manual verification. This calls for credibility assessment tools that can automate this verification process. Prior works in this domain make strong assumptions about the structure of the claims and the communities where they are made. Most importantly, black-box techniques proposed in prior works lack the ability to explain why a certain statement is deemed credible or not. To address these limitations, this dissertation proposes a general framework for automated credibility assessment that does not make any assumption about the structure or origin of the claims. Specifically, we propose a feature-based model, which automatically retrieves relevant articles about the given claim and assesses its credibility by capturing the mutual interaction between the language style of the relevant articles, their stance towards the claim, and the trustworthiness of the underlying web sources. We further enhance our credibility assessment approach and propose a neural-network-based model. Unlike the feature-based model, this model does not rely on feature engineering and external lexicons. Both our models make their assessments interpretable by extracting explainable evidence from judiciously selected web sources. We utilize our models and develop a Web interface, CredEye, which enables users to automatically assess the credibility of a textual claim and dissect into the assessment by browsing through judiciously and automatically selected evidence snippets. In addition, we study the problem of stance classification and propose a neural-network-based model for predicting the stance of diverse user perspectives regarding the controversial claims. Given a controversial claim and a user comment, our stance classification model predicts whether the user comment is supporting or opposing the claim.Das Web ist eine riesige Quelle wertvoller Informationen, allerdings wurde es durch die Verbreitung von Falschmeldungen verschmutzt. Eine zunehmende Anzahl an Hoaxes, Falschmeldungen und irreführenden Informationen im Internet haben viele Websites hervorgebracht, auf denen die Fakten überprüft und zweifelhafte Behauptungen manuell bewertet werden. Die rasante Verbreitung großer Mengen von Fehlinformationen sind jedoch zum Engpass für die manuelle Überprüfung geworden. Dies erfordert Tools zur Bewertung der Glaubwürdigkeit, mit denen dieser Überprüfungsprozess automatisiert werden kann. In früheren Arbeiten in diesem Bereich werden starke Annahmen gemacht über die Struktur der Behauptungen und die Portale, in denen sie gepostet werden. Vor allem aber können die Black-Box-Techniken, die in früheren Arbeiten vorgeschlagen wurden, nicht erklären, warum eine bestimmte Aussage als glaubwürdig erachtet wird oder nicht. Um diesen Einschränkungen zu begegnen, wird in dieser Dissertation ein allgemeines Framework für die automatisierte Bewertung der Glaubwürdigkeit vorgeschlagen, bei dem keine Annahmen über die Struktur oder den Ursprung der Behauptungen gemacht werden. Insbesondere schlagen wir ein featurebasiertes Modell vor, das automatisch relevante Artikel zu einer bestimmten Behauptung abruft und deren Glaubwürdigkeit bewertet, indem die gegenseitige Interaktion zwischen dem Sprachstil der relevanten Artikel, ihre Haltung zur Behauptung und der Vertrauenswürdigkeit der zugrunde liegenden Quellen erfasst wird. Wir verbessern unseren Ansatz zur Bewertung der Glaubwürdigkeit weiter und schlagen ein auf neuronalen Netzen basierendes Modell vor. Im Gegensatz zum featurebasierten Modell ist dieses Modell nicht auf Feature-Engineering und externe Lexika angewiesen. Unsere beiden Modelle machen ihre Einschätzungen interpretierbar, indem sie erklärbare Beweise aus sorgfältig ausgewählten Webquellen extrahieren. Wir verwenden unsere Modelle zur Entwicklung eines Webinterfaces, CredEye, mit dem Benutzer die Glaubwürdigkeit einer Behauptung in Textform automatisch bewerten und verstehen können, indem sie automatisch ausgewählte Beweisstücke einsehen. Darüber hinaus untersuchen wir das Problem der Positionsklassifizierung und schlagen ein auf neuronalen Netzen basierendes Modell vor, um die Position verschiedener Benutzerperspektiven in Bezug auf die umstrittenen Behauptungen vorherzusagen. Bei einer kontroversen Behauptung und einem Benutzerkommentar sagt unser Einstufungsmodell voraus, ob der Benutzerkommentar die Behauptung unterstützt oder ablehnt

    Usability aspects of the inside-in approach for ancillary search tasks on the web

    Get PDF
    International audienceGiven the huge amount of data available over the Web nowadays, search engines become essential tools helping users to find the information they are looking for. Nonetheless, search engines often return large sets of results which must be filtered by the users to find the suitable information items. However, in many cases, filtering is not enough, as the results returned by the engine require users to perform a secondary search to complement the current information thus featuring ancillary search tasks. Such ancillary search tasks create a nested context for user tasks that increases the articulatory distance between the users and their ultimate goal. In this paper, we analyze the interplay between such ancillary searches and other primary search tasks on the Web. Moreover, we describe the inside-in approach, which aims at reducing the articulatory distance between interleaved tasks by allowing users to perform ancillary search tasks without losing the context. The inside-in approach is illustrated by means of a case study based on ancillary searches of coauthors in a digital library, using an information visualization technique

    Credibility perceptions of content contributors and consumers in social media

    Full text link
    This panel addresses information credibility issues in the context of social media. During this panel, participants will discuss people's credibility perceptions of online content in social media from the perspectives of both content contributors and consumers. Each panelist will bring her own perspective on credibility issues in various social media, including Twitter (Morris), Wikipedia (Metzger; Francke), blogs (Rieh), and social Q&A (Jeon). This panel aims to flesh out multi‐disciplinary approaches to the investigation of credibility and discuss integrated conceptual frameworks and future research directions focusing on assessing and establishing credibility in social media.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/111174/1/meet14505101022.pd

    The Extended Mind and Network-Enabled Cognition

    No full text
    In thinking about the transformative potential of network technologies with respect to human cognition, it is common to see network resources as playing a largely assistive or augmentative role. In this paper we propose a somewhat more radical vision. We suggest that the informational and technological elements of a network system can, at times, constitute part of the material supervenience base for a human agent’s mental states and processes. This thesis (called the thesis of network-enabled cognition) draws its inspiration from the notion of the extended mind that has been propounded in the philosophical and cognitive science literature. Our basic claim is that network systems can do more than just augment cognition; they can also constitute part of the physical machinery that makes mind and cognition mechanistically possible. In evaluating this hypothesis, we identify a number of issues that seem to undermine the extent to which contemporary network systems, most notably the World Wide Web, can legitimately feature as part of an environmentally-extended cognitive system. Specific problems include the reliability and resilience of network-enabled devices, the accessibility of online information content, and the extent to which network-derived information is treated in the same way as information retrieved from biological memory. We argue that these apparent shortfalls do not necessarily merit the wholesale rejection of the network-enabled cognition thesis; rather, they point to the limits of the current state-of-the-art and identify the targets of many ongoing research initiatives in the network and information sciences. In addition to highlighting the importance of current research and technology development efforts, the thesis of network-enabled cognition also suggests a number of areas for future research. These include the formation and maintenance of online trust relationships, the subjective assessment of information credibility and the long-term impact of network access on human psychological and cognitive functioning. The nascent discipline of web science is, we suggest, suitably placed to begin an exploration of these issues

    University of Copenhagen Participation in TREC Health Misinformation Track 2020

    Full text link
    In this paper, we describe our participation in the TREC Health Misinformation Track 2020. We submitted 1111 runs to the Total Recall Task and 13 runs to the Ad Hoc task. Our approach consists of 3 steps: (1) we create an initial run with BM25 and RM3; (2) we estimate credibility and misinformation scores for the documents in the initial run; (3) we merge the relevance, credibility and misinformation scores to re-rank documents in the initial run. To estimate credibility scores, we implement a classifier which exploits features based on the content and the popularity of a document. To compute the misinformation score, we apply a stance detection approach with a pretrained Transformer language model. Finally, we use different approaches to merge scores: weighted average, the distance among score vectors and rank fusion

    Evaluation of Spam Impact on Arabic Websites Popularity

    Get PDF
    The expansion of the Web and its information in all aspects of life raises the concern of how to trust information published on the Web especially in cases where publisher may not be known. Websites strive to be more popular and make themselves visible to search engines and eventually to users. Website popularity can be measured using several metrics such as the Web traffic (e.g. Website: visitors\u27 number and visited page number). A link or page popularity refers to the total number of hyperlinks referring to a certain Web page. In this study, several top ranked Arabic Websites are selected for evaluating possible Web spam behavior. Websites use spam techniques to boost their ranks within Search Engine Results Page (SERP). Results of this study showed that some of these popular Websites are using techniques that are considered spam techniques according to Search Engine Optimization guidelines

    Reach and rich : the new economics of information and the provision of on-line legal services in the UK

    Get PDF
    The paper considers a number of issues including the use of the Web as an opportunity for smaller firms to break free from the traditional indicators of reputation and expertise such as the size and opulence of offices. It also reflects on the use of client-specific Extranets in addition to publicly available Internet sites. The paper concludes that although the Web provides reach, offering richness and the sense of community required for creating and sustaining relationships with potential clients can be difficult. Some suggestions are made for enhancing 'Richness' in Web sites
    corecore