1,590 research outputs found

    Credibility analysis of textual claims with explainable evidence

    Get PDF
    Despite being a vast resource of valuable information, the Web has been polluted by the spread of false claims. Increasing hoaxes, fake news, and misleading information on the Web have given rise to many fact-checking websites that manually assess these doubtful claims. However, the rapid speed and large scale of misinformation spread have become the bottleneck for manual verification. This calls for credibility assessment tools that can automate this verification process. Prior works in this domain make strong assumptions about the structure of the claims and the communities where they are made. Most importantly, black-box techniques proposed in prior works lack the ability to explain why a certain statement is deemed credible or not. To address these limitations, this dissertation proposes a general framework for automated credibility assessment that does not make any assumption about the structure or origin of the claims. Specifically, we propose a feature-based model, which automatically retrieves relevant articles about the given claim and assesses its credibility by capturing the mutual interaction between the language style of the relevant articles, their stance towards the claim, and the trustworthiness of the underlying web sources. We further enhance our credibility assessment approach and propose a neural-network-based model. Unlike the feature-based model, this model does not rely on feature engineering and external lexicons. Both our models make their assessments interpretable by extracting explainable evidence from judiciously selected web sources. We utilize our models and develop a Web interface, CredEye, which enables users to automatically assess the credibility of a textual claim and dissect into the assessment by browsing through judiciously and automatically selected evidence snippets. In addition, we study the problem of stance classification and propose a neural-network-based model for predicting the stance of diverse user perspectives regarding the controversial claims. Given a controversial claim and a user comment, our stance classification model predicts whether the user comment is supporting or opposing the claim.Das Web ist eine riesige Quelle wertvoller Informationen, allerdings wurde es durch die Verbreitung von Falschmeldungen verschmutzt. Eine zunehmende Anzahl an Hoaxes, Falschmeldungen und irreführenden Informationen im Internet haben viele Websites hervorgebracht, auf denen die Fakten überprüft und zweifelhafte Behauptungen manuell bewertet werden. Die rasante Verbreitung großer Mengen von Fehlinformationen sind jedoch zum Engpass für die manuelle Überprüfung geworden. Dies erfordert Tools zur Bewertung der Glaubwürdigkeit, mit denen dieser Überprüfungsprozess automatisiert werden kann. In früheren Arbeiten in diesem Bereich werden starke Annahmen gemacht über die Struktur der Behauptungen und die Portale, in denen sie gepostet werden. Vor allem aber können die Black-Box-Techniken, die in früheren Arbeiten vorgeschlagen wurden, nicht erklären, warum eine bestimmte Aussage als glaubwürdig erachtet wird oder nicht. Um diesen Einschränkungen zu begegnen, wird in dieser Dissertation ein allgemeines Framework für die automatisierte Bewertung der Glaubwürdigkeit vorgeschlagen, bei dem keine Annahmen über die Struktur oder den Ursprung der Behauptungen gemacht werden. Insbesondere schlagen wir ein featurebasiertes Modell vor, das automatisch relevante Artikel zu einer bestimmten Behauptung abruft und deren Glaubwürdigkeit bewertet, indem die gegenseitige Interaktion zwischen dem Sprachstil der relevanten Artikel, ihre Haltung zur Behauptung und der Vertrauenswürdigkeit der zugrunde liegenden Quellen erfasst wird. Wir verbessern unseren Ansatz zur Bewertung der Glaubwürdigkeit weiter und schlagen ein auf neuronalen Netzen basierendes Modell vor. Im Gegensatz zum featurebasierten Modell ist dieses Modell nicht auf Feature-Engineering und externe Lexika angewiesen. Unsere beiden Modelle machen ihre Einschätzungen interpretierbar, indem sie erklärbare Beweise aus sorgfältig ausgewählten Webquellen extrahieren. Wir verwenden unsere Modelle zur Entwicklung eines Webinterfaces, CredEye, mit dem Benutzer die Glaubwürdigkeit einer Behauptung in Textform automatisch bewerten und verstehen können, indem sie automatisch ausgewählte Beweisstücke einsehen. Darüber hinaus untersuchen wir das Problem der Positionsklassifizierung und schlagen ein auf neuronalen Netzen basierendes Modell vor, um die Position verschiedener Benutzerperspektiven in Bezug auf die umstrittenen Behauptungen vorherzusagen. Bei einer kontroversen Behauptung und einem Benutzerkommentar sagt unser Einstufungsmodell voraus, ob der Benutzerkommentar die Behauptung unterstützt oder ablehnt

    Unchaining Collective Intelligence for Science, Research and Technology Development by Blockchain-Boosted Community Participation

    Get PDF
    Since its launch just over a decade ago by the cryptocurrency Bitcoin, the distributed ledger technology (DLT) blockchain has followed a breathtaking trajectory into manifold application spaces. This paper analyses how key factors underpinning the success of this ground-breaking “internet of value” technology, such as staking of collateral (“skin in the game”), competitive crowdsourcing, crowdfunding, and prediction markets, can be applied to substantially innovate the legacy organization of science, research and technology development (RTD). Here, we elaborate a highly integrative, community-based strategy where a token-based crypto-economy supports finding best possible consensus, trust and truth through adding unconventional elements known from reputation systems, betting, secondary markets and social networking. These tokens support the holder’s formalized reputation, and are used in liquid-democracy style governance and arbitration within projects or community-driven initiatives. This participatory research model serves as a solid basis for comprehensively leveraging collective intelligence by effectively incentivizing contributions from the crowd, such as intellectual property (IP), work, validation, assessment, infrastructure, education, assessment, governance, publication, and promotion of projects. On the analogy of its current blockbusters like peer-to-peer structured decentralized finance (“DeFi”), blockchain technology can seminally enhance the efficiency of science and RTD initiatives, even permitting to fully stage operations as a chiefless Decentralised Autonomous Organization (DAOs)

    Controversy Analysis and Detection

    Get PDF
    Seeking information on a controversial topic is often a complex task. Alerting users about controversial search results can encourage critical literacy, promote healthy civic discourse and counteract the filter bubble effect, and therefore would be a useful feature in a search engine or browser extension. Additionally, presenting information to the user about the different stances or sides of the debate can help her navigate the landscape of search results beyond a simple list of 10 links . This thesis has made strides in the emerging niche of controversy detection and analysis. The body of work in this thesis revolves around two themes: computational models of controversy, and controversies occurring in neighborhoods of topics. Our broad contributions are: (1) Presenting a theoretical framework for modeling controversy as contention among populations; (2) Constructing the first automated approach to detecting controversy on the web, using a KNN classifier that maps from the web to similar Wikipedia articles; and (3) Proposing a novel controversy detection in Wikipedia by employing a stacked model using a combination of link structure and similarity. We conclude this work by discussing the challenging technical, societal and ethical implications of this emerging research area and proposing avenues for future work

    Measuring COVID-19 Related Media Consumption on Twitter

    Full text link
    The COVID-19 pandemic has been affecting the world dramatically ever since 2020. The minimum availability of physical interactions during the lockdown has caused more and more people to turn to online activities on social media platforms. These platforms have provided essential updates regarding the pandemic, serving as bridges for communications. Research on studying these communications on different platforms emerges during the meantime. Prior studies focus on areas such as topic modeling, sentiment analysis and prediction tasks such as predicting COVID-19 positive cases, misinformation spread, etc. However, online communications with media outlets remain unexplored on an international scale. We have little knowledge about the patterns of the media consumption geographically and their association with offline political preference. We believe addressing these questions could help governments and researchers better understand human behaviors during the pandemic. In this thesis, we specifically investigate the online consumption of media outlets on Twitter through a set of quantitative analyses. We make use of several public media outlet datasets to extract media consumption from tweets collected based on COVID-19 keyword matching. We make use of a metric "interaction" to quantify media consumption through weighted Twitter activities. We further construct a matrix based on it which could be directly used to measure user-media consumption in different granularities. We then conduct analyses on the United States level and global level. To the best of our knowledge, this thesis presents the first-of-its-kind study on media consumption on COVID-19 across countries, it sheds light on understanding how people consume media outlets during the pandemic and provides potential insights for peer researchers.Comment: Thesis submitted to Bachelor of Information Technology (Honours) 2021 at AN

    Doctor of Philosophy

    Get PDF
    dissertationDue to the popularity of Web 2.0 and Social Media in the last decade, the percolation of user generated content (UGC) has rapidly increased. In the financial realm, this results in the emergence of virtual investing communities (VIC) to the investing public. There is an on-going debate among scholars and practitioners on whether such UGC contain valuable investing information or mainly noise. I investigate two major studies in my dissertation. First I examine the relationship between peer influence and information quality in the context of individual characteristics in stock microblogging. Surprisingly, I discover that the set of individual characteristics that relate to peer influence is not synonymous with those that relate to high information quality. In relating to information quality, influentials who are frequently mentioned by peers due to their name value are likely to possess higher information quality while those who are better at diffusing information via retweets are likely to associate with lower information quality. Second I propose a study to explore predictability of stock microblog dimensions and features over stock price directional movements using data mining classification techniques. I find that author-ticker-day dimension produces the highest predictive accuracy inferring that this dimension is able to capture both relevant author and ticker information as compared to author-day and ticker-day. In addition to these two studies, I also explore two topics: network structure of co-tweeted tickers and sentiment annotation via crowdsourcing. I do this in order to understand and uncover new features as well as new outcome indicators with the objective of improving predictive accuracy of the classification or saliency of the explanatory models. My dissertation work extends the frontier in understanding the relationship between financial UGC, specifically stock microblogging with relevant phenomena as well as predictive outcomes

    Scientific and regulatory evaluation of mechanistic in silico drug and disease models in drug development: building model credibility

    Get PDF
    The value of in silico methods in drug development and evaluation has been demonstrated repeatedly and convincingly. While their benefits are now unanimously recognized, international standards for their evaluation, accepted by all stakeholders involved, are still to be established. In this white paper, we propose a risk-informed evaluation framework for mechanistic model credibility evaluation. To properly frame the proposed verification and validation activities, concepts such as context of use, regulatory impact and risk-based analysis are discussed. To ensure common understanding between all stakeholders, an overview is provided of relevant in silico terminology used throughout this paper. To illustrate the feasibility of the proposed approach, we have applied it to three real case examples in the context of drug development, using a credibility matrix currently being tested as a quick-start tool by regulators. Altogether, this white paper provides a practical approach to model evaluation, applicable in both scientific and regulatory evaluation contexts

    ProHealth eCoach: user-centered design and development of an eCoach app to promote healthy lifestyle with personalized activity recommendations

    Get PDF
    Background: Regular physical activity (PA), healthy habits, and an appropriate diet are recommended guidelines to maintain a healthy lifestyle. A healthy lifestyle can help to avoid chronic diseases and long-term illnesses. A monitoring and automatic personalized lifestyle recommendation system (i.e., automatic electronic coach or eCoach) with considering clinical and ethical guidelines, individual health status, condition, and preferences may successfully help participants to follow recommendations to maintain a healthy lifestyle. As a prerequisite for the prototype design of such a helpful eCoach system, it is essential to involve the end-users and subject-matter experts throughout the iterative design process. Methods: We used an iterative user-centered design (UCD) approach to understend context of use and to collect qualitative data to develop a roadmap for self-management with eCoaching. We involved researchers, non-technical and technical, health professionals, subject-matter experts, and potential end-users in design process. We designed and developed the eCoach prototype in two stages, adopting diferent phases of the iterative design process. In design workshop 1, we focused on identifying end-users, understanding the user’s context, specifying user requirements, designing and developing an initial low-fdelity eCoach prototype. In design workshop 2, we focused on maturing the low-fdelity solution design and development for the visualization of continuous and discrete data, artifcial intelligence (AI)-based interval forecasting, personalized recommendations, and activity goals. Results: The iterative design process helped to develop a working prototype of eCoach system that meets end-user’s requirements and expectations towards an efective recommendation visualization, considering diversity in culture, quality of life, and human values. The design provides an early version of the solution, consisting of wearable technology, a mobile app following the “Google Material Design” guidelines, and web content for self-monitoring, goal setting, and lifestyle recommendations in an engaging manner between the eCoach app and end-users. Conclusions: The adopted iterative design process brings in a design focus on the user and their needs at each phase. Throughout the design process, users have been involved at the heart of the design to create a working.publishedVersio
    corecore