3,992 research outputs found
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Communities
One of the major hurdles preventing the full exploitation of information from
online communities is the widespread concern regarding the quality and
credibility of user-contributed content. Prior works in this domain operate on
a static snapshot of the community, making strong assumptions about the
structure of the data (e.g., relational tables), or consider only shallow
features for text classification.
To address the above limitations, we propose probabilistic graphical models
that can leverage the joint interplay between multiple factors in online
communities --- like user interactions, community dynamics, and textual content
--- to automatically assess the credibility of user-contributed online content,
and the expertise of users and their evolution with user-interpretable
explanation. To this end, we devise new models based on Conditional Random
Fields for different settings like incorporating partial expert knowledge for
semi-supervised learning, and handling discrete labels as well as numeric
ratings for fine-grained analysis. This enables applications such as extracting
reliable side-effects of drugs from user-contributed posts in healthforums, and
identifying credible content in news communities.
Online communities are dynamic, as users join and leave, adapt to evolving
trends, and mature over time. To capture this dynamics, we propose generative
models based on Hidden Markov Model, Latent Dirichlet Allocation, and Brownian
Motion to trace the continuous evolution of user expertise and their language
model over time. This allows us to identify expert users and credible content
jointly over time, improving state-of-the-art recommender systems by explicitly
considering the maturity of users. This also enables applications such as
identifying helpful product reviews, and detecting fake and anomalous reviews
with limited information.Comment: PhD thesis, Mar 201
Bayesian Learning and Predictability in a Stochastic Nonlinear Dynamical Model
Bayesian inference methods are applied within a Bayesian hierarchical
modelling framework to the problems of joint state and parameter estimation,
and of state forecasting. We explore and demonstrate the ideas in the context
of a simple nonlinear marine biogeochemical model. A novel approach is proposed
to the formulation of the stochastic process model, in which ecophysiological
properties of plankton communities are represented by autoregressive stochastic
processes. This approach captures the effects of changes in plankton
communities over time, and it allows the incorporation of literature metadata
on individual species into prior distributions for process model parameters.
The approach is applied to a case study at Ocean Station Papa, using Particle
Markov chain Monte Carlo computational techniques. The results suggest that, by
drawing on objective prior information, it is possible to extract useful
information about model state and a subset of parameters, and even to make
useful long-term forecasts, based on sparse and noisy observations
Navigating information and uncertainty: A fuzzy logic model to approach transparency, democracy and social wellbeing
In the digital age of information overload and uncertainty, the authors
propose the tDTSW model based on fuzzy logic to navigate governance
complexities. This model transcends binary thinking, analyzes democracy,
transparency, and social well-being, highlighting their roles in just societies
through case studies. It addresses challenges like capitalism, sustainability,
gender equality, and education in modern democracies, emphasizing their
interplay for positive change. "Navigating Information and Uncertainty"
introduces fuzzy logic, offering a structured approach. It calls for collective
efforts to create equitable, sustainable, and just societies, inviting readers
to shape a brighter future.Comment: 60 pages, 14 figures
Argumentation Mining in User-Generated Web Discourse
The goal of argumentation mining, an evolving research field in computational
linguistics, is to design methods capable of analyzing people's argumentation.
In this article, we go beyond the state of the art in several ways. (i) We deal
with actual Web data and take up the challenges given by the variety of
registers, multiple domains, and unrestricted noisy user-generated Web
discourse. (ii) We bridge the gap between normative argumentation theories and
argumentation phenomena encountered in actual data by adapting an argumentation
model tested in an extensive annotation study. (iii) We create a new gold
standard corpus (90k tokens in 340 documents) and experiment with several
machine learning methods to identify argument components. We offer the data,
source codes, and annotation guidelines to the community under free licenses.
Our findings show that argumentation mining in user-generated Web discourse is
a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in
User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17
Exploring the landscape of seasonal forecast provision by Global Producing Centres
Despite the growing demand for seasonal climate forecasts, there is limited understanding of the landscape of organisations providing this critically important climate information. This study attempts to fill this gap by presenting results from an in-depth dialogue with the organisations entrusted with the provision of seasonal forecasts by the World Meteorological Organisation, known as the Global Producing Centres for Long-Range Forecasts (GPCs-LRF). The results provide an overview and detailed description of the organisational setup, mandate, target audience of GPCs-LRF and their interactions with other centres. Looking beyond the GPCs-LRF to other centres providing seasonal forecasts, some of which have been rapidly taking prominent places in this landscape, revealed a heterogeneous and still maturing community of practice, with an increasing number of players and emerging efforts to produce multi-model ensemble forecasts. The dialogues pointed at the need to not only improve climate models and produce more skilful climate forecasts, but also to improve the transformation of the forecasts into useful and usable products. Finally, using the lenses of credibility, salience and legitimacy, we explore ways to bridge the fragmentation of the information offered across the organisations considered and the people involved in the delivery and use of seasonal forecasts. The paper concludes by suggesting ways to address the boundary crossing between science, policy and society in the context of seasonal climate prediction.We would like to thank all the study participants for their valuable contributions and feedback to the paper, and Diana Urquiza for designing the figure. An earlier version of this paper was presented in the workshop “Quality of Climate Information for Adaptation” in October 2020. This research has been supported by the EU H2020 project FOCUS-Africa (GA 869575).Peer ReviewedPostprint (published version
Credibility analysis of textual claims with explainable evidence
Despite being a vast resource of valuable information, the Web has been polluted by the spread of false claims. Increasing hoaxes, fake news, and misleading information on the Web have given rise to many fact-checking websites that manually assess these doubtful claims. However, the rapid speed and large scale of misinformation spread have become the bottleneck for manual verification. This calls for credibility assessment tools that can automate this verification process. Prior works in this domain make strong assumptions about the structure of the claims and the communities where they are made. Most importantly, black-box techniques proposed in prior works lack the ability to explain why a certain statement is deemed credible or not. To address these limitations, this dissertation proposes a general framework for automated credibility assessment that does not make any assumption about the structure or origin of the claims. Specifically, we propose a feature-based model, which automatically retrieves relevant articles about the given claim and assesses its credibility by capturing the mutual interaction between the language style of the relevant articles, their stance towards the claim, and the trustworthiness of the underlying web sources. We further enhance our credibility assessment approach and propose a neural-network-based model. Unlike the feature-based model, this model does not rely on feature engineering and external lexicons. Both our models make their assessments interpretable by extracting explainable evidence from judiciously selected web sources.
We utilize our models and develop a Web interface, CredEye, which enables users to automatically assess the credibility of a textual claim and dissect into the assessment by browsing through judiciously and automatically selected evidence snippets. In addition, we study the problem of stance classification and propose a neural-network-based model for predicting the stance of diverse user perspectives regarding the controversial claims. Given a controversial claim and a user comment, our stance classification model predicts whether the user comment is supporting or opposing the claim.Das Web ist eine riesige Quelle wertvoller Informationen, allerdings wurde es durch die Verbreitung von Falschmeldungen verschmutzt. Eine zunehmende Anzahl an Hoaxes, Falschmeldungen und irreführenden Informationen im Internet haben viele Websites hervorgebracht, auf denen die Fakten überprüft und zweifelhafte Behauptungen manuell bewertet werden. Die rasante Verbreitung großer Mengen von Fehlinformationen sind jedoch zum Engpass für die manuelle Überprüfung geworden. Dies erfordert Tools zur Bewertung der Glaubwürdigkeit, mit denen dieser Überprüfungsprozess automatisiert werden kann. In früheren Arbeiten in diesem Bereich werden starke Annahmen gemacht über die Struktur der Behauptungen und die Portale, in denen sie gepostet werden. Vor allem aber können die Black-Box-Techniken, die in früheren Arbeiten vorgeschlagen wurden, nicht erklären, warum eine bestimmte Aussage als glaubwürdig erachtet wird oder nicht. Um diesen Einschränkungen zu begegnen, wird in dieser Dissertation ein allgemeines Framework für die automatisierte Bewertung der Glaubwürdigkeit vorgeschlagen, bei dem keine Annahmen über die Struktur oder den Ursprung der Behauptungen gemacht werden. Insbesondere schlagen wir ein featurebasiertes Modell vor, das automatisch relevante Artikel zu einer bestimmten Behauptung abruft und deren Glaubwürdigkeit bewertet, indem die gegenseitige Interaktion zwischen dem Sprachstil der relevanten Artikel, ihre Haltung zur Behauptung und der Vertrauenswürdigkeit der zugrunde liegenden Quellen erfasst wird. Wir verbessern unseren Ansatz zur Bewertung der Glaubwürdigkeit weiter und schlagen ein auf neuronalen Netzen basierendes Modell vor. Im Gegensatz zum featurebasierten Modell ist dieses Modell nicht auf Feature-Engineering und externe Lexika angewiesen. Unsere beiden Modelle machen ihre Einschätzungen interpretierbar, indem sie erklärbare Beweise aus sorgfältig ausgewählten Webquellen extrahieren. Wir verwenden unsere Modelle zur Entwicklung eines Webinterfaces, CredEye, mit dem Benutzer die Glaubwürdigkeit einer Behauptung in Textform automatisch bewerten und verstehen können, indem sie automatisch ausgewählte Beweisstücke einsehen. Darüber hinaus untersuchen wir das Problem der Positionsklassifizierung und schlagen ein auf neuronalen Netzen basierendes Modell vor, um die Position verschiedener Benutzerperspektiven in Bezug auf die umstrittenen Behauptungen vorherzusagen. Bei einer kontroversen Behauptung und einem Benutzerkommentar sagt unser Einstufungsmodell voraus, ob der Benutzerkommentar die Behauptung unterstützt oder ablehnt
Social network analytics and visualization: Dynamic topic-based influence analysis in evolving micro-blogs
Influence Analysis is one of the well-known areas of Social Network Analysis. However, discovering influencers from micro-blog networks based on topics has gained recent popularity due to its specificity. Besides, these data networks are massive, continuous and evolving. Therefore, to address the above challenges we propose a dynamic framework for topic modelling and identifying influencers in the same process. It incorporates dynamic sampling, community detection and network statistics over graph data stream from a social media activity management application. Further, we compare the graph measures against each other empirically and observe that there is no evidence of correlation between the sets of users having large number of friends and the users whose posts achieve high acceptance (i.e., highly liked, commented and shared posts). Therefore, we propose a novel approach that incorporates a user's reachability and also acceptability by other users. Consequently, we improve on graph metrics by including a dynamic acceptance score (integrating content quality with network structure) for ranking influencers in micro-blogs. Additionally, we analysed the topic clusters' structure and quality with empirical experiments and visualization.Fundaçao para a Ciência e a Tecnologia, Grant/Award Number: UIDB/50014/202
- …