32,092 research outputs found
Credible Review Detection with Limited Information using Consistency Analysis
Online reviews provide viewpoints on the strengths and shortcomings of products/services, influencing potential customers' purchasing decisions. However, the proliferation of non-credible reviews -- either fake (promoting/ demoting an item), incompetent (involving irrelevant aspects), or biased -- entails the problem of identifying credible reviews. Prior works involve classifiers harnessing rich information about items/users -- which might not be readily available in several domains -- that provide only limited interpretability as to why a review is deemed non-credible. This paper presents a novel approach to address the above issues. We utilize latent topic models leveraging review texts, item ratings, and timestamps to derive consistency features without relying on item/user histories, unavailable for "long-tail" items/users. We develop models, for computing review credibility scores to provide interpretable evidence for non-credible reviews, that are also transferable to other domains -- addressing the scarcity of labeled data. Experiments on real-world datasets demonstrate improvements over state-of-the-art baselines
Assessing the Credibility of Cyber Adversaries
Online communications are ever increasing, and we are constantly faced with the challenge of whether online information is credible or not. Being able to assess the credibility of others was once the work solely of intelligence agencies. In the current times of disinformation and misinformation, understanding what we are reading and to who we are paying attention to is essential for us to make considered, informed, and accurate decisions, and it has become everyone’s business. This paper employs a literature review to examine the empirical evidence across online credibility, trust, deception, and fraud detection in an effort to consolidate this information to understand adversary online credibility – how do we know with whom we are conversing is who they say they are? Based on this review, we propose a model that includes examining information as well as user and interaction characteristics to best inform an assessment of online credibility. Limitations and future opportunities are highlighted
The impact of corporate philanthropy on reputation for corporate social performance
This study seeks to examine the mechanisms by which a corporation’s use of philanthropy affects its reputation for corporate social performance (CSP), which the authors conceive of as consisting of two dimensions: CSP awareness and CSP perception. Using signal detection theory (SDT), the authors model signal amplitude (the amount contributed), dispersion (number of areas supported), and consistency (presence of a corporate foundation) on CSP awareness and perception. Overall, this study finds that characteristics of firms' portfolio of philanthropic activities are a greater predictor of CSP awareness than of CSP perception. Awareness increases with signal amplitude, dispersion, and consistency. CSP perception is driven by awareness and corporate reputation. The authors’ contention that corporate philanthropy is a complex variable is upheld, as we find that CSP signal characteristics influence CSP awareness and perception independently and asymmetrically. The authors conclude by proposing avenues for future research
Inferring the photometric and size evolution of galaxies from image simulations
Current constraints on models of galaxy evolution rely on morphometric
catalogs extracted from multi-band photometric surveys. However, these catalogs
are altered by selection effects that are difficult to model, that correlate in
non trivial ways, and that can lead to contradictory predictions if not taken
into account carefully. To address this issue, we have developed a new approach
combining parametric Bayesian indirect likelihood (pBIL) techniques and
empirical modeling with realistic image simulations that reproduce a large
fraction of these selection effects. This allows us to perform a direct
comparison between observed and simulated images and to infer robust
constraints on model parameters. We use a semi-empirical forward model to
generate a distribution of mock galaxies from a set of physical parameters.
These galaxies are passed through an image simulator reproducing the
instrumental characteristics of any survey and are then extracted in the same
way as the observed data. The discrepancy between the simulated and observed
data is quantified, and minimized with a custom sampling process based on
adaptive Monte Carlo Markov Chain methods. Using synthetic data matching most
of the properties of a CFHTLS Deep field, we demonstrate the robustness and
internal consistency of our approach by inferring the parameters governing the
size and luminosity functions and their evolutions for different realistic
populations of galaxies. We also compare the results of our approach with those
obtained from the classical spectral energy distribution fitting and
photometric redshift approach.Our pipeline infers efficiently the luminosity
and size distribution and evolution parameters with a very limited number of
observables (3 photometric bands). When compared to SED fitting based on the
same set of observables, our method yields results that are more accurate and
free from systematic biases.Comment: 24 pages, 12 figures, accepted for publication in A&
A Bayesian - Deep Learning model for estimating Covid-19 evolution in Spain
This work proposes a semi-parametric approach to estimate Covid-19
(SARS-CoV-2) evolution in Spain. Considering the sequences of 14 days
cumulative incidence of all Spanish regions, it combines modern Deep Learning
(DL) techniques for analyzing sequences with the usual Bayesian Poisson-Gamma
model for counts. DL model provides a suitable description of observed
sequences but no reliable uncertainty quantification around it can be obtained.
To overcome this we use the prediction from DL as an expert elicitation of the
expected number of counts along with their uncertainty and thus obtaining the
posterior predictive distribution of counts in an orthodox Bayesian analysis
using the well known Poisson-Gamma model. The overall resulting model allows us
to either predict the future evolution of the sequences on all regions, as well
as, estimating the consequences of eventual scenarios.Comment: Related to: https://github.com/scabras/covid19-bayes-d
Credibility analysis of textual claims with explainable evidence
Despite being a vast resource of valuable information, the Web has been polluted by the spread of false claims. Increasing hoaxes, fake news, and misleading information on the Web have given rise to many fact-checking websites that manually assess these doubtful claims. However, the rapid speed and large scale of misinformation spread have become the bottleneck for manual verification. This calls for credibility assessment tools that can automate this verification process. Prior works in this domain make strong assumptions about the structure of the claims and the communities where they are made. Most importantly, black-box techniques proposed in prior works lack the ability to explain why a certain statement is deemed credible or not. To address these limitations, this dissertation proposes a general framework for automated credibility assessment that does not make any assumption about the structure or origin of the claims. Specifically, we propose a feature-based model, which automatically retrieves relevant articles about the given claim and assesses its credibility by capturing the mutual interaction between the language style of the relevant articles, their stance towards the claim, and the trustworthiness of the underlying web sources. We further enhance our credibility assessment approach and propose a neural-network-based model. Unlike the feature-based model, this model does not rely on feature engineering and external lexicons. Both our models make their assessments interpretable by extracting explainable evidence from judiciously selected web sources.
We utilize our models and develop a Web interface, CredEye, which enables users to automatically assess the credibility of a textual claim and dissect into the assessment by browsing through judiciously and automatically selected evidence snippets. In addition, we study the problem of stance classification and propose a neural-network-based model for predicting the stance of diverse user perspectives regarding the controversial claims. Given a controversial claim and a user comment, our stance classification model predicts whether the user comment is supporting or opposing the claim.Das Web ist eine riesige Quelle wertvoller Informationen, allerdings wurde es durch die Verbreitung von Falschmeldungen verschmutzt. Eine zunehmende Anzahl an Hoaxes, Falschmeldungen und irreführenden Informationen im Internet haben viele Websites hervorgebracht, auf denen die Fakten überprüft und zweifelhafte Behauptungen manuell bewertet werden. Die rasante Verbreitung großer Mengen von Fehlinformationen sind jedoch zum Engpass für die manuelle Überprüfung geworden. Dies erfordert Tools zur Bewertung der Glaubwürdigkeit, mit denen dieser Überprüfungsprozess automatisiert werden kann. In früheren Arbeiten in diesem Bereich werden starke Annahmen gemacht über die Struktur der Behauptungen und die Portale, in denen sie gepostet werden. Vor allem aber können die Black-Box-Techniken, die in früheren Arbeiten vorgeschlagen wurden, nicht erklären, warum eine bestimmte Aussage als glaubwürdig erachtet wird oder nicht. Um diesen Einschränkungen zu begegnen, wird in dieser Dissertation ein allgemeines Framework für die automatisierte Bewertung der Glaubwürdigkeit vorgeschlagen, bei dem keine Annahmen über die Struktur oder den Ursprung der Behauptungen gemacht werden. Insbesondere schlagen wir ein featurebasiertes Modell vor, das automatisch relevante Artikel zu einer bestimmten Behauptung abruft und deren Glaubwürdigkeit bewertet, indem die gegenseitige Interaktion zwischen dem Sprachstil der relevanten Artikel, ihre Haltung zur Behauptung und der Vertrauenswürdigkeit der zugrunde liegenden Quellen erfasst wird. Wir verbessern unseren Ansatz zur Bewertung der Glaubwürdigkeit weiter und schlagen ein auf neuronalen Netzen basierendes Modell vor. Im Gegensatz zum featurebasierten Modell ist dieses Modell nicht auf Feature-Engineering und externe Lexika angewiesen. Unsere beiden Modelle machen ihre Einschätzungen interpretierbar, indem sie erklärbare Beweise aus sorgfältig ausgewählten Webquellen extrahieren. Wir verwenden unsere Modelle zur Entwicklung eines Webinterfaces, CredEye, mit dem Benutzer die Glaubwürdigkeit einer Behauptung in Textform automatisch bewerten und verstehen können, indem sie automatisch ausgewählte Beweisstücke einsehen. Darüber hinaus untersuchen wir das Problem der Positionsklassifizierung und schlagen ein auf neuronalen Netzen basierendes Modell vor, um die Position verschiedener Benutzerperspektiven in Bezug auf die umstrittenen Behauptungen vorherzusagen. Bei einer kontroversen Behauptung und einem Benutzerkommentar sagt unser Einstufungsmodell voraus, ob der Benutzerkommentar die Behauptung unterstützt oder ablehnt
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Communities
One of the major hurdles preventing the full exploitation of information from
online communities is the widespread concern regarding the quality and
credibility of user-contributed content. Prior works in this domain operate on
a static snapshot of the community, making strong assumptions about the
structure of the data (e.g., relational tables), or consider only shallow
features for text classification.
To address the above limitations, we propose probabilistic graphical models
that can leverage the joint interplay between multiple factors in online
communities --- like user interactions, community dynamics, and textual content
--- to automatically assess the credibility of user-contributed online content,
and the expertise of users and their evolution with user-interpretable
explanation. To this end, we devise new models based on Conditional Random
Fields for different settings like incorporating partial expert knowledge for
semi-supervised learning, and handling discrete labels as well as numeric
ratings for fine-grained analysis. This enables applications such as extracting
reliable side-effects of drugs from user-contributed posts in healthforums, and
identifying credible content in news communities.
Online communities are dynamic, as users join and leave, adapt to evolving
trends, and mature over time. To capture this dynamics, we propose generative
models based on Hidden Markov Model, Latent Dirichlet Allocation, and Brownian
Motion to trace the continuous evolution of user expertise and their language
model over time. This allows us to identify expert users and credible content
jointly over time, improving state-of-the-art recommender systems by explicitly
considering the maturity of users. This also enables applications such as
identifying helpful product reviews, and detecting fake and anomalous reviews
with limited information.Comment: PhD thesis, Mar 201
- …