5,522 research outputs found
On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection
Humans are the final decision makers in critical tasks that involve ethical
and legal concerns, ranging from recidivism prediction, to medical diagnosis,
to fighting against fake news. Although machine learning models can sometimes
achieve impressive performance in these tasks, these tasks are not amenable to
full automation. To realize the potential of machine learning for improving
human decisions, it is important to understand how assistance from machine
learning models affects human performance and human agency.
In this paper, we use deception detection as a testbed and investigate how we
can harness explanations and predictions of machine learning models to improve
human performance while retaining human agency. We propose a spectrum between
full human agency and full automation, and develop varying levels of machine
assistance along the spectrum that gradually increase the influence of machine
predictions. We find that without showing predicted labels, explanations alone
slightly improve human performance in the end task. In comparison, human
performance is greatly improved by showing predicted labels (>20% relative
improvement) and can be further improved by explicitly suggesting strong
machine performance. Interestingly, when predicted labels are shown,
explanations of machine predictions induce a similar level of accuracy as an
explicit statement of strong machine performance. Our results demonstrate a
tradeoff between human performance and human agency and show that explanations
of machine predictions can moderate this tradeoff.Comment: 17 pages, 19 figures, in Proceedings of ACM FAT* 2019, dataset & demo
available at https://deception.machineintheloop.co
CIMTDetect: A Community Infused Matrix-Tensor Coupled Factorization Based Method for Fake News Detection
Detecting whether a news article is fake or genuine is a crucial task in
today's digital world where it's easy to create and spread a misleading news
article. This is especially true of news stories shared on social media since
they don't undergo any stringent journalistic checking associated with main
stream media. Given the inherent human tendency to share information with their
social connections at a mouse-click, fake news articles masquerading as real
ones, tend to spread widely and virally. The presence of echo chambers (people
sharing same beliefs) in social networks, only adds to this problem of
wide-spread existence of fake news on social media. In this paper, we tackle
the problem of fake news detection from social media by exploiting the very
presence of echo chambers that exist within the social network of users to
obtain an efficient and informative latent representation of the news article.
By modeling the echo-chambers as closely-connected communities within the
social network, we represent a news article as a 3-mode tensor of the structure
- and propose a tensor factorization based method to
encode the news article in a latent embedding space preserving the community
structure. We also propose an extension of the above method, which jointly
models the community and content information of the news article through a
coupled matrix-tensor factorization framework. We empirically demonstrate the
efficacy of our method for the task of Fake News Detection over two real-world
datasets. Further, we validate the generalization of the resulting embeddings
over two other auxiliary tasks, namely: \textbf{1)} News Cohort Analysis and
\textbf{2)} Collaborative News Recommendation. Our proposed method outperforms
appropriate baselines for both the tasks, establishing its generalization.Comment: Presented at ASONAM'1
LoRA-like Calibration for Multimodal Deception Detection using ATSFace Data
Recently, deception detection on human videos is an eye-catching techniques
and can serve lots applications. AI model in this domain demonstrates the high
accuracy, but AI tends to be a non-interpretable black box. We introduce an
attention-aware neural network addressing challenges inherent in video data and
deception dynamics. This model, through its continuous assessment of visual,
audio, and text features, pinpoints deceptive cues. We employ a multimodal
fusion strategy that enhances accuracy; our approach yields a 92\% accuracy
rate on a real-life trial dataset. Most important of all, the model indicates
the attention focus in the videos, providing valuable insights on deception
cues. Hence, our method adeptly detects deceit and elucidates the underlying
process. We further enriched our study with an experiment involving students
answering questions either truthfully or deceitfully, resulting in a new
dataset of 309 video clips, named ATSFace. Using this, we also introduced a
calibration method, which is inspired by Low-Rank Adaptation (LoRA), to refine
individual-based deception detection accuracy.Comment: 10 pages, 9 figure
Veracity Roadmap: Is Big Data Objective, Truthful and Credible?
This paper argues that big data can possess different characteristics, which affect its quality. Depending on its origin, data processing technologies, and methodologies used for data collection and scientific discoveries, big data can have biases, ambiguities, and inaccuracies which need to be identified and accounted for to reduce inference errors and improve the accuracy of generated insights. Big data veracity is now being recognized as a necessary property for its utilization, complementing the three previously established quality dimensions (volume, variety, and velocity), But there has been little discussion of the concept of veracity thus far. This paper provides a roadmap for theoretical and empirical definitions of veracity along with its practical implications. We explore veracity across three main dimensions: 1) objectivity/subjectivity, 2) truthfulness/deception, 3) credibility/implausibility – and propose to operationalize each of these dimensions with either existing computational tools or potential ones, relevant particularly to textual data analytics. We combine the measures of veracity dimensions into one composite index – the big data veracity index. This newly developed veracity index provides a useful way of assessing systematic variations in big data quality across datasets with textual information. The paper contributes to the big data research by categorizing the range of existing tools to measure the suggested dimensions, and to Library and Information Science (LIS) by proposing to account for heterogeneity of diverse big data, and to identify information quality dimensions important for each big data type
TI-CNN: Convolutional Neural Networks for Fake News Detection
With the development of social networks, fake news for various commercial and
political purposes has been appearing in large numbers and gotten widespread in
the online world. With deceptive words, people can get infected by the fake
news very easily and will share them without any fact-checking. For instance,
during the 2016 US president election, various kinds of fake news about the
candidates widely spread through both official news media and the online social
networks. These fake news is usually released to either smear the opponents or
support the candidate on their side. The erroneous information in the fake news
is usually written to motivate the voters' irrational emotion and enthusiasm.
Such kinds of fake news sometimes can bring about devastating effects, and an
important goal in improving the credibility of online social networks is to
identify the fake news timely. In this paper, we propose to study the fake news
detection problem. Automatic fake news identification is extremely hard, since
pure model based fact-checking for news is still an open problem, and few
existing models can be applied to solve the problem. With a thorough
investigation of a fake news data, lots of useful explicit features are
identified from both the text words and images used in the fake news. Besides
the explicit features, there also exist some hidden patterns in the words and
images used in fake news, which can be captured with a set of latent features
extracted via the multiple convolutional layers in our model. A model named as
TI-CNN (Text and Image information based Convolutinal Neural Network) is
proposed in this paper. By projecting the explicit and latent features into a
unified feature space, TI-CNN is trained with both the text and image
information simultaneously. Extensive experiments carried on the real-world
fake news datasets have demonstrate the effectiveness of TI-CNN
Towards Structured Analysis of Broadcast Badminton Videos
Sports video data is recorded for nearly every major tournament but remains
archived and inaccessible to large scale data mining and analytics. It can only
be viewed sequentially or manually tagged with higher-level labels which is
time consuming and prone to errors. In this work, we propose an end-to-end
framework for automatic attributes tagging and analysis of sport videos. We use
commonly available broadcast videos of matches and, unlike previous approaches,
does not rely on special camera setups or additional sensors.
Our focus is on Badminton as the sport of interest. We propose a method to
analyze a large corpus of badminton broadcast videos by segmenting the points
played, tracking and recognizing the players in each point and annotating their
respective badminton strokes. We evaluate the performance on 10 Olympic matches
with 20 players and achieved 95.44% point segmentation accuracy, 97.38% player
detection score ([email protected]), 97.98% player identification accuracy, and stroke
segmentation edit scores of 80.48%. We further show that the automatically
annotated videos alone could enable the gameplay analysis and inference by
computing understandable metrics such as player's reaction time, speed, and
footwork around the court, etc.Comment: 9 page
Predicting the Law Area and Decisions of French Supreme Court Cases
In this paper, we investigate the application of text classification methods
to predict the law area and the decision of cases judged by the French Supreme
Court. We also investigate the influence of the time period in which a ruling
was made over the textual form of the case description and the extent to which
it is necessary to mask the judge's motivation for a ruling to emulate a
real-world test scenario. We report results of 96% f1 score in predicting a
case ruling, 90% f1 score in predicting the law area of a case, and 75.9% f1
score in estimating the time span when a ruling has been issued using a linear
Support Vector Machine (SVM) classifier trained on lexical features.Comment: RANLP 201
- …