6 research outputs found
An Inter-observer consistent deep adversarial training for visual scanpath prediction
The visual scanpath is a sequence of points through which the human gaze
moves while exploring a scene. It represents the fundamental concepts upon
which visual attention research is based. As a result, the ability to predict
them has emerged as an important task in recent years. In this paper, we
propose an inter-observer consistent adversarial training approach for scanpath
prediction through a lightweight deep neural network. The adversarial method
employs a discriminative neural network as a dynamic loss that is better suited
to model the natural stochastic phenomenon while maintaining consistency
between the distributions related to the subjective nature of scanpaths
traversed by different observers. Through extensive testing, we show the
competitiveness of our approach in regard to state-of-the-art methods.Comment: ICIP202
A domain adaptive deep learning solution for scanpath prediction of paintings
Cultural heritage understanding and preservation is an important issue for
society as it represents a fundamental aspect of its identity. Paintings
represent a significant part of cultural heritage, and are the subject of study
continuously. However, the way viewers perceive paintings is strictly related
to the so-called HVS (Human Vision System) behaviour. This paper focuses on the
eye-movement analysis of viewers during the visual experience of a certain
number of paintings. In further details, we introduce a new approach to
predicting human visual attention, which impacts several cognitive functions
for humans, including the fundamental understanding of a scene, and then extend
it to painting images. The proposed new architecture ingests images and returns
scanpaths, a sequence of points featuring a high likelihood of catching
viewers' attention. We use an FCNN (Fully Convolutional Neural Network), in
which we exploit a differentiable channel-wise selection and Soft-Argmax
modules. We also incorporate learnable Gaussian distributions onto the network
bottleneck to simulate visual attention process bias in natural scene images.
Furthermore, to reduce the effect of shifts between different domains (i.e.
natural images, painting), we urge the model to learn unsupervised general
features from other domains using a gradient reversal classifier. The results
obtained by our model outperform existing state-of-the-art ones in terms of
accuracy and efficiency.Comment: Accepted at CBMI2022 graz, austri
Insights into Classifying and Mitigating LLMs' Hallucinations
The widespread adoption of large language models (LLMs) across diverse AI
applications is proof of the outstanding achievements obtained in several
tasks, such as text mining, text generation, and question answering. However,
LLMs are not exempt from drawbacks. One of the most concerning aspects regards
the emerging problematic phenomena known as "Hallucinations". They manifest in
text generation systems, particularly in question-answering systems reliant on
LLMs, potentially resulting in false or misleading information propagation.
This paper delves into the underlying causes of AI hallucination and elucidates
its significance in artificial intelligence. In particular, Hallucination
classification is tackled over several tasks (Machine Translation, Question and
Answer, Dialog Systems, Summarisation Systems, Knowledge Graph with LLMs, and
Visual Question Answer). Additionally, we explore potential strategies to
mitigate hallucinations, aiming to enhance the overall reliability of LLMs. Our
research addresses this critical issue within the HeReFaNMi (Health-Related
Fake News Mitigation) project, generously supported by NGI Search, dedicated to
combating Health-Related Fake News dissemination on the Internet. This
endeavour represents a concerted effort to safeguard the integrity of
information dissemination in an age of evolving AI technologies.Comment: Accepted at AIxIA 202
Automatic diagnosis of knee osteoarthritis severity using Swin transformer
Knee osteoarthritis (KOA) is a widespread condition that can cause chronic
pain and stiffness in the knee joint. Early detection and diagnosis are crucial
for successful clinical intervention and management to prevent severe
complications, such as loss of mobility. In this paper, we propose an automated
approach that employs the Swin Transformer to predict the severity of KOA. Our
model uses publicly available radiographic datasets with Kellgren and Lawrence
scores to enable early detection and severity assessment. To improve the
accuracy of our model, we employ a multi-prediction head architecture that
utilizes multi-layer perceptron classifiers. Additionally, we introduce a novel
training approach that reduces the data drift between multiple datasets to
ensure the generalization ability of the model. The results of our experiments
demonstrate the effectiveness and feasibility of our approach in predicting KOA
severity accurately.Comment: CBMI 202
Detecting colour vision deficiencies via Webcam-based Eye-tracking: A case study
Webcam-based eye-tracking platforms have recently reemerged due to improvements in machine learning-supported calibration processes and offer a scalable option for conducting eye movement studies. Although not yet comparable to the infrared-based ones regarding accuracy and frequency, some compelling performances have been observed, especially in those scenarios with medium-sized
AOI (Areas of Interest) in images. In this study, we test the
reliability of webcam-based eye-tracking on a specific task: Eye
movement distribution analysis for CVD (Colour Vision Deficiency) detection. We introduce a new publicly available eye movement dataset based on a pilot study (n=12) on images with dominant red colour (previously shown to be difficult with dichromatic AOI to investigate CVD by comparing attention patterns obtained in webcam eye-tracking sessions). We hypothesized that webcam eye tracking without infrared support could detect differing attention patterns between CVD and non-CVD participants and observed statistically significant differences, allowing the retention of our
hypothesis