29 research outputs found
Clustering Human Trust Dynamics for Customized Real-time Prediction
Trust calibration is necessary to ensure appropriate user acceptance in
advanced automation technologies. A significant challenge to achieve trust
calibration is to quantitatively estimate human trust in real-time. Although
multiple trust models exist, these models have limited predictive performance
partly due to individual differences in trust dynamics. A personalized model
for each person can address this issue, but it requires a significant amount of
data for each user. We present a methodology to develop customized model by
clustering humans based on their trust dynamics. The clustering-based method
addresses the individual differences in trust dynamics while requiring
significantly less data than personalized model. We show that our
clustering-based customized models not only outperform the general model based
on entire population, but also outperform simple demographic factor-based
customized models. Specifically, we propose that two models based on
``confident'' and ``skeptical'' group of participants, respectively, can
represent the trust behavior of the population. The ``confident'' participants,
as compared to the ``skeptical'' participants, have higher initial trust
levels, lose trust slower when they encounter low reliability operations, and
have higher trust levels during trust-repair after the low reliability
operations. In summary, clustering-based customized models improve trust
prediction performance for further trust calibration considerations.Comment: To be published in 2021 IEEE 24rd International Conference on
Intelligent Transportation Systems (ITSC
ViCor: Bridging Visual Understanding and Commonsense Reasoning with Large Language Models
In our work, we explore the synergistic capabilities of pre-trained
vision-and-language models (VLMs) and large language models (LLMs) for visual
commonsense reasoning (VCR). We categorize the problem of VCR into visual
commonsense understanding (VCU) and visual commonsense inference (VCI). For
VCU, which involves perceiving the literal visual content, pre-trained VLMs
exhibit strong cross-dataset generalization. On the other hand, in VCI, where
the goal is to infer conclusions beyond image content, VLMs face difficulties.
We find that a baseline where VLMs provide perception results (image captions)
to LLMs leads to improved performance on VCI. However, we identify a challenge
with VLMs' passive perception, which often misses crucial context information,
leading to incorrect or uncertain reasoning by LLMs. To mitigate this issue, we
suggest a collaborative approach where LLMs, when uncertain about their
reasoning, actively direct VLMs to concentrate on and gather relevant visual
elements to support potential commonsense inferences. In our method, named
ViCor, pre-trained LLMs serve as problem classifiers to analyze the problem
category, VLM commanders to leverage VLMs differently based on the problem
classification, and visual commonsense reasoners to answer the question. VLMs
will perform visual recognition and understanding. We evaluate our framework on
two VCR benchmark datasets and outperform all other methods that do not require
in-domain supervised fine-tuning
Boosting Standard Classification Architectures Through a Ranking Regularizer
We employ triplet loss as a feature embedding regularizer to boost
classification performance. Standard architectures, like ResNet and Inception,
are extended to support both losses with minimal hyper-parameter tuning. This
promotes generality while fine-tuning pretrained networks. Triplet loss is a
powerful surrogate for recently proposed embedding regularizers. Yet, it is
avoided due to large batch-size requirement and high computational cost.
Through our experiments, we re-assess these assumptions.
During inference, our network supports both classification and embedding
tasks without any computational overhead. Quantitative evaluation highlights a
steady improvement on five fine-grained recognition datasets. Further
evaluation on an imbalanced video dataset achieves significant improvement.
Triplet loss brings feature embedding characteristics like nearest neighbor to
classification models. Code available at \url{http://bit.ly/2LNYEqL}.Comment: WACV 2020 Camera ready + supplementary materia
The Interaction Gap: A Step Toward Understanding Trust in Autonomous Vehicles Between Encounters
Shared autonomous vehicles (SAVs) will be introduced in greater numbers over
the coming decade. Due to rapid advances in shared mobility and the slower
development of fully autonomous vehicles (AVs), SAVs will likely be deployed
before privately-owned AVs. Moreover, existing shared mobility services are
transitioning their vehicle fleets toward those with increasingly higher levels
of driving automation. Consequently, people who use shared vehicles on an "as
needed" basis will have infrequent interactions with automated driving, thereby
experiencing interaction gaps. Using human trust data of 25 participants, we
show that interaction gaps can affect human trust in automated driving.
Participants engaged in a simulator study consisting of two interactions
separated by a one-week interaction gap. A moderate, inverse correlation was
found between the change in trust during the initial interaction and the
interaction gap, suggesting people "forget" some of their gained trust or
distrust in automation during an interaction gap.Comment: 5 pages, 3 figure
ジョウホウ ケンサク シツモン オウトウ ニ モトズク サイテキナ タイワ センリャク オ ソナエタ オンセイ ニ ヨル ジョウホウ アンナイ システム
京都大学0048新制・課程博士博士(情報学)甲第13976号情博第291号新制||情||57(附属図書館)UT51-2008-C892京都大学大学院情報学研究科知能情報学専攻(主査)教授 河原 達也, 教授 奥乃 博, 教授 黒橋 禎夫学位規則第4条第1項該当Doctor of InformaticsKyoto UniversityDA