56 research outputs found
Deep Reinforcement-Learning-based Driving Policy for Autonomous Road Vehicles
In this work the problem of path planning for an autonomous vehicle that
moves on a freeway is considered. The most common approaches that are used to
address this problem are based on optimal control methods, which make
assumptions about the model of the environment and the system dynamics. On the
contrary, this work proposes the development of a driving policy based on
reinforcement learning. In this way, the proposed driving policy makes minimal
or no assumptions about the environment, since a priori knowledge about the
system dynamics is not required. Driving scenarios where the road is occupied
both by autonomous and manual driving vehicles are considered. To the best of
our knowledge, this is one of the first approaches that propose a reinforcement
learning driving policy for mixed driving environments. The derived
reinforcement learning policy, firstly, is compared against an optimal policy
derived via dynamic programming, and, secondly, its efficiency is evaluated
under realistic scenarios generated by the established SUMO microscopic traffic
flow simulator. Finally, some initial results regarding the effect of
autonomous vehicles' behavior on the overall traffic flow are presented.Comment: 19 pages. arXiv admin note: substantial text overlap with
arXiv:1905.0904
Towards General Game Representations: Decomposing Games Pixels into Content and Style
On-screen game footage contains rich contextual information that players
process when playing and experiencing a game. Learning pixel representations of
games can benefit artificial intelligence across several downstream tasks
including game-playing agents, procedural content generation, and player
modelling. The generalizability of these methods, however, remains a challenge,
as learned representations should ideally be shared across games with similar
game mechanics. This could allow, for instance, game-playing agents trained on
one game to perform well in similar games with no re-training. This paper
explores how generalizable pre-trained computer vision encoders can be for such
tasks, by decomposing the latent space into content embeddings and style
embeddings. The goal is to minimize the domain gap between games of the same
genre when it comes to game content critical for downstream tasks, and ignore
differences in graphical style. We employ a pre-trained Vision Transformer
encoder and a decomposition technique based on game genres to obtain separate
content and style embeddings. Our findings show that the decomposed embeddings
achieve style invariance across multiple games while still maintaining strong
content extraction capabilities. We argue that the proposed decomposition of
content and style offers better generalization capacities across game
environments independently of the downstream task
Tensor-based Nonlinear Classifier for High-Order Data Analysis
In this paper we propose a tensor-based nonlinear model for high-order data
classification. The advantages of the proposed scheme are that (i) it
significantly reduces the number of weight parameters, and hence of required
training samples, and (ii) it retains the spatial structure of the input
samples. The proposed model, called \textit{Rank}-1 FNN, is based on a
modification of a feedforward neural network (FNN), such that its weights
satisfy the {\it rank}-1 canonical decomposition. We also introduce a new
learning algorithm to train the model, and we evaluate the \textit{Rank}-1 FNN
on third-order hyperspectral data. Experimental results and comparisons
indicate that the proposed model outperforms state of the art classification
methods, including deep learning based ones, especially in cases with small
numbers of available training samples.Comment: To appear in IEEE ICASSP 2018. arXiv admin note: text overlap with
arXiv:1709.0816
From pixels to affect : a study on games and player experience
Is it possible to predict the affect of a user just
by observing her behavioral interaction through a video? How
can we, for instance, predict a user’s arousal in games by
merely looking at the screen during play? In this paper we
address these questions by employing three dissimilar deep
convolutional neural network architectures in our attempt to
learn the underlying mapping between video streams of gameplay
and the player’s arousal. We test the algorithms in an annotated
dataset of 50 gameplay videos of a survival shooter game and
evaluate the deep learned models’ capacity to classify high vs low
arousal levels. Our key findings with the demanding leave-onevideo-
out validation method reveal accuracies of over 78% on
average and 98% at best. While this study focuses on games and
player experience as a test domain, the findings and methodology
are directly relevant to any affective computing area, introducing
a general and user-agnostic approach for modeling affect.This paper is funded, in part, by the H2020 project Com N Play Science (project no: 787476).peer-reviewe
Knowing Your Annotator: Rapidly Testing the Reliability of Affect Annotation
The laborious and costly nature of affect annotation is a key detrimental factor for obtaining large scale corpora with valid and reliable affect labels. Motivated by the lack of tools that can effectively determine an annotator's reliability, this paper proposes general quality assurance (QA) tests for real-time continuous annotation tasks. Assuming that the annotation tasks rely on stimuli with audiovisual components, such as videos, we propose and evaluate two QA tests: a visual and an auditory QA test. We validate the QA tool across 20 annotators that are asked to go through the test followed by a lengthy task of annotating the engagement of gameplay videos. Our findings suggest that the proposed QA tool reveals, unsurprisingly, that trained annotators are more reliable than the best of untrained crowdworkers we could employ. Importantly, the QA tool introduced can predict effectively the reliability of an affect annotator with 80% accuracy, thereby, saving on resources, effort and cost, and maximizing the reliability of labels solicited in affective corpora. The introduced QA tool is available and accessible through the PAGAN annotation platform
Predicting Player Engagement in Tom Clancy's The Division 2: A Multimodal Approach via Pixels and Gamepad Actions
This paper introduces a large scale multimodal corpus collected for the
purpose of analysing and predicting player engagement in commercial-standard
games. The corpus is solicited from 25 players of the action role-playing game
Tom Clancy's The Division 2, who annotated their level of engagement using a
time-continuous annotation tool. The cleaned and processed corpus presented in
this paper consists of nearly 20 hours of annotated gameplay videos accompanied
by logged gamepad actions. We report preliminary results on predicting
long-term player engagement based on in-game footage and game controller
actions using Convolutional Neural Network architectures. Results obtained
suggest we can predict the player engagement with up to 72% accuracy on average
(88% at best) when we fuse information from the game footage and the player's
controller input. Our findings validate the hypothesis that long-term (i.e. 1
hour of play) engagement can be predicted efficiently solely from pixels and
gamepad actions.Comment: 8 pages, accepted for publication and presentation at 2023 25th ACM
International Conference on Multimodal Interaction (ICMI
- …