38 research outputs found
One Self-Configurable Model to Solve Many Abstract Visual Reasoning Problems
Abstract Visual Reasoning (AVR) comprises a wide selection of various
problems similar to those used in human IQ tests. Recent years have brought
dynamic progress in solving particular AVR tasks, however, in the contemporary
literature AVR problems are largely dealt with in isolation, leading to highly
specialized task-specific methods. With the aim of developing universal
learning systems in the AVR domain, we propose the unified model for solving
Single-Choice Abstract visual Reasoning tasks (SCAR), capable of solving
various single-choice AVR tasks, without making any a priori assumptions about
the task structure, in particular the number and location of panels. The
proposed model relies on a novel Structure-Aware dynamic Layer (SAL), which
adapts its weights to the structure of the considered AVR problem. Experiments
conducted on Raven's Progressive Matrices, Visual Analogy Problems, and Odd One
Out problems show that SCAR (SAL-based models, in general) effectively solves
diverse AVR tasks, and its performance is on par with the state-of-the-art
task-specific baselines. What is more, SCAR demonstrates effective knowledge
reuse in multi-task and transfer learning settings. To our knowledge, this work
is the first successful attempt to construct a general single-choice AVR solver
relying on self-configurable architecture and unified solving method. With this
work we aim to stimulate and foster progress on task-independent research paths
in the AVR domain, with the long-term goal of development of a general AVR
solver.Comment: Accepted to The 38th Annual AAAI Conference on Artificial
Intelligence (AAAI 2024
Double-oracle sampling method for Stackelberg Equilibrium approximation in general-sum extensive-form games
The paper presents a new method for approximating Strong Stackelberg
Equilibrium in general-sum sequential games with imperfect information and
perfect recall. The proposed approach is generic as it does not rely on any
specific properties of a particular game model. The method is based on
iterative interleaving of the two following phases: (1) guided Monte Carlo Tree
Search sampling of the Follower's strategy space and (2) building the Leader's
behavior strategy tree for which the sampled Follower's strategy is an optimal
response. The above solution scheme is evaluated with respect to expected
Leader's utility and time requirements on three sets of interception games with
variable characteristics, played on graphs. A comparison with three
state-of-the-art MILP/LP-based methods shows that in vast majority of test
cases proposed simulation-based approach leads to optimal Leader's strategies,
while excelling the competitive methods in terms of better time scalability and
lower memory requirements
An overview of mixing augmentation methods and augmentation strategies
Deep Convolutional Neural Networks have made an incredible progress in many
Computer Vision tasks. This progress, however, often relies on the availability
of large amounts of the training data, required to prevent over-fitting, which
in many domains entails significant cost of manual data labeling. An
alternative approach is application of data augmentation (DA) techniques that
aim at model regularization by creating additional observations from the
available ones. This survey focuses on two DA research streams: image mixing
and automated selection of augmentation strategies. First, the presented
methods are briefly described, and then qualitatively compared with respect to
their key characteristics. Various quantitative comparisons are also included
based on the results reported in recent DA literature. This review mainly
covers the methods published in the materials of top-tier conferences and in
leading journals in the years 2017-2021
AttentionMix: Data augmentation method that relies on BERT attention mechanism
The Mixup method has proven to be a powerful data augmentation technique in
Computer Vision, with many successors that perform image mixing in a guided
manner. One of the interesting research directions is transferring the
underlying Mixup idea to other domains, e.g. Natural Language Processing (NLP).
Even though there already exist several methods that apply Mixup to textual
data, there is still room for new, improved approaches. In this work, we
introduce AttentionMix, a novel mixing method that relies on attention-based
information. While the paper focuses on the BERT attention mechanism, the
proposed approach can be applied to generally any attention-based model.
AttentionMix is evaluated on 3 standard sentiment classification datasets and
in all three cases outperforms two benchmark approaches that utilize Mixup
mechanism, as well as the vanilla BERT method. The results confirm that the
attention-based information can be effectively used for data augmentation in
the NLP domain