249 research outputs found
An Action Selection Architecture for an Emotional Agent
An architecture for action selection is presented linking emotion, cognition and behavior. It defines the information and emotion processes of an agent. The architecture has been implemented and used in a prototype environment
Independent Prototype Propagation for Zero-Shot Compositionality
Humans are good at compositional zero-shot reasoning; someone who has never
seen a zebra before could nevertheless recognize one when we tell them it looks
like a horse with black and white stripes. Machine learning systems, on the
other hand, usually leverage spurious correlations in the training data, and
while such correlations can help recognize objects in context, they hurt
generalization. To be able to deal with underspecified datasets while still
leveraging contextual clues during classification, we propose ProtoProp, a
novel prototype propagation graph method. First we learn prototypical
representations of objects (e.g., zebra) that are conditionally independent
w.r.t. their attribute labels (e.g., stripes) and vice versa. Next we propagate
the independent prototypes through a compositional graph, to learn
compositional prototypes of novel attribute-object combinations that reflect
the dependencies of the target distribution. The method does not rely on any
external data, such as class hierarchy graphs or pretrained word embeddings. We
evaluate our approach on AO-Clever, a synthetic and strongly visual dataset
with clean labels, and UT-Zappos, a noisy real-world dataset of fine-grained
shoe types. We show that in the generalized compositional zero-shot setting we
outperform state-of-the-art results, and through ablations we show the
importance of each part of the method and their contribution to the final
results
Language-Based Augmentation to Address Shortcut Learning in Object Goal Navigation
Deep Reinforcement Learning (DRL) has shown great potential in enabling robots to find certain objects (e.g., `find a fridge') in environments like homes or schools. This task is known as Object-Goal Navigation (ObjectNav). DRL methods are predominantly trained and evaluated using environment simulators. Although DRL has shown impressive results, the simulators may be biased or limited. This creates a risk of shortcut learning, i.e., learning a policy tailored to specific visual details of training environments. We aim to deepen our understanding of shortcut learning in ObjectNav, its implications and propose a solution. We design an experiment for inserting a shortcut bias in the appearance of training environments. As a proof-of-concept, we associate room types to specific wall colors (e.g., bedrooms with green walls), and observe poor generalization of a state-of-the-art (SOTA) ObjectNav method to environments where this is not the case (e.g., bedrooms with blue walls). We find that shortcut learning is the root cause: the agent learns to navigate to target objects, by simply searching for the associated wall color of the target object's room. To solve this, we propose Language-Based (L-B) augmentation. Our key insight is that we can leverage the multimodal feature space of a Vision-Language Model (VLM) to augment visual representations directly at the feature-level, requiring no changes to the simulator, and only an addition of one layer to the model. Where the SOTA ObjectNav method's success rate drops 69%, our proposal has only a drop of 23%
Spatial heterogeneity of element and litter turnover in a Bornean rain forest.
The spatial heterogeneity of element fluxes was quantified by measuring litterfall, throughfall and litter decomposition for 1 y in 30 randomly located sampling areas in a lowland dipterocarp rain forest. The idea tested was that turnover of elements is more variable than turnover of dry matter in a forest with extremely high tree species diversity. In spite of the low fertility of the soil (an ultisol), total litter production (leaves, trash, and wood <2 cm in diameter) was high (1105 g
Recurrently Predicting Hypergraphs
This work considers predicting the relational structure of a hypergraph for a
given set of vertices, as common for applications in particle physics,
biological systems and other complex combinatorial problems. A problem arises
from the number of possible multi-way relationships, or hyperedges, scaling in
for a set of elements. Simply storing an indicator
tensor for all relationships is already intractable for moderately sized ,
prompting previous approaches to restrict the number of vertices a hyperedge
connects. Instead, we propose a recurrent hypergraph neural network that
predicts the incidence matrix by iteratively refining an initial guess of the
solution. We leverage the property that most hypergraphs of interest are
sparsely connected and reduce the memory requirement to ,
where is the maximum number of positive edges, i.e., edges that actually
exist. In order to counteract the linearly growing memory cost from training a
lengthening sequence of refinement steps, we further propose an algorithm that
applies backpropagation through time on randomly sampled subsequences. We
empirically show that our method can match an increase in the intrinsic
complexity without a performance decrease and demonstrate superior performance
compared to state-of-the-art models
- …