16 research outputs found
Spatially-Aware Transformer for Embodied Agents
Episodic memory plays a crucial role in various cognitive processes, such as
the ability to mentally recall past events. While cognitive science emphasizes
the significance of spatial context in the formation and retrieval of episodic
memory, the current primary approach to implementing episodic memory in AI
systems is through transformers that store temporally ordered experiences,
which overlooks the spatial dimension. As a result, it is unclear how the
underlying structure could be extended to incorporate the spatial axis beyond
temporal order alone and thereby what benefits can be obtained. To address
this, this paper explores the use of Spatially-Aware Transformer models that
incorporate spatial information. These models enable the creation of
place-centric episodic memory that considers both temporal and spatial
dimensions. Adopting this approach, we demonstrate that memory utilization
efficiency can be improved, leading to enhanced accuracy in various
place-centric downstream tasks. Additionally, we propose the Adaptive Memory
Allocator, a memory management method based on reinforcement learning that aims
to optimize efficiency of memory utilization. Our experiments demonstrate the
advantages of our proposed model in various environments and across multiple
downstream tasks, including prediction, generation, reasoning, and
reinforcement learning. The source code for our models and experiments will be
available at https://github.com/junmokane/spatially-aware-transformer.Comment: ICLR 2024 Spotlight. First two authors contributed equall
An Investigation into Pre-Training Object-Centric Representations for Reinforcement Learning
Unsupervised object-centric representation (OCR) learning has recently drawn
attention as a new paradigm of visual representation. This is because of its
potential of being an effective pre-training technique for various downstream
tasks in terms of sample efficiency, systematic generalization, and reasoning.
Although image-based reinforcement learning (RL) is one of the most important
and thus frequently mentioned such downstream tasks, the benefit in RL has
surprisingly not been investigated systematically thus far. Instead, most of
the evaluations have focused on rather indirect metrics such as segmentation
quality and object property prediction accuracy. In this paper, we investigate
the effectiveness of OCR pre-training for image-based reinforcement learning
via empirical experiments. For systematic evaluation, we introduce a simple
object-centric visual RL benchmark and conduct experiments to answer questions
such as ``Does OCR pre-training improve performance on object-centric tasks?''
and ``Can OCR pre-training help with out-of-distribution generalization?''. Our
results provide empirical evidence for valuable insights into the effectiveness
of OCR pre-training for RL and the potential limitations of its use in certain
scenarios. Additionally, this study also examines the critical aspects of
incorporating OCR pre-training in RL, including performance in a visually
complex environment and the appropriate pooling layer to aggregate the object
representations.Comment: We study unsupervised object-centric representations in reinforcement
learning through systematic investigatio
Memory Heat Map: Anomaly Detection in Real-Time Embedded Systems Using Memory Behavior
In this paper, we introduce a novel mechanism that identifies abnormal system-wide behaviors using the predictable nature of real-time embedded applications. We introduce Memory Heat Map (MHM) to characterize the memory behavior of the operating system. Our machine learning algorithms automatically (a) summarize the information contained in the MHMs and then (b) detect deviations from the normal memory behavior patterns. These methods are implemented on top of a multicore processor architecture to aid in the process of monitoring and detection. The techniques are evaluated using multIPle attack scenarios including kernel rootkits and shellcode. To the best of our knowledge, this is the first work that uses aggregated memory behavior for detecting system anomalies especially the concept of memory heat maps
HUMBI: A Large Multiview Dataset of Human Body Expressions and Benchmark Challenge
This paper presents a new large multiview dataset called HUMBI for human body expressions with natural clothing. The goal of HUMBI is to facilitate modeling view-specific appearance and geometry of five primary body signals including gaze, face, hand, body, and garment from assorted people. 107 synchronized HD cameras are used to capture 772 distinctive subjects across gender, ethnicity, age, and style. With the multiview image streams, we reconstruct the geometry of body expressions using 3D mesh models, which allows representing view-specific appearance. We demonstrate that HUMBI is highly effective in learning and reconstructing a complete human model and is complementary to the existing datasets of human body expressions with limited views and subjects such as MPII-Gaze, Multi-PIE, Human3.6M, and Panoptic Studio datasets. Based on HUMBI, we formulate a new benchmark challenge of a pose-guided appearance rendering task that aims to substantially extend photorealism in modeling diverse human expressions in 3D, which is the key enabling factor of authentic social tele-presence. HUMBI is publicly available at http://humbi-data.net.Y