81 research outputs found
Hypergraph-Based Recognition Memory Model for Lifelong Experience
Cognitive agents are expected to interact with and adapt to a nonstationary dynamic environment. As an initial process of decision making in a real-world agent interaction, familiarity judgment leads the following processes for intelligence. Familiarity judgment includes knowing previously encoded data as well as completing original patterns from partial information, which are fundamental functions of recognition memory. Although previous computational memory models have attempted to reflect human behavioral properties on the recognition memory, they have been focused on static conditions without considering temporal changes in terms of lifelong learning. To provide temporal adaptability to an agent, in this paper, we suggest a computational model for recognition memory that enables lifelong learning. The proposed model is based on a hypergraph structure, and thus it allows a high-order relationship between contextual nodes and enables incremental learning. Through a simulated experiment, we investigate the optimal conditions of the memory model and validate the consistency of memory performance for lifelong learning
2014 Summer Research Symposium Abstract Book
2014 Summer volume of abstracts for science research projects conducted by students at Trinity College
A Review of Data Mining in Personalized Education: Current Trends and Future Prospects
Personalized education, tailored to individual student needs, leverages
educational technology and artificial intelligence (AI) in the digital age to
enhance learning effectiveness. The integration of AI in educational platforms
provides insights into academic performance, learning preferences, and
behaviors, optimizing the personal learning process. Driven by data mining
techniques, it not only benefits students but also provides educators and
institutions with tools to craft customized learning experiences. To offer a
comprehensive review of recent advancements in personalized educational data
mining, this paper focuses on four primary scenarios: educational
recommendation, cognitive diagnosis, knowledge tracing, and learning analysis.
This paper presents a structured taxonomy for each area, compiles commonly used
datasets, and identifies future research directions, emphasizing the role of
data mining in enhancing personalized education and paving the way for future
exploration and innovation.Comment: 25 pages, 5 figure
Visual Perception For Robotic Spatial Understanding
Humans understand the world through vision without much effort. We perceive the structure, objects, and people in the environment and pay little direct attention to most of it, until it becomes useful. Intelligent systems, especially mobile robots, have no such biologically engineered vision mechanism to take for granted. In contrast, we must devise algorithmic methods of taking raw sensor data and converting it to something useful very quickly. Vision is such a necessary part of building a robot or any intelligent system that is meant to interact with the world that it is somewhat surprising we don\u27t have off-the-shelf libraries for this capability.
Why is this? The simple answer is that the problem is extremely difficult. There has been progress, but the current state of the art is impressive and depressing at the same time. We now have neural networks that can recognize many objects in 2D images, in some cases performing better than a human. Some algorithms can also provide bounding boxes or pixel-level masks to localize the object. We have visual odometry and mapping algorithms that can build reasonably detailed maps over long distances with the right hardware and conditions. On the other hand, we have robots with many sensors and no efficient way to compute their relative extrinsic poses for integrating the data in a single frame. The same networks that produce good object segmentations and labels in a controlled benchmark still miss obvious objects in the real world and have no mechanism for learning on the fly while the robot is exploring. Finally, while we can detect pose for very specific objects, we don\u27t yet have a mechanism that detects pose that generalizes well over categories or that can describe new objects efficiently.
We contribute algorithms in four of the areas mentioned above. First, we describe a practical and effective system for calibrating many sensors on a robot with up to 3 different modalities. Second, we present our approach to visual odometry and mapping that exploits the unique capabilities of RGB-D sensors to efficiently build detailed representations of an environment. Third, we describe a 3-D over-segmentation technique that utilizes the models and ego-motion output in the previous step to generate temporally consistent segmentations with camera motion. Finally, we develop a synthesized dataset of chair objects with part labels and investigate the influence of parts on RGB-D based object pose recognition using a novel network architecture we call PartNet
DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the Self
This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both the human and the robot. The framework, based on a biologically-grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users
Attribute Learning for Image/Video Understanding
PhDFor the past decade computer vision research has achieved increasing success in visual recognition
including object detection and video classification. Nevertheless, these achievements still
cannot meet the urgent needs of image and video understanding. The recently rapid development
of social media sharing has created a huge demand for automatic media classification and annotation
techniques. In particular, these types of media data usually contain very complex social
activities of a group of people (e.g. YouTube video of a wedding reception) and are captured
by consumer devices with poor visual quality. Thus it is extremely challenging to automatically
understand such a high number of complex image and video categories, especially when these
categories have never been seen before.
One way to understand categories with no or few examples is by transfer learning which
transfers knowledge across related domains, tasks, or distributions. In particular, recently lifelong
learning has become popular which aims at transferring information to tasks without any
observed data. In computer vision, transfer learning often takes the form of attribute learning.
The key underpinning idea of attribute learning is to exploit transfer learning via an intermediatelevel
semantic representations – attributes. The semantic attributes are most commonly used as a
semantically meaningful bridge between low feature data and higher level class concepts, since
they can be used both descriptively (e.g., ’has legs’) and discriminatively (e.g., ’cats have it but
dogs do not’). Previous works propose many different attribute learning models for image and
video understanding. However, there are several intrinsic limitations and problems that exist in
previous attribute learning work. Such limitations discussed in this thesis include limitations of
user-defined attributes, projection domain-shift problems, prototype sparsity problems, inability
to combine multiple semantic representations and noisy annotations of relative attributes. To
tackle these limitations, this thesis explores attribute learning on image and video understanding
from the following three aspects.
Firstly to break the limitations of user-defined attributes, a framework for learning latent
attributes is present for automatic classification and annotation of unstructured group social activity
in videos, which enables the tasks of attribute learning for understanding complex multimedia
data with sparse and incomplete labels. We investigate the learning of latent attributes
for content-based understanding, which aims to model and predict classes and tags relevant to
objects, sounds and events – anything likely to be used by humans to describe or search for
media. Secondly, we propose the framework of transductive multi-view embedding hypergraph
label propagation and solve three inherent limitations of most previous attribute learning work,
i.e., the projection domain shift problems, the prototype sparsity problems and the inability to
combine multiple semantic representations. We explore the manifold structure of the data distributions
of different views projected onto the same embedding space via label propagation on
a graph. Thirdly a novel framework for robust learning is presented to effectively learn relative
attributes from the extremely noisy and sparse annotations. Relative attributes are increasingly
learned from pairwise comparisons collected via crowdsourcing tools which are more economic
and scalable than the conventional laboratory based data annotation. However, a major challenge
for taking a crowdsourcing strategy is the detection and pruning of outliers. We thus propose
a principled way to identify annotation outliers by formulating the relative attribute prediction
task as a unified robust learning to rank problem, tackling both the outlier detection and relative
attribute prediction tasks jointly.
In summary, this thesis studies and solves the key challenges and limitations of attribute
learning in image/video understanding. We show the benefits of solving these challenges and
limitations in our approach which thus achieves better performance than previous methods
HPCR: Holistic Proxy-based Contrastive Replay for Online Continual Learning
Online continual learning (OCL) aims to continuously learn new data from a
single pass over the online data stream. It generally suffers from the
catastrophic forgetting issue. Existing replay-based methods effectively
alleviate this issue by replaying part of old data in a proxy-based or
contrastive-based replay manner. In this paper, we conduct a comprehensive
analysis of these two replay manners and find they can be complementary.
Inspired by this finding, we propose a novel replay-based method called
proxy-based contrastive replay (PCR), which replaces anchor-to-sample pairs
with anchor-to-proxy pairs in the contrastive-based loss to alleviate the
phenomenon of forgetting. Based on PCR, we further develop a more advanced
method named holistic proxy-based contrastive replay (HPCR), which consists of
three components. The contrastive component conditionally incorporates
anchor-to-sample pairs to PCR, learning more fine-grained semantic information
with a large training batch. The second is a temperature component that
decouples the temperature coefficient into two parts based on their impacts on
the gradient and sets different values for them to learn more novel knowledge.
The third is a distillation component that constrains the learning process to
keep more historical knowledge. Experiments on four datasets consistently
demonstrate the superiority of HPCR over various state-of-the-art methods.Comment: 18 pages, 11 figure
Automated Reinforcement Learning:An Overview
Reinforcement Learning and recently Deep Reinforcement Learning are popular
methods for solving sequential decision making problems modeled as Markov
Decision Processes. RL modeling of a problem and selecting algorithms and
hyper-parameters require careful considerations as different configurations may
entail completely different performances. These considerations are mainly the
task of RL experts; however, RL is progressively becoming popular in other
fields where the researchers and system designers are not RL experts. Besides,
many modeling decisions, such as defining state and action space, size of
batches and frequency of batch updating, and number of timesteps are typically
made manually. For these reasons, automating different components of RL
framework is of great importance and it has attracted much attention in recent
years. Automated RL provides a framework in which different components of RL
including MDP modeling, algorithm selection and hyper-parameter optimization
are modeled and defined automatically. In this article, we explore the
literature and present recent work that can be used in automated RL. Moreover,
we discuss the challenges, open questions and research directions in AutoRL
- …