420 research outputs found
Advancing Data-Efficiency in Reinforcement Learning
In many real-world applications, including traffic control, robotics and web system
configurations, we are confronted with real-time decision-making problems where
data is limited. Reinforcement Learning (RL) allows us to construct a mathematical
framework to solve sequential decision-making problems under uncertainty. Under
low-data constraints, RL agents must be able to quickly identify relevant information in the observations, and to quickly learn how to act in order attain their long-term objective. While recent advancements in RL have demonstrated impressive
achievements, the end-to-end approach they take favours autonomy and flexibility
at the expense of fast learning. To be of practical use, there is an undeniable need
to improve the data-efficiency of existing systems.
Ideal RL agents would possess an optimal way of representing their environment, combined with an efficient mechanism for propagating reward signals across
the state space. This thesis investigates the problem of data-efficiency in RL from
these two aforementioned perspectives. A deep overview of the different representation learning methods in use in RL is provided. The aim of this overview is to
categorise the different representation learning approaches and highlight the impact
of the representation on data-efficiency. Then, this framing is used to develop two
main research directions. The first problem focuses on learning a representation that
captures the geometry of the problem. An RL mechanism that uses a scalable feature learning on graphs method to learn such rich representations is introduced, ultimately leading to more efficient value function approximation. Secondly, ET (λ ),
an algorithm that improves credit assignment in stochastic environments by propagating reward information counterfactually is presented. ET (λ ) results in faster earning compared to traditional methods that rely solely on temporal credit assignment. Overall, this thesis shows how a structural representation encoding the geometry of the state space, and counterfactual credit assignment are key characteristics
for data-efficient RL
A random forest approach to segmenting and classifying gestures
This thesis investigates a gesture segmentation and recognition scheme that employs a random forest classification model. A complete gesture recognition system should localize and classify each gesture from a given gesture vocabulary, within a continuous video stream. Thus, the system must determine the start and end points of each gesture in time, as well as accurately recognize the class label of each gesture. We propose a unified approach that performs the tasks of temporal segmentation and classification simultaneously. Our method trains a random forest classification model to recognize gestures from a given vocabulary, as presented in a training dataset of video plus 3D body joint locations, as well as out-of-vocabulary (non-gesture) instances. Given an input video stream, our trained model is applied to candidate gestures using sliding windows at multiple temporal scales. The class label with the highest classifier confidence is selected, and its corresponding scale is used to determine the segmentation boundaries in time. We evaluated our formulation in segmenting and recognizing gestures from two different benchmark datasets: the NATOPS dataset of 9,600 gesture instances from a vocabulary of 24 aircraft handling signals, and the CHALEARN dataset of 7,754 gesture instances from a vocabulary of 20 Italian communication gestures. The performance of our method compares favorably with state-of-the-art methods that employ Hidden Markov Models or Hidden Conditional Random Fields on the NATOPS dataset. We conclude with a discussion of the advantages of using our model
Introspective knowledge acquisition for case retrieval networks in textual case base reasoning.
Textual Case Based Reasoning (TCBR) aims at effective reuse of information contained in unstructured documents. The key advantage of TCBR over traditional Information Retrieval systems is its ability to incorporate domain-specific knowledge to facilitate case comparison beyond simple keyword matching. However, substantial human intervention is needed to acquire and transform this knowledge into a form suitable for a TCBR system. In this research, we present automated approaches that exploit statistical properties of document collections to alleviate this knowledge acquisition bottleneck. We focus on two important knowledge containers: relevance knowledge, which shows relatedness of features to cases, and similarity knowledge, which captures the relatedness of features to each other. The terminology is derived from the Case Retrieval Network (CRN) retrieval architecture in TCBR, which is used as the underlying formalism in this thesis applied to text classification. Latent Semantic Indexing (LSI) generated concepts are a useful resource for relevance knowledge acquisition for CRNs. This thesis introduces a supervised LSI technique called sprinkling that exploits class knowledge to bias LSI's concept generation. An extension of this idea, called Adaptive Sprinkling has been proposed to handle inter-class relationships in complex domains like hierarchical (e.g. Yahoo directory) and ordinal (e.g. product ranking) classification tasks. Experimental evaluation results show the superiority of CRNs created with sprinkling and AS, not only over LSI on its own, but also over state-of-the-art classifiers like Support Vector Machines (SVM). Current statistical approaches based on feature co-occurrences can be utilized to mine similarity knowledge for CRNs. However, related words often do not co-occur in the same document, though they co-occur with similar words. We introduce an algorithm to efficiently mine such indirect associations, called higher order associations. Empirical results show that CRNs created with the acquired similarity knowledge outperform both LSI and SVM. Incorporating acquired knowledge into the CRN transforms it into a densely connected network. While improving retrieval effectiveness, this has the unintended effect of slowing down retrieval. We propose a novel retrieval formalism called the Fast Case Retrieval Network (FCRN) which eliminates redundant run-time computations to improve retrieval speed. Experimental results show FCRN's ability to scale up over high dimensional textual casebases. Finally, we investigate novel ways of visualizing and estimating complexity of textual casebases that can help explain performance differences across casebases. Visualization provides a qualitative insight into the casebase, while complexity is a quantitative measure that characterizes classification or retrieval hardness intrinsic to a dataset. We study correlations of experimental results from the proposed approaches against complexity measures over diverse casebases
Improving Bags-of-Words model for object categorization
In the past decade, Bags-of-Words (BOW) models have become popular for the task of object recognition, owing to their good performance and simplicity. Some of the most effective recent methods for computer-based object recognition work by detecting and extracting local image features, before quantizing them according to a codebook rule such as k-means clustering, and classifying these with conventional classifiers such as Support Vector Machines and Naive Bayes.
In this thesis, a Spatial Object Recognition Framework is presented that consists of the four main contributions of the research.
The first contribution, frequent keypoint pattern discovery, works by combining pairs and triplets of frequent keypoints in order to discover intermediate representations for object classes. Based on the same frequent keypoints principle, algorithms for locating the region-of-interest in training images is then discussed.
Extensions to the successful Spatial Pyramid Matching scheme, in order to better capture spatial relationships, are then proposed. The pairs frequency histogram and shapes frequency histogram work by capturing more redefined spatial information between local image features.
Finally, alternative techniques to Spatial Pyramid Matching for capturing spatial information are presented. The proposed techniques, variations of binned log-polar histograms, divides the image into grids of different scale and different orientation. Thus captures the distribution of image features both in distance and orientation explicitly.
Evaluations on the framework are focused on several recent and popular datasets, including image retrieval, object recognition, and object categorization. Overall, while the effectiveness of the framework is limited in some of the datasets, the proposed contributions are nevertheless powerful improvements of the BOW model
Generalization and Transferability in Reinforcement Learning
Reinforcement learning has proven capable of extending the applicability of machine learning to domains in which
knowledge cannot be acquired from labeled examples but only via trial-and-error. Being able to solve problems with such
characteristics is a crucial requirement for autonomous agents that can accomplish tasks without human intervention.
However, most reinforcement learning algorithms are designed to solve exactly one task, not offering means to systematically
reuse previous knowledge acquired in other problems. Motivated by insights from homotopic continuation methods,
in this work we investigate approaches based on optimization- and concurrent systems theory to gain an understanding
of conceptual and technical challenges of knowledge transfer in reinforcement learning domains. Building upon these
findings, we present an algorithm based on contextual relative entropy policy search that allows an agent to generate
a structured sequence of learning tasks that guide its learning towards a target distribution of tasks by giving it control
over an otherwise hidden context distribution. The presented algorithm is evaluated on a number of robotic tasks, in
which a desired system state needs to be reached, demonstrating that the proposed learning scheme helps to increase
and stabilize learning performance
Bourdieu and Brand-Me: Agri-food Higher Education students’ experiences of securing industrial placements and employment, and through personal branding strategies
This thesis relates to the field of student placement and explores the strategies that higher education (HE) students use to secure industrial placement employment in the agri-food sector. Brand-Me is the collective of a person’s online and offline presence. Digital technologies and how students choose their self-presentations online has blurred the lines between personal and professional. This presents a challenge for HE students in managing their digital footprint. A Bourdieusian lens (Bourdieu, 1977) and conceptual tools of habitus, capital and field was used to shed light on the practice of placement seeking and the ownership of students in shaping their Brand-Me. The role the Placement Manager plays as ‘referee’ is a vital link in assisting the students seeking placements and the transition of student to placement employment. Qualitative in-depth interviews with students, university staff, and employers (n= 15, 7, and 2 respectively) were undertaken using Rich Pictures to support the interviews. This visual qualitative approach provided a close generation of insights and semiotic resources for data analysis and framework for self-reflection. Students showed differing approaches to consideration of Brand-Me based on their concern for managing their digital footprint and showed a desire to convey a ‘hardworking’ and neoliberal self for the social logic of the field. The research uncovered how family habitus influenced placement seeking, career envisioning and geographical mobility with the misleading conceptualisations of a student as being freely single. Female students showed how the masculine dominated habitus and hegemony in the agricultural rural sector had shaped their consideration of Brand-Me. This provides a contribution to the conceptual construction of rural gender identities and how students adopt chameleon like strategies to fit-in with their surroundings. The use of Rich Pictures with individuals also provided a novel structure of reflection for employability in career support services and in consideration of Brand-Me
Biological Control of Weeds: Theory and Practical Application
Crop Production/Industries,
- …