1,600 research outputs found
CASSL: Curriculum Accelerated Self-Supervised Learning
Recent self-supervised learning approaches focus on using a few thousand data
points to learn policies for high-level, low-dimensional action spaces.
However, scaling this framework for high-dimensional control require either
scaling up the data collection efforts or using a clever sampling strategy for
training. We present a novel approach - Curriculum Accelerated Self-Supervised
Learning (CASSL) - to train policies that map visual information to high-level,
higher- dimensional action spaces. CASSL orders the sampling of training data
based on control dimensions: the learning and sampling are focused on few
control parameters before other parameters. The right curriculum for learning
is suggested by variance-based global sensitivity analysis of the control
space. We apply our CASSL framework to learning how to grasp using an adaptive,
underactuated multi-fingered gripper, a challenging system to control. Our
experimental results indicate that CASSL provides significant improvement and
generalization compared to baseline methods such as staged curriculum learning
(8% increase) and complete end-to-end learning with random exploration (14%
improvement) tested on a set of novel objects
Validating generic metrics of fairness in game-based resource allocation scenarios with crowdsourced annotations
Being able to effectively measure the notion of fairness is of vital importance as it can provide insight into the formation and evolution of complex patterns and phenomena, such as social preferences, collaboration, group structures and social conflicts. This paper presents a comparative study for quantitatively modelling the notion of fairness in one-to-many resource allocation scenarios - i.e. one provider agent has to allocate resources to multiple receiver agents. For this purpose, we investigate the efficacy of six metrics and cross-validate them on crowdsourced human ranks of fairness annotated through a computer game implementation of the one-to-many resource allocation scenario. Four of the fairness metrics examined are well-established metrics of data dispersion, namely standard deviation, normalised entropy, the Gini coefficient and the fairness index. The fifth metric, proposed by the authors, is an ad-hoc context-based measure which is based on key aspects of distribution strategies. The sixth metric, finally, is machine learned via ranking support vector machines (SVMs) on the crowdsourced human perceptions of fairness. Results suggest that all ad-hoc designed metrics correlate well with the human notion of fairness, and the context-based metrics we propose appear to have a predictability advantage over the other ad-hoc metrics. On the other hand, the normalised entropy and fairness index metrics appear to be the most expressive and generic for measuring fairness for the scenario adopted in this study and beyond. The SVM model can automatically model fairness more accurately than any ad-hoc metric examined (with an accuracy of 81.86%) but it is limited by its expressivity and generalisability.Being able to effectively measure the notion of fairness is of vital importance as it can provide insight into the formation and evolution of complex patterns and phenomena, such as social preferences, collaboration, group structures and social conflicts. This paper presents a comparative study for quantitatively modelling the notion of fairness in one-to-many resource allocation scenarios - i.e. one provider agent has to allocate resources to multiple receiver agents. For this purpose, we investigate the efficacy of six metrics and cross-validate them on crowdsourced human ranks of fairness annotated through a computer game implementation of the one-to-many resource allocation scenario. Four of the fairness metrics examined are well-established metrics of data dispersion, namely standard deviation, normalised entropy, the Gini coefficient and the fairness index. The fifth metric, proposed by the authors, is an ad-hoc context-based measure which is based on key aspects of distribution strategies. The sixth metric, finally, is machine learned via ranking support vector machines (SVMs) on the crowdsourced human perceptions of fairness. Results suggest that all ad-hoc designed metrics correlate well with the human notion of fairness, and the context-based metrics we propose appear to have a predictability advantage over the other ad-hoc metrics. On the other hand, the normalised entropy and fairness index metrics appear to be the most expressive and generic for measuring fairness for the scenario adopted in this study and beyond. The SVM model can automatically model fairness more accurately than any ad-hoc metric examined (with an accuracy of 81.86%) but it is limited by its expressivity and generalisability.peer-reviewe
Recommended from our members
Simulation and optimization techniques applied in semiconductor assembly and test operations
The importance of back-end operations in semiconductor manufacturing has been growing steadily in the face of higher customer expectations and stronger competition in the industry. In order to achieve low cycle times, high throughput, and high utilization while improving due-date performance, more effective tools are needed to support machine setup and lot dispatching decisions. In previous work, the problem of maximizing the weighted throughput of lots undergoing assembly and test (AT), while ensuring that critical lots are given priority, was investigated and a greedy randomized adaptive search procedure (GRASP) developed to find solutions. Optimization techniques have long been used for scheduling manufacturing operations on a daily basis. Solutions provide a prescription for machine setups and job processing over a finite the planning horizon. In contrast, simulation provides more detail but in a normative sense. It tells you how the system will evolve in real time for a given demand, a given set of resources and rules for using them. A simulation model can also accommodate changeovers, initial setups and multi-pass requirements easily. The first part of the research is to show how the results of an optimization model can be integrated with the decisions made within a simulation model. The problem addressed is defined in terms of four hierarchical objectives: minimize the weighted sum of key device shortages, maximize weighted throughput, minimize the number of machines used, and minimize makespan for a given set of lots in queue, and a set of resources that includes machines and tooling. The facility can be viewed as a reentrant flow shop. The basic simulation was written in AutoSched AP (ASAP) and then enhanced with the help of customization features available in the software. Several new dispatch rules were developed. Rule_First_setup is able to initialize the simulation with the setups obtained with the GRASP. Rule_All_setups enables a machine to select the setup provided by the optimization solution whenever a decision is about to be made on which setup to choose subsequent to the initial setup. Rule_Hotlot was also proposed to prioritize the processing of the hot lots that contain key devices. The objective of the second part of the research is to design and implement heuristics within the simulation model to schedule back-end operations in a semiconductor AT facility. Rule_Setupnum lets the machines determine which key device to process according to a machine setup frequency table constructed from the GRASP solution. GRASP_asap embeds a more robust selection features of GRASP in the ASAP model through customization. This allows ASAP to explore a larger portion of the feasible region at each decision point by randomizing machine setups using adaptive probability distributions that are a function of solution quality. Rule_Greedy, which is a simplification of GRASP_asap, always picks the setup for a particular machine that gives the greatest marginal improvement in the objective function among all candidates. The purpose of the third part of the research is to statistically validate the relative effectiveness of our top six dispatch rules by comparing their performance on 30 real and randomly generated data sets. Using both GRASP and our ASAP discrete event simulation model, we have (1) identified the general order of dispatch rule performance, (2) investigated the impact of having setups installed on machines at time zero on rule performance, (3) determined the conditions under which restricting the maximum number of changeover affects the rule performance, and (4) studied the factors that might simultaneously affect rule performance with the help of a common random numbers experimental design. In the analysis, the first two objectives, weighted key device shortages and weighted throughput, are used to measure outcomes.Operations Research and Industrial Engineerin
- …