13,210 research outputs found
Rolling Horizon Workforce Schedule to Assign Workers to Tasks in a Learning/Forgetting Environment
The variability in demand across the planning horizon and the presence of heterogeneous workforces where workers learn and forget at different rates make the process of building and managing a workforce challenging. When integrating learning and forgetting functions of workers into workforce scheduling, the previous experience of a worker on a task can have significant impact on productivity. While making assignments over an infinite planning horizon is ideal, the learning/forgetting function significantly increases problem complexity and solution difficulty as the length of planning horizon increases. In this thesis, a multi-period rolling horizon worker-task assignment framework is developed to overcome computational challenges associated with longer planning horizons. The non-linear learning/forgetting function is converted into an equivalent linear form (using an existing technique) to further reduce problem complexity. We design experiments to analyze the optimal planning horizon and the factors that affect it, questions that remain unanswered in literature. After testing the model under different scenarios (varying staffing level, variation in demand, learning rate, forgetting rate and workforce heterogeneity), we conclude variation in demand and staffing level to be the most significant factors in determining the optimal planning horizon. We also see a significant improvement in performance when comparing our proposed multi-period framework against a myopic model, especially in scenarios with higher workforce heterogeneity, higher variation in demand, and faster forgetting rate
When Does a Newcomer Contribute to a Better Performance? A Multi-Agent Study on Self-Organising Processes of Task Allocation
This paper describes how a work group and a newcomer mutually adapt. We study two types of simulated groups that need an extra worker, one group because a former employee had left the group and one group because of its workload. For both groups, we test three conditions, newcomers being specialists, newcomers being generalists, and a control condition with no newcomer. We hypothesise that the group that needs an extra worker because of its workload will perform the best with a newcomer being a generalist. The group that needs an extra worker because a former employee had left the group, will perform better with a specialist newcomer. We study the development of task allocation and performance, with expertise and motivation as process variables. We use two performance indicators, the performance time of the slowest agent that indicates the speed of the group and the sum of performance of all agents to indicate labour costs. Both are indicative for the potential benefit of the newcomer. Strictly spoken the results support our hypotheses although the differences between the groups with generalists and specialists are negligible. What really mattered was the possibility for a newcomer to fit in.Task Allocation, Group Processes, Psychological Theory, Small Groups, Self-Organisation
Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents
Evolution strategies (ES) are a family of black-box optimization algorithms
able to train deep neural networks roughly as well as Q-learning and policy
gradient methods on challenging deep reinforcement learning (RL) problems, but
are much faster (e.g. hours vs. days) because they parallelize better. However,
many RL problems require directed exploration because they have reward
functions that are sparse or deceptive (i.e. contain local optima), and it is
unknown how to encourage such exploration with ES. Here we show that algorithms
that have been invented to promote directed exploration in small-scale evolved
neural networks via populations of exploring agents, specifically novelty
search (NS) and quality diversity (QD) algorithms, can be hybridized with ES to
improve its performance on sparse or deceptive deep RL tasks, while retaining
scalability. Our experiments confirm that the resultant new algorithms, NS-ES
and two QD algorithms, NSR-ES and NSRA-ES, avoid local optima encountered by ES
to achieve higher performance on Atari and simulated robots learning to walk
around a deceptive trap. This paper thus introduces a family of fast, scalable
algorithms for reinforcement learning that are capable of directed exploration.
It also adds this new family of exploration algorithms to the RL toolbox and
raises the interesting possibility that analogous algorithms with multiple
simultaneous paths of exploration might also combine well with existing RL
algorithms outside ES
νλΌλ―Έν° νμ΅ ν΅ν λ°μ΄ν° μ‘μ λ° κ°μ극볡 μ°κ΅¬
νμλ
Όλ¬Έ (λ°μ¬) -- μμΈλνκ΅ λνμ : 곡과λν μ κΈ°Β·μ»΄ν¨ν°κ³΅νλΆ, 2021. 2. μ κ΅λ―Ό.μΈκ³΅μ κ²½λ§ λͺ¨λΈμ λ€λμ λ°μ΄ν°λ₯Ό νμ΅μν€λ λ°©μμ μ»΄ν¨ν° λΉμ λ° μμ°μ΄ μ²λ¦¬ λΆμΌμ λ¬Έμ λ€μ ν΄κ²°νλλ° μλ‘μ΄ ν¨λ¬λ€μμΌλ‘ μ리맀κΉνμλ€. κΈ°μ‘΄ μ¬λμ μ§κ΄μΌλ‘ λͺ¨λΈμ μ€μ νλ λ°©μκ³Ό λΉκ΅νμ¬ λμ μ±λ₯μ λ¬μ±ν μ μμμΌλ, νμ΅λ°μ΄ν°μ μκ³Ό νμ§μ λ°λΌμ κ·Έ μ±λ₯μ΄ ν¬κ² μ’μ°λλ€. μ΄λ κ² μΈκ³΅ μ κ²½λ§μ ν¨κ³Όμ μΌλ‘ νλ ¨νλ €λ©΄ λ§μ μμ λ°μ΄ν°λ₯Ό λͺ¨μΌλ κ²κ³Ό λ°μ΄ν°μ νμ§μ μ νμν€λ μμΈμ νμ
νλ κ²μ΄ μ€μνλ€. λ³Έ μ°κ΅¬μμλ λΌλ²¨λ§λ λ°μ΄ν°μ νμ§μ κ²°μ νλ μ£Όμ μμΈμΌλ‘ μλ €μ Έ μλ μ‘μ(Noise)κ³Ό κ°μ(Interference)μ 극볡ν μ μλ κΈ°λ²μ μ μνλ€.
μ°κ΅¬μλ€μ μΌλ°μ μΌλ‘ μΉκΈ°λ°μ ν¬λΌμ°λ μμ±μμ€ν
μ μ¬μ©νμ¬ λ€μν μ¬λλ€λ‘λΆν° λ΅λ³μ μμ§νμ¬ λ°μ΄ν°κ·Έλ£Ήμ ꡬμ±νλ€\cite{simonyan2014very}. κ·Έλ¬λ μ¬λλ€μ λ΅λ³μΌλ‘ μ»λ λ°μ΄ν°λ μμ
μ§μΉ¨μ λν μ€ν΄, μ±
μ λΆμ‘± λ° κ³ μ ν μ€λ₯λ‘ μΈν΄μ λ°μ΄ν° μ
λ ₯(Input)κ³Ό μΆλ ₯(Target)μ¬μ΄μ μ‘μμ΄ ν¬ν¨λλ€. λ³Έ μ°κ΅¬μμλ μ΄λ κ² ν¬λΌμ°λ μμ±μ ν΅ν΄ λΌλ²¨λ§λ λ°μ΄ν°μ μ‘΄μ¬νλ μ‘μμ 극볡νκΈ° μν μΆλ‘ μκ³ λ¦¬μ¦μ μ μνλ€.
λλ²μ§Έλ‘, λͺ¨λΈμ νμ΅μ±λ₯μ μ νμν€λ μμΈμΈ λ°μ΄ν°κ°μ κ°μμ λ€λ£¬λ€. μ‘μμ΄ μ κ±°λμ΄ μ μ λ μ
λ ₯κ³Ό μΆλ ₯μ λΌλ²¨λ§λ λ°μ΄ν° μνμ΄λΌκ³ νλ©΄, νμ΅μμ μνλ€ μ¬μ΄μ κ΄κ³λ₯Ό μκ°ν μ μλ€. μ¬λ μμ€μ μΈκ³΅μ§λ₯μ λλ¬νκΈ° μν΄μλ νλμ λͺ¨λΈμ΄ νλμ λ¬Έμ λ§μ ν΄κ²°νλ κ²μ΄ μλλΌ μκ°μ μμ°¨μ μΌλ‘ μ§λ©΄νλ μ¬λ¬ λ¬Έμ λ₯Ό λμμ ν΄κ²°ν μ μμ΄μΌ νλ€. μ΄λ¬ν μν©μμ, μνλ€ μ¬μ΄μ κ°μμ΄ λ°μν μ μκ³ , νκ³μμλ μ°μνμ΅(Continual Learning)μμμ "Catastrophic Forgetting"λλ "Semantic Drift"μΌλ‘ μ μνκ³ μλ€. λ³Έ μ°κ΅¬μμλ μ΄λ¬ν κ°μμ ν¨κ³Όμ μΌλ‘ 극볡νκΈ° μν λ°©λ²μ λν μ°κ΅¬λ₯Ό λ€λ£¬λ€.
μμ μΈκΈν λ°μ΄ν° μ‘μμ 극볡νκΈ° μν΄μ 첫 λ²μ§Έ μ₯μμλ ν¬λΌμ°λ μμ± μμ€ν
μ μ΄μ° κ°κ΄μ λ° μ€μ λ²‘ν° νκ· μμ
μ λν μλ‘μ΄ μΆλ‘ μκ³ λ¦¬μ¦μ κ°κ° μ μνλ€. μ μ λ μκ³ λ¦¬μ¦μ ν¬λΌμ°λ μμ± λͺ¨λΈμ κ·Έλν λͺ¨λΈ(Graphical Model)λ‘μ μμ νκ³ , ν
μ€ν¬μ λ΅λ³μ μ£Όλ μ¬λλ€κ°μ λ κ°μ§ μ νμ λ©μμ§λ₯Ό λ°λ³΅μ μΌλ‘ μ£Όκ³ λ°μμΌλ‘μ¨ κ° μμ
μ μ λ΅κ³Ό κ° μμ
μμ μ λ’°μ±μ μΆμ ν μ μλ€. λν μ΄λ€μ νκ· μ±λ₯μ νλ₯ μ κ΅°μ€ λͺ¨λΈμ μ΄μ©νμ¬ λΆμνκ³ μ
μ¦νλ€. μ΄λ¬ν μ±λ₯μλ¬ νκ³λ μμ
λΉ ν λΉλλ μ¬λλ€μ μμ μμ
μμ νκ· μ λ’°μ±μν΄ κ²°μ λλ€. μ¬λλ€μ νκ· μ λ’°λκ° μΌμ μμ€μ λμ΄μλ©΄, μ μλ μκ³ λ¦¬μ¦μ νκ· μ±λ₯μ λͺ¨λ μμ
μμ μ λ’°μ±μ μκ³ μλ μ€λΌν΄ μΆμ κΈ° (μ΄λ‘ μ μΈ νκ³)μ μλ ΄νλ€. μ€μ λ°μ΄ν° μΈνΈμ ν©μ± λ°μ΄ν° μΈνΈ λͺ¨λμ λν κ΄λ²μν μ€νμ ν΅ν΄, μ μλ μκ³ λ¦¬μ¦μ μ€μ μ±λ₯μ΄ μ΄μ μ state-of-the-art μκ³ λ¦¬μ¦λ€ λ³΄λ€ μ°μνλ€λ κ²μ μ
μ¦νλ€.
λ
Όλ¬Έμ λ λ²μ§Έ μ₯μμλ μ°μνμ΅μν©μμ λ°μ΄ν°μνμ¬μ΄μ λ°μνλ κ°μμ ν΄κ²°νκΈ° μν΄, νμμ±κΈ°λ°μ λ©ν νμ΅ κ΅¬μ‘° (Homeostatic Meta Model)λ₯Ό μ μνλ€. ꡬ체μ μΌλ‘, μ΄μ ν
μ€ν¬ μ€μν νμ΅ λ³μλ₯Ό μ°Ύκ³ μ κ·νμ μ λ³μ μΌλ‘ μ μ©νλ λ°©λ²μ μ¬μ©νλλ°, μ μλ λͺ¨λΈμ μ΄λ¬ν μ κ·νμ κ°λλ₯Ό μλμΌλ‘ μ μ΄νλ€. μ΄λ¬ν κΈ°λ²μ μλ‘μ΄ νμ΅μ μ§νν λ μ΄μ μ νλν μ§μμ μ΅μνμΌλ‘ μμ΄λ²λ¦¬λλ‘ μΈκ³΅μ κ²½λ§μ νμ΅μ μ λνλ€. λ€μν μ νμ μ°μ νμ΅ κ³Όμ μμ μ μλ λ°©λ²μ κ²μ¦νλλ°, μ€νμ μΌλ‘ μ μλ λ°©λ²μ΄ νμ΅μ κ°μμν μΈ‘λ©΄μμ κΈ°μ‘΄ λ°©λ²λ³΄λ€ μ°μνλ€λ μ μ 보μΈλ€.λν κΈ°μ‘΄ μλ
μ€ κ°μμ± κΈ°λ° λ°©λ²λ€μ λΉν΄ μλμ μΌλ‘ λ³νμ κ°μΈνλ€.μ μλ λͺ¨λΈμ μν΄ μμ±λ μ κ·νμ κ°λ κ°μ μλ
μ€μμ νμμ± μ μμ νΌλλ°± λ©μ»€λμ¦κ³Ό μ μ¬νκ², νΉμ λ²μ λ΄μμ λ₯λμ μΌλ‘ μ μ΄λλ€.Data-driven approaches based on neural networks have emerged as new paradigm to solve problems in computer vision and natural language processing fields. These approaches achieve better performance compared to existing human-design approaches (heuristic), however, these performance gains solely relies on a large amount of high quality labeled data. Accordingly, it is important to collect a large amount of data and improve the quality of data by analyzing degrading factors in order to well-train a model. In this dissertation, I propose iterative algorithms to relieve noise of labeled data in crowdsourcing system and meta architecture to alleviate interference among them in continual learning scenarios respectively.
Researchers generally collect data using crowdsourcing system which utilizes human evaluations. However, human annotators' decisions may vary significantly due to misconceptions of task instructions, the lack of responsibility, and inherent noise. To relieve the noise in responses from crowd annotators, I propose novel inference algorithms for discrete multiple choice and real-valued vector regression tasks. Web-based crowdsourcing platforms are widely used for collecting large amount of labeled data. Due to low-paid workers and inherent noise, the quality of acquired data could be easily degraded. The proposed algorithms can overcome the noise by estimating the true answer of each task and a reliability of each worker updating two types of messages iteratively. For performance guarantee, the performances of the algorithms are theoretically proved under probabilistic crowd model. Interestingly, their performance bounds depend on the number of queries per task and the average quality of workers. Under a certain condition, each average performance becomes close to an oracle estimator which knows the reliability of every worker (theoretical upper bound). Through extensive experiments with both real-world and synthetic datasets, the practical performance of algorithms are verified. In fact, they are superior to other state-of-the-art algorithms.
Second, when a model learns a sequence of tasks one by one (continual learning), previously learned knowledge may conflict with new knowledge. It is well-known phenomenon called "Catastrophic Forgetting" or "Semantic Drift". In this dissertation, we call the phenomena "Interference" since it occurs between two knowledge from labeled data separated in time. It is essential to control the amount of noise and interference for neural network to be well-trained.
In the second part of dissertation, to solve the Interference among labeled data from consecutive tasks in continual learning scenario, a homeostasis-inspired meta learning architecture (HM) is proposed. The HM automatically controls the intensity of regularization (IoR) by capturing important parameters from the previous tasks and the current learning direction. By adjusting IoR, a learner can balance the amount of interference and degrees of freedom for its current learning. Experimental results are provided on various types of continual learning tasks. Those results show that the proposed method notably outperforms the conventional methods in terms of average accuracy and amount of the interference. In experiments, I verify that HM is relatively stable and robust compared to the existing Synaptic Plasticity based methods. Interestingly, the IoR generated by HM appears to be proactively controlled within a certain range, which resembles a negative feedback mechanism of homeostasis in synapses.Contents
Abstract
Contents
List of Tables
List of Figures
1 INTRODUCTION 1
2 Reliable multiple-choice iterative algorithm for crowdsourcing systems 6
2.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Task Allocation . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.2 Multiple Iterative Algorithm . . . . . . . . . . . . . . . . . . 8
2.2.3 Task Allocation for General Setting . . . . . . . . . . . . . . 10
2.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Analysis of algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4.1 Quality of workers . . . . . . . . . . . . . . . . . . . . . . . 16
2.4.2 Bound on the Average Error Probability . . . . . . . . . . . . 18
2.4.3 Proof of the Theorem 1 . . . . . . . . . . . . . . . . . . . . . 20
2.4.4 Proof of Sub-Gaussianity . . . . . . . . . . . . . . . . . . . . 22
2.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
iii2.6 Related Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3 Reliable Aggregation Method for Vector Regression in Crowdsourcing 38
3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2 Inference Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.1 Task Message . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2.2 Worker Message . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3.1 Real crowdsourcing data . . . . . . . . . . . . . . . . . . . . 43
3.4 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4.1 Dirichlet crowd model . . . . . . . . . . . . . . . . . . . . . 48
3.4.2 Error Bound . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4.3 Optimality of Oracle Estimator . . . . . . . . . . . . . . . . . 51
3.4.4 Performance Proofs . . . . . . . . . . . . . . . . . . . . . . . 52
3.5 Related Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4 Homeostasis-Inspired Meta Continual Learning 60
4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.1.1 Continual Learning . . . . . . . . . . . . . . . . . . . . . . . 60
4.1.2 Meta Learning . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.2 Homeostatic Meta-Model . . . . . . . . . . . . . . . . . . . . . . . . 63
4.3 Preliminary Experiments and Findings . . . . . . . . . . . . . . . . . 66
4.3.1 Block-wise Permutation . . . . . . . . . . . . . . . . . . . . 67
4.3.2 Performance Metrics . . . . . . . . . . . . . . . . . . . . . . 68
4.4 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.4.3 Overall Performance . . . . . . . . . . . . . . . . . . . . . . 70
4.5 Related Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
iv4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5 Conclusion 78
Abstract (In Korean) 89Docto
Worker scheduling with induced learning in a semi-on-line setting
Scheduling is a widely researched area with many interesting fields. The presented research deals with a maintenance area in which preventative maintenance and emergency jobs enter the system. Each job has varying processing time and must be scheduled. Through learning the operators are able to expand their knowledge which enables them to accomplish more tasks in a limited time. Two MINLP models have been presented, one for preventative maintenance jobs alone, and another including emergency jobs. The emergency model is semi-on-line as the arrival time is unknown. A corresponding heuristic method has also been developed to decrease the computational time of the MINLP models. The models and heuristic were tested in several areas to determine their flexibility. It has been demonstrated that the inclusion of learning has greatly improved the efficiency of the workers and of the system
Allocation of Workers Utilizing Models with Learning, Forgetting, and Various Work Structures
Much of the literature on cross-training and worker assignment problems focus on simulating production systems under cross-training methods. Many have found that for specific systems some methods of allocating workers are better performing than others in terms of overall productivity and ability to deal with change. This has lead researchers to create mathematical programming models with a goal of finding optimal levels of cross-training by changing worker allocations. Learning and forgetting curves have been a key method to improve the solutions produced by the optimization models, but learning curves are often nonlinear causing increased solving times. Because of this, most works have been restricted to modeling small, simple production systems.
This thesis studies the expansion of worker allocation models with human learning and forgetting to include variable work structures, thus allowing the models to be used to address a larger set of problems than previously possible. A worker assignment model with flexible inventory constraints capable of representing different production structures is constructed to demonstrate the expansion. Utilizing a reformulation technique to counteract the increased solve times of learning curve incorporation, the scale of the production systems modeled in this work is larger than in similar works and closer to the scale of systems seen in industry. Production systems with multiple products and corresponding due dates are modeled to better represent the production environment in industry. Investigative tests including a 2^4 factorial experiment are included to understand the performance of the model. The output of the optimization model is a schedule of worker assignments for the planning horizon over all of the tasks in the modeled system. Production managers could apply the schedule to their existing lines or run what-if scenarios on line structure to better understand how alternative structures may affect worker training and line productivity over the planning horizon
- β¦