54,729 research outputs found
Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Coordinating agents to complete a set of tasks with intercoupled temporal and
resource constraints is computationally challenging, yet human domain experts
can solve these difficult scheduling problems using paradigms learned through
years of apprenticeship. A process for manually codifying this domain knowledge
within a computational framework is necessary to scale beyond the
``single-expert, single-trainee" apprenticeship model. However, human domain
experts often have difficulty describing their decision-making processes,
causing the codification of this knowledge to become laborious. We propose a
new approach for capturing domain-expert heuristics through a pairwise ranking
formulation. Our approach is model-free and does not require enumerating or
iterating through a large state space. We empirically demonstrate that this
approach accurately learns multifaceted heuristics on a synthetic data set
incorporating job-shop scheduling and vehicle routing problems, as well as on
two real-world data sets consisting of demonstrations of experts solving a
weapon-to-target assignment problem and a hospital resource allocation problem.
We also demonstrate that policies learned from human scheduling demonstration
via apprenticeship learning can substantially improve the efficiency of a
branch-and-bound search for an optimal schedule. We employ this human-machine
collaborative optimization technique on a variant of the weapon-to-target
assignment problem. We demonstrate that this technique generates solutions
substantially superior to those produced by human domain experts at a rate up
to 9.5 times faster than an optimization approach and can be applied to
optimally solve problems twice as complex as those solved by a human
demonstrator.Comment: Portions of this paper were published in the Proceedings of the
International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and
in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper
consists of 50 pages with 11 figures and 4 table
Active Inverse Reward Design
Designers of AI agents often iterate on the reward function in a
trial-and-error process until they get the desired behavior, but this only
guarantees good behavior in the training environment. We propose structuring
this process as a series of queries asking the user to compare between
different reward functions. Thus we can actively select queries for maximum
informativeness about the true reward. In contrast to approaches asking the
designer for optimal behavior, this allows us to gather additional information
by eliciting preferences between suboptimal behaviors. After each query, we
need to update the posterior over the true reward function from observing the
proxy reward function chosen by the designer. The recently proposed Inverse
Reward Design (IRD) enables this. Our approach substantially outperforms IRD in
test environments. In particular, it can query the designer about
interpretable, linear reward functions and still infer non-linear ones
NEXT LEVEL: A COURSE RECOMMENDER SYSTEM BASED ON CAREER INTERESTS
Skills-based hiring is a talent management approach that empowers employers to align recruitment around business results, rather than around credentials and title. It starts with employers identifying the particular skills required for a role, and then screening and evaluating candidates’ competencies against those requirements. With the recent rise in employers adopting skills-based hiring practices, it has become integral for students to take courses that improve their marketability and support their long-term career success. A 2017 survey of over 32,000 students at 43 randomly selected institutions found that only 34% of students believe they will graduate with the skills and knowledge required to be successful in the job market. Furthermore, the study found that while 96% of chief academic officers believe that their institutions are very or somewhat effective at preparing students for the workforce, only 11% of business leaders strongly agree [11]. An implication of the misalignment is that college graduates lack the skills that companies need and value. Fortunately, the rise of skills-based hiring provides an opportunity for universities and students to establish and follow clearer classroom-to-career pathways. To this end, this paper presents a course recommender system that aims to improve students’ career readiness by suggesting relevant skills and courses based on their unique career interests
Decision by sampling
We present a theory of decision by sampling (DbS) in which, in contrast with traditional models, there are no underlying psychoeconomic scales. Instead, we assume that an attribute’s subjective value is constructed from a series of binary, ordinal comparisons to a sample of attribute values drawn from memory and is its rank within the sample. We assume that the sample reflects both the immediate distribution of attribute values from the current decision’s context and also the background, real-world distribution of attribute values. DbS accounts for concave utility functions; losses looming larger than gains; hyperbolic temporal discounting; and the overestimation of small probabilities and the underestimation of large probabilities
- …