303,392 research outputs found
Real-time Planning as Decision-making Under Uncertainty
In real-time planning, an agent must select the next action to take within a fixed time bound.
Many popular real-time heuristic search methods approach this by expanding nodes using time-limited A* and selecting the action leading toward the frontier node with the lowest f value. In this thesis, we reconsider real-time planning as a problem of decision-making under uncertainty. We treat heuristic values as uncertain evidence and we explore several backup methods for aggregating this evidence. We then propose a novel lookahead strategy that expands nodes to minimize risk, the expected regret in case a non-optimal action is chosen. We evaluate these methods in a simple synthetic benchmark and the sliding tile puzzle and find that they outperform previous methods. This work illustrates how uncertainty can arise even when solving deterministic planning problems, due to the inherent ignorance of time-limited search algorithms about those portions of the state space that they have not computed, and how an agent can benefit from explicitly meta-reasoning about this uncertainty
Learning Classical Planning Strategies with Policy Gradient
A common paradigm in classical planning is heuristic forward search. Forward
search planners often rely on simple best-first search which remains fixed
throughout the search process. In this paper, we introduce a novel search
framework capable of alternating between several forward search approaches
while solving a particular planning problem. Selection of the approach is
performed using a trainable stochastic policy, mapping the state of the search
to a probability distribution over the approaches. This enables using policy
gradient to learn search strategies tailored to a specific distributions of
planning problems and a selected performance metric, e.g. the IPC score. We
instantiate the framework by constructing a policy space consisting of five
search approaches and a two-dimensional representation of the planner's state.
Then, we train the system on randomly generated problems from five IPC domains
using three different performance metrics. Our experimental results show that
the learner is able to discover domain-specific search strategies, improving
the planner's performance relative to the baselines of plain best-first search
and a uniform policy.Comment: Accepted for ICAPS 201
Word Limited: An Empirical Analysis of the Relationship Between theLength, Resiliency, and Impact of Federal Regulations
Since the rise of the modern administrative state we have seen a demonstrable trend towards lengthier regulations. However, popular critiques of the administrative state that focus on the overall size of the Federal Register are misguided. They rest on the premise that more, and longer, regulations unduly burden industry and the economy in general. However, movement towards lengthier and more detailed regulations could be rational and largely unproblematic. This study tests two potential rational explanations for the trend towards longer regulations: dubbed (1) “the insulation hypothesis” and (2) “the socially beneficial hypothesis.” Each of these explanations embodies a theoretically rational decision. First, the insulation hypothesis rests on the idea that it would make sense for policy-makers to include more detailed legal and scientific support in new regulations, and thereby increase their length relative to previous regulations, if the addition-al detail provided more insulation from judicial review. Second, the socially beneficial hypothesis rests on the idea that devoting relatively more time and re-sources to each new rule would be appropriate if longer, newer regulations produced more net social benefits than older, shorter ones. The empirical analysis set forth in this article combines data from a number of publicly available sources to test these hypotheses. The results, confirming “the socially beneficial hypothesis,” add to the canon of empirical analysis of administrative law, building on the work of Cass Sunstein, Cary Coglianese, and others. Recognizing an overly burdensome regulatory state, an undoubtedly worthwhile and vital check in a democratic society, requires more than simply counting the pages of regulations. The results of this study should put some minds at ease, at least with respect to EPA regulations; they should also help better direct our scrutiny in the future
Campaign Management under Approval-Driven Voting Rules
Approval-like voting rules, such as Sincere-Strategy Preference-Based
Approval voting (SP-AV), the Bucklin rule (an adaptive variant of -Approval
voting), and the Fallback rule (an adaptive variant of SP-AV) have many
desirable properties: for example, they are easy to understand and encourage
the candidates to choose electoral platforms that have a broad appeal. In this
paper, we investigate both classic and parameterized computational complexity
of electoral campaign management under such rules. We focus on two methods that
can be used to promote a given candidate: asking voters to move this candidate
upwards in their preference order or asking them to change the number of
candidates they approve of. We show that finding an optimal campaign management
strategy of the first type is easy for both Bucklin and Fallback. In contrast,
the second method is computationally hard even if the degree to which we need
to affect the votes is small. Nevertheless, we identify a large class of
scenarios that admit fixed-parameter tractable algorithms.Comment: 34 pages, 1 figur
On Backtracking in Real-time Heuristic Search
Real-time heuristic search algorithms are suitable for situated agents that
need to make their decisions in constant time. Since the original work by Korf
nearly two decades ago, numerous extensions have been suggested. One of the
most intriguing extensions is the idea of backtracking wherein the agent
decides to return to a previously visited state as opposed to moving forward
greedily. This idea has been empirically shown to have a significant impact on
various performance measures. The studies have been carried out in particular
empirical testbeds with specific real-time search algorithms that use
backtracking. Consequently, the extent to which the trends observed are
characteristic of backtracking in general is unclear. In this paper, we present
the first entirely theoretical study of backtracking in real-time heuristic
search. In particular, we present upper bounds on the solution cost exponential
and linear in a parameter regulating the amount of backtracking. The results
hold for a wide class of real-time heuristic search algorithms that includes
many existing algorithms as a small subclass
Code Generation = A* + BURS
A system called BURS that is based on term rewrite systems and a search algorithm A* are combined to produce a code generator that generates optimal code. The theory underlying BURS is re-developed, formalised and explained in this work. The search algorithm uses a cost heuristic that is derived from the termrewrite system to direct the search. The advantage of using a search algorithm is that we need to compute only those costs that may be part of an optimal rewrite sequence
Searching and Stopping: An Analysis of Stopping Rules and Strategies
Searching naturally involves stopping points, both at a query level (how far down the ranked list should I go?) and at a session level (how many queries should I issue?). Understanding when searchers stop has been of much interest to the community because it is fundamental to how we evaluate search behaviour and performance. Research has shown that searchers find it difficult to formalise stopping criteria, and typically resort to their intuition of what is "good enough". While various heuristics and stopping criteria have been proposed, little work has investigated how well they perform, and whether searchers actually conform to any of these rules. In this paper, we undertake the first large scale study of stopping rules, investigating how they influence overall session performance, and which rules best match actual stopping behaviour. Our work is focused on stopping at the query level in the context of ad-hoc topic retrieval, where searchers undertake search tasks within a fixed time period. We show that stopping strategies based upon the disgust or frustration point rules - both of which capture a searcher's tolerance to non-relevance - typically result in (i) the best overall performance, and (ii) provide the closest approximation to actual searcher behaviour, although a fixed depth approach also performs remarkably well. Findings from this study have implications regarding how we build measures, and how we conduct simulations of search behaviours
- …