303,392 research outputs found

    Real-time Planning as Decision-making Under Uncertainty

    Get PDF
    In real-time planning, an agent must select the next action to take within a fixed time bound. Many popular real-time heuristic search methods approach this by expanding nodes using time-limited A* and selecting the action leading toward the frontier node with the lowest f value. In this thesis, we reconsider real-time planning as a problem of decision-making under uncertainty. We treat heuristic values as uncertain evidence and we explore several backup methods for aggregating this evidence. We then propose a novel lookahead strategy that expands nodes to minimize risk, the expected regret in case a non-optimal action is chosen. We evaluate these methods in a simple synthetic benchmark and the sliding tile puzzle and find that they outperform previous methods. This work illustrates how uncertainty can arise even when solving deterministic planning problems, due to the inherent ignorance of time-limited search algorithms about those portions of the state space that they have not computed, and how an agent can benefit from explicitly meta-reasoning about this uncertainty

    Learning Classical Planning Strategies with Policy Gradient

    Get PDF
    A common paradigm in classical planning is heuristic forward search. Forward search planners often rely on simple best-first search which remains fixed throughout the search process. In this paper, we introduce a novel search framework capable of alternating between several forward search approaches while solving a particular planning problem. Selection of the approach is performed using a trainable stochastic policy, mapping the state of the search to a probability distribution over the approaches. This enables using policy gradient to learn search strategies tailored to a specific distributions of planning problems and a selected performance metric, e.g. the IPC score. We instantiate the framework by constructing a policy space consisting of five search approaches and a two-dimensional representation of the planner's state. Then, we train the system on randomly generated problems from five IPC domains using three different performance metrics. Our experimental results show that the learner is able to discover domain-specific search strategies, improving the planner's performance relative to the baselines of plain best-first search and a uniform policy.Comment: Accepted for ICAPS 201

    Word Limited: An Empirical Analysis of the Relationship Between theLength, Resiliency, and Impact of Federal Regulations

    Get PDF
    Since the rise of the modern administrative state we have seen a demonstrable trend towards lengthier regulations. However, popular critiques of the administrative state that focus on the overall size of the Federal Register are misguided. They rest on the premise that more, and longer, regulations unduly burden industry and the economy in general. However, movement towards lengthier and more detailed regulations could be rational and largely unproblematic. This study tests two potential rational explanations for the trend towards longer regulations: dubbed (1) “the insulation hypothesis” and (2) “the socially beneficial hypothesis.” Each of these explanations embodies a theoretically rational decision. First, the insulation hypothesis rests on the idea that it would make sense for policy-makers to include more detailed legal and scientific support in new regulations, and thereby increase their length relative to previous regulations, if the addition-al detail provided more insulation from judicial review. Second, the socially beneficial hypothesis rests on the idea that devoting relatively more time and re-sources to each new rule would be appropriate if longer, newer regulations produced more net social benefits than older, shorter ones. The empirical analysis set forth in this article combines data from a number of publicly available sources to test these hypotheses. The results, confirming “the socially beneficial hypothesis,” add to the canon of empirical analysis of administrative law, building on the work of Cass Sunstein, Cary Coglianese, and others. Recognizing an overly burdensome regulatory state, an undoubtedly worthwhile and vital check in a democratic society, requires more than simply counting the pages of regulations. The results of this study should put some minds at ease, at least with respect to EPA regulations; they should also help better direct our scrutiny in the future

    Campaign Management under Approval-Driven Voting Rules

    Get PDF
    Approval-like voting rules, such as Sincere-Strategy Preference-Based Approval voting (SP-AV), the Bucklin rule (an adaptive variant of kk-Approval voting), and the Fallback rule (an adaptive variant of SP-AV) have many desirable properties: for example, they are easy to understand and encourage the candidates to choose electoral platforms that have a broad appeal. In this paper, we investigate both classic and parameterized computational complexity of electoral campaign management under such rules. We focus on two methods that can be used to promote a given candidate: asking voters to move this candidate upwards in their preference order or asking them to change the number of candidates they approve of. We show that finding an optimal campaign management strategy of the first type is easy for both Bucklin and Fallback. In contrast, the second method is computationally hard even if the degree to which we need to affect the votes is small. Nevertheless, we identify a large class of scenarios that admit fixed-parameter tractable algorithms.Comment: 34 pages, 1 figur

    On Backtracking in Real-time Heuristic Search

    Full text link
    Real-time heuristic search algorithms are suitable for situated agents that need to make their decisions in constant time. Since the original work by Korf nearly two decades ago, numerous extensions have been suggested. One of the most intriguing extensions is the idea of backtracking wherein the agent decides to return to a previously visited state as opposed to moving forward greedily. This idea has been empirically shown to have a significant impact on various performance measures. The studies have been carried out in particular empirical testbeds with specific real-time search algorithms that use backtracking. Consequently, the extent to which the trends observed are characteristic of backtracking in general is unclear. In this paper, we present the first entirely theoretical study of backtracking in real-time heuristic search. In particular, we present upper bounds on the solution cost exponential and linear in a parameter regulating the amount of backtracking. The results hold for a wide class of real-time heuristic search algorithms that includes many existing algorithms as a small subclass

    Code Generation = A* + BURS

    Get PDF
    A system called BURS that is based on term rewrite systems and a search algorithm A* are combined to produce a code generator that generates optimal code. The theory underlying BURS is re-developed, formalised and explained in this work. The search algorithm uses a cost heuristic that is derived from the termrewrite system to direct the search. The advantage of using a search algorithm is that we need to compute only those costs that may be part of an optimal rewrite sequence

    Searching and Stopping: An Analysis of Stopping Rules and Strategies

    Get PDF
    Searching naturally involves stopping points, both at a query level (how far down the ranked list should I go?) and at a session level (how many queries should I issue?). Understanding when searchers stop has been of much interest to the community because it is fundamental to how we evaluate search behaviour and performance. Research has shown that searchers find it difficult to formalise stopping criteria, and typically resort to their intuition of what is "good enough". While various heuristics and stopping criteria have been proposed, little work has investigated how well they perform, and whether searchers actually conform to any of these rules. In this paper, we undertake the first large scale study of stopping rules, investigating how they influence overall session performance, and which rules best match actual stopping behaviour. Our work is focused on stopping at the query level in the context of ad-hoc topic retrieval, where searchers undertake search tasks within a fixed time period. We show that stopping strategies based upon the disgust or frustration point rules - both of which capture a searcher's tolerance to non-relevance - typically result in (i) the best overall performance, and (ii) provide the closest approximation to actual searcher behaviour, although a fixed depth approach also performs remarkably well. Findings from this study have implications regarding how we build measures, and how we conduct simulations of search behaviours
    corecore