35,023 research outputs found

    New prioritized value iteration for Markov decision processes

    Full text link
    The problem of solving large Markov decision processes accurately and quickly is challenging. Since the computational effort incurred is considerable, current research focuses on finding superior acceleration techniques. For instance, the convergence properties of current solution methods depend, to a great extent, on the order of backup operations. On one hand, algorithms such as topological sorting are able to find good orderings but their overhead is usually high. On the other hand, shortest path methods, such as Dijkstra's algorithm which is based on priority queues, have been applied successfully to the solution of deterministic shortest-path Markov decision processes. Here, we propose an improved value iteration algorithm based on Dijkstra's algorithm for solving shortest path Markov decision processes. The experimental results on a stochastic shortest-path problem show the feasibility of our approach. © Springer Science+Business Media B.V. 2011.García Hernández, MDG.; Ruiz Pinales, J.; Onaindia De La Rivaherrera, E.; Aviña Cervantes, JG.; Ledesma Orozco, S.; Alvarado Mendez, E.; Reyes Ballesteros, A. (2012). New prioritized value iteration for Markov decision processes. Artificial Intelligence Review. 37(2):157-167. doi:10.1007/s10462-011-9224-zS157167372Agrawal S, Roth D (2002) Learning a sparse representation for object detection. In: Proceedings of the 7th European conference on computer vision. Copenhagen, Denmark, pp 1–15Bellman RE (1954) The theory of dynamic programming. Bull Amer Math Soc 60: 503–516Bellman RE (1957) Dynamic programming. Princeton University Press, New JerseyBertsekas DP (1995) Dynamic programming and optimal control. Athena Scientific, MassachusettsBhuma K, Goldsmith J (2003) Bidirectional LAO* algorithm. In: Proceedings of indian international conferences on artificial intelligence. p 980–992Blackwell D (1965) Discounted dynamic programming. Ann Math Stat 36: 226–235Bonet B, Geffner H (2003a) Faster heuristic search algorithms for planning with uncertainty and full feedback. In: Proceedings of the 18th international joint conference on artificial intelligence. Morgan Kaufmann, Acapulco, México, pp 1233–1238Bonet B, Geffner H (2003b) Labeled RTDP: improving the convergence of real-time dynamic programming. In: Proceedings of the international conference on automated planning and scheduling. Trento, Italy, pp 12–21Bonet B, Geffner H (2006) Learning depth-first search: a unified approach to heuristic search in deterministic and non-deterministic settings and its application to MDP. In: Proceedings of the 16th international conference on automated planning and scheduling. Cumbria, UKBoutilier C, Dean T, Hanks S (1999) Decision-theoretic planning: structural assumptions and computational leverage. J Artif Intell Res 11: 1–94Chang I, Soo H (2007) Simulation-based algorithms for Markov decision processes Communications and control engineering. Springer, LondonDai P, Goldsmith J (2007a) Faster dynamic programming for Markov decision processes. Technical report. Doctoral consortium, department of computer science and engineering. University of WashingtonDai P, Goldsmith J (2007b) Topological value iteration algorithm for Markov decision processes. In: Proceedings of the 20th international joint conference on artificial intelligence. Hyderabad, India, pp 1860–1865Dai P, Hansen EA (2007c) Prioritizing bellman backups without a priority queue. In: Proceedings of the 17th international conference on automated planning and scheduling, association for the advancement of artificial intelligence. Rhode Island, USA, pp 113–119Dibangoye JS, Chaib-draa B, Mouaddib A (2008) A Novel prioritization technique for solving Markov decision processes. In: Proceedings of the 21st international FLAIRS (The Florida Artificial Intelligence Research Society) conference, association for the advancement of artificial intelligence. Florida, USAFerguson D, Stentz A (2004) Focused propagation of MDPs for path planning. In: Proceedings of the 16th IEEE international conference on tools with artificial intelligence. pp 310–317Hansen EA, Zilberstein S (2001) LAO: a heuristic search algorithm that finds solutions with loops. Artif Intell 129: 35–62Hinderer K, Waldmann KH (2003) The critical discount factor for finite Markovian decision processes with an absorbing set. Math Methods Oper Res 57: 1–19Li L (2009) A unifying framework for computational reinforcement learning theory. PhD Thesis. The state university of New Jersey, New Brunswick. NJLittman ML, Dean TL, Kaelbling LP (1995) On the complexity of solving Markov decision problems.In: Proceedings of the 11th international conference on uncertainty in artificial intelligence. Montreal, Quebec pp 394–402McMahan HB, Gordon G (2005a) Fast exact planning in Markov decision processes. In: Proceedings of the 15th international conference on automated planning and scheduling. Monterey, CA, USAMcMahan HB, Gordon G (2005b) Generalizing Dijkstra’s algorithm and gaussian elimination for solving MDPs. Technical report, Carnegie Mellon University, PittsburghMeuleau N, Brafman R, Benazera E (2006) Stochastic over-subscription planning using hierarchies of MDPs. In: Proceedings of the 16th international conference on automated planning and scheduling. Cumbria, UK, pp 121–130Moore A, Atkeson C (1993) Prioritized sweeping: reinforcement learning with less data and less real time. Mach Learn 13: 103–130Puterman ML (1994) Markov decision processes. Wiley Editors, New YorkPuterman ML (2005) Markov decision processes. Wiley Inter Science Editors, New YorkRussell S (2005) Artificial intelligence: a modern approach. Making complex decisions (Ch-17), 2nd edn. Pearson Prentice Hill Ed., USAShani G, Brafman R, Shimony S (2008) Prioritizing point-based POMDP solvers. IEEE Trans Syst Man Cybern 38(6): 1592–1605Sniedovich M (2006) Dijkstra’s algorithm revisited: the dynamic programming connexion. Control Cybern 35: 599–620Sniedovich M (2010) Dynamic programming: foundations and principles, 2nd edn. Pure and Applied Mathematics Series, UKTijms HC (2003) A first course in stochastic models. Discrete-time Markov decision processes (Ch-6). Wiley Editors, UKVanderbei RJ (1996) Optimal sailing strategies. Statistics and operations research program, University of Princeton, USA ( http://www.orfe.princeton.edu/~rvdb/sail/sail.html )Vanderbei RJ (2008) Linear programming: foundations and extensions, 3rd edn. Springer, New YorkWingate D, Seppi KD (2005) Prioritization methods for accelerating MDP solvers. J Mach Learn Res 6: 851–88

    Bridging the lesson distribution gap

    Get PDF
    Paper presented at The 17th International Joint Conference on Artificial Intelligence, IJCAI 2001, Seattle, WA: pp. 987-992.Many organizations employ lessons learned (LL) processes to collect, analyze, store, and distribute, validated experiential knowledge (lessons) of their members that, when reused, can substantially improve organizational decision processes. Unfortunately, deployed LL systems do not facilitate lesson reuse and fail to bring lessons to the attention of the users when and where they are needed and applicable (i.e., they fail to bridge the lesson distribution gap). Our approach for solving this problem, named monitored distribution, tightly integrates lesson distribution with these decision processes. We describe a case-based implementation of monitored distribution (ALDS) in a plan authoring tool suite (HICAP). We evaluate its utility in a simulated military planning domain. Our results show that monitored distribution can significantly improve plan evaluation measures for this domain

    Bridging the lesson distribution gap

    Get PDF
    Paper presented at The 17th International Joint Conference on Artificial Intelligence, IJCAI 2001, Seattle, WA: pp. 987-992.Many organizations employ lessons learned (LL) processes to collect, analyze, store, and distribute, validated experiential knowledge (lessons) of their members that, when reused, can substantially improve organizational decision processes. Unfortunately, deployed LL systems do not facilitate lesson reuse and fail to bring lessons to the attention of the users when and where they are needed and applicable (i.e., they fail to bridge the lesson distribution gap). Our approach for solving this problem, named monitored distribution, tightly integrates lesson distribution with these decision processes. We describe a case-based implementation of monitored distribution (ALDS) in a plan authoring tool suite (HICAP). We evaluate its utility in a simulated military planning domain. Our results show that monitored distribution can significantly improve plan evaluation measures for this domain

    A flexible coupling approach to multi-agent planning under incomplete information

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s10115-012-0569-7Multi-agent planning (MAP) approaches are typically oriented at solving loosely coupled problems, being ineffective to deal with more complex, strongly related problems. In most cases, agents work under complete information, building complete knowledge bases. The present article introduces a general-purpose MAP framework designed to tackle problems of any coupling levels under incomplete information. Agents in our MAP model are partially unaware of the information managed by the rest of agents and share only the critical information that affects other agents, thus maintaining a distributed vision of the task. Agents solve MAP tasks through the adoption of an iterative refinement planning procedure that uses single-agent planning technology. In particular, agents will devise refinements through the partial-order planning paradigm, a flexible framework to build refinement plans leaving unsolved details that will be gradually completed by means of new refinements. Our proposal is supported with the implementation of a fully operative MAP system and we show various experiments when running our system over different types of MAP problems, from the most strongly related to the most loosely coupled.This work has been partly supported by the Spanish MICINN under projects Consolider Ingenio 2010 CSD2007-00022 and TIN2011-27652-C03-01, and the Valencian Prometeo project 2008/051.Torreño Lerma, A.; Onaindia De La Rivaherrera, E.; Sapena Vercher, O. (2014). A flexible coupling approach to multi-agent planning under incomplete information. Knowledge and Information Systems. 38:141-178. https://doi.org/10.1007/s10115-012-0569-7S14117838Argente E, Botti V, Carrascosa C, Giret A, Julian V, Rebollo M (2011) An abstract architecture for virtual organizations: the THOMAS approach. Knowl Inf Syst 29(2):379–403Barrett A, Weld DS (1994) Partial-order planning: evaluating possible efficiency gains. Artif Intell 67(1):71–112Belesiotis A, Rovatsos M, Rahwan I (2010) Agreeing on plans through iterated disputes. In: Proceedings of the 9th international conference on autonomous agents and multiagent systems. pp 765–772Bellifemine F, Poggi A, Rimassa G (2001) JADE: a FIPA2000 compliant agent development environment. In: Proceedings of the 5th international conference on autonomous agents (AAMAS). ACM, pp 216–217Blum A, Furst ML (1997) Fast planning through planning graph analysis. Artif Intell 90(1–2):281–300Boutilier C, Brafman R (2001) Partial-order planning with concurrent interacting actions. J Artif Intell Res 14(105):136Brafman R, Domshlak C (2008) From one to many: planning for loosely coupled multi-agent systems. In: Proceedings of the 18th international conference on automated planning and scheduling (ICAPS). pp 28–35Brenner M, Nebel B (2009) Continual planning and acting in dynamic multiagent environments. J Auton Agents Multiag Syst 19(3):297–331Coles A, Coles A, Fox M, Long D (2010) Forward-chaining partial-order planning. In: Proceedings of the 20th international conference on automated planning and scheduling (ICAPS). pp 42–49Coles A, Fox M, Long D, Smith A (2008) Teaching forward-chaining planning with JavaFF. In: Colloquium on AI education, 23rd AAAI conference on artificial intelligenceCox J, Durfee E, Bartold T (2005) A distributed framework for solving the multiagent plan coordination problem. In: Proceedings of the 4th international joint conference on autonomous agents and multiagent systems (AAMAS). ACM, pp 821–827de Weerdt M, Clement B (2009) Introduction to planning in multiagent systems. Multiag Grid Syst 5(4):345–355Decker K, Lesser VR (1992) Generalizing the partial global planning algorithm. Int J Coop Inf Syst 2(2):319–346desJardins M, Durfee E, Ortiz C, Wolverton M (1999) A survey of research in distributed continual planning. AI Mag 20(4):13–22Doshi P (2007) On the role of interactive epistemology in multiagent planning. In: Artificial intelligence and, pattern recognition. pp 208–213Dréo J, Savéant P, Schoenauer M, Vidal V (2011) Divide-and-evolve: the marriage of descartes and darwin. In: Proceedings of the 7th international planning competition (IPC). Freiburg, GermanyDurfee EH (2001) Distributed problem solving and planning. In: Multi-agents systems and applications: selected tutorial papers from the 9th ECCAI advanced course (ACAI) and agentLink’s third European agent systems summer school (EASSS), vol LNAI 2086. Springer, pp 118–149Durfee EH, Lesser V (1991) Partial global planning: a coordination framework for distributed hypothesis formation. IEEE Trans Syst Man Cybern Special Issue Distrib Sens Netw 21(5):1167–1183Ephrati E, Rosenschein JS (1996) Deriving consensus in multiagent systems. Artif Intell 87(1–2):21–74Fikes R, Nilsson N (1971) STRIPS: a new approach to the application of theorem proving to problem solving. Artif Intell 2(3):189–208Fogués R, Alberola J, Such J, Espinosa A, Garcia-Fornes A (2010) Towards dynamic agent interaction support in open multiagent systems. In: Proceedings of the 2010 conference on artificial intelligence research and development: proceedings of the 13th international conference of the Catalan association for artificial intelligence’. IOS Press, pp 89–98Gerevini A, Long D (2006) Preferences and soft constraints in PDDL3. In: ICAPS workshop on planning with preferences and soft constraints, vol 6. Citeseer, pp 46–53Ghallab M, Howe A, Knoblock C, McDermott D, Ram A, Veloso M, Weld D, Wilkins D (1998) PDDL-the Planning Domain Definition Language. In: AIPS-98 planning committeeGmytrasiewicz P, Doshi P (2005) A framework for sequential planning in multi-agent settings. J Artif Intell Res 24:49–79Haslum P, Jonsson P (1999) Some results on the complexity of planning with incomplete information. In: Proceedings of the 5th European conference on, planning (ECP). pp 308–318Helmert M (2006) The fast downward planning system. J Artif Intell Res 26(1):191–246Hoffmann J, Nebel B (2001) The FF planning system: fast planning generation through heuristic search. J Artif Intell Res 14:253–302Jonsson A, Rovatsos M (2011) Scaling up multiagent planning: a best-response approach. In: Proceedings of the 21st international conference on automated planning and scheduling (ICAPS). AAAI, pp 114–121Kambhampati S (1997) Refinement planning as a unifying framework for plan synthesis. AI Mag 18(2):67–97Kaminka GA, Pynadath DV, Tambe M (2002) Monitoring teams by overhearing: a multi-agent plan-recognition approach. J Artif Intell Res 17:83–135Kone M, Shimazu A, Nakajima T (2000) The state of the art in agent communication languages. Knowl Inf Syst 2(3):259–284Kovacs DL (2011) Complete BNF description of PDDL3.1. Technical reportKraus S (1997) Beliefs, time and incomplete information in multiple encounter negotiations among autonomous agents. Ann Math Artif Intell 20(1–4):111–159Kumar A, Zilberstein S, Toussaint M (2011) Scalable multiagent planning using probabilistic inference. In: Proceedings of the 22nd international joint conference on artificial intelligence (IJCAI)’. Barcelona, Spain, pp 2140–2146Kvarnström J. (2011) Planning for loosely coupled agents using partial order forward-chaining. In: Proceedings of the 21st international conference on automated planning and scheduling (ICAPS). AAAI, pp 138–145Lesser V, Decker K, Wagner T, Carver N, Garvey A, Horling B, Neiman D, Podorozhny R, Prasad M, Raja A et al (2004) Evolution of the GPGP/TAEMS domain-independent coordination framework. Auton Agents Multi Agent Syst 9(1):87–143Lipovetzky N, Geffner H (2011) Searching for plans with carefully designed probes. In: Proceedings of the 21th international conference on automated planning and scheduling (ICAPS)Micacchi C, Cohen R (2008) A framework for simulating real-time multi-agent systems. Knowl Inf Syst 17(2):135–166Nguyen N, Katarzyniak R (2009) Actions and social interactions in multi-agent systems. Knowl Inf Syst 18(2):133–136Nguyen X, Kambhampati S (2001) Reviving partial order planning. In: Proceedings of the 17th international joint conference on artificial intelligence (IJCAI). Morgan Kaufmann, pp 459–464Nissim R, Brafman R, Domshlak C (2010) A general, fully distributed multi-agent planning algorithm. In: Proceedings of the 9th international conference on autonomous agents and multiagent systems (AAMAS). pp 1323–1330Pajares S, Onaindia E (2012) Defeasible argumentation for multi-agent planning in ambient intelligence applications. In: Proceedings of the 11th international conference on autonomous agents and multiagent systems (AAMAS) pp 509–516Paolucci M, Shehory O, Sycara K, Kalp D, Pannu A (2000) A planning component for RETSINA agents. Intelligent Agents VI. Agent Theories Architectures, and Languages pp 147–161Parsons S, Sierra C, Jennings N (1998) Agents that reason and negotiate by arguing. J Logic Comput 8(3):261Penberthy J, Weld D (1992) UCPOP: a sound, complete, partial order planner for ADL. In: Proceedings of the 3rd international conference on principles of knowledge representation and reasoning (KR). Morgan Kaufmann, pp 103–114Richter S, Westphal M (2010) The LAMA planner: guiding cost-based anytime planning with landmarks. J Artif Intell Res 39(1):127–177Sycara K, Pannu A (1998) The RETSINA multiagent system (video session): towards integrating planning, execution and information gathering. In: Proceedings of the 2nd international conference on autonomous agents (Agents). ACM, pp 350–351Tambe M (1997) Towards flexible teamwork. J Artif Intell Res 7:83–124Tang Y, Norman T, Parsons S (2010) A model for integrating dialogue and the execution of joint plans. Argumentation in multi-agent systems, pp 60–78Tonino H, Bos A, de Weerdt M, Witteveen C (2002) Plan coordination by revision in collective agent based systems. Artif Intell 142(2):121–145Van Der Krogt R, De Weerdt M (2005), Plan repair as an extension of planning. In: Proceedings of the 15th international conference on automated planning and scheduling (ICAPS). pp 161–170Weld D (1994) An introduction to least commitment planning. AI Mag 15(4):27Weld D (1999) Recent advances in AI planning. AI Mag 20(2):93–123Wilkins D, Myers K (1998) A multiagent planning architecture. In: Proceedings of the 4th international conference on artificial intelligence planning systems (AIPS), pp 154–162Wu F, Zilberstein S, Chen X (2011) Online planning for multi-agent systems with bounded communication. Artif Intell 175(2):487–511Younes H, Simmons R (2003) VHPOP: versatile heuristic partial order planner. J Artif Intell Res 20: 405–430Zhang J, Nguyen X, Kowalczyk R (2007) Graph-based multi-agent replanning algorithm. In: Proceedings of the 6th conference on autonomous agents and multiagent systems (AAMAS

    Allocation in Practice

    Full text link
    How do we allocate scarcere sources? How do we fairly allocate costs? These are two pressing challenges facing society today. I discuss two recent projects at NICTA concerning resource and cost allocation. In the first, we have been working with FoodBank Local, a social startup working in collaboration with food bank charities around the world to optimise the logistics of collecting and distributing donated food. Before we can distribute this food, we must decide how to allocate it to different charities and food kitchens. This gives rise to a fair division problem with several new dimensions, rarely considered in the literature. In the second, we have been looking at cost allocation within the distribution network of a large multinational company. This also has several new dimensions rarely considered in the literature.Comment: To appear in Proc. of 37th edition of the German Conference on Artificial Intelligence (KI 2014), Springer LNC

    Variations on a Theme: A Bibliography on Approaches to Theorem Proving Inspired From Satchmo

    Get PDF
    This articles is a structured bibliography on theorem provers, approaches to theorem proving, and theorem proving applications inspired from Satchmo, the model generation theorem prover developed in the mid 80es of the 20th century at ECRC, the European Computer- Industry Research Centre. Note that the bibliography given in this article is not exhaustive
    • …
    corecore