2,985 research outputs found
Recommended from our members
Towards Informed Exploration for Deep Reinforcement Learning
In this thesis, we discuss various techniques for improving exploration for deep reinforcement learning. We begin with a brief review of reinforcement learning (RL) and the fundamental v.s. exploitation trade-off. Then we review how deep RL has improved upon classical and summarize six categories of the latest exploration methods for deep RL, in the order increasing usage of prior information. We then explore representative works in three categories discuss their strengths and weaknesses. The first category, represented by Soft Q-learning, uses regularization to encourage exploration. The second category, represented by count-based via hashing, maps states to hash codes for counting and assigns higher exploration to less-encountered states. The third category utilizes hierarchy and is represented by modular architecture for RL agents to play StarCraft II. Finally, we conclude that exploration by prior knowledge is a promising research direction and suggest topics of potentially impact
Planning through Automatic Portfolio Configuration: The PbP Approach
In the field of domain-independent planning, several powerful planners implementing different techniques have been developed. However, no one of these systems outperforms all others in every known benchmark domain. In this work, we propose a multi-planner approach that automatically configures a portfolio of planning techniques for each given domain. The configuration process for a given domain uses a set of training instances to: (i) compute and analyze some alternative sets of macro-actions for each planner in the portfolio identifying a (possibly empty) useful set, (ii) select a cluster of planners, each one with the identified useful set of macro-actions, that is expected to perform best, and (iii) derive some additional information for configuring the execution scheduling of the selected planners at planning time. The resulting planning system, called PbP (Portfolio- based Planner), has two variants focusing on speed and plan quality. Different versions of PbP entered and won the learning track of the sixth and seventh International Planning Competitions. In this paper, we experimentally analyze PbP considering planning speed and plan quality in depth. We provide a collection of results that help to understand PbP�s behavior, and demonstrate the effectiveness of our approach to configuring a portfolio of planners with macro-actions
On the Online Generation of Effective Macro-operators
Macro-operator (“macro”, for short) generation is a
well-known technique that is used to speed-up the
planning process. Most published work on using
macros in automated planning relies on an offline
learning phase where training plans, that is, solutions
of simple problems, are used to generate the
macros. However, there might not always be a place
to accommodate training.
In this paper we propose OMA, an efficient method
for generating useful macros without an offline
learning phase, by utilising lessons learnt from existing
macro learning techniques. Empirical evaluation
with IPC benchmarks demonstrates performance
improvement in a range of state-of-the-art
planning engines, and provides insights into what
macros can be generated without training
Water Window Ptychographic Imaging with Characterized Coherent X-rays
We report on a ptychographical coherent diffractive imaging experiment in the
water window with focused soft X-rays at . An X-ray beam with
high degree of coherence was selected for ptychography at the P04 beamline of
the PETRA III synchrotron radiation source. We measured the beam coherence with
the newly developed non-redundant array method. A pinhole
in size selected the coherent part of the beam and was used for ptychographic
measurements of a lithographically manufactured test sample and fossil diatom.
The achieved resolution was for the test sample and only
limited by the size of the detector. The diatom was imaged at a resolution
better than .Comment: 22 pages. 7 figure
Learning Useful Macro-actions for Planning with N-Grams
International audienceAutomated planning has achieved significant breakthroughs in recent years. Nonetheless, attempts to improve search algorithm efficiency remain the primary focus of most research. However, it is also possible to build on previous searches and learn from previously found solutions. Our approach consists in learning macro-actions and adding them into the planner's domain. A macro-action is an action sequence selected for application at search time and applied as a single indivisible action. Carefully chosen macros can drastically improve the planning performances by reducing the search space depth. However, macros also increase the branching factor. Therefore, the use of macros entails a utility problem: a trade-off has to be addressed between the benefit of adding macros to speed up the goal search and the overhead caused by increasing the branching factor in the search space. In this paper, we propose an online domain and planner-independent approach to learn 'useful' macros, i.e. macros that address the utility problem. These useful macros are obtained by statistical and heuristic filtering of a domain specific macro library. The library is created from the most frequent action sequences derived from an n-gram analysis on successful plans previously computed by the planner. The relevance of this approach is proven by experiments on International Planning Competition domains
Remote information management of an automated manufacturing system
Thesis (M. Tech.) -- Central University of Technology, Free State, 2007With technology advancing, more and more people turn to the World Wide Web to conduct business. This may include buying and selling on the Web, advertising and monitoring of business activities.
There is a big need for software and systems that enable remote monitoring and controlling of business activities. The Mechatronics Research Group of the Faculty of Engineering, Information and Communication Technology at the Central University of Technology, Free State, has identified a similar need. This research group has created an Automated Manufacturing System around which research topics revolve. They want to monitor this Automated Manufacturing System from remote locations like their offices or, if possible, from home.
The Remote Information Management (RIM) System was developed, using the Rapid Application Development (RAD) Methodology. The reasons why this methodology was used, is because it is the best to use in a changing environment, when the system needs to be developed very quickly and when most of the data is already available. This is a good description of the Automated Manufacturing System’s environment.
The RAD methodology consists of four stages: Requirements Planning, User Design, Rapid Construction and Transition. Project Management is used throughout these stages to ensure that the project goes according to plan. Development of the RIM system went through all four stages and project management was applied.
The final system consisted of a Web Page with Web Camera views of the Automated Manufacturing System. The application that was developed using National Instruments LabVIEW, Microsoft Visual C++, and Microsoft Excel, is embedded in this Web Page. This application is called a Virtual Instrument (VI). The VI shows real-time data from the Automated Manufacturing System. Control over the VI can be granted and will allow the remote user to create reports on how many different products was produced and system downtimes.
A system like the RIM System has advantages in the business world. It can enable telecommuting and will allow employees and managers to monitor (and even control) manufacturing systems, or any system connected to a PLC, from remote locations
- …