Search CORE

35,098 research outputs found

EMBEDDED LEARNING ROBOT WITH FUZZY Q-LEARNING FOR OBSTACLE AVOIDANCE BEHAVIOR

Author: Anam Khairul
Publication venue
Publication date: 01/01/2009
Field of study

Fuzzy Q-learning is extending of Q-learning algorithm that uses fuzzy inference system to enable Q-learning holding continuous action and state. This learning has been implemented in various robot learning application like obstacle avoidance and target searching. However, most of them have not been realized in embedded robot. This paper presents implementation of fuzzy Q-learning for obstacle avoidance navigation in embedded mobile robot. The experimental result demonstrates that fuzzy Q-learning enables robot to be able to learn the right policy i.e. to avoid obstacle

EEPIS Repository

Active inference, evidence accumulation, and the urn task

Author: Attias H.
Beal M. J.
Karl Friston
Michael Moutoussis
Philipp Schwartenbeck
Raymond J. Dolan
Schwartenbeck P.
Thomas H. B. FitzGerald
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2015
Field of study

Deciding how much evidence to accumulate before making a decision is a problem we and other animals often face, but one that is not completely understood. This issue is particularly important because a tendency to sample less information (often known as reflection impulsivity) is a feature in several psychopathologies, such as psychosis. A formal understanding of information sampling may therefore clarify the computational anatomy of psychopathology. In this theoretical letter, we consider evidence accumulation in terms of active (Bayesian) inference using a generic model of Markov decision processes. Here, agents are equipped with beliefs about their own behavior--in this case, that they will make informed decisions. Normative decision making is then modeled using variational Bayes to minimize surprise about choice outcomes. Under this scheme, different facets of belief updating map naturally onto the functional anatomy of the brain (at least at a heuristic level). Of particular interest is the key role played by the expected precision of beliefs about control, which we have previously suggested may be encoded by dopaminergic neurons in the midbrain. We show that manipulating expected precision strongly affects how much information an agent characteristically samples, and thus provides a possible link between impulsivity and dopaminergic dysfunction. Our study therefore represents a step toward understanding evidence accumulation in terms of neurobiologically plausible Bayesian inference and may cast light on why this process is disordered in psychopathology

Crossref

PubMed Central

University of East Anglia digital repository

MPG.PuRe

Operational risk management and new computational needs in banks

Author: Duc PHAM-HI
Publication venue
Publication date
Field of study

Basel II banking regulation introduces new needs for computational schemes. They involve both optimal stochastic control, and large scale simulations of decision processes of preventing low-frequency high loss-impact events. This paper will first state the problem and present its parameters. It then spells out the equations that represent a rational risk management behavior and link together the variables: Levy processes are used to model operational risk losses, where calibration by historical loss databases is possible ; where it is not the case, qualitative variables such as quality of business environment and internal controls can provide both costs-side and profits-side impacts. Among other control variables are business growth rate, and efficiency of risk mitigation. The economic value of a policy is maximized by resolving the resulting Hamilton-Jacobi-Bellman type equation. Computational complexity arises from embedded interactions between 3 levels: * Programming global optimal dynamic expenditures budget in Basel II context, * Arbitraging between the cost of risk-reduction policies (as measured by organizational qualitative scorecards and insurance buying) and the impact of incurred losses themselves. This implies modeling the efficiency of the process through which forward-looking measures of threats minimization, can actually reduce stochastic losses, * And optimal allocation according to profitability across subsidiaries and business lines. The paper next reviews the different types of approaches that can be envisaged in deriving a sound budgetary policy solution for operational risk management, based on this HJB equation. It is argued that while this complex, high dimensional problem can be resolved by taking some usual simplifications (Galerkin approach, imposing Merton form solutions, viscosity approach, ad hoc utility functions that provide closed form solutions, etc.) , the main interest of this model lies in exploring the scenarios in an adaptive learning framework ( MDP, partially observed MDP, Q-learning, neuro-dynamic programming, greedy algorithm, etc.). This makes more sense from a management point of view, and solutions are more easily communicated to, and accepted by, the operational level staff in banks through the explicit scenarios that can be derived. This kind of approach combines different computational techniques such as POMDP, stochastic control theory and learning algorithms under uncertainty and incomplete information. The paper concludes by presenting the benefits of such a consistent computational approach to managing budgets, as opposed to a policy of operational risk management made up from disconnected expenditures. Such consistency satisfies the qualifying criteria for banks to apply for the AMA (Advanced Measurement Approach) that will allow large economies of regulatory capital charge under Basel II Accord.REGULAR - Operational risk management, HJB equation, Levy processes, budget optimization, capital allocation

Research Papers in Economics