318 research outputs found
Model-free reinforcement learning for stochastic parity games
This paper investigates the use of model-free reinforcement learning to compute the optimal value in two-player stochastic games with parity objectives. In this setting, two decision makers, player Min and player Max, compete on a finite game arena - a stochastic game graph with unknown but fixed probability distributions - to minimize and maximize, respectively, the probability of satisfying a parity objective. We give a reduction from stochastic parity games to a family of stochastic reachability games with a parameter Δ, such that the value of a stochastic parity game equals the limit of the values of the corresponding simple stochastic games as the parameter Δ tends to 0. Since this reduction does not require the knowledge of the probabilistic transition structure of the underlying game arena, model-free reinforcement learning algorithms, such as minimax Q-learning, can be used to approximate the value and mutual best-response strategies for both players in the underlying stochastic parity game. We also present a streamlined reduction from 112-player parity games to reachability games that avoids recourse to nondeterminism. Finally, we report on the experimental evaluations of both reductions
Calculations of giant magnetoresistance in Fe/Cr trilayers using layer potentials determined from {\it ab-initio} methods
The ab initio full-potential linearized augmented plane-wave method
explicitly designed for the slab geometry was employed to elucidate the
physical origin of the layer potentials for the trilayers nFe/3Cr/nFe(001),
where n is the number of Fe monolayers. The thickness of the transition-metal
ferromagnet has been ranged from up to n=8 while the spacer thickness was
fixed to 3 monolayers. The calculated potentials were inserted in the
Fuchs-Sondheimer formalism in order to calculate the giant magnetoresistance
(GMR) ratio. The predicted GMR ratio was compared with the experiment and the
oscillatory behavior of the GMR as a function of the ferromagnetic layer
thickness was discussed in the context of the layer potentials. The reported
results confirm that the interface monolayers play a dominant role in the
intrinsic GMR.Comment: 17 pages, 7 figures, 3 tables. accepted in J. Phys.: Cond. Matte
A phenomenological approach to the simulation of metabolism and proliferation dynamics of large tumour cell populations
A major goal of modern computational biology is to simulate the collective
behaviour of large cell populations starting from the intricate web of
molecular interactions occurring at the microscopic level. In this paper we
describe a simplified model of cell metabolism, growth and proliferation,
suitable for inclusion in a multicell simulator, now under development
(Chignola R and Milotti E 2004 Physica A 338 261-6). Nutrients regulate the
proliferation dynamics of tumor cells which adapt their behaviour to respond to
changes in the biochemical composition of the environment. This modeling of
nutrient metabolism and cell cycle at a mesoscopic scale level leads to a
continuous flow of information between the two disparate spatiotemporal scales
of molecular and cellular dynamics that can be simulated with modern computers
and tested experimentally.Comment: 58 pages, 7 figures, 3 tables, pdf onl
Reward Shaping for Reinforcement Learning with Omega-Regular Objectives
Recently, successful approaches have been made to exploit good-for-MDPs automata (B\"uchi automata with a restricted form of nondeterminism) for model free reinforcement learning, a class of automata that subsumes good for games automata and the most widespread class of limit deterministic automata. The foundation of using these B\"uchi automata is that the B\"uchi condition can, for good-for-MDP automata, be translated to reachability. The drawback of this translation is that the rewards are, on average, reaped very late, which requires long episodes during the learning process. We devise a new reward shaping approach that overcomes this issue. We show that the resulting model is equivalent to a discounted payoff objective with a biased discount that simplifies and improves on prior work in this direction
Environmental contaminants as etiologic factors for diabetes.
For both type 1 and type 2 diabetes mellitus, the rates have been increasing in the United States and elsewhere; rates vary widely by country, and genetic factors account for less than half of new cases. These observations suggest environmental factors cause both type 1 and type 2 diabetes. Occupational exposures have been associated with increased risk of diabetes. In addition, recent data suggest that toxic substances in the environment, other than infectious agents or exposures that stimulate an immune response, are associated with the occurrence of these diseases. We reviewed the epidemiologic data that addressed whether environmental contaminants might cause type 1 or type 2 diabetes. For type 1 diabetes, higher intake of nitrates, nitrites, and N-nitroso compounds, as well as higher serum levels of polychlorinated biphenyls have been associated with increased risk. Overall, however, the data were limited or inconsistent. With respect to type 2 diabetes, data on arsenic and 2,3,7,8-tetrachlorodibenzo-p-dioxin relative to risk were suggestive of a direct association but were inconclusive. The occupational data suggested that more data on exposure to N-nitroso compounds, arsenic, dioxins, talc, and straight oil machining fluids in relation to diabetes would be useful. Although environmental factors other than contaminants may account for the majority of type 1 and type 2 diabetes, the etiologic role of several contaminants and occupational exposures deserves further study
Synthesis and Characterization of Catalytically Active Ni(II) Complexes with Bis(phenol)diamine Ligands
A novel N,Nâ-dimethylethylenediamine derivative of substituted bis(phenol)diamine ligands, namely 2-(tert-butyl)-4-methylphenol in H2L1, was synthesized by a convenient green procedure. Nickel)II) complex [NiL1] 1 has been synthesized and characterized by various methods along with crystal structure determined. Ni(II) coordination center in a mononuclear complex is surrounded by two phenolate oxygen atoms and two amine nitrogen atoms of the ligand in a square planar arrangement. The magnetic susceptibility of the title complex indicates a paramagnetic behavior above 150âŻK, while strong ferromagnetism below 100âŻK. Furthermore, the cyclic voltammetry studies show two ligand-centered oxidation of the phenolate groups to phenoxyl radical and the metal-centered reduction of Ni(II) to Ni(0). The Glaser coupling reaction of phenylacetylene was also studied. A strong catalytic activity at room T in THF solvent is observed for 1 in the presence of zinc powder as a reducing agent. A full conversion rate was achieved after 7âŻh at 25âŻÂ°C. The DFT analysis corroborates with the square-planar NiO2N2 chromophore of 1 being reduced in catalytically active Ni(0) by applied Zn. The calculated Gibbs free energy of the reaction leading to the formation of the substrate Ni-complex is favorable endothermic. Most of the data for 1 were obtained also for the very similar previously reported [NiL2] 2, with 2,4- di tert-butylphenol in H2L2, which were than compared
Optimal Control for Multi-mode Systems with Discrete Costs
This paper studies optimal time-bounded control in multi-mode systems with
discrete costs. Multi-mode systems are an important subclass of linear hybrid
systems, in which there are no guards on transitions and all invariants are
global. Each state has a continuous cost attached to it, which is linear in the
sojourn time, while a discrete cost is attached to each transition taken. We
show that an optimal control for this model can be computed in NEXPTIME and
approximated in PSPACE. We also show that the one-dimensional case is simpler:
although the problem is NP-complete (and in LOGSPACE for an infinite time
horizon), we develop an FPTAS for finding an approximate solution.Comment: extended version of a FORMATS 2017 pape
The Complexity of Nash Equilibria in Stochastic Multiplayer Games
We analyse the computational complexity of finding Nash equilibria in
turn-based stochastic multiplayer games with omega-regular objectives. We show
that restricting the search space to equilibria whose payoffs fall into a
certain interval may lead to undecidability. In particular, we prove that the
following problem is undecidable: Given a game G, does there exist a Nash
equilibrium of G where Player 0 wins with probability 1? Moreover, this problem
remains undecidable when restricted to pure strategies or (pure) strategies
with finite memory. One way to obtain a decidable variant of the problem is to
restrict the strategies to be positional or stationary. For the complexity of
these two problems, we obtain a common lower bound of NP and upper bounds of NP
and PSPACE respectively. Finally, we single out a special case of the general
problem that, in many cases, admits an efficient solution. In particular, we
prove that deciding the existence of an equilibrium in which each player either
wins or loses with probability 1 can be done in polynomial time for games where
the objective of each player is given by a parity condition with a bounded
number of priorities
- âŠ