318 research outputs found

    Model-free reinforcement learning for stochastic parity games

    Get PDF
    This paper investigates the use of model-free reinforcement learning to compute the optimal value in two-player stochastic games with parity objectives. In this setting, two decision makers, player Min and player Max, compete on a finite game arena - a stochastic game graph with unknown but fixed probability distributions - to minimize and maximize, respectively, the probability of satisfying a parity objective. We give a reduction from stochastic parity games to a family of stochastic reachability games with a parameter Δ, such that the value of a stochastic parity game equals the limit of the values of the corresponding simple stochastic games as the parameter Δ tends to 0. Since this reduction does not require the knowledge of the probabilistic transition structure of the underlying game arena, model-free reinforcement learning algorithms, such as minimax Q-learning, can be used to approximate the value and mutual best-response strategies for both players in the underlying stochastic parity game. We also present a streamlined reduction from 112-player parity games to reachability games that avoids recourse to nondeterminism. Finally, we report on the experimental evaluations of both reductions

    Calculations of giant magnetoresistance in Fe/Cr trilayers using layer potentials determined from {\it ab-initio} methods

    Full text link
    The ab initio full-potential linearized augmented plane-wave method explicitly designed for the slab geometry was employed to elucidate the physical origin of the layer potentials for the trilayers nFe/3Cr/nFe(001), where n is the number of Fe monolayers. The thickness of the transition-metal ferromagnet has been ranged from n=1n=1 up to n=8 while the spacer thickness was fixed to 3 monolayers. The calculated potentials were inserted in the Fuchs-Sondheimer formalism in order to calculate the giant magnetoresistance (GMR) ratio. The predicted GMR ratio was compared with the experiment and the oscillatory behavior of the GMR as a function of the ferromagnetic layer thickness was discussed in the context of the layer potentials. The reported results confirm that the interface monolayers play a dominant role in the intrinsic GMR.Comment: 17 pages, 7 figures, 3 tables. accepted in J. Phys.: Cond. Matte

    A phenomenological approach to the simulation of metabolism and proliferation dynamics of large tumour cell populations

    Full text link
    A major goal of modern computational biology is to simulate the collective behaviour of large cell populations starting from the intricate web of molecular interactions occurring at the microscopic level. In this paper we describe a simplified model of cell metabolism, growth and proliferation, suitable for inclusion in a multicell simulator, now under development (Chignola R and Milotti E 2004 Physica A 338 261-6). Nutrients regulate the proliferation dynamics of tumor cells which adapt their behaviour to respond to changes in the biochemical composition of the environment. This modeling of nutrient metabolism and cell cycle at a mesoscopic scale level leads to a continuous flow of information between the two disparate spatiotemporal scales of molecular and cellular dynamics that can be simulated with modern computers and tested experimentally.Comment: 58 pages, 7 figures, 3 tables, pdf onl

    Reward Shaping for Reinforcement Learning with Omega-Regular Objectives

    Get PDF
    Recently, successful approaches have been made to exploit good-for-MDPs automata (B\"uchi automata with a restricted form of nondeterminism) for model free reinforcement learning, a class of automata that subsumes good for games automata and the most widespread class of limit deterministic automata. The foundation of using these B\"uchi automata is that the B\"uchi condition can, for good-for-MDP automata, be translated to reachability. The drawback of this translation is that the rewards are, on average, reaped very late, which requires long episodes during the learning process. We devise a new reward shaping approach that overcomes this issue. We show that the resulting model is equivalent to a discounted payoff objective with a biased discount that simplifies and improves on prior work in this direction

    Environmental contaminants as etiologic factors for diabetes.

    Get PDF
    For both type 1 and type 2 diabetes mellitus, the rates have been increasing in the United States and elsewhere; rates vary widely by country, and genetic factors account for less than half of new cases. These observations suggest environmental factors cause both type 1 and type 2 diabetes. Occupational exposures have been associated with increased risk of diabetes. In addition, recent data suggest that toxic substances in the environment, other than infectious agents or exposures that stimulate an immune response, are associated with the occurrence of these diseases. We reviewed the epidemiologic data that addressed whether environmental contaminants might cause type 1 or type 2 diabetes. For type 1 diabetes, higher intake of nitrates, nitrites, and N-nitroso compounds, as well as higher serum levels of polychlorinated biphenyls have been associated with increased risk. Overall, however, the data were limited or inconsistent. With respect to type 2 diabetes, data on arsenic and 2,3,7,8-tetrachlorodibenzo-p-dioxin relative to risk were suggestive of a direct association but were inconclusive. The occupational data suggested that more data on exposure to N-nitroso compounds, arsenic, dioxins, talc, and straight oil machining fluids in relation to diabetes would be useful. Although environmental factors other than contaminants may account for the majority of type 1 and type 2 diabetes, the etiologic role of several contaminants and occupational exposures deserves further study

    Synthesis and Characterization of Catalytically Active Ni(II) Complexes with Bis(phenol)diamine Ligands

    Get PDF
    A novel N,N’-dimethylethylenediamine derivative of substituted bis(phenol)diamine ligands, namely 2-(tert-butyl)-4-methylphenol in H2L1, was synthesized by a convenient green procedure. Nickel)II) complex [NiL1] 1 has been synthesized and characterized by various methods along with crystal structure determined. Ni(II) coordination center in a mononuclear complex is surrounded by two phenolate oxygen atoms and two amine nitrogen atoms of the ligand in a square planar arrangement. The magnetic susceptibility of the title complex indicates a paramagnetic behavior above 150 K, while strong ferromagnetism below 100 K. Furthermore, the cyclic voltammetry studies show two ligand-centered oxidation of the phenolate groups to phenoxyl radical and the metal-centered reduction of Ni(II) to Ni(0). The Glaser coupling reaction of phenylacetylene was also studied. A strong catalytic activity at room T in THF solvent is observed for 1 in the presence of zinc powder as a reducing agent. A full conversion rate was achieved after 7 h at 25 °C. The DFT analysis corroborates with the square-planar NiO2N2 chromophore of 1 being reduced in catalytically active Ni(0) by applied Zn. The calculated Gibbs free energy of the reaction leading to the formation of the substrate Ni-complex is favorable endothermic. Most of the data for 1 were obtained also for the very similar previously reported [NiL2] 2, with 2,4- di tert-butylphenol in H2L2, which were than compared

    Optimal Control for Multi-mode Systems with Discrete Costs

    Get PDF
    This paper studies optimal time-bounded control in multi-mode systems with discrete costs. Multi-mode systems are an important subclass of linear hybrid systems, in which there are no guards on transitions and all invariants are global. Each state has a continuous cost attached to it, which is linear in the sojourn time, while a discrete cost is attached to each transition taken. We show that an optimal control for this model can be computed in NEXPTIME and approximated in PSPACE. We also show that the one-dimensional case is simpler: although the problem is NP-complete (and in LOGSPACE for an infinite time horizon), we develop an FPTAS for finding an approximate solution.Comment: extended version of a FORMATS 2017 pape

    The Complexity of Nash Equilibria in Stochastic Multiplayer Games

    Get PDF
    We analyse the computational complexity of finding Nash equilibria in turn-based stochastic multiplayer games with omega-regular objectives. We show that restricting the search space to equilibria whose payoffs fall into a certain interval may lead to undecidability. In particular, we prove that the following problem is undecidable: Given a game G, does there exist a Nash equilibrium of G where Player 0 wins with probability 1? Moreover, this problem remains undecidable when restricted to pure strategies or (pure) strategies with finite memory. One way to obtain a decidable variant of the problem is to restrict the strategies to be positional or stationary. For the complexity of these two problems, we obtain a common lower bound of NP and upper bounds of NP and PSPACE respectively. Finally, we single out a special case of the general problem that, in many cases, admits an efficient solution. In particular, we prove that deciding the existence of an equilibrium in which each player either wins or loses with probability 1 can be done in polynomial time for games where the objective of each player is given by a parity condition with a bounded number of priorities
    • 

    corecore