174 research outputs found

    Minimax Weight Learning for Absorbing MDPs

    Full text link
    Reinforcement learning policy evaluation problems are often modeled as finite or discounted/averaged infinite-horizon MDPs. In this paper, we study undiscounted off-policy policy evaluation for absorbing MDPs. Given the dataset consisting of the i.i.d episodes with a given truncation level, we propose a so-called MWLA algorithm to directly estimate the expected return via the importance ratio of the state-action occupancy measure. The Mean Square Error (MSE) bound for the MWLA method is investigated and the dependence of statistical errors on the data size and the truncation level are analyzed. With an episodic taxi environment, computational experiments illustrate the performance of the MWLA algorithm.Comment: 36 pages, 9 figure

    Global asymptotic behavior and boundedness of positive solutions to an odd-order rational difference equation

    Get PDF
    AbstractIn this note we consider the following high-order rational difference equation xn=1+∏i=1k(1−xn−i)∑i=1kxn−i,n=0,1,…, where k≥3 is odd number, x−k,x−k+1,x−k+2,…,x−1 is positive numbers. We obtain the boundedness of positive solutions for the above equation, and with the perturbation of initial values, we mainly use the transformation method to prove that the positive equilibrium point of this equation is globally asymptotically stable

    Singular Orbits and Dynamics at Infinity of a Conjugate Lorenz-Like System

    Get PDF
    A conjugate Lorenz-like system which includes only two quadratic nonlinearities is proposed in this paper. Some basic properties of this system, such as the distribution of its equilibria and their stabilities, the Lyapunov exponents, the bifurcations are investigated by some numerical and theoretical analysis. The forming mechanisms of compound structures of its new chaotic attractors obtained by merging together two simple attractors after performing one mirror operation are also presented. Furthermore, some of its other complex dynamical behaviours, which include the existence of singularly degenerate heteroclinic cycles, the existence of homoclinic and heteroclinic orbits and the dynamics at infinity, etc, are formulated in detail. In the meantime, some problems deserving further investigations are presented

    Bifurcation analysis in a discrete predator–prey model with herd behaviour and group defense

    Get PDF
    In this paper, we utilize the semi-discretization method to construct a discrete model from a continuous predator-prey model with herd behaviour and group defense. Specifically, some new results for the transcritical bifurcation, the period-doubling bifurcation, and the Neimark-Sacker bifurcation are derived by using the center manifold theorem and bifurcation theory. Novelty includes a smooth transition from individual behaviour (low number of prey) to herd behaviour (large number of prey). Our results not only formulate simpler forms for the existence conditions of these bifurcations, but also clearly present the conditions for the direction and stability of the bifurcated closed orbits. Numerical simulations are also given to illustrate the existence of the derived Neimark-Sacker bifurcation

    Period-doubling bifurcation and Neimark-Sacker bifurcation of a discrete predator-prey model with Allee effect and cannibalism

    Get PDF
    In this paper, a discrete predator-prey model incorporating Allee effect and cannibalism is derived from its continuous version by semidiscretization method. Not only the existence and local stability of fixed points of the discret system are investigated, but more important, the sufficient conditions for the occurrence of its period-doubling bifurcation and Neimark-Sacker bifurcation are obtained using the center manifold theorem and local bifurcation theory. Finally some numerical simulations are given to illustrate the existence of Neimark-Sacker bifurcation. The outcome of the study reveals that this discrete system undergoes various bifurcations including period-doubling bifurcation and Neimark-Sacker bifurcation

    Triolein-based polycation lipid nanocarrier for efficient gene delivery: characteristics and mechanism

    Get PDF
    We proposed to develop a polycation lipid nanocarrier (PLN) with higher transfection efficiency than our previously described polycation nanostrucutred lipid nanocarrier (PNLC). PLN was composed of triolein, cetylated low-molecular-weight polyethylenimine, and dioleoyl phosphatidylethanolamine. The physicochemical properties of PLN and the PLN/DNA complexes (PDC) were characterized. The in vitro transfection was performed in human lung adenocarcinoma (SPC-A1) cells, and the intracellular mechanism was investigated as well. The measurements indicated that PLN and PDC are homogenous nanometer-sized particles with a positive charge. The transfection efficiency of PDC significantly increased with the content of triolein and was higher than that of PNLC and commercial Lipofectamine™ 2000. In particular, the transfection of PLN in the presence of 10% serum was more effective than that in its absence. With the help of specific inhibitors of chlorpromazine and filipin, the clathrin-dependent endocytosis pathway was determined to be the main contributor to the successful transfection mediated by PLN in SPC-A1 cells. The captured images verified that the fluorescent PDC was localized in the lysosomes and nuclei after endocytosis. Thus, PLN represents a novel efficient nonviral gene delivery vector