2,008 research outputs found

    An Energy Sharing Game with Generalized Demand Bidding: Model and Properties

    Get PDF
    This paper proposes a novel energy sharing mechanism for prosumers who can produce and consume. Different from most existing works, the role of individual prosumer as a seller or buyer in our model is endogenously determined. Several desirable properties of the proposed mechanism are proved based on a generalized game-theoretic model. We show that the Nash equilibrium exists and is the unique solution of an equivalent convex optimization problem. The sharing price at the Nash equilibrium equals to the average marginal disutility of all prosumers. We also prove that every prosumer has the incentive to participate in the sharing market, and prosumers' total cost decreases with increasing absolute value of price sensitivity. Furthermore, the Nash equilibrium approaches the social optimal as the number of prosumers grows, and competition can improve social welfare.Comment: 16 pages, 7 figure

    On Reward Structures of Markov Decision Processes

    Full text link
    A Markov decision process can be parameterized by a transition kernel and a reward function. Both play essential roles in the study of reinforcement learning as evidenced by their presence in the Bellman equations. In our inquiry of various kinds of "costs" associated with reinforcement learning inspired by the demands in robotic applications, rewards are central to understanding the structure of a Markov decision process and reward-centric notions can elucidate important concepts in reinforcement learning. Specifically, we study the sample complexity of policy evaluation and develop a novel estimator with an instance-specific error bound of O~(Ļ„sn)\tilde{O}(\sqrt{\frac{\tau_s}{n}}) for estimating a single state value. Under the online regret minimization setting, we refine the transition-based MDP constant, diameter, into a reward-based constant, maximum expected hitting cost, and with it, provide a theoretical explanation for how a well-known technique, potential-based reward shaping, could accelerate learning with expert knowledge. In an attempt to study safe reinforcement learning, we model hazardous environments with irrecoverability and proposed a quantitative notion of safe learning via reset efficiency. In this setting, we modify a classic algorithm to account for resets achieving promising preliminary numerical results. Lastly, for MDPs with multiple reward functions, we develop a planning algorithm that computationally efficiently finds Pareto-optimal stochastic policies.Comment: This PhD thesis draws heavily from arXiv:1907.02114 and arXiv:2002.06299; minor edit

    Solvent dependence of the rheological properties in hydrogel magnetorheological plastomer

    Get PDF
    Chemically crosslinked hydrogel magnetorheological (MR) plastomer (MRP) embedded with carbonyl iron particles (CIPs) exhibits excellent magnetic performance (MR effect) in the presence of external stimuli especially magnetic field. However, oxidation and desiccation in hydrogel MRP due to a large amount of water content as a dispersing phase would limit its usage for longā€term applications, especially in industrial engineering. In this study, different solvents such as dimethyl sulfoxide (DMSO) are also used to prepare polyvinyl alcohol (PVA) hydrogel MRP. Thus, to understand the dynamic viscoelastic properties of hydrogel MRP, three different samples with different solvents: water, DMSO, and their binary mixtures (DMSO/water) were prepared and systematically carried out using the oscillatory shear. The outcomes demonstrate that the PVA hydrogel MRP prepared from precursor gel with water shows the highest MR effect of 15,544% among the PVA hydrogel MRPs. However, the samples exhibit less stability and tend to oxidise after a month. Meanwhile, the samples with binary mixtures (DMSO/water) show an acceptable MR effect of 11,024% with good stability and no CIPs oxidation. Otherwise, the sample with DMSO has the lowest MR effect of 7049% and less stable compared to the binary solvent samples. This confirms that the utilisation of DMSO as a new solvent affects the rheological properties and stability of the samples

    Supply Chain

    Get PDF
    Traditionally supply chain management has meant factories, assembly lines, warehouses, transportation vehicles, and time sheets. Modern supply chain management is a highly complex, multidimensional problem set with virtually endless number of variables for optimization. An Internet enabled supply chain may have just-in-time delivery, precise inventory visibility, and up-to-the-minute distribution-tracking capabilities. Technology advances have enabled supply chains to become strategic weapons that can help avoid disasters, lower costs, and make money. From internal enterprise processes to external business transactions with suppliers, transporters, channels and end-users marks the wide range of challenges researchers have to handle. The aim of this book is at revealing and illustrating this diversity in terms of scientific and theoretical fundamentals, prevailing concepts as well as current practical applications

    The impact of MRP on a traditional manufacturing company

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Sloan School of Management, 1990.Includes bibliographical references (leaves 66-67).by Maria Carolina Briza Junqueira.M.S

    Business Process Automation and Managerial Accounting: An SAP Plug and Play Module (FINAL REPORT)

    Get PDF
    The primary aim of our project is to develop an Enterprise Resource Planning (ERP) platform that enables students at Pace to understand how different interdisciplinary areas in cross-unit and/or cross-enterprise decision making are related. ERP can help us do this since it allows a firm to automate and integrate its business processes, share common data and practices across the entire enterprise, and provide and access information in a real-time environment

    When is Agnostic Reinforcement Learning Statistically Tractable?

    Full text link
    We study the problem of agnostic PAC reinforcement learning (RL): given a policy class Ī \Pi, how many rounds of interaction with an unknown MDP (with a potentially large state and action space) are required to learn an Ļµ\epsilon-suboptimal policy with respect to Ī \Pi? Towards that end, we introduce a new complexity measure, called the \emph{spanning capacity}, that depends solely on the set Ī \Pi and is independent of the MDP dynamics. With a generative model, we show that for any policy class Ī \Pi, bounded spanning capacity characterizes PAC learnability. However, for online RL, the situation is more subtle. We show there exists a policy class Ī \Pi with a bounded spanning capacity that requires a superpolynomial number of samples to learn. This reveals a surprising separation for agnostic learnability between generative access and online access models (as well as between deterministic/stochastic MDPs under online access). On the positive side, we identify an additional \emph{sunflower} structure, which in conjunction with bounded spanning capacity enables statistically efficient online RL via a new algorithm called POPLER, which takes inspiration from classical importance sampling methods as well as techniques for reachable-state identification and policy evaluation in reward-free exploration.Comment: Accepted to NeurIPS 202
    • ā€¦
    corecore