309,656 research outputs found

    Look-ahead strategies for dynamic pickup and delivery problems

    Get PDF
    In this paper we consider a dynamic full truckload pickup and delivery problem with time-windows. Jobs arrive over time and are offered in a second-price auction. Individual vehicles bid on these jobs and maintain a schedule of the jobs they have won. We propose a pricing and scheduling strategy based on dynamic programming where not only the direct costs of a job insertion are taken into account, but also the impact on future opportunities. Simulation is used to evaluate the benefits of pricing opportunities compared to simple pricing strategies in various market settings. Numerical results show that the proposed approach provides high quality solutions, in terms of profits, capacity utilization, and delivery reliability

    Reconciling Information from Alternative Climate-economic Models: A Posterior Integration Approach

    Get PDF
    Studies of complex systems are non-separable from the analysis of partial and imprecise information received from alternative sources. Due to the high complexity of the underlying processes, researches tend to create an ensemble of multiple models, which describe the studied phenomenon using different modeling approaches and primary assumptions. A system analysist deals then with a set of ensemble outcomes (usually represented by a family of probability distributions), which needs to be integrated into one estimate in order to install the ensemble into the modeling chain or provide support for the informed decision making. This research is focused on the application of the posterior integration method (which was originally developed in IIASA [1] to reconcile stochastic estimates from independent sources) to an ensemble of climate-economic models. Our case-study uses two versions of the stylized model SDEM (Structural Dynamic Economic Model) [2], which generate different outputs (including emissions, CO2 concentration, temperature, size of economy) under two scenarios: the business-as-usual scenario and mitigation scenario (under carbon tax). We compare original results with results of posterior integration and results of the traditional approach of averaging model outcomes. [1] Kryazhimskiy, A. (2013) Posterior integration of independent stochastic estimates, IIASA Interim Report IR-13-006. [2] Kovalevsky, D.V., Hasselmann, K. (2014): Assessing the transition to a low-carbon economy using actor-based system-dynamic models. Proceedings of the 7th International Congress on Environmental Modelling and Software (iEMSs), 15-19 June 2014, San Diego, California, Vol. 4, 1865-1872, URL: http://www.iemss.org/sites/iemss2014/papers/Volume_4_iEMSs2014_pp_1817-2386.pdf

    Dynamic Non-Bayesian Decision Making

    Full text link
    The model of a non-Bayesian agent who faces a repeated game with incomplete information against Nature is an appropriate tool for modeling general agent-environment interactions. In such a model the environment state (controlled by Nature) may change arbitrarily, and the feedback/reward function is initially unknown. The agent is not Bayesian, that is he does not form a prior probability neither on the state selection strategy of Nature, nor on his reward function. A policy for the agent is a function which assigns an action to every history of observations and actions. Two basic feedback structures are considered. In one of them -- the perfect monitoring case -- the agent is able to observe the previous environment state as part of his feedback, while in the other -- the imperfect monitoring case -- all that is available to the agent is the reward obtained. Both of these settings refer to partially observable processes, where the current environment state is unknown. Our main result refers to the competitive ratio criterion in the perfect monitoring case. We prove the existence of an efficient stochastic policy that ensures that the competitive ratio is obtained at almost all stages with an arbitrarily high probability, where efficiency is measured in terms of rate of convergence. It is further shown that such an optimal policy does not exist in the imperfect monitoring case. Moreover, it is proved that in the perfect monitoring case there does not exist a deterministic policy that satisfies our long run optimality criterion. In addition, we discuss the maxmin criterion and prove that a deterministic efficient optimal strategy does exist in the imperfect monitoring case under this criterion. Finally we show that our approach to long-run optimality can be viewed as qualitative, which distinguishes it from previous work in this area.Comment: See http://www.jair.org/ for any accompanying file

    Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey

    Full text link
    Wireless sensor networks (WSNs) consist of autonomous and resource-limited devices. The devices cooperate to monitor one or more physical phenomena within an area of interest. WSNs operate as stochastic systems because of randomness in the monitored environments. For long service time and low maintenance cost, WSNs require adaptive and robust methods to address data exchange, topology formulation, resource and power optimization, sensing coverage and object detection, and security challenges. In these problems, sensor nodes are to make optimized decisions from a set of accessible strategies to achieve design goals. This survey reviews numerous applications of the Markov decision process (MDP) framework, a powerful decision-making tool to develop adaptive algorithms and protocols for WSNs. Furthermore, various solution methods are discussed and compared to serve as a guide for using MDPs in WSNs

    Identifying the consequences of dynamic treatment strategies: A decision-theoretic overview

    Full text link
    We consider the problem of learning about and comparing the consequences of dynamic treatment strategies on the basis of observational data. We formulate this within a probabilistic decision-theoretic framework. Our approach is compared with related work by Robins and others: in particular, we show how Robins's 'G-computation' algorithm arises naturally from this decision-theoretic perspective. Careful attention is paid to the mathematical and substantive conditions required to justify the use of this formula. These conditions revolve around a property we term stability, which relates the probabilistic behaviours of observational and interventional regimes. We show how an assumption of 'sequential randomization' (or 'no unmeasured confounders'), or an alternative assumption of 'sequential irrelevance', can be used to infer stability. Probabilistic influence diagrams are used to simplify manipulations, and their power and limitations are discussed. We compare our approach with alternative formulations based on causal DAGs or potential response models. We aim to show that formulating the problem of assessing dynamic treatment strategies as a problem of decision analysis brings clarity, simplicity and generality.Comment: 49 pages, 15 figure
    • …
    corecore