477 research outputs found

    Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey

    Full text link
    Wireless sensor networks (WSNs) consist of autonomous and resource-limited devices. The devices cooperate to monitor one or more physical phenomena within an area of interest. WSNs operate as stochastic systems because of randomness in the monitored environments. For long service time and low maintenance cost, WSNs require adaptive and robust methods to address data exchange, topology formulation, resource and power optimization, sensing coverage and object detection, and security challenges. In these problems, sensor nodes are to make optimized decisions from a set of accessible strategies to achieve design goals. This survey reviews numerous applications of the Markov decision process (MDP) framework, a powerful decision-making tool to develop adaptive algorithms and protocols for WSNs. Furthermore, various solution methods are discussed and compared to serve as a guide for using MDPs in WSNs

    Review of Markov models for maintenance optimization in the context of offshore wind

    Get PDF
    The offshore environment poses a number of challenges to wind farm operators. Harsher climatic conditions typically result in lower reliability while challenges in accessibility make maintenance difficult. One of the ways to improve availability is to optimize the Operation and Maintenance (O&M) actions such as scheduled, corrective and proactive maintenance. Many authors have attempted to model or optimize O&M through the use of Markov models. Two examples of Markov models, Hidden Markov Models (HMMs) and Partially Observable Markov Decision Processes (POMDPs) are investigated in this paper. In general, Markov models are a powerful statistical tool, which has been successfully applied for component diagnostics, prognostics and maintenance optimization across a range of industries. This paper discusses the suitability of these models to the offshore wind industry. Existing models which have been created for the wind industry are critically reviewed and discussed. As there is little evidence of widespread application of these models, this paper aims to highlight the key factors required for successful application of Markov models to practical problems. From this, the paper identifies the necessary theoretical and practical gaps that must be resolved in order to gain broad acceptance of Markov models to support O&M decision making in the offshore wind industry

    Multi-agent deep reinforcement learning with centralized training and decentralized execution for transportation infrastructure management

    Full text link
    We present a multi-agent Deep Reinforcement Learning (DRL) framework for managing large transportation infrastructure systems over their life-cycle. Life-cycle management of such engineering systems is a computationally intensive task, requiring appropriate sequential inspection and maintenance decisions able to reduce long-term risks and costs, while dealing with different uncertainties and constraints that lie in high-dimensional spaces. To date, static age- or condition-based maintenance methods and risk-based or periodic inspection plans have mostly addressed this class of optimization problems. However, optimality, scalability, and uncertainty limitations are often manifested under such approaches. The optimization problem in this work is cast in the framework of constrained Partially Observable Markov Decision Processes (POMDPs), which provides a comprehensive mathematical basis for stochastic sequential decision settings with observation uncertainties, risk considerations, and limited resources. To address significantly large state and action spaces, a Deep Decentralized Multi-agent Actor-Critic (DDMAC) DRL method with Centralized Training and Decentralized Execution (CTDE), termed as DDMAC-CTDE is developed. The performance strengths of the DDMAC-CTDE method are demonstrated in a generally representative and realistic example application of an existing transportation network in Virginia, USA. The network includes several bridge and pavement components with nonstationary degradation, agency-imposed constraints, and traffic delay and risk considerations. Compared to traditional management policies for transportation networks, the proposed DDMAC-CTDE method vastly outperforms its counterparts. Overall, the proposed algorithmic framework provides near optimal solutions for transportation infrastructure management under real-world constraints and complexities

    Continuous-observation partially observable semi-Markov decision processes for machine maintenance

    Get PDF
    Partially observable semi-Markov decision processes (POSMDPs) provide a rich framework for planning under both state transition uncertainty and observation uncertainty. In this paper, we widen the literature on POSMDP by studying discrete-state, discrete-action yet continuous-observation POSMDPs. We prove that the resultant α-vector set is continuous and therefore propose a point-based value iteration algorithm. This paper also bridges the gap between POSMDP and machine maintenance by incorporating various types of maintenance actions, such as actions changing machine state, actions changing degradation rate, and the temporally extended action "do nothing''. Both finite and infinite planning horizons are reviewed, and the solution methodology for each type of planning horizon is given. We illustrate the maintenance decision process via a real industrial problem and demonstrate that the developed framework can be readily applied to solve relevant maintenance problems

    Life-cycle policies for large engineering systems under complete and partial observability

    Get PDF
    Management of structures and infrastructure systems has gained significant attention in the pursuit of optimal inspection and maintenance life-cycle policies that are able to handle diverse deteriorating effects of stochastic nature and satisfy long-term objectives. Such sequential decision problems can be efficiently formulated along the premises of Markov Decision Processes (MDP) and Partially Observable Markov Decision Processes (POMDP), which describe agent-based acting in environments with Markovian dynamics, equipped with rewards, actions, and complete or partial observations. In systems with relatively low dimensional state and action spaces, MDPs and POMDPs can be satisfactorily solved using different dynamic programming algorithms, such as value iteration with or without synchronous updates and pointbased approaches for partial observability cases. However, optimal planning for large systems with multiple components is computationally hard and severely suffers from the curse of dimensionality. Namely, the system states and actions can grow exponentially with the number of components, in the most general and adverse case, making the problem intractable by conventional dynamic programming schemes. In this work, Deep Reinforcement Learning (DRL) is implemented, with emphasis in the development and application of deep architectures, suitable for large engineering systems. The developed approach leverages componentwise information to prescribe component-wise actions, while maintaining global optimality on the system level. Thereby, the system life-cycle cost functions are efficiently parametrized for large state and action spaces through nonlinear approximations, enabling adept planning in complex decision problems. Results are presented for a multi-component system, evaluated against various condition-based policies.This material is based upon work supported by the National Science Foundation under CAREER Grant No. 1751941

    Understanding Behavior via inverse reinforcement learning

    Get PDF
    Tese de Mestrado Integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201
    corecore