325 research outputs found

    Deep reinforcement learning for attacking wireless sensor networks

    Get PDF
    Recent advances in Deep Reinforcement Learning allow solving increasingly complex problems. In this work, we show how current defense mechanisms in Wireless Sensor Networks are vulnerable to attacks that use these advances. We use a Deep Reinforcement Learning attacker architecture that allows having one or more attacking agents that can learn to attack using only partial observations. Then, we subject our architecture to a test-bench consisting of two defense mechanisms against a distributed spectrum sensing attack and a backoff attack. Our simulations show that our attacker learns to exploit these systems without having a priori information about the defense mechanism used nor its concrete parameters. Since our attacker requires minimal hyper-parameter tuning, scales with the number of attackers, and learns only by interacting with the defense mechanism, it poses a significant threat to current defense procedures

    Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey

    Full text link
    Wireless sensor networks (WSNs) consist of autonomous and resource-limited devices. The devices cooperate to monitor one or more physical phenomena within an area of interest. WSNs operate as stochastic systems because of randomness in the monitored environments. For long service time and low maintenance cost, WSNs require adaptive and robust methods to address data exchange, topology formulation, resource and power optimization, sensing coverage and object detection, and security challenges. In these problems, sensor nodes are to make optimized decisions from a set of accessible strategies to achieve design goals. This survey reviews numerous applications of the Markov decision process (MDP) framework, a powerful decision-making tool to develop adaptive algorithms and protocols for WSNs. Furthermore, various solution methods are discussed and compared to serve as a guide for using MDPs in WSNs

    Approximate information state based convergence analysis of recurrent Q-learning

    Full text link
    In spite of the large literature on reinforcement learning (RL) algorithms for partially observable Markov decision processes (POMDPs), a complete theoretical understanding is still lacking. In a partially observable setting, the history of data available to the agent increases over time so most practical algorithms either truncate the history to a finite window or compress it using a recurrent neural network leading to an agent state that is non-Markovian. In this paper, it is shown that in spite of the lack of the Markov property, recurrent Q-learning (RQL) converges in the tabular setting. Moreover, it is shown that the quality of the converged limit depends on the quality of the representation which is quantified in terms of what is known as an approximate information state (AIS). Based on this characterization of the approximation error, a variant of RQL with AIS losses is presented. This variant performs better than a strong baseline for RQL that does not use AIS losses. It is demonstrated that there is a strong correlation between the performance of RQL over time and the loss associated with the AIS representation.Comment: 25 pages, 6 figure
    corecore