154 research outputs found

    A Coverage Monitoring algorithm based on Learning Automata for Wireless Sensor Networks

    Full text link
    To cover a set of targets with known locations within an area with limited or prohibited ground access using a wireless sensor network, one approach is to deploy the sensors remotely, from an aircraft. In this approach, the lack of precise sensor placement is compensated by redundant de-ployment of sensor nodes. This redundancy can also be used for extending the lifetime of the network, if a proper scheduling mechanism is available for scheduling the active and sleep times of sensor nodes in such a way that each node is in active mode only if it is required to. In this pa-per, we propose an efficient scheduling method based on learning automata and we called it LAML, in which each node is equipped with a learning automaton, which helps the node to select its proper state (active or sleep), at any given time. To study the performance of the proposed method, computer simulations are conducted. Results of these simulations show that the pro-posed scheduling method can better prolong the lifetime of the network in comparison to similar existing method

    On incorporating the paradigms of discretization and Bayesian estimation to create a new family of pursuit learning automata

    Get PDF
    There are currently two fundamental paradigms that have been used to enhance the convergence speed of Learning Automata (LA). The first involves the concept of utilizing the estimates of the reward probabilities, while the second involves discretizing the probability space in which the LA operates. This paper demonstrates how both of these can be simultaneously utilized, and in particular, by using the family of Bayesian estimates that have been proven to have distinct advantages over their maximum likelihood counterparts. The success of LA-based estimator algorithms over the classical, Linear Reward-Inaction (LRI)-like schemes, can be explained by their ability to pursue the actions with the highest reward probability estimates. Without access to reward probability estimates, it makes sense for schemes like the LRI to first make large exploring steps, and then to gradually turn exploration into exploitation by making progressively smaller learning steps. However, this behavior becomes counter-intuitive when pursuing actions based on their estimated reward probabilities. Learning should then ideally proceed in progressively larger steps, as the reward probability estimates turn more accurate. This paper introduces a new estimator algorithm, the Discretized Bayesian Pursuit Algorithm (DBPA), that achieves this by incorporating both the above paradigms. The DBPA is implemented by linearly discretizing the action probability space of the Bayesian Pursuit Algorithm (BPA) (Zhang et al. in IEA-AIE 2011, Springer, New York, pp. 608-620, 2011). The key innovation of this paper is that the linear discrete updating rules mitigate the counter-intuitive behavior of the corresponding linear continuous updating rules, by augmenting them with the reward probability estimates. Extensive experimental results show the superiority of DBPA over previous estimator algorithms. Indeed, the DBPA is probably the fastest reported LA to date. Apart from the rigorous experimental demonstration of the strength of the DBPA, the paper also briefly records the proofs of why the BPA and the DBPA are ε{lunate}-optimal in stationary environments

    Generalized pursuit learning schemes: new families of continuous and discretized learning automata

    Full text link

    The design of absorbing Bayesian pursuit algorithms and the formal analyses of their ε-optimality

    Get PDF
    The fundamental phenomenon that has been used to enhance the convergence speed of learning automata (LA) is that of incorporating the running maximum likelihood (ML) estimates of the action reward probabilities into the probability updating rules for selecting the actions. The frontiers of this field have been recently expanded by replacing the ML estimates with their corresponding Bayesian counterparts that incorporate the properties of the conjugate priors. These constitute the Bayesian pursuit algorithm (BPA), and the discretized Bayesian pursuit algorithm. Although these algorithms have been designed and efficiently implemented, and are, arguably, the fastest and most accurate LA reported in the literature, the proofs of their ϵϵ-optimal convergence has been unsolved. This is precisely the intent of this paper. In this paper, we present a single unifying analysis by which the proofs of both the continuous and discretized schemes are proven. We emphasize that unlike the ML-based pursuit schemes, the Bayesian schemes have to not only consider the estimates themselves but also the distributional forms of their conjugate posteriors and their higher order moments—all of which render the proofs to be particularly challenging. As far as we know, apart from the results themselves, the methodologies of this proof have been unreported in the literature—they are both pioneering and novel

    A UNIQUE MATHEMATICAL QUEUING MODEL FOR WIRED AND WIRELESS NETWORKS

    Get PDF
    The de-facto protocol for transmitting data in wired and wireless networks is the Transmission Control Protocol/Internet Protocol (TCP/IP). While a lot of modifications have been done to adapt the TCP/IP protocol for wireless networks, a lot remains to be done about the bandwidth underutilization caused by network traffic control actions taken by active queue management controllers currently being implemented on modern routers. The main cause of bandwidth underutilization is uncertainties in network parameters. This is especially true for wireless networks. In this study, two unique mathematical models for queue management in wired and wireless networks are proposed. The models were derived using a recursive, thirdorder, discrete-time structure. The models are; the Model Predictive Controller (MPC) and the Self-Tuning Regulator (STR). The MPC was modeled to bear uncertainties in gain, poles and delay time. The STR, with an assigned closed-loop pole, was modeled to be very robust to varying network parameters. Theoretically, the proposed models deliver a performance in network traffic control that optimizes the use of available bandwidth and minimizes queue length and packet loss in wired and wireless networks

    A UNIQUE MATHEMATICAL QUEUING MODEL FOR WIRED AND WIRELESS NETWORKS

    Get PDF
    The de-facto protocol for transmitting data in wired and wireless networks is the Transmission Control Protocol/Internet Protocol (TCP/IP). While a lot of modifications have been done to adapt the TCP/IP protocol for wireless networks, a lot remains to be done about the bandwidth underutilization caused by network traffic control actions taken by active queue management controllers currently being implemented on modern routers. The main cause of bandwidth underutilization is uncertainties in network parameters. This is especially true for wireless networks. In this study, two unique mathematical models for queue management in wired and wireless networks are proposed. The models were derived using a recursive, thirdorder, discrete-time structure. The models are; the Model Predictive Controller (MPC) and the Self-Tuning Regulator (STR). The MPC was modeled to bear uncertainties in gain, poles and delay time. The STR, with an assigned closed-loop pole, was modeled to be very robust to varying network parameters. Theoretically, the proposed models deliver a performance in network traffic control that optimizes the use of available bandwidth and minimizes queue length and packet loss in wired and wireless networks

    Reinforcement Learning

    Get PDF
    Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Learning is a very important aspect. This book is on reinforcement learning which involves performing actions to achieve a goal. The first 11 chapters of this book describe and extend the scope of reinforcement learning. The remaining 11 chapters show that there is already wide usage in numerous fields. Reinforcement learning can tackle control tasks that are too complex for traditional, hand-designed, non-learning controllers. As learning computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels. This book shows that reinforcement learning is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field
    • …
    corecore