Search CORE

12 research outputs found

On Optimality of Myopic Policy for Restless Multi-armed Bandit Problem with Non i.i.d. Arms and Imperfect Detection

Author: Agha Khaldoun Al
Chen Lin
Liu Quan
Wang Kehao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/05/2012
Field of study

We consider the channel access problem in a multi-channel opportunistic communication system with imperfect channel sensing, where the state of each channel evolves as a non independent and identically distributed Markov process. This problem can be cast into a restless multi-armed bandit (RMAB) problem that is intractable for its exponential computation complexity. A natural alternative is to consider the easily implementable myopic policy that maximizes the immediate reward but ignores the impact of the current strategy on the future reward. In particular, we develop three axioms characterizing a family of generic and practically important functions termed as

g

-regular functions which includes a wide spectrum of utility functions in engineering. By pursuing a mathematical analysis based on the axioms, we establish a set of closed-form structural conditions for the optimality of myopic policy.Comment: Second version, 16 page

arXiv.org e-Print Archive

Crossref

Monitoring and control of stochastic systems

Author: Kuhn J.
Publication venue
Publication date: 01/01/2017
Field of study

International Migration, Integration and Social Cohesion online publications

Optimal and Suboptimal Policies for Opportunistic Spectrum Access: A Resource Allocation Approach.

Author: Haji Ali Ahmad Sahand
Publication venue
Publication date
Field of study

In recent years there has been significant research in increasing efficiency of using spectrum. This concept known as smart radios or Cognitive radio has received widespread attention by companies such as Google and Motorola looking for making contracts with FCC and designing smart radios which can effectively use the unused bandwidth and spectrum in order to transmit their signals without interference with signals of primary users. In this thesis, we study several problems related to resource allocation in wireless networks through modeling and studying them as game theory and stochastic control problems. In the first problem we looked at methods for designing optimal cognitive radios which use optimal and suboptimal sensing policies in order to maximize their long-term expected reward within a finite or infinite horizon. We proved in the case that channels are bursty and user can select only one channel and probe it, the optimal policy for the radio is to use a greedy policy in probing channels and select the channel at each moment that has the highest probability of being available for transmission. In second problem we modeled resource allocation as a congestion game and studied existence of Nash equilibrium for such game. In the last problem, we studied a more general case of the first problem where primary user can select multiple channels at a time in order to sense them. Again the goal of the cognitive radio in this case is to select those channels for sensing that provide him with the highest expected reward in the respective horizon where reward comes from successfully probing a channel and transmitting through it. We summarized all results in the conclusion chapter.Ph.D.Electrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/78778/1/shajiali_1.pd

Deep Blue Documents at the University of Michigan

Anti-jamming communication in cognitive radio networks with unknown channel Statistics

Author: Kui Ren
Peng Ning
Qian Wang
Publication venue
Publication date: 01/01/2011
Field of study

Abstract-Recently, many opportunistic spectrum sensing and access protocols have been proposed for cognitive radio networks (CRNs). For achieving optimized spectrum usage, existing solutions model the spectrum sensing and access problem as a partially observed Markov decision process (POMDP) and assume that the information states and/or the primary users' (PUs) traffic statistics are known a priori to the secondary users (SUs). While theoretically sound, these existing approaches may not be effective in practice due to two main concerns. First, the assumptions they made are not practical, as before the communication starts, PUs' traffic statistics may not be readily available to the SUs. Secondly and more seriously, existing approaches are extremely vulnerable to malicious jamming attacks. A cognitive attacker can always jam the channels to be accessed by leveraging the same statistic information and stochastic dynamic decision making process that the SUs would follow. To address the above concerns, we formulate the problem of anti-jamming multichannel access in CRNs and solve it as a non-stochastic multiarmed bandit (NS-MAB) problem, where the secondary sender and receiver adaptively choose their arms (i.e., sending and receiving channels) to operate. The proposed protocol enables them to hop to the same set of channels with high probability in the presence of jamming. We analytically show the convergence of the learning algorithms, i.e., the performance difference between the secondary sender and receiver's optimal strategies is no more than O( T n ln n). Extensive simulations are conducted to validate the theoretical analysis and show that the proposed protocol is highly resilient to various jamming attacks

CiteSeerX

Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks

Author: Chen Kwang-Cheng
Hanzo Lajos
Jiang Chunxiao
Ren Yong
Wang Jingjing
Zhang Haijun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/01/2019
Field of study

Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning. Furthermore, we investigate their employment in the compelling applications of wireless networks, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various ML algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.Comment: 46 pages, 22 fig

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Evaluation of the Intelligence Collection and Analysis Process

Author: Stark Livia
Publication venue: Lancaster University
Publication date: 01/11/2022
Field of study

Intelligence is a critical tool in modern security operations that provides insight into current and future operational conditions. It is a concept that transfers to other applications where monitoring activities or situations is imperative, such as ecological research. As technological advances in the past decades lead to increased availability of potential intelligence, we concentrate on source selection to ensure the resulting intelligence is of high quality and fit for purpose. We wish to bring focus to the more varied nature of intelligence than what is currently reflected in models of its collection and evaluation. Therefore, we examine the intelligence collection and analysis process in two separate scenarios; one treats it as a ongoing strategic activity, in another intelligence collection is carried out with an investigative intent. The first problem we formulate concerns source selection with a random time delay in feedback, corresponding to the collection and evaluation time of the intelligence. Both the distributions of such time delay and the outcome of the intelligence evaluation are unknown, giving rise to the classic exploration-exploitation dilemma in a long-run setting. We develop promising approaches to accommodate the novel features of the model based on Gittins indices and the knowledge gradient, and examine the issues presented when incorporating structures of dependence between the time delay and the outcome of the evaluation. Next, we develop a novel intelligence collection problem rooted in tactical level source selection, aiming to piece together an intelligence picture comprised of multiple types of information, for example, where and when an attack is planned. We demonstrate that when all elements of the model are known, dynamic programming provides the optimal policy. When some elements are unknown, which introduces an exploration-exploitation aspect to the model, we find that in certain cases the ability to learn is severely limited

Lancaster E-Prints

Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference

Author
Publication venue: AUAI Press
Publication date: 01/09/2018
Field of study

UCL Discovery