Search CORE

6,690 research outputs found

Recommended from our members

Online learning and decision-making from implicit feedback

Author: Krishnasamy Subhashini
Publication venue
Publication date: 20/06/2017
Field of study

This thesis focuses on designing learning and control algorithms for emerging resource allocation platforms like recommender systems, 5G wireless networks, and online marketplaces. These systems have an environment which is only partially known. Thus, the controllers need to make resource allocation decisions based on implicit feedback obtained from the environment based on past actions. The goal is to sequentially select actions using incremental feedback so as to optimize performance while simultaneously learning about the environment. We study three problems which exemplify this setting. The first is an inference problem which requires identification of sponsored content in recommender systems. Specifically, we ask if it is possible to detect the existence of sponsored content disguised as genuine recommendations using implicit feedback from a subset of users of the recommender system. The second problem is the design of scheduling algorithms for switch networks when the user-server link statistics are unknown (for e.g., in wireless networks, online marketplaces). The scheduling algorithm has to tradeoff between scheduling the optimal links and obtaining sufficient feedback about all the links for accurate estimates. We observe the close connection of this problem to the stochastic multi-armed bandit problem and analyze bandit-style explore-exploit algorithms for learning the statistical parameters while simultaneously assigning servers to users. The third is the joint problem of base station activation and rate allocation in an energy efficient wireless network when the channel statistics are unknown. The controller observes instantaneous channel rates of activated BSs, and thereby sequentially obtains implicit feedback about the channel. Here again, there is a tradeoff between learning the channel versus optimizing the operation cost based on estimated parameters. For each of these systems, we propose algorithms with provable asymptotic guarantees. These learning algorithms highlight the use of implicit feedback in online decision making and control.Electrical and Computer Engineerin

Texas ScholarWorks

Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks

Author: Chen Kwang-Cheng
Hanzo Lajos
Jiang Chunxiao
Ren Yong
Wang Jingjing
Zhang Haijun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/01/2019
Field of study

Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning. Furthermore, we investigate their employment in the compelling applications of wireless networks, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various ML algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.Comment: 46 pages, 22 fig

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

An Online Approach to Dynamic Channel Access and Transmission Scheduling

Author: Borst S.
Dams J.
Liu Y.
Yang X.
Publication venue
Publication date: 04/04/2015
Field of study

Making judicious channel access and transmission scheduling decisions is essential for improving performance as well as energy and spectral efficiency in multichannel wireless systems. This problem has been a subject of extensive study in the past decade, and the resulting dynamic and opportunistic channel access schemes can bring potentially significant improvement over traditional schemes. However, a common and severe limitation of these dynamic schemes is that they almost always require some form of a priori knowledge of the channel statistics. A natural remedy is a learning framework, which has also been extensively studied in the same context, but a typical learning algorithm in this literature seeks only the best static policy, with performance measured by weak regret, rather than learning a good dynamic channel access policy. There is thus a clear disconnect between what an optimal channel access policy can achieve with known channel statistics that actively exploits temporal, spatial and spectral diversity, and what a typical existing learning algorithm aims for, which is the static use of a single channel devoid of diversity gain. In this paper we bridge this gap by designing learning algorithms that track known optimal or sub-optimal dynamic channel access and transmission scheduling policies, thereby yielding performance measured by a form of strong regret, the accumulated difference between the reward returned by an optimal solution when a priori information is available and that by our online algorithm. We do so in the context of two specific algorithms that appeared in [1] and [2], respectively, the former for a multiuser single-channel setting and the latter for a single-user multichannel setting. In both cases we show that our algorithms achieve sub-linear regret uniform in time and outperforms the standard weak-regret learning algorithms.Comment: 10 pages, to appear in MobiHoc 201

arXiv.org e-Print Archive

Crossref

Throughput Optimal Scheduling with Dynamic Channel Feedback

Author: Alpcan Tansu
Boche Holger
Ercetin Ozgur
Erçetin Özgür
Karaca Mehmet
Sarıkaya Yunus
Sarikaya Yunus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/03/2012
Field of study

It is well known that opportunistic scheduling algorithms are throughput optimal under full knowledge of channel and network conditions. However, these algorithms achieve a hypothetical achievable rate region which does not take into account the overhead associated with channel probing and feedback required to obtain the full channel state information at every slot. We adopt a channel probing model where

\beta

fraction of time slot is consumed for acquiring the channel state information (CSI) of a single channel. In this work, we design a joint scheduling and channel probing algorithm named SDF by considering the overhead of obtaining the channel state information. We first analytically prove SDF algorithm can support

1+\epsilon

fraction of of the full rate region achieved when all users are probed where

\epsilon

depends on the expected number of users which are not probed. Then, for homogenous channel, we show that when the number of users in the network is greater than 3,

\epsilon > 0

, i.e., we guarantee to expand the rate region. In addition, for heterogenous channels, we prove the conditions under which SDF guarantees to increase the rate region. We also demonstrate numerically in a realistic simulation setting that this rate region can be achieved by probing only less than 50% of all channels in a CDMA based cellular network utilizing high data rate protocol under normal channel conditions.Comment: submitte

arXiv.org e-Print Archive

Crossref

Sabanci University Research Database