17,369 research outputs found

    Sensor Scheduling for Energy-Efficient Target Tracking in Sensor Networks

    Full text link
    In this paper we study the problem of tracking an object moving randomly through a network of wireless sensors. Our objective is to devise strategies for scheduling the sensors to optimize the tradeoff between tracking performance and energy consumption. We cast the scheduling problem as a Partially Observable Markov Decision Process (POMDP), where the control actions correspond to the set of sensors to activate at each time step. Using a bottom-up approach, we consider different sensing, motion and cost models with increasing levels of difficulty. At the first level, the sensing regions of the different sensors do not overlap and the target is only observed within the sensing range of an active sensor. Then, we consider sensors with overlapping sensing range such that the tracking error, and hence the actions of the different sensors, are tightly coupled. Finally, we consider scenarios wherein the target locations and sensors' observations assume values on continuous spaces. Exact solutions are generally intractable even for the simplest models due to the dimensionality of the information and action spaces. Hence, we devise approximate solution techniques, and in some cases derive lower bounds on the optimal tradeoff curves. The generated scheduling policies, albeit suboptimal, often provide close-to-optimal energy-tracking tradeoffs

    Two Timescale Convergent Q-learning for Sleep--Scheduling in Wireless Sensor Networks

    Full text link
    In this paper, we consider an intrusion detection application for Wireless Sensor Networks (WSNs). We study the problem of scheduling the sleep times of the individual sensors to maximize the network lifetime while keeping the tracking error to a minimum. We formulate this problem as a partially-observable Markov decision process (POMDP) with continuous state-action spaces, in a manner similar to (Fuemmeler and Veeravalli [2008]). However, unlike their formulation, we consider infinite horizon discounted and average cost objectives as performance criteria. For each criterion, we propose a convergent on-policy Q-learning algorithm that operates on two timescales, while employing function approximation to handle the curse of dimensionality associated with the underlying POMDP. Our proposed algorithm incorporates a policy gradient update using a one-simulation simultaneous perturbation stochastic approximation (SPSA) estimate on the faster timescale, while the Q-value parameter (arising from a linear function approximation for the Q-values) is updated in an on-policy temporal difference (TD) algorithm-like fashion on the slower timescale. The feature selection scheme employed in each of our algorithms manages the energy and tracking components in a manner that assists the search for the optimal sleep-scheduling policy. For the sake of comparison, in both discounted and average settings, we also develop a function approximation analogue of the Q-learning algorithm. This algorithm, unlike the two-timescale variant, does not possess theoretical convergence guarantees. Finally, we also adapt our algorithms to include a stochastic iterative estimation scheme for the intruder's mobility model. Our simulation results on a 2-dimensional network setting suggest that our algorithms result in better tracking accuracy at the cost of only a few additional sensors, in comparison to a recent prior work

    Fast-Convergent Learning-aided Control in Energy Harvesting Networks

    Full text link
    In this paper, we present a novel learning-aided energy management scheme (LEM\mathtt{LEM}) for multihop energy harvesting networks. Different from prior works on this problem, our algorithm explicitly incorporates information learning into system control via a step called \emph{perturbed dual learning}. LEM\mathtt{LEM} does not require any statistical information of the system dynamics for implementation, and efficiently resolves the challenging energy outage problem. We show that LEM\mathtt{LEM} achieves the near-optimal [O(ϵ),O(log(1/ϵ)2)][O(\epsilon), O(\log(1/\epsilon)^2)] utility-delay tradeoff with an O(1/ϵ1c/2)O(1/\epsilon^{1-c/2}) energy buffers (c(0,1)c\in(0,1)). More interestingly, LEM\mathtt{LEM} possesses a \emph{convergence time} of O(1/ϵ1c/2+1/ϵc)O(1/\epsilon^{1-c/2} +1/\epsilon^c), which is much faster than the Θ(1/ϵ)\Theta(1/\epsilon) time of pure queue-based techniques or the Θ(1/ϵ2)\Theta(1/\epsilon^2) time of approaches that rely purely on learning the system statistics. This fast convergence property makes LEM\mathtt{LEM} more adaptive and efficient in resource allocation in dynamic environments. The design and analysis of LEM\mathtt{LEM} demonstrate how system control algorithms can be augmented by learning and what the benefits are. The methodology and algorithm can also be applied to similar problems, e.g., processing networks, where nodes require nonzero amount of contents to support their actions

    Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks

    Full text link
    Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning. Furthermore, we investigate their employment in the compelling applications of wireless networks, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various ML algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.Comment: 46 pages, 22 fig

    Learning Scheduling Algorithms for Data Processing Clusters

    Full text link
    Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms. Current systems, however, use simple generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible. In this paper, we show that modern machine learning techniques can generate highly-efficient policies automatically. Decima uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms without any human instruction beyond a high-level objective such as minimizing average job completion time. Off-the-shelf RL techniques, however, cannot handle the complexity and scale of the scheduling problem. To build Decima, we had to develop new representations for jobs' dependency graphs, design scalable RL models, and invent RL training methods for dealing with continuous stochastic job arrivals. Our prototype integration with Spark on a 25-node cluster shows that Decima improves the average job completion time over hand-tuned scheduling heuristics by at least 21%, achieving up to 2x improvement during periods of high cluster load
    corecore