33,698 research outputs found

    Learning Scheduling Algorithms for Data Processing Clusters

    Full text link
    Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms. Current systems, however, use simple generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible. In this paper, we show that modern machine learning techniques can generate highly-efficient policies automatically. Decima uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms without any human instruction beyond a high-level objective such as minimizing average job completion time. Off-the-shelf RL techniques, however, cannot handle the complexity and scale of the scheduling problem. To build Decima, we had to develop new representations for jobs' dependency graphs, design scalable RL models, and invent RL training methods for dealing with continuous stochastic job arrivals. Our prototype integration with Spark on a 25-node cluster shows that Decima improves the average job completion time over hand-tuned scheduling heuristics by at least 21%, achieving up to 2x improvement during periods of high cluster load

    Spectral and Energy Efficiency in Cognitive Radio Systems with Unslotted Primary Users and Sensing Uncertainty

    Full text link
    This paper studies energy efficiency (EE) and average throughput maximization for cognitive radio systems in the presence of unslotted primary users. It is assumed that primary user activity follows an ON-OFF alternating renewal process. Secondary users first sense the channel possibly with errors in the form of miss detections and false alarms, and then start the data transmission only if no primary user activity is detected. The secondary user transmission is subject to constraints on collision duration ratio, which is defined as the ratio of average collision duration to transmission duration. In this setting, the optimal power control policy which maximizes the EE of the secondary users or maximizes the average throughput while satisfying a minimum required EE under average/peak transmit power and average interference power constraints are derived. Subsequently, low-complexity algorithms for jointly determining the optimal power level and frame duration are proposed. The impact of probabilities of detection and false alarm, transmit and interference power constraints on the EE, average throughput of the secondary users, optimal transmission power, and the collisions with primary user transmissions are evaluated. In addition, some important properties of the collision duration ratio are investigated. The tradeoff between the EE and average throughput under imperfect sensing decisions and different primary user traffic are further analyzed.Comment: This paper is accepted for publication in IEEE Transactions on Communication

    In-Network Distributed Solar Current Prediction

    Get PDF
    Long-term sensor network deployments demand careful power management. While managing power requires understanding the amount of energy harvestable from the local environment, current solar prediction methods rely only on recent local history, which makes them susceptible to high variability. In this paper, we present a model and algorithms for distributed solar current prediction, based on multiple linear regression to predict future solar current based on local, in-situ climatic and solar measurements. These algorithms leverage spatial information from neighbors and adapt to the changing local conditions not captured by global climatic information. We implement these algorithms on our Fleck platform and run a 7-week-long experiment validating our work. In analyzing our results from this experiment, we determined that computing our model requires an increased energy expenditure of 4.5mJ over simpler models (on the order of 10^{-7}% of the harvested energy) to gain a prediction improvement of 39.7%.Comment: 28 pages, accepted at TOSN and awaiting publicatio
    corecore