5,735 research outputs found

    SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization

    Full text link
    Computer vision is experiencing an AI renaissance, in which machine learning models are expediting important breakthroughs in academic research and commercial applications. Effectively training these models, however, is not trivial due in part to hyperparameters: user-configured values that control a model's ability to learn from data. Existing hyperparameter optimization methods are highly parallel but make no effort to balance the search across heterogeneous hardware or to prioritize searching high-impact spaces. In this paper, we introduce a framework for massively Scalable Hardware-Aware Distributed Hyperparameter Optimization (SHADHO). Our framework calculates the relative complexity of each search space and monitors performance on the learning task over all trials. These metrics are then used as heuristics to assign hyperparameters to distributed workers based on their hardware. We first demonstrate that our framework achieves double the throughput of a standard distributed hyperparameter optimization framework by optimizing SVM for MNIST using 150 distributed workers. We then conduct model search with SHADHO over the course of one week using 74 GPUs across two compute clusters to optimize U-Net for a cell segmentation task, discovering 515 models that achieve a lower validation loss than standard U-Net.Comment: 10 pages, 6 figure

    A Novel Cross Entropy Approach for Offloading Learning in Mobile Edge Computing

    Get PDF
    In this letter, we propose a novel offloading learning approach to compromise energy consumption and latency in a multi-tier network with mobile edge computing. In order to solve this integer programming problem, instead of using conventional optimization tools, we apply a cross entropy approach with iterative learning of the probability of elite solution samples. Compared to existing methods, the proposed one in this network permits a parallel computing architecture and is verified to be computationally very efficient. Specifically, it achieves performance close to the optimal and performs well with different choices of the values of hyperparameters in the proposed learning approach

    Scheduling Storms and Streams in the Cloud

    Full text link
    Motivated by emerging big streaming data processing paradigms (e.g., Twitter Storm, Streaming MapReduce), we investigate the problem of scheduling graphs over a large cluster of servers. Each graph is a job, where nodes represent compute tasks and edges indicate data-flows between these compute tasks. Jobs (graphs) arrive randomly over time, and upon completion, leave the system. When a job arrives, the scheduler needs to partition the graph and distribute it over the servers to satisfy load balancing and cost considerations. Specifically, neighboring compute tasks in the graph that are mapped to different servers incur load on the network; thus a mapping of the jobs among the servers incurs a cost that is proportional to the number of "broken edges". We propose a low complexity randomized scheduling algorithm that, without service preemptions, stabilizes the system with graph arrivals/departures; more importantly, it allows a smooth trade-off between minimizing average partitioning cost and average queue lengths. Interestingly, to avoid service preemptions, our approach does not rely on a Gibbs sampler; instead, we show that the corresponding limiting invariant measure has an interpretation stemming from a loss system.Comment: 14 page
    • …
    corecore