2,432 research outputs found

    Split-and-Merge Method for Accelerating Convergence of Stochastic Linear Programs

    Full text link
    Abstract: Stochastic program optimizations are computationally very expensive, especially when the number of scenar-ios are large. Complexity of the focal application, and the slow convergence rate add to its computational complexity. We propose a split-and-merge (SAM) method for accelerating the convergence of stochastic lin-ear programs. SAM splits the original problem into subproblems, and utilizes the dual constraints from the subproblems to accelerate the convergence of the original problem. Our initial results are very encouraging, giving up to 58 % reduction in the optimization time. In this paper we discuss the initial results, the ongoing and the future work.

    Gunrock: GPU Graph Analytics

    Full text link
    For large-scale graph analytics on the GPU, the irregularity of data access and control flow, and the complexity of programming GPUs, have presented two significant challenges to developing a programmable high-performance graph library. "Gunrock", our graph-processing system designed specifically for the GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on operations on a vertex or edge frontier. Gunrock achieves a balance between performance and expressiveness by coupling high performance GPU computing primitives and optimization strategies with a high-level programming model that allows programmers to quickly develop new graph primitives with small code size and minimal GPU programming knowledge. We characterize the performance of various optimization strategies and evaluate Gunrock's overall performance on different GPU architectures on a wide range of graph primitives that span from traversal-based algorithms and ranking algorithms, to triangle counting and bipartite-graph-based algorithms. The results show that on a single GPU, Gunrock has on average at least an order of magnitude speedup over Boost and PowerGraph, comparable performance to the fastest GPU hardwired primitives and CPU shared-memory graph libraries such as Ligra and Galois, and better performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing (TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance Graph Processing Library on the GPU

    Parallel algorithms for two-stage stochastic optimization

    Get PDF
    We develop scalable algorithms for two-stage stochastic program optimizations. We propose performance optimizations such as cut-window mechanism in Stage 1 and scenario clustering in Stage 2 of benders method for solving two-stage stochastic programs. A naive implementation of benders method has slow convergence rate and does not scale well to large number of processors especially when the problem size is large and/or there are integer variables in Stage 1. Parallelization of stochastic integer programs pose very unique characteristics that make them very challenging to parallelize. We develop a Parallel Stochastic Integer Program Solver (PSIPS) that exploits nested parallelism by exploring the branch-and-bound tree vertices in parallel along with scenario parallelization. PSIPS has been shown to have high parallel efficiency of greater than 40% at 120 cores which is significantly greater than the parallel efficiency of state-of-the-art mixed-integer program solvers. A significant portion of the time in this branch-and-bound solver is spent in optimizing the stochastic linear program at the root vertex. Stochastic linear programs at other vertices of the branch-and-bound tree take very less iterations to converge because they can inherit benders cut from their parent vertices and/or the root. Therefore, it is important to reduce the optimization time of the stochastic linear program at the root vertex. We propose two decomposition schemes namely the Split-and-Merge (SAM) method and the Lagrangian Decomposition and Merge (LDAM) method that significantly increase the convergence rate of benders decomposition. SAM method gives up to 64% reduction in solution time while also giving significantly higher parallel speedups as compared to the naive benders method. LDAM method, on the other hand, has made it possible to solve otherwise intractable stochastic programs. We further provide a computational engine for many real-time and dynamic problems faced by US Air Mobility Command. We first propose a stochastic programming solution to the military aircraft allocation problem with consideration for disaster management. Then, we study US AMC's dynamic mission re-planning problem and propose a mathematical formulation that is computationally feasible and leads to significant savings in cost as compared to myopic and deterministic optimization. It is expected that this work will provide the springboard for more robust problem solving with HPC in many logistics and planning problems

    Structure-Aware Dynamic Scheduler for Parallel Machine Learning

    Full text link
    Training large machine learning (ML) models with many variables or parameters can take a long time if one employs sequential procedures even with stochastic updates. A natural solution is to turn to distributed computing on a cluster; however, naive, unstructured parallelization of ML algorithms does not usually lead to a proportional speedup and can even result in divergence, because dependencies between model elements can attenuate the computational gains from parallelization and compromise correctness of inference. Recent efforts toward this issue have benefited from exploiting the static, a priori block structures residing in ML algorithms. In this paper, we take this path further by exploring the dynamic block structures and workloads therein present during ML program execution, which offers new opportunities for improving convergence, correctness, and load balancing in distributed ML. We propose and showcase a general-purpose scheduler, STRADS, for coordinating distributed updates in ML algorithms, which harnesses the aforementioned opportunities in a systematic way. We provide theoretical guarantees for our scheduler, and demonstrate its efficacy versus static block structures on Lasso and Matrix Factorization

    Skin lesion detection and classification using convolutional neural network for deep feature extraction and support vector machine

    Get PDF
    Pigmented skin lesion identification is essential for detecting harmful pathologies related to this large organ, especially cancer. An analysis of the different methods and projects developed to diagnose these illnesses throughout the years showed that they had become very useful tools to identify melanoma, dermatofibroma, and basal cell carcinoma, among other types of cancer, are seen through the use of new computer-aided technologies. The most common diagnosis is based on dermoscopy and the dermatologist expertise that can improve accuracy with image detection techniques and classification by computer. Therefore, this study aims to develop software models able to detect and classify skin cancer. The following work is based on the use of dermoscopy images obtained from the HAM10000 dataset, a database with 10000 images previously tested and validated for research use. The main process is divided into three relevant parts: image segmentation, feature extraction (FE) using ten different pre-trained Convolutional Neural Networks (CNNs), and Support Vector Machine (SVM) to establish a classification model. According to the results, the models of classification performed very well using the image segmentation step, showing average accuracies between 80.67% (Xception) and 90% (Alexnet). In contrast to the process without using image segmentation, where no method reached 60%. AlexNet plus SVM model showed the minor running time and presented the higher accuracy rate (90.34%) for the correct identification and classification of the seven categories of cutaneous lesions taken into account
    • …
    corecore