3,308 research outputs found

    Fully polynomial-time approximation schemes for time–cost tradeoff problems in series–parallel project networks

    Get PDF
    2009-2010 > Academic research: refereed > Publication in refereed journalAccepted ManuscriptPublishe

    Multi-Resource List Scheduling of Moldable Parallel Jobs under Precedence Constraints

    Full text link
    The scheduling literature has traditionally focused on a single type of resource (e.g., computing nodes). However, scientific applications in modern High-Performance Computing (HPC) systems process large amounts of data, hence have diverse requirements on different types of resources (e.g., cores, cache, memory, I/O). All of these resources could potentially be exploited by the runtime scheduler to improve the application performance. In this paper, we study multi-resource scheduling to minimize the makespan of computational workflows comprised of parallel jobs subject to precedence constraints. The jobs are assumed to be moldable, allowing the scheduler to flexibly select a variable set of resources before execution. We propose a multi-resource, list-based scheduling algorithm, and prove that, on a system with dd types of schedulable resources, our algorithm achieves an approximation ratio of 1.619d+2.545d+11.619d+2.545\sqrt{d}+1 for any dd, and a ratio of d+O(d23)d+O(\sqrt[3]{d^2}) for large dd. We also present improved results for independent jobs and for jobs with special precedence constraints (e.g., series-parallel graphs and trees). Finally, we prove a lower bound of dd on the approximation ratio of any list scheduling scheme with local priority considerations. To the best of our knowledge, these are the first approximation results for moldable workflows with multiple resource requirements

    Approximating the Nonlinear Newsvendor and Single-Item Stochastic Lot-Sizing Problems When Data Is Given by an Oracle

    Get PDF
    The single-item stochastic lot-sizing problem is to find an inventory replenishment policy in the presence of discrete stochastic demands under periodic review and finite time horizon. A closely related problem is the single-period newsvendor model. It is well known that the newsvendor problem admits a closed formula for the optimal order quantity whenever the revenue and salvage values are linear increasing functions and the procurement (ordering) cost is fixed plus linear. The optimal policy for the single-item lot-sizing model is also well known under similar assumptions. In this paper we show that the classical (single-period) newsvendor model with fixed plus linear ordering cost cannot be approximated to any degree of accuracy when either the demand distribution or the cost functions are given by an oracle. We provide a fully polynomial time approximation scheme for the nonlinear single-item stochastic lot-sizing problem, when demand distribution is given by an oracle, procurement costs are provided as nondecreasing oracles, holding/backlogging/disposal costs are linear, and lead time is positive. Similar results exist for the nonlinear newsvendor problem. These approximation schemes are designed by extending the technique of K-approximation sets and functions.National Science Foundation (U.S.) (Contract CMMI-0758069)United States. Office of Naval Research (Grant N000141110056

    Joint Linear and Nonlinear Computation with Data Encryption for Efficient Privacy-Preserving Deep Learning

    Get PDF
    Deep Learning (DL) has shown unrivalled performance in many applications such as image classification, speech recognition, anomalous detection, and business analytics. While end users and enterprises own enormous data, DL talents and computing power are mostly gathered in technology giants having cloud servers. Thus, data owners, i.e., the clients, are motivated to outsource their data, along with computationally-intensive tasks, to the server in order to leverage the server’s abundant computation resources and DL talents for developing cost-effective DL solutions. However, trust is required between the server and the client to finish the computation tasks (e.g., conducting inference for the newly-input data from the client, based on a well-trained model at the server) otherwise there could be the data breach (e.g., leaking data from the client or the proprietary model parameters from the server). Privacy-preserving DL takes data privacy into account where various data-encryption based techniques are adopted. However, the efficiency of linear and nonlinear computation for each DL layer remains a fundamental challenge in practice due to the intrinsic intractability and complexity of privacy-preserving primitives (e.g., Homomorphic Encryption (HE) and Garbled Circuits (GC)). As such, this dissertation targets deeply optimizing state-of-the-art frameworks as well as newly designing efficient modules by joint linear and nonlinear computation, with data encryption, to further boost the overall performance of privacy-preserving DL. Four contributions are made

    Data Mining and Machine Learning in Astronomy

    Full text link
    We review the current state of data mining and machine learning in astronomy. 'Data Mining' can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black-box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those where data mining techniques directly resulted in improved science, and important current and future directions, including probability density functions, parallel algorithms, petascale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm, and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra figures, some minor additions to the tex

    A Framework for Approximate Optimization of BoT Application Deployment in Hybrid Cloud Environment

    Get PDF
    We adopt a systematic approach to investigate the efficiency of near-optimal deployment of large-scale CPU-intensive Bag-of-Task applications running on cloud resources with the non-proportional cost to performance ratios. Our analytical solutions perform in both known and unknown running time of the given application. It tries to optimize users' utility by choosing the most desirable tradeoff between the make-span and the total incurred expense. We propose a schema to provide a near-optimal deployment of BoT application regarding users' preferences. Our approach is to provide user with a set of Pareto-optimal solutions, and then she may select one of the possible scheduling points based on her internal utility function. Our framework can cope with uncertainty in the tasks' execution time using two methods, too. First, an estimation method based on a Monte Carlo sampling called AA algorithm is presented. It uses the minimum possible number of sampling to predict the average task running time. Second, assuming that we have access to some code analyzer, code profiling or estimation tools, a hybrid method to evaluate the accuracy of each estimation tool in certain interval times for improving resource allocation decision has been presented. We propose approximate deployment strategies that run on hybrid cloud. In essence, proposed strategies first determine either an estimated or an exact optimal schema based on the information provided from users' side and environmental parameters. Then, we exploit dynamic methods to assign tasks to resources to reach an optimal schema as close as possible by using two methods. A fast yet simple method based on First Fit Decreasing algorithm, and a more complex approach based on the approximation solution of the transformed problem into a subset sum problem. Extensive experiment results conducted on a hybrid cloud platform confirm that our framework can deliver a near optimal solution respecting user's utility function
    corecore