3,308 research outputs found
Fully polynomial-time approximation schemes for time–cost tradeoff problems in series–parallel project networks
2009-2010 > Academic research: refereed > Publication in refereed journalAccepted ManuscriptPublishe
Multi-Resource List Scheduling of Moldable Parallel Jobs under Precedence Constraints
The scheduling literature has traditionally focused on a single type of
resource (e.g., computing nodes). However, scientific applications in modern
High-Performance Computing (HPC) systems process large amounts of data, hence
have diverse requirements on different types of resources (e.g., cores, cache,
memory, I/O). All of these resources could potentially be exploited by the
runtime scheduler to improve the application performance. In this paper, we
study multi-resource scheduling to minimize the makespan of computational
workflows comprised of parallel jobs subject to precedence constraints. The
jobs are assumed to be moldable, allowing the scheduler to flexibly select a
variable set of resources before execution. We propose a multi-resource,
list-based scheduling algorithm, and prove that, on a system with types of
schedulable resources, our algorithm achieves an approximation ratio of
for any , and a ratio of for
large . We also present improved results for independent jobs and for jobs
with special precedence constraints (e.g., series-parallel graphs and trees).
Finally, we prove a lower bound of on the approximation ratio of any list
scheduling scheme with local priority considerations. To the best of our
knowledge, these are the first approximation results for moldable workflows
with multiple resource requirements
Approximating the Nonlinear Newsvendor and Single-Item Stochastic Lot-Sizing Problems When Data Is Given by an Oracle
The single-item stochastic lot-sizing problem is to find an inventory replenishment policy in the presence of discrete stochastic demands under periodic review and finite time horizon. A closely related problem is the single-period newsvendor model. It is well known that the newsvendor problem admits a closed formula for the optimal order quantity whenever the revenue and salvage values are linear increasing functions and the procurement (ordering) cost is fixed plus linear. The optimal policy for the single-item lot-sizing model is also well known under similar assumptions.
In this paper we show that the classical (single-period) newsvendor model with fixed plus linear ordering cost cannot be approximated to any degree of accuracy when either the demand distribution or the cost functions are given by an oracle. We provide a fully polynomial time approximation scheme for the nonlinear single-item stochastic lot-sizing problem, when demand distribution is given by an oracle, procurement costs are provided as nondecreasing oracles, holding/backlogging/disposal costs are linear, and lead time is positive. Similar results exist for the nonlinear newsvendor problem. These approximation schemes are designed by extending the technique of K-approximation sets and functions.National Science Foundation (U.S.) (Contract CMMI-0758069)United States. Office of Naval Research (Grant N000141110056
Joint Linear and Nonlinear Computation with Data Encryption for Efficient Privacy-Preserving Deep Learning
Deep Learning (DL) has shown unrivalled performance in many applications such as image classification, speech recognition, anomalous detection, and business analytics. While end users and enterprises own enormous data, DL talents and computing power are mostly gathered in technology giants having cloud servers. Thus, data owners, i.e., the clients, are motivated to outsource their data, along with computationally-intensive tasks, to the server in order to leverage the server’s abundant computation resources and DL talents for developing cost-effective DL solutions. However, trust is required between the server and the client to finish the computation tasks (e.g., conducting inference for the newly-input data from the client, based on a well-trained model at the server) otherwise there could be the data breach (e.g., leaking data from the client or the proprietary model parameters from the server). Privacy-preserving DL takes data privacy into account where various data-encryption based techniques are adopted. However, the efficiency of linear and nonlinear computation for each DL layer remains a fundamental challenge in practice due to the intrinsic intractability and complexity of privacy-preserving primitives (e.g., Homomorphic Encryption (HE) and Garbled Circuits (GC)). As such, this dissertation targets deeply optimizing state-of-the-art frameworks as well as newly designing efficient modules by joint linear and nonlinear computation, with data encryption, to further boost the overall performance of privacy-preserving DL. Four contributions are made
Data Mining and Machine Learning in Astronomy
We review the current state of data mining and machine learning in astronomy.
'Data Mining' can have a somewhat mixed connotation from the point of view of a
researcher in this field. If used correctly, it can be a powerful approach,
holding the potential to fully exploit the exponentially increasing amount of
available data, promising great scientific advance. However, if misused, it can
be little more than the black-box application of complex computing algorithms
that may give little physical insight, and provide questionable results. Here,
we give an overview of the entire data mining process, from data collection
through to the interpretation of results. We cover common machine learning
algorithms, such as artificial neural networks and support vector machines,
applications from a broad range of astronomy, emphasizing those where data
mining techniques directly resulted in improved science, and important current
and future directions, including probability density functions, parallel
algorithms, petascale computing, and the time domain. We conclude that, so long
as one carefully selects an appropriate algorithm, and is guided by the
astronomical problem at hand, data mining can be very much the powerful tool,
and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra
figures, some minor additions to the tex
A Framework for Approximate Optimization of BoT Application Deployment in Hybrid Cloud Environment
We adopt a systematic approach to investigate the efficiency of near-optimal deployment of large-scale CPU-intensive Bag-of-Task applications running on cloud resources with the non-proportional cost to performance ratios. Our analytical solutions perform in both known and unknown running time of the given application. It tries to optimize users' utility by choosing the most desirable tradeoff between the make-span and the total incurred expense. We propose a schema to provide a near-optimal deployment of BoT application regarding users' preferences. Our approach is to provide user with a set of Pareto-optimal solutions, and then she may select one of the possible scheduling points based on her internal utility function. Our framework can cope with uncertainty in the tasks' execution time using two methods, too. First, an estimation method based on a Monte Carlo sampling called AA algorithm is presented. It uses the minimum possible number of sampling to predict the average task running time. Second, assuming that we have access to some code analyzer, code profiling or estimation tools, a hybrid method to evaluate the accuracy of each estimation tool in certain interval times for improving resource allocation decision has been presented. We propose approximate deployment strategies that run on hybrid cloud. In essence, proposed strategies first determine either an estimated or an exact optimal schema based on the information provided from users' side and environmental parameters. Then, we exploit dynamic methods to assign tasks to resources to reach an optimal schema as close as possible by using two methods. A fast yet simple method based on First Fit Decreasing algorithm, and a more complex approach based on the approximation solution of the transformed problem into a subset sum problem. Extensive experiment results conducted on a hybrid cloud platform confirm that our framework can deliver a near optimal solution respecting user's utility function
- …