105 research outputs found
Adaptive Performance and Power Management in Distributed Computing Systems
The complexity of distributed computing systems has raised two unprecedented challenges for system management. First, various customers need to be assured by meeting their required service-level agreements such as response time and throughput. Second, system power consumption must be controlled in order to avoid system failures caused by power capacity overload or system overheating due to increasingly high server density. However, most existing work, unfortunately, either relies on open-loop estimations based on off-line profiled system models, or evolves in a more ad hoc fashion, which requires exhaustive iterations of tuning and testing, or oversimplifies the problem by ignoring the coupling between different system characteristics (\ie, response time and throughput, power consumption of different servers). As a result, the majority of previous work lacks rigorous guarantees on the performance and power consumption for computing systems, and may result in degraded overall system performance. In this thesis, we extensively study adaptive performance/power management and power-efficient performance management for distributed computing systems such as information dissemination systems, power grid management systems, and data centers, by proposing Multiple-Input-Multiple-Output (MIMO) control and hierarchical designs based on feedback control theory. For adaptive performance management, we design an integrated solution that controls both the average response time and CPU utilization in information dissemination systems to achieve bounded response time for high-priority information and maximized system throughput in an example information dissemination system. In addition, we design a hierarchical control solution to guarantee the deadlines of real-time tasks in power grid computing by grouping them based on their characteristics, respectively. For adaptive power management, we design MIMO optimal control solutions for power control at the cluster and server level and a hierarchical solution for large-scale data centers. Our MIMO control design can capture the coupling among different system characteristics, while our hierarchical design can coordinate controllers at different levels. For power-efficient performance management, we discuss a two-layer coordinated management solution for virtualized data centers. Experimental results in both physical testbeds and simulations demonstrate that all the solutions outperform state-of-the-art management schemes by significantly improving overall system performance
BUILDING EFFICIENT AND COST-EFFECTIVE CLOUD-BASED BIG DATA MANAGEMENT SYSTEMS
In today’s big data world, data is being produced in massive volumes, at great velocity
and from a variety of different sources such as mobile devices, sensors, a plethora
of small devices hooked to the internet (Internet of Things), social networks, communication
networks and many others. Interactive querying and large-scale analytics are being
increasingly used to derive value out of this big data. A large portion of this data is being
stored and processed in the Cloud due the several advantages provided by the Cloud such
as scalability, elasticity, availability, low cost of ownership and the overall economies
of scale. There is thus, a growing need for large-scale cloud-based data management
systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics
can grow linearly with the time and resources required. Reducing the cost of data analytics
in the Cloud thus remains a primary challenge. In my dissertation research, I have
focused on building efficient and cost-effective cloud-based data management systems for
different application domains that are predominant in cloud computing environments.
In the first part of my dissertation, I address the problem of reducing the cost of
transactional workloads on relational databases to support database-as-a-service in the
Cloud. The primary challenges in supporting such workloads include choosing how to
partition the data across a large number of machines, minimizing the number of distributed
transactions, providing high data availability, and tolerating failures gracefully.
I have designed, built and evaluated SWORD, an end-to-end scalable online transaction
processing system, that utilizes workload-aware data placement and replication to minimize
the number of distributed transactions that incorporates a suite of novel techniques
to significantly reduce the overheads incurred both during the initial placement of data,
and during query execution at runtime.
In the second part of my dissertation, I focus on sampling-based progressive analytics
as a means to reduce the cost of data analytics in the relational domain. Sampling has
been traditionally used by data scientists to get progressive answers to complex analytical
tasks over large volumes of data. Typically, this involves manually extracting samples
of increasing data size (progressive samples) for exploratory querying. This provides the
data scientists with user control, repeatable semantics, and result provenance. However,
such solutions result in tedious workflows that preclude the reuse of work across samples.
On the other hand, existing approximate query processing systems report early results,
but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive
data-parallel computation framework, NOW!, that provides support for progressive
analytics over big data. In particular, NOW! enables progressive relational (SQL) query
support in the Cloud using unique progress semantics that allow efficient and deterministic
query processing over samples providing meaningful early results and provenance
to data scientists. NOW! enables the provision of early results using significantly fewer
resources thereby enabling a substantial reduction in the cost incurred during such analytics.
Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics
on large-scale graph-structured data in the Cloud. The system is based on the
key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in
the graph; examples include ego network analysis, motif counting in biological networks,
finding social circles in social networks, personalized recommendations, link prediction,
etc. These tasks are not well served by existing vertex-centric graph processing frameworks
whose computation and execution models limit the user program to directly access
the state of a single vertex, resulting in high execution overheads. Further, the lack of
support for extracting the relevant portions of the graph that are of interest to an analysis
task and loading it onto distributed memory leads to poor scalability. NSCALE allows
users to write programs at the level of neighborhoods or subgraphs rather than at the level
of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient
distributed execution of these neighborhood-centric complex analysis tasks over largescale
graphs, while minimizing resource consumption and communication cost, thereby
substantially reducing the overall cost of graph data analytics in the Cloud.
The results of our extensive experimental evaluation of these prototypes with several
real-world data sets and applications validate the effectiveness of our techniques
which provide orders-of-magnitude reductions in the overheads of distributed data querying
and analysis in the Cloud
Hybrid genetic algorithm based on bin packing strategy for the unrelated parallel workgroup scheduling problem
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.In this paper we focus on an unrelated parallel workgroup scheduling problem
where each workgroup is composed of a number of personnel with similar work
skills which has eligibility and human resource constraints. The most difference
from the general unrelated parallel machine scheduling with resource constraints
is that one workgroup can process multiple jobs at a time as long as the resources
are available, which means that a feasible scheduling scheme is impossible to get
if we consider the processing sequence of jobs only in time dimension. We
construct this problem as an integer programming model with the objective of
minimizing makespan. As it is incapable to get the optimal solution in the
acceptable time for the presented model by exact algorithm, meta-heuristic is
considered to design. A pure genetic algorithm based on special coding design is
proposed firstly. Then a hybrid genetic algorithm based on bin packing strategy is
further developed by the consideration of transforming the single workgroup
scheduling to a strip-packing problem. Finally, the proposed algorithms, together
with exact approach, are tested at different size of instances. Results demonstrate
that the proposed hybrid genetic algorithm shows the effective performance
Effective Resource and Workload Management in Data Centers
The increasing demand for storage, computation, and business continuity has driven the growth of data centers. Managing data centers efficiently is a difficult task because of the wide variety of datacenter applications, their ever-changing intensities, and the fact that application performance targets may differ widely. Server virtualization has been a game-changing technology for IT, providing the possibility to support multiple virtual machines (VMs) simultaneously. This dissertation focuses on how virtualization technologies can be utilized to develop new tools for maintaining high resource utilization, for achieving high application performance, and for reducing the cost of data center management.;For multi-tiered applications, bursty workload traffic can significantly deteriorate performance. This dissertation proposes an admission control algorithm AWAIT, for handling overloading conditions in multi-tier web services. AWAIT places on hold requests of accepted sessions and refuses to admit new sessions when the system is in a sudden workload surge. to meet the service-level objective, AWAIT serves the requests in the blocking queue with high priority. The size of the queue is dynamically determined according to the workload burstiness.;Many admission control policies are triggered by instantaneous measurements of system resource usage, e.g., CPU utilization. This dissertation first demonstrates that directly measuring virtual machine resource utilizations with standard tools cannot always lead to accurate estimates. A directed factor graph (DFG) model is defined to model the dependencies among multiple types of resources across physical and virtual layers.;Virtualized data centers always enable sharing of resources among hosted applications for achieving high resource utilization. However, it is difficult to satisfy application SLOs on a shared infrastructure, as application workloads patterns change over time. AppRM, an automated management system not only allocates right amount of resources to applications for their performance target but also adjusts to dynamic workloads using an adaptive model.;Server consolidation is one of the key applications of server virtualization. This dissertation proposes a VM consolidation mechanism, first by extending the fair load balancing scheme for multi-dimensional vector scheduling, and then by using a queueing network model to capture the service contentions for a particular virtual machine placement
Scalable Real-Time Rendering for Extremely Complex 3D Environments Using Multiple GPUs
In 3D visualization, real-time rendering of high-quality meshes in complex 3D environments is still one of the major challenges in computer graphics. New data acquisition techniques like 3D modeling and scanning have drastically increased the requirement for more complex models and the demand for higher display resolutions in recent years. Most of the existing acceleration techniques using a single GPU for rendering suffer from the limited GPU memory budget, the time-consuming sequential executions, and the finite display resolution. Recently, people have started building commodity workstations with multiple GPUs and multiple displays. As a result, more GPU memory is available across a distributed cluster of GPUs, more computational power is provided throughout the combination of multiple GPUs, and a higher display resolution can be achieved by connecting each GPU to a display monitor (resulting in a tiled large display configuration). However, using a multi-GPU workstation may not always give the desired rendering performance due to the imbalanced rendering workloads among GPUs and overheads caused by inter-GPU communication.
In this dissertation, I contribute a multi-GPU multi-display parallel rendering approach for complex 3D environments. The approach has the capability to support a high-performance and high-quality rendering of static and dynamic 3D environments. A novel parallel load balancing algorithm is developed based on a screen partitioning strategy to dynamically balance the number of vertices and triangles rendered by each GPU. The overhead of inter-GPU communication is minimized by transferring only a small amount of image pixels rather than chunks of 3D primitives with a novel frame exchanging algorithm. The state-of-the-art parallel mesh simplification and GPU out-of-core techniques are integrated into the multi-GPU multi-display system to accelerate the rendering process
Planning and Scheduling Optimization
Although planning and scheduling optimization have been explored in the literature for many years now, it still remains a hot topic in the current scientific research. The changing market trends, globalization, technical and technological progress, and sustainability considerations make it necessary to deal with new optimization challenges in modern manufacturing, engineering, and healthcare systems. This book provides an overview of the recent advances in different areas connected with operations research models and other applications of intelligent computing techniques used for planning and scheduling optimization. The wide range of theoretical and practical research findings reported in this book confirms that the planning and scheduling problem is a complex issue that is present in different industrial sectors and organizations and opens promising and dynamic perspectives of research and development
Advances and Novel Approaches in Discrete Optimization
Discrete optimization is an important area of Applied Mathematics with a broad spectrum of applications in many fields. This book results from a Special Issue in the journal Mathematics entitled ‘Advances and Novel Approaches in Discrete Optimization’. It contains 17 articles covering a broad spectrum of subjects which have been selected from 43 submitted papers after a thorough refereeing process. Among other topics, it includes seven articles dealing with scheduling problems, e.g., online scheduling, batching, dual and inverse scheduling problems, or uncertain scheduling problems. Other subjects are graphs and applications, evacuation planning, the max-cut problem, capacitated lot-sizing, and packing algorithms
Recommended from our members
Transiency-driven Resource Management for Cloud Computing Platforms
Modern distributed server applications are hosted on enterprise or cloud data centers that provide computing, storage, and networking capabilities to these applications. These applications are built using the implicit assumption that the underlying servers will be stable and normally available, barring for occasional faults. In many emerging scenarios, however, data centers and clouds only provide transient, rather than continuous, availability of their servers. Transiency in modern distributed systems arises in many contexts, such as green data centers powered using renewable intermittent sources, and cloud platforms that provide lower-cost transient servers which can be unilaterally revoked by the cloud operator.
Transient computing resources are increasingly important, and existing fault-tolerance and resource management techniques are inadequate for transient servers because applications typically assume continuous resource availability. This thesis presents research in distributed systems design that treats transiency as a first-class design principle. I show that combining transiency-specific fault-tolerance mechanisms with resource management policies to suit application characteristics and requirements, can yield significant cost and performance benefits. These mechanisms and policies have been implemented and prototyped as part of software systems, which allow a wide range of applications, such as interactive services and distributed data processing, to be deployed on transient servers, and can reduce cloud computing costs by up to 90\%.
This thesis makes contributions to four areas of computer systems research: transiency-specific fault-tolerance, resource allocation, abstractions, and resource reclamation. For reducing the impact of transient server revocations, I develop two fault-tolerance techniques that are tailored to transient server characteristics and application requirements. For interactive applications, I build a derivative cloud platform that masks revocations by transparently moving application-state between servers of different types. Similarly, for distributed data processing applications, I investigate the use of application level periodic checkpointing to reduce the performance impact of server revocations. For managing and reducing the risk of server revocations, I investigate the use of server portfolios that allow transient resource allocation to be tailored to application requirements.
Finally, I investigate how resource providers (such as cloud platforms) can provide transient resource availability without revocation, by looking into alternative resource reclamation techniques. I develop resource deflation, wherein a server\u27s resources are fractionally reclaimed, allowing the application to continue execution albeit with fewer resources. Resource deflation generalizes revocation, and the deflation mechanisms and cluster-wide policies can yield both high cluster utilization and low application performance degradation
- …