65,842 research outputs found
Simulation model of load balancing in distributed computing systems
The availability of high-performance computing, high speed data transfer over the network and widespread of software for the design and pre-production in mechanical engineering have led to the fact that at the present time the large industrial enterprises and small engineering companies implement complex computer systems for efficient solutions of production and management tasks. Such computer systems are generally built on the basis of distributed heterogeneous computer systems. The analytical problems solved by such systems are the key models of research, but the system-wide problems of efficient distribution (balancing) of the computational load and accommodation input, intermediate and output databases are no less important. The main tasks of this balancing system are load and condition monitoring of compute nodes, and the selection of a node for transition of the user's request in accordance with a predetermined algorithm. The load balancing is one of the most used methods of increasing productivity of distributed computing systems through the optimal allocation of tasks between the computer system nodes. Therefore, the development of methods and algorithms for computing optimal scheduling in a distributed system, dynamically changing its infrastructure, is an important task
A novel approach for energy- and memory-efficient data loss prevention to support Internet of Things networks
Internet of Things integrates various technologies, including wireless sensor networks, edge computing, and cloud computing, to support a wide range of applications such as environmental monitoring and disaster surveillance. In these types of applications, IoT devices operate using limited resources in terms of battery, communication bandwidth, processing, and memory capacities. In this context, load balancing, fault tolerance, and energy and memory efficiency are among the most important issues related to data dissemination in IoT networks. In order to successfully cope with the abovementioned issues, two main approaches—data-centric storage and distributed data storage—have been proposed in the literature. Both approaches suffer from data loss due to memory and/or energy depletion in the storage nodes. Even though several techniques have been proposed so far to overcome the abovementioned problems, the proposed solutions typically focus on one issue at a time. In this article, we propose a cross-layer optimization approach to increase memory and energy efficiency as well as support load balancing. The optimization problem is a mixed-integer nonlinear programming problem, and we solve it using a genetic algorithm. Moreover, we integrate the data-centric storage features into distributed data storage mechanisms and present a novel heuristic approach, denoted as Collaborative Memory and Energy Management, to solve the underlying optimization problem. We also propose analytical and simulation frameworks for performance evaluation. Our results show that the proposed method outperforms the existing approaches in various IoT scenarios
Comparative study for load management of HBase and Cassandra distributed databases in big data
The advancement in cloud computing, the increasing size of databases and the emergence of big data have made traditional data management system to be insufficient solution to store and manage such large-scale data. Therefore, there has been an emergence of new mechanisms for data storage that can handle large-scale data. NoSQL databases are used to store and manage large amount of data. They are intended to be open source, distributed and horizontally scalable in order to provide high performance. Scalability is one of the important features of such systems, it means that by increasing the number of nodes, more requests can be served per unit of time. Distribution and scalability are always companied with load management, which provides load balancing of work among multiple nodes. Load management efficiency varies from system to another according to the used load balancing technique. In this study, HBase and Cassandra load management with scalability will be evaluated as they are the most popular NoSQL databases modeled based on Big Table. In particular,this paper will compare and analyze the load management for the distributed performance of HBase and Cassandra using standard benchmark tool named Yahoo! Cloud Serving Benchmark (YCSB). The experiments will measure the performance of database operations with a different number of connections using different numbers of operations, database size, and processing nodes. The experimental results showed that HBase can provide better performance as the number of connections increase in the presence of horizontal scalabilit
Maintaining Replica Consistency Over Large-Scale Data Grid Using Update Propagation Technique
A Data Grid is an organized collection of nodes in a wide area network which contributes to various computation, storage data, and application. In Data Grid high numbers of users are distributed in a wide area environment which is dynamic and heterogeneous. Data management is one of the current issues where data transparency, consistency, fault-tolerance, automatic management and the performance are the user parameters in grid environment. Data management techniques must scale up while addressing autonomy, dynamicity and heterogeneity of the data resource. Data replication is a well known technique used to reduce accesses latency, improve availability and performance in a distributed computing environment. Replication introduces the problem of maintaining consistency among the replicas when files are allowed to be updated. The update information should be propagated to all replicas to guarantee correct read of the remote replicas. An asynchronous replication is a commonly agreed solution for the problem in consistency of replicas. A few studies have been done to maintain replica consistency in Data Grid. However, the introduced techniques are neither efficient nor scalable. They cannot be used in real Data Grid since the issues of large number of replica sites, large scale distribution, load balancing and site autonomy where the capability of grid site to join and leave the grid community at any time have not been addressed.
This thesis proposes a new asynchronous replication protocol called Update Propagation Grid (UPG) to maintain replica consistency over a large scale data grid. In UPG the updates reach all on-line secondary replicas using a propagation technique based on nodes organized into a logical structure network in the form of two-dimensional grid structure. The proposed update propagation technique is a hybrid push-pull and dynamic technique that addresses the issues of site autonomy, efficiency, scalability, load balancing and fairness.
A two performance analysis studies have been conducted to study the performance of the proposed technique in comparison with other techniques. First study involves mathematical and simulation analysis. Second study is based on Queuing Network Model. The result of the performance analysis shows that the proposed technique scales well with high number of replica sites and with high request loads. The result also shows the reduction on the average update reach time by 5% to 97%. Moreover the result shows that the proposed technique is capable of reaching load balancing while providing update propagation fairnes
Flow Assignment and Processing on a Distributed Edge Computing Platform
The evolution of telecommunication networks toward the fifth generation of mobile services (5G), along with the increasing presence of cloud-native applications, and the development of Cloud and Mobile Edge Computing (MEC) paradigms, have opened up new opportunities for the monitoring and management of logistics and transportation. We address the case of distributed streaming platforms with multiple message brokers to develop an optimization model for the real-time assignment and load balancing of event streaming generated data traffic among Edge Computing facilities. The performance indicator function to be optimised is derived by adopting queuing models with different granularity (packet- and flow-level) that are suitably combined. A specific use case concerning a logistics application is considered and numerical results are provided to show the effectiveness of the optimisation procedure, also in comparison to a “static” assignment proportional to the processing speed of the brokers
DHTJoin: Processing Continuous Join Queries Using DHT Networks
International audienceContinuous query processing in data stream management systems (DSMS) has received considerable attention recently. Many applications share the same need for processing data streams in a continuous fashion. For most distributed streaming applications, the centralized processing of continuous queries over distributed data is simply not viable. This paper addresses the problem of computing approximate answers to continuous join queries over distributed data streams. We present a new method, called DHTJoin, which combines hash-based placement of tuples in a Distributed Hash Table (DHT) and dissemination of queries by exploiting the embedded trees in the underlying DHT, thereby incuring little overhead. DHTJoin also deals with join attribute value skew which may hurt load balancing and result completeness. We provide a performance evaluation of DHTJoin which shows that it can achieve significant performance gains in terms of network traffic
A Survey on Load Balancing Algorithms for VM Placement in Cloud Computing
The emergence of cloud computing based on virtualization technologies brings
huge opportunities to host virtual resource at low cost without the need of
owning any infrastructure. Virtualization technologies enable users to acquire,
configure and be charged on pay-per-use basis. However, Cloud data centers
mostly comprise heterogeneous commodity servers hosting multiple virtual
machines (VMs) with potential various specifications and fluctuating resource
usages, which may cause imbalanced resource utilization within servers that may
lead to performance degradation and service level agreements (SLAs) violations.
To achieve efficient scheduling, these challenges should be addressed and
solved by using load balancing strategies, which have been proved to be NP-hard
problem. From multiple perspectives, this work identifies the challenges and
analyzes existing algorithms for allocating VMs to PMs in infrastructure
Clouds, especially focuses on load balancing. A detailed classification
targeting load balancing algorithms for VM placement in cloud data centers is
investigated and the surveyed algorithms are classified according to the
classification. The goal of this paper is to provide a comprehensive and
comparative understanding of existing literature and aid researchers by
providing an insight for potential future enhancements.Comment: 22 Pages, 4 Figures, 4 Tables, in pres
HEC: Collaborative Research: SAM^2 Toolkit: Scalable and Adaptive Metadata Management for High-End Computing
The increasing demand for Exa-byte-scale storage capacity by high end computing applications requires a higher level of scalability and dependability than that provided by current file and storage systems. The proposal deals with file systems research for metadata management of scalable cluster-based parallel and distributed file storage systems in the HEC environment. It aims to develop a scalable and adaptive metadata management (SAM2) toolkit to extend features of and fully leverage the peak performance promised by state-of-the-art cluster-based parallel and distributed file storage systems used by the high performance computing community. There is a large body of research on data movement and management scaling, however, the need to scale up the attributes of cluster-based file systems and I/O, that is, metadata, has been underestimated. An understanding of the characteristics of metadata traffic, and an application of proper load-balancing, caching, prefetching and grouping mechanisms to perform metadata management correspondingly, will lead to a high scalability. It is anticipated that by appropriately plugging the scalable and adaptive metadata management components into the state-of-the-art cluster-based parallel and distributed file storage systems one could potentially increase the performance of applications and file systems, and help translate the promise and potential of high peak performance of such systems to real application performance improvements.
The project involves the following components:
1. Develop multi-variable forecasting models to analyze and predict file metadata access patterns. 2. Develop scalable and adaptive file name mapping schemes using the duplicative Bloom filter array technique to enforce load balance and increase scalability 3. Develop decentralized, locality-aware metadata grouping schemes to facilitate the bulk metadata operations such as prefetching. 4. Develop an adaptive cache coherence protocol using a distributed shared object model for client-side and server-side metadata caching. 5. Prototype the SAM2 components into the state-of-the-art parallel virtual file system PVFS2 and a distributed storage data caching system, set up an experimental framework for a DOE CMS Tier 2 site at University of Nebraska-Lincoln and conduct benchmark, evaluation and validation studies
Load Balancing and Virtual Machine Allocation in Cloud-based Data Centers
As cloud services see an exponential increase in consumers, the demand for faster processing of data and a reliable delivery of services becomes a pressing concern. This puts a lot of pressure on the cloud-based data centers, where the consumers’ data is stored, processed and serviced. The rising demand for high quality services and the constrained environment, make load balancing within the cloud data centers a vital concern. This project aims to achieve load balancing within the data centers by means of implementing a Virtual Machine allocation policy, based on consensus algorithm technique. The cloud-based data center system, consisting of Virtual Machines has been simulated on CloudSim – a Java based cloud simulator
- …