390,527 research outputs found
Building a generalized distributed system model
A number of topics related to building a generalized distributed system model are discussed. The effects of distributed database modeling on evaluation of transaction rollbacks, the measurement of effects of distributed database models on transaction availability measures, and a performance analysis of static locking in replicated distributed database systems are covered
Magpie: Automatically Tuning Static Parameters for Distributed File Systems using Deep Reinforcement Learning
Distributed file systems are widely used nowadays, yet using their default
configurations is often not optimal. At the same time, tuning configuration
parameters is typically challenging and time-consuming. It demands expertise
and tuning operations can also be expensive. This is especially the case for
static parameters, where changes take effect only after a restart of the system
or workloads. We propose a novel approach, Magpie, which utilizes deep
reinforcement learning to tune static parameters by strategically exploring and
exploiting configuration parameter spaces. To boost the tuning of the static
parameters, our method employs both server and client metrics of distributed
file systems to understand the relationship between static parameters and
performance. Our empirical evaluation results show that Magpie can noticeably
improve the performance of the distributed file system Lustre, where our
approach on average achieves 91.8% throughput gains against default
configuration after tuning towards single performance indicator optimization,
while it reaches 39.7% more throughput gains against the baseline.Comment: Accepted at The IEEE International Conference on Cloud Engineering
(IC2E) conference 202
Process-aware web programming with Jolie
We extend the Jolie programming language to capture the native modelling of
process-aware web information systems, i.e., web information systems based upon
the execution of business processes. Our main contribution is to offer a
unifying approach for the programming of distributed architectures on the web,
which can capture web servers, stateful process execution, and the composition
of services via mediation. We discuss applications of this approach through a
series of examples that cover, e.g., static content serving, multiparty
sessions, and the evolution of web systems. Finally, we present a performance
evaluation that includes a comparison of Jolie-based web systems to other
frameworks and a measurement of its scalability.Comment: IMADA-preprint-c
Methodology for modeling high performance distributed and parallel systems
Performance modeling of distributed and parallel systems is of considerable importance to the high performance computing community. To achieve high performance, proper task or process assignment and data or file allocation among processing sites is essential. This dissertation describes an elegant approach to model distributed and parallel systems, which combines the optimal static solutions for data allocation with dynamic policies for task assignment. A performance-efficient system model is developed using analytical tools and techniques.
The system model is accomplished in three steps. First, the basic client-server model which allows only data transfer is evaluated. A prediction and evaluation method is developed to examine the system behavior and estimate performance measures. The method is based on known product form queueing networks. The next step extends the model so that each site of the system behaves as both client and server. A data-allocation strategy is designed at this stage which optimally assigns the data to the processing sites. The strategy is based on flow deviation technique in queueing models. The third stage considers process-migration policies. A novel on-line adaptive load-balancing algorithm is proposed which dynamically migrates processes and transfers data among different sites to minimize the job execution cost. The gradient-descent rule is used to optimize the cost function, which expresses the cost of process execution at different processing sites.
The accuracy of the prediction method and the effectiveness of the analytical techniques is established by the simulations. The modeling procedure described here is general and applicable to any message-passing distributed and parallel system. The proposed techniques and tools can be easily utilized in other related areas such as networking and operating systems. This work contributes significantly towards the design of distributed and parallel systems where performance is critical
Probabilistic field mapping for product search
Master's thesis in Computer scienceOnline shopping has shown a rapid growth in the last few years. Robust
search systems are arguably fundamental to e-commerce sites. Most importantly,
sites should have smart retrieval systems to present optimized results
that could best satisfy customers purchase intent. To address the demand for
such systems we adapted retrieval approaches based on a generative language
modeling framework, representing products as semi-structured documents.
We present and experimentally compare three alternative ranking functions
which make use of different prior estimates. The first method is static field
weighting approach relying on field’s individual performance taking nDCG as
an effectiveness measure. Two other methods dynamically assign term-field
weights according to the distribution of terms in field’s collection. These
retrieval functions infers from user search keywords the most likely matching
product property probabilistically. The methods differ as one of them
considers a uniform field prior whereas the other utilizes performance based
prior. The methods were evaluated in relatively new evaluation methodology
that evaluated ranking systems when real customer were doing online shopping
at toy webshop ‘regiojatek.hu’ : Living labs. In the experiment the lab
present an interleaved result, based on Team draft interleaving, from production
site and our experimental rankings to customers. The Lab employ
an evaluation metric “outcome” and we applied outcome measure to compare
our methods and to interpret our results. Our results show that both
term-specific mapping methods outperformed the static weight assignment
approach. In addition results also suggest that estimating field mapping priors
based on historical clicks does not outperform the setting where the priors
are uniformly distributed. Furthermore,we also discovered that a trec-style
evaluation carried out deeming historical clicks as relevance indicators had
ordered the methods inversely in relation to Living labs. This has possible
implication that Living labs evaluation platform are essential in IR tasks
Development of Cluster Computing –A Review
This paper presents the review work of “Cluster Computing” in depth and detail. Cluster Computing: A Mobile Code Approach by R.B.Patel and Manpreet Singh (2006); Performance Evaluation of Parallel Applications Using Message Passing Interface In Network of Workstations Of Different Computing Powers by Rajkumar Sharma, Priyesh Kanungo and Manohar Chandwani (2011); On the Performance of MPI-OpenMP on a 12 nodes Multi-core Cluster by Abdelgadir Tageldin, Al-Sakib Khan Pathan , Mohiuddin Ahmed (2011); Dynamic Load Balancing in Parallel Processing on Non-Homogeneous Clusters by Armando E. De Giusti, Marcelo R. Naiouf, Laura C. De Giusti, Franco Chichizola (2005); Performance Evaluation of Computation Intensive Tasks in Grid by P.Raghu, K. Sriram (2011); Automatic Distribution of Vision-Tasks on Computing Clusters by Thomas Muller, Binh An Tran and Alois Knoll (2011); Terminology And Taxonomy Parallel Computing Architecture by Amardeep Singh, Satinder Pal Singh, Vandana, Sukhnandan Kaur (2011); Research of Distributed Algorithm based on Parallel Computer Cluster System by Xu He-li, Liu Yan (2010); Cluster Computing Using Orders Based Transparent Parallelizing by Vitaliy D. Pavlenko, Victor V. Burdejnyj (2007) and VCE: A New Personated Virtual Cluster Engine for Cluster Computing by Mohsen Sharifi, Masoud Hassani, Ehsan Mousavi Khaneghah, Seyedeh Leili Mirtaheri (2008). Keywords:Cluster computing, Cluster Architectures, Dynamic and Static Load Balancing, Distributed Systems, Homogeneous and Non-Homogeneous Processors, Multicore clusters, Parallel computing, Parallel Computer Vision, Task parallelism, Terminology and taxonomy, Virtualization, Virtual Cluster
NASA Proof-of-Concept 1-W(sub e) Stirling Convertor Development for Small Radioisotope Power Systems
Low power Stirling convertors are being developed at NASA Glenn Research Center to provide future small spacecraft with electrical power by converting heat from one or more Light Weight Radioisotope Heater Units (LWRHU). An initial design converts multiple watts of heat to one watt of electrical power output using a Stirling convertor. A variety of mission concepts have been studied by NASA and the U. S. Department of Energy that would utilize low power Radioisotope Power Systems (RPS) for probes, landers, rovers, and repeaters. These missions would contain science instruments distributed across planetary surfaces or near objects of interest where solar flux is insufficient for using solar cells. Landers could be used to provide data such as, radiation, temperature, pressure, seismic activity, and other surface measurements for planetary science and to inform future mission planners. The studies propose using fractional versions of the General Purpose Heat Source or multiple LWRHUs to heat power conversion technologies for science instruments and communication. Dynamic power systems are capable of higher conversion efficiencies, which could enable equal power using less fuel or more power using equal fuel, when compared to less efficient static power conversion technologies. Providing spacecraft with more power would decrease duty cycling of basic functions and, therefore, increase the quality and abundance of science data. Efforts to develop the concept have focused on maturation of a 1-We convertor and controller design and performance evaluation of an evacuated metal foil insulation. A proof-of-concept 1-We convertor, controller, and evacuated metal foil insulation package have been fabricated and are undergoing characterization testing. The current status, findings, and path forward for the effort are explained in this paper
Designing energy-efficient computing systems using equalization and machine learning
As technology scaling slows down in the nanometer CMOS regime and mobile computing becomes more ubiquitous, designing energy-efficient hardware for mobile systems is becoming increasingly critical and challenging. Although various approaches like near-threshold computing (NTC), aggressive voltage scaling with shadow latches, etc. have been proposed to get the most out of limited battery life, there is still no “silver bullet” to increasing power-performance demands of the mobile systems. Moreover, given that a mobile system could operate in a variety of environmental conditions, like different temperatures, have varying performance requirements, etc., there is a growing need for designing tunable/reconfigurable systems in order to achieve energy-efficient operation. In this work we propose to address the energy- efficiency problem of mobile systems using two different approaches: circuit tunability and distributed adaptive algorithms.
Inspired by the communication systems, we developed feedback equalization based digital logic that changes the threshold of its gates based on the input pattern. We showed that feedback equalization in static complementary CMOS logic enabled up to 20% reduction in energy dissipation while maintaining the performance metrics. We also achieved 30% reduction in energy dissipation for pass-transistor digital logic (PTL) with equalization while maintaining performance. In addition, we proposed a mechanism that leverages feedback equalization techniques to achieve near optimal operation of static complementary CMOS logic blocks over the entire voltage range from near threshold supply voltage to nominal supply voltage. Using energy-delay product (EDP) as a metric we analyzed the use of the feedback equalizer as part of various sequential computational blocks. Our analysis shows that for near-threshold voltage operation, when equalization was used, we can improve the operating frequency by up to 30%, while the energy increase was less than 15%, with an overall EDP reduction of ≈10%. We also observe an EDP reduction of close to 5% across entire above-threshold voltage range.
On the distributed adaptive algorithm front, we explored energy-efficient hardware implementation of machine learning algorithms. We proposed an adaptive classifier that leverages the wide variability in data complexity to enable energy-efficient data classification operations for mobile systems. Our approach takes advantage of varying classification hardness across data to dynamically allocate resources and improve energy efficiency. On average, our adaptive classifier is ≈100× more energy efficient but has ≈1% higher error rate than a complex radial basis function classifier and is ≈10× less energy efficient but has ≈40% lower error rate than a simple linear classifier across a wide range of classification data sets. We also developed a field of groves (FoG) implementation of random forests (RF) that achieves an accuracy comparable to Convolutional Neural Networks (CNN) and Support Vector Machines (SVM) under tight energy budgets. The FoG architecture takes advantage of the fact that in random forests a small portion of the weak classifiers (decision trees) might be sufficient to achieve high statistical performance. By dividing the random forest into smaller forests (Groves), and conditionally executing the rest of the forest, FoG is able to achieve much higher energy efficiency levels for comparable error rates. We also take advantage of the distributed nature of the FoG to achieve high level of parallelism. Our evaluation shows that at maximum achievable accuracies FoG consumes ≈1.48×, ≈24×, ≈2.5×, and ≈34.7× lower energy per classification compared to conventional RF, SVM-RBF , Multi-Layer Perceptron Network (MLP), and CNN, respectively. FoG is 6.5× less energy efficient than SVM-LR, but achieves 18% higher accuracy on average across all considered datasets
Hierarchical Dynamic Loop Self-Scheduling on Distributed-Memory Systems Using an MPI+MPI Approach
Computationally-intensive loops are the primary source of parallelism in
scientific applications. Such loops are often irregular and a balanced
execution of their loop iterations is critical for achieving high performance.
However, several factors may lead to an imbalanced load execution, such as
problem characteristics, algorithmic, and systemic variations. Dynamic loop
self-scheduling (DLS) techniques are devised to mitigate these factors, and
consequently, improve application performance. On distributed-memory systems,
DLS techniques can be implemented using a hierarchical master-worker execution
model and are, therefore, called hierarchical DLS techniques. These techniques
self-schedule loop iterations at two levels of hardware parallelism: across and
within compute nodes. Hybrid programming approaches that combine the message
passing interface (MPI) with open multi-processing (OpenMP) dominate the
implementation of hierarchical DLS techniques. The MPI-3 standard includes the
feature of sharing memory regions among MPI processes. This feature introduced
the MPI+MPI approach that simplifies the implementation of parallel scientific
applications. The present work designs and implements hierarchical DLS
techniques by exploiting the MPI+MPI approach. Four well-known DLS techniques
are considered in the evaluation proposed herein. The results indicate certain
performance advantages of the proposed approach compared to the hybrid
MPI+OpenMP approach
- …