7 research outputs found

    Regrouping metric-space search index for search engine size adaptation

    Get PDF
    This work contributes to the development of search engines that self-adapt their size in response to fluctuations in workload. Deploying a search engine in an Infrastructure as a Service (IaaS) cloud facilitates allocating or deallocating computational resources to or from the engine. In this paper, we focus on the problem of regrouping the metric-space search index when the number of virtual machines used to run the search engine is modified to reflect changes in workload. We propose an algorithm for incrementally adjusting the index to fit the varying number of virtual machines. We tested its performance using a custom-build prototype search engine deployed in the Amazon EC2 cloud, while calibrating the results to compensate for the performance fluctuations of the platform. Our experiments show that, when compared with computing the index from scratch, the incremental algorithm speeds up the index computation 2–10 times while maintaining a similar search performance

    Multi-dimensional multiple query scheduling with distributed semantic caching framework

    No full text
    It is becoming more important to leverage a large number of distributed cache memory seamlessly in modern large scale systems. Several previous studies showed that traditional scheduling policies often fail to exhibit high cache hit ratio and to achieve good system load balance with large scale distributed caching facilities. To maximize the system throughput, distributed caching facilities should balance the workloads and leverage cached data at the same time. In this work, we present a distributed job processing framework that yields high cache hit ratio while achieving balanced system load. Our framework employs a scheduling policy-DEMA that considers both cache hit ratio and system load and it supports geographically distributed multiple job schedulers. We show collaborative task scheduling and the data migration can even further improve the performance by increasing the cache hit ratio while achieving good load balance. Our experiments show that the proposed job scheduling policies outperform legacy load-based job scheduling policy in terms of job response time, load balancing, and cache hit ratioclose0

    A multi-level hypergraph partitioning algorithm using rough set clustering

    Get PDF
    The hypergraph partitioning problem has many applications in scientific computing and provides a more accurate inter-processor communication model for distributed systems than the equivalent graph problem. In this paper, we propose a sequential multi-level hypergraph partitioning algorithm. The algorithm makes novel use of the technique of rough set clustering in categorising the vertices of the hypergraph. The algorithm treats hyperedges as features of the hypergraph and tries to discard unimportant hyperedges to make better clustering decisions. It also focuses on the trade-off to be made between local vertex matching decisions (which have low cost in terms of the space required and time taken) and global decisions (which can be of better quality but have greater costs). The algorithm is evaluated and compared to state-of-the-art algorithms on a range of benchmarks. The results show that it generates better partition quality

    The Effect of Various Sparsity Structures on Parallelism and Algorithms to Reveal Those Structures

    No full text
    Structured sparse matrices can greatly benefit parallel numerical methods in terms of parallel performance and convergence. In this chapter, we present combinatorial models for obtaining several different sparse matrix forms. There are four basic forms we focus on: singly-bordered block-diagonal form, doubly-bordered block-diagonal form, nonempty off-diagonal block minimization, and block diagonal with overlap form. For each of these forms, we first present the form in detail and describe what goals are sought within the form, and then examine the combinatorial models that attain the respective form while targeting the sought goals, and finally explain in which aspects the forms benefit certain parallel numerical methods and their relationship with the models. Our work focuses especially on graph and hypergraph partitioning models in obtaining the mentioned forms. Despite their relatively high preprocessing overhead compared to other heuristics, they have proven to model the given problem more accurately and this overhead can be often amortized due the fact that matrix structure does not change much during a typical numerical simulation. This chapter presents a number of models and their relationship with parallel numerical methods
    corecore