5 research outputs found

    Adaptive Merging on Phase Change Memory

    Full text link
    Indexing is a well-known database technique used to facilitate data access and speed up query processing. Nevertheless, the construction and modification of indexes are very expensive. In traditional approaches, all records in the database table are equally covered by the index. It is not effective, since some records may be queried very often and some never. To avoid this problem, adaptive merging has been introduced. The key idea is to create index adaptively and incrementally as a side-product of query processing. As a result, the database table is indexed partially depending on the query workload. This paper faces a problem of adaptive merging for phase change memory (PCM). The most important features of this memory type are: limited write endurance and high write latency. As a consequence, adaptive merging should be investigated from the scratch. We solve this problem in two steps. First, we apply several PCM optimization techniques to the traditional adaptive merging approach. We prove that the proposed method (eAM) outperforms a traditional approach by 60%. After that, we invent the framework for adaptive merging (PAM) and a new PCM-optimized index. It further improves the system performance by 20% for databases where search queries interleave with data modifications

    Discovery of Potential Parallelism in Sequential Programs

    Get PDF
    In the era of multicore processors, the responsibility for performance gains has been shifted onto software developers. Once improvements of the sequential algorithm have been exhausted, software-managed parallelism is the only option left. However, writing parallel code is still difficult, especially when parallelizing sequential code written by someone else. A key task in this process is the identification of suitable parallelization targets in the source code. Parallelism discovery tools help developers to find such targets automatically. Unfortunately, tools that identify parallelism during compilation are usually conservative due to the lack of runtime information, and tools relying on runtime information primarily suffer from high overhead in terms of both time and memory. This dissertation presents a generic framework for parallelism discovery based on dynamic program analysis, supporting various types of parallelism while incurring practically affordable overhead. The framework contains two main components: an efficient data-dependence profiler and a set of parallelism discovery algorithms based on a language-independent concept called Computational Unit. The data-dependence profiler serves as the foundation of the parallelism discovery framework. Traditional dependence profiling approaches introduce a tremendous amount of time and memory overhead. To lower the overhead, current methods limit their scope to the subset of the dependence information needed for the analysis they have been created for, sacrificing generality and discouraging reuse. In contrast, the profiler shown in this thesis addresses the problem via signature-based memory management and a lock-free parallel design. It produces detailed dependences not only for sequential but also for multi-threaded code without causing prohibitive overhead, allowing it to serve as a generic base for various program analysis techniques. Computational Units (CUs) provide a language-independent foundation for parallelism discovery. CUs are computations that follow the read-compute-write pattern. Unlike other concepts, they are not restricted to predefined language constructs. A program is represented as a CU graph, in which vertexes are CUs and edges are data dependences. This allows parallelism to be detected that spreads across multiple language constructs, taking code refactoring into consideration. The parallelism discovery algorithms cover both loop and task parallelism. Results of our experiments show that 1) the efficient data-dependence profiler has a very competitive average slowdown of around 80× with accuracy higher than 99.6%; 2) the framework discovers parallelism with high accuracy, identifying 92.5% of the parallel loops in NAS benchmarks; 3) when parallelizing well-known open-source software following the outputs of the framework, reasonable speedups are obtained. Finally, use cases beyond parallelism discovery are briefly demonstrated to show the generality of the framework
    corecore