81 research outputs found

    Depth-first-search and dynamic programming algorithms for efficient CMOS cell generation

    Full text link

    Reducing Cache Contention On GPUs

    Get PDF
    The usage of Graphics Processing Units (GPUs) as an application accelerator has become increasingly popular because, compared to traditional CPUs, they are more cost-effective, their highly parallel nature complements a CPU, and they are more energy efficient. With the popularity of GPUs, many GPU-based compute-intensive applications (a.k.a., GPGPUs) present significant performance improvement over traditional CPU-based implementations. Caches, which significantly improve CPU performance, are introduced to GPUs to further enhance application performance. However, the effect of caches is not significant for many cases in GPUs and even detrimental for some cases. The massive parallelism of the GPU execution model and the resulting memory accesses cause the GPU memory hierarchy to suffer from significant memory resource contention among threads. One cause of cache contention arises from column-strided memory access patterns that GPU applications commonly generate in many data-intensive applications. When such access patterns are mapped to hardware thread groups, they become memory-divergent instructions whose memory requests are not GPU hardware friendly, resulting in serialized access and performance degradation. Cache contention also arises from cache pollution caused by lines with low reuse. For the cache to be effective, a cached line must be reused before its eviction. Unfortunately, the streaming characteristic of GPGPU workloads and the massively parallel GPU execution model increase the reuse distance, or equivalently reduce reuse frequency of data. In a GPU, the pollution caused by a large reuse distance data is significant. Memory request stall is another contention factor. A stalled Load/Store (LDST) unit does not execute memory requests from any ready warps in the issue stage. This stall prevents the potential hit chances for the ready warps. This dissertation proposes three novel architectural modifications to reduce the contention: 1) contention-aware selective caching detects the memory-divergent instructions caused by the column-strided access patterns, calculates the contending cache sets and locality information and then selectively caches; 2) locality-aware selective caching dynamically calculates the reuse frequency with efficient hardware and caches based on the reuse frequency; and 3) memory request scheduling queues the memory requests from a warp issuing stage, frees the LDST unit stall and schedules items from the queue to the LDST unit by multiple probing of the cache. Through systematic experiments and comprehensive comparisons with existing state-of-the-art techniques, this dissertation demonstrates the effectiveness of our aforementioned techniques and the viability of reducing cache contention through architectural support. Finally, this dissertation suggests other promising opportunities for future research on GPU architecture

    Area-power-delay trade-off in logic synthesis

    Get PDF
    This thesis introduces new concepts to perform area-power-delay trade-offs in a logic synthesis system. To achieve this, a new delay model is presented, which gives accurate delay estimations for arbitrary sets of Boolean expressions. This allows use of this delay model already during the very first steps of logic synthesis. Furthermore, new algorithms are presented for a number of different optimization tasks within logic synthesis. There are new algorithms to create prime irredundant Boo lean expressions, to perform technology mapping for use with standard cell generators, and to perform gate sizing. To prove the validity of the presented ideas, benchmark results are given throughout the thesis

    Digital watermark technology in security applications

    Get PDF
    With the rising emphasis on security and the number of fraud related crimes around the world, authorities are looking for new technologies to tighten security of identity. Among many modern electronic technologies, digital watermarking has unique advantages to enhance the document authenticity. At the current status of the development, digital watermarking technologies are not as matured as other competing technologies to support identity authentication systems. This work presents improvements in performance of two classes of digital watermarking techniques and investigates the issue of watermark synchronisation. Optimal performance can be obtained if the spreading sequences are designed to be orthogonal to the cover vector. In this thesis, two classes of orthogonalisation methods that generate binary sequences quasi-orthogonal to the cover vector are presented. One method, namely "Sorting and Cancelling" generates sequences that have a high level of orthogonality to the cover vector. The Hadamard Matrix based orthogonalisation method, namely "Hadamard Matrix Search" is able to realise overlapped embedding, thus the watermarking capacity and image fidelity can be improved compared to using short watermark sequences. The results are compared with traditional pseudo-randomly generated binary sequences. The advantages of both classes of orthogonalisation inethods are significant. Another watermarking method that is introduced in the thesis is based on writing-on-dirty-paper theory. The method is presented with biorthogonal codes that have the best robustness. The advantage and trade-offs of using biorthogonal codes with this watermark coding methods are analysed comprehensively. The comparisons between orthogonal and non-orthogonal codes that are used in this watermarking method are also made. It is found that fidelity and robustness are contradictory and it is not possible to optimise them simultaneously. Comparisons are also made between all proposed methods. The comparisons are focused on three major performance criteria, fidelity, capacity and robustness. aom two different viewpoints, conclusions are not the same. For fidelity-centric viewpoint, the dirty-paper coding methods using biorthogonal codes has very strong advantage to preserve image fidelity and the advantage of capacity performance is also significant. However, from the power ratio point of view, the orthogonalisation methods demonstrate significant advantage on capacity and robustness. The conclusions are contradictory but together, they summarise the performance generated by different design considerations. The synchronisation of watermark is firstly provided by high contrast frames around the watermarked image. The edge detection filters are used to detect the high contrast borders of the captured image. By scanning the pixels from the border to the centre, the locations of detected edges are stored. The optimal linear regression algorithm is used to estimate the watermarked image frames. Estimation of the regression function provides rotation angle as the slope of the rotated frames. The scaling is corrected by re-sampling the upright image to the original size. A theoretically studied method that is able to synchronise captured image to sub-pixel level accuracy is also presented. By using invariant transforms and the "symmetric phase only matched filter" the captured image can be corrected accurately to original geometric size. The method uses repeating watermarks to form an array in the spatial domain of the watermarked image and the the array that the locations of its elements can reveal information of rotation, translation and scaling with two filtering processes

    A combinatorial approach to orthogonal placement problems

    Get PDF
    liegt nicht vor!Wir betrachten zwei Familien von NP-schwierigen orthogonalen Platzierungsproblemen aus dem Bereich der Informationsvisualisierung von einem theoretischen und praktischen Standpunkt aus. Diese Arbeit enthält ein gemeinsames kombinatorisches Gerüst für Kompaktierungsprobleme aus dem Bereich des orthogonalen Graphenzeichnens und Beschriftungsprobleme von Punktmengen aus dem Gebiet der Computer-Kartografie. Bei den Kompaktierungsproblemen geht es darum, eine gegebene dimensionslose Beschreibung der orthogonalen Form eines Graphen in eine orthogonale Gitterzeichnung mit kurzen Kanten und geringem Flächenverbrauch zu transformieren. Die Beschriftungsprobleme haben zur Aufgabe, eine gegebene Menge von rechteckigen Labels so zu platzieren, dass eine lesbare Karte entsteht. In einer klassischen Anwendung repräsentieren die Punkte beispielsweise Städte einer Landkarte, und die Labels enthalten die Namen der Städte. Wir präsentieren neue kombinatorische Formulierungen für diese Probleme und verwenden dabei eine pfad- und kreisbasierte graphentheoretische Eigenschaft in einem zugehörigen problemspezifschen Paar von Constraint-Graphen. Die Umformulierung ermöglicht es uns, exakte Algorithmen für die Originalprobleme zu entwickeln. Umfassende experimentelle Studien mit Benchmark-Instanzen aus der Praxis zeigen, dass unsere Algorithmen, die auf linearer Programmierung beruhen, in der Lage sind, große Instanzen der Platzierungsprobleme beweisbar optimal und in kurzer Rechenzeit zu lösen. Ferner kombinieren wir die Formulierungen für Kompaktierungs- und Beschriftungsprobleme und präsentieren einen exakten algorithmischen Ansatz für ein Graphbeschriftungsproblem. Oftmals sind unsere neuen Algorithmen die ersten exakten Algorithmen für die jeweilige Problemvariante

    A combinatorial approach to orthogonal placement problems

    Get PDF
    liegt nicht vor!Wir betrachten zwei Familien von NP-schwierigen orthogonalen Platzierungsproblemen aus dem Bereich der Informationsvisualisierung von einem theoretischen und praktischen Standpunkt aus. Diese Arbeit enthält ein gemeinsames kombinatorisches Gerüst für Kompaktierungsprobleme aus dem Bereich des orthogonalen Graphenzeichnens und Beschriftungsprobleme von Punktmengen aus dem Gebiet der Computer-Kartografie. Bei den Kompaktierungsproblemen geht es darum, eine gegebene dimensionslose Beschreibung der orthogonalen Form eines Graphen in eine orthogonale Gitterzeichnung mit kurzen Kanten und geringem Flächenverbrauch zu transformieren. Die Beschriftungsprobleme haben zur Aufgabe, eine gegebene Menge von rechteckigen Labels so zu platzieren, dass eine lesbare Karte entsteht. In einer klassischen Anwendung repräsentieren die Punkte beispielsweise Städte einer Landkarte, und die Labels enthalten die Namen der Städte. Wir präsentieren neue kombinatorische Formulierungen für diese Probleme und verwenden dabei eine pfad- und kreisbasierte graphentheoretische Eigenschaft in einem zugehörigen problemspezifschen Paar von Constraint-Graphen. Die Umformulierung ermöglicht es uns, exakte Algorithmen für die Originalprobleme zu entwickeln. Umfassende experimentelle Studien mit Benchmark-Instanzen aus der Praxis zeigen, dass unsere Algorithmen, die auf linearer Programmierung beruhen, in der Lage sind, große Instanzen der Platzierungsprobleme beweisbar optimal und in kurzer Rechenzeit zu lösen. Ferner kombinieren wir die Formulierungen für Kompaktierungs- und Beschriftungsprobleme und präsentieren einen exakten algorithmischen Ansatz für ein Graphbeschriftungsproblem. Oftmals sind unsere neuen Algorithmen die ersten exakten Algorithmen für die jeweilige Problemvariante
    • …
    corecore