219 research outputs found
Cache-oblivious algorithms
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.Includes bibliographical references (p. 67-70).by Harald Prokop.S.M
High-Quality Hypergraph Partitioning
This dissertation focuses on computing high-quality solutions for the NP-hard balanced hypergraph partitioning problem: Given a hypergraph and an integer , partition its vertex set into disjoint blocks of bounded size, while minimizing an objective function over the hyperedges. Here, we consider the two most commonly used objectives: the cut-net metric and the connectivity metric.
Since the problem is computationally intractable, heuristics are used in practice - the most prominent being the three-phase multi-level paradigm: During coarsening, the hypergraph is successively contracted to obtain a hierarchy of smaller instances. After applying an initial partitioning algorithm to the smallest hypergraph, contraction is undone and, at each level, refinement algorithms try to improve the current solution.
With this work, we give a brief overview of the field and present several algorithmic improvements to the multi-level paradigm. Instead of using a logarithmic number of levels like traditional algorithms, we present two coarsening algorithms that create a hierarchy of (nearly) levels, where is the number of vertices. This makes consecutive levels as similar as possible and provides many opportunities for refinement algorithms to improve the partition. This approach is made feasible in practice by tailoring all algorithms and data structures to the -level paradigm, and developing lazy-evaluation techniques, caching mechanisms and early stopping criteria to speed up the partitioning process. Furthermore, we propose a sparsification algorithm based on locality-sensitive hashing that improves the running time for hypergraphs with large hyperedges, and show that incorporating global information about the community structure into the coarsening process improves quality. Moreover, we present a portfolio-based initial partitioning approach, and propose three refinement algorithms. Two are based on the Fiduccia-Mattheyses (FM) heuristic, but perform a highly localized search at each level. While one is designed for two-way partitioning, the other is the first FM-style algorithm that can be efficiently employed in the multi-level setting to directly improve -way partitions. The third algorithm uses max-flow computations on pairs of blocks to refine -way partitions. Finally, we present the first memetic multi-level hypergraph partitioning algorithm for an extensive exploration of the global solution space.
All contributions are made available through our open-source framework KaHyPar. In a comprehensive experimental study, we compare KaHyPar with hMETIS, PaToH, Mondriaan, Zoltan-AlgD, and HYPE on a wide range of hypergraphs from several application areas. Our results indicate that KaHyPar, already without the memetic component, computes better solutions than all competing algorithms for both the cut-net and the connectivity metric, while being faster than Zoltan-AlgD and equally fast as hMETIS. Moreover, KaHyPar compares favorably with the current best graph partitioning system KaFFPa - both in terms of solution quality and running time
Models for energy consumption of data structures and algorithms
EXCESS deliverable D2.1. More information at http://www.excess-project.eu/This deliverable reports our early energy models for data structures and algorithms based on both micro-benchmarks and concurrent algorithms. It reports the early results of Task 2.1 on investigating and modeling the trade-off between energy and performance in concurrent data structures and algorithms, which forms the basis for the whole work package 2 (WP2). The work has been conducted on the two main EXCESS platforms: (1) Intel platform with recent Intel multi-core CPUs and (2) Movidius embedded platform
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
White-box methodologies, programming abstractions and libraries
EXCESS deliverable D2.2. More information at http://www.excess-project.eu/This deliverable reports the results of white-box methodologies and early results ofthe first prototype of libraries and programming abstractions as available by projectmonth 18 by Work Package 2 (WP2). It reports i) the latest results of Task 2.2on white-box methodologies, programming abstractions and libraries for developingenergy-efficient data structures and algorithms and ii) the improved results of Task2.1 on investigating and modeling the trade-off between energy and performance ofconcurrent data structures and algorithms. The work has been conducted on two mainEXCESS platforms: Intel platforms with recent Intel multicore CPUs and MovidiusMyriad1 platform
Recommended from our members
Coupling, Conservation, and Performance in Numerical Simulations
This thesis considers three aspects of the numerical simulations, which arecoupling, conservation, and performance. We conduct a project and addressone challenge from each of these aspects.We propose a novel penalty force to enforce contacts with accurate Coulombfriction. The force is compatible with fully-implicit time integration and theuse of optimization-based integration. In addition to processing collisionsbetween deformable objects, the force can be used to couple rigid bodies todeformable objects or the material point method. The force naturally leads tostable stacking without drift over time, even when solvers are not run toconvergence. The force leads to an asymmetrical system, and we provide apractical solution for handling these.Next we present a new technique for transferring momentum and velocity betweenparticles and MAC grids based on the Affine-Particle-In-Cell (APIC) frameworkpreviously developed for co-locatedgrids. We extend the original APIC paper and show thatthe proposed transfers preserve linear and angular momentum and also satisfyall of the original APIC properties.Early indications in the original APIC paper suggested that APIC might besuitable for simulating high Reynolds fluids due to favorable retention ofvortices, but these properties were not studied further. We use twodimensional Fourier analysis to investigate dissipation in the limit \dt=0.We investigate dissipation and vortex retention numerically to quantify theeffectiveness of APIC compared with other transfer algorithms.Finally we present an efficient solver for problems typically seen inmicrofluidic applications.Microfluidic ``lab on a chip'' devices are small devices that operate on smalllength scales on small volumes of fluid. Designs for microfluidic chips aregenerally composed of standardized and often repeated components connected bylong, thin, straight fluid channels. We propose a novel discretizationalgorithm for simulating the Stokes equations on geometry with these features,which produces sparse linear systems with many repeated matrix blocks. Thediscretization is formally third order accurate for velocity and second orderaccurate for pressure in the norm. We also propose a novel linearsystem solver based on cyclic reduction, reordered sparse Gaussian elimination,and operation caching that is designed to efficiently solve systems withrepeated matrix blocks
- …