Search CORE

1,284 research outputs found

Indexed dependence metadata and its applications in software performance optimisation

Author: Howes Lee William
Howes Lee William
Publication venue: Computing, Imperial College London
Publication date: 01/04/2010
Field of study

To achieve continued performance improvements, modern microprocessor design is tending to concentrate an increasing proportion of hardware on computation units with less automatic management of data movement and extraction of parallelism. As a result, architectures increasingly include multiple computation cores and complicated, software-managed memory hierarchies. Compilers have difficulty characterizing the behaviour of a kernel in a general enough manner to enable automatic generation of efficient code in any but the most straightforward of cases. We propose the concept of indexed dependence metadata to improve application development and mapping onto such architectures. The metadata represent both the iteration space of a kernel and the mapping of that iteration space from a given index to the set of data elements that iteration might use: thus the dependence metadata is indexed by the kernel’s iteration space. This explicit mapping allows the compiler or runtime to optimise the program more efficiently, and improves the program structure for the developer. We argue that this form of explicit interface specification reduces the need for premature, architecture-specific optimisation. It improves program portability, supports intercomponent optimisation and enables generation of efficient data movement code. We offer the following contributions: an introduction to the concept of indexed dependence metadata as a generalisation of stream programming, a demonstration of its advantages in a component programming system, the decoupled access/execute model for C++ programs, and how indexed dependence metadata might be used to improve the programming model for GPU-based designs. Our experimental results with prototype implementations show that indexed dependence metadata supports automatic synthesis of double-buffered data movement for the Cell processor and enables aggressive loop fusion optimisations in image processing, linear algebra and multigrid application case studies

Spiral - Imperial College Digital Repository

Restricted Strip Covering and the Sensor Cover Problem

Author: Buchsbaum Adam L.
Efrat Alon
Jain Shaili
Venkatasubramanian Suresh
Yi Ke
Publication venue
Publication date: 23/05/2006
Field of study

Given a set of objects with durations (jobs) that cover a base region, can we schedule the jobs to maximize the duration the original region remains covered? We call this problem the sensor cover problem. This problem arises in the context of covering a region with sensors. For example, suppose you wish to monitor activity along a fence by sensors placed at various fixed locations. Each sensor has a range and limited battery life. The problem is to schedule when to turn on the sensors so that the fence is fully monitored for as long as possible. This one dimensional problem involves intervals on the real line. Associating a duration to each yields a set of rectangles in space and time, each specified by a pair of fixed horizontal endpoints and a height. The objective is to assign a position to each rectangle to maximize the height at which the spanning interval is fully covered. We call this one dimensional problem restricted strip covering. If we replace the covering constraint by a packing constraint, the problem is identical to dynamic storage allocation, a scheduling problem that is a restricted case of the strip packing problem. We show that the restricted strip covering problem is NP-hard and present an O(log log n)-approximation algorithm. We present better approximations or exact algorithms for some special cases. For the uniform-duration case of restricted strip covering we give a polynomial-time, exact algorithm but prove that the uniform-duration case for higher-dimensional regions is NP-hard. Finally, we consider regions that are arbitrary sets, and we present an O(log n)-approximation algorithm.Comment: 14 pages, 6 figure

arXiv.org e-Print Archive

CiteSeerX

Parallel local search

Author: Verhoeven M.G.A.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/1996
Field of study

Repository TU/e

Pure OAI Repository

Nested-Loops Tiling for Parallelization and Locality Optimization

Author: Hamzei Mohammad
Parsa Saeed
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 06/07/2017
Field of study

Data locality improvement and nested loops parallelization are two complementary and competing approaches for optimizing loop nests that constitute a large portion of computation times in scientific and engineering programs. While there are effective methods for each one of these, prior studies have paid less attention to address these two simultaneously. This paper proposes a unified approach that integrates these two techniques to obtain an appropriate locality conscious loop transformation to partition the loop iteration space into outer parallel tiled loops. The approach is based on the polyhedral model to achieve a multidimensional affine scheduling as a transformation that result the largest groups of tilable loops with maximum coarse grain parallelism, as far as possible. Furthermore, tiles will be scheduled on processor cores to exploit maximum data reuse through scheduling tiles with high volume of data sharing on the same core consecutively or on different cores with shared cache at around the same time

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Mapping constrained optimization problems to quantum annealing with application to fault diagnosis

Author: Bian Zhengbing
Chudak Fabian
Israel Robert
Lackey Brad
Macready William G.
Roy Aidan
Publication venue
Publication date: 01/01/2016
Field of study

Current quantum annealing (QA) hardware suffers from practical limitations such as finite temperature, sparse connectivity, small qubit numbers, and control error. We propose new algorithms for mapping boolean constraint satisfaction problems (CSPs) onto QA hardware mitigating these limitations. In particular we develop a new embedding algorithm for mapping a CSP onto a hardware Ising model with a fixed sparse set of interactions, and propose two new decomposition algorithms for solving problems too large to map directly into hardware. The mapping technique is locally-structured, as hardware compatible Ising models are generated for each problem constraint, and variables appearing in different constraints are chained together using ferromagnetic couplings. In contrast, global embedding techniques generate a hardware independent Ising model for all the constraints, and then use a minor-embedding algorithm to generate a hardware compatible Ising model. We give an example of a class of CSPs for which the scaling performance of D-Wave's QA hardware using the local mapping technique is significantly better than global embedding. We validate the approach by applying D-Wave's hardware to circuit-based fault-diagnosis. For circuits that embed directly, we find that the hardware is typically able to find all solutions from a min-fault diagnosis set of size N using 1000N samples, using an annealing rate that is 25 times faster than a leading SAT-based sampling method. Further, we apply decomposition algorithms to find min-cardinality faults for circuits that are up to 5 times larger than can be solved directly on current hardware.Comment: 22 pages, 4 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

Frontiers - Publisher Connector

09061 Abstracts Collection -- Combinatorial Scientific Computing

Author: Naumann Uwe
Schenk Olaf
Simon Horst D
Toledo Sivan
Publication venue: Dagstuhl Seminar Proceedings. 09061 - Combinatorial Scientific Computing
Publication date: 01/01/2009
Field of study

From 01.02.2009 to 06.02.2009, the Dagstuhl Seminar 09061 ``Combinatorial Scientific Computing \u27\u27 was held in Schloss Dagstuhl -- Leibniz Center for Informatics. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

Dagstuhl Research Online Publication Server

Guarding and Searching Polyhedra

Author: VIGLIETTA GIOVANNI
Publication venue: 'Pisa University Press'
Publication date: 11/11/2012
Field of study

Guarding and searching problems have been of fundamental interest since the early years of Computational Geometry. Both are well-developed areas of research and have been thoroughly studied in planar polygonal settings. In this thesis we tackle the Art Gallery Problem and the Searchlight Scheduling Problem in 3-dimensional polyhedral environments, putting special emphasis on edge guards and orthogonal polyhedra. We solve the Art Gallery Problem with reflex edge guards in orthogonal polyhedra having reflex edges in just two directions: generalizing a classic theorem by O'Rourke, we prove that r/2 + 1 reflex edge guards are sufficient and occasionally necessary, where r is the number of reflex edges. We also show how to compute guard locations in O(n log n) time. Then we investigate the Art Gallery Problem with mutually parallel edge guards in orthogonal polyhedra with e edges, showing that 11e/72 edge guards are always sufficient and can be found in linear time, improving upon the previous state of the art, which was e/6. We also give tight inequalities relating e with the number of reflex edges r, obtaining an upper bound on the guard number of 7r/12 + 1. We further study the Art Gallery Problem with edge guards in polyhedra having faces oriented in just four directions, obtaining a lower bound of e/6 - 1 edge guards and an upper bound of (e+r)/6 edge guards. All the previously mentioned results hold for polyhedra of any genus. Additionally, several guard types and guarding modes are discussed, namely open and closed edge guards, and orthogonal and non-orthogonal guarding. Next, we model the Searchlight Scheduling Problem, the problem of searching a given polyhedron by suitably turning some half-planes around their axes, in order to catch an evasive intruder. After discussing several generalizations of classic theorems, we study the problem of efficiently placing guards in a given polyhedron, in order to make it searchable. For general polyhedra, we give an upper bound of r^2 on the number of guards, which reduces to r for orthogonal polyhedra. Then we prove that it is strongly NP-hard to decide if a given polyhedron is entirely searchable by a given set of guards. We further prove that, even under the assumption that an orthogonal polyhedron is searchable, approximating the minimum search time within a small-enough constant factor to the optimum is still strongly NP-hard. Finally, we show that deciding if a specific region of an orthogonal polyhedron is searchable is strongly PSPACE-hard. By further improving our construction, we show that the same problem is strongly PSPACE-complete even for planar orthogonal polygons. Our last results are especially meaningful because no similar hardness theorems for 2-dimensional scenarios were previously known

arXiv.org e-Print Archive

Electronic Thesis and Dissertation Archive - Università di Pisa

Stability of Service under Time-of-Use Pricing

Author: Chawla Shuchi
Devanur Nikhil R.
Holroyd Alexander E.
Karlin Anna
Martin James
Sivan Balasubramanian
Publication venue
Publication date: 01/01/2017
Field of study

We consider "time-of-use" pricing as a technique for matching supply and demand of temporal resources with the goal of maximizing social welfare. Relevant examples include energy, computing resources on a cloud computing platform, and charging stations for electric vehicles, among many others. A client/job in this setting has a window of time during which he needs service, and a particular value for obtaining it. We assume a stochastic model for demand, where each job materializes with some probability via an independent Bernoulli trial. Given a per-time-unit pricing of resources, any realized job will first try to get served by the cheapest available resource in its window and, failing that, will try to find service at the next cheapest available resource, and so on. Thus, the natural stochastic fluctuations in demand have the potential to lead to cascading overload events. Our main result shows that setting prices so as to optimally handle the {\em expected} demand works well: with high probability, when the actual demand is instantiated, the system is stable and the expected value of the jobs served is very close to that of the optimal offline algorithm.Comment: To appear in STOC'1

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Improvements in Simulated Quenching Method for Vehicle Routing Problem with Time Windows by Using Search History and Devising Means for Reducing the Number of Vehicles

Author: Kawashima Hironao
Kokubugata Hisafumi
Matsumoto Shuichi
Shimazaki Yuji
Tatsuru Daimon
Publication venue: 'IntechOpen'
Publication date: 29/08/2012
Field of study

IntechOpen