2,665 research outputs found
Task scheduling techniques for asymmetric multi-core systems
As performance and energy efficiency have become the main challenges for next-generation high-performance computing, asymmetric multi-core architectures can provide solutions to tackle these issues. Parallel programming models need to be able to suit the needs of such systems and keep on increasing the application’s portability and efficiency. This paper proposes two task scheduling approaches that target asymmetric systems. These dynamic scheduling policies reduce total execution time either by detecting the longest or the critical path of the dynamic task dependency graph of the application, or by finding the earliest executor of a task. They use dynamic scheduling and information discoverable during execution, fact that makes them implementable and functional without the need of off-line profiling. In our evaluation we compare these scheduling approaches with two existing state-of the art heterogeneous schedulers and we track their improvement over a FIFO baseline scheduler. We show that the heterogeneous schedulers improve the baseline by up to 1.45 in a real 8-core asymmetric system and up to 2.1 in a simulated 32-core asymmetric chip.This work has been supported by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de
Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), by the RoMoL ERC Advanced Grant (GA 321253) and the
European HiPEAC Network of Excellence. The Mont-Blanc project receives funding from the EU’s Seventh Framework Programme (FP7/2007-2013) under grant agreement
no 610402 and from the EU’s H2020 Framework Programme (H2020/2014-2020) under grant agreement no 671697. M.
Moretó has been partially supported by the Ministry of Economy and Competitiveness under Juan de la Cierva postdoctoral fellowship number JCI-2012-15047. M. Casas
is supported by the Secretary for Universities and Research of the Ministry of Economy and Knowledge of the Government of Catalonia and the Cofund programme of the Marie
Curie Actions of the 7th R&D Framework Programme of the European Union (Contract 2013 BP B 00243).Peer ReviewedPostprint (author's final draft
HyperANF: Approximating the Neighbourhood Function of Very Large Graphs on a Budget
The neighbourhood function N(t) of a graph G gives, for each t, the number of
pairs of nodes such that y is reachable from x in less that t hops. The
neighbourhood function provides a wealth of information about the graph (e.g.,
it easily allows one to compute its diameter), but it is very expensive to
compute it exactly. Recently, the ANF algorithm (approximate neighbourhood
function) has been proposed with the purpose of approximating NG(t) on large
graphs. We describe a breakthrough improvement over ANF in terms of speed and
scalability. Our algorithm, called HyperANF, uses the new HyperLogLog counters
and combines them efficiently through broadword programming; our implementation
uses overdecomposition to exploit multi-core parallelism. With HyperANF, for
the first time we can compute in a few hours the neighbourhood function of
graphs with billions of nodes with a small error and good confidence using a
standard workstation. Then, we turn to the study of the distribution of the
shortest paths between reachable nodes (that can be efficiently approximated by
means of HyperANF), and discover the surprising fact that its index of
dispersion provides a clear-cut characterisation of proper social networks vs.
web graphs. We thus propose the spid (Shortest-Paths Index of Dispersion) of a
graph as a new, informative statistics that is able to discriminate between the
above two types of graphs. We believe this is the first proposal of a
significant new non-local structural index for complex networks whose
computation is highly scalable
Efficient Race Detection with Futures
This paper addresses the problem of provably efficient and practically good
on-the-fly determinacy race detection in task parallel programs that use
futures. Prior works determinacy race detection have mostly focused on either
task parallel programs that follow a series-parallel dependence structure or
ones with unrestricted use of futures that generate arbitrary dependences. In
this work, we consider a restricted use of futures and show that it can be race
detected more efficiently than general use of futures.
Specifically, we present two algorithms: MultiBags and MultiBags+. MultiBags
targets programs that use futures in a restricted fashion and runs in time
, where is the sequential running time of the
program, is the inverse Ackermann's function, is the total number
of memory accesses, is the dynamic count of places at which parallelism is
created. Since is a very slowly growing function (upper bounded by
for all practical purposes), it can be treated as a close-to-constant overhead.
MultiBags+ an extension of MultiBags that target programs with general use of
futures. It runs in time where , ,
and are defined as before, and is the number of future operations in
the computation. We implemented both algorithms and empirically demonstrate
their efficiency
Well Structured Transition Systems with History
We propose a formal model of concurrent systems in which the history of a
computation is explicitly represented as a collection of events that provide a
view of a sequence of configurations. In our model events generated by
transitions become part of the system configurations leading to operational
semantics with historical data. This model allows us to formalize what is
usually done in symbolic verification algorithms. Indeed, search algorithms
often use meta-information, e.g., names of fired transitions, selected
processes, etc., to reconstruct (error) traces from symbolic state exploration.
The other interesting point of the proposed model is related to a possible new
application of the theory of well-structured transition systems (wsts). In our
setting wsts theory can be applied to formally extend the class of properties
that can be verified using coverability to take into consideration (ordered and
unordered) historical data. This can be done by using different types of
representation of collections of events and by combining them with wsts by
using closure properties of well-quasi orderings.Comment: In Proceedings GandALF 2015, arXiv:1509.0685
Activity recognition from videos with parallel hypergraph matching on GPUs
In this paper, we propose a method for activity recognition from videos based
on sparse local features and hypergraph matching. We benefit from special
properties of the temporal domain in the data to derive a sequential and fast
graph matching algorithm for GPUs.
Traditionally, graphs and hypergraphs are frequently used to recognize
complex and often non-rigid patterns in computer vision, either through graph
matching or point-set matching with graphs. Most formulations resort to the
minimization of a difficult discrete energy function mixing geometric or
structural terms with data attached terms involving appearance features.
Traditional methods solve this minimization problem approximately, for instance
with spectral techniques.
In this work, instead of solving the problem approximatively, the exact
solution for the optimal assignment is calculated in parallel on GPUs. The
graphical structure is simplified and regularized, which allows to derive an
efficient recursive minimization algorithm. The algorithm distributes
subproblems over the calculation units of a GPU, which solves them in parallel,
allowing the system to run faster than real-time on medium-end GPUs
AccTEE: A WebAssembly-based Two-way Sandbox for Trusted Resource Accounting
Remote computation has numerous use cases such as cloud computing, client-side web applications or volunteer computing. Typically, these computations are executed inside a sandboxed environment for two reasons: first, to isolate the execution in order to protect the host environment from unauthorised access, and second to control and restrict resource usage. Often, there is mutual distrust between entities providing the code and the ones executing it, owing to concerns over three potential problems: (i) loss of control over code and data by the providing entity, (ii) uncertainty of the integrity of the execution environment for customers, and (iii) a missing mutually trusted accounting of resource usage.
In this paper we present AccTEE, a two-way sandbox that offers remote computation with resource accounting trusted by consumers and providers. AccTEE leverages two recent technologies: hardware-protected trusted execution environments, and Web-Assembly, a novel platform independent byte-code format. We show how AccTEE uses automated code instrumentation for fine-grained resource accounting while maintaining confidentiality and integrity of code and data. Our evaluation of AccTEE in three scenarios – volunteer computing, serverless computing, and pay-by-computation for the web – shows a maximum accounting overhead of 10%
- …