40,721 research outputs found
Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes
The ongoing hardware evolution exhibits an escalation in the number, as well
as in the heterogeneity, of computing resources. The pressure to maintain
reasonable levels of performance and portability forces application developers
to leave the traditional programming paradigms and explore alternative
solutions. PaStiX is a parallel sparse direct solver, based on a dynamic
scheduler for modern hierarchical manycore architectures. In this paper, we
study the benefits and limits of replacing the highly specialized internal
scheduler of the PaStiX solver with two generic runtime systems: PaRSEC and
StarPU. The tasks graph of the factorization step is made available to the two
runtimes, providing them the opportunity to process and optimize its traversal
in order to maximize the algorithm efficiency for the targeted hardware
platform. A comparative study of the performance of the PaStiX solver on top of
its native internal scheduler, PaRSEC, and StarPU frameworks, on different
execution environments, is performed. The analysis highlights that these
generic task-based runtimes achieve comparable results to the
application-optimized embedded scheduler on homogeneous platforms. Furthermore,
they are able to significantly speed up the solver on heterogeneous
environments by taking advantage of the accelerators while hiding the
complexity of their efficient manipulation from the programmer.Comment: Heterogeneity in Computing Workshop (2014
Pipelining the Fast Multipole Method over a Runtime System
Fast Multipole Methods (FMM) are a fundamental operation for the simulation
of many physical problems. The high performance design of such methods usually
requires to carefully tune the algorithm for both the targeted physics and the
hardware. In this paper, we propose a new approach that achieves high
performance across architectures. Our method consists of expressing the FMM
algorithm as a task flow and employing a state-of-the-art runtime system,
StarPU, in order to process the tasks on the different processing units. We
carefully design the task flow, the mathematical operators, their Central
Processing Unit (CPU) and Graphics Processing Unit (GPU) implementations, as
well as scheduling schemes. We compute potentials and forces of 200 million
particles in 48.7 seconds on a homogeneous 160 cores SGI Altix UV 100 and of 38
million particles in 13.34 seconds on a heterogeneous 12 cores Intel Nehalem
processor enhanced with 3 Nvidia M2090 Fermi GPUs.Comment: No. RR-7981 (2012
A Methodology for Engineering Collaborative and ad-hoc Mobile Applications using SyD Middleware
Today’s web applications are more collaborative and utilize standard and ubiquitous Internet protocols. We have earlier developed System on Mobile Devices (SyD) middleware to rapidly develop and deploy collaborative applications over heterogeneous and possibly mobile devices hosting web objects. In this paper, we present the software engineering methodology for developing SyD-enabled web applications and illustrate it through a case study on two representative applications: (i) a calendar of meeting application, which is a collaborative application and (ii) a travel application which is an ad-hoc collaborative application. SyD-enabled web objects allow us to create a collaborative application rapidly with limited coding effort. In this case study, the modular software architecture allowed us to hide the inherent heterogeneity among devices, data stores, and networks by presenting a uniform and persistent object view of mobile objects interacting through XML/SOAP requests and responses. The performance results we obtained show that the application scales well as we increase the group size and adapts well within the constraints of mobile devices
Explainable Reasoning over Knowledge Graphs for Recommendation
Incorporating knowledge graph into recommender systems has attracted
increasing attention in recent years. By exploring the interlinks within a
knowledge graph, the connectivity between users and items can be discovered as
paths, which provide rich and complementary information to user-item
interactions. Such connectivity not only reveals the semantics of entities and
relations, but also helps to comprehend a user's interest. However, existing
efforts have not fully explored this connectivity to infer user preferences,
especially in terms of modeling the sequential dependencies within and holistic
semantics of a path. In this paper, we contribute a new model named
Knowledge-aware Path Recurrent Network (KPRN) to exploit knowledge graph for
recommendation. KPRN can generate path representations by composing the
semantics of both entities and relations. By leveraging the sequential
dependencies within a path, we allow effective reasoning on paths to infer the
underlying rationale of a user-item interaction. Furthermore, we design a new
weighted pooling operation to discriminate the strengths of different paths in
connecting a user with an item, endowing our model with a certain level of
explainability. We conduct extensive experiments on two datasets about movie
and music, demonstrating significant improvements over state-of-the-art
solutions Collaborative Knowledge Base Embedding and Neural Factorization
Machine.Comment: 8 pages, 5 figures, AAAI-201
- …