Search CORE

15,783 research outputs found

The pharmacophore kernel for virtual screening with support vector machines

Author: Mahé Pierre
Ralaivola Liva
Stoven Véronique
Vert Jean-Philippe
Publication venue
Publication date: 03/03/2006
Field of study

We introduce a family of positive definite kernels specifically optimized for the manipulation of 3D structures of molecules with kernel methods. The kernels are based on the comparison of the three-points pharmacophores present in the 3D structures of molecul es, a set of molecular features known to be particularly relevant for virtual screening applications. We present a computationally demanding exact implementation of these kernels, as well as fast approximations related to the classical fingerprint-based approa ches. Experimental results suggest that this new approach outperforms state-of-the-art algorithms based on the 2D structure of mol ecules for the detection of inhibitors of several drug targets

arXiv.org e-Print Archive

HAL AMU

HAL-MINES ParisTech

Sharp analysis of low-rank kernel matrix approximations

Author: Bach Francis
Publication venue
Publication date: 01/01/2013
Field of study

We consider supervised learning problems within the positive-definite kernel framework, such as kernel ridge regression, kernel logistic regression or the support vector machine. With kernels leading to infinite-dimensional feature spaces, a common practical limiting difficulty is the necessity of computing the kernel matrix, which most frequently leads to algorithms with running time at least quadratic in the number of observations n, i.e., O(n^2). Low-rank approximations of the kernel matrix are often considered as they allow the reduction of running time complexities to O(p^2 n), where p is the rank of the approximation. The practicality of such methods thus depends on the required rank p. In this paper, we show that in the context of kernel ridge regression, for approximations based on a random subset of columns of the original kernel matrix, the rank p may be chosen to be linear in the degrees of freedom associated with the problem, a quantity which is classically used in the statistical analysis of such methods, and is often seen as the implicit number of parameters of non-parametric estimators. This result enables simple algorithms that have sub-quadratic running time complexity, but provably exhibit the same predictive performance than existing algorithms, for any given problem instance, and not only for worst-case situations

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

Scheduling data flow program in xkaapi: A new affinity based Algorithm for Heterogeneous Architectures

Author: A. Buttari
C. Augonnet
D.S. Hochbaum
E. Agullo
E. Hermann
F. Song
G. Bosilca
H. Topcuoglu
J.V.F. Lima
S. Kedad-Sidhoum
S. Tomov
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Efficient implementations of parallel applications on heterogeneous hybrid architectures require a careful balance between computations and communications with accelerator devices. Even if most of the communication time can be overlapped by computations, it is essential to reduce the total volume of communicated data. The literature therefore abounds with ad-hoc methods to reach that balance, but that are architecture and application dependent. We propose here a generic mechanism to automatically optimize the scheduling between CPUs and GPUs, and compare two strategies within this mechanism: the classical Heterogeneous Earliest Finish Time (HEFT) algorithm and our new, parametrized, Distributed Affinity Dual Approximation algorithm (DADA), which consists in grouping the tasks by affinity before running a fast dual approximation. We ran experiments on a heterogeneous parallel machine with six CPU cores and eight NVIDIA Fermi GPUs. Three standard dense linear algebra kernels from the PLASMA library have been ported on top of the Xkaapi runtime. We report their performances. It results that HEFT and DADA perform well for various experimental conditions, but that DADA performs better for larger systems and number of GPUs, and, in most cases, generates much lower data transfers than HEFT to achieve the same performance

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server