15,783 research outputs found
The pharmacophore kernel for virtual screening with support vector machines
We introduce a family of positive definite kernels specifically optimized for
the manipulation of 3D structures of molecules with kernel methods. The kernels
are based on the comparison of the three-points pharmacophores present in the
3D structures of molecul es, a set of molecular features known to be
particularly relevant for virtual screening applications. We present a
computationally demanding exact implementation of these kernels, as well as
fast approximations related to the classical fingerprint-based approa ches.
Experimental results suggest that this new approach outperforms
state-of-the-art algorithms based on the 2D structure of mol ecules for the
detection of inhibitors of several drug targets
Sharp analysis of low-rank kernel matrix approximations
We consider supervised learning problems within the positive-definite kernel
framework, such as kernel ridge regression, kernel logistic regression or the
support vector machine. With kernels leading to infinite-dimensional feature
spaces, a common practical limiting difficulty is the necessity of computing
the kernel matrix, which most frequently leads to algorithms with running time
at least quadratic in the number of observations n, i.e., O(n^2). Low-rank
approximations of the kernel matrix are often considered as they allow the
reduction of running time complexities to O(p^2 n), where p is the rank of the
approximation. The practicality of such methods thus depends on the required
rank p. In this paper, we show that in the context of kernel ridge regression,
for approximations based on a random subset of columns of the original kernel
matrix, the rank p may be chosen to be linear in the degrees of freedom
associated with the problem, a quantity which is classically used in the
statistical analysis of such methods, and is often seen as the implicit number
of parameters of non-parametric estimators. This result enables simple
algorithms that have sub-quadratic running time complexity, but provably
exhibit the same predictive performance than existing algorithms, for any given
problem instance, and not only for worst-case situations
Scheduling data flow program in xkaapi: A new affinity based Algorithm for Heterogeneous Architectures
Efficient implementations of parallel applications on heterogeneous hybrid
architectures require a careful balance between computations and communications
with accelerator devices. Even if most of the communication time can be
overlapped by computations, it is essential to reduce the total volume of
communicated data. The literature therefore abounds with ad-hoc methods to
reach that balance, but that are architecture and application dependent. We
propose here a generic mechanism to automatically optimize the scheduling
between CPUs and GPUs, and compare two strategies within this mechanism: the
classical Heterogeneous Earliest Finish Time (HEFT) algorithm and our new,
parametrized, Distributed Affinity Dual Approximation algorithm (DADA), which
consists in grouping the tasks by affinity before running a fast dual
approximation. We ran experiments on a heterogeneous parallel machine with six
CPU cores and eight NVIDIA Fermi GPUs. Three standard dense linear algebra
kernels from the PLASMA library have been ported on top of the Xkaapi runtime.
We report their performances. It results that HEFT and DADA perform well for
various experimental conditions, but that DADA performs better for larger
systems and number of GPUs, and, in most cases, generates much lower data
transfers than HEFT to achieve the same performance
- …