119,444 research outputs found
An ontology enhanced parallel SVM for scalable spam filter training
This is the post-print version of the final paper published in Neurocomputing. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2013 Elsevier B.V.Spam, under a variety of shapes and forms, continues to inflict increased damage. Varying approaches including Support Vector Machine (SVM) techniques have been proposed for spam filter training and classification. However, SVM training is a computationally intensive process. This paper presents a MapReduce based parallel SVM algorithm for scalable spam filter training. By distributing, processing and optimizing the subsets of the training data across multiple participating computer nodes, the parallel SVM reduces the training time significantly. Ontology semantics are employed to minimize the impact of accuracy degradation when distributing the training data among a number of SVM classifiers. Experimental results show that ontology based augmentation improves the accuracy level of the parallel SVM beyond the original sequential counterpart
Algorithmic patterns for -matrices on many-core processors
In this work, we consider the reformulation of hierarchical ()
matrix algorithms for many-core processors with a model implementation on
graphics processing units (GPUs). matrices approximate specific
dense matrices, e.g., from discretized integral equations or kernel ridge
regression, leading to log-linear time complexity in dense matrix-vector
products. The parallelization of matrix operations on many-core
processors is difficult due to the complex nature of the underlying algorithms.
While previous algorithmic advances for many-core hardware focused on
accelerating existing matrix CPU implementations by many-core
processors, we here aim at totally relying on that processor type. As main
contribution, we introduce the necessary parallel algorithmic patterns allowing
to map the full matrix construction and the fast matrix-vector
product to many-core hardware. Here, crucial ingredients are space filling
curves, parallel tree traversal and batching of linear algebra operations. The
resulting model GPU implementation hmglib is the, to the best of the authors
knowledge, first entirely GPU-based Open Source matrix library of
this kind. We conclude this work by an in-depth performance analysis and a
comparative performance study against a standard matrix library,
highlighting profound speedups of our many-core parallel approach
Alternating Maximization: Unifying Framework for 8 Sparse PCA Formulations and Efficient Parallel Codes
Given a multivariate data set, sparse principal component analysis (SPCA)
aims to extract several linear combinations of the variables that together
explain the variance in the data as much as possible, while controlling the
number of nonzero loadings in these combinations. In this paper we consider 8
different optimization formulations for computing a single sparse loading
vector; these are obtained by combining the following factors: we employ two
norms for measuring variance (L2, L1) and two sparsity-inducing norms (L0, L1),
which are used in two different ways (constraint, penalty). Three of our
formulations, notably the one with L0 constraint and L1 variance, have not been
considered in the literature. We give a unifying reformulation which we propose
to solve via a natural alternating maximization (AM) method. We show the the AM
method is nontrivially equivalent to GPower (Journ\'{e}e et al; JMLR
11:517--553, 2010) for all our formulations. Besides this, we provide 24
efficient parallel SPCA implementations: 3 codes (multi-core, GPU and cluster)
for each of the 8 problems. Parallelism in the methods is aimed at i) speeding
up computations (our GPU code can be 100 times faster than an efficient serial
code written in C++), ii) obtaining solutions explaining more variance and iii)
dealing with big data problems (our cluster code is able to solve a 357 GB
problem in about a minute).Comment: 29 pages, 9 tables, 7 figures (the paper is accompanied by a release
of the open-source code '24am'
- …