26,273 research outputs found
An ontology enhanced parallel SVM for scalable spam filter training
This is the post-print version of the final paper published in Neurocomputing. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2013 Elsevier B.V.Spam, under a variety of shapes and forms, continues to inflict increased damage. Varying approaches including Support Vector Machine (SVM) techniques have been proposed for spam filter training and classification. However, SVM training is a computationally intensive process. This paper presents a MapReduce based parallel SVM algorithm for scalable spam filter training. By distributing, processing and optimizing the subsets of the training data across multiple participating computer nodes, the parallel SVM reduces the training time significantly. Ontology semantics are employed to minimize the impact of accuracy degradation when distributing the training data among a number of SVM classifiers. Experimental results show that ontology based augmentation improves the accuracy level of the parallel SVM beyond the original sequential counterpart
How proofs are prepared at Camelot
We study a design framework for robust, independently verifiable, and
workload-balanced distributed algorithms working on a common input. An
algorithm based on the framework is essentially a distributed encoding
procedure for a Reed--Solomon code, which enables (a) robustness against
byzantine failures with intrinsic error-correction and identification of failed
nodes, and (b) independent randomized verification to check the entire
computation for correctness, which takes essentially no more resources than
each node individually contributes to the computation. The framework builds on
recent Merlin--Arthur proofs of batch evaluation of Williams~[{\em Electron.\
Colloq.\ Comput.\ Complexity}, Report TR16-002, January 2016] with the
observation that {\em Merlin's magic is not needed} for batch evaluation---mere
Knights can prepare the proof, in parallel, and with intrinsic
error-correction.
The contribution of this paper is to show that in many cases the verifiable
batch evaluation framework admits algorithms that match in total resource
consumption the best known sequential algorithm for solving the problem. As our
main result, we show that the -cliques in an -vertex graph can be counted
{\em and} verified in per-node time and space on
compute nodes, for any constant and
positive integer divisible by , where is the
exponent of matrix multiplication. This matches in total running time the best
known sequential algorithm, due to Ne{\v{s}}et{\v{r}}il and Poljak [{\em
Comment.~Math.~Univ.~Carolin.}~26 (1985) 415--419], and considerably improves
its space usage and parallelizability. Further results include novel algorithms
for counting triangles in sparse graphs, computing the chromatic polynomial of
a graph, and computing the Tutte polynomial of a graph.Comment: 42 p
Domain decomposition methods for compressed sensing
We present several domain decomposition algorithms for sequential and
parallel minimization of functionals formed by a discrepancy term with respect
to data and total variation constraints. The convergence properties of the
algorithms are analyzed. We provide several numerical experiments, showing the
successful application of the algorithms for the restoration 1D and 2D signals
in interpolation/inpainting problems respectively, and in a compressed sensing
problem, for recovering piecewise constant medical-type images from partial
Fourier ensembles.Comment: 4 page
An efficient steady-state analysis of the eddy current problem using a parallel-in-time algorithm
This paper introduces a parallel-in-time algorithm for efficient steady-state
solution of the eddy current problem. Its main idea is based on the application
of the well-known multi-harmonic (or harmonic balance) approach as the coarse
solver within the periodic parallel-in-time framework. A frequency domain
representation allows for the separate calculation of each harmonic component
in parallel and therefore accelerates the solution of the time-periodic system.
The presented approach is verified for a nonlinear coaxial cable model
Objective multiscale analysis of random heterogeneous materials
The multiscale framework presented in [1, 2] is assessed in this contribution for a study of random heterogeneous materials. Results are compared to direct numerical simulations (DNS) and the sensitivity to user-defined parameters such as the domain decomposition type and initial coarse scale resolution is reported. The parallel performance of the implementation is studied for different domain decompositions
- …