272,987 research outputs found
Hierarchical linear support vector machine
This is the author’s version of a work that was accepted for publication in Pattern Recognition. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Pattern Recognition, Vol. 45, Iss. 12, (2012) DOI: 10.1016/j.patcog.2012.06.002The increasing size and dimensionality of real-world datasets make it necessary to design efficient algorithms not only in the training process but also in the prediction phase. In applications such as credit card fraud detection, the classifier needs to predict an event in 10 ms at most. In these environments the speed of the prediction constraints heavily outweighs the training costs. We propose a new classification method, called a Hierarchical Linear Support Vector Machine (H-LSVM), based on the construction of an oblique decision tree in which the node split is obtained as a Linear Support Vector Machine. Although other methods have been proposed to break the data space down in subregions to speed up Support Vector Machines, the H-LSVM algorithm represents a very simple and efficient model in training but mainly in prediction for large-scale datasets. Only a few hyperplanes need to be evaluated in the prediction step, no kernel computation is required and the tree structure makes parallelization possible. In experiments with medium and large datasets, the H-LSVM reduces the prediction cost considerably while achieving classification results closer to the non-linear SVM than that of the linear case.The authors would like to thank the anonymous reviewers for their comments that help improve the manuscript. I.R.-L. is supported by an FPU Grant from Universidad Autónoma de Madrid, and partially supported by the Universidad Autónoma de Madrid-IIC Chair and TIN2010-21575-C02-01. R.H. acknowledges partial support by ONRN00014-07-1-0741, USARIEM-W81XWH-10-C-0040 (ELINTRIX) and JPL-2012-1455933
HVSTO: Efficient Privacy Preserving Hybrid Storage in Cloud Data Center
In cloud data center, shared storage with good management is a main structure
used for the storage of virtual machines (VM). In this paper, we proposed
Hybrid VM storage (HVSTO), a privacy preserving shared storage system designed
for the virtual machine storage in large-scale cloud data center. Unlike
traditional shared storage, HVSTO adopts a distributed structure to preserve
privacy of virtual machines, which are a threat in traditional centralized
structure. To improve the performance of I/O latency in this distributed
structure, we use a hybrid system to combine solid state disk and distributed
storage. From the evaluation of our demonstration system, HVSTO provides a
scalable and sufficient throughput for the platform as a service
infrastructure.Comment: 7 pages, 8 figures, in proceeding of The Second International
Workshop on Security and Privacy in Big Data (BigSecurity 2014
A distributed file service based on optimistic concurrency control
The design of a layered file service for the Amoeba Distributed System is discussed, on top of which various applications can easily be intplemented. The bottom layer is formed by the Amoeba Block Services, responsible for implementing stable storage and repficated, highly available disk blocks. The next layer is formed by the Amoeba File Service which provides version management and concurrency control for tree-structured files. On top of this layer, the appficafions, ranging from databases to source code control systems, determine the structure of the file trees and provide an interface to the users
Learning loopy graphical models with latent variables: Efficient methods and guarantees
The problem of structure estimation in graphical models with latent variables
is considered. We characterize conditions for tractable graph estimation and
develop efficient methods with provable guarantees. We consider models where
the underlying Markov graph is locally tree-like, and the model is in the
regime of correlation decay. For the special case of the Ising model, the
number of samples required for structural consistency of our method scales
as , where p is the
number of variables, is the minimum edge potential, is
the depth (i.e., distance from a hidden node to the nearest observed nodes),
and is a parameter which depends on the bounds on node and edge
potentials in the Ising model. Necessary conditions for structural consistency
under any algorithm are derived and our method nearly matches the lower bound
on sample requirements. Further, the proposed method is practical to implement
and provides flexibility to control the number of latent variables and the
cycle lengths in the output graph.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1070 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A multiresolution space-time adaptive scheme for the bidomain model in electrocardiology
This work deals with the numerical solution of the monodomain and bidomain
models of electrical activity of myocardial tissue. The bidomain model is a
system consisting of a possibly degenerate parabolic PDE coupled with an
elliptic PDE for the transmembrane and extracellular potentials, respectively.
This system of two scalar PDEs is supplemented by a time-dependent ODE modeling
the evolution of the so-called gating variable. In the simpler sub-case of the
monodomain model, the elliptic PDE reduces to an algebraic equation. Two simple
models for the membrane and ionic currents are considered, the
Mitchell-Schaeffer model and the simpler FitzHugh-Nagumo model. Since typical
solutions of the bidomain and monodomain models exhibit wavefronts with steep
gradients, we propose a finite volume scheme enriched by a fully adaptive
multiresolution method, whose basic purpose is to concentrate computational
effort on zones of strong variation of the solution. Time adaptivity is
achieved by two alternative devices, namely locally varying time stepping and a
Runge-Kutta-Fehlberg-type adaptive time integration. A series of numerical
examples demonstrates thatthese methods are efficient and sufficiently accurate
to simulate the electrical activity in myocardial tissue with affordable
effort. In addition, an optimalthreshold for discarding non-significant
information in the multiresolution representation of the solution is derived,
and the numerical efficiency and accuracy of the method is measured in terms of
CPU time speed-up, memory compression, and errors in different norms.Comment: 25 pages, 41 figure
- …