5,362 research outputs found
A distributed-memory package for dense Hierarchically Semi-Separable matrix computations using randomization
We present a distributed-memory library for computations with dense
structured matrices. A matrix is considered structured if its off-diagonal
blocks can be approximated by a rank-deficient matrix with low numerical rank.
Here, we use Hierarchically Semi-Separable representations (HSS). Such matrices
appear in many applications, e.g., finite element methods, boundary element
methods, etc. Exploiting this structure allows for fast solution of linear
systems and/or fast computation of matrix-vector products, which are the two
main building blocks of matrix computations. The compression algorithm that we
use, that computes the HSS form of an input dense matrix, relies on randomized
sampling with a novel adaptive sampling mechanism. We discuss the
parallelization of this algorithm and also present the parallelization of
structured matrix-vector product, structured factorization and solution
routines. The efficiency of the approach is demonstrated on large problems from
different academic and industrial applications, on up to 8,000 cores.
This work is part of a more global effort, the STRUMPACK (STRUctured Matrices
PACKage) software package for computations with sparse and dense structured
matrices. Hence, although useful on their own right, the routines also
represent a step in the direction of a distributed-memory sparse solver
One-class classifiers based on entropic spanning graphs
One-class classifiers offer valuable tools to assess the presence of outliers
in data. In this paper, we propose a design methodology for one-class
classifiers based on entropic spanning graphs. Our approach takes into account
the possibility to process also non-numeric data by means of an embedding
procedure. The spanning graph is learned on the embedded input data and the
outcoming partition of vertices defines the classifier. The final partition is
derived by exploiting a criterion based on mutual information minimization.
Here, we compute the mutual information by using a convenient formulation
provided in terms of the -Jensen difference. Once training is
completed, in order to associate a confidence level with the classifier
decision, a graph-based fuzzy model is constructed. The fuzzification process
is based only on topological information of the vertices of the entropic
spanning graph. As such, the proposed one-class classifier is suitable also for
data characterized by complex geometric structures. We provide experiments on
well-known benchmarks containing both feature vectors and labeled graphs. In
addition, we apply the method to the protein solubility recognition problem by
considering several representations for the input samples. Experimental results
demonstrate the effectiveness and versatility of the proposed method with
respect to other state-of-the-art approaches.Comment: Extended and revised version of the paper "One-Class Classification
Through Mutual Information Minimization" presented at the 2016 IEEE IJCNN,
Vancouver, Canad
Recommended from our members
Using topological sweep to extract the boundaries of regions in maps represented by region quadtrees
A variant of the plane sweep paradigm known as topological sweep is adapted to solve geometric problems involving two-dimensional regions when the underlying representation is a region quadtree. The utility of this technique is illustrated by showing how it can be used to extract the boundaries of a map in O(M) space and O(Ma(M)) time, where M is the number of quad tree blocks in the map, and a(·) is the (extremely slowly growing) inverse of Ackerman's function. The algorithm works for maps that contain multiple regions as well as holes. The algorithm makes use of active objects (in the form of regions) and an active border. It keeps track of the current position in the active border so that at each step no search is necessary. The algorithm represents a considerable improvement over a previous approach whose worst-case execution time is proportional to the product of the number of blocks in the map and the resolution of the quad tree (i.e., the maximum level of decomposition). The algorithm works for many different quadtree representations including those where the quadtree is stored in external storage
Learning hard quantum distributions with variational autoencoders
Studying general quantum many-body systems is one of the major challenges in
modern physics because it requires an amount of computational resources that
scales exponentially with the size of the system.Simulating the evolution of a
state, or even storing its description, rapidly becomes intractable for exact
classical algorithms. Recently, machine learning techniques, in the form of
restricted Boltzmann machines, have been proposed as a way to efficiently
represent certain quantum states with applications in state tomography and
ground state estimation. Here, we introduce a new representation of states
based on variational autoencoders. Variational autoencoders are a type of
generative model in the form of a neural network. We probe the power of this
representation by encoding probability distributions associated with states
from different classes. Our simulations show that deep networks give a better
representation for states that are hard to sample from, while providing no
benefit for random states. This suggests that the probability distributions
associated to hard quantum states might have a compositional structure that can
be exploited by layered neural networks. Specifically, we consider the
learnability of a class of quantum states introduced by Fefferman and Umans.
Such states are provably hard to sample for classical computers, but not for
quantum ones, under plausible computational complexity assumptions. The good
level of compression achieved for hard states suggests these methods can be
suitable for characterising states of the size expected in first generation
quantum hardware.Comment: v2: 9 pages, 3 figures, journal version with major edits with respect
to v1 (rewriting of section "hard and easy quantum states", extended
discussion on comparison with tensor networks
Bounding Embeddings of VC Classes into Maximum Classes
One of the earliest conjectures in computational learning theory-the Sample
Compression conjecture-asserts that concept classes (equivalently set systems)
admit compression schemes of size linear in their VC dimension. To-date this
statement is known to be true for maximum classes---those that possess maximum
cardinality for their VC dimension. The most promising approach to positively
resolving the conjecture is by embedding general VC classes into maximum
classes without super-linear increase to their VC dimensions, as such
embeddings would extend the known compression schemes to all VC classes. We
show that maximum classes can be characterised by a local-connectivity property
of the graph obtained by viewing the class as a cubical complex. This geometric
characterisation of maximum VC classes is applied to prove a negative embedding
result which demonstrates VC-d classes that cannot be embedded in any maximum
class of VC dimension lower than 2d. On the other hand, we show that every VC-d
class C embeds in a VC-(d+D) maximum class where D is the deficiency of C,
i.e., the difference between the cardinalities of a maximum VC-d class and of
C. For VC-2 classes in binary n-cubes for 4 <= n <= 6, we give best possible
results on embedding into maximum classes. For some special classes of Boolean
functions, relationships with maximum classes are investigated. Finally we give
a general recursive procedure for embedding VC-d classes into VC-(d+k) maximum
classes for smallest k.Comment: 22 pages, 2 figure
- …