3,229 research outputs found
SCAN: Learning Hierarchical Compositional Visual Concepts
The seemingly infinite diversity of the natural world arises from a
relatively small set of coherent rules, such as the laws of physics or
chemistry. We conjecture that these rules give rise to regularities that can be
discovered through primarily unsupervised experiences and represented as
abstract concepts. If such representations are compositional and hierarchical,
they can be recombined into an exponentially large set of new concepts. This
paper describes SCAN (Symbol-Concept Association Network), a new framework for
learning such abstractions in the visual domain. SCAN learns concepts through
fast symbol association, grounding them in disentangled visual primitives that
are discovered in an unsupervised manner. Unlike state of the art multimodal
generative model baselines, our approach requires very few pairings between
symbols and images and makes no assumptions about the form of symbol
representations. Once trained, SCAN is capable of multimodal bi-directional
inference, generating a diverse set of image samples from symbolic descriptions
and vice versa. It also allows for traversal and manipulation of the implicit
hierarchy of visual concepts through symbolic instructions and learnt logical
recombination operations. Such manipulations enable SCAN to break away from its
training data distribution and imagine novel visual concepts through
symbolically instructed recombination of previously learnt concepts
Theano: new features and speed improvements
Theano is a linear algebra compiler that optimizes a user's
symbolically-specified mathematical computations to produce efficient low-level
implementations. In this paper, we present new features and efficiency
improvements to Theano, and benchmarks demonstrating Theano's performance
relative to Torch7, a recently introduced machine learning library, and to
RNNLM, a C++ library targeted at recurrent neural networks.Comment: Presented at the Deep Learning Workshop, NIPS 201
An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling
We present a sparse linear system solver that is based on a multifrontal
variant of Gaussian elimination, and exploits low-rank approximation of the
resulting dense frontal matrices. We use hierarchically semiseparable (HSS)
matrices, which have low-rank off-diagonal blocks, to approximate the frontal
matrices. For HSS matrix construction, a randomized sampling algorithm is used
together with interpolative decompositions. The combination of the randomized
compression with a fast ULV HSS factorization leads to a solver with lower
computational complexity than the standard multifrontal method for many
applications, resulting in speedups up to 7 fold for problems in our test
suite. The implementation targets many-core systems by using task parallelism
with dynamic runtime scheduling. Numerical experiments show performance
improvements over state-of-the-art sparse direct solvers. The implementation
achieves high performance and good scalability on a range of modern shared
memory parallel systems, including the Intel Xeon Phi (MIC). The code is part
of a software package called STRUMPACK -- STRUctured Matrices PACKage, which
also has a distributed memory component for dense rank-structured matrices
Knowledge Graph semantic enhancement of input data for improving AI
Intelligent systems designed using machine learning algorithms require a
large number of labeled data. Background knowledge provides complementary, real
world factual information that can augment the limited labeled data to train a
machine learning algorithm. The term Knowledge Graph (KG) is in vogue as for
many practical applications, it is convenient and useful to organize this
background knowledge in the form of a graph. Recent academic research and
implemented industrial intelligent systems have shown promising performance for
machine learning algorithms that combine training data with a knowledge graph.
In this article, we discuss the use of relevant KGs to enhance input data for
two applications that use machine learning -- recommendation and community
detection. The KG improves both accuracy and explainability
2HOT: An Improved Parallel Hashed Oct-Tree N-Body Algorithm for Cosmological Simulation
We report on improvements made over the past two decades to our adaptive
treecode N-body method (HOT). A mathematical and computational approach to the
cosmological N-body problem is described, with performance and scalability
measured up to 256k () processors. We present error analysis and
scientific application results from a series of more than ten 69 billion
() particle cosmological simulations, accounting for
floating point operations. These results include the first simulations using
the new constraints on the standard model of cosmology from the Planck
satellite. Our simulations set a new standard for accuracy and scientific
throughput, while meeting or exceeding the computational efficiency of the
latest generation of hybrid TreePM N-body methods.Comment: 12 pages, 8 figures, 77 references; To appear in Proceedings of SC
'1
Doctor of Philosophy
dissertationFormal verification of hardware designs has become an essential component of the overall system design flow. The designs are generally modeled as finite state machines, on which property and equivalence checking problems are solved for verification. Reachability analysis forms the core of these techniques. However, increasing size and complexity of the circuits causes the state explosion problem. Abstraction is the key to tackling the scalability challenges. This dissertation presents new techniques for word-level abstraction with applications in sequential design verification. By bundling together k bit-level state-variables into one word-level constraint expression, the state-space is construed as solutions (variety) to a set of polynomial constraints (ideal), modeled over the finite (Galois) field of 2^k elements. Subsequently, techniques from algebraic geometry -- notably, Groebner basis theory and technology -- are researched to perform reachability analysis and verification of sequential circuits. This approach adds a "word-level dimension" to state-space abstraction and verification to make the process more efficient. While algebraic geometry provides powerful abstraction and reasoning capabilities, the algorithms exhibit high computational complexity. In the dissertation, we show that by analyzing the constraints, it is possible to obtain more insights about the polynomial ideals, which can be exploited to overcome the complexity. Using our algorithm design and implementations, we demonstrate how to perform reachability analysis of finite-state machines purely at the word level. Using this concept, we perform scalable verification of sequential arithmetic circuits. As contemporary approaches make use of resolution proofs and unsatisfiable cores for state-space abstraction, we introduce the algebraic geometry analog of unsatisfiable cores, and present algorithms to extract and refine unsatisfiable cores of polynomial ideals. Experiments are performed to demonstrate the efficacy of our approaches
Many Roads to Synchrony: Natural Time Scales and Their Algorithms
We consider two important time scales---the Markov and cryptic orders---that
monitor how an observer synchronizes to a finitary stochastic process. We show
how to compute these orders exactly and that they are most efficiently
calculated from the epsilon-machine, a process's minimal unifilar model.
Surprisingly, though the Markov order is a basic concept from stochastic
process theory, it is not a probabilistic property of a process. Rather, it is
a topological property and, moreover, it is not computable from any
finite-state model other than the epsilon-machine. Via an exhaustive survey, we
close by demonstrating that infinite Markov and infinite cryptic orders are a
dominant feature in the space of finite-memory processes. We draw out the roles
played in statistical mechanical spin systems by these two complementary length
scales.Comment: 17 pages, 16 figures:
http://cse.ucdavis.edu/~cmg/compmech/pubs/kro.htm. Santa Fe Institute Working
Paper 10-11-02
Computational and human-based methods for knowledge discovery over knowledge graphs
The modern world has evolved, accompanied by the huge exploitation of data and information. Daily, increasing volumes of data from various sources and formats are stored, resulting in a challenging strategy to manage and integrate them to discover new knowledge. The appropriate use of data in various sectors of society, such as education, healthcare, e-commerce, and industry, provides advantages for decision support in these areas. However, knowledge discovery becomes challenging since data may come from heterogeneous sources with important information hidden. Thus, new approaches that adapt to the new challenges of knowledge discovery in such heterogeneous data environments are required. The semantic web and knowledge graphs (KGs) are becoming increasingly relevant on the road to knowledge discovery. This thesis tackles the problem of knowledge discovery over KGs built from heterogeneous data sources. We provide a neuro-symbolic artificial intelligence system that integrates symbolic and sub-symbolic frameworks to exploit the semantics encoded in a KG and its structure. The symbolic system relies on existing approaches of deductive databases to make explicit, implicit knowledge encoded in a KG. The proposed deductive database can derive new statements to ego networks given an abstract target prediction. Thus, minimizes data sparsity in KGs. In addition, a sub-symbolic system relies on knowledge graph embedding (KGE) models. KGE models are commonly applied in the KG completion task to represent entities in a KG in a low-dimensional vector space. However, KGE models are known to suffer from data sparsity, and a symbolic system assists in overcoming this fact. The proposed approach discovers knowledge given a target prediction in a KG and extracts unknown implicit information related to the target prediction. As a proof of concept, we have implemented the neuro-symbolic system on top of a KG for lung cancer to predict polypharmacy treatment effectiveness. The symbolic system implements a deductive system to deduce pharmacokinetic drug-drug interactions encoded in a set of rules through the Datalog program. Additionally, the sub-symbolic system predicts treatment effectiveness using a KGE model, which preserves the KG structure. An ablation study on the components of our approach is conducted, considering state-of-the-art KGE methods. The observed results provide evidence for the benefits of the neuro-symbolic integration of our approach, where the neuro-symbolic system for an abstract target prediction exhibits improved results. The enhancement of the results occurs because the symbolic system increases the prediction capacity of the sub-symbolic system. Moreover, the proposed neuro-symbolic artificial intelligence system in Industry 4.0 (I4.0) is evaluated, demonstrating its effectiveness in determining relatedness among standards and analyzing their properties to detect unknown relations in the I4.0KG. The results achieved allow us to conclude that the proposed neuro-symbolic approach for an abstract target prediction improves the prediction capability of KGE models by minimizing data sparsity in KGs
- …