193,935 research outputs found
Dilations and information flow axioms in categorical probability
We study the positivity and causality axioms for Markov categories as
properties of dilations and information flow in Markov categories, and in
variations thereof for arbitrary semicartesian monoidal categories. These help
us show that being a positive Markov category is merely an additional property
of a symmetric monoidal category (rather than extra structure). We also
characterize the positivity of representable Markov categories and prove that
causality implies positivity, but not conversely. Finally, we note that
positivity fails for quasi-Borel spaces and interpret this failure as a privacy
property of probabilistic name generation.Comment: 42 page
From patterned response dependency to structured covariate dependency: categorical-pattern-matching
Data generated from a system of interest typically consists of measurements
from an ensemble of subjects across multiple response and covariate features,
and is naturally represented by one response-matrix against one
covariate-matrix. Likely each of these two matrices simultaneously embraces
heterogeneous data types: continuous, discrete and categorical. Here a matrix
is used as a practical platform to ideally keep hidden dependency among/between
subjects and features intact on its lattice. Response and covariate dependency
is individually computed and expressed through mutliscale blocks via a newly
developed computing paradigm named Data Mechanics. We propose a categorical
pattern matching approach to establish causal linkages in a form of information
flows from patterned response dependency to structured covariate dependency.
The strength of an information flow is evaluated by applying the combinatorial
information theory. This unified platform for system knowledge discovery is
illustrated through five data sets. In each illustrative case, an information
flow is demonstrated as an organization of discovered knowledge loci via
emergent visible and readable heterogeneity. This unified approach
fundamentally resolves many long standing issues, including statistical
modeling, multiple response, renormalization and feature selections, in data
analysis, but without involving man-made structures and distribution
assumptions. The results reported here enhance the idea that linking patterns
of response dependency to structures of covariate dependency is the true
philosophical foundation underlying data-driven computing and learning in
sciences.Comment: 32 pages, 10 figures, 3 box picture
Complexity of Grammar Induction for Quantum Types
Most categorical models of meaning use a functor from the syntactic category
to the semantic category. When semantic information is available, the problem
of grammar induction can therefore be defined as finding preimages of the
semantic types under this forgetful functor, lifting the information flow from
the semantic level to a valid reduction at the syntactic level. We study the
complexity of grammar induction, and show that for a variety of type systems,
including pivotal and compact closed categories, the grammar induction problem
is NP-complete. Our approach could be extended to linguistic type systems such
as autonomous or bi-closed categories.Comment: In Proceedings QPL 2014, arXiv:1412.810
Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions
Generative flows and diffusion models have been predominantly trained on
ordinal data, for example natural images. This paper introduces two extensions
of flows and diffusion for categorical data such as language or image
segmentation: Argmax Flows and Multinomial Diffusion. Argmax Flows are defined
by a composition of a continuous distribution (such as a normalizing flow), and
an argmax function. To optimize this model, we learn a probabilistic inverse
for the argmax that lifts the categorical data to a continuous space.
Multinomial Diffusion gradually adds categorical noise in a diffusion process,
for which the generative denoising process is learned. We demonstrate that our
method outperforms existing dequantization approaches on text modelling and
modelling on image segmentation maps in log-likelihood.Comment: Accepted at Neural Information Processing Systems (NeurIPS 2021
- …