913 research outputs found
A non-ambiguous decomposition of regular languages and factorizing codes
AbstractGiven languages Z,L⊆Σ∗,Z is L-decomposable (finitely L-decomposable, resp.) if there exists a non-trivial pair of languages (finite languages, resp.) (A,B), such that Z=AL+B and the operations are non-ambiguous. We show that it is decidable whether Z is L-decomposable and whether Z is finitely L-decomposable, in the case Z and L are regular languages. The result in the case Z=L allows one to decide whether, given a finite language S⊆Σ∗, there exist finite languages C,P such that SC∗P=Σ∗ with non-ambiguous operations. This problem is related to Schützenberger's Factorization Conjecture on codes. We also construct an infinite family of factorizing codes
A note on the factorization conjecture
We give partial results on the factorization conjecture on codes proposed by
Schutzenberger. We consider finite maximal codes C over the alphabet A = {a, b}
with C \cap a^* = a^p, for a prime number p. Let P, S in Z , with S = S_0 +
S_1, supp(S_0) \subset a^* and supp(S_1) \subset a^*b supp(S_0). We prove that
if (P,S) is a factorization for C then (P,S) is positive, that is P,S have
coefficients 0,1, and we characterize the structure of these codes. As a
consequence, we prove that if C is a finite maximal code such that each word in
C has at most 4 occurrences of b's and a^p is in C, then each factorization for
C is a positive factorization. We also discuss the structure of these codes.
The obtained results show once again relations between (positive)
factorizations and factorizations of cyclic groups
Spherical and Hyperbolic Toric Topology-Based Codes On Graph Embedding for Ising MRF Models: Classical and Quantum Topology Machine Learning
The paper introduces the application of information geometry to describe the
ground states of Ising models by utilizing parity-check matrices of cyclic and
quasi-cyclic codes on toric and spherical topologies. The approach establishes
a connection between machine learning and error-correcting coding. This
proposed approach has implications for the development of new embedding methods
based on trapping sets. Statistical physics and number geometry applied for
optimize error-correcting codes, leading to these embedding and sparse
factorization methods. The paper establishes a direct connection between DNN
architecture and error-correcting coding by demonstrating how state-of-the-art
architectures (ChordMixer, Mega, Mega-chunk, CDIL, ...) from the long-range
arena can be equivalent to of block and convolutional LDPC codes (Cage-graph,
Repeat Accumulate). QC codes correspond to certain types of chemical elements,
with the carbon element being represented by the mixed automorphism
Shu-Lin-Fossorier QC-LDPC code. The connections between Belief Propagation and
the Permanent, Bethe-Permanent, Nishimori Temperature, and Bethe-Hessian Matrix
are elaborated upon in detail. The Quantum Approximate Optimization Algorithm
(QAOA) used in the Sherrington-Kirkpatrick Ising model can be seen as analogous
to the back-propagation loss function landscape in training DNNs. This
similarity creates a comparable problem with TS pseudo-codeword, resembling the
belief propagation method. Additionally, the layer depth in QAOA correlates to
the number of decoding belief propagation iterations in the Wiberg decoding
tree. Overall, this work has the potential to advance multiple fields, from
Information Theory, DNN architecture design (sparse and structured prior graph
topology), efficient hardware design for Quantum and Classical DPU/TPU (graph,
quantize and shift register architect.) to Materials Science and beyond.Comment: 71 pages, 42 Figures, 1 Table, 1 Appendix. arXiv admin note: text
overlap with arXiv:2109.08184 by other author
Tropicalization and irreducibility of Generalized Vandermonde Determinants
We find geometric and arithmetic conditions in order to characterize the
irreducibility of the determinant of the generic Vandermonde matrix over the
algebraic closure of any field k. We also characterize those determinants whose
tropicalization with respect to the variables of a row is irreducible.Comment: 10 pages, AMSart. Revised version to appear in Proceedings of the AM
Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations
The key idea behind the unsupervised learning of disentangled representations
is that real-world data is generated by a few explanatory factors of variation
which can be recovered by unsupervised learning algorithms. In this paper, we
provide a sober look at recent progress in the field and challenge some common
assumptions. We first theoretically show that the unsupervised learning of
disentangled representations is fundamentally impossible without inductive
biases on both the models and the data. Then, we train more than 12000 models
covering most prominent methods and evaluation metrics in a reproducible
large-scale experimental study on seven different data sets. We observe that
while the different methods successfully enforce properties ``encouraged'' by
the corresponding losses, well-disentangled models seemingly cannot be
identified without supervision. Furthermore, increased disentanglement does not
seem to lead to a decreased sample complexity of learning for downstream tasks.
Our results suggest that future work on disentanglement learning should be
explicit about the role of inductive biases and (implicit) supervision,
investigate concrete benefits of enforcing disentanglement of the learned
representations, and consider a reproducible experimental setup covering
several data sets
- …