Search CORE

6,166 research outputs found

A Proof of Entropy Minimization for Outputs in Deletion Channels via Hidden Word Statistics

Author: Atashpendar Arash
Mestel David
Roscoe A. W.
Ryan Peter Y. A.
Publication venue
Publication date: 30/07/2018
Field of study

From the output produced by a memoryless deletion channel from a uniformly random input of known length

n

, one obtains a posterior distribution on the channel input. The difference between the Shannon entropy of this distribution and that of the uniform prior measures the amount of information about the channel input which is conveyed by the output of length

m

, and it is natural to ask for which outputs this is extremized. This question was posed in a previous work, where it was conjectured on the basis of experimental data that the entropy of the posterior is minimized and maximized by the constant strings

\texttt{000}\ldots

and

\texttt{111}\ldots

and the alternating strings

\texttt{0101}\ldots

and

\texttt{1010}\ldots

respectively. In the present work we confirm the minimization conjecture in the asymptotic limit using results from hidden word statistics. We show how the analytic-combinatorial methods of Flajolet, Szpankowski and Vall\'ee for dealing with the hidden pattern matching problem can be applied to resolve the case of fixed output length and

n\rightarrow\infty

, by obtaining estimates for the entropy in terms of the moments of the posterior distribution and establishing its minimization via a measure of autocorrelation.Comment: 11 pages, 2 figure

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

Clustering, Hamming Embedding, Generalized LSH and the Max Norm

Author: A. Gionis
J.L. Krivine
L. Danzer
L.V. Buchok
N. Alon
N. Srebro
Publication venue
Publication date: 01/01/2014
Field of study

We study the convex relaxation of clustering and hamming embedding, focusing on the asymmetric case (co-clustering and asymmetric hamming embedding), understanding their relationship to LSH as studied by (Charikar 2002) and to the max-norm ball, and the differences between their symmetric and asymmetric versions.Comment: 17 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

On empirical methodology, constraints, and hierarchy in artificial grammar learning

Author: Levelt W.
Publication venue: 'Wiley'
Publication date: 24/09/2019
Field of study

This paper considers the AGL literature from a psycholinguistic perspective. It first presents a taxonomy of the experimental familiarization test procedures used, which is followed by a consideration of shortcomings and potential improvements of the empirical methodology. It then turns to reconsidering the issue of grammar learning from the point of view of acquiring constraints, instead of the traditional AGL approach in terms of acquiring sets of rewrite rules. This is, in particular, a natural way of handling long‐distance dependences. The final section addresses an underdeveloped issue in the AGL literature, namely how to detect latent hierarchical structure in AGL response patterns

MPG.PuRe

In search of isoglosses: continuous and discrete language embeddings in Slavic historical phonology

Author: Cathcart Chundra A.
Wandl Florian
Publication venue
Publication date: 01/01/2020
Field of study

This paper investigates the ability of neural network architectures to effectively learn diachronic phonological generalizations in a multilingual setting. We employ models using three different types of language embedding (dense, sigmoid, and straight-through). We find that the Straight-Through model outperforms the other two in terms of accuracy, but the Sigmoid model's language embeddings show the strongest agreement with the traditional subgrouping of the Slavic languages. We find that the Straight-Through model has learned coherent, semi-interpretable information about sound change, and outline directions for future research

arXiv.org e-Print Archive

Crossref

ZORA

Blind Reconciliation

Author: Elkouss David
Martin Vicente
Martinez-Mateo Jesus
Publication venue
Publication date: 01/09/2012
Field of study

Information reconciliation is a crucial procedure in the classical post-processing of quantum key distribution (QKD). Poor reconciliation efficiency, revealing more information than strictly needed, may compromise the maximum attainable distance, while poor performance of the algorithm limits the practical throughput in a QKD device. Historically, reconciliation has been mainly done using close to minimal information disclosure but heavily interactive procedures, like Cascade, or using less efficient but also less interactive -just one message is exchanged- procedures, like the ones based in low-density parity-check (LDPC) codes. The price to pay in the LDPC case is that good efficiency is only attained for very long codes and in a very narrow range centered around the quantum bit error rate (QBER) that the code was designed to reconcile, thus forcing to have several codes if a broad range of QBER needs to be catered for. Real world implementations of these methods are thus very demanding, either on computational or communication resources or both, to the extent that the last generation of GHz clocked QKD systems are finding a bottleneck in the classical part. In order to produce compact, high performance and reliable QKD systems it would be highly desirable to remove these problems. Here we analyse the use of short-length LDPC codes in the information reconciliation context using a low interactivity, blind, protocol that avoids an a priori error rate estimation. We demonstrate that 2x10^3 bits length LDPC codes are suitable for blind reconciliation. Such codes are of high interest in practice, since they can be used for hardware implementations with very high throughput.Comment: 22 pages, 8 figure

arXiv.org e-Print Archive

Docta Complutense