41 research outputs found
A note on the factorization conjecture
We give partial results on the factorization conjecture on codes proposed by
Schutzenberger. We consider finite maximal codes C over the alphabet A = {a, b}
with C \cap a^* = a^p, for a prime number p. Let P, S in Z , with S = S_0 +
S_1, supp(S_0) \subset a^* and supp(S_1) \subset a^*b supp(S_0). We prove that
if (P,S) is a factorization for C then (P,S) is positive, that is P,S have
coefficients 0,1, and we characterize the structure of these codes. As a
consequence, we prove that if C is a finite maximal code such that each word in
C has at most 4 occurrences of b's and a^p is in C, then each factorization for
C is a positive factorization. We also discuss the structure of these codes.
The obtained results show once again relations between (positive)
factorizations and factorizations of cyclic groups
Subfactors and Applications
The theory of subfactors connects diverse topics in mathematics
and mathematical physics such as tensor categories, vertex operator
algebras, quantum groups, quantum topology, free probability,
quantum field theory, conformal field theory,
statistical mechanics, condensed matter
physics and, of course, operator algebras.
We invited an international group of researchers from these areas
and many fruitful interactions took place during the workshop
The subconvexity problem for \GL_{2}
Generalizing and unifying prior results, we solve the subconvexity problem
for the -functions of \GL_{1} and \GL_{2} automorphic representations
over a fixed number field, uniformly in all aspects. A novel feature of the
present method is the softness of our arguments; this is largely due to a
consistent use of canonically normalized period relations, such as those
supplied by the work of Waldspurger and Ichino--Ikeda.Comment: Almost final version to appear in Publ. Math IHES. References
updated
On representation learning for generative models of text
Cette thèse fait des petits pas dans la construction et la compréhension des systèmes d'apprentissage des représentations neuronales et des modèles génératifs pour le traitement du langage naturel. Il est présenté comme une thèse par article qui contient quatre travaux.
Dans le premier article, nous montrons que l'apprentissage multi-tâches peut être utilisé pour combiner les biais inductifs de plusieurs tâches d'apprentissage auto-supervisées et supervisées pour apprendre des représentations de phrases distribuées de longueur fixe à usage général qui obtiennent des résultats solides sur les tâches d'apprentissage par transfert en aval sans tout modèle de réglage fin.
Le deuxième article s'appuie sur le premier et présente un modèle génératif en deux étapes pour le texte qui modélise la distribution des représentations de phrases pour produire de nouveaux plongements de phrases qui servent de "contour neuronal" de haut niveau qui est reconstruit en mots avec un récurrent neuronal autorégressif conditionnel décodeur.
Le troisième article étudie la nécessité de représentations démêlées pour la génération de texte contrôlable. Une grande partie des systèmes de génération de texte contrôlables reposent sur l'idée que le contrôle d'un attribut (ou d'un style) particulier nécessite la construction de représentations dissociées qui séparent le contenu et le style. Nous démontrons que les représentations produites dans des travaux antérieurs qui utilisent la formation contradictoire du domaine ne sont pas dissociées dans la pratique. Nous présentons ensuite une approche qui ne vise pas à apprendre des représentations démêlées et montrons qu'elle permet d'obtenir des résultats nettement meilleurs que les travaux antérieurs.
Dans le quatrième article, nous concevons des modèles de langage de transformateur qui apprennent les représentations à plusieurs échelles de temps et montrent que ceux-ci peuvent aider à réduire l'empreinte mémoire importante de ces modèles. Il présente trois architectures multi-échelles différentes qui présentent des compromis favorables entre la perplexité et l'empreinte mémoire.This thesis takes baby steps in building and understanding neural representation learning systems and generative models for natural language processing. It is presented as a thesis by article that contains four pieces of work.
In the first article, we show that multi-task learning can be used to combine the inductive biases of several self-supervised and supervised learning tasks to learn general-purpose fixed-length distributed sentence representations that achieve strong results on downstream transfer learning tasks without any model fine-tuning.
The second article builds on the first and presents a two-step generative model for text that models the distribution of sentence representations to produce novel sentence embeddings that serves as a high level ``neural outline'' that is reconstructed to words with a conditional autoregressive RNN decoder.
The third article studies the necessity of disentangled representations for controllable text generation. A large fraction of controllable text generation systems rely on the idea that control over a particular attribute (or style) requires building disentangled representations that separate content and style. We demonstrate that representations produced in previous work that uses domain adversarial training are not disentangled in practice. We then present an approach that does not aim to learn disentangled representations and show that it achieves significantly better results than prior work.
In the fourth article, we design transformer language models that learn representations at multiple time scales and show that these can help address the large memory footprint these models typically have. It presents three different multi-scale architectures that exhibit favorable perplexity vs memory footprint trade-offs
Efficient Data-Driven Robust Policies for Reinforcement Learning
Applying the reinforcement learning methodology to domains that involve risky decisions like medicine or robotics requires high confidence in the performance of a policy before its deployment. Markov Decision Processes (MDPs) have served as a well-established model in reinforcement learning (RL). An MDP model assumes that the exact transitional probabilities and rewards are available. However, in most cases, these parameters are unknown and are typically estimated from data, which are inherently prone to errors. Consequently, due to such statistical errors, the resulting computed policy\u27s actual performance is often different from the designer\u27s expectation. In this context, practitioners can either be negligent and ignore parameter uncertainty during decision-making or be pessimistic by planning to be protected against the worst-case scenario. This dissertation focuses on a moderate mindset that strikes a balance between the two contradicting points of view. This objective is also known as the percentile criterion and can be modeled as risk-aversion to epistemic uncertainty. We propose several RL algorithms that efficiently compute reliable policies with limited data that notably improve the policies\u27 performance and alleviate the computational complexity compared to standard risk-averse RL algorithms. Furthermore, we present a fast and robust feature selection method for linear value function approximation, a standard approach to solving reinforcement learning problems with large state spaces. Our experiments show that our technique is faster and more stable than alternative methods
Charting Dark Matter interactions
The nature of Dark Matter (DM) is one of the most compelling problems in Fundamental Physics.It is a well established fact that the Standard Model (SM) of particle physics and General Relativity (GR) by themselves cannot explain astrophysical and cosmological data such as galactic rotation curves, the Cosmic Microwave Background (CMB) and the distribution of structures at large scales.These data indicate the existence of a new fluid, the DM, that is: 1) collisionless 2) cold 3) dominated by GR at large distances.Very few properties are known about the particles making up the DM.The two main ones are: i) the DM must interact weakly with SM particles, and ii) the DM must be stable on cosmological time scales.These two properties by themselves are too general to draw a clear picture of the Dark Sector (DS). In this Thesis we will try to assess some of its properties in light of current and future experiments.The most natural possibility is for the DM to interact with the weakest of the SM forces, the electroweak (EW) force. We completely characterize this kind of DM particles, called WIMPs.After computing their masses, set by EW annihilations, we study their phenomenology at future lepton colliders and at Direct Detection (DD) experiments. The lightest WIMPs are a perfect target for realistic future lepton colliders, while to probe the heaviest ones future Xenon DD experiments are needed.The second scenario we analyze is the case in which DM does not interact with any of the SM force mediators. In this case, the Effective Field Theory (EFT) approach is needed. We introduce a set of portal operators that have received little attention in the past. After describing a model-independent approach, we discuss bounds on the portals coming from high intensity experiments, like neutrino experiments at Fermilab (e.g. DUNE). These are competitive with respect to current constraints.The last possibility is the case in which even portals are absent. In this scenario, the clustering of both species during the Universe evolution can provide a window on the DM nature. We focus on models in which the DM has a long range self-interaction mediated by a light scalar.We study the evolution of inhomogeneities, and compare the predicted CMB anisotropies and galaxy power spectra with current and future data (like Euclid), setting strong bounds on the strength of the self-interaction.Finally we comment on how theoretical insights on DM stability can constrain DM model building
Conditionals and modularity in general logics
In this work in progress, we discuss independence and interpolation and
related topics for classical, modal, and non-monotonic logics
The subconvexity problem for GL2
Generalizing and unifying prior results, we solve the subconvexity problem for the L-functions of GL 1 and GL 2 automorphic representations over a fixed number field, uniformly in all aspects. A novel feature of the present method is the softness of our arguments; this is largely due to a consistent use of canonically normalized period relations, such as those supplied by the work of Waldspurger and Ichino-Iked