Search CORE

25 research outputs found

Amortizing intractable inference in large language models

Author: Bengio Yoshua
Elmoznino Eric
Hu Edward J.
Jain Moksh
Kaddar Younesse
Lajoie Guillaume
Malkin Nikolay
Publication venue
Publication date: 06/10/2023
Field of study

Autoregressive large language models (LLMs) compress knowledge from their training data through next-token conditional distributions. This limits tractable querying of this knowledge to start-to-end autoregressive sampling. However, many tasks of interest -- including sequence continuation, infilling, and other forms of constrained generation -- involve sampling from intractable posterior distributions. We address this limitation by using amortized Bayesian inference to sample from these intractable posteriors. Such amortization is algorithmically achieved by fine-tuning LLMs via diversity-seeking reinforcement learning algorithms: generative flow networks (GFlowNets). We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training and reward-maximizing policy optimization. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem and demonstrate that our approach enables data-efficient adaptation of LLMs to tasks that require multi-step rationalization and tool use.Comment: 23 pages; code: https://github.com/GFNOrg/gfn-lm-tunin

arXiv.org e-Print Archive

Consciousness in Artificial Intelligence: Insights from the Science of Consciousness

Author: Bengio Yoshua
Birch Jonathan
Butlin Patrick
Constant Axel
Deane George
Elmoznino Eric
Fleming Stephen M.
Frith Chris
Ji Xu
Kanai Ryota
Klein Colin
Lindsay Grace
Long Robert
Michel Matthias
Mudrik Liad
Peters Megan A. K.
Schwitzgebel Eric
Simon Jonathan
VanRullen Rufin
Publication venue
Publication date: 22/08/2023
Field of study

Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argues for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive "indicator properties" of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators

arXiv.org e-Print Archive

Kinematical analysis of the nutation speed reducer

Author: M A Jasem
P Y Krauinsh
Kedrowski K
Lemanski A
Lemanski A J
Molyneux W G
Elmoznino M
Binney D A
Kedrowski D K
Humpage T
Publication venue: 'IOP Publishing'
Publication date: 01/01/2012
Field of study

This paper discusses the development of a Nutating Speed Reducer (NSR) which is characterized by high reduction ratio, high tooth contact ratio, very high torque to weight/volume ratio, quiet and smooth operation under load and very high efficiency. All of these advantages are due to the presence of conjugate face-gear pairs, which incorporate each other, which called nutating/rotating gear mechanism. Details of the NSR, its kinematics, gear tooth load capacity, and mesh efficiency are explained. The NSR component speeds and speed reduction ratios of the NSR are calculated. Effect of the varying nutation angles on the geometry of the NSR is discussed and compared

Electronic archive of Tomsk Polytechnic University

Crossref

Mythe de la femme chez Vigny

Author: Elmoznino Hazdai
Publication venue: McGill University
Publication date
Field of study

eScholarship@McGill

Relationship between model eigenspectra and encoding performance.

Author: Eric Elmoznino (17769143)
Michael F. Bonner (5131151)
Publication venue
Publication date: 10/01/2024
Field of study

Each curve shows the eigenspectrum of one layer from one DNN. The x-axis is the index of principal components, sorted in decreasing order of variance, and the y-axis is the variance along each principal component (scaled by a constant in order to align all curves for comparison). The black line is a reference for a power law function that decays as , where i is the principal component index. This power law of was hypothesized in [40] to be a theoretical upper limit on the latent dimensionality of smooth representations. a: Eigenspectra are color-coded as a function of the corresponding encoding performance each model achieved. Models with more slowly decaying eigenspectra (i.e., higher latent dimensionality) are better predictors of cortical activity, with top-performing models approaching the theoretical upper bound on dimensionality proposed in [40]. Encoding performance is for the IT brain region, which had the strongest relationship with ED among regions we considered. b: Eigenspectra are color-coded as a function of their corresponding ED. Since ED is a summary statistic of an eigenspectrum meant to quantify its rate of decay, models with more slowly decaying eigenspectra tend to have higher ED.</p

FigShare

ED and representational similarity analysis.

Author: Eric Elmoznino (17769143)
Michael F. Bonner (5131151)
Publication venue
Publication date: 10/01/2024
Field of study

Comparing ED to neural data using representational similarity analysis instead of encoding performance. (PDF)</p

FigShare

Other correlates of DNN encoding model performance.

Author: Eric Elmoznino (17769143)
Michael F. Bonner (5131151)
Publication venue
Publication date: 10/01/2024
Field of study

Comparison of encoding performance to DNN features other than ED, such as classification performance and other geometric statistics of the representations. (PDF)</p

FigShare

Theory of latent dimensionality and encoding performance.

Author: Eric Elmoznino (17769143)
Michael F. Bonner (5131151)
Publication venue
Publication date: 10/01/2024
Field of study

Detailed description of the theory of ED and encoding performance sketched out in Section Dimensionality and alignment in computational brain models. (PDF)</p

FigShare

Additional simulation results.

Author: Eric Elmoznino (17769143)
Michael F. Bonner (5131151)
Publication venue
Publication date: 10/01/2024
Field of study

Results for additional analyses that we conducted by varying parameters of the simulations described in Section Dimensionality and alignment in computational brain models. (PDF)</p

FigShare

Additional analyses of ED and encoding performance.

Author: Eric Elmoznino (17769143)
Michael F. Bonner (5131151)
Publication venue
Publication date: 10/01/2024
Field of study

Analyses that examine the relationship between ED and encoding performance under different conditions from those described in the main paper. (PDF)</p

FigShare