Search CORE

45 research outputs found

Cognitive network science reveals bias in GPT-3, ChatGPT, and GPT-4 mirroring math anxiety in high-school students

Author: Abramski Katherine
Citraro Salvatore
Lombardi Luigi
Rossetti Giulio
Stella Massimo
Publication venue
Publication date: 22/05/2023
Field of study

Large language models are becoming increasingly integrated into our lives. Hence, it is important to understand the biases present in their outputs in order to avoid perpetuating harmful stereotypes, which originate in our own flawed ways of thinking. This challenge requires developing new benchmarks and methods for quantifying affective and semantic bias, keeping in mind that LLMs act as psycho-social mirrors that reflect the views and tendencies that are prevalent in society. One such tendency that has harmful negative effects is the global phenomenon of anxiety toward math and STEM subjects. Here, we investigate perceptions of math and STEM fields provided by cutting-edge language models, namely GPT-3, Chat-GPT, and GPT-4, by applying an approach from network science and cognitive psychology. Specifically, we use behavioral forma mentis networks (BFMNs) to understand how these LLMs frame math and STEM disciplines in relation to other concepts. We use data obtained by probing the three LLMs in a language generation task that has previously been applied to humans. Our findings indicate that LLMs have an overall negative perception of math and STEM fields, with math being perceived most negatively. We observe significant differences across the three LLMs. We observe that newer versions (i.e. GPT-4) produce richer, more complex perceptions as well as less negative perceptions compared to older versions and N=159 high-school students. These findings suggest that advances in the architecture of LLMs may lead to increasingly less biased models that could even perhaps someday aid in reducing harmful stereotypes in society rather than perpetuating them.Comment: 23 pages, 8 figure

arXiv.org e-Print Archive

Information Geometry

Author: Verdoolaege Geert
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

This Special Issue of the journal Entropy, titled “Information Geometry I”, contains a collection of 17 papers concerning the foundations and applications of information geometry. Based on a geometrical interpretation of probability, information geometry has become a rich mathematical field employing the methods of differential geometry. It has numerous applications to data science, physics, and neuroscience. Presenting original research, yet written in an accessible, tutorial style, this collection of papers will be useful for scientists who are new to the field, while providing an excellent reference for the more experienced researcher. Several papers are written by authorities in the field, and topics cover the foundations of information geometry, as well as applications to statistics, Bayesian inference, machine learning, complex systems, physics, and neuroscience

Directory of Open Access Books (DOAB)

Observability and Controllability of Nonlinear Networks: The Role of Symmetry

Author: Brennan Sean N.
Sauer Timothy D.
Schiff Steven J.
Whalen Andrew J.
Publication venue: 'American Physical Society (APS)'
Publication date: 06/10/2014
Field of study

Observability and controllability are essential concepts to the design of predictive observer models and feedback controllers of networked systems. For example, noncontrollable mathematical models of real systems have subspaces that influence model behavior, but cannot be controlled by an input. Such subspaces can be difficult to determine in complex nonlinear networks. Since almost all of the present theory was developed for linear networks without symmetries, here we present a numerical and group representational framework, to quantify the observability and controllability of nonlinear networks with explicit symmetries that shows the connection between symmetries and nonlinear measures of observability and controllability. We numerically observe and theoretically predict that not all symmetries have the same effect on network observation and control. Our analysis shows that the presence of symmetry in a network may decrease observability and controllability, although networks containing only rotational symmetries remain controllable and observable. These results alter our view of the nature of observability and controllability in complex networks, change our understanding of structural controllability, and affect the design of mathematical models to observe and control such networks.Comment: 19 pages, 9 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

Modelling and predicting distribution-valued fields with applications to inversion under uncertainty

Author: Gautier Athénaïs
Publication venue: Universität Bern
Publication date
Field of study

Capturing the dependence between a random response and predictors is a fundamental task in statistics and stochastic modelling. The focus of this work is on density regression, which entails estimating response distributions given predictor values. It enables the derivation of various statistical quantities, including the conditional mean, threshold exceedance probabilities, and quantiles. This thesis presents a flexible approach, based upon the class of so-called Spatial Logistic Gaussian Processes (SLGPs). The SLGP framework utilizes a well-behaved latent Gaussian Process that undergoes a non-linear transformation, resulting in a class of models suitable for capturing spatially-dependent probability measures. SLGP models overcome limitations associated with strong distributional assumptions (e.g. shapes constraints, log-concavity, Gaussianity, etc.), varying sample sizes, and changes in target density shapes and modalities. The first part of this work is dedicated to the development of SLGP models and gaining a deep understanding of the associated mathematical concepts. We introduce SLGPs from the perspective of random measures and their densities, and investigate links between properties of SLGPs and underlying processes. We show that SLGP models can be characterized by their log-increments and leverage this characterization to establish theoretical results with a main focus on spatial regularity. We then focus on applicability of our approach, and propose an implementation relying on finite rank Gaussian Processes. We demonstrate it on synthetic examples and on temperature distributions at meteorological stations. Finally, we address the potential of SLGPs for statistical inference, focusing on their potential in stochastic optimization and stochastic inverse problems. Notably, for inverse problems, an Approximate Bayesian Computation (ABC) framework is introduced, leveraging SLGP-surrogated likelihoods to accommodate situations with limited to moderate data. This methodology, inspired by GP-ABC methods, harnesses the probabilistic nature of SLGPs to guide data acquisition, thereby facilitating accelerated inference. We illustrate these approaches on synthetic examples as well as on a hydrogeological inverse problem in which a contaminant source is sought under uncertain geological scenario

BORIS Theses

Recommended from our members

Accelerating Materials Discovery with Machine Learning

Author: Goodall Rhys Edward Andrew
Publication venue: University of Cambridge
Publication date: 16/03/2022
Field of study

As we enter the data age, ever-increasing amounts of human knowledge are being recorded in machine-readable formats. This has opened up new opportunities to leverage data to accelerate scientific discovery. This thesis focuses on how we can use historical and computational data to aid the discovery and development of new materials. We begin by looking at a traditional materials informatics task -- elucidating the structure-function relationships of high-temperature cuprate superconductors. One of the most significant challenges for materials informatics is the limited availability of relevant data. We propose a simple calibration-based approach to estimate the apical and in-plane copper-oxygen distances from more readily available lattice parameter data to address this challenge for cuprate superconductors. Our investigation uncovers a large, unexplored region of materials space that may yield cuprates with higher critical temperatures. We propose two experimental avenues that may enable this region to be accessed. Computational materials exploration is bottle-necked by our ability to provide input structures to feed our workflows. Whilst \textit{ab-intio} structure identification is possible, it is computationally burdensome and we lack design rules for deciding where to target searches in high-throughput setups. To address this, there is a need to develop tools that suggest promising candidates, enabling automated deployment and increased efficiency. Machine learning models are well suited to this task, however, current approaches typically use hand-engineered inputs. This means that their performance is circumscribed by the intuitions reflected in the chosen inputs. We propose a novel way to formulate the machine learning task as a set regression problem over the elements in a material. We show that our approach leads to higher sample efficiency than other well-established composition-based approaches. Having demonstrated the ability of machine learning to aid in the selection of promising compound compositions, we next explore how useful machine learning might be for identifying fabrication routes. Using a recently released data-mined data set of solid-state synthesis reactions, we design a two-stage model to predict the products of inorganic reactions. We critically explore the performance of this model, showing that whilst the predictions fall short of the accuracy required to be chemically discriminative, the model provides valuable insights into understanding inorganic reactions. Through careful investigation of the model's failure modes, we explore the challenges that remain in the construction of forward inorganic reaction prediction models and suggest some pathways to tackle the identified issues. One of the principal ways that material scientists understand and categorise materials is in terms of their symmetries. Crystal structure prototypes are assigned based on the presence of symmetrically equivalent sites known as Wyckoff positions. We show that a powerful coarse-grained representation of materials structures can be constructed from the Wyckoff positions by discarding information about their coordinates within crystal structures. One of the strengths of this representation is that it maintains the ability of structure-based methods to distinguish polymorphs whilst also allowing combinatorial enumeration akin to composition-based approaches. We construct an end-to-end differentiable model that takes our proposed Wyckoff representation as input. The performance of this approach is examined on a suite of materials discovery experiments showing that it leads to strong levels of enrichment in materials discovery tasks. The research presented in this thesis highlights the promise of applying data-driven workflows and machine learning in materials discovery and development. This thesis concludes by speculating about promising research directions for applying machine learning within materials discovery

Apollo (Cambridge)

Information geometry

Author: Verdoolaege Geert
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

Ghent University Academic Bibliography

Directory of Open Access Books (DOAB)

A Rainbow in Deep Network Black Boxes

Author: Guth Florentin
Mallat Stéphane
Ménard Brice
Rochette Gaspar
Publication venue
Publication date: 29/05/2023
Field of study

We introduce rainbow networks as a probabilistic model of trained deep neural networks. The model cascades random feature maps whose weight distributions are learned. It assumes that dependencies between weights at different layers are reduced to rotations which align the input activations. Neuron weights within a layer are independent after this alignment. Their activations define kernels which become deterministic in the infinite-width limit. This is verified numerically for ResNets trained on the ImageNet dataset. We also show that the learned weight distributions have low-rank covariances. Rainbow networks thus alternate between linear dimension reductions and non-linear high-dimensional embeddings with white random features. Gaussian rainbow networks are defined with Gaussian weight distributions. These models are validated numerically on image classification on the CIFAR-10 dataset, with wavelet scattering networks. We further show that during training, SGD updates the weight covariances while mostly preserving the Gaussian initialization.Comment: 56 pages, 10 figure

arXiv.org e-Print Archive