45 research outputs found
Cognitive network science reveals bias in GPT-3, ChatGPT, and GPT-4 mirroring math anxiety in high-school students
Large language models are becoming increasingly integrated into our lives.
Hence, it is important to understand the biases present in their outputs in
order to avoid perpetuating harmful stereotypes, which originate in our own
flawed ways of thinking. This challenge requires developing new benchmarks and
methods for quantifying affective and semantic bias, keeping in mind that LLMs
act as psycho-social mirrors that reflect the views and tendencies that are
prevalent in society. One such tendency that has harmful negative effects is
the global phenomenon of anxiety toward math and STEM subjects. Here, we
investigate perceptions of math and STEM fields provided by cutting-edge
language models, namely GPT-3, Chat-GPT, and GPT-4, by applying an approach
from network science and cognitive psychology. Specifically, we use behavioral
forma mentis networks (BFMNs) to understand how these LLMs frame math and STEM
disciplines in relation to other concepts. We use data obtained by probing the
three LLMs in a language generation task that has previously been applied to
humans. Our findings indicate that LLMs have an overall negative perception of
math and STEM fields, with math being perceived most negatively. We observe
significant differences across the three LLMs. We observe that newer versions
(i.e. GPT-4) produce richer, more complex perceptions as well as less negative
perceptions compared to older versions and N=159 high-school students. These
findings suggest that advances in the architecture of LLMs may lead to
increasingly less biased models that could even perhaps someday aid in reducing
harmful stereotypes in society rather than perpetuating them.Comment: 23 pages, 8 figure
Information Geometry
This Special Issue of the journal Entropy, titled “Information Geometry I”, contains a collection of 17 papers concerning the foundations and applications of information geometry. Based on a geometrical interpretation of probability, information geometry has become a rich mathematical field employing the methods of differential geometry. It has numerous applications to data science, physics, and neuroscience. Presenting original research, yet written in an accessible, tutorial style, this collection of papers will be useful for scientists who are new to the field, while providing an excellent reference for the more experienced researcher. Several papers are written by authorities in the field, and topics cover the foundations of information geometry, as well as applications to statistics, Bayesian inference, machine learning, complex systems, physics, and neuroscience
Observability and Controllability of Nonlinear Networks: The Role of Symmetry
Observability and controllability are essential concepts to the design of
predictive observer models and feedback controllers of networked systems. For
example, noncontrollable mathematical models of real systems have subspaces
that influence model behavior, but cannot be controlled by an input. Such
subspaces can be difficult to determine in complex nonlinear networks. Since
almost all of the present theory was developed for linear networks without
symmetries, here we present a numerical and group representational framework,
to quantify the observability and controllability of nonlinear networks with
explicit symmetries that shows the connection between symmetries and nonlinear
measures of observability and controllability. We numerically observe and
theoretically predict that not all symmetries have the same effect on network
observation and control. Our analysis shows that the presence of symmetry in a
network may decrease observability and controllability, although networks
containing only rotational symmetries remain controllable and observable. These
results alter our view of the nature of observability and controllability in
complex networks, change our understanding of structural controllability, and
affect the design of mathematical models to observe and control such networks.Comment: 19 pages, 9 figure
Modelling and predicting distribution-valued fields with applications to inversion under uncertainty
Capturing the dependence between a random response and predictors is a fundamental task in statistics and stochastic modelling. The focus of this work is on density regression, which entails estimating response distributions given predictor values. It enables the derivation of various statistical quantities, including the conditional mean, threshold exceedance probabilities, and quantiles.
This thesis presents a flexible approach, based upon the class of so-called Spatial Logistic Gaussian Processes (SLGPs). The SLGP framework utilizes a well-behaved latent Gaussian Process that undergoes a non-linear transformation, resulting in a class of models suitable for capturing spatially-dependent probability measures. SLGP models overcome limitations associated with strong distributional assumptions (e.g. shapes constraints, log-concavity, Gaussianity, etc.), varying sample sizes, and changes in target density shapes and modalities.
The first part of this work is dedicated to the development of SLGP models and gaining a deep understanding of the associated mathematical concepts. We introduce SLGPs from the perspective of random measures and their densities, and investigate links between properties of SLGPs and underlying processes. We show that SLGP models can be characterized by their log-increments and leverage this characterization to establish theoretical results with a main focus on spatial regularity.
We then focus on applicability of our approach, and propose an implementation relying on finite rank Gaussian Processes. We demonstrate it on synthetic examples and on temperature distributions at meteorological stations.
Finally, we address the potential of SLGPs for statistical inference, focusing on their potential in stochastic optimization and stochastic inverse problems. Notably, for inverse problems, an Approximate Bayesian Computation (ABC) framework is introduced, leveraging SLGP-surrogated likelihoods to accommodate situations with limited to moderate data. This methodology, inspired by GP-ABC methods, harnesses the probabilistic nature of SLGPs to guide data acquisition, thereby facilitating accelerated inference. We illustrate these approaches on synthetic examples as well as on a hydrogeological inverse problem in which a contaminant source is sought under uncertain geological scenario
Recommended from our members
Accelerating Materials Discovery with Machine Learning
As we enter the data age, ever-increasing amounts of human knowledge are being recorded in machine-readable formats.
This has opened up new opportunities to leverage data to accelerate scientific discovery.
This thesis focuses on how we can use historical and computational data to aid the discovery and development of new materials.
We begin by looking at a traditional materials informatics task -- elucidating the structure-function relationships of high-temperature cuprate superconductors.
One of the most significant challenges for materials informatics is the limited availability of relevant data.
We propose a simple calibration-based approach to estimate the apical and in-plane copper-oxygen distances from more readily available lattice parameter data to address this challenge for cuprate superconductors.
Our investigation uncovers a large, unexplored region of materials space that may yield cuprates with higher critical temperatures.
We propose two experimental avenues that may enable this region to be accessed.
Computational materials exploration is bottle-necked by our ability to provide input structures to feed our workflows.
Whilst \textit{ab-intio} structure identification is possible, it is computationally burdensome and we lack design rules for deciding where to target searches in high-throughput setups.
To address this, there is a need to develop tools that suggest promising candidates, enabling automated deployment and increased efficiency.
Machine learning models are well suited to this task, however, current approaches typically use hand-engineered inputs.
This means that their performance is circumscribed by the intuitions reflected in the chosen inputs.
We propose a novel way to formulate the machine learning task as a set regression problem over the elements in a material.
We show that our approach leads to higher sample efficiency than other well-established composition-based approaches.
Having demonstrated the ability of machine learning to aid in the selection of promising compound compositions, we next explore how useful machine learning might be for identifying fabrication routes.
Using a recently released data-mined data set of solid-state synthesis reactions, we design a two-stage model to predict the products of inorganic reactions.
We critically explore the performance of this model, showing that whilst the predictions fall short of the accuracy required to be chemically discriminative, the model provides valuable insights into understanding inorganic reactions.
Through careful investigation of the model's failure modes, we explore the challenges that remain in the construction of forward inorganic reaction prediction models and suggest some pathways to tackle the identified issues.
One of the principal ways that material scientists understand and categorise materials is in terms of their symmetries.
Crystal structure prototypes are assigned based on the presence of symmetrically equivalent sites known as Wyckoff positions.
We show that a powerful coarse-grained representation of materials structures can be constructed from the Wyckoff positions by discarding information about their coordinates within crystal structures.
One of the strengths of this representation is that it maintains the ability of structure-based methods to distinguish polymorphs whilst also allowing combinatorial enumeration akin to composition-based approaches.
We construct an end-to-end differentiable model that takes our proposed Wyckoff representation as input.
The performance of this approach is examined on a suite of materials discovery experiments showing that it leads to strong levels of enrichment in materials discovery tasks.
The research presented in this thesis highlights the promise of applying data-driven workflows and machine learning in materials discovery and development.
This thesis concludes by speculating about promising research directions for applying machine learning within materials discovery
Information geometry
This Special Issue of the journal Entropy, titled “Information Geometry I”, contains a collection of 17 papers concerning the foundations and applications of information geometry. Based on a geometrical interpretation of probability, information geometry has become a rich mathematical field employing the methods of differential geometry. It has numerous applications to data science, physics, and neuroscience. Presenting original research, yet written in an accessible, tutorial style, this collection of papers will be useful for scientists who are new to the field, while providing an excellent reference for the more experienced researcher. Several papers are written by authorities in the field, and topics cover the foundations of information geometry, as well as applications to statistics, Bayesian inference, machine learning, complex systems, physics, and neuroscience
A Rainbow in Deep Network Black Boxes
We introduce rainbow networks as a probabilistic model of trained deep neural
networks. The model cascades random feature maps whose weight distributions are
learned. It assumes that dependencies between weights at different layers are
reduced to rotations which align the input activations. Neuron weights within a
layer are independent after this alignment. Their activations define kernels
which become deterministic in the infinite-width limit. This is verified
numerically for ResNets trained on the ImageNet dataset. We also show that the
learned weight distributions have low-rank covariances. Rainbow networks thus
alternate between linear dimension reductions and non-linear high-dimensional
embeddings with white random features. Gaussian rainbow networks are defined
with Gaussian weight distributions. These models are validated numerically on
image classification on the CIFAR-10 dataset, with wavelet scattering networks.
We further show that during training, SGD updates the weight covariances while
mostly preserving the Gaussian initialization.Comment: 56 pages, 10 figure