45 research outputs found

    Cognitive network science reveals bias in GPT-3, ChatGPT, and GPT-4 mirroring math anxiety in high-school students

    Full text link
    Large language models are becoming increasingly integrated into our lives. Hence, it is important to understand the biases present in their outputs in order to avoid perpetuating harmful stereotypes, which originate in our own flawed ways of thinking. This challenge requires developing new benchmarks and methods for quantifying affective and semantic bias, keeping in mind that LLMs act as psycho-social mirrors that reflect the views and tendencies that are prevalent in society. One such tendency that has harmful negative effects is the global phenomenon of anxiety toward math and STEM subjects. Here, we investigate perceptions of math and STEM fields provided by cutting-edge language models, namely GPT-3, Chat-GPT, and GPT-4, by applying an approach from network science and cognitive psychology. Specifically, we use behavioral forma mentis networks (BFMNs) to understand how these LLMs frame math and STEM disciplines in relation to other concepts. We use data obtained by probing the three LLMs in a language generation task that has previously been applied to humans. Our findings indicate that LLMs have an overall negative perception of math and STEM fields, with math being perceived most negatively. We observe significant differences across the three LLMs. We observe that newer versions (i.e. GPT-4) produce richer, more complex perceptions as well as less negative perceptions compared to older versions and N=159 high-school students. These findings suggest that advances in the architecture of LLMs may lead to increasingly less biased models that could even perhaps someday aid in reducing harmful stereotypes in society rather than perpetuating them.Comment: 23 pages, 8 figure

    Information Geometry

    Get PDF
    This Special Issue of the journal Entropy, titled “Information Geometry I”, contains a collection of 17 papers concerning the foundations and applications of information geometry. Based on a geometrical interpretation of probability, information geometry has become a rich mathematical field employing the methods of differential geometry. It has numerous applications to data science, physics, and neuroscience. Presenting original research, yet written in an accessible, tutorial style, this collection of papers will be useful for scientists who are new to the field, while providing an excellent reference for the more experienced researcher. Several papers are written by authorities in the field, and topics cover the foundations of information geometry, as well as applications to statistics, Bayesian inference, machine learning, complex systems, physics, and neuroscience

    Observability and Controllability of Nonlinear Networks: The Role of Symmetry

    Full text link
    Observability and controllability are essential concepts to the design of predictive observer models and feedback controllers of networked systems. For example, noncontrollable mathematical models of real systems have subspaces that influence model behavior, but cannot be controlled by an input. Such subspaces can be difficult to determine in complex nonlinear networks. Since almost all of the present theory was developed for linear networks without symmetries, here we present a numerical and group representational framework, to quantify the observability and controllability of nonlinear networks with explicit symmetries that shows the connection between symmetries and nonlinear measures of observability and controllability. We numerically observe and theoretically predict that not all symmetries have the same effect on network observation and control. Our analysis shows that the presence of symmetry in a network may decrease observability and controllability, although networks containing only rotational symmetries remain controllable and observable. These results alter our view of the nature of observability and controllability in complex networks, change our understanding of structural controllability, and affect the design of mathematical models to observe and control such networks.Comment: 19 pages, 9 figure

    Modelling and predicting distribution-valued fields with applications to inversion under uncertainty

    Get PDF
    Capturing the dependence between a random response and predictors is a fundamental task in statistics and stochastic modelling. The focus of this work is on density regression, which entails estimating response distributions given predictor values. It enables the derivation of various statistical quantities, including the conditional mean, threshold exceedance probabilities, and quantiles. This thesis presents a flexible approach, based upon the class of so-called Spatial Logistic Gaussian Processes (SLGPs). The SLGP framework utilizes a well-behaved latent Gaussian Process that undergoes a non-linear transformation, resulting in a class of models suitable for capturing spatially-dependent probability measures. SLGP models overcome limitations associated with strong distributional assumptions (e.g. shapes constraints, log-concavity, Gaussianity, etc.), varying sample sizes, and changes in target density shapes and modalities. The first part of this work is dedicated to the development of SLGP models and gaining a deep understanding of the associated mathematical concepts. We introduce SLGPs from the perspective of random measures and their densities, and investigate links between properties of SLGPs and underlying processes. We show that SLGP models can be characterized by their log-increments and leverage this characterization to establish theoretical results with a main focus on spatial regularity. We then focus on applicability of our approach, and propose an implementation relying on finite rank Gaussian Processes. We demonstrate it on synthetic examples and on temperature distributions at meteorological stations. Finally, we address the potential of SLGPs for statistical inference, focusing on their potential in stochastic optimization and stochastic inverse problems. Notably, for inverse problems, an Approximate Bayesian Computation (ABC) framework is introduced, leveraging SLGP-surrogated likelihoods to accommodate situations with limited to moderate data. This methodology, inspired by GP-ABC methods, harnesses the probabilistic nature of SLGPs to guide data acquisition, thereby facilitating accelerated inference. We illustrate these approaches on synthetic examples as well as on a hydrogeological inverse problem in which a contaminant source is sought under uncertain geological scenario

    Information geometry

    Get PDF
    This Special Issue of the journal Entropy, titled “Information Geometry I”, contains a collection of 17 papers concerning the foundations and applications of information geometry. Based on a geometrical interpretation of probability, information geometry has become a rich mathematical field employing the methods of differential geometry. It has numerous applications to data science, physics, and neuroscience. Presenting original research, yet written in an accessible, tutorial style, this collection of papers will be useful for scientists who are new to the field, while providing an excellent reference for the more experienced researcher. Several papers are written by authorities in the field, and topics cover the foundations of information geometry, as well as applications to statistics, Bayesian inference, machine learning, complex systems, physics, and neuroscience

    A Rainbow in Deep Network Black Boxes

    Full text link
    We introduce rainbow networks as a probabilistic model of trained deep neural networks. The model cascades random feature maps whose weight distributions are learned. It assumes that dependencies between weights at different layers are reduced to rotations which align the input activations. Neuron weights within a layer are independent after this alignment. Their activations define kernels which become deterministic in the infinite-width limit. This is verified numerically for ResNets trained on the ImageNet dataset. We also show that the learned weight distributions have low-rank covariances. Rainbow networks thus alternate between linear dimension reductions and non-linear high-dimensional embeddings with white random features. Gaussian rainbow networks are defined with Gaussian weight distributions. These models are validated numerically on image classification on the CIFAR-10 dataset, with wavelet scattering networks. We further show that during training, SGD updates the weight covariances while mostly preserving the Gaussian initialization.Comment: 56 pages, 10 figure
    corecore