92 research outputs found
Deep neural networks architectures from the perspective of manifold learning
Despite significant advances in the field of deep learning in ap-plications
to various areas, an explanation of the learning pro-cess of neural network
models remains an important open ques-tion. The purpose of this paper is a
comprehensive comparison and description of neural network architectures in
terms of ge-ometry and topology. We focus on the internal representation of
neural networks and on the dynamics of changes in the topology and geometry of
a data manifold on different layers. In this paper, we use the concepts of
topological data analysis (TDA) and persistent homological fractal dimension.
We present a wide range of experiments with various datasets and configurations
of convolutional neural network (CNNs) architectures and Transformers in CV and
NLP tasks. Our work is a contribution to the development of the important field
of explainable and interpretable AI within the framework of geometrical deep
learning.Comment: 11 pages, 12 figures, PRAI2023. arXiv admin note: substantial text
overlap with arXiv:2204.0862
Do Large GPT Models Discover Moral Dimensions in Language Representations? A Topological Study Of Sentence Embeddings
As Large Language Models are deployed within Artificial Intelligence systems,
that are increasingly integrated with human society, it becomes more important
than ever to study their internal structures. Higher level abilities of LLMs
such as GPT-3.5 emerge in large part due to informative language
representations they induce from raw text data during pre-training on trillions
of words. These embeddings exist in vector spaces of several thousand
dimensions, and their processing involves mapping between multiple vector
spaces, with total number of parameters on the order of trillions. Furthermore,
these language representations are induced by gradient optimization, resulting
in a black box system that is hard to interpret. In this paper, we take a look
at the topological structure of neuronal activity in the "brain" of Chat-GPT's
foundation language model, and analyze it with respect to a metric representing
the notion of fairness. We develop a novel approach to visualize GPT's moral
dimensions. We first compute a fairness metric, inspired by social psychology
literature, to identify factors that typically influence fairness assessments
in humans, such as legitimacy, need, and responsibility. Subsequently, we
summarize the manifold's shape using a lower-dimensional simplicial complex,
whose topology is derived from this metric. We color it with a heat map
associated with this fairness metric, producing human-readable visualizations
of the high-dimensional sentence manifold. Our results show that sentence
embeddings based on GPT-3.5 can be decomposed into two submanifolds
corresponding to fair and unfair moral judgments. This indicates that GPT-based
language models develop a moral dimension within their representation spaces
and induce an understanding of fairness during their training process
Topos and Stacks of Deep Neural Networks
Every known artificial deep neural network (DNN) corresponds to an object in
a canonical Grothendieck's topos; its learning dynamic corresponds to a flow of
morphisms in this topos. Invariance structures in the layers (like CNNs or
LSTMs) correspond to Giraud's stacks. This invariance is supposed to be
responsible of the generalization property, that is extrapolation from learning
data under constraints. The fibers represent pre-semantic categories (Culioli,
Thom), over which artificial languages are defined, with internal logics,
intuitionist, classical or linear (Girard). Semantic functioning of a network
is its ability to express theories in such a language for answering questions
in output about input data. Quantities and spaces of semantic information are
defined by analogy with the homological interpretation of Shannon's entropy
(P.Baudot and D.B. 2015). They generalize the measures found by Carnap and
Bar-Hillel (1952). Amazingly, the above semantical structures are classified by
geometric fibrant objects in a closed model category of Quillen, then they give
rise to homotopical invariants of DNNs and of their semantic functioning.
Intentional type theories (Martin-Loef) organize these objects and fibrations
between them. Information contents and exchanges are analyzed by Grothendieck's
derivators
Logic programming with pseudo-Boolean constraints
Boolean constraints play an important role in various constraint logic programming languages. In this paper we consider pseudo-Boolean constraints, that is equations and inequalities between pseudo-Boolean functions. A pseudo-Boolean function is an integer-valued function of Boolean variables and thus a generalization of a Boolean function. Pseudo-Boolean functions occur in many application areas, in particular in problems from operations research. An interesting connection to logic is that inference problems in propositional logic can be translated into linear pseudo-Boolean optimization problems. More generally, pseudo-Boolean constraints can be seen as a particular way of combining two of the most important domains in constraint logic programming: arithmetic and Boolean algebra. In this paper we define a new constraint logic programming language {\em CLP(PB)} for logic progamming with pseudo-Boolean constraints. The language is an instance of the general constraint logic programming language scheme {\em CLP(X)} and inherits all the typical semantic properties. We show that any pseudo-Boolean constraint has a most general solution and give variable elimination algorithms for pseudo-Boolean unification and unconstrained pseudo-Boolean optimization. Both algorithms subsume the well-known Boolean unification algorithm of B\"uttner and Simonis
Unsupervised Geometric and Topological Approaches for Cross-Lingual Sentence Representation and Comparison
We propose novel structural-based approaches for the generation and comparison of cross lingual sentence representations. We do so by applying geometric and topological methods to analyze the structure of sentences, as captured by their word embeddings. The key properties of our methods are”:” (a) They are designed to be isometric invariant, in order to provide language-agnostic representations. (b) They are fully unsupervised, and use no cross-lingual signal. The quality of our representations, and their preservation across languages, are evaluated in similarity comparison tasks, achieving competitive results. Furthermore, we show that our structural-based representations can be combined with existing methods for improved results
Semantic Communications Based on Adaptive Generative Models and Information Bottleneck
Semantic communications represent a significant breakthrough with respect to
the current communication paradigm, as they focus on recovering the meaning
behind the transmitted sequence of symbols, rather than the symbols themselves.
In semantic communications, the scope of the destination is not to recover a
list of symbols symbolically identical to the transmitted ones, but rather to
recover a message that is semantically equivalent to the semantic message
emitted by the source. This paradigm shift introduces many degrees of freedom
to the encoding and decoding rules that can be exploited to make the design of
communication systems much more efficient. In this paper, we present an
approach to semantic communication building on three fundamental ideas: 1)
represent data over a topological space as a formal way to capture semantics,
as expressed through relations; 2) use the information bottleneck principle as
a way to identify relevant information and adapt the information bottleneck
online, as a function of the wireless channel state, in order to strike an
optimal trade-off between transmit power, reconstruction accuracy and delay; 3)
exploit probabilistic generative models as a general tool to adapt the
transmission rate to the wireless channel state and make possible the
regeneration of the transmitted images or run classification tasks at the
receiver side.Comment: To appear on IEEE Communications Magazine, special issue on Semantic
Communications: Transmission beyond Shannon, 202
- …