1,718 research outputs found
Characterizing the impact of geometric properties of word embeddings on task performance
Analysis of word embedding properties to inform their use in downstream NLP
tasks has largely been studied by assessing nearest neighbors. However,
geometric properties of the continuous feature space contribute directly to the
use of embedding features in downstream models, and are largely unexplored. We
consider four properties of word embedding geometry, namely: position relative
to the origin, distribution of features in the vector space, global pairwise
distances, and local pairwise distances. We define a sequence of
transformations to generate new embeddings that expose subsets of these
properties to downstream models and evaluate change in task performance to
understand the contribution of each property to NLP models. We transform
publicly available pretrained embeddings from three popular toolkits (word2vec,
GloVe, and FastText) and evaluate on a variety of intrinsic tasks, which model
linguistic information in the vector space, and extrinsic tasks, which use
vectors as input to machine learning models. We find that intrinsic evaluations
are highly sensitive to absolute position, while extrinsic tasks rely primarily
on local similarity. Our findings suggest that future embedding models and
post-processing techniques should focus primarily on similarity to nearby
points in vector space.Comment: Appearing in the Third Workshop on Evaluating Vector Space
Representations for NLP (RepEval 2019). 7 pages + reference
Platonic model of mind as an approximation to neurodynamics
Hierarchy of approximations involved in simplification of microscopic theories, from sub-cellural to the whole brain level, is presented. A new approximation to neural dynamics is described, leading to a Platonic-like model of mind based on psychological spaces. Objects and events in these spaces correspond to quasi-stable states of brain dynamics and may be interpreted from psychological point of view. Platonic model bridges the gap between neurosciences and psychological sciences. Static and dynamic versions of this model are outlined and Feature Space Mapping, a neurofuzzy realization of the static version of Platonic model, described. Categorization experiments with human subjects are analyzed from the neurodynamical and Platonic model points of view
Categorical Dimensions of Human Odor Descriptor Space Revealed by Non-Negative Matrix Factorization
In contrast to most other sensory modalities, the basic perceptual dimensions of olfaction remain unclear. Here, we use non-negative matrix factorization (NMF) – a dimensionality reduction technique – to uncover structure in a panel of odor profiles, with each odor defined as a point in multi-dimensional descriptor space. The properties of NMF are favorable for the analysis of such lexical and perceptual data, and lead to a high-dimensional account of odor space. We further provide evidence that odor dimensions apply categorically. That is, odor space is not occupied homogenously, but rather in a discrete and intrinsically clustered manner. We discuss the potential implications of these results for the neural coding of odors, as well as for developing classifiers on larger datasets that may be useful for predicting perceptual qualities from chemical structures
Fuzzy context adaptation through conceptual situation spaces
Context-adaptive information systems (IS) are highly desired across several application domains and usually rely on matching a particular real-world situation to a finite set of predefined situation parameters. To represent context parameters, semantic and non-semantic representation standards are widely used. However, describing the complex and diverse notion of specific situations is costly and may never reach semantic completeness. Whereas not any situation parameter completely equals another, the number of (predefined) representations of situation parameters is finite. Moreover, following symbolic representation approaches leads to ambiguity issues and does not entail semantic meaningfulness. Consequently, the challenge is to enable fuzzy matchmaking methodologies to match real-world situation characteristics to a finite set of predefined situation descriptions. In this paper, we propose conceptual situation spaces (CSS) which enable the description of situation characteristics as members in geometrical vector spaces following the idea of conceptual spaces. Consequently, fuzzy matchmaking is supported by calculating the semantic similarity between the current situation and prototypical situation descriptions in terms of their Euclidean distance within a CSS. Aligning CSS to existing symbolic representation standards, enables the automatic matchmaking between real-world situation characteristics and symbolic parameter representations. To prove the feasibility, we apply our approach to the domain of e-learning
Understanding the Latent Space of Diffusion Models through the Lens of Riemannian Geometry
Despite the success of diffusion models (DMs), we still lack a thorough
understanding of their latent space. To understand the latent space
, we analyze them from a geometrical perspective.
Specifically, we utilize the pullback metric to find the local latent basis in
and their corresponding local tangent basis in , the
intermediate feature maps of DMs. The discovered latent basis enables
unsupervised image editing capability through latent space traversal. We
investigate the discovered structure from two perspectives. First, we examine
how geometric structure evolves over diffusion timesteps. Through analysis, we
show that 1) the model focuses on low-frequency components early in the
generative process and attunes to high-frequency details later; 2) At early
timesteps, different samples share similar tangent spaces; and 3) The simpler
datasets that DMs trained on, the more consistent the tangent space for each
timestep. Second, we investigate how the geometric structure changes based on
text conditioning in Stable Diffusion. The results show that 1) similar prompts
yield comparable tangent spaces; and 2) the model depends less on text
conditions in later timesteps. To the best of our knowledge, this paper is the
first to present image editing through -space traversal and provide
thorough analyses of the latent structure of DMs
Semantics-based selection of everyday concepts in visual lifelogging
Concept-based indexing, based on identifying various semantic concepts appearing in multimedia, is an attractive option for multimedia retrieval and much research tries to bridge the semantic gap between the media’s low-level features and high-level semantics. Research into concept-based multimedia retrieval has generally focused on detecting concepts from high quality media such as broadcast TV or movies, but it is not well addressed in other domains like lifelogging where the original data is captured with poorer quality. We argue that in noisy domains such as lifelogging, the management of data needs to include semantic reasoning in order to deduce a set of concepts to represent lifelog content for applications like searching, browsing or summarisation. Using semantic concepts to manage lifelog data relies on the fusion of automatically-detected concepts to provide a better understanding of the lifelog data. In this paper, we investigate the selection of semantic concepts for lifelogging which includes reasoning on semantic networks using a density-based approach. In a series of experiments we compare different semantic reasoning approaches and the experimental evaluations we report on lifelog data show the efficacy of our approach
- …