921 research outputs found

    Design space reduction in optimization using generative topographic mapping

    No full text
    Dimension reduction in design optimization is an extensively researched area. The need arises in design problems dealing with very high dimensions, which increase the computational burden of the design process because the sample space required for the design search varies exponentially with the dimensions. This work describes the application of a latent variable method called Generative Topographic Mapping (GTM) in dimension reduction of a data set by transformation into a low-dimensional latent space. The attraction it presents is that the variables are not removed, but only transformed and hence there is no risk of missing out on information relating to all the variables. The method has been tested on the Branin test function initially and then on an aircraft wing weight problem. Ongoing work involves finding a suitable update strategy for adding infill points to the trained GTM in order to converge to the global optimum effectively. Three update methods tested on GTM so far are discussed

    GTM: the generative topographic mapping

    Get PDF
    Latent variable models represent the probability density of data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. A familiar example is factor analysis which is based on a linear transformations between the latent space and the data space. In this paper we introduce a form of non-linear latent variable model called the Generative Topographic Mapping, for which the parameters of the model can be determined using the EM algorithm. GTM provides a principled alternative to the widely used Self-Organizing Map (SOM) of Kohonen (1982), and overcomes most of the significant limitations of the SOM. We demonstrate the performance of the GTM algorithm on a toy problem and on simulated data from flow diagnostics for a multi-phase oil pipeline

    Mapping the conformations of biological assemblies

    Full text link
    Mapping conformational heterogeneity of macromolecules presents a formidable challenge to X-ray crystallography and cryo-electron microscopy, which often presume its absence. This has severely limited our knowledge of the conformations assumed by biological systems and their role in biological function, even though they are known to be important. We propose a new approach to determining to high resolution the three-dimensional conformations of biological entities such as molecules, macromolecular assemblies, and ultimately cells, with existing and emerging experimental techniques. This approach may also enable one to circumvent current limits due to radiation damage and solution purification.Comment: 14 pages, 6 figure

    GTM: the generative topographic mapping

    Get PDF
    This thesis describes the Generative Topographic Mapping (GTM) --- a non-linear latent variable model, intended for modelling continuous, intrinsically low-dimensional probability distributions, embedded in high-dimensional spaces. It can be seen as a non-linear form of principal component analysis or factor analysis. It also provides a principled alternative to the self-organizing map --- a widely established neural network model for unsupervised learning --- resolving many of its associated theoretical problems. An important, potential application of the GTM is visualization of high-dimensional data. Since the GTM is non-linear, the relationship between data and its visual representation may be far from trivial, but a better understanding of this relationship can be gained by computing the so-called magnification factor. In essence, the magnification factor relates the distances between data points, as they appear when visualized, to the actual distances between those data points. There are two principal limitations of the basic GTM model. The computational effort required will grow exponentially with the intrinsic dimensionality of the density model. However, if the intended application is visualization, this will typically not be a problem. The other limitation is the inherent structure of the GTM, which makes it most suitable for modelling moderately curved probability distributions of approximately rectangular shape. When the target distribution is very different to that, theaim of maintaining an `interpretable' structure, suitable for visualizing data, may come in conflict with the aim of providing a good density model. The fact that the GTM is a probabilistic model means that results from probability theory and statistics can be used to address problems such as model complexity. Furthermore, this framework provides solid ground for extending the GTM to wider contexts than that of this thesis

    High-Dimensional Regression with Gaussian Mixtures and Partially-Latent Response Variables

    Get PDF
    In this work we address the problem of approximating high-dimensional data with a low-dimensional representation. We make the following contributions. We propose an inverse regression method which exchanges the roles of input and response, such that the low-dimensional variable becomes the regressor, and which is tractable. We introduce a mixture of locally-linear probabilistic mapping model that starts with estimating the parameters of inverse regression, and follows with inferring closed-form solutions for the forward parameters of the high-dimensional regression problem of interest. Moreover, we introduce a partially-latent paradigm, such that the vector-valued response variable is composed of both observed and latent entries, thus being able to deal with data contaminated by experimental artifacts that cannot be explained with noise models. The proposed probabilistic formulation could be viewed as a latent-variable augmentation of regression. We devise expectation-maximization (EM) procedures based on a data augmentation strategy which facilitates the maximum-likelihood search over the model parameters. We propose two augmentation schemes and we describe in detail the associated EM inference procedures that may well be viewed as generalizations of a number of EM regression, dimension reduction, and factor analysis algorithms. The proposed framework is validated with both synthetic and real data. We provide experimental evidence that our method outperforms several existing regression techniques

    Magnification factors for the GTM algorithm

    Get PDF
    The Generative Topographic Mapping (GTM) algorithm of Bishop et al. (1997) has been introduced as a principled alternative to the Self-Organizing Map (SOM). As well as avoiding a number of deficiencies in the SOM, the GTM algorithm has the key property that the smoothness properties of the model are decoupled from the reference vectors, and are described by a continuous mapping from a lower-dimensional latent space into the data space. Magnification factors, which are approximated by the difference between code-book vectors in SOMs, can therefore be evaluated for the GTM model as continuous functions of the latent variables using the techniques of differential geometry. They play an important role in data visualization by highlighting the boundaries between data clusters, and are illustrated here for both a toy data set, and a problem involving the identification of crab species from morphological data

    Magnification factors for the SOM and GTM algorithms

    Get PDF
    Magnification factors specify the extent to which the area of a small patch of the latent (or `feature') space of a topographic mapping is magnified on projection to the data space, and are of considerable interest in both neuro-biological and data analysis contexts. Previous attempts to consider magnification factors for the self-organizing map (SOM) algorithm have been hindered because the mapping is only defined at discrete points (given by the reference vectors). In this paper we consider the batch version of SOM, for which a continuous mapping can be defined, as well as the Generative Topographic Mapping (GTM) algorithm of Bishop et al. (1997) which has been introduced as a probabilistic formulation of the SOM. We show how the techniques of differential geometry can be used to determine magnification factors as continuous functions of the latent space coordinates. The results are illustrated here using a problem involving the identification of crab species from morphological data
    corecore