460,986 research outputs found
Non-linear dimensionality reduction techniques for classification
This thesis project concerns on dimensionality reduction through
manifold learning with a focus on non linear techniques.
Dimension Reduction (DR) is the process of reducing high dimension
dataset with d feature (dimension) to one with a lower number of feature p (p ≪ d) that preserves the information contained in the original
higher dimensional space. More in general, the concept of manifold
learning is introduced, a generalized approach that involves algorithm
for dimensionality reduction.
Manifold learning can be divided in two main categories: Linear and
Non Linear method. Although, linear method, such as Principal
Component Analysis (PCA) and Multidimensional Scaling (MDS) are
widely used and well known, there are plenty of non linear techniques
i.e. Isometric Feature Mapping (Isomap), Locally Linear Embedding
(LLE), Local Tangent Space Alignment (LTSA), which in recent years
have been subject of studies.
This project is inspired by the work done by [Bahadur et Al., 2017 ],
with the aim to estimate the US market dimensionality using Russell
3000 as a proxy of financial market.
Since financial markets are high dimensional and complex environment
an approach with non linear techniques among linear is proposed.This thesis project concerns on dimensionality reduction through
manifold learning with a focus on non linear techniques.
Dimension Reduction (DR) is the process of reducing high dimension
dataset with d feature (dimension) to one with a lower number of feature p (p ≪ d) that preserves the information contained in the original
higher dimensional space. More in general, the concept of manifold
learning is introduced, a generalized approach that involves algorithm
for dimensionality reduction.
Manifold learning can be divided in two main categories: Linear and
Non Linear method. Although, linear method, such as Principal
Component Analysis (PCA) and Multidimensional Scaling (MDS) are
widely used and well known, there are plenty of non linear techniques
i.e. Isometric Feature Mapping (Isomap), Locally Linear Embedding
(LLE), Local Tangent Space Alignment (LTSA), which in recent years
have been subject of studies.
This project is inspired by the work done by [Bahadur et Al., 2017 ],
with the aim to estimate the US market dimensionality using Russell
3000 as a proxy of financial market.
Since financial markets are high dimensional and complex environment
an approach with non linear techniques among linear is proposed
Non-linear dimension reduction in factor-augmented vector autoregressions
This paper introduces non-linear dimension reduction in factor-augmented
vector autoregressions to analyze the effects of different economic shocks. I
argue that controlling for non-linearities between a large-dimensional dataset
and the latent factors is particularly useful during turbulent times of the
business cycle. In simulations, I show that non-linear dimension reduction
techniques yield good forecasting performance, especially when data is highly
volatile. In an empirical application, I identify a monetary policy as well as
an uncertainty shock excluding and including observations of the COVID-19
pandemic. Those two applications suggest that the non-linear FAVAR approaches
are capable of dealing with the large outliers caused by the COVID-19 pandemic
and yield reliable results in both scenarios.Comment: JEL: C11, C32, C40, C55, E37. Keywords: Dimension reduction, machine
learning, non-linear factor-augmented vector autoregression, monetary policy
shock, uncertainty shock, impulse response analysis, COVID-1
Training Process Reduction Based On Potential Weights Linear Analysis To Accelarate Back Propagation Network
Learning is the important property of Back Propagation Network (BPN) and
finding the suitable weights and thresholds during training in order to improve
training time as well as achieve high accuracy. Currently, data pre-processing
such as dimension reduction input values and pre-training are the contributing
factors in developing efficient techniques for reducing training time with high
accuracy and initialization of the weights is the important issue which is
random and creates paradox, and leads to low accuracy with high training time.
One good data preprocessing technique for accelerating BPN classification is
dimension reduction technique but it has problem of missing data. In this
paper, we study current pre-training techniques and new preprocessing technique
called Potential Weight Linear Analysis (PWLA) which combines normalization,
dimension reduction input values and pre-training. In PWLA, the first data
preprocessing is performed for generating normalized input values and then
applying them by pre-training technique in order to obtain the potential
weights. After these phases, dimension of input values matrix will be reduced
by using real potential weights. For experiment results XOR problem and three
datasets, which are SPECT Heart, SPECTF Heart and Liver disorders (BUPA) will
be evaluated. Our results, however, will show that the new technique of PWLA
will change BPN to new Supervised Multi Layer Feed Forward Neural Network
(SMFFNN) model with high accuracy in one epoch without training cycle. Also
PWLA will be able to have power of non linear supervised and unsupervised
dimension reduction property for applying by other supervised multi layer feed
forward neural network model in future work.Comment: 11 pages IEEE format, International Journal of Computer Science and
Information Security, IJCSIS 2009, ISSN 1947 5500, Impact factor 0.42
Non-linear, Sparse Dimensionality Reduction via Path Lasso Penalized Autoencoders
High-dimensional data sets are often analyzed and explored via the
construction of a latent low-dimensional space which enables convenient
visualization and efficient predictive modeling or clustering. For complex data
structures, linear dimensionality reduction techniques like PCA may not be
sufficiently flexible to enable low-dimensional representation. Non-linear
dimension reduction techniques, like kernel PCA and autoencoders, suffer from
loss of interpretability since each latent variable is dependent of all input
dimensions. To address this limitation, we here present path lasso penalized
autoencoders. This structured regularization enhances interpretability by
penalizing each path through the encoder from an input to a latent variable,
thus restricting how many input variables are represented in each latent
dimension. Our algorithm uses a group lasso penalty and non-negative matrix
factorization to construct a sparse, non-linear latent representation. We
compare the path lasso regularized autoencoder to PCA, sparse PCA, autoencoders
and sparse autoencoders on real and simulated data sets. We show that the
algorithm exhibits much lower reconstruction errors than sparse PCA and
parameter-wise lasso regularized autoencoders for low-dimensional
representations. Moreover, path lasso representations provide a more accurate
reconstruction match, i.e. preserved relative distance between objects in the
original and reconstructed spaces
Effectiveness of landmark analysis for establishing locality in p2p networks
Locality to other nodes on a peer-to-peer overlay network can be established by means of a set of landmarks shared among the participating nodes. Each node independently collects a set of latency measures to landmark nodes, which are used as a multi-dimensional feature vector. Each peer node uses the feature vector to generate a unique scalar index which is correlated to its topological locality. A popular dimensionality reduction technique is the space filling Hilbert’s curve, as it possesses good locality
preserving properties. However, there exists little comparison between Hilbert’s curve and other techniques for dimensionality reduction. This work carries out a quantitative analysis of their properties. Linear and non-linear techniques for scaling the landmark vectors to a single dimension are investigated. Hilbert’s curve, Sammon’s mapping and Principal Component Analysis
have been used to generate a 1d space with locality preserving properties. This work provides empirical evidence to support the use of Hilbert’s curve in the context of locality preservation when generating peer identifiers by means of landmark vector analysis. A comparative analysis is carried out with an artificial 2d network model and with a realistic network topology model
with a typical power-law distribution of node connectivity in the Internet. Nearest neighbour analysis confirms Hilbert’s curve to be very effective in both artificial and realistic network topologies. Nevertheless, the results in the realistic network model show that there is scope for improvements and better techniques to preserve locality information are required
Methods for Estimation of Intrinsic Dimensionality
Dimension reduction is an important tool used to describe the structure of complex data (explicitly or implicitly) through a small but sufficient number of variables, and
thereby make data analysis more efficient. It is also useful for visualization purposes. Dimension reduction helps statisticians to overcome the ‘curse of dimensionality’. However, most dimension reduction techniques require the intrinsic dimension of the low-dimensional subspace to be fixed in advance.
The availability of reliable intrinsic dimension (ID) estimation techniques is of major importance. The main goal of this thesis is to develop algorithms for determining the intrinsic dimensions of recorded data sets in a nonlinear context. Whilst this is a well-researched topic for linear planes, based mainly on principal components analysis, relatively little attention has been paid to ways of estimating this number for non–linear variable interrelationships. The proposed algorithms here are based on existing concepts that can be categorized into local methods, relying on randomly selected subsets of a recorded variable set, and global methods, utilizing the entire data set.
This thesis provides an overview of ID estimation techniques, with special consideration given to recent developments in non–linear techniques, such as charting
manifold and fractal–based methods. Despite their nominal existence, the practical implementation of these techniques is far from straightforward.
The intrinsic dimension is estimated via Brand’s algorithm by examining the growth point process, which counts the number of points in hyper-spheres. The estimation needs to determine the starting point for each hyper-sphere. In this thesis we provide settings for selecting starting points which work well for most data sets. Additionally we propose approaches for estimating dimensionality via Brand’s algorithm, the Dip method and the Regression method.
Other approaches are proposed for estimating the intrinsic dimension by fractal dimension estimation methods, which exploit the intrinsic geometry of a data set. The most popular concept from this family of methods is the correlation dimension, which requires the estimation of the correlation integral for a ball of radius tending to
0. In this thesis we propose new approaches to approximate the correlation integral in this limit. The new approaches are the Intercept method, the Slop method and the Polynomial method.
In addition we propose a new approach, a localized global method, which could be defined as a local version of global ID methods. The objective of the localized global approach is to improve the algorithm based on a local ID method, which could significantly reduce the negative bias.
Experimental results on real world and simulated data are used to demonstrate the algorithms and compare them to other methodology. A simulation study which verifies the effectiveness of the proposed methods is also provided. Finally, these algorithms are contrasted using a recorded data set from an industrial melter process
Real-time Inflation Forecasting Using Non-linear Dimension Reduction Techniques
In this paper, we assess whether using non-linear dimension reduction techniques pays off for forecasting inflation in real-time. Several recent methods from the machine learning literature are adopted to map a large dimensional dataset into a lower dimensional set of latent factors. We model the relationship between inflation and these latent factors using state-of-the-art time-varying parameter (TVP) regressions with shrinkage priors. Using monthly real-time data for the US, our results suggest that adding such non-linearities yields forecasts that are on average highly competitive to ones obtained from methods using linear dimension reduction techniques. Zooming into model performance over time moreover reveals that controlling for non-linear relations in the data is of particular importance during recessionary episodes of the business cycle
Real-time inflation forecasting using non-linear dimension reduction techniques
In this paper, we assess whether using non-linear dimension reduction techniques pays off for forecasting inflation in real-time. Several recent methods from the machine learning literature are adopted to map a large dimensional dataset into a lower-dimensional set of latent factors. We model the relationship between inflation and the latent factors using constant and time-varying parameter (TVP) regressions with shrinkage priors. Our models are then used to forecast monthly US inflation in real-time. The results suggest that sophisticated dimension reduction methods yield inflation forecasts that are highly competitive with linear approaches based on principal components. Among the techniques considered, the Autoencoder and squared principal components yield factors that have high predictive power for one-month- and one-quarter-ahead inflation. Zooming into model performance over time reveals that controlling for non-linear relations in the data is of particular importance during recessionary episodes of the business cycle or the current COVID-19 pandemic
- …