10,724 research outputs found
Pruning Error Minimization in Least Squares Support Vector Machines
The support vector machine (SVM) is a method for classification and for function approximation. This method commonly makes use of an /spl epsi/-insensitive cost function, meaning that errors smaller than /spl epsi/ remain unpunished. As an alternative, a least squares support vector machine (LSSVM) uses a quadratic cost function. When the LSSVM method is used for function approximation, a nonsparse solution is obtained. The sparseness is imposed by pruning, i.e., recursively solving the approximation problem and subsequently omitting data that has a small error in the previous pass. However, omitting data with a small approximation error in the previous pass does not reliably predict what the error will be after the sample has been omitted. In this paper, a procedure is introduced that selects from a data set the training sample that will introduce the smallest approximation error when it will be omitted. It is shown that this pruning scheme outperforms the standard one
Modeling Financial Time Series with Artificial Neural Networks
Financial time series convey the decisions and actions of a population of human actors over time. Econometric and regressive models have been developed in the past decades for analyzing these time series. More recently, biologically inspired artificial neural network models have been shown to overcome some of the main challenges of traditional techniques by better exploiting the non-linear, non-stationary, and oscillatory nature of noisy, chaotic human interactions. This review paper explores the options, benefits, and weaknesses of the various forms of artificial neural networks as compared with regression techniques in the field of financial time series analysis.CELEST, a National Science Foundation Science of Learning Center (SBE-0354378); SyNAPSE program of the Defense Advanced Research Project Agency (HR001109-03-0001
Survey of data mining approaches to user modeling for adaptive hypermedia
The ability of an adaptive hypermedia system to create tailored environments depends mainly on the amount and accuracy of information stored in each user model. Some of the difficulties that user modeling faces are the amount of data available to create user models, the adequacy of the data, the noise within that data, and the necessity of capturing the imprecise nature of human behavior. Data mining and machine learning techniques have the ability to handle large amounts of data and to process uncertainty. These characteristics make these techniques suitable for automatic generation of user models that simulate human decision making. This paper surveys different data mining techniques that can be used to efficiently and accurately capture user behavior. The paper also presents guidelines that show which techniques may be used more efficiently according to the task implemented by the applicatio
Analyzing sparse dictionaries for online learning with kernels
Many signal processing and machine learning methods share essentially the
same linear-in-the-parameter model, with as many parameters as available
samples as in kernel-based machines. Sparse approximation is essential in many
disciplines, with new challenges emerging in online learning with kernels. To
this end, several sparsity measures have been proposed in the literature to
quantify sparse dictionaries and constructing relevant ones, the most prolific
ones being the distance, the approximation, the coherence and the Babel
measures. In this paper, we analyze sparse dictionaries based on these
measures. By conducting an eigenvalue analysis, we show that these sparsity
measures share many properties, including the linear independence condition and
inducing a well-posed optimization problem. Furthermore, we prove that there
exists a quasi-isometry between the parameter (i.e., dual) space and the
dictionary's induced feature space.Comment: 10 page
Performance and optimization of support vector machines in high-energy physics classification problems
In this paper we promote the use of Support Vector Machines (SVM) as a
machine learning tool for searches in high-energy physics. As an example for a
new- physics search we discuss the popular case of Supersymmetry at the Large
Hadron Collider. We demonstrate that the SVM is a valuable tool and show that
an automated discovery- significance based optimization of the SVM
hyper-parameters is a highly efficient way to prepare an SVM for such
applications. A new C++ LIBSVM interface called SVM-HINT is developed and
available on Github.Comment: 20 pages, 6 figure
Entropy of Overcomplete Kernel Dictionaries
In signal analysis and synthesis, linear approximation theory considers a
linear decomposition of any given signal in a set of atoms, collected into a
so-called dictionary. Relevant sparse representations are obtained by relaxing
the orthogonality condition of the atoms, yielding overcomplete dictionaries
with an extended number of atoms. More generally than the linear decomposition,
overcomplete kernel dictionaries provide an elegant nonlinear extension by
defining the atoms through a mapping kernel function (e.g., the gaussian
kernel). Models based on such kernel dictionaries are used in neural networks,
gaussian processes and online learning with kernels.
The quality of an overcomplete dictionary is evaluated with a diversity
measure the distance, the approximation, the coherence and the Babel measures.
In this paper, we develop a framework to examine overcomplete kernel
dictionaries with the entropy from information theory. Indeed, a higher value
of the entropy is associated to a further uniform spread of the atoms over the
space. For each of the aforementioned diversity measures, we derive lower
bounds on the entropy. Several definitions of the entropy are examined, with an
extensive analysis in both the input space and the mapped feature space.Comment: 10 page
- …