Search CORE

136 research outputs found

Parallel multiclass stochastic gradient descent algorithms for classifying million images with very-high-dimensional signatures into thousands classes

Author: Thanh-Nghi Do
Publication venue: Springer Nature
Publication date: 01/01/2014
Field of study

Automatic Modulation Recognition for MFSK Using Modified Covariance Method

Author: M.Hamee Hanan
Wadi Jafer
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/06/2015
Field of study

This paper presents modulation classification method capable of classifyingMFSK digital signals without a priori information using modified covariancemethod. This method using for calculation features for FSK modulationshould have a good properties of sensitive with FSK modulation index andinsensitive with signal to noise ratio SNR variation. The numericalsimulations and investigation of the performance by the support vectorsmachine one against all (SVM-OAA) as a classifier for classifying 6 digitallymodulated signals which gives probability of correction classification up to85.85 at SNR=-15dB

IAES journal

Crossref

Institute of Advanced Engineering and Science

Generalization Bounds for Compressed Learning with Hard Support Vector Machines, and Multiclass Learning with Error Correcting Output Codes

Author: McVay Paul Robert
Publication venue
Publication date: 27/04/2021
Field of study

This dissertation covers two topics. First, the Compressed Learning with hard SVM. We characterize the conditions under which the separability assumption holds after compression. Using these results, we give an upper bound on the compression ratio that maintains separability in the compressed domain. Furthermore, we provide theoretical results to show how the generalization bound changes with respect to the compression ratio used. These results allow for theoretical justifications in choosing the best compression matrix given the particular design parameters at hand. Additionally, as required for the analysis presented, we extend the existing hard-SVM bounds to the case when a bias term is allowed. The second topic addresses the design of error correcting output codes for multiclass classification. Our algorithm optimizes the encoder for a channel code based coding matrix to ensure the maximum minimum distance of the coding matrix. The optimization procedure uses the properties of the code to run extremely fast,

\mathcal{O}(k \log k)

. We demonstrate the need for the optimal minimum distance for the coding matrix by proving a generalization bound for both hard and soft decoding. These bounds beat the previously published tight asypmtotic growth rate with respect to

k

. Finally, we present empirical results to validate our approach

Texas A&M Repository

Restricting Supervised Learning: Feature Selection and Feature Space Partition

Author: Nan Xiaofei
Publication venue: eGrove
Publication date: 01/01/2012
Field of study

Many supervised learning problems are considered difficult to solve either because of the redundant features or because of the structural complexity of the generative function. Redundant features increase the learning noise and therefore decrease the prediction performance. Additionally, a number of problems in various applications such as bioinformatics or image processing, whose data are sampled in a high dimensional space, suffer the curse of dimensionality, and there are not enough observations to obtain good estimates. Therefore, it is necessary to reduce such features under consideration. Another issue of supervised learning is caused by the complexity of an unknown generative model. To obtain a low variance predictor, linear or other simple functions are normally suggested, but they usually result in high bias. Hence, a possible solution is to partition the feature space into multiple non-overlapping regions such that each region is simple enough to be classified easily. In this dissertation, we proposed several novel techniques for restricting supervised learning problems with respect to either feature selection or feature space partition. Among different feature selection methods, 1-norm regularization is advocated by many researchers because it incorporates feature selection as part of the learning process. We give special focus here on ranking problems because very little work has been done for ranking using L1 penalty. We present here a 1-norm support vector machine method to simultaneously find a linear ranking function and to perform feature subset selection in ranking problems. Additionally, because ranking is formulated as a classification task when pair-wise data are considered, it increases the computational complexity from linear to quadratic in terms of sample size. We also propose a convex hull reduction method to reduce this impact. The method was tested on one artificial data set and two benchmark real data sets, concrete compressive strength set and Abalone data set. Theoretically, by tuning the trade-off parameter between the 1-norm penalty and the empirical error, any desired size of feature subset could be achieved, but computing the whole solution path in terms of the trade-off parameter is extremely difficult. Therefore, using 1-norm regularization alone may not end up with a feature subset of small size. We propose a recursive feature selection method based on 1-norm regularization which can handle the multi-class setting effectively and efficiently. The selection is performed iteratively. In each iteration, a linear multi-class classifier is trained using 1-norm regularization, which leads to sparse weight vectors, i.e., many feature weights are exactly zero. Those zero-weight features are eliminated in the next iteration. The selection process has a fast rate of convergence. We tested our method on an earthworm microarray data set and the empirical results demonstrate that the selected features (genes) have very competitive discriminative power. Feature space partition separates a complex learning problem into multiple non-overlapping simple sub-problems. It is normally implemented in a hierarchical fashion. Different from decision tree, a leaf node of this hierarchical structure does not represent a single decision, but represents a region (sub-problem) that is solvable with respect to linear functions or other simple functions. In our work, we incorporate domain knowledge in the feature space partition process. We consider domain information encoded by discrete or categorical attributes. A discrete or categorical attribute provides a natural partition of the problem domain, and hence divides the original problem into several non-overlapping sub-problems. In this sense, the domain information is useful if the partition simplifies the learning task. However it is not trivial to select the discrete or categorical attribute that maximally simplify the learning task. A naive approach exhaustively searches all the possible restructured problems. It is computationally prohibitive when the number of discrete or categorical attributes is large. We describe a metric to rank attributes according to their potential to reduce the uncertainty of a classification task. It is quantified as a conditional entropy achieved using a set of optimal classifiers, each of which is built for a sub-problem defined by the attribute under consideration. To avoid high computational cost, we approximate the solution by the expected minimum conditional entropy with respect to random projections. This approach was tested on three artificial data sets, three cheminformatics data sets, and two leukemia gene expression data sets. Empirical results demonstrate that our method is capable of selecting a proper discrete or categorical attribute to simplify the problem, i.e., the performance of the classifier built for the restructured problem always beats that of the original problem. Restricting supervised learning is always about building simple learning functions using a limited number of features. Top Selected Pair (TSP) method builds simple classifiers based on very few (for example, two) features with simple arithmetic calculation. However, traditional TSP method only deals with static data. In this dissertation, we propose classification methods for time series data that only depend on a few pairs of features. Based on the different comparison strategies, we developed the following approaches: TSP based on average, TSP based on trend, and TSP based on trend and absolute difference amount. In addition, inspired by the idea of using two features, we propose a time series classification method based on few feature pairs using dynamic time warping and nearest neighbor

eGrove (Univ. of Mississippi)

Learning from data with uncertainty: Robust multiclass kernel-based classifiers and regressors.

Author: Santosa Budi.
Publication venue
Publication date: 01/01/2005
Field of study

Motivated by the presence of uncertainty in real data, in this research we investigate a robust optimization approach applied to multiclass support vector machines (SVMs) and support vector regression. Two new kernel based-methods are developed to address data with uncertainty where each data point is inside a sphere of uncertainty. For classification problems, the models are called robust SVM (R-SVM) and robust feasibility approach (R-FA) respectively as extensions of SVM approach. The two models are compared in terms of robustness and generalization error. For comparison purposes, the robust minimax probability machine (MPM) is applied and compared with the above methods. From the empirical results, we conclude that the R-SVM performs better than robust MPM. For regression problems, the models are called robust support vector regression (R-SVR) and robust feasibility approach for regression (R-FAR.). The proposed robust methods can improve the mean square error (MSE) in regression problems

SHAREOK repository

MaxMin-L2-SVC-NCH: A New Method to Train Support Vector Classifier with the Selection of Model's Parameters

Author: Chen Ziyang
Luo Linkai
Peng Hong
Wang Yiding
Yang Qiaoling
Publication venue
Publication date: 14/07/2023
Field of study

The selection of model's parameters plays an important role in the application of support vector classification (SVC). The commonly used method of selecting model's parameters is the k-fold cross validation with grid search (CV). It is extremely time-consuming because it needs to train a large number of SVC models. In this paper, a new method is proposed to train SVC with the selection of model's parameters. Firstly, training SVC with the selection of model's parameters is modeled as a minimax optimization problem (MaxMin-L2-SVC-NCH), in which the minimization problem is an optimization problem of finding the closest points between two normal convex hulls (L2-SVC-NCH) while the maximization problem is an optimization problem of finding the optimal model's parameters. A lower time complexity can be expected in MaxMin-L2-SVC-NCH because CV is abandoned. A gradient-based algorithm is then proposed to solve MaxMin-L2-SVC-NCH, in which L2-SVC-NCH is solved by a projected gradient algorithm (PGA) while the maximization problem is solved by a gradient ascent algorithm with dynamic learning rate. To demonstrate the advantages of the PGA in solving L2-SVC-NCH, we carry out a comparison of the PGA and the famous sequential minimal optimization (SMO) algorithm after a SMO algorithm and some KKT conditions for L2-SVC-NCH are provided. It is revealed that the SMO algorithm is a special case of the PGA. Thus, the PGA can provide more flexibility. The comparative experiments between MaxMin-L2-SVC-NCH and the classical parameter selection models on public datasets show that MaxMin-L2-SVC-NCH greatly reduces the number of models to be trained and the test accuracy is not lost to the classical models. It indicates that MaxMin-L2-SVC-NCH performs better than the other models. We strongly recommend MaxMin-L2-SVC-NCH as a preferred model for SVC task

arXiv.org e-Print Archive

The Numerical Stability of Hyperbolic Representation Learning

Author: Mishne Gal
Wan Zhengchao
Wang Yusu
Yang Sheng
Publication venue
Publication date: 27/06/2023
Field of study

Given the exponential growth of the volume of the ball w.r.t. its radius, the hyperbolic space is capable of embedding trees with arbitrarily small distortion and hence has received wide attention for representing hierarchical datasets. However, this exponential growth property comes at a price of numerical instability such that training hyperbolic learning models will sometimes lead to catastrophic NaN problems, encountering unrepresentable values in floating point arithmetic. In this work, we carefully analyze the limitation of two popular models for the hyperbolic space, namely, the Poincar\'e ball and the Lorentz model. We first show that, under the 64 bit arithmetic system, the Poincar\'e ball has a relatively larger capacity than the Lorentz model for correctly representing points. Then, we theoretically validate the superiority of the Lorentz model over the Poincar\'e ball from the perspective of optimization. Given the numerical limitations of both models, we identify one Euclidean parametrization of the hyperbolic space which can alleviate these limitations. We further extend this Euclidean parametrization to hyperbolic hyperplanes and exhibits its ability in improving the performance of hyperbolic SVM

arXiv.org e-Print Archive