Search CORE

15 research outputs found

Robust linear discriminant rules with coordinatewise and distance based approaches

Author: Lim Yai Fung
Publication venue
Publication date: 01/01/2020
Field of study

Linear discriminant analysis (LDA) is one of the supervised classification techniques to deal with relationship between a categorical variable and a set of continuous variables. The main objective of LDA is to create a function to distinguish between groups and allocating future observations to previously defined groups. Under the assumptions of normality and homoscedasticity, the LDA yields optimal linear discriminant rule (LDR) between two or more groups. However, the optimality of LDA highly relies on the sample mean and sample covariance matrix which are known to be sensitive to outliers. To abate these conflicts, robust location and scale estimators via coordinatewise and distance based approaches have been applied in constructing new robust LDA. These robust estimators were used to replace the classical sample mean and sample covariance to form robust linear discriminant rules (RLDR). A total of six RLDR, namely four coordinatewise (RLDRM, RLDRMw, RLDRW, RLDRWw) and two distance based (RLDRV, RLDRT) approaches have been proposed and implemented in this study. Simulation and real data study were conducted to investigate on the performance of the proposed RLDR, measured in terms of misclassification error rates and computational time. Several data conditions such as non-normality, heteroscedasticity, balanced and unbalanced data set were manipulated in the simulation study to evaluate the performance of these proposed RLDR. In real data study, a set of diabetes data was used. This data set violated the assumptions of normality as well as homoscedasticity. The results showed that the novel RLDRV is the best proposed RLDR to solve classification problem since it provides as much as 91.03% accuracy in classification as shown in the real data study. The proposed RLDR are good alternatives to the classical LDR as well as existing RLDR since these RLDR perform well in classification problems even under contaminated data

Universiti Utara Malaysia: UUM eTheses

Robust Linear Discriminant Analysis with Highest Breakdown Point Estimator

Author: Ali Hazlina
Syed Yahaya Sharipah Soaad
Yai Fung Lim
Publication venue: Penerbit Universiti Teknikal Malaysia Melaka Press
Publication date: 01/01/2018
Field of study

Linear Discriminant Analysis (LDA) is a supervised classification technique concerned with the relationship between a categorical variable and a set of interrelated variables.The main objective of LDA is to create a rule to distinguish between populations and allocating future observations to previously defined populations.The LDA yields optimal discriminant rule between two or more groups under the assumptions of normality and homoscedasticity.Nevertheless, the classical estimates, sample mean and sample covariance matrix, are highly affected when the ideal conditions are violated.To abate these problems, a new robust LDA rule using high breakdown point estimators has been proposed in this article.A winsorized approach used to estimate the location measure while the multiplication of Spearman’s rho and the rescaled median absolute deviation were used to estimate the scatter measure to replace the sample mean and sample covariance matrix, respectively.Simulation and real data study were conducted to evaluate the performance of the proposed model measured in terms of misclassification error rates.The computational results showed that the proposed LDA is always better than the classical LDA and were comparable with the existing robust LDAs

UUM Repository

A comparative study of heuristic methods to solve Traveling Salesman Problem (TPS)

Author: Hong Pei Yee
Khalid Ruzelan
Lim Yai Fung
Ramli Razamin
Publication venue: 'UUM Press, Universiti Utara Malaysia'
Publication date: 01/01/2011
Field of study

Traveling Salesman Problem (TSP) is a famous problem in combinatorial optimization. The objective of the TSP is to find the shortest path that reaches all the cities which are interconnected with each other by straight lines.The symmetric TSP is used and the distance between two cities is calculated by using Euclidean equation.In this study, three heuristic methods, namely simulated annealing (SA), tabu search (TS) and reactive tabu search (RTS) are used to solve TSP.SA is a generic probabilistic meta-algorithm for the global optimization problem and TS is a meta-heuristic search technique that guides a local search procedure to explore the solution space beyond local optimality. RTS is an improved method of TS and it dynamically adjusts tabu list size based on how the search is performed.The performance of SA, TS and RTS algorithms in solving TSP with different size of problems are evaluated by using empirical testing, benchmarking solution and simple probabilistic analysis. The implementations of the three methods to solve TSP show that the RTS algorithm provides a better solution in terms of minimizing the objective function while SA algorithm is less time consuming in solving problem with large number of cities.In conclusion, RTS is more effective in producing good quality solution and on the other hand, SA may be used to obtain instant results

UUM Repository

Robust linear discriminant analysis with automatic trimmed mean

Author: Ali Hazlina
Lim Yai-Fung
Omar Zurni
Syed Yahaya Sharipah Soaad
Publication venue: Penerbit Universiti Teknikal Malaysia Melaka Press
Publication date: 01/01/2016
Field of study

Linear discriminant analysis (LDA) is a multivariate statistical technique used to determine which continuous variables discriminate between two or more naturally occurring groups. This technique creates a linear discriminant function that yields optimal classification rule between two or more groups under the assumptions of normality and homoscedasticity.Nonetheless, the computation of parametric LDA which are based on the sample mean vectors and pooled sample covariance matrix are known to be sensitive to nonnormality.To overcome the sensitivity of this method towards non-normality as well as homoscedasticity, this study proposed a new robust LDA method.Through this approach, an automatic trimmed mean vector was used as a substitute for the usual mean vector in the parametric LDA. Meanwhile, for the covariance matrix, this study introduced a robust approach by multiplying the Spearman’s rho with the corresponding robust scale estimator used in the trimming process. Simulated and real financial data were used to test the performance of the proposed method in terms of misclassification rate.The results showed that the new method performed better compared to the parametric LDA and the existing robust LDA with S-estimato

UUM Repository

Robust Linear Discriminant Analysis with Automatic Trimmed Mean

Author: Ali Hazlina
Lim Yai-Fung
Omar Zurni
Syed Yahaya Sharipah Soaad
Publication venue: Journal of Telecommunication, Electronic and Computer Engineering (JTEC)
Publication date: 01/12/2016
Field of study

Linear discriminant analysis (LDA) is a multivariate statistical technique used to determine which continuous variables discriminate between two or more naturally occurring groups. This technique creates a linear discriminant function that yields optimal classification rule between two or more groups under the assumptions of normality and homoscedasticity. Nonetheless, the computation of parametric LDA which are based on the sample mean vectors and pooled sample covariance matrix are known to be sensitive to nonnormality. To overcome the sensitivity of this method towards non-normality as well as homoscedasticity, this study proposed a new robust LDA method. Through this approach, an automatic trimmed mean vector was used as a substitute for the usual mean vector in the parametric LDA. Meanwhile, for the covariance matrix, this study introduced a robust approach by multiplying the Spearman’s rho with the corresponding robust scale estimator used in the trimming process. Simulated and real financial data were used to test the performance of the proposed method in terms of misclassification rate. The results showed that the new method performed better compared to the parametric LDA and the existing robust LDA with S-estimato

Universiti Teknikal Malaysia Melaka: UTeM Open Journal System

Modified Wilcoxon procedure for dependent group

Author: Abdullah Suhaida
Ahad Nor Aishah
Lim Yai Fung
Md Yusof Zahayu
Syed Yahaya Sharipah Soaad
Publication venue: 'UUM Press, Universiti Utara Malaysia'
Publication date: 01/01/2014
Field of study

Nonparametric methods require no or very limited assumptions to be made about the format of the data, and they may therefore be preferable when the assumptions required for parametric methods are not valid.The Wilcoxon signed rank test applies to matched pairs studies.For two tail test, it tests the null hypothesis that there is no systematic difference within pairs against alternatives that assert a systematic difference. The test is based on the Wilcoxon signed rank statistic W, which is the smaller of the two ranks sums. The step to compute the statistic W considered positive and negative differences and omit all the zero differences. In this study, we modify the Wilcoxon signed rank test using the indicator function of positive, zero and negative differences to compute the Wilcoxon statistic, W. The empirical Type I error rates of the modified statistical test was measured via Monte Carlo simulation.These rates were obtained under different distributional shapes, sample sizes, and number of replications.The modified Wilcoxon signed rank test was found to be robust under symmetric distributions.The result shows that this test produced liberal Type I error rates under skewed distribution.The use of the indicator positive, zero and negative differences influence the result of the Wilcoxon statistic.These finding was demonstrated using an example data

UUM Repository

Type I error of the modified Wilcoxon signed rank test under leptokurtic distribution

Author: Abdullah Suhaida
Ahad Nor Aishah
Lim Yai Fung
Md Yusof Zahayu
Syed Yahaya Sharipah Soaad
Publication venue: Faculty of Computer & Mathematical Sciences, UiTM Kedah
Publication date: 01/01/2015
Field of study

Group comparisons are at the heart of many research questions addressed by researchers.Making inferences and drawing conclusions through statistical hypothesis testing on the differences between groups is actively adopted by researchers in many disciplines.When the groups are dependent, and violation of normality assumption occurred, the most commonly used method like paired t-test, usually produced doubtful result which will lead to misleading conclusions.As alternative, researchers tend to choose non parametric Wilcoxon signed rank test for the purpose.The computation of this statistic involves ranking the absolute difference of each pair of observations and any pair with 0 differences will be discarded.In this study, the statistic was modified by includimg the 0 differences in the ranking. The empirical Type I error rates of the modified statistical test was measured via Monte Carlo simulation. These rates were obtained under the combination of leptokurtic distributional shapes with various sample sizes and number of replications.The modified Wilcoxon signed rank test was found to be more robust under symmetric lepto!antic with conservative values as compared to the skewed leptokurtic distribution. The finding also indicated that different number of replications had no effect on Type I error

UUM Repository

Winsorization on linear discriminant analysis

Author: Ali Hazlina
Lim Yai-Fung
Syed Yahaya Sharipah Soaad
Publication venue: 'AIP Publishing'
Publication date: 01/08/2016
Field of study

Linear discriminant analysis (LDA) is a widely used multivariate technique for pattern classification.LDA creates an equation which can minimize the possibility of misclassifying observations into their corresponding populations. The main objective of LDA is to classify multivariate data into different populations on the basis of a training sample with known group memberships.Under ideal conditions that is when the distribution is normal and variances are equal (homoscedasticity), LDA performs optimally. Nevertheless, the classical estimates, sample mean and sample covariance, are highly affected when the ideal conditions are violated.To alleviate these problems, a new robust LDA model using winsorized approach to estimate the location measure to replace the sample mean was introduced in this study. Meanwhile, for the robust covariance, the product of Spearman’s rho and the rescaled median absolute deviation was used as the substitute for the classical covariance.The optimality of the proposed model in terms of misclassification error rate was evaluated through simulation and real data application.The results revealed that the misclassification error rate of the proposed model were always better than the classical LDA and were comparable with the existing robust LDA under contamination.In contrast, in terms of computational time, classical LDA provide the shortest time followed by the proposed model and the existing robust LDA

UUM Repository

Modified reactive tabu search for the symmetric traveling salesman problems

Author: Hong Pei Yee
Khalid Ruzelan
Lim Yai Fung
Ramli Razamin
Publication venue: 'AIP Publishing'
Publication date: 01/01/2013
Field of study

Reactive tabu search (RTS) is an improved method of tabu search (TS) and it dynamically adjusts tabu list size based on how the search is performed.RTS can avoid disadvantage of TS which is in the parameter tuning in tabu list size. In this paper, we proposed a modified RTS approach for solving symmetric traveling salesman problems (TSP).The tabu list size of the proposed algorithm depends on the number of iterations when the solutions do not override the aspiration level to achieve a good balance between diversification and intensification.The proposed algorithm was tested on seven chosen benchmarked problems of symmetric TSP.The performance of the proposed algorithm is compared with that of the TS by using empirical testing, benchmark solution and simple probabilistic analysis in order to validate the quality of solution. The computational results and comparisons show that the proposed algorithm provides a better quality solution than that of the TS

UUM Repository

Crossref