1,088 research outputs found
Worst-Case Linear Discriminant Analysis as Scalable Semidefinite Feasibility Problems
In this paper, we propose an efficient semidefinite programming (SDP)
approach to worst-case linear discriminant analysis (WLDA). Compared with the
traditional LDA, WLDA considers the dimensionality reduction problem from the
worst-case viewpoint, which is in general more robust for classification.
However, the original problem of WLDA is non-convex and difficult to optimize.
In this paper, we reformulate the optimization problem of WLDA into a sequence
of semidefinite feasibility problems. To efficiently solve the semidefinite
feasibility problems, we design a new scalable optimization method with
quasi-Newton methods and eigen-decomposition being the core components. The
proposed method is orders of magnitude faster than standard interior-point
based SDP solvers.
Experiments on a variety of classification problems demonstrate that our
approach achieves better performance than standard LDA. Our method is also much
faster and more scalable than standard interior-point SDP solvers based WLDA.
The computational complexity for an SDP with constraints and matrices of
size by is roughly reduced from to
( in our case).Comment: 14 page
A generic optimising feature extraction method using multiobjective genetic programming
In this paper, we present a generic, optimising feature extraction method using multiobjective genetic programming. We re-examine the feature extraction problem and show that effective feature extraction can significantly enhance the performance of pattern recognition systems with simple classifiers. A framework is presented to evolve optimised feature extractors that transform an input pattern space into a decision space in which maximal class separability is obtained. We have applied this method to real world datasets from the UCI Machine Learning and StatLog databases to verify our approach and compare our proposed method with other reported results. We conclude that our algorithm is able to produce classifiers of superior (or equivalent) performance to the conventional classifiers examined, suggesting removal of the need to exhaustively evaluate a large family of conventional classifiers on any new problem. (C) 2010 Elsevier B.V. All rights reserved
Visual and semantic interpretability of projections of high dimensional data for classification tasks
A number of visual quality measures have been introduced in visual analytics
literature in order to automatically select the best views of high dimensional
data from a large number of candidate data projections. These methods generally
concentrate on the interpretability of the visualization and pay little
attention to the interpretability of the projection axes. In this paper, we
argue that interpretability of the visualizations and the feature
transformation functions are both crucial for visual exploration of high
dimensional labeled data. We present a two-part user study to examine these two
related but orthogonal aspects of interpretability. We first study how humans
judge the quality of 2D scatterplots of various datasets with varying number of
classes and provide comparisons with ten automated measures, including a number
of visual quality measures and related measures from various machine learning
fields. We then investigate how the user perception on interpretability of
mathematical expressions relate to various automated measures of complexity
that can be used to characterize data projection functions. We conclude with a
discussion of how automated measures of visual and semantic interpretability of
data projections can be used together for exploratory analysis in
classification tasks.Comment: Longer version of the VAST 2011 poster.
http://dx.doi.org/10.1109/VAST.2011.610247
A Comparative Study: Globality versus Locality for Graph Construction in Discriminant Analysis
Local graph based discriminant analysis (DA) algorithms recently have attracted increasing attention to mitigate the limitations of global (graph) DA algorithms. However, there are few particular concerns on the following important issues: whether the local construction is better than the global one for intraclass and interclass graphs, which (intraclass or interclass) graph should locally or globally be constructed? and, further how they should be effectively jointed for good discriminant performances. In this paper, pursuing our previous studies on the graph construction and DA, we firstly address the issues involved above, and then by jointly utilizing both the globality and the locality, we develop, respectively, a Globally marginal and Locally compact Discriminant Analysis (GmLcDA) algorithm based on so-introduced global interclass and local intraclass graphs and a Locally marginal and Globally compact Discriminant Analysis (LmGcDA) based on so-introduced local interclass and global intraclass graphs, the purpose of which is not to show how novel the algorithms are but to illustrate the analyses in theory. Further, by comprehensively comparing the Locally marginal and Locally compact DA (LmLcDA) based on locality alone, the Globally marginal and Globally compact Discriminant Analysis (GmGcDA) just based on globality alone, GmLcDA, and LmGcDA, we suggest that the joint of locally constructed intraclass and globally constructed interclass graphs is more discriminant
Customization of Discriminant Function Analysis for Prediction of Solar Flares
This research is an extension to the research conducted by K. Leka and G. Barnes of the Colorado Research Associates Division, Northwest Research Associates, Inc. in Boulder, Colorado (CORA) in which they found no single photospheric solar parameter they considered could sufficiently identify a flare-producing active region (AR). Their research then explored the possibility a linear combination of parameters used in a multivariable discriminant function (DF) could adequately predict solar activity. The purpose of this research is to extend the DF research conducted by Leka and Barnes by refining the method of statistical discriminant analysis (DA) with the goal of selecting those photospheric magnetic parameters most capable of identifying flare-producing active regions in hopes of increasing the reliability of short term flare warnings and the understanding of flare production. The data for this research were photospheric vector magnetograms captured by the Imaging Vector Magnetograph (IVM) at the University of Hawai`i Mees Solar Observatory at Haleakala and provided by CORA. Increasing the data set size was an essential task for this research in order to have a more statistically significant training sample for DA. This research also modified current DF procedures to enable the customization of the costs of flare false alarms and flare misses. Work was also done to expand the binary DF results to produce flare probability forecasts. The selection of the optimum combination of photospheric magnetic parameters to be used as predictors in a linear DF began with the elimination of redundant parameters and those parameters least likely to contribute to flare production. The selection of parameters was governed by maximizing the Mahalanobis distance in a step-up method. The DF results show a pre-flaring active region may be characterized by larger magnetic flux, an active region with a larger area of magnetic shear angle greater than 80°, larger current of heterogeneity, larger spatial vertical magnetic field gradient, and a larger kurtosis of the shear angle. With the optimum combination of parameters, DF flare probability forecasts were compared to the daily forecasts produced by the National Oceanic and Atmospheric Administration, Space Environment Center (NOAA SEC). The Chi-Squared values of each forecast show the objective DF based flare probability forecasting method performs as well as the subjective forecasting method employed by the SEC
- …