15,928 research outputs found
A Survey on Soft Subspace Clustering
Subspace clustering (SC) is a promising clustering technology to identify
clusters based on their associations with subspaces in high dimensional spaces.
SC can be classified into hard subspace clustering (HSC) and soft subspace
clustering (SSC). While HSC algorithms have been extensively studied and well
accepted by the scientific community, SSC algorithms are relatively new but
gaining more attention in recent years due to better adaptability. In the
paper, a comprehensive survey on existing SSC algorithms and the recent
development are presented. The SSC algorithms are classified systematically
into three main categories, namely, conventional SSC (CSSC), independent SSC
(ISSC) and extended SSC (XSSC). The characteristics of these algorithms are
highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201
On the role of pre and post-processing in environmental data mining
The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed
New methods for the estimation of Takagi-Sugeno model based extended Kalman filter and its applications to optimal control for nonlinear systems
This paper describes new approaches to improve the local and global approximation (matching) and modeling capability of Takagi–Sugeno (T-S) fuzzy model. The main aim is obtaining high function approximation accuracy and fast convergence. The main problem encountered is that T-S identification method cannot be applied when the membership functions are overlapped by pairs. This restricts the application of the T-S method because this type of membership function has been widely used during the last 2 decades in the stability, controller design of fuzzy systems and is popular in industrial control applications. The approach developed here can be considered as a generalized version of T-S identification method with optimized performance in approximating nonlinear functions. We propose a noniterative method through weighting of parameters approach and an iterative algorithm by applying the extended Kalman filter, based on the same idea of parameters’ weighting. We show that the Kalman filter is an effective tool in the identification of T-S fuzzy model. A fuzzy controller based linear quadratic regulator is proposed in order to show the effectiveness of the estimation method developed here in control applications. An illustrative example of an inverted pendulum is chosen to evaluate the robustness and remarkable performance of the proposed method locally and globally in comparison with the original T-S model. Simulation results indicate the potential, simplicity, and generality of the algorithm. An illustrative example is chosen to evaluate the robustness. In this paper, we prove that these algorithms converge very fast, thereby making them very practical to use
Taming Wild High Dimensional Text Data with a Fuzzy Lash
The bag of words (BOW) represents a corpus in a matrix whose elements are the
frequency of words. However, each row in the matrix is a very high-dimensional
sparse vector. Dimension reduction (DR) is a popular method to address sparsity
and high-dimensionality issues. Among different strategies to develop DR
method, Unsupervised Feature Transformation (UFT) is a popular strategy to map
all words on a new basis to represent BOW. The recent increase of text data and
its challenges imply that DR area still needs new perspectives. Although a wide
range of methods based on the UFT strategy has been developed, the fuzzy
approach has not been considered for DR based on this strategy. This research
investigates the application of fuzzy clustering as a DR method based on the
UFT strategy to collapse BOW matrix to provide a lower-dimensional
representation of documents instead of the words in a corpus. The quantitative
evaluation shows that fuzzy clustering produces superior performance and
features to Principal Components Analysis (PCA) and Singular Value
Decomposition (SVD), two popular DR methods based on the UFT strategy
Latent class analysis for segmenting preferences of investment bonds
Market segmentation is a key component of conjoint analysis which addresses consumer
preference heterogeneity. Members in a segment are assumed to be homogenous in their
views and preferences when worthing an item but distinctly heterogenous to members of other
segments. Latent class methodology is one of the several conjoint segmentation procedures
that overcome the limitations of aggregate analysis and a-priori segmentation. The main
benefit of Latent class models is that market segment membership and regression parameters
of each derived segment are estimated simultaneously. The Latent class model presented in
this paper uses mixtures of multivariate conditional normal distributions to analyze rating
data, where the likelihood is maximized using the EM algorithm. The application focuses on
customer preferences for investment bonds described by four attributes; currency, coupon
rate, redemption term and price. A number of demographic variables are used to generate
segments that are accessible and actionable.peer-reviewe
- …