2 research outputs found

    Unsupervised and semi-supervised clustering with learnable cluster dependent kernels.

    Get PDF
    Despite the large number of existing clustering methods, clustering remains a challenging task especially when the structure of the data does not correspond to easily separable categories, and when clusters vary in size, density and shape. Existing kernel based approaches allow to adapt a specific similarity measure in order to make the problem easier. Although good results were obtained using the Gaussian kernel function, its performance depends on the selection of the scaling parameter. Moreover, since one global parameter is used for the entire data set, it may not be possible to find one optimal scaling parameter when there are large variations between the distributions of the different clusters in the feature space. One way to learn optimal scaling parameters is through an exhaustive search of one optimal scaling parameter for each cluster. However, this approach is not practical since it is computationally expensive especially when the data includes a large number of clusters and when the dynamic range of possible values of the scaling parameters is large. Moreover, it is not trivial to evaluate the resulting partition in order to select the optimal parameters. To overcome this limitation, we introduce two new fuzzy relational clustering techniques that learn cluster dependent Gaussian kernels. The first algorithm called clustering and Local Scale Learning algorithm (LSL) minimizes one objective function for both the optimal partition and for cluster dependent scaling parameters that reflect the intra-cluster characteristics of the data. The second algorithm, called Fuzzy clustering with Learnable Cluster dependent Kernels (FLeCK) learns the scaling parameters by optimizing both the intra-cluster and the inter-cluster dissimilarities. Consequently, the learned scale parameters reflect the relative density, size, and position of each cluster with respect to the other clusters. We also introduce semi-supervised versions of LSL and FLeCK. These algorithms generate a fuzzy partition of the data and learn the optimal kernel resolution of each cluster simultaneously. We show that the incorporation of a small set of constraints can guide the clustering process to better learn the scaling parameters and the fuzzy memberships in order to obtain a better partition of the data. In particular, we show that the partial supervision is even more useful on real high dimensional data sets where the algorithms are more susceptible to local minima. All of the proposed algorithms are optimized iteratively by dynamically updating the partition and the scaling parameter in each iteration. This makes these algorithms simple and fast. Moreover, our algorithms are formulated to work on relational data. This makes them applicable to data where objects cannot be represented by vectors or when clusters of similar objects cannot be represented efficiently by a single prototype. Our extensive experiments show that FLeCK and SS-FLeCK outperform existing algorithms. In particular, we show that when data include clusters with various inter-cluster and intra-cluster distances, learning cluster dependent kernel is crucial in obtaining a good partition

    Context dependent spectral unmixing.

    Get PDF
    A hyperspectral unmixing algorithm that finds multiple sets of endmembers is proposed. The algorithm, called Context Dependent Spectral Unmixing (CDSU), is a local approach that adapts the unmixing to different regions of the spectral space. It is based on a novel function that combines context identification and unmixing. This joint objective function models contexts as compact clusters and uses the linear mixing model as the basis for unmixing. Several variations of the CDSU, that provide additional desirable features, are also proposed. First, the Context Dependent Spectral unmixing using the Mahalanobis Distance (CDSUM) offers the advantage of identifying non-spherical clusters in the high dimensional spectral space. Second, the Cluster and Proportion Constrained Multi-Model Unmixing (CC-MMU and PC-MMU) algorithms use partial supervision information, in the form of cluster or proportion constraints, to guide the search process and narrow the space of possible solutions. The supervision information could be provided by an expert, generated by analyzing the consensus of multiple unmixing algorithms, or extracted from co-located data from a different sensor. Third, the Robust Context Dependent Spectral Unmixing (RCDSU) introduces possibilistic memberships into the objective function to reduce the effect of noise and outliers in the data. Finally, the Unsupervised Robust Context Dependent Spectral Unmixing (U-RCDSU) algorithm learns the optimal number of contexts in an unsupervised way. The performance of each algorithm is evaluated using synthetic and real data. We show that the proposed methods can identify meaningful and coherent contexts, and appropriate endmembers within each context. The second main contribution of this thesis is consensus unmixing. This approach exploits the diversity and similarity of the large number of existing unmixing algorithms to identify an accurate and consistent set of endmembers in the data. We run multiple unmixing algorithms using different parameters, and combine the resulting unmixing ensemble using consensus analysis. The extracted endmembers will be the ones that have a consensus among the multiple runs. The third main contribution consists of developing subpixel target detectors that rely on the proposed CDSU algorithms to adapt target detection algorithms to different contexts. A local detection statistic is computed for each context and then all scores are combined to yield a final detection score. The context dependent unmixing provides a better background description and limits target leakage, which are two essential properties for target detection algorithms
    corecore