8 research outputs found

    Multidimensional Balance-Based Cluster Boundary Detection for High-Dimensional Data

    Full text link
    © 2018 IEEE. The balance of neighborhood space around a central point is an important concept in cluster analysis. It can be used to effectively detect cluster boundary objects. The existing neighborhood analysis methods focus on the distribution of data, i.e., analyzing the characteristic of the neighborhood space from a single perspective, and could not obtain rich data characteristics. In this paper, we analyze the high-dimensional neighborhood space from multiple perspectives. By simulating each dimension of a data point's k nearest neighbors space (k NNs) as a lever, we apply the lever principle to compute the balance fulcrum of each dimension after proving its inevitability and uniqueness. Then, we model the distance between the projected coordinate of the data point and the balance fulcrum on each dimension and construct the DHBlan coefficient to measure the balance of the neighborhood space. Based on this theoretical model, we propose a simple yet effective cluster boundary detection algorithm called Lever. Experiments on both low- and high-dimensional data sets validate the effectiveness and efficiency of our proposed algorithm

    Unknown Clutter Estimation by FMM Approach in Multitarget Tracking Algorithm

    Get PDF
    Finite mixture model (FMM) approach is a research focus in multitarget tracking field. The clutter was treated as uniform distribution previously. Aiming at severe bias caused by unknown and complex clutter, a multitarget tracking algorithm based on clutter model estimation is put forward in this paper. Multitarget likelihood function is established with FMM. In this frame, the algorithms of expectation maximum (EM) and Markov Chain Monte Carlo (MCMC) are both consulted in FMM parameters estimation. Furthermore, target number and multitarget states can be estimated precisely after the clutter model fitted. Association between target and measurement can be avoided. Simulation proved that the proposed algorithm has a good performance in dealing with unknown and complex clutter

    An Illumination Invariant Accurate Face Recognition with Down Scaling of DCT Coefficients

    Get PDF
    In this paper, a novel approach for illumination normalization under varying lighting conditions is presented. Our approach utilizes the fact that discrete cosine transform (DCT) low-frequency coefficients correspond to illumination variations in a digital image. Under varying illuminations, the images captured may have low contrast; initially we apply histogram equalization on these for contrast stretching. Then the low-frequency DCT coefficients are scaled down to compensate the illumination variations. The value of scaling down factor and the number of low-frequency DCT coefficients, which are to be re-scaled, are obtained experimentally. The classification is done using k-nearest neighbor classification and nearest mean classification on the images obtained by inverse DCT on the processed coefficients. The correlation coefficient and Euclidean distance obtained using principal component analysis are used as distance metrics in classification. We have tested our face recognition method using Yale face database B. The results show that our method performs without any error (100% face recognition performance) even on the most extreme illumination variations. There are different schemes in the literature for illumination normalization under varying lighting conditions, but no one is claimed to give 100% recognition rate under all illumination variations for this database. The proposed technique is computationally efficient and can easily be implemented for real time face recognition system

    Exquisitor:Interactive Learning for Multimedia

    Get PDF

    Scalable Query Processing on Spatial Networks

    Get PDF
    Spatial networks (e.g., road networks) are general graphs with spatial information (e.g., latitude/longitude) information associated with the vertices and/or the edges of the graph. Techniques are presented for query processing on spatial networks that are based on the observed coherence between the spatial positions of the vertices and the shortest paths between them. This facilitates aggregation of the vertices into coherent regions that share vertices on the shortest paths between them. Using this observation, a framework, termed SILC, is introduced that precomputes and compactly encodes the N^2 shortest path and network distances between every pair of vertices on a spatial network containing N vertices. The compactness of the shortest paths from source vertex V is achieved by partitioning the destination vertices into subsets based on the identity of the first edge to them from V. The spatial coherence of these subsets is captured by using a quadtree representation whose dimension-reducing property enables the storage requirements of each subset to be reduced to be proportional to the perimeter of the spatially coherent regions, instead of to the number of vertices in the spatial network. In particular, experiments on a number of large road networks as well as a theoretical analysis have shown that the total storage for the shortest paths has been reduced from O(N^3) to O(N^1.5). In addition to SILC, another framework, termed PCP, is proposed that also takes advantage of the spatial coherence of the source vertices and makes use of the Well Separated Pair decomposition to further reduce the storage, under suitably defined conditions, to O(N). Using these frameworks, scalable algorithms are presented to implement a wide variety of operations such as nearest neighbor finding and distance joins on large datasets of locations residing on a spatial network. These frameworks essentially decouple the process of computing shortest paths from that of spatial query processing as well as also decouple the domain of the participating objects from the domain of the vertices of the spatial network. This means that as long as the spatial network is unchanged, the algorithm and underlying representation of the shortest paths in the spatial network can be used with different sets of objects

    To appear in PAMI K-Nearest Neighbor Finding Using MaxNearestDist

    No full text
    Similarity searching often reduces to finding the k nearest neighbors to a query object. Finding the k nearest neighbors is achieved by applying either a depth-first or a best-first algorithm to the search hierarchy containing the data. These algorithms are generally applicable to any index based on hierarchical clustering. The idea is that the data is partitioned into clusters which are aggregated to form other clusters, with the total aggregation being represented as a tree. These algorithms have traditionally used a lower bound corresponding to the minimum distance at which a nearest neighbor can be found (termed MINDIST) to prune the search process by avoiding the processing of some of the clusters as well as individual objects when they can be shown to be farther from the query object q than all of the current k nearest neighbors of q. An alternative pruning technique that uses an upper bound corresponding to the maximum possible distance at which a nearest neighbor is guaranteed to be found (termed MAXNEARESTDIST) is described. The MAXNEARESTDIST upper bound is adapted to enable its use for finding the k nearest neighbors instead of just the nearest neighbor (i.e., k = 1) as in its previous uses. Both the depth-first and best-first k-nearest neighbor algorithms are modified to use MAXNEARESTDIST, which is shown to enhance both algorithms by overcoming their shortcomings. In particular, for the depthfirst algorithm, the number of clusters in the search hierarchy that must be examined is not increased thereby potentially lowering its execution time, while for the best-first algorithm, the number of clusters in the search hierarchy that must be retained in the priority queue used to control the ordering of processing of the clusters is also not increased, thereby potentially lowering its storage requirements. Index Terms — k-nearest neighbors; similarity searching; metric spaces; depth-first nearest neighbor finding; best-first nearest neighbor finding I
    corecore