213,915 research outputs found

    Pattern classification using a linear associative memory

    Get PDF
    Pattern classification is a very important image processing task. A typical pattern classification algorithm can be broken into two parts; first, the pattern features are extracted and, second, these features are compared with a stored set of reference features until a match is found. In the second part, usually one of the several clustering algorithms or similarity measures is applied. In this paper, a new application of linear associative memory (LAM) to pattern classification problems is introduced. Here, the clustering algorithms or similarity measures are replaced by a LAM matrix multiplication. With a LAM, the reference features need not be separately stored. Since the second part of most classification algorithms is similar, a LAM standardizes the many clustering algorithms and also allows for a standard digital hardware implementation. Computer simulations on regular textures using a feature extraction algorithm achieved a high percentage of successful classification. In addition, this classification is independent of topological transformations

    Clustering Software Components for Program Restructuring and Component Reuse Using Hybrid XNOR Similarity Function

    Get PDF
    AbstractComponent based software development has gained a lot of practical importance in the field of software engineering from academic researchers and also from industry perspective. Finding components for efficient software reuse is one of the important problems aimed by researchers. Clustering reduces the search space of components by grouping similar entities together thus ensuring reduced time complexity as it reduces the search time for component retrieval. In this research, we instigate a generalized approach for clustering a given set of documents or software components by defining a similarity function called hybrid XNOR function to find degree of similarity between two document sets or software components. A similarity matrix is obtained for a given set of documents or components by applying hybrid XNOR function. We define and design the algorithm for component or document clustering which has the input as similarity matrix and output being set of clusters. The output is a set of highly cohesive pattern groups or components

    Spatial Point Pattern Analysis and Industry Concentration

    Get PDF
    Traditional measures of spatial industry concentration are restricted to given areal units. They do not make allowance for the fact that concentration may be differently pronounced at various geographical levels. Methods of spatial point pattern analysis allow to measure industry concentration at a continuum of spatial scales. While common distancebased methods are well applicable for sub-national study areas, they become inefficient in measuring concentration at various levels within industrial countries. This particularly applies in testing for conditional concentration where overall manufacturing is used as a reference population. Using Ripley’s K function approach to second-order analysis, we propose a subsample similarity test as a feasible testing approach for establishing conditional clustering or dispersion at different spatial scales. For measuring the extent of clustering and dispersion, we introduce a concentration index of the style of Besag’s (1977) L function. By contrast to Besag’s L function, the new index can be employed to measure deviations of observed from general spatial point patterns. The K function approach is illustratively applied to measuring and testing industry concentration in Germany.Spatial concentration, clustering, dispersion, spatial point pattern analysis, K function

    Automatic pattern segmentation of jacquard warp-knitted fabric based on hybrid image processing methods

    Get PDF
    This paper reports an automatic pattern separation approach for jacquard warp-knitted fabric, which includes bilateral filter, pyramidal wavelet decomposition and improved fuzzy c-means (FCM) clustering. First, jacquard warp-knitted fabric images are captured and digitized by a scanner in gray mode, and then the bilateral filter is adopted to smoothen the fabric textures formed by various lapping movements of jacquard fabric and to reduce the noise appearing in capturing process. Next, multi-scale wavelet decomposition is applied to lessen calculation burden and to shorten computation time. Finally, the modified FCM clustering is proposed, in which the Mercer Kernel function is used to make some features prominent for clustering, and a weight function is proposed to measure the similarity between the data and the clustering center. The experimental results reveal that this hybrid method can achieve fast and accurate pattern segmentation. It is proved that this study is suitable for the pattern separation of jacquard warp-knitted fabric

    Malware Classification based on Call Graph Clustering

    Full text link
    Each day, anti-virus companies receive tens of thousands samples of potentially harmful executables. Many of the malicious samples are variations of previously encountered malware, created by their authors to evade pattern-based detection. Dealing with these large amounts of data requires robust, automatic detection approaches. This paper studies malware classification based on call graph clustering. By representing malware samples as call graphs, it is possible to abstract certain variations away, and enable the detection of structural similarities between samples. The ability to cluster similar samples together will make more generic detection techniques possible, thereby targeting the commonalities of the samples within a cluster. To compare call graphs mutually, we compute pairwise graph similarity scores via graph matchings which approximately minimize the graph edit distance. Next, to facilitate the discovery of similar malware samples, we employ several clustering algorithms, including k-medoids and DBSCAN. Clustering experiments are conducted on a collection of real malware samples, and the results are evaluated against manual classifications provided by human malware analysts. Experiments show that it is indeed possible to accurately detect malware families via call graph clustering. We anticipate that in the future, call graphs can be used to analyse the emergence of new malware families, and ultimately to automate implementation of generic detection schemes.Comment: This research has been supported by TEKES - the Finnish Funding Agency for Technology and Innovation as part of its ICT SHOK Future Internet research programme, grant 40212/0

    Sphere-sphere intersection for investment portfolio diversification - A new data-driven cluster analysis.

    Get PDF
    Aiming at supporting the process of investment portfolio diversification by using a data-driven approach, the present methodological paper proposes a new cluster analysis, which compares publicly traded companies, mainly in times of high volatility (e.g. crisis times). The main goal of the proposed method is to provide a less arbitrary analysis to support financial investors to precisely measure the degree of similarity between equity stocks, unveiling equity market clustering patterns by applying analytic geometry solutions and calculating an overall clustering pattern indicator. Empirical results on synthetic data demonstrate either that the proposed method has conceptual superiority over traditional cluster analyses and its potential practical usefulness to asset allocation, portfolio strategy, asset pricing, among other related purposes. Finally, the outputs of the proposed cluster analysis are presented through an intuitive and easily understandable mathematical visualization. •It is proposed a new method to calculate risk-similarity and clustering patterns.•The method unveils clustering patterns through a data-driven process.•Portfolio diversification can benefit from sphere-sphere intersection calculations

    Multilevel kohonen network learning for clustering problems

    Get PDF
    Clustering is the procedure of recognising classes of patterns that occur in the environment and assigning each pattern to its relevant class. Unlike classical statistical methods, self-organising map (SOM) does not require any prior knowledge about the statistical distribution of the patterns in the environment. In this study, an alternative classification of self-organising neural networks, known as multilevel learning, was proposed to solve the task of pattern separation. The performance of standard SOM and multilevel SOM were evaluated with different distance or dissimilarity measures in retrieving similarity between patterns. The purpose of this analysis was to evaluate the quality of map produced by SOM learning using different distance measures in representing a given dataset. Based on the results obtained from both SOM methods, predictions can be made for the unknown samples. The results showed that multilevel SOM learning gives better classification rate for small and medium scale datasets, but not for large scale dataset
    • …
    corecore