17 research outputs found

    Musical recommendations and personalization in a social network

    Full text link
    This paper presents a set of algorithms used for music recommendations and personalization in a general purpose social network www.ok.ru, the second largest social network in the CIS visited by more then 40 millions users per day. In addition to classical recommendation features like "recommend a sequence" and "find similar items" the paper describes novel algorithms for construction of context aware recommendations, personalization of the service, handling of the cold-start problem, and more. All algorithms described in the paper are working on-line and are able to detect and address changes in the user's behavior and needs in the real time. The core component of the algorithms is a taste graph containing information about different entities (users, tracks, artists, etc.) and relations between them (for example, user A likes song B with certainty X, track B created by artist C, artist C is similar to artist D with certainty Y and so on). Using the graph it is possible to select tracks a user would most probably like, to arrange them in a way that they match each other well, to estimate which items from a fixed list are most relevant for the user, and more. In addition, the paper describes the approach used to estimate algorithms efficiency and analyze the impact of different recommendation related features on the users' behavior and overall activity at the service.Comment: This is a full version of a 4 pages article published at ACM RecSys 201

    Adaptive semi-supervised affinity propagation clustering algorithm based on structural similarity

    Get PDF
    Uzimajući u obzir nezadovoljavajuće djelovanje grupiranja srodnog širenja algoritma grupiranja, kada se radi o nizovima podataka složenih struktura, u ovom se radu predlaže prilagodljivi nadzirani algoritam grupiranja srodnog širenja utemeljen na strukturnoj sličnosti (SAAP-SS). Najprije se predlaže nova strukturna sličnost rješavanjem nelinearnog problema zastupljenosti niskoga ranga. Zatim slijedi srodno širenje na temelju podešavanja matrice sličnosti primjenom poznatih udvojenih ograničenja. Na kraju se u postupak algoritma uvodi ideja eksplozija kod vatrometa. Prilagodljivo pretražujući preferencijalni prostor u dva smjera, uravnotežuju se globalne i lokalne pretraživačke sposobnosti algoritma u cilju pronalaženja optimalne strukture grupiranja. Rezultati eksperimenata i sa sintetičkim i s realnim nizovima podataka pokazuju poboljšanja u radu predloženog algoritma u usporedbi s AP, FEO-SAP i K-means metodama.In view of the unsatisfying clustering effect of affinity propagation (AP) clustering algorithm when dealing with data sets of complex structures, an adaptive semi-supervised affinity propagation clustering algorithm based on structural similarity (SAAP-SS) is proposed in this paper. First, a novel structural similarity is proposed by solving a non-linear, low-rank representation problem. Then we perform affinity propagation on the basis of adjusting the similarity matrix by utilizing the known pairwise constraints. Finally, the idea of fireworks explosion is introduced into the process of the algorithm. By adaptively searching the preference space bi-directionally, the algorithm’s global and local searching abilities are balanced in order to find the optimal clustering structure. The results of the experiments with both synthetic and real data sets show performance improvements of the proposed algorithm compared with AP, FEO-SAP and K-means methods

    Latent Geometry Inspired Graph Dissimilarities Enhance Affinity Propagation Community Detection in Complex Networks

    Full text link
    Affinity propagation is one of the most effective unsupervised pattern recognition algorithms for data clustering in high-dimensional feature space. However, the numerous attempts to test its performance for community detection in complex networks have been attaining results very far from the state of the art methods such as Infomap and Louvain. Yet, all these studies agreed that the crucial problem is to convert the unweighted network topology in a 'smart-enough' node dissimilarity matrix that is able to properly address the message passing procedure behind affinity propagation clustering. Here we introduce a conceptual innovation and we discuss how to leverage network latent geometry notions in order to design dissimilarity matrices for affinity propagation community detection. Our results demonstrate that the latent geometry inspired dissimilarity measures we design bring affinity propagation to equal or outperform current state of the art methods for community detection. These findings are solidly proven considering both synthetic 'realistic' networks (with known ground-truth communities) and real networks (with community metadata), even when the data structure is corrupted by noise artificially induced by missing or spurious connectivity

    Clustering large-scale data based on modified affinity propagation algorithm

    Get PDF
    Traditional clustering algorithms are no longer suitable for use in data mining applications that make use of large-scale data. There have been many large-scale data clustering algorithms proposed in recent years, but most of them do not achieve clustering with high quality. Despite that Affinity Propagation (AP) is effective and accurate in normal data clustering, but it is not effective for large-scale data. This paper proposes two methods for large-scale data clustering that depend on a modified version of AP algorithm. The proposed methods are set to ensure both low time complexity and good accuracy of the clustering method. Firstly, a data set is divided into several subsets using one of two methods random fragmentation or K-means. Secondly, subsets are clustered into K clusters using K-Affinity Propagation (KAP) algorithm to select local cluster exemplars in each subset. Thirdly, the inverse weighted clustering

    Semi-supervised affinity propagation based on density peaks

    Get PDF
    Zbog nezadovoljavajućeg učinka grupiranja (klasteriranja) pomoću algoritma grupiranja propagacijom afiniteta (AP - affinity propagation) u slučaju nizova podataka složene strukture, u radu se predlaže polu nadzirani algoritam grupiranja propagacije afiniteta temeljen na vršnoj gustoći (SAP-DP). Taj algoritam primjenjuje novi algoritam vršne gustoće (DP - density peaks) čija je prednost višestruko grupiranje uz polu-nadziranje, izgradnja udvojenih ograničenja zbog usklađivanja s matricom sličnosti, a zatim izvršenje grupiranja propagacijom afiniteta. Rezultati simulacijskih eksperimenata potvrdili su da je grupiranje predloženim algoritmom učinkovitije od grupiranja konvencionalnom propagacijom afiniteta (AP).In view of the unsatisfying clustering effect of affinity propagation (AP) clustering algorithm when dealing with data sets of complex structures, a semi-supervised affinity propagation clustering algorithm based on density peaks (SAP-DP) was proposed in this paper. The algorithm uses a new algorithm of density peaks (DP) which has the advantage of the manifold clustering with the idea of semi-supervised, builds pairwise constraints to adjust the similarity matrix, and then executes the AP clustering. The results of the simulation experiments validated that the proposed algorithm has better clustering performance compared with conventional AP

    Improved clustering approach for junction detection of multiple edges with modified freeman chain code

    Get PDF
    Image processing framework of two-dimensional line drawing involves three phases that are detecting junction and corner that exist in the drawing, representing the lines, and extracting features to be used in recognizing the line drawing based on the representation scheme used. As an alternative to the existing frameworks, this thesis proposed a framework that consists of improvement in the clustering approach for junction detection of multiple edges, modified Freeman chain code scheme and provide new features and its extraction, and recognition algorithm. This thesis concerns with problem in clustering line drawing for junction detection of multiple edges in the first phase. Major problems in cluster analysis such as time taken and particularly number of accurate clusters contained in the line drawing when performing junction detection are crucial to be addressed. Two clustering approaches are used to compare with the result obtained from the proposed algorithm: self-organising map (SOM) and affinity propagation (AP). These approaches are chosen based on their similarity as unsupervised learning class and do not require initial cluster count to execute. In the second phase, a new chain code scheme is proposed to be used in representing the direction of lines and it consists of series of directional codes and corner labels found in the drawing. In the third phase, namely feature extraction algorithm, three features proposed are length of lines, angle of corners, and number of branches at each corner. These features are then used in the proposed recognition algorithm to match the line drawing, involving only mean and variance in the calculation. Comparison with SOM and AP clustering approaches resulting in up to 31% reduction for cluster count and 57 times faster. The results on corner detection algorithm shows that it is capable to detect junction and corner of the given thinned binary image by producing a new thinned binary image containing markers at their locations

    Gravity Theory-Based Affinity Propagation Clustering Algorithm and Its Applications

    Get PDF
    The original Affinity Propagation clustering algorithm (AP) only used the Euclidean distance of data sample as the only standard for similarity calculation. This method of calculation had great limitations for data with high dimension and sparsity when the original algorithm was running. Due to the single calculation method of similarity, the convergence and clustering accuracy of the algorithm were greatly affected. On the other hand, in the universe, we can consider the formation of galaxies is a clustering process. In addition, the interaction between different celestial bodies are achieved through universal gravitation. This paper introduced the Density Peak clustering algorithm (DP) and gravitational thought into the AP algorithm, and constructed the density property to calculate the similarity, put forward the Affinity Propagation clustering algorithm based on Gravity (GAP). The proposed algorithm was more accurate to calculate similarity of simple points through the local density of corresponding points, and then used the gravity formula to update the similarity matrix. The data clustering process could be seen as the sample points spontaneously attract each other based on ‘gravitation’. Experimental results showed that the convergence performance of GAP algorithm is obviously improved over the AP algorithm, and the clustering effect was better