2,156 research outputs found

    From the User to the Medium: Neural Profiling Across Web Communities

    Full text link
    Online communities provide a unique way for individuals to access information from those in similar circumstances, which can be critical for health conditions that require daily and personalized management. As these groups and topics often arise organically, identifying the types of topics discussed is necessary to understand their needs. As well, these communities and people in them can be quite diverse, and existing community detection methods have not been extended towards evaluating these heterogeneities. This has been limited as community detection methodologies have not focused on community detection based on semantic relations between textual features of the user-generated content. Thus here we develop an approach, NeuroCom, that optimally finds dense groups of users as communities in a latent space inferred by neural representation of published contents of users. By embedding of words and messages, we show that NeuroCom demonstrates improved clustering and identifies more nuanced discussion topics in contrast to other common unsupervised learning approaches

    Tetkik: Akan veri kümeleme algoritmalarını çalıştırma ve karşılaştırma

    Get PDF
    12th Turkish National Software Engineering Symposium, UYMS 2018; Istanbul; Turkey; 10 September 2018 through 12 September 2018Recently, clustering data streams have become an incredibly important research area for knowledge discovery as applications produce more and more unstoppable streaming data. In this paper we introduce clustering, streams and data streaming clustering algorithms, as well as discussions of the most important stream clustering algorithms, considering their structure. As an additional contribution of our work and differently from review and survey papers in stream clustering, we offer the practical part of the most known stream clustering algorithms, namely: (i) CluStream; (ii) DenStream; (iii) D-Stream; and (iv) ClusTree, showing their experimental results along with some performance metrics computation of for each, depending on MOA framework.Son zamanlarda, veri akışlarını kümelemek uygulamalar daha fazla durdurulamaz veri akışı üretirken bilgi keşfi için inanılmaz derecede önemli bir araştırma alanı haline gelmiştir.Bu makalede, kümeleme, akışlar ve veri akışlarını kümeleme algoritmalarını en önemli akım kümeleme algoritmalarının irdelenmesini yapılarını da göz önünde bulundurarak tanıtıyoruz. Çalışmamızın ek bir katkısı ve akış kümeleme alanında yapılmış tetkit ve gözden geçirme makalelerinden farklı olarak en bilinen akış kümeleme algoritmalarının Pratik kısmını, yani: (i) CluStream; (ii) DenStream; (iii) D-Stream; and (iv) ClusTree, MOA Java çerçevesine bağlı olarak, her biri için bazı performans metriklerinin hesaplanmasıyla birlikte deney sonuçlarını göstererek sunuyoruz

    A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets

    Get PDF
    The term "outlier" can generally be defined as an observation that is significantly different from the other values in a data set. The outliers may be instances of error or indicate events. The task of outlier detection aims at identifying such outliers in order to improve the analysis of data and further discover interesting and useful knowledge about unusual events within numerous applications domains. In this paper, we report on contemporary unsupervised outlier detection techniques for multiple types of data sets and provide a comprehensive taxonomy framework and two decision trees to select the most suitable technique based on data set. Furthermore, we highlight the advantages, disadvantages and performance issues of each class of outlier detection techniques under this taxonomy framework

    PFU: Profiling Forum users in online social networks, a knowledge driven data mining approach

    Get PDF
    Online Social Networks (OSNs) provide platform to raise opinions on various issues, create and spread news rapidly in Online Social Network Forums (OSNFs). This work proposes a novel method for Profiling Forum Users (PFU) by exploring their behavioral characteristics based on their involvement in various topics of discussion and number of posts in respective topics posted by them in OSNFs dynamically. Modeling the proposed method mathematically, the PFU algorithm is illustrated for its adequacy and accuracy

    Methods of Hierarchical Clustering

    Get PDF
    We survey agglomerative hierarchical clustering algorithms and discuss efficient implementations that are available in R and other software environments. We look at hierarchical self-organizing maps, and mixture models. We review grid-based clustering, focusing on hierarchical density-based approaches. Finally we describe a recently developed very efficient (linear time) hierarchical clustering algorithm, which can also be viewed as a hierarchical grid-based algorithm.Comment: 21 pages, 2 figures, 1 table, 69 reference
    corecore