2,156 research outputs found
From the User to the Medium: Neural Profiling Across Web Communities
Online communities provide a unique way for individuals to access information
from those in similar circumstances, which can be critical for health
conditions that require daily and personalized management. As these groups and
topics often arise organically, identifying the types of topics discussed is
necessary to understand their needs. As well, these communities and people in
them can be quite diverse, and existing community detection methods have not
been extended towards evaluating these heterogeneities. This has been limited
as community detection methodologies have not focused on community detection
based on semantic relations between textual features of the user-generated
content. Thus here we develop an approach, NeuroCom, that optimally finds dense
groups of users as communities in a latent space inferred by neural
representation of published contents of users. By embedding of words and
messages, we show that NeuroCom demonstrates improved clustering and identifies
more nuanced discussion topics in contrast to other common unsupervised
learning approaches
Tetkik: Akan veri kümeleme algoritmalarını çalıştırma ve karşılaştırma
12th Turkish National Software Engineering Symposium, UYMS 2018; Istanbul; Turkey; 10 September 2018 through 12 September 2018Recently, clustering data streams have become an incredibly important research area for knowledge discovery as applications produce more and more unstoppable streaming data. In this paper we introduce clustering, streams and data streaming clustering algorithms, as well as discussions of the most important stream clustering algorithms, considering their structure. As an additional contribution of our work and differently from review and survey papers in stream clustering, we offer the practical part of the most known stream clustering algorithms, namely: (i) CluStream; (ii) DenStream; (iii) D-Stream; and (iv) ClusTree, showing their experimental results along with some performance metrics computation of for each, depending on MOA framework.Son zamanlarda, veri akışlarını kümelemek uygulamalar daha fazla
durdurulamaz veri akışı üretirken bilgi keşfi için inanılmaz derecede önemli bir
araştırma alanı haline gelmiştir.Bu makalede, kümeleme, akışlar ve veri
akışlarını kümeleme algoritmalarını en önemli akım kümeleme algoritmalarının
irdelenmesini yapılarını da göz önünde bulundurarak tanıtıyoruz. Çalışmamızın
ek bir katkısı ve akış kümeleme alanında yapılmış tetkit ve gözden geçirme
makalelerinden farklı olarak en bilinen akış kümeleme algoritmalarının Pratik
kısmını, yani: (i) CluStream; (ii) DenStream; (iii) D-Stream; and (iv) ClusTree,
MOA Java çerçevesine bağlı olarak, her biri için bazı performans metriklerinin
hesaplanmasıyla birlikte deney sonuçlarını göstererek sunuyoruz
A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets
The term "outlier" can generally be defined as an observation that is significantly different from
the other values in a data set. The outliers may be instances of error or indicate events. The
task of outlier detection aims at identifying such outliers in order to improve the analysis of
data and further discover interesting and useful knowledge about unusual events within numerous
applications domains. In this paper, we report on contemporary unsupervised outlier detection
techniques for multiple types of data sets and provide a comprehensive taxonomy framework and
two decision trees to select the most suitable technique based on data set. Furthermore, we
highlight the advantages, disadvantages and performance issues of each class of outlier detection
techniques under this taxonomy framework
PFU: Profiling Forum users in online social networks, a knowledge driven data mining approach
Online Social Networks (OSNs) provide platform to raise opinions on various issues, create and spread news rapidly in Online Social Network Forums (OSNFs). This work proposes a novel method for Profiling Forum Users (PFU) by exploring their behavioral characteristics based on their involvement in various topics of discussion and number of posts in respective topics posted by them in OSNFs dynamically. Modeling the proposed method mathematically, the PFU algorithm is illustrated for its adequacy and accuracy
Methods of Hierarchical Clustering
We survey agglomerative hierarchical clustering algorithms and discuss
efficient implementations that are available in R and other software
environments. We look at hierarchical self-organizing maps, and mixture models.
We review grid-based clustering, focusing on hierarchical density-based
approaches. Finally we describe a recently developed very efficient (linear
time) hierarchical clustering algorithm, which can also be viewed as a
hierarchical grid-based algorithm.Comment: 21 pages, 2 figures, 1 table, 69 reference
- …