359 research outputs found
Development of a R package to facilitate the learning of clustering techniques
This project explores the development of a tool, in the form of a R package, to ease the process of
learning clustering techniques, how they work and what their pros and cons are. This tool should provide
implementations for several different clustering techniques with explanations in order to allow the student
to get familiar with the characteristics of each algorithm by testing them against several different datasets
while deepening their understanding of them through the explanations. Additionally, these explanations
should adapt to the input data, making the tool not only adept for self-regulated learning but for teaching
too.Grado en Ingeniería Informátic
A Survey of Adaptive Resonance Theory Neural Network Models for Engineering Applications
This survey samples from the ever-growing family of adaptive resonance theory
(ART) neural network models used to perform the three primary machine learning
modalities, namely, unsupervised, supervised and reinforcement learning. It
comprises a representative list from classic to modern ART models, thereby
painting a general picture of the architectures developed by researchers over
the past 30 years. The learning dynamics of these ART models are briefly
described, and their distinctive characteristics such as code representation,
long-term memory and corresponding geometric interpretation are discussed.
Useful engineering properties of ART (speed, configurability, explainability,
parallelization and hardware implementation) are examined along with current
challenges. Finally, a compilation of online software libraries is provided. It
is expected that this overview will be helpful to new and seasoned ART
researchers
Some Clustering Methods, Algorithms and their Applications
Clustering is a type of unsupervised learning [15]. When no target values are known, or "supervisors," in an unsupervised learning task, the purpose is to produce training data from the inputs themselves. Data mining and machine learning would be useless without clustering. If you utilize it to categorize your datasets according to their similarities, you'll be able to predict user behavior more accurately. The purpose of this research is to compare and contrast three widely-used data-clustering methods. Clustering techniques include partitioning, hierarchy, density, grid, and fuzzy clustering. Machine learning, data mining, pattern recognition, image analysis, and bioinformatics are just a few of the many fields where clustering is utilized as an analytical technique. In addition to defining the various algorithms, specialized forms of cluster analysis, linking methods, and please offer a review of the clustering techniques used in the big data setting
Relational clustering models for knowledge discovery and recommender systems
Cluster analysis is a fundamental research field in Knowledge Discovery and Data Mining
(KDD). It aims at partitioning a given dataset into some homogeneous clusters so as
to reflect the natural hidden data structure. Various heuristic or statistical approaches
have been developed for analyzing propositional datasets. Nevertheless, in relational
clustering the existence of multi-type relationships will greatly degrade the performance
of traditional clustering algorithms. This issue motivates us to find more effective algorithms
to conduct the cluster analysis upon relational datasets. In this thesis we
comprehensively study the idea of Representative Objects for approximating data distribution
and then design a multi-phase clustering framework for analyzing relational
datasets with high effectiveness and efficiency.
The second task considered in this thesis is to provide some better data models for
people as well as machines to browse and navigate a dataset. The hierarchical taxonomy
is widely used for this purpose. Compared with manually created taxonomies, automatically
derived ones are more appealing because of their low creation/maintenance cost
and high scalability. Up to now, the taxonomy generation techniques are mainly used
to organize document corpus. We investigate the possibility of utilizing them upon relational
datasets and then propose some algorithmic improvements. Another non-trivial
problem is how to assign suitable labels for the taxonomic nodes so as to credibly summarize
the content of each node. Unfortunately, this field has not been investigated
sufficiently to the best of our knowledge, and so we attempt to fill the gap by proposing
some novel approaches.
The final goal of our cluster analysis and taxonomy generation techniques is
to improve the scalability of recommender systems that are developed to tackle the
problem of information overload. Recent research in recommender systems integrates
the exploitation of domain knowledge to improve the recommendation quality, which
however reduces the scalability of the whole system at the same time. We address this
issue by applying the automatically derived taxonomy to preserve the pair-wise similarities
between items, and then modeling the user visits by another hierarchical structure.
Experimental results show that the computational complexity of the recommendation
procedure can be greatly reduced and thus the system scalability be improved
- …