13,313 research outputs found
Recommended from our members
Dynamic load balancing in parallel KD-tree k-means
One among the most influential and popular data mining methods is the k-Means algorithm for cluster analysis.
Techniques for improving the efficiency of k-Means have been
largely explored in two main directions. The amount of computation can be significantly reduced by adopting geometrical constraints and an efficient data structure, notably a multidimensional binary search tree (KD-Tree). These techniques allow to reduce the number of distance computations the algorithm performs at each iteration. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient k-Means variants in parallel computing environments. In this work, we provide a parallel formulation of the KD-Tree based k-Means algorithm for distributed memory systems and address its load balancing
issue. Three solutions have been developed and tested. Two
approaches are based on a static partitioning of the data set and a third solution incorporates a dynamic load balancing policy
A Multi-Tier Knowledge Discovery Info-Structure Using Ensemble Techniques
Fokus utama kami ialah untuk mempelajari keujudan peraturan-peraturan yang ditemui
daripada data-data tanpa catatan serta menjana keputusan yang lebih tepat dan
muktamad.
Our terminal focus is to learn rules instances that have been discovered from
unannotated data and generate results with high accuracy
Underdetermined source separation using a sparse STFT framework and weighted laplacian directional modelling
The instantaneous underdetermined audio source separation problem of
K-sensors, L-sources mixing scenario (where K < L) has been addressed by many
different approaches, provided the sources remain quite distinct in the virtual
positioning space spanned by the sensors. This problem can be tackled as a
directional clustering problem along the source position angles in the mixture.
The use of Generalised Directional Laplacian Densities (DLD) in the MDCT domain
for underdetermined source separation has been proposed before. Here, we derive
weighted mixtures of DLDs in a sparser representation of the data in the STFT
domain to perform separation. The proposed approach yields improved results
compared to our previous offering and compares favourably with the
state-of-the-art.Comment: EUSIPCO 2016, Budapest, Hungar
A Multi-Tier Knowledge Discovery Info-Structure Using Ensemble Techniques [QA76.9.D35 S158 2007 f rb].
Fokus utama kami ialah untuk mempelajari keujudan peraturan-peraturan yang ditemui daripada data-data tanpa catatan serta menjana keputusan yang lebih tepat dan muktamad. Ini dilakukan melalui kaedah penghibridan yang merangkumi kedua-dua mekanisma berselia dan tidak berselia.
Our terminal focus is to learn rules instances that have been discovered from unannotated data and generate results with high accuracy. This is done via a hybridized methodology which features both supervised and unsupervised techniques. Unannotated data without prior classification information could now be useful as our research has
brought new insight to knowledge discovery and learning altogether
BlogForever D2.6: Data Extraction Methodology
This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform
- …