Search CORE

4 research outputs found

Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis

Author: Dalatu Paul Inuwa
Midi Habshah
Publication venue: Universiti Putra Malaysia Press
Publication date: 01/01/2018
Field of study

Clustering is basically one of the major sources of primary data mining tools. It makes researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with the major aim of partitioning, where objects in the same cluster are similar, and objects which belong to different clusters vary significantly, with respect to their attributes. However, the classical Standardized Euclidean distance, which uses standard deviation to down weight maximum points of the ith features on the distance clusters, has been criticized by many scholars that the method produces outliers, lack robustness, and has 0% breakdown points. It also has low efficiency in normal distribution. Therefore, to remedy the problem, we suggest two statistical estimators which have 50% breakdown points namely the Sn and Qn estimators, with 58% and 82% efficiency, respectively. The proposed methods evidently outperformed the existing methods in down weighting the maximum points of the ith features in distance-based clustering analysis

Universiti Putra Malaysia Institutional Repository

The best of both worlds: highlighting the synergies of combining manual and automatic knowledge organization methods to improve information search and discovery.

Author: Burnett Simon
Cleverley Paul H.
Publication venue: 'Nomos Verlag'
Publication date: 01/01/2015
Field of study

Research suggests organizations across all sectors waste a significant amount of time looking for information and often fail to leverage the information they have. In response, many organizations have deployed some form of enterprise search to improve the 'findability' of information. Debates persist as to whether thesauri and manual indexing or automated machine learning techniques should be used to enhance discovery of information. In addition, the extent to which a knowledge organization system (KOS) enhances discoveries or indeed blinds us to new ones remains a moot point. The oil and gas industry was used as a case study using a representative organization. Drawing on prior research, a theoretical model is presented which aims to overcome the shortcomings of each approach. This synergistic model could help to re-conceptualize the 'manual' versus 'automatic' debate in many enterprises, accommodating a broader range of information needs. This may enable enterprises to develop more effective information and knowledge management strategies and ease the tension between what arc often perceived as mutually exclusive competing approaches. Certain aspects of the theoretical model may be transferable to other industries, which is an area for further research

Crossref

Open Access Institutional Repository at Robert Gordon University

A New Method for Evaluating Automatically Learned Terminological Taxonomies

Author: J. María Ruiz-Martínez
P. Velardi
R. Navigli
S. Faralli
Publication venue
Publication date: 01/01/2012
Field of study

Archivio della ricerca- Università di Roma La Sapienza

Pertanika Journal of Science & Technology

Author: Universiti Putra Malaysia Press
Publication venue: Universiti Putra Malaysia Press
Publication date: 01/01/2018
Field of study

Universiti Putra Malaysia Institutional Repository