Search CORE

58,579 research outputs found

Clustering technique for conceptual clusters

Author: Anquetil Nicolas
Ducasse Stéphane
Govin Brice
Monegier Du Sorbier Arnaud
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/08/2016
Field of study

International audienceClustering aims to classify elements into groups called classes or clusters. Clustering is used in reverse-engineering to help to understand legacy software. It is also a tech-nic used in re-engineering to propose gatherings of software entities to engineers who can then accept them or not. This paper presents a Pharo implementation of an iterative and semi-automatic method for clustering. Our method proposes, to an end-user, clusters that are based on domain information and structural information. The method presented in this paper has been applied in an industrial project of architecture migration. We show that this method helps engineers to cluster software elements into domain concepts. The clustering gives a result of 56% of precision and 79% of recall after the automated part in a high level clustering. A deeper clustering gives a result of 51% of precision and 52% of recall

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Color Image Clustering using Block Truncation Algorithm

Author: Maheshwari Manish
Motwani Mahesh
Silakari Sanjay
Publication venue: International Journal of Computer Science Issues, IJCSI
Publication date: 01/09/2009
Field of study

With the advancement in image capturing device, the image data been generated at high volume. If images are analyzed properly, they can reveal useful information to the human users. Content based image retrieval address the problem of retrieving images relevant to the user needs from image databases on the basis of low-level visual features that can be derived from the images. Grouping images into meaningful categories to reveal useful information is a challenging and important problem. Clustering is a data mining technique to group a set of unsupervised data based on the conceptual clustering principal: maximizing the intraclass similarity and minimizing the interclass similarity. Proposed framework focuses on color as feature. Color Moment and Block Truncation Coding (BTC) are used to extract features for image dataset. Experimental study using K-Means clustering algorithm is conducted to group the image dataset into various clusters

CogPrints Cognitive Sciences Eprint Archive

Recommended from our members

Methods of conceptual clustering and their relation to numerical taxonomy

Author: Fisher Douglas
Langley Pat
Publication venue: eScholarship, University of California
Publication date: 22/07/1985
Field of study

Artificial Intelligence (AI) methods for machine learning can be viewed as forms of exploratory data analysis, even though they differ markedly from the statistical methods generally connoted by the term. The distinction between methods of machine learning and statistical data analysis is primarily due to differences in the way techniques of each type represent data and structure within data. That is, methods of machine learning are strongly biased toward symbolic (as opposed to numeric) data representations. We explore this difference within a limited context, devoting the bulk of our paper to the explication of conceptual clustering, an extension to the statistically based methods of numerical taxonomy. In conceptual clustering the formation of object clusters is dependent on the quality of 'higher-level' characterizations, termed concepts, of the clusters. The form of concepts used by existing conceptual clustering systems (sets of necessary and sufficient conditions) is described in some detail. This is followed by descriptions of several conceptual clustering techniques, along with sample output. We conclude with a discussion of how alternative concept representations might enhance the effectiveness of future conceptual clustering systems

eScholarship - University of California

Recommended from our members

Approaches to conceptual clustering

Author: Fisher Douglas
Langley Pat
Publication venue: eScholarship, University of California
Publication date: 12/07/1985
Field of study

Methods for Conceptual Clustering may be explicated in two lights. Conceptual Clustering methods may be viewed as extensions to techniques of numerical taxonomy, a collection of methods developed by social and natural scientists for creating classification schemes over object sets. Alternatively, conceptual clustering may be viewed as a form of learning by observation or concept formation, as opposed to methods of learning from examples or concept identification. In this paper we survey and compare a number of conceptual clustering methods along dimensions suggested by each of these views. The point we most wish to clarify is that conceptual clustering processes can be explicated as being composed of three distinct but inter-dependent subprocesses: the process of deriving a hierarchical classification scheme; the process of aggregating objects into individual classes; and the process of assigning conceptual descriptions to object classes. Each subprocess may be characterized along a number of dimensions related to search, thus facilitating a better understanding of the conceptual clustering process as a whole

eScholarship - University of California

Recommended from our members

Generating predictions to aid the scientific discovery process

Author: Jones Randy
Publication venue: eScholarship, University of California
Publication date: 15/07/1986
Field of study

NGLAUBER is a system which models the scientific discovery of qualitative empirical laws. As such, it falls into the category of scientific discovery systems. However, the program can also be viewed as a conceptual clustering system since it forms classes of objects and characterizes these classes. NGLAUBER differs from existing scientific discovery and conceptual clustering systems in a number of ways: It uses an incremental method to group objects into classes; these classes are formed based on the relationships between objects rather than just the attributes of objects; the system describes the relationships between classes rather than simply describing the classes; and most importantly, NGLAUBER proposes experiments by predicting future data. The experiments help the system guide itself through the search for regularities in the data

eScholarship - University of California

On the Effect of Semantically Enriched Context Models on Software Modularization

Author: Hage Jurriaan
Jansen Slinger
Khadka Ravi
Saeidi Amir
Publication venue: 'Aspect-Oriented Software Association (AOSA)'
Publication date: 04/08/2017
Field of study

Many of the existing approaches for program comprehension rely on the linguistic information found in source code, such as identifier names and comments. Semantic clustering is one such technique for modularization of the system that relies on the informal semantics of the program, encoded in the vocabulary used in the source code. Treating the source code as a collection of tokens loses the semantic information embedded within the identifiers. We try to overcome this problem by introducing context models for source code identifiers to obtain a semantic kernel, which can be used for both deriving the topics that run through the system as well as their clustering. In the first model, we abstract an identifier to its type representation and build on this notion of context to construct contextual vector representation of the source code. The second notion of context is defined based on the flow of data between identifiers to represent a module as a dependency graph where the nodes correspond to identifiers and the edges represent the data dependencies between pairs of identifiers. We have applied our approach to 10 medium-sized open source Java projects, and show that by introducing contexts for identifiers, the quality of the modularization of the software systems is improved. Both of the context models give results that are superior to the plain vector representation of documents. In some cases, the authoritativeness of decompositions is improved by 67%. Furthermore, a more detailed evaluation of our approach on JEdit, an open source editor, demonstrates that inferred topics through performing topic analysis on the contextual representations are more meaningful compared to the plain representation of the documents. The proposed approach in introducing a context model for source code identifiers paves the way for building tools that support developers in program comprehension tasks such as application and domain concept location, software modularization and topic analysis

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref

ZENODO

Utrecht University Repository

FigShare

Matching Image Sets via Adaptive Multi Convex Hull

Author: Chen Shaokang
Lovell Brian C.
Sanderson Conrad
Wiliem Arnold
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Traditional nearest points methods use all the samples in an image set to construct a single convex or affine hull model for classification. However, strong artificial features and noisy data may be generated from combinations of training samples when significant intra-class variations and/or noise occur in the image set. Existing multi-model approaches extract local models by clustering each image set individually only once, with fixed clusters used for matching with various image sets. This may not be optimal for discrimination, as undesirable environmental conditions (eg. illumination and pose variations) may result in the two closest clusters representing different characteristics of an object (eg. frontal face being compared to non-frontal face). To address the above problem, we propose a novel approach to enhance nearest points based methods by integrating affine/convex hull classification with an adapted multi-model approach. We first extract multiple local convex hulls from a query image set via maximum margin clustering to diminish the artificial variations and constrain the noise in local convex hulls. We then propose adaptive reference clustering (ARC) to constrain the clustering of each gallery image set by forcing the clusters to have resemblance to the clusters in the query image set. By applying ARC, noisy clusters in the query set can be discarded. Experiments on Honda, MoBo and ETH-80 datasets show that the proposed method outperforms single model approaches and other recent techniques, such as Sparse Approximated Nearest Points, Mutual Subspace Method and Manifold Discriminant Analysis.Comment: IEEE Winter Conference on Applications of Computer Vision (WACV), 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Queensland University of Technology ePrints Archive

University of Queensland eSpace

A Role-Based Taxonomy of Human Resource Organizations

Author: Dyer Lee
Labelle Christiane M.
Publication venue: DigitalCommons@ILR
Publication date: 10/07/1992
Field of study

[Excerpt] An empirically-derived classification (taxonomy) of human resource departments , based on a few fundamental roles played in organizations, was developed as an alternative to the mostly speculative existing typologies. Four types emerged: the strategic partner, the strategic advisor, the operational partner, and the operational administrator. The stability of the solution and the relationships with variables not used to generate it were found satisfactory. The types show some similarities with those identified in the literature

DigitalCommons@ILR

eCommons@Cornell