Search CORE

127,288 research outputs found

Recommended from our members

Approaches to conceptual clustering

Author: Fisher Douglas
Langley Pat
Publication venue: eScholarship, University of California
Publication date: 12/07/1985
Field of study

Methods for Conceptual Clustering may be explicated in two lights. Conceptual Clustering methods may be viewed as extensions to techniques of numerical taxonomy, a collection of methods developed by social and natural scientists for creating classification schemes over object sets. Alternatively, conceptual clustering may be viewed as a form of learning by observation or concept formation, as opposed to methods of learning from examples or concept identification. In this paper we survey and compare a number of conceptual clustering methods along dimensions suggested by each of these views. The point we most wish to clarify is that conceptual clustering processes can be explicated as being composed of three distinct but inter-dependent subprocesses: the process of deriving a hierarchical classification scheme; the process of aggregating objects into individual classes; and the process of assigning conceptual descriptions to object classes. Each subprocess may be characterized along a number of dimensions related to search, thus facilitating a better understanding of the conceptual clustering process as a whole

eScholarship - University of California

Concept Tree Based Clustering Visualization with Shaded Similarity Matrices

Author: Gasser Les
Wang Jun
Yu Bei
Publication venue: SURFACE at Syracuse University
Publication date: 01/12/2002
Field of study

One of the problems with existing clustering methods is that the interpretation of clusters may be difficult. Two different approaches have been used to solve this problem: conceptual clustering in machine learning and clustering visualization in statistics and graphics. The purpose of this paper is to investigate the benefits of combining clustering visualization and conceptual clustering to obtain better cluster interpretations. In our research we have combined concept trees for conceptual clustering with shaded similarity matrices for visualization. Experimentation shows that the two interpretation approaches can complement each other to help us understand data better

Syracuse University Research Facility and Collaborative Environment

A Multiobjective Evolutionary Conceptual Clustering Methodology for Gene Annotation Within Structural Databases: A Case of Study on the Gene Ontology Database

Author: Cordón Óscar
Herrera Francisco
Perren Cobb J.
Romero Zaliz Rocío C.
Rubio Escudero Cristina
Zwir Igor
Publication venue: IEEE Computer Society
Publication date: 01/01/2008
Field of study

Current tools and techniques devoted to examine the content of large databases are often hampered by their inability to support searches based on criteria that are meaningful to their users. These shortcomings are particularly evident in data banks storing representations of structural data such as biological networks. Conceptual clustering techniques have demonstrated to be appropriate for uncovering relationships between features that characterize objects in structural data. However, typical con ceptual clustering approaches normally recover the most obvious relations, but fail to discover the lessfrequent but more informative underlying data associations. The combination of evolutionary algorithms with multiobjective and multimodal optimization techniques constitutes a suitable tool for solving this problem. We propose a novel conceptual clustering methodology termed evolutionary multiobjective conceptual clustering (EMO-CC), re lying on the NSGA-II multiobjective (MO) genetic algorithm. We apply this methodology to identify conceptual models in struc tural databases generated from gene ontologies. These models can explain and predict phenotypes in the immunoinflammatory response problem, similar to those provided by gene expression or other genetic markers. The analysis of these results reveals that our approach uncovers cohesive clusters, even those comprising a small number of observations explained by several features, which allows describing objects and their interactions from different perspectives and at different levels of detail.Ministerio de Ciencia y Tecnología TIC-2003-00877Ministerio de Ciencia y Tecnología BIO2004-0270EMinisterio de Ciencia y Tecnología TIN2006-1287

idUS. Depósito de Investigación Universidad de Sevilla

A model-based conceptual clustering of moving objects in video surveillance

Author: Lee Jeongkyu
Rajauria Pragya
Shah Subodh K.
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 29/01/2007
Field of study

Copyright 2007 Society of Photo-Optical Instrumentation Engineers. One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited.Data mining techniques have been applied in video databases to identify various patterns or groups. Clustering analysis is used to find the patterns and groups of moving objects in video surveillance systems. Most existing methods for the clustering focus on finding the optimum of overall partitioning. However, these approaches cannot provide meaningful descriptions of the clusters. Also, they are not very suitable for moving object databases since video data have spatial and temporal characteristics, and high-dimensional attributes. In this paper, we propose a model-based conceptual clustering (MCC) of moving objects in video surveillance based on a formal concept analysis. Our proposed MCC consists of three steps: 'model formation' , 'model-based concept analysis' , and 'concept graph generation' . The generated concept graph provides conceptual descriptions of moving objects. In order to assess the proposed approach, we conduct comprehensive experiments with artificial and real video surveillance data sets. The experimental results indicate that our MCC dominates two other methods, i.e., generality-based and error-based conceptual clustering algorithms, in terms of quality of concepts.http://dx.doi.org/10.1117/12.70822

UB ScholarWorks

Crossref

Bayesian Hierarchical Modelling for Tailoring Metric Thresholds

Author: Bettenburg N.
Bob Carpenter
McIlreath Richard
Oliveira P.
Panichella A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/04/2018
Field of study

Software is highly contextual. While there are cross-cutting `global' lessons, individual software projects exhibit many `local' properties. This data heterogeneity makes drawing local conclusions from global data dangerous. A key research challenge is to construct locally accurate prediction models that are informed by global characteristics and data volumes. Previous work has tackled this problem using clustering and transfer learning approaches, which identify locally similar characteristics. This paper applies a simpler approach known as Bayesian hierarchical modeling. We show that hierarchical modeling supports cross-project comparisons, while preserving local context. To demonstrate the approach, we conduct a conceptual replication of an existing study on setting software metrics thresholds. Our emerging results show our hierarchical model reduces model prediction error compared to a global approach by up to 50%.Comment: Short paper, published at MSR '18: 15th International Conference on Mining Software Repositories May 28--29, 2018, Gothenburg, Swede

arXiv.org e-Print Archive

Crossref

Constraint Programming for Multi-criteria Conceptual Clustering

Author: B Ganter
J Motwani
L Hossain
M Khiari
MM Ahmad
N Lazaar
N Pasquier
P Schaus
T Guns
T Guns
T Guns
T Uno
TBH Dao
W Ugarte
YC Law
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 28/08/2017
Field of study

International audienceA conceptual clustering is a set of formal concepts (i.e., closed itemsets) that defines a partition of a set of transactions. Finding a conceptual clustering is an N P-complete problem for which Constraint Programming (CP) and Integer Linear Programming (ILP) approaches have been recently proposed. We introduce new CP models to solve this problem: a pure CP model that uses set constraints, and an hybrid model that uses a data mining tool to extract formal concepts in a preprocessing step and then uses CP to select a subset of formal concepts that defines a partition. We compare our new models with recent CP and ILP approaches on classical machine learning instances. We also introduce a new set of instances coming from a real application case, which aims at extracting setting concepts from an Enterprise Resource Planning (ERP) software. We consider two classic criteria to optimize, i.e., the frequency and the size. We show that these criteria lead to extreme solutions with either very few small formal concepts or many large formal concepts, and that compromise clusterings may be obtained by computing the Pareto front of non dominated clusterings

Crossref