Search CORE

50,796 research outputs found

Recommended from our members

Methods of conceptual clustering and their relation to numerical taxonomy

Author: Fisher Douglas
Langley Pat
Publication venue: eScholarship, University of California
Publication date: 22/07/1985
Field of study

Artificial Intelligence (AI) methods for machine learning can be viewed as forms of exploratory data analysis, even though they differ markedly from the statistical methods generally connoted by the term. The distinction between methods of machine learning and statistical data analysis is primarily due to differences in the way techniques of each type represent data and structure within data. That is, methods of machine learning are strongly biased toward symbolic (as opposed to numeric) data representations. We explore this difference within a limited context, devoting the bulk of our paper to the explication of conceptual clustering, an extension to the statistically based methods of numerical taxonomy. In conceptual clustering the formation of object clusters is dependent on the quality of 'higher-level' characterizations, termed concepts, of the clusters. The form of concepts used by existing conceptual clustering systems (sets of necessary and sufficient conditions) is described in some detail. This is followed by descriptions of several conceptual clustering techniques, along with sample output. We conclude with a discussion of how alternative concept representations might enhance the effectiveness of future conceptual clustering systems

eScholarship - University of California

On the Effect of Semantically Enriched Context Models on Software Modularization

Author: Hage Jurriaan
Jansen Slinger
Khadka Ravi
Saeidi Amir
Publication venue: 'Aspect-Oriented Software Association (AOSA)'
Publication date: 04/08/2017
Field of study

Many of the existing approaches for program comprehension rely on the linguistic information found in source code, such as identifier names and comments. Semantic clustering is one such technique for modularization of the system that relies on the informal semantics of the program, encoded in the vocabulary used in the source code. Treating the source code as a collection of tokens loses the semantic information embedded within the identifiers. We try to overcome this problem by introducing context models for source code identifiers to obtain a semantic kernel, which can be used for both deriving the topics that run through the system as well as their clustering. In the first model, we abstract an identifier to its type representation and build on this notion of context to construct contextual vector representation of the source code. The second notion of context is defined based on the flow of data between identifiers to represent a module as a dependency graph where the nodes correspond to identifiers and the edges represent the data dependencies between pairs of identifiers. We have applied our approach to 10 medium-sized open source Java projects, and show that by introducing contexts for identifiers, the quality of the modularization of the software systems is improved. Both of the context models give results that are superior to the plain vector representation of documents. In some cases, the authoritativeness of decompositions is improved by 67%. Furthermore, a more detailed evaluation of our approach on JEdit, an open source editor, demonstrates that inferred topics through performing topic analysis on the contextual representations are more meaningful compared to the plain representation of the documents. The proposed approach in introducing a context model for source code identifiers paves the way for building tools that support developers in program comprehension tasks such as application and domain concept location, software modularization and topic analysis

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref

ZENODO

Utrecht University Repository

FigShare

Recommended from our members

Shape matching and clustering in design

Author: Duffy Alex H. B.
Lee Byungsuk
Lim Sungwoo
Publication venue
Publication date: 21/08/2001
Field of study

Generalising knowledge and matching patterns is a basic human trait in re-using past experiences. We often cluster (group) knowledge of similar attributes as a process of learning and or aid to manage the complexity and re-use of experiential knowledge [1, 2]. In conceptual design, an ill-defined shape may be recognised as more than one type. Resulting in shapes possibly being classified differently when different criteria are applied. This paper outlines the work being carried out to develop a new technique for shape clustering. It highlights the current methods for analysing shapes found in computer aided sketching systems, before a method is proposed that addresses shape clustering and pattern matching. Clustering for vague geometric models and multiple viewpoint support are explored

Open Research Online (The Open University)

Recommended from our members

Approaches to conceptual clustering

Author: Fisher Douglas
Langley Pat
Publication venue: eScholarship, University of California
Publication date: 12/07/1985
Field of study

Methods for Conceptual Clustering may be explicated in two lights. Conceptual Clustering methods may be viewed as extensions to techniques of numerical taxonomy, a collection of methods developed by social and natural scientists for creating classification schemes over object sets. Alternatively, conceptual clustering may be viewed as a form of learning by observation or concept formation, as opposed to methods of learning from examples or concept identification. In this paper we survey and compare a number of conceptual clustering methods along dimensions suggested by each of these views. The point we most wish to clarify is that conceptual clustering processes can be explicated as being composed of three distinct but inter-dependent subprocesses: the process of deriving a hierarchical classification scheme; the process of aggregating objects into individual classes; and the process of assigning conceptual descriptions to object classes. Each subprocess may be characterized along a number of dimensions related to search, thus facilitating a better understanding of the conceptual clustering process as a whole

eScholarship - University of California

Shape matching and clustering

Author: Duffy Alex H.B.
Lee Byungsuk
Lim Sungwoo
Publication venue
Publication date: 01/01/2001
Field of study

University of Strathclyde Institutional Repository

Recommended from our members

Model granularity and related concepts

Author: Clarkson P. J.
Eckert C. M.
Maier J. F.
Publication venue
Publication date: 12/05/2016
Field of study

Models are integral to engineering design and basis for many decisions. Therefore, it is necessary to comprehend how a model’s properties might influence its behaviour. Model granularity is an important property but has so far only received limited attention. The terminology used to describe granularity and related phenomena varies and pertinent concepts are distributed across communities. This article positions granularity in the theoretical background of models, collects formal definitions for relevant terms from a range of communities and discusses the implications for engineering design

Open Research Online (The Open University)

Pacifier overuse and conceptual relations of abstract and emotional concepts

Author: Barca Laura
Borghi Anna M.
Mazzuca Claudia
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2017
Field of study

This study explores the impact of the extensive use of an oral device since infancy (pacifier) on the acquisition of concrete, abstract, and emotional concepts. While recent evidence showed a negative relation between pacifier use and children’s emotional competence (Niedenthal et al., 2012), the possible interaction between use of pacifier and processing of emotional and abstract language has not been investigated. According to recent theories, while all concepts are grounded in sensorimotor experience, abstract concepts activate linguistic and social information more than concrete ones. Specifically, the Words As Social Tools (WAT) proposal predicts that the simulation of their meaning leads to an activation of the mouth (Borghi and Binkofski, 2014; Borghi and Zarcone, 2016). Since the pacifier affects facial mimicry forcing mouth muscles into a static position, we hypothesize its possible interference on acquisition/consolidation of abstract emotional and abstract not-emotional concepts, which aremainly conveyed during social and linguistic interactions, than of concrete concepts. Fifty-nine first grade children, with a history of different frequency of pacifier use, provided oral definitions of the meaning of abstract not-emotional, abstract emotional, and concrete words. Main effect of concept type emerged, with higher accuracy in defining concrete and abstract emotional concepts with respect to abstract not-emotional concepts, independently from pacifier use. Accuracy in definitions was not influenced by the use of pacifier, butcorrespondence and hierarchical clustering analyses suggest that the use of pacifier differently modulates the conceptual relations elicited by abstract emotional and abstract not-emotional. While the majority of the children produced a similar pattern of conceptual relations, analyses on the few (6) children who overused the pacifier (for more than 3 years) showed that they tend to distinguish less clearly between concrete and abstract emotional concepts and between concrete and abstract not-emotional concepts than children who did not use it (5) or used it for short (17). As to the conceptual relations they produced, children who overused the pacifier tended to refer less to their experience and to social and emotional situations, usemore exemplifications and functional relations, and less free associations

Frontiers - Publisher Connector

Archivio della ricerca- Università di Roma La Sapienza

PUblication MAnagement

Recommended from our members

Recognition by directed attention to recursively partitioned images

Author: McNulty Dale M.
Publication venue: eScholarship, University of California
Publication date: 01/01/1988
Field of study

A learning/recognition model (and instantiating program) is described which recursively combines the learning paradigms of conceptual clustering (Michalski, 1980) and learning-from-examples to resolve the ambiguities of real-world recognition. The model is based on neuropsychological and psychological evidence that the visual system is analytic, hierarchical, and composed of a parallel/serial dichotomy (many, see conclusions by Crick, 1984). Emulating the experimental evidence, parallel processes in the model decompose the image into components and cluster the constituents in much the same way as the image processing technique known as moment analysis (Alt, 1962). Serial, attentive mechanisms then reassemble the decompositions by investigating spatial relationships between components. The use of attentive mechanisms extends the moment analysis technique to handle alterations in structure and solves the contention problem created by combining the two learning paradigms. The contention results from a disagreement between the teacher and the model on what constitutes the salient features at the highest level of the symbol. There are four cases ZBT must handle, two of which result from the disagreement with the teacher. The parallel/serial dichotomy represents a vertical/horizontal tradeoff between the invariant and variant features of a domain. The resultant learned hierarchy allows ZBT to recognize structural differences while avoiding problems of exponential growth

eScholarship - University of California

Substructure Discovery Using Minimum Description Length and Background Knowledge

Author: Cook D. J.
Holder L. B.
Publication venue
Publication date: 01/01/1994
Field of study

The ability to identify interesting and repetitive substructures is an essential component to discovering knowledge in structural data. We describe a new version of our SUBDUE substructure discovery system based on the minimum description length principle. The SUBDUE system discovers substructures that compress the original data and represent structural concepts in the data. By replacing previously-discovered substructures in the data, multiple passes of SUBDUE produce a hierarchical description of the structural regularities in the data. SUBDUE uses a computationally-bounded inexact graph match that identifies similar, but not identical, instances of a substructure and finds an approximate measure of closeness of two substructures when under computational constraints. In addition to the minimum description length principle, other background knowledge can be used by SUBDUE to guide the search towards more appropriate substructures. Experiments in a variety of domains demonstrate SUBDUE's ability to find substructures capable of compressing the original data and to discover structural concepts important to the domain. Description of Online Appendix: This is a compressed tar file containing the SUBDUE discovery system, written in C. The program accepts as input databases represented in graph form, and will output discovered substructures with their corresponding value.Comment: See http://www.jair.org/ for an online appendix and other files accompanying this articl

arXiv.org e-Print Archive

CiteSeerX