Search CORE

19,456 research outputs found

Recommended from our members

Methods of conceptual clustering and their relation to numerical taxonomy

Author: Fisher Douglas
Langley Pat
Publication venue: eScholarship, University of California
Publication date: 22/07/1985
Field of study

Artificial Intelligence (AI) methods for machine learning can be viewed as forms of exploratory data analysis, even though they differ markedly from the statistical methods generally connoted by the term. The distinction between methods of machine learning and statistical data analysis is primarily due to differences in the way techniques of each type represent data and structure within data. That is, methods of machine learning are strongly biased toward symbolic (as opposed to numeric) data representations. We explore this difference within a limited context, devoting the bulk of our paper to the explication of conceptual clustering, an extension to the statistically based methods of numerical taxonomy. In conceptual clustering the formation of object clusters is dependent on the quality of 'higher-level' characterizations, termed concepts, of the clusters. The form of concepts used by existing conceptual clustering systems (sets of necessary and sufficient conditions) is described in some detail. This is followed by descriptions of several conceptual clustering techniques, along with sample output. We conclude with a discussion of how alternative concept representations might enhance the effectiveness of future conceptual clustering systems

eScholarship - University of California

Recommended from our members

Integrating explanation-based and empirical learning methods in OCCAM

Author: Pazzani Michael J.
Publication venue: eScholarship, University of California
Publication date: 18/10/1988
Field of study

This paper discusses an approach to integrating empirical and explanation based learning techniques. The paper focuses on OCCAM, a program that has the capability to acquire via empirical means the knowledge needed for analytical learning. Two examples of this capability are discussed:The ability to use empirical techniques to acquire a domain theory for explanation based learning.The ability to use empirical learning techniques to find common patterns for causal relationships. These patterns encode a theory of causality (i.e., a set of general principles for recognizing causal relationships). Once acquired, a theory of causality can facilitate later learning by focusing on hypotheses which are consistent with the theory

eScholarship - University of California

Self-adaptive GA, quantitative semantic similarity measures and ontology-based text clustering

Author: Li Chenghua
Song Wei
Yu Wei
Zhang Chengzhi
Publication venue: IEEE Press
Publication date: 01/01/2008
Field of study

As the common clustering algorithms use vector space model (VSM) to represent document, the conceptual relationships between related terms which do not co-occur literally are ignored. A genetic algorithm-based clustering technique, named GA clustering, in conjunction with ontology is proposed in this article to overcome this problem. In general, the ontology measures can be partitioned into two categories: thesaurus-based methods and corpus-based methods. We take advantage of the hierarchical structure and the broad coverage taxonomy of Wordnet as the thesaurus-based ontology. However, the corpus-based method is rather complicated to handle in practical application. We propose a transformed latent semantic analysis (LSA) model as the corpus-based method in this paper. Moreover, two hybrid strategies, the combinations of the various similarity measures, are implemented in the clustering experiments. The results show that our GA clustering algorithm, in conjunction with the thesaurus-based and the LSA-based method, apparently outperforms that with other similarity measures. Moreover, the superiority of the GA clustering algorithm proposed over the commonly used k-means algorithm and the standard GA is demonstrated by the improvements of the clustering performance

E-LIS

Crossref

A foundation for machine learning in design

Author: Duffy Alex H.B.
Sim Siang Kok
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/1998
Field of study

This paper presents a formalism for considering the issues of learning in design. A foundation for machine learning in design (MLinD) is defined so as to provide answers to basic questions on learning in design, such as, "What types of knowledge can be learnt?", "How does learning occur?", and "When does learning occur?". Five main elements of MLinD are presented as the input knowledge, knowledge transformers, output knowledge, goals/reasons for learning, and learning triggers. Using this foundation, published systems in MLinD were reviewed. The systematic review presents a basis for validating the presented foundation. The paper concludes that there is considerable work to be carried out in order to fully formalize the foundation of MLinD

Crossref

University of Strathclyde Institutional Repository

Recommended from our members

Improved streamflow forecasting using self-organizing radial basis function artificial neural networks

Author: Gupta HV
Hsu KL
Moradkhani H
Sorooshian S
Publication venue: eScholarship, University of California
Publication date: 10/08/2004
Field of study

Streamflow forecasting has always been a challenging task for water resources engineers and managers and a major component of water resources system control. In this study, we explore the applicability of a Self Organizing Radial Basis (SORB) function to one-step ahead forecasting of daily streamflow. SORB uses a Gaussian Radial Basis Function architecture in conjunction with the Self-Organizing Feature Map (SOFM) used in data classification. SORB outperforms the two other ANN algorithms, the well known Multi-layer Feedforward Network (MFN) and Self-Organizing Linear Output map (SOLO) neural network for simulation of daily streamflow in the semi-arid Salt River basin. The applicability of the linear regression model was also investigated and concluded that the regression model is not reliable for this study. To generalize the model and derive a robust parameter set, cross-validation is applied and its outcome is compared with the split sample test. Cross-validation justifies the validity of the nonlinear relationship set up between input and output data. © 2004 Elsevier B.V. All rights reserved

eScholarship - University of California

A Semantic Similarity Measure for Expressive Description Logics

Author: d'Amato Claudia
Esposito Floriana
Fanizzi Nicola
Publication venue
Publication date: 01/01/2009
Field of study

A totally semantic measure is presented which is able to calculate a similarity value between concept descriptions and also between concept description and individual or between individuals expressed in an expressive description logic. It is applicable on symbolic descriptions although it uses a numeric approach for the calculus. Considering that Description Logics stand as the theoretic framework for the ontological knowledge representation and reasoning, the proposed measure can be effectively used for agglomerative and divisional clustering task applied to the semantic web domain.Comment: 13 pages, Appeared at CILC 2005, Convegno Italiano di Logica Computazionale also available at http://www.disp.uniroma2.it/CILC2005/downloads/papers/15.dAmato_CILC05.pd

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Bari

XML Schema Clustering with Semantic and Hierarchical Similarity Measures

Author: Iryadi Wina
Nayak Richi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis

Crossref

Queensland University of Technology ePrints Archive

SPIDA: Abstracting and generalizing layout design cases

Author: Duffy A.H.D.
Lee B.S.
Manfaat D.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/1998
Field of study

Abstraction and generalization of layout design cases generate new knowledge that is more widely applicable to use than specific design cases. The abstraction and generalization of design cases into hierarchical levels of abstractions provide the designer with the flexibility to apply any level of abstract and generalized knowledge for a new layout design problem. Existing case-based layout learning (CBLL) systems abstract and generalize cases into single levels of abstractions, but not into a hierarchy. In this paper, we propose a new approach, termed customized viewpoint - spatial (CV-S), which supports the generalization and abstraction of spatial layouts into hierarchies along with a supporting system, SPIDA (SPatial Intelligent Design Assistant)

Crossref

University of Strathclyde Institutional Repository