19,456 research outputs found
Recommended from our members
Methods of conceptual clustering and their relation to numerical taxonomy
Artificial Intelligence (AI) methods for machine learning can be viewed as forms of exploratory data analysis, even though they differ markedly from the statistical methods generally connoted by the term. The distinction between methods of machine learning and statistical data analysis is primarily due to differences in the way techniques of each type represent data and structure within data. That is, methods of machine learning are strongly biased toward symbolic (as opposed to numeric) data representations. We explore this difference within a limited context, devoting the bulk of our paper to the explication of conceptual clustering, an extension to the statistically based methods of numerical taxonomy. In conceptual clustering the formation of object clusters is dependent on the quality of 'higher-level' characterizations, termed concepts, of the clusters. The form of concepts used by existing conceptual clustering systems (sets of necessary and sufficient conditions) is described in some detail. This is followed by descriptions of several conceptual clustering techniques, along with sample output. We conclude with a discussion of how alternative concept representations might enhance the effectiveness of future conceptual clustering systems
Recommended from our members
Integrating explanation-based and empirical learning methods in OCCAM
This paper discusses an approach to integrating empirical and explanation based learning techniques. The paper focuses on OCCAM, a program that has the capability to acquire via empirical means the knowledge needed for analytical learning. Two examples of this capability are discussed:The ability to use empirical techniques to acquire a domain theory for explanation based learning.The ability to use empirical learning techniques to find common patterns for causal relationships. These patterns encode a theory of causality (i.e., a set of general principles for recognizing causal relationships). Once acquired, a theory of causality can facilitate later learning by focusing on hypotheses which are consistent with the theory
Self-adaptive GA, quantitative semantic similarity measures and ontology-based text clustering
As the common clustering algorithms use vector space model (VSM) to represent document, the conceptual relationships between related terms which do not co-occur literally are ignored. A genetic algorithm-based clustering technique, named GA clustering, in conjunction with ontology is proposed in this article to overcome this problem. In general, the ontology measures can be partitioned into two categories: thesaurus-based methods and corpus-based methods. We take advantage of the hierarchical structure and the broad coverage taxonomy of Wordnet as the thesaurus-based ontology. However, the corpus-based method is rather complicated to handle in practical application. We propose a transformed latent semantic analysis (LSA) model as the corpus-based method in this paper. Moreover, two hybrid strategies, the combinations of the various similarity measures, are implemented in the clustering experiments. The results show that our GA clustering algorithm, in conjunction with the thesaurus-based and the LSA-based method, apparently outperforms that with other similarity measures. Moreover, the superiority of the GA clustering algorithm proposed over the commonly used k-means algorithm and the standard GA is demonstrated by the improvements of the clustering performance
A foundation for machine learning in design
This paper presents a formalism for considering the issues of learning in design. A foundation for machine learning in design (MLinD) is defined so as to provide answers to basic questions on learning in design, such as, "What types of knowledge can be learnt?", "How does learning occur?", and "When does learning occur?". Five main elements of MLinD are presented as the input knowledge, knowledge transformers, output knowledge, goals/reasons for learning, and learning triggers. Using this foundation, published systems in MLinD were reviewed. The systematic review presents a basis for validating the presented foundation. The paper concludes that there is considerable work to be carried out in order to fully formalize the foundation of MLinD
Recommended from our members
Improved streamflow forecasting using self-organizing radial basis function artificial neural networks
Streamflow forecasting has always been a challenging task for water resources engineers and managers and a major component of water resources system control. In this study, we explore the applicability of a Self Organizing Radial Basis (SORB) function to one-step ahead forecasting of daily streamflow. SORB uses a Gaussian Radial Basis Function architecture in conjunction with the Self-Organizing Feature Map (SOFM) used in data classification. SORB outperforms the two other ANN algorithms, the well known Multi-layer Feedforward Network (MFN) and Self-Organizing Linear Output map (SOLO) neural network for simulation of daily streamflow in the semi-arid Salt River basin. The applicability of the linear regression model was also investigated and concluded that the regression model is not reliable for this study. To generalize the model and derive a robust parameter set, cross-validation is applied and its outcome is compared with the split sample test. Cross-validation justifies the validity of the nonlinear relationship set up between input and output data. © 2004 Elsevier B.V. All rights reserved
A Semantic Similarity Measure for Expressive Description Logics
A totally semantic measure is presented which is able to calculate a
similarity value between concept descriptions and also between concept
description and individual or between individuals expressed in an expressive
description logic. It is applicable on symbolic descriptions although it uses a
numeric approach for the calculus. Considering that Description Logics stand as
the theoretic framework for the ontological knowledge representation and
reasoning, the proposed measure can be effectively used for agglomerative and
divisional clustering task applied to the semantic web domain.Comment: 13 pages, Appeared at CILC 2005, Convegno Italiano di Logica
Computazionale also available at
http://www.disp.uniroma2.it/CILC2005/downloads/papers/15.dAmato_CILC05.pd
XML Schema Clustering with Semantic and Hierarchical Similarity Measures
With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis
SPIDA: Abstracting and generalizing layout design cases
Abstraction and generalization of layout design cases generate new knowledge that is more widely applicable to use than specific design cases. The abstraction and generalization of design cases into hierarchical levels of abstractions provide the designer with the flexibility to apply any level of abstract and generalized knowledge for a new layout design problem. Existing case-based layout learning (CBLL) systems abstract and generalize cases into single levels of abstractions, but not into a hierarchy. In this paper, we propose a new approach, termed customized viewpoint - spatial (CV-S), which supports the generalization and abstraction of spatial layouts into hierarchies along with a supporting system, SPIDA (SPatial Intelligent Design Assistant)
- …