Search CORE

16,878 research outputs found

Recommended from our members

Methods of conceptual clustering and their relation to numerical taxonomy

Author: Fisher Douglas
Langley Pat
Publication venue: eScholarship, University of California
Publication date: 22/07/1985
Field of study

Artificial Intelligence (AI) methods for machine learning can be viewed as forms of exploratory data analysis, even though they differ markedly from the statistical methods generally connoted by the term. The distinction between methods of machine learning and statistical data analysis is primarily due to differences in the way techniques of each type represent data and structure within data. That is, methods of machine learning are strongly biased toward symbolic (as opposed to numeric) data representations. We explore this difference within a limited context, devoting the bulk of our paper to the explication of conceptual clustering, an extension to the statistically based methods of numerical taxonomy. In conceptual clustering the formation of object clusters is dependent on the quality of 'higher-level' characterizations, termed concepts, of the clusters. The form of concepts used by existing conceptual clustering systems (sets of necessary and sufficient conditions) is described in some detail. This is followed by descriptions of several conceptual clustering techniques, along with sample output. We conclude with a discussion of how alternative concept representations might enhance the effectiveness of future conceptual clustering systems

eScholarship - University of California

"Selection of Input Parameters for Multivariate Classifiersin Proactive Machine Health Monitoring by Clustering Envelope Spectrum Harmonics"

Author: Ball Andrew
Gu Fengshou
Smith Ann
Publication venue: 'Trans Tech Publications, Ltd.'
Publication date: 01/10/2015
Field of study

In condition monitoring (CM) signal analysis the inherent problem of key characteristics being masked by noise can be addressed by analysis of the signal envelope. Envelope analysis of vibration signals is effective in extracting useful information for diagnosing different faults. However, the number of envelope features is generally too large to be effectively incorporated in system models. In this paper a novel method of extracting the pertinent information from such signals based on multivariate statistical techniques is developed which substantialy reduces the number of input parameters required for data classification models. This was achieved by clustering possible model variables into a number of homogeneous groups to assertain levels of interdependency. Representatives from each of the groups were selected for their power to discriminate between the categorical classes. The techniques established were applied to a reciprocating compressor rig wherein the target was identifying machine states with respect to operational health through comparison of signal outputs for healthy and faulty systems. The technique allowed near perfect fault classification. In addition methods for identifying seperable classes are investigated through profiling techniques, illustrated using Andrew’s Fourier curves

Crossref

University of Huddersfield Repository

Huddersfield Research Portal

Point process-based modeling of multiple debris flow landslides using INLA: an application to the 2009 Messina disaster

Author: Huser Raphael
Lombardo Luigi
Opitz Thomas
Publication venue
Publication date: 10/08/2017
Field of study

We develop a stochastic modeling approach based on spatial point processes of log-Gaussian Cox type for a collection of around 5000 landslide events provoked by a precipitation trigger in Sicily, Italy. Through the embedding into a hierarchical Bayesian estimation framework, we can use the Integrated Nested Laplace Approximation methodology to make inference and obtain the posterior estimates. Several mapping units are useful to partition a given study area in landslide prediction studies. These units hierarchically subdivide the geographic space from the highest grid-based resolution to the stronger morphodynamic-oriented slope units. Here we integrate both mapping units into a single hierarchical model, by treating the landslide triggering locations as a random point pattern. This approach diverges fundamentally from the unanimously used presence-absence structure for areal units since we focus on modeling the expected landslide count jointly within the two mapping units. Predicting this landslide intensity provides more detailed and complete information as compared to the classically used susceptibility mapping approach based on relative probabilities. To illustrate the model's versatility, we compute absolute probability maps of landslide occurrences and check its predictive power over space. While the landslide community typically produces spatial predictive models for landslides only in the sense that covariates are spatially distributed, no actual spatial dependence has been explicitly integrated so far for landslide susceptibility. Our novel approach features a spatial latent effect defined at the slope unit level, allowing us to assess the spatial influence that remains unexplained by the covariates in the model

arXiv.org e-Print Archive

HAL Descartes

Finding Groups in Large Data Sets

Author: Adrian Müller
Publication venue
Publication date
Field of study

This paper aims to give an overview of methods to find groups in large data sets, such as household expenditure survey data. These methods are grouped in three: cluster analysis, dimension reduction and basic explorative methods. The emphasis is put on a critical analysis and potential drawbacks, especially of inputs that have to be provided by the researcher. These may impose some structure not present in the data, thus defeating the purpose of revealing intrinsic patterns. In general, the more elaborate methods, such as cluster analysis, are delicate to apply, especially in the context of social sciences. Often, it may be best to limit oneself to more transparent approaches such as comparisons of basic statistics.

Research Papers in Economics