415 research outputs found

    Classifying Candidate Axioms via Dimensionality Reduction Techniques

    Get PDF
    We assess the role of similarity measures and learning methods in classifying candidate axioms for automated schema induction through kernel-based learning algorithms. The evaluation is based on (i) three different similarity measures between axioms, and (ii) two alternative dimensionality reduction techniques to check the extent to which the considered similarities allow to separate true axioms from false axioms. The result of the dimensionality reduction process is subsequently fed to several learning algorithms, comparing the accuracy of all combinations of similarity, dimensionality reduction technique, and classification method. As a result, it is observed that it is not necessary to use sophisticated semantics-based similarity measures to obtain accurate predictions, and furthermore that classification performance only marginally depends on the choice of the learning method. Our results open the way to implementing efficient surrogate models for axiom scoring to speed up ontology learning and schema induction methods

    Improving Monitoring and Diagnosis for Process Control using Independent Component Analysis

    Get PDF
    Statistical Process Control (SPC) is the general field concerned with monitoring the operation and performance of systems. SPC consists of a collection of techniques for characterizing the operation of a system using a probability distribution consistent with the system\u27s inputs and outputs. Classical SPC monitors a single variable to characterize the operation of a single machine tool or process step using tools such as Shewart charts. The traditional approach works well for simple small to medium size processes. For more complex processes a number of multivariate SPC techniques have been developed in recent decades. These advanced methods suffer from several disadvantages compared to univariate techniques: they tend to be statistically less powerful, and they tend to complicate process diagnosis when a disturbance is detected. This research introduces a general method for simplifying multivariate process monitoring in such a manner as to allow the use of traditional SPC tools while facilitating process diagnosis. Latent variable representations of complex processes are developed which directly relate disturbances with process steps or segments. The method models disturbances in the process rather than the process itself. The basic tool used is Independent Component Analysis (ICA). The methodology is illustrated on the problem of monitoring Electrical Test (E-Test) data from a semiconductor manufacturing process. Development and production data from a working semiconductor plant are used to estimate a factor model that is then used to develop univariate control charts for particular types of process disturbances. Detection and false alarm rates for data with known disturbances are given. The charts correctly detect and classify all the disturbance cases with a very low false alarm rate. A secondary contribution is the introduction of a method for performing an ICA like analysis using possibilistic data instead of probabilistic data. This technique extends the general ICA framework to apply to a broader range of uncertainty types. Further development of this technique could lead to the capability to use extremely sparse data to estimate ICA process models

    An alternative approach for choice models in transportation: Use of possibility theory for comparison of utilities

    Get PDF
    Modeling of human choice mechanism has been a topic of intense discussion in the transportation community for many years. The framework of modeling has been rooted in probability theory in which the analyst’s uncertainty about the integrity of the model is expressed in probability. In most choice situations, the decision-maker (traveler) also experiences uncertainty because of the lack of complete information on the choices. In the traditional modeling framework, the uncertainty of the analyst and that of the decision-maker are both embedded in the same random term and not clearly separated. While the analyst's uncertainty may be represented by probability due to the statistical nature of events, that of the decision maker, however, is not always subjected to randomness; rather, it is the perceptive uncertainty. This paper proposes a modeling framework that attempts to account for the decision maker’s uncertainty by possibility theory and then the analyst's uncertainty by probability theory. The possibility to probability transformation is performed using the principle of uncertainty invariance. The proposed approach accounts for the quality of information on the changes in choice probability. The paper discusses the thought process, mathematics of possibility theory and probability transformation, and examples

    Systems Statistical Engineering – Systems Hierarchical Constraint Propagation

    Get PDF
    Cotter (ASEM-IAC 2012, 2015, 2016, 2017): (1) identified the gaps in knowledge that statistical engineering needed to address and set forth a working definition of and body of knowledge for statistical engineering; (2) proposed a systemic causal Bayesian hierarchical model that addressed the knowledge gap needed to integrate deterministic mathematical engineering causal models within a stochastic framework; (3) specified the modeling methodology through which statistical engineering models could be developed, diagnosed, and applied to predict systemic mission performance; and (4) proposed revisions to and integration of IDEF0 as the framework for developing hierarchical qualitative systems models. In the last work, Cotter (2017) noted that a necessary dimension of the systems statistical engineering body of knowledge is hierarchical constraint propagation to assure that imposed environmental economic, legal, political, social, and technical constraints are consistently decomposed to subsystems , modules, and components and that modules, and subsystems socio-technical constraints are mapped to systemic mission performance. This paper presents systems theory, constraint propagation theory, and Bayesian constrained regression theory relevant to the problem of systemic hierarchical constraint propagation and sets forth the theoretical basis for their integration into the systems statistical engineering body of knowledge

    A Rapid Soft Computing Approach to Dimensionality Reduction in Model Construction

    Get PDF
    A rapid soft computing method for dimensionality reduction of data sets is presented. Traditional approaches usually base on factor or principal component analysis. Our method applies fuzzy cluster analysis and approximate reasoning instead, and thus it is also viable to nonparametric and nonlinear models. Comparisons are drawn between the methods with two empiric data sets.Peer reviewe

    Approximate Reasoning in Hydrogeological Modeling

    Get PDF
    The accurate determination of hydraulic conductivity is an important element of successful groundwater flow and transport modeling. However, the exhaustive measurement of this hydrogeological parameter is quite costly and, as a result, unrealistic. Alternatively, relationships between hydraulic conductivity and other hydrogeological variables less costly to measure have been used to estimate this crucial variable whenever needed. Until this point, however, the majority of these relationships have been assumed to be crisp and precise, contrary to what intuition dictates. The research presented herein addresses the imprecision inherent in hydraulic conductivity estimation, framing this process in a fuzzy logic framework. Because traditional hydrogeological practices are not suited to handle fuzzy data, various approaches to incorporating fuzzy data at different steps in the groundwater modeling process have been previously developed. Such approaches have been both redundant and contrary at times, including multiple approaches proposed for both fuzzy kriging and groundwater modeling. This research proposes a consistent rubric for the handling of fuzzy data throughout the entire groundwater modeling process. This entails the estimation of fuzzy data from alternative hydrogeological parameters, the sampling of realizations from fuzzy hydraulic conductivity data, including, most importantly, the appropriate aggregation of expert-provided fuzzy hydraulic conductivity estimates with traditionally-derived hydraulic conductivity measurements, and utilization of this information in the numerical simulation of groundwater flow and transport

    Duroc and Iberian Pork Neural Network Classification by Visible and Near Infrared Reflectance Spectroscopy

    Get PDF
    a b s t r a c t Visible and near infrared reflectance spectroscopy (VIS/NIRS) was used to differentiate between Duroc and Iberian pork in the M. masseter. Samples of Duroc (n = 15) and Iberian (n = 15) pig muscles were scanned in the VIS/NIR region (350-2500 nm) using a portable spectral radiometer. Both mutual information and VIS/NIRS spectra characterization were developed to generate a ranking of variables and the data were then processed by artificial neural networks, establishing 1, 3, or 10 wavelengths as input variable for classifying between the pig breeds. The models correctly classified >70% of all problem assumptions, with a correct classification of >95% for the three-variable assumption using either mutual information ranking or VIS/NIRS spectra characterization. These results demonstrate the potential value of the VIS/ NIRS technique as an objective and rapid method for the authentication and identification of Duroc and Iberian pork

    Hyperspectral Unmixing Overview: Geometrical, Statistical, and Sparse Regression-Based Approaches

    Get PDF
    Imaging spectrometers measure electromagnetic energy scattered in their instantaneous field view in hundreds or thousands of spectral channels with higher spectral resolution than multispectral cameras. Imaging spectrometers are therefore often referred to as hyperspectral cameras (HSCs). Higher spectral resolution enables material identification via spectroscopic analysis, which facilitates countless applications that require identifying materials in scenarios unsuitable for classical spectroscopic analysis. Due to low spatial resolution of HSCs, microscopic material mixing, and multiple scattering, spectra measured by HSCs are mixtures of spectra of materials in a scene. Thus, accurate estimation requires unmixing. Pixels are assumed to be mixtures of a few materials, called endmembers. Unmixing involves estimating all or some of: the number of endmembers, their spectral signatures, and their abundances at each pixel. Unmixing is a challenging, ill-posed inverse problem because of model inaccuracies, observation noise, environmental conditions, endmember variability, and data set size. Researchers have devised and investigated many models searching for robust, stable, tractable, and accurate unmixing algorithms. This paper presents an overview of unmixing methods from the time of Keshava and Mustard's unmixing tutorial [1] to the present. Mixing models are first discussed. Signal-subspace, geometrical, statistical, sparsity-based, and spatial-contextual unmixing algorithms are described. Mathematical problems and potential solutions are described. Algorithm characteristics are illustrated experimentally.Comment: This work has been accepted for publication in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensin
    corecore