1,366 research outputs found
Designing fuzzy rule based classifier using self-organizing feature map for analysis of multispectral satellite images
We propose a novel scheme for designing fuzzy rule based classifier. An SOFM
based method is used for generating a set of prototypes which is used to
generate a set of fuzzy rules. Each rule represents a region in the feature
space that we call the context of the rule. The rules are tuned with respect to
their context. We justified that the reasoning scheme may be different in
different context leading to context sensitive inferencing. To realize context
sensitive inferencing we used a softmin operator with a tunable parameter. The
proposed scheme is tested on several multispectral satellite image data sets
and the performance is found to be much better than the results reported in the
literature.Comment: 23 pages, 7 figure
Land cover classification using fuzzy rules and aggregation of contextual information through evidence theory
Land cover classification using multispectral satellite image is a very
challenging task with numerous practical applications. We propose a multi-stage
classifier that involves fuzzy rule extraction from the training data and then
generation of a possibilistic label vector for each pixel using the fuzzy rule
base. To exploit the spatial correlation of land cover types we propose four
different information aggregation methods which use the possibilistic class
label of a pixel and those of its eight spatial neighbors for making the final
classification decision. Three of the aggregation methods use Dempster-Shafer
theory of evidence while the remaining one is modeled after the fuzzy k-NN
rule. The proposed methods are tested with two benchmark seven channel
satellite images and the results are found to be quite satisfactory. They are
also compared with a Markov random field (MRF) model-based contextual
classification method and found to perform consistently better.Comment: 14 pages, 2 figure
Novel Application of Neutrosophic Logic in Classifiers Evaluated under Region-Based Image Categorization System
Neutrosophic logic is a relatively new logic that is a generalization of fuzzy logic. In this dissertation, for the first time, neutrosophic logic is applied to the field of classifiers where a support vector machine (SVM) is adopted as the example to validate the feasibility and effectiveness of neutrosophic logic. The proposed neutrosophic set is integrated into a reformulated SVM, and the performance of the achieved classifier N-SVM is evaluated under an image categorization system. Image categorization is an important yet challenging research topic in computer vision. In this dissertation, images are first segmented by a hierarchical two-stage self organizing map (HSOM), using color and texture features. A novel approach is proposed to select the training samples of HSOM based on homogeneity properties. A diverse density support vector machine (DD-SVM) framework that extends the multiple-instance learning (MIL) technique is then applied to the image categorization problem by viewing an image as a bag of instances corresponding to the regions obtained from the image segmentation. Using the instance prototype, every bag is mapped to a point in the new bag space, and the categorization is transformed to a classification problem. Then, the proposed N-SVM based on the neutrosophic set is used as the classifier in the new bag space. N-SVM treats samples differently according to the weighting function, and it helps reduce the effects of outliers. Experimental results on a COREL dataset of 1000 general purpose images and a Caltech 101 dataset of 9000 images demonstrate the validity and effectiveness of the proposed method
A Self-Organizing Neural System for Learning to Recognize Textured Scenes
A self-organizing ARTEX model is developed to categorize and classify textured image regions. ARTEX specializes the FACADE model of how the visual cortex sees, and the ART model of how temporal and prefrontal cortices interact with the hippocampal system to learn visual recognition categories and their names. FACADE processing generates a vector of boundary and surface properties, notably texture and brightness properties, by utilizing multi-scale filtering, competition, and diffusive filling-in. Its context-sensitive local measures of textured scenes can be used to recognize scenic properties that gradually change across space, as well a.s abrupt texture boundaries. ART incrementally learns recognition categories that classify FACADE output vectors, class names of these categories, and their probabilities. Top-down expectations within ART encode learned prototypes that pay attention to expected visual features. When novel visual information creates a poor match with the best existing category prototype, a memory search selects a new category with which classify the novel data. ARTEX is compared with psychophysical data, and is benchmarked on classification of natural textures and synthetic aperture radar images. It outperforms state-of-the-art systems that use rule-based, backpropagation, and K-nearest neighbor classifiers.Defense Advanced Research Projects Agency; Office of Naval Research (N00014-95-1-0409, N00014-95-1-0657
A Hybrid Templated-Based Composite Classification System
An automatic target classification system contains a classifier which reads a feature as an input and outputs a class label. Typically, the feature is a vector of real numbers. Other features can be non-numeric, such as a string of symbols or alphabets. One method of improving the performance of an automatic classification system is through combining two or more independent classifiers that are complementary in nature. Complementary classifiers are observed by finding an optimal method for partitioning the problem space. For example, the individual classifiers may operate to identify specific objects. Another method may be to use classifiers that operate on different features. We propose a design for a hybrid composite classification system, which exploits both real-numbered and non-numeric features with a template matching classification scheme. This composite classification system is made up of two independent classification systems.These two independent classification systems, which receive input from two separate sensors are then combined over various fusion methods for the purpose of target identification. By using these two separate classifiers, we explore conditions that allow the two techniques to be complementary in nature, thus improving the overall performance of the classification system. We examine various fusion techniques, in search of the technique that generates the best results. We investigate different parameter spaces and fusion rules on example problems to demonstrate our classification system. Our examples consider various application areas to help further demonstrate the utility of our classifier. Optimal classifier performance is obtained using a mathematical framework, which takes into account decision variables based on decision-maker preferences and/or engineering specifications, depending upon the classification problem at hand
Multi-tier framework for the inferential measurement and data-driven modeling
A framework for the inferential measurement and data-driven modeling has been proposed and assessed in several real-world application domains. The architecture of the framework has been structured in multiple tiers to facilitate extensibility and the integration of new components. Each of the proposed four tiers has been assessed in an uncoupled way to verify their suitability. The first tier, dealing with exploratory data analysis, has been assessed with the characterization of the chemical space related to the biodegradation of organic chemicals. This analysis has established relationships between physicochemical variables and biodegradation rates that have been used for model development. At the preprocessing level, a novel method for feature selection based on dissimilarity measures between Self-Organizing maps (SOM) has been developed and assessed. The proposed method selected more features than others published in literature but leads to models with improved predictive power. Single and multiple data imputation techniques based on the SOM have also been used to recover missing data in a Waste Water Treatment Plant benchmark. A new dynamic method to adjust the centers and widths of in Radial basis Function networks has been proposed to predict water quality. The proposed method outperformed other neural networks. The proposed modeling components have also been assessed in the development of prediction and classification models for biodegradation rates in different media. The results obtained proved the suitability of this approach to develop data-driven models when the complex dynamics of the process prevents the formulation of mechanistic models. The use of rule generation algorithms and Bayesian dependency models has been preliminary screened to provide the framework with interpretation capabilities. Preliminary results obtained from the classification of Modes of Toxic Action (MOA) indicate that this could be a promising approach to use MOAs as proxy indicators of human health effects of chemicals.Finally, the complete framework has been applied to three different modeling scenarios. A virtual sensor system, capable of inferring product quality indices from primary process variables has been developed and assessed. The system was integrated with the control system in a real chemical plant outperforming multi-linear correlation models usually adopted by chemical manufacturers. A model to predict carcinogenicity from molecular structure for a set of aromatic compounds has been developed and tested. Results obtained after the application of the SOM-dissimilarity feature selection method yielded better results than models published in the literature. Finally, the framework has been used to facilitate a new approach for environmental modeling and risk management within geographical information systems (GIS). The SOM has been successfully used to characterize exposure scenarios and to provide estimations of missing data through geographic interpolation. The combination of SOM and Gaussian Mixture models facilitated the formulation of a new probabilistic risk assessment approach.Aquesta tesi proposa i avalua en diverses aplicacions reals, un marc general de treball per al desenvolupament de sistemes de mesurament inferencial i de modelat basats en dades. L'arquitectura d'aquest marc de treball s'organitza en diverses capes que faciliten la seva extensibilitat així com la integració de nous components. Cadascun dels quatre nivells en que s'estructura la proposta de marc de treball ha estat avaluat de forma independent per a verificar la seva funcionalitat. El primer que nivell s'ocupa de l'anàlisi exploratòria de dades ha esta avaluat a partir de la caracterització de l'espai químic corresponent a la biodegradació de certs compostos orgànics. Fruit d'aquest anàlisi s'han establert relacions entre diverses variables físico-químiques que han estat emprades posteriorment per al desenvolupament de models de biodegradació. A nivell del preprocés de les dades s'ha desenvolupat i avaluat una nova metodologia per a la selecció de variables basada en l'ús del Mapes Autoorganitzats (SOM). Tot i que el mètode proposat selecciona, en general, un major nombre de variables que altres mètodes proposats a la literatura, els models resultants mostren una millor capacitat predictiva. S'han avaluat també tot un conjunt de tècniques d'imputació de dades basades en el SOM amb un conjunt de dades estàndard corresponent als paràmetres d'operació d'una planta de tractament d'aigües residuals. Es proposa i avalua en un problema de predicció de qualitat en aigua un nou model dinàmic per a ajustar el centre i la dispersió en xarxes de funcions de base radial. El mètode proposat millora els resultats obtinguts amb altres arquitectures neuronals. Els components de modelat proposat s'han aplicat també al desenvolupament de models predictius i de classificació de les velocitats de biodegradació de compostos orgànics en diferents medis. Els resultats obtinguts demostren la viabilitat d'aquesta aproximació per a desenvolupar models basats en dades en aquells casos en els que la complexitat de dinàmica del procés impedeix formular models mecanicistes. S'ha dut a terme un estudi preliminar de l'ús de algorismes de generació de regles i de grafs de dependència bayesiana per a introduir una nova capa que faciliti la interpretació dels models. Els resultats preliminars obtinguts a partir de la classificació dels Modes d'acció Tòxica (MOA) apunten a que l'ús dels MOA com a indicadors intermediaris dels efectes dels compostos químics en la salut és una aproximació factible.Finalment, el marc de treball proposat s'ha aplicat en tres escenaris de modelat diferents. En primer lloc, s'ha desenvolupat i avaluat un sensor virtual capaç d'inferir índexs de qualitat a partir de variables primàries de procés. El sensor resultant ha estat implementat en una planta química real millorant els resultats de les correlacions multilineals emprades habitualment. S'ha desenvolupat i avaluat un model per a predir els efectes carcinògens d'un grup de compostos aromàtics a partir de la seva estructura molecular. Els resultats obtinguts desprès d'aplicar el mètode de selecció de variables basat en el SOM milloren els resultats prèviament publicats. Aquest marc de treball s'ha usat també per a proporcionar una nova aproximació al modelat ambiental i l'anàlisi de risc amb sistemes d'informació geogràfica (GIS). S'ha usat el SOM per a caracteritzar escenaris d'exposició i per a desenvolupar un nou mètode d'interpolació geogràfica. La combinació del SOM amb els models de mescla de gaussianes dona una nova formulació al problema de l'anàlisi de risc des d'un punt de vista probabilístic
Adaptive Algorithms For Classification On High-Frequency Data Streams: Application To Finance
Mención Internacional en el título de doctorIn recent years, the problem of concept drift has gained importance in the financial
domain. The succession of manias, panics and crashes have stressed the nonstationary
nature and the likelihood of drastic structural changes in financial markets.
The most recent literature suggests the use of conventional machine learning and statistical
approaches for this. However, these techniques are unable or slow to adapt
to non-stationarities and may require re-training over time, which is computationally
expensive and brings financial risks.
This thesis proposes a set of adaptive algorithms to deal with high-frequency data
streams and applies these to the financial domain. We present approaches to handle
different types of concept drifts and perform predictions using up-to-date models.
These mechanisms are designed to provide fast reaction times and are thus applicable
to high-frequency data. The core experiments of this thesis are based on the prediction
of the price movement direction at different intraday resolutions in the SPDR S&P 500
exchange-traded fund. The proposed algorithms are benchmarked against other popular
methods from the data stream mining literature and achieve competitive results.
We believe that this thesis opens good research prospects for financial forecasting
during market instability and structural breaks. Results have shown that our proposed
methods can improve prediction accuracy in many of these scenarios. Indeed, the
results obtained are compatible with ideas against the efficient market hypothesis.
However, we cannot claim that we can beat consistently buy and hold; therefore, we
cannot reject it.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Gustavo Recio Isasi.- Secretario: Pedro Isasi Viñuela.- Vocal: Sandra García Rodrígue
- …