148 research outputs found

    Fuzzy GMDH and its application to forecasting financial processes

    Get PDF
    This paper is devoted to the investigation and application of the fuzzy inductive modeling method known as Group Method of Data Handling (GMDH) in problems of Data Mining, in particularly its application to solving the forecasting tasks in financial sphere. The advantage of the inductive modeling method GMDH is a possibility of constructing an adequate model directly in the process of algorithm run. The generalization of GMDH in case of uncertainty — a new method fuzzy GMDH is described which enables to construct fuzzy models almost automatically. The algorithm of fuzzy GMDH is considered. Fuzzy GMDH with Gaussian and bell-wise membership functions MF are considered and their similarity with triangular MF is shown. Fuzzy GMDH with different partial descriptions orthogonal polynomials of Chebyshev and Fourier are considered. The problem of adaptation of fuzzy models obtained by FGMDH is considered and the corresponding adaptation algorithm is described. The extension and generalization of fuzzy GMDH in case of fuzzy inputs is considered and its properties are analyzed. The experimental investigations of the suggested FGMDH were carried out

    Data Mining in Smart Grids

    Get PDF
    Effective smart grid operation requires rapid decisions in a data-rich, but information-limited, environment. In this context, grid sensor data-streaming cannot provide the system operators with the necessary information to act on in the time frames necessary to minimize the impact of the disturbances. Even if there are fast models that can convert the data into information, the smart grid operator must deal with the challenge of not having a full understanding of the context of the information, and, therefore, the information content cannot be used with any high degree of confidence. To address this issue, data mining has been recognized as the most promising enabling technology for improving decision-making processes, providing the right information at the right moment to the right decision-maker. This Special Issue is focused on emerging methodologies for data mining in smart grids. In this area, it addresses many relevant topics, ranging from methods for uncertainty management, to advanced dispatching. This Special Issue not only focuses on methodological breakthroughs and roadmaps in implementing the methodology, but also presents the much-needed sharing of the best practices. Topics include, but are not limited to, the following: Fuzziness in smart grids computing Emerging techniques for renewable energy forecasting Robust and proactive solution of optimal smart grids operation Fuzzy-based smart grids monitoring and control frameworks Granular computing for uncertainty management in smart grids Self-organizing and decentralized paradigms for information processin

    Development of breast cancer diagnosis system based on fuzzy logic and probabilistic neural network

    Get PDF
    Breast cancer is one of the most common kinds of cancers that infect females in the whole world. It has happened when the cells in breast tissues start to grow in an uncontrollable way. Because it leads to death, early detection and diagnosis is a very important task to save the patient's life. Due to the restriction of human observers, computer plays a significant role in detecting early cancer signs. The proposed system uses a multi-resolution analysis and a top-hat operation for detecting the suspicious regions in a mammogram image. The discrete wavelet transform feature analysis is utilized for extracting features from the region of interest. Fuzzy Logic (FL) and Probabilistic Neural Network (PNN) are utilized for classifying the tumor into normal or abnormal. The differences between the proposed system and other researches are the use of adaptive threshold value depending on each image, by using Discrete Wavelet Transform (DWT) in both segmentation and feature extraction phases, which decrease complexity and time. Additionally, the detection of more than one tumor in the breast mammogram image and the utilization of FL and PNN work on increasing the system efficiency that led to raising the accuracy rate of the system and reducing the time. The obtained results of accuracy, sensitivity, and specificity were equal to 99 %, 98 %, and 47 %, respectively, and these results showed that the proposed system is more accurate than the other previous related work

    A neuro-genetic hybrid approach to automatic identification of plant leaves

    Get PDF
    Plants are essential for the existence of most living things on this planet. Plants are used for providing food, shelter, and medicine. The ability to identify plants is very important for several applications, including conservation of endangered plant species, rehabilitation of lands after mining activities and differentiating crop plants from weeds. In recent times, many researchers have made attempts to develop automated plant species recognition systems. However, the current computer-based plants recognition systems have limitations as some plants are naturally complex, thus it is difficult to extract and represent their features. Further, natural differences of features within the same plant and similarities between plants of different species cause problems in classification. This thesis developed a novel hybrid intelligent system based on a neuro-genetic model for automatic recognition of plants using leaf image analysis based on novel approach of combining several image descriptors with Cellular Neural Networks (CNN), Genetic Algorithm (GA), and Probabilistic Neural Networks (PNN) to address classification challenges in plant computer-based plant species identification using the images of plant leaves. A GA-based feature selection module was developed to select the best of these leaf features. Particle Swam Optimization (PSO) and Principal Component Analysis (PCA) were also used sideways for comparison and to provide rigorous feature selection and analysis. Statistical analysis using ANOVA and correlation techniques confirmed the effectiveness of the GA-based and PSO-based techniques as there were no redundant features, since the subset of features selected by both techniques correlated well. The number of principal components (PC) from the past were selected by conventional method associated with PCA. However, in this study, GA was used to select a minimum number of PC from the original PC space. This reduced computational cost with respect to time and increased the accuracy of the classifier used. The algebraic nature of the GA’s fitness function ensures good performance of the GA. Furthermore, GA was also used to optimize the parameters of a CNN (CNN for image segmentation) and then uniquely combined with PNN to improve and stabilize the performance of the classification system. The CNN (being an ordinary differential equation (ODE)) was solved using Runge-Kutta 4th order algorithm in order to minimize descritisation errors associated with edge detection. This study involved the extraction of 112 features from the images of plant species found in the Flavia dataset (publically available) using MATLAB programming environment. These features include Zernike Moments (20 ZMs), Fourier Descriptors (21 FDs), Legendre Moments (20 LMs), Hu 7 Moments (7 Hu7Ms), Texture Properties (22 TP) , Geometrical Properties (10 GP), and Colour features (12 CF). With the use of GA, only 14 features were finally selected for optimal accuracy. The PNN was genetically optimized to ensure optimal accuracy since it is not the best practise to fix the tunning parameters for the PNN arbitrarily. Two separate GA algorithms were implemented to optimize the PNN, that is, the GA provided by MATLAB Optimization Toolbox (GA1) and a separately implemented GA (GA2). The best chromosome (PNN spread) for GA1 was 0.035 with associated classification accuracy of 91.3740% while a spread value of 0.06 was obtained from GA2 giving rise to improved classification accuracy of 92.62%. The PNN-based classifier used in this study was benchmarked against other classifiers such as Multi-layer perceptron (MLP), K Nearest Neigbhour (kNN), Naive Bayes Classifier (NBC), Radial Basis Function (RBF), Ensemble classifiers (Adaboost). The best candidate among these classifiers was the genetically optimized PNN. Some computational theoretic properties on PNN are also presented

    Design of neuro-fuzzy models by evolutionary and gradient-based algorithms

    Get PDF
    All systems found in nature exhibit, with different degrees, a nonlinear behavior. To emulate this behavior, classical systems identification techniques use, typically, linear models, for mathematical simplicity. Models inspired by biological principles (artificial neural networks) and linguistically motivated (fuzzy systems), due to their universal approximation property, are becoming alternatives to classical mathematical models. In systems identification, the design of this type of models is an iterative process, requiring, among other steps, the need to identify the model structure, as well as the estimation of the model parameters. This thesis addresses the applicability of gradient-basis algorithms for the parameter estimation phase, and the use of evolutionary algorithms for model structure selection, for the design of neuro-fuzzy systems, i.e., models that offer the transparency property found in fuzzy systems, but use, for their design, algorithms introduced in the context of neural networks. A new methodology, based on the minimization of the integral of the error, and exploiting the parameter separability property typically found in neuro-fuzzy systems, is proposed for parameter estimation. A recent evolutionary technique (bacterial algorithms), based on the natural phenomenon of microbial evolution, is combined with genetic programming, and the resulting algorithm, bacterial programming, advocated for structure determination. Different versions of this evolutionary technique are combined with gradient-based algorithms, solving problems found in fuzzy and neuro-fuzzy design, namely incorporation of a-priori knowledge, gradient algorithms initialization and model complexity reduction.Todos os sistemas encontrados na natureza exibem, com maior ou menor grau, um comportamento linear. De modo a emular esse comportamento, as técnicas de identificação clássicas usam, tipicamente e por simplicidade matemática, modelos lineares. Devido à sua propriedade de aproximação universal, modelos inspirados por princípios biológicos (redes neuronais artificiais) e motivados linguisticamente (sistemas difusos) tem sido cada vez mais usados como alternativos aos modelos matemáticos clássicos. Num contexto de identificação de sistemas, o projeto de modelos como os acima descritos é um processo iterativo, constituído por vários passos. Dentro destes, encontra-se a necessidade de identificar a estrutura do modelo a usar, e a estimação dos seus parâmetros. Esta Tese discutirá a aplicação de algoritmos baseados em derivadas para a fase de estimação de parâmetros, e o uso de algoritmos baseados na teoria da evolução de espécies, algoritmos evolutivos, para a seleção de estrutura do modelo. Isto será realizado no contexto do projeto de modelos neuro-difusos, isto é, modelos que simultaneamente exibem a propriedade de transparência normalmente associada a sistemas difusos mas que utilizam, para o seu projeto algoritmos introduzidos no contexto de redes neuronais. Os modelos utilizados neste trabalho são redes B-Spline, de Função de Base Radial, e sistemas difusos dos tipos Mamdani e Takagi-Sugeno. Neste trabalho começa-se por explorar, para desenho de redes B-Spline, a introdução de conhecimento à-priori existente sobre um processo. Neste sentido, aplica-se uma nova abordagem na qual a técnica para a estimação dos parâmetros é alterada a fim de assegurar restrições de igualdade da função e das suas derivadas. Mostra-se ainda que estratégias de determinação de estrutura do modelo, baseadas em computação evolutiva ou em heurísticas determinísticas podem ser facilmente adaptadas a este tipo de modelos restringidos. É proposta uma nova técnica evolutiva, resultante da combinação de algoritmos recentemente introduzidos (algoritmos bacterianos, baseados no fenómeno natural de evolução microbiana) e programação genética. Nesta nova abordagem, designada por programação bacteriana, os operadores genéticos são substituídos pelos operadores bacterianos. Deste modo, enquanto a mutação bacteriana trabalha num indivíduo, e tenta otimizar a bactéria que o codifica, a transferência de gene é aplicada a toda a população de bactérias, evitando-se soluções de mínimos locais. Esta heurística foi aplicada para o desenho de redes B-Spline. O desempenho desta abordagem é ilustrada e comparada com alternativas existentes. Para a determinação dos parâmetros de um modelo são normalmente usadas técnicas de otimização locais, baseadas em derivadas. Como o modelo em questão é não-linear, o desempenho deste género de técnicas é influenciado pelos pontos de partida. Para resolver este problema, é proposto um novo método no qual é usado o algoritmo evolutivo referido anteriormente para determinar pontos de partida mais apropriados para o algoritmo baseado em derivadas. Deste modo, é aumentada a possibilidade de se encontrar um mínimo global. A complexidade dos modelos neuro-difusos (e difusos) aumenta exponencialmente com a dimensão do problema. De modo a minorar este problema, é proposta uma nova abordagem de particionamento do espaço de entrada, que é uma extensão das estratégias de decomposição de entrada normalmente usadas para este tipo de modelos. Simulações mostram que, usando esta abordagem, se pode manter a capacidade de generalização com modelos de menor complexidade. Os modelos B-Spline são funcionalmente equivalentes a modelos difusos, desde que certas condições sejam satisfeitas. Para os casos em que tal não acontece (modelos difusos Mamdani genéricos), procedeu-se à adaptação das técnicas anteriormente empregues para as redes B-Spline. Por um lado, o algoritmo Levenberg-Marquardt é adaptado e a fim de poder ser aplicado ao particionamento do espaço de entrada de sistema difuso. Por outro lado, os algoritmos evolutivos de base bacteriana são adaptados para sistemas difusos, e combinados com o algoritmo de Levenberg-Marquardt, onde se explora a fusão das características de cada metodologia. Esta hibridização dos dois algoritmos, denominada de algoritmo bacteriano memético, demonstrou, em vários problemas de teste, apresentar melhores resultados que alternativas conhecidas. Os parâmetros dos modelos neuronais utilizados e dos difusos acima descritos (satisfazendo no entanto alguns critérios) podem ser separados, de acordo com a sua influência na saída, em parâmetros lineares e não-lineares. Utilizando as consequências desta propriedade nos algoritmos de estimação de parâmetros, esta Tese propõe também uma nova metodologia para estimação de parâmetros, baseada na minimização do integral do erro, em alternativa à normalmente utilizada minimização da soma do quadrado dos erros. Esta técnica, além de possibilitar (em certos casos) um projeto totalmente analítico, obtém melhores resultados de generalização, dado usar uma superfície de desempenho mais similar aquela que se obteria se se utilizasse a função geradora dos dados
    corecore