6 research outputs found

    A generalised feedforward neural network architecture and its applications to classification and regression

    Get PDF
    Shunting inhibition is a powerful computational mechanism that plays an important role in sensory neural information processing systems. It has been extensively used to model some important visual and cognitive functions. It equips neurons with a gain control mechanism that allows them to operate as adaptive non-linear filters. Shunting Inhibitory Artificial Neural Networks (SIANNs) are biologically inspired networks where the basic synaptic computations are based on shunting inhibition. SIANNs were designed to solve difficult machine learning problems by exploiting the inherent non-linearity mediated by shunting inhibition. The aim was to develop powerful, trainable networks, with non-linear decision surfaces, for classification and non-linear regression tasks. This work enhances and extends the original SIANN architecture to a more general form called the Generalised Feedforward Neural Network (GFNN) architecture, which contains as subsets both SIANN and the conventional Multilayer Perceptron (MLP) architectures. The original SIANN structure has the number of shunting neurons in the hidden layers equal to the number of inputs, due to the neuron model that is used having a single direct excitatory input. This was found to be too restrictive, often resulting in inadequately small or inordinately large network structures

    On the training of feedforward neural networks.

    Get PDF
    by Hau-san Wong.Thesis (M.Phil.)--Chinese University of Hong Kong, 1993.Includes bibliographical references (leaves [178-183]).Chapter 1 --- INTRODUCTIONChapter 1.1 --- Learning versus Explicit Programming --- p.1-1Chapter 1.2 --- Artificial Neural Networks --- p.1-2Chapter 1.3 --- Learning in ANN --- p.1-3Chapter 1.4 --- Problems of Learning in BP Networks --- p.1-5Chapter 1.5 --- Dynamic Node Architecture for BP Networks --- p.1-7Chapter 1.6 --- Incremental Learning --- p.1-10Chapter 1.7 --- Research Objective and Thesis Organization --- p.1-11Chapter 2 --- THE FEEDFORWARD MULTILAYER NEURAL NETWORKChapter 2.1 --- The Perceptron --- p.2-1Chapter 2.2 --- The Generalization of the Perceptron --- p.2-4Chapter 2.3 --- The Multilayer Feedforward Network --- p.2-5Chapter 3 --- SOLUTIONS TO THE BP LEARNING PROBLEMChapter 3.1 --- Introduction --- p.3-1Chapter 3.2 --- Attempts in the Establishment of a Viable Hidden Representation Model --- p.3-5Chapter 3.3 --- Dynamic Node Creation Algorithms --- p.3-9Chapter 3.4 --- Concluding Remarks --- p.3-15Chapter 4 --- THE GROWTH ALGORITHM FOR NEURAL NETWORKSChapter 4.1 --- Introduction --- p.4-2Chapter 4.2 --- The Radial Basis Function --- p.4-6Chapter 4.3 --- The Additional Input Node and the Modified Nonlinearity --- p.4-9Chapter 4.4 --- The Initialization of the New Hidden Node --- p.4-11Chapter 4.5 --- Initialization of the First Node --- p.4-15Chapter 4.6 --- Practical Considerations for the Growth Algorithm --- p.4-18Chapter 4.7 --- The Convergence Proof for the Growth Algorithm --- p.4-20Chapter 4.8 --- The Flow of the Growth Algorithm --- p.4-21Chapter 4.9 --- Experimental Results and Performance Analysis --- p.4-21Chapter 4.10 --- Concluding Remarks --- p.4-33Chapter 5 --- KNOWLEDGE REPRESENTATION IN NEURAL NETWORKSChapter 5.1 --- An Alternative Perspective to Knowledge Representation in Neural Network: The Temporal Vector (T-Vector) Approach --- p.5-1Chapter 5.2 --- Prior Research Works in the T-Vector Approach --- p.5-2Chapter 5.3 --- Formulation of the T-Vector Approach --- p.5-3Chapter 5.4 --- Relation of the Hidden T-Vectors to the Output T-Vectors --- p.5-6Chapter 5.5 --- Relation of the Hidden T-Vectors to the Input T-Vectors --- p.5-10Chapter 5.6 --- An Inspiration for a New Training Algorithm from the Current Model --- p.5-12Chapter 6 --- THE DETERMINISTIC TRAINING ALGORITHM FOR NEURAL NETWORKSChapter 6.1 --- Introduction --- p.6-1Chapter 6.2 --- The Linear Independency Requirement for the Hidden T-Vectors --- p.6-3Chapter 6.3 --- Inspiration of the Current Work from the Barmann T-Vector Model --- p.6-5Chapter 6.4 --- General Framework of Dynamic Node Creation Algorithm --- p.6-10Chapter 6.5 --- The Deterministic Initialization Scheme for the New Hidden NodesChapter 6.5.1 --- Introduction --- p.6-12Chapter 6.5.2 --- Determination of the Target T-VectorChapter 6.5.2.1 --- Introduction --- p.6-15Chapter 6.5.2.2 --- Modelling of the Target Vector βQhQ --- p.6-16Chapter 6.5.2.3 --- Near-Linearity Condition for the Sigmoid Function --- p.6-18Chapter 6.5.3 --- Preparation for the BP Fine-Tuning Process --- p.6-24Chapter 6.5.4 --- Determination of the Target Hidden T-Vector --- p.6-28Chapter 6.5.5 --- Determination of the Hidden Weights --- p.6-29Chapter 6.5.6 --- Determination of the Output Weights --- p.6-30Chapter 6.6 --- Linear Independency Assurance for the New Hidden T-Vector --- p.6-30Chapter 6.7 --- Extension to the Multi-Output Case --- p.6-32Chapter 6.8 --- Convergence Proof for the Deterministic Algorithm --- p.6-35Chapter 6.9 --- The Flow of the Deterministic Dynamic Node Creation Algorithm --- p.6-36Chapter 6.10 --- Experimental Results and Performance Analysis --- p.6-36Chapter 6.11 --- Concluding Remarks --- p.6-50Chapter 7 --- THE GENERALIZATION MEASURE MONITORING SCHEMEChapter 7.1 --- The Problem of Generalization for Neural Networks --- p.7-1Chapter 7.2 --- Prior Attempts in Solving the Generalization Problem --- p.7-2Chapter 7.3 --- The Generalization Measure --- p.7-4Chapter 7.4 --- The Adoption of the Generalization Measure to the Deterministic Algorithm --- p.7-5Chapter 7.5 --- Monitoring of the Generalization Measure --- p.7-6Chapter 7.6 --- Correspondence between the Generalization Measure and the Generalization Capability of the Network --- p.7-8Chapter 7.7 --- Experimental Results and Performance Analysis --- p.7-12Chapter 7.8 --- Concluding Remarks --- p.7-16Chapter 8 --- THE ESTIMATION OF THE INITIAL HIDDEN LAYER SIZEChapter 8.1 --- The Need for an Initial Hidden Layer Size Estimation --- p.8-1Chapter 8.2 --- The Initial Hidden Layer Estimation Scheme --- p.8-2Chapter 8.3 --- The Extension of the Estimation Procedure to the Multi-Output Network --- p.8-6Chapter 8.4 --- Experimental Results and Performance Analysis --- p.8-6Chapter 8.5 --- Concluding Remarks --- p.8-16Chapter 9 --- CONCLUSIONChapter 9.1 --- Contributions --- p.9-1Chapter 9.2 --- Suggestions for Further Research --- p.9-3REFERENCES --- p.R-1APPENDIX --- p.A-

    Conectionist model to evaluate medical equipment purchasing proposals

    Get PDF
    Orientador: Saide Jorge CalilTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de ComputaçãoResumo: No Brasil, existe uma parcela significativa de equipamentos médico-hospitalares inoperantes devido à condução inadequada, feita por pessoas despreparadas, do processo de aquisição desses equipamentos. Visando uma futura solução para esse problema, nesta tese foi desenvolvido um estudo para mostrar a possibilidade de representar (através da utilização de redes neurais artificiais) o processo cognitivo utilizado por engenheiros clínicos experientes durante a fase de ponderação dos critérios para julgamento de propostas de fornecimento de equipamentos médicos. Para isso, as respostas fornecidas, a uma pesquisa com engenheiros clínicos de várias regiões do país, foram usadas para construir exemplos para treinamento de diversas arquiteturas de redes neurais. Os melhores resultados (maior correlação com as respostas originais e menor erro quadrático de teste) foram obtidos para a composição (ensemble) de 100 redes neurais de duas camadas escondidas treinadas com o algoritmo back-propagation. Isso mostrou a viabilidade de representar o conhecimento dos especialistas na forma de um modelo conexionista não-linear, cujas saídas fornecem a importância de diversos fatores (clínico, financeiro, qualidade, segurança e técnico) envolvidos no processo de julgamento de propostas para aquisição de um equipamento médicoAbstract: Most recently, in Brazil, there are evidences of a great number of useless medical equipment, due to the absence of experienced professionals to conduct an effective purchasing plan by the healthcare institutions. In order to search a future solution to this problem it was developed a study to verify the liability of representing (trought artificial neural networks) the cognitive process used by clinical engineering experts, during the evaluation phase of purchasing proposals for medical equipment. An inquiry (using electronic mail) to clinical engineers from several brazilian regions was conducted, using an electronic chart that contained a list of parameters commonly used for this evaluation phase. Data from the filled charts were used to train, and to test, diverse types of artificial neural networks. The best results (major correlation and minor quadratic errors with respect to the original entries) were encountered for an ensemble of 100 two-hidden-layers perceptrons trained with the backpropagation algorithm. It was then showed that the knowlegde of clinical engineers (for the evaluation process of purchasing proposals) can be represented by a non-linear connectionist model, whose entries would be the phisical risk, cost and strategic importance of the medical equipment. The model's outputs are the importance given by clinical engineers for five factors (clinical, financial, quality, safety and technical) for the evaluation of a medical equipmentDoutoradoEngenharia BiomedicaDoutor em Engenharia Elétric

    Neuronale Netze mit erweiterten bayesschen Methoden für reale Datensammlungen

    Get PDF
    Zu zahlreichen Problemen, die bei der Verarbeitung von realen Trainingsdaten durch neuronale Netze auftreten können, und die bisher in der Literatur nicht oder nicht ausreichend diskutiert wurden, werden Lösungen präsentiert. Alle diese Verfahren werden in einem Gesamtsystem zur Verarbeitung von Korrosionsdaten implementiert und empirisch validiert. Ausgang aller Konzepte und Algorithmen bilden neuronale Netze mit erweiterten bayesschen Methoden: sie verarbeiten Trainingsdaten mit individuellen Messfehlerangaben. Entsprechend können zu den Prognosen auch Prognosefehler in Form von Konfidenzen berechnet werden. Für die Implementierung wurden generalisierte lineare Netze verwendet. Sie ermöglichen einen sehr effizienten Trainingsalgorithmus, der neben den Gewichten auch die a priori Verteilung der Gewichte vollautomatisch bestimmt. Weiter wird eine Reihe von theoretischen Aussagen präsentiert, die für das Verständnis der erweiterten bayesschen Methoden wichtig sind, und die das Verhältnis zwischen Trainings- und Prognosefehlern, den Basisfunktionen und der Gewichtsregularisierung beschreiben. Die Kooperation von Netzen wird eingeführt, um zwei strukturelle Probleme der vorliegenden Korrosionsdatensammlung effektiv zu lösen. Da sich die Messstellen einerseits in einem sehr hochdimensionalen Raum befinden, sie aber andererseits in vergleichsweise wenigen Clustern angeordnet sind, werden jeweils inhaltlich zusammengehörige Trainingsdaten zu einzelnen Experten zusammengefasst. Außerdem werden Trainingsdaten, die in einem Parameter fehlende, also verteilte Werte aufweisen, in anderen Experten trainiert als Trainingsdaten mit konkreten Werten. Darüber hinaus beschleunigt die Kooperation sowohl das Training als auch die Prognose und verringert den benötigten Speicherplatz. Die Beziehung zwischen einem einzelnen Netz, das auf allen Daten trainiert wurde, und zwei kooperierenden Netzen, die zusammen auf den gleichen Daten trainiert wurden, wird analytisch und beispielhaft untersucht. Die Kooperation generalisiert dabei näherungsweise genauso gut wie ein einzelnes, universelles Netz. Die Korrosion ist überwiegend, aber nicht überall eine deterministische Funktion der Eingangsgrößen. Das vorgestellte Modell des regionalen Rauschens ist, wenn entsprechende Trainingsdaten zur Verfügung stehen, in der Lage, diejenigen Regionen im Eingaberaum zu erkennen, in denen Trainingsdaten, gemessen an ihren Messfehlerangaben, zueinander in Widerspruch stehen. Die Standardabweichung des inhärenten Rauschens wird dabei erkannt und bildet zusammen mit dem bayesschen Prognosefehler einen erweiterten Fehlerbalken der Prognose. Das in der Literatur üblicherweise verwendete Klassifikationsmodell, das die Eingangsgrößen als Zufallsvariablen in Abhängigkeit der zu trainierenden Klasse annimmt, ist auf die Korrosion nicht anwendbar. Daher wird ein alternatives Modell entwickelt, welches diese Abhängigkeit umkehrt. Es ermöglicht darüber hinaus eine Trennung der trainierten und der prognostizierten Klassen, sodass die Information, die in den Trainingsdaten enthalten ist, besser genutzt werden kann. Die Verarbeitung von Daten, die nicht ursprünglich zum Training von neuronalen Netzen zusammengestellt wurden, erfordert eine umfangreiche Vorverarbeitung. Dazu werden Methoden eines zweistufigen Verfahrens beschrieben, dessen zentrales Element das komplexe, benutzer- und problemorientierte konzeptionelle Datenschema ist. Bei der Abbildung der ursprünglichen Trainingsdaten in dieses Schema werden Spezifika der Datenbeschreibung abgebaut und so eine phänomenorientierte Beschreibung geleistet. In die weitere Abbildung auf die Netzein- und -ausgänge fließt analytisches Problemwissen ein, was dann zu erheblich verbesserten Generalisierungseigenschaften führt. Ein Überblick über den Leistungsumfang der entstandenen Software und empirische Auswertungen, die die Leistungsfähigkeit und die Korrektheit aller beschriebenen Modelle und Konzepte belegen, schließen die Arbeit ab

    Constructive Neural Network In Model-based Control Of A Biotechnological Process

    No full text
    In the present work, a constructive learning algorithm is employed to design an optimal one-hidden layer neural network structure that best approximates a given mapping. The method determines not only the optimal number of hidden neurons but also the best activation function for each node. Here, the projection pursuit technique is applied in association with the optimization of the solvability condition, giving rise to a more efficient and accurate computational learning algorithm. As each activation function of a hidden neuron is optimally defined for every approximation problem, better rates of convergence are achieved. The proposed constructive learning algorithm was successfully applied to identify a large-scale multivariate process, providing a multivariable model that was able to describe the complex process dynamics, even in long-range horizon predictions. The resulting identification model is then considered as part of a model-based predictive control strategy, with high-quality performance in closed-loop experiments.324062411Andrietta, S.R., Maugeri, F., Optimum design of a continuous fermentation unit of an industrial plant for alcohol production (1994) Advances in Bioprocess Engineering, , Kluwer Academic PublishersBärmann, F., Biegler-König, F., On a Class of Efficient Learning Algorithms for Neural Networks (1992) Neural Network, 5 (1), pp. 139-144Battiti, R., First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method (1992) Neural Computation, 4 (2), pp. 141-166Camacho, E.F., Bordons, C., (1999) Model Predictive Control, , Springer-Verlag"Clarke, D.W., (1994) Advances in Model Based Predictive Control, , Oxford University PressEdgar, T.F., Himmelblau, D.M., (1988) Optimization of Chemical Processes, , McGraw-HillFriedman, J.H., Stuetzle, W., Projection Pursuit Regression (1981) Journal of the American Statistical Association (JASA), 76, pp. 817-823Geman, S., Bienenstock, E., Doursat, R., Neural Networks and the Bias/Variance Dilemma (1992) Neural Computation, 4 (1), pp. 1-58Haykin, S., (1999) Neural Networks: A Comprehensive Foundation - 2nd Edition, , Prentice HallHornik, K., Multilayer feedforward networks are universal approximators (1989) Neural Networks, 2 (5), pp. 359-366Huber, P.J., Projection Pursuit (1985) The Annal a of Statistics, 13 (2), pp. 435-475Hwang, J.-N., Lay, S.-R., Maechler, M., Martin, R.D., Schimert, J., Regression Modeling in Back-Propagation and Projection Pursuit Learning (1994) IEEE Transactions on Neural Networks, 5 (3), pp. 342-353Jones, L.K., On a conjecture of Huber concerning the convergence of projection pursuit regression (1987) The Annals of Statistics, 15, pp. 880-882Kosko, B., (1997) Fuzzy Engineering, , Prentice HallKosko, B., (1992) Neural Networks and Fuzzy Systems: A Dynamical Systems Approach to Machine Intelligence, , Prentice HallKwok, T.Y., Yeung, D.Y., Constructive Algorithms for Structure Learning in Feedforward Neural Networks for Regression Problems (1997) IEEE Trans. on Neural Networks, 8 (3), pp. 630-645Meleiro, L.A.C., (2002) Design and Applications of Linear, Neural, and Fuzzy Model-Based Controllers, , PhD Thesis (in Portuguese)Meleiro, L.A.C., Campello, R.J.G.B., Maciel Filho, R., Von Zuben, F.J., Identification of a Multivariate Fermentation Process Using Constructive Learning (2002) Proc. SBRN'2002 - VII Brazilian Symposium on Neural Networks, , IEEE Computer SocietyMeleiro, L.A.C., Maciel Filho, R., Campello, R.J.G.B., Amaral, W.C., Hierarchical Neural Fuzzy Models as a Tool for Process Identification: A Bioprocess Application (2001) Application of Neural Networks and Other Learning Technologies in Process Engineering, , Mujtaba, I. M. and Hussain, M. A. (Editors). Imperial College PressNg, G.W., (1997) Application of Neural Networks to Adaptive Control of Nonlinear Systems, , Research Studies Press Ltd., John Wiley & Sons IncRoosen, C.B., Hastie, T.J., Automatic Smoothing Spline Projection Pursuit (1994) Journal of Computational and Graphical Statistics, 3, pp. 235-248Soeterboek, R., (1992) Predictive Control - A Unified Approach, , Prentice HallVon Zuben, F.J., Netto, M.L.A., Unit-growing learning optimizing the solvability condition for model-free regression (1995) Proceedings of the IEEE International Conference on Neural Networks, 2, pp. 795-800Von Zuben, F.J., Netto, M.L.A., Projection Pursuit and the Solvability Condition Applied to Constructive Learning (1997) Proceedings of the International Joint Conference on Neural Networks, pp. 1062-1067. , Houston - USA,
    corecore