40 research outputs found

    Cómo elegir una función de activación para el aprendizaje profundo

    Get PDF
    Activation functions are important in each layer of the neural network because they allow the network to learn complex relationships between the input data and the output data. They also introduce nonlinearity into the network, which is essential for learning patterns in data. Activation functions play a critical role in the training and optimization of deep learning models, and choosing the right activation function can significantly impact the model’s performance. This article presents a summary of the features of these functions.  Las funciones de activación son importantes en cada capa de la red neuronal porque permiten a la red aprender relaciones complejas entre los datos de entrada y los de salida. También introducen la no linealidad en la red, que es esencial para aprender patrones en los datos. Las funciones de activación desempeñan un papel fundamental en el entrenamiento y la optimización de los modelos de aprendizaje profundo, y la elección de la función de activación adecuada puede influir significativamente en el rendimiento del modelo. Este artículo presenta un resumen de las características de estas funciones. &nbsp

    KOMPARASI FUNGSI AKTIVASI NEURAL NETWORK PADA DATA TIME SERIES

    Get PDF
    Abstract— The sophistication and success of machine learning in solving problems in various fields of artificial intelligence cannot be separated from the neural networks that form the basis of its algorithms. Meanwhile, the essence of a neural network lies in its activation function. However because so many activation function which are merged lately, it’s needed to search for proper activation function according to the model and it’s dataset used. In this study, the activation functions commonly used in machine learning models will be tested, namely; ReLU, GELU and SELU, for time series data in the form of stock prices. These activation functions are implemented in python and use the TensorFlow library, as well as a model developed based on the Convolutional Neural Network (CNN). From the results of this implementation, the results obtained with the CNN model, that the GELU activation function for time series data has the smallest loss valu

    Augmented Air Traffic Control System—Artificial Intelligence as Digital Assistance System to Predict Air Traffic Conflicts

    Get PDF
    Today’s air traffic management (ATM) system evolves around the air traffic controllers and pilots. This human-centered design made air traffic remarkably safe in the past. However, with the increase in flights and the variety of aircraft using European airspace, it is reaching its limits. It poses significant problems such as congestion, deterioration of flight safety, greater costs, more delays, and higher emissions. Transforming the ATM into the “next generation” requires complex human-integrated systems that provide better abstraction of airspace and create situational awareness, as described in the literature for this problem. This paper makes the following contributions: (a) It outlines the complexity of the problem. (b) It introduces a digital assistance system to detect conflicts in air traffic by systematically analyzing aircraft surveillance data to provide air traffic controllers with better situational awareness. For this purpose, long short-term memory (LSTMs) networks, which are a popular version of recurrent neural networks (RNNs) are used to determine whether their temporal dynamic behavior is capable of reliably monitoring air traffic and classifying error patterns. (c) Large-scale, realistic air traffic models with several thousand flights containing air traffic conflicts are used to create a parameterized airspace abstraction to train several variations of LSTM networks. The applied networks are based on a 20–10–1 architecture while using leaky ReLU and sigmoid activation function. For the learning process, the binary cross-entropy loss function and the adaptive moment estimation (ADAM) optimizer are applied with different learning rates and batch sizes over ten epochs. (d) Numerical results and achievements by using LSTM networks to predict various weather events, cyberattacks, emergency situations and human factors are presented

    Enhancing Interpretability of Neural Networks in Food Recommendation Systems

    Get PDF
    ABSTRACT: Over the years the risk of developing diseases related to poor alimentation has been increasing. Many of these diseases are caused by obesity. Obesity is a silent disease related to being overweight, which due to its rapid growth has become a public health problem. Worldwide obesity has nearly tripled since 1975. Obesity can lead to health problems like type 2 diabetes, cardiovascular disease, and even cancer. The main factors that result in obesity are a sedentary lifestyle and a poor diet. Although obesity is uncured, it can be avoided/treated through a healthier lifestyle and diet. Amid so much information about diets and healthier recipes, it can be difficult to find a diet that meets the needs of each person. Recommendation systems can filter from a large dataset, the information that best suits the profile of each user. Due to the constant in crease in information and computational power, recommendation systems have evolved from a traditional approach to a deep-learning one. Recommendation systems are a hot topic in deep learning. Research in the food recommendation systems area has seen little development when compared to recommendations systems in other areas, such as leisure and entertainment. A powerful tool to use in food recommendation systems is neural networks. Neural networks play an important role in our society, for their capacity to learn from complex and high-dimensional data. One side down of neural networks is the difficulties if not impossibility in understanding how the predictions are being made. The behind the-scenes often remain opaque, leading neural networks to be characterized as “black boxes”. With this research, we aim to give contribute to understanding how neural networks operate underneath and make them more transparent and so more trustworthy. With this goal in mind, we propose the use of a secondary model to predict the errors of a primary neural network. By analyzing the error predictions of the second model, we aim to gain insights into its decision-making process. With this approach, we hope not only to help to understand the func tioning of neural networks but also to provide an idea of how to improve their performance. Improving neural networks’ understanding can make them more simple and accessible. With the work developed through this research, we look to stride towards making neural networks more transparent and explainable, thereby enhancing trust in these powerful models.RESUMO: Ao longo dos anos, o risco de desenvolver doenças relacionadas a má alimentação tem aumentado. Muitas destas doenças são causadas pela obesidade. A obesidade é uma doença silenciosa relacionada com o excesso de peso, que devido ao seu rápido crescimento tornou-se um problema de saúde pública. A nível mundial a obesidade quase que triplicou desde 1975. A obesidade pode levar a problemas de saúde como diabetes do tipo 2, doenças cardiovasculares e até mesmo cancro. Os principais fatores que resultam na obesidade são um estilo de vida sedentário e uma dieta pobre. Embora a obesidade não tenha cura, pode ser evitada/tratada através da adoção de um estilo de vida e de uma dieta mais saudáveis. No meio de tanta informação sobre dietas e receitas saudáveis, pode ser difícil encontrar uma dieta que satisfaça as necessidades de cada pessoa. Os sistemas de recomendação podem filtrar a partir de um grande conjunto de dados a informação que melhor se adapta ao perfil de cada utilizador. Devido ao constante aumento de informação e de poder computacional, os sistemas de recomendação evoluíram desde uma abordagem tradicional para uma abordagem de deep learning. Os sistemas de recomendação são um tema quente na ´area de deep learning. A investigação na ´area dos sistemas de recomendação alimentar tem visto pouco desenvolvimento quando comparada com os sistemas de recomendação em outras ´areas, como lazer e entretenimento. Uma ferramenta poderosa a utilizar nos sistemas de recomendação alimentar são as redes neurais. As redes neurais desempenham um papel importante na nossa sociedade, pela sua capacidade de aprender a partir de dados complexos e de alta dimensão. Um dos lados negativos das redes neurais é a dificuldade, se não a impossibilidade, de compreender como as previsões estão a ser feitas. Os processos de decisão permanecem frequentemente opacos, levando as redes neurais a serem caracterizadas como ”caixas pretas”. Com esta investigação, pretendemos contribuir para a compreensão de como as redes neurais operam debaixo dos panos e torná-las-ás mais transparentes e, portanto, mais confiáveis. Com este objetivo em mente, propomos o uso de um segundo modelo para prever os erros de uma rede neural. Ao analisar as previsões de erro do segundo modelo, pretendemos obter noções sobre o processo de tomada de decisão da rede neural. Com esta abordagem, esperamos não só ajudar a entender o funcionamento das redes neurais, mas também fornecer uma ideia de como melhorar o seu desempenho. Melhorar a compreensão das redes neurais pode torná-las-ás mais simples e acessíveis. Com o trabalho desenvolvido através desta investigação, procuramos avançar no sentido de tornar as redes neurais mais transparentes e explicáveis, aumentando assim a confiança nestes modelos poderosos

    Advanced Stochastic Optimization Algorithm for Deep Learning Artificial Neural Networks in Banking and Finance Industries

    Get PDF
    One of the objectives of this paper is to incorporate fat-tail effects into, for instance, Sigmoid in order to introduce Transparency and Stability into the existing stochastic Activation Functions. Secondly, according to the available literature reviewed, the existing set of Activation Functions were introduced into the Deep learning Artificial Neural Network through the “Window” not properly through the “Legitimate Door” since they are “Trial and Error “and “Arbitrary Assumptions”, thus, the Author proposed a “Scientific Facts”, “Definite Rules: Jameel’s Stochastic ANNAF Criterion”, and a “Lemma” to substitute not necessarily replace the existing set of stochastic Activation Functions, for instance, the Sigmoid among others. This research is expected to open the “Black-Box” of Deep Learning Artificial Neural networks. The author proposed a new set of advanced optimized fat-tailed Stochastic Activation Functions EMANATED from the AI-ML-Purified Stocks Data  namely; the Log – Logistic (3P) Probability Distribution (1st), Cauchy Probability Distribution (2nd), Pearson 5 (3P) Probability Distribution (3rd), Burr (4P) Probability Distribution (4th), Fatigue Life (3P) Probability Distribution (5th), Inv. Gaussian (3P) Probability Distribution (6th), Dagum (4P) Probability Distribution (7th), and Lognormal (3P) Probability Distribution (8th) for the successful conduct of both Forward and Backward Propagations of Deep Learning Artificial Neural Network. However, this paper did not check the Monotone Differentiability of the proposed distributions. Appendix A, B, and C presented and tested the performances of the stressed Sigmoid and the Optimized Activation Functions using Stocks Data (2014-1991) of Microsoft Corporation (MSFT), Exxon Mobil (XOM), Chevron Corporation (CVX), Honda Motor Corporation (HMC), General Electric (GE), and U.S. Fundamental Macroeconomic Parameters, the results were found fascinating. Thus, guarantee, the first three distributions are excellent Activation Functions to successfully conduct any Stock Deep Learning Artificial Neural Network. Distributions Number 4 to 8 are also good Advanced Optimized Activation Functions. Generally, this research revealed that the Advanced Optimized Activation Functions satisfied Jameel’s ANNAF Stochastic Criterion depends on the Referenced Purified AI Data Set, Time Change and Area of Application which is against the existing “Trial and Error “and “Arbitrary Assumptions” of Sigmoid, Tanh, Softmax, ReLu, and Leaky ReLu

    Detecting Alzheimer\u27s Disease using Artificial Neural Networks

    Get PDF
    This project aims to use artificial neural networks (ANN) in order to detect Alzheimer’s disease. More specifically, convolutional neural networks (CNN) will be utilized as this is the most common ANN and has been used in many different image processing applications. The purpose of using artificial neural networks as a detect method is so that an intelligent way for image and signal analysis can be used. A software that implements CNN will be developed so that users in medical settings can utilize this software to detect Alzheimer’s in patients. The input for this software will be the patient’s MRI scans. In addition, this is a project that is relevant with the current trends of an increase in development surrounding artificial intelligence. As technology has become more advanced, there has been an increase in medical developments as well. From the simulation, the hyperbolic tangent activation function provided the best results. The performance resulting from the two classifications when using the hyperbolic tangent function, on average was validation best accuracy of 81.10%, validation stopped tuning accuracy of 81.10%, training best accuracy of 100.00%, testing best accuracy of 68.94%, F-1 score of 70.06%, precision of 71.00%, and recall of 70.06%. This project will open doors to more applications of this detection method. More diseases other than Alzheimer’s disease can utilize artificial neural networks (ANN) to detect diseases early on so that lives can be restored and saved
    corecore