Search CORE

119 research outputs found

Deep Learning: Our Miraculous Year 1990-1991

Author: Schmidhuber Juergen
Publication venue
Publication date: 12/05/2020
Field of study

In 2020, we will celebrate that many of the basic ideas behind the deep learning revolution were published three decades ago within fewer than 12 months in our "Annus Mirabilis" or "Miraculous Year" 1990-1991 at TU Munich. Back then, few people were interested, but a quarter century later, neural networks based on these ideas were on over 3 billion devices such as smartphones, and used many billions of times per day, consuming a significant fraction of the world's compute.Comment: 37 pages, 188 references, based on work of 4 Oct 201

arXiv.org e-Print Archive

Deep neural network representation and Generative Adversarial Learning

Author: Cheong Took Clive
Mandic Danilo
PALADE VASILE
RUIZ-GARCIA ARIEL
Schmidhuber Jürgen
Publication venue: 'Elsevier BV'
Publication date: 09/03/2021
Field of study

Royal Holloway - Pure

Quaternion generative adversarial networks

Author: Cicero Edoardo
Comminiello Danilo
Grassucci Eleonora
Publication venue: place:Cham
Publication date: 27/07/2021
Field of study

Latest Generative Adversarial Networks (GANs) are gathering outstanding results through a large-scale training, thus employing models composed of millions of parameters requiring extensive computational capabilities. Building such huge models undermines their replicability and increases the training instability. Moreover, multi-channel data, such as images or audio, are usually processed by real-valued convolutional networks that flatten and concatenate the input, often losing intra-channel spatial relations. To address these issues related to complexity and information loss, we propose a family of quaternion-valued generative adversarial networks (QGANs). QGANs exploit the properties of quaternion algebra, e.g., the Hamilton product, that allows to process channels as a single entity and capture internal latent relations, while reducing by a factor of 4 the overall number of parameters. We show how to design QGANs and to extend the proposed approach even to advanced models. We compare the proposed QGANs with real-valued counterparts on several image generation benchmarks. Results show that QGANs are able to obtain better FID scores than real-valued GANs and to generate visually pleasing images. Furthermore, QGANs save up to 75% of the training parameters. We believe these results may pave the way to novel, more accessible, GANs capable of improving performance and saving computational resources

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Lower Dimensional Kernels for Video Discriminators:Lower-Dimensional Video Discriminators for Generative Adversarial Networks

Author: Kahembwe Emmanuel
Ramamoorthy Subramanian
Publication venue: 'Elsevier BV'
Publication date: 18/12/2019
Field of study

This work presents an analysis of the discriminators used in Generative Adversarial Networks (GANs) for Video. We show that unconstrained video discriminator architectures induce a loss surface with high curvature which make optimisation difficult. We also show that this curvature becomes more extreme as the maximal kernel dimension of video discriminators increases. With these observations in hand, we propose a family of efficient Lower-Dimensional Video Discriminators for GANs (LDVD GANs). The proposed family of discriminators improve the performance of video GAN models they are applied to and demonstrate good performance on complex and diverse datasets such as UCF-101. In particular, we show that they can double the performance of Temporal-GANs and provide for state-of-the-art performance on a single GPU

arXiv.org e-Print Archive

Heriot Watt Pure

Edinburgh Research Explorer

Creativity as an information-based process

Author: De Pisapia Nicola
Rastelli Clara
Publication venue: Mimesis Edizioni
Publication date: 30/04/2022
Field of study

Abstract: Creativity, mostly ignored in Western philosophy due to its supposed mysteriousness, has recently become a respected research topic in psychology, neuroscience, and artificial intelligence. We discuss how in science the approach has mainly been to describe creativity as an information-based process, coherently with a computational view of the human mind started with the cognitive revolution. This view has produced progressively convincing models of creativity, up to current artificial neural network systems, vaguely inspired by biological neural processing, but already competing with human creativity in several fields. These successes suggest that creativity might not be an exclusively human function, but actually a way of functioning of any natural or artificial system implementing the creative process. We conclude by acknowledging that the information-based view of creativity has tremendous explanatory and generative power, but we propose a thought experiment to start discussing how it actually leaves out the experiential side of being creative.Keywords: Creative Cognition; Cognitive Neuroscience; Computational Creativity; Generative Algorithms; Cognitive Science La creatività come processo basato sull’informazioneRiassunto: La creatività, spesso ignorata dalla filosofia occidentale per la sua presunta oscurità, in tempi recenti è diventata un rispettabile oggetto di ricerca per la psicologia, la neuroscienza e l’intelligenza artificiale. Vogliamo illustrare il modo in cui lo sguardo scientifico sia rivolto prevalentemente a considerare la creatività come processo information-based, coerentemente con la prospettiva computazionale sulla mente umana aperta dalla rivoluzione cognitiva. Questa prospettiva ha prodotto modelli della creatività sempre più convincenti, fino agli attuali sistemi di reti neurali artificiali, vagamente inspirati al processamento biologico neurale, ma già competitivi rispetto alla creatività umana in molti ambiti. Questi successi suggeriscono che la creatività possa non essere una funzione esclusivamente umana ma in effetti un modo di funzionare di un sistema naturale o artificiale capace di implementare il processo creativo. In conclusione, pur riconoscendo come il considerare la creatività come processo information-based possieda grande potere esplicativo e generativo, proporremo un esperimento mentale per aprire una discussione sul come questa prospettiva non copra in effetti il lato esperienziale dell’essere creativo.Parole chiave: Cognizione creativa; Neuroscienza cognitiva; Creatività computazionale; Algoritmi generativi; Scienza cognitivaAbstract: Creativity, mostly ignored in Western philosophy due to its supposed mysteriousness, has recently become a respected research topic in psychology, neuroscience, and artificial intelligence. We discuss how in science the approach has mainly been to describe creativity as an information-based process, coherently with a computational view of the human mind started with the cognitive revolution. This view has produced progressively convincing models of creativity, up to current artificial neural network systems, vaguely inspired by biological neural processing, but already competing with human creativity in several fields. These successes suggest that creativity might not be an exclusively human function, but actually a way of functioning of any natural or artificial system implementing the creative process. We conclude by acknowledging that the information-based view of creativity has tremendous explanatory and generative power, but we propose a thought experiment to start discussing how it actually leaves out the experiential side of being creative.Keywords: Creative Cognition; Cognitive Neuroscience; Computational Creativity; Generative Algorithms; Cognitive Science La creatività come processo basato sull’informazioneRiassunto: La creatività, spesso ignorata dalla filosofia occidentale per la sua presunta oscurità, in tempi recenti è diventata un rispettabile oggetto di ricerca per la psicologia, la neuroscienza e l’intelligenza artificiale. Vogliamo illustrare il modo in cui lo sguardo scientifico sia rivolto prevalentemente a considerare la creatività come processo information-based, coerentemente con la prospettiva computazionale sulla mente umana aperta dalla rivoluzione cognitiva. Questa prospettiva ha prodotto modelli della creatività sempre più convincenti, fino agli attuali sistemi di reti neurali artificiali, vagamente inspirati al processamento biologico neurale, ma già competitivi rispetto alla creatività umana in molti ambiti. Questi successi suggeriscono che la creatività possa non essere una funzione esclusivamente umana ma in effetti un modo di funzionare di un sistema naturale o artificiale capace di implementare il processo creativo. In conclusione, pur riconoscendo come il considerare la creatività come processo information-based possieda grande potere esplicativo e generativo, proporremo un esperimento mentale per aprire una discussione sul come questa prospettiva non copra in effetti il lato esperienziale dell’essere creativo.Parole chiave: Cognizione creativa; Neuroscienza cognitiva; Creatività computazionale; Algoritmi generativi; Scienza cognitiv

Rivista Internazionale di Filosofia e Psicologia (Università degli Studi di Bari)

Characterizing Piecewise Linear Neural Networks

Author: Johansson Anton
Publication venue
Publication date: 01/01/2022
Field of study

Neural networks utilizing piecewise linear transformations between layers have in many regards become the default network type to use across a wide range of applications. Their superior training dynamics and generalization performance irrespective of the nature of the problem has resulted in these networks achieving state of the art results on a diverse set of tasks. Even though the efficacy of these networks have been established, there is a poor understanding of their intrinsic behaviour and properties. Little is known regarding how these functions evolve during training, how they behave at initialization and how all of this is related to the architecture of the network. Exploring and detailing these properties is not only of theoretical interest, it can also aid in developing new schemes and algorithms to further improve the performance of the networks. In this thesis we thus seek to further explore and characterize these properties. We theoretically prove how the local properties of piecewise linear networks vary at initialization and explore empirically how more complex properties behave during training. We use these results to reason about which intrinsic properties are associated with the generalization performance and develop new regularization schemes. We further substantiate the empirical success of piecewise linear networks by showcasing how their application can solve two tasks relevant to the safety and effectiveness of processes related to the automotive industry

Chalmers Research

Cooperative Coevolution for Non-Separable Large-Scale Black-Box Optimization: Convergence Analyses and Distributed Accelerations

Author: Duan Qiqi
Shao Chang
Shi Yuhui
Yang Haobin
Zhao Qi
Zhou Guochen
Publication venue
Publication date: 11/04/2023
Field of study

Given the ubiquity of non-separable optimization problems in real worlds, in this paper we analyze and extend the large-scale version of the well-known cooperative coevolution (CC), a divide-and-conquer optimization framework, on non-separable functions. First, we reveal empirical reasons of why decomposition-based methods are preferred or not in practice on some non-separable large-scale problems, which have not been clearly pointed out in many previous CC papers. Then, we formalize CC to a continuous game model via simplification, but without losing its essential property. Different from previous evolutionary game theory for CC, our new model provides a much simpler but useful viewpoint to analyze its convergence, since only the pure Nash equilibrium concept is needed and more general fitness landscapes can be explicitly considered. Based on convergence analyses, we propose a hierarchical decomposition strategy for better generalization, as for any decomposition there is a risk of getting trapped into a suboptimal Nash equilibrium. Finally, we use powerful distributed computing to accelerate it under the multi-level learning framework, which combines the fine-tuning ability from decomposition with the invariance property of CMA-ES. Experiments on a set of high-dimensional functions validate both its search performance and scalability (w.r.t. CPU cores) on a clustering computing platform with 400 CPU cores

arXiv.org e-Print Archive