Search CORE

597 research outputs found

Deep learning for video game playing

Author: Bontrager Philip
Justesen Niels
Risi Sebastian
Togelius Julian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

In this article, we review recent Deep Learning advances in the context of how they have been applied to play different types of video games such as first-person shooters, arcade games, and real-time strategy games. We analyze the unique requirements that different game genres pose to a deep learning system and highlight important open challenges in the context of applying these machine learning methods to video games, such as general game playing, dealing with extremely large decision spaces and sparse rewards

arXiv.org e-Print Archive

The IT University of Copenhagen's Repository

Speech Processing in Computer Vision Applications

Author: Waterworth Nicholas
Publication venue: ScholarWorks@UARK
Publication date: 01/05/2020
Field of study

Deep learning has been recently proven to be a viable asset in determining features in the field of Speech Analysis. Deep learning methods like Convolutional Neural Networks facilitate the expansion of specific feature information in waveforms, allowing networks to create more feature dense representations of data. Our work attempts to address the problem of re-creating a face given a speaker\u27s voice and speaker identification using deep learning methods. In this work, we first review the fundamental background in speech processing and its related applications. Then we introduce novel deep learning-based methods to speech feature analysis. Finally, we will present our deep learning approaches to speaker identification and speech to face synthesis. The presented method can convert a speaker audio sample to an image of their predicted face. This framework is composed of several chained together networks, each with an essential step in the conversion process. These include Audio embedding, encoding, and face generation networks, respectively. Our experiments show that certain features can map to the face and that with a speaker\u27s voice, DNNs can create their face and that a GUI could be used in conjunction to display a speaker recognition network\u27s data

ScholarWorks@UARK

UARK (University of Arkansas )

WESPE: Weakly Supervised Photo Enhancer for Digital Cameras

Author: Ignatov Andrey
Kobyshev Nikolay
Timofte Radu
Van Gool Luc
Vanhoey Kenneth
Publication venue
Publication date: 03/03/2018
Field of study

Low-end and compact mobile cameras demonstrate limited photo quality mainly due to space, hardware and budget constraints. In this work, we propose a deep learning solution that translates photos taken by cameras with limited capabilities into DSLR-quality photos automatically. We tackle this problem by introducing a weakly supervised photo enhancer (WESPE) - a novel image-to-image Generative Adversarial Network-based architecture. The proposed model is trained by under weak supervision: unlike previous works, there is no need for strong supervision in the form of a large annotated dataset of aligned original/enhanced photo pairs. The sole requirement is two distinct datasets: one from the source camera, and one composed of arbitrary high-quality images that can be generally crawled from the Internet - the visual content they exhibit may be unrelated. Hence, our solution is repeatable for any camera: collecting the data and training can be achieved in a couple of hours. In this work, we emphasize on extensive evaluation of obtained results. Besides standard objective metrics and subjective user study, we train a virtual rater in the form of a separate CNN that mimics human raters on Flickr data and use this network to get reference scores for both original and enhanced photos. Our experiments on the DPED, KITTI and Cityscapes datasets as well as pictures from several generations of smartphones demonstrate that WESPE produces comparable or improved qualitative results with state-of-the-art strongly supervised methods

arXiv.org e-Print Archive

Repository for Publications and Research Data

A theory of relation learning and cross-domain generalization

Author: Doumas Leonidas A A
Hummel John E.
Martin Andrea E.
Puebla Ramírez Guillermo
Publication venue: 'American Psychological Association (APA)'
Publication date: 07/12/2021
Field of study

People readily generalize knowledge to novel domains and stimuli. We present a theory, instantiated in a computational model, based on the idea that cross-domain generalization in humans is a case of analogical inference over structured (i.e., symbolic) relational representations. The model is an extension of the LISA and DORA models of relational inference and learning. The resulting model learns both the content and format (i.e., structure) of relational representations from non-relational inputs without supervision, when augmented with the capacity for reinforcement learning, leverages these representations to learn individual domains, and then generalizes to new domains on the first exposure (i.e., zero-shot learning) via analogical inference. We demonstrate the capacity of the model to learn structured relational representations from a variety of simple visual stimuli, and to perform cross-domain generalization between video games (Breakout and Pong) and between several psychological tasks. We demonstrate that the model's trajectory closely mirrors the trajectory of children as they learn about relations, accounting for phenomena from the literature on the development of children's reasoning and analogy making. The model's ability to generalize between domains demonstrates the flexibility afforded by representing domains in terms of their underlying relational structure, rather than simply in terms of the statistical relations between their inputs and outputs.Comment: Includes supplemental materia

arXiv.org e-Print Archive

Edinburgh Research Explorer

Multi-Agent Actor-Critic with Hierarchical Graph Attention Network

Author: Park Jinkyoo
Ryu Heechang
Shin Hayong
Publication venue
Publication date: 26/11/2019
Field of study

Most previous studies on multi-agent reinforcement learning focus on deriving decentralized and cooperative policies to maximize a common reward and rarely consider the transferability of trained policies to new tasks. This prevents such policies from being applied to more complex multi-agent tasks. To resolve these limitations, we propose a model that conducts both representation learning for multiple agents using hierarchical graph attention network and policy learning using multi-agent actor-critic. The hierarchical graph attention network is specially designed to model the hierarchical relationships among multiple agents that either cooperate or compete with each other to derive more advanced strategic policies. Two attention networks, the inter-agent and inter-group attention layers, are used to effectively model individual and group level interactions, respectively. The two attention networks have been proven to facilitate the transfer of learned policies to new tasks with different agent compositions and allow one to interpret the learned strategies. Empirically, we demonstrate that the proposed model outperforms existing methods in several mixed cooperative and competitive tasks.Comment: Accepted as a conference paper at the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), New York, US

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Attention is more than prediction precision [Commentary on target article]

Author: Bowman Howard
Filetti Marco
Olivers Christian
Wyble Brad
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2013
Field of study

A cornerstone of the target article is that, in a predictive coding framework, attention can be modelled by weighting prediction error with a measure of precision. We argue that this is not a complete explanation, especially in the light of ERP (event-related potentials) data showing large evoked responses for frequently presented target stimuli, which thus are predicted

VU Research Portal

Crossref

University of Birmingham Research Portal

Kent Academic Repository

Recommended from our members

Computational models of the human visual cortex: on individual differences and ecologically valid input statistics

Author: Mehrer Johannes
Publication venue: University of Cambridge
Publication date: 03/07/2020
Field of study

Perception relies on cortical processes in response to sensory stimuli. Visual input entering the eyes ascends a cascade of processing steps from the retina to high-level regions of the cortex. Vision science investigates these transformations that give rise to high-level processing of visual objects, such as object recognition. In this thesis I investigate computational models of the human visual cortex with regard to their ability to predict cortical responses to visual objects. In particular, I describe two factors playing an important role in using deep neural networks (DNNs) to better understand cortical functioning: the initial weight state and ecologically more valid input statistics. In Chapter 1 of this thesis I will introduce relevant literature pertaining to deep neural networks as a modeling framework for the visual cortex. Next, I will lay out the motivation for the research questions investigated in this thesis and described in detail in Chapters 2, 3, and 4. Chapter 2 focuses on the impact of the initial weight state of a model on its ability to predict cortical representations. I describe work in which we demonstrate that two DNN instances identical in every aspect but their initial weights, yield very dissimilar representations. Relying on single network instances to predict cortical activation patterns in response to sensory stimuli poses a problem for computational neuroscience: depending on the initial set of weights the ability to mirror the cortical representations of these stimuli might vary. Thus, results based on single (“off-the-shelf”) model instances - as commonly used in computational neuroscience - may not generalize. In contrast, using multiple DNN instances might alleviate this problem as they allow insights in the variability of a given model architecture to predict cortical representations. These individual differences between model instances suggest that to allow results to generalize more easily the model instances should be treated similar to human experimental participants. In Chapter 3 I focus on ecologically more valid input statistics (in the form of training images) aiming to improve a model’s ability to predict cortical representations. The most successful models of the human visual cortex to date are DNNs trained on object recognition tasks designed with machine learning goals in mind. However, the image sets used for training these DNNs are often not ecologically realistic. For example, training on the most-widely used image set in computational neuroscience (ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012) requires the fine-grained distinction of 120 dog breeds, but does not contain visual object categories encountered frequently in everyday human life (e.g. woman, man, or child). This suggests that taking into account the human visual experience when training models of the human visual cortex on a categorization task might help to predict cortical representations. In this Chapter I describe the creation of a set of images aimed at mimicking the human visual diet: ecoset. Ecoset contains more than 1.5 million images from 565 basic level categories and is the largest image set specifically designed for computational neuroscience to date. Ecoset is freely available to allow the community to test their own hypotheses of models trained with input statistics matched to the human visual environment. In Chapter 4 we build on the results from the previous two Chapters. Using multiple DNN instances I investigate whether a brain-inspired model architecture (vNet) trained on ecologically more valid input statistics (ecoset) might improve its ability to predict cortical representations. I first demonstrate that ecoset might improve an architecture’s ability to mirror cortical representations. Furthermore, ecoset-trained vNet also outperforms state-ofthe- art computer vision and computational neuroscience models in terms of mirroring cortical representations in the human brain. Thus, incorporating biological and ecological aspects, such as brain-inspired architectural features and ecologically more valid input statistics, into computational models may yield better predictions of response patterns in the human visual cortex. Treating DNN instances similar to human experimental participants and considering ecological and biological factors for building these DNNs may be an important step towards better models of the human visual cortex. Such models might allow a better understanding of the cortical processes underlying high-level vision in the human brain.Cambridge Trust - Vice Chancellor's Award 2015 Cambridge Philosophical Society MRC Cognition and Brain Sciences Uni

Apollo (Cambridge)

Object detection and recognition with event driven cameras

Author: IACONO MASSIMILIANO
Publication venue: Universit\ue0 degli studi di Genova
Publication date: 29/04/2020
Field of study

This thesis presents study, analysis and implementation of algorithms to perform object detection and recognition using an event-based cam era. This sensor represents a novel paradigm which opens a wide range of possibilities for future developments of computer vision. In partic ular it allows to produce a fast, compressed, illumination invariant output, which can be exploited for robotic tasks, where fast dynamics and signi\ufb01cant illumination changes are frequent. The experiments are carried out on the neuromorphic version of the iCub humanoid platform. The robot is equipped with a novel dual camera setup mounted directly in the robot\u2019s eyes, used to generate data with a moving camera. The motion causes the presence of background clut ter in the event stream. In such scenario the detection problem has been addressed with an at tention mechanism, speci\ufb01cally designed to respond to the presence of objects, while discarding clutter. The proposed implementation takes advantage of the nature of the data to simplify the original proto object saliency model which inspired this work. Successively, the recognition task was \ufb01rst tackled with a feasibility study to demonstrate that the event stream carries su\ufb03cient informa tion to classify objects and then with the implementation of a spiking neural network. The feasibility study provides the proof-of-concept that events are informative enough in the context of object classi\ufb01 cation, whereas the spiking implementation improves the results by employing an architecture speci\ufb01cally designed to process event data. The spiking network was trained with a three-factor local learning rule which overcomes weight transport, update locking and non-locality problem. The presented results prove that both detection and classi\ufb01cation can be carried-out in the target application using the event data

Archivio istituzionale della ricerca - Università di Genova

Top-Down Selection in Convolutional Neural Networks

Author: Biparva Mahdi
Publication venue
Publication date: 11/05/2020
Field of study

Feedforward information processing fills the role of hierarchical feature encoding, transformation, reduction, and abstraction in a bottom-up manner. This paradigm of information processing is sufficient for task requirements that are satisfied in the one-shot rapid traversal of sensory information through the visual hierarchy. However, some tasks demand higher-order information processing using short-term recurrent, long-range feedback, or other processes. The predictive, corrective, and modulatory information processing in top-down fashion complement the feedforward pass to fulfill many complex task requirements. Convolutional neural networks have recently been successful in addressing some aspects of the feedforward processing. However, the role of top-down processing in such models has not yet been fully understood. We propose a top-down selection framework for convolutional neural networks to address the selective and modulatory nature of top-down processing in vision systems. We examine various aspects of the proposed model in different experimental settings such as object localization, object segmentation, task priming, compact neural representation, and contextual interference reduction. We test the hypothesis that the proposed approach is capable of accomplishing hierarchical feature localization according to task cuing. Additionally, feature modulation using the proposed approach is tested for demanding tasks such as segmentation and iterative parameter fine-tuning. Moreover, the top-down attentional traces are harnessed to enable a more compact neural representation. The experimental achievements support the practical complementary role of the top-down selection mechanisms to the bottom-up feature encoding routines

YorkSpace

Machine Learning Applications for Load Predictions in Electrical Energy Network

Author: Johannesen Nils Jacob
Publication venue: 'Elsevier BV'
Publication date: 01/01/2022
Field of study

In this work collected operational data of typical urban and rural energy network are analysed for predictions of energy consumption, as well as for selected region of Nordpool electricity markets. The regression techniques are systematically investigated for electrical energy prediction and correlating other impacting parameters. The k-Nearest Neighbour (kNN), Random Forest (RF) and Linear Regression (LR) are analysed and evaluated both by using continuous and vertical time approach. It is observed that for 30 minutes predictions the RF Regression has the best results, shown by a mean absolute percentage error (MAPE) in the range of 1-2 %. kNN show best results for the day-ahead forecasting with MAPE of 2.61 %. The presented vertical time approach outperforms the continuous time approach. To enhance pre-processing stage, refined techniques from the domain of statistics and time series are adopted in the modelling. Reducing the dimensionality through principal components analysis improves the predictive performance of Recurrent Neural Networks (RNN). In the case of Gated Recurrent Units (GRU) networks, the results for all the seasons are improved through principal components analysis (PCA). This work also considers abnormal operation due to various instances (e.g. random effect, intrusion, abnormal operation of smart devices, cyber-threats, etc.). In the results of kNN, iforest and Local Outlier Factor (LOF) on urban area data and from rural region data, it is observed that the anomaly detection for the scenarios are different. For the rural region, most of the anomalies are observed in the latter timeline of the data concentrated in the last year of the collected data. For the urban area data, the anomalies are spread out over the entire timeline. The frequency of detected anomalies where considerably higher for the rural area load demand than for the urban area load demand. Observing from considered case scenarios, the incidents of detected anomalies are more data driven, than exceptions in the algorithms. It is observed that from the domain knowledge of smart energy systems the LOF is able to detect observations that could not have detected by visual inspection alone, in contrast to kNN and iforest. Whereas kNN and iforest excludes an upper and lower bound, the LOF is density based and separates out anomalies amidst in the data. The capability that LOF has to identify anomalies amidst the data together with the deep domain knowledge is an advantage, when detecting anomalies in smart meter data. This work has shown that the instance based models can compete with models of higher complexity, yet some methods in preprocessing (such as circular coding) does not function for an instance based learner such as k-Nearest Neighbor, and hence kNN can not option for this kind of complexity even in the feature engineering of the model. It will be interesting for the future work of electrical load forecasting to develop solution that combines a high complexity in the feature engineering and have the explainability of instance based models.publishedVersio

Agder University Research Archive