Search CORE

455 research outputs found

Learning visual representations with neural networks for video captioning and image generation

Author: Yao Li
Publication venue
Publication date: 01/12/2017
Field of study

La recherche sur les réseaux de neurones a permis de réaliser de larges progrès durant la dernière décennie. Non seulement les réseaux de neurones ont été appliqués avec succès pour résoudre des problèmes de plus en plus complexes; mais ils sont aussi devenus l’approche dominante dans les domaines où ils ont été testés tels que la compréhension du langage, les agents jouant à des jeux de manière automatique ou encore la vision par ordinateur, grâce à leurs capacités calculatoires et leurs efficacités statistiques. La présente thèse étudie les réseaux de neurones appliqués à des problèmes en vision par ordinateur, où les représentations sémantiques abstraites jouent un rôle fondamental. Nous démontrerons, à la fois par la théorie et par l’expérimentation, la capacité des réseaux de neurones à apprendre de telles représentations à partir de données, avec ou sans supervision. Le contenu de la thèse est divisé en deux parties. La première partie étudie les réseaux de neurones appliqués à la description de vidéo en langage naturel, nécessitant l’apprentissage de représentation visuelle. Le premier modèle proposé permet d’avoir une attention dynamique sur les différentes trames de la vidéo lors de la génération de la description textuelle pour de courtes vidéos. Ce modèle est ensuite amélioré par l’introduction d’une opération de convolution récurrente. Par la suite, la dernière section de cette partie identifie un problème fondamental dans la description de vidéo en langage naturel et propose un nouveau type de métrique d’évaluation qui peut être utilisé empiriquement comme un oracle afin d’analyser les performances de modèles concernant cette tâche. La deuxième partie se concentre sur l’apprentissage non-supervisé et étudie une famille de modèles capables de générer des images. En particulier, l’accent est mis sur les “Neural Autoregressive Density Estimators (NADEs), une famille de modèles probabilistes pour les images naturelles. Ce travail met tout d’abord en évidence une connection entre les modèles NADEs et les réseaux stochastiques génératifs (GSN). De plus, une amélioration des modèles NADEs standards est proposée. Dénommés NADEs itératifs, cette amélioration introduit plusieurs itérations lors de l’inférence du modèle NADEs tout en préservant son nombre de paramètres. Débutant par une revue chronologique, ce travail se termine par un résumé des récents développements en lien avec les contributions présentées dans les deux parties principales, concernant les problèmes d’apprentissage de représentation sémantiques pour les images et les vidéos. De prometteuses directions de recherche sont envisagées.The past decade has been marked as a golden era of neural network research. Not only have neural networks been successfully applied to solve more and more challenging real- world problems, but also they have become the dominant approach in many of the places where they have been tested. These places include, for instance, language understanding, game playing, and computer vision, thanks to neural networks’ superiority in computational efficiency and statistical capacity. This thesis applies neural networks to problems in computer vision where high-level and semantically meaningful representations play a fundamental role. It demonstrates both in theory and in experiment the ability to learn such representations from data with and without supervision. The main content of the thesis is divided into two parts. The first part studies neural networks in the context of learning visual representations for the task of video captioning. Models are developed to dynamically focus on different frames while generating a natural language description of a short video. Such a model is further improved by recurrent convolutional operations. The end of this part identifies fundamental challenges in video captioning and proposes a new type of evaluation metric that may be used experimentally as an oracle to benchmark performance. The second part studies the family of models that generate images. While the first part is supervised, this part is unsupervised. The focus of it is the popular family of Neural Autoregressive Density Estimators (NADEs), a tractable probabilistic model for natural images. This work first makes a connection between NADEs and Generative Stochastic Networks (GSNs). The standard NADE is improved by introducing multiple iterations in its inference without increasing the number of parameters, which is dubbed iterative NADE. With a historical view at the beginning, this work ends with a summary of recent development for work discussed in the first two parts around the central topic of learning visual representations for images and videos. A bright future is envisioned at the end

Dépôt Institutionnel Numérique

Energy efficient enabling technologies for semantic video processing on mobile devices

Author: Larkin Daniel
Publication venue: Dublin City University. Centre for Digital Video Processing (CDVP)
Publication date: 01/11/2008
Field of study

Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art

DCU Online Research Access Service

Energy load forecast in smart buildings with deep learning techniques

Author: Barón García Alejandro
Publication venue
Publication date: 01/01/2020
Field of study

Predicting energy load is a growing problem these days. The need to study in advance how electricity consumption will behave is key to resource management. Especially interesting is the case of the so-called Smart Buildings, buildings born from the trend towards sustainable development and consumption which is increasingly in vogue, becoming mandatory by law in many countries. One type of model that constitutes an important part of the state of the art are the models based on Deep Learning. These models represented great advances in Artificial Intelligence recently, since although they were born in the 20th century, it has not been until 10 years ago that they have re-emerged thanks to the computational advances that allow them to be trained by the general public. In this Final Degree Project, advanced Deep Learning techniques applied to the problem of load prediction in Smart Buildings are presented, mainly basing the development on the data from the Alice Perry building of the National University of Ireland Galway, in collaboration with the Informatics Research Unit for Sustainable Engineering of the same university. The datasets used were obtained from the time series of aggregated electricity consumption of the air handling units (AHUs) in the Alice Perry building. Along with this information, historical weather data were also collected from the weather station in the same building in order to study if these climatic variables help to a better prediction in the models. Time series prediction on this energy load data will be made in two different ways with hourly granularity: one-step prediction in which studying the previous observations an estimate of the value of the load in the next hour is obtained and sequence prediction, in which we will try to predict the behaviour of the series in the next hours from the previous values.La predicción de carga energética es un problema al alza actualmente. La necesidad de estudiar con antelación cómo se va a comportar el consumo eléctrico es clave para la gestión de recursos. Especialmente interesante es el caso de los llamados Smart Buildings, edificios nacidos por la tendencia hacia un desarrollo y consumo sostenible el cual cada vez está más en boga, llegando a ser obligatorio por ley en muchos países. Un tipo de modelos que constituyen una parte importante del estado del arte son los modelos basados en Deep Learning. Estos modelos supusieron grandes avances en la Inteligencia Artificial recientemente, ya que aunque nacidos en el Siglo XX, no ha sido hasta escasos 10 años cuando han resurgido gracias a los avances computacionales que permiten entrenarlos por el público general. En este trabajo de fin de grado se presentan técnicas avanzadas de Deep Learning aplicadas al problema de la predicción de carga en Smart Buildings, principalmente basando el desarrollo en los datos del edificio Alice Perry de la National University of Ireland Galway, en colaboración con el grupo Informatics Research Unit for Sustainable Engineering de la misma universidad. Los conjuntos de datos utilizados se obtuvieron datos sobre la serie temporal de consumo eléctrico agregado de los aires acondicionados en el edificio Alice Perry. Junto a esta información, se recopilaron también datos meteorológicos históricos de la estación meteorológica en el mismo edificio con el objetivo de estudiar si estas variables climáticas ayudan a una mejor predicción en los modelos. La predicción de series temporales sobre estos datos de carga energética se realizará en dos modos con granularidad horaria: La predicción a un paso en la que estudiando las observaciones anteriores se obtiene una estimación del valor de la carga en la próxima hora y predicción de secuencias, en la que se intentará predecir el comportamiento de la serie en las próximas horas a partir de los valores anteriores.Grado en Ingeniería Informátic

Repositorio Documental de la Universidad de Valladolid

Modeling and Optimization of Active Distribution Network Operation Based on Deep Learning

Author: Liao Wenlong
Publication venue: Aalborg Universitetsforlag
Publication date: 01/01/2023
Field of study

VBN

Applications of Machine Learning to Optimizing Polyolefin Manufacturing

Author: Liu Y. A.
Sharma Niket
Publication venue
Publication date: 18/01/2024
Field of study

This chapter is a preprint from our book by , focusing on leveraging machine learning (ML) in chemical and polyolefin manufacturing optimization. It's crafted for both novices and seasoned professionals keen on the latest ML applications in chemical processes. We trace the evolution of AI and ML in chemical industries, delineate core ML components, and provide resources for ML beginners. A detailed discussion on various ML methods is presented, covering regression, classification, and unsupervised learning techniques, with performance metrics and examples. Ensemble methods, deep learning networks, including MLP, DNNs, RNNs, CNNs, and transformers, are explored for their growing role in chemical applications. Practical workshops guide readers through predictive modeling using advanced ML algorithms. The chapter culminates with insights into science-guided ML, advocating for a hybrid approach that enhances model accuracy. The extensive bibliography offers resources for further research and practical implementation. This chapter aims to be a thorough primer on ML's practical application in chemical engineering, particularly for polyolefin production, and sets the stage for continued learning in subsequent chapters. Please cite the original work [169,170] when referencing

arXiv.org e-Print Archive

Data analytics for mobile traffic in 5G networks using machine learning techniques

Author: Trinh Hoang Duy
Publication venue: Universitat Politècnica de Catalunya
Publication date: 10/06/2020
Field of study

This thesis collects the research works I pursued as Ph.D. candidate at the Universitat Politecnica de Catalunya (UPC). Most of the work has been accomplished at the Mobile Network Department Centre Tecnologic de Telecomunicacions de Catalunya (CTTC). The main topic of my research is the study of mobile network traffic through the analysis of operative networks dataset using machine learning techniques. Understanding first the actual network deployments is fundamental for next-generation network (5G) for improving the performance and Quality of Service (QoS) of the users. The work starts from the collection of a novel type of dataset, using an over-the-air monitoring tool, that allows to extract the control information from the radio-link channel, without harming the users’ identities. The subsequent analysis comprehends a statistical characterization of the traffic and the derivation of prediction models for the network traffic. A wide group of algorithms are implemented and compared, in order to identify the highest performances. Moreover, the thesis addresses a set of applications in the context mobile networks that are prerogatives in the future mobile networks. This includes the detection of urban anomalies, the user classification based on the demanded network services, the design of a proactive wake-up scheme for efficient-energy devices.Esta tesis recoge los trabajos de investigación que realicé como Ph.D. candidato a la Universitat Politecnica de Catalunya (UPC). La mayor parte del trabajo se ha realizado en el Centro Tecnológico de Telecomunicaciones de Catalunya (CTTC) del Departamento de Redes Móviles. El tema principal de mi investigación es el estudio del tráfico de la red móvil a través del análisis del conjunto de datos de redes operativas utilizando técnicas de aprendizaje automático. Comprender primero las implementaciones de red reales es fundamental para la red de próxima generación (5G) para mejorar el rendimiento y la calidad de servicio (QoS) de los usuarios. El trabajo comienza con la recopilación de un nuevo tipo de conjunto de datos, utilizando una herramienta de monitoreo por aire, que permite extraer la información de control del canal de radioenlace, sin dañar las identidades de los usuarios. El análisis posterior comprende una caracterización estadística del tráfico y la derivación de modelos de predicción para el tráfico de red. Se implementa y compara un amplio grupo de algoritmos para identificar los rendimientos más altos. Además, la tesis aborda un conjunto de aplicaciones en el contexto de redes móviles que son prerrogativas en las redes móviles futuras. Esto incluye la detección de anomalías urbanas, la clasificación de usuarios basada en los servicios de red demandados, el diseño de un esquema de activación proactiva para dispositivos de energía eficiente.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC