    A two-stage framework for short-term wind power forecasting using different feature-learning models

    With the growing dependence on wind power generation, improving the accuracy of short-term forecasting has become increasingly important for ensuring continued economical and reliable system operations. In the wind power forecasting field, ensemble-based forecasting models have been studied extensively; however, few of them considered learning the features from both historical wind data and NWP data. In addition, the exploration of the multiple-input and multiple-output learning structures is lacking in the wind power forecasting literature. Therefore, this study exploits the NWP and historical wind data as input and proposes a two-stage forecasting framework on the shelf of moving window algorithm. Specifically, at the first stage, four forecasting models are constructed with deep neural networks considering the multiple-input and multiple-output structures; at the second stage, an ensemble model is developed using ridge regression method for reducing the extrapolation error. The experiments are conducted on three existing wind farms for examining the 2-h ahead forecasting point. The results demonstrate that 1) the single-input-multiple-output (SIMO) structure leads to a better forecasting accuracy than the other threes; 2) ridge regression method results in a better ensemble model that is able to further improve the forecasting accuracy, than the other machine learning methods; 3) the proposed two-stage forecasting framework is likely to generate more accurate and stable results than the other existing algorithms

    Empowering the Data Scientist professional profile through competition dynamics

    La Ciencia de Datos es el área que comprende el desarrollo de métodos científicos, procesos y sistemas para extraer conocimiento a partir de datos recopilados previamente, con el objetivo de analizar los procedimientos llevados a cabo actualmente. El perfil profesional asociado a este campo es el del Científico de Datos, generalmente llevado a cabo por Ingenieros Informáticos gracias a que las aptitudes y competencias adquiridas durante su formación se ajustan perfectamente a lo requerido en este puesto laboral. Debido a la necesidad de formación de nuevos Científicos de Datos, entre otros fines, surgen plataformas en las que éstos pueden adquirir una amplia experiencia, como es el caso de Kaggle. El principal objetivo de esta experiencia docente es proporcionar al alumnado una experiencia práctica con un problema real, así como la posibilidad de cooperar y competir al mismo tiempo. Así, la adquisición y el desarrollo de las competencias necesarias en Ciencia de Datos se realiza en un entorno altamente motivador. La realización de actividades relacionadas con este perfil ha tenido una repercusión directa sobre el alumnado, siendo fundamental la motivación, la capacidad de aprendizaje y el reciclaje continuo de conocimientos a los que se someten los Ingenieros Informáticos.Data Science is the area that comprises the development of scientific methods, processes, and systems for extracting knowledge from previously collected data, aiming to analyse the procedures being carried out currently. The professional profile associated with this field is the Data Scientist, generally carried out by Computer Engineers as the skills and competencies acquired during their training are perfectly suited to what this job requires. Due to the need for training new Data Scientists, among other goals, there are different emerging platforms where they can acquire extensive experience, such as Kaggle. The main objective of this teaching experience is to provide students with practical experience on a real problem, as well as the possibility of cooperating and competing at the same time. Thus, the acquisition and development of the necessary competencies in Data Science are carried out in a highly motivating environment. The development of activities related to this profile has had a direct impact on the students, being fundamental the motivation, the learning capacity and the continuous recycling of knowledge to which Computer Engineers are subjected

    Predicción de energía eólica con métodos de ensemble

    La inteligencia artifcial es un área de las ciencias de computación en la que la inversión de trabajo y esfuerzo de investigación esta obteniendo grandes recompensas. Es un campo en el que cada vez parece haber másposibilidades, y en el que enestetrabajo pretendemosavanzarpara darunasolución aunproblemaconcreto: la predicción de energía eólica en el Parque Eólico Experimental de Sotavento. La energía eólica es una forma de energía renovable en la que España ha sido pionera, y cada vez es mayor el porcentaje de energía que proviene de esta fuente. Por esto es indispensable conseguir predicciones precisas de la energía que se va a obtener en todo momento, y en este trabajo abordamos el problema de predecirla a partir de datos de predicciones atmosféricas. Se estudiarán varios modelos de regresión para luego construir un modelo de ensemble stacking que trate de compensar los fallos de los modelos por los que está compuesto, y se hará un análisis de los resultados obtenidos, para dar indicaciones sobre posible trabajo futuro

    Combining Multi-Task Learning and Multi-Channel Variational Auto-Encoders to Exploit Datasets with Missing Observations -Application to Multi-Modal Neuroimaging Studies in Dementia

    The joint modeling of neuroimaging data across multiple datasets requires to consistently analyze high-dimensional and heterogeneous information in presence of often non-overlapping sets of views across data samples (e.g. imaging data, clinical scores, biological measurements). This analysis is associated with the problem of missing information across datasets, which can happen in two forms: missing at random (MAR), when the absence of a view is unpredictable and does not depend on the dataset (e.g. due to data corruption); missing not at random (MNAR), when a specific view is absent by design for a specific dataset. In order to take advantage of the increased variability and sample size when pooling together observations from many cohorts and at the same time cope with the ubiquitous problem of missing information, we propose here a multi-task generative latent-variable model where the common variability across datasets stems from the estimation of a shared latent representation across views. Our formulation allows to retrieve a consistent latent representation common to all views and datasets, even in the presence of missing information. Simulations on synthetic data show that our method is able to identify a common latent representation of multi-view datasets, even when the compatibility across datasets is minimal. When jointly analyzing multi-modal neuroimaging and clinical data from real independent dementia studies, our model is able to mitigate the absence of modalities without having to discard any available information. Moreover, the common latent representation inferred with our model can be used to define robust classifiers gathering the combined information across different datasets. To conclude, both on synthetic and real data experiments, our model compared favorably to state of the art benchmark methods, providing a more powerful exploitation of multi-modal observations with missing views