155 research outputs found

    General noise support vector regression with non-constant uncertainty intervals for solar radiation prediction

    Full text link
    General noise cost functions have been recently proposed for support vector regression (SVR). When applied to tasks whose underlying noise distribution is similar to the one assumed for the cost function, these models should perform better than classical є-SVR. On the other hand, uncertainty estimates for SVR have received a somewhat limited attention in the literature until now and still have unaddressed problems. Keeping this in mind, three main goals are addressed here. First, we propose a framework that uses a combination of general noise SVR models with naive online R minimization algorithm (NORMA) as optimization method, and then gives nonconstant error intervals dependent upon input data aided by the use of clustering techniques. We give theoretical details required to implement this framework for Laplace, Gaussian, Beta, Weibull and Marshall–Olkin generalized exponential distributions. Second, we test the proposed framework in two real-world regression problems using data of two public competitions about solar energy. Results show the validity of our models and an improvement over classical є-SVR. Finally, in accordance with the principle of reproducible research, we make sure that data and model implementations used for the experiments are easily and publicly accessible.With partial support from Spain’s grants TIN2013-42351-P, TIN2016-76406-P, TIN2015-70308-REDT, as well as S2013/ICE-2845 CASI-CAM-CM. This work was supported also by project FACIL–Ayudas Fundación BBVA a Equipos de Investigación Científica 2016 and the UAM–ADIC Chair for Data Science and Machine Learning. We gratefully acknowledge the use of the facilities of Centro de Computación Científica, CCC, at Universidad Autónoma de Madrid, UA

    Regional Forecasting with Support Vector Regressions: The Case of Spain

    Get PDF
    This study attempts to assess the forecasting accuracy of Support Vector Regression (SVR) with regard to other Artificial Intelligence techniques based on statistical learning. We use two different neural networks and three SVR models that differ by the type of kernel used. We focus on international tourism demand to all seventeen regions of Spain. The SVR with a Gaussian kernel shows the best forecasting performance. The best predictions are obtained for longer forecast horizons, which suggest the suitability of machine learning techniques for medium and long term forecasting

    Forecasting bus passenger flows by using a clustering-based support vector regression approach

    Get PDF
    As a significant component of the intelligent transportation system, forecasting bus passenger flows plays a key role in resource allocation, network planning, and frequency setting. However, it remains challenging to recognize high fluctuations, nonlinearity, and periodicity of bus passenger flows due to varied destinations and departure times. For this reason, a novel forecasting model named as affinity propagation-based support vector regression (AP-SVR) is proposed based on clustering and nonlinear simulation. For the addressed approach, a clustering algorithm is first used to generate clustering-based intervals. A support vector regression (SVR) is then exploited to forecast the passenger flow for each cluster, with the use of particle swarm optimization (PSO) for obtaining the optimized parameters. Finally, the prediction results of the SVR are rearranged by chronological order rearrangement. The proposed model is tested using real bus passenger data from a bus line over four months. Experimental results demonstrate that the proposed model performs better than other peer models in terms of absolute percentage error and mean absolute percentage error. It is recommended that the deterministic clustering technique with stable cluster results (AP) can improve the forecasting performance significantly.info:eu-repo/semantics/publishedVersio

    Computational approaches for sub-meter ocean color remote sensing

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Mechanical and Oceanographic Engineering at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution February 2021.The satellite ocean color remote sensing paradigm developed by government space agencies enables the assessment of ocean color products on global scales at kilometer resolutions. A similar paradigm has not yet been developed for regional scales at sub-meter resolutions, but it is essential for specific ocean color applications (e.g., mapping algal biomass in the marginal ice zone). While many aspects of the satellite ocean color remote sensing paradigm are applicable to sub-meter scales, steps within the paradigm must be adapted to the optical character of the ocean at these scales and the opto-electronics of the available sensing instruments. This dissertation adapts the three steps of the satellite ocean color remote sensing paradigm that benefit the most from reassessment at sub-meter scales, namely the correction for surface-reflected light, the design and selection of the opto-electronics and the post-processing of over-sampled regions. First, I identify which surface-reflected light removal algorithm and view angle combination are optimal at sub-meter scales, using data collected during a field deployment to the Martha’s Vineyard Coastal Observatory. I find that of the three most widely used glint correction algorithms, a spectral optimization based approach applied to measurements with a 40∘ view angle best recovers the remotesensing reflectance and chlorophyll concentration despite centimeter scale variability in the surface-reflected light. Second, I develop a simulation framework to assess the impact of higher optical and electronics noise on ocean color product retrieval from unique ocean color scenarios. I demonstrate the framework’s power as a design tool by identifying hardware limitations, and developing potential solutions, for estimating algal biomass from high dynamic range sensing in the marginal ice zone. Third, I investigate a spectral super-resolution technique for application to spatially over-sampled oceanic regions. I determine that this technique more accurately represents spectral frequencies beyond the Nyquist and that it can be trained to be invariant to noise sources characteristic of ocean color remote sensing on images with similar statistics as the training dataset. Overall, the developed and critically assessed sub-meter ocean color remote sensing paradigm enables researchers to collect high fidelity sub-meter data from imaging spectrometers in unique ocean color scenarios.Ryan O’Shea was supported by the Department of Defense (DoD) through the National Defense Science & Engineering Graduate Fellowship (NDSEG) Program. This research was funded by Woods Hole Oceanographic Institution’s Edwin W. Hiam Ocean Science and Technology Award Fund, its Ocean Venture Funds, its Academic Programs Office, and the National Aeronautics and Space Administration via grant number CCE NNX17AI72G to Dr. Samuel Laney. The raw data for Figures 3-3 and 3-4 were provided through Australian Antarctic Science grants 2678 and 4390

    SVR, General Noise Functions and Deep Learning. General Noise Deep Models

    Full text link
    Tesis Doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Ingenieria Informática. Fecha de Lectura: 20-01-2023El aprendizaje automático, ML por sus siglas en inglés, es una rama de la inteligencia artifcial que permite construir sistemas que aprendan a resolver una tarea automáticamente a partir de los datos, en el sentido de que no necesitan ser programados explícitamente con las reglas o el método para hacerlo. ML abarca diferentes tipos de problemas; Uno de ellos, la regresión, implica predecir un resultado numérico y será el foco de atención de esta tesis. Entre los modelos ML utilizados para la regresión, las máquinas de vectores soporte o Support Vector Machines, SVM, son uno de los principales algoritmos de eleccón, habitualmente llamado Support Vector Regression, SVR, cuando se aplica a tareas de regresión. Este tipo de modelos generalmente emplea la función de pérdida ϵ−insensitive, lo que implica asumir una distribución concreta en el ruido presente en los datos, pero recientemente se han propuesto funciones de coste de ruido general para SVR. Estas funciones de coste deberían ser más efectivas cuando se aplican a problemas de regresión cuya distribución de ruido subyacente sigue la asumida para esa función de coste particular. Sin embargo, el uso de estas funciones generales, con la disparidad en las propiedades matemáticas como la diferenciabilidad que implica, hace que el método de optimización estándar utilizado en SVR, optimización mínima secuencial o SMO, ya no sea una posibilidad. Además, posiblemente el principal inconveniente de los modelos SVR es que pueden sufrir problemas de escalabilidad al trabajar con datos de gran tamaño, una situación común en la era de los grandes datos. Por otro lado, los modelos de Aprendizaje Profundo o Deep Learning, DL, pueden manejar grandes conjuntos de datos con mayor facilidad, siendo esta una de las razones fundamentales para explicar su reciente popularidad. Finalmente, aunque los modelos SVR se han estudiado a fondo, la construcción de intervalos de error para ellos parece haber recibido menos atención y sigue siendo un problema sin resolver. Esta es una desventaja signifcativa, ya que en muchas aplicaciones que implican resolver un problema de regresión no solo es util una predicción precisa, sino que también un intervalo de confianza asociado a esta predicción puede ser extremadamente valioso. Teniendo en cuenta todos estos factores, esta tesis tiene cuatro objetivos principales: Primero, proponer un marco para entrenar Modelos SVR de ruido general utilizando como método de optimización Naive Online R Minimization Algorithm, NORMA. En segundo lugar, proporcionar un método para construir modelos DL de ruido general que combinen el procesamiento de características altamente no lineales de los modelos DL con el potencial predictivo de usar funciones de pérdida de ruido general, de las cuales la función de pérdida ϵ−insensitive utilizada en SVR es solo un ejemplo particular. Tercero, describir un enfoque directo para construir intervalos de error para SVR u otros modelos de regresión, basado en asumir la hipótesis de que los residuos siguen una función de distribución concreta. Y finalmente, unificar los tres objetivos anteriores en un marco de modelos unico que permita construir modelos profundos de ruido general para la predicción en problemas de regresión con la posibilidad de obtener intervalos de confianza o intervalos de error asociado

    Downscaling Aerosol Optical Thickness from Satellite Observations: Physics and Machine Learning Approaches

    Get PDF
    In recent years, the satellite observation of aerosol properties has been greatly improved. As a result, the derivation of Aerosol Optical Thickness (AOT), one of the most popular atmospheric parameters used in air pollution monitoring, over ocean and continents from satellite observations shows comparable quality to ground-based measurements. Satellite AOT products is often applied for monitoring at global scale because of its coarse spatial resolution. However, monitoring at local scale such as over cities requires more detailed AOT information. The increase spatial resolution to suitable level has potential for applications of air pollution monitoring at global-to-local scale, detecting emission sources, deciding pollution management strategies, localizing aerosol estimation, etc. In this thesis, we investigated, proposed, implemented and validated algorithms to derive AOT maps with spatial resolution increased up to 1×1 km2 from MODerate resolution Imaging Spectrometer (MODIS) observations provided by National Aeronautics and Space Administration (NASA), while MODIS standard aerosol products provide maps at 10×10 km2 of spatial resolution. The solutions are considered on two perspectives: dynamical downscaling by improving the algorithm for remote sensing of tropospheric aerosol from MODIS and statistical downscaling using Support Vector Regression
    • …
    corecore