67 research outputs found

    Three Essays in Applied Econometrics: with Application to Natural Resource and Energy Markets

    Get PDF
    Abstract Chapter 1 This essay examines the effect of state renewable energy policies in inducing innovation and the spillover effect of these policies on innovation in neighboring states. The analysis is conducted with patent data related to renewable technology using wind power for the United States over the period 1983-2010. We run a panel data regression of a log transformation of states\u27 yearly patent counts on state renewable energy policies and spatially weighted average of renewable energy policies in neighboring states using the Tobit model with individual effects. The results show that renewable energy rules, regulation and mandates such as interconnection standards, net metering and renewable portfolio standard enacted in neighboring states have shown a statistically significant positive spillover effect in increasing the number of patent applications in that state. However, financial policies such as tax incentives and subsidy policies implemented by neighboring states have shown statistically significant negative effects on technological innovation within that state. Chapter 2 In this essay, we have conducted a Monte Carlo Study of the prediction performance of various nonparametric estimation methods for spatially dependent data, such as the nonparametric local linear kernel estimator, the Nadraya-Watson estimator, and the k-Nearest Neighbors method developed by Hallin et al. (2004b), Lu and Chen (2002), P.M. Robinson (2011) and Li and Tran (2009). With data sampled on a rectangular grid in a nonlinear random field, the results show that nonparametric local linear kernel method has the best performance in terms of mean squared prediction error. The Nadaraya-Watson estimation method also performs well. In general, these two nonparametric methods consistently outperform the k-Nearest Neighbors method and the maximum likelihood method regardless of the data generating process and sample size. However, the maximum likelihood method does not perform well because the spatial weight matrix can only be used to estimate linear structures while the true data generating process is nonlinear. This also gives some support to the idea of using nonparametric methods when various misspecification may exist either in the functional form or spatial weight matrix for spatially dependent data. We use these methods to predict county-level crop yields with spatially weighted precipitation. The results are generally consistent with the simulation results. The nonparametric local linear kernel estimator has the best prediction performance. The Nadaraya-Watson estimator also performs better than the k-Nearest Neighbors method and the maximum likelihood estimator. However, with an inverse distance weighting matrix, the maximum likelihood estimator outperforms the k-Nearest Neighbors method in predicting crop yield. Chapter 3 This essay uses the exceedances over high threshold model of Davidson and Smith (1990) to investigate the univariate tail distribution of the returns on various energy products such as Crude Oil, Gasoline, Heating Oil, Propane and Diesel. The bivariate threshold exceedance model of Ledford and Tawn (1996) is also used to study the tail dependence between returns on various pairs of selected energy products. Tail index estimates for univariate threshold exceedance models show that these returns generally have fat tails similar to those of a Student\u27s t-Distribution with 2 to 5 degrees of freedom except that for Crude Oil where the tail index estimates are closer to that of a normal distribution. We also estimate the tail dependence index for four pairs of energy products, crude oil/gasoline, crude oil/heating oil, crude oil/propane, crude oil/diese. The correlation coefficients implied by the dependence index estimates show that correlations conditional on threshold exceedance are generally higher than the unconditional correlation between crude oil/heating oil and crude oil/gasoline. However, there is some variation in the implied correlation between crude oil/propane and crude oil/diesel. Whether the extreme correlation will be higher or lower than the unconditional correlation depends on the threshold chosen

    Scalable aggregation predictive analytics: a query-driven machine learning approach

    Get PDF
    We introduce a predictive modeling solution that provides high quality predictive analytics over aggregation queries in Big Data environments. Our predictive methodology is generally applicable in environments in which large-scale data owners may or may not restrict access to their data and allow only aggregation operators like COUNT to be executed over their data. In this context, our methodology is based on historical queries and their answers to accurately predict ad-hoc queries’ answers. We focus on the widely used set-cardinality, i.e., COUNT, aggregation query, as COUNT is a fundamental operator for both internal data system optimizations and for aggregation-oriented data exploration and predictive analytics. We contribute a novel, query-driven Machine Learning (ML) model whose goals are to: (i) learn the query-answer space from past issued queries, (ii) associate the query space with local linear regression & associative function estimators, (iii) define query similarity, and (iv) predict the cardinality of the answer set of unseen incoming queries, referred to the Set Cardinality Prediction (SCP) problem. Our ML model incorporates incremental ML algorithms for ensuring high quality prediction results. The significance of contribution lies in that it (i) is the only query-driven solution applicable over general Big Data environments, which include restricted-access data, (ii) offers incremental learning adjusted for arriving ad-hoc queries, which is well suited for query-driven data exploration, and (iii) offers a performance (in terms of scalability, SCP accuracy, processing time, and memory requirements) that is superior to data-centric approaches. We provide a comprehensive performance evaluation of our model evaluating its sensitivity, scalability and efficiency for quality predictive analytics. In addition, we report on the development and incorporation of our ML model in Spark showing its superior performance compared to the Spark’s COUNT method

    Proceedings. 19. Workshop Computational Intelligence, Dortmund, 2. - 4. Dezember 2009

    Get PDF
    Dieser Tagungsband enthält die Beiträge des 19. Workshops „Computational Intelligence“ des Fachausschusses 5.14 der VDI/VDE-Gesellschaft fĂĽr Mess- und Automatisierungstechnik (GMA) und der Fachgruppe „Fuzzy-Systeme und Soft-Computing“ der Gesellschaft fĂĽr Informatik (GI), der vom 2.-4. Dezember 2009 im Haus Bommerholz bei Dortmund stattfindet

    Efficient Learning and Inference for High-dimensional Lagrangian Systems

    Get PDF
    Learning the nature of a physical system is a problem that presents many challenges and opportunities owing to the unique structure associated with such systems. Many physical systems of practical interest in engineering are high-dimensional, which prohibits the application of standard learning methods to such problems. This first part of this work proposes therefore to solve learning problems associated with physical systems by identifying their low-dimensional Lagrangian structure. Algorithms are given to learn this structure in the case that it is obscured by a change of coordinates. The associated inference problem corresponds to solving a high-dimensional minimum-cost path problem, which can be solved by exploiting the symmetry of the problem. These techniques are demonstrated via an application to learning from high-dimensional human motion capture data. The second part of this work is concerned with the application of these methods to high-dimensional motion planning. Algorithms are given to learn and exploit the struc- ture of holonomic motion planning problems effectively via spectral analysis and iterative dynamic programming, admitting solutions to problems of unprecedented dimension com- pared to known methods for optimal motion planning. The quality of solutions found is also demonstrated to be much superior in practice to those obtained via sampling-based planning and smoothing, in both simulated problems and experiments with a robot arm. This work therefore provides strong validation of the idea that learning low-dimensional structure is the key to future advances in this field

    A similarity-based Bayesian mixture-of-experts model

    Full text link
    We present a new nonparametric mixture-of-experts model for multivariate regression problems, inspired by the probabilistic kk-nearest neighbors algorithm. Using a conditionally specified model, predictions for out-of-sample inputs are based on similarities to each observed data point, yielding predictive distributions represented by Gaussian mixtures. Posterior inference is performed on the parameters of the mixture components as well as the distance metric using a mean-field variational Bayes algorithm accompanied with a stochastic gradient-based optimization procedure. The proposed method is especially advantageous in settings where inputs are of relatively high dimension in comparison to the data size, where input--output relationships are complex, and where predictive distributions may be skewed or multimodal. Computational studies on two synthetic datasets and one dataset comprising dose statistics of radiation therapy treatment plans show that our mixture-of-experts method performs similarly or better than a conditional Dirichlet process mixture model both in terms of validation metrics and visual inspection

    Relative Positional Encoding for Transformers with Linear Complexity

    Full text link
    Recent advances in Transformer models allow for unprecedented sequence lengths, due to linear space and time complexity. In the meantime, relative positional encoding (RPE) was proposed as beneficial for classical Transformers and consists in exploiting lags instead of absolute positions for inference. Still, RPE is not available for the recent linear-variants of the Transformer, because it requires the explicit computation of the attention matrix, which is precisely what is avoided by such methods. In this paper, we bridge this gap and present Stochastic Positional Encoding as a way to generate PE that can be used as a replacement to the classical additive (sinusoidal) PE and provably behaves like RPE. The main theoretical contribution is to make a connection between positional encoding and cross-covariance structures of correlated Gaussian processes. We illustrate the performance of our approach on the Long-Range Arena benchmark and on music generation.Comment: ICML 2021 (long talk) camera-ready. 24 page
    • …
    corecore