1,853 research outputs found

    Bounding Optimality Gap in Stochastic Optimization via Bagging: Statistical Efficiency and Stability

    Full text link
    We study a statistical method to estimate the optimal value, and the optimality gap of a given solution for stochastic optimization as an assessment of the solution quality. Our approach is based on bootstrap aggregating, or bagging, resampled sample average approximation (SAA). We show how this approach leads to valid statistical confidence bounds for non-smooth optimization. We also demonstrate its statistical efficiency and stability that are especially desirable in limited-data situations, and compare these properties with some existing methods. We present our theory that views SAA as a kernel in an infinite-order symmetric statistic, which can be approximated via bagging. We substantiate our theoretical findings with numerical results

    Curriculum Guidelines for Undergraduate Programs in Data Science

    Get PDF
    The Park City Math Institute (PCMI) 2016 Summer Undergraduate Faculty Program met for the purpose of composing guidelines for undergraduate programs in Data Science. The group consisted of 25 undergraduate faculty from a variety of institutions in the U.S., primarily from the disciplines of mathematics, statistics and computer science. These guidelines are meant to provide some structure for institutions planning for or revising a major in Data Science

    A Network Model of Financial Markets

    Get PDF
    This thesis introduces a network representation of equity markets.The model is based on the premise that assets share dependencies on abstract ‘factors’ resulting in exploitable patterns among asset price levels.The network model is a collection of long-run market trends estimated by a 3 layer machine learning framework.The network model’s comprehensive validity is established with 2 simulations in the fields of algorithmic trading, and systemic risk.The algorithmic trading validation applies expectations derived from the network model to estimating expected future returns. It further utilizes the network’s expectations to actively manage a theoretically market neutral portfolio.The validation demonstrates that the network model’s portfolio generates excess returns relative to 2 benchmarks. Over the time period of April, 2007 to January, 2014 the network model’s portfolio for assets drawn from the S&P/ASX 100 produced a Sharpe ratio of 0.674.This approximately doubles the nearest benchmark. The systemic risk validation utilized the network model to simulate shocks to select market sectors and evaluate the resulting financial contagion.The validation successfully differentiated sectors by systemic connectivity levels and suggested some interesting market features. Most notable was the identification of the ‘Financials’ sector as most systemically influential and ‘Basic Materials’ as the most systemically dependent. Additionally, there was evidence that ‘Financials’ may function as a hub of systemic risk which exacerbates losses from multiple market sectors

    A Tabu Search Heuristic Procedure in Markov Chain Bootstrapping

    Get PDF
    Markov chain theory is proving to be a powerful approach to bootstrap nite states processes, especially where time dependence is non linear. In this work we extend such approach to bootstrap discrete time continuous-valued processes. To this purpose we solve a minimization problem to partition the state space of a continuous-valued process into a nite number of intervals or unions of intervals (i.e. its states) and identify the time lags which provide \memory" to the process. A distance is used as objective function to stimulate the clustering of the states having similar transition probabilities. The problem of the exploding number of alternative partitions in the solution space (which grows with the number of states and the order of the Markov chain) is addressed through a Tabu Search algorithm. The method is applied to bootstrap the series of the German and Spanish electricity prices. The analysis of the results conrms the good consistency properties of the method we propose

    The impact of macroeconomic leading indicators on inventory management

    Get PDF
    Forecasting tactical sales is important for long term decisions such as procurement and informing lower level inventory management decisions. Macroeconomic indicators have been shown to improve the forecast accuracy at tactical level, as these indicators can provide early warnings of changing markets while at the same time tactical sales are sufficiently aggregated to facilitate the identification of useful leading indicators. Past research has shown that we can achieve significant gains by incorporating such information. However, at lower levels, that inventory decisions are taken, this is often not feasible due to the level of noise in the data. To take advantage of macroeconomic leading indicators at this level we need to translate the tactical forecasts into operational level ones. In this research we investigate how to best assimilate top level forecasts that incorporate such exogenous information with bottom level (at Stock Keeping Unit level) extrapolative forecasts. The aim is to demonstrate whether incorporating these variables has a positive impact on bottom level planning and eventually inventory levels. We construct appropriate hierarchies of sales and use that structure to reconcile the forecasts, and in turn the different available information, across levels. We are interested both at the point forecast and the prediction intervals, as the latter inform safety stock decisions. Therefore the contribution of this research is twofold. We investigate the usefulness of macroeconomic leading indicators for SKU level forecasts and alternative ways to estimate the variance of hierarchically reconciled forecasts. We provide evidence using a real case study

    Curriculum Guidelines for Undergraduate Programs in Data Science

    Get PDF
    The Park City Math Institute 2016 Summer Undergraduate Faculty Program met for the purpose of composing guidelines for undergraduate programs in data science. The group consisted of 25 undergraduate faculty from a variety of institutions in the United States, primarily from the disciplines of mathematics, statistics, and computer science. These guidelines are meant to provide some structure for institutions planning for or revising a major in data science

    A Topic Coverage Approach to Evaluation of Topic Models

    Full text link
    Topic models are widely used unsupervised models of text capable of learning topics - weighted lists of words and documents - from large collections of text documents. When topic models are used for discovery of topics in text collections, a question that arises naturally is how well the model-induced topics correspond to topics of interest to the analyst. In this paper we revisit and extend a so far neglected approach to topic model evaluation based on measuring topic coverage - computationally matching model topics with a set of reference topics that models are expected to uncover. The approach is well suited for analyzing models' performance in topic discovery and for large-scale analysis of both topic models and measures of model quality. We propose new measures of coverage and evaluate, in a series of experiments, different types of topic models on two distinct text domains for which interest for topic discovery exists. The experiments include evaluation of model quality, analysis of coverage of distinct topic categories, and the analysis of the relationship between coverage and other methods of topic model evaluation. The contributions of the paper include new measures of coverage, insights into both topic models and other methods of model evaluation, and the datasets and code for facilitating future research of both topic coverage and other approaches to topic model evaluation.Comment: Results and contributions unchanged; Added new references; Improved the contextualization and the description of the work (abstr, intro, 7.1 concl, rw, concl); Moved technical details of data and model building to appendices; Improved layout
    • 

    corecore