58 research outputs found

    Incrementally Improving Graph WaveNet Performance on Traffic Prediction

    Full text link
    We present a series of modifications which improve upon Graph WaveNet's previously state-of-the-art performance on the METR-LA traffic prediction task. The goal of this task is to predict the future speed of traffic at each sensor in a network using the past hour of sensor readings. Graph WaveNet (GWN) is a spatio-temporal graph neural network which interleaves graph convolution to aggregate information from nearby sensors and dilated convolutions to aggregate information from the past. We improve GWN by (1) using better hyperparameters, (2) adding connections that allow larger gradients to flow back to the early convolutional layers, and (3) pretraining on an easier short-term traffic prediction task. These modifications reduce the mean absolute error by .06 on the METR-LA task, nearly equal to GWN's improvement over its predecessor. These improvements generalize to the PEMS-BAY dataset, with similar relative magnitude. We also show that ensembling separate models for short-and long-term predictions further improves performance. Code is available at https://github.com/sshleifer/Graph-WaveNet

    Classification as Decoder: Trading Flexibility for Control in Medical Dialogue

    Full text link
    Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deeper understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the cost of control, a concerning tradeoff in doctor/patient interactions. Inaccuracies, typos, or undesirable content in the training data will be reproduced by the model at inference time. We trade a small amount of labeling effort and some loss of response variety in exchange for quality control. More specifically, a pretrained language model encodes the conversational context, and we finetune a classification head to map an encoded conversational context to a response class, where each class is a noisily labeled group of interchangeable responses. Experts can update these exemplar responses over time as best practices change without retraining the classifier or invalidating old training data. Expert evaluation of 775 unseen doctor/patient conversations shows that only 12% of the discriminative model's responses are worse than the what the doctor ended up writing, compared to 18% for the generative model.Comment: Machine Learning for Health (ML4H) at NeurIPS 2019 - Extended Abstract. arXiv admin note: substantial text overlap with arXiv:1910.0347

    HuggingFace's Transformers: State-of-the-art Natural Language Processing

    Full text link
    Recent progress in natural language processing has been driven by advances in both model architecture and model pretraining. Transformer architectures have facilitated building higher-capacity models and pretraining has made it possible to effectively utilize this capacity for a wide variety of tasks. \textit{Transformers} is an open-source library with the goal of opening up these advances to the wider machine learning community. The library consists of carefully engineered state-of-the art Transformer architectures under a unified API. Backing this library is a curated collection of pretrained models made by and available for the community. \textit{Transformers} is designed to be extensible by researchers, simple for practitioners, and fast and robust in industrial deployments. The library is available at \url{https://github.com/huggingface/transformers}.Comment: 8 pages, 4 figures, more details at https://github.com/huggingface/transformer

    Expectations of Returns and Expected Returns *

    No full text
    We analyze time-series of investor expectations of future stock market returns from six data sources between 1963 and 2011. The six measures of expectations are highly positively correlated with each other, as well as with past stock returns and with the level of the stock market. However, investor expectations are strongly negatively correlated with model-based expected returns. We reconcile the evidence by calibrating a simple behavioral model, in which fundamental traders require a premium to accommodate expectations shocks from extrapolative traders, but markets are not efficient

    Why Did Holdings of Highly-Rated Securitization Tranches Differ So Much Across Banks?

    No full text
    We provide estimates of holdings of highly-rated securitization tranches of U.S. bank holding companies ahead of the credit crisis and evaluate hypotheses that have been advanced to explain these holdings. Our broadest estimates include CDOs as well as holdings in off-balance-sheet conduits. While holdings exceeded Tier 1 capital for some large banks, they were economically trivial for the typical U.S. bank. The banks with high holdings were not riskier before the crisis using conventional measures, but their performance was poorer during the crisis. We find that holdings of highly-rated tranches are correlated with a bank’s securitization activity. Theories of highly-rated tranches that are unrelated to a bank’s securitization activity, such as “bad incentives,” “bad governance, ” or “bad risk management, ” have no support in the data

    Pseudo-market timing and predictive regressions, Harvard University working paper

    No full text
    A number of studies claim that aggregate managerial decision variables, such as aggregate equity issuance, have power to predict stock or bond market returns. Recent research argues that these results may be driven by an aggregate time-series version of Schultz’s (2003) pseudo market timing bias. We use standard simulation techniques to estimate the size of the aggregate pseudo market timing bias for a variety of predictive regressions based on managerial decision variables. The results show that the bias explains less than two percent of the predictive power of the equity share in new issues, and that it is also too small to overturn prior inferences about the predictive power of corporate investment plans, insider trading, dividend initiations, or the maturity of corporate debt issues

    Bad beta, good beta

    No full text
    This paper explains the size and value “anomalies ” in stock returns using an economically motivated two-beta model. We break the beta of a stock with the market portfolio into two components, one reflecting news about the market’s future cash flows and one reflecting news about the market’s discount rates. Intertemporal asset pricing theory suggests that the former should have a higher price of risk; thus beta, like cholesterol, comes in “bad ” and “good ” varieties. Empirically, we find that value stocks and small stocks have considerably higher cash-flow betas than growth stocks and large stocks, and this can explain their higher average returns. The poor performance of the capital asset pricing model (CAPM) since 1963 is explained by the fact that growth stocks and high-past-beta stocks have predominantly good betas with low risk prices. (JEL G12, G14, N22) How should a rational investor measure the risks of stock market investments? What determines the risk premium that will induce a rational investor to hold an individual stock at its market weight, rather than overweighting or underweighting it? According to the CAPM of William Sharpe (1964) and John Lintner (1965), a stock’s risk is summarized by its beta with the market portfolio of all invested wealth. Controlling for beta, no other characteristics of a stock should influence the return required by a rational investor. It is well known that the CAPM fails to describe average realized stock returns since the early 1960s, if a value-weighted equity index is used as a proxy for the market portfolio. I

    Predicting Returns with Managerial Decision Variables: Is there a Small Sample Bias

    No full text
    Many studies find that aggregate managerial decision variables, such as aggregate equity issuance, predict stock or bond market returns. Recent research argues that these findings may be driven by an aggregate time-series version of Schultz’s (2003) pseudo market-timing bias. Using standard simulation techniques, we find that the bias is much too small to account for the observed predictive power of the equity share in new issues, corporate investment plans, insider trading, dividend initiations, or the maturity of corporate debt issues

    Eastern Universities Development Conference and the MIT Development Economics and Organizational Economics

    No full text
    Do historical institutions have a persistent impact on economic performance? We analyze the colonial institutions set up by the British to collect land revenue in India, and show that differences in historical property rights institutions lead to sustained differences in economic outcomes. Areas in which proprietary rights in land were historically given to landlords have significantly lower agricultural investments, agricultural productivity and investments in public goods in the post-Independence period than areas in which these rights were given to the cultivators. We verify that these differences are not driven by omitted variables or endogeneity of the historical institutions, and argue that they probably arise because differences in institutions lead to very different policy choices

    Politics and Local Economic Growth: Evidence from India ∗

    No full text
    Does politics have an impact on local economic outcomes? Using a regression discontinuity design built around close elections in India from 1990-2005, we examine the local economic effects of one form of political favoritism: the benefit of having a local politician who is aligned with the party in control of the state government. We show that private sector employment in politically aligned constituencies grows by 1.7 percentage points more per year than in non-aligned constituencies. We find no effect on government employment or supply of public infrastructure. Stock prices show 12-15 % positive cumulative abnormal returns when an aligned candidate wins the constituency where a firm is headquartered, suggesting that political alignment is a net benefit to both local labor and capital. Finally, we use international survey data to classify industries by their dependence on (i) government bureaucracy, (ii) direct transfers in the form of procurement, and (iii) external finance. We find the effect of political alignment is largest in industries that depend most on government officials, with no significant effect of dependence on credit or procurement. The results suggest that state politicians can control the enforcement of regulation, with important economic consequences
    corecore