Search CORE

58 research outputs found

Incrementally Improving Graph WaveNet Performance on Traffic Prediction

Author: Chitters Vamsi
McCreery Clara
Shleifer Sam
Publication venue
Publication date: 11/12/2019
Field of study

We present a series of modifications which improve upon Graph WaveNet's previously state-of-the-art performance on the METR-LA traffic prediction task. The goal of this task is to predict the future speed of traffic at each sensor in a network using the past hour of sensor readings. Graph WaveNet (GWN) is a spatio-temporal graph neural network which interleaves graph convolution to aggregate information from nearby sensors and dilated convolutions to aggregate information from the past. We improve GWN by (1) using better hyperparameters, (2) adding connections that allow larger gradients to flow back to the early convolutional layers, and (3) pretraining on an easier short-term traffic prediction task. These modifications reduce the mean absolute error by .06 on the METR-LA task, nearly equal to GWN's improvement over its predecessor. These improvements generalize to the PEMS-BAY dataset, with similar relative magnitude. We also show that ensembling separate models for short-and long-term predictions further improves performance. Code is available at https://github.com/sshleifer/Graph-WaveNet

arXiv.org e-Print Archive

Classification as Decoder: Trading Flexibility for Control in Medical Dialogue

Author: Amatriain Xavier
Chablani Manish
Kannan Anitha
Katariya Namit
Shleifer Sam
Publication venue
Publication date: 15/11/2019
Field of study

Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deeper understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the cost of control, a concerning tradeoff in doctor/patient interactions. Inaccuracies, typos, or undesirable content in the training data will be reproduced by the model at inference time. We trade a small amount of labeling effort and some loss of response variety in exchange for quality control. More specifically, a pretrained language model encodes the conversational context, and we finetune a classification head to map an encoded conversational context to a response class, where each class is a noisily labeled group of interchangeable responses. Experts can update these exemplar responses over time as best practices change without retraining the classifier or invalidating old training data. Expert evaluation of 775 unseen doctor/patient conversations shows that only 12% of the discriminative model's responses are worse than the what the doctor ended up writing, compared to 18% for the generative model.Comment: Machine Learning for Health (ML4H) at NeurIPS 2019 - Extended Abstract. arXiv admin note: substantial text overlap with arXiv:1910.0347

arXiv.org e-Print Archive

HuggingFace's Transformers: State-of-the-art Natural Language Processing

Author: Chaumond Julien
Cistac Pierric
Davison Joe
Debut Lysandre
Delangue Clement
Drame Mariama
Funtowicz Morgan
Gugger Sylvain
Jernite Yacine
Lhoest Quentin
Louf Rémi
Ma Clara
Moi Anthony
Plu Julien
Rault Tim
Rush Alexander M.
Sanh Victor
Scao Teven Le
Shleifer Sam
von Platen Patrick
Wolf Thomas
Xu Canwen
Publication venue
Publication date: 13/07/2020
Field of study

Recent progress in natural language processing has been driven by advances in both model architecture and model pretraining. Transformer architectures have facilitated building higher-capacity models and pretraining has made it possible to effectively utilize this capacity for a wide variety of tasks. \textit{Transformers} is an open-source library with the goal of opening up these advances to the wider machine learning community. The library consists of carefully engineered state-of-the art Transformer architectures under a unified API. Backing this library is a curated collection of pretrained models made by and available for the community. \textit{Transformers} is designed to be extensible by researchers, simple for practitioners, and fast and robust in industrial deployments. The library is available at \url{https://github.com/huggingface/transformers}.Comment: 8 pages, 4 figures, more details at https://github.com/huggingface/transformer

arXiv.org e-Print Archive

Expectations of Returns and Expected Returns *

Author: Adi Sunderam
Andrei Shleifer
Annette Vissing
Jared Dourdeville
Joshua Schwartzstein
Owen Lamont
Robin Greenwood
Sam Hanson
Stefan Nagel
Publication venue
Publication date: 01/01/2013
Field of study

We analyze time-series of investor expectations of future stock market returns from six data sources between 1963 and 2011. The six measures of expectations are highly positively correlated with each other, as well as with past stock returns and with the level of the stock market. However, investor expectations are strongly negatively correlated with model-based expected returns. We reconcile the evidence by calibrating a simple behavioral model, in which fundamental traders require a premium to accommodate expectations shocks from extrapolative traders, but markets are not efficient

CiteSeerX

Why Did Holdings of Highly-Rated Securitization Tranches Differ So Much Across Banks?

Author: Andrei Shleifer
David Hirshleifer
George Pennachi
Isil Erel
Philip Strahan
René M. Stulz
Sam Hanson
Taylor Nadauld
Publication venue
Publication date
Field of study

We provide estimates of holdings of highly-rated securitization tranches of U.S. bank holding companies ahead of the credit crisis and evaluate hypotheses that have been advanced to explain these holdings. Our broadest estimates include CDOs as well as holdings in off-balance-sheet conduits. While holdings exceeded Tier 1 capital for some large banks, they were economically trivial for the typical U.S. bank. The banks with high holdings were not riskier before the crisis using conventional measures, but their performance was poorer during the crisis. We find that holdings of highly-rated tranches are correlated with a bank’s securitization activity. Theories of highly-rated tranches that are unrelated to a bank’s securitization activity, such as “bad incentives,” “bad governance, ” or “bad risk management, ” have no support in the data

CiteSeerX

Pseudo-market timing and predictive regressions, Harvard University working paper

Author: Andrei Shleifer
Jeffrey Wurgler
Jeremy Stein
Jim Stock
Malcolm Baker
Rob Stambaugh
Ryan Taliaferro
Sam Thompson
Tuomo Vuolteenaho
Publication venue
Publication date
Field of study

A number of studies claim that aggregate managerial decision variables, such as aggregate equity issuance, have power to predict stock or bond market returns. Recent research argues that these results may be driven by an aggregate time-series version of Schultz’s (2003) pseudo market timing bias. We use standard simulation techniques to estimate the size of the aggregate pseudo market timing bias for a variety of predictive regressions based on managerial decision variables. The results show that the bias explains less than two percent of the predictive power of the equity share in new issues, and that it is also too small to overturn prior inferences about the predictive power of corporate investment plans, insider trading, dividend initiations, or the maturity of corporate debt issues

CiteSeerX

Bad beta, good beta

Author: Andrei Shleifer
Christopher Polk
Jay Shanken
Luis Viceira
Matti Keloharju
Robert Hodrick
Sam Thompson
Tuomo Vuolteenaho
Y Cohen
Y. Campbell
Publication venue: Harvard University
Publication date
Field of study

This paper explains the size and value “anomalies ” in stock returns using an economically motivated two-beta model. We break the beta of a stock with the market portfolio into two components, one reflecting news about the market’s future cash flows and one reflecting news about the market’s discount rates. Intertemporal asset pricing theory suggests that the former should have a higher price of risk; thus beta, like cholesterol, comes in “bad ” and “good ” varieties. Empirically, we find that value stocks and small stocks have considerably higher cash-flow betas than growth stocks and large stocks, and this can explain their higher average returns. The poor performance of the capital asset pricing model (CAPM) since 1963 is explained by the fact that growth stocks and high-past-beta stocks have predominantly good betas with low risk prices. (JEL G12, G14, N22) How should a rational investor measure the risks of stock market investments? What determines the risk premium that will induce a rational investor to hold an individual stock at its market weight, rather than overweighting or underweighting it? According to the CAPM of William Sharpe (1964) and John Lintner (1965), a stock’s risk is summarized by its beta with the market portfolio of all invested wealth. Controlling for beta, no other characteristics of a stock should influence the return required by a rational investor. It is well known that the CAPM fails to describe average realized stock returns since the early 1960s, if a value-weighted equity index is used as a proxy for the market portfolio. I

CiteSeerX

Predicting Returns with Managerial Decision Variables: Is there a Small Sample Bias

Author: Andrei Shleifer
Jay Ritter
Jeffrey Wurgler
Jeremy Stein
Jim Stock
Malcolm Baker
Rob Stambaugh
Ryan Taliaferro
Sam Thompson
Tuomo Vuolteenaho
Publication venue
Publication date
Field of study

Many studies find that aggregate managerial decision variables, such as aggregate equity issuance, predict stock or bond market returns. Recent research argues that these findings may be driven by an aggregate time-series version of Schultz’s (2003) pseudo market-timing bias. Using standard simulation techniques, we find that the bias is much too small to account for the observed predictive power of the equity share in new issues, corporate investment plans, insider trading, dividend initiations, or the maturity of corporate debt issues

CiteSeerX

Eastern Universities Development Conference and the MIT Development Economics and Organizational Economics

Author: Abhijit Banerjee
Amiya Bagchi
Andrei Shleifer
Esther Duflo
Karla Hoff
Lakshmi Iyer
Maitreesh Ghatak
Raghuram Rajan
Sam Bowles
The North
We Thank Daron Acemoglu
Publication venue
Publication date: 01/01/2002
Field of study

Do historical institutions have a persistent impact on economic performance? We analyze the colonial institutions set up by the British to collect land revenue in India, and show that differences in historical property rights institutions lead to sustained differences in economic outcomes. Areas in which proprietary rights in land were historically given to landlords have significantly lower agricultural investments, agricultural productivity and investments in public goods in the post-Independence period than areas in which these rights were given to the cultivators. We verify that these differences are not driven by omitted variables or endogeneity of the historical institutions, and argue that they probably arise because differences in institutions lead to very different policy choices

CiteSeerX

Politics and Local Economic Growth: Evidence from India ∗

Author: Andrei Shleifer
Devesh Kapur
Josh Angrist
Lakshmi Iyer
Lorenzo Casaburi
Paul Novosad
Ricardo Hausmann
Richard Hornbeck
Rohini P
Sam Asher
Sendhil Mullainathan
Publication venue
Publication date
Field of study

Does politics have an impact on local economic outcomes? Using a regression discontinuity design built around close elections in India from 1990-2005, we examine the local economic effects of one form of political favoritism: the benefit of having a local politician who is aligned with the party in control of the state government. We show that private sector employment in politically aligned constituencies grows by 1.7 percentage points more per year than in non-aligned constituencies. We find no effect on government employment or supply of public infrastructure. Stock prices show 12-15 % positive cumulative abnormal returns when an aligned candidate wins the constituency where a firm is headquartered, suggesting that political alignment is a net benefit to both local labor and capital. Finally, we use international survey data to classify industries by their dependence on (i) government bureaucracy, (ii) direct transfers in the form of procurement, and (iii) external finance. We find the effect of political alignment is largest in industries that depend most on government officials, with no significant effect of dependence on credit or procurement. The results suggest that state politicians can control the enforcement of regulation, with important economic consequences

CiteSeerX