    Specialization and Regulation: The Rise of Professionals and the Emergence of Occupational Licensing Regulation

    This paper explores the origins and effects of occupational licensing regulation in late nineteenth and early twentieth century America. Was licensing regulation introduced to limit competition in the market for professional services at the expense of efficiency? Or was licensing adopted to reduce informational asymmetries about professional quality? To investigate these hypotheses, we analyze the determinants of licensing legislation and the effect of licensing on entry into eleven occupations. We also examine the impact of medical licensing laws on entry into the medical profession, physician earnings, mortality rates, and the incidence of medical malpractice. We believe that, at least for the Progressive Era, the evidence is more consistent with the asymmetric information hypothesis than the industry capture hypothesis.

    The Determinants of Progressive Era Reform: The Pure Food and Drugs Act of 1906

    We examine three theories of Progressive Era regulation: public interest, industry capture, and information manipulation by the federal bureaucracy and muckraking press. Based on analysis of qualitative legislative histories and econometric evidence, we argue that the adoption of the 1906 Pure Food and Drugs Act was due to all three factors. Select producer groups sought regulation to tilt the competitive playing field to their advantage. Progressive reform interests desired regulation to reduce uncertainty about food and drug quality. Additionally, rent-seeking by the muckraking press and its bureaucratic allies played a key role in the timing of the legislation. We also find that because the interests behind regulation could not shape the enforcing agency or the legal environment in which enforcement took place, these groups did not ultimately benefit from regulation in the ways originally anticipated.

    Medical Licensing Board Characteristics and Physician Discipline: An Empirical Analysis

    This paper investigates the relationship between the characteristics of medical licensing boards and the frequency with which boards discipline physicians. Specifically, we take advantage of variation in the structure of medical licensing boards between 1993 and 2003 to determine the effect of organizational and budgetary independence, public oversight, and resource constraints on rates of physician discipline. We find that larger licensing boards, boards with more staff, and boards that are organizationally independent from state government discipline doctors more frequently. Public oversight and political control over board budgets do not appear to influence the extent to which medical licensing boards discipline doctors. These findings are broadly consistent with theories of regulatory behavior that emphasize the importance of bureaucratic autonomy for effective regulatory enforcement

    The Political Economy of "Truth-in-Advertising" Regulation During the Progressive Era

    This paper explores the origins and impact of "truth-in-advertising" regulation during the Progressive era. Was advertising regulation adopted in response to rent-seeking on the part of firms who sought to limit the availability of advertising as a competitive device? Or was advertising regulation desired because it furnished a mechanism through which firms could improve the credibility of advertising? We find the available qualitative and quantitative evidence to be more consistent with the latter hypothesis.

    Low Budget Active Learning via Wasserstein Distance: An Integer Programming Approach

    Given restrictions on the availability of data, active learning is the process of training a model with limited labeled data by selecting a core subset of an unlabeled data pool to label. Although selecting the most useful points for training is an optimization problem, the scale of deep learning data sets forces most selection strategies to employ efficient heuristics. Instead, we propose a new integer optimization problem for selecting a core set that minimizes the discrete Wasserstein distance from the unlabeled pool. We demonstrate that this problem can be tractably solved with a Generalized Benders Decomposition algorithm. Our strategy requires high-quality latent features which we obtain by unsupervised learning on the unlabeled pool. Numerical results on several data sets show that our optimization approach is competitive with baselines and particularly outperforms them in the low budget regime where less than one percent of the data set is labeled

    Graph Metanetworks for Processing Diverse Neural Architectures

    Neural networks efficiently encode learned information within their parameters. Consequently, many tasks can be unified by treating neural networks themselves as input data. When doing so, recent studies demonstrated the importance of accounting for the symmetries and geometry of parameter spaces. However, those works developed architectures tailored to specific networks such as MLPs and CNNs without normalization layers, and generalizing such architectures to other types of networks can be challenging. In this work, we overcome these challenges by building new metanetworks - neural networks that take weights from other neural networks as input. Put simply, we carefully build graphs representing the input neural networks and process the graphs using graph neural networks. Our approach, Graph Metanetworks (GMNs), generalizes to neural architectures where competing methods struggle, such as multi-head attention layers, normalization layers, convolutional layers, ResNet blocks, and group-equivariant linear layers. We prove that GMNs are expressive and equivariant to parameter permutation symmetries that leave the input neural network functions unchanged. We validate the effectiveness of our method on several metanetwork tasks over diverse neural network architectures.Comment: 29 pages. v2 updated experimental results and detail

    Optimizing Data Collection for Machine Learning

    Modern deep learning systems require huge data sets to achieve impressive performance, but there is little guidance on how much or what kind of data to collect. Over-collecting data incurs unnecessary present costs, while under-collecting may incur future costs and delay workflows. We propose a new paradigm for modeling the data collection workflow as a formal optimal data collection problem that allows designers to specify performance targets, collection costs, a time horizon, and penalties for failing to meet the targets. Additionally, this formulation generalizes to tasks requiring multiple data sources, such as labeled and unlabeled data used in semi-supervised learning. To solve our problem, we develop Learn-Optimize-Collect (LOC), which minimizes expected future collection costs. Finally, we numerically compare our framework to the conventional baseline of estimating data requirements by extrapolating from neural scaling laws. We significantly reduce the risks of failing to meet desired performance targets on several classification, segmentation, and detection tasks, while maintaining low total collection costs.Comment: Accepted to NeurIPS 202