10 research outputs found

    The Lov\'asz Hinge: A Novel Convex Surrogate for Submodular Losses

    Get PDF
    Learning with non-modular losses is an important problem when sets of predictions are made simultaneously. The main tools for constructing convex surrogate loss functions for set prediction are margin rescaling and slack rescaling. In this work, we show that these strategies lead to tight convex surrogates iff the underlying loss function is increasing in the number of incorrect predictions. However, gradient or cutting-plane computation for these functions is NP-hard for non-supermodular loss functions. We propose instead a novel surrogate loss function for submodular losses, the Lov\'asz hinge, which leads to O(p log p) complexity with O(p) oracle accesses to the loss function to compute a gradient or cutting-plane. We prove that the Lov\'asz hinge is convex and yields an extension. As a result, we have developed the first tractable convex surrogates in the literature for submodular losses. We demonstrate the utility of this novel convex surrogate through several set prediction tasks, including on the PASCAL VOC and Microsoft COCO datasets

    A Convex Surrogate Operator for General Non-Modular Loss Functions

    Get PDF
    International audienceEmpirical risk minimization frequently employs convex surrogates to underlying discrete loss functions in order to achieve computational tractability during optimization. However, classical convex surrogates can only tightly bound modular loss functions, sub-modular functions or supermodular functions separately while maintaining polynomial time computation. In this work, a novel generic convex surrogate for general non-modular loss functions is introduced, which provides for the first time a tractable solution for loss functions that are neither super-modular nor submodular. This convex surro-gate is based on a submodular-supermodular decomposition for which the existence and uniqueness is proven in this paper. It takes the sum of two convex surrogates that separately bound the supermodular component and the submodular component using slack-rescaling and the Lovász hinge, respectively. It is further proven that this surrogate is convex , piecewise linear, an extension of the loss function, and for which subgradient computation is polynomial time. Empirical results are reported on a non-submodular loss based on the Sørensen-Dice difference function, and a real-world face track dataset with tens of thousands of frames, demonstrating the improved performance, efficiency, and scalabil-ity of the novel convex surrogate

    Hinge-Loss Markov Random Fields and Probabilistic Soft Logic: A Scalable Approach to Structured Prediction

    Get PDF
    A fundamental challenge in developing impactful artificial intelligence technologies is balancing the ability to model rich, structured domains with the ability to scale to big data. Many important problem areas are both richly structured and large scale, from social and biological networks, to knowledge graphs and the Web, to images, video, and natural language. In this thesis I introduce two new formalisms for modeling structured data, distinguished from previous approaches by their ability to both capture rich structure and scale to big data. The first, hinge-loss Markov random fields (HL-MRFs), is a new kind of probabilistic graphical model that generalizes different approaches to convex inference. I unite three views of inference from the randomized algorithms, probabilistic graphical models, and fuzzy logic communities, showing that all three views lead to the same inference objective. I then derive HL-MRFs by generalizing this unified objective. The second new formalism, probabilistic soft logic (PSL), is a probabilistic programming language that makes HL-MRFs easy to define, refine, and reuse for relational data. PSL uses a syntax based on first-order logic to compactly specify complex models. I next introduce an algorithm for inferring most-probable variable assignments (MAP inference) for HL-MRFs that is extremely scalable, much more so than commercially available software, because it uses message passing to leverage the sparse dependency structures common in inference tasks. I then show how to learn the parameters of HL-MRFs using a number of learning objectives. The learned HL-MRFs are as accurate as traditional, discrete models, but much more scalable. To enable HL-MRFs and PSL to capture even richer dependencies, I then extend learning to support latent variables, i.e., variables without training labels. To overcome the bottleneck of repeated inferences required during learning, I introduce paired-dual learning, which interleaves inference and parameter updates. Paired-dual learning learns accurate models and is also scalable, often completing before traditional methods make even one parameter update. Together, these algorithms enable HL-MRFs and PSL to model rich, structured data at scales not previously possible

    Insurance and taxation

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Economics, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 135-144).Chapter 1 analyzes Pareto optimal non-linear taxation of profits and labor income in a private information economy with endogenous firm formation. Individuals differ in both their skill and their cost of setting up a firm, and choose between becoming workers and entrepreneurs. I show that a tax system in which entrepreneurial profits and labor income must be subject to the same non-linear tax schedule makes use of general equilibrium effects through wages to indirectly achieve redistribution between entrepreneurs and workers. As a result, constrained Pareto optimal policies can involve negative marginal tax rates at the top and, if available, input taxes that distort the firms' input choices. However, these properties disappear when a differential tax treatment of profits and labor income is possible, as for instance implemented by a corporate income tax. In this case, redistribution is achieved directly through the tax system rather than "trickle down" effects, and production efficiency is always optimal. When I extend the model to incorporate entrepreneurial borrowing in credit markets, I find that endogenous cross-subsidization in the credit market equilibrium results in excessive (insufficient) entry of low-skilled (high-skilled) agents into entrepreneurship. Even without redistributive objectives, this gives rise to an additional, corrective role for differential taxation of entrepreneurial profits and labor income. In particular, a regressive profit tax may restore the efficient occupational choice. In chapter 2, which is joint work with Nick Netzer, we show that, in the presence of a time-inconsistency problem with optimal agency contracts, competitive markets can implement allocations that Pareto dominate those achieved by a benevolent planner, and they induce more effort. In particular, we analyze a model with moral hazard and two-sided lack of commitment. After agents have chosen a hidden effort and the need to provide incentives has vanished, firms can modify their contracts and agents can switch firms, resulting in an adverse selection problem at the ex-post stage. As long as the ex-post market outcome satisfies a weak notion of competitiveness and sufficiently separates individuals who choose different effort levels, the market allocation is Pareto superior to a social planner's allocation with a complete breakdown of incentives. In addition, even when a planner without commitment is able to sustain effort incentives, competitive markets without commitment implement more effort in equilibrium under general conditions. We illustrate our findings with standard market equilibrium concepts. Chapter 3 studies Pareto-optimal risk-sharing arrangements in a private information economy with aggregate uncertainty and ex ante heterogeneous agents. I show that any such arrangement has to be such that ratios of expected inverse marginal utilities across different agents are independent of aggregate shocks. I use this condition to show how to implement Pareto-optima as equilibria when agents can trade claims to consumption contingent on aggregate shocks in financial markets. If aggregate shocks affect individual outputs only, the implementation of optimal allocations does not require interventions in financial markets. If they also affect probability distributions over idiosyncratic risk, however, transaction taxes need to be introduced that are higher for claims to consumption in states with a more volatile distribution of likelihood ratios in the sense of second-order stochastic dominance. Two implementation results are provided. If transaction taxes are constrained to be linear, they need to condition on individual outputs in addition to aggregate shocks. To prevent double-deviations, they induce additional risk for agents who buy financial claims and provide additional insurance to those who sell them. Finally, an implementation with non-linear transaction taxes that do not depend on idiosyncratic shocks is constructed..by Florian Scheuer.Ph.D

    Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference

    Get PDF

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum
    corecore