thesis

Machine learning in the real world with multiple objectives

Abstract

Machine learning (ML) is ubiquitous in many real-world applications. Existing ML systems are based on optimizing a single quality metric such as prediction accuracy. These metrics typically do not fully align with real-world design constraints such as computation, latency, fairness, and acquisition costs that we encounter in real-world applications. In this thesis, we develop ML methods for optimizing prediction accuracy while accounting for such real-world constraints. In particular, we introduce multi-objective learning in two different setups: resource-efficient prediction and algorithmic fairness in language models. First, we focus on decreasing the test-time computational costs of prediction systems. Budget constraints arise in many machine learning problems. Computational costs limit the usage of many models on small devices such as IoT or mobile phones and increase the energy consumption in cloud computing. We design systems that allow on-the-fly modification of the prediction model for each input sample. These sample-adaptive systems allow us to leverage wide variability in sample complexity where we learn policies for selecting cheap models for low complexity instances and using descriptive models only for complex ones. We utilize multiple--objective approach where one minimizes the system cost while preserving predictive accuracy. We demonstrate significant speed-ups in the fields of computer vision, structured prediction, natural language processing, and deep learning. In the context of fairness, we first demonstrate that a naive application of ML methods runs the risk of amplifying social biases present in data. This danger is particularly acute for methods based on word embeddings, which are increasingly gaining importance in many natural language processing applications of ML. We show that word embeddings trained on Google News articles exhibit female/male gender stereotypes. We demonstrate that geometrically, gender bias is captured by unique directions in the word embedding vector space. To remove bias we formulate a empirical risk objective with fairness constraints to remove stereotypes from embeddings while maintaining desired associations. Using crowd-worker evaluation as well as standard benchmarks, we empirically demonstrate that our algorithms significantly reduces gender bias in embeddings, while preserving its useful properties such as the ability to cluster related concepts

    Similar works