5 research outputs found

    Environmental adaptation and differential replication in machine learning

    Get PDF
    When deployed in the wild, machine learning models are usually confronted withan environment that imposes severe constraints. As this environment evolves, so do these constraints.As a result, the feasible set of solutions for the considered need is prone to change in time. We referto this problem as that of environmental adaptation. In this paper, we formalize environmentaladaptation and discuss how it differs from other problems in the literature. We propose solutionsbased on differential replication, a technique where the knowledge acquired by the deployed modelsis reused in specific ways to train more suitable future generations. We discuss different mechanismsto implement differential replications in practice, depending on the considered level of knowledge.Finally, we present seven examples where the problem of environmental adaptation can be solvedthrough differential replication in real-life applications

    Differential Replication for Credit Scoring in Regulated Environments

    Get PDF
    Differential replication is a method to adapt existing machine learning solutions to the demands of highly regulated environments by reusing knowledge from one generation to the next. Copying is a technique that allows differential replication by projecting a given classifier onto a new hypothesis space, in circumstances where access to both the original solution and its training data is limited. The resulting model replicates the original decision behavior while displaying new features and characteristics. In this paper, we apply this approach to a use case in the context of credit scoring. We use a private residential mortgage default dataset. We show that differential replication through copying can be exploited to adapt a given solution to the changing demands of a constrained environment such as that of the financial market. In particular, we show how copying can be used to replicate the decision behavior not only of a model, but also of a full pipeline. As a result, we can ensure the decomposability of the attributes used to provide explanations for credit scoring models and reduce the time-to-market delivery of these solution

    Risk mitigation in algorithmic accountability: The role of machine learning copies

    Get PDF
    Machine learning plays an increasingly important role in our society and economy and is already having an impact on our daily life in many different ways. From several perspectives, machine learning is seen as the new engine of productivity and economic growth. It can increase the business efficiency and improve any decision-making process, and of course, spawn the creation of new products and services by using complex machine learning algorithms. In this scenario, the lack of actionable accountability-related guidance is potentially the single most important challenge facing the machine learning community. Machine learning systems are often composed of many parts and ingredients, mixing third party components or software-as-a-service APIs, among others. In this paper we study the role of copies for risk mitigation in such machine learning systems. Formally, a copy can be regarded as an approximated projection operator of a model into a target model hypothesis set. Under the conceptual framework of actionable accountability, we explore the use of copies as a viable alternative in circumstances where models cannot be re-trained, nor enhanced by means of a wrapper. We use a real residential mortgage default dataset as a use case to illustrate the feasibility of this approach

    Adapting by copying. Towards a sustainable machine learning

    Get PDF
    [eng] Despite the rapid growth of machine learning in the past decades, deploying automated decision making systems in practice remains a challenge for most companies. On an average day, data scientists face substantial barriers to serving models into production. Production environments are complex ecosystems, still largely based on on-premise technology, where modifications are timely and costly. Given the rapid pace with which the machine learning environment changes these days, companies struggle to stay up-to-date with the latest software releases, the changes in regulation and the newest market trends. As a result, machine learning often fails to deliver according to expectations. And more worryingly, this can result in unwanted risks for users, for the company itself and even for the society as a whole, insofar the negative impact of these risks is perpetuated in time. In this context, adaptation is an instrument that is both necessary and crucial for ensuring a sustainable deployment of industrial machine learning. This dissertation is devoted to developing theoretical and practical tools to enable adaptation of machine learning models in company production environments. More precisely, we focus on devising mechanisms to exploit the knowledge acquired by models to train future generations that are better fit to meet the stringent demands of a changing ecosystem. We introduce copying as a mechanism to replicate the decision behaviour of a model using another that presents differential characteristics, in cases where access to both the models and their training data are restricted. We discuss the theoretical implications of this methodology and show how it can be performed and evaluated in practice. Under the conceptual framework of actionable accountability we also explore how copying can be used to ensure risk mitigation in circumstances where deployment of a machine learning solution results in a negative impact to individuals or organizations.[spa] A pesar del rápido crecimiento del aprendizaje automático en últimas décadas, la implementación de sistemas automatizados para la toma de decisiones sigue siendo un reto para muchas empresas. Los científicos de datos se enfrentan a diario a numerosas barreras a la hora de desplegar los modelos en producción. Los entornos de producción son ecosistemas complejos, mayoritariamente basados en tecnologías on- premise, donde los cambios son costosos. Es por eso que las empresas tienen serias dificultades para mantenerse al día con las últimas versiones de software, los cambios en la regulación vigente o las nuevas tendencias del mercado. Como consecuencia, el rendimiento del aprendizaje automático está a menudo muy por debajo de las expectativas. Y lo que es más preocupante, esto puede derivar en riesgos para los usuarios, para las propias empresas e incluso para la sociedad en su conjunto, en la medida en que el impacto negativo de dichos riesgos se perpetúe en el tiempo. En este contexto, la adaptación se revela como un elemento necesario e imprescindible para asegurar la sostenibilidad del desarrollo industrial del aprendizaje automático. Este trabajo está dedicado a desarrollar las herramientas teóricas y prácticas necesarias para posibilitar la adaptación de los modelos de aprendizaje automático en entornos de producción. En concreto, nos centramos en concebir mecanismos que permitan reutilizar el conocimiento adquirido por los modelos para entrenar futuras generaciones que estén mejor preparadas para satisfacer las demandas de un entorno altamente cambiante. Introducimos la idea de copiar, como un mecanismo que permite replicar el comportamiento decisorio de un modelo utilizando un segundo que presenta características diferenciales, en escenarios donde el acceso tanto a los datos como al propio modelo está restringido. Es en este contexto donde discutimos las implicaciones teóricas de esta metodología y demostramos como las copias pueden ser entrenadas y evaluadas en la práctica. Bajo el marco de la responsabilidad accionable, exploramos también cómo las copias pueden explotarse como herramienta para la mitigación de riesgos en circunstancias en que el despliegue de una solución basada en el aprendizaje automático pueda tener un impacto negativo sobre las personas o las organizaciones
    corecore