5,603 research outputs found

    iFair: Learning Individually Fair Data Representations for Algorithmic Decision Making

    Get PDF
    People are rated and ranked, towards algorithmic decision making in an increasing number of applications, typically based on machine learning. Research on how to incorporate fairness into such tasks has prevalently pursued the paradigm of group fairness: giving adequate success rates to specifically protected groups. In contrast, the alternative paradigm of individual fairness has received relatively little attention, and this paper advances this less explored direction. The paper introduces a method for probabilistically mapping user records into a low-rank representation that reconciles individual fairness and the utility of classifiers and rankings in downstream applications. Our notion of individual fairness requires that users who are similar in all task-relevant attributes such as job qualification, and disregarding all potentially discriminating attributes such as gender, should have similar outcomes. We demonstrate the versatility of our method by applying it to classification and learning-to-rank tasks on a variety of real-world datasets. Our experiments show substantial improvements over the best prior work for this setting.Comment: Accepted at ICDE 2019. Please cite the ICDE 2019 proceedings versio

    Big-Data-Driven Materials Science and its FAIR Data Infrastructure

    Get PDF
    This chapter addresses the forth paradigm of materials research -- big-data driven materials science. Its concepts and state-of-the-art are described, and its challenges and chances are discussed. For furthering the field, Open Data and an all-embracing sharing, an efficient data infrastructure, and the rich ecosystem of computer codes used in the community are of critical importance. For shaping this forth paradigm and contributing to the development or discovery of improved and novel materials, data must be what is now called FAIR -- Findable, Accessible, Interoperable and Re-purposable/Re-usable. This sets the stage for advances of methods from artificial intelligence that operate on large data sets to find trends and patterns that cannot be obtained from individual calculations and not even directly from high-throughput studies. Recent progress is reviewed and demonstrated, and the chapter is concluded by a forward-looking perspective, addressing important not yet solved challenges.Comment: submitted to the Handbook of Materials Modeling (eds. S. Yip and W. Andreoni), Springer 2018/201

    {iFair}: {L}earning Individually Fair Data Representations for Algorithmic Decision Making

    Get PDF
    People are rated and ranked, towards algorithmic decision making in an increasing number of applications, typically based on machine learning. Research on how to incorporate fairness into such tasks has prevalently pursued the paradigm of group fairness: ensuring that each ethnic or social group receives its fair share in the outcome of classifiers and rankings. In contrast, the alternative paradigm of individual fairness has received relatively little attention. This paper introduces a method for probabilistically clustering user records into a low-rank representation that captures individual fairness yet also achieves high accuracy in classification and regression models. Our notion of individual fairness requires that users who are similar in all task-relevant attributes such as job qualification, and disregarding all potentially discriminating attributes such as gender, should have similar outcomes. Since the case for fairness is ubiquitous across many tasks, we aim to learn general representations that can be applied to arbitrary downstream use-cases. We demonstrate the versatility of our method by applying it to classification and learning-to-rank tasks on two real-world datasets. Our experiments show substantial improvements over the best prior work for this setting

    Licensing FAIR data for reuse

    Get PDF
    The last letter of the FAIR acronym stands for Reusability. Data and metadata should be made available with a clear and accessible usage license. But, what are the choices? How can researchers share data and allow reusability? Are all the licenses available for sharing content suitable for data? Data can be covered by different layers of copyright protection making the relationship between data and copyright particularly complex. Some research data can be considered as a work and therefore covered by full copyright while other data can be in the public domain due to their lack of originality. Moreover, a collection of data can be protected by special rights in Europe to acknowledge the investment in time and money in obtaining, presenting, arranging or verifying the data. The need of using a license when sharing data comes from the fact that, under current copyright laws, when rights exist, the absence of any legal notice must be understood as the default “all rights reserved” regime. Unless an exception applies, the authorisation of right holders is necessary for reuse. Right holders could use any text to state the reusability of data but it is advisable to use some of the existing licenses, and especially the ones that are suitable for data and databases. We hope that with this paper we can bring some clarity in relation to the rights involved when sharing research data

    Fair Data: History and Present Context

    Get PDF
    In this paper, we discuss FAIR Data, why it exists, and who it applies to. We further review the principles of FAIR data and how they are managed in research centers. We also discuss the types of problems that researchers encounter, and what an information professional can do to assist them. At present, the vast majority of centers subscribe to the FAIR principles. However, both center and researcher face the arduous task of understanding, managing, and implementing the model. They must know data formats and standards. For a correct description and to facilitate data retrieval and interoperability, they must know about different types of metadata schemas. They must know about digital preservation and specific aspects of knowledge and information management. In addition, there are also ethical issues, intellectual property, and cultural differences. All these controversies translate into extra workload for researchers, who only get a return in the form of citations. It is critical to note that these information professionals can play a key role in the proper management of research data, and can help achieve the objectives described in the principles: making data findable, accessible, interoperable, and reusable

    Exploiting FAIR Data to Enhance Data Analysis

    Get PDF
    In times of continuously increasing data intensive research, good management of data becomes inevitable. This talk outlines some technologies and Coscine as a tool that are available to researchers to manage their data according to the FAIR principles and to make the best use of it.NFDI-MatWerk is funded as part of the National Research Data Infrastructure (NFDI) following a recommendation of the German Joint Science Conference (GWK). The funding is provided by the Federal Government and the Heads of Government of the Länder and managed by the German Research Foundation (DFG) - project number 460247524

    AgroFIMS v.1.0 - User manual

    Get PDF
    The Agronomy Field Information Management System (AgroFIMS) has been developed on CGIAR’s HIDAP (Highly Interactive Data Analysis Platform) created by CGIAR’s International Potato Center, CIP. AgroFIMS draws fully on ontologies, particularly the Agronomy Ontology (AgrO)1. It consists of modules that represent the typical cycle of operations in agronomic trial management (seeding, weeding, fertilization, harvest, and more) and enables the creation of data collection sheets using the same ontology-based set of variables, terminology, units and protocols. AgroFIMS therefore enables a priori harmonization with metadata and data interoperability standards and adherence to the FAIR Data Principles essential for data reuse and increasingly, for compliance with funder mandates - without any extra work for researchers. AgroFIMS is therefore of value to anyone (scientist, researcher, agronomist, etc.) who wishes to easily design a standards-compliant agronomic research fieldbook following the FAIR Data Principles. AgroFIMS also allows users to collect data electronically in the field, thereby reducing errors. Currently this is restricted to the KDSmart Android platform, but we expect to enable this capability with other platforms such as the Open Data Kit (ODK) and Field Book in v.2.0. Once data is collected using KDSmart, the data can be uploaded back to AgroFIMS for data validation, statistical analysis, and the generation of statistical analysis reports. V.2.0 will allow easy upload of the data from AgroFIMS to an institutional or compliant repository of the user’s choice

    A funder's perspective on data management, FAIR data and Open Access

    Get PDF
    Presentation on the perspective of Dutch health funder ZonMW on data management, FAIR data and Open Access for the Health-RI FAIR data stewards basics course, in Utrecht, on 18 June 202

    Towards FAIR Data in Heterogeneous Catalysis Research

    Get PDF
    This poster elaborates on the challenges of developing a database for heterogeneous catalysis research and FAIRmat’s effort in tayloring NOMAD schemas and search interface for catalysis researchers as well as demonstrating first steps of an ontology development for catalysis. The poster has been presented in the 1st Conference on Research Data Infrastructure (CoRDI 2023) which took place on Septmeber 12 - 14, 2023, in Karlsruhe, Germany.Funding statement: FAIRmat is a consortium of German National Research Data Infrastructure (NDFI) funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – project 460197019
    • …
    corecore