244 research outputs found

    Improving fairness in machine learning systems: What do industry practitioners need?

    Full text link
    The potential for machine learning (ML) systems to amplify social inequities and unfairness is receiving increasing popular and academic attention. A surge of recent work has focused on the development of algorithmic tools to assess and mitigate such unfairness. If these tools are to have a positive impact on industry practice, however, it is crucial that their design be informed by an understanding of real-world needs. Through 35 semi-structured interviews and an anonymous survey of 267 ML practitioners, we conduct the first systematic investigation of commercial product teams' challenges and needs for support in developing fairer ML systems. We identify areas of alignment and disconnect between the challenges faced by industry practitioners and solutions proposed in the fair ML research literature. Based on these findings, we highlight directions for future ML and HCI research that will better address industry practitioners' needs.Comment: To appear in the 2019 ACM CHI Conference on Human Factors in Computing Systems (CHI 2019

    World citation and collaboration networks: uncovering the role of geography in science

    Get PDF
    Modern information and communication technologies, especially the Internet, have diminished the role of spatial distances and territorial boundaries on the access and transmissibility of information. This has enabled scientists for closer collaboration and internationalization. Nevertheless, geography remains an important factor affecting the dynamics of science. Here we present a systematic analysis of citation and collaboration networks between cities and countries, by assigning papers to the geographic locations of their authors' affiliations. The citation flows as well as the collaboration strengths between cities decrease with the distance between them and follow gravity laws. In addition, the total research impact of a country grows linearly with the amount of national funding for research & development. However, the average impact reveals a peculiar threshold effect: the scientific output of a country may reach an impact larger than the world average only if the country invests more than about 100,000 USD per researcher annually.Comment: Published version. 9 pages, 5 figures + Appendix, The world citation and collaboration networks at both city and country level are available at http://becs.aalto.fi/~rajkp/datasets.htm

    Large-scale machine learning-based phenotyping significantly improves genomic discovery for optic nerve head morphology.

    Get PDF
    Genome-wide association studies (GWASs) require accurate cohort phenotyping, but expert labeling can be costly, time intensive, and variable. Here, we develop a machine learning (ML) model to predict glaucomatous optic nerve head features from color fundus photographs. We used the model to predict vertical cup-to-disc ratio (VCDR), a diagnostic parameter and cardinal endophenotype for glaucoma, in 65,680 Europeans in the UK Biobank (UKB). A GWAS of ML-based VCDR identified 299 independent genome-wide significant (GWS; p ≤ 5 × 10-8) hits in 156 loci. The ML-based GWAS replicated 62 of 65 GWS loci from a recent VCDR GWAS in the UKB for which two ophthalmologists manually labeled images for 67,040 Europeans. The ML-based GWAS also identified 93 novel loci, significantly expanding our understanding of the genetic etiologies of glaucoma and VCDR. Pathway analyses support the biological significance of the novel hits to VCDR: select loci near genes involved in neuronal and synaptic biology or harboring variants are known to cause severe Mendelian ophthalmic disease. Finally, the ML-based GWAS results significantly improve polygenic prediction of VCDR and primary open-angle glaucoma in the independent EPIC-Norfolk cohort

    International Migration of Doctors, and Its Impact on Availability of Psychiatrists in Low and Middle Income Countries

    Get PDF
    Background:Migration of health professionals from low and middle income countries to rich countries is a large scale and long-standing phenomenon, which is detrimental to the health systems in the donor countries. We sought to explore the extent of psychiatric migration. Methods: In our study, we use the respective professional databases in each country to establish the numbers of psychiatrists currently registered in the UK, US, New Zealand, and Australia who originate from other countries. We also estimate the impact of this migration on the psychiatrist population ratios in the donor countries. Findings: We document large numbers of psychiatrists currently registered in the UK, US, New Zealand and Australia originating from India (4687 psychiatrists), Pakistan (1158), Bangladesh (149) , Nigeria (384) , Egypt (484), Sri Lanka (142), Philippines (1593). For some countries of origin, the numbers of psychiatrists currently registered within high-income countries' professional databases are very small (e.g., 5 psychiatrists of Tanzanian origin registered in the 4 high-income countries we studied), but this number is very significant compared to the 15 psychiatrists currently registered in Tanzania). Without such emigration, many countries would have more than double the number of psychiatrists per 100, 000 population (e.g. Bangladesh, Myanmar, Afghanistan, Egypt, Syria, Lebanon); and some countries would have had five to eight times more psychiatrists per 100,000 (e.g. Philippines, Pakistan, Sri Lanka, Liberia, Nigeria and Zambia). Conclusions: Large numbers of psychiatrists originating from key low and middle income countries are currently registered in the UK, US, New Zealand and Australia, with concomitant impact on the psychiatrist/ population ratio n the originating countries. We suggest that creative international policy approaches are needed to ensure the individual migration rights of health professionals do not compromise societal population rights to health, and that there are public and fair agreements between countries within an internationally agreed framework. © 2010 Jenkins et al

    Modeling Methane Adsorption in Interpenetrating Porous Polymer Networks

    Get PDF
    Porous polymer networks (PPNs) are a class of porous materials of particular interest in a variety of energy-related applications because of their stability, high surface areas, and gas uptake capacities. Computationally derived structures for five recently synthesized PPN frameworks, PPN-2, -3, -4, -5, and -6, were generated for various topologies, optimized using semiempirical electronic structure methods, and evaluated using classical grand-canonical Monte Carlo simulations. We show that a key factor in modeling the methane uptake performance of these materials is whether, and how, these material frameworks interpenetrate and demonstrate a computational approach for predicting the presence, degree, and nature of interpenetration in PPNs that enables the reproduction of experimental adsorption data. © 2013 American Chemical Society

    HXE 108 - APPROACHES TO ENGLISH LITERATURE OCT 04.

    Get PDF
    Recent years have witnessed a persistent interest in generating pseudo test collections, both for training and evaluation purposes. We describe a method for generating queries and relevance judgments for microblog search in an unsupervised way. Our starting point is this intuition: tweets with a hashtag are relevant to the topic covered by the hashtag and hence to a suitable query derived from the hashtag. Our baseline method selects all commonly used hashtags, and all associated tweets as relevance judgments; we then generate a query from these tweets. Next, we generate a timestamp for each query, allowing us to use temporal information in the training process. We then enrich the generation process with knowledge derived from an editorial test collection for microblog search. We use our pseudo test collections in two ways. First, we tune parameters of a variety of well known retrieval methods on them. Correlations with parameter sweeps on an editorial test collection are high on average, with a large variance over retrieval algorithms. Second, we use the pseudo test collections as training sets in a learning to rank scenario. Performance close to training on an editorial test collection is achieved in many cases. Our results demonstrate the utility of tuning and training microblog search algorithms on automatically generated training material

    One-Pass Ranking Models for Low-Latency Product Recommendations

    Full text link
    Purchase logs collected in e-commerce platforms provide rich information about customer preferences. These logs can be leveraged to improve the quality of product recommenda-tions by feeding them to machine-learned ranking models. However, a variety of deployment constraints limit the näıve applicability of machine learning to this problem. First, the amount and the dimensionality of the data make in-memory learning simply not possible. Second, the drift of customers’ preference over time require to retrain the ranking model regularly with freshly collected data. This limits the time that is available for training to prohibitively short intervals. Third, ranking in real-time is necessary whenever the query complexity prevents us from caching the predictions. This constraint requires to minimize prediction time (or equiva
    corecore