1,077 research outputs found

    Recalibrating machine learning for social biases: demonstrating a new methodology through a case study classifying gender biases in archival documentation

    Get PDF
    This thesis proposes a recalibration of Machine Learning for social biases to minimize harms from existing approaches and practices in the field. Prioritizing quality over quantity, accuracy over efficiency, representativeness over convenience, and situated thinking over universal thinking, the thesis demonstrates an alternative approach to creating Machine Learning models. Drawing on GLAM, the Humanities, the Social Sciences, and Design, the thesis focuses on understanding and communicating biases in a specific use case. 11,888 metadata descriptions from the University of Edinburgh Heritage Collections' Archives catalog were manually annotated for gender biases and text classification models were then trained on the resulting dataset of 55,260 annotations. Evaluations of the models' performance demonstrates that annotating gender biases can be automated; however, the subjectivity of bias as a concept complicates the generalizability of any one approach. The contributions are: (1) an interdisciplinary and participatory Bias-Aware Methodology, (2) a Taxonomy of Gendered and Gender Biased Language, (3) data annotated for gender biased language, (4) gender biased text classification models, and (5) a human-centered approach to model evaluation. The contributions have implications for Machine Learning, demonstrating how bias is inherent to all data and models; more specifically for Natural Language Processing, providing an annotation taxonomy, annotated datasets and classification models for analyzing gender biased language at scale; for the Gallery, Library, Archives, and Museum sector, offering guidance to institutions seeking to reconcile with histories of marginalizing communities through their documentation practices; and for historians, who utilize cultural heritage documentation to study and interpret the past. Through a real-world application of the Bias-Aware Methodology in a case study, the thesis illustrates the need to shift away from removing social biases and towards acknowledging them, creating data and models that surface the uncertainty and multiplicity characteristic of human societies

    Researching animal research: What the humanities and social sciences can contribute to laboratory animal science and welfare

    Get PDF
    Every year around 80 million scientific procedures are carried out on animals globally. These experiments have the potential to generate new understandings of biology and clinical treatments. They also give rise to ongoing societal debate.This book demonstrates how the humanities and social sciences can contribute to understanding what is created through animal procedures - including constitutional forms of research governance, different institutional cultures of care, the professional careers of scientists and veterinarians, collaborations with patients and publics, and research animals, specially bred for experiments or surplus to requirements.Developing the idea of the animal research nexus, this book explores how connections and disconnections are made between these different elements, how these have reshaped each other historically, and how they configure the current practice and policy of UK animal research

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Mapping the Focal Points of WordPress: A Software and Critical Code Analysis

    Get PDF
    Programming languages or code can be examined through numerous analytical lenses. This project is a critical analysis of WordPress, a prevalent web content management system, applying four modes of inquiry. The project draws on theoretical perspectives and areas of study in media, software, platforms, code, language, and power structures. The applied research is based on Critical Code Studies, an interdisciplinary field of study that holds the potential as a theoretical lens and methodological toolkit to understand computational code beyond its function. The project begins with a critical code analysis of WordPress, examining its origins and source code and mapping selected vulnerabilities. An examination of the influence of digital and computational thinking follows this. The work also explores the intersection of code patching and vulnerability management and how code shapes our sense of control, trust, and empathy, ultimately arguing that a rhetorical-cultural lens can be used to better understand code\u27s controlling influence. Recurring themes throughout these analyses and observations are the connections to power and vulnerability in WordPress\u27 code and how cultural, processual, rhetorical, and ethical implications can be expressed through its code, creating a particular worldview. Code\u27s emergent properties help illustrate how human values and practices (e.g., empathy, aesthetics, language, and trust) become encoded in software design and how people perceive the software through its worldview. These connected analyses reveal cultural, processual, and vulnerability focal points and the influence these entanglements have concerning WordPress as code, software, and platform. WordPress is a complex sociotechnical platform worthy of further study, as is the interdisciplinary merging of theoretical perspectives and disciplines to critically examine code. Ultimately, this project helps further enrich the field by introducing focal points in code, examining sociocultural phenomena within the code, and offering techniques to apply critical code methods

    Discovering circulating protein biomarkers through in-depth plasma proteomics

    Get PDF
    Plasma, i.e., the liquid component of blood, is one of the most clinically used samples for biomarker measurement. Despite that plasma proteins and metabolites are the most frequently analysed biomarkers in practice, identifying and implementing new circulating protein biomarkers for diagnosis, treatment prediction, prognosis, and disease monitoring has been limited. This PhD thesis compiles the discovery of systemic alterations in the blood plasma proteome and potential biomarkers related to disease status, prognosis, or treatment through plasma proteomics. We analysed plasma and serum samples with global proteomics by high-resolution isoelectric focusing (HiRIEF) and liquid chromatography coupled with mass-spectrometry (LC-MS/MS), and targeted proteomics by antibody-based proximity extension assays (PEA) in three diseases that would benefit from blood biomarkers: stage IV metastatic cutaneous melanoma (mCM), glioblastoma (GBM), and coronavirus disease 2019 (COVID-19). Specifically: a.) New treatment options for mCM substantially prolong overall survival (OS), but multiple patients do not respond to treatment or develop treatment resistance, thus having shorter progression free survival (PFS). Corroborated by the presence of multiple metastases, which makes biomarker sampling difficult, circulating proteins derived from the tumour and in response to treatment could serve as predictive and prognostic biomarkers in mCM. b.) GBM is the most malignant primary brain tumour with limited treatment options and notoriously short OS. Sampling biomarkers for GBM requires an invasive surgical intervention on the skull, which makes GBM a good candidate for circulating protein biomarkers for prognosis and monitoring. c.) COVID-19 is an inflammation-driven infectious disease that affects multiple organs and systems, thus making the plasma proteome a good source to explore systemic biological processes occurring in COVID-19. In papers I and II, using HiRIEF LC-MS/MS and PEA, we explored the treatment-driven plasma proteome alterations in mCM patients treated with anti-PD-1 immune checkpoint inhibitors (ICI) and MAPK-inhibitors (MAPKi), respectively, and identified potential treatment predictive and monitoring biomarkers. mCM patients treated with anti-PD-1 ICI had a strong increase in soluble PD-1 levels during treatment, and upregulation of proteins involved in T-cell response. BRAF[V600]-mutated mCM patients treated with MAPKi had deregulation in proteins involved in immune response and proteolysis. CPB1 had the highest increase in patients treated with BRAF- and MEK-inhibitors and was associated with longer PFS. Higher levels of several proteins involved in inflammation before treatment were associated with shorter PFS regardless of ICI or MAPKi treatment. In paper III, using HiRIEF LC-MS/MS and PEA, we longitudinally analysed the plasma proteome dynamics of GBM patients, collecting plasma samples before surgery and at three timepoints after surgery. Through consensus clustering, based on treatment-naïve plasma protein levels, we identified two patient clusters that differed in median OS. The association between the cluster membership and OS remained consistent after adjustment for age, sex, and treatment. Through machine learning, we identified protein panels that separated the patient clusters and may serve as prognostic biomarkers. The largest alterations in the plasma proteome of GBM patients occurred within two months after surgery, whereas the plasma protein levels at later timepoints had no difference compared to pre- surgery levels. We observed a decrease in glioma-elevated proteins in the blood after surgery, identifying potential monitoring biomarkers. In paper IV, using HiRIEF LC-MS/MS, we analysed serum proteome alterations in hospitalised COVID-19 patients in comparison to healthy controls, and identified a strong upregulation in inflammatory, interferon-induced, and proteasomal proteins. Several protein groups showed association with clinical parameters of COVID-19 severity, including proteasomal proteins. Serum proteome alterations were traceable to proteome alterations induced in a lung adenocarcinoma cell line (Calu-3) by infection with SARS-CoV-2. Finally, we performed the first meta-analysis of global proteomics studies of the soluble blood proteome in COVID-19, providing estimates of standardised mean differences and summary receiver operating characteristics curves. We demonstrate the high accuracy and precision of HiRIEF LC-MS/MS when compared to the meta-analysis estimates and pinpoint proteins that may serve as biomarkers of COVID-19. In summary, this thesis postulates that new circulating protein biomarkers would be clinically useful. By combining mass-spectrometry- and antibody-based-proteomics, we demonstrate the potential of in-depth analyses of the plasma proteome in capturing systemic alterations related to treatment, survival, and disease status, pinpointing potentially novel biomarkers that require validation in larger cohorts

    Bias and Fairness in Large Language Models: A Survey

    Full text link
    Rapid advancements of large language models (LLMs) have enabled the processing, understanding, and generation of human-like text, with increasing integration into systems that touch our social sphere. Despite this success, these models can learn, perpetuate, and amplify harmful social biases. In this paper, we present a comprehensive survey of bias evaluation and mitigation techniques for LLMs. We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing, defining distinct facets of harm and introducing several desiderata to operationalize fairness for LLMs. We then unify the literature by proposing three intuitive taxonomies, two for bias evaluation, namely metrics and datasets, and one for mitigation. Our first taxonomy of metrics for bias evaluation disambiguates the relationship between metrics and evaluation datasets, and organizes metrics by the different levels at which they operate in a model: embeddings, probabilities, and generated text. Our second taxonomy of datasets for bias evaluation categorizes datasets by their structure as counterfactual inputs or prompts, and identifies the targeted harms and social groups; we also release a consolidation of publicly-available datasets for improved access. Our third taxonomy of techniques for bias mitigation classifies methods by their intervention during pre-processing, in-training, intra-processing, and post-processing, with granular subcategories that elucidate research trends. Finally, we identify open problems and challenges for future work. Synthesizing a wide range of recent research, we aim to provide a clear guide of the existing literature that empowers researchers and practitioners to better understand and prevent the propagation of bias in LLMs

    ML-based data-entry automation and data anomaly detection to support data quality assurance

    Get PDF
    Data playsacentralroleinmodernsoftwaresystems,whichare very oftenpoweredbymachinelearning(ML)andusedincriticaldo- mains ofourdailylives,suchasfinance,health,andtransportation. However,theeffectivenessofML-intensivesoftwareapplicationshighly depends onthequalityofthedata.Dataqualityisaffectedbydata anomalies; dataentryerrorsareoneofthemainsourcesofanomalies. The goalofthisthesisistodevelopapproachestoensuredataquality by preventingdataentryerrorsduringtheform-fillingprocessandby checking theofflinedatasavedindatabases. The maincontributionsofthisthesisare: 1. LAFF, anapproachtoautomaticallysuggestpossiblevaluesofcat- egorical fieldsindataentryforms. 2. LACQUER, anapproachtoautomaticallyrelaxthecompleteness requirementofdataentryformsbydecidingwhenafieldshould be optionalbasedonthefilledfieldsandhistoricalinputinstances. 3. LAFF-AD, anapproachtoautomaticallydetectdataanomaliesin categorical columnsinofflinedatasets. LAFF andLACQUERfocusmainlyonpreventingdataentryerrors during theform-fillingprocess.Bothapproachescanbeintegratedinto data entryapplicationsasefficientandeffectivestrategiestoassistthe user duringtheform-fillingprocess.LAFF-ADcanbeusedofflineon existing suspiciousdatatoeffectivelydetectanomaliesincategorical data. In addition,weperformedanextensiveevaluationofthethreeap- proaches,assessingtheireffectivenessandefficiency,usingreal-world datasets

    Measuring the impact of COVID-19 on hospital care pathways

    Get PDF
    Care pathways in hospitals around the world reported significant disruption during the recent COVID-19 pandemic but measuring the actual impact is more problematic. Process mining can be useful for hospital management to measure the conformance of real-life care to what might be considered normal operations. In this study, we aim to demonstrate that process mining can be used to investigate process changes associated with complex disruptive events. We studied perturbations to accident and emergency (A &E) and maternity pathways in a UK public hospital during the COVID-19 pandemic. Co-incidentally the hospital had implemented a Command Centre approach for patient-flow management affording an opportunity to study both the planned improvement and the disruption due to the pandemic. Our study proposes and demonstrates a method for measuring and investigating the impact of such planned and unplanned disruptions affecting hospital care pathways. We found that during the pandemic, both A &E and maternity pathways had measurable reductions in the mean length of stay and a measurable drop in the percentage of pathways conforming to normative models. There were no distinctive patterns of monthly mean values of length of stay nor conformance throughout the phases of the installation of the hospital’s new Command Centre approach. Due to a deficit in the available A &E data, the findings for A &E pathways could not be interpreted

    Postmodern Classicism: A Practice-Based Investigation

    Get PDF
    This thesis establishes a critical framework for a grassroots literary genre, postmodern classicism (pomoclassicism), which was founded by myself and Stephen Spencer II circa 2010. Postmodernism here signifies the intellectual and cultural concerns which were tantamount at the latter half of the twentieth century, and by extension, classical writing simply refers to that which was apparently before the postmodern, in a heuristic sliding scale oriented around canonicity and nostalgia. A portfolio of creative writing accompanies critical efforts at engaging with and describing the foundational assumptions of the western canon, from which much of the creative work is appropriated. My research writing is grounded in a reformulation of the early modern notion of canonical literature (circa 1700): ‘eternal life’ through literary preservation, which is itself the paradoxical material upon which the ‘canon’ is founded. This theme is taken up in the oeuvre of Goethe. Goethe’s writing relies on the paradoxical reconciliation of opposites known to the author as ‘polarity, ’ and influences how Friedrich Nietzsche, Sigmund Freud, and Franz Kafka understand canonical literature itself. Goethe, Nietzsche, and Kafka’s use of appropriation has influenced my own creative work, which includes redaction writing, erasure, and other forms of narrative appropriation. Kafka will be shown to have taken up the theme of ‘polarity’ in his own literary writing, as examined by Benjamin and Deleuze and Guattari. Finally, I will draw upon the critical writing of Sabina Spielrein, whose concepts of simultaneous creation and destruction and erotic fusion are the conceptual core of my own poetic approach, and who provides a Nietzschean critique of the early modern notion of ‘eternity.
    corecore