1,317 research outputs found

    Recalibrating machine learning for social biases: demonstrating a new methodology through a case study classifying gender biases in archival documentation

    Get PDF
    This thesis proposes a recalibration of Machine Learning for social biases to minimize harms from existing approaches and practices in the field. Prioritizing quality over quantity, accuracy over efficiency, representativeness over convenience, and situated thinking over universal thinking, the thesis demonstrates an alternative approach to creating Machine Learning models. Drawing on GLAM, the Humanities, the Social Sciences, and Design, the thesis focuses on understanding and communicating biases in a specific use case. 11,888 metadata descriptions from the University of Edinburgh Heritage Collections' Archives catalog were manually annotated for gender biases and text classification models were then trained on the resulting dataset of 55,260 annotations. Evaluations of the models' performance demonstrates that annotating gender biases can be automated; however, the subjectivity of bias as a concept complicates the generalizability of any one approach. The contributions are: (1) an interdisciplinary and participatory Bias-Aware Methodology, (2) a Taxonomy of Gendered and Gender Biased Language, (3) data annotated for gender biased language, (4) gender biased text classification models, and (5) a human-centered approach to model evaluation. The contributions have implications for Machine Learning, demonstrating how bias is inherent to all data and models; more specifically for Natural Language Processing, providing an annotation taxonomy, annotated datasets and classification models for analyzing gender biased language at scale; for the Gallery, Library, Archives, and Museum sector, offering guidance to institutions seeking to reconcile with histories of marginalizing communities through their documentation practices; and for historians, who utilize cultural heritage documentation to study and interpret the past. Through a real-world application of the Bias-Aware Methodology in a case study, the thesis illustrates the need to shift away from removing social biases and towards acknowledging them, creating data and models that surface the uncertainty and multiplicity characteristic of human societies

    Digital Technologies for Teaching English as a Foreign/Second Language: a collective monograph

    Get PDF
    Колективна монографія розкриває різні аспекти використання цифрових технологій у навчанні англійської мови як іноземної/другої мови (цифровий сторітелінг, мобільні застосунки, інтерактивне навчання і онлайн-ігри, тощо) та надає освітянам і дослідникам ресурс для збагачення їхньої професійної діяльності. Окрема увага приділена цифровим інструментам для впровадження соціально-емоційного навчання та інклюзивної освіти на уроках англійської мови. Для вчителів англійської мови, методистів, викладачів вищих закладів освіти, науковців, здобувачів вищої освіти

    Self-supervised learning for transferable representations

    Get PDF
    Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks

    Understanding Agreement and Disagreement in Listeners’ Perceived Emotion in Live Music Performance

    Get PDF
    Emotion perception of music is subjective and time dependent. Most computational music emotion recognition (MER) systems overlook time- and listener-dependent factors by averaging emotion judgments across listeners. In this work, we investigate the influence of music, setting (live vs lab vs online), and individual factors on music emotion perception over time. In an initial study, we explore changes in perceived music emotions among audience members during live classical music performances. Fifteen audience members used a mobile application to annotate time-varying emotion judgments based on the valence-arousal model. Inter-rater reliability analyses indicate that consistency in emotion judgments varies significantly across rehearsal segments, with systematic disagreements in certain segments. In a follow-up study, we examine listeners' reasons for their ratings in segments with high and low agreement. We relate these reasons to acoustic features and individual differences. Twenty-one listeners annotated perceived emotions while watching a recorded video of the live performance. They then reflected on their judgments and provided explanations retrospectively. Disagreements were attributed to listeners attending to different musical features or being uncertain about the expressed emotions. Emotion judgments were significantly associated with personality traits, gender, cultural background, and music preference. Thematic analysis of explanations revealed cognitive processes underlying music emotion perception, highlighting attributes less frequently discussed in MER studies, such as instrumentation, arrangement, musical structure, and multimodal factors related to performer expression. Exploratory models incorporating these semantic features and individual factors were developed to predict perceived music emotion over time. Regression analyses confirmed the significance of listener-informed semantic features as independent variables, with individual factors acting as moderators between loudness, pitch range, and arousal. In our final study, we analyzed the effects of individual differences on music emotion perception among 128 participants with diverse backgrounds. Participants annotated perceived emotions for 51 piano performances of different compositions from the Western canon, spanning various era. Linear mixed effects models revealed significant variations in valence and arousal ratings, as well as the frequency of emotion ratings, with regard to several individual factors: music sophistication, music preferences, personality traits, and mood states. Additionally, participants' ratings of arousal, valence, and emotional agreement were significantly associated to the historical time periods of the examined clips. This research highlights the complexity of music emotion perception, revealing it to be a dynamic, individual and context-dependent process. It paves the way for the development of more individually nuanced, time-based models in music psychology, opening up new avenues for personalised music emotion recognition and recommendation, music emotion-driven generation and therapeutic applications

    Reconstruction and Synthesis of Human-Scene Interaction

    Get PDF
    In this thesis, we argue that the 3D scene is vital for understanding, reconstructing, and synthesizing human motion. We present several approaches which take the scene into consideration in reconstructing and synthesizing Human-Scene Interaction (HSI). We first observe that state-of-the-art pose estimation methods ignore the 3D scene and hence reconstruct poses that are inconsistent with the scene. We address this by proposing a pose estimation method that takes the 3D scene explicitly into account. We call our method PROX for Proximal Relationships with Object eXclusion. We leverage the data generated using PROX and build a method to automatically place 3D scans of people with clothing in scenes. The core novelty of our method is encoding the proximal relationships between the human and the scene in a novel HSI model, called POSA for Pose with prOximitieS and contActs. POSA is limited to static HSI, however. We propose a real-time method for synthesizing dynamic HSI, which we call SAMP for Scene-Aware Motion Prediction. SAMP enables virtual humans to navigate cluttered indoor scenes and naturally interact with objects. Data-driven kinematic models, like SAMP, can produce high-quality motion when applied in environments similar to those shown in the dataset. However, when applied to new scenarios, kinematic models can struggle to generate realistic behaviors that respect scene constraints. In contrast, we present InterPhys which uses adversarial imitation learning and reinforcement learning to train physically-simulated characters that perform scene interaction tasks in a physical and life-like manner

    Die Verbildlichung von Klangstrukturen im Kontext der Entwicklung von Werkzeugen für die Medienproduktion

    Get PDF
    Audio ist einer der wichtigsten Aspekte bei mediengestützten Produktionen. Allerdings sind die Oberflächen zur Suche, Erstellung und Manipulation von Audio häufig getrieben durch die zugrundeliegenden technischen Parameter. Diese Parameter beschreiben in der Regel weder deren Bedeutung für den Sound noch welche Qualitäten im Sound zum Ausdruck kommen. Dadurch bieten solche Oberflächen selten eine intuitive, noch expressive Handhabe, da zuerst ein Übersetzungsprozess von der künstlerischen Idee hin zu den technischen Parametern erfolgen muss. Um die Oberflächen nahbarer und nachvollziehbar zu gestalten, wird daher die Verbildlichung der Strukturen von Sound auf verschiedenen Ebenen betrachtet. Insbesondere wie visuelle Momente zur Interaktion mit Audio genutzt werden können, wird herausgestellt. Auf diese Weise wird die grafische Erweiterung von technischer spektraler Editierung als Beispiel für die direkte Signalverarbeitung diskutiert. Auch werden metaphorische Visualisierungen für Audioeffekte thematisiert. Zudem wird das mentale Modell von Audio analysiert, welches hier assoziativ durch Skizzen als abstrakte visuelle Repräsentationen erhoben wird. Daher wurden Studien zur Erhebung und Bewertung von Skizzen durchgeführt, die je einen Sound abbilden. Aus den resultierenden Skizzenassoziationen ist ein Vorgehen zur Ableitung einer Klassifikation des skizzenbasierten mentalen Modells entstanden. Diese Klassifikation ist ein möglicher Ausgangspunkt für die Entwicklung von Werkzeugen und bietet ein Bewusstsein der Ausdrucksmöglichkeiten beim Einsatz grafischer Assoziationen bis hin zu Affordanzen im Design von Datensätzen für maschinelles Lernen. Denn es wurden auch statistische Zusammenhänge der Kenntnisse der zeichnenden Personen und den verwendeten Skizzenklassen untersucht. Dadurch kann die Art der Abbildung in Bezug zur gewünschten Zielgruppe gewählt werden.Audio is one of the most important aspects of media-based productions. However, the interfaces for searching, creating, and manipulating audio are often driven by the underlying technical parameters. These parameters usually do not describe their meanings for the sound, nor what qualities are expressed in the sound. As a result, such surfaces rarely offer an intuitive, nor expressive way of interacting, since a translation process from the artistic idea to the technical parameters must take place first. In order to make the interfaces more approachable and comprehensible, the visualization of the structures of sound is therefore considered on different levels. In particular, how visualizations can be used to interact with audio will be highlighted. In this way, the graphical extension of technical spectral editing is discussed as an example of direct signal processing. Also, metaphorical visualizations for audio effects are addressed. In addition, the mental model of audio is analyzed, which is here elicited associatively through sketches as abstract visual representations. Therefore, studies were conducted to collect and evaluate sketches, each depicting one sound. From the resulting sketch associations, a procedure for deriving a classification of the sketch-based mental model has been developed. This classification is a possible starting point for tool development and provides an awareness of the expressive possibilities when using graphical associations up to affordances in the design of datasets for machine learning. In fact, statistical correlations of the knowledge of the people drawing and the classes of sketches used were also investigated. This allows to choose the type of illustration in relation to the desired target group

    2007 GREAT Day Program

    Get PDF
    SUNY Geneseo’s First Annual G.R.E.A.T. Day.https://knightscholar.geneseo.edu/program-2007/1001/thumbnail.jp

    Human participants in AI research: Ethics and transparency in practice

    Full text link
    In recent years, research involving human participants has been critical to advances in artificial intelligence (AI) and machine learning (ML), particularly in the areas of conversational, human-compatible, and cooperative AI. For example, around 12% and 6% of publications at recent AAAI and NeurIPS conferences indicate the collection of original human data, respectively. Yet AI and ML researchers lack guidelines for ethical, transparent research practices with human participants. Fewer than one out of every four of these AAAI and NeurIPS papers provide details of ethical review, the collection of informed consent, or participant compensation. This paper aims to bridge this gap by exploring normative similarities and differences between AI research and related fields that involve human participants. Though psychology, human-computer interaction, and other adjacent fields offer historic lessons and helpful insights, AI research raises several specific concerns\unicode{x2014}namely, participatory design, crowdsourced dataset development, and an expansive role of corporations\unicode{x2014}that necessitate a contextual ethics framework. To address these concerns, this paper outlines a set of guidelines for ethical and transparent practice with human participants in AI and ML research. These guidelines can be found in Section 4 on pp. 4\unicode{x2013}7

    Workshop Proceedings of the 12th edition of the KONVENS conference

    Get PDF
    The 2014 issue of KONVENS is even more a forum for exchange: its main topic is the interaction between Computational Linguistics and Information Science, and the synergies such interaction, cooperation and integrated views can produce. This topic at the crossroads of different research traditions which deal with natural language as a container of knowledge, and with methods to extract and manage knowledge that is linguistically represented is close to the heart of many researchers at the Institut für Informationswissenschaft und Sprachtechnologie of Universität Hildesheim: it has long been one of the institute’s research topics, and it has received even more attention over the last few years

    Coping with low data availability for social media crisis message categorisation

    Full text link
    During crisis situations, social media allows people to quickly share information, including messages requesting help. This can be valuable to emergency responders, who need to categorise and prioritise these messages based on the type of assistance being requested. However, the high volume of messages makes it difficult to filter and prioritise them without the use of computational techniques. Fully supervised filtering techniques for crisis message categorisation typically require a large amount of annotated training data, but this can be difficult to obtain during an ongoing crisis and is expensive in terms of time and labour to create. This thesis focuses on addressing the challenge of low data availability when categorising crisis messages for emergency response. It first presents domain adaptation as a solution for this problem, which involves learning a categorisation model from annotated data from past crisis events (source domain) and adapting it to categorise messages from an ongoing crisis event (target domain). In many-to-many adaptation, where the model is trained on multiple past events and adapted to multiple ongoing events, a multi-task learning approach is proposed using pre-trained language models. This approach outperforms baselines and an ensemble approach further improves performance..
    corecore