3,743 research outputs found

    Recalibrating machine learning for social biases: demonstrating a new methodology through a case study classifying gender biases in archival documentation

    Get PDF
    This thesis proposes a recalibration of Machine Learning for social biases to minimize harms from existing approaches and practices in the field. Prioritizing quality over quantity, accuracy over efficiency, representativeness over convenience, and situated thinking over universal thinking, the thesis demonstrates an alternative approach to creating Machine Learning models. Drawing on GLAM, the Humanities, the Social Sciences, and Design, the thesis focuses on understanding and communicating biases in a specific use case. 11,888 metadata descriptions from the University of Edinburgh Heritage Collections' Archives catalog were manually annotated for gender biases and text classification models were then trained on the resulting dataset of 55,260 annotations. Evaluations of the models' performance demonstrates that annotating gender biases can be automated; however, the subjectivity of bias as a concept complicates the generalizability of any one approach. The contributions are: (1) an interdisciplinary and participatory Bias-Aware Methodology, (2) a Taxonomy of Gendered and Gender Biased Language, (3) data annotated for gender biased language, (4) gender biased text classification models, and (5) a human-centered approach to model evaluation. The contributions have implications for Machine Learning, demonstrating how bias is inherent to all data and models; more specifically for Natural Language Processing, providing an annotation taxonomy, annotated datasets and classification models for analyzing gender biased language at scale; for the Gallery, Library, Archives, and Museum sector, offering guidance to institutions seeking to reconcile with histories of marginalizing communities through their documentation practices; and for historians, who utilize cultural heritage documentation to study and interpret the past. Through a real-world application of the Bias-Aware Methodology in a case study, the thesis illustrates the need to shift away from removing social biases and towards acknowledging them, creating data and models that surface the uncertainty and multiplicity characteristic of human societies

    Fake News: Finding Truth in Strategic Communication

    Get PDF
    Fake news is an old phenomenon that has become a new obsession and a menace to society due to technological advancement and the proliferation of social media, which has changed traditional journalism norms. As the spread of false information has increased these past few years, it has become increasingly difficult for information consumers to distinguish between facts and fakes. A comprehensive systematic literature review to extract themes revealed the major factors responsible for spreading fake news. This qualitative interpretative meta-synthesis (QIMS) aims to better understand and offer solutions to combat fake news. This Ph.D. dissertation will serve as a guide for ethical communication practice and a reference for future research studies

    Language Design for Reactive Systems: On Modal Models, Time, and Object Orientation in Lingua Franca and SCCharts

    Get PDF
    Reactive systems play a crucial role in the embedded domain. They continuously interact with their environment, handle concurrent operations, and are commonly expected to provide deterministic behavior to enable application in safety-critical systems. In this context, language design is a key aspect, since carefully tailored language constructs can aid in addressing the challenges faced in this domain, as illustrated by the various concurrency models that prevent the known pitfalls of regular threads. Today, many languages exist in this domain and often provide unique characteristics that make them specifically fit for certain use cases. This thesis evolves around two distinctive languages: the actor-oriented polyglot coordination language Lingua Franca and the synchronous statecharts dialect SCCharts. While they take different approaches in providing reactive modeling capabilities, they share clear similarities in their semantics and complement each other in design principles. This thesis analyzes and compares key design aspects in the context of these two languages. For three particularly relevant concepts, it provides and evaluates lean and seamless language extensions that are carefully aligned with the fundamental principles of the underlying language. Specifically, Lingua Franca is extended toward coordinating modal behavior, while SCCharts receives a timed automaton notation with an efficient execution model using dynamic ticks and an extension toward the object-oriented modeling paradigm

    Southern Adventist University Undergraduate Catalog 2023-2024

    Get PDF
    Southern Adventist University\u27s undergraduate catalog for the academic year 2023-2024.https://knowledge.e.southern.edu/undergrad_catalog/1123/thumbnail.jp

    Digital Technologies for Teaching English as a Foreign/Second Language: a collective monograph

    Get PDF
    Колективна монографія розкриває різні аспекти використання цифрових технологій у навчанні англійської мови як іноземної/другої мови (цифровий сторітелінг, мобільні застосунки, інтерактивне навчання і онлайн-ігри, тощо) та надає освітянам і дослідникам ресурс для збагачення їхньої професійної діяльності. Окрема увага приділена цифровим інструментам для впровадження соціально-емоційного навчання та інклюзивної освіти на уроках англійської мови. Для вчителів англійської мови, методистів, викладачів вищих закладів освіти, науковців, здобувачів вищої освіти

    Visualizing emoji usage in geo-social media across time, space, and topic

    Get PDF
    Social media is ubiquitous in the modern world and its use is ever-increasing. Similarly, the use of emojis within social media posts continues to surge. Geo-social media produces massive amounts of spatial data that can provide insights into users' thoughts and reactions across time and space. This research used emojis as an alternative to text-based social media analysis in order to avoid the common obstacles of natural language processing such as spelling mistakes, grammatical errors, slang, and sarcasm. Because emojis offer a non-verbal means to express thoughts and emotions, they provide additional context in comparison to purely text-based analysis. This facilitates cross-language studies. In this study, the spatial and temporal usage of emojis were visualized in order to detect relevant topics of discussion within a Twitter dataset that is not thematically pre-filtered. The dataset consists of Twitter posts that were geotagged within Europe during the year 2020. This research leveraged cartographic visualization techniques to detect spatial-temporal changes in emoji usage and to investigate the correlation of emoji usage with significant topics. The spatial and temporal developments of these topics and their respective emojis were visualized as a series of choropleth maps and map matrices. This geovisualization technique allowed for individual emojis to be independently analyzed and for specific spatial or temporal trends to be further investigated. Emoji usage was found to be spatially and temporally heterogeneous, and trends in emoji usage were found to correlate with topics including the COVID-19 pandemic, several political movements, and leisure activities

    Analyzing public discourse on photovoltaic (PV) adoption in Indonesia: A topic-based sentiment analysis of news articles and social media

    Get PDF
    The importance of integrating renewable energy, such as solar PV, in the global energy mix for mitigating carbon emissions is increasing. Despite the global drive towards renewable energy, the limited uptake of solar PV particularly in developing nations, such as Indonesia, poses significant challenges for transition to sustainable energy. This study analyses public discourse to comprehend the obstacles for widespread adoption of solar PV technologies. This study employs topic modelling and sentiment analysis of mainstream and social media data to comprehensively capture public discourse and perceptions concerning PV and residential PV adoption in Indonesia. The findings reveal shared thematic areas in both mainstream and social media. Nonetheless, the two media types diverge significantly in their focal points. Our findings support previous survey-based research while introducing three new topics found in both media channels. These topics are: (1) knowledge, misconceptions, and skepticism, (2) economically viable alternative PV technologies; and (3) government regulations and policies. Social and visual impressions such as aesthetics, hedonic motivation, and social influence are notably absent. Public perception varies, with mainstream media portraying PV technology more positively than social media. From both media, the public generally holds favorable views of PV, particularly in terms of its practicality, installation, safety, and information accessibility. Nevertheless, negative perceptions arise regarding investment costs, regulations, governmental policies, and the adequacy of government support

    Attention is Not Always What You Need: Towards Efficient Classification of Domain-Specific Text

    Full text link
    For large-scale IT corpora with hundreds of classes organized in a hierarchy, the task of accurate classification of classes at the higher level in the hierarchies is crucial to avoid errors propagating to the lower levels. In the business world, an efficient and explainable ML model is preferred over an expensive black-box model, especially if the performance increase is marginal. A current trend in the Natural Language Processing (NLP) community is towards employing huge pre-trained language models (PLMs) or what is known as self-attention models (e.g., BERT) for almost any kind of NLP task (e.g., question-answering, sentiment analysis, text classification). Despite the widespread use of PLMs and the impressive performance in a broad range of NLP tasks, there is a lack of a clear and well-justified need to as why these models are being employed for domain-specific text classification (TC) tasks, given the monosemic nature of specialized words (i.e., jargon) found in domain-specific text which renders the purpose of contextualized embeddings (e.g., PLMs) futile. In this paper, we compare the accuracies of some state-of-the-art (SOTA) models reported in the literature against a Linear SVM classifier and TFIDF vectorization model on three TC datasets. Results show a comparable performance for the LinearSVM. The findings of this study show that for domain-specific TC tasks, a linear model can provide a comparable, cheap, reproducible, and interpretable alternative to attention-based models

    SeeChart: Enabling Accessible Visualizations Through Interactive Natural Language Interface For People with Visual Impairments

    Full text link
    Web-based data visualizations have become very popular for exploring data and communicating insights. Newspapers, journals, and reports regularly publish visualizations to tell compelling stories with data. Unfortunately, most visualizations are inaccessible to readers with visual impairments. For many charts on the web, there are no accompanying alternative (alt) texts, and even if such texts exist they do not adequately describe important insights from charts. To address the problem, we first interviewed 15 blind users to understand their challenges and requirements for reading data visualizations. Based on the insights from these interviews, we developed SeeChart, an interactive tool that automatically deconstructs charts from web pages and then converts them to accessible visualizations for blind people by enabling them to hear the chart summary as well as to interact through data points using the keyboard. Our evaluation with 14 blind participants suggests the efficacy of SeeChart in understanding key insights from charts and fulfilling their information needs while reducing their required time and cognitive burden.Comment: 28 pages, 13 figure

    Software Design Change Artifacts Generation through Software Architectural Change Detection and Categorisation

    Get PDF
    Software is solely designed, implemented, tested, and inspected by expert people, unlike other engineering projects where they are mostly implemented by workers (non-experts) after designing by engineers. Researchers and practitioners have linked software bugs, security holes, problematic integration of changes, complex-to-understand codebase, unwarranted mental pressure, and so on in software development and maintenance to inconsistent and complex design and a lack of ways to easily understand what is going on and what to plan in a software system. The unavailability of proper information and insights needed by the development teams to make good decisions makes these challenges worse. Therefore, software design documents and other insightful information extraction are essential to reduce the above mentioned anomalies. Moreover, architectural design artifacts extraction is required to create the developer’s profile to be available to the market for many crucial scenarios. To that end, architectural change detection, categorization, and change description generation are crucial because they are the primary artifacts to trace other software artifacts. However, it is not feasible for humans to analyze all the changes for a single release for detecting change and impact because it is time-consuming, laborious, costly, and inconsistent. In this thesis, we conduct six studies considering the mentioned challenges to automate the architectural change information extraction and document generation that could potentially assist the development and maintenance teams. In particular, (1) we detect architectural changes using lightweight techniques leveraging textual and codebase properties, (2) categorize them considering intelligent perspectives, and (3) generate design change documents by exploiting precise contexts of components’ relations and change purposes which were previously unexplored. Our experiment using 4000+ architectural change samples and 200+ design change documents suggests that our proposed approaches are promising in accuracy and scalability to deploy frequently. Our proposed change detection approach can detect up to 100% of the architectural change instances (and is very scalable). On the other hand, our proposed change classifier’s F1 score is 70%, which is promising given the challenges. Finally, our proposed system can produce descriptive design change artifacts with 75% significance. Since most of our studies are foundational, our approaches and prepared datasets can be used as baselines for advancing research in design change information extraction and documentation
    corecore