3,538 research outputs found
Self-supervised learning for transferable representations
Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks
Examining systemic and dispositional factors impacting historically disenfranchised schools across North Carolina
This mixed method sequential explanatory study provided analysis of North Carolina (NC) school leaders’ dispositions in eliminating opportunity gaps, outlined in NC’s strategic plan. The study’s quantitative phase used descriptive and correlation analysis of eight Likert subscales around four tenets of transformative leadership (Shields, 2011) and aspects of critical race theory (Bell, 1992; Ladson-Billings, 1998; Ladson-Billings & Tate, 2006) to understand systemic inequities and leadership attitudes.
The qualitative phase comprised three analyses of education leadership dispositions and systemic factors in NC schools. The first analysis of State Board of Education meeting minutes from 2018–2023 quantified and analyzed utterances of racism and critical race, outlined the sociopolitical context of such utterances, and identified systemic patterns and state leader dispositions. The second analysis of five interviews of K–12 graduates identified persistent and systemic factors influencing NC education 3 decades after Brown v. Board of Education (1954) and within the context of Leandro v. State of NC (1997), where the NC Supreme Court recognized the state constitutional right for every student to access a “sound basic education.” The final qualitative analysis consisted of five interviews of current NC public school system leaders, for personal narratives of the state of NC schools compared to patterns from lived experiences of NC K–12 graduates.
The study’s findings suggested NC school and state education leaders experience a racialized dichotomy between willingness for change (equity intentions) and execution of transformative action (practice). Although leaders at the board and school levels recognize the need for inclusivity and equity, a struggle to transcend systemic challenges, especially rooted in racial biases and power dynamics is evident. This study may identify leadership qualities needed for change in NC to address systemic inequities for improving educational access and inform policy to uphold all students’ constitutional right to a sound, basic education
Multidisciplinary perspectives on Artificial Intelligence and the law
This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio
Design Knowledge for Virtual Learning Companions from a Value-centered Perspective
The increasing popularity of conversational agents such as ChatGPT has sparked interest in their potential use in educational contexts but undermines the role of companionship in learning with these tools. Our study targets the design of virtual learning companions (VLCs), focusing on bonding relationships for collaborative learning while facilitating students’ time management and motivation. We draw upon design science research (DSR) to derive prescriptive design knowledge for VLCs as the core of our contribution. Through three DSR cycles, we conducted interviews with working students and experts, held interdisciplinary workshops with the target group, designed and evaluated two conceptual prototypes, and fully coded a VLC instantiation, which we tested with students in class. Our approach has yielded 9 design principles, 28 meta-requirements, and 33 design features centered around the value-in-interaction. These encompass Human-likeness and Dialogue Management, Proactive and Reactive Behavior, and Relationship Building on the Relationship Layer (DP1,3,4), Adaptation (DP2) on the Matching Layer, as well as Provision of Supportive Content, Fostering Learning Competencies, Motivational Environment, and Ethical Responsibility (DP5-8) on the Service Layer
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
A Closer Look at Audio-Visual Semantic Segmentation
Audio-visual segmentation (AVS) is a complex task that involves accurately
segmenting the corresponding sounding object based on audio-visual queries.
Successful audio-visual learning requires two essential components: 1) an
unbiased dataset with high-quality pixel-level multi-class labels, and 2) a
model capable of effectively linking audio information with its corresponding
visual object. However, these two requirements are only partially addressed by
current methods, with training sets containing biased audio-visual data, and
models that generalise poorly beyond this biased training set. In this work, we
propose a new strategy to build cost-effective and relatively unbiased
audio-visual semantic segmentation benchmarks. Our strategy, called Visual
Post-production (VPO), explores the observation that it is not necessary to
have explicit audio-visual pairs extracted from single video sources to build
such benchmarks. We also refine the previously proposed AVSBench to transform
it into the audio-visual semantic segmentation benchmark AVSBench-Single+.
Furthermore, this paper introduces a new pixel-wise audio-visual contrastive
learning method to enable a better generalisation of the model beyond the
training set. We verify the validity of the VPO strategy by showing that
state-of-the-art (SOTA) models trained with datasets built by matching audio
and visual data from different sources or with datasets containing audio and
visual data from the same video source produce almost the same accuracy. Then,
using the proposed VPO benchmarks and AVSBench-Single+, we show that our method
produces more accurate audio-visual semantic segmentation than SOTA models.
Code and dataset will be available
Software Design Change Artifacts Generation through Software Architectural Change Detection and Categorisation
Software is solely designed, implemented, tested, and inspected by expert people, unlike other engineering projects where they are mostly implemented by workers (non-experts) after designing by engineers. Researchers and practitioners have linked software bugs, security holes, problematic integration of changes, complex-to-understand codebase, unwarranted mental pressure, and so on in software development and maintenance to inconsistent and complex design and a lack of ways to easily understand what is going on and what to plan in a software system. The unavailability of proper information and insights needed by the development teams to make good decisions makes these challenges worse. Therefore, software design documents and other insightful information extraction are essential to reduce the above mentioned anomalies. Moreover, architectural design artifacts extraction is required to create the developer’s profile to be available to the market for many crucial scenarios. To that end, architectural change detection, categorization, and change description generation are crucial because they are the primary artifacts to trace other software artifacts.
However, it is not feasible for humans to analyze all the changes for a single release for detecting change and impact because it is time-consuming, laborious, costly, and inconsistent. In this thesis, we conduct six studies considering the mentioned challenges to automate the architectural change information extraction and document generation that could potentially assist the development and maintenance teams. In particular, (1) we detect architectural changes using lightweight techniques leveraging textual and codebase properties, (2) categorize them considering intelligent perspectives, and (3) generate design change documents by exploiting precise contexts of components’ relations and change purposes which were previously unexplored. Our experiment using 4000+ architectural change samples and 200+ design change documents suggests that our proposed approaches are promising in accuracy and scalability to deploy frequently. Our proposed change detection approach can detect up to 100% of the architectural change instances (and is very scalable). On the other hand, our proposed change classifier’s F1 score is 70%, which is promising given the challenges. Finally, our proposed system can produce descriptive design change artifacts with 75% significance. Since most of our studies are foundational, our approaches and prepared datasets can be used as baselines for advancing research in design change information extraction and documentation
Mapping the Focal Points of WordPress: A Software and Critical Code Analysis
Programming languages or code can be examined through numerous analytical lenses. This project is a critical analysis of WordPress, a prevalent web content management system, applying four modes of inquiry. The project draws on theoretical perspectives and areas of study in media, software, platforms, code, language, and power structures. The applied research is based on Critical Code Studies, an interdisciplinary field of study that holds the potential as a theoretical lens and methodological toolkit to understand computational code beyond its function. The project begins with a critical code analysis of WordPress, examining its origins and source code and mapping selected vulnerabilities. An examination of the influence of digital and computational thinking follows this. The work also explores the intersection of code patching and vulnerability management and how code shapes our sense of control, trust, and empathy, ultimately arguing that a rhetorical-cultural lens can be used to better understand code\u27s controlling influence. Recurring themes throughout these analyses and observations are the connections to power and vulnerability in WordPress\u27 code and how cultural, processual, rhetorical, and ethical implications can be expressed through its code, creating a particular worldview. Code\u27s emergent properties help illustrate how human values and practices (e.g., empathy, aesthetics, language, and trust) become encoded in software design and how people perceive the software through its worldview. These connected analyses reveal cultural, processual, and vulnerability focal points and the influence these entanglements have concerning WordPress as code, software, and platform. WordPress is a complex sociotechnical platform worthy of further study, as is the interdisciplinary merging of theoretical perspectives and disciplines to critically examine code. Ultimately, this project helps further enrich the field by introducing focal points in code, examining sociocultural phenomena within the code, and offering techniques to apply critical code methods
SAF-IS: a Spatial Annotation Free Framework for Instance Segmentation of Surgical Tools
Instance segmentation of surgical instruments is a long-standing research
problem, crucial for the development of many applications for computer-assisted
surgery. This problem is commonly tackled via fully-supervised training of deep
learning models, requiring expensive pixel-level annotations to train. In this
work, we develop a framework for instance segmentation not relying on spatial
annotations for training. Instead, our solution only requires binary tool
masks, obtainable using recent unsupervised approaches, and binary tool
presence labels, freely obtainable in robot-assisted surgery. Based on the
binary mask information, our solution learns to extract individual tool
instances from single frames, and to encode each instance into a compact vector
representation, capturing its semantic features. Such representations guide the
automatic selection of a tiny number of instances (8 only in our experiments),
displayed to a human operator for tool-type labelling. The gathered information
is finally used to match each training instance with a binary tool presence
label, providing an effective supervision signal to train a tool instance
classifier. We validate our framework on the EndoVis 2017 and 2018 segmentation
datasets. We provide results using binary masks obtained either by manual
annotation or as predictions of an unsupervised binary segmentation model. The
latter solution yields an instance segmentation approach completely free from
spatial annotations, outperforming several state-of-the-art fully-supervised
segmentation approaches
- …