6,824 research outputs found
Augmented Behavioral Annotation Tools, with Application to Multimodal Datasets and Models: A Systematic Review
Annotation tools are an essential component in the creation of datasets for machine learning purposes. Annotation tools have evolved greatly since the turn of the century, and now commonly include collaborative features to divide labor efficiently, as well as automation employed to amplify human efforts. Recent developments in machine learning models, such as Transformers, allow for training upon very large and sophisticated multimodal datasets and enable generalization across domains of knowledge. These models also herald an increasing emphasis on prompt engineering to provide qualitative fine-tuning upon the model itself, adding a novel emerging layer of direct machine learning annotation. These capabilities enable machine intelligence to recognize, predict, and emulate human behavior with much greater accuracy and nuance, a noted shortfall of which have contributed to algorithmic injustice in previous techniques. However, the scale and complexity of training data required for multimodal models presents engineering challenges. Best practices for conducting annotation for large multimodal models in the most safe and ethical, yet efficient, manner have not been established. This paper presents a systematic literature review of crowd and machine learning augmented behavioral annotation methods to distill practices that may have value in multimodal implementations, cross-correlated across disciplines. Research questions were defined to provide an overview of the evolution of augmented behavioral annotation tools in the past, in relation to the present state of the art. (Contains five figures and four tables)
Endogenous measures for contextualising large-scale social phenomena: a corpus-based method for mediated public discourse
This work presents an interdisciplinary methodology for developing endogenous measures of group membership through analysis of pervasive linguistic patterns in public discourse. Focusing on political discourse, this work critiques the conventional approach to the study of political participation, which is premised on decontextualised, exogenous measures to characterise groups. Considering the theoretical and empirical weaknesses of decontextualised approaches to large-scale social phenomena, this work suggests that contextualisation using endogenous measures might provide a complementary perspective to mitigate such weaknesses.
This work develops a sociomaterial perspective on political participation in mediated discourse as affiliatory action performed through language. While the affiliatory function of language is often performed consciously (such as statements of identity), this work is concerned with unconscious features (such as patterns in lexis and grammar). This work argues that pervasive patterns in such features that emerge through socialisation are resistant to change and manipulation, and thus might serve as endogenous measures of sociopolitical contexts, and thus of groups.
In terms of method, the work takes a corpus-based approach to the analysis of data from the Twitter messaging service whereby patterns in usersâ speech are examined statistically in order to trace potential community membership. The method is applied in the US state of Michigan during the second half of 2018â6 November having been the date of midterm (i.e. non-Presidential) elections in the United States. The corpus is assembled from the original posts of 5,889 users, who are nominally geolocalised to 417 municipalities. These users are clustered according to pervasive language features. Comparing the linguistic clusters according to the municipalities they represent finds that there are regular sociodemographic differentials across clusters. This is understood as an indication of social structure, suggesting that endogenous measures derived from pervasive patterns in language may indeed offer a complementary, contextualised perspective on large-scale social phenomena
Ab Initio Language Teaching in British Higher Education
Drawing extensively on the expertise of teachers of German in universities across the UK, this volume offers an overview of recent trends, new pedagogical approaches and practical guidance for teaching at beginners level in the higher education classroom. At a time when entries for UK school exams in modern foreign languages are decreasing, this book serves the urgent need for research and guidance on ab initio learning and teaching in HE. Using the example of teaching German, it offers theoretical reflections on teaching ab initio and practice-oriented approaches that will be useful for teachers of both German and other languages in higher education.
The first chapters assess the role of ab initio provision within the wider context of modern languages departments and language centres. They are followed by sections on teaching methods and innovative approaches in the ab initio classroom that include chapters on the use of music, textbook evaluation, the effective use of a flipped classroom and the contribution of language apps. Finally, the book focuses on the learner in the ab initio context and explores issues around autonomy and learner strengths. The whole builds into a theoretically grounded guide that sketches out perspectives for teaching and learning ab initio languages that will benefit current and future generations of students
Inclusive Intelligent Learning Management System Framework - Application of Data Science in Inclusive Education
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceBeing a disabled student the author faced higher education with a handicap which as experience
studying during COVID 19 confinement periods matched the findings in recent research about the
importance of digital accessibility through more e-learning intensive academic experiences. Narrative
and systematic literature reviews enabled providing context in World Health Organizationâs
International Classification of Functioning, Disability and Health, legal and standards framework and
information technology and communication state-of-the art. Assessing Portuguese higher education
institutionsâ web sites alerted to the fact that only outlying institutions implemented near perfect,
accessibility-wise, websites.
Therefore a gap was identified in how accessible the Portuguese higher education websites are, the
needs of all students, including those with disabilities, and even the accessibility minimum legal
requirements for digital products and the services provided by public or publicly funded organizations.
Having identified a problem in society and exploring the scientific base of knowledge for context and
state of the art was a first stage in the Design Science Research methodology, to which followed
development and validation cycles of an Inclusive Intelligent Learning Management System
Framework. The framework blends various Data Science study fields contributions with accessibility
guidelines compliant interface design and content upload accessibility compliance assessment.
Validation was provided by a focus group whose inputs were considered for the version presented in
this dissertation. Not being the purpose of the research to deliver a complete implementation of the
framework and lacking consistent data to put all the modules interacting with each other, the most
relevant modules were tested with open data as proof of concept.
The rigor cycle of DSR started with the inclusion of the previous thesis on AtlĂąntica University Institute
Scientific Repository and is to be completed with the publication of this thesis and the already started
PhDâs findings in relevant journals and conferences
Foundation Models and Fair Use
Existing foundation models are trained on copyrighted material. Deploying
these models can pose both legal and ethical risks when data creators fail to
receive appropriate attribution or compensation. In the United States and
several other countries, copyrighted content may be used to build foundation
models without incurring liability due to the fair use doctrine. However, there
is a caveat: If the model produces output that is similar to copyrighted data,
particularly in scenarios that affect the market of that data, fair use may no
longer apply to the output of the model. In this work, we emphasize that fair
use is not guaranteed, and additional work may be necessary to keep model
development and deployment squarely in the realm of fair use. First, we survey
the potential risks of developing and deploying foundation models based on
copyrighted content. We review relevant U.S. case law, drawing parallels to
existing and potential applications for generating text, source code, and
visual art. Experiments confirm that popular foundation models can generate
content considerably similar to copyrighted material. Second, we discuss
technical mitigations that can help foundation models stay in line with fair
use. We argue that more research is needed to align mitigation strategies with
the current state of the law. Lastly, we suggest that the law and technical
mitigations should co-evolve. For example, coupled with other policy
mechanisms, the law could more explicitly consider safe harbors when strong
technical tools are used to mitigate infringement harms. This co-evolution may
help strike a balance between intellectual property and innovation, which
speaks to the original goal of fair use. But we emphasize that the strategies
we describe here are not a panacea and more work is needed to develop policies
that address the potential harms of foundation models
BIM-GPT: a Prompt-Based Virtual Assistant Framework for BIM Information Retrieval
Efficient information retrieval (IR) from building information models (BIMs)
poses significant challenges due to the necessity for deep BIM knowledge or
extensive engineering efforts for automation. We introduce BIM-GPT, a
prompt-based virtual assistant (VA) framework integrating BIM and generative
pre-trained transformer (GPT) technologies to support NL-based IR. A prompt
manager and dynamic template generate prompts for GPT models, enabling
interpretation of NL queries, summarization of retrieved information, and
answering BIM-related questions. In tests on a BIM IR dataset, our approach
achieved 83.5% and 99.5% accuracy rates for classifying NL queries with no data
and 2% data incorporated in prompts, respectively. Additionally, we validated
the functionality of BIM-GPT through a VA prototype for a hospital building.
This research contributes to the development of effective and versatile VAs for
BIM IR in the construction industry, significantly enhancing BIM accessibility
and reducing engineering efforts and training data requirements for processing
NL queries.Comment: 35 pages, 15 figure
Bridging Systems: Open Problems for Countering Destructive Divisiveness across Ranking, Recommenders, and Governance
Divisiveness appears to be increasing in much of the world, leading to
concern about political violence and a decreasing capacity to collaboratively
address large-scale societal challenges. In this working paper we aim to
articulate an interdisciplinary research and practice area focused on what we
call bridging systems: systems which increase mutual understanding and trust
across divides, creating space for productive conflict, deliberation, or
cooperation. We give examples of bridging systems across three domains:
recommender systems on social media, collective response systems, and
human-facilitated group deliberation. We argue that these examples can be more
meaningfully understood as processes for attention-allocation (as opposed to
"content distribution" or "amplification") and develop a corresponding
framework to explore similarities - and opportunities for bridging - across
these seemingly disparate domains. We focus particularly on the potential of
bridging-based ranking to bring the benefits of offline bridging into spaces
which are already governed by algorithms. Throughout, we suggest research
directions that could improve our capacity to incorporate bridging into a world
increasingly mediated by algorithms and artificial intelligence.Comment: 40 pages, 11 figures. See https://bridging.systems for more about
this wor
Interdisciplinarity as a political instrument of governance and its consequences for doctoral training
UK educational policies exploit interdisciplinarity as a marketing tool in a competitive educational world by building images of prosperous futures for society, the economy, and universities. Following this narrative, interdisciplinary science is promoted as superior to disciplinary forms of research and requires the training of future researchers accordingly, with interdisciplinary doctoral education becoming more established in universities.
This emphasis on the growth of interdisciplinary science polarises scholarsâ views on the role of academic research between the production of knowledge on the one hand and knowledge as an economic resource at the other end of the spectrum. This research asks: what is the rationale behind the perceived value of interdisciplinary research and training, and how does it affect graduate studentsâ experiences of their PhD?
Based on a practice theory perspective for its suitability in generating insights into how universityâs social life is organised, reproduced and transformed, the doctorate is conceptualised as sets of interconnected practices that are observable as they happen. This current study, therefore, comprised two stages of data collection and analysis; the examination of documents to elucidate educational policy practices and an educational ethnography of an interdisciplinary doctoral programme.
This study found interdisciplinary doctoral training is hindered by the lack of role models and positive social relationships, which are crucial to the way interdisciplinary students learn. Furthermore, it is argued that interdisciplinarity is sometimes applied to research as a label to fit with fundersâ requirements. Specifically, in this case, medical optical imaging is best seen as an interdiscipline as it does not exhibit true interdisciplinary integration.
Further insights show that while interdisciplinarity is promoted in policy around promises and expectations for a better future, it is in tension with how it is organisationally embedded in higher education. These insights form the basis for a list of practical recommendations for institutions. Overall, interdisciplinary doctoral training was observed to present students with difficulties and to leave policy concerns unaddressed
Towards Mobility Data Science (Vision Paper)
Mobility data captures the locations of moving objects such as humans,
animals, and cars. With the availability of GPS-equipped mobile devices and
other inexpensive location-tracking technologies, mobility data is collected
ubiquitously. In recent years, the use of mobility data has demonstrated
significant impact in various domains including traffic management, urban
planning, and health sciences. In this paper, we present the emerging domain of
mobility data science. Towards a unified approach to mobility data science, we
envision a pipeline having the following components: mobility data collection,
cleaning, analysis, management, and privacy. For each of these components, we
explain how mobility data science differs from general data science, we survey
the current state of the art and describe open challenges for the research
community in the coming years.Comment: Updated arXiv metadata to include two authors that were missing from
the metadata. PDF has not been change
- âŠ