27,287 research outputs found
Gated Convolutional Bidirectional Attention-based Model for Off-topic Spoken Response Detection
Off-topic spoken response detection, the task aiming at predicting whether a
response is off-topic for the corresponding prompt, is important for an
automated speaking assessment system. In many real-world educational
applications, off-topic spoken response detectors are required to achieve high
recall for off-topic responses not only on seen prompts but also on prompts
that are unseen during training. In this paper, we propose a novel approach for
off-topic spoken response detection with high off-topic recall on both seen and
unseen prompts. We introduce a new model, Gated Convolutional Bidirectional
Attention-based Model (GCBiA), which applies bi-attention mechanism and
convolutions to extract topic words of prompts and key-phrases of responses,
and introduces gated unit and residual connections between major layers to
better represent the relevance of responses and prompts. Moreover, a new
negative sampling method is proposed to augment training data. Experiment
results demonstrate that our novel approach can achieve significant
improvements in detecting off-topic responses with extremely high on-topic
recall, for both seen and unseen prompts.Comment: ACL2020 long pape
Use of nonintrusive sensor-based information and communication technology for real-world evidence for clinical trials in dementia
Cognitive function is an important end point of treatments in dementia clinical trials. Measuring cognitive function by standardized tests, however, is biased toward highly constrained environments (such as hospitals) in selected samples. Patient-powered real-world evidence using information and communication technology devices, including environmental and wearable sensors, may help to overcome these limitations. This position paper describes current and novel information and communication technology devices and algorithms to monitor behavior and function in people with prodromal and manifest stages of dementia continuously, and discusses clinical, technological, ethical, regulatory, and user-centered requirements for collecting real-world evidence in future randomized controlled trials. Challenges of data safety, quality, and privacy and regulatory requirements need to be addressed by future smart sensor technologies. When these requirements are satisfied, these technologies will provide access to truly user relevant outcomes and broader cohorts of participants than currently sampled in clinical trials
An Exploratory Application of Rhetorical Structure Theory to Detect Coherence Errors in L2 English Writing: Possible Implications for Automated Writing Evaluation Software
This paper presents an initial attempt to examine whether Rhetorical Structure Theory (RST) (Mann & Thompson, 1988) can be fruitfully applied to the detection of the coherence errors made by Taiwanese low-intermediate learners of English. This investigation is considered warranted for three reasons. First, other methods for bottom-up coherence analysis have proved ineffective (e.g., Watson Todd et al., 2007). Second, this research provides a preliminary categorization of the coherence errors made by first language (L1) Chinese learners of English. Third, second language discourse errors in general have received little attention in applied linguistic research. The data are 45 written samples from the LTTC English Learner Corpus, a Taiwanese learner corpus of English currently under construction. The rationale of this study is that diagrams which violate some of the rules of RST diagram formation will point to coherence errors. No reliability test has been conducted since this work is at an initial stage. Therefore, this study is exploratory and results are preliminary. Results are discussed in terms of the practicality of using this method to detect coherence errors, their possible consequences about claims for a typical inductive content order in the writing of L1 Chinese learners of English, and their potential implications for Automated Writing Evaluation (AWE) software, since discourse organization is one of the essay characteristics assessed by this software. In particular, the extent to which the kinds of errors detected through the RST analysis match those located by Criterion (Burstein, Chodorow, & Leachock, 2004), a well-known AWE software by Educational Testing Service (ETS), is discussed
CEPS Task Force on Artificial Intelligence and Cybersecurity Technology, Governance and Policy Challenges Task Force Evaluation of the HLEG Trustworthy AI Assessment List (Pilot Version). CEPS Task Force Report 22 January 2020
The Centre for European Policy Studies launched a Task Force on Artificial Intelligence (AI) and
Cybersecurity in September 2019. The goal of this Task Force is to bring attention to the market,
technical, ethical and governance challenges posed by the intersection of AI and cybersecurity,
focusing both on AI for cybersecurity but also cybersecurity for AI. The Task Force is multi-stakeholder
by design and composed of academics, industry players from various sectors, policymakers and civil
society.
The Task Force is currently discussing issues such as the state and evolution of the application of AI
in cybersecurity and cybersecurity for AI; the debate on the role that AI could play in the dynamics
between cyber attackers and defenders; the increasing need for sharing information on threats and
how to deal with the vulnerabilities of AI-enabled systems; options for policy experimentation; and
possible EU policy measures to ease the adoption of AI in cybersecurity in Europe.
As part of such activities, this report aims at assessing the High-Level Expert Group (HLEG) on AI Ethics
Guidelines for Trustworthy AI, presented on April 8, 2019. In particular, this report analyses and
makes suggestions on the Trustworthy AI Assessment List (Pilot version), a non-exhaustive list aimed
at helping the public and the private sector in operationalising Trustworthy AI. The list is composed
of 131 items that are supposed to guide AI designers and developers throughout the process of
design, development, and deployment of AI, although not intended as guidance to ensure
compliance with the applicable laws. The list is in its piloting phase and is currently undergoing a
revision that will be finalised in early 2020.
This report would like to contribute to this revision by addressing in particular the interplay between
AI and cybersecurity. This evaluation has been made according to specific criteria: whether and how
the items of the Assessment List refer to existing legislation (e.g. GDPR, EU Charter of Fundamental
Rights); whether they refer to moral principles (but not laws); whether they consider that AI attacks
are fundamentally different from traditional cyberattacks; whether they are compatible with
different risk levels; whether they are flexible enough in terms of clear/easy measurement,
implementation by AI developers and SMEs; and overall, whether they are likely to create obstacles
for the industry.
The HLEG is a diverse group, with more than 50 members representing different stakeholders, such
as think tanks, academia, EU Agencies, civil society, and industry, who were given the difficult task of
producing a simple checklist for a complex issue. The public engagement exercise looks successful
overall in that more than 450 stakeholders have signed in and are contributing to the process.
The next sections of this report present the items listed by the HLEG followed by the analysis and
suggestions raised by the Task Force (see list of the members of the Task Force in Annex 1)
Impact of ASR performance on free speaking language assessment
In free speaking tests candidates respond in spontaneous speech to prompts. This form of test allows the spoken language proficiency of a non-native speaker of English to be assessed more fully than read aloud tests. As the candidate's responses are unscripted, transcription by automatic speech recognition (ASR) is essential for automated assessment. ASR will never be 100% accurate so any assessment system must seek to minimise and mitigate ASR errors. This paper considers the impact of ASR errors on the performance of free speaking test auto-marking systems. Firstly rich linguistically related features, based on part-of-speech tags from statistical parse trees, are investigated for assessment. Then, the impact of ASR errors on how well the system can detect whether a learner's answer is relevant to the question asked is evaluated. Finally, the impact that these errors may have on the ability of the system to provide detailed feedback to the learner is analysed. In particular, pronunciation and grammatical errors are considered as these are important in helping a learner to make progress. As feedback resulting from an ASR error would be highly confusing, an approach to mitigate this problem using confidence scores is also analysed
A Virtual Conversational Agent for Teens with Autism: Experimental Results and Design Lessons
We present the design of an online social skills development interface for
teenagers with autism spectrum disorder (ASD). The interface is intended to
enable private conversation practice anywhere, anytime using a web-browser.
Users converse informally with a virtual agent, receiving feedback on nonverbal
cues in real-time, and summary feedback. The prototype was developed in
consultation with an expert UX designer, two psychologists, and a pediatrician.
Using the data from 47 individuals, feedback and dialogue generation were
automated using a hidden Markov model and a schema-driven dialogue manager
capable of handling multi-topic conversations. We conducted a study with nine
high-functioning ASD teenagers. Through a thematic analysis of post-experiment
interviews, identified several key design considerations, notably: 1) Users
should be fully briefed at the outset about the purpose and limitations of the
system, to avoid unrealistic expectations. 2) An interface should incorporate
positive acknowledgment of behavior change. 3) Realistic appearance of a
virtual agent and responsiveness are important in engaging users. 4)
Conversation personalization, for instance in prompting laconic users for more
input and reciprocal questions, would help the teenagers engage for longer
terms and increase the system's utility
- …