16,888 research outputs found
Corpora and evaluation tools for multilingual named entity grammar development
We present an effort for the development of multilingual named entity grammars in a unification-based finite-state formalism (SProUT). Following an extended version of the MUC7 standard, we have developed Named Entity Recognition grammars for German, Chinese, Japanese, French, Spanish, English, and Czech. The grammars recognize person names, organizations, geographical locations, currency, time and date expressions. Subgrammars and gazetteers are shared as much as possible for the grammars of the different languages. Multilingual corpora from the business domain are used for grammar development and evaluation. The annotation format (named entity and other linguistic information) is described. We present an evaluation tool which provides detailed statistics and diagnostics, allows for partial matching of annotations, and supports user-defined mappings between different annotation and grammar output formats
Keystroke Biometrics in Response to Fake News Propagation in a Global Pandemic
This work proposes and analyzes the use of keystroke biometrics for content
de-anonymization. Fake news have become a powerful tool to manipulate public
opinion, especially during major events. In particular, the massive spread of
fake news during the COVID-19 pandemic has forced governments and companies to
fight against missinformation. In this context, the ability to link multiple
accounts or profiles that spread such malicious content on the Internet while
hiding in anonymity would enable proactive identification and blacklisting.
Behavioral biometrics can be powerful tools in this fight. In this work, we
have analyzed how the latest advances in keystroke biometric recognition can
help to link behavioral typing patterns in experiments involving 100,000 users
and more than 1 million typed sequences. Our proposed system is based on
Recurrent Neural Networks adapted to the context of content de-anonymization.
Assuming the challenge to link the typed content of a target user in a pool of
candidate profiles, our results show that keystroke recognition can be used to
reduce the list of candidate profiles by more than 90%. In addition, when
keystroke is combined with auxiliary data (such as location), our system
achieves a Rank-1 identification performance equal to 52.6% and 10.9% for a
background candidate list composed of 1K and 100K profiles, respectively.Comment: arXiv admin note: text overlap with arXiv:2004.0362
Exploring User Satisfaction in a Tutorial Dialogue System
Abstract User satisfaction is a common evaluation metric in task-oriented dialogue systems, whereas tutorial dialogue systems are often evaluated in terms of student learning gain. However, user satisfaction is also important for such systems, since it may predict technology acceptance. We present a detailed satisfaction questionnaire used in evaluating the BEETLE II system (REVU-NL), and explore the underlying components of user satisfaction using factor analysis. We demonstrate interesting patterns of interaction between interpretation quality, satisfaction and the dialogue policy, highlighting the importance of more finegrained evaluation of user satisfaction
- …