15,192 research outputs found
Learning to Identify Ambiguous and Misleading News Headlines
Accuracy is one of the basic principles of journalism. However, it is
increasingly hard to manage due to the diversity of news media. Some editors of
online news tend to use catchy headlines which trick readers into clicking.
These headlines are either ambiguous or misleading, degrading the reading
experience of the audience. Thus, identifying inaccurate news headlines is a
task worth studying. Previous work names these headlines "clickbaits" and
mainly focus on the features extracted from the headlines, which limits the
performance since the consistency between headlines and news bodies is
underappreciated. In this paper, we clearly redefine the problem and identify
ambiguous and misleading headlines separately. We utilize class sequential
rules to exploit structure information when detecting ambiguous headlines. For
the identification of misleading headlines, we extract features based on the
congruence between headlines and bodies. To make use of the large unlabeled
data set, we apply a co-training method and gain an increase in performance.
The experiment results show the effectiveness of our methods. Then we use our
classifiers to detect inaccurate headlines crawled from different sources and
conduct a data analysis.Comment: Accepted by IJCAI 201
Video Highlight Prediction Using Audience Chat Reactions
Sports channel video portals offer an exciting domain for research on
multimodal, multilingual analysis. We present methods addressing the problem of
automatic video highlight prediction based on joint visual features and textual
analysis of the real-world audience discourse with complex slang, in both
English and traditional Chinese. We present a novel dataset based on League of
Legends championships recorded from North American and Taiwanese Twitch.tv
channels (will be released for further research), and demonstrate strong
results on these using multimodal, character-level CNN-RNN model architectures.Comment: EMNLP 201
Monolingual, Bilingual Dictionaries and Language Study
This paper tries to prove that, neither monolingual nor bilingual dictionary can, by themselves satisfy the needs of foreign language learners. Different stages of second language acquisition require different types of dictionaries, and they all have their own unique function in helping the learners to form a new language habit. This paper makes a review about present and past research and various scholarly points of views. For research approach, a quantitative method is adopted to investigate which kind of dictionary best meets the needs of students in different stages of foreign language acquisition. Finally, results and discussions are shown to conclude the investigation. Key words: monolingual dictionary, bilingual dictionary, language study Résumé: Le présent article tente de prouver que, ni le dictionnaire monolingue ni le dictionnaire bilingue ne peut satisfaire les besoins des apprenants de langues étrangères. Les différentes étapes de l’acquisition de la deuxième langue exigent de différents types de dictionnaires, et ils ont tous leur propre fonction pour aider les apprenants à former une nouvelle habitude langagière. Le présent article met en revue les recherches actuelles et passées ainsi que les points de vues académiques divers. Quant à l’approche de recherches, des méthodes quantitatives sont adoptées pour étudier quel type de dictionnaire répond le mieux aux besoins des étudiants dans les différentes phases de l’acquisition des langues étrangères. Finalement, on montre les résultats et les discussions pour conclure l’investigation. Mots-Clés: dictionnaire monolingue, dictionnaire bilingue, étude linguistique 摘要:本文試圖證明無論單語詞典還是雙語詞典都不能獨立地滿足語言學習的需求。不同階段的外語學習需要使用不同的詞典,它們在幫助學習者形成一種新的語言習慣過程中發揮著不同的作用。本文回顧了有關文獻和不同的研究觀點,採用定量和定性的方法來研究哪種詞典能滿足二語習得不同階段的學習要求,最後得出結論。 關鍵詞:單語詞典;雙語詞典;語言學
Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems
Visual modifications to text are often used to obfuscate offensive comments
in social media (e.g., "!d10t") or as a writing style ("1337" in "leet speak"),
among other scenarios. We consider this as a new type of adversarial attack in
NLP, a setting to which humans are very robust, as our experiments with both
simple and more difficult visual input perturbations demonstrate. We then
investigate the impact of visual adversarial attacks on current NLP systems on
character-, word-, and sentence-level tasks, showing that both neural and
non-neural models are, in contrast to humans, extremely sensitive to such
attacks, suffering performance decreases of up to 82\%. We then explore three
shielding methods---visual character embeddings, adversarial training, and
rule-based recovery---which substantially improve the robustness of the models.
However, the shielding methods still fall behind performances achieved in
non-attack scenarios, which demonstrates the difficulty of dealing with visual
attacks.Comment: Accepted as long paper at NAACL-2019; fixed one ungrammatical
sentenc
Test of adolescent semantic knowledge : a pilot study
The present study investigated the feasibility of a beta version of Test of Adolescent Semantic Knowledge (TASK) in assessing lexical-semantic development of Cantonese-speaking adolescents with or without specific language impairment in Hong Kong. Sixty typically developing Cantonese-speaking adolescents (TD group) and 17 language-impaired Cantonese-speaking adolescents (LI group), aged between 12;01 and 17;06 studying Secondary 1 (S.1), Secondary 3 (S.3) and Secondary 5 (S.5), were recruited. A list of 300 vocabulary was located from local textbooks and dictionaries. The list was reduced to 91 vocabulary according to secondary school teachers’ judgement and feedback from a pre-pilot try-out. Three receptive vocabulary task, two expressive vocabulary tasks and a lexical inferencing task were devised to examine 5 domains: (1) literate words, (2) idioms, (3) slangs, (4) homophones, and (5) lexical inferencing strategies. The composite scores demonstrated a significant growth with grade level in the TD group. The LI group performed significantly weaker than the TD group in all five domains and the composite scores. At an individual level, with -1.5 SD as the cutoff, TASK showed 85.3%overall accuracy with 76.5% and94.1% sensitivity and specificity respectively. The results concluded that there is a continual growth of semantic knowledge during adolescence and TASK could be a feasible test for evaluating semantic knowledge of Cantonese-speaking adolescents in Hong Kong.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science
- …