739 research outputs found
An evaluation of syntactic simplification rules for people with autism
Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR) at the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2014)Syntactically complex sentences constitute an obstacle for some people with Autistic Spectrum Disorders. This paper evaluates a set of simplification rules specifically designed for tackling complex and compound sentences. In total, 127 different rules were developed for the rewriting of complex sentences and 56 for the rewriting of compound sentences. The evaluation assessed the accuracy of these rules individually and revealed that fully automatic conversion of these sentences into a more accessible form is not very reliable.EC FP7-ICT-2011-
An Empirical Study on the Generalization Power of Neural Representations Learned via Visual Guessing Games
Guessing games are a prototypical instance of the "learning by interacting"
paradigm. This work investigates how well an artificial agent can benefit from
playing guessing games when later asked to perform on novel NLP downstream
tasks such as Visual Question Answering (VQA). We propose two ways to exploit
playing guessing games: 1) a supervised learning scenario in which the agent
learns to mimic successful guessing games and 2) a novel way for an agent to
play by itself, called Self-play via Iterated Experience Learning (SPIEL).
We evaluate the ability of both procedures to generalize: an in-domain
evaluation shows an increased accuracy (+7.79) compared with competitors on the
evaluation suite CompGuessWhat?!; a transfer evaluation shows improved
performance for VQA on the TDIUC dataset in terms of harmonic average accuracy
(+5.31) thanks to more fine-grained object representations learned via SPIEL.Comment: Accepted paper for the 16th Conference of the European Chapter of the
Association for Computational Linguistics (EACL 2021
Multi-Relational Hyperbolic Word Embeddings from Natural Language Definitions
Natural language definitions possess a recursive, self-explanatory semantic
structure that can support representation learning methods able to preserve
explicit conceptual relations and constraints in the latent space. This paper
presents a multi-relational model that explicitly leverages such a structure to
derive word embeddings from definitions. By automatically extracting the
relations linking defined and defining terms from dictionaries, we demonstrate
how the problem of learning word embeddings can be formalised via a
translational framework in Hyperbolic space and used as a proxy to capture the
global semantic structure of definitions. An extensive empirical analysis
demonstrates that the framework can help imposing the desired structural
constraints while preserving the semantic mapping required for controllable and
interpretable traversal. Moreover, the experiments reveal the superiority of
the Hyperbolic word embeddings over the Euclidean counterparts and demonstrate
that the multi-relational approach can obtain competitive results when compared
to state-of-the-art neural models, with the advantage of being intrinsically
more efficient and interpretable.Comment: Accepted at the 18th Conference of the European Chapter of the
Association for Computational Linguistics (EACL 2024), camera-read
Unsupervised does not mean uninterpretable : the case for word sense induction and disambiguation
This dataset contains the models for interpretable Word Sense Disambiguation (WSD) that were employed in Panchenko et al. (2017; the paper can be accessed at https://www.lt.informatik.tu-darmstadt.de/fileadmin/user_upload/Group_LangTech/publications/EACL_Interpretability___FINAL__1_.pdf).
The files were computed on a 2015 dump from the English Wikipedia. Their contents:
Induced Sense Inventories: wp_stanford_sense_inventories.tar.gz
This file contains 3 inventories (coarse, medium fine)
Language Model (3-gram): wiki_text.3.arpa.gz
This file contains all n-grams up to n=3 and can be loaded into an index
Weighted Dependency Features: wp_stanford_lemma_LMI_s0.0_w2_f2_wf2_wpfmax1000_wpfmin2_p1000.gz
This file contains weighted word--context-feature combinations and includes their count and an LMI significance score
Distributional Thesaurus (DT) of Dependency Features: wp_stanford_lemma_BIM_LMI_s0.0_w2_f2_wf2_wpfmax1000_wpfmin2_p1000_simsortlimit200_feature expansion.gz
This file contains a DT of context features. The context feature similarities can be used for context expansion
For further information, consult the paper and the companion page: http://jobimtext.org/wsd/
Panchenko A., Ruppert E., Faralli S., Ponzetto S. P., and Biemann C. (2017): Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL'2017). Valencia, Spain. Association for Computational Linguistics
- …