1,945 research outputs found
Duluth at SemEval-2017 Task 6: Language Models in Humor Detection
This paper describes the Duluth system that participated in SemEval-2017 Task
6 #HashtagWars: Learning a Sense of Humor. The system participated in Subtasks
A and B using N-gram language models, ranking highly in the task evaluation.
This paper discusses the results of our system in the development and
evaluation stages and from two post-evaluation runs.Comment: 5 pages, to Appear in the Proceedings of the 11th International
Workshop on Semantic Evaluation (SemEval 2017), August 2017, Vancouver, B
Parsing coordinations
The present paper is concerned with statistical parsing of constituent structures in German. The paper presents four experiments that aim at improving parsing performance of coordinate structure: 1) reranking the n-best parses of a PCFG parser, 2) enriching the input to a PCFG parser by gold scopes for any conjunct, 3) reranking the parser output for all possible scopes for conjuncts that are permissible with regard to clause structure. Experiment 4 reranks a combination of parses from experiments 1 and 3. The experiments presented show that n- best parsing combined with reranking improves results by a large margin. Providing the parser with different scope possibilities and reranking the resulting parses results in an increase in F-score from 69.76 for the baseline to 74.69. While the F-score is similar to the one of the first experiment (n-best parsing and reranking), the first experiment results in higher recall (75.48% vs. 73.69%) and the third one in higher precision (75.43% vs. 73.26%). Combining the two methods results in the best result with an F-score of 76.69
Evaluation of the NLP Components of the OVIS2 Spoken Dialogue System
The NWO Priority Programme Language and Speech Technology is a 5-year
research programme aiming at the development of spoken language information
systems. In the Programme, two alternative natural language processing (NLP)
modules are developed in parallel: a grammar-based (conventional, rule-based)
module and a data-oriented (memory-based, stochastic, DOP) module. In order to
compare the NLP modules, a formal evaluation has been carried out three years
after the start of the Programme. This paper describes the evaluation procedure
and the evaluation results. The grammar-based component performs much better
than the data-oriented one in this comparison.Comment: Proceedings of CLIN 9
Scattertext: a Browser-Based Tool for Visualizing how Corpora Differ
Scattertext is an open source tool for visualizing linguistic variation
between document categories in a language-independent way. The tool presents a
scatterplot, where each axis corresponds to the rank-frequency a term occurs in
a category of documents. Through a tie-breaking strategy, the tool is able to
display thousands of visible term-representing points and find space to legibly
label hundreds of them. Scattertext also lends itself to a query-based
visualization of how the use of terms with similar embeddings differs between
document categories, as well as a visualization for comparing the importance
scores of bag-of-words features to univariate metrics.Comment: ACL 2017 Demos. 6 pages, 5 figures. See the Githup repo
https://github.com/JasonKessler/scattertext for source code and documentatio
Distributional semantics beyond words: Supervised learning of analogy and paraphrase
There have been several efforts to extend distributional semantics beyond
individual words, to measure the similarity of word pairs, phrases, and
sentences (briefly, tuples; ordered sets of words, contiguous or
noncontiguous). One way to extend beyond words is to compare two tuples using a
function that combines pairwise similarities between the component words in the
tuples. A strength of this approach is that it works with both relational
similarity (analogy) and compositional similarity (paraphrase). However, past
work required hand-coding the combination function for different tasks. The
main contribution of this paper is that combination functions are generated by
supervised learning. We achieve state-of-the-art results in measuring
relational similarity between word pairs (SAT analogies and SemEval~2012 Task
2) and measuring compositional similarity between noun-modifier phrases and
unigrams (multiple-choice paraphrase questions)
- …