24 research outputs found

    A Preliminary Evaluation of the Impact of Syntactic Structure in Semantic Textual Similarity and Semantic Relatedness Tasks

    Get PDF
    The well related tasks of evaluating the Se- mantic Textual Similarity and Semantic Relat- edness have been under a special attention in NLP community. Many different approaches have been proposed, implemented and evalu- ated at different levels, such as lexical similar- ity, word/string/POS tags overlapping, seman- tic modeling (LSA, LDA), etc. However, at the level of syntactic structure, it is not clear how significant it contributes to the overall ac- curacy. In this paper, we make a preliminary evaluation of the impact of the syntactic struc- ture in the tasks by running and analyzing the results from several experiments regarding to how syntactic structure contributes to solving these tasks

    Learning the Impact and Behavior of Syntactic Structure: A Case Study in Semantic Textual Similarity

    Get PDF
    We present a case study on the role of syn- tactic structures towards resolving the Se- mantic Textual Similarity (STS) task. Al- though various approaches have been pro- posed, the research of using syntactic in- formation to determine the semantic simi- larity is a relatively under-researched area. At the level of syntactic structure, it is not clear how significant the syntactic struc- ture contributes to the overall accuracy of the task. In this paper, we analyze the impact of syntactic structure towards the overall performance and its behavior in different score ranges of the STS seman- tic scale

    FBK-HLT: An Application of Semantic Textual Similarity for Answer Selection in Community Question Answering

    Get PDF
    This paper reports the description and perfor- mance of our system, FBK-HLT, participating in the SemEval 2015, Task #3 "Answer Se- lection in Community Question Answering" for English, for both subtasks. We submit two runs with different classifiers in combining typ- ical features (lexical similarity, string similar- ity, word n-grams, etc.) with machine transla- tion evaluation metrics and with some ad hoc features (e.g user overlapping, spam filtering). We outperform the baseline system and achieve interesting results on both subtasks

    FBK-HLT: An Effective System for Paraphrase Identification and Semantic Similarity in Twitter

    Get PDF
    This paper reports the description and perfor- mance of our system, FBK-HLT, participating in the SemEval 2015, Task #1 "Paraphrase and Semantic Similarity in Twitter", for both sub- tasks. We submitted two runs with different classifiers in combining typical features (lexi- cal similarity, string similarity, word n-grams, etc) with machine translation metrics and edit distance features. We outperform the baseline system and achieve a very competitive result to the best system on the first subtask. Eventually, we are ranked 4th out of 18 teams participating in subtask "Paraphrase Identification"

    FBK-HLT: A New Framework for Semantic Textual Similarity

    Get PDF
    This paper reports the description and perfor- mance of our system, FBK-HLT, participat- ing in the SemEval 2015, Task #2 “Semantic Textual Similarity”, English subtask. We sub- mitted three runs with different hypothesis in combining typical features (lexical similarity, string similarity, word n-grams, etc) with syn- tactic structure features, resulting in different sets of features. The results evaluated on both STS 2014 and 2015 datasets prove our hypoth- esis of building a STS system taking into con- sideration of syntactic information. We out- perform the best system on STS 2014 datasets and achieve a very competitive result to the best system on STS 2015 datasets

    Identifying Motion Entities in Natural Language and A Case Study for Named Entity Recognition

    Get PDF
    Motion recognition is one of the basic cognitive capabilities of many life forms, however, detecting and understanding motion in text is not a trivial task. In addition, identifying motion entities in natural language is not only challenging but also beneficial for a better natural language understanding. In this paper, we present a Motion Entity Tagging (MET) model to identify entities in motion in a text using the Literal-Motion-in-Text (LiMiT) dataset for training and evaluating the model. Then we propose a new method to split clauses and phrases from complex and long motion sentences to improve the performance of our MET model. We also present results showing that motion features, in particular, entity in motion benefits the Named-Entity Recognition (NER) task. Finally, we present an analysis for the special co-occurrence relation between the person category in NER and animate entities in motion, which significantly improves the classification performance for the person category in NER

    Safety and efficacy of fluoxetine on functional outcome after acute stroke (AFFINITY): a randomised, double-blind, placebo-controlled trial

    Get PDF
    Background Trials of fluoxetine for recovery after stroke report conflicting results. The Assessment oF FluoxetINe In sTroke recoverY (AFFINITY) trial aimed to show if daily oral fluoxetine for 6 months after stroke improves functional outcome in an ethnically diverse population. Methods AFFINITY was a randomised, parallel-group, double-blind, placebo-controlled trial done in 43 hospital stroke units in Australia (n=29), New Zealand (four), and Vietnam (ten). Eligible patients were adults (aged ≥18 years) with a clinical diagnosis of acute stroke in the previous 2–15 days, brain imaging consistent with ischaemic or haemorrhagic stroke, and a persisting neurological deficit that produced a modified Rankin Scale (mRS) score of 1 or more. Patients were randomly assigned 1:1 via a web-based system using a minimisation algorithm to once daily, oral fluoxetine 20 mg capsules or matching placebo for 6 months. Patients, carers, investigators, and outcome assessors were masked to the treatment allocation. The primary outcome was functional status, measured by the mRS, at 6 months. The primary analysis was an ordinal logistic regression of the mRS at 6 months, adjusted for minimisation variables. Primary and safety analyses were done according to the patient's treatment allocation. The trial is registered with the Australian New Zealand Clinical Trials Registry, ACTRN12611000774921. Findings Between Jan 11, 2013, and June 30, 2019, 1280 patients were recruited in Australia (n=532), New Zealand (n=42), and Vietnam (n=706), of whom 642 were randomly assigned to fluoxetine and 638 were randomly assigned to placebo. Mean duration of trial treatment was 167 days (SD 48·1). At 6 months, mRS data were available in 624 (97%) patients in the fluoxetine group and 632 (99%) in the placebo group. The distribution of mRS categories was similar in the fluoxetine and placebo groups (adjusted common odds ratio 0·94, 95% CI 0·76–1·15; p=0·53). Compared with patients in the placebo group, patients in the fluoxetine group had more falls (20 [3%] vs seven [1%]; p=0·018), bone fractures (19 [3%] vs six [1%]; p=0·014), and epileptic seizures (ten [2%] vs two [<1%]; p=0·038) at 6 months. Interpretation Oral fluoxetine 20 mg daily for 6 months after acute stroke did not improve functional outcome and increased the risk of falls, bone fractures, and epileptic seizures. These results do not support the use of fluoxetine to improve functional outcome after stroke

    Contributions to Semantic Textual Similarity Algorithms

    No full text
    Similarity plays a central role in language understanding process. However, it is always difficult to precisely define on which type of data and what similarity metrics we can apply in order to assess the similarity of two texts. According to this spirit, the task Semantic Textual Similarity (STS) was introduced as a pilot task at the Semantic Evaluation (SemEval) workshop in year 2012. This thesis seeks to investigate the variances of performance of STS systems with respect to the heterogeneous data sources, and find solutions to alleviate these variances to improve the system performance. We carry a series of works focusing on addressing different aspects of measuring semantic similarity for texts under the umbrella of the Semantic Textual Similarity task. Firstly, we analyze the variance of system performance on dierent corpora with preliminary experiments and propose the hypothesis that system performance depends heavily on the type of train and test corpora coming from heterogeneous sources. We analyze a standard textual similarity model built on vectorial representation and we derive a couple of modalities which help significantly alleviating the negative in influence of vectorial mapping model. In particular, we study how structural information and the most advanced word alignment models in Machine Translation improve the accuracy of systems. Our analysis also leads us to carry out, for the first time, an analysis between Semantic Relatedness and Textual Entailment, then we propose a co-learning model to improve the accuracy on both tasks by exploiting their mutual relationship. As a result, all these steps lead to a consistent improvement over the standard model which is manifested across corpora. The evaluation shows that our system systematically achieves and goes beyond the former state of the art, whereas it also reduces the variation of the accuracy on various types of corpora

    Fast and Accurate Misspelling Correction in Large Corpora

    No full text
    There are several NLP systems whose ac- curacy depends crucially on finding mis- spellings fast. However, the classical approach is based on a quadratic time algo- rithm with 80% coverage. We present a novel algorithm for misspelling detection, which runs in constant time and improves the coverage to more than 96%. We use this algorithm together with a cross docu- ment coreference system in order to find proper name misspellings. The experiments confirmed significant improvement over the state of the art

    FBK-TR: SVM for Semantic Relatedeness and Corpus Patterns for RTE

    No full text
    This paper reports the description and scores of our system, FBK-TR, which participated at the SemEval 2014 task #1 "Evaluation of Compositional Distribu- tional Semantic Models on Full Sentences through Semantic Relatedness and Entail- ment". The system consists of two parts: one for computing semantic relatedness, based on SVM, and the other for identi- fying the entailment values on the basis of both semantic relatedness scores and entailment patterns based on verb-specific semantic frames. The system ranked 11t h on both tasks with competitive results
    corecore