2,894 research outputs found

    Lexical Choice via Topic Adaptation for Paraphrasing Written Language to Spoken Language

    Full text link

    Film as a Teaching Medium

    Get PDF

    PersoNER: Persian named-entity recognition

    Full text link
    © 1963-2018 ACL. Named-Entity Recognition (NER) is still a challenging task for languages with low digital resources. The main difficulties arise from the scarcity of annotated corpora and the consequent problematic training of an effective NER pipeline. To abridge this gap, in this paper we target the Persian language that is spoken by a population of over a hundred million people world-wide. We first present and provide ArmanPerosNERCorpus, the first manually-annotated Persian NER corpus. Then, we introduce PersoNER, an NER pipeline for Persian that leverages a word embedding and a sequential max-margin classifier. The experimental results show that the proposed approach is capable of achieving interesting MUC7 and CoNNL scores while outperforming two alternatives based on a CRF and a recurrent neural network

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Get PDF
    This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

    Neural probabilistic language model for system combination

    Get PDF
    This paper gives the system description of the neural probabilistic language modeling (NPLM) team of Dublin City University for our participation in the system combination task in the Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT (ML4HMT-12). We used the information obtained by NPLM as meta information to the system combination module. For the Spanish-English data, our paraphrasing approach achieved 25.81 BLEU points, which lost 0.19 BLEU points absolute compared to the standard confusion network-based system combination. We note that our current usage of NPLM is very limited due to the difficulty in combining NPLM and system combination

    Ambiguity Found in the Text Containing Local Wisdom

    Get PDF
    Theories of semantics ‘ interpretation of meanings are not only discussing about speakers’ meaning and word or sentence meaning.They are also influenced by ambiguity.     Ambiguity can cause different interpretation of meanings. One of the problems is caused byone has more than one meaning. The aim of this research entitled : “The Analysis of Ambiguity Found in the Texts containing Local Wisdomâ€.  is to find out lexical ambiguity in the texts containing local wisdom. The choice of the texts can be in the form of brochures of some tourism objects   around Cirebon, or the hystorical background of Cirebon and some other places of The Area III of Cirebon and Ciayu Maja Kuning. This research is regarded to be important due to the effect of ambiguity that can cause mis interpretation of meaning that will cause misunderstanding. The term lexical ambiguity follows the theories of Fromkin et.al (2002) and Kent Bach (2009). The data is taken from  those texts that have  been previously stated, then all the data willl be analyzed to know  the type and occurence of the usage in those texts.After that  the way how to disambiguate  them will be described. The research method that is used in this research  is descriptive qualitative from Creswell (1994).Findings :  The words classes found to be ambiguous were: 46.6 % Nouns;   33.33 % Adjectives;  6.66%  Verb; 6.66% Adverb. It also means that its occurances of ambiguites were  mostly in nouns, adjectives, verb and adverb. The way to disambiguate the ambiguous words mostly by giving additional information required to clarify the meanings of those words in order to avoid ambiguity; only one using picture. Almost all the ambiguities  were caused by having multiple meanings that are also called polysemious. Other researchers who are interested toknow more about ambiguity can use other kind of texts to  elaborate the findings.Keywords: ambiguity, local wisdom content-text, lexicalÂ

    A critical analysis of the strategies of terminology creation in the context of a multilingual Namibia: the case of ruManyo

    Get PDF
    This study examines the strategies used to develop terms in the language ruManyo. The study focuses on existing strategies used by language practitioners to construct analogous key-concept terms in ruManyo for application in various fields. The sample was taken through purposive sampling, and the investigation was carried out in Namibia's Kavango East region, in domains such as education, radio, agriculture, law, hospital, bank, and church. The data for this report was collected using a case study, which included document analysis, participant observations and interviews with ruManyo language practitioners. The findings of the study indicate that ruManyo language practitioners lack the skills and information needed to build appropriate terminology solutions for specific domains. Furthermore, it appears that linguistic competence is not guiding word-generation efforts in certain disciplines. The study re-evaluated the evolution of multilingual word-generation techniques, and discovered that specific domains necessitate specific tactics, based on the context in which terms are employed. Based on the findings of this study, the recommendation is to design unambiguous wordinvention strategies for specific domains that are consistent with the terminology development guidelines for indigenous African languages. Due to the deficiencies in African indigenous language terminologies highlighted in this study, the researcher proposes the creation of a manual for ruManyo, detailing each method for application in different domains
    corecore