102,173 research outputs found
Recommended from our members
Introducing a Romanian Frequency List and the Romanian Vocabulary Levels Test
Vocabulary is considered essential to language learning, thus English word lists and tests based on frequency information have become the centre of attention for researchers, teachers and learners alike. As a result, it is argued hereby that frequency based word lists and tests should be adapted and regarded as key elements for teaching and learning Romanian as an additional language as well.
Since there are currently no reliable frequency lists and lexical tests in Romanian, this paper aims to bridge this gap by introducing the first Romanian Word List and the Romanian Vocabulary Levels Test. The list contains the 10,000 most frequent Romanian words and is based on the Romanian Balanced Annotated Corpus (ROMBAC, Ion, Irimia, Ștefănescu, Tufiș 2012).
The primary objective of the paper is to elaborate on the compilation criteria, the challenges involved and the benefits of such a list in the case of teaching, learning and curriculum design for Romanian as an additional language. The secondary objective is to present a practical application of the word list by introducing an exemplary Romanian lexical test, the Romanian Vocabulary Levels Test and examine its reliability and validity
Proceedings of the Workshop Semantic Content Acquisition and Representation (SCAR) 2007
This is the proceedings of the Workshop on Semantic Content Acquisition and Representation, held in conjunction with NODALIDA 2007, on May 24 2007 in Tartu, Estonia.</p
A Survey of Paraphrasing and Textual Entailment Methods
Paraphrasing methods recognize, generate, or extract phrases, sentences, or
longer natural language expressions that convey almost the same information.
Textual entailment methods, on the other hand, recognize, generate, or extract
pairs of natural language expressions, such that a human who reads (and trusts)
the first element of a pair would most likely infer that the other element is
also true. Paraphrasing can be seen as bidirectional textual entailment and
methods from the two areas are often similar. Both kinds of methods are useful,
at least in principle, in a wide range of natural language processing
applications, including question answering, summarization, text generation, and
machine translation. We summarize key ideas from the two areas by considering
in turn recognition, generation, and extraction methods, also pointing to
prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of
Informatics, Athens University of Economics and Business, Greece, 201
Language Trees and Zipping
In this letter we present a very general method to extract information from a
generic string of characters, e.g. a text, a DNA sequence or a time series.
Based on data-compression techniques, its key point is the computation of a
suitable measure of the remoteness of two bodies of knowledge. We present the
implementation of the method to linguistic motivated problems, featuring highly
accurate results for language recognition, authorship attribution and language
classification.Comment: 5 pages, RevTeX4, 1 eps figure. In press in Phys. Rev. Lett. (January
2002
- …