Search CORE

26,458 research outputs found

Relating language examinations to the common European framework of reference for languages: learning, teaching, assessment (CEFR): a manual

Author: Figueras Neus
North Brian
Takala Sauli
Van Avermaet Piet
Verhelst Norman
Publication venue: Council of Europe, Language policy division
Publication date: 01/01/2009
Field of study

A Formal Framework for Linguistic Annotation

Author: Bird Steven
Liberman Mark
Publication venue
Publication date: 01/01/1999
Field of study

`Linguistic annotation' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological recordings -- or it may be textual. The added notations may include transcriptions of all sorts (from phonetic features to discourse structures), part-of-speech and sense tagging, syntactic analysis, `named entity' identification, co-reference annotation, and so on. While there are several ongoing efforts to provide formats and tools for such annotations and to publish annotated linguistic databases, the lack of widely accepted standards is becoming a critical problem. Proposed standards, to the extent they exist, have focussed on file formats. This paper focuses instead on the logical structure of linguistic annotations. We survey a wide variety of existing annotation formats and demonstrate a common conceptual core, the annotation graph. This provides a formal framework for constructing, maintaining and searching linguistic annotations, while remaining consistent with many alternative data structures and file formats.Comment: 49 page

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

IMAGINE Final Report

Author: Arana C
Dattani I
Pick R
Recio I
Schmidt P
Publication venue: s.n.
Publication date: 01/09/2003
Field of study

Southampton (e-Prints Soton)

The Production of Speech Corpora

Author: Baumann Angela
Draxler Christoph
Ellbogen Tania
Schiel Florian
Steffen Alexander
Publication venue
Publication date: 21/03/2012
Field of study

Open Access LMU

A Topic-Agnostic Approach for Identifying Fake News Pages

Author: Almeida Thais
Castelo Sonia
Elghafari Anas
Freire Juliana
Nakamura Eduardo
Pham Kien
Santos Aécio
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/05/2019
Field of study

Fake news and misinformation have been increasingly used to manipulate popular opinion and influence political processes. To better understand fake news, how they are propagated, and how to counter their effect, it is necessary to first identify them. Recently, approaches have been proposed to automatically classify articles as fake based on their content. An important challenge for these approaches comes from the dynamic nature of news: as new political events are covered, topics and discourse constantly change and thus, a classifier trained using content from articles published at a given time is likely to become ineffective in the future. To address this challenge, we propose a topic-agnostic (TAG) classification strategy that uses linguistic and web-markup features to identify fake news pages. We report experimental results using multiple data sets which show that our approach attains high accuracy in the identification of fake news, even as topics evolve over time.Comment: Accepted for publication in the Companion Proceedings of the 2019 World Wide Web Conference (WWW'19 Companion). Presented in the 2019 International Workshop on Misinformation, Computational Fact-Checking and Credible Web (MisinfoWorkshop2019). 6 page

arXiv.org e-Print Archive

Crossref

Natural language processing and advanced information management

Author: Hoard James E.
Publication venue
Publication date
Field of study

Integrating diverse information sources and application software in a principled and general manner will require a very capable advanced information management (AIM) system. In particular, such a system will need a comprehensive addressing scheme to locate the material in its docuverse. It will also need a natural language processing (NLP) system of great sophistication. It seems that the NLP system must serve three functions. First, it provides an natural language interface (NLI) for the users. Second, it serves as the core component that understands and makes use of the real-world interpretations (RWIs) contained in the docuverse. Third, it enables the reasoning specialists (RSs) to arrive at conclusions that can be transformed into procedures that will satisfy the users' requests. The best candidate for an intelligent agent that can satisfactorily make use of RSs and transform documents (TDs) appears to be an object oriented data base (OODB). OODBs have, apparently, an inherent capacity to use the large numbers of RSs and TDs that will be required by an AIM system and an inherent capacity to use them in an effective way

NASA Technical Reports Server

Handbook of Easy Languages in Europe

Author
Publication venue: 'OAPEN Foundation'
Publication date: 01/02/2022
Field of study

The Handbook of Easy Languages in Europe describes what Easy Language is and how it is used in European countries. It demonstrates the great diversity of actors, instruments and outcomes related to Easy Language throughout Europe. All people, despite their limitations, have an equal right to information, inclusion, and social participation. This results in requirements for understandable language. The notion of Easy Language refers to modified forms of standard languages that aim to facilitate reading and language comprehension. This handbook describes the historical background, the principles and the practices of Easy Language in 21 European countries. Its topics include terminological definitions, legal status, stakeholders, target groups, guidelines, practical outcomes, education, research, and a reflection on future perspectives related to Easy Language in each country. Written in an academic yet interesting and understandable style, this Handbook of Easy Languages in Europe aims to find a wide audience

Directory of Open Access Books (DOAB)

EFL teaching through english-practice work stations (EPWS) to enhance participation and interaction in english for third grade learners

Author: Carrasco Falcón Constanza
Lagos Quevedo Catalina
Moreno Carvajal Mariana
Publication venue: Universidad Andres Bello
Publication date: 01/01/2019
Field of study

Tesis (Pedagogía en Inglés)Teachers all over the world are constantly searching for new activities, new strategies and methodologies that can help them make their lessons more satisfying for their learners and making it possible to enhance their learning process as well as their results. This has become a titanic effort for those teachers who do not count on the amount of time necessary for their lesson planning and to design their activities. But there are numerous Internet websites or teachers on social networks that are willing to give and share ideas. While doing the previous research for this thesis the authors came across with Debbie Diller’s book “Literacy Work Stations: Making Centers Work” (2003), where she explains how Literacy Work Stations (henceforth, LWS) work. After reading vast information about this method, there are several aspects presented that are very similar to the one seen at pre-elementary school by two of the authors were this methodology consisted of working in stations during short periods and rotating between them, so learners can work on different subjects. The researchers think that this kind of methodology is the one that generated faster and deeper development of the four skills of the English language for the two of them. And so the main purpose of this research is to foresee its usefulness and to propose an innovative strategy to enhance participation and interaction, not only in Spanish but mainly in English, inside an English as a Foreign Language (henceforth, EFL) lesson. The topic of this research is related to the implementation of a different strategy in elementary grades called English-Practice Work Stations to enhance participation and interaction during EFL lessons

Repositorio Institucional Académico Universidad Andrés Bello

All mixed up? Finding the optimal feature set for general readability prediction and its application to English and Dutch

Author: De Clercq Orphée
Hoste Veronique
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2016
Field of study

Readability research has a long and rich tradition, but there has been too little focus on general readability prediction without targeting a specific audience or text genre. Moreover, though NLP-inspired research has focused on adding more complex readability features there is still no consensus on which features contribute most to the prediction. In this article, we investigate in close detail the feasibility of constructing a readability prediction system for English and Dutch generic text using supervised machine learning. Based on readability assessments by both experts and a crowd, we implement different types of text characteristics ranging from easy-to-compute superficial text characteristics to features requiring a deep linguistic processing, resulting in ten different feature groups. Both a regression and classification setup are investigated reflecting the two possible readability prediction tasks: scoring individual texts or comparing two texts. We show that going beyond correlation calculations for readability optimization using a wrapper-based genetic algorithm optimization approach is a promising task which provides considerable insights in which feature combinations contribute to the overall readability prediction. Since we also have gold standard information available for those features requiring deep processing we are able to investigate the true upper bound of our Dutch system. Interestingly, we will observe that the performance of our fully-automatic readability prediction pipeline is on par with the pipeline using golden deep syntactic and semantic information

Crossref

Ghent University Academic Bibliography