3,827 research outputs found
One-time treatment for incidental vocabulary learning: Call for discontinuation
Incidental vocabulary learning has attracted a great deal of attention in ELT research. However, it is important that teacher and researcher exploitation of vocabulary developments be guided by more than replication of previous research designs. For conclusions based on empirical research to be valid, it is important to be clear about exactly what any data being gathered pertains to. While Karakas & Saricoban (2012) claim to have presented a solid piece of research on the effects of subtitled cartoons on incidental vocabulary learning, in practice it is not so. It is argued that the research design validity resulted in questionable results having little relevance to genuine incidental vocabulary learning
A bag-of-features framework for incremental learning of speech invariants in unsegmented audio streams
International audienceWe introduce a computational framework that allows a machine to bootstrap flexible autonomous learning of speech recognition skills. Technically, this framework shall en- able a robot to incrementally learn to recog- nize speech invariants from unsegmented au- dio streams and with no prior knowledge of phonetics. To achieve this, we import the bag-of-words/bag-of-features approach from recent research in computer vision, and adapt it to incremental developmental speech pro- cessing. We evaluate an implementation of this framework on a complex speech database
Recommended from our members
The Teaching And Learning Of Vocabulary: With Special Reference To Bilingual Pupils
The study reported here examines the English language knowledge and performance of bilingual school children of Middle School age in Britain, in particular their acquisition and use of vocabulary. One of the chief premises of the research is that pupils from bilingual minority ethnic backgrounds suffer a major disadvantage while learning from the National Curriculum because they lack the necessary richness of word knowledge, accompanied by the conceptual frameworks expected in learning subjects such as science and geography. Furthermore, it is believed that by raising awareness among teachers and by the adoption of appropriate methods of vocabulary teaching founded on research, the vocabulary learning of bilingual pupils can be greatly increased. The aim of the study is to identify, describe and evaluate methods of vocabulary instruction currently used and to provide recommendations for suitable methods to be introduced. By means of an action research methodology implemented in a middle school, and with the joint participation of some members of staff and some pupils, classroom data was collected over a two and a half year period from teachers of science, geography and English and their pupils, supplemented with semi-structured interviews with teachers and support staff and conversations with children. These data provided material for a detailed analysis of exactly how individual words develop from first introduction into the pupils’ active vocabular
Recommended from our members
Leveraging Text-to-Scene Generation for Language Elicitation and Documentation
Text-to-scene generation systems take input in the form of a natural language text and output a 3D scene illustrating the meaning of that text. A major benefit of text-to-scene generation is that it allows users to create custom 3D scenes without requiring them to have a background in 3D graphics or knowledge of specialized software packages. This contributes to making text-to-scene useful in scenarios from creative applications to education. The primary goal of this thesis is to explore how we can use text-to-scene generation in a new way: as a tool to facilitate the elicitation and formal documentation of language. In particular, we use text-to-scene generation (a) to assist field linguists studying endangered languages; (b) to provide a cross-linguistic framework for formally modeling spatial language; and (c) to collect language data using crowdsourcing. As a side effect of these goals, we also explore the problem of multilingual text-to-scene generation, that is, systems for generating 3D scenes from languages other than English.
The contributions of this thesis are the following. First, we develop a novel tool suite (the WordsEye Linguistics Tools, or WELT) that uses the WordsEye text-to-scene system to assist field linguists with eliciting and documenting endangered languages. WELT allows linguists to create custom elicitation materials and to document semantics in a formal way. We test WELT with two endangered languages, Nahuatl and Arrernte. Second, we explore the question of how to learn a syntactic parser for WELT. We show that an incremental learning method using a small number of annotated dependency structures can produce reasonably accurate results. We demonstrate that using a parser trained in this way can significantly decrease the time it takes an annotator to label a new sentence with dependency information. Third, we develop a framework that generates 3D scenes from spatial and graphical semantic primitives. We incorporate this system into the WELT tools for creating custom elicitation materials, allowing users to directly manipulate the underlying semantics of a generated scene. Fourth, we introduce a deep semantic representation of spatial relations and use this to create a new resource, SpatialNet, which formally declares the lexical semantics of spatial relations for a language. We demonstrate how SpatialNet can be used to support multilingual text-to-scene generation. Finally, we show how WordsEye and the semantic resources it provides can be used to facilitate elicitation of language using crowdsourcing
Language Writ Large: LLMs, ChatGPT, Grounding, Meaning and Understanding
Apart from what (little) OpenAI may be concealing from us, we all know
(roughly) how ChatGPT works (its huge text database, its statistics, its vector
representations, and their huge number of parameters, its next-word training,
and so on). But none of us can say (hand on heart) that we are not surprised by
what ChatGPT has proved to be able to do with these resources. This has even
driven some of us to conclude that ChatGPT actually understands. It is not true
that it understands. But it is also not true that we understand how it can do
what it can do. I will suggest some hunches about benign biases: convergent
constraints that emerge at LLM scale that may be helping ChatGPT do so much
better than we would have expected. These biases are inherent in the nature of
language itself, at LLM scale, and they are closely linked to what it is that
ChatGPT lacks, which is direct sensorimotor grounding to connect its words to
their referents and its propositions to their meanings. These convergent biases
are related to (1) the parasitism of indirect verbal grounding on direct
sensorimotor grounding, (2) the circularity of verbal definition, (3) the
mirroring of language production and comprehension, (4) iconicity in
propositions at LLM scale, (5) computational counterparts of human categorical
perception in category learning by neural nets, and perhaps also (6) a
conjecture by Chomsky about the laws of thought. The exposition will be in the
form of a dialogue with ChatGPT-4.Comment: 48 pages, 25 reference
Self-Organizing Maps with Variable Input Length for Motif Discovery and Word Segmentation
Time Series Motif Discovery (TSMD) is defined as searching for patterns that
are previously unknown and appear with a given frequency in time series.
Another problem strongly related with TSMD is Word Segmentation. This problem
has received much attention from the community that studies early language
acquisition in babies and toddlers. The development of biologically plausible
models for word segmentation could greatly advance this field. Therefore, in
this article, we propose the Variable Input Length Map (VILMAP) for Motif
Discovery and Word Segmentation. The model is based on the Self-Organizing Maps
and can identify Motifs with different lengths in time series. In our
experiments, we show that VILMAP presents good results in finding Motifs in a
standard Motif discovery dataset and can avoid catastrophic forgetting when
trained with datasets with increasing values of input size. We also show that
VILMAP achieves results similar or superior to other methods in the literature
developed for the task of word segmentation
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
This paper surveys the current state of the art in Natural Language
Generation (NLG), defined as the task of generating text or speech from
non-linguistic input. A survey of NLG is timely in view of the changes that the
field has undergone over the past decade or so, especially in relation to new
(usually data-driven) methods, as well as new applications of NLG technology.
This survey therefore aims to (a) give an up-to-date synthesis of research on
the core tasks in NLG and the architectures adopted in which such tasks are
organised; (b) highlight a number of relatively recent research topics that
have arisen partly as a result of growing synergies between NLG and other areas
of artificial intelligence; (c) draw attention to the challenges in NLG
evaluation, relating them to similar challenges faced in other areas of Natural
Language Processing, with an emphasis on different evaluation methods and the
relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118
pages, 8 figures, 1 tabl
Constituting grammar and its pedagogy : the reform of the South African English Home Language intermediate phase curriculum between 1997 and 2012
Includes bibliographical references.Post-apartheid curriculum reform in South Africa has impacted the constitution and organisation of English language knowledge, including grammatical knowledge and its pedagogy. Additionally, changes in theoretical viewpoints on grammar instruction and early literacy instruction have influenced the conceptualisation and teaching of English grammar. This study aims to determine how grammar and its pedagogy have been constituted and explicated in the South African Intermediate Phase (IP) English Home Language (HL) curricula through curriculum reforms after 1997. It also seeks to explore how the constitution of grammar within Curriculum 2005 (C2005), the Revised National Curriculum Statements (RNCS), and the Curriculum and Assessment Policy Statements (CAPS) have been influenced by changing grammar and early literacy instruction theories and language teaching methodologies. The study analyses and compares the organisation and structure of grammatical knowledge and its suggested pedagogy in the three curriculum documents using Bernstein’s concepts of classification and framing. Grammar instruction theories and conceptualisations of grammar types as prescriptive, descriptive and rhetorical (drawn from a variety of grammar instruction commentators including Lefstein, Thornbury and Hudson & Walmsley) are identified in teacher guides and other supporting literature accompanying the three curricula. These documents are also analysed to identify the predominant early literacy instruction theories - skills/phonics-based, whole language, and balanced language approaches – underpinning curriculum development. The analysis shows that through the curriculum reforms, grammatical knowledge has been more strongly classified and framed resulting in a more explicit constitution of grammar as a skill to be acquired by learners for the development of an English meta-language. The CAPS English HL IP syllabus has returned to a contents- or knowledge-based curriculum. This clearer constitution of grammatical knowledge mirrors the re-emergence of explicit grammar instruction internationally, most notably in the UK. The analysis also shows that indistinct progression requirements, pertaining to the acquisition of specific grammatical knowledge, with an arbitrary basis between grades are a consistent concern in all three curricula. It also finds that conceptual ambiguity, regarding early literacy instruction approaches in curricula and accompanying guides, present since the inception of the RNCS and continuing in the CAPS, makes the task of curriculum interpretation difficult. The study concludes with some possible implications the areas of concern may have for teacher training and suggestions on grammatical knowledge organisation for clearer curriculum interpretation and implementation
- …