3,468 research outputs found

    A plea for more interactions between psycholinguistics and natural language processing research

    Get PDF
    A new development in psycholinguistics is the use of regression analyses on tens of thousands of words, known as the megastudy approach. This development has led to the collection of processing times and subjective ratings (of age of acquisition, concreteness, valence, and arousal) for most of the existing words in English and Dutch. In addition, a crowdsourcing study in the Dutch language has resulted in information about how well 52,000 lemmas are known. This information is likely to be of interest to NLP researchers and computational linguists. At the same time, large-scale measures of word characteristics developed in the latter traditions are likely to be pivotal in bringing the megastudy approach to the next level

    Corpus linguistics

    Get PDF
    The first comprehensive guide to research methods and technologies in psycholinguistics and the neurobiology of language Bringing together contributions from a distinguished group of researchers and practitioners, editors Annette M. B. de Groot and Peter Hagoort explore the methods and technologies used by researchers of language acquisition, language processing, and communication, including: traditional observational and behavioral methods; computational modelling; corpus linguistics; and virtual reality. The book also examines neurobiological methods, including functional and structural neuroimaging and molecular genetics. Ideal for students engaged in the field, Research Methods in Psycholinguistics and the Neurobiology of Language examines the relative strengths and weaknesses of various methods in relation to competing approaches.  It describes the apparatus involved, the nature of the stimuli and data used, and the data collection and analysis techniques for each method. Featuring numerous example studies, along with many full-color illustrations, this indispensable text will help readers gain a clear picture of the practices and tools described.  Brings together contributions from distinguished researchers across an array of related disciplines who explain the underlying assumptions and rationales of their research methods Describes the apparatus involved, the nature of the stimuli and data used, and the data collection and analysis techniques for each method Explores the relative strengths and weaknesses of various methods in relation to competing approaches Features numerous real-world examples, along with many full-color illustrations, to help readers gain a clear picture of the practices and tools describe

    Recognition times for 62 thousand English words : data from the English Crowdsourcing Project

    Get PDF
    We present a new dataset of English word recognition times for a total of 62 thousand words, called the English Crowdsourcing Project. The data were collected via an internet vocabulary test in which more than one million people participated. The present dataset is limited to native English speakers. Participants were asked to indicate which words they knew. Their response times were registered, although at no point were the participants asked to respond as quickly as possible. Still, the response times correlate around .75 with the response times of the English Lexicon Project for the shared words. Also, the results of virtual experiments indicate that the new response times are a valid addition to the English Lexicon Project. This not only means that we have useful response times for some 35 thousand extra words, but we now also have data on differences in response latencies as a function of education and age

    Which words do English non-native speakers know? New supernational levels based on yes/no decision

    Get PDF
    To have more information about the English words known by second language (L2) speakers, we ran a large-scale crowdsourcing vocabulary test, which yielded 17 million useful responses. It provided us with a list of 445 words known to nearly all participants. The list was compared to various existing lists of words advised to include in the first stages of English L2 teaching. The data also provided us with a ranking of 61,000 words in terms of degree and speed of word recognition in English L2 speakers, which correlated r = .85 with a similar ranking based on native English speakers. The L2 speakers in our study were relatively better at academic words (which are often cognates in their mother tongue) and words related to experiences English L2 students are likely to have. They were worse at words related to childhood and family life. Finally, a new list of 20 levels of 1,000 word families is presented, which will be of use to English L2 teachers, as the levels represent the order in which English vocabulary seems to be acquired by L2 learners across the world

    Dynamic Prediction of retail Website Visitors\u27 Intentions

    Get PDF
    This paper presents a model for identifying general intentions of consumers visiting a retail website. When visiting a transactional website, consumers have various intentions such as browsing (i.e., no purchase intention), purchasing a product in the near future, or purchasing a particular product during their current visit. By predicting these intentions early in the visit, online merchants could personalize their offer to better fulfill the needs of consumers. We propose a simple model which enables classifying visitors according to their intentions after only four traversals (clicks). The model is based solely on navigation patterns which can be automatically extracted from clickstream. The results are presented and extensions of the model are proposed
    corecore