780 research outputs found

    Impact of lexical and sentiment factors on the popularity of scientific papers

    Full text link
    We investigate how textual properties of scientific papers relate to the number of citations they receive. Our main finding is that correlations are non-linear and affect differently most-cited and typical papers. For instance, we find that in most journals short titles correlate positively with citations only for the most cited papers, for typical papers the correlation is in most cases negative. Our analysis of 6 different factors, calculated both at the title and abstract level of 4.3 million papers in over 1500 journals, reveals the number of authors, and the length and complexity of the abstract, as having the strongest (positive) influence on the number of citations.Comment: 9 pages, 3 figures, 3 table

    Presenting GECO : an eyetracking corpus of monolingual and bilingual sentence reading

    Get PDF
    This paper introduces GECO, the Ghent Eye-tracking Corpus, a monolingual and bilingual corpus of eye-tracking data of participants reading a complete novel. English monolinguals and Dutch-English bilinguals read an entire novel, which was presented in paragraphs on the screen. The bilinguals read half of the novel in their first language, and the other half in their second language. In this paper we describe the distributions and descriptive statistics of the most important reading time measures for the two groups of participants. This large eye-tracking corpus is perfectly suited for both exploratory purposes as well as more directed hypothesis testing, and it can guide the formulation of ideas and theories about naturalistic reading processes in a meaningful context. Most importantly, this corpus has the potential to evaluate the generalizability of monolingual and bilingual language theories and models to reading of long texts and narratives

    Pressure points in reading comprehension:a quantile multiple regression analysis

    Get PDF
    The goal of this study was to examine how selected pressure points or areas of vulnerability are related to individual differences in reading comprehension and whether the importance of these pressure points varies as a function of the level of children’s reading comprehension. A sample of 245 third grade children were given an assessment battery that included multiple measures of vocabulary, grammar, higher-level language ability, word reading, working memory, and reading comprehension. Ordinary least squares (OLS) and quantile regression analyses were undertaken. OLS regression analyses indicated that all variables except working memory, accounted for unique variance in reading comprehension. However, quatntile regression showed that the extent of the relationships varied in some cases across readers of different ability levels. Results suggest that quantile regression may be a useful approach for the study of reading in both typical and atypical readers and aid greater specification of componential models of reading comprehension across the ability range

    The thermodynamics of human reaction times

    Get PDF
    I present a new approach for the interpretation of reaction time (RT) data from behavioral experiments. From a physical perspective, the entropy of the RT distribution provides a model-free estimate of the amount of processing performed by the cognitive system. In this way, the focus is shifted from the conventional interpretation of individual RTs being either long or short, into their distribution being\ud more or less complex in terms of entropy. The new approach enables the estimation of the cognitive processing load without reference to the informational content of the stimuli themselves, thus providing a more appropriate estimate of the cognitive impact of dierent sources of information that are carried by experimental stimuli or tasks. The paper introduces the formulation of the theory, followed by an empirical validation using a database of human RTs in lexical tasks (visual lexical decision and word\ud naming). The results show that this new interpretation of RTs is more powerful than the traditional one. The method provides theoretical estimates of the processing loads elicited by individual stimuli. These loads sharply distinguish the responses from different tasks. In addition, it provides upper-bound estimates for the speed at which the system processes information. Finally, I argue that the theoretical proposal, and the associated empirical evidence, provide strong arguments for an adaptive system that systematically adjusts its operational processing speed to the particular demands of each stimulus. This\ud finding is in contradiction with Hick's law, which posits a relatively constant processing speed within an experimental context

    Multi-Level Modeling of Quotation Families Morphogenesis

    Get PDF
    This paper investigates cultural dynamics in social media by examining the proliferation and diversification of clearly-cut pieces of content: quoted texts. In line with the pioneering work of Leskovec et al. and Simmons et al. on memes dynamics we investigate in deep the transformations that quotations published online undergo during their diffusion. We deliberately put aside the structure of the social network as well as the dynamical patterns pertaining to the diffusion process to focus on the way quotations are changed, how often they are modified and how these changes shape more or less diverse families and sub-families of quotations. Following a biological metaphor, we try to understand in which way mutations can transform quotations at different scales and how mutation rates depend on various properties of the quotations.Comment: Published in the Proceedings of the ASE/IEEE 4th Intl. Conf. on Social Computing "SocialCom 2012", Sep. 3-5, 2012, Amsterdam, N

    Indian English Evolution and Focusing Visible Through Power Laws

    Get PDF
    New dialect emergence and focusing in language contact settings is difficult to capture and date in terms of global structural dialect stabilization. This paper explores whether diachronic power law frequency distributions can provide evidence of dialect evolution and new dialect focusing, by considering the quantitative frequency characteristics of three diachronic Indian English (IE) corpora (1970s–2008). The results demonstrate that IE consistently follows power law frequency distributions and the corpora are each best fit by Mandelbrot’s Law. Diachronic changes in the constants are interpreted as evidence of lexical and syntactic collocational focusing within the process of new dialect formation. Evidence of new dialect focusing is also visible through apparent time comparison of spoken and written data. Age and gender-separated sub-corpora of the most recent corpus show minimal deviation, providing apparent time evidence for emerging IE dialect stability. From these findings, we extend the interpretation of diachronic changes in the β coefficient—as indicative of changes in the degree of synthetic/analytic structure—so that β is also sensitive to grammaticalization and changes in collocational patterns

    It takes time to prime: Semantic priming in the ocular lexical decision task

    No full text
    Two eye-tracking experiments were conducted in which the manual response mode typically used in lexical decision tasks (LDTs) was replaced with an eye-movement response through a sequence of 3 words. This ocular LDT combines the explicit control of task goals found in LDTs with the highly practiced ocular response used in reading text. In Experiment 1, forward saccades indicated an affirmative lexical decision (LD) on each word in the triplet. In Experiment 2, LD responses were delayed until all 3 letter strings had been read. The goal of the study was to evaluate the contribution of task goals and response mode to semantic priming. Semantic priming is very robust in tasks that involve recognition of words in isolation, such as LDT, but limited during text reading, as measured using eye movements. Gaze durations in both experiments showed robust semantic priming even though ocular response times were much shorter than manual LDs for the same words in the English Lexicon Project. Ex-Gaussian distribution fits revealed that the priming effect was concentrated in estimates of tau (Ď„), meaning that priming was most pronounced in the slow tail of the distribution. This pattern shows differential use of the prime information, which may be more heavily recruited in cases in which the LD is difficult, as indicated by longer response times. Compared with the manual LD responses, ocular LDs provide a more sensitive measure of this task-related influence on word recognition as measured by the LDT

    Does Conceptual Representation Require Embodiment? Insights From Large Language Models

    Full text link
    Recent advances in large language models (LLM) have the potential to shed light on the debate regarding the extent to which knowledge representation requires the grounding of embodied experience. Despite learning from limited modalities (e.g., text for GPT-3.5, and text+image for GPT-4), LLMs have nevertheless demonstrated human-like behaviors in various psychology tasks, which may provide an alternative interpretation of the acquisition of conceptual knowledge. We compared lexical conceptual representations between humans and ChatGPT (GPT-3.5 and GPT-4) on subjective ratings of various lexical conceptual features or dimensions (e.g., emotional arousal, concreteness, haptic, etc.). The results show that both GPT-3.5 and GPT-4 were strongly correlated with humans in some abstract dimensions, such as emotion and salience. In dimensions related to sensory and motor domains, GPT-3.5 shows weaker correlations while GPT-4 has made significant progress compared to GPT-3.5. Still, GPT-4 struggles to fully capture motor aspects of conceptual knowledge such as actions with foot/leg, mouth/throat, and torso. Moreover, we found that GPT-4's progress can largely be associated with its training in the visual domain. Certain aspects of conceptual representation appear to exhibit a degree of independence from sensory capacities, but others seem to necessitate them. Our findings provide insights into the complexities of knowledge representation from diverse perspectives and highlights the potential influence of embodied experience in shaping language and cognition
    • …
    corecore