10,696 research outputs found
The limits of my language are the limits of my world: The Scientific Lexicon from 1350 to 1640
[Abstract] A diachronic compilation of different types of texts such as the Helsinki Corpus provides adequate material for a preliminary approach to the degree of diffusion of scientific/technical vocabulary (mainly nouns). The presence of this specific lexicon in writings other than scientific may be taken as an indicator of this diffusion. In addition, the lexical features of certain texts may originate a variety of language for specific purposes as a linguistic response to external demands. Socio-economic specialisation in a speech community is paralleled by the creation of specific registers
Lexical typology : a programmatic sketch
The present paper is an attempt to lay the foundation for Lexical Typology as a new kind of linguistic typology.1 The goal of Lexical Typology is to investigate crosslinguistically significant patterns of interaction between lexicon and grammar
Collaborative Deep Learning for Recommender Systems
Collaborative filtering (CF) is a successful approach commonly used by many
recommender systems. Conventional CF-based methods use the ratings given to
items by users as the sole source of information for learning to make
recommendation. However, the ratings are often very sparse in many
applications, causing CF-based methods to degrade significantly in their
recommendation performance. To address this sparsity problem, auxiliary
information such as item content information may be utilized. Collaborative
topic regression (CTR) is an appealing recent method taking this approach which
tightly couples the two components that learn from two different sources of
information. Nevertheless, the latent representation learned by CTR may not be
very effective when the auxiliary information is very sparse. To address this
problem, we generalize recent advances in deep learning from i.i.d. input to
non-i.i.d. (CF-based) input and propose in this paper a hierarchical Bayesian
model called collaborative deep learning (CDL), which jointly performs deep
representation learning for the content information and collaborative filtering
for the ratings (feedback) matrix. Extensive experiments on three real-world
datasets from different domains show that CDL can significantly advance the
state of the art
WRITING AND LITERARY ACTIVITY IN THE VERNACULAR IN ANGLO-SAXON ENGLAND
У статті розглянуто процес виникнення і функціонування різних форм текстової фіксації на
англійських територіальних діалектах давнього періоду розвитку англійської мови як наслідок
розвитку суспільних функцій мови та розширення сфер функціонування її писемної форми в період
формування англосаксонського суспільства. В історичній перспективі простежено становлення
основних текстових категорій і видів текстів давньоанглійської писемності на основі
функціональної класифікації писемних пам’яток давнього періоду. Описано формування англійської
писемної традиції в соціолінгвістичному контексті та зазначено основні рукописні тексти-джерела
деяких давньоанглійських писемних пам’яток
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Large language models (LLMs) have been shown to be able to perform new tasks
based on a few demonstrations or natural language instructions. While these
capabilities have led to widespread adoption, most LLMs are developed by
resource-rich organizations and are frequently kept from the public. As a step
towards democratizing this powerful technology, we present BLOOM, a
176B-parameter open-access language model designed and built thanks to a
collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer
language model that was trained on the ROOTS corpus, a dataset comprising
hundreds of sources in 46 natural and 13 programming languages (59 in total).
We find that BLOOM achieves competitive performance on a wide variety of
benchmarks, with stronger results after undergoing multitask prompted
finetuning. To facilitate future research and applications using LLMs, we
publicly release our models and code under the Responsible AI License
Action nominalizations in Early Modern scientific English
The present dissertation, Action nominalizations in Early Modern scientific English, was conceived as a contribution to the literature on nominalizations. Its point of departure was the assertion that action nominalizations are the result of a word-formation process which aims at filling gaps in the vocabulary of a particular language, English in this case. Action nominalizations are clear cases of grammatical metaphor (Halliday 2004 [1985]), since they are nouns, but they refer to actions as verbs do. For this reason, attention is given to the evolution and use of action nominalizations in the Early Modern English period (henceforth EModE), the time which sees the greatest increase of vocabulary in the history of the English language. Given that nouns prototypically refer to objects rather than actions, the question arises as to how they behave when they denote actions, and what the consequences of this use are
Blunder, Error, Mistake, Pitfall: Trawling the OED with the Help of the Historical Thesaurus
The paper considers the lexis of error and examines its use across time in relation to the writing and spelling of English, to grammar and pronunciation. Discussion focuses first on the earliest records of notions of correctness in English language usage, from Ælfric forwards to the emergence of standard English, from the sixteenth century’s growing worries about copiousness and purity of diction to eighteenth-century concerns to prescribe and rule the language. The historical overview is complemented by consideration of the data drawn together by the Glasgow Historical Thesaurus project, its evidence taken from the Oxford English Dictionary and the Dictionary of Old English Corpus. For earlier centuries, there are by far fewer relevant citations, often buried within words wide in reference. With the help of the Historical Thesaurus we drill down to view how views of language mistakes and errors have changed over the centuries of the recorded history of English
- …