103 research outputs found
Interaction of lexical strata in hybrid compound words through gradient phonotactics
We analyse hybrid compound words in Japanese, where a hybrid compound is one formed from stems that belong to more than one lexical stratum. In native-foreign compounds, where one stem belongs to one of the three native strata: Yamato, Sino Japanese and mimetic, as identified by Itô and Mester (1995), and the other to the foreign stratum, we observe that violations of phonological wellformedness constraints in the foreign stem are significantly less probable than in pure foreign words. These observations are explainable through gradient phonotactic probability, where the probability of a phoneme is determined by the whole sequence of phonemes that precedes it. We shall argue that this observed phonotactic behaviour of hybrid compounds is best explained by the hypothesis that both lexical strata distinctions and phonotactics are graded rather than categorical
Social networks and intraspeaker variation during periods of language change
Previous work has revealed general characteristics of language change at both the level of linguistic communities as well as individual speakers. What are the properties of language users such that we can account for these characteristics? To address this question, we built a computational model of a social network of language users. By holding the network structure constant and varying properties of the language users, we found that language change reflects both the structure of social networks and properties of language users. In particular, our results suggest that although language users must be capable of probabilistically accessing multiple grammars, they must prefer to access a single grammar categorically
Learning a gradient grammar of French liaison
In certain French words, an orthgraphically-final consonant is unpronounced except, in certain environments, when it precedes a vowel. This phenomenon, liaison, shows significant interactions with several other patterns in French (including h-aspiré, schwa deletion, and the presence of other morphemes in the liaison context). We present a learning algorithm that acquires a grammar that accounts for these patterns and their interactions. The learned grammar employs Gradient Symbolic Computation (GSC), incorporating weighted constraints and partially-activated symbolic representations. Grammatical analysis in the GSC framework includes the challenging determination of the numerical strength of symbolic constituent activations (as well as constraints). Here we present the first general algorithm for learning these quantities from empirical examples: the Error-Driven Gradient Activation Readjustment (EDGAR). Smolensky and Goldrick (2016) proposed a GSC analysis, with hand-determined numerical strengths, in which liaison derives from the coalescence of partially-activated input consonants. EDGAR allows us to extend this work to a wider range of liaison phenomena by automatically determining the more comprehensive set of numerical strengths required to generate the complex pattern of overall liaison behaviour
Recommended from our members
Transient blend states and discrete agreement-driven errors in sentence production
Errors in subject-verb agreement are common in everyday language production. This has been studied using a preamble completion task in which a participant hears or reads a preamble containing inflected nouns and forms a complete English sentence (“The key to the cabinets” could be completed as The key to the cabinets is gold. ) Existing work has focused on errors arising in selecting the correct verb form for production in the presence of a more ‘local’ noun with different number features (The key to the cabinets are gold). However, the same paradigm elicits substantial numbers of preamble errors ( The key to the cabinets repeated as The key to the cabinet ) that existing theories have largely failed to address.
We propose a Gradient Symbolic Computation (GSC) account of agreement and preamble errors. Sentence processing is modeled as a continuous-time, continuous-state stochastic dynamical system. Within this continuous representational space, a subset of states reflect discrete symbolic structures. The remainder are blend states where multiple symbols are simultaneously partially active. Initial phases of computation prefer blend states; an additional dynamic control parameter, commitment strength, pushes the model to discrete structures. This process, combined with stochastic gradient ascent dynamics respecting grammatical constraints on syntactic structures, yields discrete sentence outputs. We propose that transient blend states allow portions of target and non-target syntactic structures to interact, yielding both verb and preamble errors
A restricted interaction account (RIA) of spoken word production: The best of both worlds.
Theories of spoken word production generally assume that mapping from conceptual representations (e.g., [furry, feline, domestic]) to phonemes (e.g., /k/, /ae/, /t/) involves both a meaning-based process and a sound-based process. A central question in this framework is how these two processes interact with one another. Two theories that occupy extreme positions on the continuum of interactivity are reviewed: a highly discrete position (e.g., Theories of single word production generally assume that two cognitive processes are required for mapping from a conceptual representations (e.g., [furry, feline, domestic]) to the set of phonemes used to communicate that concept (/k/, /ae/, /t/). The first process is meaning-based (semantic) and involves the selection of a particular word to express a nonverbal concept. The second is sound-based (phonological) and involves retrieving the phonemes that correspond to the selected word To motivate RIA, we first examine highly discrete and interactive theories and carry out a fairly extensive review of the data that are problematic for each. We then show that RIA has the necessary and sufficient features to account for the existing empirical findings and, furthermore, that it can be extended to account for more recent challenges A GENERIC TWO-STAGE FRAMEWORK Theories of spoken word production differ not only in terms of the degree of interactivity that they incorporate, but also with regard to a number of representational and architectural issues. In order to specifically focus on differences among the theories regarding discreteness/interactivity we adopt a ''generic'' architecture that abstracts away from many of the representational differences among current theories. To motivate this generic framework, we first briefly review certain prominent theories of spoken word production. Theories of spoken word production: Representations Most accounts of spoken word production assume a spreading-activation architecture, whereby processing involves sets of units or nodes that accumulate activation and transmit it to other units. The sets of units represent different types of information. Most theories assume separate sets for semantic, syntactic, and phonological information; theories differ, however, in other regards. We briefly review the proposals of Levelt et al
Enhanced harvest performance predictability through advanced multivariate data analysis of mammalian cell culture particle size distribution
The industry's pursuit for higher antibody production has led to increased cell density cultures that impact the performance of subsequent product recovery steps. This increase in cell concentration has highlighted the critical role of solids concentration in centrifugation yield, while recent product degradation cases have shed light on the impact of cell lysis on product quality. Current methods for measuring solids concentration and cell lysis are not suited for early-stage high-throughput experimentation, which means that these cell culture outputs are not well characterized in early process development. This article describes a novel approach that leveraged the data from a widely-used automated cell counter (Vi-CELL™ XR) to accurately predict solids concentration and a common cell lysis indicator represented as lactate dehydrogenase (LDH) release. For this purpose, partial least squares (PLS) models were derived with k-fold cross-validation from the particle size distribution data generated by the cell counter. The PLS models showed good predictive potential for both LDH release and solids concentration. This novel approach reduced the time required for evaluating the solids concentration and LDH for a typical high-throughput cell culture system (with 48 bioreactors in parallel) from around 7 h down to a few minutes
Machine learning and advanced data analytics automating the exploitation of Raman spectroscopy: From micro-scale to large-scale operation
Please click Additional Files below to see the full abstract
- …