55 research outputs found

    Combining quantitative narrative analysis and predictive modeling - an eye tracking study

    Get PDF
    As a part of a larger interdisciplinary project on Shakespeare sonnets’ reception (Jacobs et al., 2017; Xue et al., 2017), the present study analyzed the eye movement behavior of participants reading three of the 154 sonnets as a function of seven lexical features extracted via Quantitative Narrative Analysis (QNA). Using a machine learning- based predictive modeling approach five ‘surface’ features (word length, orthographic neighborhood density, word frequency, orthographic dissimilarity and sonority score) were detected as important predictors of total reading time and fixation probability in poetry reading. The fact that one phonological feature, i.e., sonority score, also played a role is in line with current theorizing on poetry reading. Our approach opens new ways for future eye movement research on reading poetic texts and other complex literary materials (cf. Jacobs, 2015c)

    Reading and Rereading Shakespeare’s Sonnets: Combining Quantitative Narrative Analysis and Predictive Modeling

    Get PDF
    Natural reading is rather like a juggling feat, as our eyes and minds are kept on several things at the same time. Instead, reading texts developed by researchers (so-called “textoids”; Graesser, Millis, & Zwaan, 1997) may be fairly simple, since this facilitates an experimental investigation. It thus provides the chance for clear statements regarding the effect of predefined variables. Likewise, most empirical studies focused only a few selected features while ignoring the great diversity of possibly important others (e.g., Rayner et al., 2001; Reichle, Rayner, & Pollatsek, 2003; Rayner & Pollatsek, 2006; Engbert et al., 2005; Rayner, 2009). However, it is not possible to directly transfer the results generated from textoids to natural reading due to the identification of more than 100 features on different hierarchical levels, which may influence processing a natural text (Graf, Nagler, & Jacobs, 2005; Jacobs, 2015a, b; Jacobs et al., 2017). The present dissertation differed from past research in that it used a literary text, i.e., Shakespeare’s sonnets, instead of texts constructed by the experimenter. The goal of the present dissertation was to investigate how psycholinguistic features may influence the reading behavior during poem perception. To this end, two problems need to be handled: Firstly, complex natural texts need to be broken up into measurable and testable features by “turning words into numbers” (Franzosi, 2010) for the sake of statistical analysis. Secondly, statistical ways were sought to deal with the non-linear webs of correlations among different features, which has long been a concern of Jacob’s working group (e.g., Willems, 2015; Willems & Jacobs, 2016; Jacobs & Willems, 2018). A quantitative narrative analysis (QNA) based predictive modeling approach was suggested to solve the above problems (e.g., Jacobs et al., 2017; Jacobs, 2017, 2018a, b). Since it is impossible to identify all relevant features of a natural text [e.g., over 50 features mentioned for single word recognition (Graf et al., 2005) or over 100 features computed for the corpus of Shakespeare sonnets (Jacobs et al., 2017)] and including more inter/supra-lexical features also requires extending sample sizes (i.e., more/longer texts and more participants), my dissertation focuses on lexical features. Seven of these are surface features (word length, word frequency, orthographic neighborhood density, higher frequency neighbors, orthographic dissimilarity index, consonant vowel quotient, and the sonority score) and two are affective-semantic features (valence and arousal). By applying the QNA-based predictive modeling approach, I conducted three eye tracking studies: study 1 (Chapter 5) asked English native speakers to read three of Shakespeare’s sonnets (sonnet 27, 60, and 66), aiming to investigate the role of seven surface psycholinguistic features in sonnets reading. Study 2 (Chapter 6) used a rereading paradigm and let another group of English natives read two of the three sonnets (sonnet 27 and 66), to find out whether the roles of the surface psycholinguistic features may be changed in rereading. In study 3 (Chapter 7), I reanalyzed the data of study 2, in which beyond the surface features I started to pay attention to the affective-semantic features, hoping to examine whether the roles of surface and affective-semantic features may be different throughout reading sessions. The three studies show highly reliable data for high feature importance of surface variables, and in rereading an increasing impact of affective-semantic features in reading Shakespeare’s sonnets. From a methodological viewpoint, all three studies show a much better sufficiency of neural net approach than the classical general linear model approach in psycholinguistic eye tracking research. For the rereading studies, in general, compared to the first reading, rereading improved the fluency of reading on poem level (shorter total reading times, shorter regression times, and lower fixation probability) and the depth of comprehension (e.g., Hakemulder, 2004; Kuijpers & Hakemulder, 2018). Contrary to the other rereading studies using literary texts (e.g., Dixon et al., 1993; Millis, 1995; Kuijpers & Hakemulder, 2018), no increase in appreciation was apparent. In summary, this dissertation can show that the application of predictive modeling to investigate poetry might be far more suitable to capture the highly interactive, non-linear composition of linguistic features in natural texts that guide reading behavior and reception. Besides, surface features seem to influence reading during all reading sessions, while affective-semantic features seem to increase their importance in line with processing depth as indicated by higher influence during rereading. The results seem to be stable and valid as I could replicate these novel findings using machine learning algorithms within my dissertation project. My dissertation project is a first step towards a more differentiated picture of the guiding factors of poetry reception and a poetry specific reading model

    Visual Recognition and Categorization on the Basis of Similarities to Multiple Class Prototypes

    Get PDF
    To recognize a previously seen object, the visual system must overcome the variability in the object's appearance caused by factors such as illumination and pose. Developments in computer vision suggest that it may be possible to counter the influence of these factors, by learning to interpolate between stored views of the target object, taken under representative combinations of viewing conditions. Daily life situations, however, typically require categorization, rather than recognition, of objects. Due to the open-ended character both of natural kinds and of artificial categories, categorization cannot rely on interpolation between stored examples. Nonetheless, knowledge of several representative members, or prototypes, of each of the categories of interest can still provide the necessary computational substrate for the categorization of new instances. The resulting representational scheme based on similarities to prototypes appears to be computationally viable, and is readily mapped onto the mechanisms of biological vision revealed by recent psychophysical and physiological studies

    Reading Shakespeare sonnets: Combining quantitative narrative analysis and predictive modeling - an eye tracking study

    Get PDF
    As a part of a larger interdisciplinary project on Shakespeare sonnets’ reception (Jacobs et al., 2017; Xue et al., 2017), the present study analyzed the eye movement behavior of participants reading three of the 154 sonnets as a function of seven lexical features extracted via Quantitative Narrative Analysis (QNA). Using a machine learning- based predictive modeling approach five ‘surface’ features (word length, orthographic neighborhood density, word frequency, orthographic dissimilarity and sonority score) were detected as important predictors of total reading time and fixation probability in poetry reading. The fact that one phonological feature, i.e., sonority score, also played a role is in line with current theorizing on poetry reading. Our approach opens new ways for future eye movement research on reading poetic texts and other complex literary materials (cf. Jacobs, 2015c)

    Reduced Order Models and Data Assimilation for Hydrological Applications

    Get PDF
    The present thesis work concerns the study of Monte Carlo (MC)-based data assimilation methods applied to the numerical simulation of complex hydrological models with stochastic parameters. The ensemble Kalman filter (EnKF) and the sequential importance resampling (SIR) are implemented in the CATHY model, a solver that couples the subsurface water flow in porous media with the surface water dynamics. A detailed comparison of the results given by the two filters in a synthetic test case highlights the main benefits and drawbacks associated to these techniques. A modification of the SIR update is suggested to improve the performance of the filter in case of small ensemble sizes and small variances of the measurement errors. With this modification, both filters are able to assimilate pressure head and streamflow measurements and correct model errors, such as biased initial and boundary conditions. SIR technique seems to be better suited for the simulations at hand as they do not make use of the Gaussian approximation inherent the EnKF method. Further research is needed, however, to assess the robustness of the particle filters methods in particular to ensure accuracy of the results even when relatively small ensemble sizes are employed. In the second part of the thesis the focus is shifted to reducing the computational burden associated with the construction of the MC realizations (which constitutes the core of the EnKF and SIR). With this goal, we analyze the computational saving associated to the use of reduced order models (RM) for the generation of the ensemble of solutions. The proper orthogonal decomposition (POD) is applied to the linear equations of the groundwater flow in saturated porous media with a randomly distributed recharge and random heterogeneous hydraulic conductivity. Several test cases are used to assess the errors on the ensemble statistics caused by the RM approximation. Particular attention is given to the efficient computation of the principal components that are needed to project the model equations in the reduced space. The greedy algorithm selects the snapshots in the set of the MC realizations in such a way that the final principal components are parameter independent. An innovative residual-based estimation of the error associated to the RM solution is used to assess the precision of the RM and to stop the iterations of the greedy algorithm. By way of numerical applications in synthetic and real scenarios, we demonstrate that this modified greedy algorithm determines the minimum number of principal components to use in the reduction and, thus, leads to important computational savings

    Second language acquisition of Japanese orthography

    Get PDF

    Forested Watersheds and Water Supply: Exploring Effects of Wildfires, Silviculture, and Climate Change on Downstream Waters

    Get PDF
    Drinking water supplies for much of society originate in forests. To preserve the capability of these forests to produce clean and easily treatable water, source water supply and protection strategies focus in particular on potential disturbances to the landscape, which include prescribed forest harvesting and wildfires of varying intensity. While decades of work have revealed relationships between forest harvesting and stream flow response, there is a considerable lack of synthesis disentangling the interactions of climate, wildfires, stream flow, and water quality. Revealing the mechanisms for impacts on downstream waters after disturbances of harvesting and wildfire will greatly improve land and water management. In this dissertation, I combined synthesis of previously published or available data, novel mathematical analyses, and deterministic modeling to disentangle various disturbance effects and further our understanding of processes in forested watersheds. I broadly sought to explore how streamflow and water quality change after forest disturbances, and how new methods and analyses can provide insight into the biogeochemical and ecohydrologic processes changing during disturbances. First, I examined the effect of wildfire on hydrology, and developed a novel Budyko decomposition method to separate climatic and disturbance effects on streamflow. Using a set of 17 watersheds in southern California, I showed that while traditional metrics like changes in flow or runoff ratio might not detect a disturbance effect from wildfire due to confounding climate signals, the Budyko framework can be used successfully for statistical change detection. The method was used to estimate hydrologic recovery timescales that varied between 5 and 45 years, with an increase of about 4 years of recovery time per 10% of the watershed burned. Next, in Chapter 3 I used a meta-analysis approach to examine the effect of wildfire on water quality, using data from 121 catchments around the world. Analyzing the changes in concentrations of stream water nutrients, including carbon, nitrogen, and phosphorus, I showed that concentrations generally increased after fire. While a large amount of variability existed in the data, we found concurrent increases in the constituents after fire highlighting tight coupling of the biogeochemical cycles. Most interestingly, we found fire to increase the concentrations of biologically active nutrients like nitrate and phosphate at a greater rate than total nitrogen and phosphorus, with median increases of 40-60% in the nitrate to TN, and SRP to TP ratios. Next, I conducted an analysis of both water quality and hydrology together after fire in Chapter 4, using a set of 29 wildfire-impacted watersheds in the United States. Concentration-discharge relationships can be used to reveal pathways and sources of elements exported from watersheds, and my overall hypothesis was that these relationships change in post-fire landscapes. I developed a new methodology, using k-means clustering, to classify watersheds as chemostatic, dilution, mobilization and chemodynamic, and explored how their position within the cluster changed in post-fire landscapes. I found that the behavior of nitrate and ammonium was increasingly chemostatic after fire, while behavior of total nitrogen, phosphorus, and organic phosphorus was increasingly mobilizing after fire. Finally, I developed a coupled hydrology-vegetation-biogeochemistry model to simulate and elucidate processes controlling the impact of harvesting on downstream waters. I focused on the Turkey Lakes watershed where a significant amount of data has been collected on vegetation and soil nutrient dynamics, in addition to traditional streamflow and water quality metrics, and developed a novel multi-part calibration process that used measured data on stream, forest, and soil characteristics and dynamics. Future work would involve using the model to explore the data driven relationships that have been developed in the earlier chapters of the paper. The work presented in this dissertation highlights new small and large-scale relationships between disturbances in forested watersheds and effects on downstream waters. With more threats predicted to escalate and overlap in the coming years, the novel results and methodologies that I have presented here should contribute to improving land and water management

    A Computational Model of Auditory Feature Extraction and Sound Classification

    Get PDF
    This thesis introduces a computer model that incorporates responses similar to those found in the cochlea, in sub-corticai auditory processing, and in auditory cortex. The principle aim of this work is to show that this can form the basis for a biologically plausible mechanism of auditory stimulus classification. We will show that this classification is robust to stimulus variation and time compression. In addition, the response of the system is shown to support multiple, concurrent, behaviourally relevant classifications of natural stimuli (speech). The model incorporates transient enhancement, an ensemble of spectro - temporal filters, and a simple measure analogous to the idea of visual salience to produce a quasi-static description of the stimulus suitable either for classification with an analogue artificial neural network or, using appropriate rate coding, a classifier based on artificial spiking neurons. We also show that the spectotemporal ensemble can be derived from a limited class of 'formative' stimuli, consistent with a developmental interpretation of ensemble formation. In addition, ensembles chosen on information theoretic grounds consist of filters with relatively simple geometries, which is consistent with reports of responses in mammalian thalamus and auditory cortex. A powerful feature of this approach is that the ensemble response, from which salient auditory events are identified, amounts to stimulus-ensemble driven method of segmentation which respects the envelope of the stimulus, and leads to a quasi-static representation of auditory events which is suitable for spike rate coding. We also present evidence that the encoded auditory events may form the basis of a representation-of-similarity, or second order isomorphism, which implies a representational space that respects similarity relationships between stimuli including novel stimuli
    • 

    corecore