1,291 research outputs found

    Influence properties of partial squares regression.

    Get PDF
    In this paper, we compute the influence function for partial least squares regression. Thereunto, we design two alternative algorithms, according to the PLS algorithm used. One algorithm for the computation of the influence function is based on the Helland PLS algorithm, whilst the other is compatible with SIMPLS.The calculation of the influence function leads to new influence diagnostic plots for PLS. An alternative to the well known Cook distance plot is proposed, as well as a variant which is sample specific.Moreover, a novel estimate of prediction variance is deduced. The validity of the latter is corroborated by dint of a Monte Carlo simulation.Influence function; Design; Algorithms; Simulation;

    Influence properties of partial least squares regression.

    Get PDF
    Regression; Partial least squares; Least-squares; Squares; Squares regression;

    The L2 acquisition of agreement: comparing the inter language of Dutch, English, French and Swedish-speaking learners of Spanish

    Get PDF
    Much of current generative research into non-native language (L2) acquisition of morphosyntax has focused on L1 transfer and access to Universal Grammar. Subject-Verb agreement (1) has figured more prominently than nominal agreement ((1 )-(2)) in this debate, but empirical findings remain inconclusive. For instance, Hawkins & Franceschina (2004) conclude that UG features (e.g. [GENDER]) not realised in the L1 cannot be acquired, whereas White et al. (2001) argue the opposite. The present study examines the acquisition of nominal and verbal agreement marking in L2 Spanish through acceptability judgement, comprehension and production tasks carried out amongst adult L2 acquirers matched for at least two levels of proficiency, with L1s which vary in terms of the realisation of nominal and/or verbal agreement: I demonstrate that the fact that L2ers can produce or recognise agreeing morphological markers is not sufficient to ascribe to them knowledge of syntactic agreement (and hence of the relevant functional features). The experiments address this issue by examining (non)agreement in non-contiguous ('long' distance) contexts with a complex sentential subject consisting of a head noun and an intervener (as illustrated in (3a-b) for nominal gender marking). Such test items systematically contrast contexts where the head noun and intervener have matching (3a) versus opposite (3b) agreement features L2ers at lower proficiency level perform significantly better at contexts with matching than opposite gender agreement features, suggesting that they rely more on linear word order and hence general cognitive learning strategies. The most advanced L2ers, however, demonstrate native-like 'long' distanceagreement in all contexts, suggesting (hierarchical) structure dependency and hence acquisition that is specific to Language (contra Hawkins & Chan's (1997) Failed Functional Features Hypothesis, but supporting access to UG as defined by Schwartz & Sprouse's (1996) Full Transfer/Full Access Theory). The data also reveal that not all types of morphosyntactic agreement are equally acquirable. For all L2ers regardless of their L1, nominal and verbal [NUMBER] are less problematic than [PERSON] and [GENDER]. These L2A findings differ from the results of studies into the L 1A of Spanish agreement morphology. L1 children master gender agreement before they start producing nominal number agreement (Marrero & Aguirre 2003, Hernandez Pina 1984) and produce distinctions between different verbal persons (1st and 3rd)before plural verb forms emerge (Bel2002, Grinstead 2000, López Ornat 1997).The L2ers' L1 does play a role, however, in the initial stages of L2A, particularly in the field of L2 morphology. Problems with remapping syntactic features onto surface morphology cause difficulties for L2ers whose L1 operates a different morphological system to L2 Spanish. L 1 French speakers, for instance, have fewer problems with the acquisition of separate morphemes for nominal gender and nominal number agreement in L2 Spanish than Dutch and Swedish L2ers whose L1 uses a portmanteau morpheme to realise both features. These problems in the field of 'morphological competence' (Lardiere 2005) appear more relevant than issues of syntactic transfer as predicted by Schwartz & Sprouse (1996). Indeed, L 1 English learners of Spanish do not seem to experience more problems building up a morphosyntactic system for nominal agreement from scratch than the Swedish and Dutch L2ers who need to 'remap' (i.e. disentangle and reassemble - Lardiere 2005) syntactic features to agreement morphemes. The finding that mapping problems between syntactic features and lexical forms prevent some L2ers from producing concording agreement morphology is also confirmed by the discrepancy between L2ers' ability to interpret and judge agreement marking, as reflected in the acceptability judgement and comprehension tasks, and the L2ers' more limited ability to produce agreement marking. Moreover, the least marked features often act as defaults, as demonstrated by the overgeneralization of [+MASC], [+3P] and [+SG] markings

    Robust continuum regression.

    Get PDF
    Several applications of continuum regression to non-contaminated data have shown that a significant improvement in predictive power can be obtained compared to the three standard techniques which it encompasses (Ordinary least Squares, Principal Component Regression and Partial Least Squares). For contaminated data continuum regression may yield aberrant estimates due to its non-robustness with respect to outliers. Also for data originating from a distribution which significantly differs from the normal distribution, continuum regression may yield very inefficient estimates. In the current paper, robust continuum regression (RCR) is proposed. To construct the estimator, an algorithm based on projection pursuit is proposed. The robustness and good efficiency properties of RCR are shown by means of a simulation study. An application to an X-ray fluorescence analysis of hydrometallurgical samples illustrates the method's applicability in practice.Advantages; Applications; Calibration; Continuum regression (CR); Data; Distribution; Efficiency; Estimator; Least-squares; M-estimators; Methods; Model; Optimal; Ordinary least squares; Outliers; Partial least squares; Precision; Prediction; Projection-pursuit; Regression; Research; Robust continuum regression (RCR); Robust multivariate calibration; Robust regression; Robustness; Simulation; Squares; Studies; Variables; Yield;

    Partial robust M-regression.

    Get PDF
    Partial Least Squares (PLS) is a standard statistical method in chemometrics. It can be considered as an incomplete, or 'partial', version of the Least Squares estimator of regression, applicable when high or perfect multicollinearity is present in the predictor variables. The Least Squares estimator is well-known to be an optimal estimator for regression, but only when the error terms are normally distributed. In the absence of normality, and in particular when outliers are in the data set, other more robust regression estimators have better properties. In this paper a 'partial' version of M-regression estimators will be defined. If an appropriate weighting scheme is chosen, partial M-estimators become entirely robust to any type of outlying points, and are called Partial Robust M-estimators. It is shown that partial robust M-regression outperforms existing methods for robust PLS regression in terms of statistical precision and computational speed, while keeping good robustness properties. The method is applied to a data set consisting of EPXMA spectra of archaeological glass vessels. This data set contains several outliers, and the advantages of partial robust M-regression are illustrated. Applying partial robust M-regression yields much smaller prediction errors for noisy calibration samples than PLS. On the other hand, if the data follow perfectly well a normal model, the loss in efficiency to be paid for is very small.Advantages; Applications; Calibration; Data; Distribution; Efficiency; Estimator; Least-squares; M-estimators; Methods; Model; Optimal; Ordinary least squares; Outliers; Partial least squares; Precision; Prediction; Projection-pursuit; Regression; Robust regression; Robustness; Simulation; Spectometric quantization; Squares; Studies; Variables; Yield;

    Robust continuum regression.

    Get PDF
    Several applications of continuum regression (CR) to non-contaminated data have shown that a significant improvement in predictive power can be obtained compared to the three standard techniques which it encompasses (ordinary least squares (OLS), principal component regression (PCR) and partial least squares (PLS)). For contaminated data continuum regression may yield aberrant estimates due to its non-robustness with respect to outliers. Also for data originating from a distribution which significantly differs from the normal distribution, continuum regression may yield very inefficient estimates. In the current paper, robust continuum regression (RCR) is proposed. To construct the estimator, an algorithm based on projection pursuit (PP) is proposed. The robustness and good efficiency properties of RCR are shown by means of a simulation study. An application to an X-ray fluorescence analysis of hydrometallurgical samples illustrates the method's applicability in practice.Regression; Applications; Data; Ordinary least squares; Least-squares; Squares; Partial least squares; Yield; Outliers; Distribution; Estimator; Projection-pursuit; Robustness; Efficiency; Simulation; Studies;

    Reconciliation of weak pairwise spike-train correlations and highly coherent local field potentials across space

    Full text link
    Chronic and acute implants of multi-electrode arrays that cover several mm2^2 of neural tissue provide simultaneous access to population signals like extracellular potentials and the spiking activity of 100 or more individual neurons. While the recorded data may uncover principles of brain function, its interpretation calls for multiscale computational models with corresponding spatial dimensions and signal predictions. Such models can facilitate the search of mechanisms underlying observed spatiotemporal activity patterns in cortex. Multi-layer spiking neuron network models of local cortical circuits covering ~1 mm2^2 have been developed, integrating experimentally obtained neuron-type specific connectivity data and reproducing features of in-vivo spiking statistics. With forward models, local field potentials (LFPs) can be computed from the simulated spiking activity. To account for the spatial scale of common neural recordings, we extend a local network and LFP model to 4x4 mm2^2. The upscaling preserves the neuron densities, and introduces distance-dependent connection probabilities and delays. As detailed experimental connectivity data is partially lacking, we address this uncertainty in model parameters by testing parameter combinations within biologically plausible bounds. Based on model predictions of spiking activity and LFPs, we find that the upscaling procedure preserves the overall spiking statistics of the original model and reproduces asynchronous irregular spiking across populations and weak pairwise spike-train correlations observed in sensory cortex. In contrast with the weak spike-train correlations, the correlation of LFP signals is strong and distance-dependent, compatible with experimental observations. Enhanced spatial coherence in the low-gamma band may explain the recent experimental report of an apparent band-pass filter effect in the spatial reach of the LFP.Comment: 44 pages, 9 figures, 5 table

    Intersecting near-optimal spaces: European power systems with more resilience to weather variability

    Get PDF
    We suggest a new methodology for designing robust energy systems. For this, we investigate so-called near-optimal solutions to energy system optimisation models; solutions whose objective values deviate only marginally from the optimum. Using a refined method for obtaining explicit geometric descriptions of these near-optimal feasible spaces, we find designs that are as robust as possible to perturbations. This contributes to the ongoing debate on how to define and work with robustness in energy systems modelling. We apply our methods in an investigation using multiple decades of weather data. For the first time, we run a capacity expansion model of the European power system (one node per country) with a three-hourly temporal resolution and 41 years of weather data. While an optimisation with 41 weather years is at the limits of computational feasibility, we use the near-optimal feasible spaces of single years to gain an understanding of the design space over the full time period. Specifically, we intersect all near-optimal feasible spaces for the individual years in order to get designs that are likely to be feasible over the entire time period. We find significant potential for investment flexibility, and verify the feasibility of these designs by simulating the resulting dispatch problem with four decades of weather data. They are characterised by a shift towards more onshore wind and solar power, while emitting more than 50% less CO2 than a cost-optimal solution over that period. Our work builds on recent developments in the field, including techniques such as Modelling to Generate Alternatives (MGA) and Modelling All Alternatives (MAA), and provides new insights into the geometry of near-optimal feasible spaces and the importance of multi-decade weather variability for energy systems design. We also provide an effective way of working with a multi-decade time frame in a highly parallelised manner. Our implementation is open-sourced, adaptable and is based on PyPSA-Eur
    corecore