714 research outputs found

    lassopack: Model selection and prediction with regularized regression in Stata

    Get PDF
    This article introduces lassopack, a suite of programs for regularized regression in Stata. lassopack implements lasso, square-root lasso, elastic net, ridge regression, adaptive lasso and post-estimation OLS. The methods are suitable for the high-dimensional setting where the number of predictors pp may be large and possibly greater than the number of observations, nn. We offer three different approaches for selecting the penalization (`tuning') parameters: information criteria (implemented in lasso2), KK-fold cross-validation and hh-step ahead rolling cross-validation for cross-section, panel and time-series data (cvlasso), and theory-driven (`rigorous') penalization for the lasso and square-root lasso for cross-section and panel data (rlasso). We discuss the theoretical framework and practical considerations for each approach. We also present Monte Carlo results to compare the performance of the penalization approaches.Comment: 52 pages, 6 figures, 6 tables; submitted to Stata Journal; for more information see https://statalasso.github.io

    Optimal Scaling transformations to model non-linear relations in GLMs with ordered and unordered predictors

    Full text link
    In Generalized Linear Models (GLMs) it is assumed that there is a linear effect of the predictor variables on the outcome. However, this assumption is often too strict, because in many applications predictors have a nonlinear relation with the outcome. Optimal Scaling (OS) transformations combined with GLMs can deal with this type of relations. Transformations of the predictors have been integrated in GLMs before, e.g. in Generalized Additive Models. However, the OS methodology has several benefits. For example, the levels of categorical predictors are quantified directly, such that they can be included in the model without defining dummy variables. This approach enhances the interpretation and visualization of the effect of different levels on the outcome. Furthermore, monotonicity restrictions can be applied to the OS transformations such that the original ordering of the category values is preserved. This improves the interpretation of the effect and may prevent overfitting. The scaling level can be chosen for each individual predictor such that models can include mixed scaling levels. In this way, a suitable transformation can be found for each predictor in the model. The implementation of OS in logistic regression is demonstrated using three datasets that contain a binary outcome variable and a set of categorical and/or continuous predictor variables.Comment: 35 pages, 4 figure

    Fisheries

    Get PDF
    This is the final version. Available from MCCIP via the DOI in this record

    Sub-Typing of Rheumatic Diseases Based on a Systems Diagnosis Questionnaire

    Get PDF
    The future of personalized medicine depends on advanced diagnostic tools to characterize responders and non-responders to treatment. Systems diagnosis is a new approach which aims to capture a large amount of symptom information from patients to characterize relevant sub-groups.49 patients with a rheumatic disease were characterized using a systems diagnosis questionnaire containing 106 questions based on Chinese and Western medicine symptoms. Categorical principal component analysis (CATPCA) was used to discover differences in symptom patterns between the patients. Two Chinese medicine experts where subsequently asked to rank the Cold and Heat status of all the patients based on the questionnaires. These rankings were used to study the Cold and Heat symptoms used by these practitioners.The CATPCA analysis results in three dimensions. The first dimension is a general factor (40.2% explained variance). In the second dimension (12.5% explained variance) 'anxious', 'worrying', 'uneasy feeling' and 'distressed' were interpreted as the Internal disease stage, and 'aggravate in wind', 'fear of wind' and 'aversion to cold' as the External disease stage. In the third dimension (10.4% explained variance) 'panting s', 'superficial breathing', 'shortness of breath s', 'shortness of breath f' and 'aversion to cold' were interpreted as Cold and 'restless', 'nervous', 'warm feeling', 'dry mouth s' and 'thirst' as Heat related. 'Aversion to cold', 'fear of wind' and 'pain aggravates with cold' are most related to the experts Cold rankings and 'aversion to heat', 'fullness of chest' and 'dry mouth' to the Heat rankings.This study shows that the presented systems diagnosis questionnaire is able to identify groups of symptoms that are relevant for sub-typing patients with a rheumatic disease

    Comparison of computational codes for direct numerical simulations of turbulent Rayleigh-B\'enard convection

    Get PDF
    Computational codes for direct numerical simulations of Rayleigh-B\'enard (RB) convection are compared in terms of computational cost and quality of the solution. As a benchmark case, RB convection at Ra=108Ra=10^8 and Pr=1Pr=1 in a periodic domain, in cubic and cylindrical containers is considered. A dedicated second-order finite-difference code (AFID/RBflow) and a specialized fourth-order finite-volume code (Goldfish) are compared with a general purpose finite-volume approach (OpenFOAM) and a general purpose spectral-element code (Nek5000). Reassuringly, all codes provide predictions of the average heat transfer that converge to the same values. The computational costs, however, are found to differ considerably. The specialized codes AFID/RBflow and Goldfish are found to excel in efficiency, outperforming the general purpose flow solvers Nek5000 and OpenFOAM by an order of magnitude with an error on the Nusselt number NuNu below 5%5\%. However, we find that NuNu alone is not sufficient to assess the quality of the numerical results: in fact, instantaneous snapshots of the temperature field from a near wall region obtained for deliberately under-resolved simulations using Nek5000 clearly indicate inadequate flow resolution even when NuNu is converged. Overall, dedicated special purpose codes for RB convection are found to be more efficient than general purpose codes.Comment: 12 pages, 5 figure

    Mathematics in different settings: plenary panel.

    Get PDF
    When we think about the title “Mathematics in different settings”, a number of questions arise. For example: • How many mathematics are there – one or many? Is there a mathematics that is “prior to”, or independent of, any setting? • What (who) is it that makes settings “different”? And how does this relate to social differences among people? • What is an appropriate typology of different settings – for research or for curriculum design purposes? Relatedly, we might ask: who decides what is “important”? • What is the nature of relations among policy arrangements, research and educational institutional settings? • How are different settings represented in mathematics teaching and assessment? • What is the relationship of mathematics education researchers to any setting
    • …
    corecore