Search CORE

714 research outputs found

lassopack: Model selection and prediction with regularized regression in Stata

Author: Athey S.
Huang J.
Shao J.
Tikhonov A. N.
Van der Kooij A.
Yang Y.
Zhao P.
Publication venue
Publication date: 16/01/2019
Field of study

This article introduces lassopack, a suite of programs for regularized regression in Stata. lassopack implements lasso, square-root lasso, elastic net, ridge regression, adaptive lasso and post-estimation OLS. The methods are suitable for the high-dimensional setting where the number of predictors

p

may be large and possibly greater than the number of observations,

n

. We offer three different approaches for selecting the penalization (`tuning') parameters: information criteria (implemented in lasso2),

K

-fold cross-validation and

h

-step ahead rolling cross-validation for cross-section, panel and time-series data (cvlasso), and theory-driven (`rigorous') penalization for the lasso and square-root lasso for cross-section and panel data (rlasso). We discuss the theoretical framework and practical considerations for each approach. We also present Monte Carlo results to compare the performance of the penalization approaches.Comment: 52 pages, 6 figures, 6 tables; submitted to Stata Journal; for more information see https://statalasso.github.io

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref

Determination of Serum Nucleotidase with Cytidine Monophosphate as Substrate. Part II: Improvement of the procedure

Author: Haut A.
Kooij P. J. van der
PERSIJN J.-P.
Slik W. van der
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/1976
Field of study

Peer Reviewe

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin

Optimal Scaling transformations to model non-linear relations in GLMs with ordered and unordered predictors

Author: Meulman J. J.
van der Kooij A. J.
Willems S. J. W.
Publication venue
Publication date: 01/09/2023
Field of study

In Generalized Linear Models (GLMs) it is assumed that there is a linear effect of the predictor variables on the outcome. However, this assumption is often too strict, because in many applications predictors have a nonlinear relation with the outcome. Optimal Scaling (OS) transformations combined with GLMs can deal with this type of relations. Transformations of the predictors have been integrated in GLMs before, e.g. in Generalized Additive Models. However, the OS methodology has several benefits. For example, the levels of categorical predictors are quantified directly, such that they can be included in the model without defining dummy variables. This approach enhances the interpretation and visualization of the effect of different levels on the outcome. Furthermore, monotonicity restrictions can be applied to the OS transformations such that the original ordering of the category values is preserved. This improves the interpretation of the effect and may prevent overfitting. The scaling level can be chosen for each individual predictor such that models can include mixed scaling levels. In this way, a suitable transformation can be found for each predictor in the model. The implementation of OS in logistic regression is demonstrated using three datasets that contain a binary outcome variable and a set of categorical and/or continuous predictor variables.Comment: 35 pages, 4 figure

arXiv.org e-Print Archive

Fisheries

Author: Engelhard GH
Garrett A
Pinnegar JK
Simpson SD
van der Kooij J
Publication venue: Marine Climate Change Impacts Partnership
Publication date: 24/11/2017
Field of study

This is the final version. Available from MCCIP via the DOI in this record

Open Research Exeter

Sub-Typing of Rheumatic Diseases Based on a Systems Diagnosis Questionnaire

Author: Hankemeier Thomas
Meulman Jacqueline J.
Reijmers Theo H.
Schroën Jan
van der Greef Jan
van der Kooij Anita J.
van Wietmarschen Herman A.
Wei Heng
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

The future of personalized medicine depends on advanced diagnostic tools to characterize responders and non-responders to treatment. Systems diagnosis is a new approach which aims to capture a large amount of symptom information from patients to characterize relevant sub-groups.49 patients with a rheumatic disease were characterized using a systems diagnosis questionnaire containing 106 questions based on Chinese and Western medicine symptoms. Categorical principal component analysis (CATPCA) was used to discover differences in symptom patterns between the patients. Two Chinese medicine experts where subsequently asked to rank the Cold and Heat status of all the patients based on the questionnaires. These rankings were used to study the Cold and Heat symptoms used by these practitioners.The CATPCA analysis results in three dimensions. The first dimension is a general factor (40.2% explained variance). In the second dimension (12.5% explained variance) 'anxious', 'worrying', 'uneasy feeling' and 'distressed' were interpreted as the Internal disease stage, and 'aggravate in wind', 'fear of wind' and 'aversion to cold' as the External disease stage. In the third dimension (10.4% explained variance) 'panting s', 'superficial breathing', 'shortness of breath s', 'shortness of breath f' and 'aversion to cold' were interpreted as Cold and 'restless', 'nervous', 'warm feeling', 'dry mouth s' and 'thirst' as Heat related. 'Aversion to cold', 'fear of wind' and 'pain aggravates with cold' are most related to the experts Cold rankings and 'aversion to heat', 'fullness of chest' and 'dry mouth' to the Heat rankings.This study shows that the presented systems diagnosis questionnaire is able to identify groups of symptoms that are relevant for sub-typing patients with a rheumatic disease

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Leiden University Scholary Publications

Comparison of computational codes for direct numerical simulations of turbulent Rayleigh-B\'enard convection

Author: Botchev Mikhail A.
Frederix Edo M. A.
Geurts Bernard J.
Horn Susanne
Kooij Gijs L.
Lohse Detlef
Shishkina Olga
Stevens Richard J. A. M.
van der Poel Erwin P.
Verzicco Roberto
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

Computational codes for direct numerical simulations of Rayleigh-B\'enard (RB) convection are compared in terms of computational cost and quality of the solution. As a benchmark case, RB convection at

Ra=10^8

and

Pr=1

in a periodic domain, in cubic and cylindrical containers is considered. A dedicated second-order finite-difference code (AFID/RBflow) and a specialized fourth-order finite-volume code (Goldfish) are compared with a general purpose finite-volume approach (OpenFOAM) and a general purpose spectral-element code (Nek5000). Reassuringly, all codes provide predictions of the average heat transfer that converge to the same values. The computational costs, however, are found to differ considerably. The specialized codes AFID/RBflow and Goldfish are found to excel in efficiency, outperforming the general purpose flow solvers Nek5000 and OpenFOAM by an order of magnitude with an error on the Nusselt number

Nu

below

5\%

. However, we find that

Nu

alone is not sufficient to assess the quality of the numerical results: in fact, instantaneous snapshots of the temperature field from a near wall region obtained for deliberately under-resolved simulations using Nek5000 clearly indicate inadequate flow resolution even when

Nu

is converged. Overall, dedicated special purpose codes for RB convection are found to be more efficient than general purpose codes.Comment: 12 pages, 5 figure

arXiv.org e-Print Archive

Repository TU/e

Crossref

Pure OAI Repository

University of Twente Research Information

MPG.PuRe

Mathematics in different settings: plenary panel.

Author: Alatorre S.
Alatorre S.
Evans J.
Evans J.
International Group for the Psychology of Mathematics Education.
International Group for the Psychology of Mathematics Education.
Noyes A.
Noyes A.
Potari D.
Potari D.
Van der Kooij H.
Van der Kooij H.
Publication venue: Federal University of Minas Gerais
Publication date: 01/01/2010
Field of study

When we think about the title “Mathematics in different settings”, a number of questions arise. For example: • How many mathematics are there – one or many? Is there a mathematics that is “prior to”, or independent of, any setting? • What (who) is it that makes settings “different”? And how does this relate to social differences among people? • What is an appropriate typology of different settings – for research or for curriculum design purposes? Relatedly, we might ask: who decides what is “important”? • What is the nature of relations among policy arrangements, research and educational institutional settings? • How are different settings represented in mathematics teaching and assessment? • What is the relationship of mathematics education researchers to any setting

Middlesex University Research Repository