Search CORE

43 research outputs found

Fitting Prediction Rule Ensembles with R Package pre

Author: Fokkema Marjolein
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 01/02/2020
Field of study

Prediction rule ensembles (PREs) are sparse collections of rules, offering highly interpretable regression and classification models. This paper presents the R package pre, which derives PREs through the methodology of Friedman and Popescu (2008). The implementation and functionality of package pre is described and illustrated through application on a dataset on the prediction of depression. Furthermore, accuracy and sparsity of PREs is compared with that of single trees, random forest and lasso regression in four benchmark datasets. Results indicate that pre derives ensembles with predictive accuracy comparable to that of random forests, while using a smaller number of variables for prediction

arXiv.org e-Print Archive

Journal of Statistical Software

Leiden University Scholary Publications

Optimizing the assessment of suicidal behavior: the application of curtailment techniques

Author: De Beurs Derek P.
Fokkema Marjolein
O'Connor Rory
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Background: Given their length, commonly used scales to assess suicide risk, such as the Beck Scale for Suicide Ideation (SSI) are of limited use as screening tools. In the current study we tested whether deterministic and stochastic curtailment can be applied to shorten the 19-item SSI, without compromising its accuracy. Methods: Data from 366 patients, who were seen by a liaison psychiatry service in a general hospital in Scotland after a suicide attempt, were used. Within 24 h of admission, the SSI was administered; 15 months later, it was determined whether a patient was re-admitted to a hospital as the result of another suicide attempt. We fitted a Receiver Operating Characteristic curve to derive the best cut-off value of the SSI for predicting future suicidal behavior. Using this cut-off, both deterministic and stochastic curtailment were simulated on the item score patterns of the SSI. Results: A cut-off value of SSI≥6 provided the best classification accuracy for future suicidal behavior. Using this cut-off, we found that both deterministic and stochastic curtailment reduce the length of the SSI, without reducing the accuracy of the final classification decision. With stochastic curtailment, on average, less than 8 items are needed to assess whether administration of the full-length test will result in an SSI score below or above the cut-off value of 6. Limitations: New studies using other datasets should re-validate the optimal cut-off for risk of repeated suicidal behavior after being treated in a hospital following an attempt. Conclusions: Curtailment can be used to simplify the assessment of suicidal behavior, and should be considered as an alternative to the full scale

VU Research Portal

Crossref

Leiden University Scholary Publications

Enlighten

Fitting Prediction Rule Ensembles to Psychological Research Data: An Introduction and Tutorial

Author: Fokkema Marjolein
Strobl Carolin
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/10/2020
Field of study

Prediction rule ensembles (PREs) are a relatively new statistical learning method, which aim to strike a balance between predictive accuracy and interpretability. Starting from a decision tree ensemble, like a boosted tree ensemble or a random forest, PREs retain a small subset of tree nodes in the final predictive model. These nodes can be written as simple rules of the form if [condition] then [prediction]. As a result, PREs are often much less complex than full decision tree ensembles, while they have been found to provide similar predictive accuracy in many situations. The current paper introduces the methodology and shows how PREs can be fitted using the R package pre through several real-data examples from psychological research. The examples also illustrate a number of features of package \textbf{pre} that may be particularly useful for applications in psychology: support for categorical, multivariate and count responses, application of (non-)negativity constraints, inclusion of confirmatory rules and standardized variable importance measures.Comment: Published in Psychological Method

arXiv.org e-Print Archive

Leiden University Scholary Publications

Stacked Penalized Logistic Regression for Selecting Views in Multi-View Learning

Author: de Rooij Mark
Fokkema Marjolein
Szabo Botond
van Loon Wouter
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

In biomedical research, many different types of patient data can be collected, such as various types of omics data and medical imaging modalities. Applying multi-view learning to these different sources of information can increase the accuracy of medical classification models compared with single-view procedures. However, collecting biomedical data can be expensive and/or burdening for patients, so that it is important to reduce the amount of required data collection. It is therefore necessary to develop multi-view learning methods which can accurately identify those views that are most important for prediction. In recent years, several biomedical studies have used an approach known as multi-view stacking (MVS), where a model is trained on each view separately and the resulting predictions are combined through stacking. In these studies, MVS has been shown to increase classification accuracy. However, the MVS framework can also be used for selecting a subset of important views. To study the view selection potential of MVS, we develop a special case called stacked penalized logistic regression (StaPLR). Compared with existing view-selection methods, StaPLR can make use of faster optimization algorithms and is easily parallelized. We show that nonnegativity constraints on the parameters of the function which combines the views play an important role in preventing unimportant views from entering the model. We investigate the performance of StaPLR through simulations, and consider two real data examples. We compare the performance of StaPLR with an existing view selection method called the group lasso and observe that, in terms of view selection, StaPLR is often more conservative and has a consistently lower false positive rate.Comment: 26 pages, 9 figures. Accepted manuscrip

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Bocconi

Why we need systematic reviews and meta-analyses in the testing and assessment literature

Author: Fokkema Marjolein
Greiff Samuel
Iliescu Dragos
Rusu A
Scherer Ronny
Publication venue
Publication date: 01/01/2022
Field of study

Multivariate analysis of psychological dat

Leiden University Scholary Publications

Open Repository and Bibliography - Luxembourg

Analyzing hierarchical multi-view MRI data with StaPLR: An application to Alzheimer's disease classification

Author: de Rooij Mark
de Vos Frank
Fokkema Marjolein
Koini Marisa
Schmidt Reinhold
Szabo Botond
van Loon Wouter
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2022
Field of study

Multi-view data refers to a setting where features are divided into feature sets, for example because they correspond to different sources. Stacked penalized logistic regression (StaPLR) is a recently introduced method that can be used for classification and automatically selecting the views that are most important for prediction. We introduce an extension of this method to a setting where the data has a hierarchical multi-view structure. We also introduce a new view importance measure for StaPLR, which allows us to compare the importance of views at any level of the hierarchy. We apply our extended StaPLR algorithm to Alzheimer's disease classification where different MRI measures have been calculated from three scan types: structural MRI, diffusion-weighted MRI, and resting-state fMRI. StaPLR can identify which scan types and which derived MRI measures are most important for classification, and it outperforms elastic net regression in classification performance.Comment: 36 pages, 9 figures. Accepted manuscrip

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Bocconi

PubMed Central

Leiden University Scholary Publications

Contacts between elderly parents and their children in four European countries : current patterns and future prospects

Author: Broese van Groenou Marjolein
Fokkema Tineke
Grundy Emily
Kalogirou Stamatis
Karisto Antti
Martikainen Pekka
Tomassini Cecilia
Publication venue
Publication date: 01/01/2004
Field of study

Peer reviewe

Università degli Studi del Molise: IRIS

VU Research Portal

LSHTM Research Online

LSE Research Online

Helsingin yliopiston digitaalinen arkisto

Optimizing the assessment of suicidal behavior: The application of curtailment techniques

Author: Beck
Brown
Carpenter
Chaudron
De Beurs
De Beurs
de Beurs
Derek P. de Beurs
Finkelman
Finkelman
Fokkema
Friedman
Hastie
Hawton
Hickey
Jacobs
Kapur
Kerkhof
Marjolein Fokkema
Matthews
O’Connor
Reeve
Rice
Robin
Rory C. O’Connor
Smits
Spijker
Stone
van der Linden
Verwey
Wasserman
Youden
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Predicting mental health improvement and deterioration in a large community sample of 11- to 13-year-olds.

Author: Costa da Silva Luís
Edbrooke-Childs Julian
Fokkema Marjolein
Jacob Jenna
Napoleone Elisa
Patalay Praveetha
Patel Meera
Promberger Marianne
Wolpert Miranda
Zamperoni Victoria
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2020
Field of study

Of children with mental health problems who access specialist help, 50% show reliable improvement on self-report measures at case closure and 10% reliable deterioration. To contextualise these figures it is necessary to consider rates of improvement for those in the general population. This study examined rates of reliable improvement/deterioration for children in a school sample over time. N = 9074 children (mean age 12; 52% female; 79% white) from 118 secondary schools across England provided self-report mental health (SDQ), quality of life and demographic data (age, ethnicity and free school meals (FSM) at baseline and 1 year and self-report data on access to mental health support at 1 year). Multinomial logistic regressions and classification trees were used to analyse the data. Of 2270 (25%) scoring above threshold for mental health problems at outset, 27% reliably improved and 9% reliably deteriorated at 1-year follow up. Of 6804 (75%) scoring below threshold, 4% reliably improved and 12% reliably deteriorated. Greater emotional difficulties at outset were associated with greater rates of reliable improvement for both groups (above threshold group: OR = 1.89, p < 0.001, 95% CI [1.64, 2.17], below threshold group: OR = 2.23, p < 0.001, 95% CI [1.93, 2.57]). For those above threshold, higher baseline quality of life was associated with greater likelihood of reliable improvement (OR = 1.28, p < 0.001, 95% CI [1.13, 1.46]), whilst being in receipt of FSM was associated with reduced likelihood of reliable improvement (OR = 0.68, p < 0.01, 95% CI [0.53, 0.88]). For the group below threshold, being female was associated with increased likelihood of reliable deterioration (OR = 1.20, p < 0.025, 95% CI [1.00, 1.42]), whereas being from a non-white ethnic background was associated with decreased likelihood of reliable deterioration (OR = 0.66, p < 0.001, 95% CI [0.54, 0.80]). For those above threshold, almost one in three children showed reliable improvement at 1 year. The extent of emotional difficulties at outset showed the highest associations with rates of reliable improvement

University of Liverpool Repository