Search CORE

462 research outputs found

Segmentation and Dimension Reduction: Exploratory and Model-Based Approaches

Author: Rosmalen J.M. (Joost) van
Publication venue: Representing the information in a data set in a concise way is an important part of data analysis. A variety of multivariate statistical techniques have been developed for this purpose, such as k-means clustering and principal components analysis. These techniques are often based on the principles of segmentation (partitioning the observations into distinct groups) and dimension reduction (constructing a low-dimensional representation of a data set). However, such techniques typically make no statistical assumptions on the process that generates the data; as a result, the statistical significance of the results is often unknown. In this thesis, we incorporate the modeling principles of segmentation and dimension reduction into statistical models. We thus develop new models that can summarize and explain the information in a data set in a simple way. The focus is on dimension reduction using bilinear parameter structures and techniques for clustering both modes of a two-mode data matrix. To illustrate the usefulness of the techniques, the thesis includes a variety of empirical applications in marketing, psychometrics, and political science. An important application is modeling the response behavior in surveys with rating scales, which provides novel insight into what kinds of response styles exist, and how substantive opinions vary among respondents. We find that our modeling approaches yield new techniques for data analysis that can be useful in a variety of applied fields.
Publication date: 09/04/2009
Field of study

Representing the information in a data set in a concise way is an important part of data analysis. A variety of multivariate statistical techniques have been developed for this purpose, such as k-means clustering and principal components analysis. These techniques are often based on the principles of segmentation (partitioning the observations into distinct groups) and dimension reduction (constructing a low-dimensional representation of a data set). However, such techniques typically make no statistical assumptions on the process that generates the data; as a result, the statistical significance of the results is often unknown. In this thesis, we incorporate the modeling principles of segmentation and dimension reduction into statistical models. We thus develop new models that can summarize and explain the information in a data set in a simple way. The focus is on dimension reduction using bilinear parameter structures and techniques for clustering both modes of a two-mode data matrix. To illustrate the usefulness of the techniques, the thesis includes a variety of empirical applications in marketing, psychometrics, and political science. An important application is modeling the response behavior in surveys with rating scales, which provides novel insight into what kinds of response styles exist, and how substantive opinions vary among respondents. We find that our modeling approaches yield new techniques for data analysis that can be useful in a variety of applied fields

Erasmus University Digital Repository

Mobility level and factors affecting mobility status in hospitalized patients admitted in single-occupancy patient rooms

Author: Ista Erwin
Schafthuizen Laura
van Dijk Monique
van Rosmalen Joost
Publication venue
Publication date: 02/01/2024
Field of study

Background: Although stimulating patients’ mobility is considered a component of fundamental nursing care, approximately 35% of hospitalized patients experience functional decline during or after hospital admission. The aim of this study is to assess mobility level and to identify factors affecting mobility status in hospitalized patients admitted in single-occupancy patient rooms (SPRs) on general wards. Methods: Mobility level was quantified with the Johns Hopkins Highest Level of Mobility Scale (JH-HLM) and EQ-5D-3L. GENEActiv accelerometer data over 24 h were collected in a subset of patients. Data were analyzed using generalized ordinal logistic regression analysis. The STROBE reporting checklist was applied. Results:Wearing pajamas during daytime, having pain, admission in an isolation room, and wearing three or more medical equipment were negatively associated with mobilization level. More than half of patients (58.9%) who were able to mobilize according to the EQ-5D-3L did not achieve the highest possible level of mobility according to the JH-HLM. The subset of patients that wore an accelerometer spent most of the day in sedentary behavior (median 88.1%, IQR 85.9–93.6). The median total daily step count was 1326 (range 22-5362). Conclusion: We found that the majority of participating hospitalized patients staying in single-occupancy patient rooms were able to mobilize. It appeared, however, that most of the patients who are physically capable of walking, do not reach the highest possible level of mobility according to the JH-HLM scale. Nurses should take their responsibility to ensure that patients achieve the highest possible level of mobility.</p

EUR Research Repository

A Simple Cost-Effectiveness Model of Screening:An Open-Source Teaching and Research Tool Coded in R

Author: Lin Yi Shu
O’Mahony James F.
van Rosmalen Joost
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2023
Field of study

Applied cost-effectiveness analysis models are an important tool for assessing health and economic effects of healthcare interventions but are not best suited for illustrating methods. Our objective is to provide a simple, open-source model for the simulation of disease-screening cost-effectiveness for teaching and research purposes. We introduce our model and provide an initial application to examine changes to the efficiency frontier as input parameters vary and to demonstrate face validity. We described a vectorised, discrete-event simulation of screening in R with an Excel interface to define parameters and inspect principal results. An R Shiny app permits dynamic interpretation of simulation outputs. An example with 8161 screening strategies illustrates the cost and effectiveness of varying the disease sojourn time, treatment effectiveness, and test performance characteristics and costs on screening policies. Many of our findings are intuitive and straightforward, such as a reduction in screening costs leading to decreased overall costs and improved cost-effectiveness. Others are less obvious and depend on whether we consider gross outcomes or those net to no screening. For instance, enhanced treatment of symptomatic disease increases gross effectiveness, but reduces the net effectiveness and cost-effectiveness of screening. A lengthening of the preclinical sojourn time has ambiguous effects relative to no screening, as cost-effectiveness improves for some strategies but deteriorates for others. Our simple model offers an accessible platform for methods research and teaching. We hope it will serve as a public good and promote an intuitive understanding of the cost-effectiveness of screening.Health Research Boar

Research Repository UCD

EUR Research Repository

Identifying Unknown Response Styles: A Latent-Class Bilinear Multinomial Logit Model

Author: Groenen P.J.F. (Patrick)
Herk H. (Hester) van
Rosmalen J.M. (Joost) van
Publication venue: Rosmalen, J.M. (Joost) van
Publication date: 01/01/2007
Field of study

Respondents can vary significantly in the way they use rating scales. Specifically, respondents can exhibit varying degrees of response style, which threatens the validity of the responses. The purpose of this article is to investigate to what extent rating scale responses show response style and substantive content of the item. The authors develop a novel model that accounts for possibly unknown kinds of response styles, content of the items, and background characteristics of respondents. By imposing a bilinear structure on the parameters of a multinomial logit model, the authors can visually distinguish the effects on the response behavior of both the characteristics of a respondent and the content of the item. This approach is combined with finite mixture modeling, so that two separate segmentations of the respondents are obtained: one for response style and one for item content. This latent-class bilinear multinomial logit (LC-BML) model is applied to a cross-national data set. The results show that item content is highly influential in explaining response behavior and reveal the presence of several response styles, including the prominent response styles acquiescence and extreme response style

VU Research Portal

EUR Research Repository

Erasmus University Digital Repository

A Bayesian approach to two-mode clustering

Author: Dijk A. (Bram) van
Paap R. (Richard)
Rosmalen J.M. (Joost) van
Publication venue: Dijk, A. (Bram) van
Publication date: 16/03/2009
Field of study

We develop a new Bayesian approach to estimate the parameters of a latent-class model for the joint clustering of both modes of two-mode data matrices. Posterior results are obtained using a Gibbs sampler with data augmentation. Our Bayesian approach has three advantages over existing methods. First, we are able to do statistical inference on the model parameters, which would not be possible using frequentist estimation procedures. In addition, the Bayesian approach allows us to provide statistical criteria for determining the optimal numbers of clusters. Finally, our Gibbs sampler has fewer problems with local optima in the likelihood function and empty classes than the EM algorithm used in a frequentist approach. We apply the Bayesian estimation method of the latent-class two-mode clustering model to two empirical data sets. The first data set is the Supreme Court voting data set of Doreian, Batagelj, and Ferligoj (2004). The second data set comprises the roll call votes of the United States House

Erasmus University Digital Repository

Fuzzy clustering with Minkowski distance

Author: Groenen P.J.F. (Patrick)
Kaymak U. (Uzay)
Rosmalen J.M. (Joost) van
Publication venue
Publication date: 06/07/2006
Field of study

Distances in the well known fuzzy c-means algorithm of Bezdek (1973) are measured by the squared Euclidean distance. Other distances have been used as well in fuzzy clustering. For example, Jajuga (1991) proposed to use the L_1-distance and Bobrowski and Bezdek (1991) also used the L_infty-distance. For the more general case of Minkowski distance and the case of using a root of the squared Minkowski distance, Groenen and Jajuga (2001) introduced a majorization algorithm to minimize the error. One of the advantages of iterative majorization is that it is a guaranteed descent algorithm, so that every iteration reduces the error until convergence is reached. However, their algorithm was limited to the case of Minkowski parameter between 1 and 2, that is, between the L_1-distance and the Euclidean distance. Here, we extend their majorization algorithm to any Minkowski distance with Minkowski parameter greater than (or equal to) 1. This extension also includes the case of the L_infty-distance. We also investigate how well this algorithm performs and present an empirical application

Erasmus University Digital Repository

Optimal Scaling of Interaction Effects in Generalized Linear Models

Author: Groenen P.J.F. (Patrick)
Koning A.J. (Alex)
Rosmalen J.M. (Joost) van
Publication venue: Multiplicative interaction models, such as Goodman's RC(M) association models, can be a useful tool for analyzing the content of interaction effects. However, most models for interaction effects are only suitable for data sets with two or three predictor variables. Here, we discuss an optimal scaling model for analyzing the content of interaction effects in generalized linear models with any number of categorical predictor variables. This model, which we call the optimal scaling of interactions (OSI) model, is a parsimonious, one-dimensional multiplicative interaction model. We discuss how the model can be used to visually interpret the interaction effects. Two empirical data sets are used to show how the results of the model can b
Publication date: 01/10/2007
Field of study

Multiplicative interaction models, such as Goodman's RC(M) association models, can be a useful tool for analyzing the content of interaction effects. However, most models for interaction effects are only suitable for data sets with two or three predictor variables. Here, we discuss an optimal scaling model for analyzing the content of interaction effects in generalized linear models with any number of categorical predictor variables. This model, which we call the optimal scaling of interactions (OSI) model, is a parsimonious, one-dimensional multiplicative interaction model. We discuss how the model can be used to visually interpret the interaction effects. Two empirical data sets are used to show how the results of the model can

Erasmus University Digital Repository

Disease activity in primary progressive multiple sclerosis:a systematic review and meta-analysis

Author: Blok Katelijn
de Beukelaar Janet
Smolders Joost
Tebayna Nura
van Rosmalen Joost
Wokke Beatrijs
Publication venue
Publication date: 06/11/2023
Field of study

Background: Disease activity in multiple sclerosis (MS) is defined as presence of relapses, gadolinium enhancing lesions and/or new or enlarging lesions on MRI. It is associated with efficacy of immunomodulating therapies (IMTs) in primary progressive MS (PPMS). However, a thorough review on disease activity in PPMS is lacking. In relapsing remitting MS, the prevalence of activity decreases in more contemporary cohorts. For PPMS, this is unknown. Aim: To review disease activity in PPMS cohorts and identify its predictors. Methods: A systematic search in EMBASE, MEDLINE, Web of science Core Collection, COCHRANE CENTRAL register of trials, and GOOGLE SCHOLAR was performed. Keywords included PPMS, inflammation, and synonyms. We included original studies with predefined available data, extracted cohort characteristics and disease activity outcomes and performed meta-regression analyses. Results: We included 34 articles describing 7,109 people with PPMS (pwPPMS). The weighted estimated proportion of pwPPMS with overall disease activity was 26.8% (95% CI 20.6–34.0%). A lower age at inclusion predicted higher disease activity (OR 0.91, p = 0.031). Radiological activity (31.9%) was more frequent than relapses (9.2%), and was predicted by longer follow-up duration (OR 1.27, p = 0.033). Year of publication was not correlated with disease activity. Conclusion: Inflammatory disease activity is common in PPMS and has remained stable over the last decades. Age and follow-up duration predict disease activity, advocating prolonged monitoring of young pwPPMS to evaluate potential IMT benefits

EUR Research Repository

Including historical data in the analysis of clinical trials: Is it worth the effort?

Author: Dejardin D
Lesaffre E
Löwenberg Bob
van Norden Yvette
van Rosmalen Joost
Publication venue
Publication date: 01/01/2018
Field of study

EUR Research Repository

Dealing with Time in Health Economic Evaluation: Methodological Issues and Recommendations for Practice

Author: Newall A.T. (Anthony T.)
O'Mahony J.F. (James)
Rosmalen J.M. (Joost) van
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Time is an important aspect of health economic evaluation, as the timing and duration of clinical events, healthcare interventions and their consequences all affect estimated costs and effects. These issues should be reflected in the design of health economic models. This article considers three important aspects of time in modelling: (1) which cohorts to simulate and how far into the future to extend the analysis; (2) the simulation of time, including the difference between discrete-time and continuous-time models, cycle lengths, and converting rates and probabilities; and (3) discounting future costs and effects to their present values. We provide a methodological overview of these issues and make recommendations to help inform both the conduct of cost-effectiveness analyses and the interpretation of their results. For choosing which cohorts to simulate and how many, we suggest analysts carefully assess potential reasons for variation in cost effectiveness between cohorts and the feasibility of subgroup-specific recommendations. For the simulation of time, we recommend using short cycles or continuous-time models to avoid biases and the need for half-cycle corrections, and provide advic

Springer - Publisher Connector

PubMed Central

EUR Research Repository

Erasmus University Digital Repository