Search CORE

328 research outputs found

Stacked Penalized Logistic Regression for Selecting Views in Multi-View Learning

Author: de Rooij Mark
Fokkema Marjolein
Szabo Botond
van Loon Wouter
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

In biomedical research, many different types of patient data can be collected, such as various types of omics data and medical imaging modalities. Applying multi-view learning to these different sources of information can increase the accuracy of medical classification models compared with single-view procedures. However, collecting biomedical data can be expensive and/or burdening for patients, so that it is important to reduce the amount of required data collection. It is therefore necessary to develop multi-view learning methods which can accurately identify those views that are most important for prediction. In recent years, several biomedical studies have used an approach known as multi-view stacking (MVS), where a model is trained on each view separately and the resulting predictions are combined through stacking. In these studies, MVS has been shown to increase classification accuracy. However, the MVS framework can also be used for selecting a subset of important views. To study the view selection potential of MVS, we develop a special case called stacked penalized logistic regression (StaPLR). Compared with existing view-selection methods, StaPLR can make use of faster optimization algorithms and is easily parallelized. We show that nonnegativity constraints on the parameters of the function which combines the views play an important role in preventing unimportant views from entering the model. We investigate the performance of StaPLR through simulations, and consider two real data examples. We compare the performance of StaPLR with an existing view selection method called the group lasso and observe that, in terms of view selection, StaPLR is often more conservative and has a consistently lower false positive rate.Comment: 26 pages, 9 figures. Accepted manuscrip

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Bocconi

Crossing the Line: Evidence for the Categorization Theory of Spatial Voting

Author: de Rooij Eline A.
Kimbrough Erik O.
Pickup Mark
Publication venue: Chapman University Digital Commons
Publication date: 13/10/2023
Field of study

Bølstad and Dinas (2017) propose a model of spatial voting, based on social identity theory, that suggests supporting a candidate/policy on the other side of the ideological spectrum has a disutility that is not accounted for by common spatial models. Unfortunately, the data they use cannot speak directly to whether the disutility arises because individuals perceive their ideology as a social identity. We present the results of an experimental study that measures the norm against crossing the ideological spectrum; tests the cost of doing so, controlling for spatial effects; and demonstrates that this cost increases with the salience and strength of identity norms. By demonstrating the norm mechanism for the disutility of crossing the ideological spectrum, we provide strong support for B&D\u27s model

Chapman University Digital Commons

Recommended from our members

Computational mechanisms for resolving misunderstandings

Author: Blokpoel Mark
Braak Laura van de
Dingemanse Mark
Rooij Iris van
Toni Ivan
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Imagine discussing yesterdays dinner with a friend: It wasn’t particularly tasty. Your friend concurs, it was very salty!Thinking you were talking about the appetizer (which wasnt salty at all), youre forced to reconsider which course yourfriend was talking about. Was the appetizer salty to her? Was she talking about the main course? People encounter mis-understandings in everyday conversation, yet quickly and seamlessly resolve them. How people do this is an explanatorychallenge: the thing being talked about (i.e., the referent) is often not physically present during the conversation. Hence,theres no easy way for interlocutors to establish common ground via ostensive signaling (e.g., by pointing at the dish). Wedevelop a model of speakers that use pragmatic reasoning to infer the referent inferred by listeners. We explore the perfor-mance of this model using agent-based simulated conversations. The results imply necessary and sufficient conditions forsuccessful updating

eScholarship - University of California

Research Openness in Canadian Political Science: Toward an Inclusive and Differentiated Discussion

Author: de Rooij Eline A.
Johnson Genevieve Fuji
Leger Remi
Pickup Mark
Publication venue
Publication date: 01/03/2017
Field of study

In this paper, we initiate a discussion within the Canadian political science community about research openness and its implications for our discipline.  This discussion is important because the Tri-Agency has recently released guidelines on data management and because a number of political science journals, from several subfields, have signed the Journal Editors’ Transparency Statement requiring data access and research transparency (DA-RT).  As norms regarding research openness develop, an increasing number and range of journals and funding agencies may begin to implement DA-RT-type requirements.  If Canadian political scientists wish to continue to participate in the global political science community, we must take careful note of and be proactive participants in the ongoing developments concerning research openness

Simon Fraser University Institutional Repository

The Bradley–Terry Regression Trunk approach for Modeling Preference Data with Small Trees

Author: Alessio Baldassarre
Antonio D’Ambrosio
Claudio Conversano
Elise Dusseldorp
Mark de Rooij
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

This paper introduces the Bradley-Terry regression trunk model, a novel probabilistic approach for the analysis of preference data expressed through paired comparison rankings. In some cases, it may be reasonable to assume that the preferences expressed by individuals depend on their characteristics. Within the framework of tree-based partitioning, we specify a tree-based model estimating the joint effects of subject-specific covariates over and above their main effects. We, therefore, combine a tree-based model and the log-linear Bradley-Terry model using the outcome of the comparisons as response variable. The proposed model provides a solution to discover interaction effects when no a-priori hypotheses are available. It produces a small tree, called trunk, that represents a fair compromise between a simple interpretation of the interaction effects and an easy to read partition of judges based on their characteristics and the preferences they have expressed. We present an application on a real dataset following two different approaches, and a simulation study to test the model's performance. Simulations showed that the quality of the model performance increases when the number of rankings and objects increases. In addition, the performance is considerably amplified when the judges' characteristics have a high impact on their choices

Archivio della ricerca - Università degli studi di Napoli Federico II

Archivio istituzionale della ricerca - Università di Cagliari

Leiden University Scholary Publications

Continuous Sweep: an improved, binary quantifier

Author: de Rooij Mark
Karch Julian D.
Kloos Kevin
Meertens Quinten A.
Publication venue
Publication date: 16/08/2023
Field of study

Quantification is a supervised machine learning task, focused on estimating the class prevalence of a dataset rather than labeling its individual observations. We introduce Continuous Sweep, a new parametric binary quantifier inspired by the well-performing Median Sweep. Median Sweep is currently one of the best binary quantifiers, but we have changed this quantifier on three points, namely 1) using parametric class distributions instead of empirical distributions, 2) optimizing decision boundaries instead of applying discrete decision rules, and 3) calculating the mean instead of the median. We derive analytic expressions for the bias and variance of Continuous Sweep under general model assumptions. This is one of the first theoretical contributions in the field of quantification learning. Moreover, these derivations enable us to find the optimal decision boundaries. Finally, our simulation study shows that Continuous Sweep outperforms Median Sweep in a wide range of situations

arXiv.org e-Print Archive

Elastohydrodynamic lubrication of coated finite line contacts

Author: Alakhramsing Shivam S.
de Rooij Matthijn B.
Schipper Dirk J.
van Drogen Mark
Publication venue: 'SAGE Publications'
Publication date: 17/08/2018
Field of study

University of Twente Research Information

Analyzing hierarchical multi-view MRI data with StaPLR: An application to Alzheimer's disease classification

Author: de Rooij Mark
de Vos Frank
Fokkema Marjolein
Koini Marisa
Schmidt Reinhold
Szabo Botond
van Loon Wouter
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2022
Field of study

Multi-view data refers to a setting where features are divided into feature sets, for example because they correspond to different sources. Stacked penalized logistic regression (StaPLR) is a recently introduced method that can be used for classification and automatically selecting the views that are most important for prediction. We introduce an extension of this method to a setting where the data has a hierarchical multi-view structure. We also introduce a new view importance measure for StaPLR, which allows us to compare the importance of views at any level of the hierarchy. We apply our extended StaPLR algorithm to Alzheimer's disease classification where different MRI measures have been calculated from three scan types: structural MRI, diffusion-weighted MRI, and resting-state fMRI. StaPLR can identify which scan types and which derived MRI measures are most important for classification, and it outperforms elastic net regression in classification performance.Comment: 36 pages, 9 figures. Accepted manuscrip

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Bocconi

PubMed Central

Leiden University Scholary Publications

The detection and modeling of direct effects in latent class analysis

Author: Bakk Zsuzsa
de Rooij Mark J.
Janssen Jeroen H. M.
Kuha Jouni
van Laar Saskia
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2018
Field of study

Several approaches have been proposed for latent class modeling with external variables, including one-step, two-step and three-step estimators. However, very little is known yet about the performance of these approaches when direct effects of the external variable to the indicators of latent class membership are present. In the current article, we compare those approaches and investigate the consequences of not modeling these direct effects when present, as well as the power of residual and fir statistics to identify such effects. The results of the simulations show that not modeling direct effect can lead to severe parameter bias, especially with a weak measurement model. Both residual and fit statistics can be used to identify such effects, as long as the number and strength of these effects is low and the measurement model is sufficiently strong

LSE Research Online

Leiden University Scholary Publications

NORA - Norwegian Open Research Archives