Search CORE

283,177 research outputs found

Variable selection with LASSO regression for complex survey data

Author: Arostegui I.
Barrio I.
Iparragirre A.
Lumely T.
Publication venue
Publication date: 01/01/2023
Field of study

Variable selection is an important step to end up with good prediction models. LASSO regression models are one of the most commonly used methods for this purpose, for which cross-validation is the most widely applied validation technique to choose the tuning parameter (λ). Validation techniques in a complex survey framework are closely related to “replicate weights”. However, to our knowledge, they have never been used in a LASSO regression context. Applying LASSO regression models to complex survey data could be challenging. The goal of this paper is two-fold. On the one hand, we analyze the performance of replicate weights methods to select the tuning parameter for fitting LASSO regression models to complex survey data. On the other hand, we propose new replicate weights methods for the same purpose. In particular, we propose a new design-based cross-validation method as a combination of the traditional cross-validation and replicate weights. The performance of all these methods has been analyzed and compared by means of an extensive simulation study to the traditional cross-validation technique to select the tuning parameter for LASSO regression models. The results suggest a considerable improvement when the new proposal design-based cross-validation is used instead of the traditional crossvalidation.IT1456-22 PIF18/21

BCAM's Institutional Repository Data

Archivo Digital para la Docencia y la Investigación

Recommended from our members

Using agent based simulation to empirically examine complexity in carbon footprint business process

Author: Khan T
Patel NV
Pilla VN
Publication venue: EMCIS 2009
Publication date: 01/01/2009
Field of study

Through the critical analysis of the extant literature, it is observed that Simulation is widely used as a research method in Natural Sciences, Engineering and Social Sciences, in addition to argumentation and formalisation as the third way of carrying out research. Simulation is not so widely used in Business and Management research as it ought to have been, though this is changing for the better with the technological advances in computers and their computational power. These technological advances enhance the capability of theoretical research models, in defining a problem and their use in empirically examining a solution to the problem in simulated reality, like never before. Management journal searches for “Simulation and Complexity Theory” returned nil or zero returns, which explain that this combination is not popular in management research, though they are used individually more often. The major objective of this paper is to analyse some of the conceptual (or theoretical) and methodological (or empirical) contributions that Agent Based Simulation and Complexity Theory can make to the business and management community in their business process related research In view of this, some basic ideas are discussed of using Agent Based Simulation as a method in Business and Management Studies research and how an Agent Based Model can be applied to a business process as complex as Carbon Footprint. It is in this context that the use of Complexity as the base theory to empirically examine a business process is discussed. Throughout this article, our research on complex adaptive systems (e.g., Accounting Information System) in continuously changing organisations managing complex business processes (e.g., Carbon Footprint business process) is considered as the basis for illustrating some of the concepts. Through this article, avenues for further management research using these tools and methodology are suggested

Brunel University Research Archive

Overview on agent-based social modelling and the use of formal languages

Author: Casanovas Garcia Josep
Cela Espín José M.
Kaplan Marcusan Adriana
Montañola Sales Cristina
Rubio Campillo Xavier
Publication venue: 'IGI Global'
Publication date: 01/01/2013
Field of study

Transdisciplinary Models and Applications investigates a variety of programming languages used in validating and verifying models in order to assist in their eventual implementation. This book will explore different methods of evaluating and formalizing simulation models, enabling computer and industrial engineers, mathematicians, and students working with computer simulations to thoroughly understand the progression from simulation to product, improving the overall effectiveness of modeling systems.Postprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Engineering simulations for cancer systems biology

Author: Andrews Paul S.
Bown James L.
Deeni Yusuf Y.
Goltsov Alexey
Idowu Michael A.
Polac Fiona A.C.
Sampson Adam T.
Shovman Mark
Stepney Susan
Publication venue
Publication date: 01/11/2012
Field of study

Computer simulation can be used to inform in vivo and in vitro experimentation, enabling rapid, low-cost hypothesis generation and directing experimental design in order to test those hypotheses. In this way, in silico models become a scientific instrument for investigation, and so should be developed to high standards, be carefully calibrated and their findings presented in such that they may be reproduced. Here, we outline a framework that supports developing simulations as scientific instruments, and we select cancer systems biology as an exemplar domain, with a particular focus on cellular signalling models. We consider the challenges of lack of data, incomplete knowledge and modelling in the context of a rapidly changing knowledge base. Our framework comprises a process to clearly separate scientific and engineering concerns in model and simulation development, and an argumentation approach to documenting models for rigorous way of recording assumptions and knowledge gaps. We propose interactive, dynamic visualisation tools to enable the biological community to interact with cellular signalling models directly for experimental design. There is a mismatch in scale between these cellular models and tissue structures that are affected by tumours, and bridging this gap requires substantial computational resource. We present concurrent programming as a technology to link scales without losing important details through model simplification. We discuss the value of combining this technology, interactive visualisation, argumentation and model separation to support development of multi-scale models that represent biologically plausible cells arranged in biologically plausible structures that model cell behaviour, interactions and response to therapeutic interventions

CiteSeerX

Abertay Research Portal

Towards Data-driven Simulation of End-to-end Network Performance Indicators

Author: Sliwa Benjamin
Wietfeld Christian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/11/2019
Field of study

Novel vehicular communication methods are mostly analyzed simulatively or analytically as real world performance tests are highly time-consuming and cost-intense. Moreover, the high number of uncontrollable effects makes it practically impossible to reevaluate different approaches under the exact same conditions. However, as these methods massively simplify the effects of the radio environment and various cross-layer interdependencies, the results of end-to-end indicators (e.g., the resulting data rate) often differ significantly from real world measurements. In this paper, we present a data-driven approach that exploits a combination of multiple machine learning methods for modeling the end-to-end behavior of network performance indicators within vehicular networks. The proposed approach can be exploited for fast and close to reality evaluation and optimization of new methods in a controllable environment as it implicitly considers cross-layer dependencies between measurable features. Within an example case study for opportunistic vehicular data transfer, the proposed approach is validated against real world measurements and a classical system-level network simulation setup. Although the proposed method does only require a fraction of the computation time of the latter, it achieves a significantly better match with the real world evaluations

arXiv.org e-Print Archive

Crossref

Agent-based modeling: a systematic assessment of use cases and requirements for enhancing pharmaceutical research and development productivity.

Author: Hunt C. Anthony
Hunt CA
Kennedy RC
Kim SHJ
Ropella GEP
Publication venue: eScholarship, University of California
Publication date: 04/06/2013
Field of study

A crisis continues to brew within the pharmaceutical research and development (R&D) enterprise: productivity continues declining as costs rise, despite ongoing, often dramatic scientific and technical advances. To reverse this trend, we offer various suggestions for both the expansion and broader adoption of modeling and simulation (M&S) methods. We suggest strategies and scenarios intended to enable new M&S use cases that directly engage R&D knowledge generation and build actionable mechanistic insight, thereby opening the door to enhanced productivity. What M&S requirements must be satisfied to access and open the door, and begin reversing the productivity decline? Can current methods and tools fulfill the requirements, or are new methods necessary? We draw on the relevant, recent literature to provide and explore answers. In so doing, we identify essential, key roles for agent-based and other methods. We assemble a list of requirements necessary for M&S to meet the diverse needs distilled from a collection of research, review, and opinion articles. We argue that to realize its full potential, M&S should be actualized within a larger information technology framework--a dynamic knowledge repository--wherein models of various types execute, evolve, and increase in accuracy over time. We offer some details of the issues that must be addressed for such a repository to accrue the capabilities needed to reverse the productivity decline

PubMed Central

eScholarship - University of California