Search CORE

49,361 research outputs found

Measurement and Measurement Methodology

Author: Edewor Patrick
Kajola S. O.
Publication venue: Unspecified
Publication date: 01/01/2000
Field of study

Regression Trees for Longitudinal Data

Author: Harezlak Jaroslaw
Kundu Madan Gopal
Publication venue: 'Informa UK Limited'
Publication date: 30/09/2013
Field of study

While studying response trajectory, often the population of interest may be diverse enough to exist distinct subgroups within it and the longitudinal change in response may not be uniform in these subgroups. That is, the timeslope and/or influence of covariates in longitudinal profile may vary among these different subgroups. For example, Raudenbush (2001) used depression as an example to argue that it is incorrect to assume that all the people in a given population would be experiencing either increasing or decreasing levels of depression. In such cases, traditional linear mixed effects model (assuming common parametric form for covariates and time) is not directly applicable for the entire population as a group-averaged trajectory can mask important subgroup differences. Our aim is to identify and characterize longitudinally homogeneous subgroups based on the combination of baseline covariates in the most parsimonious way. This goal can be achieved via constructing regression tree for longitudinal data using baseline covariates as partitioning variables. We have proposed LongCART algorithm to construct regression tree for the longitudinal data. In each node, the proposed LongCART algorithm determines the need for further splitting (i.e. whether parameter(s) of longitudinal profile is influenced by any baseline attributes) via parameter instability tests and thus the decision of further splitting is type-I error controlled. We have obtained the asymptotic results for the proposed instability test and examined finite sample behavior of the whole algorithm through simulation studies. Finally, we have applied the LongCART algorithm to study the longitudinal changes in choline level among HIV patients

arXiv.org e-Print Archive

CiteSeerX

Collection Of Biostatistics Research Archive

Recommended from our members

A cognitive architecture for learning in reactive environments

Author: Langley Pat
Publication venue: eScholarship, University of California
Publication date: 01/12/1986
Field of study

Previous research in machine learning has viewed the process of empirical discovery as search through a space of 'theoretical' terms. In this paper, we propose a problem space for empirical discovery, specifying six complementary operators for defining new terms that ease the statement of empirical laws. The six types of terms include: numeric attributes (such as PV/T); intrinsic properties (such as mass); composite objects (such as pairs of colliding balls); classes of objects (such as acids and alkalis); composite relations (such as chemical reactions); and classes of relations (such as combustion/oxidation). We review existing machine discovery systems in light of this framework, examining which parts of the problem space were, covered by these systems. Finally, we outline an integrated discovery system (IDS) we are constructing that includes all six of the operators and which should be able to discover a broad range of empirical laws

eScholarship - University of California

Recommended from our members

A framework for empirical discovery

Author: Langley Pat
Nordhausen Bernd
Publication venue: eScholarship, University of California
Publication date: 24/09/1986
Field of study

eScholarship - University of California

Automated data pre-processing via meta-learning

Author: A Guazzelli
A Kalousis
D Pyle
F Serban
J Vanschoren
J-U Kietz
M Hall
MA Munson
SF Crone
T Dasu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The final publication is available at link.springer.comA data mining algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way around. As a matter of fact, a dataset usually needs to be pre-processed. Taking into account all the possible pre-processing operators, there exists a staggeringly large number of alternatives and nonexperienced users become overwhelmed. We show that this problem can be addressed by an automated approach, leveraging ideas from metalearning. Specifically, we consider a wide range of data pre-processing techniques and a set of data mining algorithms. For each data mining algorithm and selected dataset, we are able to predict the transformations that improve the result of the algorithm on the respective dataset. Our approach will help non-expert users to more effectively identify the transformations appropriate to their applications, and hence to achieve improved results.Peer ReviewedPostprint (published version

Crossref

UPCommons. Portal del coneixement obert de la UPC

Recursive Partitioning for Heterogeneous Causal Effects

Author: Athey Susan
Imbens Guido
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 30/12/2015
Field of study

In this paper we study the problems of estimating heterogeneity in causal effects in experimental or observational studies and conducting inference about the magnitude of the differences in treatment effects across subsets of the population. In applications, our method provides a data-driven approach to determine which subpopulations have large or small treatment effects and to test hypotheses about the differences in these effects. For experiments, our method allows researchers to identify heterogeneity in treatment effects that was not specified in a pre-analysis plan, without concern about invalidating inference due to multiple testing. In most of the literature on supervised machine learning (e.g. regression trees, random forests, LASSO, etc.), the goal is to build a model of the relationship between a unit's attributes and an observed outcome. A prominent role in these methods is played by cross-validation which compares predictions to actual outcomes in test samples, in order to select the level of complexity of the model that provides the best predictive power. Our method is closely related, but it differs in that it is tailored for predicting causal effects of a treatment rather than a unit's outcome. The challenge is that the "ground truth" for a causal effect is not observed for any individual unit: we observe the unit with the treatment, or without the treatment, but not both at the same time. Thus, it is not obvious how to use cross-validation to determine whether a causal effect has been accurately predicted. We propose several novel cross-validation criteria for this problem and demonstrate through simulations the conditions under which they perform better than standard methods for the problem of causal effects. We then apply the method to a large-scale field experiment re-ranking results on a search engine

arXiv.org e-Print Archive

PubMed Central

Price Indexes For Multi-dwelling Properties In Sweden

Author: Lennart Berg
Publication venue
Publication date
Field of study

The econometric test in this paper indicates that standard property and municipality attributes are important determinants of sales prices for MDCBs (multi-dwelling and commercial buildings) in Sweden. I also employ spatial econometric techniques and find that spatial specified regressions improved the models? explanatory power. The constant quality price for a model estimated with OLS is roughly one percentage point higher than for a model controlling for spatial autocorrelation. When the constant quality price trend is estimated on a yearly basis, there are hardly any differences between the estimated parameters, notwithstand-ing if all MDCBs are in the sample or if the sample is split into sub markets. However, estimating models with a quarterly constant quality price trend to some extent shows different price trends for the three sub markets.

Research Papers in Economics