Search CORE

266 research outputs found

Balancing Multiple Goals In Observational Study Design

Author: Pimentel Samuel
Publication venue: ScholarlyCommons
Publication date: 01/01/2017
Field of study

This thesis unites three papers discussing new strategies for matched pair designs using observational data, developed to balance the demands of various disparate design goals. The first chapter introduces a new matching algorithm for large-scale treated-control comparisons when many categorical covariates are present. The algorithm balances covariates and their interactions in a prioritized manner by solving a combinatorial optimization problem, and guarantees computational efficiency through the use of a sparse network representation. The second chapter defines a class of variables called prods which can be ignored when matching in order to strictly attenuate unmeasured bias, if it is present. These variables can be difficult to identify with confidence, so a multiple-control-group strategy is proposed in which investigators match once on all variables, and once ignoring prods; the two treated-control comparisons together give stronger evidence about treatment effects than either one individually. The final paper considers a new version of Fisher\u27s classical lack-of-fit test for regression models, appropriate for data that lack replicated observations. The test uses matched pairs formed by optimal nonbipartite matching as near-replicates, and the model fit is used is used in constructing the matching distance in order to focus attention on variables that are predictive in the null model

ScholarlyCommons@Penn

Choosing a Clustering: An A Posteriori Method for Social Networks

Author: Pimentel Samuel D.
Publication venue: 'Exeley, Inc.'
Publication date
Field of study

Exeley Inc.

Covariate-adaptive randomization inference in matched designs

Author: Huang Yaxuan
Pimentel Samuel D.
Publication venue
Publication date: 10/11/2023
Field of study

It is common to conduct causal inference in matched observational studies by proceeding as though treatment assignments within matched sets are assigned uniformly at random and using this distribution as the basis for inference. This approach ignores observed discrepancies in matched sets that may be consequential for the distribution of treatment, which are succinctly captured by within-set differences in the propensity score. We address this problem via covariate-adaptive randomization inference, which modifies the permutation probabilities to vary with estimated propensity score discrepancies and avoids requirements to exclude matched pairs or model an outcome variable. We show that the test achieves type I error control arbitrarily close to the nominal level when large samples are available for propensity score estimation. We characterize the large-sample behavior of the new randomization test for a difference-in-means estimator of a constant additive effect. We also show that existing methods of sensitivity analysis generalize effectively to covariate-adaptive randomization inference. Finally, we evaluate the empirical value of covariate-adaptive randomization procedures via comparisons to traditional uniform inference in matched designs with and without propensity score calipers and regression adjustment using simulations and analyses of genetic damage among welders and right-heart catheterization in surgical patients.Comment: 41 pages, 8 figure

arXiv.org e-Print Archive

Variance-based sensitivity analysis for weighting estimators result in more informative bounds

Author: Huang Melody
Pimentel Samuel D.
Publication venue
Publication date: 02/08/2022
Field of study

Weighting methods are popular tools for estimating causal effects; assessing their robustness under unobserved confounding is important in practice. In the following paper, we introduce a new set of sensitivity models called "variance-based sensitivity models". Variance-based sensitivity models characterize the bias from omitting a confounder by bounding the distributional differences that arise in the weights from omitting a confounder, with several notable innovations over existing approaches. First, the variance-based sensitivity models can be parameterized with respect to a simple

R^2

parameter that is both standardized and bounded. We introduce a formal benchmarking procedure that allows researchers to use observed covariates to reason about plausible parameter values in an interpretable and transparent way. Second, we show that researchers can estimate valid confidence intervals under a set of variance-based sensitivity models, and provide extensions for researchers to incorporate their substantive knowledge about the confounder to help tighten the intervals. Last, we highlight the connection between our proposed approach and existing sensitivity analyses, and demonstrate both, empirically and theoretically, that variance-based sensitivity models can provide improvements on both the stability and tightness of the estimated confidence intervals over existing methods. We illustrate our proposed approach on a study examining blood mercury levels using the National Health and Nutrition Examination Survey (NHANES)

arXiv.org e-Print Archive

GRFT – Genetic Records Family Tree Web Applet

Author: Fernandes John
Pimentel Samuel
Walbot Virginia
Publication venue: Frontiers Research Foundation
Publication date: 01/01/2011
Field of study

Current software for storing and displaying records of genetic crosses does not provide an easy way to determine the lineage of an individual. The genetic records family tree (GRFT) applet processes records of genetic crosses and allows researchers to quickly visualize lineages using a family tree construct and to access other information from these records using any Internet browser. Users select from three display features: (1) a family tree view which displays a color-coded family tree for an individual, (2) a sequential list of crosses, and (3) a list of crosses matching user-defined search criteria. Each feature contains options to specify the number of records shown and the latter two contain an option to filter results by the owner of the cross. The family tree feature is interactive, displaying a popup box with genetic information when the user mouses over an individual and allowing the user to draw a new tree by clicking on any individual in the current tree. The applet is written in JavaScript and reads genetic records from a tab-delimited text file on the server, so it is cross-platform, can be accessed by anyone with an Internet connection, and supports almost instantaneous generation of new trees and table lists. Researchers can use the tool with their own genetic cross records for any sexually reproducing organism. No additional software is required and with only minor modifications to the script, researchers can add their own custom columns. GRFT’s speed, versatility, and low overhead make it an effective and innovative visualization method for genetic records. A sample tool is available at http://stanford.edu/walbot/grft-sample.html

Directory of Open Access Journals

PubMed Central

Frontiers - Publisher Connector

An Exact Test of Fit for the Gaussian Linear Model using Optimal Nonbipartite Matching

Author: Pimentel Samuel D
Rosenbaum Paul R
Small Dylan S
Publication venue: ScholarlyCommons
Publication date: 13/04/2017
Field of study

Fisher tested the fit of Gaussian linear models using replicated observations. We refine this method by (1) constructing near-replicates using an optimal nonbipartite matching and (2) defining a distance that focuses on predictors important to the model’s predictions. Near-replicates may not exist unless the predictor set is low-dimensional; the test addresses dimensionality by betting that model failures involve a subset of predictors important in the old fit. Despite using the old fit to pair observations, the test has exactly its stated level under the null hypothesis. Simulations show the test has reasonable power even when many spurious predictors are present

ScholarlyCommons@Penn

Constructed Second Control Groups and Attenuation of Unmeasured Biases

Author: Pimentel Samuel D
Rosenbaum Paul R
Small Dylan S
Publication venue: ScholarlyCommons
Publication date: 01/10/2016
Field of study

The informal folklore of observational studies claims that if an irrelevant observed covariate is left uncontrolled, say unmatched, then it will influence treatment assignment in haphazard ways, thereby diminishing the biases from unmeasured covariates. We prove a result along these lines: it is true, in a certain sense, to a limited degree, under certain conditions. Alas, the conditions are neither inconsequential nor easy to check in empirical work; indeed, they are often dubious, more often implausible. We suggest the result is most useful in the computerized construction of a second control group, where the investigator can see more in available data without necessarily believing the required conditions. One of the two control groups controls for the possibly irrelevant observed covariate, the other control group either leaves it uncontrolled or forces separation; therefore, the investigator views one situation from two angles under different assumptions. A pair of sensitivity analyses for the two control groups is coordinated by a weighted Holm or recycling procedure built around the possibility of slight attenuation of bias in one control group. Issues are illustrated using an observational study of the possible effects of cigarette smoking as a cause of increased homocysteine levels, a risk factor for cardiovascular disease. Supplementary materials for this article are available online

ScholarlyCommons@Penn

FigShare

Genética e melhoramento de ovinos no Brasil

Author: Araújo Ronyere Olegário de
Paiva Samuel Rezende
Pimentel Concepta Margaret McManus
Publication venue: 'FapUNIFESP (SciELO)'
Publication date: 01/07/2010
Field of study

Estudos em genética e melhoramento de ovinos no Brasil têm aumentado significativamente nos últimos anos. Estes envolvem pesquisa em caracterização, criação e cruzamento de ovinos utilizando as novas tecnologias disponíveis, incorporando tanto a genética quantitativa clássica e molecular. São abordadas sugestões para melhorias nas técnicas de estatística, nos recursos computacionais, bem como na análise de DNA e nas lacunas no conhecimento atual e possibilidades de possíveis investigações. Há uma necessidade de maior interação entre vários grupos de trabalho no país, bem como as interações com outras disciplinas, como Sistemas de Informação Geográfica, Estatística, Bioinformática, bem como estudos biológicos, como fisiologia e proteômica.Studies in genetics and breeding of sheep in Brazil have increased significantly in recent years. These involve research in characterization, breeding and crossing sheep using new technologies available incorporating both classical quantitative and molecular genetics. Improvements in statistical techniques, computational resources as well as analysis of DNA and gaps in present knowledge and opportunities for possible research are pointed out. There is a need for greater interaction between various groups working in the country as well as interactions with other disciplines such as Geographical Information Systems, Statistics, Bioinformatics, as well as biological studies such as physiology and proteomics

Repositório Institucional da Universidade de Brasília

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Evaluation of PRNP polymorphisms in Brazilian local adapted breeds

Author: Caetano Alexandre Rodrigues
Ianella Patrícia
Paiva Samuel Rezende
Pimentel Concepta Margaret McManus
Publication venue
Publication date: 01/01/2010
Field of study

The present study was conducted to genotype and estimate haplotypes and haplotypic and genotypic frequencies on three previously reported PRNP polymorphisms in Brazilian local adapted/naturalized breeds, and to evaluate the flock‘s genetic potential in relation to scrapie usceptibility/resistance

Repositório Institucional da Universidade de Brasília