99 research outputs found
Recommended from our members
The Estimation of Causal Effects from Observational Data
When experimental designs are infeasible, researchers must resort to the use of observational data from surveys, censuses, and administrative records. Because assignment to the independent variables of observational data is usually nonrandom, the challenge of estimating causal effects with observational data can be formidable. In this chapter, we review the large literature produced primarily by statisticians and econometricians in the past two decades on the estimation of causal effects from observational data. We first review the now widely accepted counterfactual framework for the modeling of causal effects. After examining estimators, both old and new, that can be used to estimate causal effects from cross-sectional data, we present estimators that exploit the additional information furnished by longitudinal data. Because of the size and technical nature of the literature, we cannot offer a fully detailed and comprehensive presentation. Instead, we present only the main features of methods that are accessible and potentially of use to quantitatively oriented sociologists.Sociolog
Recommended from our members
Translating Causal Claims: Principles and Strategies for Policy-Relevant Criminology
Research Summary
This article reviews the causal turn in the social sciences and accompanying efforts by criminologists to make policy claims more credible. Although there has been much progress in techniques for the estimation of causal effects, we find that the link between evidence and valid policy implications remains elusive. Drawing on criminological theory and research insights from disciplines such as sociology, economics, and statistics, we assess principles and strategies for informing policy in a causally uncertain world. We identify three distinct domains of inquiry that form a part of the translational process from evidence to policy and that complicate the straightforward exportation of causal effects to policy recommendations: (a) mechanisms and causal pathways, (b) effect heterogeneity, and (c) contextualization. We elaborate these three concepts by examining research on broken windows theory, policing, video games and violence, the Moving to Opportunity voucher experiment, incarceration, and especially the rich set of experimental studies on domestic violence that originated in Minneapolis, MN in the early 1980s. We also articulate a set of conceptual tools for advancing the goal of policy translation and offer recommendations for how what we call “policy graphs”—causal graphs used to analyze the policy implications of a system of causal relations—can potentially integrate the theoretical and policy arms of criminology.
Policy Implications
Evidence, even if causal, does not necessarily inform policy. In fact, the question of “what works,” the focus of the growing evidence-based movement in criminology, turns out to be a different question than, “what will work?” Evidence-based policy research must therefore be concerned with much more than providing policymakers with research on causal effects, however precisely measured. The implication is that we must separate criminology’s increasing focus on causality from its policy turn and formally recognize that the latter requires a different standard of theory and evidence than does the former. In particular, criminologists interested in making policy claims must ask hard questions about the potential mechanisms through which a treatment influences an outcome, heterogeneous effects across people and time, contextual variations, and all of the real-world phenomena to which these challenges give rise—such as unintended consequences, policies that change incentive and opportunity structures, and the scale at which policies change in meaning. Theoretically guided causal graphs enhance this goal and help inform policy in a causally uncertain world. Translational criminology is ultimately a process that entails the constant interplay of theory, research, and practice.Sociolog
Recommended from our members
A Logit Model with Interactions for Predicting Major Gift Donors
We provide a new statistical model developed from the alumni database at Northwestern University for identifying potential major gift donors. Our logit model with interactions predicts which individuals will give 10,000.Sociolog
Ecometrics in the Age of Big Data: Measuring and Assessing "Broken Windows" Using Large-scale Administrative Records
The collection of large-scale administrative records in electronic form by many cities provides a new opportunity for the measurement and longitudinal tracking of neighborhood characteristics, but one that requires novel methodologies that convert such data into research-relevant measures. The authors illustrate these challenges by developing measures of “broken windows” from Boston’s constituent relationship management (CRM) system (aka 311 hotline). A 16-month archive of the CRM database contains more than 300,000 address-based requests for city services, many of which reference physical incivilities (e.g., graffiti removal). The authors carry out three ecometric analyses, each building on the previous one. Analysis 1 examines the content of the measure, identifying 28 items that constitute two independent constructs, private neglect and public denigration. Analysis 2 assesses the validity of the measure by using investigator-initiated neighborhood audits to examine the “civic response rate” across neighborhoods. Indicators of civic response were then extracted from the CRM database so that measurement adjustments could be automated. These adjustments were calibrated against measures of litter from the objective audits. Analysis 3 examines the reliability of the composite measure of physical disorder at different spatiotemporal windows, finding that census tracts can be measured at two-month intervals and census block groups at six-month intervals. The final measures are highly detailed, can be tracked longitudinally, and are virtually costless. This framework thus provides an example of how new forms of large-scale administrative data can yield ecometric measurement for urban science while illustrating the methodological challenges that must be addressed.Sociolog
Cancer risk and tumour spectrum in 172 patients with a germline SUFU pathogenic variation : a collaborative study of the SIOPE Host Genome Working Group
Background Little is known about risks associated with germline SUFU pathogenic variants (PVs) known as a cancer predisposition syndrome. Methods To study tumour risks, we have analysed data of a large cohort of 45 unpublished patients with a germline SUFU PV completed with 127 previously published patients. To reduce the ascertainment bias due to index patient selection, the risk of tumours was evaluated in relatives with SUFU PV (89 patients) using the Nelson-Aalen estimator. Results Overall, 117/172 (68%) SUFU PV carriers developed at least one tumour: medulloblastoma (MB) (86 patients), basal cell carcinoma (BCC) (25 patients), meningioma (20 patients) and gonadal tumours (11 patients). Thirty-three of them (28%) had multiple tumours. Median age at diagnosis of MB, gonadal tumour, first BCC and first meningioma were 1.5, 14, 40 and 44 years, respectively. Follow-up data were available for 160 patients (137 remained alive and 23 died). The cumulative incidence of tumours in relatives was 14.4% (95% CI 6.8 to 21.4), 18.2% (95% CI 9.7 to 25.9) and 44.1% (95% CI 29.7 to 55.5) at the age of 5, 20 and 50 years, respectively. The cumulative risk of an MB, gonadal tumour, BCC and meningioma at age 50 years was: 13.3% (95% CI 6 to 20.1), 4.6% (95% CI 0 to 9.7), 28.5% (95% CI 13.4 to 40.9) and 5.2% (95% CI 0 to 12), respectively. Sixty-four different PVs were reported across the entire SUFU gene and inherited in 73% of cases in which inheritance could be evaluated. Conclusion Germline SUFU PV carriers have a life-long increased risk of tumours with a spectrum dominated by MB before the age of 5, gonadal tumours during adolescence and BCC and meningioma in adulthood, justifying fine-tuned surveillance programmes.Peer reviewe
[Comment] Redefine statistical significance
The lack of reproducibility of scientific studies has caused growing concern over the credibility of claims of new discoveries based on “statistically significant” findings. There has been much progress toward documenting and addressing several causes of this lack of reproducibility (e.g., multiple testing, P-hacking, publication bias, and under-powered studies). However, we believe that a leading cause of non-reproducibility has not yet been adequately addressed: Statistical standards of evidence for claiming discoveries in many fields of science are simply too low. Associating “statistically significant” findings with P < 0.05 results in a high rate of false positives even in the absence of other experimental, procedural and reporting problems.
For fields where the threshold for defining statistical significance is P<0.05, we propose a change to P<0.005. This simple step would immediately improve the reproducibility of scientific research in many fields. Results that would currently be called “significant” but do not meet the new threshold should instead be called “suggestive.” While statisticians have known the relative weakness of using P≈0.05 as a threshold for discovery and the proposal to lower it to 0.005 is not new (1, 2), a critical mass of researchers now endorse this change.
We restrict our recommendation to claims of discovery of new effects. We do not address the appropriate threshold for confirmatory or contradictory replications of existing claims. We also do not advocate changes to discovery thresholds in fields that have already adopted more stringent standards (e.g., genomics and high-energy physics research; see Potential Objections below).
We also restrict our recommendation to studies that conduct null hypothesis significance tests. We have diverse views about how best to improve reproducibility, and many of us believe that other ways of summarizing the data, such as Bayes factors or other posterior summaries based on clearly articulated model assumptions, are preferable to P-values. However, changing the P-value threshold is simple and might quickly achieve broad acceptance
The FANCM:p.Arg658* truncating variant is associated with risk of triple-negative breast cancer
Abstract: Breast cancer is a common disease partially caused by genetic risk factors. Germline pathogenic variants in DNA repair genes BRCA1, BRCA2, PALB2, ATM, and CHEK2 are associated with breast cancer risk. FANCM, which encodes for a DNA translocase, has been proposed as a breast cancer predisposition gene, with greater effects for the ER-negative and triple-negative breast cancer (TNBC) subtypes. We tested the three recurrent protein-truncating variants FANCM:p.Arg658*, p.Gln1701*, and p.Arg1931* for association with breast cancer risk in 67,112 cases, 53,766 controls, and 26,662 carriers of pathogenic variants of BRCA1 or BRCA2. These three variants were also studied functionally by measuring survival and chromosome fragility in FANCM−/− patient-derived immortalized fibroblasts treated with diepoxybutane or olaparib. We observed that FANCM:p.Arg658* was associated with increased risk of ER-negative disease and TNBC (OR = 2.44, P = 0.034 and OR = 3.79; P = 0.009, respectively). In a country-restricted analysis, we confirmed the associations detected for FANCM:p.Arg658* and found that also FANCM:p.Arg1931* was associated with ER-negative breast cancer risk (OR = 1.96; P = 0.006). The functional results indicated that all three variants were deleterious affecting cell survival and chromosome stability with FANCM:p.Arg658* causing more severe phenotypes. In conclusion, we confirmed that the two rare FANCM deleterious variants p.Arg658* and p.Arg1931* are risk factors for ER-negative and TNBC subtypes. Overall our data suggest that the effect of truncating variants on breast cancer risk may depend on their position in the gene. Cell sensitivity to olaparib exposure, identifies a possible therapeutic option to treat FANCM-associated tumors
The role of administrative data in the big data revolution in social science research
The term big data is currently a buzzword in social science, however its precise meaning is ambiguous. In this paper we focus on administrative data which is a distinctive form of big data. Exciting new opportunities for social science research will be afforded by new administrative data resources, but these are currently under appreciated by the research community. The central aim of this paper is to discuss the challenges associated with administrative data. We emphasise that it is critical for researchers to carefully consider how administrative data has been produced. We conclude that administrative datasets have the potential to contribute to the development of high-quality and impactful social science research, and should not be overlooked in the emerging field of big data
- …