Search CORE

880 research outputs found

A Bayesian Approach to Graphical Record Linkage and Deduplication

Author: Fienberg SE
Hall R
Steorts RC
Publication venue
Publication date: 01/10/2016
Field of study

© 2016 American Statistical Association.We propose an unsupervised approach for linking records across arbitrarily many files, while simultaneously detecting duplicate records within files. Our key innovation involves the representation of the pattern of links between records as a bipartite graph, in which records are directly linked to latent true individuals, and only indirectly linked to other records. This flexible representation of the linkage structure naturally allows us to estimate the attributes of the unique observable people in the population, calculate transitive linkage probabilities across records (and represent this visually), and propagate the uncertainty of record linkage into later analyses. Our method makes it particularly easy to integrate record linkage with post-processing procedures such as logistic regression, capture–recapture, etc. Our linkage structure lends itself to an efficient, linear-time, hybrid Markov chain Monte Carlo algorithm, which overcomes many obstacles encountered by previously record linkage approaches, despite the high-dimensional parameter space. We illustrate our method using longitudinal data from the National Long Term Care Survey and with data from the Italian Survey on Household and Wealth, where we assess the accuracy of our method and show it to be better in terms of error rates and empirical scalability than other approaches in the literature. Supplementary materials for this article are available online

DukeSpace

SMERED: A Bayesian Approach to Graphical Record Linkage and De-duplication

Author: Fienberg SE
Hall R
Steorts RC
Publication venue
Publication date
Field of study

We propose a novel unsupervised approach for linking records across arbitrarily many files, while simultaneously detecting duplicate records within files. Our key innovation is to represent the pattern of links between records as a {\em bipartite} graph, in which records are directly linked to latent true individuals, and only indirectly linked to other records. This flexible new representation of the linkage structure naturally allows us to estimate the attributes of the unique observable people in the population, calculate

k

-way posterior probabilities of matches across records, and propagate the uncertainty of record linkage into later analyses. Our linkage structure lends itself to an efficient, linear-time, hybrid Markov chain Monte Carlo algorithm, which overcomes many obstacles encountered by previously proposed methods of record linkage, despite the high dimensional parameter space. We assess our results on real and simulated data

DukeSpace

Structure in the nucleus of NGC 1068 at 10 microns

Author: Fazio G. G.
Gezari D. Y.
Hoffmann W. F.
Lamb G. M.
Mccreight C. R.
Shu P. K.
Tresch-Fienberg R.
Publication venue
Publication date
Field of study

New 8 to 13 micron array camera images of the central kiloparsec of Seyfert 2 galaxy NGC 1068 resolve structure that is similar to that observed at visible and radio wavelengths. The images reveal an infrared source which is extended and asymmetric, with its long axis oriented at P.A. 33 deg. Maps of the spatial distribution of 8 to 13 micron color temperature and warm dust opacity are derived from the multiwavelength infrared images. The results suggest that there exist two pointlike luminosity sources in the central regions of NGC 1068, with the brighter source at the nucleus and the fainter one some 100 pc to the northeast. This geometry strengthens the possibility that the 10 micron emission observed from grains in the nucleus is powered by a nonthermal source. In the context of earlier visible and radio studies, these results considerably strengthen the case for jet induced star formation in NGC 1068

NASA Technical Reports Server

The 8.3 and 12.4 micron imaging of the Galactic Center source complex with the Goddard infrared array camera

Author: Fazio G. G.
Gatley I.
Gezari D. Y.
Hoffmann W. F.
Lamb G.
Mccreight C. R.
Shu P.
Tresch-Fienberg R.
Publication venue
Publication date
Field of study

A 30 x 30 arcsec field at the Galactic Center (1.5 x 1.5 parsec) was mapped at 8.3 microns and 12.41 microns with high spatial resolution and accurate relative astrometry, using the 16 x 16 Si:Bi accumulation mode charge injection device Goddard infrared array camera. The design and performance of the array camera detector electronics system and image data processing techniques are discussed. Color temperature and dust opacity distributions derived from the spatially accurate images indicate that the compact infrared sources and the large scale ridge structure are bounded by warmer, more diffuse material. None of the objects appear to be heated appreciably by internal luminosity sources. These results are consistent with the model proposing that the complex is heated externally by a strong luminosity source at the Galactic Center, which dominates the energetics of the inner few parsecs of the galaxy

NASA Technical Reports Server

Recommended from our members

TRAIL-induced variation of cell signaling states provides nonheritable resistance to apoptosis.

Author: Baskar Reema
Bendall Sean C
Favaro Patricia
Fienberg Harris G
Green Douglas R
Khair Zumana
Kimmey Sam
Nolan Garry P
Plevritis Sylvia
Publication venue: eScholarship, University of California
Publication date: 01/12/2019
Field of study

TNFα-related apoptosis-inducing ligand (TRAIL), specifically initiates programmed cell death, but often fails to eradicate all cells, making it an ineffective therapy for cancer. This fractional killing is linked to cellular variation that bulk assays cannot capture. Here, we quantify the diversity in cellular signaling responses to TRAIL, linking it to apoptotic frequency across numerous cell systems with single-cell mass cytometry (CyTOF). Although all cells respond to TRAIL, a variable fraction persists without apoptotic progression. This cell-specific behavior is nonheritable where both the TRAIL-induced signaling responses and frequency of apoptotic resistance remain unaffected by prior exposure. The diversity of signaling states upon exposure is correlated to TRAIL resistance. Concomitantly, constricting the variation in signaling response with kinase inhibitors proportionally decreases TRAIL resistance. Simultaneously, TRAIL-induced de novo translation in resistant cells, when blocked by cycloheximide, abrogated all TRAIL resistance. This work highlights how cell signaling diversity, and subsequent translation response, relates to nonheritable fractional escape from TRAIL-induced apoptosis. This refined view of TRAIL resistance provides new avenues to study death ligands in general

eScholarship - University of California

Differentially Private Model Selection with Penalized and Constrained Likelihood

Author: Chaudhuri K.
Chaudhuri K.
Chaudhuri K.
Dalenius T.
Duchi J. C.
Fienberg S.
Gaboardi M.
Hardt M.
Lei J.
Rubin D. B.
Smith A.
Tibshirani R.
Uhler C.
Publication venue
Publication date: 14/07/2016
Field of study

In statistical disclosure control, the goal of data analysis is twofold: The released information must provide accurate and useful statistics about the underlying population of interest, while minimizing the potential for an individual record to be identified. In recent years, the notion of differential privacy has received much attention in theoretical computer science, machine learning, and statistics. It provides a rigorous and strong notion of protection for individuals' sensitive information. A fundamental question is how to incorporate differential privacy into traditional statistical inference procedures. In this paper we study model selection in multivariate linear regression under the constraint of differential privacy. We show that model selection procedures based on penalized least squares or likelihood can be made differentially private by a combination of regularization and randomization, and propose two algorithms to do so. We show that our private procedures are consistent under essentially the same conditions as the corresponding non-private procedures. We also find that under differential privacy, the procedure becomes more sensitive to the tuning parameters. We illustrate and evaluate our method using simulation studies and two real data examples

arXiv.org e-Print Archive

Crossref

Boston University Institutional Repository (OpenBU)

Sharing Social Network Data: Differentially Private Estimation of Exponential-Family Random Graph Models

Author: Carroll R. J.
Chaudhuri A.
Duchi J. C.
Fienberg S.
Geyer C. J.
Hunter D. R.
Karwa V.
Kinney S. K.
Lu W.
Morris M.
Raghunathan T. E.
Reiter J. P.
Snijders T. A.
Zhou Y.
Publication venue
Publication date: 23/09/2016
Field of study

Motivated by a real-life problem of sharing social network data that contain sensitive personal information, we propose a novel approach to release and analyze synthetic graphs in order to protect privacy of individual relationships captured by the social network while maintaining the validity of statistical results. A case study using a version of the Enron e-mail corpus dataset demonstrates the application and usefulness of the proposed techniques in solving the challenging problem of maintaining privacy \emph{and} supporting open access to network data to ensure reproducibility of existing studies and discovering new scientific insights that can be obtained by analyzing such data. We use a simple yet effective randomized response mechanism to generate synthetic networks under

\epsilon

-edge differential privacy, and then use likelihood based inference for missing data and Markov chain Monte Carlo techniques to fit exponential-family random graph models to the generated synthetic networks.Comment: Updated, 39 page

arXiv.org e-Print Archive

Crossref

Research Online

A DNA-binding activity, TRAC, specific for the TRA element of the transferrin receptor gene copurifies with the Ku autoantigen.

Author: A. Fienberg
F. H. Ruddle
L. Hunihan
M. R. Roberts
Y. Han
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date
Field of study

Crossref

Bayesian Exponential Random Graph Models with Nodal Random Effects

Author: A. Caimo
Caimo
Caimo
Caimo
Chatterjee
Everitt
Fienberg
Frank
G. Kauermann
Gelman
Geyer
Gill
Goldenberg
Holland
Hunter
Hunter
Hunter
Hunter
Kapferer
Kass
Kolaczyk
Krivitsky
Lusher
Milgram
Murray
N. Friel
R Core Team
Robins
Robins
Robins
Robins
S. Thiemichen
Salter-Townshend
Schweinberger
Severini
Snijders
Spiegelhalter
Strauss
Thurner
van Duijn
Varin
Watts
Zachary
Zijlstra
Publication venue
Publication date: 12/01/2015
Field of study

We extend the well-known and widely used Exponential Random Graph Model (ERGM) by including nodal random effects to compensate for heterogeneity in the nodes of a network. The Bayesian framework for ERGMs proposed by Caimo and Friel (2011) yields the basis of our modelling algorithm. A central question in network models is the question of model selection and following the Bayesian paradigm we focus on estimating Bayes factors. To do so we develop an approximate but feasible calculation of the Bayes factor which allows one to pursue model selection. Two data examples and a small simulation study illustrate our mixed model approach and the corresponding model selection.Comment: 23 pages, 9 figures, 3 table

arXiv.org e-Print Archive

Crossref

Research Repository UCD

Arrow@TUDublin

Irish Universities

R.A.Fisher, design theory, and the Indian connection

Design Theory, a branch of mathematics, was born out of the experimental statistics research of the population geneticist R. A. Fisher and of Indian mathematical statisticians in the 1930s. The field combines elements of combinatorics, finite projective geometries, Latin squares, and a variety of further mathematical structures, brought together in surprising ways. This essay will present these structures and ideas as well as how the field came together, in itself an interesting story.Comment: 11 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Louisiana State University