Search CORE

64 research outputs found

US Data Access and the Commission for Evidence-based Policymaking

Author: Amy O'Hara
Publication venue: 'Swansea University'
Publication date: 01/09/2018
Field of study

Introduction In September 2017, the bipartisan Commission for Evidence-based Policymaking released twenty-two recommendations to improve secure data access for evidence building activities involving population-level government files. Many of the files are siloed in government agencies. The commission deliberated over eighteen months to understand the risks and barriers to broader data use. Objectives and Approach I will describe the Commission’s charge and review its recommendations, in context of US laws and privacy debates. I will compare the report’s recommendations and implications to laws and initiatives in other countries. The report calls for the establishment of a National Secure Data Service (NSDS), which has the potential to transform the data sharing environment for federal agencies, policy makers, and researchers. The report suggests more extensive use of differential privacy and secure multiparty computation to protect privacy. I will describe how the current environment could change depending on how the recommendations are implemented. Results The Commission was established under one administration, but the recommendations were released under another. Despite the political and budget uncertainty in Washington, a bill was introduced and passed in the House in November 2017 to implement some recommendations. I will summarize the actions to be taken if the bill becomes law, including directives on learning agendas to prioritize and coordinate evidence-building activities across government, the roles of chief evaluation and chief data officers, and formation of an advisory committee to plan a NSDS. I will describe benefits that could follow from directives in the bill, including transparency about uses of administrative data, development of guidance to assess the risk when combining data sources, and minimization of the risk of publicly releasing de-identified data. Conclusion/Implications The US may develop a national secure data service to support evaluations and policymaking. The recommendations are akin to the UK Data Service. Some recommendations are straightforward, others need years of planning and technical breakthroughs, and all require political buy-in and funding

Directory of Open Access Journals

The Differential Privacy Corner: What has the US Backed Itself Into?

Author: Amy O'Hara
Quentin Brummet
Publication venue: Swansea University
Publication date: 01/11/2019
Field of study

An expanding body of data privacy research reveals that computational advances and ever-growing amounts of publicly retrievable data increase re-identification risks. Because of this, data publishers are realizing that traditional statistical disclosure limitation methods may not protect privacy. This paper discusses the use of differential privacy at the US Census Bureau to protect the published results of the 2020 census. We first discuss the legal framework under which the Census Bureau intends to use differential privacy. The Census Act in the US states that the agency must keep information confidential, avoiding “any publication whereby the data furnished by any particular establishment or individual under this title can be identified.” The fact that Census may release fewer statistics in 2020 than in 2010 is leading scholars to parse the meaning of identification and reevaluate the agency’s responsibility to balance data utility with privacy protection. We then describe technical aspects of the application of differential privacy in the U.S. Census. This data collection is enormously complex and serves a wide variety of users and uses -- 7.8 billion statistics were released using the 2010 US Census. This complexity strains the application of differential privacy to ensure appropriate geographic relationships, respect legal requirements for certain statistics to be free of noise infusion, and provide information for detailed demographic groups. We end by discussing the prospects of applying formal mathematical privacy to other information products at the Census Bureau. At present, techniques exist for applying differential privacy to descriptive statistics, histograms, and counts, but are less developed for more complex data releases including panel data, linked data, and vast person-level datasets. We expect the continued development of formally private methods to occur alongside discussions of what privacy means and the policy issues involved in trading off protection for accuracy

Directory of Open Access Journals

Genomic and proteomic profiling of responses to toxic metals in human lung cells.

Author: Andrew Angeline S
Barchowsky Aaron
Hamilton Joshua W
Klei Linda
O'Hara Kimberley A
Soucy Nicole V
Temple Kaili A
Warren Amy J
Publication venue
Publication date: 01/05/2003
Field of study

Examining global effects of toxic metals on gene expression can be useful for elucidating patterns of biological response, discovering underlying mechanisms of toxicity, and identifying candidate metal-specific genetic markers of exposure and response. Using a 1,200 gene nylon array, we examined changes in gene expression following low-dose, acute exposures of cadmium, chromium, arsenic, nickel, or mitomycin C (MMC) in BEAS-2B human bronchial epithelial cells. Total RNA was isolated from cells exposed to 3 M Cd(II) (as cadmium chloride), 10 M Cr(VI) (as sodium dichromate), 3 g/cm2 Ni(II) (as nickel subsulfide), 5 M or 50 M As(III) (as sodium arsenite), or 1 M MMC for 4 hr. Expression changes were verified at the protein level for several genes. Only a small subset of genes was differentially expressed in response to each agent: Cd, Cr, Ni, As (5 M), As (50 M), and MMC each differentially altered the expression of 25, 44, 31, 110, 65, and 16 individual genes, respectively. Few genes were commonly expressed among the various treatments. Only one gene was altered in response to all four metals (hsp90), and no gene overlapped among all five treatments. We also compared low-dose (5 M, noncytotoxic) and high-dose (50 M, cytotoxic) arsenic treatments, which surprisingly, affected expression of almost completely nonoverlapping subsets of genes, suggesting a threshold switch from a survival-based biological response at low doses to a death response at high doses

Crossref

PubMed Central

Dartmouth Digital Commons (Dartmouth College)

Reconciling Parent-Child Relationships across US Administrative Datasets

Author: Amy O'Hara
Carla Medalia
Katie Genadek
Trent Alexander
Publication venue: 'Swansea University'
Publication date: 01/09/2018
Field of study

Introduction Population data capture children, parents, relatives, and others moving in and out of households. The U.S. has seen falling marriage rates, and increases in multigenerational households and complex families, young children living with grandparents, and adult children living with parents. Robust parent-child linkages are critical to understand these demographic shifts. Objectives and Approach We construct and validate parent-child linkages over a century to observe how U.S. households are changing over time. The three largest person-based datafiles in the U.S. are the decennial censuses, the Social Security Administration transaction file, and individual tax returns from the Internal Revenue Service. These sources operationalize relationships differently, capture data at various frequencies, and gather the data for unique purposes. We use probabilistic matching to observe and reconcile parent-child relationships across these sources. The data include a variety of personal identifiers including name, date of birth, parents’ names, address, and place of birth that support matching and validation. Results We find that understanding the content, consistency, and coverage of the files before matching is critical for high quality linkages. The representativeness of the parent-child relationship file improves over time, with the weakest coverage for the Greatest Generation and the strongest coverage for Millennials. Coverage varies by source: tax data underrepresent non-white children and have duplicate records for SSNs, while names and dates of birth are missing from Census data. Multiple match rates differ among demographic groups and over time. In the matching process, the blocking variables rely on common variables across the population datasets. Our approach provides robust entity resolution for women, despite married-maiden name changes. We describe challenges due to data problems in old census records and validation changes in social security data. Conclusion/Implications We conduct a successful reconciliation of parent-child relationships in U.S. population level files. The project supports operational and research uses, such as the 2020 Census. We will extend this work using graph matching and will expand the method to validate other relationship links including spouses and siblings

Directory of Open Access Journals

The Building Blocks of Interoperability. A Multisite Analysis of Patient Demographic Attributes Available for Matching.

Author: Applegate Reuben
Becich Michael J
Bell Douglas
Bernstam Elmer
Bian Jiang
Cappella Nickie
Carton Thomas
Culbertson Adam
Goel Satyender
Grannis Shaun
Hall Lauren
Hogan William
Jackson Kathryn L
Kho Abel
Klann Jeff
Krishnamurthy Ashok
Lipori Gloria
Liu Mei
Madden Margaret B
Martin Andrew
Matheny Michael
O'Hara Amy B
Rothman Russell
Safaeinili Niloufar
Sutphen Rebecca
Visweswaran Shyam
Waitman Russ
Publication venue: eScholarship, University of California
Publication date: 01/04/2017
Field of study

BackgroundPatient matching is a key barrier to achieving interoperability. Patient demographic elements must be consistently collected over time and region to be valuable elements for patient matching.ObjectivesWe sought to determine what patient demographic attributes are collected at multiple institutions in the United States and see how their availability changes over time and across clinical sites.MethodsWe compiled a list of 36 demographic elements that stakeholders previously identified as essential patient demographic attributes that should be collected for the purpose of linking patient records. We studied a convenience sample of 9 health care systems from geographically distinct sites around the country. We identified changes in the availability of individual patient demographic attributes over time and across clinical sites.ResultsSeveral attributes were consistently available over the study period (2005-2014) including last name (99.96%), first name (99.95%), date of birth (98.82%), gender/sex (99.73%), postal code (94.71%), and full street address (94.65%). Other attributes changed significantly from 2005-2014: Social security number (SSN) availability declined from 83.3% to 50.44% (p<0.0001). Email address availability increased from 8.94% up to 54% availability (p<0.0001). Work phone number increased from 20.61% to 52.33% (p<0.0001).ConclusionsOverall, first name, last name, date of birth, gender/sex and address were widely collected across institutional sites and over time. Availability of emerging attributes such as email and phone numbers are increasing while SSN use is declining. Understanding the relative availability of patient attributes can inform strategies for optimal matching in healthcare

Crossref

eScholarship - University of California

Self Curation, Social Partitioning, Escaping from Prejudice and Harassment: the Many Dimensions of Lying Online

Author: Guy Amy
Murray-Rust Dave
O'Hara Kieron
Shadbolt Nigel
Smith Daniel Alexander
Van Kleek Max
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/05/2015
Field of study

Portraying matters as other than they truly are is an important part of everyday human communication. In this paper, we use a survey to examine ways in which people fabricate, omit or alter the truth online. Many reasons are found, including creative expression, hiding sensitive information, role-playing, and avoiding harassment or discrimination. The results suggest lying is often used for benign purposes, and we conclude that its use may be essential to maintaining a humane online societ

CiteSeerX

Southampton (e-Prints Soton)

Edinburgh Research Explorer

Establishing an International Data Linkage Repository Workgroup Toward a Benchmarking Repository

Author: Amy O'Hara
Hye-Chung Kum
Luiza Antonie
Margaret Levenstein
Susan Leonard
Trent Alexander
Özgür Akgün
Publication venue: 'Swansea University'
Publication date: 01/09/2018
Field of study

Introduction Access to real data with diverse attributes is critical for effective development of any data analytic algorithm. Benchmarking data repositories have all been vital to the development of research communities focused on algorithm development. This work reports on the development of such a data repository for record linkage. Objectives and Approach Establishing a common benchmarking repository of real data can propel a field to the next level of rigor by facilitating comparison of different algorithms, understanding what type of algorithms work best under certain real data conditions and problem domains, promoting transparency and replicability of research, and creating incentives for proper citations for contributions. In addition, benchmarking repositories can bring together the diverse stakeholders (e.g., computer scientists, statisticians, data custodians, data users including social, behaviour, economic, and health (SBEH) scientists) that can advance the field more effectively than could researchers from any single discipline. Results In Fall 2016, international leaders in record linkage formed a Data Linkage Repository workgroup (DLRep) to establish a benchmarking data repository for record linkage. The workgroup is working in collaboration with The Inter-university Consortium for Political and Social Research (ICPSR) to host the site data repository planned for release in Summer 2018. The repository for record linkage research will house various types of real data that require linking with metadata, unique handles for citations, proposed algorithms for evaluation criteria, and a platform for posting, sharing, and comparing results as well as citations of relevant papers. Some datasets will have the gold standard published that researchers can evaluate their results against. Other datasets will gather results to build the gold standard as a community. Conclusion/Implications Record linkage methodology is important to domains where data needs to be integrated from multiple sources, including diverse disciplines. Establishing an international interdisciplinary research community around a benchmark data linkage repository to validate and compare linkage algorithms is crucial to fully realizing the social benefits of data about people

Directory of Open Access Journals

Seven-step framework to enhance practitioner explanations and parental understandings of research without prior consent in paediatric emergency and critical care trials

Author: Amy Humphreys
Anand Iyer
Carrol Gamble
Elizabeth D Lee
Helen Hickey
Joanne Noblet
Kerry Woolfall
Louise Roper
Lyttle
Lyttle
Mark D Lyttle
Naomi Rainford
O'Hara
Peters
Realpe
Richard Appleton
Shrouk Messahel
Woolfall
Publication venue: 'BMJ'
Publication date: 22/02/2021
Field of study

Background: Alternatives to prospective informed consent enable the conduct of paediatric emergency and critical care trials. Research without prior consent (RWPC) involves practitioners approaching parents after an intervention has been given and seeking consent for their child to continue in the trial. As part of an embedded study in the 'Emergency treatment with Levetiracetam or Phenytoin in Status Epilepticus in children' (EcLiPSE) trial, we explored how practitioners described the trial and RWPC during recruitment discussions, and how well this information was understood by parents. We aimed to develop a framework to assist trial conversations in future paediatric emergency and critical care trials using RWPC. Methods: Qualitative methods embedded within the EcLiPSE trial processes, including audiorecorded practitioner-parent trial discussions and telephone interviews with parents. We analysed data using thematic analysis, drawing on the Realpe et al (2016) model for recruitment to trials. Results: We analysed 76 recorded trial discussions and conducted 30 parent telephone interviews. For 19 parents, we had recorded trial discussion and interview data, which were matched for analysis. Parental understanding of the EcLiPSE trial was enhanced when practitioners: provided a comprehensive description of trial aims; explained the reasons for RWPC; discussed uncertainty about which intervention was best; provided a balanced description of trial intervention; provided a clear explanation about randomisation and provided an opportunity for questions. We present a seven-step framework to assist recruitment practice in trials involving RWPC. Conclusion: This study provides a framework to enhance recruitment practice and parental understanding in paediatric emergency and critical care trials involving RWPC. Further testing of this framework is required

University of Liverpool Repository

Crossref

UWE Bristol Research Repository

Establishing a large prospective clinical cohort in people with head and neck cancer as a biomedical resource: head and neck 5000

Author: Abdelkader Maged
Ahmed Imtiaz
Allmark Christine
Anari Shahram
Andrade Gerard
Baldwin Andrew
Balfour Alistair
Barnes Debi
Beaumont-Jewell Dawn
Benson Richard
Berry Sandeep
Bisase Brian
Brammer Caroline
Carr Ruth
Casasola Richard
Christian Judith
Coatesworth Andrew
Cogill Geoffrey
Cole Naomi
Conway David
Dallas Nicola
Davies Joe
Doyle Margret
Dyker Karen
Dyson P
England James
Evans Andrew
Evans Mererid
Fisher Sheila
Foran Bernie
Forster Martin
Fresco Lydia
Gahir Daljit
Gollins Simon
Goodchild Kate
Gunasekaran Sinnappa P
Hall Charles
Hamid Abdel
Hari Churunal
Hollingworth Will
Homer Jarrod
Hurley Katrina
Hwang David
Hyde Nicholas
Jankowska Petra
Jeffreys Mona
Kim Dae
King Emma
Lamont Alan
Leary Sam
Lees Laura
Lester Jim
Lester Shane
Loo H W
Lowe Rachel
Mano Joseph
McAllister Ken
McCaul James
Mehanna Hisham
Moss Laura
Moule Russell
Ness Andrew Robert
Nutting Chris
Nutting Christopher
O'Hara James
Palaniappan Nachi
Paleri Vinidh
Penfold Chris
Persson Martin
Peters Tim J
Pring Miranda
Repanos Costas
Richards Stuart
Ring Susan
Rogers Simon
Roques Tom
Rowell Nick
Roy Amy
Scrase Christopher
Sen Mehmet
Sheehan Tom
Simcock Richard
Siva Muthu
Stewart Simon
Tatla Taran
Thiruchelvam J K
Thomas Steve
Tierney Paul
Toms Stu
Tyler Jayne
Wagstaff Lynda
Waldron John
Waylen Andrea
Wight Richard
Winter Stuart
Wood Christine
Wood Katie
Worthington Helen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

BACKGROUND: Head and neck cancer is an important cause of ill health. Survival appears to be improving but the reasons for this are unclear. They could include evolving aetiology, modifications in care, improvements in treatment or changes in lifestyle behaviour. Observational studies are required to explore survival trends and identify outcome predictors. METHODS: We are identifying people with a new diagnosis of head and neck cancer. We obtain consent that includes agreement to collect longitudinal data, store samples and record linkage. Prior to treatment we give participants three questionnaires on health and lifestyle, quality of life and sexual history. We collect blood and saliva samples, complete a clinical data capture form and request a formalin fixed tissue sample. At four and twelve months we complete further data capture forms and send participants further quality of life questionnaires. DISCUSSION: This large clinical cohort of people with head and neck cancer brings together clinical data, patient-reported outcomes and biological samples in a single co-ordinated resource for translational and prognostic research

Crossref

University of Birmingham Research Portal

Edge Hill University Research Information Repository

PubMed Central

The University of Manchester - Institutional Repository

Institute of Cancer Research Repository

Explore Bristol Research