1,202 research outputs found
Evaluating privacy-preserving record linkage using cryptographic long-term keys and multibit trees on large medical datasets.
Background: Integrating medical data using databases from different sources by record linkage is a powerful technique increasingly used in medical research. Under many jurisdictions, unique personal identifiers needed for linking the records are unavailable. Since sensitive attributes, such as names, have to be used instead, privacy regulations usually demand encrypting these identifiers. The corresponding set of techniques for privacy-preserving record linkage (PPRL) has received widespread attention. One recent method is based on Bloom filters. Due to superior resilience against cryptographic attacks, composite Bloom filters (cryptographic long-term keys, CLKs) are considered best practice for privacy in PPRL. Real-world performance of these techniques using large-scale data is unknown up to now. Methods: Using a large subset of Australian hospital admission data, we tested the performance of an innovative PPRL technique (CLKs using multibit trees) against a gold-standard derived from clear-text probabilistic record linkage. Linkage time and linkage quality (recall, precision and F-measure) were evaluated. Results: Clear text probabilistic linkage resulted in marginally higher precision and recall than CLKs. PPRL required more computing time but 5 million records could still be de-duplicated within one day. However, the PPRL approach required fine tuning of parameters. Conclusions: We argue that increased privacy of PPRL comes with the price of small losses in precision and recall and a large increase in computational burden and setup time. These costs seem to be acceptable in most applied settings, but they have to be considered in the decision to apply PPRL. Further research on the optimal automatic choice of parameters is needed
Estimating parameters for probabilistic linkage of privacy-preserved datasets.
Background: Probabilistic record linkage is a process used to bring together person-based records from within the same dataset (de-duplication) or from disparate datasets using pairwise comparisons and matching probabilities. The linkage strategy and associated match probabilities are often estimated through investigations into data quality and manual inspection. However, as privacy-preserved datasets comprise encrypted data, such methods are not possible. In this paper, we present a method for estimating the probabilities and threshold values for probabilistic privacy-preserved record linkage using Bloom filters. Methods: Our method was tested through a simulation study using synthetic data, followed by an application using real-world administrative data. Synthetic datasets were generated with error rates from zero to 20% error. Our method was used to estimate parameters (probabilities and thresholds) for de-duplication linkages. Linkage quality was determined by F-measure. Each dataset was privacy-preserved using separate Bloom filters for each field. Match probabilities were estimated using the expectation-maximisation (EM) algorithm on the privacy-preserved data. Threshold cut-off values were determined by an extension to the EM algorithm allowing linkage quality to be estimated for each possible threshold. De-duplication linkages of each privacy-preserved dataset were performed using both estimated and calculated probabilities. Linkage quality using the F-measure at the estimated threshold values was also compared to the highest F-measure. Three large administrative datasets were used to demonstrate the applicability of the probability and threshold estimation technique on real-world data. Results: Linkage of the synthetic datasets using the estimated probabilities produced an F-measure that was comparable to the F-measure using calculated probabilities, even with up to 20% error. Linkage of the administrative datasets using estimated probabilities produced an F-measure that was higher than the F-measure using calculated probabilities. Further, the threshold estimation yielded results for F-measure that were only slightly below the highest possible for those probabilities. Conclusions: The method appears highly accurate across a spectrum of datasets with varying degrees of error. As there are few alternatives for parameter estimation, the approach is a major step towards providing a complete operational approach for probabilistic linkage of privacy-preserved datasets
K30, H150, and H168 Are Essential Residues for Coordinating Pyridoxal 5′-Phosphate of O-Acetylserine Sulfhydrylase from Acidithiobacillus ferrooxidans
O-acetylserine sulfhydrylase (OASS) is a key enzyme involved in the pathway of the cysteine biosynthesis. The gene of OASS from Acidithiobacillus ferrooxidans ATCC 23270 was cloned and expressed in E. coli, the soluble protein was purified by one-step affinity chromatography to apparent homogeneity. Colors and UV–vis scanning results of the recombinant protein confirmed that it was a pyridoxal 5′-phosphate (PLP)-containing protein. Sequence alignment and site-directed mutation of the enzyme revealed that the cofactor PLP is covalently bound in Schiff base linkage with K30, as well as the two residues H150 and H168 were the crucial residues for PLP binding and stabilization
Age-Related Reference Intervals of the Main Biochemical and Hematological Parameters in C57BL/6J, 129SV/EV and C3H/HeJ Mouse Strains
BACKGROUND: Although the mouse is the animal model most widely used to study the pathogenesis and treatment of human diseases, reference values for biochemical parameters are scanty or lacking for the most frequently used strains. We therefore evaluated these parameters in the C57BL/6J, 129SV/EV and C3H/HeJ mice. METHODOLOGY/PRINCIPAL FINDINGS: We measured by dry chemistry 26 analytes relative to electrolyte balance, lipoprotein metabolism, and muscle/heart, liver, kidney and pancreas functions, and by automated blood counter 5 hematological parameters in 30 animals (15 male and 15 female) of each mouse strain at three age ranges: 1-2 months, 3-8 months and 9-12 months. Whole blood was collected from the retro-orbital sinus. We used quality control procedures to investigate analytical imprecision and inaccuracy. Reference values were calculated by non parametric methods (median and 2.5(th) and 97.5(th) percentiles). The Mann-Whitney and Kruskal-Wallis tests were used for between-group comparisons. Median levels of GLU, LDH, Chol and BUN were higher, and LPS, AST, ALP and CHE were lower in males than in females (p range: 0.05-0.001). Inter-strain differences were observed for: (1) GLU, t-Bil, K+, Ca++, PO(4)- (p<0.05) and for TAG, Chol, AST, Fe++ (p<0.001) in 4-8 month-old animals; (2) for CK, Crea, Mg++, Na++, K+, Cl- (p<0.05) and BUN (p<0.001) in 2- and in 10-12 month-old mice; and (3) for WBC, RBC, HGB, HCT and PLT (p<0.05) during the 1 year life span. CONCLUSION/SIGNIFICANCE: Our results indicate that metabolic variations in C57BL/6J, 129SV/EV and C3H/HeJ mice after therapeutic intervention should be evaluated against gender- and age-dependent reference intervals
Transverse-target-spin asymmetry in exclusive -meson electroproduction
Hard exclusive electroproduction of mesons is studied with the
HERMES spectrometer at the DESY laboratory by scattering 27.6 GeV positron and
electron beams off a transversely polarized hydrogen target. The amplitudes of
five azimuthal modulations of the single-spin asymmetry of the cross section
with respect to the transverse proton polarization are measured. They are
determined in the entire kinematic region as well as for two bins in photon
virtuality and momentum transfer to the nucleon. Also, a separation of
asymmetry amplitudes into longitudinal and transverse components is done. These
results are compared to a phenomenological model that includes the pion pole
contribution. Within this model, the data favor a positive
transition form factor.Comment: DESY Report 15-14
- …