278 research outputs found
Holographic data visualization: using synthetic full-parallax holography to share information
This investigation explores representing information through data visualization using the medium holography. It is an exploration from the perspective of a creative practitioner deploying a transdisciplinary approach. The task of visualizing and making use of data and âbig dataâ has been the focus of a large number of research projects during the opening of this century. As the amount of data that can be gathered has increased in a short time our ability to comprehend and get meaning out of the numbers has been brought into attention. This project is looking at the possibility of employing threedimensional imaging using holography to visualize data and additional information. To explore the viability of the concept, this project has set out to transform the visualization of calculated energy and fluid flow data to a holographic medium. A Computational Fluid Dynamics (CFD) model of flow around a vehicle, and a model of Solar irradiation on a building were chosen to investigate the process. As no pre-existing software is available to directly transform the data into a compatible format the team worked collaboratively and transdisciplinary in order to achieve an accurate conversion from the format of the calculation and visualization tools to a configuration suitable for synthetic holography production. The project also investigates ideas for layout and design suitable for holographic visualization of energy data. Two completed holograms will be presented. Future possibilities for developing the concept of Holographic Data Visualization are briefly deliberated upon. (c) 2017, Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only
Differentially Private Model Selection with Penalized and Constrained Likelihood
In statistical disclosure control, the goal of data analysis is twofold: The
released information must provide accurate and useful statistics about the
underlying population of interest, while minimizing the potential for an
individual record to be identified. In recent years, the notion of differential
privacy has received much attention in theoretical computer science, machine
learning, and statistics. It provides a rigorous and strong notion of
protection for individuals' sensitive information. A fundamental question is
how to incorporate differential privacy into traditional statistical inference
procedures. In this paper we study model selection in multivariate linear
regression under the constraint of differential privacy. We show that model
selection procedures based on penalized least squares or likelihood can be made
differentially private by a combination of regularization and randomization,
and propose two algorithms to do so. We show that our private procedures are
consistent under essentially the same conditions as the corresponding
non-private procedures. We also find that under differential privacy, the
procedure becomes more sensitive to the tuning parameters. We illustrate and
evaluate our method using simulation studies and two real data examples
Incidence of Obesity Among Young US Children Living in Low-Income Families, 2008â2011
OBJECTIVE: To examine the incidence and reverse of obesity among young low-income children and variations across population subgroups.
METHODS: We included 1.2 million participants in federally funded child health and nutrition programs who were 0 to 23 months old in 2008 and were followed up 24 to 35 months later in 2010â2011. Weight and height were measured. Obesity at baseline was defined as gender-specific weight-for-length \u3e/=95th percentile on the 2000 Centers for Disease Control and Prevention growth charts. Obesity at follow-up was defined as gender-specific BMI-for-age \u3e/=95th percentile. We used a multivariable log-binomial model to estimate relative risk of obesity adjusting for gender, baseline age, race/ethnicity, duration of follow-up, and baseline weight-for-length percentile.
RESULTS: The incidence of obesity was 11.0% after the follow-up period. The incidence was significantly higher among boys versus girls and higher among children aged 0 to 11 months at baseline versus those older. Compared with non-Hispanic whites, the risk of obesity was 35% higher among Hispanics and 49% higher among American Indians (AIs)/Alaska Natives (ANs), but 8% lower among non-Hispanic African Americans. Among children who were obese at baseline, 36.5% remained obese and 63.5% were nonobese at follow-up. The proportion of reversing of obesity was significantly lower among Hispanics and AIs/ANs than that among other racial/ethnic groups.
CONCLUSIONS: The high incidence underscores the importance of earlylife obesity prevention in multiple settings for low-income children and their families. The variations within population subgroups suggest that culturally appropriate intervention efforts should be focused on Hispanics and AIs/ANs
Can a supernova be located by its neutrinos?
A future core-collapse supernova in our Galaxy will be detected by several
neutrino detectors around the world. The neutrinos escape from the supernova
core over several seconds from the time of collapse, unlike the electromagnetic
radiation, emitted from the envelope, which is delayed by a time of order
hours. In addition, the electromagnetic radiation can be obscured by dust in
the intervening interstellar space. The question therefore arises whether a
supernova can be located by its neutrinos alone. The early warning of a
supernova and its location might allow greatly improved astronomical
observations. The theme of the present work is a careful and realistic
assessment of this question, taking into account the statistical significance
of the various neutrino signals. Not surprisingly, neutrino-electron forward
scattering leads to a good determination of the supernova direction, even in
the presence of the large and nearly isotropic background from other reactions.
Even with the most pessimistic background assumptions, SuperKamiokande (SK) and
the Sudbury Neutrino Observatory (SNO) can restrict the supernova direction to
be within circles of radius and , respectively. Other
reactions with more events but weaker angular dependence are much less useful
for locating the supernova. Finally, there is the oft-discussed possibility of
triangulation, i.e., determination of the supernova direction based on an
arrival time delay between different detectors. Given the expected statistics
we show that, contrary to previous estimates, this technique does not allow a
good determination of the supernova direction.Comment: 11 pages including 2 figures. Revised version corrects typos, adds
some brief comment
Exclusive Strategy for Generalization Algorithms in Micro-data Disclosure
Abstract. When generalization algorithms are known to the public, an adver-sary can obtain a more precise estimation of the secret table than what can be deduced from the disclosed generalization result. Therefore, whether a general-ization algorithm can satisfy a privacy property should be judged based on such an estimation. In this paper, we show that the computation of the estimation is inherently a recursive process that exhibits a high complexity when generaliza-tion algorithms take a straightforward inclusive strategy. To facilitate the design of more efficient generalization algorithms, we suggest an alternative exclusive strategy, which adopts a seemingly drastic approach to eliminate the need for recursion. Surprisingly, the data utility of the two strategies are actually not com-parable and the exclusive strategy can provide better data utility in certain cases.
Recommended from our members
Anonymisation of geographical distance matrices via Lipschitz embedding
BACKGROUND: Anonymisation of spatially referenced data has received increasing attention in recent years. Whereas the research focus has been on the anonymisation of point locations, the disclosure risk arising from the publishing of inter-point distances and corresponding anonymisation methods have not been studied systematically.
METHODS: We propose a new anonymisation method for the release of geographical distances between records of a microdata file-for example patients in a medical database. We discuss a data release scheme in which microdata without coordinates and an additional distance matrix between the corresponding rows of the microdata set are released. In contrast to most other approaches this method preserves small distances better than larger distances. The distances are modified by a variant of Lipschitz embedding.
RESULTS: The effects of the embedding parameters on the risk of data disclosure are evaluated by linkage experiments using simulated data. The results indicate small disclosure risks for appropriate embedding parameters.
CONCLUSION: The proposed method is useful if published distance information might be misused for the re-identification of records. The method can be used for publishing scientific-use-files and as an additional tool for record-linkage studies
Functional anonymisation: Personal data and the data environment
Anonymisation of personal data has a long history stemming from the expansion of the types of data products routinely provided by National Statistical Institutes. Variants on anonymisation have received serious criticism reinforced by much-publicised apparent failures. We argue that both the operators of such schemes and their critics have become confused by being overly focused on the properties of the data themselves. We claim that, far from being able to determine whether data are anonymous (and therefore non-personal) by looking at the data alone, any anonymisation technique worthy of the name must take account of not only the data but also their environment. This paper proposes an alternative formulation called functional anonymisation that focuses on the relationship between the data and the environment within which the data exist (their data environment). We provide a formulation for describing the relationship between the data and their environment that links the legal notion of personal data with the statistical notion of disclosure control. Anonymisation, properly conceived and effectively conducted, can be a critical part of the toolkit of the privacy-respecting data controller and the wider remit of providing accurate and usable data
- âŠ