2,525 research outputs found
Preprocessing Solar Images while Preserving their Latent Structure
Telescopes such as the Atmospheric Imaging Assembly aboard the Solar Dynamics
Observatory, a NASA satellite, collect massive streams of high resolution
images of the Sun through multiple wavelength filters. Reconstructing
pixel-by-pixel thermal properties based on these images can be framed as an
ill-posed inverse problem with Poisson noise, but this reconstruction is
computationally expensive and there is disagreement among researchers about
what regularization or prior assumptions are most appropriate. This article
presents an image segmentation framework for preprocessing such images in order
to reduce the data volume while preserving as much thermal information as
possible for later downstream analyses. The resulting segmented images reflect
thermal properties but do not depend on solving the ill-posed inverse problem.
This allows users to avoid the Poisson inverse problem altogether or to tackle
it on each of 10 segments rather than on each of 10 pixels,
reducing computing time by a factor of 10. We employ a parametric
class of dissimilarities that can be expressed as cosine dissimilarity
functions or Hellinger distances between nonlinearly transformed vectors of
multi-passband observations in each pixel. We develop a decision theoretic
framework for choosing the dissimilarity that minimizes the expected loss that
arises when estimating identifiable thermal properties based on segmented
images rather than on a pixel-by-pixel basis. We also examine the efficacy of
different dissimilarities for recovering clusters in the underlying thermal
properties. The expected losses are computed under scientifically motivated
prior distributions. Two simulation studies guide our choices of dissimilarity
function. We illustrate our method by segmenting images of a coronal hole
observed on 26 February 2015
Recommended from our members
H-means image segmentation to identify solar thermal features
Properly segmenting multiband images of the Sun by their thermal properties will help determine the thermal structure of the solar corona. However, off-the-shelf segmentation algorithms are typically inappropriate because temperature information is captured by the relative intensities in different passbands, while the absolute levels are not relevant. Input features are therefore pixel-wise proportions of photons observed in each band. To segment solar images based on these proportions, we use a modification of k-means clustering that we call the H-means algorithm because it uses the Hellinger distance to compare probability vectors. H-means has a closed-form expression for cluster centroids, so computation is as fast as k-means. Tempering the input probability vectors reveals a broader class of H-means algorithms which include spherical k-means clustering. More generally, H-means can be used anytime the input feature is a probabilistic distribution, and hence is useful beyond image segmentation applications.Statistic
Detecting Unspecified Structure in Low-Count Images
Unexpected structure in images of astronomical sources often presents itself
upon visual inspection of the image, but such apparent structure may either
correspond to true features in the source or be due to noise in the data. This
paper presents a method for testing whether inferred structure in an image with
Poisson noise represents a significant departure from a baseline (null) model
of the image. To infer image structure, we conduct a Bayesian analysis of a
full model that uses a multiscale component to allow flexible departures from
the posited null model. As a test statistic, we use a tail probability of the
posterior distribution under the full model. This choice of test statistic
allows us to estimate a computationally efficient upper bound on a p-value that
enables us to draw strong conclusions even when there are limited computational
resources that can be devoted to simulations under the null model. We
demonstrate the statistical performance of our method on simulated images.
Applying our method to an X-ray image of the quasar 0730+257, we find
significant evidence against the null model of a single point source and
uniform background, lending support to the claim of an X-ray jet
Does Function Follow Organizational Form? Evidence From the Lending Practices of Large and Small Banks
Theories based on incomplete contracting suggest that small organizations may do better than large organizations in activities that require the processing of soft information. We explore this idea in the context of bank lending to small firms, an activity that is typically thought of as relying heavily on soft information. We find that large banks are less willing than small banks to lend to informationally 'difficult' credits, such as firms that do not keep formal financial records. Moreover, controlling for the endogeneity of bank-firm matching, large banks lend at a greater distance, interact more impersonally with their borrowers, have shorter and less exclusive relationships, and do not alleviate credit constraints as effectively. All of this is consistent with small banks being better able to collect and act on soft information than large banks.
Does Function Follow Organzizational Form? Evidence From the Lending Practices of Large and Small Banks
Theories based on incomplete contracting suggest that small organizations may do better than large organizations in activities that require the processing of soft information. We explore this idea in the context of bank lending to small firms, an activity that is typically thought of as relying heavily on soft information. We find that large banks are less willing than small banks to lend to informationally “difficult†credits, such as firms that do not keep formal financial records. Moreover, controlling for the endogeneity of bank-firm matching, large banks lend at a greater distance, interact more impersonally with their borrowers, have shorter and less exclusive relationships, and do not alleviate credit constraints as effectively. All of this is consistent with small banks being better able to collect and act on soft information than large banks. The opinions in this paper do not necessarily reflect those of the Federal Reserve Board or its staff. This work has been supported by the National Science Foundation (Rajan, Stein), and the George J. Stigler Center for Study of the State and Economy (Rajan). Thanks also to seminar participants at Yale, the Federal Reserve Bank of New York, Tulane, Babson, the University of Illinois, the Federal Reserve Bank of Chicago Bank Structure Conference, the NBER and the Western Finance Association meetings, as well as to Abhijit Banerjee, Michael Kremer, David Scharfstein, Andrei Shleifer, Greg Udell, Christopher Udry and James Weston for helpful comments and suggestions.
Recommended from our members
Advances in Empirical Bayes Modeling and Bayesian Computation
Chapter 1 of this thesis focuses on accelerating perfect sampling algorithms for a Bayesian hierarchical model. A discrete data augmentation scheme together with two different parameterizations yields two Gibbs samplers for sampling from the posterior distribution of the hyperparameters of the Dirichlet-multinomial hierarchical model under a default prior distribution. The finite-state space nature of this data augmentation permits us to construct two perfect samplers using bounding chains that take advantage of monotonicity and anti-monotonicity in the target posterior distribution, but both are impractically slow. We demonstrate however that a composite algorithm that strategically alternates between the two samplers' updates can be substantially faster than either individually. We theoretically bound the expected time until coalescence for the composite algorithm, and show via simulation that the theoretical bounds can be close to actual performance. Chapters 2 and 3 introduce a strategy for constructing scientifically sensible priors in complex models. We call these priors catalytic priors to suggest that adding such prior information catalyzes our ability to use richer, more realistic models. Because they depend on observed data, catalytic priors are a tool for empirical Bayes modeling. The overall perspective is data-driven: catalytic priors have a pseudo-data interpretation, and the building blocks are alternative plausible models for observations, yielding behavior similar to hierarchical models but with a conceptual shift away from distributional assumptions on parameters. The posterior under a catalytic prior can be viewed as an optimal approximation to a target measure, subject to a constraint on the posterior distribution's predictive implications. In Chapter 3, we apply catalytic priors to several familiar models and investigate the performance of the resulting posterior distributions. We also illustrate the application of catalytic priors in a preliminary analysis of the effectiveness of a job training program, which is complicated by the need to account for noncompliance, partially defined outcomes, and missing outcome data.Statistic
Mendelian randomization in family data
The phrase "mendelian randomization" has become associated with the use of genetic polymorphisms to uncover causal relationships between phenotypic variables. The statistical methods useful in mendelian randomization are known as instrumental variable techniques. We present an approach to instrumental variable estimation that is useful in family data and is robust to the use of weak instruments. We illustrate our method to measure the causal influence of low-density lipoprotein on high-density lipoprotein, body mass index, triglycerides, and systolic blood pressure. We use the Framingham Heart Study data as distributed to participants in the Genetics Analysis Workshop 16
Comparison of univariate and multivariate linkage analysis of traits related to hypertension
Complex traits are often manifested by multiple correlated traits. One example of this is hypertension (HTN), which is measured on a continuous scale by systolic blood pressure (SBP). Predisposition to HTN is predicted by hyperlipidemia, characterized by elevated triglycerides (TG), low-density lipids (LDL), and high-density lipids (HDL). We hypothesized that the multivariate analysis of TG, LDL, and HDL would be more powerful for detecting HTN genes via linkage analysis compared with univariate analysis of SBP. We conducted linkage analysis of four chromosomal regions known to contain genes associated with HTN using SBP as a measure of HTN in univariate Haseman-Elston regression and using the correlated traits TG, LDL, and HDL in multivariate Haseman-Elston regression. All analyses were conducted using the Framingham Heart Study data. We found that multivariate linkage analysis was better able to detect chromosomal regions in which the angiotensinogen, angiotensin receptor, guanine nucleotide-binding protein 3, and prostaglandin I2 synthase genes reside. Univariate linkage analysis only detected the AGT gene. We conclude that multivariate analysis is appropriate for the analysis of multiple correlated phenotypes, and our findings suggest that it may yield new linkage signals undetected by univariate analysis
Randomized Controlled Trial of Prophylactic Antibiotics for Dog Bites with Refined Cost Model
Reprints available through open access a
- …