474,770 research outputs found
New Approaches To Photometric Redshift Prediction Via Gaussian Process Regression In The Sloan Digital Sky Survey
Expanding upon the work of Way and Srivastava 2006 we demonstrate how the use
of training sets of comparable size continue to make Gaussian process
regression (GPR) a competitive approach to that of neural networks and other
least-squares fitting methods. This is possible via new large size matrix
inversion techniques developed for Gaussian processes (GPs) that do not require
that the kernel matrix be sparse. This development, combined with a
neural-network kernel function appears to give superior results for this
problem. Our best fit results for the Sloan Digital Sky Survey (SDSS) Main
Galaxy Sample using u,g,r,i,z filters gives an rms error of 0.0201 while our
results for the same filters in the luminous red galaxy sample yield 0.0220. We
also demonstrate that there appears to be a minimum number of training-set
galaxies needed to obtain the optimal fit when using our GPR rank-reduction
methods. We find that morphological information included with many photometric
surveys appears, for the most part, to make the photometric redshift evaluation
slightly worse rather than better. This would indicate that most morphological
information simply adds noise from the GP point of view in the data used
herein. In addition, we show that cross-match catalog results involving
combinations of the Two Micron All Sky Survey, SDSS, and Galaxy Evolution
Explorer have to be evaluated in the context of the resulting cross-match
magnitude and redshift distribution. Otherwise one may be misled into overly
optimistic conclusions.Comment: 32 pages, ApJ in Press, 2 new figures, 1 new table of comparison
methods, updated discussion, references and typos to reflect version in Pres
Multi-Document Summarization via Discriminative Summary Reranking
Existing multi-document summarization systems usually rely on a specific
summarization model (i.e., a summarization method with a specific parameter
setting) to extract summaries for different document sets with different
topics. However, according to our quantitative analysis, none of the existing
summarization models can always produce high-quality summaries for different
document sets, and even a summarization model with good overall performance may
produce low-quality summaries for some document sets. On the contrary, a
baseline summarization model may produce high-quality summaries for some
document sets. Based on the above observations, we treat the summaries produced
by different summarization models as candidate summaries, and then explore
discriminative reranking techniques to identify high-quality summaries from the
candidates for difference document sets. We propose to extract a set of
candidate summaries for each document set based on an ILP framework, and then
leverage Ranking SVM for summary reranking. Various useful features have been
developed for the reranking process, including word-level features,
sentence-level features and summary-level features. Evaluation results on the
benchmark DUC datasets validate the efficacy and robustness of our proposed
approach
The effect of the color filter array layout choice on state-of-the-art demosaicing
Interpolation from a Color Filter Array (CFA) is the most common method for obtaining full color image data. Its success relies on the smart combination of a CFA and a demosaicing algorithm. Demosaicing on the one hand has been extensively studied. Algorithmic development in the past 20 years ranges from simple linear interpolation to modern neural-network-based (NN) approaches that encode the prior knowledge of millions of training images to fill in missing data in an inconspicious way. CFA design, on the other hand, is less well studied, although still recognized to strongly impact demosaicing performance. This is because demosaicing algorithms are typically limited to one particular CFA pattern, impeding straightforward CFA comparison. This is starting to change with newer classes of demosaicing that may be considered generic or CFA-agnostic. In this study, by comparing performance of two state-of-the-art generic algorithms, we evaluate the potential of modern CFA-demosaicing. We test the hypothesis that, with the increasing power of NN-based demosaicing, the influence of optimal CFA design on system performance decreases. This hypothesis is supported with the experimental results. Such a finding would herald the possibility of relaxing CFA requirements, providing more freedom in the CFA design choice and producing high-quality cameras
Learning Object Categories From Internet Image Searches
In this paper, we describe a simple approach to learning models of visual object categories from images gathered from Internet image search engines. The images for a given keyword are typically highly variable, with a large fraction being unrelated to the query term, and thus pose a challenging environment from which to learn. By training our models directly from Internet images, we remove the need to laboriously compile training data sets, required by most other recognition approaches-this opens up the possibility of learning object category models “on-the-fly.” We describe two simple approaches, derived from the probabilistic latent semantic analysis (pLSA) technique for text document analysis, that can be used to automatically learn object models from these data. We show two applications of the learned model: first, to rerank the images returned by the search engine, thus improving the quality of the search engine; and second, to recognize objects in other image data sets
- …