144,437 research outputs found

    Kernel Regression For Determining Photometric Redshifts From Sloan Broadband Photometry

    Full text link
    We present a new approach, kernel regression, to determine photometric redshifts for 399,929 galaxies in the Fifth Data Release of the Sloan Digital Sky Survey (SDSS). In our case, kernel regression is a weighted average of spectral redshifts of the neighbors for a query point, where higher weights are associated with points that are closer to the query point. One important design decision when using kernel regression is the choice of the bandwidth. We apply 10-fold cross-validation to choose the optimal bandwidth, which is obtained as the cross-validation error approaches the minimum. The experiments show that the optimal bandwidth is different for diverse input patterns, the least rms error of photometric redshift estimation arrives at 0.019 using color+eClass as the inputs, the less rms error amounts to 0.020 using ugriz+eClass as the inputs. Here eClass is a galaxy spectra type. Then the little rms scatter is 0.021 with color+r as the inputs.Comment: 6 pages,2 figures, accepted for publication in MNRA

    Rule-based Machine Learning Methods for Functional Prediction

    Full text link
    We describe a machine learning method for predicting the value of a real-valued function, given the values of multiple input variables. The method induces solutions from samples in the form of ordered disjunctive normal form (DNF) decision rules. A central objective of the method and representation is the induction of compact, easily interpretable solutions. This rule-based decision model can be extended to search efficiently for similar cases prior to approximating function values. Experimental results on real-world data demonstrate that the new techniques are competitive with existing machine learning and statistical methods and can sometimes yield superior regression performance.Comment: See http://www.jair.org/ for any accompanying file

    Weighted Polynomial Approximations: Limits for Learning and Pseudorandomness

    Get PDF
    Polynomial approximations to boolean functions have led to many positive results in computer science. In particular, polynomial approximations to the sign function underly algorithms for agnostically learning halfspaces, as well as pseudorandom generators for halfspaces. In this work, we investigate the limits of these techniques by proving inapproximability results for the sign function. Firstly, the polynomial regression algorithm of Kalai et al. (SIAM J. Comput. 2008) shows that halfspaces can be learned with respect to log-concave distributions on Rn\mathbb{R}^n in the challenging agnostic learning model. The power of this algorithm relies on the fact that under log-concave distributions, halfspaces can be approximated arbitrarily well by low-degree polynomials. We ask whether this technique can be extended beyond log-concave distributions, and establish a negative result. We show that polynomials of any degree cannot approximate the sign function to within arbitrarily low error for a large class of non-log-concave distributions on the real line, including those with densities proportional to exp(x0.99)\exp(-|x|^{0.99}). Secondly, we investigate the derandomization of Chernoff-type concentration inequalities. Chernoff-type tail bounds on sums of independent random variables have pervasive applications in theoretical computer science. Schmidt et al. (SIAM J. Discrete Math. 1995) showed that these inequalities can be established for sums of random variables with only O(log(1/δ))O(\log(1/\delta))-wise independence, for a tail probability of δ\delta. We show that their results are tight up to constant factors. These results rely on techniques from weighted approximation theory, which studies how well functions on the real line can be approximated by polynomials under various distributions. We believe that these techniques will have further applications in other areas of computer science.Comment: 22 page
    corecore