12,335 research outputs found

    Diversity-based Attribute Weighting for K-modes Clustering

    Get PDF
    Categorical data is a kind of data that is used for computational in computer science. To obtain the information from categorical data input, it needs a clustering algorithm. There are so many clustering algorithms that are given by the researchers. One of the clustering algorithms for categorical data is k-modes. K-modes uses a simple matching approach. This simple matching approach uses similarity values. In K-modes, the two similar objects have similarity value 1, and 0 if it is otherwise. Actually, in each attribute, there are some kinds of different attribute value and each kind of attribute value has different number. The similarity value 0 and 1 is not enough to represent the real semantic distance between a data object and a cluster. Thus in this paper, we generalize a k-modes algorithm for categorical data by adding the weight and diversity value of each attribute value to optimize categorical data clustering

    Redshift-weighted constraints on primordial non-Gaussianity from the clustering of the eBOSS DR14 quasars in Fourier space

    Full text link
    We present constraints on local primordial non-Gaussianity (PNG), parametrized through fNLlocf^{\rm loc}_{\rm NL}, using the Sloan Digital Sky Survey IV extended Baryon Oscillation Spectroscopic Survey Data Release 14 quasar sample. We measure and analyze the anisotropic clustering of the quasars in Fourier space, testing for the scale-dependent bias introduced by primordial non-Gaussianity on large scales. We derive and employ a power spectrum estimator using optimal weights that account for the redshift evolution of the PNG signal. We find constraints of −51<fNLloc<21-51<f^{\rm loc}_{\rm NL}<21 at 95% confidence level. These are amont the tightest constraints from Large Scale Structure (LSS) data. Our redshift weighting improves the error bar by 15% in comparison to the unweighted case. If quasars have lower response to PNG, the constraint degrades to −81<fNLloc<26-81<f^{\rm loc}_{\rm NL}<26, with a 40% improvement over the standard approach. We forecast that the full eBOSS dataset could reach σfNLloc≃5-8\sigma_{f^{\rm loc}_{\rm NL}}\simeq 5\text{-}8 using optimal methods and full range of scales.Comment: 28 pages, 12 figures. Comments welcome
    • …
    corecore