9,686 research outputs found

    On Semantic Word Cloud Representation

    Full text link
    We study the problem of computing semantic-preserving word clouds in which semantically related words are close to each other. While several heuristic approaches have been described in the literature, we formalize the underlying geometric algorithm problem: Word Rectangle Adjacency Contact (WRAC). In this model each word is associated with rectangle with fixed dimensions, and the goal is to represent semantically related words by ensuring that the two corresponding rectangles touch. We design and analyze efficient polynomial-time algorithms for some variants of the WRAC problem, show that several general variants are NP-hard, and describe a number of approximation algorithms. Finally, we experimentally demonstrate that our theoretically-sound algorithms outperform the early heuristics

    Precise Algorithm to Generate Random Sequential Addition of Hard Hyperspheres at Saturation

    Full text link
    Random sequential addition (RSA) time-dependent packing process, in which congruent hard hyperspheres are randomly and sequentially placed into a system without interparticle overlap, is a useful packing model to study disorder in high dimensions. Of particular interest is the infinite-time {\it saturation} limit in which the available space for another sphere tends to zero. However, the associated saturation density has been determined in all previous investigations by extrapolating the density results for near-saturation configurations to the saturation limit, which necessarily introduces numerical uncertainties. We have refined an algorithm devised by us [S. Torquato, O. Uche, and F.~H. Stillinger, Phys. Rev. E {\bf 74}, 061308 (2006)] to generate RSA packings of identical hyperspheres. The improved algorithm produce such packings that are guaranteed to contain no available space using finite computational time with heretofore unattained precision and across the widest range of dimensions (2d82 \le d \le 8). We have also calculated the packing and covering densities, pair correlation function g2(r)g_2(r) and structure factor S(k)S(k) of the saturated RSA configurations. As the space dimension increases, we find that pair correlations markedly diminish, consistent with a recently proposed "decorrelation" principle, and the degree of "hyperuniformity" (suppression of infinite-wavelength density fluctuations) increases. We have also calculated the void exclusion probability in order to compute the so-called quantizer error of the RSA packings, which is related to the second moment of inertia of the average Voronoi cell. Our algorithm is easily generalizable to generate saturated RSA packings of nonspherical particles

    Scalable and interpretable product recommendations via overlapping co-clustering

    Full text link
    We consider the problem of generating interpretable recommendations by identifying overlapping co-clusters of clients and products, based only on positive or implicit feedback. Our approach is applicable on very large datasets because it exhibits almost linear complexity in the input examples and the number of co-clusters. We show, both on real industrial data and on publicly available datasets, that the recommendation accuracy of our algorithm is competitive to that of state-of-art matrix factorization techniques. In addition, our technique has the advantage of offering recommendations that are textually and visually interpretable. Finally, we examine how to implement our technique efficiently on Graphical Processing Units (GPUs).Comment: In IEEE International Conference on Data Engineering (ICDE) 201
    corecore