36,103 research outputs found
Sampled Weighted Min-Hashing for Large-Scale Topic Mining
We present Sampled Weighted Min-Hashing (SWMH), a randomized approach to
automatically mine topics from large-scale corpora. SWMH generates multiple
random partitions of the corpus vocabulary based on term co-occurrence and
agglomerates highly overlapping inter-partition cells to produce the mined
topics. While other approaches define a topic as a probabilistic distribution
over a vocabulary, SWMH topics are ordered subsets of such vocabulary.
Interestingly, the topics mined by SWMH underlie themes from the corpus at
different levels of granularity. We extensively evaluate the meaningfulness of
the mined topics both qualitatively and quantitatively on the NIPS (1.7 K
documents), 20 Newsgroups (20 K), Reuters (800 K) and Wikipedia (4 M) corpora.
Additionally, we compare the quality of SWMH with Online LDA topics for
document representation in classification.Comment: 10 pages, Proceedings of the Mexican Conference on Pattern
Recognition 201
Physical properties of CeGe2-x (x = 0.24) single crystals
We present data on the anisotropic magnetic properties, heat capacity and
transport properties of CeGe2-x (x = 0.24) single crystals. The electronic
coefficient of the heat capacity, gamma ~ 110 mJ/mol K^2, is enhanced; three
magnetic transitions, with critical temperatures of ~ 7 K, ~ 5 K, and ~ 4 K are
observed in thermodynamic and transport measurements. The ground state has a
small ferromagnetic component along the c - axis. Small applied field, below 10
kOe, is enough to bring the material to an apparent saturated paramagnetic
state (with no further metamagnetic transitions up to 55 kOe) with a reduced,
below 1 mu_B, saturated moment
- …