532 research outputs found
Memory-Efficient Topic Modeling
As one of the simplest probabilistic topic modeling techniques, latent
Dirichlet allocation (LDA) has found many important applications in text
mining, computer vision and computational biology. Recent training algorithms
for LDA can be interpreted within a unified message passing framework. However,
message passing requires storing previous messages with a large amount of
memory space, increasing linearly with the number of documents or the number of
topics. Therefore, the high memory usage is often a major problem for topic
modeling of massive corpora containing a large number of topics. To reduce the
space complexity, we propose a novel algorithm without storing previous
messages for training LDA: tiny belief propagation (TBP). The basic idea of TBP
relates the message passing algorithms with the non-negative matrix
factorization (NMF) algorithms, which absorb the message updating into the
message passing process, and thus avoid storing previous messages. Experimental
results on four large data sets confirm that TBP performs comparably well or
even better than current state-of-the-art training algorithms for LDA but with
a much less memory consumption. TBP can do topic modeling when massive corpora
cannot fit in the computer memory, for example, extracting thematic topics from
7 GB PUBMED corpora on a common desktop computer with 2GB memory.Comment: 20 pages, 7 figure
A New Approach to Speeding Up Topic Modeling
Latent Dirichlet allocation (LDA) is a widely-used probabilistic topic
modeling paradigm, and recently finds many applications in computer vision and
computational biology. In this paper, we propose a fast and accurate batch
algorithm, active belief propagation (ABP), for training LDA. Usually batch LDA
algorithms require repeated scanning of the entire corpus and searching the
complete topic space. To process massive corpora having a large number of
topics, the training iteration of batch LDA algorithms is often inefficient and
time-consuming. To accelerate the training speed, ABP actively scans the subset
of corpus and searches the subset of topic space for topic modeling, therefore
saves enormous training time in each iteration. To ensure accuracy, ABP selects
only those documents and topics that contribute to the largest residuals within
the residual belief propagation (RBP) framework. On four real-world corpora,
ABP performs around to times faster than state-of-the-art batch LDA
algorithms with a comparable topic modeling accuracy.Comment: 14 pages, 12 figure
Experimental Test of Tracking the King Problem
In quantum theory, the retrodiction problem is not as clear as its classical
counterpart because of the uncertainty principle of quantum mechanics. In
classical physics, the measurement outcomes of the present state can be used
directly for predicting the future events and inferring the past events which
is known as retrodiction. However, as a probabilistic theory,
quantum-mechanical retrodiction is a nontrivial problem that has been
investigated for a long time, of which the Mean King Problem is one of the most
extensively studied issues. Here, we present the first experimental test of a
variant of the Mean King Problem, which has a more stringent regulation and is
termed "Tracking the King". We demonstrate that Alice, by harnessing the shared
entanglement and controlled-not gate, can successfully retrodict the choice of
King's measurement without knowing any measurement outcome. Our results also
provide a counterintuitive quantum communication to deliver information hidden
in the choice of measurement.Comment: 16 pages, 5 figures, 2 table
Deep Learning the Effects of Photon Sensors on the Event Reconstruction Performance in an Antineutrino Detector
We provide a fast approach incorporating the usage of deep learning for
evaluating the effects of photon sensors in an antineutrino detector on the
event reconstruction performance therein. This work is an attempt to harness
the power of deep learning for detector designing and upgrade planning. Using
the Daya Bay detector as a benchmark case and the vertex reconstruction
performance as the objective for the deep neural network, we find that the
photomultiplier tubes (PMTs) have different relative importance to the vertex
reconstruction. More importantly, the vertex position resolutions for the Daya
Bay detector follow approximately a multi-exponential relationship with respect
to the number of PMTs and hence, the coverage. This could also assist in
deciding on the merits of installing additional PMTs for future detector plans.
The approach could easily be used with other objectives in place of vertex
reconstruction
catena-Poly[[[[N′-(4-cyanoÂbenzylÂidene)nicotinohydrazide]silver(I)]-μ-[N′-4-cyanoÂbenzylÂidene)nicotinohydrazide]] hexaÂfluoridoarsenate]
In the title compound, {[Ag(C14H10N4O)2]AsF6}n, the AgI ion is coordinated by two N atoms from two different pyridyl rings and one N atom from one carbonitrile group of three different N′-(4-cyanoÂbenzylÂidene)nicotinohydrazide ligands in a distorted T-shaped geometry. The Ag—Ncarbonitrile bond distance is significant longer than those of Ag—NpyridÂyl. The bond angles around the AgI atom are also not in line with those in an ideal T-shaped geometry. One type of ligand acts as the bridge that connects AgI atoms into chains along [01]. These chains are linked to each other via N—H⋯O hydrogen bonds and Ag⋯O interÂactions with an Ag⋯O separation of 2.869 (2) Å. In addition, the [AsF6]− counter-anions are linked to the hydrazone groups through N—H⋯F hydrogen bonds. Four of the F atoms of the [AsF6]− anion are disordered over two sets of sites with occupancies of 0.732 (9) and 0.268 (9)
- …