30,393 research outputs found
Enhancing Domain Word Embedding via Latent Semantic Imputation
We present a novel method named Latent Semantic Imputation (LSI) to transfer
external knowledge into semantic space for enhancing word embedding. The method
integrates graph theory to extract the latent manifold structure of the
entities in the affinity space and leverages non-negative least squares with
standard simplex constraints and power iteration method to derive spectral
embeddings. It provides an effective and efficient approach to combining entity
representations defined in different Euclidean spaces. Specifically, our
approach generates and imputes reliable embedding vectors for low-frequency
words in the semantic space and benefits downstream language tasks that depend
on word embedding. We conduct comprehensive experiments on a carefully designed
classification problem and language modeling and demonstrate the superiority of
the enhanced embedding via LSI over several well-known benchmark embeddings. We
also confirm the consistency of the results under different parameter settings
of our method.Comment: ACM SIGKDD 201
Vacuum structure and string tension in Yang-Mills dimeron ensembles
We numerically simulate ensembles of SU(2) Yang-Mills dimeron solutions with
a statistical weight determined by the classical action and perform a
comprehensive analysis of their properties. In particular, we examine the
extent to which these ensembles capture topological and confinement properties
of the Yang-Mills vacuum. This further allows us to test the classic picture of
meron-induced quark confinement as triggered by dimeron dissociation. At small
bare couplings, spacial, topological-charge and color correlations among the
dimerons generate a short-range order which screens topological charges. With
increasing coupling this order weakens rapidly, however, in part because the
dimerons gradually dissociate into their meron constituents. Monitoring
confinement properties by evaluating Wilson-loop expectation values, we find
the growing disorder due to these progressively liberated merons to generate a
finite and (with the coupling) increasing string tension. The short-distance
behavior of the static quark-antiquark potential, on the other hand, is
dominated by small, "instanton-like" dimerons. String tension, action density
and topological susceptibility of the dimeron ensembles in the physical
coupling region turn out to be of the order of standard values. Hence the above
results demonstrate without reliance on weak-coupling or low-density
approximations that the dissociating dimeron component in the Yang-Mills vacuum
can indeed produce a meron-populated confining phase. The density of
coexisting, hardly dissociated and thus instanton-like dimerons seems to remain
large enough, on the other hand, to reproduce much of the additional
phenomenology successfully accounted for by non-confining instanton vacuum
models. Hence dimeron ensembles should provide an efficient basis for a rather
complete description of the Yang-Mills vacuum.Comment: 36 pages, 17 figure
Matching Methods for Causal Inference: A Review and a Look Forward
When estimating causal effects using observational data, it is desirable to
replicate a randomized experiment as closely as possible by obtaining treated
and control groups with similar covariate distributions. This goal can often be
achieved by choosing well-matched samples of the original treated and control
groups, thereby reducing bias due to the covariates. Since the 1970s, work on
matching methods has examined how to best choose treated and control subjects
for comparison. Matching methods are gaining popularity in fields such as
economics, epidemiology, medicine and political science. However, until now the
literature and related advice has been scattered across disciplines.
Researchers who are interested in using matching methods---or developing
methods related to matching---do not have a single place to turn to learn about
past and current research. This paper provides a structure for thinking about
matching methods and guidance on their use, coalescing the existing research
(both old and new) and providing a summary of where the literature on matching
methods is now and where it should be headed.Comment: Published in at http://dx.doi.org/10.1214/09-STS313 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Semantic Visual Localization
Robust visual localization under a wide range of viewing conditions is a
fundamental problem in computer vision. Handling the difficult cases of this
problem is not only very challenging but also of high practical relevance,
e.g., in the context of life-long localization for augmented reality or
autonomous robots. In this paper, we propose a novel approach based on a joint
3D geometric and semantic understanding of the world, enabling it to succeed
under conditions where previous approaches failed. Our method leverages a novel
generative model for descriptor learning, trained on semantic scene completion
as an auxiliary task. The resulting 3D descriptors are robust to missing
observations by encoding high-level 3D geometric and semantic information.
Experiments on several challenging large-scale localization datasets
demonstrate reliable localization under extreme viewpoint, illumination, and
geometry changes
Terminology mining in social media
The highly variable and dynamic word usage in social media presents serious challenges for both research and those commercial applications that are geared towards blogs or other user-generated non-editorial texts. This paper discusses and exemplifies a terminology mining approach for dealing with the productive character of the textual environment in social media. We explore the challenges of practically acquiring new terminology, and of modeling similarity and relatedness of terms from observing realistic amounts of data. We also discuss semantic evolution and density, and investigate novel measures for characterizing the preconditions for terminology mining
- …