4,516 research outputs found
End-to-end Learning for Short Text Expansion
Effectively making sense of short texts is a critical task for many real
world applications such as search engines, social media services, and
recommender systems. The task is particularly challenging as a short text
contains very sparse information, often too sparse for a machine learning
algorithm to pick up useful signals. A common practice for analyzing short text
is to first expand it with external information, which is usually harvested
from a large collection of longer texts. In literature, short text expansion
has been done with all kinds of heuristics. We propose an end-to-end solution
that automatically learns how to expand short text to optimize a given learning
task. A novel deep memory network is proposed to automatically find relevant
information from a collection of longer documents and reformulate the short
text through a gating mechanism. Using short text classification as a
demonstrating task, we show that the deep memory network significantly
outperforms classical text expansion methods with comprehensive experiments on
real world data sets.Comment: KDD'201
Topological and Algebraic Properties of Chernoff Information between Gaussian Graphs
In this paper, we want to find out the determining factors of Chernoff
information in distinguishing a set of Gaussian graphs. We find that Chernoff
information of two Gaussian graphs can be determined by the generalized
eigenvalues of their covariance matrices. We find that the unit generalized
eigenvalue doesn't affect Chernoff information and its corresponding dimension
doesn't provide information for classification purpose. In addition, we can
provide a partial ordering using Chernoff information between a series of
Gaussian trees connected by independent grafting operations. With the
relationship between generalized eigenvalues and Chernoff information, we can
do optimal linear dimension reduction with least loss of information for
classification.Comment: Submitted to Allerton2018, and this version contains proofs of the
propositions in the pape
Code Completion with Neural Attention and Pointer Networks
Intelligent code completion has become an essential research task to
accelerate modern software development. To facilitate effective code completion
for dynamically-typed programming languages, we apply neural language models by
learning from large codebases, and develop a tailored attention mechanism for
code completion. However, standard neural language models even with attention
mechanism cannot correctly predict the out-of-vocabulary (OoV) words that
restrict the code completion performance. In this paper, inspired by the
prevalence of locally repeated terms in program source code, and the recently
proposed pointer copy mechanism, we propose a pointer mixture network for
better predicting OoV words in code completion. Based on the context, the
pointer mixture network learns to either generate a within-vocabulary word
through an RNN component, or regenerate an OoV word from local context through
a pointer component. Experiments on two benchmarked datasets demonstrate the
effectiveness of our attention mechanism and pointer mixture network on the
code completion task.Comment: Accepted in IJCAI 201
Twin-solute, twin-dislocation and twin-twin interactions in magnesium
Magnesium alloys have received considerable research interest due to their lightweight, high specific strength and excellent castability. However, their plastic deformation is more complicated compared to cubic materials, primarily because their low-symmetry hexagonal closepacked (hcp) crystal structure. Deformation twinning is a crucial plastic deformation mechanism in magnesium, and twins can affect the evolution of microstructure by interacting with other lattice defects, thereby affecting the mechanical properties. This paper provides a review of the interactions between deformation twins and lattice defects, such as solute atoms, dislocations and twins, in magnesium and its alloys. This review starts with interactions between twin boundaries and substitutional solutes like yttrium, zinc, silver, as well as interstitial solutes like hydrogen and oxygen. This is followed by twin-dislocation interactions, which mainly involve those between {10[]2} tension or {10[]1} compression twins and 〈 a 〉 , 〈 c 〉 or 〈 c +-a 〉 type dislocations. The following section examines twin-twin interactions, which occur either among the six variants of the same {10[]2} or {10[]1} twin, or between different types of twins. The resulting structures, including twin-twin junctions or boundaries, tension-tension double twin, and compression-tension double twin, are discussed in detail. Lastly, this review highlights the remaining research issues concerning the interactions between twins and lattice defects in magnesium, and provides suggestions for future work in this area
Partition Information and its Transmission over Boolean Multi-Access Channels
In this paper, we propose a novel partition reservation system to study the
partition information and its transmission over a noise-free Boolean
multi-access channel. The objective of transmission is not message restoration,
but to partition active users into distinct groups so that they can,
subsequently, transmit their messages without collision. We first calculate (by
mutual information) the amount of information needed for the partitioning
without channel effects, and then propose two different coding schemes to
obtain achievable transmission rates over the channel. The first one is the
brute force method, where the codebook design is based on centralized source
coding; the second method uses random coding where the codebook is generated
randomly and optimal Bayesian decoding is employed to reconstruct the
partition. Both methods shed light on the internal structure of the partition
problem. A novel hypergraph formulation is proposed for the random coding
scheme, which intuitively describes the information in terms of a strong
coloring of a hypergraph induced by a sequence of channel operations and
interactions between active users. An extended Fibonacci structure is found for
a simple, but non-trivial, case with two active users. A comparison between
these methods and group testing is conducted to demonstrate the uniqueness of
our problem.Comment: Submitted to IEEE Transactions on Information Theory, major revisio
Asymptotic Error Free Partitioning over Noisy Boolean Multiaccess Channels
In this paper, we consider the problem of partitioning active users in a
manner that facilitates multi-access without collision. The setting is of a
noisy, synchronous, Boolean, multi-access channel where active users (out
of a total of users) seek to access. A solution to the partition problem
places each of the users in one of groups (or blocks) such that no two
active nodes are in the same block. We consider a simple, but non-trivial and
illustrative case of active users and study the number of steps used
to solve the partition problem. By random coding and a suboptimal decoding
scheme, we show that for any , where and
are positive constants (independent of ), and can be
arbitrary small, the partition problem can be solved with error probability
, for large . Under the same scheme, we also bound from
the other direction, establishing that, for any ,
the error probability for large ; again and
are constants and can be arbitrarily small. These bounds on the number
of steps are lower than the tight achievable lower-bound in terms of for group testing (in which all active users are identified,
rather than just partitioned). Thus, partitioning may prove to be a more
efficient approach for multi-access than group testing.Comment: This paper was submitted in June 2014 to IEEE Transactions on
Information Theory, and is under review no
- …