15,066 research outputs found
End-to-End Neural Ad-hoc Ranking with Kernel Pooling
This paper proposes K-NRM, a kernel based neural model for document ranking.
Given a query and a set of documents, K-NRM uses a translation matrix that
models word-level similarities via word embeddings, a new kernel-pooling
technique that uses kernels to extract multi-level soft match features, and a
learning-to-rank layer that combines those features into the final ranking
score. The whole model is trained end-to-end. The ranking layer learns desired
feature patterns from the pairwise ranking loss. The kernels transfer the
feature patterns into soft-match targets at each similarity level and enforce
them on the translation matrix. The word embeddings are tuned accordingly so
that they can produce the desired soft matches. Experiments on a commercial
search engine's query log demonstrate the improvements of K-NRM over prior
feature-based and neural-based states-of-the-art, and explain the source of
K-NRM's advantage: Its kernel-guided embedding encodes a similarity metric
tailored for matching query words to document words, and provides effective
multi-level soft matches
Do branch lengths help to locate a tree in a phylogenetic network?
Phylogenetic networks are increasingly used in evolutionary biology to
represent the history of species that have undergone reticulate events such as
horizontal gene transfer, hybrid speciation and recombination. One of the most
fundamental questions that arise in this context is whether the evolution of a
gene with one copy in all species can be explained by a given network. In
mathematical terms, this is often translated in the following way: is a given
phylogenetic tree contained in a given phylogenetic network? Recently this tree
containment problem has been widely investigated from a computational
perspective, but most studies have only focused on the topology of the phylo-
genies, ignoring a piece of information that, in the case of phylogenetic
trees, is routinely inferred by evolutionary analyses: branch lengths. These
measure the amount of change (e.g., nucleotide substitutions) that has occurred
along each branch of the phylogeny. Here, we study a number of versions of the
tree containment problem that explicitly account for branch lengths. We show
that, although length information has the potential to locate more precisely a
tree within a network, the problem is computationally hard in its most general
form. On a positive note, for a number of special cases of biological
relevance, we provide algorithms that solve this problem efficiently. This
includes the case of networks of limited complexity, for which it is possible
to recover, among the trees contained by the network with the same topology as
the input tree, the closest one in terms of branch lengths
- …