8,635 research outputs found
Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection
The state-of-the-art named entity recognition (NER) systems are supervised
machine learning models that require large amounts of manually annotated data
to achieve high accuracy. However, annotating NER data by human is expensive
and time-consuming, and can be quite difficult for a new language. In this
paper, we present two weakly supervised approaches for cross-lingual NER with
no human annotation in a target language. The first approach is to create
automatically labeled NER data for a target language via annotation projection
on comparable corpora, where we develop a heuristic scheme that effectively
selects good-quality projection-labeled data from noisy data. The second
approach is to project distributed representations of words (word embeddings)
from a target language to a source language, so that the source-language NER
system can be applied to the target language without re-training. We also
design two co-decoding schemes that effectively combine the outputs of the two
projection-based approaches. We evaluate the performance of the proposed
approaches on both in-house and open NER data for several target languages. The
results show that the combined systems outperform three other weakly supervised
approaches on the CoNLL data.Comment: 11 pages, The 55th Annual Meeting of the Association for
Computational Linguistics (ACL), 201
Q-CSMA: Queue-Length Based CSMA/CA Algorithms for Achieving Maximum Throughput and Low Delay in Wireless Networks
Recently, it has been shown that CSMA-type random access algorithms can
achieve the maximum possible throughput in ad hoc wireless networks. However,
these algorithms assume an idealized continuous-time CSMA protocol where
collisions can never occur. In addition, simulation results indicate that the
delay performance of these algorithms can be quite bad. On the other hand,
although some simple heuristics (such as distributed approximations of greedy
maximal scheduling) can yield much better delay performance for a large set of
arrival rates, they may only achieve a fraction of the capacity region in
general. In this paper, we propose a discrete-time version of the CSMA
algorithm. Central to our results is a discrete-time distributed randomized
algorithm which is based on a generalization of the so-called Glauber dynamics
from statistical physics, where multiple links are allowed to update their
states in a single time slot. The algorithm generates collision-free
transmission schedules while explicitly taking collisions into account during
the control phase of the protocol, thus relaxing the perfect CSMA assumption.
More importantly, the algorithm allows us to incorporate mechanisms which lead
to very good delay performance while retaining the throughput-optimality
property. It also resolves the hidden and exposed terminal problems associated
with wireless networks.Comment: 12 page
Improving Multilingual Named Entity Recognition with Wikipedia Entity Type Mapping
The state-of-the-art named entity recognition (NER) systems are statistical
machine learning models that have strong generalization capability (i.e., can
recognize unseen entities that do not appear in training data) based on lexical
and contextual information. However, such a model could still make mistakes if
its features favor a wrong entity type. In this paper, we utilize Wikipedia as
an open knowledge base to improve multilingual NER systems. Central to our
approach is the construction of high-accuracy, high-coverage multilingual
Wikipedia entity type mappings. These mappings are built from weakly annotated
data and can be extended to new languages with no human annotation or
language-dependent knowledge involved. Based on these mappings, we develop
several approaches to improve an NER system. We evaluate the performance of the
approaches via experiments on NER systems trained for 6 languages. Experimental
results show that the proposed approaches are effective in improving the
accuracy of such systems on unseen entities, especially when a system is
applied to a new domain or it is trained with little training data (up to 18.3
F1 score improvement).Comment: 11 pages, Conference on Empirical Methods in Natural Language
Processing (EMNLP), 201
Learning Loosely Connected Markov Random Fields
We consider the structure learning problem for graphical models that we call
loosely connected Markov random fields, in which the number of short paths
between any pair of nodes is small, and present a new conditional independence
test based algorithm for learning the underlying graph structure. The novel
maximization step in our algorithm ensures that the true edges are detected
correctly even when there are short cycles in the graph. The number of samples
required by our algorithm is C*log p, where p is the size of the graph and the
constant C depends on the parameters of the model. We show that several
previously studied models are examples of loosely connected Markov random
fields, and our algorithm achieves the same or lower computational complexity
than the previously designed algorithms for individual cases. We also get new
results for more general graphical models, in particular, our algorithm learns
general Ising models on the Erdos-Renyi random graph G(p, c/p) correctly with
running time O(np^5).Comment: 45 pages, minor revisio
Fast Mixing of Parallel Glauber Dynamics and Low-Delay CSMA Scheduling
Glauber dynamics is a powerful tool to generate randomized, approximate
solutions to combinatorially difficult problems. It has been used to analyze
and design distributed CSMA (Carrier Sense Multiple Access) scheduling
algorithms for multi-hop wireless networks. In this paper we derive bounds on
the mixing time of a generalization of Glauber dynamics where multiple links
are allowed to update their states in parallel and the fugacity of each link
can be different. The results can be used to prove that the average queue
length (and hence, the delay) under the parallel Glauber dynamics based CSMA
grows polynomially in the number of links for wireless networks with
bounded-degree interference graphs when the arrival rate lies in a fraction of
the capacity region. We also show that in specific network topologies, the
low-delay capacity region can be further improved.Comment: 12 page
Stochastic Behavior of the Nonnegative Least Mean Fourth Algorithm for Stationary Gaussian Inputs and Slow Learning
Some system identification problems impose nonnegativity constraints on the
parameters to estimate due to inherent physical characteristics of the unknown
system. The nonnegative least-mean-square (NNLMS) algorithm and its variants
allow to address this problem in an online manner. A nonnegative least mean
fourth (NNLMF) algorithm has been recently proposed to improve the performance
of these algorithms in cases where the measurement noise is not Gaussian. This
paper provides a first theoretical analysis of the stochastic behavior of the
NNLMF algorithm for stationary Gaussian inputs and slow learning. Simulation
results illustrate the accuracy of the proposed analysis.Comment: 11 pages, 8 figures, submitted for publicatio
Cooperative non-orthogonal multiple access in cognitive radio
This letter studies the application of non-orthogonal multiple access to a downlink cognitive radio (termed CR-NOMA) system. A new cooperative transmission scheme is proposed aimed at exploiting the inherent spatial diversity offered by the CR-NOMA system. The closed-form analytical results are developed to show that the cooperative transmission scheme gives better performance when more secondary users participate in relaying, which helps achieve the maximum diversity order at secondary user and a diversity order of two at primary user. The simulations are performed to validate the performance of the proposed scheme and the accuracy of the analytical results
Application of non-orthogonal multiple access in cooperative spectrum-sharing networks over Nakagami-m fading channels
This paper proposes a novel non-orthogonal multiple access (NOMA)-based cooperative transmission scheme for a spectrum-sharing cognitive radio network, whereby a secondary transmitter (ST) serves as a relay and helps transmit the primary and secondary messages simultaneously with employing NOMA signaling. This cooperation is particularly useful when the ST has good channel conditions to a primary receiver but lacks of the radio spectrum. To evaluate the performance of the proposed scheme, the outage probability and system throughput for the primary and secondary networks are derived in closed forms. Simulation results demonstrate the superior performance gains for both networks thanks to the use of the proposed NOMAbased cooperative transmission scheme. It is also revealed that NOMA outperforms conventional orthogonal multiple access and achieves better spectrum utilization
- …
