8,635 research outputs found

    Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection

    Full text link
    The state-of-the-art named entity recognition (NER) systems are supervised machine learning models that require large amounts of manually annotated data to achieve high accuracy. However, annotating NER data by human is expensive and time-consuming, and can be quite difficult for a new language. In this paper, we present two weakly supervised approaches for cross-lingual NER with no human annotation in a target language. The first approach is to create automatically labeled NER data for a target language via annotation projection on comparable corpora, where we develop a heuristic scheme that effectively selects good-quality projection-labeled data from noisy data. The second approach is to project distributed representations of words (word embeddings) from a target language to a source language, so that the source-language NER system can be applied to the target language without re-training. We also design two co-decoding schemes that effectively combine the outputs of the two projection-based approaches. We evaluate the performance of the proposed approaches on both in-house and open NER data for several target languages. The results show that the combined systems outperform three other weakly supervised approaches on the CoNLL data.Comment: 11 pages, The 55th Annual Meeting of the Association for Computational Linguistics (ACL), 201

    Q-CSMA: Queue-Length Based CSMA/CA Algorithms for Achieving Maximum Throughput and Low Delay in Wireless Networks

    Full text link
    Recently, it has been shown that CSMA-type random access algorithms can achieve the maximum possible throughput in ad hoc wireless networks. However, these algorithms assume an idealized continuous-time CSMA protocol where collisions can never occur. In addition, simulation results indicate that the delay performance of these algorithms can be quite bad. On the other hand, although some simple heuristics (such as distributed approximations of greedy maximal scheduling) can yield much better delay performance for a large set of arrival rates, they may only achieve a fraction of the capacity region in general. In this paper, we propose a discrete-time version of the CSMA algorithm. Central to our results is a discrete-time distributed randomized algorithm which is based on a generalization of the so-called Glauber dynamics from statistical physics, where multiple links are allowed to update their states in a single time slot. The algorithm generates collision-free transmission schedules while explicitly taking collisions into account during the control phase of the protocol, thus relaxing the perfect CSMA assumption. More importantly, the algorithm allows us to incorporate mechanisms which lead to very good delay performance while retaining the throughput-optimality property. It also resolves the hidden and exposed terminal problems associated with wireless networks.Comment: 12 page

    Improving Multilingual Named Entity Recognition with Wikipedia Entity Type Mapping

    Full text link
    The state-of-the-art named entity recognition (NER) systems are statistical machine learning models that have strong generalization capability (i.e., can recognize unseen entities that do not appear in training data) based on lexical and contextual information. However, such a model could still make mistakes if its features favor a wrong entity type. In this paper, we utilize Wikipedia as an open knowledge base to improve multilingual NER systems. Central to our approach is the construction of high-accuracy, high-coverage multilingual Wikipedia entity type mappings. These mappings are built from weakly annotated data and can be extended to new languages with no human annotation or language-dependent knowledge involved. Based on these mappings, we develop several approaches to improve an NER system. We evaluate the performance of the approaches via experiments on NER systems trained for 6 languages. Experimental results show that the proposed approaches are effective in improving the accuracy of such systems on unseen entities, especially when a system is applied to a new domain or it is trained with little training data (up to 18.3 F1 score improvement).Comment: 11 pages, Conference on Empirical Methods in Natural Language Processing (EMNLP), 201

    Learning Loosely Connected Markov Random Fields

    Full text link
    We consider the structure learning problem for graphical models that we call loosely connected Markov random fields, in which the number of short paths between any pair of nodes is small, and present a new conditional independence test based algorithm for learning the underlying graph structure. The novel maximization step in our algorithm ensures that the true edges are detected correctly even when there are short cycles in the graph. The number of samples required by our algorithm is C*log p, where p is the size of the graph and the constant C depends on the parameters of the model. We show that several previously studied models are examples of loosely connected Markov random fields, and our algorithm achieves the same or lower computational complexity than the previously designed algorithms for individual cases. We also get new results for more general graphical models, in particular, our algorithm learns general Ising models on the Erdos-Renyi random graph G(p, c/p) correctly with running time O(np^5).Comment: 45 pages, minor revisio

    Fast Mixing of Parallel Glauber Dynamics and Low-Delay CSMA Scheduling

    Full text link
    Glauber dynamics is a powerful tool to generate randomized, approximate solutions to combinatorially difficult problems. It has been used to analyze and design distributed CSMA (Carrier Sense Multiple Access) scheduling algorithms for multi-hop wireless networks. In this paper we derive bounds on the mixing time of a generalization of Glauber dynamics where multiple links are allowed to update their states in parallel and the fugacity of each link can be different. The results can be used to prove that the average queue length (and hence, the delay) under the parallel Glauber dynamics based CSMA grows polynomially in the number of links for wireless networks with bounded-degree interference graphs when the arrival rate lies in a fraction of the capacity region. We also show that in specific network topologies, the low-delay capacity region can be further improved.Comment: 12 page

    Stochastic Behavior of the Nonnegative Least Mean Fourth Algorithm for Stationary Gaussian Inputs and Slow Learning

    Full text link
    Some system identification problems impose nonnegativity constraints on the parameters to estimate due to inherent physical characteristics of the unknown system. The nonnegative least-mean-square (NNLMS) algorithm and its variants allow to address this problem in an online manner. A nonnegative least mean fourth (NNLMF) algorithm has been recently proposed to improve the performance of these algorithms in cases where the measurement noise is not Gaussian. This paper provides a first theoretical analysis of the stochastic behavior of the NNLMF algorithm for stationary Gaussian inputs and slow learning. Simulation results illustrate the accuracy of the proposed analysis.Comment: 11 pages, 8 figures, submitted for publicatio

    Cooperative non-orthogonal multiple access in cognitive radio

    Get PDF
    This letter studies the application of non-orthogonal multiple access to a downlink cognitive radio (termed CR-NOMA) system. A new cooperative transmission scheme is proposed aimed at exploiting the inherent spatial diversity offered by the CR-NOMA system. The closed-form analytical results are developed to show that the cooperative transmission scheme gives better performance when more secondary users participate in relaying, which helps achieve the maximum diversity order at secondary user and a diversity order of two at primary user. The simulations are performed to validate the performance of the proposed scheme and the accuracy of the analytical results

    Application of non-orthogonal multiple access in cooperative spectrum-sharing networks over Nakagami-m fading channels

    Get PDF
    This paper proposes a novel non-orthogonal multiple access (NOMA)-based cooperative transmission scheme for a spectrum-sharing cognitive radio network, whereby a secondary transmitter (ST) serves as a relay and helps transmit the primary and secondary messages simultaneously with employing NOMA signaling. This cooperation is particularly useful when the ST has good channel conditions to a primary receiver but lacks of the radio spectrum. To evaluate the performance of the proposed scheme, the outage probability and system throughput for the primary and secondary networks are derived in closed forms. Simulation results demonstrate the superior performance gains for both networks thanks to the use of the proposed NOMAbased cooperative transmission scheme. It is also revealed that NOMA outperforms conventional orthogonal multiple access and achieves better spectrum utilization
    corecore