177 research outputs found

    Exploiting Multiple Embeddings for Chinese Named Entity Recognition

    Full text link
    Identifying the named entities mentioned in text would enrich many semantic applications at the downstream level. However, due to the predominant usage of colloquial language in microblogs, the named entity recognition (NER) in Chinese microblogs experience significant performance deterioration, compared with performing NER in formal Chinese corpus. In this paper, we propose a simple yet effective neural framework to derive the character-level embeddings for NER in Chinese text, named ME-CNER. A character embedding is derived with rich semantic information harnessed at multiple granularities, ranging from radical, character to word levels. The experimental results demonstrate that the proposed approach achieves a large performance improvement on Weibo dataset and comparable performance on MSRA news dataset with lower computational cost against the existing state-of-the-art alternatives.Comment: accepted at CIKM 201

    A word-building method based on neural network for text classification

    Get PDF
    Text classification is a foundational task in many natural language processing applications. All traditional text classifiers take words as the basic units and conduct the pre-training process (like word2vec) to directly generate word vectors at the first step. However, none of them have considered the information contained in word structure which is proved to be helpful for text classification. In this paper, we propose a word-building method based on neural network model that can decompose a Chinese word to a sequence of radicals and learn structure information from these radical level features which is a key difference from the existing models. Then, the convolutional neural network is applied to extract structure information of words from radical sequence to generate a word vector, and the long short-term memory is applied to generate the sentence vector for the prediction purpose. The experimental results show that our model outperforms other existing models on Chinese dataset. Our model is also applicable to English as well where an English word can be decomposed down to character level, which demonstrates the excellent generalisation ability of our model. The experimental results have proved that our model also outperforms others on English dataset

    End-to-end Neural Information Retrieval

    Get PDF
    In recent years we have witnessed many successes of neural networks in the information retrieval community with lots of labeled data. Yet it remains unknown whether the same techniques can be easily adapted to search social media posts where the text is much shorter. In addition, we find that most neural information retrieval models are compared against weak baselines. In this thesis, we build an end-to-end neural information retrieval system using two toolkits: Anserini and MatchZoo. In addition, we also propose a novel neural model to capture the relevance of short and varied tweet text, named MP-HCNN. With the information retrieval toolkit Anserini, we build a reranking architecture based on various traditional information retrieval models (QL, QL+RM3, BM25, BM25+RM3), including a strong pseudo-relevance feedback baseline: RM3. With the neural network toolkit MatchZoo, we offer an empirical study of a number of popular neural network ranking models (DSSM, CDSSM, KNRM, DUET, DRMM). Experiments on datasets from the TREC Microblog Tracks and the TREC Robust Retrieval Track show that most existing neural network models cannot beat a simple language model baseline. How- ever, DRMM provides a significant improvement over the pseudo-relevance feedback baseline (BM25+RM3) on the Robust04 dataset and DUET, DRMM and MP-HCNN can provide significant improvements over the baseline (QL+RM3) on the microblog datasets. Further detailed analyses suggest that searching social media and searching news articles exhibit several different characteristics that require customized model design, shedding light on future directions

    Detecting Traffic Information From Social Media Texts With Deep Learning Approaches

    Get PDF
    Mining traffic-relevant information from social media data has become an emerging topic due to the real-time and ubiquitous features of social media. In this paper, we focus on a specific problem in social media mining which is to extract traffic relevant microblogs from Sina Weibo, a Chinese microblogging platform. It is transformed into a machine learning problem of short text classification. First, we apply the continuous bag-of-word model to learn word embedding representations based on a data set of three billion microblogs. Compared to the traditional one-hot vector representation of words, word embedding can capture semantic similarity between words and has been proved effective in natural language processing tasks. Next, we propose using convolutional neural networks (CNNs), long short-term memory (LSTM) models and their combination LSTM-CNN to extract traffic relevant microblogs with the learned word embeddings as inputs. We compare the proposed methods with competitive approaches, including the support vector machine (SVM) model based on a bag of n-gram features, the SVM model based on word vector features, and the multi-layer perceptron model based on word vector features. Experiments show the effectiveness of the proposed deep learning approaches

    News Text Classification Based on an Improved Convolutional Neural Network

    Get PDF
    With the explosive growth in Internet news media and the disorganized status of news texts, this paper puts forward an automatic classification model for news based on a Convolutional Neural Network (CNN). In the model, Word2vec is firstly merged with Latent Dirichlet Allocation (LDA) to generate an effective text feature representation. Then when an attention mechanism is combined with the proposed model, higher attention probability values are given to key features to achieve an accurate judgment. The results show that the precision rate, the recall rate and the F1 value of the model in this paper reach 96.4%, 95.9% and 96.2% respectively, which indicates that the improved CNN, through a unique framework, can extract deep semantic features of the text and provide a strong support for establishing an efficient and accurate news text classification model

    A history and theory of textual event detection and recognition

    Get PDF

    Char-RNN and Active Learning for Hashtag Segmentation

    Full text link
    We explore the abilities of character recurrent neural network (char-RNN) for hashtag segmentation. Our approach to the task is the following: we generate synthetic training dataset according to frequent n-grams that satisfy predefined morpho-syntactic patterns to avoid any manual annotation. The active learning strategy limits the training dataset and selects informative training subset. The approach does not require any language-specific settings and is compared for two languages, which differ in inflection degree.Comment: to appear in Cicling201
    corecore