35 research outputs found

    TransPrompt v2: A Transferable Prompting Framework for Cross-task Text Classification

    Full text link
    Text classification is one of the most imperative tasks in natural language processing (NLP). Recent advances with pre-trained language models (PLMs) have shown remarkable success on this task. However, the satisfying results obtained by PLMs heavily depend on the large amounts of task-specific labeled data, which may not be feasible in many application scenarios due to data access and privacy constraints. The recently-proposed prompt-based fine-tuning paradigm improves the performance of PLMs for few-shot text classification with task-specific templates. Yet, it is unclear how the prompting knowledge can be transferred across tasks, for the purpose of mutual reinforcement. We propose TransPrompt v2, a novel transferable prompting framework for few-shot learning across similar or distant text classification tasks. For learning across similar tasks, we employ a multi-task meta-knowledge acquisition (MMA) procedure to train a meta-learner that captures the cross-task transferable knowledge. For learning across distant tasks, we further inject the task type descriptions into the prompt, and capture the intra-type and inter-type prompt embeddings among multiple distant tasks. Additionally, two de-biasing techniques are further designed to make the trained meta-learner more task-agnostic and unbiased towards any tasks. After that, the meta-learner can be adapted to each specific task with better parameters initialization. Extensive experiments show that TransPrompt v2 outperforms single-task and cross-task strong baselines over multiple NLP tasks and datasets. We further show that the meta-learner can effectively improve the performance of PLMs on previously unseen tasks. In addition, TransPrompt v2 also outperforms strong fine-tuning baselines when learning with full training sets

    An information theoretic approach to sentiment polarity classification

    Full text link
    Sentiment classification is a task of classifying documents according to their overall sentiment inclination. It is very important and popular in many web applications, such as credibility analysis of news sites on the Web, recommen-dation system and mining online discussion. Vector space model is widely applied on modeling documents in super-vised sentiment classification, in which the feature presenta-tion (including features type and weight function) is crucial for classification accuracy. The traditional feature presen-tation methods of text categorization do not perform well in sentiment classification, because the expressing manners of sentiment are more subtle. We analyze the relationships of terms with sentiment labels based on information theory, and propose a method by applying information theoretic approach on sentiment classification of documents. In this paper, we adopt mutual information on quantifying the sen-timent polarities of terms in a document firstly. Then the terms are weighted in vector space based on both sentiment scores and contribution to the document. We perform exten-sive experiments with SVM on the sets of multiple product reviews, and the experimental results show our approach is more effective than the traditional ones

    Measurement of the vertical atmospheric density profile from the X-ray Earth occultation of the Crab Nebula with Insight-HXMT

    Full text link
    In this paper, the X-ray Earth occultation (XEO) of the Crab Nebula is investigated by using the Hard X-ray Modulation Telescope (Insight-HXMT). The pointing observation data on the 30th September, 2018 recorded by the Low Energy X-ray telescope (LE) of Insight-HXMT are selected and analyzed. The extinction lightcurves and spectra during the X-ray Earth occultation process are extracted. A forward model for the XEO lightcurve is established and the theoretical observational signal for lightcurve is predicted. The atmospheric density model is built with a scale factor to the commonly used MSIS density profile within a certain altitude range. A Bayesian data analysis method is developed for the XEO lightcurve modeling and the atmospheric density retrieval. The posterior probability distribution of the model parameters is derived through the Markov Chain Monte Carlo (MCMC) algorithm with the NRLMSISE-00 model and the NRLMSIS 2.0 model as basis functions and the best-fit density profiles are retrieved respectively. It is found that in the altitude range of 105--200 km, the retrieved density profile is 88.8% of the density of NRLMSISE-00 and 109.7% of the density of NRLMSIS 2.0 by fitting the lightcurve in the energy range of 1.0--2.5 keV based on XEOS method. In the altitude range of 95--125 km, the retrieved density profile is 81.0% of the density of NRLMSISE-00 and 92.3% of the density of NRLMSIS 2.0 by fitting the lightcurve in the energy range of 2.5--6.0 keV based on XEOS method. In the altitude range of 85--110 km, the retrieved density profile is 87.7% of the density of NRLMSISE-00 and 101.4% of the density of NRLMSIS 2.0 by fitting the lightcurve in the energy range of 6.0--10.0 keV based on XEOS method. This study demonstrates that the XEOS from the X-ray astronomical satellite Insight-HXMT can provide an approach for the study of the upper atmosphere.Comment: 31 pages, 15 figures, 5 tables, accepted for publication in Atmospheric Measurement Technique

    Improving Hypernymy Prediction via Taxonomy Enhanced Adversarial Learning

    No full text
    Hypernymy is a basic semantic relation in computational linguistics that expresses the “is-a” relation between a generic concept and its specific instances, serving as the backbone in taxonomies and ontologies. Although several NLP tasks related to hypernymy prediction have been extensively addressed, few methods have fully exploited the large number of hypernymy relations in Web-scale taxonomies.In this paper, we introduce the Taxonomy Enhanced Adversarial Learning (TEAL) for hypernymy prediction. We first propose an unsupervised measure U-TEAL to distinguish hypernymy with other semantic relations. It is implemented based on a word embedding projection network distantly trained over a taxonomy. To address supervised hypernymy detection tasks, the supervised model S-TEAL and its improved version, the adversarial supervised model AS-TEAL, are further presented. Specifically, AS-TEAL employs a coupled adversarial training algorithm to transfer hierarchical knowledge in taxonomies to hypernymy prediction models. We conduct extensive experiments to confirm the effectiveness of TEAL over three standard NLP tasks: unsupervised hypernymy classification, supervised hypernymy detection and graded lexical entailment. We also show that TEAL can be applied to non-English languages and can detect missing hypernymy relations in taxonomies

    An Efficient Cooperative Framework for Multi-Query Processing over

    No full text
    Abstract. XML is a de-facto standard for exchanging and presenting information on the Web. However, XML data is also recognized as verbose since it heavily inflates the size of the data due to the repeated tags and structures. The data verbosity problem gives rise to many challenges of conventional distributed database technologies. In this paper, we study the XML dissemination problem over the Internet, where the speed of information delivery can be rather slow in a server-client architecture which consists of a large number of geographically spanned users who access a large amount of correlated XML information. The problem becomes more severe when the users access closely related XML fragments, and in this case the usage of bandwidth is inefficient. In order to save bandwidth and process the queries efficiently, we propose an architecture that incorporates XML compression techniques and exploits the results of XPath containment. Within our framework, we demonstrate that the loading of the server is reduced, the network bandwidth can be more efficiently used and, consequently, all clients as a whole can benefit due to savings of various costs.

    Scalable XSLT Evaluation

    No full text
    Abstract. XSLT is an increasingly popular language for processing XML data. It is widely supported by application platform software. However, little optimization effort has been made inside the current XSLT processing engines. Evaluating a very simple XSLT program on a large XML document with a simple schema may result in extensive usage of memory. In this paper, we present a novel notion of Streaming Processing Model (SPM) to evaluate a subset of XSLT programs on XML documents, especially large ones. With SPM, an XSLT processor can transform an XML source document to other formats without extra memory buffers required. Therefore, our approach can not only tackle large source documents, but also produce large results. We demonstrate with a performance study the advantages of the SPM approach. Experimental results clearly confirm that SPM improves XSLT evaluation typically 2 to 10 times better than the existing approaches. Moreover, the SPM approach also features high scalability.

    Uncertainty-Aware Self-Training for Low-Resource Neural Sequence Labeling

    No full text
    Neural sequence labeling (NSL) aims at assigning labels for input language tokens, which covers a broad range of applications, such as named entity recognition (NER) and slot filling, etc. However, the satisfying results achieved by traditional supervised-based approaches heavily depend on the large amounts of human annotation data, which may not be feasible in real-world scenarios due to data privacy and computation efficiency issues. This paper presents SeqUST, a novel uncertain-aware self-training framework for NSL to address the labeled data scarcity issue and to effectively utilize unlabeled data. Specifically, we incorporate Monte Carlo (MC) dropout in Bayesian neural network (BNN) to perform uncertainty estimation at the token level and then select reliable language tokens from unlabeled data based on the model confidence and certainty. A well-designed masked sequence labeling task with a noise-robust loss supports robust training, which aims to suppress the problem of noisy pseudo labels. In addition, we develop a Gaussian-based consistency regularization technique to further improve the model robustness on Gaussian-distributed perturbed representations. This effectively alleviates the over-fitting dilemma originating from pseudo-labeled augmented data. Extensive experiments over six benchmarks demonstrate that our SeqUST framework effectively improves the performance of self-training, and consistently outperforms strong baselines by a large margin in low-resource scenarios

    Sentiment Classification via Integrating Multiple Feature Presentations

    No full text
    In the bag of words framework, documents are often converted into vectors according to predefined features together with weighting mechanisms. Since each feature presentation has its character, it is difficult to determine which one should be chosen for a specific domain, especially for the users who are not familiar with the domain. This paper explores the integration of various feature presentations to improve the classification accuracy. A general two phases framework is proposed. In the first phase, we train multiple base classifiers with various vector spaces and use these classifiers to predict the class of testing samples respectively. In the second phase, the previous predicted results are integrated into the ultimate class via stacking with SVM. The experimental results demonstrate the effectiveness of our method

    GFilter: A General Gram Filter for String Similarity Search

    No full text
    Numerous applications such as data integration, protein detection, and article copy detection share a similar core problem: given a string as the query, how to efficiently find all the similar answers from a large scale string collection. Many existing methods adopt a prefix-filter-based framework to solve this problem, and a number of recent works aim to use advanced filters to improve the overall search performance. In this paper, we propose a gram-based framework to achieve near maximum filter performance. The main idea is to judiciously choose the high-quality grams as the prefix of query according to their estimated ability to filter candidates. As this selection process is proved to be NP-hard problem, we give a cost model to measure the filter ability of grams and develop efficient heuristic algorithms to find high-quality grams. Extensive experiments on real datasets demonstrate the superiority of the proposed framework in comparison with the state-of-art approaches
    corecore