35 research outputs found
TransPrompt v2: A Transferable Prompting Framework for Cross-task Text Classification
Text classification is one of the most imperative tasks in natural language
processing (NLP). Recent advances with pre-trained language models (PLMs) have
shown remarkable success on this task. However, the satisfying results obtained
by PLMs heavily depend on the large amounts of task-specific labeled data,
which may not be feasible in many application scenarios due to data access and
privacy constraints. The recently-proposed prompt-based fine-tuning paradigm
improves the performance of PLMs for few-shot text classification with
task-specific templates. Yet, it is unclear how the prompting knowledge can be
transferred across tasks, for the purpose of mutual reinforcement. We propose
TransPrompt v2, a novel transferable prompting framework for few-shot learning
across similar or distant text classification tasks. For learning across
similar tasks, we employ a multi-task meta-knowledge acquisition (MMA)
procedure to train a meta-learner that captures the cross-task transferable
knowledge. For learning across distant tasks, we further inject the task type
descriptions into the prompt, and capture the intra-type and inter-type prompt
embeddings among multiple distant tasks. Additionally, two de-biasing
techniques are further designed to make the trained meta-learner more
task-agnostic and unbiased towards any tasks. After that, the meta-learner can
be adapted to each specific task with better parameters initialization.
Extensive experiments show that TransPrompt v2 outperforms single-task and
cross-task strong baselines over multiple NLP tasks and datasets. We further
show that the meta-learner can effectively improve the performance of PLMs on
previously unseen tasks. In addition, TransPrompt v2 also outperforms strong
fine-tuning baselines when learning with full training sets
An information theoretic approach to sentiment polarity classification
Sentiment classification is a task of classifying documents according to their overall sentiment inclination. It is very important and popular in many web applications, such as credibility analysis of news sites on the Web, recommen-dation system and mining online discussion. Vector space model is widely applied on modeling documents in super-vised sentiment classification, in which the feature presenta-tion (including features type and weight function) is crucial for classification accuracy. The traditional feature presen-tation methods of text categorization do not perform well in sentiment classification, because the expressing manners of sentiment are more subtle. We analyze the relationships of terms with sentiment labels based on information theory, and propose a method by applying information theoretic approach on sentiment classification of documents. In this paper, we adopt mutual information on quantifying the sen-timent polarities of terms in a document firstly. Then the terms are weighted in vector space based on both sentiment scores and contribution to the document. We perform exten-sive experiments with SVM on the sets of multiple product reviews, and the experimental results show our approach is more effective than the traditional ones
Measurement of the vertical atmospheric density profile from the X-ray Earth occultation of the Crab Nebula with Insight-HXMT
In this paper, the X-ray Earth occultation (XEO) of the Crab Nebula is
investigated by using the Hard X-ray Modulation Telescope (Insight-HXMT). The
pointing observation data on the 30th September, 2018 recorded by the Low
Energy X-ray telescope (LE) of Insight-HXMT are selected and analyzed. The
extinction lightcurves and spectra during the X-ray Earth occultation process
are extracted. A forward model for the XEO lightcurve is established and the
theoretical observational signal for lightcurve is predicted. The atmospheric
density model is built with a scale factor to the commonly used MSIS density
profile within a certain altitude range. A Bayesian data analysis method is
developed for the XEO lightcurve modeling and the atmospheric density
retrieval. The posterior probability distribution of the model parameters is
derived through the Markov Chain Monte Carlo (MCMC) algorithm with the
NRLMSISE-00 model and the NRLMSIS 2.0 model as basis functions and the best-fit
density profiles are retrieved respectively. It is found that in the altitude
range of 105--200 km, the retrieved density profile is 88.8% of the density of
NRLMSISE-00 and 109.7% of the density of NRLMSIS 2.0 by fitting the lightcurve
in the energy range of 1.0--2.5 keV based on XEOS method. In the altitude range
of 95--125 km, the retrieved density profile is 81.0% of the density of
NRLMSISE-00 and 92.3% of the density of NRLMSIS 2.0 by fitting the lightcurve
in the energy range of 2.5--6.0 keV based on XEOS method. In the altitude range
of 85--110 km, the retrieved density profile is 87.7% of the density of
NRLMSISE-00 and 101.4% of the density of NRLMSIS 2.0 by fitting the lightcurve
in the energy range of 6.0--10.0 keV based on XEOS method. This study
demonstrates that the XEOS from the X-ray astronomical satellite Insight-HXMT
can provide an approach for the study of the upper atmosphere.Comment: 31 pages, 15 figures, 5 tables, accepted for publication in
Atmospheric Measurement Technique
Improving Hypernymy Prediction via Taxonomy Enhanced Adversarial Learning
Hypernymy is a basic semantic relation in computational linguistics that expresses the “is-a” relation between a generic concept and its specific instances, serving as the backbone in taxonomies and ontologies. Although several NLP tasks related to hypernymy prediction have been extensively addressed, few methods have fully exploited the large number of hypernymy relations in Web-scale taxonomies.In this paper, we introduce the Taxonomy Enhanced Adversarial Learning (TEAL) for hypernymy prediction. We first propose an unsupervised measure U-TEAL to distinguish hypernymy with other semantic relations. It is implemented based on a word embedding projection network distantly trained over a taxonomy. To address supervised hypernymy detection tasks, the supervised model S-TEAL and its improved version, the adversarial supervised model AS-TEAL, are further presented. Specifically, AS-TEAL employs a coupled adversarial training algorithm to transfer hierarchical knowledge in taxonomies to hypernymy prediction models. We conduct extensive experiments to confirm the effectiveness of TEAL over three standard NLP tasks: unsupervised hypernymy classification, supervised hypernymy detection and graded lexical entailment. We also show that TEAL can be applied to non-English languages and can detect missing hypernymy relations in taxonomies
An Efficient Cooperative Framework for Multi-Query Processing over
Abstract. XML is a de-facto standard for exchanging and presenting information on the Web. However, XML data is also recognized as verbose since it heavily inflates the size of the data due to the repeated tags and structures. The data verbosity problem gives rise to many challenges of conventional distributed database technologies. In this paper, we study the XML dissemination problem over the Internet, where the speed of information delivery can be rather slow in a server-client architecture which consists of a large number of geographically spanned users who access a large amount of correlated XML information. The problem becomes more severe when the users access closely related XML fragments, and in this case the usage of bandwidth is inefficient. In order to save bandwidth and process the queries efficiently, we propose an architecture that incorporates XML compression techniques and exploits the results of XPath containment. Within our framework, we demonstrate that the loading of the server is reduced, the network bandwidth can be more efficiently used and, consequently, all clients as a whole can benefit due to savings of various costs.
Scalable XSLT Evaluation
Abstract. XSLT is an increasingly popular language for processing XML data. It is widely supported by application platform software. However, little optimization effort has been made inside the current XSLT processing engines. Evaluating a very simple XSLT program on a large XML document with a simple schema may result in extensive usage of memory. In this paper, we present a novel notion of Streaming Processing Model (SPM) to evaluate a subset of XSLT programs on XML documents, especially large ones. With SPM, an XSLT processor can transform an XML source document to other formats without extra memory buffers required. Therefore, our approach can not only tackle large source documents, but also produce large results. We demonstrate with a performance study the advantages of the SPM approach. Experimental results clearly confirm that SPM improves XSLT evaluation typically 2 to 10 times better than the existing approaches. Moreover, the SPM approach also features high scalability.
Uncertainty-Aware Self-Training for Low-Resource Neural Sequence Labeling
Neural sequence labeling (NSL) aims at assigning labels for input language tokens, which covers a broad range of applications, such as named entity recognition (NER) and slot filling, etc. However, the satisfying results achieved by traditional supervised-based approaches heavily depend on the large amounts of human annotation data, which may not be feasible in real-world scenarios due to data privacy and computation efficiency issues. This paper presents SeqUST, a novel uncertain-aware self-training framework for NSL to address the labeled data scarcity issue and to effectively utilize unlabeled data. Specifically, we incorporate Monte Carlo (MC) dropout in Bayesian neural network (BNN) to perform uncertainty estimation at the token level and then select reliable language tokens from unlabeled data based on the model confidence and certainty. A well-designed masked sequence labeling task with a noise-robust loss supports robust training, which aims to suppress the problem of noisy pseudo labels. In addition, we develop a Gaussian-based consistency regularization technique to further improve the model robustness on Gaussian-distributed perturbed representations. This effectively alleviates the over-fitting dilemma originating from pseudo-labeled augmented data. Extensive experiments over six benchmarks demonstrate that our SeqUST framework effectively improves the performance of self-training, and consistently outperforms strong baselines by a large margin in low-resource scenarios
Sentiment Classification via Integrating Multiple Feature Presentations
In the bag of words framework, documents are often converted into vectors according to predefined features together with weighting mechanisms. Since each feature presentation has its character, it is difficult to determine which one should be chosen for a specific domain, especially for the users who are not familiar with the domain. This paper explores the integration of various feature presentations to improve the classification accuracy. A general two phases framework is proposed. In the first phase, we train multiple base classifiers with various vector spaces and use these classifiers to predict the class of testing samples respectively. In the second phase, the previous predicted results are integrated into the ultimate class via stacking with SVM. The experimental results demonstrate the effectiveness of our method
GFilter: A General Gram Filter for String Similarity Search
Numerous applications such as data integration, protein detection, and article copy detection share a similar core problem: given a string as the query, how to efficiently find all the similar answers from a large scale string collection. Many existing methods adopt a prefix-filter-based framework to solve this problem, and a number of recent works aim to use advanced filters to improve the overall search performance. In this paper, we propose a gram-based framework to achieve near maximum filter performance. The main idea is to judiciously choose the high-quality grams as the prefix of query according to their estimated ability to filter candidates. As this selection process is proved to be NP-hard problem, we give a cost model to measure the filter ability of grams and develop efficient heuristic algorithms to find high-quality grams. Extensive experiments on real datasets demonstrate the superiority of the proposed framework in comparison with the state-of-art approaches