243,061 research outputs found
Improving Pre-trained Language Model Fine-tuning with Noise Stability Regularization
The advent of large-scale pre-trained language models has contributed greatly
to the recent progress in natural language processing. Many state-of-the-art
language models are first trained on a large text corpus and then fine-tuned on
downstream tasks. Despite its recent success and wide adoption, fine-tuning a
pre-trained language model often suffers from overfitting, which leads to poor
generalizability due to the extremely high complexity of the model and the
limited training samples from downstream tasks. To address this problem, we
propose a novel and effective fine-tuning framework, named Layerwise Noise
Stability Regularization (LNSR). Specifically, we propose to inject the
standard Gaussian noise or In-manifold noise and regularize hidden
representations of the fine-tuned model. We first provide theoretical analyses
to support the efficacy of our method. We then demonstrate the advantages of
the proposed method over other state-of-the-art algorithms including L2-SP,
Mixout and SMART. While these previous works only verify the effectiveness of
their methods on relatively simple text classification tasks, we also verify
the effectiveness of our method on question answering tasks, where the target
problem is much more difficult and more training examples are available.
Furthermore, extensive experimental results indicate that the proposed
algorithm can not only enhance the in-domain performance of the language models
but also improve the domain generalization performance on out-of-domain data.Comment: Accepted by TNNL
On the possible Computational Power of the Human Mind
The aim of this paper is to address the question: Can an artificial neural
network (ANN) model be used as a possible characterization of the power of the
human mind? We will discuss what might be the relationship between such a model
and its natural counterpart. A possible characterization of the different power
capabilities of the mind is suggested in terms of the information contained (in
its computational complexity) or achievable by it. Such characterization takes
advantage of recent results based on natural neural networks (NNN) and the
computational power of arbitrary artificial neural networks (ANN). The possible
acceptance of neural networks as the model of the human mind's operation makes
the aforementioned quite relevant.Comment: Complexity, Science and Society Conference, 2005, University of
Liverpool, UK. 23 page
Making Queries Tractable on Big Data with Preprocessing
A query class is traditionally considered tractable if there exists a polynomial-time (PTIME) algorithm to answer its queries. When it comes to big data, however, PTIME al-gorithms often become infeasible in practice. A traditional and effective approach to coping with this is to preprocess data off-line, so that queries in the class can be subsequently evaluated on the data efficiently. This paper aims to pro-vide a formal foundation for this approach in terms of com-putational complexity. (1) We propose a set of Î -tractable queries, denoted by Î T0Q, to characterize classes of queries that can be answered in parallel poly-logarithmic time (NC) after PTIME preprocessing. (2) We show that several natu-ral query classes are Î -tractable and are feasible on big data. (3) We also study a set Î TQ of query classes that can be ef-fectively converted to Î -tractable queries by re-factorizing its data and queries for preprocessing. We introduce a form of NC reductions to characterize such conversions. (4) We show that a natural query class is complete for Î TQ. (5) We also show that Î T0Q â P unless P = NC, i.e., the set Î T0Q of all Î -tractable queries is properly contained in the set P of all PTIME queries. Nonetheless, Î TQ = P, i.e., all PTIME query classes can be made Î -tractable via proper re-factorizations. This work is a step towards understanding the tractability of queries in the context of big data. 1
Automatic Analysis of Linguistic Complexity and Its Application in Language Learning Research
The construct of complexity, together with accuracy and fluency have become the central foci of language learning research in recent years. This dissertation focuses on complexity, a multidimensional construct that has its own working mechanism, cognitive and psycholinguistic processes, and developmental dynamics. Six studies revolving around complexity, including its conceptualization, automatic measurement, and application in language acquisition research are reported.
The basis of these studies is the automatic multidimensional analysis of linguistic complexity, which was implemented into a Web platform called Common Text Analysis Platform by making use of state-of-the-art Natural Language Processing (NLP) technologies . The system provides a rich set of complexity measures that are easily accessible by normal users and supports collaborative development of complexity feature extractors.
An application study zooming into characterizing the text-level readability with the word-level feature of lexical frequency is reported next. It was found that the lexical complexity measure of word frequency was highly predictive of text readability. Another application study focuses on investigating the developmental interrelationship between complexity and accuracy, an issue that conflicting theories and research results have been reported. Our findings support the simultaneous development account.
The other few studies are about applying automatic complexity analysis to promote language development, which involves analyzing both learning input and learner production, as well as linking the two spaces. We first proposed and validated the approach to link input and production with complexity feature vector distances. Then the ICALL system SyB implementing the approach was developed and demonstrated. An effective test of the system was conducted with a randomized control experiment that tested the effects of different levels of input challenge on L2 development. Results of the experiment supported the comprehensible input hypothesis in Second Language Acquisition (SLA) and provided an automatizable operationalization of the theory.
The series of studies in this dissertation demonstrates how language learning research can benefit from NLP technologies. On the other hand, it also demonstrates how these technologies can be applied to build practical language learning systems based on solid theoretical and research foundations in SLA
Technology assessment of advanced automation for space missions
Six general classes of technology requirements derived during the mission definition phase of the study were identified as having maximum importance and urgency, including autonomous world model based information systems, learning and hypothesis formation, natural language and other man-machine communication, space manufacturing, teleoperators and robot systems, and computer science and technology
- âŠ