243,061 research outputs found

    Improving Pre-trained Language Model Fine-tuning with Noise Stability Regularization

    Full text link
    The advent of large-scale pre-trained language models has contributed greatly to the recent progress in natural language processing. Many state-of-the-art language models are first trained on a large text corpus and then fine-tuned on downstream tasks. Despite its recent success and wide adoption, fine-tuning a pre-trained language model often suffers from overfitting, which leads to poor generalizability due to the extremely high complexity of the model and the limited training samples from downstream tasks. To address this problem, we propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR). Specifically, we propose to inject the standard Gaussian noise or In-manifold noise and regularize hidden representations of the fine-tuned model. We first provide theoretical analyses to support the efficacy of our method. We then demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART. While these previous works only verify the effectiveness of their methods on relatively simple text classification tasks, we also verify the effectiveness of our method on question answering tasks, where the target problem is much more difficult and more training examples are available. Furthermore, extensive experimental results indicate that the proposed algorithm can not only enhance the in-domain performance of the language models but also improve the domain generalization performance on out-of-domain data.Comment: Accepted by TNNL

    On the possible Computational Power of the Human Mind

    Full text link
    The aim of this paper is to address the question: Can an artificial neural network (ANN) model be used as a possible characterization of the power of the human mind? We will discuss what might be the relationship between such a model and its natural counterpart. A possible characterization of the different power capabilities of the mind is suggested in terms of the information contained (in its computational complexity) or achievable by it. Such characterization takes advantage of recent results based on natural neural networks (NNN) and the computational power of arbitrary artificial neural networks (ANN). The possible acceptance of neural networks as the model of the human mind's operation makes the aforementioned quite relevant.Comment: Complexity, Science and Society Conference, 2005, University of Liverpool, UK. 23 page

    Making Queries Tractable on Big Data with Preprocessing

    Get PDF
    A query class is traditionally considered tractable if there exists a polynomial-time (PTIME) algorithm to answer its queries. When it comes to big data, however, PTIME al-gorithms often become infeasible in practice. A traditional and effective approach to coping with this is to preprocess data off-line, so that queries in the class can be subsequently evaluated on the data efficiently. This paper aims to pro-vide a formal foundation for this approach in terms of com-putational complexity. (1) We propose a set of Π-tractable queries, denoted by ΠT0Q, to characterize classes of queries that can be answered in parallel poly-logarithmic time (NC) after PTIME preprocessing. (2) We show that several natu-ral query classes are Π-tractable and are feasible on big data. (3) We also study a set ΠTQ of query classes that can be ef-fectively converted to Π-tractable queries by re-factorizing its data and queries for preprocessing. We introduce a form of NC reductions to characterize such conversions. (4) We show that a natural query class is complete for ΠTQ. (5) We also show that ΠT0Q ⊂ P unless P = NC, i.e., the set ΠT0Q of all Π-tractable queries is properly contained in the set P of all PTIME queries. Nonetheless, ΠTQ = P, i.e., all PTIME query classes can be made Π-tractable via proper re-factorizations. This work is a step towards understanding the tractability of queries in the context of big data. 1

    Automatic Analysis of Linguistic Complexity and Its Application in Language Learning Research

    Get PDF
    The construct of complexity, together with accuracy and fluency have become the central foci of language learning research in recent years. This dissertation focuses on complexity, a multidimensional construct that has its own working mechanism, cognitive and psycholinguistic processes, and developmental dynamics. Six studies revolving around complexity, including its conceptualization, automatic measurement, and application in language acquisition research are reported. The basis of these studies is the automatic multidimensional analysis of linguistic complexity, which was implemented into a Web platform called Common Text Analysis Platform by making use of state-of-the-art Natural Language Processing (NLP) technologies . The system provides a rich set of complexity measures that are easily accessible by normal users and supports collaborative development of complexity feature extractors. An application study zooming into characterizing the text-level readability with the word-level feature of lexical frequency is reported next. It was found that the lexical complexity measure of word frequency was highly predictive of text readability. Another application study focuses on investigating the developmental interrelationship between complexity and accuracy, an issue that conflicting theories and research results have been reported. Our findings support the simultaneous development account. The other few studies are about applying automatic complexity analysis to promote language development, which involves analyzing both learning input and learner production, as well as linking the two spaces. We first proposed and validated the approach to link input and production with complexity feature vector distances. Then the ICALL system SyB implementing the approach was developed and demonstrated. An effective test of the system was conducted with a randomized control experiment that tested the effects of different levels of input challenge on L2 development. Results of the experiment supported the comprehensible input hypothesis in Second Language Acquisition (SLA) and provided an automatizable operationalization of the theory. The series of studies in this dissertation demonstrates how language learning research can benefit from NLP technologies. On the other hand, it also demonstrates how these technologies can be applied to build practical language learning systems based on solid theoretical and research foundations in SLA

    Technology assessment of advanced automation for space missions

    Get PDF
    Six general classes of technology requirements derived during the mission definition phase of the study were identified as having maximum importance and urgency, including autonomous world model based information systems, learning and hypothesis formation, natural language and other man-machine communication, space manufacturing, teleoperators and robot systems, and computer science and technology
    • 

    corecore