Search CORE

119,723 research outputs found

Transfer Learning for Power Outage Detection Task with Limited Training Data

Author: Owolabi Olukunle
Publication venue
Publication date: 28/05/2023
Field of study

Early detection of power outages is crucial for maintaining a reliable power distribution system. This research investigates the use of transfer learning and language models in detecting outages with limited labeled data. By leveraging pretraining and transfer learning, models can generalize to unseen classes. Using a curated balanced dataset of social media tweets related to power outages, we conducted experiments using zero-shot and few-shot learning. Our hypothesis is that Language Models pretrained with limited data could achieve high performance in outage detection tasks over baseline models. Results show that while classical models outperform zero-shot Language Models, few-shot fine-tuning significantly improves their performance. For example, with 10% fine-tuning, BERT achieves 81.3% accuracy (+15.3%), and GPT achieves 74.5% accuracy (+8.5%). This has practical implications for analyzing and localizing outages in scenarios with limited data availability. Our evaluation provides insights into the potential of few-shot fine-tuning with Language Models for power outage detection, highlighting their strengths and limitations. This research contributes to the knowledge base of leveraging advanced natural language processing techniques for managing critical infrastructure

arXiv.org e-Print Archive

Investigating Linguistic Pattern Ordering in Hierarchical Natural Language Generation

Author: Chen Yun-Nung
Su Shang-Yu
Publication venue
Publication date: 19/09/2018
Field of study

Natural language generation (NLG) is a critical component in spoken dialogue system, which can be divided into two phases: (1) sentence planning: deciding the overall sentence structure, (2) surface realization: determining specific word forms and flattening the sentence structure into a string. With the rise of deep learning, most modern NLG models are based on a sequence-to-sequence (seq2seq) model, which basically contains an encoder-decoder structure; these NLG models generate sentences from scratch by jointly optimizing sentence planning and surface realization. However, such simple encoder-decoder architecture usually fail to generate complex and long sentences, because the decoder has difficulty learning all grammar and diction knowledge well. This paper introduces an NLG model with a hierarchical attentional decoder, where the hierarchy focuses on leveraging linguistic knowledge in a specific order. The experiments show that the proposed method significantly outperforms the traditional seq2seq model with a smaller model size, and the design of the hierarchical attentional decoder can be applied to various NLG systems. Furthermore, different generation strategies based on linguistic patterns are investigated and analyzed in order to guide future NLG research work.Comment: accepted by the 7th IEEE Workshop on Spoken Language Technology (SLT 2018). arXiv admin note: text overlap with arXiv:1808.0274

arXiv.org e-Print Archive

Crossref

Intelligent Word Embeddings of Free-Text Radiology Reports

Author: Banerjee Imon
Goldman Roger Eric
Madhavan Sriraman
Rubin Daniel L.
Publication venue
Publication date: 01/01/2017
Field of study

Radiology reports are a rich resource for advancing deep learning applications in medicine by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the ambiguity and subtlety of natural language. We propose a hybrid strategy that combines semantic-dictionary mapping and word2vec modeling for creating dense vector embeddings of free-text radiology reports. Our method leverages the benefits of both semantic-dictionary mapping as well as unsupervised learning. Using the vector representation, we automatically classify the radiology reports into three classes denoting confidence in the diagnosis of intracranial hemorrhage by the interpreting radiologist. We performed experiments with varying hyperparameter settings of the word embeddings and a range of different classifiers. Best performance achieved was a weighted precision of 88% and weighted recall of 90%. Our work offers the potential to leverage unstructured electronic health record data by allowing direct analysis of narrative clinical notes.Comment: AMIA Annual Symposium 201

arXiv.org e-Print Archive

eScholarship - University of California

Survey on reinforcement learning for language processing

Author: Martin-Gonzalez Anabel
Navarro-Guerrero Nicolas
Uc-Cetina Victor
Weber Cornelius
Wermter Stefan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/03/2022
Field of study

In recent years some researchers have explored the use of reinforcement learning (RL) algorithms as key components in the solution of various natural language processing tasks. For instance, some of these algorithms leveraging deep neural learning have found their way into conversational systems. This paper reviews the state of the art of RL methods for their possible use for different problems of natural language processing, focusing primarily on conversational systems, mainly due to their growing relevance. We provide detailed descriptions of the problems as well as discussions of why RL is well-suited to solve them. Also, we analyze the advantages and limitations of these methods. Finally, we elaborate on promising research directions in natural language processing that might benefit from reinforcement learning

arXiv.org e-Print Archive

Cultural Compass: Predicting Transfer Learning Success in Offensive Language Detection with Cultural Features

Author: Chen Wenyu
Hershcovich Daniel
Karamolegkou Antonia
Zhou Li
Publication venue
Publication date: 10/10/2023
Field of study

The increasing ubiquity of language technology necessitates a shift towards considering cultural diversity in the machine learning realm, particularly for subjective tasks that rely heavily on cultural nuances, such as Offensive Language Detection (OLD). Current understanding underscores that these tasks are substantially influenced by cultural values, however, a notable gap exists in determining if cultural features can accurately predict the success of cross-cultural transfer learning for such subjective tasks. Addressing this, our study delves into the intersection of cultural features and transfer learning effectiveness. The findings reveal that cultural value surveys indeed possess a predictive power for cross-cultural transfer learning success in OLD tasks and that it can be further improved using offensive word distance. Based on these results, we advocate for the integration of cultural information into datasets. Additionally, we recommend leveraging data sources rich in cultural information, such as surveys, to enhance cultural adaptability. Our research signifies a step forward in the quest for more inclusive, culturally sensitive language technologies.Comment: Findings of EMNLP 202

arXiv.org e-Print Archive

Machine translation as an underrated ingredient? : solving classification tasks with large language models for comparative research

Author: Feldmann Ádám
Máté Ákos
Sebők Miklós
Stolicki Dariusz
Wordliczek Łukasz
Publication venue
Publication date: 01/01/2023
Field of study

While large language models have revolutionised computational text analysis methods, the field is still tilted towards English language resources. Even as there are pre-trained models for some "smaller" languages, the coverage is far from universal, and pre-training large language models is an expensive and complicated task. This uneven language coverage limits comparative social research in terms of its geographical and linguistic scope. We propose a solution that sidesteps these issues by leveraging transfer learning and open-source machine translation. We use English as a bridge language between Hungarian and Polish bills and laws to solve a classification task related to the Comparative Agendas Project (CAP) coding scheme. Using the Hungarian corpus as training data for model fine-tuning, we categorise the Polish laws into 20 CAP categories. In doing so, we compare the performance of Transformer-based deep learning models (monolinguals, such as BERT, and multilinguals such as XLM-RoBERTa) and machine learning algorithms (e.g., SVM). Results show that the fine-tuned large language models outperform the traditional supervised learning benchmarks but are themselves surpassed by the machine translation approach. Overall, the proposed solution demonstrates a viable option for applying a transfer learning framework for low-resource languages and achieving state-of-the-art results without requiring expensive pre-training

Jagiellonian Univeristy Repository

Language Grounding through Social Interactions and Curiosity-Driven Multi-Goal Learning

Author: Colas Cédric
Dominey Peter
Dussoux Jean-Michel
Lair Nicolas
Oudeyer Pierre-Yves
Portelas Rémy
Publication venue: HAL CCSD
Publication date: 08/11/2019
Field of study

International audienceAutonomous reinforcement learning agents, like children, do not have access to predefined goals and reward functions. They must discover potential goals, learn their own reward functions and engage in their own learning trajectory. Children, however, benefit from exposure to language, helping to organize and mediate their thought. We propose LE2 (Language Enhanced Exploration), a learning algorithm leveraging intrinsic motivations and natural language (NL) interactions with a descriptive social partner (SP). Using NL descriptions from the SP, it can learn an NL-conditioned reward function to formulate goals for intrinsically motivated goal exploration and learn a goal-conditioned policy. By exploring, collecting descriptions from the SP and jointly learning the reward function and the policy, the agent grounds NL descriptions into real behavioral goals. From simple goals discovered early to more complex goals discovered by experimenting on simpler ones, our agent autonomously builds its own behavioral repertoire. This naturally occurring curriculum is supplemented by an active learning curriculum resulting from the agent's intrinsic motivations. Experiments are presented with a simulated robotic arm that interacts with several objects including tools

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Hal-Diderot