119,723 research outputs found
Transfer Learning for Power Outage Detection Task with Limited Training Data
Early detection of power outages is crucial for maintaining a reliable power
distribution system. This research investigates the use of transfer learning
and language models in detecting outages with limited labeled data. By
leveraging pretraining and transfer learning, models can generalize to unseen
classes.
Using a curated balanced dataset of social media tweets related to power
outages, we conducted experiments using zero-shot and few-shot learning. Our
hypothesis is that Language Models pretrained with limited data could achieve
high performance in outage detection tasks over baseline models. Results show
that while classical models outperform zero-shot Language Models, few-shot
fine-tuning significantly improves their performance. For example, with 10%
fine-tuning, BERT achieves 81.3% accuracy (+15.3%), and GPT achieves 74.5%
accuracy (+8.5%). This has practical implications for analyzing and localizing
outages in scenarios with limited data availability.
Our evaluation provides insights into the potential of few-shot fine-tuning
with Language Models for power outage detection, highlighting their strengths
and limitations. This research contributes to the knowledge base of leveraging
advanced natural language processing techniques for managing critical
infrastructure
Investigating Linguistic Pattern Ordering in Hierarchical Natural Language Generation
Natural language generation (NLG) is a critical component in spoken dialogue
system, which can be divided into two phases: (1) sentence planning: deciding
the overall sentence structure, (2) surface realization: determining specific
word forms and flattening the sentence structure into a string. With the rise
of deep learning, most modern NLG models are based on a sequence-to-sequence
(seq2seq) model, which basically contains an encoder-decoder structure; these
NLG models generate sentences from scratch by jointly optimizing sentence
planning and surface realization. However, such simple encoder-decoder
architecture usually fail to generate complex and long sentences, because the
decoder has difficulty learning all grammar and diction knowledge well. This
paper introduces an NLG model with a hierarchical attentional decoder, where
the hierarchy focuses on leveraging linguistic knowledge in a specific order.
The experiments show that the proposed method significantly outperforms the
traditional seq2seq model with a smaller model size, and the design of the
hierarchical attentional decoder can be applied to various NLG systems.
Furthermore, different generation strategies based on linguistic patterns are
investigated and analyzed in order to guide future NLG research work.Comment: accepted by the 7th IEEE Workshop on Spoken Language Technology (SLT
2018). arXiv admin note: text overlap with arXiv:1808.0274
Intelligent Word Embeddings of Free-Text Radiology Reports
Radiology reports are a rich resource for advancing deep learning
applications in medicine by leveraging the large volume of data continuously
being updated, integrated, and shared. However, there are significant
challenges as well, largely due to the ambiguity and subtlety of natural
language. We propose a hybrid strategy that combines semantic-dictionary
mapping and word2vec modeling for creating dense vector embeddings of free-text
radiology reports. Our method leverages the benefits of both
semantic-dictionary mapping as well as unsupervised learning. Using the vector
representation, we automatically classify the radiology reports into three
classes denoting confidence in the diagnosis of intracranial hemorrhage by the
interpreting radiologist. We performed experiments with varying hyperparameter
settings of the word embeddings and a range of different classifiers. Best
performance achieved was a weighted precision of 88% and weighted recall of
90%. Our work offers the potential to leverage unstructured electronic health
record data by allowing direct analysis of narrative clinical notes.Comment: AMIA Annual Symposium 201
Survey on reinforcement learning for language processing
In recent years some researchers have explored the use of reinforcement
learning (RL) algorithms as key components in the solution of various natural
language processing tasks. For instance, some of these algorithms leveraging
deep neural learning have found their way into conversational systems. This
paper reviews the state of the art of RL methods for their possible use for
different problems of natural language processing, focusing primarily on
conversational systems, mainly due to their growing relevance. We provide
detailed descriptions of the problems as well as discussions of why RL is
well-suited to solve them. Also, we analyze the advantages and limitations of
these methods. Finally, we elaborate on promising research directions in
natural language processing that might benefit from reinforcement learning
Cultural Compass: Predicting Transfer Learning Success in Offensive Language Detection with Cultural Features
The increasing ubiquity of language technology necessitates a shift towards
considering cultural diversity in the machine learning realm, particularly for
subjective tasks that rely heavily on cultural nuances, such as Offensive
Language Detection (OLD). Current understanding underscores that these tasks
are substantially influenced by cultural values, however, a notable gap exists
in determining if cultural features can accurately predict the success of
cross-cultural transfer learning for such subjective tasks. Addressing this,
our study delves into the intersection of cultural features and transfer
learning effectiveness. The findings reveal that cultural value surveys indeed
possess a predictive power for cross-cultural transfer learning success in OLD
tasks and that it can be further improved using offensive word distance. Based
on these results, we advocate for the integration of cultural information into
datasets. Additionally, we recommend leveraging data sources rich in cultural
information, such as surveys, to enhance cultural adaptability. Our research
signifies a step forward in the quest for more inclusive, culturally sensitive
language technologies.Comment: Findings of EMNLP 202
Machine translation as an underrated ingredient? : solving classification tasks with large language models for comparative research
While large language models have revolutionised computational text analysis methods, the field is still tilted towards English language resources. Even as there are pre-trained models for some "smaller" languages, the coverage is far from universal, and pre-training large language models is an expensive and complicated task. This uneven language coverage limits comparative social research in terms of its geographical and linguistic scope. We propose a solution that sidesteps these issues by leveraging transfer learning and open-source machine translation. We use English as a bridge language between Hungarian and Polish bills and laws to solve a classification task related to the Comparative Agendas Project (CAP) coding scheme. Using the Hungarian corpus as training data for model fine-tuning, we categorise the Polish laws into 20 CAP categories. In doing so, we compare the performance of Transformer-based deep learning models (monolinguals, such as BERT, and multilinguals such as XLM-RoBERTa) and machine learning algorithms (e.g., SVM). Results show that the fine-tuned large language models outperform the traditional supervised learning benchmarks but are themselves surpassed by the machine translation approach. Overall, the proposed solution demonstrates a viable option for applying a transfer learning framework for low-resource languages and achieving state-of-the-art results without requiring expensive pre-training
Language Grounding through Social Interactions and Curiosity-Driven Multi-Goal Learning
International audienceAutonomous reinforcement learning agents, like children, do not have access to predefined goals and reward functions. They must discover potential goals, learn their own reward functions and engage in their own learning trajectory. Children, however, benefit from exposure to language, helping to organize and mediate their thought. We propose LE2 (Language Enhanced Exploration), a learning algorithm leveraging intrinsic motivations and natural language (NL) interactions with a descriptive social partner (SP). Using NL descriptions from the SP, it can learn an NL-conditioned reward function to formulate goals for intrinsically motivated goal exploration and learn a goal-conditioned policy. By exploring, collecting descriptions from the SP and jointly learning the reward function and the policy, the agent grounds NL descriptions into real behavioral goals. From simple goals discovered early to more complex goals discovered by experimenting on simpler ones, our agent autonomously builds its own behavioral repertoire. This naturally occurring curriculum is supplemented by an active learning curriculum resulting from the agent's intrinsic motivations. Experiments are presented with a simulated robotic arm that interacts with several objects including tools
- âŠ