6,949 research outputs found
Breaking Sticks and Ambiguities with Adaptive Skip-gram
Recently proposed Skip-gram model is a powerful method for learning
high-dimensional word representations that capture rich semantic relationships
between words. However, Skip-gram as well as most prior work on learning word
representations does not take into account word ambiguity and maintain only
single representation per word. Although a number of Skip-gram modifications
were proposed to overcome this limitation and learn multi-prototype word
representations, they either require a known number of word meanings or learn
them using greedy heuristic approaches. In this paper we propose the Adaptive
Skip-gram model which is a nonparametric Bayesian extension of Skip-gram
capable to automatically learn the required number of representations for all
words at desired semantic resolution. We derive efficient online variational
learning algorithm for the model and empirically demonstrate its efficiency on
word-sense induction task
BlogForever D2.6: Data Extraction Methodology
This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform
A Survey on Few-Shot Class-Incremental Learning
Large deep learning models are impressive, but they struggle when real-time data is not available. Few-shot class-incremental learning (FSCIL) poses a significant challenge for deep neural networks to learn new tasks from just a few labeled samples without forgetting the previously learned ones. This setup can easily leads to catastrophic forgetting and overfitting problems, severely affecting model performance. Studying FSCIL helps overcome deep learning model limitations on data volume and acquisition time, while improving practicality and adaptability of machine learning models. This paper provides a comprehensive survey on FSCIL. Unlike previous surveys, we aim to synthesize few-shot learning and incremental learning, focusing on introducing FSCIL from two perspectives, while reviewing over 30 theoretical research studies and more than 20 applied research studies. From the theoretical perspective, we provide a novel categorization approach that divides the field into five subcategories, including traditional machine learning methods, meta learning-based methods, feature and feature space-based methods, replay-based methods, and dynamic network structure-based methods. We also evaluate the performance of recent theoretical research on benchmark datasets of FSCIL. From the application perspective, FSCIL has achieved impressive achievements in various fields of computer vision such as image classification, object detection, and image segmentation, as well as in natural language processing and graph. We summarize the important applications. Finally, we point out potential future research directions, including applications, problem setups, and theory development. Overall, this paper offers a comprehensive analysis of the latest advances in FSCIL from a methodological, performance, and application perspective
W-procer: Weighted Prototypical Contrastive Learning for Medical Few-Shot Named Entity Recognition
Contrastive learning has become a popular solution for few-shot Name Entity
Recognization (NER). The conventional configuration strives to reduce the
distance between tokens with the same labels and increase the distance between
tokens with different labels. The effect of this setup may, however, in the
medical domain, there are a lot of entities annotated as OUTSIDE (O), and they
are undesirably pushed apart to other entities that are not labeled as OUTSIDE
(O) by the current contrastive learning method end up with a noisy prototype
for the semantic representation of the label, though there are many OUTSIDE (O)
labeled entities are relevant to the labeled entities. To address this
challenge, we propose a novel method named Weighted Prototypical Contrastive
Learning for Medical Few Shot Named Entity Recognization (W-PROCER). Our
approach primarily revolves around constructing the prototype-based contractive
loss and weighting network. These components play a crucial role in assisting
the model in differentiating the negative samples from OUTSIDE (O) tokens and
enhancing the discrimination ability of contrastive learning. Experimental
results show that our proposed W-PROCER framework significantly outperforms the
strong baselines on the three medical benchmark datasets
- âŠ