Search CORE

45 research outputs found

Reframing in Clustering: An Introductory Survey

Author: Rahman Md. Geaur
Publication venue: 'International Journal of Computer Engineering and Applications'
Publication date: 08/07/2018
Field of study

Reframing is an essential task for improving the performance of machine learning and data mining algorithms in the areas where there are context changes between the source and target domains. A major assumption in many reframing algorithms is that the target domain has some labelled data. However, in many real-world applications, this assumption may not hold. For example, we sometimes have a clustering task in one domain of interest, but we only have sufficient source data in another domain of interest, where the latter data may be in a different feature space or follow a different data distribution. Moreover, both source and target data may be unlabelled. In such cases, reframing in clustering, if done successfully, would greatly improve the performance of clustering by avoiding much expensive data labeling efforts. In recent years, reframing in clustering has emerged as a new clustering framework to address this problem. In this paper, we present a review on the state-of-the-art reframing in clustering approaches, and to the best of our knowledge it has never been done in the literature. We give a definition of reframing in clustering. We also explore some potential future issues in this area of research

International Journal of Computer (IJC - Global Society of Scientific Research and Researchers, GSSRR)

Understanding and Improving Visual Prompting: A Label-Mapping Perspective

Author: Chen Aochuan
Chen Pin-Yu
Liu Sijia
Yao Yuguang
Zhang Yihua
Publication venue
Publication date: 21/11/2022
Field of study

We revisit and advance visual prompting (VP), an input prompting technique for vision tasks. VP can reprogram a fixed, pre-trained source model to accomplish downstream tasks in the target domain by simply incorporating universal prompts (in terms of input perturbation patterns) into downstream data points. Yet, it remains elusive why VP stays effective even given a ruleless label mapping (LM) between the source classes and the target classes. Inspired by the above, we ask: How is LM interrelated with VP? And how to exploit such a relationship to improve its accuracy on target tasks? We peer into the influence of LM on VP and provide an affirmative answer that a better 'quality' of LM (assessed by mapping precision and explanation) can consistently improve the effectiveness of VP. This is in contrast to the prior art where the factor of LM was missing. To optimize LM, we propose a new VP framework, termed ILM-VP (iterative label mapping-based visual prompting), which automatically re-maps the source labels to the target labels and progressively improves the target task accuracy of VP. Further, when using a contrastive language-image pretrained (CLIP) model, we propose to integrate an LM process to assist the text prompt selection of CLIP and to improve the target task accuracy. Extensive experiments demonstrate that our proposal significantly outperforms state-of-the-art VP methods. As highlighted below, we show that when reprogramming an ImageNet-pretrained ResNet-18 to 13 target tasks, our method outperforms baselines by a substantial margin, e.g., 7.9% and 6.7% accuracy improvements in transfer learning to the target Flowers102 and CIFAR100 datasets. Besides, our proposal on CLIP-based VP provides 13.7% and 7.1% accuracy improvements on Flowers102 and DTD respectively. Our code is available at https://github.com/OPTML-Group/ILM-VP

arXiv.org e-Print Archive

Transfer Learning for Historical Corpora: An Assessment on Post-OCR Correction and Named Entity Recognition

Author: Colavizza G.
Todorov K.
Publication venue
Publication date: 01/01/2020
Field of study

International Migration, Integration and Social Cohesion online publications

Transfer learning for historical corpora: An assessment on post-OCR correction and named entity recognition

Author: Colavizza Giovanni
Todorov Konstantin
Publication venue: CEUR-WS
Publication date: 01/01/2020
Field of study

Transfer learning in Natural Language Processing, mainly in the form of pre-trained language models, has recently delivered substantial gains across a range of tasks. Scholars and practitioners working with OCRed historical corpora are thus increasingly exploring the use of pre-trained language models. Nevertheless, the specific challenges posed by historical documents, including OCR quality and linguistic change, call for a critical assessment of the use of pre-trained language models in this setting. We consider two shared tasks, ICDAR2019 (post-OCR correction) and CLEF-HIPE-2020 (Named Entity Recognition, NER), and systematically assess using pre-trained language models with data in French, German and English. We find that using pre-trained language models helps with NER but less so with post-OCR correction. Pre-trained language models should therefore be used critically when working with OCRed historical corpora. We release our code base, in order to allow replicating our results and testing other pre-trained representations

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Recommended from our members

On surrogate supervision multi-view learning

Author: Jin Gaole
Publication venue: 'Oregon State University'
Publication date
Field of study

Data can be represented in multiple views. Traditional multi-view learning methods (i.e., co-training, multi-task learning) focus on improving learning performance using information from the auxiliary view, although information from the target view is sufficient for learning task. However, this work addresses a semi-supervised case of multi-view learning, the surrogate supervision multi-view learning, where labels are available on limited views and a classifier is obtained on the target view where labels are missing. In surrogate multi-view learning, one cannot obtain a classifier without information from the auxiliary view. To solve this challenging problem, we propose discriminative and generative approaches

ScholarsArchive@OSU