Search CORE

3 research outputs found

BSDAR: Beam Search Decoding with Attention Reward in Neural Keyphrase Generation

Author: Menkovski Vlado
Ni'mah Iftitahu
Pechenizkiy Mykola
Publication venue
Publication date: 17/09/2019
Field of study

This study mainly investigates two decoding problems in neural keyphrase generation: sequence length bias and beam diversity. We introduce an extension of beam search inference based on word-level and n-gram level attention score to adjust and constrain Seq2Seq prediction at test time. Results show that our proposed solution can overcome the algorithm bias to shorter and nearly identical sequences, resulting in a significant improvement of the decoding performance on generating keyphrases that are present and absent in source text

arXiv.org e-Print Archive

KeyGen2Vec: Learning Document Embedding via Multi-label Keyword Generation in Question-Answering

Author: Khoshrou Samaneh
Menkovski Vlado
Ni'mah Iftitahu
Pechenizkiy Mykola
Publication venue
Publication date: 30/10/2023
Field of study

Representing documents into high dimensional embedding space while preserving the structural similarity between document sources has been an ultimate goal for many works on text representation learning. Current embedding models, however, mainly rely on the availability of label supervision to increase the expressiveness of the resulting embeddings. In contrast, unsupervised embeddings are cheap, but they often cannot capture implicit structure in target corpus, particularly for samples that come from different distribution with the pretraining source. Our study aims to loosen up the dependency on label supervision by learning document embeddings via Sequence-to-Sequence (Seq2Seq) text generator. Specifically, we reformulate keyphrase generation task into multi-label keyword generation in community-based Question Answering (cQA). Our empirical results show that KeyGen2Vec in general is superior than multi-label keyword classifier by up to 14.7% based on Purity, Normalized Mutual Information (NMI), and F1-Score metrics. Interestingly, although in general the absolute advantage of learning embeddings through label supervision is highly positive across evaluation datasets, KeyGen2Vec is shown to be competitive with classifier that exploits topic label supervision in Yahoo! cQA with larger number of latent topic labels.Comment: Arxiv preprin

arXiv.org e-Print Archive

Looking Deeper into Deep Learning Model: Attribution-based Explanations of TextCNN

Author: Huesca Juan M. G.
Ni'mah Iftitahu
Pechenizkiy Mykola
van Ipenburg Werner
Veldsink Jan
Xiong Wenting
Publication venue
Publication date: 01/01/2018
Field of study

Layer-wise Relevance Propagation (LRP) and saliency maps have been recently used to explain the predictions of Deep Learning models, specifically in the domain of text classification. Given different attribution-based explanations to highlight relevant words for a predicted class label, experiments based on word deleting perturbation is a common evaluation method. This word removal approach, however, disregards any linguistic dependencies that may exist between words or phrases in a sentence, which could semantically guide a classifier to a particular prediction. In this paper, we present a feature-based evaluation framework for comparing the two attribution methods on customer reviews (public data sets) and Customer Due Diligence (CDD) extracted reports (corporate data set). Instead of removing words based on the relevance score, we investigate perturbations based on embedded features removal from intermediate layers of Convolutional Neural Networks. Our experimental study is carried out on embedded-word, embedded-document, and embedded-ngrams explanations. Using the proposed framework, we provide a visualization tool to assist analysts in reasoning toward the model's final prediction.Comment: NIPS 2018 Workshop on Challenges and Opportunities for AI in Financial Services: the Impact of Fairness, Explainability, Accuracy, and Privacy, Montr\'eal, Canad

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository