7,724 research outputs found
Boosting Multi-Modal E-commerce Attribute Value Extraction via Unified Learning Scheme and Dynamic Range Minimization
With the prosperity of e-commerce industry, various modalities, e.g., vision
and language, are utilized to describe product items. It is an enormous
challenge to understand such diversified data, especially via extracting the
attribute-value pairs in text sequences with the aid of helpful image regions.
Although a series of previous works have been dedicated to this task, there
remain seldomly investigated obstacles that hinder further improvements: 1)
Parameters from up-stream single-modal pretraining are inadequately applied,
without proper jointly fine-tuning in a down-stream multi-modal task. 2) To
select descriptive parts of images, a simple late fusion is widely applied,
regardless of priori knowledge that language-related information should be
encoded into a common linguistic embedding space by stronger encoders. 3) Due
to diversity across products, their attribute sets tend to vary greatly, but
current approaches predict with an unnecessary maximal range and lead to more
potential false positives. To address these issues, we propose in this paper a
novel approach to boost multi-modal e-commerce attribute value extraction via
unified learning scheme and dynamic range minimization: 1) Firstly, a unified
scheme is designed to jointly train a multi-modal task with pretrained
single-modal parameters. 2) Secondly, a text-guided information range
minimization method is proposed to adaptively encode descriptive parts of each
modality into an identical space with a powerful pretrained linguistic model.
3) Moreover, a prototype-guided attribute range minimization method is proposed
to first determine the proper attribute set of the current product, and then
select prototypes to guide the prediction of the chosen attributes. Experiments
on the popular multi-modal e-commerce benchmarks show that our approach
achieves superior performance over the other state-of-the-art techniques
Econometrics meets sentiment : an overview of methodology and applications
The advent of massive amounts of textual, audio, and visual data has spurred the development of econometric methodology to transform qualitative sentiment data into quantitative sentiment variables, and to use those variables in an econometric analysis of the relationships between sentiment and other variables. We survey this emerging research field and refer to it as sentometrics, which is a portmanteau of sentiment and econometrics. We provide a synthesis of the relevant methodological approaches, illustrate with empirical results, and discuss useful software
Harvesting Entities from the Web Using Unique Identifiers -- IBEX
In this paper we study the prevalence of unique entity identifiers on the
Web. These are, e.g., ISBNs (for books), GTINs (for commercial products), DOIs
(for documents), email addresses, and others. We show how these identifiers can
be harvested systematically from Web pages, and how they can be associated with
human-readable names for the entities at large scale.
Starting with a simple extraction of identifiers and names from Web pages, we
show how we can use the properties of unique identifiers to filter out noise
and clean up the extraction result on the entire corpus. The end result is a
database of millions of uniquely identified entities of different types, with
an accuracy of 73--96% and a very high coverage compared to existing knowledge
bases. We use this database to compute novel statistics on the presence of
products, people, and other entities on the Web.Comment: 30 pages, 5 figures, 9 tables. Complete technical report for A.
Talaika, J. A. Biega, A. Amarilli, and F. M. Suchanek. IBEX: Harvesting
Entities from the Web Using Unique Identifiers. WebDB workshop, 201
A novel Auto-ML Framework for Sarcasm Detection
Many domains have sarcasm or verbal irony presented in the text of reviews, tweets, comments, and dialog discussions. The purpose of this research is to classify sarcasm for multiple domains using the deep learning based AutoML framework. The proposed AutoML framework has five models in the model search pipeline, these five models are the combination of convolutional neural network (CNN), Long Short-Term Memory (LSTM), deep neural network (DNN), and Bidirectional Long Short-Term Memory (BiLSTM). The hybrid combination of CNN, LSTM, and DNN models are presented as CNN-LSTM-DNN, LSTM-DNN, BiLSTM-DNN, and CNN-BiLSTM-DNN. This work has proposed the algorithms that contrast polarities between terms and phrases, which are categorized into implicit and explicit incongruity categories. The incongruity and pragmatic features like punctuation, exclamation marks, and others integrated into the AutoML DeepConcat framework models. That integration was possible when the DeepConcat AutoML framework initiate a model search pipeline for five models to achieve better performance. Conceptually, DeepConcat means that model will integrate with generalized features. It was evident that the pretrain model BiLSTM achieved a better performance of 0.98 F1 when compared with the other five model performances. Similarly, the AutoML based BiLSTM-DNN model achieved the best performance of 0.98 F1, which is better than core approaches and existing state-of-the-art Tweeter tweet dataset, Amazon reviews, and dialog discussion comments. The proposed AutoML framework has compared performance metrics F1 and AUC and discovered that F1 is better than AUC. The integration of all feature categories achieved a better performance than the individual category of pragmatic and incongruity features. This research also evaluated the performance of the dropout layer hyperparameter and it achieved better performance than the fixed percentage like 10% of dropout parameter of the AutoML based Bayesian optimization. Proposed AutoML framework DeepConcat evaluated best pretrain models BiLSTM-DNN and CNN-CNN-DNN to transfer knowledge across domains like Amazon reviews and Dialog discussion comments (text) using the last strategy, full layer, and our fade-out freezing strategies. In the transfer learning fade-out strategy outperformed the existing state-of-the-art model BiLSTM-DNN, the performance is 0.98 F1 on tweets, 0.85 F1 on Amazon reviews, and 0.87 F1 on the dialog discussion SCV2-Gen dataset. Further, all strategies with various domains can be compared for the best model selection
Towards a Universal Wordnet by Learning from Combined Evidenc
Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification
A Survey on Semantic Processing Techniques
Semantic processing is a fundamental research domain in computational
linguistics. In the era of powerful pre-trained language models and large
language models, the advancement of research in this domain appears to be
decelerating. However, the study of semantics is multi-dimensional in
linguistics. The research depth and breadth of computational semantic
processing can be largely improved with new technologies. In this survey, we
analyzed five semantic processing tasks, e.g., word sense disambiguation,
anaphora resolution, named entity recognition, concept extraction, and
subjectivity detection. We study relevant theoretical research in these fields,
advanced methods, and downstream applications. We connect the surveyed tasks
with downstream applications because this may inspire future scholars to fuse
these low-level semantic processing tasks with high-level natural language
processing tasks. The review of theoretical research may also inspire new tasks
and technologies in the semantic processing domain. Finally, we compare the
different semantic processing techniques and summarize their technical trends,
application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN
1566-2535. The equal contribution mark is missed in the published version due
to the publication policies. Please contact Prof. Erik Cambria for detail
- …