9,926 research outputs found
Emojiās sentiment score estimation using convolutional neural network with multi-scale emoji images
Emojis are any small images, symbols, or icons that are used in social media. Several well-known emojis have been ranked and sentiment scores have been assigned to them. These ranked emojis can be used for sentiment analysis; however, many new released emojis have not been ranked and have no sentiment score yet. This paper proposes a new method to estimate the sentiment score of any unranked emotion emoji from its image by classifying it into the class of the most similar ranked emoji and then estimating the sentiment score using the score of the most similar emoji. The accuracy of sentiment score estimation is improved by using multi-scale images. The ranked emoji image data set consisted of 613 classes with 161 emoji images from three different platforms in each class. The images were cropped to produce multi-scale images. The classification and estimation were performed by using convolutional neural network (CNN) with multi-scale emoji images and the proposed voting algorithm called the majority voting with probability (MVP). The proposed method was evaluated on two datasets: ranked emoji images and unranked emoji images. The accuracies of sentiment score estimation for the ranked and unranked emoji test images are 98% and 51%, respectively
Logical disagreement : an epistemological study
While the epistemic signiļ¬cance of disagreement has been a popular topic in epistemology for at least a decade, little attention has been paid to logical disagreement. This monograph is meant as a remedy. The text starts with an extensive literature review of the epistemology of (peer) disagreement and sets the stage for an epistemological study of logical disagreement. The guiding thread for the rest of the work is then three distinct readings of the ambiguous term ālogical disagreementā. Chapters 1 and 2 focus on the Ad Hoc Reading according to which logical disagreements occur when two subjects take incompatible doxastic attitudes toward a speciļ¬c proposition in or about logic. Chapter 2 presents a new counterexample to the widely discussed Uniqueness Thesis. Chapters 3 and 4 focus on the Theory Choice Reading of ālogical disagreementā. According to this interpretation, logical disagreements occur at the level of entire logical theories rather than individual entailment-claims. Chapter 4 concerns a key question from the philosophy of logic, viz., how we have epistemic justiļ¬cation for claims about logical consequence. In Chapters 5 and 6 we turn to the Akrasia Reading. On this reading, logical disagreements occur when there is a mismatch between the deductive strength of oneās background logic and the logical theory one prefers (oļ¬cially). Chapter 6 introduces logical akrasia by analogy to epistemic akrasia and presents a novel dilemma. Chapter 7 revisits the epistemology of peer disagreement and argues that the epistemic signiļ¬cance of central principles from the literature are at best deļ¬ated in the context of logical disagreement. The chapter also develops a simple formal model of deep disagreement in Default Logic, relating this to our general discussion of logical disagreement. The monograph ends in an epilogue with some reļ¬ections on the potential epistemic signiļ¬cance of convergence in logical theorizing
Location Reference Recognition from Texts: A Survey and Comparison
A vast amount of location information exists in unstructured texts, such as social media posts, news stories, scientific articles, web pages, travel blogs, and historical archives. Geoparsing refers to recognizing location references from texts and identifying their geospatial representations. While geoparsing can benefit many domains, a summary of its specific applications is still missing. Further, there is a lack of a comprehensive review and comparison of existing approaches for location reference recognition, which is the first and core step of geoparsing. To fill these research gaps, this review first summarizes seven typical application domains of geoparsing: geographic information retrieval, disaster management, disease surveillance, traffic management, spatial humanities, tourism management, and crime management. We then review existing approaches for location reference recognition by categorizing these approaches into four groups based on their underlying functional principle: rule-based, gazetteer matchingābased, statistical learning-ābased, and hybrid approaches. Next, we thoroughly evaluate the correctness and computational efficiency of the 27Ā most widely used approaches for location reference recognition based on 26 public datasets with different types of texts (e.g., social media posts and news stories) containing 39,736 location references worldwide. Results from this thorough evaluation can help inform future methodological developments and can help guide the selection of proper approaches based on application needs
Explainable text-based features in predictive models of crowdfunding campaigns
Reward-Based Crowdfunding offers an opportunity for innovative ventures that would not be supported through traditional financing. A key problem for those seeking funding is understanding which features of a crowdfunding campaign will sway the decisions of a sufficient number of funders. Predictive models of fund-raising campaigns used in combination with Explainable AI methods promise to provide such insights. However, previous work on Explainable AI has largely focused on quantitative structured data. In this study, our aim is to construct explainable models of human decisions based on analysis of natural language text, thus contributing to a fast-growing body of research on the use of Explainable AI for text analytics. We propose a novel method to construct predictions based on text via semantic clustering of sentences, which, compared with traditional methods using individual words and phrases, allows complex meaning contained in the text to be operationalised. Using experimental evaluation, we compare our proposed method to keyword extraction and topic modelling, which have traditionally been used in similar applications. Our results demonstrate that the sentence clustering method produces features with significant predictive power, compared to keyword-based methods and topic models, but which are much easier to interpret for human raters. We furthermore conduct a SHAP analysis of the models incorporating sentence clusters, demonstrating concrete insights into the types of natural language content that influence the outcome of crowdfunding campaigns
A Simple and Effective Method of Cross-Lingual Plagiarism Detection
We present a simple cross-lingual plagiarism detection method applicable to a
large number of languages. The presented approach leverages open multilingual
thesauri for candidate retrieval task and pre-trained multilingual BERT-based
language models for detailed analysis. The method does not rely on machine
translation and word sense disambiguation when in use, and therefore is
suitable for a large number of languages, including under-resourced languages.
The effectiveness of the proposed approach is demonstrated for several existing
and new benchmarks, achieving state-of-the-art results for French, Russian, and
Armenian languages
Faithful Low-Resource Data-to-Text Generation through Cycle Training
Methods to generate text from structured data have advanced significantly in
recent years, primarily due to fine-tuning of pre-trained language models on
large datasets. However, such models can fail to produce output faithful to the
input data, particularly on out-of-domain data. Sufficient annotated data is
often not available for specific domains, leading us to seek an unsupervised
approach to improve the faithfulness of output text. Since the problem is
fundamentally one of consistency between the representations of the structured
data and text, we evaluate the effectiveness of cycle training in this work.
Cycle training uses two models which are inverses of each other: one that
generates text from structured data, and one which generates the structured
data from natural language text. We show that cycle training, when initialized
with a small amount of supervised data (100 samples in our case), achieves
nearly the same performance as fully supervised approaches for the data-to-text
generation task on the WebNLG, E2E, WTQ, and WSQL datasets. We perform
extensive empirical analysis with automated evaluation metrics and a newly
designed human evaluation schema to reveal different cycle training strategies'
effectiveness of reducing various types of generation errors. Our code is
publicly available at https://github.com/Edillower/CycleNLG.Comment: 19 pages, 4 figures, ACL 202
A Comprehensive Survey on Applications of Transformers for Deep Learning Tasks
Transformer is a deep neural network that employs a self-attention mechanism
to comprehend the contextual relationships within sequential data. Unlike
conventional neural networks or updated versions of Recurrent Neural Networks
(RNNs) such as Long Short-Term Memory (LSTM), transformer models excel in
handling long dependencies between input sequence elements and enable parallel
processing. As a result, transformer-based models have attracted substantial
interest among researchers in the field of artificial intelligence. This can be
attributed to their immense potential and remarkable achievements, not only in
Natural Language Processing (NLP) tasks but also in a wide range of domains,
including computer vision, audio and speech processing, healthcare, and the
Internet of Things (IoT). Although several survey papers have been published
highlighting the transformer's contributions in specific fields, architectural
differences, or performance evaluations, there is still a significant absence
of a comprehensive survey paper encompassing its major applications across
various domains. Therefore, we undertook the task of filling this gap by
conducting an extensive survey of proposed transformer models from 2017 to
2022. Our survey encompasses the identification of the top five application
domains for transformer-based models, namely: NLP, Computer Vision,
Multi-Modality, Audio and Speech Processing, and Signal Processing. We analyze
the impact of highly influential transformer-based models in these domains and
subsequently classify them based on their respective tasks using a proposed
taxonomy. Our aim is to shed light on the existing potential and future
possibilities of transformers for enthusiastic researchers, thus contributing
to the broader understanding of this groundbreaking technology
- ā¦