Search CORE

7,652 research outputs found

An Activation Method of Topic Dictionary to Expand Training Data for Trend Rule Discovery

Author: Kyoko Makino
Shigeaki Sakurai
Shigeru Matsumoto
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

This paper improves a method which predicts whether evaluation objects such as companies and products are to be attractive in near future. The attractiveness is evaluated by trend rules. The trend rules represent relationships among evaluation objects, keywords, and numerical changes related to the evaluation objects. They are inductively acquired from text sequential data and numerical sequential data. The method assigns evaluation objects to the text sequential data by activating a topic dictionary. The dictionary describes keywords representing the numerical change. It can expand the amount of the training data. It is anticipated that the expansion leads to the acquisition of more valid trend rules. This paper applies the method to a task which predicts attractive stock brands based on both news headlines and stock price sequences. It shows that the method can improve the detection performance of evaluation objects through numerical experiments

Crossref

Directory of Open Access Journals

A Latent Dirichlet Allocation and Fuzzy Clustering Based Machine Learning Model for Text Thesaurus

Author: Dai Zong
Luo Jia
Yu Dongwen
Publication venue: Agora University Press
Publication date: 28/03/2020
Field of study

It is not quite possible to use manual methods to process the huge amount of structured and semi-structured data. This study aims to solve the problem of processing huge data through machine learning algorithms. We collected the text data of the company’s public opinion through crawlers, and use Latent Dirichlet Allocation (LDA) algorithm to extract the keywords of the text, and uses fuzzy clustering to cluster the keywords to form different topics. The topic keywords will be used as a seed dictionary for new word discovery. In order to verify the efficiency of machine learning in new word discovery, algorithms based on association rules, N-Gram, PMI, andWord2vec were used for comparative testing of new word discovery. The experimental results show that the Word2vec algorithm based on machine learning model has the highest accuracy, recall and F-value indicators

Agora University Editing House: Journals

Multiple-Domain Sentiment Classification for Cantonese Using a Combined Approach

Author: Chai P.Y.F.
Choi Y.S.
Lee M.C.M.
Ngai E.W.T.
Publication venue: AIS Electronic Library (AISeL)
Publication date: 26/06/2018
Field of study

In this study, we proposed a combined approach, which amalgamates machine learning and lexicon- based approaches for multiple-domain sentiment classification that supports Cantonese-based social media analysis. Our study contributes to the existing literature not only by investigating the effectiveness of the proposed combined approach for supporting social media analysis in the Cantonese context but also by verifying that the proposed method outperforms the baseline approaches, which are commonly used in the literature. We demonstrated that social media network-based classifiers can be general classifiers that support multiple-domain sentiment classification

AIS Electronic Library (AISeL)

Neurocognitive Informatics Manifesto.

Author: Duch Wlodzislaw
Publication venue: California Polytechnic State University
Publication date: 01/01/2009
Field of study

Informatics studies all aspects of the structure of natural and artificial information systems. Theoretical and abstract approaches to information have made great advances, but human information processing is still unmatched in many areas, including information management, representation and understanding. Neurocognitive informatics is a new, emerging field that should help to improve the matching of artificial and natural systems, and inspire better computational algorithms to solve problems that are still beyond the reach of machines. In this position paper examples of neurocognitive inspirations and promising directions in this area are given

CiteSeerX

CogPrints Cognitive Sciences Eprint Archive

A hybrid model for aspect-based sentiment analysis on customer feedback: research on the mobile commerce sector in Vietnam

Author: Bui Hien Minh
Ho Thanh Trung
Thai Phung Kim
Publication venue: Universitas Ahmad Dahlan
Publication date: 01/07/2023
Field of study

Feedback and comments on mobile commerce applications are extremely useful and valuable information sources that reflect the quality of products or services to determine whether data is positive or negative and help businesses monitor brand and product sentiment in customers’ feedback and understand customers’ needs. However, the increasing number of comments makes it increasingly difficult to understand customers using manual methods. To solve this problem, this study builds a hybrid research model based on aspect mining and comment classification for aspect-based sentiment analysis (ABSA) to deeply comprehend the customer and their experiences. Based on previous classification results, we first construct a dictionary of positive and negative words in the e-commerce field. Then, the POS tagging technique is applied for word classification in Vietnamese to extract aspects of model commerce related to positive or negative words. The model is implemented with machine and deep learning methods on a corpus comprising more than 1,000,000 customer opinions collected from Vietnam's four largest mobile commerce applications. Experimental results show that the Bi-LSTM method has the highest accuracy with 92.01%; it is selected for the proposed model to analyze the viewpoint of words on real data. The findings are that the proposed hybrid model can be applied to monitor online customer experience in real time, enable administrators to make timely and accurate decisions, and improve the quality of products and services to take a competitive advantage

International Journal of Advances in Intelligent Informatics

International Journal of Advances in Intelligent Informatics (IJAIN)

Document-level sentiment analysis of email data

Author: Liu Sisi
Publication venue
Publication date: 01/01/2020
Field of study

Sisi Liu investigated machine learning methods for Email document sentiment analysis. She developed a systematic framework that has been qualitatively and quantitatively proved to be effective and efficient in identifying sentiment from massive amount of Email data. Analytical results obtained from the document-level Email sentiment analysis framework are beneficial for better decision making in various business settings

ResearchOnline@JCU

ResearchOnline at James Cook University

A comparison of homonym meaning frequency estimates derived from movie and television subtitles, free association, and explicit ratings

Author: Armstrong Blair C.
Beekhuizen Barend
Dubrovsky Vladimir
Rice Caitlin A.
Stevenson Suzanne
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

First Online: 10 September 2018Most words are ambiguous, with interpretation dependent on context. Advancing theories of ambiguity resolution is important for any general theory of language processing, and for resolving inconsistencies in observed ambiguity effects across experimental tasks. Focusing on homonyms (words such as bank with unrelated meanings EDGE OF A RIVER vs. FINANCIAL INSTITUTION), the present work advances theories and methods for estimating the relative frequency of their meanings, a factor that shapes observed ambiguity effects. We develop a new method for estimating meaning frequency based on the meaning of a homonym evoked in lines of movie and television subtitles according to human raters. We also replicate and extend a measure of meaning frequency derived from the classification of free associates. We evaluate the internal consistency of these measures, compare them to published estimates based on explicit ratings of each meaning’s frequency, and compare each set of norms in predicting performance in lexical and semantic decision mega-studies. All measures have high internal consistency and show agreement, but each is also associated with unique variance, which may be explained by integrating cognitive theories of memory with the demands of different experimental methodologies. To derive frequency estimates, we collected manual classifications of 533 homonyms over 50,000 lines of subtitles, and of 357 homonyms across over 5000 homonym–associate pairs. This database—publicly available at: www.blairarmstrong.net/homonymnorms/—constitutes a novel resource for computational cognitive modeling and computational linguistics, and we offer suggestions around good practices for its use in training and testing models on labeled data

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital para la Docencia y la Investigación

Machine Learning and Alternative Data Analytics for Fashion Finance

Author: Bainiaksinaite Julija
Publication venue: UCL (University College London)
Publication date: 28/03/2020
Field of study

This dissertation investigates the application of Machine Learning, Natural Language Processing and computational finance to a novel area Fashion Finance. Specifically identifying investment opportunities within the Apparel industry using influential alternative data sources such as Instagram. Fashion investment is challenging due to the ephemeral nature of the industry and the difficulty for investors who lack an understanding of how to analyze trend-driven consumer brands. Unstructured online data (e-commerce stores, social media, online blogs, news, etc.), introduce new opportunities for investment signals extraction. We focus on how trading signals can be generated from the Instagram data and events reported in the news articles. Part of this research work was done in collaboration with Arabesque Asset Management. Farfetch, the online luxury retailer, and Living Bridge Private Equity provided industry advice. Research Datasets The datasets used for this research are collected from various sources and include the following types of data: - Financial data: daily stock prices of 50 U.S. and European Apparel and Footwear equities, daily U.S. Retail Trade and U.S. Consumer Non-Durables sectors indices, Form 10-K reports. - Instagram data: daily Instagram profile followers for 11 fashion companies. - News data: 0.5 mln news articles that mention selected 50 equities. Research Experiments The thesis consists of the below studies: 1. Relationship between Instagram Popularity and Stock Prices. This study investigates a link between the changes in a company's popularity (daily followers counts) on Instagram and its stock price, revenue movements. We use cross-correlation analysis to find whether the signals derived from the followers' data could help to infer a company's future financial performance. Two hypothetical trading strategies are designed to test if the changes in a company's Instagram popularity could improve the returns. To test the hypotheses, Wilcoxon signed-rank test is used. 2. Dynamic Density-based News Clustering. The aim of this study is twofold: 1) analyse the characteristics of relevant news event articles and how they differ from the noisy/irrelevant news; 2) using the insights, design an unsupervised framework that clusters news articles and identifies events clusters without predefined parameters or expert knowledge. The framework incorporates the density-based clustering algorithm DBSCAN where the clustering parameters are selected dynamically with Gaussian Mixture Model and by maximizing the inter-cluster Information Entropy. 3. ALGA: Automatic Logic Gate Annotator for Event Detection. We design a news classification model for detecting fashion events that are likely to impact a company's stock price. The articles are represented by the following text embeddings: TF-IDF, Doc2Vec and BERT (Transformer Neural Network). The study is comprised of two parts: 1) we design a domain-specific automatic news labelling framework ALGA. The framework incorporates topic extraction (Latent Dirichlet Allocation) and clustering (DBSCAN) algorithms in addition to other filters to annotate the dataset; 2) using the labelled dataset, we train Logistic Regression classifier for identifying financially relevant news. The model shows the state-of-the-art results in the domain-specific financial event detection problem. Contribution to Science This research work presents the following contributions to science: - Introducing original work in Machine Learning and Natural Language Processing application for analysing alternative data on ephemeral fashion assets. - Introducing the new metrics to measure and track a fashion brand's popularity for investment decision making. - Design of the dynamic news events clustering framework that finds events clusters of various sizes in the news articles without predefined parameters. - Present the original Automatic Logic Gate Annotator framework (ALGA) for automatic labelling of news articles for the financial event detection task. - Design of the Apparel and Footwear news events classifier using the datasets generated by the ALGA's framework and show the state-of-the-art performance in a domain-specific financial event detection task. - Build the \textit{Fashion Finance Dictionary} that contains 320 phrases related to various financially-relevant events in the Apparel and Footwear industry

UCL Discovery

Promotional Campaigns in the Era of Social Platforms

Author: Abu-el-rub Noor E
Publication venue: UNM Digital Repository
Publication date: 01/07/2019
Field of study

The rise of social media has facilitated the diffusion of information to more easily reach millions of users. While some users connect with friends and organically share information and opinions on social media, others have exploited these platforms to gain influence and profit through promotional campaigns and advertising. The existence of promotional campaigns contributes to the spread of misleading information, spam, and fake news. Thus, these campaigns affect the trustworthiness and reliability of social media and render it as a crowd advertising platform. This dissertation studies the existence of promotional campaigns in social media and explores different ways users and bots (i.e. automated accounts) engage in such campaigns. In this dissertation, we design a suite of detection, ranking, and mining techniques. We study user-generated reviews in online e-commerce sites, such as Google Play, to extract campaigns. We identify cooperating sets of bots and classify their interactions in social networks such as Twitter, and rank the bots based on the degree of their malevolence. Our study shows that modern online social interactions are largely modulated by promotional campaigns such as political campaigns, advertisement campaigns, and incentive-driven campaigns. We measure how these campaigns can potentially impact information consumption of millions of social media users

Fourth Conference on Artificial Intelligence for Space Applications

Author: Denton Judith S.
Odell Stephen L.
Vereen Mary
Publication venue
Publication date
Field of study

Proceedings of a conference held in Huntsville, Alabama, on November 15-16, 1988. The Fourth Conference on Artificial Intelligence for Space Applications brings together diverse technical and scientific work in order to help those who employ AI methods in space applications to identify common goals and to address issues of general interest in the AI community. Topics include the following: space applications of expert systems in fault diagnostics, in telemetry monitoring and data collection, in design and systems integration; and in planning and scheduling; knowledge representation, capture, verification, and management; robotics and vision; adaptive learning; and automatic programming

NASA Technical Reports Server