2,989 research outputs found
CLOUD-BASED MACHINE LEARNING AND SENTIMENT ANALYSIS
The role of a Data Scientist is becoming increasingly ubiquitous as companies and institutions see the need to gain additional insights and information from data to make better decisions to improve the quality-of-service delivery to customers. This thesis document contains three aspects of data science projects aimed at improving tools and techniques used in analyzing and evaluating data. The first research study involved the use of a standard cybersecurity dataset and cloud-based auto-machine learning algorithms were applied to detect vulnerabilities in the network traffic data. The performance of the algorithms was measured and compared using standard evaluation metrics. The second research study involved the use of text-mining social media, specifically Reddit. We mined up to 100,000 comments in multiple subreddits and tested for hate speech via a custom designed version of the Python Vader sentiment analysis package. Our work integrated standard sentiment analysis with Hatebase.org and we demonstrate our new method can better detect hate speech in social media. Following sentiment analysis and hate speech detection, in the third research project, we applied statistical techniques in evaluating the significant difference in text analytics, specifically the sentiment-categories for both lexicon-based software and cloud-based tools. We compared the three big cloud providers, AWS, Azure, and GCP with the standard python Vader sentiment analysis library. We utilized statistical analysis to determine a significant difference between the cloud platforms utilized as well as Vader and demonstrated that each platform is unique in its analysis scoring mechanism
A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4
Large language models (LLMs) are a special class of pretrained language
models obtained by scaling model size, pretraining corpus and computation.
LLMs, because of their large size and pretraining on large volumes of text
data, exhibit special abilities which allow them to achieve remarkable
performances without any task-specific training in many of the natural language
processing tasks. The era of LLMs started with OpenAI GPT-3 model, and the
popularity of LLMs is increasing exponentially after the introduction of models
like ChatGPT and GPT4. We refer to GPT-3 and its successor OpenAI models,
including ChatGPT and GPT4, as GPT-3 family large language models (GLLMs). With
the ever-rising popularity of GLLMs, especially in the research community,
there is a strong need for a comprehensive survey which summarizes the recent
research progress in multiple dimensions and can guide the research community
with insightful future research directions. We start the survey paper with
foundation concepts like transformers, transfer learning, self-supervised
learning, pretrained language models and large language models. We then present
a brief overview of GLLMs and discuss the performances of GLLMs in various
downstream tasks, specific domains and multiple languages. We also discuss the
data labelling and data augmentation abilities of GLLMs, the robustness of
GLLMs, the effectiveness of GLLMs as evaluators, and finally, conclude with
multiple insightful future research directions. To summarize, this
comprehensive survey paper will serve as a good resource for both academic and
industry people to stay updated with the latest research related to GPT-3
family large language models.Comment: Preprint under review, 58 page
A survey on sentiment analysis in Urdu: A resource-poor language
© 2020 Background/introduction: The dawn of the internet opened the doors to the easy and widespread sharing of information on subject matters such as products, services, events and political opinions. While the volume of studies conducted on sentiment analysis is rapidly expanding, these studies mostly address English language concerns. The primary goal of this study is to present state-of-art survey for identifying the progress and shortcomings saddling Urdu sentiment analysis and propose rectifications. Methods: We described the advancements made thus far in this area by categorising the studies along three dimensions, namely: text pre-processing lexical resources and sentiment classification. These pre-processing operations include word segmentation, text cleaning, spell checking and part-of-speech tagging. An evaluation of sophisticated lexical resources including corpuses and lexicons was carried out, and investigations were conducted on sentiment analysis constructs such as opinion words, modifiers, negations. Results and conclusions: Performance is reported for each of the reviewed study. Based on experimental results and proposals forwarded through this paper provides the groundwork for further studies on Urdu sentiment analysis
Transforming Sentiment Analysis in the Financial Domain with ChatGPT
Financial sentiment analysis plays a crucial role in decoding market trends
and guiding strategic trading decisions. Despite the deployment of advanced
deep learning techniques and language models to refine sentiment analysis in
finance, this study breaks new ground by investigating the potential of large
language models, particularly ChatGPT 3.5, in financial sentiment analysis,
with a strong emphasis on the foreign exchange market (forex). Employing a
zero-shot prompting approach, we examine multiple ChatGPT prompts on a
meticulously curated dataset of forex-related news headlines, measuring
performance using metrics such as precision, recall, f1-score, and Mean
Absolute Error (MAE) of the sentiment class. Additionally, we probe the
correlation between predicted sentiment and market returns as an additional
evaluation approach. ChatGPT, compared to FinBERT, a well-established sentiment
analysis model for financial texts, exhibited approximately 35\% enhanced
performance in sentiment classification and a 36\% higher correlation with
market returns. By underlining the significance of prompt engineering,
particularly in zero-shot contexts, this study spotlights ChatGPT's potential
to substantially boost sentiment analysis in financial applications. By sharing
the utilized dataset, our intention is to stimulate further research and
advancements in the field of financial services.Comment: 10 pages, 8 figures, Preprint submitted to Machine Learning with
Application
Analyzing the Application of Minimalism in Product Appearance Design using Associative Data Mining Optimized Feature Selection and Deep Learning of Bang&Olufsen Products
The application of minimalism in product appearance design has gained significant attention in recent years due to its focus on simplicity, functionality, and aesthetic appeal. This paper explores the use of Associative Data Mining Optimized Feature Selection (ADM-OFS) classifier with deep learning techniques to analyze the application of minimalism in product appearance design, using Bang&Olufsen products as a case study. The proposed ADM-OFS perform feature selection is performed using an associative data mining approach, which estimates the most relevant and influential features that contribute to minimalistic design. The optimized feature selection process enhances the accuracy and efficiency of the analysis by reducing the dimensionality of the dataset while retaining its essential characteristics. The ADM-OFS model comprises the deep learning techniques employed to capture intricate patterns and relationships between minimalism and product appearance design. The deep learning model is trained on the dataset, enabling it to recognize complex visual features and make predictions about the minimalistic qualities of new product designs. The findings of ADM-OFS provide valuable insights into the application of minimalism in product appearance design, specifically in the context of Bang&Olufsen products. The analysis demonstrated the ADM-OFS classifier with deep learning, in analyzing and interpreting the application of minimalism in product appearance design. The findings of ADM-OFS stated that the designers, manufacturers, and researchers in their pursuit of creating visually appealing and functionally efficient products that embody the principles of minimalism
An empirical study on the various stock market prediction methods
Investment in the stock market is one of the much-admired investment actions. However, prediction of the stock market has remained a hard task because of the non-linearity exhibited. The non-linearity is due to multiple affecting factors such as global economy, political situations, sector performance, economic numbers, foreign institution investment, domestic institution investment, and so on. A proper set of such representative factors must be analyzed to make an efficient prediction model. Marginal improvement of prediction accuracy can be gainful for investors. This review provides a detailed analysis of research papers presenting stock market prediction techniques. These techniques are assessed in the time series analysis and sentiment analysis section. A detailed discussion on research gaps and issues is presented. The reviewed articles are analyzed based on the use of prediction techniques, optimization algorithms, feature selection methods, datasets, toolset, evaluation matrices, and input parameters. The techniques are further investigated to analyze relations of prediction methods with feature selection algorithm, datasets, feature selection methods, and input parameters. In addition, major problems raised in the present techniques are also discussed. This survey will provide researchers with deeper insight into various aspects of current stock market prediction methods
- …