Search CORE

4 research outputs found

InProC: Industry and Product/Service Code Classification

Author: Kaur Simerjot
Shah Sameena
Stefanucci Andrea
Publication venue
Publication date: 22/05/2023
Field of study

Determining industry and product/service codes for a company is an important real-world task and is typically very expensive as it involves manual curation of data about the companies. Building an AI agent that can predict these codes automatically can significantly help reduce costs, and eliminate human biases and errors. However, unavailability of labeled datasets as well as the need for high precision results within the financial domain makes this a challenging problem. In this work, we propose a hierarchical multi-class industry code classifier with a targeted multi-label product/service code classifier leveraging advances in unsupervised representation learning techniques. We demonstrate how a high quality industry and product/service code classification system can be built using extremely limited labeled dataset. We evaluate our approach on a dataset of more than 20,000 companies and achieved a classification accuracy of more than 92\%. Additionally, we also compared our approach with a dataset of 350 manually labeled product/service codes provided by Subject Matter Experts (SMEs) and obtained an accuracy of more than 96\% resulting in real-life adoption within the financial domain

arXiv.org e-Print Archive

Company classification using zero-shot learning

Author: Jankov Andrej
Miskovski Igor
Pinsky Eugene
Rizinski Maryan
Sankaradas Vignesh
Trajanov Dimitar
Publication venue
Publication date: 01/05/2023
Field of study

In recent years, natural language processing (NLP) has become increasingly important in a variety of business applications, including sentiment analysis, text classification, and named entity recognition. In this paper, we propose an approach for company classification using NLP and zero-shot learning. Our method utilizes pre-trained transformer models to extract features from company descriptions, and then applies zero-shot learning to classify companies into relevant categories without the need for specific training data for each category. We evaluate our approach on publicly available datasets of textual descriptions of companies, and demonstrate that it can streamline the process of company classification, thereby reducing the time and resources required in traditional approaches such as the Global Industry Classification Standard (GICS). The results show that this method has potential for automation of company classification, making it a promising avenue for future research in this area.Comment: 6 pages, 1 figure, 4 tables, conference paper, to be published in the 20th International Conference on Informatics and Information Technologies (CIIT 2023

arXiv.org e-Print Archive

Knitting Machinery Spare Classification using Deep Learning with Differential Privacy

Author: Akin Erhan
Kasap Songul
Tastimur Canan
Publication venue: Journal of Scientific and Industrial Research (JSIR)
Publication date: 26/09/2021
Field of study

Given their widespread use, knitting machines must be maintained regularly. When the spare parts that make up these machines break down or become unusable, they must be replaced with new ones. However, the code/name information of the spare parts is not available to the end user, and can only be accessed with high-cost catalog procurement. Manufacturing companies keep the code/name information of such machine parts confidential. When the literature is examined, there are no studies in which spare parts are classified with machine learning–based algorithms. In line with this, this study focuses on the classification of spare parts using machine learning–based algorithms. The deep learning–based Convolutional Neural Network (CNN) architecture developed in this study can classify highly similar spare parts. In addition, since the code/name information received from the manufacturer and the spare part sample images require confidentiality, the CNN architecture has been developed in combination with the Differential Privacy (DP) method to present the DP-CNN method. As a result of the application of the Differential Privacy method, there has been no great loss of accuracy. This is an important development for our study. In the article, many optimizer algorithms are tested on the proposed method and comparative results are given. A 99.41% accuracy ratio has been obtained with the DP-RMSProp optimization method, which produces the best results. Experimental results of our study are presented in detail

Online Publishing @ NISCAIR

Knitting Machinery Spare Classification using Deep Learning with Differential Privacy

Author: Akin Erhan
Kasap Songul
Tastimur Canan
Publication venue: NIScPR-CSIR, India
Publication date: 01/07/2021
Field of study

570-581Given their widespread use, knitting machines must be maintained regularly. When the spare parts that make up these machines break down or become unusable, they must be replaced with new ones. However, the code/name information of the spare parts is not available to the end user, and can only be accessed with high-cost catalog procurement. Manufacturing companies keep the code/name information of such machine parts confidential. When the literature is examined, there are no studies in which spare parts are classified with machine learning–based algorithms. In line with this, this study focuses on the classification of spare parts using machine learning–based algorithms. The deep learning–based Convolutional Neural Network (CNN) architecture developed in this study can classify highly similar spare parts. In addition, since the code/name information received from the manufacturer and the spare part sample images require confidentiality, the CNN architecture has been developed in combination with the Differential Privacy (DP) method to present the DP-CNN method. As a result of the application of the Differential Privacy method, there has been no great loss of accuracy. This is an important development for our study. In the article, many optimizer algorithms are tested on the proposed method and comparative results are given. A 99.41% accuracy ratio has been obtained with the DP-RMSProp optimization method, which produces the best results. Experimental results of our study are presented in detail

NOPR