6 research outputs found

    PatentSBERTa: A Deep NLP based Hybrid Model for Patent Distance and Classification using Augmented SBERT

    Full text link
    This study provides an efficient approach for using text data to calculate patent-to-patent (p2p) technological similarity, and presents a hybrid framework for leveraging the resulting p2p similarity for applications such as semantic search and automated patent classification. We create embeddings using Sentence-BERT (SBERT) based on patent claims. We leverage SBERTs efficiency in creating embedding distance measures to map p2p similarity in large sets of patent data. We deploy our framework for classification with a simple Nearest Neighbors (KNN) model that predicts Cooperative Patent Classification (CPC) of a patent based on the class assignment of the K patents with the highest p2p similarity. We thereby validate that the p2p similarity captures their technological features in terms of CPC overlap, and at the same demonstrate the usefulness of this approach for automatic patent classification based on text data. Furthermore, the presented classification framework is simple and the results easy to interpret and evaluate by end-users. In the out-of-sample model validation, we are able to perform a multi-label prediction of all assigned CPC classes on the subclass (663) level on 1,492,294 patents with an accuracy of 54% and F1 score > 66%, which suggests that our model outperforms the current state-of-the-art in text-based multi-label and multi-class patent classification. We furthermore discuss the applicability of the presented framework for semantic IP search, patent landscaping, and technology intelligence. We finally point towards a future research agenda for leveraging multi-source patent embeddings, their appropriateness across applications, as well as to improve and validate patent embeddings by creating domain-expert curated Semantic Textual Similarity (STS) benchmark datasets.Comment: 18 pages, 7 figures and 4 Table

    A New Model to Identify the Reliability and Trust of Internet Banking Users Using Fuzzy Theory and Data-Mining

    Get PDF
    As a result of changes in approach from traditional to virtual banking system, security in data exchange has become more important; thus, it seems essentially necessary to present a pattern based on smart models in order to reduce fraud in this field. A new algorithm has been provided in this article to improve security and to specify the limits of giving special services to Internet banking users in order to pave appropriate ground for virtual banking. In addition to identifying behavioral models of customers, this algorithm compares the behaviors of any customer with this model and finally computes the rate of trust in customer’s behavior. The hybrid data-mining and knowledge based structure has been adapted in this algorithm according to fuzzy systems. In this research, qualitative data was gathered from interviews with banking experts, analyzed by Expert Choice to identify the most important variables of customer behavior analysis, and to analyze customer behavior and customer bank Internet transaction data for a period of one year by MATLAB and Clementine. The results of this survey indicate that the potential of the given structure to recognize the rate of trust in Internet bank user’s behavior might be at reasonable level for experts in this area

    The Banking Industry Foresight Using the Scenario Planning Approach and the Cross-Effects Matrix

    Get PDF
    While this study identifies the most important key indicators that influence the banking industry, it also attempts to provide a forecast for the Iranian banking industry in the future. Scenario planning and cross-impact matrix are used in this study. Among all the identified factors, 29 key factors influencing the future of the industry were selected through a fuzzy analytical hierarchical process and then the impact of each of these factors was determined through analysis of the cross-impact matrix. The cross-impact balance was then used to write scenarios.Accordingly, of all combined scenarios, the most likely strong scenarios were clustered into five general categories using K-mode clustering. Finally, four scenarios were identified, including optimism for the bank, banking industry development, inflationary conditions and sanctions. It was therefore possible to define action plans for each of the scenarios
    corecore