5,854 research outputs found

    A Hybrid Technological Innovation Text Mining, Ensemble Learning and Risk Scorecard Approach for Enterprise Credit Risk Assessment

    Get PDF
    Enterprise credit risk assessment models typically use financial-based information as a predictor variable, relying on backward-looking historical information rather than forward-looking information for risk assessment. We propose a novel hybrid assessment of credit risk that uses technological innovation information as a predictor variable. Text mining techniques are used to extract this information for each enterprise. A combination of random forest and extreme gradient boosting are used for indicator screening, and finally, risk scorecard based on logistic regression is used for credit risk scoring. Our results show that technological innovation indicators obtained through text mining provide valuable information for credit risk assessment, and that the combination of ensemble learning from random forest and extreme gradient boosting combinations with logistic regression models outperforms other traditional methods. The best results achieved 0.9129 area under receiver operating characteristic. In addition, our approach provides meaningful scoring rules for credit risk assessment of technology innovation enterprises

    Patent Keyword Extraction Algorithm Based on Distributed Representation for Patent Classification

    Get PDF
    Many text mining tasks such as text retrieval, text summarization, and text comparisons depend on the extraction of representative keywords from the main text. Most existing keyword extraction algorithms are based on discrete bag-of-words type of word representation of the text. In this paper, we propose a patent keyword extraction algorithm (PKEA) based on the distributed Skip-gram model for patent classification. We also develop a set of quantitative performance measures for keyword extraction evaluation based on information gain and cross-validation, based on Support Vector Machine (SVM) classification, which are valuable when human-annotated keywords are not available. We used a standard benchmark dataset and a homemade patent dataset to evaluate the performance of PKEA. Our patent dataset includes 2500 patents from five distinct technological fields related to autonomous cars (GPS systems, lidar systems, object recognition systems, radar systems, and vehicle control systems). We compared our method with Frequency, Term Frequency-Inverse Document Frequency (TF-IDF), TextRank and Rapid Automatic Keyword Extraction (RAKE). The experimental results show that our proposed algorithm provides a promising way to extract keywords from patent texts for patent classification

    Prediction of Emerging Technologies Based on Analysis of the U.S. Patent Citation Network

    Full text link
    The network of patents connected by citations is an evolving graph, which provides a representation of the innovation process. A patent citing another implies that the cited patent reflects a piece of previously existing knowledge that the citing patent builds upon. A methodology presented here (i) identifies actual clusters of patents: i.e. technological branches, and (ii) gives predictions about the temporal changes of the structure of the clusters. A predictor, called the {citation vector}, is defined for characterizing technological development to show how a patent cited by other patents belongs to various industrial fields. The clustering technique adopted is able to detect the new emerging recombinations, and predicts emerging new technology clusters. The predictive ability of our new method is illustrated on the example of USPTO subcategory 11, Agriculture, Food, Textiles. A cluster of patents is determined based on citation data up to 1991, which shows significant overlap of the class 442 formed at the beginning of 1997. These new tools of predictive analytics could support policy decision making processes in science and technology, and help formulate recommendations for action

    Phonetic Searching

    Get PDF
    An improved method and apparatus is disclosed which uses probabilistic techniques to map an input search string with a prestored audio file, and recognize certain portions of a search string phonetically. An improved interface is disclosed which permits users to input search strings, linguistics, phonetics, or a combination of both, and also allows logic functions to be specified by indicating how far separated specific phonemes are in time.Georgia Tech Research Corporatio

    A Novel Patent Similarity Measurement Methodology: Semantic Distance and Technological Distance

    Full text link
    Measuring similarity between patents is an essential step to ensure novelty of innovation. However, a large number of methods of measuring the similarity between patents still rely on manual classification of patents by experts. Another body of research has proposed automated methods; nevertheless, most of it solely focuses on the semantic similarity of patents. In order to tackle these limitations, we propose a hybrid method for automatically measuring the similarity between patents, considering both semantic and technological similarities. We measure the semantic similarity based on patent texts using BERT, calculate the technological similarity with IPC codes using Jaccard similarity, and perform hybridization by assigning weights to the two similarity methods. Our evaluation result demonstrates that the proposed method outperforms the baseline that considers the semantic similarity only

    The direction of technical change in AI and the trajectory effects of government funding

    Get PDF
    Government funding of innovation can have a significant impact not only on the rate of technical change, but also on its direction. In this paper, we examine the role that government grants and government departments played in the development of artificial intelligence (AI), an emergent general purpose technology with the potential to revolutionize many aspects of the economy and society. We analyze all AI patents filed at the US Patent and Trademark Office and develop network measures that capture each patent’s influence on all possible sequences of follow-on innovation. By identifying the effect of patents on technological trajectories, we are able to account for the long-term cumulative impact of new knowledge that is not captured by standard patent citation measures. We show that patents funded by government grants, but above all patents filed by federal agencies and state departments, profoundly influenced the development of AI. These long-term effects were especially significant in early phases, and weakened over time as private incentives took over. These results are robust to alternative specifications and controlling for endogeneity

    AI Patents: A Data Driven Approach

    Get PDF
    While artificial intelligence (AI) research brings challenges, the resulting systems are no accident. In fact, academics, researchers, and industry professionals have been developing AI systems since the early 1900s. AI is a field uniquely positioned at the intersection of several scientific disciplines including computer science, applied mathematics, and neuroscience. The AI design process is meticulous, deliberate, and time-consuming – involving intensive mathematical theory, data processing, and computer programming. All the while, AI’s economic value is accelerating. As such, protecting the intellectual property (IP) springing from this work is a keystone for technology firms acting in competitive markets

    AI-assisted patent prior art searching - feasibility study

    Get PDF
    This study seeks to understand the feasibility, technical complexities and effectiveness of using artificial intelligence (AI) solutions to improve operational processes of registering IP rights. The Intellectual Property Office commissioned Cardiff University to undertake this research. The research was funded through the BEIS Regulators’ Pioneer Fund (RPF). The RPF fund was set up to help address barriers to innovation in the UK economy
    • …
    corecore