4 research outputs found

    Auto-clustering of Financial Reports Based on Formatting Style and Author’s Fingerprint

    Get PDF
    peer reviewedWe present a new clustering algorithm of financial reports that is based on the reports’ formatting and style. The algorithm uses layout and content information to automatically generate as many clusters as needed. This allows us to reduce the effort of labeling the reports in order to train text-based machine learning models for extracting person or company names, addresses, financial categories, etc. In addition, the algorithm also produces a set of sub-clusters inside each cluster, where each sub-cluster corresponds to a set of reports made by the same author (person or firm). The information about sub-clusters allows us to evaluate the change in the author over time. We have applied the algorithm to a dataset with over 38,000 financial reports (last Annual Account presented by a company) from the Luxembourg Business Registers (LBR) and found 2,165 clusters between 2 and 850 documents with a median of 4 and an average of 14. When adding 2,500 new documents to the existing cluster set (previous annual accounts presented by companies), we found that 67.3% of the financial reports were placed in the correct cluster and sub-cluster. From the remaining documents, 65% were placed in a different subcluster because the company changed the formatting style, which is expected and correct behavior. Finally, labeling 11% of the entire dataset, we can replicate these labels up to 72% of the dataset, keeping a high feature coverage.U-AGR-7012 - BRIDGES2020/IS/15403349/SCRiPT_Yoba Cont (01/04/2021 - 31/03/2024) - BRORSSON Mats Hakan9. Industry, innovation and infrastructur

    Effective Automatic Feature Engineering on Financial Statements for Bankruptcy Prediction

    Get PDF
    peer reviewedFeature engineering on financial records for bankruptcy prediction has traditionally relied significantly on domain knowledge and typically results in a range of financial ratios but with limited complexity and feature utilization due to manual design. It is often a time-consuming and error-prone procedure, confined to the domain experts’ experience, without taking into account the characteristics of different data sets. In this paper, we propose an automated feature engineering approach to generate effective, explainable, and extensible model training features. The experiments have been conducted using a publicly available record of financial statements submitted to the Luxembourg Business Registers. This approach aims to improve bankruptcy prediction for professionals who may not possess the necessary engineering expertise or efficient data. The experimental results suggest that the proposed approach can provide valuable features for model training and in most of the cases, the model’s outcomes outperforms predominantly as compared to the traditional approaches and the well-known approaches the models, thus can provide valuable features for model training

    Towards an autonomous vision-based unmanned aerial system against wildlife poachers.

    Get PDF
    Poaching is an illegal activity that remains out of control in many countries. Based on the 2014 report of the United Nations and Interpol, the illegal trade of global wildlife and natural resources amounts to nearly $ 213 billion every year, which is even helping to fund armed conflicts. Poaching activities around the world are further pushing many animal species on the brink of extinction. Unfortunately, the traditional methods to fight against poachers are not enough, hence the new demands for more efficient approaches. In this context, the use of new technologies on sensors and algorithms, as well as aerial platforms is crucial to face the high increase of poaching activities in the last few years. Our work is focused on the use of vision sensors on UAVs for the detection and tracking of animals and poachers, as well as the use of such sensors to control quadrotors during autonomous vehicle following and autonomous landing
    corecore