Search CORE

1,039 research outputs found

Application of an Artificial Neural Network as a Third-Party Database Auditing System

Author: Daghighi Amirali
Publication venue: The Repository at St. Cloud State
Publication date: 01/03/2019
Field of study

Data auditing is a fundamental challenge for organizations who deal with large databases. Databases are frequently targeted by attacks that grow in quantity and sophistication every day, and one-third of which are coming from users inside the organizations. Database auditing plays a vital role in protecting against these attacks. Native features in data base auditing systems monitor and capture activities and incidents that occur within a database and notify the database administrator. However, the cost of administration and performance overhead in the software must be considered. As opposed to using native auditing tools, the better solution for having a more secure database is to utilize third-party products. The primary goal of this thesis is to utilize an efficient and optimized deep learning approach to detect suspicious behaviors within a database by calculating the amount of risk that each user poses for the system. This will be accomplished by using an Artificial Neural Network as an enhanced feature of analyzer component of a database auditing system. This ANN will work as a third-party product for the database auditing system. The model has been validated in order to have a low bias and low variance. Moreover, parameter tuning technique has been utilized to find the best parameters that would result in the highest accuracy for the model

St. Cloud State University

Internet Financial Credit Risk Assessment with Sliding Window and Attention Mechanism LSTM Model

Author: Jia Xiaojun
Li Menggang
Liu Rui
Lu Ming
Zhang Yingjie
Zhang Zixuan
Zhou Xuan
Publication venue
Publication date: 01/01/2023
Field of study

With the accelerated pace of market-oriented reform, Internet finance has gained a broad and healthy development environment. Existing studies lack consideration of time trends in financial risk, and treating all features equally may lead to inaccurate predictions. To address the above problems, we propose an LSTM model based on sliding window and attention mechanism. The model uses sliding windows to enable the model to effectively exploit the contextual relevance of loan data. And we introduce the attention mechanism into the model, which enables the model to focus on important information. The result on the Lending Club public desensitization dataset shows that our model outperforms ARIMA, SVM, ANN, LSTM, and GRU models

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

A transformer-based model for default prediction in mid-cap corporate markets

Author: Bravo Cristián
Korangi Kamesh
Mues Christophe
Publication venue: 'Elsevier BV'
Publication date: 02/11/2022
Field of study

In this paper, we study mid-cap companies, i.e. publicly traded companies with less than US $10 billion in market capitalisation. Using a large dataset of US mid-cap companies observed over 30 years, we look to predict the default probability term structure over the medium term and understand which data sources (i.e. fundamental, market or pricing data) contribute most to the default risk. Whereas existing methods typically require that data from different time periods are first aggregated and turned into cross-sectional features, we frame the problem as a multi-label time-series classification problem. We adapt transformer models, a state-of-the-art deep learning model emanating from the natural language processing domain, to the credit risk modelling setting. We also interpret the predictions of these models using attention heat maps. To optimise the model further, we present a custom loss function for multi-label classification and a novel multi-channel architecture with differential training that gives the model the ability to use all input data efficiently. Our results show the proposed deep learning architecture's superior performance, resulting in a 13% improvement in AUC (Area Under the receiver operating characteristic Curve) over traditional models. We also demonstrate how to produce an importance ranking for the different data sources and the temporal relationships using a Shapley approach specific to these models.Comment: to be published in the European Journal of Operational Researc

arXiv.org e-Print Archive

USING MACHINE LEARNING TO OPTIMIZE PREDICTIVE MODELS USED FOR BIG DATA ANALYTICS IN VARIOUS SPORTS EVENTS

Author: Gour Akhil Kumar
Publication venue: SJSU ScholarWorks
Publication date: 22/05/2020
Field of study

In today’s world, data is growing in huge volume and type day by day. Historical data can hence be leveraged to predict the likelihood of the events which are to occur in the future. This process of using statistical or any other form of data to predict future outcomes is commonly termed as predictive modelling. Predictive modelling is becoming more and more important and is trending because of several reasons. But mainly, it enables businesses or individual users to gain accurate insights and allows to decide suitable actions for a profitable outcome. Machine learning techniques are generally used in order to build these predictive models. Examples of machine learning models ranges from time-series-based regression models which can be used for predicting volume of airline related traffic and linear regression-based models which can be used for predicting fuel efficiency. There are many domains which can gain competitive advantage by using predictive modelling with machine learning. Few of these domains include, but are not limited to, banking and financial services, retail, insurance, fraud detection, stock market analysis, sentimental analysis etc. In this research project, predictive analysis is used for the sports domain. It’s an upcoming domain where machine learning can help make better predictions. There are numerous sports events happening around the globe every day and the data gathered from these events can very well be used for predicting as well as improving the future events. In this project, machine learning with statistics would be used to perform quantitative and predictive analysis of dataset related to soccer. Comparisons of these models to see how effectively the models are is also presented. Also, few big data tools and techniques are used in order to optimize these predictive models and increase their accuracy to over 90%

SJSU ScholarWorks