272 research outputs found
A credit risk model with small sample data based on G-XGBoost
Currently existing credit risk models, e.g., Scoring Card and Extreme Gradient Boosting (XGBoost), usually have requirements for the capacity of modeling samples. The small sample size may result in the adverse outcomes for the trained models which may neither achieve the expected accuracy nor distinguish risks well. On the other hand, data acquisition can be difficult and restricted due to data protection regulations. In view of the above dilemma, this paper applies Generative Adversarial Nets (GAN) to the construction of small and micro enterprises (SMEs) credit risk model, and proposes a novel training method, namely G-XGBoost, based on the XGBoost model. A few batches of real data are selected to train GAN. When the generative network reaches Nash equilibrium, the network is used to generate pseudo data with the same distribution. The pseudo data is then combined with real data to form an amplified sample set. The amplified sample set is used to train XGBoost for credit risk prediction. The feasibility and advantages of the G-XGBoost model are demonstrated by comparing with the XGBoost model
Development of a Machine Learning-Based Financial Risk Control System
With the gradual end of the COVID-19 outbreak and the gradual recovery of the economy, more and more individuals and businesses are in need of loans. This demand brings business opportunities to various financial institutions, but also brings new risks. The traditional loan application review is mostly manual and relies on the business experience of the auditor, which has the disadvantages of not being able to process large quantities and being inefficient. Since the traditional audit processing method is no longer suitable some other method of reducing the rate of non-performing loans and detecting fraud in applications is urgently needed by financial institutions.
In this project, a financial risk control model is built by using various machine learning algorithms. The model is used to replace the traditional manual approach to review loan applications. It improves the speed of review as well as the accuracy and approval rate of the review. Machine learning algorithms were also used in this project to create a loan user scorecard system that better reflects changes in user information compared to the credit card systems used by financial institutions today. In this project, the data imbalance problem and the performance improvement problem are also explored
Intelligent instance selection techniques for support vector machine speed optimization with application to e-fraud detection.
Doctor of Philosophy in Computer Science. University of KwaZulu-Natal, Durban 2017.Decision-making is a very important aspect of many businesses. There are grievous penalties involved in wrong decisions, including financial loss, damage of company reputation and reduction in company productivity. Hence, it is of dire importance that managers make the right decisions. Machine Learning (ML) simplifies the process of decision making: it helps to discover useful patterns from historical data, which can be used for meaningful decision-making. The ability to make strategic and meaningful decisions is dependent on the reliability of data. Currently, many organizations are overwhelmed with vast amounts of data, and unfortunately, ML algorithms cannot effectively handle large datasets. This thesis therefore proposes seven filter-based and five wrapper-based intelligent instance selection techniques for optimizing the speed and predictive accuracy of ML algorithms, with a particular focus on Support Vector Machine (SVM). Also, this thesis proposes a novel fitness function for instance selection. The primary difference between the filter-based and wrapper-based technique is in their method of selection. The filter-based techniques utilizes the proposed fitness function for selection, while the wrapper-based technique utilizes SVM algorithm for selection.
The proposed techniques are obtained by fusing SVM algorithm with the following Nature Inspired algorithms: flower pollination algorithm, social spider algorithm, firefly algorithm, cuckoo search algorithm and bat algorithm. Also, two of the filter-based techniques are boundary detection algorithms, inspired by edge detection in image processing and edge selection in ant colony optimization. Two different sets of experiments were performed in order to evaluate the performance of the proposed techniques (wrapper-based and filter-based). All experiments were performed on four datasets containing three popular e-fraud types: credit card fraud, email spam and phishing email. In addition, experiments were performed on 20 datasets provided by the well-known UCI data repository. The results show that the proposed filter-based techniques excellently improved SVM training speed in 100% (24 out of 24) of the datasets used for evaluation, without significantly affecting SVM classification quality. Moreover, experimental results also show that the wrapper-based techniques consistently improved SVM predictive accuracy in 78% (18 out of 23) of the datasets used for evaluation and simultaneously improved SVM training speed in all cases. Furthermore, two different statistical tests were conducted to further validate the credibility of the results: Freidman’s test and Holm’s post-hoc test. The statistical test results reveal that the proposed filter-based and wrapper-based techniques are significantly faster, compared to standard SVM and some existing instance selection techniques, in all cases. Moreover, statistical test results also reveal that Cuckoo Search Instance Selection Algorithm outperform all the proposed techniques, in terms of speed.
Overall, the proposed techniques have proven to be fast and accurate ML-based e-fraud detection techniques, with improved training speed, predictive accuracy and storage reduction. In real life application, such as video surveillance and intrusion detection systems, that require a classifier to be trained very quickly for speedy classification of new target concepts, the filter-based techniques provide the best solutions; while the wrapper-based techniques are better suited for applications, such as email filters, that are very sensitive to slight changes in predictive accuracy
Advanced Information Systems and Technologies
This book comprises the proceedings of the VI International Scientific Conference “Advanced Information Systems and Technologies, AIST-2018”. The proceeding papers cover issues related to system analysis and modeling, project management, information system engineering, intelligent data processing, computer networking and telecomunications, modern methods and information technologies of sustainable development. They will be useful for students, graduate students, researchers who interested in computer science
Advanced Information Systems and Technologies
This book comprises the proceedings of the VI International Scientific Conference “Advanced Information Systems and Technologies, AIST-2018”. The proceeding papers cover issues related to system analysis and modeling, project management, information system engineering, intelligent data processing, computer networking and telecomunications, modern methods and information technologies of sustainable development. They will be useful for students, graduate students, researchers who interested in computer science
Deep Learning -Powered Computational Intelligence for Cyber-Attacks Detection and Mitigation in 5G-Enabled Electric Vehicle Charging Station
An electric vehicle charging station (EVCS) infrastructure is the backbone of transportation electrification. However, the EVCS has various cyber-attack vulnerabilities in software, hardware, supply chain, and incumbent legacy technologies such as network, communication, and control. Therefore, proactively monitoring, detecting, and defending against these attacks is very important. The state-of-the-art approaches are not agile and intelligent enough to detect, mitigate, and defend against various cyber-physical attacks in the EVCS system. To overcome these limitations, this dissertation primarily designs, develops, implements, and tests the data-driven deep learning-powered computational intelligence to detect and mitigate cyber-physical attacks at the network and physical layers of 5G-enabled EVCS infrastructure. Also, the 5G slicing application to ensure the security and service level agreement (SLA) in the EVCS ecosystem has been studied. Various cyber-attacks such as distributed denial of services (DDoS), False data injection (FDI), advanced persistent threats (APT), and ransomware attacks on the network in a standalone 5G-enabled EVCS environment have been considered. Mathematical models for the mentioned cyber-attacks have been developed. The impact of cyber-attacks on the EVCS operation has been analyzed. Various deep learning-powered intrusion detection systems have been proposed to detect attacks using local electrical and network fingerprints. Furthermore, a novel detection framework has been designed and developed to deal with ransomware threats in high-speed, high-dimensional, multimodal data and assets from eccentric stakeholders of the connected automated vehicle (CAV) ecosystem. To mitigate the adverse effects of cyber-attacks on EVCS controllers, novel data-driven digital clones based on Twin Delayed Deep Deterministic Policy Gradient (TD3) Deep Reinforcement Learning (DRL) has been developed. Also, various Bruteforce, Controller clones-based methods have been devised and tested to aid the defense and mitigation of the impact of the attacks of the EVCS operation. The performance of the proposed mitigation method has been compared with that of a benchmark Deep Deterministic Policy Gradient (DDPG)-based digital clones approach. Simulation results obtained from the Python, Matlab/Simulink, and NetSim software demonstrate that the cyber-attacks are disruptive and detrimental to the operation of EVCS. The proposed detection and mitigation methods are effective and perform better than the conventional and benchmark techniques for the 5G-enabled EVCS
Accelerating Network Communication and I/O in Scientific High Performance Computing Environments
High performance computing has become one of the major drivers behind technology inventions and science discoveries. Originally driven through the increase of operating frequencies and technology scaling, a recent slowdown in this evolution has led to the development of multi-core architectures, which are supported by accelerator devices such as graphics processing units (GPUs). With the upcoming exascale era, the overall power consumption and the gap between compute capabilities and I/O bandwidth have become major challenges. Nowadays, the system performance is dominated by the time spent in communication and I/O, which highly depends on the capabilities of the network interface. In order to cope with the extreme concurrency and heterogeneity of future systems, the software ecosystem of the interconnect needs to be carefully tuned to excel in reliability, programmability, and usability.
This work identifies and addresses three major gaps in today's interconnect software systems. The I/O gap describes the disparity in operating speeds between the computing capabilities and second storage tiers. The communication gap is introduced through the communication overhead needed to synchronize distributed large-scale applications and the mixed workload. The last gap is the so called concurrency gap, which is introduced through the extreme concurrency and the inflicted learning curve posed to scientific application developers to exploit the hardware capabilities.
The first contribution is the introduction of the network-attached accelerator approach, which moves accelerators into a "stand-alone" cluster connected through the Extoll interconnect. The novel communication architecture enables the direct accelerators communication without any host interactions and an optimal application-to-compute-resources mapping. The effectiveness of this approach is evaluated for two classes of accelerators: Intel Xeon Phi coprocessors and NVIDIA GPUs.
The next contribution comprises the design, implementation, and evaluation of the support of legacy codes and protocols over the Extoll interconnect technology. By providing TCP/IP protocol support over Extoll, it is shown that the performance benefits of the interconnect can be fully leveraged by a broader range of applications, including the seamless support of legacy codes.
The third contribution is twofold. First, a comprehensive analysis of the Lustre networking protocol semantics and interfaces is presented. Afterwards, these insights are utilized to map the LNET protocol semantics onto the Extoll networking technology. The result is a fully functional Lustre network driver for Extoll. An initial performance evaluation demonstrates promising bandwidth and message rate results.
The last contribution comprises the design, implementation, and evaluation of two easy-to-use load balancing frameworks, which transparently distribute the I/O workload across all available storage system components. The solutions maximize the parallelization and throughput of file I/O. The frameworks are evaluated on the Titan supercomputing systems for three I/O interfaces. For example for large-scale application runs, POSIX I/O and MPI-IO can be improved by up to 50% on a per job basis, while HDF5 shows performance improvements of up to 32%
Blown to Bits: Your Life, Liberty, and Happiness After the Digital Explosion
382 p.Libro ElectrónicoEach of us has been in the computing field for more than 40 years. The book is the product of a lifetime of observing and participating in the changes it has brought. Each of us has been both a teacher and a learner in the field.
This book emerged from a general education course we have taught at Harvard, but it is not a textbook. We wrote this book to share what wisdom we have with as many people as we can reach. We try to paint a big picture,
with dozens of illuminating anecdotes as the brushstrokes. We aim to entertain you at the same time as we provoke your thinking.Preface
Chapter 1 Digital Explosion
Why Is It Happening, and What Is at Stake?
The Explosion of Bits, and Everything Else
The Koans of Bits
Good and Ill, Promise and Peril
Chapter 2 Naked in the Sunlight
Privacy Lost, Privacy Abandoned
1984 Is Here, and We Like It
Footprints and Fingerprints
Why We Lost Our Privacy, or Gave It Away
Little Brother Is Watching
Big Brother, Abroad and in the U.S.
Technology Change and Lifestyle Change
Beyond Privacy
Chapter 3 Ghosts in the Machine
Secrets and Surprises of Electronic Documents
What You See Is Not What the Computer Knows
Representation, Reality, and Illusion
Hiding Information in Images
The Scary Secrets of Old Disks
Chapter 4 Needles in the Haystack
Google and Other Brokers in the Bits Bazaar
Found After Seventy Years
The Library and the Bazaar
The Fall of Hierarchy
It Matters How It Works
Who Pays, and for What?
Search Is Power
You Searched for WHAT? Tracking Searches
Regulating or Replacing the Brokers
Chapter 5 Secret Bits
How Codes Became Unbreakable
Encryption in the Hands of Terrorists, and Everyone Else
Historical Cryptography
Lessons for the Internet Age
Secrecy Changes Forever
Cryptography for Everyone
Cryptography Unsettled
Chapter 6 Balance Toppled
Who Owns the Bits?
Automated Crimes—Automated Justice
NET Act Makes Sharing a Crime
The Peer-to-Peer Upheaval
Sharing Goes Decentralized
Authorized Use Only
Forbidden Technology
Copyright Koyaanisqatsi: Life Out of Balance
The Limits of Property
Chapter 7 You Can’t Say That on the Internet
Guarding the Frontiers of Digital Expression
Do You Know Where Your Child Is on the Web Tonight?
Metaphors for Something Unlike Anything Else
Publisher or Distributor?
Neither Liberty nor Security
The Nastiest Place on Earth
The Most Participatory Form of Mass Speech
Protecting Good Samaritans—and a Few Bad Ones
Laws of Unintended Consequences
Can the Internet Be Like a Magazine Store?
Let Your Fingers Do the Stalking
Like an Annoying Telephone Call?
Digital Protection, Digital Censorship—and Self-Censorship
Chapter 8 Bits in the Air
Old Metaphors, New Technologies, and Free Speech
Censoring the President
How Broadcasting Became Regulated
The Path to Spectrum Deregulation
What Does the Future Hold for Radio?
Conclusion After the Explosion
Bits Lighting Up the World
A Few Bits in Conclusion
Appendix The Internet as System and Spirit
The Internet as a Communication System
The Internet Spirit
Endnotes
Inde
Phishing within e-commerce: reducing the risk, increasing the trust
E-Commerce has been plagued with problems since its inception and this study examines one of these problems: The lack of user trust in E-Commerce created by the risk of phishing. Phishing has grown exponentially together with the expansion of the Internet. This growth and the advancement of technology has not only benefited honest Internet users, but has enabled criminals to increase their effectiveness which has caused considerable damage to this budding area of commerce. Moreover, it has negatively impacted both the user and online business in breaking down the trust relationship between them. In an attempt to explore this problem, the following was considered: First, E-Commerce’s vulnerability to phishing attacks. By referring to the Common Criteria Security Model, various critical security areas within E-Commerce are identified, as well as the areas of vulnerability and weakness. Second, the methods and techniques used in phishing, such as phishing e-mails, websites and addresses, distributed attacks and redirected attacks, as well as the data that phishers seek to obtain, are examined. Furthermore, the way to reduce the risk of phishing and in turn increase the trust between users and websites is identified. Here the importance of Trust and the Uncertainty Reduction Theory plus the fine balance between trust and control is explored. Finally, the study presents Critical Success Factors that aid in phishing prevention and control, these being: User Authentication, Website Authentication, E-mail Authentication, Data Cryptography, Communication, and Active Risk Mitigation
- …