Search CORE

1,716 research outputs found

Fraud detection for online banking for scalable and distributed data

Author: Haq Ikram
Publication venue: 'Federation University Australia'
Publication date: 01/01/2020
Field of study

Online fraud causes billions of dollars in losses for banks. Therefore, online banking fraud detection is an important field of study. However, there are many challenges in conducting research in fraud detection. One of the constraints is due to unavailability of bank datasets for research or the required characteristics of the attributes of the data are not available. Numeric data usually provides better performance for machine learning algorithms. Most transaction data however have categorical, or nominal features as well. Moreover, some platforms such as Apache Spark only recognizes numeric data. So, there is a need to use techniques e.g. One-hot encoding (OHE) to transform categorical features to numerical features, however OHE has challenges including the sparseness of transformed data and that the distinct values of an attribute are not always known in advance. Efficient feature engineering can improve the algorithm’s performance but usually requires detailed domain knowledge to identify correct features. Techniques like Ripple Down Rules (RDR) are suitable for fraud detection because of their low maintenance and incremental learning features. However, high classification accuracy on mixed datasets, especially for scalable data is challenging. Evaluation of RDR on distributed platforms is also challenging as it is not available on these platforms. The thesis proposes the following solutions to these challenges: • We developed a technique Highly Correlated Rule Based Uniformly Distribution (HCRUD) to generate highly correlated rule-based uniformly-distributed synthetic data. • We developed a technique One-hot Encoded Extended Compact (OHE-EC) to transform categorical features to numeric features by compacting sparse-data even if all distinct values are unknown. • We developed a technique Feature Engineering and Compact Unified Expressions (FECUE) to improve model efficiency through feature engineering where the domain of the data is not known in advance. • A Unified Expression RDR fraud deduction technique (UE-RDR) for Big data has been proposed and evaluated on the Spark platform. Empirical tests were executed on multi-node Hadoop cluster using well-known classifiers on bank data, synthetic bank datasets and publicly available datasets from UCI repository. These evaluations demonstrated substantial improvements in terms of classification accuracy, ruleset compactness and execution speed.Doctor of Philosoph

Federation ResearchOnline

Improving Detection of DeepFakes through Facial Region Analysis in Images

Author: Alanazi F
Morgan G
Ushaw G
Publication venue: 'MDPI AG'
Publication date
Field of study

\ua9 2023 by the authors. In the evolving landscape of digital media, the discipline of media forensics, which encompasses the critical examination and authentication of digital images, videos, and audio recordings, has emerged as an area of paramount importance. This heightened significance is predominantly attributed to the burgeoning concerns surrounding the proliferation of DeepFakes, which are highly realistic and manipulated media content, often created using advanced artificial intelligence techniques. Such developments necessitate a profound understanding and advancement in media forensics to ensure the integrity of digital media in various domains. Current research endeavours are primarily directed towards addressing a common challenge observed in DeepFake datasets, which pertains to the issue of overfitting. Many suggested remedies centre around the application of data augmentation methods, with a frequently adopted strategy being the incorporation of random erasure or cutout. This method entails the random removal of sections from an image to introduce diversity and mitigate overfitting. Generating disparities between the altered and unaltered images serves to inhibit the model from excessively adapting itself to individual samples, thus leading to more favourable results. Nonetheless, the stochastic nature of this approach may inadvertently obscure facial regions that harbour vital information necessary for DeepFake detection. Due to the lack of guidelines on specific regions for cutout, most studies use a randomised approach. However, in recent research, face landmarks have been integrated to designate specific facial areas for removal, even though the selection remains somewhat random. Therefore, there is a need to acquire a more comprehensive insight into facial features and identify which regions hold more crucial data for the identification of DeepFakes. In this study, the investigation delves into the data conveyed by various facial components through the excision of distinct facial regions during the training of the model. The goal is to offer valuable insights to enhance forthcoming face removal techniques within DeepFake datasets, fostering a deeper comprehension among researchers and advancing the realm of DeepFake detection. Our study presents a novel method that uses face cutout techniques to improve understanding of key facial features crucial in DeepFake detection. Moreover, the method combats overfitting in DeepFake datasets by generating diverse images with these techniques, thereby enhancing model robustness. The developed methodology is validated against publicly available datasets like FF++ and Celeb-DFv2. Both face cutout groups surpassed the Baseline, indicating cutouts improve DeepFake detection. Face Cutout Group 2 excelled, with 91% accuracy on Celeb-DF and 86% on the compound dataset, suggesting external facial features’ significance in detection. The study found that eyes are most impactful and the nose is least in model performance. Future research could explore the augmentation policy’s effect on video-based DeepFake detection

Newcastle University E-Prints

Unveiling the frontiers of deep learning: innovations shaping diverse domains

Author: Afrin Shaila
Ahmed Shams Forruque
Alam Md. Sakib Bin
Gandomi Amir H.
Kabir Maliha
Mehjabin Aanushka
Rafa Sabiha Jannat
Publication venue
Publication date: 06/09/2023
Field of study

Deep learning (DL) enables the development of computer models that are capable of learning, visualizing, optimizing, refining, and predicting data. In recent years, DL has been applied in a range of fields, including audio-visual data processing, agriculture, transportation prediction, natural language, biomedicine, disaster management, bioinformatics, drug design, genomics, face recognition, and ecology. To explore the current state of deep learning, it is necessary to investigate the latest developments and applications of deep learning in these disciplines. However, the literature is lacking in exploring the applications of deep learning in all potential sectors. This paper thus extensively investigates the potential applications of deep learning across all major fields of study as well as the associated benefits and challenges. As evidenced in the literature, DL exhibits accuracy in prediction and analysis, makes it a powerful computational tool, and has the ability to articulate itself and optimize, making it effective in processing data with no prior training. Given its independence from training data, deep learning necessitates massive amounts of data for effective analysis and processing, much like data volume. To handle the challenge of compiling huge amounts of medical, scientific, healthcare, and environmental data for use in deep learning, gated architectures like LSTMs and GRUs can be utilized. For multimodal learning, shared neurons in the neural network for all activities and specialized neurons for particular tasks are necessary.Comment: 64 pages, 3 figures, 3 table

arXiv.org e-Print Archive

An introduction to Graph Data Management

Author: A Dries
A Gutiérrez
A Iosup
A Morari
A Poulovassilis
AD Zhu
AO Mendelzon
B Amann
B Elser
C Berge
C Vicknair
C Watters
C Weiss
CS Chang
D Conte
D Dominguez-Sal
D Theodoratos
DC Faye
DW Shipman
EF Codd
FW Tompa
G Malewicz
GM Kuper
H He
HS Kunii
IF Cruz
IF Cruz
J Hidders
J Paredaens
J Peckham
J. Hidders
Jonathan Hayes
K Zeng
L Kowalik
L Zou
M Atre
M Ciglan
M Consens
M Gemis
M Gyssens
M Han
M Levene
M Levene
M Levene
M Mainguenaud
M Schmidt
M Yannakakis
MA Bornea
MA Rodriguez
MA Rodriguez
Marc Andries
MP Consens
MP Consens
N Kiesel
N Roussopoulos
O Erling
P Barceló Baeza
P Buneman
P Yuan
Philippe Cudré-Mauroux
PPS Chen
PT Wood
PT Wood
R Agrawal
R Angles
R Angles
R Brijder
R Ronen
RH Güting
RS Xin
S Abiteboul
S Abiteboul
T Neumann
W Fan
W Kim
Y Guo
Y Low
Y Papakonstantinou
Y Tian
Y Zhao
YA Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/12/2017
Field of study

A graph database is a database where the data structures for the schema and/or instances are modeled as a (labeled)(directed) graph or generalizations of it, and where querying is expressed by graph-oriented operations and type constructors. In this article we present the basic notions of graph databases, give an historical overview of its main development, and study the main current systems that implement them

arXiv.org e-Print Archive

Crossref

Handbook of Digital Face Manipulation and Detection

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/02/2022
Field of study

This open access book provides the first comprehensive collection of studies dealing with the hot topic of digital face manipulation such as DeepFakes, Face Morphing, or Reenactment. It combines the research fields of biometrics and media forensics including contributions from academia and industry. Appealing to a broad readership, introductory chapters provide a comprehensive overview of the topic, which address readers wishing to gain a brief overview of the state-of-the-art. Subsequent chapters, which delve deeper into various research challenges, are oriented towards advanced readers. Moreover, the book provides a good starting point for young researchers as well as a reference guide pointing at further literature. Hence, the primary readership is academic institutions and industry currently involved in digital face manipulation and detection. The book could easily be used as a recommended text for courses in image processing, machine learning, media forensics, biometrics, and the general security area

Directory of Open Access Books (DOAB)

Cyber Security

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/02/2022
Field of study

This open access book constitutes the refereed proceedings of the 17th International Annual Conference on Cyber Security, CNCERT 2021, held in Beijing, China, in AJuly 2021. The 14 papers presented were carefully reviewed and selected from 51 submissions. The papers are organized according to the following topical sections: data security; privacy protection; anomaly detection; traffic analysis; social network security; vulnerability detection; text classification

Directory of Open Access Books (DOAB)

Design and Real-World Application of Novel Machine Learning Techniques for Improving Face Recognition Algorithms

Author: Sáez Trigueros Daniel
Publication venue
Publication date: 28/05/2019
Field of study

Recent progress in machine learning has made possible the development of real-world face recognition applications that can match face images as good as or better than humans. However, several challenges remain unsolved. In this PhD thesis, some of these challenges are studied and novel machine learning techniques to improve the performance of real-world face recognition applications are proposed. Current face recognition algorithms based on deep learning techniques are able to achieve outstanding accuracy when dealing with face images taken in unconstrained environments. However, training these algorithms is often costly due to the very large datasets and the high computational resources needed. On the other hand, traditional methods for face recognition are better suited when these requirements cannot be satisfied. This PhD thesis presents new techniques for both traditional and deep learning methods. In particular, a novel traditional face recognition method that combines texture and shape features together with subspace representation techniques is first presented. The proposed method is lightweight and can be trained quickly with small datasets. This method is used for matching face images scanned from identity documents against face images stored in the biometric chip of such documents. Next, two new techniques to increase the performance of face recognition methods based on convolutional neural networks are presented. Specifically, a novel training strategy that increases face recognition accuracy when dealing with face images presenting occlusions, and a new loss function that improves the performance of the triplet loss function are proposed. Finally, the problem of collecting large face datasets is considered, and a novel method based on generative adversarial networks to synthesize both face images of existing subjects in a dataset and face images of new subjects is proposed. The accuracy of existing face recognition algorithms can be increased by training with datasets augmented with the synthetic face images generated by the proposed method. In addition to the main contributions, this thesis provides a comprehensive literature review of face recognition methods and their evolution over the years. A significant amount of the work presented in this PhD thesis is the outcome of a 3-year-long research project partially funded by Innovate UK as part of a Knowledge Transfer Partnership between University of Hertfordshire and IDscan Biometrics Ltd (partnership number: 009547)

University of Hertfordshire Research Archive

Handbook of Digital Face Manipulation and Detection

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

OAPEN Library