1,420 research outputs found
GraphFC: Customs Fraud Detection with Label Scarcity
Custom officials across the world encounter huge volumes of transactions.
With increased connectivity and globalization, the customs transactions
continue to grow every year. Associated with customs transactions is the
customs fraud - the intentional manipulation of goods declarations to avoid the
taxes and duties. With limited manpower, the custom offices can only undertake
manual inspection of a limited number of declarations. This necessitates the
need for automating the customs fraud detection by machine learning (ML)
techniques. Due the limited manual inspection for labeling the new-incoming
declarations, the ML approach should have robust performance subject to the
scarcity of labeled data. However, current approaches for customs fraud
detection are not well suited and designed for this real-world setting. In this
work, we propose ( neural networks for
ustoms raud), a model-agnostic, domain-specific,
semi-supervised graph neural network based customs fraud detection algorithm
that has strong semi-supervised and inductive capabilities. With upto 252%
relative increase in recall over the present state-of-the-art, extensive
experimentation on real customs data from customs administrations of three
different countries demonstrate that GraphFC consistently outperforms various
baselines and the present state-of-art by a large margin
Semi-supervised multi-layered clustering model for intrusion detection
A Machine Learning (ML) -based Intrusion Detection and Prevention System (IDPS) requires a large amount of labeled up-to-date training data, to effectively detect intrusions and generalize well to novel attacks. However, labeling of data is costly and becomes infeasible when dealing with big data, such as those generated by IoT (Internet of Things) -based applications. To this effect, building a ML model that learns from non- or partially-labeled data is of critical importance. This paper proposes a novel Semi-supervised Multi-Layered Clustering Model (SMLC) for network intrusion detection and prevention tasks. The SMLC has the capability to learn from partially labeled data while achieving a comparable detection performance to supervised ML-based IDPS. The performance of the SMLC is compared with well-known supervised ensemble ML models, namely, RandomForest, Bagging, and AdaboostM1 and a semi-supervised model (i.e., tri-training) on a benchmark network intrusion dataset, the Kyoto 2006+. Experimental results show that the SMLC outperforms all other models and can achieve better detection accuracy using only 20% labeled instances of the training data
A comprehensive survey on deep active learning and its applications in medical image analysis
Deep learning has achieved widespread success in medical image analysis,
leading to an increasing demand for large-scale expert-annotated medical image
datasets. Yet, the high cost of annotating medical images severely hampers the
development of deep learning in this field. To reduce annotation costs, active
learning aims to select the most informative samples for annotation and train
high-performance models with as few labeled samples as possible. In this
survey, we review the core methods of active learning, including the evaluation
of informativeness and sampling strategy. For the first time, we provide a
detailed summary of the integration of active learning with other
label-efficient techniques, such as semi-supervised, self-supervised learning,
and so on. Additionally, we also highlight active learning works that are
specifically tailored to medical image analysis. In the end, we offer our
perspectives on the future trends and challenges of active learning and its
applications in medical image analysis.Comment: Paper List on Github:
https://github.com/LightersWang/Awesome-Active-Learning-for-Medical-Image-Analysi
Recent advancement in Disease Diagnostic using machine learning: Systematic survey of decades, comparisons, and challenges
Computer-aided diagnosis (CAD), a vibrant medical imaging research field, is
expanding quickly. Because errors in medical diagnostic systems might lead to
seriously misleading medical treatments, major efforts have been made in recent
years to improve computer-aided diagnostics applications. The use of machine
learning in computer-aided diagnosis is crucial. A simple equation may result
in a false indication of items like organs. Therefore, learning from examples
is a vital component of pattern recognition. Pattern recognition and machine
learning in the biomedical area promise to increase the precision of disease
detection and diagnosis. They also support the decision-making process's
objectivity. Machine learning provides a practical method for creating elegant
and autonomous algorithms to analyze high-dimensional and multimodal
bio-medical data. This review article examines machine-learning algorithms for
detecting diseases, including hepatitis, diabetes, liver disease, dengue fever,
and heart disease. It draws attention to the collection of machine learning
techniques and algorithms employed in studying conditions and the ensuing
decision-making process
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
- …