8 research outputs found
The (de)biasing effect of GAN-based augmentation methods on skin lesion images
New medical datasets are now more open to the public, allowing for better and
more extensive research. Although prepared with the utmost care, new datasets
might still be a source of spurious correlations that affect the learning
process. Moreover, data collections are usually not large enough and are often
unbalanced. One approach to alleviate the data imbalance is using data
augmentation with Generative Adversarial Networks (GANs) to extend the dataset
with high-quality images. GANs are usually trained on the same biased datasets
as the target data, resulting in more biased instances. This work explored
unconditional and conditional GANs to compare their bias inheritance and how
the synthetic data influenced the models. We provided extensive manual data
annotation of possibly biasing artifacts on the well-known ISIC dataset with
skin lesions. In addition, we examined classification models trained on both
real and synthetic data with counterfactual bias explanations. Our experiments
showed that GANs inherited biases and sometimes even amplified them, leading to
even stronger spurious correlations. Manual data annotation and synthetic
images are publicly available for reproducible scientific research.Comment: Accepted to MICCAI202
Modelling Arbitrary Complex Dielectric Properties – an automated implementation for gprMax
There is a need to accurately simulate materials with complex electromagnetic properties when modelling Ground Penetrating Radar (GPR), as many objects encountered with GPR contain water, e.g. soils, curing concrete, and water-filled pipes. One of widely-used open-source software that simulates electromagnetic wave propagation is gprMax. It uses Yee’s algorithm to solve Maxwell’s equations with the Finite-Difference Time-Domain (FDTD) method. A significant drawback of the FDTD method is the limited ability to model materials with dispersive properties, currently narrowed to specific set of relaxation mechanisms, namely multi-Debye, Drude and Lorentz media. Consequently, modelling any arbitrary complex material should be done by approximating it as a combination of these functions. This paper describes work carried out as part of the Google Summer of Code (GSoC) programme 2021 to develop a new module within gprMax that can be used to simulate complex dispersive materials using multi-Debye expansions in an automatic manner. The module is capable of modelling Havriliak-Negami, Cole-Cole, Cole-Davidson, Jonscher, Complex-Refractive Index Models, and indeed any arbitrary dispersive material with real and imaginary permittivity specified by the user
Waste detection in Pomerania: non-profit project for detecting waste in environment
Waste pollution is one of the most significant environmental issues in the
modern world. The importance of recycling is well known, either for economic or
ecological reasons, and the industry demands high efficiency. Our team
conducted comprehensive research on Artificial Intelligence usage in waste
detection and classification to fight the world's waste pollution problem. As a
result an open-source framework that enables the detection and classification
of litter was developed. The final pipeline consists of two neural networks:
one that detects litter and a second responsible for litter classification.
Waste is classified into seven categories: bio, glass, metal and plastic,
non-recyclable, other, paper and unknown. Our approach achieves up to 70% of
average precision in waste detection and around 75% of classification accuracy
on the test dataset. The code used in the studies is publicly available online.Comment: Litter detection, Waste detection, Object detectio
Towards trustworthy multi-modal motion prediction: Holistic evaluation and interpretability of outputs
Predicting the motion of other road agents enables autonomous vehicles to
perform safe and efficient path planning. This task is very complex, as the
behaviour of road agents depends on many factors and the number of possible
future trajectories can be considerable (multi-modal). Most prior approaches
proposed to address multi-modal motion prediction are based on complex machine
learning systems that have limited interpretability. Moreover, the metrics used
in current benchmarks do not evaluate all aspects of the problem, such as the
diversity and admissibility of the output. In this work, we aim to advance
towards the design of trustworthy motion prediction systems, based on some of
the requirements for the design of Trustworthy Artificial Intelligence. We
focus on evaluation criteria, robustness, and interpretability of outputs.
First, we comprehensively analyse the evaluation metrics, identify the main
gaps of current benchmarks, and propose a new holistic evaluation framework. We
then introduce a method for the assessment of spatial and temporal robustness
by simulating noise in the perception system. To enhance the interpretability
of the outputs and generate more balanced results in the proposed evaluation
framework, we propose an intent prediction layer that can be attached to
multi-modal motion prediction models. The effectiveness of this approach is
assessed through a survey that explores different elements in the visualization
of the multi-modal trajectories and intentions. The proposed approach and
findings make a significant contribution to the development of trustworthy
motion prediction systems for autonomous vehicles, advancing the field towards
greater safety and reliability.Comment: 16 pages, 7 figures, 6 table
GAN-based generative modelling for dermatological applications -- comparative study
The lack of sufficiently large open medical databases is one of the biggest
challenges in AI-powered healthcare. Synthetic data created using Generative
Adversarial Networks (GANs) appears to be a good solution to mitigate the
issues with privacy policies. The other type of cure is decentralized protocol
across multiple medical institutions without exchanging local data samples. In
this paper, we explored unconditional and conditional GANs in centralized and
decentralized settings. The centralized setting imitates studies on large but
highly unbalanced skin lesion dataset, while the decentralized one simulates a
more realistic hospital scenario with three institutions. We evaluated models'
performance in terms of fidelity, diversity, speed of training, and predictive
ability of classifiers trained on the generated synthetic data. In addition we
provided explainability through exploration of latent space and embeddings
projection focused both on global and local explanations. Calculated distance
between real images and their projections in the latent space proved the
authenticity and generalization of trained GANs, which is one of the main
concerns in this type of applications. The open source code for conducted
studies is publicly available at
\url{https://github.com/aidotse/stylegan2-ada-pytorch}.Comment: 16 pages, 5 figures, 2 table
Security theory and practice: Improving the level of security: Methods and tools
Z wprowadzenia: "Nauki o bezpieczeństwie oraz nauki o zarządzaniu i jakości jako dyscypliny naukowe
wchodzą w skład dziedziny nauk społecznych. Obiektem i przedmiotem badań w naukach
społecznych jest rzeczywistość społeczna, na którą składają się: zbiorowości
i zbiory społeczne, instytucje społeczne, a także procesy oraz zjawiska społeczne.
Występuje przy tym heterogeniczność obiektu badań, co wymaga stosowania różnorodnych
narzędzi badawczych, metod i technik – często pochodzących z innych dyscyplin
naukowych, spoza dziedziny nauk społecznych. Jednym z celów badań dotyczących
bezpieczeństwa może być podniesienie jego poziomu, także z wykorzystaniem
dorobku nauk o zarządzaniu. Należy brać pod uwagę zróżnicowane rozumienie pojęcia
„bezpieczeństwo” i fakt, że termin ten jest obecnie dyskutowany."(...
Numerical modelings of ultrashort pulse propagation and conical emission in multimode optical fibers
International audienceWe make use of two well-known numerical approaches of nonlinear pulse propagation, namely, the unidirectional pulse propagation equation and the multimode generalized nonlinear Schrödinger equation, to provide a detailed comparison of ultrashort pulse propagation and possible conical emission in the context of multimode optical fibers. We confirm the strong impact of the frequency dispersion of the nonlinear response on pulse splitting and supercontinuum dynamics in the femtosecond regime for pumping powers around the critical self-focusing threshold. Our results also confirm that the modal distribution of optical fibers provides a discretization of conical emission of the corresponding bulk medium (i.e., here fused silica). This study also provides some criteria for the use of numerical models, and it paves the way for future nonlinear experiments in commercially available optical fibers