21 research outputs found
Generated or Not Generated (GNG): The Importance of Background in the Detection of Fake Images
Facial biometrics are widely used to reliably and conveniently recognize people in photos, in videos, or from real-time webcam streams. It is therefore of fundamental importance to detect synthetic faces in images in order to reduce the vulnerability of biometrics-based security systems. Furthermore, manipulated images of faces can be intentionally shared on social media to spread fake news related to the targeted individual. This paper shows how fake face recognition models may mainly rely on the information contained in the background when dealing with generated faces, thus reducing their effectiveness. Specifically, a classifier is trained to separate fake images from real ones, using their representation in a latent space. Subsequently, the faces are segmented and the background removed, and the detection procedure is performed again, observing a significant drop in classification accuracy. Finally, an explainability tool (SHAP) is used to highlight the salient areas of the image, showing that the background and face contours crucially influence the classifier decision
Facial Segmentation in Deepfake Classification: a Transfer Learning Approach
Artificial Intelligence (AI)–generated images represent a significant threat in various fields, such as security, privacy, media forensics and content moderation. In this paper, a novel approach for the detection of StyleGAN2–generated human faces is presented, leveraging a Transfer Learning strategy to improve the Classification performance of the models. A modified version of the state– of–the–art semantic segmentation model DeepLabV3+, using either a ResNet50 or a MobileNetV3 Large as feature extraction backbones, is used to create both a face segmentation model and the synthetic image detector. To achieve this goal, the models are at first trained for face segmentation in a multi–class Classification task on a widely used semantic segmentation dataset, achieving remarkable results for both configurations. Then, the pre–trained models are retrained on a collection of real and generated images, gathered from different sources to solve a binary Classification task, namely to detect synthetic (i.e. generated) images, thus carrying out two different transfer learning strategies. The results indicate that this targeted methodology significantly improves the detection rates compared to analyzing the face as a whole, and underlines the importance of advanced image recognition technologies when tackling the challenge of detecting generated faces
SARS-CoV-2 susceptibility and COVID-19 disease severity are associated with genetic variants affecting gene expression in a variety of tissues
Variability in SARS-CoV-2 susceptibility and COVID-19 disease severity between individuals is partly due to
genetic factors. Here, we identify 4 genomic loci with suggestive associations for SARS-CoV-2 susceptibility
and 19 for COVID-19 disease severity. Four of these 23 loci likely have an ethnicity-specific component.
Genome-wide association study (GWAS) signals in 11 loci colocalize with expression quantitative trait loci
(eQTLs) associated with the expression of 20 genes in 62 tissues/cell types (range: 1:43 tissues/gene),
including lung, brain, heart, muscle, and skin as well as the digestive system and immune system. We perform
genetic fine mapping to compute 99% credible SNP sets, which identify 10 GWAS loci that have eight or fewer
SNPs in the credible set, including three loci with one single likely causal SNP. Our study suggests that the
diverse symptoms and disease severity of COVID-19 observed between individuals is associated with variants across the genome, affecting gene expression levels in a wide variety of tissue types
Common, low-frequency, rare, and ultra-rare coding variants contribute to COVID-19 severity
The combined impact of common and rare exonic variants in COVID-19 host genetics is currently insufficiently understood. Here, common and rare variants from whole-exome sequencing data of about 4000 SARS-CoV-2-positive individuals were used to define an interpretable machine-learning model for predicting COVID-19 severity. First, variants were converted into separate sets of Boolean features, depending on the absence or the presence of variants in each gene. An ensemble of LASSO logistic regression models was used to identify the most informative Boolean features with respect to the genetic bases of severity. The Boolean features selected by these logistic models were combined into an Integrated PolyGenic Score that offers a synthetic and interpretable index for describing the contribution of host genetics in COVID-19 severity, as demonstrated through testing in several independent cohorts. Selected features belong to ultra-rare, rare, low-frequency, and common variants, including those in linkage disequilibrium with known GWAS loci. Noteworthily, around one quarter of the selected genes are sex-specific. Pathway analysis of the selected genes associated with COVID-19 severity reflected the multi-organ nature of the disease. The proposed model might provide useful information for developing diagnostics and therapeutics, while also being able to guide bedside disease management. © 2021, The Author(s)
Improving fake image detection through background analysis, facial segmentation, and model interpretability
The diffusion of powerful generative AI models, such as the StyleGAN family developed by NVIDIA, paved the way for the diffusion of high-resolution images depicting synthetic human faces. The uncontrolled proliferation of these images poses a threat in any field where facial biometrics are involved, such as security systems, identity verification, and digital forensics. This thesis addresses these vulnerabilities, with particular focus on the importance of different areas of the image for the detection of synthetic images. To this aim, a state-of-the-art segmentation model is trained to first perform a partition of the image in various semantic areas, and then to actually distinguish real photos from artificially generated content. The semantic segmentation is used in two ways: first, it is applied to remove the background from the image, allowing the detection process to be performed separately on the original dataset and on the background-removed version. This approach demonstrates that the background significantly aids the classifier, as removing it results in a noticeable drop in performance. Second, the segmentation model is utilized in a transfer learning framework, where the features learned during segmentation are leveraged to improve the detection of synthetic faces. Finally, this work addresses the explainability aspect by employing SHapley Additive exPlanations (SHAP) to analyze the decision-making process and demonstrating that in the original images, the model tends to focus on those areas as key features for distinguishing real from synthetic images, further confirming the role of the background in aiding the decision process
A Hybrid Deep Learning Approach for Liver Tumor Segmentation Using DeepLabV3+ and Hidden Markov Models
The liver, being the largest solid organ in the human body, is one of the most important players in metabolism and digestion processes. It is also a site where both primary and secondary tumors — originating from distant organs such as the lungs or other abdominal parts such as the pancreas and colon — can originate or metastasize. Therefore, the liver is regularly screened for the presence of lesions. These lesions require precise segmentation techniques to accurately diagnose cancer and improve patient monitoring, disease progression, and response to treatment. In this paper, a slightly modified version of a DeepLabV3+ network, a well–known and state–of–the–art segmentation model, paired with a Hidden Markov Model (HMM) based noise reduction module, is employed and trained on the Medical Segmentation Decathlon (MSD) liver tumor data set. This collection of liver lesions is a fraction of the MSD international challenge dedicated to identifying a general– purpose algorithm for medical image segmentation. The model is then evaluated on the test set of the same dataset with pixel–pixel accuracy and Intersection over Union (IoU)
Common, low-frequency, rare, and ultra-rare coding variants contribute to COVID-19 severity
AbstractThe combined impact of common and rare exonic variants in COVID-19 host genetics is currently insufficiently understood. Here, common and rare variants from whole exome sequencing data of about 4,000 SARS-CoV-2-positive individuals were used to define an interpretable machine learning model for predicting COVID-19 severity. Firstly, variants were converted into separate sets of Boolean features, depending on the absence or the presence of variants in each gene. An ensemble of LASSO logistic regression models was used to identify the most informative Boolean features with respect to the genetic bases of severity. The Boolean features selected by these logistic models were combined into an Integrated PolyGenic Score that offers a synthetic and interpretable index for describing the contribution of host genetics in COVID-19 severity, as demonstrated through testing in several independent cohorts. Selected features belong to ultra-rare, rare, low-frequency, and common variants, including those in linkage disequilibrium with known GWAS loci. Noteworthly, around one quarter of the selected genes are sex-specific. Pathway analysis of the selected genes associated with COVID-19 severity reflected the multi-organ nature of the disease. The proposed model might provide useful information for developing diagnostics and therapeutics, while also being able to guide bedside disease management.</jats:p
