102 research outputs found
Using Model Explanations to Guide Deep Learning Models Towards Consistent Explanations for EHR Data
It has been shown that identical Deep Learning (DL) architectures will produce distinct explanations when trained with different hyperparameters that are orthogonal to the task (e.g. random seed, training set order). In domains such as healthcare and finance, where transparency and explainability is paramount, this can be a significant barrier to DL adoption. In this study we present a further analysis of explanation (in)consistency on 6 tabular datasets/tasks, with a focus on Electronic Health Records data. We propose a novel deep learning ensemble architecture that trains its sub-models to produce consistent explanations, improving explanation consistency by as much as 315% (e.g. from 0.02433 to 0.1011 on MIMIC-IV), and on average by 124% (e.g. from 0.12282 to 0.4450 on the BCW dataset). We evaluate the effectiveness of our proposed technique and discuss the implications our results have for both industrial applications of DL and explainability as well as future methodological work
SMS spam filtering using probabilistic topic modelling and Stacked Denoising Autoencoder.
In This paper we present a novel approach to spam filtering and demonstrate its applicability with respect to SMS messages. Our approach requires minimum features engineering and a small set of labelled data samples. Features are extracted using topic modelling based on latent Dirichlet allocation, and then a comprehensive data model is created using a Stacked Denoising Autoencoder (SDA). Topic modelling summarises the data providing ease of use and high interpretability by visualising the topics using word clouds. Given that the SMS messages can be regarded as either spam (unwanted) or ham (wanted), the SDA is able to model the messages and accurately discriminate between the two classes without the need for a pre-labelled training set. The results are compared against the state-of-the-art spam detection algorithms with our proposed approach achieving over 97 % accuracy which compares favourably to the best reported algorithms presented in the literature
Disentangling Racial Phenotypes: Fine-Grained Control of Race-related Facial Phenotype Characteristics
Achieving an effective fine-grained appearance variation over 2D facial images, whilst preserving facial identity, is a challenging task due to the high complexity and entanglement of common 2D facial feature encoding spaces. Despite these challenges, such fine-grained control, by way of disentanglement is a crucial enabler for data-driven racial bias mitigation strategies across multiple automated facial analysis tasks, as it allows to analyse, characterise and synthesise human facial diversity. In this paper, we propose a novel GAN framework to enable fine-grained control over individual race-related phenotype attributes of the facial images. Our framework factors the latent (feature) space into elements that correspond to race-related facial phenotype representations, thereby separating phenotype aspects (e.g. skin, hair colour, nose, eye, mouth shapes), which are notoriously difficult to annotate robustly in real-world facial data. Concurrently, we also introduce a high quality augmented, diverse 2D face image dataset drawn from CelebA-HQ for GAN training. Unlike prior work, our framework only relies upon 2D imagery and related parameters to achieve state-of-the-art individual control over race-related phenotype attributes with improved photo-realistic output
SMS Spam Filtering using Probabilistic Topic Modelling and Stacked Denoising Autoencoder
In This paper we present a novel approach to spam filtering and demonstrate its applicability with respect to SMS messages. Our approach requires minimum features engineering and a small set of labelled data samples. Features are extracted using topic modelling based on latent Dirichlet allocation, and then a comprehensive data model is created using a Stacked Denoising Autoencoder (SDA). Topic modelling summarises the data providing ease of use and high interpretability by visualising the topics using word clouds. Given that the SMS messages can be regarded as either spam (unwanted) or ham (wanted), the SDA is able to model the messages and accurately discriminate between the two classes without the need for a pre-labelled training set. The results are compared against the state-of-the-art spam detection algorithms with our proposed approach achieving over 97 % accuracy which compares favourably to the best reported algorithms presented in the literature
Geometric Particle Swarm Optimization for Multi-objective Optimization Using Decomposition
Multi-objective evolutionary algorithms (MOEAs) based on decomposition are aggregation-based algorithms which transform a multi-objective optimization problem (MOP) into several single-objective subproblems. Being effective, efficient, and easy to implement, Particle Swarm Optimization (PSO) has become one of the most popular single-objective optimizers for continuous problems, and recently it has been successfully extended to the multi-objective domain. However, no investigation on the application of PSO within a
multi-objective decomposition framework exists in the context of combinatorial optimization. This is precisely the focus of the paper. More specifically, we study the incorporation of Geometric Particle Swarm Optimization (GPSO), a discrete generalization of PSO that has proven successful on a number of single-objective combinatorial problems, into a decomposition approach. We conduct experiments on manyobjective 1/0 knapsack problems i.e. problems with more
than three objectives functions, substantially harder than multi-objective problems with fewer objectives. The results indicate that the proposed multi-objective GPSO based on decomposition is able to outperform two version of the wellknow MOEA based on decomposition (MOEA/D) and the
most recent version of the non-dominated sorting genetic algorithm (NSGA-III), which are state-of-the-art multi-objective evolutionary approaches based on decomposition
Collaborative denoising autoencoder for high glycated haemoglobin prediction.
A pioneering study is presented demonstrating that the presence of high glycated haemoglobin (HbA1c) levels in a patient’s blood can be reliably predicted from routinely collected clinical data. This paves the way for performing early detection of Type-2 Diabetes Mellitus (T2DM). This will save healthcare providers a major cost associated with the administration and assessment of clinical tests for HbA1c. A novel collaborative denoising autoencoder framework is used to address this challenge. The framework builds an independent denoising autoencoder model for the high and low HbA1c level, which extracts feature representations in the latent space. A baseline model using just three features: patient age together with triglycerides and glucose level achieves 76% F1-score with an SVM classifier. The collaborative denoising autoencoder uses 78 features and can predict HbA1c level with 81% F1-score
HIV and HPV infections and ocular surface squamous neoplasia: systematic review and meta-analysis.
BACKGROUND: The frequency of ocular surface squamous neoplasias (OSSNs) has been increasing in populations with a high prevalence of infection with human immunodeficiency virus/acquired immunodeficiency syndrome (HIV/AIDS) and infection with human papillomavirus (HPV). We aimed to quantify the association between HIV/AIDS and HPV infection and OSSN, through systematic review and meta-analysis. METHODS: The articles providing data on the association between HIV/AIDS and/or HPV infection and OSSN were identified in MEDLINE, SCOPUS and EMBASE searched up to May 2013, and through backward citation tracking. The DerSimonian and Laird method was used to compute summary relative risk (RR) estimates and 95% confidence intervals (95% CI). Heterogeneity was quantified with the I(2) statistic. RESULTS: HIV/AIDS was strongly associated with an increased risk of OSSN (summary RR=8.06, 95% CI: 5.29-12.30, I(2)=56.0%, 12 studies). The summary RR estimate for the infection with mucosal HPV subtypes was 3.13 (95% CI: 1.72-5.71, I(2)=45.6%, 16 studies). Four studies addressed the association between both cutaneous and mucosal HPV subtypes and OSSN; the summary RR estimates were 3.52 (95% CI: 1.23-10.08, I(2)=21.8%) and 1.08 (95% CI: 0.57-2.05, I(2)=0.0%), respectively. CONCLUSION: Human immunodeficiency virus infection increases the risk of OSSN by nearly eight-fold. Regarding HPV infection, only the cutaneous subtypes seem to be a risk factor
Large-scale proteomic identification of S100 proteins in breast cancer tissues
<p>Abstract</p> <p>Background</p> <p>Attempts to reduce morbidity and mortality in breast cancer is based on efforts to identify novel biomarkers to support prognosis and therapeutic choices. The present study has focussed on S100 proteins as a potentially promising group of markers in cancer development and progression. One reason of interest in this family of proteins is because the majority of the S100 genes are clustered on a region of human chromosome 1q21 that is prone to genomic rearrangements. Moreover, there is increasing evidence that S100 proteins are often up-regulated in many cancers, including breast, and this is frequently associated with tumour progression.</p> <p>Methods</p> <p>Samples of breast cancer tissues were obtained during surgical intervention, according to the bioethical recommendations, and cryo-preserved until used. Tissue extracts were submitted to proteomic preparations for 2D-IPG. Protein identification was performed by N-terminal sequencing and/or peptide mass finger printing.</p> <p>Results</p> <p>The majority of the detected S100 proteins were absent, or present at very low levels, in the non-tumoral tissues adjacent to the primary tumor. This finding strengthens the role of S100 proteins as putative biomarkers. The proteomic screening of 100 cryo-preserved breast cancer tissues showed that some proteins were ubiquitously expressed in almost all patients while others appeared more sporadic. Most, if not all, of the detected S100 members appeared reciprocally correlated. Finally, from the perspective of biomarkers establishment, a promising finding was the observation that patients which developed distant metastases after a three year follow-up showed a general tendency of higher S100 protein expression, compared to the disease-free group.</p> <p>Conclusions</p> <p>This article reports for the first time the comparative proteomic screening of several S100 protein members among a large group of breast cancer patients. The results obtained strongly support the hypothesis that a significant deregulation of multiple S100 protein members is associated with breast cancer progression, and suggest that these proteins might act as potential prognostic factors for patient stratification. We propose that this may offer a significant contribution to the knowledge and clinical applications of the S100 protein family to breast cancer.</p
S100A7-Downregulation Inhibits Epidermal Growth Factor-Induced Signaling in Breast Cancer Cells and Blocks Osteoclast Formation
S100A7 is a small calcium binding protein, which has been shown to be differentially expressed in psoriatic skin lesions, as well as in squamous cell tumors of the skin, lung and breast. Although its expression has been correlated to HER+ high-grade tumors and to a high risk of progression, the molecular mechanisms of these S100A7-mediated tumorigenic effects are not well known. Here, we showed for the first time that epidermal growth factor (EGF) induces S100A7 expression in both MCF-7 and MDA-MB-468 cell lines. We also observed a decrease in EGF-directed migration in shRNA-downregulated MDA-MB-468 cell lines. Furthermore, our signaling studies revealed that EGF induced simultaneous EGF receptor phosphorylation at Tyr1173 and HER2 phosphorylation at Tyr1248 in S100A7-downregulated cell lines as compared to the vector-transfected controls. In addition, reduced phosphorylation of Src at tyrosine 416 and p-SHP2 at tyrosine 542 was observed in these downregulated cell lines. Further studies revealed that S100A7-downregulated cells had reduced angiogenesis in vivo based on matrigel plug assays. Our results also showed decreased tumor-induced osteoclastic resorption in an intra-tibial bone injection model involving SCID mice. S100A7-downregulated cells had decreased osteoclast number and size as compared to the vector controls, and this decrease was associated with variations in IL-8 expression in in vitro cell cultures. This is a novel report on the role of S100A7 in EGF-induced signaling in breast cancer cells and in osteoclast formation
Improving the multiobjective evolutionary algorithm based on decomposition with new penalty schemes
It has been increasingly reported that the multiobjective optimization evolutionary algorithm based on decomposition (MOEA/D) is promising for handling multiobjective optimization problems (MOPs). MOEA/D employs scalarizing functions to convert an MOP into a number of single-objective subproblems. Among them, penalty boundary intersection (PBI) is one of the most popular decomposition approaches and has been widely adopted for dealing with MOPs. However, the original PBI uses a constant penalty value for all subproblems and has difficulties in achieving a good distribution and coverage of the Pareto front for some problems. In this paper, we investigate the influence of the penalty factor on PBI, and suggest two new penalty schemes, i.e., adaptive penalty scheme and subproblem-based penalty scheme (SPS), to enhance the spread of Pareto-optimal solutions. The new penalty schemes are examined on several complex MOPs, showing that PBI with the use of them is able to provide a better approximation of the Pareto front than the original one. The SPS is further integrated into two recently developed MOEA/D variants to help balance the population diversity and convergence. Experimental results show that it can significantly enhance the algorithm�s performance. © 2016, Springer-Verlag Berlin Heidelberg
- …