8,695 research outputs found
Qluster: An easy-to-implement generic workflow for robust clustering of health data
The exploration of heath data by clustering algorithms allows to better describe the populations of interest by seeking the sub-profiles that compose it. This therefore reinforces medical knowledge, whether it is about a disease or a targeted population in real life. Nevertheless, contrary to the so-called conventional biostatistical methods where numerous guidelines exist, the standardization of data science approaches in clinical research remains a little discussed subject. This results in a significant variability in the execution of data science projects, whether in terms of algorithms used, reliability and credibility of the designed approach. Taking the path of parsimonious and judicious choice of both algorithms and implementations at each stage, this article proposes Qluster, a practical workflow for performing clustering tasks. Indeed, this workflow makes a compromise between (1) genericity of applications (e.g. usable on small or big data, on continuous, categorical or mixed variables, on database of high-dimensionality or not), (2) ease of implementation (need for few packages, few algorithms, few parameters, ...), and (3) robustness (e.g. use of proven algorithms and robust packages, evaluation of the stability of clusters, management of noise and multicollinearity). This workflow can be easily automated and/or routinely applied on a wide range of clustering projects. It can be useful both for data scientists with little experience in the field to make data clustering easier and more robust, and for more experienced data scientists who are looking for a straightforward and reliable solution to routinely perform preliminary data mining. A synthesis of the literature on data clustering as well as the scientific rationale supporting the proposed workflow is also provided. Finally, a detailed application of the workflow on a concrete use case is provided, along with a practical discussion for data scientists. An implementation on the Dataiku platform is available upon request to the authors
Deep Transfer Learning Applications in Intrusion Detection Systems: A Comprehensive Review
Globally, the external Internet is increasingly being connected to the
contemporary industrial control system. As a result, there is an immediate need
to protect the network from several threats. The key infrastructure of
industrial activity may be protected from harm by using an intrusion detection
system (IDS), a preventive measure mechanism, to recognize new kinds of
dangerous threats and hostile activities. The most recent artificial
intelligence (AI) techniques used to create IDS in many kinds of industrial
control networks are examined in this study, with a particular emphasis on
IDS-based deep transfer learning (DTL). This latter can be seen as a type of
information fusion that merge, and/or adapt knowledge from multiple domains to
enhance the performance of the target task, particularly when the labeled data
in the target domain is scarce. Publications issued after 2015 were taken into
account. These selected publications were divided into three categories:
DTL-only and IDS-only are involved in the introduction and background, and
DTL-based IDS papers are involved in the core papers of this review.
Researchers will be able to have a better grasp of the current state of DTL
approaches used in IDS in many different types of networks by reading this
review paper. Other useful information, such as the datasets used, the sort of
DTL employed, the pre-trained network, IDS techniques, the evaluation metrics
including accuracy/F-score and false alarm rate (FAR), and the improvement
gained, were also covered. The algorithms, and methods used in several studies,
or illustrate deeply and clearly the principle in any DTL-based IDS subcategory
are presented to the reader
Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review
In this paper, a critical bibliometric analysis study is conducted, coupled
with an extensive literature survey on recent developments and associated
applications in machine learning research with a perspective on Africa. The
presented bibliometric analysis study consists of 2761 machine learning-related
documents, of which 98% were articles with at least 482 citations published in
903 journals during the past 30 years. Furthermore, the collated documents were
retrieved from the Science Citation Index EXPANDED, comprising research
publications from 54 African countries between 1993 and 2021. The bibliometric
study shows the visualization of the current landscape and future trends in
machine learning research and its application to facilitate future
collaborative research and knowledge exchange among authors from different
research institutions scattered across the African continent
Modelling uncertainties for measurements of the H → γγ Channel with the ATLAS Detector at the LHC
The Higgs boson to diphoton (H → γγ) branching ratio is only 0.227 %, but this
final state has yielded some of the most precise measurements of the particle. As
measurements of the Higgs boson become increasingly precise, greater import is
placed on the factors that constitute the uncertainty. Reducing the effects of these
uncertainties requires an understanding of their causes. The research presented
in this thesis aims to illuminate how uncertainties on simulation modelling are
determined and proffers novel techniques in deriving them.
The upgrade of the FastCaloSim tool is described, used for simulating events in
the ATLAS calorimeter at a rate far exceeding the nominal detector simulation,
Geant4. The integration of a method that allows the toolbox to emulate the
accordion geometry of the liquid argon calorimeters is detailed. This tool allows
for the production of larger samples while using significantly fewer computing
resources.
A measurement of the total Higgs boson production cross-section multiplied
by the diphoton branching ratio (σ × Bγγ) is presented, where this value was
determined to be (σ × Bγγ)obs = 127 ± 7 (stat.) ± 7 (syst.) fb, within agreement
with the Standard Model prediction. The signal and background shape modelling
is described, and the contribution of the background modelling uncertainty to the
total uncertainty ranges from 18–2.4 %, depending on the Higgs boson production
mechanism.
A method for estimating the number of events in a Monte Carlo background
sample required to model the shape is detailed. It was found that the size of
the nominal γγ background events sample required a multiplicative increase by
a factor of 3.60 to adequately model the background with a confidence level of
68 %, or a factor of 7.20 for a confidence level of 95 %. Based on this estimate,
0.5 billion additional simulated events were produced, substantially reducing the
background modelling uncertainty.
A technique is detailed for emulating the effects of Monte Carlo event generator
differences using multivariate reweighting. The technique is used to estimate the
event generator uncertainty on the signal modelling of tHqb events, improving the
reliability of estimating the tHqb production cross-section. Then this multivariate
reweighting technique is used to estimate the generator modelling uncertainties
on background V γγ samples for the first time. The estimated uncertainties were
found to be covered by the currently assumed background modelling uncertainty
In vitro investigation of the effect of disulfiram on hypoxia induced NFκB, epithelial to mesenchymal transition and cancer stem cells in glioblastoma cell lines
A thesis submitted in partial fulfilment of the requirements of the University of Wolverhampton for the degree of Doctor of Philosophy.Glioblastoma multiforme (GBM) is one of the most aggressive and lethal cancers with a poor prognosis. Advances in the treatment of GBM are limited due to several resistance mechanisms and limited drug delivery into the central nervous system (CNS) compartment by the blood-brain barrier (BBB) and by actions of the normal brain to counteract tumour-targeting medications. Hypoxia is common in malignant brain tumours such as GBM and plays a significant role in tumour pathobiology. It is widely accepted that hypoxia is a major driver of GBM malignancy. Although it has been confirmed that hypoxia induces GBM stem-like-cells (GSCs), which are highly invasive and resistant to all chemotherapeutic agents, the detailed molecular pathways linking hypoxia, GSC traits and chemoresistance remain obscure. Evidence shows that hypoxia induces cancer stem cell phenotypes via epithelial-to-mesenchymal transition (EMT), promoting therapeutic resistance in most cancers, including GBM.
This study demonstrated that spheroid cultured GBM cells consist of a large population of hypoxic cells with CSC and EMT characteristics. GSCs are chemo-resistant and displayed increased levels of HIFs and NFκB activity. Similarly, the hypoxia cultured GBM cells manifested GSC traits, chemoresistance and invasiveness. These results suggest that hypoxia is responsible for GBM stemness, chemoresistance and invasiveness. GBM cells transfected with nuclear factor kappa B-p65 (NFκB-p65) subunit exhibited CSC and EMT markers indicating the essential role of NFκB in maintaining GSC phenotypes. The study also highlighted the significance of NFκB in driving chemoresistance, invasiveness, and the potential role of NFκB as the central regulator of hypoxia-induced stemness in GBM cells. GSC population has the ability of self-renewal, cancer initiation and development of secondary heterogeneous cancer. The very poor prognosis of GBM could largely be attributed to the existence of GSCs, which promote tumour propagation, maintenance, radio- and chemoresistance and local infiltration.
In this study, we used Disulfiram (DS), a drug used for more than 65 years in alcoholism clinics, in combination with copper (Cu) to target the NFκB pathway, reverse chemoresistance and block invasion in GSCs. The obtained results showed that DS/Cu is highly cytotoxic to GBM cells and completely eradicated the resistant CSC population at low dose levels in vitro. DS/Cu inhibited the migration and invasion of hypoxia-induced CSC and EMT like GBM cells at low nanomolar concentrations.
DS is an FDA approved drug with low toxicity to normal tissues and can pass through the BBB. Further research may lead to the quick translation of DS into cancer clinics and provide new therapeutic options to improve treatment outcomes in GBM patients
Predictive Maintenance of Critical Equipment for Floating Liquefied Natural Gas Liquefaction Process
Predictive Maintenance of Critical Equipment for Liquefied Natural Gas Liquefaction Process
Meeting global energy demand is a massive challenge, especially with the quest of more affinity towards sustainable and cleaner energy. Natural gas is viewed as a bridge fuel to a renewable energy. LNG as a processed form of natural gas is the fastest growing and cleanest form of fossil fuel. Recently, the unprecedented increased in LNG demand, pushes its exploration and processing into offshore as Floating LNG (FLNG). The offshore topsides gas processes and liquefaction has been identified as one of the great challenges of FLNG. Maintaining topside liquefaction process asset such as gas turbine is critical to profitability and reliability, availability of the process facilities. With the setbacks of widely used reactive and preventive time-based maintenances approaches, to meet the optimal reliability and availability requirements of oil and gas operators, this thesis presents a framework driven by AI-based learning approaches for predictive maintenance. The framework is aimed at leveraging the value of condition-based maintenance to minimises the failures and downtimes of critical FLNG equipment (Aeroderivative gas turbine).
In this study, gas turbine thermodynamics were introduced, as well as some factors affecting gas turbine modelling. Some important considerations whilst modelling gas turbine system such as modelling objectives, modelling methods, as well as approaches in modelling gas turbines were investigated. These give basis and mathematical background to develop a gas turbine simulated model. The behaviour of simple cycle HDGT was simulated using thermodynamic laws and operational data based on Rowen model. Simulink model is created using experimental data based on Rowen’s model, which is aimed at exploring transient behaviour of an industrial gas turbine. The results show the capability of Simulink model in capture nonlinear dynamics of the gas turbine system, although constraint to be applied for further condition monitoring studies, due to lack of some suitable relevant correlated features required by the model.
AI-based models were found to perform well in predicting gas turbines failures. These capabilities were investigated by this thesis and validated using an experimental data obtained from gas turbine engine facility. The dynamic behaviours gas turbines changes when exposed to different varieties of fuel. A diagnostics-based AI models were developed to diagnose different gas turbine engine’s failures associated with exposure to various types of fuels. The capabilities of Principal Component Analysis (PCA) technique have been harnessed to reduce the dimensionality of the dataset and extract good features for the diagnostics model development.
Signal processing-based (time-domain, frequency domain, time-frequency domain) techniques have also been used as feature extraction tools, and significantly added more correlations to the dataset and influences the prediction results obtained. Signal processing played a vital role in extracting good features for the diagnostic models when compared PCA. The overall results obtained from both PCA, and signal processing-based models demonstrated the capabilities of neural network-based models in predicting gas turbine’s failures. Further, deep learning-based LSTM model have been developed, which extract features from the time series dataset directly, and hence does not require any feature extraction tool. The LSTM model achieved the highest performance and prediction accuracy, compared to both PCA-based and signal processing-based the models.
In summary, it is concluded from this thesis that despite some challenges related to gas turbines Simulink Model for not being integrated fully for gas turbine condition monitoring studies, yet data-driven models have proven strong potentials and excellent performances on gas turbine’s CBM diagnostics. The models developed in this thesis can be used for design and manufacturing purposes on gas turbines applied to FLNG, especially on condition monitoring and fault detection of gas turbines. The result obtained would provide valuable understanding and helpful guidance for researchers and practitioners to implement robust predictive maintenance models that will enhance the reliability and availability of FLNG critical equipment.Petroleum Technology Development Funds (PTDF) Nigeri
Annals [...].
Pedometrics: innovation in tropics; Legacy data: how turn it useful?; Advances in soil sensing; Pedometric guidelines to systematic soil surveys.Evento online. Coordenado por: Waldir de Carvalho Junior, Helena Saraiva Koenow Pinheiro, Ricardo Simão Diniz Dalmolin
A Molecular Approach to the Diagnosis, Assessment, Monitoring and Treatment of Pulmonary Non-Tuberculous Mycobacterial Disease
Introduction: Non-Tuberculous Mycobacteria (NTM) can cause disease of the lungs and sinuses, lymph nodes, joints and central nervous system as well as disseminated infections in immunocompromised individuals. Efforts to tackle infections in NTM are hampered by a lack of reliable biomarkers for diagnosis, assessment of disease activity, and prognostication.
Aims: The broad aims of this thesis are:
1. to develop molecular assays capable of quantifying the 6 most common pathogenic mycobacteria (M. abscessus, M. avium, M. intracellulare, M. malmoense, M. kansasii, M. xenopi) and calculate comparative sensitivities and specificities for each assay.
2. to assess patients’ clinical course over 12 – 18 months by performing the developed molecular assays against DNA extracted from sputum from patients with NTM infection.
3. to assess dynamic bacterial changes of the lung microbiome in patients on treatment for NTM disease and those who are treatment na ve.
Methods: DNA was extracted from a total of 410 sputum samples obtained from 38 patients who were either:
• commencing treatment for either M. abscessus or Mycobacterium avium complex.
• considered colonised with M. abscessus or Mycobacterium avium complex (i.e. cultured NTM but were not deemed to have infection as they did not meet ATS or BTS criteria for disease).
• Diagnosed with cystic fibrosis (CF) or non-CF bronchiectasis but had never cultured NTM.
For the development of quantitative molecular assays, NTM hsp65 gene sequences were aligned and interrogated for areas of variability. These variable regions enabled the creation of species specific probes. In vitro sensitivity and specificity for each probe was determined by testing each probe against a panel of plasmids containing hsp65
gene inserts from different NTM species. Quantification accuracy was determined by using each assay against a mock community containing serial dilutions of target DNA.
Each sample was tested with the probes targeting: M. abscessus, M. avium and M. intracellulare producing a longitudinal assessment of NTM copy number during each patient’s clinical course.
In addition, a total of 64 samples from 16 patients underwent 16S rRNA gene sequencing to characterise longitudinal changes in the microbiome of both NTM disease and controls.
Results: In vitro sensitivity for the custom assays were 100% and specificity ranged from 91.6% to 100%. In terms of quantification accuracy, there was no significant difference between the measured results of each assay and the expected values when performed in singleplex. The assays were able to accurately determine NTM copy number to a theoretical limit of 10 copies/μl.
When used against samples derived from human sputum and using culture results as a gold standard, the sensitivity of the assay for M. abscessus was found to be 0.87 and 0.86 for MAC. The specificity of the assay for M. abscessus was 0.95 and 0.62 for MAC. The negative predictive value of the assay for M. abscessus was 0.98 and 0.95 for MAC. This resulted in an AUC of 0.92 for M. abscessus and 0.74 for MAC.
Longitudinal analysis of the lung microbiome using 16SrRNA gene sequencing showed that bacterial burden initially decreases after initiation of antibiotic therapy but begins to return to normal levels over several months of antibiotic therapy. This effect is mirrored by changes in alpha diversity. The decrease in bacterial burden and loss of alpha diversity was found to be secondary to significant changes in specific genera such as Veillonella and Streptococcus. The abundance of other Proteobacteria such as Pseudomonas remain relatively constant.
Conclusion: The molecular assay has shown high in vitro sensitivity and specificity for the detection and accurate quantification of the 6 most commonly pathogenic NTM species. The assays successfully identified NTM DNA from human sputum samples.
A notable association between NTM copy number and the cessation of one or more antibiotics existed (i.e. when one antibiotic was stopped because of patient intolerance, NTM copy number increased, often having been unrecordable prior to this). The qPCR assays developed in this thesis provide an affordable, real time and rapid measurement of NTM burden allowing clinicians to act on problematic results sooner than currently possible.
There was no significant difference between the microbiome in bronchiectasis and cystic fibrosis nor was there a significant difference between the microbiome in patients requiring treatment for NTM and those who did not. Patients receiving treatment experienced an initial decrease in bacterial burden over the first weeks of treatment followed by a gradual increase towards baseline over the next weeks to months. This change was mirrored in measures of alpha diversity. Changes in abundance and diversity were accounted for by decreases in specific bacteria whilst the abundance of other bacteria increased, occupying the microbial niche created. These bacteria (for example Pseudomonas spp) are often associated with morbidity.Open Acces
Development of in-vitro in-silico technologies for modelling and analysis of haematological malignancies
Worldwide, haematological malignancies are responsible for roughly 6% of all the cancer-related deaths. Leukaemias are one of the most severe types of cancer, as only about 40% of the patients have an overall survival of 10 years or more. Myelodysplastic Syndrome (MDS), a pre-leukaemic condition, is a blood disorder characterized by the presence of dysplastic, irregular, immature cells, or blasts, in the peripheral blood (PB) and in the bone marrow (BM), as well as multi-lineage cytopenias.
We have created a detailed, lineage-specific, high-fidelity in-silico erythroid model that incorporates known biological stimuli (cytokines and hormones) and a competing diseased haematopoietic population, correctly capturing crucial biological checkpoints (EPO-dependent CFU-E differentiation) and replicating the in-vivo erythroid differentiation dynamics. In parallel, we have also proposed a long-term, cytokine-free 3D cell culture system for primary MDS cells, which was firstly optimized using easily-accessible healthy controls. This system enabled long-term (24-day) maintenance in culture with high (>75%) cell viability, promoting spontaneous expansion of erythroid phenotypes (CD71+/CD235a+) without the addition of any exogenous cytokines. Lastly, we have proposed a novel in-vitro in-silico framework using GC-MS metabolomics for the metabolic profiling of BM and PB plasma, aiming not only to discretize between haematological conditions but also to sub-classify MDS patients, potentially based on candidate biomarkers. Unsupervised multivariate statistical analysis showed clear intra- and inter-disease separation of samples of 5 distinct haematological malignancies, demonstrating the potential of this approach for disease characterization.
The work herein presented paves the way for the development of in-vitro in-silico technologies to better, characterize, diagnose, model and target haematological malignancies such as MDS and AML.Open Acces
Statistical Learning for Gene Expression Biomarker Detection in Neurodegenerative Diseases
In this work, statistical learning approaches are used to detect biomarkers for neurodegenerative diseases (NDs). NDs are becoming increasingly prevalent as populations age, making understanding of disease and identification of biomarkers progressively important for facilitating early diagnosis and the screening of individuals for clinical trials. Advancements in gene expression profiling has enabled the exploration of disease biomarkers at an unprecedented scale. The work presented here demonstrates the value of gene expression data in understanding the underlying processes and detection of biomarkers of NDs. The value of novel approaches to previously collected -omics data is shown and it is demonstrated that new therapeutic targets can be identified. Additionally, the importance of meta-analysis to improve power of multiple small studies is demonstrated. The value of blood transcriptomics data is shown in applications to researching NDs to understand underlying processes using network analysis and a novel hub detection method. Finally, after demonstrating the value of blood gene expression data for investigating NDs, a combination of feature selection and classification algorithms were used to identify novel accurate biomarker signatures for the diagnosis and prognosis of Parkinson’s disease (PD) and Alzheimer’s disease (AD). Additionally, the use of feature pools based on previous knowledge of disease and the viability of neural networks in dimensionality reduction and biomarker detection is demonstrated and discussed. In summary, gene expression data is shown to be valuable for the investigation of ND and novel gene biomarker signatures for the diagnosis and prognosis of PD and AD
- …