    Failure mode prediction and energy forecasting of PV plants to assist dynamic maintenance tasks by ANN based models

    In the field of renewable energy, reliability analysis techniques combining the operating time of the system with the observation of operational and environmental conditions, are gaining importance over time. In this paper, reliability models are adapted to incorporate monitoring data on operating assets, as well as information on their environmental conditions, in their calculations. To that end, a logical decision tool based on two artificial neural networks models is presented. This tool allows updating assets reliability analysis according to changes in operational and/or environmental conditions. The proposed tool could easily be automated within a supervisory control and data acquisition system, where reference values and corresponding warnings and alarms could be now dynamically generated using the tool. Thanks to this capability, on-line diagnosis and/or potential asset degradation prediction can be certainly improved. Reliability models in the tool presented are developed according to the available amount of failure data and are used for early detection of degradation in energy production due to power inverter and solar trackers functional failures. Another capability of the tool presented in the paper is to assess the economic risk associated with the system under existing conditions and for a certain period of time. This information can then also be used to trigger preventive maintenance activities

    Fault Prognostics Using Logical Analysis of Data and Non-Parametric Reliability Estimation Methods

    RÉSUMÉ : Estimer la durée de vie utile restante (RUL) d’un système qui fonctionne suivant différentes conditions de fonctionnement représente un grand défi pour les chercheurs en maintenance conditionnelle (CBM). En effet, il est difficile de comprendre la relation entre les variables qui représentent ces conditions de fonctionnement et la RUL dans beaucoup de cas en pratique à cause du degré élevé de corrélation entre ces variables et leur dépendance dans le temps. Il est également difficile, voire impossible, pour des experts d’acquérir et accumuler un savoir à propos de systèmes complexes, où l'échec de l'ensemble du système est vu comme le résultat de l'interaction et de la concurrence entre plusieurs modes de défaillance. Cette thèse présente des méthodologies pour le pronostic en CBM basé sur l'apprentissage automatique, et une approche de découverte de connaissances appelée Logical Analysis of Data (LAD). Les méthodologies proposées se composent de plusieurs implémentations de la LAD combinées avec des méthodes non paramétriques d'estimation de fiabilité. L'objectif de ces méthodologies est de prédire la RUL du système surveillé tout en tenant compte de l'analyse des modes de défaillance uniques ou multiples. Deux d’entre elles considèrent un mode de défaillance unique et une autre considère de multiples modes de défaillance. Les deux méthodologies pour le pronostic avec mode unique diffèrent dans la manière de manipuler les données. Les méthodologies de pronostique dans cette recherche doctorale ont été testées et validées sur la base d'un ensemble de tests bien connus. Dans ces tests, les méthodologies ont été comparées à des techniques de pronostic connues; le modèle à risques proportionnels de Cox (PHM), les réseaux de neurones artificiels (ANNs) et les machines à vecteurs de support (SVMs). Deux ensembles de données ont été utilisés pour illustrer la performance des trois méthodologies: l'ensemble de données du turboréacteur à double flux (turbofan) qui est disponible au sein de la base de données pour le développement d'algorithmes de pronostic de la NASA, et un autre ensemble de données obtenu d’une véritable application dans l'industrie. Les résultats de ces comparaisons indiquent que chacune des méthodologies proposées permet de prédire avec précision la RUL du système considéré. Cette recherche doctorale conclut que l’approche utilisant la LAD possède d’importants mérites et avantages qui pourraient être bénéfiques au domaine du pronostic en CBM. Elle est capable de gérer les données en CBM qui sont corrélées et variantes dans le temps. Son autre avantage et qu’elle génère un savoir interprétable qui est bénéfique au personnel de maintenance.----------ABSTRACT : Estimating the remaining useful life (RUL) for a system working under different operating conditions represents a big challenge to the researchers in the condition-based maintenance (CBM) domain. The reason is that the relationship between the covariates that represent those operating conditions and the RUL is not fully understood in many practical cases, due to the high degree of correlation between such covariates, and their dependence on time. It is also difficult or even impossible for the experts to acquire and accumulate the knowledge from a complex system, where the failure of the system is regarded as the result of interaction and competition between several failure modes. This thesis presents systematic CBM prognostic methodologies based on a pattern-based machine learning and knowledge discovery approach called Logical Analysis of Data (LAD). The proposed methodologies comprise different implementations of the LAD approach combined with non-parametric reliability estimation methods. The objective of these methodologies is to predict the RUL of the monitored system while considering the analysis of single or multiple failure modes. Three different methodologies are presented; two deal with single failure mode and one deals with multiple failure modes. The two methodologies for single mode prognostics differ in the way of representing the data. The prognostic methodologies in this doctoral research have been tested and validated based on a set of widely known tests. In these tests, the methodologies were compared to well-known prognostic techniques; the proportional hazards model (PHM), artificial neural networks (ANNs) and support vector machines (SVMs). Two datasets were used to illustrate the performance of the three methodologies: the turbofan engine dataset that is available at NASA prognostic data repository, and another dataset collected from a real application in the industry. The results of these comparisons indicate that each of the proposed methodologies provides an accurate prediction for the RUL of the monitored system. This doctoral research concludes that the LAD approach has attractive merits and advantages that add benefits to the field of prognostics. It is capable of dealing with the CBM data that are correlated and time-varying. Another advantage is its generation of an interpretable knowledge that is beneficial to the maintenance personnel

    Recent advances in the theory and practice of logical analysis of data

    Logical Analysis of Data (LAD) is a data analysis methodology introduced by Peter L. Hammer in 1986. LAD distinguishes itself from other classification and machine learning methods by the fact that it analyzes a significant subset of combinations of variables to describe the positive or negative nature of an observation and uses combinatorial techniques to extract models defined in terms of patterns. In recent years, the methodology has tremendously advanced through numerous theoretical developments and practical applications. In the present paper, we review the methodology and its recent advances, describe novel applications in engineering, finance, health care, and algorithmic techniques for some stochastic optimization problems, and provide a comparative description of LAD with well-known classification methods

    Characterization of vascular heterogeneity of astrocytomas grade 4 for supporting patient prognosis estimation, and treatment response assessment

    [ES] Los tumores cerebrales son una de las enfermedades más devastadoras en la actualidad por el importante deterioro cognitivo que sufren los pacientes, la elevada tasa de mortalidad y el mal pronóstico. Los astrocitomas de grado 4 conllevan una supervivencia de cinco años en aproximadamente el 5% de los pacientes diagnosticados, siendo los tumores más agresivos y letales del Sistema Nervioso Central (SNC). Los astrocitomas de grado 4 siguen siendo un problema médico complejo aún sin resolver. A pesar de representar más del 60% de los tumores cerebrales malignos en adultos, estos tumores tienen una baja prevalencia relativa y se consideran una enfermedad huérfana, lo que dificulta el desarrollo de nuevos fármacos o tratamientos que puedan beneficiar a los pacientes. La agresividad de estos tumores se debe a diferentes características, como la fuerte angiogénesis, la necrosis, la microproliferación vascular, la capacidad de invasión e infiltración de las células tumorales y un microambiente inmunológico particular. Además, debido a la rápida progresión de los astrocitomas de grado 4, en la zona de la lesión coexisten diferentes regiones específicas que cambian con el tiempo. Esta naturaleza compleja, junto con la marcada heterogeneidad interpaciente, intratumoral y longitudinal, complica el éxito de un único tratamiento eficaz para todos los pacientes. La imagen de resonancia magnética (MRI) supone una técnica útil para caracterizar la morfología y la vascularidad del tumor. El uso de métodos avanzados y robustos para analizar las imágenes de MR recogidas en las fases iniciales del tratamiento de los pacientes permite la delimitación de las diferentes regiones de los astrocitomas de grado 4, convirtiéndose en herramientas útiles para investigadores, radiólogos y neurocirujanos. Además, el cálculo de biomarcadores vasculares de imagen, como los propuestos en esta tesis, facilitaría la caracterización del tumor, la estimación del pronóstico y los enfoques de tratamiento más personalizados. Esta tesis propone cuatro pilares fundamentales para avanzar en el manejo de los astrocitomas de grado 4. Estos incluyen I) la caracterización multinivel del tumor para mejorar las clasificaciones de los gliomas de alto grado del SNC; II) la búsqueda y desarrollo de biomarcadores robustos para estimar el pronóstico de los pacientes desde el momento prequirúrgico; III) así como para evaluar la respuesta a los tratamientos y la selección de los pacientes que pueden beneficiarse de terapias específicas; y IV) el diseño e implementación de estudios clínicos y protocolos para la recogida de datos a largo plazo de cohortes de pacientes notables a nivel internacional. Para abordar estos cuatro pilares, se ha utilizado un enfoque interdisciplinario que combina el análisis de imágenes médicas, técnicas avanzadas de inteligencia artificial y variables moleculares, histopatológicas y clínicas. En conclusión, hemos abordado la influencia de la heterogeneidad interpaciente e intratumoral del astrocitoma de grado 4 para la caracterización y clasificación del tumor, la estimación del pronóstico del paciente y la predicción de las respuestas al tratamiento. Además, se han diseñado e implementado diferentes estudios clínicos que permiten la recogida de datos multinivel de cohortes internacionales de pacientes con astrocitoma de grado 4.[CA] Els tumors cerebrals són una de les malalties més devastadores en l'actualitat per la important deterioració cognitiva que pateixen els pacients, l'elevada taxa de mortalitat i el mal pronòstic. Els astrocitomes de grau 4 comporten una supervivència de cinc anys en aproximadament el 5% dels pacients diagnosticats, sent els tumors més agressius i letals del Sistema Nerviós Central (SNC). Els astrocitomes de grau 4 continuen sent un problema mèdic complex encara sense resoldre. Malgrat representar més del 60% dels tumors cerebrals malignes en adults, aquests tumors tenen una baixa prevalença relativa i es consideren una malaltia òrfena, la qual cosa dificulta el desenvolupament de nous fàrmacs o tractaments que puguen beneficiar als pacients. L'agressivitat d'aquests tumors es deu a diferents característiques, com la forta angiogènesis, la necrosi, la microproliferació vascular, la capacitat d'invasió i infiltració de les cèl·lules tumorals i un microambient immunològic particular. A més, a causa de la ràpida progressió dels astrocitomes de grau 4, en la zona de la lesió coexisteixen diferents regions específiques que canvien amb el temps. Aquesta naturalesa complexa, juntament amb la marcada heterogeneïtat interpacient, intratumoral i longitudinal fa que es complique l'èxit d'un únic tractament eficaç per a tots els pacients. L'imatge de ressonància magnètica (MRI) suposa una tècnica útil per a caracteritzar la morfologia i la vascularitat del tumor. L'ús de mètodes avançats i robustos per a analitzar les imatges de MR recollides en les fases inicials del tractament dels pacients permet la delimitació de les diferents regions dels astrocitomes de grau 4, convertint-se en eines útils per a investigadors, radiòlegs i neurocirugians. A més, el càlcul de biomarcadors vasculars d'imatge, com els proposats en aquesta tesi, facilitaria la caracterització del tumor, l'estimació del pronòstic i els enfocaments de tractament més personalitzats. Aquesta tesi proposa quatre pilars fonamentals per a avançar en el maneig dels astrocitomes de grau 4. Aquests inclouen I) la caracterització multinivell del tumor per a millorar les classificacions dels gliomes d'alt grau del SNC; II) la cerca i desenvolupament de biomarcadors robustos per a estimar el pronòstic dels pacients des del moment prequirúrgic; III) així com per a avaluar la resposta als tractaments i la selecció dels pacients que poden beneficiar-se de teràpies específiques; i IV) el disseny i implementació d'estudis clínics i protocols per a la recollida de dades a llarg termini de cohorts de pacients notables a nivell internacional. Per a abordar aquests quatre pilars, s'ha utilitzat un enfocament interdisciplinari que combina l'anàlisi d'imatges mèdiques, tècniques avançades d'intel·ligència artificial i variables moleculars, histopatològiques i clíniques. En conclusió, hem abordat la influència de l'heterogeneïtat interpacient i intratumoral del astrocitoma de grau 4 per a la caracterització i classificació del tumor, l'estimació del pronòstic del pacient i la predicció de les respostes al tractament. A més, s'han dissenyat i implementat diferents estudis clínics que permeten la recollida de dades multinivell de cohorts internacionals de pacients amb astrocitoma de grau 4.[EN] Brain tumors are one of the most devastating diseases today because of the significant cognitive impairment suffered by patients, high mortality rates, and poor prognosis. Astrocytomas grade 4 bring five-year survival in approximately 5% of diagnosed patients, being the most aggressive and lethal tumors of the Central Nervous System (CNS). Astrocytomas grade 4 continue to be an unresolved complex medical problem. Despite accounting for more than 60% of malignant brain tumors in adults, these tumors have a low relative prevalence and are considered an orphan disease, making difficult developing new drugs or treatments that might benefit patients. The aggressiveness of these tumors is due to different characteristics, such as strong angiogenesis, necrosis, vascular microproliferation, the capacity of the tumor cells to invade and infiltrate, and a particular immune microenvironment. In addition, due to the rapid progression of astrocytomas grade 4, different specific regions coexist in the lesion area which change over time. This complex nature, along with the marked interpatient, intratumor, and longitudinal heterogeneity, makes complicate the success of a single efficient treatment for all patients. Magnetic Resonance Imaging (MRI) represents a useful technique to characterize tumor morphology and vascularity. Using advanced and robust methods to analyze MR images collected from initial stages of patient management allows the delineation of different regions of astrocytomas grade 4, becoming useful tools for researchers, radiologists and neurosurgeons. In addition, the calculation of imaging vascular biomarkers, such as those proposed in this thesis, would facilitate tumor characterization, prognosis estimation and more personalized treatment approaches. This thesis proposes four fundamental pillars to advance the management of astrocytomas grade 4. These include I) the multilevel characterization of the tumor to improve classifications of high-grade CNS gliomas; II) the search and development of robust biomarkers for estimating patient prognosis from the presurgical moment; III) as well as for evaluating the response to treatments and the selection of patients who may benefit from specific therapies; and IV) the design and implementation of clinical studies and protocols for long-term collecting data from internationally remarkable cohorts of patients. To address these four pillars, an interdisciplinary approach has been used that combines medical imaging analysis, advanced artificial intelligence techniques, and molecular, histopathological, and clinical variables. Concluding, we have addressed the influence of both interpatient and intratumor heterogeneity of astrocytoma grade 4 for tumor characterization and classification, patient prognosis estimation and predicting treatment responses. In addition, different clinical studies have been designed and implemented allowing the collection of multilevel data from international cohorts of patients with astrocytoma grade 4.Álvarez Torres, MDM. (2022). Characterization of vascular heterogeneity of astrocytomas grade 4 for supporting patient prognosis estimation, and treatment response assessment [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/18895

    Analytics of Sequential Time Data from Physical Assets

    RÉSUMÉ: Avec l’avancement dans les technologies des capteurs et de l’intelligence artificielle, l'analyse des données est devenue une source d’information et de connaissance qui appuie la prise de décisions dans l’industrie. La prise de ces décisions, en se basant seulement sur l’expertise humaine n’est devenu suffisant ou souhaitable, et parfois même infaisable pour de nouvelles industries. L'analyse des données collectées à partir des actifs physiques vient renforcer la prise de décisions par des connaissances pratiques qui s’appuient sur des données réelles. Ces données sont utilisées pour accomplir deux tâches principales; le diagnostic et le pronostic. Les deux tâches posent un défi, principalement à cause de la provenance des données et de leur adéquation avec l’exploitation, et aussi à cause de la difficulté à choisir le type d'analyse. Ce dernier exige un analyste ayant une expertise dans les déférentes techniques d’analyse de données, et aussi dans le domaine de l’application. Les problèmes de données sont dus aux nombreuses sources inconnues de variations interagissant avec les données collectées, qui peuvent parfois être dus à des erreurs humaines. Le choix du type de modélisation est un autre défi puisque chaque modèle a ses propres hypothèses, paramètres et limitations. Cette thèse propose quatre nouveaux types d'analyse de séries chronologiques dont deux sont supervisés et les deux autres sont non supervisés. Ces techniques d'analyse sont testées et appliquées sur des différents problèmes industriels. Ces techniques visent à minimiser la charge de choix imposée à l'analyste. Pour l’analyse de séries chronologiques par des techniques supervisées, la prédiction de temps de défaillance d’un actif physique est faite par une technique qui porte le nom de ‘Logical Analysis of Survival Curves (LASC)’. Cette technique est utilisée pour stratifier de manière adaptative les courbes de survie tout au long d’un processus d’inspection. Ceci permet une modélisation plus précise au lieu d'utiliser un seul modèle augmenté pour toutes les données. L'autre technique supervisée de pronostic est un nouveau réseau de neurones de type ‘Long Short-Term Memory (LSTM) bidirectionnel’ appelé ‘Bidirectional Handshaking LSTM (BHLSTM)’. Ce modèle fait un meilleur usage des séquences courtes en faisant un tour de ronde à travers les données. De plus, le réseau est formé à l'aide d'une nouvelle fonction objective axée sur la sécurité qui force le réseau à faire des prévisions plus sûres. Enfin, étant donné que LSTM est une technique supervisée, une nouvelle approche pour générer la durée de vie utile restante (RUL) est proposée. Cette technique exige la formulation des hypothèses moins importantes par rapport aux approches précédentes. À des fins de diagnostic non supervisé, une nouvelle technique de classification interprétable est proposée. Cette technique est intitulée ‘Interpretable Clustering for Rule Extraction and Anomaly Detection (IC-READ)’. L'interprétation signifie que les groupes résultants sont formulés en utilisant une logique conditionnelle simple. Cela est pratique lors de la fourniture des résultats à des non-spécialistes. Il facilite toute mise en oeuvre du matériel si nécessaire. La technique proposée est également non paramétrique, ce qui signifie qu'aucun réglage n'est requis. Cette technique pourrait également être utiliser dans un contexte de ‘one class classification’ pour construire un détecteur d'anomalie. L'autre technique non supervisée proposée est une approche de regroupement de séries chronologiques à plusieurs variables de longueur variable à l'aide d'une distance de type ‘Dynamic Time Warping (DTW)’ modifiée. Le DTW modifié donne des correspondances plus élevées pour les séries temporelles qui ont des tendances et des grandeurs similaires plutôt que de se concentrer uniquement sur l'une ou l'autre de ces propriétés. Cette technique est également non paramétrique et utilise la classification hiérarchique pour regrouper les séries chronologiques de manière non supervisée. Cela est particulièrement utile pour décider de la planification de la maintenance. Il est également montré qu'il peut être utilisé avec ‘Kernel Principal Components Analysis (KPCA)’ pour visualiser des séquences de longueurs variables dans des diagrammes bidimensionnels.---------- ABSTRACT: Data analysis has become a necessity for industry. Working with inherited expertise only has become insufficient, expensive, not easily transferable, and mostly unavailable for new industries and facilities. Data analysis can provide decision-makers with more insight on how to manage their production, maintenance and personnel. Data collection requires acquisition and storage of observatory information about the state of the different production assets. Data collection usually takes place in a timely manner which result in time-series of observations. Depending on the type of data records available, the type of possible analyses will differ. Data labeled with previous human experience in terms of identifiable faults or fatigues can be used to build models to perform the expert’s task in the future by means of supervised learning. Otherwise, if no human labeling is available then data analysis can provide insights about similar observations or visualize these similarities through unsupervised learning. Both are challenging types of analyses. The challenges are two-fold; the first originates from the data and its adequacy, and the other is selecting the type of analysis which is a decision made by the analyst. Data challenges are due to the substantial number of unknown sources of variations inherited in the collected data, which may sometimes include human errors. Deciding upon the type of modelling is another issue as each model has its own assumptions, parameters to tune, and limitations. This thesis proposes four new types of time-series analysis, two of which are supervised requiring data labelling by certain events such as failure when, and the other two are unsupervised that require no such labelling. These analysis techniques are tested and applied on various industrial applications, namely road maintenance, bearing outer race failure detection, cutting tool failure prediction, and turbo engine failure prediction. These techniques target minimizing the burden of choice laid on the analyst working with industrial data by providing reliable analysis tools that require fewer choices to be made by the analyst. This in turn allows different industries to easily make use of their data without requiring much expertise. For prognostic purposes a proposed modification to the binary Logical Analysis of Data (LAD) classifier is used to adaptively stratify survival curves into long survivors and short life sets. This model requires no parameters to choose and completely relies on empirical estimations. The proposed Logical Analysis of Survival Curves show a 27% improvement in prediction accuracy than the results obtained by well-known machine learning techniques in terms of the mean absolute error. The other prognostic model is a new bidirectional Long Short-Term Memory (LSTM) neural network termed the Bidirectional Handshaking LSTM (BHLSTM). This model makes better use of short sequences by making a round pass through the given data. Moreover, the network is trained using a new safety oriented objective function which forces the network to make safer predictions. Finally, since LSTM is a supervised technique, a novel approach for generating the target Remaining Useful Life (RUL) is proposed requiring lesser assumptions to be made compared to previous approaches. This proposed network architecture shows an average of 18.75% decrease in the mean absolute error of predictions on the NASA turbo engine dataset. For unsupervised diagnostic purposes a new technique for providing interpretable clustering is proposed named Interpretable Clustering for Rule Extraction and Anomaly Detection (IC-READ). Interpretation means that the resulting clusters are formulated using simple conditional logic. This is very important when providing the results to non-specialists especially those in management and ease any hardware implementation if required. The proposed technique is also non-parametric, which means there is no tuning required and shows an average of 20% improvement in cluster purity over other clustering techniques applied on 11 benchmark datasets. This technique also can use the resulting clusters to build an anomaly detector. The last proposed technique is a whole multivariate variable length time-series clustering approach using a modified Dynamic Time Warping (DTW) distance. The modified DTW gives higher matches for time-series that have the similar trends and magnitudes rather than just focusing on either property alone. This technique is also non-parametric and uses hierarchal clustering to group time-series in an unsupervised fashion. This can be specifically useful for management to decide maintenance scheduling. It is shown also that it can be used along with Kernel Principal Components Analysis (KPCA) for visualizing variable length sequences in two-dimensional plots. The unsupervised techniques can help, in some cases where there is a lot of variation within certain classes, to ease the supervised learning task by breaking it into smaller problems having the same nature

    Integration of serum metabolomics into clinical assessment to improve outcome prediction of metastatic soft tissue sarcoma patients treated with trabectedin

    Soft tissue sarcomas (STS) are a group of rare and heterogeneous cancers with few diagnostic or prognostic biomarkers. This metabolomics study aimed to identify new serum prognostic biomarkers to improve the prediction of overall survival in patients with metastatic STS. The study enrolled 24 patients treated with the same trabectedin regimen. The baseline serum metabolomics profile, targeted to 68 metabolites encompassing amino acids and bile acids pathways, was quantified by liquid chromatography-tandem mass spectrometry. Correlations between individual metabolomics profiles and overall survival were examined and a risk model to predict survival was built by Cox multivariate regression. The median overall survival of the studied patients was 13.0 months (95% CI, 5.6–23.5). Among all the metabolites investigated, only citrulline and histidine correlated significantly with overall survival. The best Cox risk prediction model obtained integrating metabolomics and clinical data, included citrulline, hemoglobin and patients’ performance status score. It allowed to distinguish patients into a high-risk group with a low median overall survival of 2.1 months and a low-to moderate-risk group with a median overall survival of 19.1 months (p < 0.0001). The results of this metabolomics translation study indicate that citrulline, an amino acid belonging to the arginine metabolism, represents an important metabolic signature that may contribute to explain the high inter-patients overall survival variability of STS patients. The risk prediction model based on baseline serum citrulline, hemoglobin and performance status may represent a new prognostic tool for the early classification of patients with metastatic STS, according to their overall survival expectancy

    Measuring uncertainty in economic evaluations: A case study in liver transplantation

    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University, 29/11/2006.It is important to account for all sources of uncertainty when evaluating the clinical or cost-effectiveness of health care technologies. Therefore, this thesis takes as its basis a cost-effectiveness study in liver transplantation and identifies two previously unexplored issues that can arise in clinical and cost-effectiveness studies. A literature review of studies evaluating the effectiveness, costs or cost-effectiveness of solid organ transplantation confirmed that these issues were important and relevant to other transplantation studies. The first issue concerns the selection of an appropriate method for estimating mean study costs in the presence of incomplete (censored) data. Twelve techniques were identified and their accuracy was compared across artificially created mechanisms and levels of censoring. Lin's method with known cost histories and short interval lengths is recommended for accurately estimating mean costs and their uncertainty. It is assumed that these findings are generalisable to any solid organ transplant study where censoring is an issue. The second issue explored in this thesis relates to methods for measuring uncertainty around survival, HRQL and cost estimates derived from prognostic models in the absence of observed data. Probabilistic sensitivity analysis is recommended for measuring prognostic model parameter uncertainty and estimating individual patient outcomes and their uncertainties, as it is able to incorporate the additional uncertainty from using prognostic models to estimate control group outcomes. This thesis shows the quantitative importance of these issues and the methodological guidance offered should enable decision makers to have more confidence in clinical and cost-effectiveness estimates. Providing decision makers with a fuller estimate of the uncertainty around clinical and cost effectiveness estimates will aid them in decisions about the necessity of conducting further research in to the clinical or cost-effectiveness of health care technologies.Department of Healt

    Proteome-wide analysis of protein abundance and turnover remodelling during oncogenic transformation of human breast epithelial cells

    Background: Viral oncogenes and mutated proto-oncogenes are potent drivers of cancer malignancy. Downstream of the oncogenic trigger are alterations in protein properties that give rise to cellular transformation and the acquisition of malignant cellular phenotypes. Developments in mass spectrometry enable large-scale, multidimensional characterisation of proteomes. Such techniques could provide an unprecedented, unbiased view of how oncogene activation remodels a human cell proteome. Methods: Using quantitative MS-based proteomics and cellular assays, we analysed how transformation induced by activating v-Src kinase remodels the proteome and cellular phenotypes of breast epithelial (MCF10A) cells. SILAC MS was used to comprehensively characterise the MCF10A proteome and to measure v-Src-induced changes in protein abundance across seven time-points (1-72 hrs). We used pulse-SILAC MS (Boisvert et al., 2012), to compare protein synthesis and turnover in control and transformed cells. Follow-on experiments employed a combination of cellular and functional assays to characterise the roles of selected Src-responsive proteins. Results: Src-induced transformation changed the expression and/or turnover levels of ~3% of proteins, affecting ~1.5% of the total protein molecules in the cell. Transformation increased the average rate of proteome turnover and disrupted protein homeostasis. We identify distinct classes of protein kinetics in response to Src activation. We demonstrate that members of the polycomb repressive complex 1 (PRC1) are important regulators of invasion and migration in MCF10A cells. Many Src-regulated proteins are present in low abundance and some are regulated post-transcriptionally. The signature of Src-responsive proteins is highly predictive of poor patient survival across multiple cancer types. Open access to search and interactively explore all these proteomic data is provided via the EPD database (www.peptracker.com/epd). Conclusions: We present the first comprehensive analysis measuring how protein expression and protein turnover is affected by cell transformation, providing a detailed picture at the protein level of the consequences of activation of an oncogene

    A Single-Cell Taxonomy Predicts Inflammatory Niche Remodeling to Drive Tissue Failure and Outcome in Human AML

    Get PDF
    Cancer initiation is orchestrated by an interplay between tumor-initiating cells and their stromal/immune environment. Here, by adapted single-cell RNA sequencing, we decipher the predicted signaling between tissue-resident hematopoietic stem/progenitor cells (HSPC) and their neoplastic counterparts with their native niches in the human bone marrow. LEPR + stromal cells are identified as central regulators of hematopoiesis through predicted interactions with all cells in the marrow. Inflammatory niche remodeling and the resulting deprivation of critical HSPC regulatory factors are predicted to repress high-output hematopoietic stem cell subsets in NPM1-mutated acute myeloid leukemia (AML), with relative resistance of clonal cells. Stromal gene signatures reflective of niche remodeling are associated with reduced relapse rates and favorable outcomes after chemotherapy across all genetic risk categories. Elucidation of the intercellular signaling defining human AML, thus, predicts that inflammatory remodeling of stem cell niches drives tissue repression and clonal selection but may pose a vulnerability for relapse-initiating cells in the context of chemotherapeutic treatment.</p