1,968 research outputs found

    A Review

    Get PDF
    Ovarian cancer is the most common cause of death among gynecological malignancies. We discuss different types of clinical and nonclinical features that are used to study and analyze the differences between benign and malignant ovarian tumors. Computer-aided diagnostic (CAD) systems of high accuracy are being developed as an initial test for ovarian tumor classification instead of biopsy, which is the current gold standard diagnostic test. We also discuss different aspects of developing a reliable CAD system for the automated classification of ovarian cancer into benign and malignant types. A brief description of the commonly used classifiers in ultrasound-based CAD systems is also given

    Computational Tools for the Untargeted Assignment of FT-MS Metabolomics Datasets

    Get PDF
    Metabolomics is the study of metabolomes, the sets of metabolites observed in living systems. Metabolism interconverts these metabolites to provide the molecules and energy necessary for life processes. Many disease processes, including cancer, have a significant metabolic component that manifests as differences in what metabolites are present and in what quantities they are produced and utilized. Thus, using metabolomics, differences between metabolomes in disease and non-disease states can be detected and these differences improve our understanding of disease processes at the molecular level. Despite the potential benefits of metabolomics, the comprehensive investigation of metabolomes remains difficult. A popular analytical technique for metabolomics is mass spectrometry. Advances in Fourier transform mass spectrometry (FT-MS) instrumentation have yielded simultaneous improvements in mass resolution, mass accuracy, and detection sensitivity. In the metabolomics field, these advantages permit more complicated, but more informative experimental designs such as the use of multiple isotope-labeled precursors in stable isotope-resolved metabolomics (SIRM) experiments. However, despite these potential applications, several outstanding problems hamper the use of FT-MS for metabolomics studies. First, artifacts and data quality problems in FT-MS spectra can confound downstream data analyses, confuse machine learning models, and complicate the robust detection and assignment of metabolite features. Second, the assignment of observed spectral features to metabolites remains difficult. Existing targeted approaches for assignment often employ databases of known metabolites; however, metabolite databases are incomplete, thus limiting or biasing assignment results. Additionally, FT-MS provides limited structural information for observed metabolites, which complicates the determination of metabolite class (e.g. lipid, sugar, etc. ) for observed metabolite spectral features, a necessary step for many metabolomics experiments. To address these problems, a set of tools were developed. The first tool identifies artifacts with high peak density observed in many FT-MS spectra and removes them safely. Using this tool, two previously unreported types of high peak density artifact were identified in FT-MS spectra: fuzzy sites and partial ringing. Fuzzy sites were particularly problematic as they confused and reduced the accuracy of machine learning models trained on datasets containing these artifacts. Second, a tool called SMIRFE was developed to assign isotope-resolved molecular formulas to observed spectral features in an untargeted manner without a database of expected metabolites. This new untargeted method was validated on a gold-standard dataset containing both unlabeled and 15N-labeled compounds and was able to identify 18 of 18 expected spectral features. Third, a collection of machine learning models was constructed to predict if a molecular formula corresponds to one or more lipid categories. These models accurately predict the correct one of eight lipid categories on our training dataset of known lipid and non-lipid molecular formulas with precisions and accuracies over 90% for most categories. These models were used to predict lipid categories for untargeted SMIRFE-derived assignments in a non-small cell lung cancer dataset. Subsequent differential abundance analysis revealed a sub-population of non-small cell lung cancer samples with a significantly increased abundance in sterol lipids. This finding implies a possible therapeutic role of statins in the treatment and/or prevention of non-small cell lung cancer. Collectively these tools represent a pipeline for FT-MS metabolomics datasets that is compatible with isotope labeling experiments. With these tools, more robust and untargeted metabolic analyses of disease will be possible

    On the development of intelligent medical systems for pre-operative anaesthesia assessment

    Get PDF
    This thesis describes the research and development of a decision support tool for determining a medical patient's suitability for surgical anaesthesia. At present, there is a change in the way that patients are clinically assessedp rior to surgery. The pre-operative assessment, usually conducted by a qualified anaesthetist, is being more frequently performed by nursing grade staff. The pre-operative assessmenet xists to minimise the risk of surgical complications for the patient. Nursing grade staff are often not as experienced as qualified anaesthetists, and thus are not as well suited to the role of performing the pre-operative assessment. This research project used data collected during pre-operative assessments to develop a decision support tool that would assist the nurse (or anaesthetist) in determining whether a patient is suitable for surgical anaesthesia. The three main objectives are: firstly, to research and develop an automated intelligent systems technique for classifying heart and lung sounds and hence identifying cardio-respiratory pathology. Secondly, to research and develop an automated intelligent systems technique for assessing the patient's blood oxygen level and pulse waveform. Finally, to develop a decision support tool that would combine the assessmentsa bove in forming a decision as to whether the patient is suitable for surgical anaesthesia. Clinical data were collected from hospital outpatient departments and recorded alongside the diagnoses made by a qualified anaesthetist. Heart and lung sounds were collected using an electronic stethoscope. Using this data two ensembles of artificial neural networks were trained to classify the different heart and lung sounds into different pathology groups. Classification accuracies up to 99.77% for the heart sounds, and 100% for the lung sounds has been obtained. Oxygen saturation and pulse waveform measurements were recorded using a pulse oximeter. Using this data an artificial neural network was trained to discriminate between normal and abnormal pulse waveforms. A discrimination accuracy of 98% has been obtained from the system. A fuzzy inference system was generated to classify the patient's blood oxygen level as being either an inhibiting or non-inhibiting factor in their suitability for surgical anaesthesia. When tested the system successfully classified 100% of the test dataset. A decision support tool, applying the genetic programming evolutionary technique to a fuzzy classification system was created. The decision support tool combined the results from the heart sound, lung sound and pulse oximetry classifiers in determining whether a patient was suitable for surgical anaesthesia. The evolved fuzzy system attained a classification accuracy of 91.79%. The principal conclusion from this thesis is that intelligent systems, such as artificial neural networks, genetic programming, and fuzzy inference systems, can be successfully applied to the creation of medical decision support tools.EThOS - Electronic Theses Online ServiceMedicdirect.co.uk Ltd.GBUnited Kingdo

    New Statistical Algorithms for the Analysis of Mass Spectrometry Time-Of-Flight Mass Data with Applications in Clinical Diagnostics

    Get PDF
    Mass spectrometry (MS) based techniques have emerged as a standard forlarge-scale protein analysis. The ongoing progress in terms of more sensitive machines and improved data analysis algorithms led to a constant expansion of its fields of applications. Recently, MS was introduced into clinical proteomics with the prospect of early disease detection using proteomic pattern matching. Analyzing biological samples (e.g. blood) by mass spectrometry generates mass spectra that represent the components (molecules) contained in a sample as masses and their respective relative concentrations. In this work, we are interested in those components that are constant within a group of individuals but differ much between individuals of two distinct groups. These distinguishing components that dependent on a particular medical condition are generally called biomarkers. Since not all biomarkers found by the algorithms are of equal (discriminating) quality we are only interested in a small biomarker subset that - as a combination - can be used as a fingerprint for a disease. Once a fingerprint for a particular disease (or medical condition) is identified, it can be used in clinical diagnostics to classify unknown spectra. In this thesis we have developed new algorithms for automatic extraction of disease specific fingerprints from mass spectrometry data. Special emphasis has been put on designing highly sensitive methods with respect to signal detection. Thanks to our statistically based approach our methods are able to detect signals even below the noise level inherent in data acquired by common MS machines, such as hormones. To provide access to these new classes of algorithms to collaborating groups we have created a web-based analysis platform that provides all necessary interfaces for data transfer, data analysis and result inspection. To prove the platform's practical relevance it has been utilized in several clinical studies two of which are presented in this thesis. In these studies it could be shown that our platform is superior to commercial systems with respect to fingerprint identification. As an outcome of these studies several fingerprints for different cancer types (bladder, kidney, testicle, pancreas, colon and thyroid) have been detected and validated. The clinical partners in fact emphasize that these results would be impossible with a less sensitive analysis tool (such as the currently available systems). In addition to the issue of reliably finding and handling signals in noise we faced the problem to handle very large amounts of data, since an average dataset of an individual is about 2.5 Gigabytes in size and we have data of hundreds to thousands of persons. To cope with these large datasets, we developed a new framework for a heterogeneous (quasi) ad-hoc Grid - an infrastructure that allows to integrate thousands of computing resources (e.g. Desktop Computers, Computing Clusters or specialized hardware, such as IBM's Cell Processor in a Playstation 3)

    Design and implementation of a statistical analysis tool for two biological states

    Get PDF
    The major goal of research in this thesis is to design and implement a software tool (Q5+) that can easily, quickly and reliably search biomarkers by statistically analyzing mass spectrometry data from two different biological states. Q5+ implements most of the Q5 algorithm, a very good algorithm that is used for classifying mass spectrometry data (Lilien et al). Compared Q5 Q5+ improves the usability of Q5 by incorporating a Graphic User interface and Matrix Library. Results show that by running the same data, Q5+ and Q5 showed the equivalent classification ability. Q5+ also implements the Peak Screening feature, which can be used to identify a set of peaks that may have discriminant power. Although human inspection is inevitable, it offers a way for further investigation which otherwise may not be possible only by human inspection. Overall, Q5+ is an easy and reliable tool for lab research

    Data Mining

    Get PDF
    Data mining is a branch of computer science that is used to automatically extract meaningful, useful knowledge and previously unknown, hidden, interesting patterns from a large amount of data to support the decision-making process. This book presents recent theoretical and practical advances in the field of data mining. It discusses a number of data mining methods, including classification, clustering, and association rule mining. This book brings together many different successful data mining studies in various areas such as health, banking, education, software engineering, animal science, and the environment

    A survey of the application of soft computing to investment and financial trading

    Get PDF

    Implementing decision tree-based algorithms in medical diagnostic decision support systems

    Get PDF
    As a branch of healthcare, medical diagnosis can be defined as finding the disease based on the signs and symptoms of the patient. To this end, the required information is gathered from different sources like physical examination, medical history and general information of the patient. Development of smart classification models for medical diagnosis is of great interest amongst the researchers. This is mainly owing to the fact that the machine learning and data mining algorithms are capable of detecting the hidden trends between features of a database. Hence, classifying the medical datasets using smart techniques paves the way to design more efficient medical diagnostic decision support systems. Several databases have been provided in the literature to investigate different aspects of diseases. As an alternative to the available diagnosis tools/methods, this research involves machine learning algorithms called Classification and Regression Tree (CART), Random Forest (RF) and Extremely Randomized Trees or Extra Trees (ET) for the development of classification models that can be implemented in computer-aided diagnosis systems. As a decision tree (DT), CART is fast to create, and it applies to both the quantitative and qualitative data. For classification problems, RF and ET employ a number of weak learners like CART to develop models for classification tasks. We employed Wisconsin Breast Cancer Database (WBCD), Z-Alizadeh Sani dataset for coronary artery disease (CAD) and the databanks gathered in Ghaem Hospital’s dermatology clinic for the response of patients having common and/or plantar warts to the cryotherapy and/or immunotherapy methods. To classify the breast cancer type based on the WBCD, the RF and ET methods were employed. It was found that the developed RF and ET models forecast the WBCD type with 100% accuracy in all cases. To choose the proper treatment approach for warts as well as the CAD diagnosis, the CART methodology was employed. The findings of the error analysis revealed that the proposed CART models for the applications of interest attain the highest precision and no literature model can rival it. The outcome of this study supports the idea that methods like CART, RF and ET not only improve the diagnosis precision, but also reduce the time and expense needed to reach a diagnosis. However, since these strategies are highly sensitive to the quality and quantity of the introduced data, more extensive databases with a greater number of independent parameters might be required for further practical implications of the developed models
    • …
    corecore