60 research outputs found

    Ensemble classification of incomplete data – a non-imputation approach with an application in ovarian tumour diagnosis support

    Get PDF
    Wydział Matematyki i InformatykiW niniejszej pracy doktorskiej zająłem się problemem klasyfikacji danych niekompletnych. Motywacja do podjęcia badań ma swoje źródło w medycynie, gdzie bardzo często występuje zjawisko braku danych. Najpopularniejszą metodą radzenia sobie z tym problemem jest imputacja danych, będąca uzupełnieniem brakujących wartości na podstawie statystycznych zależności między cechami. W moich badaniach przyjąłem inną strategię rozwiązania tego problemu. Wykorzystując opracowane wcześniej klasyfikatory można przekształcić je do formy, która zwraca przedział możliwych predykcji. Następnie, poprzez zastosowanie operatorów agregacji oraz metod progowania, można dokonać finalnej klasyfikacji. W niniejszej pracy pokazuję jak dokonać ww. przekształcenia klasyfikatorów oraz jak wykorzystać strategie agregacji danych przedziałowych do klasyfikacji. Opracowane przeze mnie metody podnoszą jakość klasyfikacji danych niekompletnych w problemie wspomagania diagnostyki guzów jajnika. Dodatkowa analiza wyników na zewnętrznych zbiorach danych z repozytorium uczenia maszynowego Uniwersytetu Kalifornijskiego w Irvine (UCI) wskazuje, że przedstawione metody są komplementarne z imputacją.In this doctoral dissertation I focus on the problem of classification of incomplete data. The motivation for the research comes from medicine, where missing data phenomena are commonly encountered. The most popular method of dealing with data missingness is imputation; that is, inserting missing data on the basis of statistical relationships among features. In my research I choose a different strategy for dealing with this issue. Classifiers of a type previously developed can be transformed to a form which returns an interval of possible predictions. In the next step, with the use of aggregation operators and thresholding methods, one can make a final classification. I show how to make such transformations of classifiers and how to use aggregation strategies for interval data classification. These methods improve the quality of the process of classification of incomplete data in the problem of ovarian tumour diagnosis. Additional analysis carried out on external datasets from the University of California, Irvine (UCI) Machine Learning Repository shows that the aforementioned methods are complementary to imputation

    Implementing decision tree-based algorithms in medical diagnostic decision support systems

    Get PDF
    As a branch of healthcare, medical diagnosis can be defined as finding the disease based on the signs and symptoms of the patient. To this end, the required information is gathered from different sources like physical examination, medical history and general information of the patient. Development of smart classification models for medical diagnosis is of great interest amongst the researchers. This is mainly owing to the fact that the machine learning and data mining algorithms are capable of detecting the hidden trends between features of a database. Hence, classifying the medical datasets using smart techniques paves the way to design more efficient medical diagnostic decision support systems. Several databases have been provided in the literature to investigate different aspects of diseases. As an alternative to the available diagnosis tools/methods, this research involves machine learning algorithms called Classification and Regression Tree (CART), Random Forest (RF) and Extremely Randomized Trees or Extra Trees (ET) for the development of classification models that can be implemented in computer-aided diagnosis systems. As a decision tree (DT), CART is fast to create, and it applies to both the quantitative and qualitative data. For classification problems, RF and ET employ a number of weak learners like CART to develop models for classification tasks. We employed Wisconsin Breast Cancer Database (WBCD), Z-Alizadeh Sani dataset for coronary artery disease (CAD) and the databanks gathered in Ghaem Hospital’s dermatology clinic for the response of patients having common and/or plantar warts to the cryotherapy and/or immunotherapy methods. To classify the breast cancer type based on the WBCD, the RF and ET methods were employed. It was found that the developed RF and ET models forecast the WBCD type with 100% accuracy in all cases. To choose the proper treatment approach for warts as well as the CAD diagnosis, the CART methodology was employed. The findings of the error analysis revealed that the proposed CART models for the applications of interest attain the highest precision and no literature model can rival it. The outcome of this study supports the idea that methods like CART, RF and ET not only improve the diagnosis precision, but also reduce the time and expense needed to reach a diagnosis. However, since these strategies are highly sensitive to the quality and quantity of the introduced data, more extensive databases with a greater number of independent parameters might be required for further practical implications of the developed models

    Evaluation of Adnexal Mass in Reproductive and Perimenopausal Age Group

    Get PDF
    INTRODUCTION: The ovaries are the organs which can give rise to both benign and malignant tumors throught the life of women. The ovarian cancer remains to held the fixth leading cause of cancer related deaths. The most important is the family history as 10% of patients have inherited genetic predisposition. Ovarian mass are a frequent finding in general gynecology and most are cystic.histo ; ogically ovarian cysts are often divided into neoplastic growth (ovarian cystic neoplasms) and those created by disruption of normal ovulation (functional ovarian cysts). Angiogenesis is an essential component of both the follicular and luteal phases of ovarian cycles. It is also a component of various pathologic ovarian cycles. It is also a component of various pathologic ovarian process including follicular cyst formation, PCOS, ovarian hyperstimulation syndrome, benign and malignant ovarian neoplasms. Functional ovarian cysts make up large proportion. Neoplasms fill the remaining category which are predominantly benign. AIMS & OBJECTIVE: Primary Objective: The primary objective of my study is to evaluat the ADNEXAL MASS in reproductive and perimenopausal age group in view of analyzingthe percentage of malignant adnexal tumors in this age group. METHODOLOY: The study included patients in the reproductive and perimenopasual age group group admitted in ISOKGH for evaluation in 1 year duration. From all patients basic data (age, occupation, education and address) and gynaecological data (menarche age, parity, last menstrual cycle, symptoms and family history) were obtained. Further more the blood analysis, tumor marker, clinical and ultrasonography, CT findings of pelvic organs and hpe reports were performed. The risk of malignancy index (RMI) for all patients was calculated. IOTA – Simple rules. Reliable triage test to differentiate between benign and malignant masses. The outcome for all patients assessed. Inclusion Criteria: All reproductive and perimenopausal age group admitted in ISOKGH. Exclusion Criteria: The patients below 15 and above 50. The patients treated as outpatients. RESULTS AND CONCLUSION: ◈ Adnexal mass presentation was found to be more common in the middle age females particularly in the perimenopasual women and the usual presentation was with symptoms of abdominal pain and distension along with dysfunctional uterine bleeding. ◈ Parity and sterilization procedures did not have any association with the occurrence of adnexal mass. ◈ Adnexal mass did not have any associated pathology in cervic, vagina or uterus. ◈ Per vagina findings shows forniceal fullness in most of the patients with adnexal mass. ◈ Right sided ovarian mass found to be more common than left side or bilateral. ◈ Right adnexal mass was the most common clinical diagnosis ◈ Right sided ovarian cyst was the most common USG finding. ◈ Mean uterine length and breadth was almost in normal size. ◈ Multi-loculated septa was seen in 30% of the patients in the mass lesion. ◈ Solid components was present in 14% of the lesions. ◈ Papillary projections was seen in 12% of the lesions. ◈ Of all the adnexal mass 15% were malignant lesion, 6% were borderline lesions and the remaining were benign lesions. ◈ Simple Ovarian cyst and mucinous cystadenoma were the most common benign lesions and the most common malignant lesion was cystadenocarcinoma

    Biochemical markers and combination testing for the diagnosis of ovarian cancer in women with symptoms or signs suspicious of ovarian cancer

    Get PDF
    Ovarian cancer (OC) has the highest mortality of all gynaecological cancers. A significant contributing factor to the high mortality in OC is delayed diagnosis. Currently, there is no consensus regarding the best test for early diagnosis. A review of existing systematic reviews about symptoms, biochemical markers and US test used alone or in combination for the diagnosis of OC in symptomatic women demonstrated that existing reviews were variable in quality, applicability and limited by poor reporting. I attempted to address these deficiencies in two reviews on the accuracy of biomarkers alone and symptoms, biomarkers or US in combination for the diagnosis of OC in symptomatic women in generalist settings in pre and postmenopausal women separately. My thesis finds key methodological issues, e.g., literature is not applicable to generalist settings as studies included women typical of tertiary healthcare settings, some studies excluded borderline tumours which inflates estimates of sensitivity, important differences exist in test performance between pre and postmenopausal women. Main results are 1) reviews not applicable to primary care settings – more research is needed. 2) for biomarkers i) HE4 at the threshold of 60-80pMol/L and 130-150pMol/L is recommended in pre and postmenopausal women for low prevalence settings ii) ROMA or LR2 in premenopausal women to replace RMI in secondary/tertiary setting; continue with RMI for postmenopausal women as it shows comparable accuracy to ROMA and LR2

    Externally validated models for first diagnosis and risk of progression of knee osteoarthritis.

    Get PDF
    ObjectiveWe develop and externally validate two models for use with radiological knee osteoarthritis. They consist of a diagnostic model for KOA and a prognostic model of time to onset of KOA. Model development and optimisation used data from the Osteoarthritis initiative (OAI) and external validation for both models was by application to data from the Multicenter Osteoarthritis Study (MOST).Materials and methodsThe diagnostic model at first presentation comprises subjects in the OAI with and without KOA (n = 2006), modelling with multivariate logistic regression. The prognostic sample involves 5-year follow-up of subjects presenting without clinical KOA (n = 1155), with modelling with Cox regression. In both instances the models used training data sets of n = 1353 and 1002 subjects and optimisation used test data sets of n = 1354 and 1003. The external validation data sets for the diagnostic and prognostic models comprised n = 2006 and n = 1155 subjects respectively.ResultsThe classification performance of the diagnostic model on the test data has an AUC of 0.748 (0.721-0.774) and 0.670 (0.631-0.708) in external validation. The survival model has concordance scores for the OAI test set of 0.74 (0.7325-0.7439) and in external validation 0.72 (0.7190-0.7373). The survival approach stratified the population into two risk cohorts. The separation between the cohorts remains when the model is applied to the validation data.DiscussionThe models produced are interpretable with app interfaces that implement nomograms. The apps may be used for stratification and for patient education over the impact of modifiable risk factors. The externally validated results, by application to data from a substantial prospective observational study, show the robustness of models for likelihood of presenting with KOA at an initial assessment based on risk factors identified by the OAI protocol and stratification of risk for developing KOA in the next five years.ConclusionModelling clinical KOA from OAI data validates well for the MOST data set. Both risk models identified key factors for differentiation of the target population from commonly available variables. With this analysis there is potential to improve clinical management of patients

    Using Bayesian neural networks with ARD input selection to detect malignant ovarian masses prior to surgery

    No full text
    In this paper, we applied Bayesian multi-layer perceptrons (MLP) using the evidence procedure to predict malignancy of ovarian masses in a large (n = 1,066) multi-centre data set. Automatic relevance determination (ARD) was used to select the most relevant inputs. Fivefold cross-validation (5CV) and repeated 5CV was used to select the optimal combination of input set and number of hidden neurons. Results indicate good performance of the models with area under the receiver operating characteristic curve values of 0.93-0.94 on independent data. Comparison with a linear benchmark model and a previously developed logistic regression model shows that the present problem is very well linearly separable. A resampling analysis further shows that the number of hidden neurons specified in the ARD analyses for input selection may influence model performance. This paper shows that Bayesian MLPs, although not frequently used, are a useful tool for detecting malignant ovarian tumours

    Brain and Human Body Modeling 2020

    Get PDF
    ​This open access book describes modern applications of computational human modeling in an effort to advance neurology, cancer treatment, and radio-frequency studies including regulatory, safety, and wireless communication fields. Readers working on any application that may expose human subjects to electromagnetic radiation will benefit from this book’s coverage of the latest models and techniques available to assess a given technology’s safety and efficacy in a timely and efficient manner. Describes computational human body phantom construction and application; Explains new practices in computational human body modeling for electromagnetic safety and exposure evaluations; Includes a survey of modern applications for which computational human phantoms are critical
    • …
    corecore