423 research outputs found

    A Review

    Get PDF
    Ovarian cancer is the most common cause of death among gynecological malignancies. We discuss different types of clinical and nonclinical features that are used to study and analyze the differences between benign and malignant ovarian tumors. Computer-aided diagnostic (CAD) systems of high accuracy are being developed as an initial test for ovarian tumor classification instead of biopsy, which is the current gold standard diagnostic test. We also discuss different aspects of developing a reliable CAD system for the automated classification of ovarian cancer into benign and malignant types. A brief description of the commonly used classifiers in ultrasound-based CAD systems is also given

    New Statistical Algorithms for the Analysis of Mass Spectrometry Time-Of-Flight Mass Data with Applications in Clinical Diagnostics

    Get PDF
    Mass spectrometry (MS) based techniques have emerged as a standard forlarge-scale protein analysis. The ongoing progress in terms of more sensitive machines and improved data analysis algorithms led to a constant expansion of its fields of applications. Recently, MS was introduced into clinical proteomics with the prospect of early disease detection using proteomic pattern matching. Analyzing biological samples (e.g. blood) by mass spectrometry generates mass spectra that represent the components (molecules) contained in a sample as masses and their respective relative concentrations. In this work, we are interested in those components that are constant within a group of individuals but differ much between individuals of two distinct groups. These distinguishing components that dependent on a particular medical condition are generally called biomarkers. Since not all biomarkers found by the algorithms are of equal (discriminating) quality we are only interested in a small biomarker subset that - as a combination - can be used as a fingerprint for a disease. Once a fingerprint for a particular disease (or medical condition) is identified, it can be used in clinical diagnostics to classify unknown spectra. In this thesis we have developed new algorithms for automatic extraction of disease specific fingerprints from mass spectrometry data. Special emphasis has been put on designing highly sensitive methods with respect to signal detection. Thanks to our statistically based approach our methods are able to detect signals even below the noise level inherent in data acquired by common MS machines, such as hormones. To provide access to these new classes of algorithms to collaborating groups we have created a web-based analysis platform that provides all necessary interfaces for data transfer, data analysis and result inspection. To prove the platform's practical relevance it has been utilized in several clinical studies two of which are presented in this thesis. In these studies it could be shown that our platform is superior to commercial systems with respect to fingerprint identification. As an outcome of these studies several fingerprints for different cancer types (bladder, kidney, testicle, pancreas, colon and thyroid) have been detected and validated. The clinical partners in fact emphasize that these results would be impossible with a less sensitive analysis tool (such as the currently available systems). In addition to the issue of reliably finding and handling signals in noise we faced the problem to handle very large amounts of data, since an average dataset of an individual is about 2.5 Gigabytes in size and we have data of hundreds to thousands of persons. To cope with these large datasets, we developed a new framework for a heterogeneous (quasi) ad-hoc Grid - an infrastructure that allows to integrate thousands of computing resources (e.g. Desktop Computers, Computing Clusters or specialized hardware, such as IBM's Cell Processor in a Playstation 3)

    Wavelet feature extraction and genetic algorithm for biomarker detection in colorectal cancer data

    Get PDF
    Biomarkers which predict patient’s survival can play an important role in medical diagnosis and treatment. How to select the significant biomarkers from hundreds of protein markers is a key step in survival analysis. In this paper a novel method is proposed to detect the prognostic biomarkers ofsurvival in colorectal cancer patients using wavelet analysis, genetic algorithm, and Bayes classifier. One dimensional discrete wavelet transform (DWT) is normally used to reduce the dimensionality of biomedical data. In this study one dimensional continuous wavelet transform (CWT) was proposed to extract the features of colorectal cancer data. One dimensional CWT has no ability to reduce dimensionality of data, but captures the missing features of DWT, and is complementary part of DWT. Genetic algorithm was performed on extracted wavelet coefficients to select the optimized features, using Bayes classifier to build its fitness function. The corresponding protein markers were located based on the position of optimized features. Kaplan-Meier curve and Cox regression model 2 were used to evaluate the performance of selected biomarkers. Experiments were conducted on colorectal cancer dataset and several significant biomarkers were detected. A new protein biomarker CD46 was found to significantly associate with survival time

    A comparative analysis of classifiers in cancer prediction using multiple data mining techniques

    Get PDF
    In recent years, application of data mining methods in health industry has received increased attention from both health professionals and scholars. This paper presents a data mining framework for detecting breast cancer based on real data from one of Iran hospitals by applying association rules and the most commonly used classifiers. The former were adopted for reducing the size of datasets, while the latter were chosen for cancer prediction. A k-fold cross validation procedure was included for evaluating the performance of the proposed classifiers. Among the six classifiers used in this paper, support vector machine achieved the best results, with an accuracy of 93%. It is worth mentioning that the approach proposed can be applied for detecting other diseases as well

    Mass spectrometry data mining for cancer detection

    Get PDF
    Early detection of cancer is crucial for successful intervention strategies. Mass spectrometry-based high throughput proteomics is recognized as a major breakthrough in cancer detection. Many machine learning methods have been used to construct classifiers based on mass spectrometry data for discriminating between cancer stages, yet, the classifiers so constructed generally lack biological interpretability. To better assist clinical uses, a key step is to discover ”biomarker signature profiles”, i.e. combinations of a small number of protein biomarkers strongly discriminating between cancer states. This dissertation introduces two innovative algorithms to automatically search for a signature and to construct a high-performance signature-based classifier for cancer discrimination tasks based on mass spectrometry data, such as data acquired by MALDI or SELDI techniques. Our first algorithm assumes that homogeneous groups of mass spectra can be modeled by (unknown) Gibbs distributions to generate an optimal signature and an associated signature-based classifier by robust log-likelihood analysis; our second algorithm uses a stochastic optimization algorithm to search for two lists of biomarkers, and then constructs a signature-based classifier. To support these two algorithms theoretically, this dissertation also studies the empirical probability distributions of mass spectrometry data and implements the actual fitting of Markov random fields to these high-dimensional distributions. We have validated our two signature discovery algorithms on several mass spectrometry datasets related to ovarian cancer and to colorectal cancer patients groups. For these cancer discrimination tasks, our algorithms have yielded better classification performances than existing machine learning algorithms and in addition,have generated more interpretable explicit signatures.Mathematics, Department o

    Superhydrophobic lab-on-chip measures secretome protonation state and provides a personalized risk assessment of sporadic tumour

    Get PDF
    Secretome of primary cultures is an accessible source of biological markers compared to more complex and less decipherable mixtures such as serum or plasma. The protonation state (PS) of secretome reflects the metabolism of cells and can be used for cancer early detection. Here, we demonstrate a superhydrophobic organic electrochemical device that measures PS in a drop of secretome derived from liquid biopsies. Using data from the sensor and principal component analysis (PCA), we developed algorithms able to efficiently discriminate tumour patients from non-tumour patients. We then validated the results using mass spectrometry and biochemical analysis of samples. For the 36 patients across three independent cohorts, the method identified tumour patients with high sensitivity and identification as high as 100% (no false positives) with declared subjects at-risk, for sporadic cancer onset, by intermediate values of PS. This assay could impact on cancer risk management, individual’s diagnosis and/or help clarify risk in healthy populations

    Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data

    Get PDF
    Mass spectrometry (MS) data provide a promising strategy for biomarker discovery. For this purpose, the detection of relevant peakbins in MS data is currently under intense research. Data from mass spectrometry are challenging to analyze because of their high dimensionality and the generally low number of samples available. To tackle this problem, the scientific community is becoming increasingly interested in applying feature subset selection techniques based on specialized machine learning algorithms. In this paper, we present a performance comparison of some metaheuristics: best first (BF), genetic algorithm (GA), scatter search (SS) and variable neighborhood search (VNS). Up to now, all the algorithms, except for GA, have been first applied to detect relevant peakbins in MS data. All these metaheuristic searches are embedded in two different filter and wrapper schemes coupled with Naive Bayes and SVM classifiers

    ENDOMET database – A means to identify novel diagnostic and prognostic tools for endometriosis

    Get PDF
    Endometriosis is a common benign hormone reliant inflammatory gynecological disease that affects fertile aged women and has a considerable economic impact on healthcare systems. Symptoms include intense menstrual pain, persistent pelvic pain, and infertility. It is defined by the existence of endometrium-like tissue developing in ectopic locations outside the uterine cavity and inflammation in the peritoneal cavity. Endometriosis presents with multifactorial etiology, and despite extensive research the etiology is still poorly understood. Diagnostic delay from the onset of the disease to when a conclusive diagnosis is reached is between 7–12 years. There is no known cure, although symptoms can be improved with hormonal medications (which often have multiple side effects and prevent pregnancy), or through surgery which carries its own risk. Current non-invasive tools for diagnosis are not sufficiently dependable, and a definite diagnosis is achieved through laparoscopy or laparotomy. This study was based on two prospective cohorts: The ENDOMET study, including 137 endometriosis patients scheduled for surgery and 62 healthy women, and PROENDO that included 138 endometriosis patients and 33 healthy women. Our long-term goal with the current study was to support the discovery of innovative new tools for efficient diagnosis of endometriosis as well as tools to further understand the etiology and pathogenesis of the disease. We set about achieving this goal by creating a database, EndometDB, based on a relational data model, implemented with PostgreSQL programming language. The database allows e.g., for the exploration of global genome-wide expression patterns in the peritoneum, endometrium, and in endometriosis lesions of endometriosis patients as well as in the peritoneum and endometrium of healthy control women of reproductive age. The data collected in the EndometDB was also used for the development and validation of a symptom and biomarker-based predictive model designed for risk evaluation and early prediction of endometriosis without invasive diagnostic methods. Using the data in the EndometDB we discovered that compared with the eutopic endometrium, the WNT- signaling pathway is one of the molecular pathways that undergo strong changes in endometriosis. We then evaluated the potential role for secreted frizzled-related protein 2 (SFRP-2, a WNT-signaling pathway modulator), in improving endometriosis lesion border detection. The SFRP-2 expression visualizes the lesion better than previously used markers and can be used to better define lesion size and that the surgical excision of the lesions is complete.ENDOMET tietokanta – Keino tunnistaa uusi diagnostinen ja ennustava työkalu endometrioosille Endometrioosi on yleinen hyvänlaatuinen, hormoneista riippuvainen tulehduksellinen lisääntymisikäisten naisten gynekologinen sairaus, joka kuormittaa terveydenhuoltojärjestelmää merkittävästi. Endometrioositaudin oireita ovat mm. voimakas kuukautiskipu, jatkuva lantion alueen kipu ja hedelmättömyys. Sairaus määritellään kohdun limakalvon kaltaisen kudoksen esiintymisenä kohdun ulkopuolella sekä siihen liittyvänä vatsakalvon tulehduksena. Endometrioosin etiologia on monitahoinen, ja laajasta tutkimuksesta huolimatta edelleen huonosti tunnettu. Kesto taudin puhkeamisesta lopullisen diagnoosin saamiseen on usein jopa 7–12 vuotta. Sairauteen ei tunneta parannuskeinoa, mutta oireita voidaan lievittää esimerkiksi hormonaalisilla lääkkeillä (joilla on usein monia sivuvaikutuksia ja jotka estävät raskauden) tai leikkauksella, johon liittyy omat tunnetut riskit. Nykyiset ei-invasiiviset diagnoosityökalut eivät ole riittävän luotettavia sairauden tunnistamiseen, ja varma endometrioosin diagnoosi saavutetaan laparoskopian tai laparotomian avulla. Tämä tutkimus perustui kahteen prospektiiviseen kohorttiin: ENDOMET-tutkimuk-seen, johon osallistui 137 endometrioosipotilasta ja 62 terveellistä naista, sekä PROENDO-tutkimukseen, johon osallistui 138 endometrioosipotilasta ja 33 terveellistä naista. Tässä tutkimuksessa pitkän aikavälin tavoitteemme oli löytää uusia työkalujen endometrioosin diagnosointiin, sekä ymmärtää endometrioosin etiologiaa ja patogeneesiä. Ensimmäisessä vaiheessa loimme EndometDB –tietokannan PostgreSQL-ohjelmointi-kielellä. Tämän osittain avoimeen käyttöön vapautetun tietokannan avulla voidaan tutkia genomin, esimerkiksi kaikkien tunnettujen geenien ilmentymistä peritoneumissa, endo-metriumissa ja endometrioosipotilaiden endometrioosileesioissa EndometDB-tietokantaan kerättyjä tietoja käytettiin oireiden ja biomarkkeripohjaisen ennustemallin kehittämiseen ja validointiin. Malli tuottaa riskinarvioinnin endometrioositaudin varhaiseen ennustamiseen ilman laparoskopiaa. Käyttäen EndometDB-tietokannan tietoja havaitsimme, että endo-metrioositautikudoksessa tapahtui voimakkaita geeni-ilmentymisen muutoksia erityisesti geeneissä, jotka liittyvät WNT-signalointireitin säätelyyn. Keskeisin löydös oli, että SFRP-2 proteiinin ilmentyminen oli huomattavasti koholla endometrioosikudoksessa ja SFRP-2 proteiinin immunohistokemiallinen värjäys erottaa endometrioosin tautikudoksen terveestä kudoksesta aiempia merkkiaineita paremmin. Löydetyllä menetelmällä voidaan siten selvittää tautikudoksen laajuus ja tarvittaessa osoittaa, että leikkauksella on kyetty poistamaan koko sairas kudos
    • …
    corecore