423 research outputs found
A Review
Ovarian cancer is the most common cause of death among gynecological malignancies. We discuss different types of clinical and nonclinical features that are used to study and analyze the differences between benign and malignant ovarian tumors. Computer-aided diagnostic (CAD) systems of high accuracy are being developed as an initial test for ovarian tumor classification instead of biopsy, which is the current gold standard diagnostic test. We also discuss different aspects of developing a reliable CAD system for the automated classification of ovarian cancer into benign and malignant types. A brief description of the commonly used classifiers in ultrasound-based CAD systems is also given
New Statistical Algorithms for the Analysis of Mass Spectrometry Time-Of-Flight Mass Data with Applications in Clinical Diagnostics
Mass spectrometry (MS) based techniques have emerged as a standard forlarge-scale protein analysis. The ongoing progress in terms of more sensitive
machines and improved data analysis algorithms led to a constant expansion of
its fields of applications. Recently, MS was introduced into clinical proteomics
with the prospect of early disease detection using proteomic pattern matching.
Analyzing biological samples (e.g. blood) by mass spectrometry generates
mass spectra that represent the components (molecules) contained in a
sample as masses and their respective relative concentrations.
In this work, we are interested in those components that are constant within a
group of individuals but differ much between individuals of two distinct groups.
These distinguishing components that dependent on a particular medical condition
are generally called biomarkers. Since not all biomarkers found by the
algorithms are of equal (discriminating) quality we are only interested in a
small biomarker subset that - as a combination - can be used as a
fingerprint for a disease. Once a fingerprint for a particular disease
(or medical condition) is identified, it can be used in clinical diagnostics to
classify unknown spectra.
In this thesis we have developed new algorithms for automatic extraction of
disease specific fingerprints from mass spectrometry data. Special emphasis has
been put on designing highly sensitive methods with respect to signal detection.
Thanks to our statistically based approach our methods are able to
detect signals even below the noise level inherent in data acquired by common MS
machines, such as hormones.
To provide access to these new classes of algorithms to collaborating groups
we have created a web-based analysis platform that provides all necessary
interfaces for data transfer, data analysis and result inspection.
To prove the platform's practical relevance it has been utilized in several
clinical studies two of which are presented in this thesis. In these studies it
could be shown that our platform is superior to commercial systems with respect
to fingerprint identification. As an outcome of these studies several
fingerprints for different cancer types (bladder, kidney, testicle, pancreas,
colon and thyroid) have been detected and validated. The clinical partners in
fact emphasize that these results would be impossible with a less sensitive
analysis tool (such as the currently available systems).
In addition to the issue of reliably finding and handling signals in noise we
faced the problem to handle very large amounts of data, since an average dataset
of an individual is about 2.5 Gigabytes in size and we have data of hundreds to
thousands of persons. To cope with these large datasets, we developed a new
framework for a heterogeneous (quasi) ad-hoc Grid - an infrastructure that
allows to integrate thousands of computing resources (e.g. Desktop Computers,
Computing Clusters or specialized hardware, such as IBM's Cell Processor in a
Playstation 3)
Wavelet feature extraction and genetic algorithm for biomarker detection in colorectal cancer data
Biomarkers which predict patient’s survival can play an important role in medical diagnosis and
treatment. How to select the significant biomarkers from hundreds of protein markers is a key step in
survival analysis. In this paper a novel method is proposed to detect the prognostic biomarkers ofsurvival in colorectal cancer patients using wavelet analysis, genetic algorithm, and Bayes classifier. One dimensional discrete wavelet transform (DWT) is normally used to reduce the dimensionality of biomedical data. In this study one dimensional continuous wavelet transform (CWT) was proposed to extract the features of colorectal cancer data. One dimensional CWT has no ability to reduce
dimensionality of data, but captures the missing features of DWT, and is complementary part of DWT. Genetic algorithm was performed on extracted wavelet coefficients to select the optimized features, using Bayes classifier to build its fitness function. The corresponding protein markers were
located based on the position of optimized features. Kaplan-Meier curve and Cox regression model 2 were used to evaluate the performance of selected biomarkers. Experiments were conducted on colorectal cancer dataset and several significant biomarkers were detected. A new protein biomarker CD46 was found to significantly associate with survival time
A comparative analysis of classifiers in cancer prediction using multiple data mining techniques
In recent years, application of data mining methods in health industry has received increased attention from both health professionals and scholars. This paper presents a data mining framework for detecting breast cancer based on real data from one of Iran hospitals by applying association rules and the most commonly used classifiers. The former were adopted for reducing the size of datasets, while the latter were chosen for cancer prediction. A k-fold cross validation procedure was included for evaluating the performance of the proposed classifiers. Among the six classifiers used in this paper, support vector machine achieved the best results, with an accuracy of 93%. It is worth mentioning that the approach proposed can be applied for detecting other diseases as well
Mass spectrometry data mining for cancer detection
Early detection of cancer is crucial for successful intervention strategies. Mass spectrometry-based high throughput proteomics is recognized as a major breakthrough in cancer detection. Many machine learning methods have been used to construct classifiers based on mass spectrometry data for discriminating between cancer stages, yet, the classifiers so constructed generally lack biological interpretability. To better assist clinical uses, a key step is to discover ”biomarker signature profiles”, i.e. combinations of a small number of protein biomarkers strongly discriminating between cancer states.
This dissertation introduces two innovative algorithms to automatically search for a signature and to construct a high-performance signature-based classifier for cancer discrimination tasks based on mass spectrometry data, such as data acquired by MALDI or SELDI techniques. Our first algorithm assumes that homogeneous groups of mass spectra can be modeled by (unknown) Gibbs distributions to generate an optimal signature and an associated signature-based classifier by robust log-likelihood analysis; our second algorithm uses a stochastic optimization algorithm to search for two lists of biomarkers, and then constructs a signature-based classifier.
To support these two algorithms theoretically, this dissertation also studies the empirical probability distributions of mass spectrometry data and implements the actual fitting of Markov random fields to these high-dimensional distributions. We have validated our two signature discovery algorithms on several mass spectrometry datasets related to ovarian cancer and to colorectal cancer patients groups. For these cancer discrimination tasks, our algorithms have yielded better classification performances than existing machine learning algorithms and in addition,have generated more interpretable explicit signatures.Mathematics, Department o
Superhydrophobic lab-on-chip measures secretome protonation state and provides a personalized risk assessment of sporadic tumour
Secretome of primary cultures is an accessible source of biological markers compared to more complex and less decipherable
mixtures such as serum or plasma. The protonation state (PS) of secretome reflects the metabolism of cells and can be used for
cancer early detection. Here, we demonstrate a superhydrophobic organic electrochemical device that measures PS in a drop of
secretome derived from liquid biopsies. Using data from the sensor and principal component analysis (PCA), we developed
algorithms able to efficiently discriminate tumour patients from non-tumour patients. We then validated the results using mass
spectrometry and biochemical analysis of samples. For the 36 patients across three independent cohorts, the method identified
tumour patients with high sensitivity and identification as high as 100% (no false positives) with declared subjects at-risk, for
sporadic cancer onset, by intermediate values of PS. This assay could impact on cancer risk management, individual’s diagnosis
and/or help clarify risk in healthy populations
Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data
Mass spectrometry (MS) data provide a promising strategy for biomarker discovery. For this purpose, the detection of relevant peakbins in MS data is currently under intense research. Data from mass spectrometry are challenging to analyze because of their high dimensionality and the generally low number of samples available. To tackle this problem, the scientific community is becoming increasingly interested in applying feature subset selection techniques based on specialized machine learning algorithms. In this paper, we present a performance comparison of some metaheuristics: best first (BF), genetic algorithm (GA), scatter search (SS) and variable neighborhood search (VNS). Up to now, all the algorithms, except for GA, have been first applied to detect relevant peakbins in MS data. All these metaheuristic searches are embedded in two different filter and wrapper schemes coupled with Naive Bayes and SVM classifiers
ENDOMET database – A means to identify novel diagnostic and prognostic tools for endometriosis
Endometriosis is a common benign hormone reliant inflammatory gynecological disease that affects fertile aged women and has a considerable economic impact on healthcare systems. Symptoms include intense menstrual pain, persistent pelvic pain, and infertility. It is defined by the existence of endometrium-like tissue developing in ectopic locations outside the uterine cavity and inflammation in the peritoneal cavity. Endometriosis presents with multifactorial etiology, and despite extensive research the etiology is still poorly understood. Diagnostic delay from the onset of the disease to when a conclusive diagnosis is reached is between 7–12 years. There is no known cure, although symptoms can be improved with hormonal medications (which often have multiple side effects and prevent pregnancy), or through surgery which carries its own risk. Current non-invasive tools for diagnosis are not sufficiently dependable, and a definite diagnosis is achieved through laparoscopy or laparotomy.
This study was based on two prospective cohorts: The ENDOMET study, including 137 endometriosis patients scheduled for surgery and 62 healthy women, and PROENDO that included 138 endometriosis patients and 33 healthy women.
Our long-term goal with the current study was to support the discovery of innovative new tools for efficient diagnosis of endometriosis as well as tools to further understand the etiology and pathogenesis of the disease. We set about achieving this goal by creating a database, EndometDB, based on a relational data model, implemented with PostgreSQL programming language. The database allows e.g., for the exploration of global genome-wide expression patterns in the peritoneum, endometrium, and in endometriosis lesions of endometriosis patients as well as in the peritoneum and endometrium of healthy control women of reproductive age. The data collected in the EndometDB was also used for the development and validation of a symptom and biomarker-based predictive model designed for risk evaluation and early prediction of endometriosis without invasive diagnostic methods. Using the data in the EndometDB we discovered that compared with the eutopic endometrium, the WNT- signaling pathway is one of the molecular pathways that undergo strong changes in endometriosis. We then evaluated the potential role for secreted frizzled-related protein 2 (SFRP-2, a WNT-signaling pathway modulator), in improving endometriosis lesion border detection. The SFRP-2 expression visualizes the lesion better than previously used markers and can be used to better define lesion size and that the surgical excision of the lesions is complete.ENDOMET tietokanta – Keino tunnistaa uusi diagnostinen ja ennustava työkalu endometrioosille
Endometrioosi on yleinen hyvänlaatuinen, hormoneista riippuvainen tulehduksellinen lisääntymisikäisten naisten gynekologinen sairaus, joka kuormittaa terveydenhuoltojärjestelmää merkittävästi. Endometrioositaudin oireita ovat mm. voimakas kuukautiskipu, jatkuva lantion alueen kipu ja hedelmättömyys. Sairaus määritellään kohdun limakalvon kaltaisen kudoksen esiintymisenä kohdun ulkopuolella sekä siihen liittyvänä vatsakalvon tulehduksena. Endometrioosin etiologia on monitahoinen, ja laajasta tutkimuksesta huolimatta edelleen huonosti tunnettu. Kesto taudin puhkeamisesta lopullisen diagnoosin saamiseen on usein jopa 7–12 vuotta. Sairauteen ei tunneta parannuskeinoa, mutta oireita voidaan lievittää esimerkiksi hormonaalisilla lääkkeillä (joilla on usein monia sivuvaikutuksia ja jotka estävät raskauden) tai leikkauksella, johon liittyy omat tunnetut riskit. Nykyiset ei-invasiiviset diagnoosityökalut eivät ole riittävän luotettavia sairauden tunnistamiseen, ja varma endometrioosin diagnoosi saavutetaan laparoskopian tai laparotomian avulla.
Tämä tutkimus perustui kahteen prospektiiviseen kohorttiin: ENDOMET-tutkimuk-seen, johon osallistui 137 endometrioosipotilasta ja 62 terveellistä naista, sekä PROENDO-tutkimukseen, johon osallistui 138 endometrioosipotilasta ja 33 terveellistä naista.
Tässä tutkimuksessa pitkän aikavälin tavoitteemme oli löytää uusia työkalujen endometrioosin diagnosointiin, sekä ymmärtää endometrioosin etiologiaa ja patogeneesiä. Ensimmäisessä vaiheessa loimme EndometDB –tietokannan PostgreSQL-ohjelmointi-kielellä. Tämän osittain avoimeen käyttöön vapautetun tietokannan avulla voidaan tutkia genomin, esimerkiksi kaikkien tunnettujen geenien ilmentymistä peritoneumissa, endo-metriumissa ja endometrioosipotilaiden endometrioosileesioissa EndometDB-tietokantaan kerättyjä tietoja käytettiin oireiden ja biomarkkeripohjaisen ennustemallin kehittämiseen ja validointiin. Malli tuottaa riskinarvioinnin endometrioositaudin varhaiseen ennustamiseen ilman laparoskopiaa. Käyttäen EndometDB-tietokannan tietoja havaitsimme, että endo-metrioositautikudoksessa tapahtui voimakkaita geeni-ilmentymisen muutoksia erityisesti geeneissä, jotka liittyvät WNT-signalointireitin säätelyyn. Keskeisin löydös oli, että SFRP-2 proteiinin ilmentyminen oli huomattavasti koholla endometrioosikudoksessa ja SFRP-2 proteiinin immunohistokemiallinen värjäys erottaa endometrioosin tautikudoksen terveestä kudoksesta aiempia merkkiaineita paremmin. Löydetyllä menetelmällä voidaan siten selvittää tautikudoksen laajuus ja tarvittaessa osoittaa, että leikkauksella on kyetty poistamaan koko sairas kudos
- …