28 research outputs found

    Local object patterns for representation and classification of colon tissue images

    Get PDF
    Cataloged from PDF version of article.This paper presents a new approach for the effective representation and classification of images of histopathological colon tissues stained with hematoxylin and eosin. In this approach, we propose to decompose a tissue image into its histological components and introduce a set of new texture descriptors, which we call local object patterns, on these components to model their composition within a tissue. We define these descriptors using the idea of local binary patterns, which quantify a pixel by constructing a binary string based on relative intensities of its neighbors. However, as opposed to pixel-level local binary patterns, we define our local object pattern descriptors at the component level to quantify a component. To this end, we specify neighborhoods with different locality ranges and encode spatial arrangements of the components within the specified local neighborhoods by generating strings. We then extract our texture descriptors from these strings to characterize histological components and construct the bag-of-words representation of an image from the characterized components. Working on microscopic images of colon tissues, our experiments reveal that the use of these component-level texture descriptors results in higher classification accuracies than the previous textural approaches. © 2013 IEEE

    Color and morphological features extraction and nuclei classification in tissue samples of colorectal cancer

    Get PDF
    Cancer is an important public health problem and the third most leading cause of death in North America. Among the highest impact types of cancer are colorectal, breast, lung, and prostate. This thesis addresses the features extraction by using different artificial intelligence algorithms that provide distinct solutions for the purpose of Computer-AidedDiagnosis (CAD). For example, classification algorithms are employed in identifying histological structures, such as lymphocytes, cancer-cells nuclei and glands, from features like existence, extension or shape. The morphological aspect of these structures indicates the degree of severity of the related disease. In this paper, we use a large dataset of 5000 images to classify eight different tissue types in the case of colorectal cancer. We compare results with another dataset. We perform image segmentation and extract statistical information about the area, perimeter, circularity, eccentricity and solidity of the interest points in the image. Finally, we use and compare four popular machine learning techniques, i.e., Naive Bayes, Random Forest, Support Vector Machine and Multilayer Perceptron to classify and to improve the precision of category assignation. The performance of each algorithm was measured using 3 types of metrics: Precision, recall and F1-Score representing a huge contribution to the existing literature complementing it in a quantitative way. The large number of images has helped us to circumvent the overfitting and reproducibility problems. The main contribution is the use of new characteristics different from those already studied, this work researches about the color and morphological characteristics in the images that may be useful for performing tissue classification in colorectal cancer histology

    Inference for a General Class of Models for Recurrent Events with application to cancer data

    Get PDF
    La necesidad del análisis de supervivencia aparece cuando necesitamos estudiar las propiedades estadísticas de una variable que describe el tiempo hasta que ocurre un evento único. En algunas ocasiones, podemos observar que el evento de interés ocurre repetidamente en un mismo individuo, como puede ser el caso de un paciente diagnosticado de cáncer que recae a lo largo del tiempo o cuando una persona es reingresada repetidas veces en un hospital. En este caso hablamos de análisis de supervivencia con eventos recurrentes. La naturaleza recurrente de los eventos hace necesario el uso de otras técnicas distintas a aquellas que utilizamos cuando analizamos tiempos de supervivencia para un evento único. En esta tesis, tratamos este tipo de análisis principalmente motivados por dos estudios en investigación en cáncer que fueron creados especialmente para este trabajo. Uno de ellos hace referencia a un estudio sobre readmisiones hospitalarias en pacientes diagnosticados con cáncer colorectal, mientras que el otro hace referencia a pacientes diagnosticados con linfomas no Hodgkinianos. Este último estudio es especialmente relevante ya que incluimos información sobre el efecto del tratamiento después de las recaídas y algunos autores han mostrado la necesidad de desarrollar un modelo específico para pacientes que presentan este tipo de enfermedades. Nuestra contribución al análisis univariante es proponer un método para construir intervalos de confianza para la mediana de supervivencia en el caso de eventos recurrentes. Para ello, hemos utilizado dos aproximaciones. Una de ellas se basa en las varianzas asintóticas derivadas de dos estimadores existentes de la función de supervivencia, mientras que el otro utiliza técnicas de remuestreo. Esta última aproximación es útil ya que uno de los estimadores utilizados todavía no tiene una forma cerrada para su varianza. La nueva contribución de este trabajo es el estudio de cómo hacer remuestreo en la presencia de datos con eventos recurrentes que aparecen de un esquema conocido como --sum-quota accrual" y la informatividad del mecanismo de censura por la derecha que presentan este tipo de datos. Demostramos la convergencia d bil y los intervalos de confianza asintóticos se construyen utilizando dicho resultado. Por otro lado, el análisis multivariante trata el problema de cómo incorporar más de una covariable en el análisis. En problemas con eventos recurrentes, también necesitamos tener en cuenta que además de las covariables, la hetereogeneidad, el número de ocurrencias, o especialmente, el efecto de las intervenciones después de las reocurrencias puede modificar la probabilidad de observar un nuevo evento en un paciente. Este último punto es muy importante ya que todavía no se ha tenido en cuenta en estudios biomédicos. Para tratar este problema, hemos basado nuestro trabajo en un nuevo modelo para eventos recurrentes propuesto por Peña y Hollander, 2004. Nuestra contribución a este punto es la adaptación de las recaídas en cáncer utilizando este modelo en el que el efecto de las intervenciones se representa mediante un proceso llamado --edad efectiva' que actúa sobre la función de riesgo basal. Hemos llamado a este modelo modelo dinámico de cáncer (--dynamic cancer model'). También tratamos el problema de la estimación de parámetros de la clase general de modelos para eventos recurrentes propuesta por Peña y Hollander donde el modelo dinámico de cáncer se puede ver como un caso especial de este modelo general. Hemos desarrollado dos aproximaciones. La primera se basa en inferencia semiparamétrica, donde la función de riesgo basal se especifica de forma no paramétrica y usamos el algoritmo EM. La segunda es una aproximación basada en verosimilitud penalizada donde adoptamos dos estrategias diferentes. Una de ellas se basa en penalizar la verosimilitud parcial donde la penalización recae en los coeficientes de regresión. La segunda penaliza la verosimilitud completa y da una estimación no paramétrica de la función de riesgo basal utilizando un estimador continuo. La solución se aproxima utilizando splines. La principal ventaja de este método es que podemos obtener fácilmente una estimación suave de la función de riesgo así como una estimación de la varianza de la varianza de la fragilidad, mientras que con las otras aproximaciones esto no es posible. Además este último método presenta un coste computacional bastante más bajo que los otros. Los resultados obtenidos con datos reales, indican que la flexibilidad de este modelo es una garantía para analizar datos de pacientes que recaen a lo largo del tiempo y que son intervenidos después de las recaídas tumorales.El aspecto computacional es otra de las contribuciones importantes de esta tesis al campo de los eventos recurrentes. Hemos desarrollado tres paquete de R llamados survrec, gcmrec y frailtypack que están accesibles en CRAN, http://www.r-project.org/. Estos paquetes permiten al usuario calcular la mediana de supervivencia y sus intervalos de confianza, estimar los par metros del modelo de Peña y Hollander (en particular el modelo dinámico de cáncer) utilizando el algoritmo EM y la verosimilitud penalizada, respectivamente.Survival analysis arises when we are interested in studying statistical properties of a variable which describes the time to a single event. In some situations, we may observe that the event of interest occurs repeatedly in the same individual, such as when a patient diagnosed with cancer tends to relapse over time or when a person is repeatedly readmitted in a hospital. In this case we speak about survival analysis with recurrent events. Recurrent nature of events makes necessary to use other techniques from those used when we analyze survival times from one single event. In this dissertation we deal with this type of analysis mainly motivatedby two studies on cancer research that were created specially for this research. One of them belongs to a study on hospital readmissions in patients diagnosed with colorectal cancer, while the other one deals with patients diagnosed with non-Hodgkin's lymphoma. This last study is mainly relevant since we include information about the effect of treatment after relapses and some authors have stated the needed of developing a specific model for relapsing patients in cancer settings.Our first contribution to univariate analysis is to propose a method to construct confidence intervals for the median survival time in the case of recurrent event settings. Two different approaches are developed. One of them is based on asymptotic variances derived from two existing estimators of survival function, while the other one uses bootstrap techniques. This last approach is useful since one of the estimators used, does not have any closed form for its variance yet. The new contribution to this work is the examination of the question of how to do bootstrapping in the presence of recurrent event data arising from a sum-quota accrual scheme and informativeness of right censoring mechanism. Weak convergence is proved and asymptotic confidence intervals are built to according this result. On the other hand, multivariate analysis addresses the problem of how incorporate more than one covariate in the analysis. In recurrent event settings, we also need to take into account that apart from covariates, the heterogeneity, the number of occurrences or specially, the effect of interventions after re occurrences may modify the probability of observing a new event in a patient. This last point is a very important one since it has not been taken into consideration in biomedical studies yet. To address this problem, we base our work on a new model for recurrent events proposed by Peña and Hollander. Our contribution to this topic is to accommodate the situation of cancer relapses to this model model in which the effect of interventions is represented by an effective age process acting on the baseline hazard function. We call this model dynamic cancer model.We also address the problem of estimating parameters of the general class of models for recurrent events proposed by Peña and Hollander, 2004, where the dynamic cancer model may be seen as a special case of this general model. Two general approaches are developed. First approach is based on semiparametric inference, where a baseline hazard function is nonparametrically specified and uses the EM algorithm. The second one is a penalized likelihood approach where two different strategies are adopted. One of them is based on penalizing the partial likelihood where the penalization bears on a regression coefficient. The second penalized approach penalized full likelihood, and it gives a non parametric estimation of the baseline hazard function using a continuous estimator. The solution is then approximated using splines. The main advantage of this method is that we caneasily obtain smooth estimates of the hazard function and an estimation of the variance of frailty variance, while in the other approaches this is not possible. In addition, this last approach has a quite less computational cost than the other ones. The results obtained using dynamic cancer model in real data sets, indicate that the flexibility of this method provides a safeguard for analyzing data where patients relapse over time and interventions are performed after tumoral reoccurrences.Computational issue is another important contribution of this work to recurrent event settings. We have developed three R packages called survrec, gcmrec, and frailtypack that are available at CRAN, http://www.r-project.org/. These packages allow users to compute median survival time and their confidence intervals, to estimate the parameters involved in the Peña and Hollander's model (in particular in the dynamic cancer model) using EM algorithm, and to estimate this parameters using penalized approach, respectively.Postprint (published version

    Feature extraction to aid disease detection and assessment of disease progression in CT and MR colonography

    Get PDF
    Computed tomographic colonography (CTC) is a technique employed to examine the whole colon for cancers and premalignant adenomas (polyps). Oral preparation is taken to fully cleanse the colon, and gas insufflation maximises the attenuation contrast between the enoluminal colon surface and the lumen. The procedure is performed routinely with the patient both prone and supine to redistribute gas and residue. This helps to differentiate fixed colonic pathology from mobile faecal residue and also helps discover pathology occluded by retained fluid or luminal collapse. Matching corresponding endoluminal surface locations with the patient in the prone and supine positions is therefore an essential aspect of interpretation by radiologists; however, interpretation can be difficult and time consuming due to the considerable colonic deformations that occur during repositioning. Hence, a method for automated registration has the potential to improve efficiency and diagnostic accuracy. I propose a novel method to establish correspondence between prone and supine CT colonography acquisitions automatically. The problem is first simplified by detecting haustral folds which are elongated ridgelike endoluminal structures and can be identified by curvature based measurements. These are subsequently matched using appearance based features, and their relative geometric relationships. It is shown that these matches can be used to find correspondence along the full length of the colon, but may also be used in conjunction with other registration methods to achieve a more robust and accurate result, explicitly addressing the problem of colonic collapse. The potential clinical value of this method has been assessed in an external clinical validation, and the application to follow-up CTC surveillance has been investigated. MRI has recently been applied as a tool to quantitatively evaluate the therapeutic response to therapy in patients with Crohn's disease, and is the preferred choice for repeated imaging. A primary biomarker for this evaluation is the measurement of variations of bowel wall thickness on changing from the active phase of the disease to remission; however, a poor level of interobserver agreement of measured thickness is reported and therefore a system for accurate, robust and reproducible measurements is desirable. I propose a novel method which will automatically track sections of colon, by estimating the positions of elliptical cross sections. Subsequently, estimation of the positions of the inner and outer bowel walls are made based on image gradient information and therefore a thickness measurement value can be extracted

    Inference for a General Class of Models for Recurrent Events with application to cancer data

    Get PDF
    La necesidad del análisis de supervivencia aparece cuando necesitamos estudiar las propiedades estadísticas de una variable que describe el tiempo hasta que ocurre un evento único. En algunas ocasiones, podemos observar que el evento de interés ocurre repetidamente en un mismo individuo, como puede ser el caso de un paciente diagnosticado de cáncer que recae a lo largo del tiempo o cuando una persona es reingresada repetidas veces en un hospital. En este caso hablamos de análisis de supervivencia con eventos recurrentes. La naturaleza recurrente de los eventos hace necesario el uso de otras técnicas distintas a aquellas que utilizamos cuando analizamos tiempos de supervivencia para un evento único. En esta tesis, tratamos este tipo de análisis principalmente motivados por dos estudios en investigación en cáncer que fueron creados especialmente para este trabajo. Uno de ellos hace referencia a un estudio sobre readmisiones hospitalarias en pacientes diagnosticados con cáncer colorectal, mientras que el otro hace referencia a pacientes diagnosticados con linfomas no Hodgkinianos. Este último estudio es especialmente relevante ya que incluimos información sobre el efecto del tratamiento después de las recaídas y algunos autores han mostrado la necesidad de desarrollar un modelo específico para pacientes que presentan este tipo de enfermedades. Nuestra contribución al análisis univariante es proponer un método para construir intervalos de confianza para la mediana de supervivencia en el caso de eventos recurrentes. Para ello, hemos utilizado dos aproximaciones. Una de ellas se basa en las varianzas asintóticas derivadas de dos estimadores existentes de la función de supervivencia, mientras que el otro utiliza técnicas de remuestreo. Esta última aproximación es útil ya que uno de los estimadores utilizados todavía no tiene una forma cerrada para su varianza. La nueva contribución de este trabajo es el estudio de cómo hacer remuestreo en la presencia de datos con eventos recurrentes que aparecen de un esquema conocido como --sum-quota accrual" y la informatividad del mecanismo de censura por la derecha que presentan este tipo de datos. Demostramos la convergencia d bil y los intervalos de confianza asintóticos se construyen utilizando dicho resultado. Por otro lado, el análisis multivariante trata el problema de cómo incorporar más de una covariable en el análisis. En problemas con eventos recurrentes, también necesitamos tener en cuenta que además de las covariables, la hetereogeneidad, el número de ocurrencias, o especialmente, el efecto de las intervenciones después de las reocurrencias puede modificar la probabilidad de observar un nuevo evento en un paciente. Este último punto es muy importante ya que todavía no se ha tenido en cuenta en estudios biomédicos. Para tratar este problema, hemos basado nuestro trabajo en un nuevo modelo para eventos recurrentes propuesto por Peña y Hollander, 2004. Nuestra contribución a este punto es la adaptación de las recaídas en cáncer utilizando este modelo en el que el efecto de las intervenciones se representa mediante un proceso llamado --edad efectiva' que actúa sobre la función de riesgo basal. Hemos llamado a este modelo modelo dinámico de cáncer (--dynamic cancer model'). También tratamos el problema de la estimación de parámetros de la clase general de modelos para eventos recurrentes propuesta por Peña y Hollander donde el modelo dinámico de cáncer se puede ver como un caso especial de este modelo general. Hemos desarrollado dos aproximaciones. La primera se basa en inferencia semiparamétrica, donde la función de riesgo basal se especifica de forma no paramétrica y usamos el algoritmo EM. La segunda es una aproximación basada en verosimilitud penalizada donde adoptamos dos estrategias diferentes. Una de ellas se basa en penalizar la verosimilitud parcial donde la penalización recae en los coeficientes de regresión. La segunda penaliza la verosimilitud completa y da una estimación no paramétrica de la función de riesgo basal utilizando un estimador continuo. La solución se aproxima utilizando splines. La principal ventaja de este método es que podemos obtener fácilmente una estimación suave de la función de riesgo así como una estimación de la varianza de la varianza de la fragilidad, mientras que con las otras aproximaciones esto no es posible. Además este último método presenta un coste computacional bastante más bajo que los otros. Los resultados obtenidos con datos reales, indican que la flexibilidad de este modelo es una garantía para analizar datos de pacientes que recaen a lo largo del tiempo y que son intervenidos después de las recaídas tumorales.El aspecto computacional es otra de las contribuciones importantes de esta tesis al campo de los eventos recurrentes. Hemos desarrollado tres paquete de R llamados survrec, gcmrec y frailtypack que están accesibles en CRAN, http://www.r-project.org/. Estos paquetes permiten al usuario calcular la mediana de supervivencia y sus intervalos de confianza, estimar los par metros del modelo de Peña y Hollander (en particular el modelo dinámico de cáncer) utilizando el algoritmo EM y la verosimilitud penalizada, respectivamente.Survival analysis arises when we are interested in studying statistical properties of a variable which describes the time to a single event. In some situations, we may observe that the event of interest occurs repeatedly in the same individual, such as when a patient diagnosed with cancer tends to relapse over time or when a person is repeatedly readmitted in a hospital. In this case we speak about survival analysis with recurrent events. Recurrent nature of events makes necessary to use other techniques from those used when we analyze survival times from one single event. In this dissertation we deal with this type of analysis mainly motivatedby two studies on cancer research that were created specially for this research. One of them belongs to a study on hospital readmissions in patients diagnosed with colorectal cancer, while the other one deals with patients diagnosed with non-Hodgkin's lymphoma. This last study is mainly relevant since we include information about the effect of treatment after relapses and some authors have stated the needed of developing a specific model for relapsing patients in cancer settings.Our first contribution to univariate analysis is to propose a method to construct confidence intervals for the median survival time in the case of recurrent event settings. Two different approaches are developed. One of them is based on asymptotic variances derived from two existing estimators of survival function, while the other one uses bootstrap techniques. This last approach is useful since one of the estimators used, does not have any closed form for its variance yet. The new contribution to this work is the examination of the question of how to do bootstrapping in the presence of recurrent event data arising from a sum-quota accrual scheme and informativeness of right censoring mechanism. Weak convergence is proved and asymptotic confidence intervals are built to according this result. On the other hand, multivariate analysis addresses the problem of how incorporate more than one covariate in the analysis. In recurrent event settings, we also need to take into account that apart from covariates, the heterogeneity, the number of occurrences or specially, the effect of interventions after re occurrences may modify the probability of observing a new event in a patient. This last point is a very important one since it has not been taken into consideration in biomedical studies yet. To address this problem, we base our work on a new model for recurrent events proposed by Peña and Hollander. Our contribution to this topic is to accommodate the situation of cancer relapses to this model model in which the effect of interventions is represented by an effective age process acting on the baseline hazard function. We call this model dynamic cancer model.We also address the problem of estimating parameters of the general class of models for recurrent events proposed by Peña and Hollander, 2004, where the dynamic cancer model may be seen as a special case of this general model. Two general approaches are developed. First approach is based on semiparametric inference, where a baseline hazard function is nonparametrically specified and uses the EM algorithm. The second one is a penalized likelihood approach where two different strategies are adopted. One of them is based on penalizing the partial likelihood where the penalization bears on a regression coefficient. The second penalized approach penalized full likelihood, and it gives a non parametric estimation of the baseline hazard function using a continuous estimator. The solution is then approximated using splines. The main advantage of this method is that we caneasily obtain smooth estimates of the hazard function and an estimation of the variance of frailty variance, while in the other approaches this is not possible. In addition, this last approach has a quite less computational cost than the other ones. The results obtained using dynamic cancer model in real data sets, indicate that the flexibility of this method provides a safeguard for analyzing data where patients relapse over time and interventions are performed after tumoral reoccurrences.Computational issue is another important contribution of this work to recurrent event settings. We have developed three R packages called survrec, gcmrec, and frailtypack that are available at CRAN, http://www.r-project.org/. These packages allow users to compute median survival time and their confidence intervals, to estimate the parameters involved in the Peña and Hollander's model (in particular in the dynamic cancer model) using EM algorithm, and to estimate this parameters using penalized approach, respectively

    Predicting the potential health and economic impact of a sugary drink tax in Canada: a modelling study

    Get PDF
    BACKGROUND Consistent with global trends, high body mass index (BMI) and high blood glucose have risen at rapid rates among Canadians. The consumption of sugar-sweetened beverages (SSBs) is a well-established and important dietary risk factor for these conditions. A growing body of research suggests that SSB taxes can achieve meaningful health impacts by shifting dietary preferences. However, there are several key literature gaps, including no estimations of the potential benefit of taxing ‘sugary drinks’, a beverage category that includes both SSBs and 100% juice, which is high in sugar. In addition, no published studies have simulated a tax on SSBs or sugary drinks for the Canadian population. PURPOSE The proposed study’s objectives were: 1) to investigate Canadians’ consumption of sugary drink types, and differences by socio-economic characteristics; and 2) to estimate the potential impact among the Canadian population of a simulated national tax on SSBs and a simulated tax on sugary drinks. METHODS The study was conducted in two components. First, sugary drink intake (volume and energy) was estimated using 24-hour dietary recall data from the 2015 Canadian Community Health Survey – Nutrition (respondents ages >1 year; final sample N=20,176). For 100% juice and ‘total SSBs’ (which included 15 beverage types), intake was reported overall and by socio-economic measures: sex, age, ethnicity, income, province, and BMI category. Student’s t-test and Wald F-test tested for differences among population sub-groups. SSB and sugary drink intakes were also estimated for inclusion in the study’s second component: a simulation of a sugary drink tax. The impact of the tax intervention was estimated using a proportional multi-state life table-based Markov model adapted to simulate the 2015 Canadian adult population. The model applied 10%, 20%, and 30% ad valorem taxes on SSBs and sugary drinks, and compared two populations: one with a tax intervention and one without a tax intervention. The model simulated the effect of energy intake from beverages on 19 diseases mediated by body mass, and the direct effects of intake on type 2 diabetes, accounting for beverage substitution. Sensitivity analyses examined key assumptions and Monte Carlo simulation assessed uncertainty. RESULTS A large proportion of respondents reported consuming 100% juice (children, 39.3%; adults 22.8%) or some type of SSB (children, 53.0%; adults, 40.8%) during the previous 24-hour period. In 2015, each Canadian consumed an average of 74.3 ml (33.7 kcal) of 100% juice and 203.6 ml (98.7 kcal) of SSBs per day. 100% juice was consumed more than any other sugary drink, followed closely by regular carbonated soft drinks. Compared to females, males’ consumption was significantly higher for 100% juice (37% greater volume) and total SSBs (54% greater). Children consumed more sugary drinks on average each day than adults: nearly double the volume of 100% juice (86% more) and 14% more SSBs. Beverage intake differed by ethnicity, province, and BMI category, but not by income quintile. For the simulated taxes, there were sizeable differences in the impacts of a SSB tax versus sugary drinks tax: prevalence of overweight/obesity changed from 63.3% to 61.7% vs 61.0%; type 2 diabetes incidence rate decreased by -5.9% vs -7.4%. Over a 25-year period, compared to a SSB tax, a sugary drinks tax produced 47% more averted disability-adjusted life years (DALYs; 314,326 versus 460,812), 45% greater health care costs savings (7.5billionvs7.5 billion vs 10.9 billion Canadian dollars), and 37% more annual tax revenue (1.0billionCADvs1.0 billion CAD vs 1.4 billion CAD). CONCLUSIONS Consumption of sugary drinks remains an important disease risk factor among the Canadian population. Average intake of sugary drinks in 2015 is lower than 2004 estimates, but remains high, especially among children and youth. The current study suggests that a beverage tax in Canada has the potential to substantially reduce the health burden while generating health care savings and tax revenue, especially if 100% juice is among taxed beverages. Given Canadians’ high 100% juice consumption, the mounting evidence on adverse effects associated with free sugar consumption, and the role of 100% juice as a substitute beverage to SSBs, there is a strong rationale for its inclusion as a taxed beverage. Future studies could examine the potential impact of a ‘tiered’ tax based on beverage sugar content, as well as the effects of a tax relative to other nutrition interventions

    Hematological image analysis for acute lymphoblastic leukemia detection and classification

    Get PDF
    Microscopic analysis of peripheral blood smear is a critical step in detection of leukemia.However, this type of light microscopic assessment is time consuming, inherently subjective, and is governed by hematopathologists clinical acumen and experience. To circumvent such problems, an efficient computer aided methodology for quantitative analysis of peripheral blood samples is required to be developed. In this thesis, efforts are therefore made to devise methodologies for automated detection and subclassification of Acute Lymphoblastic Leukemia (ALL) using image processing and machine learning methods.Choice of appropriate segmentation scheme plays a vital role in the automated disease recognition process. Accordingly to segment the normal mature lymphocyte and malignant lymphoblast images into constituent morphological regions novel schemes have been proposed. In order to make the proposed schemes viable from a practical and real–time stand point, the segmentation problem is addressed in both supervised and unsupervised framework. These proposed methods are based on neural network,feature space clustering, and Markov random field modeling, where the segmentation problem is formulated as pixel classification, pixel clustering, and pixel labeling problem respectively. A comprehensive validation analysis is presented to evaluate the performance of four proposed lymphocyte image segmentation schemes against manual segmentation results provided by a panel of hematopathologists. It is observed that morphological components of normal and malignant lymphocytes differ significantly. To automatically recognize lymphoblasts and detect ALL in peripheral blood samples, an efficient methodology is proposed.Morphological, textural and color features are extracted from the segmented nucleus and cytoplasm regions of the lymphocyte images. An ensemble of classifiers represented as EOC3 comprising of three classifiers shows highest classification accuracy of 94.73% in comparison to individual members. The subclassification of ALL based on French–American–British (FAB) and World Health Organization (WHO) criteria is essential for prognosis and treatment planning. Accordingly two independent methodologies are proposed for automated classification of malignant lymphocyte (lymphoblast) images based on morphology and phenotype. These methods include lymphoblast image segmentation, nucleus and cytoplasm feature extraction, and efficient classification
    corecore