43 research outputs found

    Maximum margin learning of t-SPNs for cell classification with filtered input

    Full text link
    An algorithm based on a deep probabilistic architecture referred to as a tree-structured sum-product network (t-SPN) is considered for cell classification. The t-SPN is constructed such that the unnormalized probability is represented as conditional probabilities of a subset of most similar cell classes. The constructed t-SPN architecture is learned by maximizing the margin, which is the difference in the conditional probability between the true and the most competitive false label. To enhance the generalization ability of the architecture, L2-regularization (REG) is considered along with the maximum margin (MM) criterion in the learning process. To highlight cell features, this paper investigates the effectiveness of two generic high-pass filters: ideal high-pass filtering and the Laplacian of Gaussian (LOG) filtering. On both HEp-2 and Feulgen benchmark datasets, the t-SPN architecture learned based on the max-margin criterion with regularization produced the highest accuracy rate compared to other state-of-the-art algorithms that include convolutional neural network (CNN) based algorithms. The ideal high-pass filter was more effective on the HEp-2 dataset, which is based on immunofluorescence staining, while the LOG was more effective on the Feulgen dataset, which is based on Feulgen staining

    Bayesian Learning of Sum-Product Networks

    Full text link
    Sum-product networks (SPNs) are flexible density estimators and have received significant attention due to their attractive inference properties. While parameter learning in SPNs is well developed, structure learning leaves something to be desired: Even though there is a plethora of SPN structure learners, most of them are somewhat ad-hoc and based on intuition rather than a clear learning principle. In this paper, we introduce a well-principled Bayesian framework for SPN structure learning. First, we decompose the problem into i) laying out a computational graph, and ii) learning the so-called scope function over the graph. The first is rather unproblematic and akin to neural network architecture validation. The second represents the effective structure of the SPN and needs to respect the usual structural constraints in SPN, i.e. completeness and decomposability. While representing and learning the scope function is somewhat involved in general, in this paper, we propose a natural parametrisation for an important and widely used special case of SPNs. These structural parameters are incorporated into a Bayesian model, such that simultaneous structure and parameter learning is cast into monolithic Bayesian posterior inference. In various experiments, our Bayesian SPNs often improve test likelihoods over greedy SPN learners. Further, since the Bayesian framework protects against overfitting, we can evaluate hyper-parameters directly on the Bayesian model score, waiving the need for a separate validation set, which is especially beneficial in low data regimes. Bayesian SPNs can be applied to heterogeneous domains and can easily be extended to nonparametric formulations. Moreover, our Bayesian approach is the first, which consistently and robustly learns SPN structures under missing data.Comment: NeurIPS 2019; See conference page for supplemen

    Preprocessing reference sensor pattern noise via spectrum equalization

    Get PDF
    Although sensor pattern noise (SPN) has been proven to be an effective means to uniquely identify digital cameras, some non-unique artifacts, shared amongst cameras undergo the same or similar in-camera processing procedures, often give rise to false identifications. Therefore, it is desirable and necessary to suppress these unwanted artifacts so as to improve the accuracy and reliability. In this work, we propose a novel preprocessing approach for attenuating the influence of the nonunique artifacts on the reference SPN to reduce the false identification rate. Specifically, we equalize the magnitude spectrum of the reference SPN through detecting and suppressing the peaks according to the local characteristics, aiming at removing the interfering periodic artifacts. Combined with 6 SPN extraction or enhancement methods, our proposed Spectrum Equalization Algorithm (SEA) is evaluated on the Dresden image database as well as our own database, and compared with the state-of-the-art preprocessing schemes. Experimental results indicate that the proposed procedure outperforms, or at least performs comparably to, the existing methods in terms of the overall ROC curve and kappa statistic computed from a confusion matrix, and tends to be more resistant to JPEG compression for medium and small image blocks

    Heterogeneidad tumoral en imágenes PET-CT

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Ciencias Físicas, Departamento de Estructura de la Materia, Física Térmica y Electrónica, leída el 28/01/2021Cancer is a leading cause of morbidity and mortality [1]. The most frequent cancers worldwide are non–small cell lung carcinoma (NSCLC) and breast cancer [2], being their management a challenging task [3]. Tumor diagnosis is usually made through biopsy [4]. However, medical imaging also plays an important role in diagnosis, staging, response to treatment, and recurrence assessment [5]. Tumor heterogeneity is recognized to be involved in cancer treatment failure, with worse clinical outcomes for highly heterogeneous tumors [6,7]. This leads to the existence of tumor sub-regions with different biological behavior (some more aggressive and treatment-resistant than others) [8-10]. Which are characterized by a different pattern of vascularization, vessel permeability, metabolism, cell proliferation, cell death, and other features, that can be measured by modern medical imaging techniques, including positron emission tomography/computed tomography (PET/CT) [10-12]. Thus, the assessment of tumor heterogeneity through medical images could allow the prediction of therapy response and long-term outcomes of patients with cancer [13]. PET/CT has become essential in oncology [14,15] and is usually evaluated through semiquantitative metabolic parameters, such as maximum/mean standard uptake value (SUVmax, SUVmean) or metabolic tumor volume (MTV), which are valuables as prognostic image-based biomarkers in several tumors [16-17], but these do not assess tumor heterogeneity. Likewise, fluorodeoxyglucose (18F-FDG) PET/CT is important to differentiate malignant from benign solitary pulmonary nodules (SPN), reducing so the number of patients who undergo unnecessary surgical biopsies. Several publications have shown that some quantitative image features, extracted from medical images, are suitable for diagnosis, tumor staging, the prognosis of treatment response, and long-term evolution of cancer patients [18-20]. The process of extracting and relating image features with clinical or biological variables is called “Radiomics” [9,20-24]. Radiomic parameters, such as textural features have been related directly to tumor heterogeneity [25]. This thesis investigated the relationships of the tumor heterogeneity, assessed by 18F-FDG-PET/CT texture analysis, with metabolic parameters and pathologic staging in patients with NSCLC, and explored the diagnostic performance of different metabolic, morphologic, and clinical criteria for classifying (malignant or not) of solitary pulmonary nodules (SPN). Furthermore, 18F-FDG-PET/CT radiomic features of patients with recurrent/metastatic breast cancer were used for constructing predictive models of response to the chemotherapy, based on an optimal combination of several feature selection and machine learning (ML) methods...El cáncer es una de las principales causas de morbilidad y mortalidad. Los más frecuentes son el carcinoma de pulmón de células no pequeñas (NSCLC) y el cáncer de mama, siendo su tratamiento un reto. El diagnóstico se suele realizar mediante biopsia. La heterogeneidad tumoral (HT) está implicada en el fracaso del tratamiento del cáncer, con peores resultados clínicos para tumores muy heterogéneos. Esta conduce a la existencia de subregiones tumorales con diferente comportamiento biológico (algunas más agresivas y resistentes al tratamiento); las cuales se caracterizan por diferentes patrones de vascularización, permeabilidad de los vasos sanguíneos, metabolismo, proliferación y muerte celular, que se pueden medir mediante imágenes médicas, incluida la tomografía por emisión de positrones/tomografía computarizada con fluorodesoxiglucosa (18F-FDG-PET/CT). La evaluación de la HT a través de imágenes médicas, podría mejorar la predicción de la respuesta al tratamiento y de los resultados a largo plazo, en pacientes con cáncer. La 18F-FDG-PET/CT es esencial en oncología, generalmente se evalúa con parámetros metabólicos semicuantitativos, como el valor de captación estándar máximo/medio (SUVmáx, SUVmedio) o el volumen tumoral metabólico (MTV), que tienen un gran valor pronóstico en varios tumores, pero no evalúan la HT. Asimismo, es importante para diferenciar los nódulos pulmonares solitarios (NPS) malignos de los benignos, reduciendo el número de pacientes que van a biopsias quirúrgicas innecesarias. Publicaciones recientes muestran que algunas características cuantitativas, extraídas de las imágenes médicas, son robustas para diagnóstico, estadificación, pronóstico de la respuesta al tratamiento y la evolución, de pacientes con cáncer. El proceso de extraer y relacionar estas características con variables clínicas o biológicas se denomina “Radiomica”. Algunos parámetros radiómicos, como la textura, se han relacionado directamente con la HT. Esta tesis investigó las relaciones entre HT, evaluada mediante análisis de textura (AT) de imágenes 18F-FDG-PET/CT, con parámetros metabólicos y estadificación patológica en pacientes con NSCLC, y exploró el rendimiento diagnóstico de diferentes criterios metabólicos, morfológicos y clínicos para la clasificación de NPS. Además, se usaron características radiómicas de imágenes 18F-FDG-PET/CT de pacientes con cáncer de mama recurrente/metastásico, para construir modelos predictivos de la respuesta a la quimioterapia, combinándose varios métodos de selección de características y aprendizaje automático (ML)...Fac. de Ciencias FísicasTRUEunpu

    Novel Architectures for Offloading and Accelerating Computations in Artificial Intelligence and Big Data

    Get PDF
    Due to the end of Moore's Law and Dennard Scaling, performance gains in general-purpose architectures have significantly slowed in recent years. While raising the number of cores has been a viable approach for further performance increases, Amdahl's Law and its implications on parallelization also limit further performance gains. Consequently, research has shifted towards different approaches, including domain-specific custom architectures tailored to specific workloads. This has led to a new golden age for computer architecture, as noted in the Turing Award Lecture by Hennessy and Patterson, which has spawned several new architectures and architectural advances specifically targeted at highly current workloads, including Machine Learning. This thesis introduces a hierarchy of architectural improvements ranging from minor incremental changes, such as High-Bandwidth Memory, to more complex architectural extensions that offload workloads from the general-purpose CPU towards more specialized accelerators. Finally, we introduce novel architectural paradigms, namely Near-Data or In-Network Processing, as the most complex architectural improvements. This cumulative dissertation then investigates several architectural improvements to accelerate Sum-Product Networks, a novel Machine Learning approach from the class of Probabilistic Graphical Models. Furthermore, we use these improvements as case studies to discuss the impact of novel architectures, showing that minor and major architectural changes can significantly increase performance in Machine Learning applications. In addition, this thesis presents recent works on Near-Data Processing, which introduces Smart Storage Devices as a novel architectural paradigm that is especially interesting in the context of Big Data. We discuss how Near-Data Processing can be applied to improve performance in different database settings by offloading database operations to smart storage devices. Offloading data-reductive operations, such as selections, reduces the amount of data transferred, thus improving performance and alleviating bandwidth-related bottlenecks. Using Near-Data Processing as a use-case, we also discuss how Machine Learning approaches, like Sum-Product Networks, can improve novel architectures. Specifically, we introduce an approach for offloading Cardinality Estimation using Sum-Product Networks that could enable more intelligent decision-making in smart storage devices. Overall, we show that Machine Learning can benefit from developing novel architectures while also showing that Machine Learning can be applied to improve the applications of novel architectures

    Computational methods for the analysis of functional 4D-CT chest images.

    Get PDF
    Medical imaging is an important emerging technology that has been intensively used in the last few decades for disease diagnosis and monitoring as well as for the assessment of treatment effectiveness. Medical images provide a very large amount of valuable information that is too huge to be exploited by radiologists and physicians. Therefore, the design of computer-aided diagnostic (CAD) system, which can be used as an assistive tool for the medical community, is of a great importance. This dissertation deals with the development of a complete CAD system for lung cancer patients, which remains the leading cause of cancer-related death in the USA. In 2014, there were approximately 224,210 new cases of lung cancer and 159,260 related deaths. The process begins with the detection of lung cancer which is detected through the diagnosis of lung nodules (a manifestation of lung cancer). These nodules are approximately spherical regions of primarily high density tissue that are visible in computed tomography (CT) images of the lung. The treatment of these lung cancer nodules is complex, nearly 70% of lung cancer patients require radiation therapy as part of their treatment. Radiation-induced lung injury is a limiting toxicity that may decrease cure rates and increase morbidity and mortality treatment. By finding ways to accurately detect, at early stage, and hence prevent lung injury, it will have significant positive consequences for lung cancer patients. The ultimate goal of this dissertation is to develop a clinically usable CAD system that can improve the sensitivity and specificity of early detection of radiation-induced lung injury based on the hypotheses that radiated lung tissues may get affected and suffer decrease of their functionality as a side effect of radiation therapy treatment. These hypotheses have been validated by demonstrating that automatic segmentation of the lung regions and registration of consecutive respiratory phases to estimate their elasticity, ventilation, and texture features to provide discriminatory descriptors that can be used for early detection of radiation-induced lung injury. The proposed methodologies will lead to novel indexes for distinguishing normal/healthy and injured lung tissues in clinical decision-making. To achieve this goal, a CAD system for accurate detection of radiation-induced lung injury that requires three basic components has been developed. These components are the lung fields segmentation, lung registration, and features extraction and tissue classification. This dissertation starts with an exploration of the available medical imaging modalities to present the importance of medical imaging in today’s clinical applications. Secondly, the methodologies, challenges, and limitations of recent CAD systems for lung cancer detection are covered. This is followed by introducing an accurate segmentation methodology of the lung parenchyma with the focus of pathological lungs to extract the volume of interest (VOI) to be analyzed for potential existence of lung injuries stemmed from the radiation therapy. After the segmentation of the VOI, a lung registration framework is introduced to perform a crucial and important step that ensures the co-alignment of the intra-patient scans. This step eliminates the effects of orientation differences, motion, breathing, heart beats, and differences in scanning parameters to be able to accurately extract the functionality features for the lung fields. The developed registration framework also helps in the evaluation and gated control of the radiotherapy through the motion estimation analysis before and after the therapy dose. Finally, the radiation-induced lung injury is introduced, which combines the previous two medical image processing and analysis steps with the features estimation and classification step. This framework estimates and combines both texture and functional features. The texture features are modeled using the novel 7th-order Markov Gibbs random field (MGRF) model that has the ability to accurately models the texture of healthy and injured lung tissues through simultaneously accounting for both vertical and horizontal relative dependencies between voxel-wise signals. While the functionality features calculations are based on the calculated deformation fields, obtained from the 4D-CT lung registration, that maps lung voxels between successive CT scans in the respiratory cycle. These functionality features describe the ventilation, the air flow rate, of the lung tissues using the Jacobian of the deformation field and the tissues’ elasticity using the strain components calculated from the gradient of the deformation field. Finally, these features are combined in the classification model to detect the injured parts of the lung at an early stage and enables an earlier intervention

    Learning based forensic techniques for source camera identification

    Get PDF
    In recent years, multimedia forensics has received rapidly growing attention. One challenging problem of multimedia forensics is source camera identification, the goal of which is to identify the source of a multimedia object, such as digital image and video. Sensor pattern noises, produced by imaging sensors, have been proved to be an effective way for source camera identification. Precisely speaking, the conventional SPN-based source camera identification.has two application models: verification and identification. In the past decade, significant progress has been achieved in the tasks of SPN-based source camera verification and identification. However, there are still many cases requiring solutions beyond the capabilities of the current methods. In this thesis, we considered and addressed two commonly seen but less studied problems. The first problem is the source camera verification with reference SPNs corrupted by scene details. The most significant limitation of using SPN for source camera identification.is that SPN can be seriously contaminated by scene details. Most existing methods consider the contaminations from scene details only occur in query images but not in reference images. To address this issue, we propose a measurement based on the combination of local image entropy and brightness so as to evaluate the quality of SPN contained by different image blocks. Based on this measurement, a context adaptive reference SPN estimator is proposed to address the problem that reference images are contaminated by scene details. The second problem that we considered relates to the high computational complexity of using SPN in source camera identification., which is caused by the high dimensionality of SPN. In order to improve identification.efficiency without degrading accuracy, we propose an effective feature extraction algorithm based on the concept of PCA denoising to extract a small set of components from the original noise residual, which tends to carry most of the information of the true SPN signal. To further improve the performance of this framework, two enhancement methods are introduced. The first enhancement method is proposed to take the advantage of the label information of the reference images so as to better separate different classes and further reduce the dimensionality. Secondly, we propose an extension based on Candid Covariance-free Incremental PCA to incrementally update the feature extractor according to the received images so that there is no need to re-conduct training every time when a new image is added to the database. Moreover, an ensemble method based on the random subspace method and majority voting is proposed in the context of source camera identification.to tackle the performance degradation of PCA-based feature extraction method due to the corruption by unwanted interferences in the training set. The proposed algorithms are evaluated on the challenging Dresden image database and experimental results confirmed their effectiveness

    Co-Segmentation Methods for Improving Tumor Target Delineation in PET-CT Images

    Get PDF
    Positron emission tomography (PET)-Computed tomography (CT) plays an important role in cancer management. As a multi-modal imaging technique it provides both functional and anatomical information of tumor spread. Such information improves cancer treatment in many ways. One important usage of PET-CT in cancer treatment is to facilitate radiotherapy planning, for the information it provides helps radiation oncologists to better target the tumor region. However, currently most tumor delineations in radiotherapy planning are performed by manual segmentation, which consumes a lot of time and work. Most computer-aided algorithms need a knowledgeable user to locate roughly the tumor area as a starting point. This is because, in PET-CT imaging, some tissues like heart and kidney may also exhibit a high level of activity similar to that of a tumor region. In order to address this issue, a novel co-segmentation method is proposed in this work to enhance the accuracy of tumor segmentation using PET-CT, and a localization algorithm is developed to differentiate and segment tumor regions from normal regions. On a combined dataset containing 29 patients with lung tumor, the combined method shows good segmentation results as well as good tumor recognition rate

    Extracción y análisis de características para identificación, agrupamiento y modificación de la fuente de imágenes generadas por dispositivos móviles

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Ingeniería del Software e Inteligencia Artificial, leída el 02/10/2017.Nowadays, digital images play an important role in our society. The presence of mobile devices with integrated cameras is growing at an unrelenting pace, resulting in the majority of digital images coming from this kind of device. Technological development not only facilitates the generation of these images, but also the malicious manipulation of them. Therefore, it is of interest to have tools that allow the device that has generated a certain digital image to be identified. The digital image source can be identified through the features that the generating device permeates it with during the creation process. In recent years most research on techniques for identifying the source has focused solely on traditional cameras. The forensic analysis techniques of digital images generated by mobile devices are therefore of particular importance since they have specific characteristics which allow for better results, and forensic techniques for digital images generated by another kind of device are often not valid. This thesis provides various contributions in two of the main research lines of forensic analysis, the field of identification techniques and the counter-forensics or attacks on these techniques. In the field of digital image source acquisition identification techniques, both closed and open scenarios are addressed. In closed scenarios, the images whose acquisition source are to be determined belong to a group of devices known a priori. Meanwhile, an open scenario is one in which the images under analysis belong to a set of devices that is not known a priori by the fo rensic analyst. In this case, the objective is not t he concrete image acquisition source identification, but their classification into groups whose images all belong to the same mobile device. The image clustering t echniques are of particular interest in real situations since in many cases the forensic analyst does not know a priori which devices have generated certain images. Firstly, techniques for identifying the device type (computer, scanner or digital camera of the mobile device) or class (make and model) of the image acquisition source in mobile devices are proposed, which are two relevant branches of forensic analysis of mobile device images. An approach based on different types of image features and Support Vector Machine as a classifier is presented. Secondly, a technique for the ident ification in open scenarios that consists of grouping digital images of mobile devices according to the acquisition source is developed, that is to say, a class-grouping of all input images is performed. The proposal is based on the combination of hierarchical grouping and flat grouping using the Sensor Pattern Noise. Lastly, in the area of att acks on forensic t echniques, topics related to the robustness of the image source identificat ion forensic techniques are addressed. For this, two new algorithms based on the sensor noise and the wavelet transform are designed, one for the destruction of t he image identity and another for its fo rgery. Results obtained by the two algorithms were compared with other tools designed for the same purpose. It is worth mentioning that the solution presented in this work requires less amount and complexity of input data than the tools to which it was compared. Finally, these identification t echniques have been included in a tool for the forensic analysis of digital images of mobile devices called Theia. Among the different branches of forensic analysis, Theia focuses mainly on the trustworthy identification of make and model of the mobile camera that generated a given image. All proposed algorithms have been implemented and integrated in Theia thus strengthening its functionality.Actualmente las imágenes digitales desempeñan un papel importante en nuestra sociedad. La presencia de dispositivos móviles con cámaras fotográficas integradas crece a un ritmo imparable, provocando que la mayoría de las imágenes digitales procedan de este tipo de dispositivos. El desarrollo tecnológico no sólo facilita la generación de estas imágenes, sino también la manipulación malintencionada de éstas. Es de interés, por tanto, contar con herramientas que permitan identificar al dispositivo que ha generado una cierta imagen digital. La fuente de una imagen digital se puede identificar a través de los rasgos que el dispositivo que la genera impregna en ella durante su proceso de creación. La mayoría de las investigaciones realizadas en los últimos años sobre técnicas de identificación de la fuente se han enfocado únicamente en las cámaras tradicionales. Las técnicas de análisis forense de imágenes generadas por dispositivos móviles cobran, pues, especial importancia, ya que éstos presentan características específicas que permiten obtener mejores resultados, no siendo válidas muchas veces además las técnicas forenses para imágenes digitales generadas por otros tipos de dispositivos. La presente Tesis aporta diversas contribuciones en dos de las principales líneas del análisis forense: el campo de las t écnicas de identificación de la fuente de adquisición de imágenes digitales y las contramedidas o at aques a est as técnicas. En el primer campo se abordan tanto los escenarios cerrados como los abiertos. En el escenario denominado cerrado las imágenes cuya fuente de adquisición hay que determinar pertenecen a un grupo de dispositivos conocidos a priori. Por su parte, un escenario abierto es aquel en el que las imágenes pertenecen a un conjunto de dispositivos que no es conocido a priori por el analista forense. En este caso el obj etivo no es la identificación concreta de la fuente de adquisición de las imágenes, sino su clasificación en grupos cuyas imágenes pertenecen todas al mismo dispositivo móvil. Las técnicas de agrupamiento de imágenes son de gran interés en situaciones reales, ya que en muchos casos el analist a forense desconoce a priori cuáles son los dispositivos que generaron las imágenes. En primer lugar se presenta una técnica para la identificación en escenarios cerrados del tipo de dispositivo (computador, escáner o cámara digital de dispositivo móvil) o la marca y modelo de la fuente en dispositivos móviles, que son dos problemáticas relevantes del análisis forense de imágenes digitales. La propuesta muestra un enfoque basado en distintos tipos de características de la imagen y en una clasificación mediante máquinas de soporte vectorial. En segundo lugar se diseña una técnica para la identificación en escenarios abiertos que consiste en el agrupamiento de imágenes digitales de dispositivos móviles según la fuente de adquisición, es decir, se realiza un agrupamiento en clases de todas las imágenes de ent rada. La propuesta combina agrupamiento jerárquico y agrupamiento plano con el uso del patrón de ruido del sensor. Por último, en el área de los ataques a las técnicas fo renses se tratan temas relacionados con la robustez de las técnicas forenses de identificación de la fuente de adquisición de imágenes. Se especifican dos algoritmos basados en el ruido del sensor y en la transformada wavelet ; el primero destruye la identidad de una imagen y el segundo falsifica la misma. Los resultados obtenidos por estos dos algoritmos se comparan con otras herramientas diseñadas para el mismo fin, observándose que la solución aquí presentada requiere de menor cantidad y complejidad de datos de entrada. Finalmente, estas técnicas de identificación han sido incluidas en una herramienta para el análisis forense de imágenes digitales de dispositivos móviles llamada Theia. Entre las diferentes ramas del análisis forense, Theia se centra principalmente en la identificación confiable de la marca y el modelo de la cámara móvil que generó una imagen dada. Todos los algoritmos desarrollados han sido implementados e integrados en Theia, reforzando así su funcionalidad.Depto. de Ingeniería de Software e Inteligencia Artificial (ISIA)Fac. de InformáticaTRUEunpu
    corecore