    Advanced imaging and data mining technologies for medical and food safety applications

    As one of the most fast-developing research areas, biological imaging and image analysis receive more and more attentions, and have been already widely applied in many scientific fields including medical diagnosis and food safety inspection. To further investigate such a very interesting area, this research is mainly focused on advanced imaging and pattern recognition technologies in both medical and food safety applications, which include 1) noise reduction of ultra-low-dose multi-slice helical CT imaging for early lung cancer screening, and 2) automated discrimination between walnut shell and meat under hyperspectral florescence imaging. In the medical imaging and diagnosis area, because X-ray computed tomography (CT) has been applied to screen large populations for early lung cancer detection during the last decade, more and more attentions have been paid to studying low-dose, even ultra-low-dose X-ray CTs. However, reducing CT radiation exposure inevitably increases the noise level in the sinogram, thereby degrading the quality of reconstructed CT images. Thus, how to reduce the noise levels in the low-dose CT images becomes a meaningful topic. In this research, a nonparametric smoothing method with block based thin plate smoothing splines and the roughness penalty was introduced to restore the ultra-low-dose helical CT raw data, which was acquired under 120 kVp / 10 mAs protocol. The objective thorax image quality evaluation was first conducted to assess the image quality and noise level of proposed method. A web-based subjective evaluation system was also built for the total of 23 radiologists to compare proposed approach with traditional sinogram restoration method. Both objective and subjective evaluation studies showed the effectiveness of proposed thin-plate based nonparametric regression method in sinogram restoration of multi-slice helical ultra-low-dose CT. In food quality inspection area, automated discrimination between walnut shell and meat has become an imperative task in the walnut postharvest processing industry in the U.S. This research developed two hyperspectral fluorescence imaging based approaches, which were capable of differentiating walnut small shell fragments from meat. Firstly, a principal component analysis (PCA) and Gaussian mixture model (PCA-GMM)-based Bayesian classification method was introduced. PCA was used to extract features, and then the optimal number of components in PCA was selected by a cross-validation technique. The PCA-GMM-based Bayesian classifier was further applied to differentiate the walnut shell and meat according to the class-conditional probability and the prior estimated by the Gaussian mixture model. The experimental results showed the effectiveness of this PCA-GMM approach, and an overall 98.2% recognition rate was achieved. Secondly, Gaussian-kernel based Support Vector Machine (SVM) was presented for the walnut shell and meat discrimination in the hyperspectral florescence imagery. SVM was applied to seek an optimal low to high dimensional mapping such that the nonlinear separable input data in the original input data space became separable on the mapped high dimensional space, and hence fulfilled the classification between walnut shell and meat. An overall recognition rate of 98.7% was achieved by this method. Although the hyperspectral fluorescence imaging is capable of differentiating between walnut shell and meat, one persistent problem is how to deal with huge amount of data acquired by the hyperspectral imaging system, and hence improve the efficiency of application system. To solve this problem, an Independent Component Analysis with k-Nearest Neighbor Classifier (ICA-kNN) approach was presented in this research to reduce the data redundancy while not sacrifice the classification performance too much. An overall 90.6% detection rate was achieved given 10 optimal wavelengths, which constituted only 13% of the total acquired hyperspectral image data. In order to further evaluate the proposed method, the classification results of the ICA-kNN approach were also compared to the kNN classifier method alone. The experimental results showed that the ICA-kNN method with fewer wavelengths had the same performance as the kNN classifier alone using information from all 79 wavelengths. This demonstrated the effectiveness of the proposed ICA-kNN method for the hyperspectral band selection in the walnut shell and meat classification

    A Balanced Secondary Structure Predictor

    Secondary structure (SS) refers to the local spatial organization of the polypeptide backbone atoms of a protein. Accurate prediction of SS is a vital clue to resolve the 3D structure of protein. SS has three different components- helix (H), beta (E) and coil (C). Most SS predictors are imbalanced as their accuracy in predicting helix and coil are high, however significantly low in the beta. The objective of this thesis is to develop a balanced SS predictor which achieves good accuracies in all three SS components. We proposed a novel approach to solve this problem by combining a genetic algorithm (GA) with a support vector machine. We prepared two test datasets (CB471 and N295) to compare the performance of our predictors with SPINE X. Overall accuracy of our predictor was 76.4% and 77.2% respectively on CB471 and N295 datasets, while SPINE X gave 76.5% overall accuracy on both test datasets

    Assessing and predicting small industrial enterprises’ credit ratings:A fuzzy decision-making approach

    Corporate credit-rating assessment plays a crucial role in helping financial institutions make their lending decisions and in reducing the financial constraints of small enterprises. This paper presents a new approach for small industrial enterprises’ credit-rating assessment using fuzzy decision-making methods, and tests it using real bank loan data from 1,820 small industrial enterprises in China. The procedure of the proposed rating approach includes (1) using triangular fuzzy numbers to quantify the qualitative evaluation indicators; (2) adopting a correlation analysis, univariate analysis and stepping backwards feature selection method to select the input features; (3) employing the best-worst method (BWM) combined with the entropy weight method (EWM), the fuzzy c-means algorithm and the technique for order of preference by similarity to ideal solution (TOPSIS) to classify small enterprises into rating classes; and (4) applying the lattice degree of nearness to predict a new loan applicant’s rating. We also conduct a 10-fold cross-validation to evaluate the predictive performance of our proposed approach. The predictive results demonstrate that our proposed data-processing and feature selection approaches have better accuracy than the alternative approaches in predicting default, offering bankers a new valuable rating system to assist their decision making

    Classifiers and machine learning techniques for image processing and computer vision

    Orientador: Siome Klein GoldensteinTese (doutorado) - Universidade Estadual de Campinas, Instituto da ComputaçãoResumo: Neste trabalho de doutorado, propomos a utilizaçãoo de classificadores e técnicas de aprendizado de maquina para extrair informações relevantes de um conjunto de dados (e.g., imagens) para solução de alguns problemas em Processamento de Imagens e Visão Computacional. Os problemas de nosso interesse são: categorização de imagens em duas ou mais classes, detecçãao de mensagens escondidas, distinção entre imagens digitalmente adulteradas e imagens naturais, autenticação, multi-classificação, entre outros. Inicialmente, apresentamos uma revisão comparativa e crítica do estado da arte em análise forense de imagens e detecção de mensagens escondidas em imagens. Nosso objetivo é mostrar as potencialidades das técnicas existentes e, mais importante, apontar suas limitações. Com esse estudo, mostramos que boa parte dos problemas nessa área apontam para dois pontos em comum: a seleção de características e as técnicas de aprendizado a serem utilizadas. Nesse estudo, também discutimos questões legais associadas a análise forense de imagens como, por exemplo, o uso de fotografias digitais por criminosos. Em seguida, introduzimos uma técnica para análise forense de imagens testada no contexto de detecção de mensagens escondidas e de classificação geral de imagens em categorias como indoors, outdoors, geradas em computador e obras de arte. Ao estudarmos esse problema de multi-classificação, surgem algumas questões: como resolver um problema multi-classe de modo a poder combinar, por exemplo, caracteríisticas de classificação de imagens baseadas em cor, textura, forma e silhueta, sem nos preocuparmos demasiadamente em como normalizar o vetor-comum de caracteristicas gerado? Como utilizar diversos classificadores diferentes, cada um, especializado e melhor configurado para um conjunto de caracteristicas ou classes em confusão? Nesse sentido, apresentamos, uma tecnica para fusão de classificadores e caracteristicas no cenário multi-classe através da combinação de classificadores binários. Nós validamos nossa abordagem numa aplicação real para classificação automática de frutas e legumes. Finalmente, nos deparamos com mais um problema interessante: como tornar a utilização de poderosos classificadores binarios no contexto multi-classe mais eficiente e eficaz? Assim, introduzimos uma tecnica para combinação de classificadores binarios (chamados classificadores base) para a resolução de problemas no contexto geral de multi-classificação.Abstract: In this work, we propose the use of classifiers and machine learning techniques to extract useful information from data sets (e.g., images) to solve important problems in Image Processing and Computer Vision. We are particularly interested in: two and multi-class image categorization, hidden messages detection, discrimination among natural and forged images, authentication, and multiclassification. To start with, we present a comparative survey of the state-of-the-art in digital image forensics as well as hidden messages detection. Our objective is to show the importance of the existing solutions and discuss their limitations. In this study, we show that most of these techniques strive to solve two common problems in Machine Learning: the feature selection and the classification techniques to be used. Furthermore, we discuss the legal and ethical aspects of image forensics analysis, such as, the use of digital images by criminals. We introduce a technique for image forensics analysis in the context of hidden messages detection and image classification in categories such as indoors, outdoors, computer generated, and art works. From this multi-class classification, we found some important questions: how to solve a multi-class problem in order to combine, for instance, several different features such as color, texture, shape, and silhouette without worrying about the pre-processing and normalization of the combined feature vector? How to take advantage of different classifiers, each one custom tailored to a specific set of classes in confusion? To cope with most of these problems, we present a feature and classifier fusion technique based on combinations of binary classifiers. We validate our solution with a real application for automatic produce classification. Finally, we address another interesting problem: how to combine powerful binary classifiers in the multi-class scenario more effectively? How to boost their efficiency? In this context, we present a solution that boosts the efficiency and effectiveness of multi-class from binary techniques.DoutoradoEngenharia de ComputaçãoDoutor em Ciência da Computaçã

    An object's smell in the multisensory brain : how our senses interact during olfactory object processing

    Object perception is a remarkable and fundamental cognitive ability that allows us to interpret and interact with the world we are living in. In our everyday life, we constantly perceive objects–mostly without being aware of it and through several senses at the same time. Although it might seem that object perception is accomplished without any effort, the underlying neural mechanisms are anything but simple. How we perceive objects in the world surrounding us is the result of a complex interplay of our senses. The aim of the present thesis was to explore, by means of functional magnetic resonance imaging, how our senses interact when we perceive an object’s smell in a multisensory setting where the amount of sensory stimulation increases, as well as in a unisensory setting where we perceive an object’s smell in isolation. In Study I, we sought to determine whether and how multisensory object information influences the processing of olfactory object information in the posterior piriform cortex (PPC), a region linked to olfactory object encoding. In Study II, we then expanded our search for integration effects during multisensory object perception to the whole brain because previous research has demonstrated that multisensory integration is accomplished by a network of early sensory cortices and higher-order multisensory integration sites. We specifically aimed at determining whether there exist cortical regions that process multisensory object information independent of from which senses and from how many senses the information arises. In Study III, we then sought to unveil how our senses interact during olfactory object perception in a unisensory setting. Other previous studies have shown that even in such unisensory settings, olfactory object processing is not exclusively accomplished by regions within the olfactory system but instead engages a more widespread network of brain regions, such as regions belonging to the visual system. We aimed at determining what this visual engagement represents. That is, whether areas of the brain that are principally concerned with processing visual object information also hold neural representations of olfactory object information, and if so, whether these representations are similar for smells and pictures of the same objects. In Study I we demonstrated that assisting inputs from our senses of vision and hearing increase the processing of olfactory object information in the PPC, and that the more assisting input we receive the more the processing is enhanced. As this enhancement occurred only for matching inputs, it likely reflects integration of multisensory object information. Study II provided evidence for convergence of multisensory object information in form of a non-linear response enhancement in the inferior parietal cortex: activation increased for bimodal compared to unimodal stimulation, and increased even further for trimodal compared to bimodal stimulation. As this multisensory response enhancement occurred independent of the congruency of the incoming signals, it likely reflects a process of relating the incoming sensory information streams to each other. Finally, Study III revealed that regions of the ventral visual object stream are engaged in recognition of an object’s smell and represent olfactory object information in form of distinct neural activation patterns. While the visual system encodes information about both visual and olfactory objects, it appears to keep information from the two sensory modalities separate by representing smells and pictures of objects differently. Taken together, the studies included in this thesis reveal that olfactory object perception is a multisensory process that engages a widespread network of early sensory as well higher-order cortical regions, even if we do not encounter ourselves in a multisensory setting but exclusively perceive an object’s smell

    Sustainable Agriculture and Advances of Remote Sensing (Volume 2)

    Agriculture, as the main source of alimentation and the most important economic activity globally, is being affected by the impacts of climate change. To maintain and increase our global food system production, to reduce biodiversity loss and preserve our natural ecosystem, new practices and technologies are required. This book focuses on the latest advances in remote sensing technology and agricultural engineering leading to the sustainable agriculture practices. Earth observation data, in situ and proxy-remote sensing data are the main source of information for monitoring and analyzing agriculture activities. Particular attention is given to earth observation satellites and the Internet of Things for data collection, to multispectral and hyperspectral data analysis using machine learning and deep learning, to WebGIS and the Internet of Things for sharing and publication of the results, among others

    Fake Review Detection using Data Mining

    Online spam reviews are deceptive evaluations of products and services. They are often carried out as a deliberate manipulation strategy to deceive the readers. Recognizing such reviews is an important but challenging problem. In this work, I try to solve this problem by using different data mining techniques. I explore the strength and weakness of those data mining techniques in detecting fake review. I start with different supervised techniques such as Support Vector Ma- chine (SVM), Multinomial Naive Bayes (MNB), and Multilayer Perceptron. The results attest that all the above mentioned supervised techniques can successfully detect fake review with more than 86% accuracy. Then, I work on a semi-supervised technique which reduces the dimension- ality of the input features vector but offers similar performance to existing approaches. I use a combination of topic modeling and SVM for the implementation of the semi-supervised tech- nique. I also compare the results with other approaches that consider all the words of a dataset as input features. I found that topic words are enough as input features to get similar accuracy compared to other approaches where researchers consider all the words as input features. At the end, I propose an unsupervised learning approach named as Words Basket Analysis for fake re- view detection. I utilize five Amazon products review dataset for an experiment and report the performance of the proposed on these datasets

    Pattern Recognition

    A wealth of advanced pattern recognition algorithms are emerging from the interdiscipline between technologies of effective visual features and the human-brain cognition process. Effective visual features are made possible through the rapid developments in appropriate sensor equipments, novel filter designs, and viable information processing architectures. While the understanding of human-brain cognition process broadens the way in which the computer can perform pattern recognition tasks. The present book is intended to collect representative researches around the globe focusing on low-level vision, filter design, features and image descriptors, data mining and analysis, and biologically inspired algorithms. The 27 chapters coved in this book disclose recent advances and new ideas in promoting the techniques, technology and applications of pattern recognition
