266 research outputs found

    Combining Exploration and Exploitation in Active Learning

    Get PDF
    This thesis investigates the active learning in the presence of model bias. State of the art approaches advocate combining exploration and exploitation in active learning. However, they suffer from premature exploitation or unnecessary exploration in the later stages of learning. We propose to combine exploration and exploitation in active learning by discarding instances outside a sampling window that is centered around the estimated decision boundary and uniformly draw sample from this window. Initially the window spans the entire feature space and is gradually constricted, where the rate of constriction models the exploration-exploitation tradeoff. The desired effect of this approach (CExp) is that we get an increasing sampling density in informative regions as active learning progresses, resulting in a continuous and natural transition from exploration to exploitation, limiting both premature exploitation and unnecessary exploration. We show that our approach outperforms state of the art on real world multiclass datasets. We also extend our approach to spatial mapping problems where the standard active learning assumption of uniform costs is violated. We show that we can take advantage of \emph{spatial continuity} in the data by geographically partitioning the instances in the sampling window and choosing a single partition (region) for sampling, as opposed to taking a random sample from the entire window, resulting in a novel spatial active learning algorithm that combines exploration and exploitation. We demonstrate that our approach (CExp-Spatial) can generate cost-effective sampling trajectories over baseline sampling methods. Finally, we present the real world problem of mapping benthic habitats where bathymetry derived features are typically not strong enough to discriminate the fine details between classes identified from high-resolution imagery, increasing the possiblity of model bias in active learning. We demonstrate, under such conditions, that CExp outperforms state of the art and that CExp-Spatial can generate more cost-effective sampling trajectories for an Autonomous Underwater Vehicle in contrast to baseline sampling strategies

    Combining Exploration and Exploitation in Active Learning

    Get PDF
    This thesis investigates the active learning in the presence of model bias. State of the art approaches advocate combining exploration and exploitation in active learning. However, they suffer from premature exploitation or unnecessary exploration in the later stages of learning. We propose to combine exploration and exploitation in active learning by discarding instances outside a sampling window that is centered around the estimated decision boundary and uniformly draw sample from this window. Initially the window spans the entire feature space and is gradually constricted, where the rate of constriction models the exploration-exploitation tradeoff. The desired effect of this approach (CExp) is that we get an increasing sampling density in informative regions as active learning progresses, resulting in a continuous and natural transition from exploration to exploitation, limiting both premature exploitation and unnecessary exploration. We show that our approach outperforms state of the art on real world multiclass datasets. We also extend our approach to spatial mapping problems where the standard active learning assumption of uniform costs is violated. We show that we can take advantage of \emph{spatial continuity} in the data by geographically partitioning the instances in the sampling window and choosing a single partition (region) for sampling, as opposed to taking a random sample from the entire window, resulting in a novel spatial active learning algorithm that combines exploration and exploitation. We demonstrate that our approach (CExp-Spatial) can generate cost-effective sampling trajectories over baseline sampling methods. Finally, we present the real world problem of mapping benthic habitats where bathymetry derived features are typically not strong enough to discriminate the fine details between classes identified from high-resolution imagery, increasing the possiblity of model bias in active learning. We demonstrate, under such conditions, that CExp outperforms state of the art and that CExp-Spatial can generate more cost-effective sampling trajectories for an Autonomous Underwater Vehicle in contrast to baseline sampling strategies

    Prediction of pathological stage in patients with prostate cancer: a neuro-fuzzy model

    Get PDF
    The prediction of cancer staging in prostate cancer is a process for estimating the likelihood that the cancer has spread before treatment is given to the patient. Although important for determining the most suitable treatment and optimal management strategy for patients, staging continues to present significant challenges to clinicians. Clinical test results such as the pre-treatment Prostate-Specific Antigen (PSA) level, the biopsy most common tumor pattern (Primary Gleason pattern) and the second most common tumor pattern (Secondary Gleason pattern) in tissue biopsies, and the clinical T stage can be used by clinicians to predict the pathological stage of cancer. However, not every patient will return abnormal results in all tests. This significantly influences the capacity to effectively predict the stage of prostate cancer. Herein we have developed a neuro-fuzzy computational intelligence model for classifying and predicting the likelihood of a patient having Organ-Confined Disease (OCD) or Extra-Prostatic Disease (ED) using a prostate cancer patient dataset obtained from The Cancer Genome Atlas (TCGA) Research Network. The system input consisted of the following variables: Primary and Secondary Gleason biopsy patterns, PSA levels, age at diagnosis, and clinical T stage. The performance of the neuro-fuzzy system was compared to other computational intelligence based approaches, namely the Artificial Neural Network, Fuzzy C-Means, Support Vector Machine, the Naive Bayes classifiers, and also the AJCC pTNM Staging Nomogram which is commonly used by clinicians. A comparison of the optimal Receiver Operating Characteristic (ROC) points that were identified using these approaches, revealed that the neuro-fuzzy system, at its optimal point, returns the largest Area Under the ROC Curve (AUC), with a low number of false positives (FPR = 0.274, TPR = 0.789, AUC = 0.812). The proposed approach is also an improvement over the AJCC pTNM Staging Nomogram (FPR = 0.032, TPR = 0.197, AUC = 0.582)

    Adaptive algorithms for real-world transactional data mining.

    Get PDF
    The accurate identification of the right customer to target with the right product at the right time, through the right channel, to satisfy the customer’s evolving needs, is a key performance driver and enhancer for businesses. Data mining is an analytic process designed to explore usually large amounts of data (typically business or market related) in search of consistent patterns and/or systematic relationships between variables for the purpose of generating explanatory/predictive data models from the detected patterns. It provides an effective and established mechanism for accurate identification and classification of customers. Data models derived from the data mining process can aid in effectively recognizing the status and preference of customers - individually and as a group. Such data models can be incorporated into the business market segmentation, customer targeting and channelling decisions with the goal of maximizing the total customer lifetime profit. However, due to costs, privacy and/or data protection reasons, the customer data available for data mining is often restricted to verified and validated data,(in most cases,only the business owned transactional data is available). Transactional data is a valuable resource for generating such data models. Transactional data can be electronically collected and readily made available for data mining in large quantity at minimum extra cost. Transactional data is however, inherently sparse and skewed. These inherent characteristics of transactional data give rise to the poor performance of data models built using customer data based on transactional data. Data models for identifying, describing, and classifying customers, constructed using evolving transactional data thus need to effectively handle the inherent sparseness and skewness of evolving transactional data in order to be efficient and accurate. Using real-world transactional data, this thesis presents the findings and results from the investigation of data mining algorithms for analysing, describing, identifying and classifying customers with evolving needs. In particular, methods for handling the issues of scalability, uncertainty and adaptation whilst mining evolving transactional data are analysed and presented. A novel application of a new framework for integrating transactional data binning and classification techniques is presented alongside an effective prototype selection algorithm for efficient transactional data model building. A new change mining architecture for monitoring, detecting and visualizing the change in customer behaviour using transactional data is proposed and discussed as an effective means for analysing and understanding the change in customer buying behaviour over time. Finally, the challenging problem of discerning between the change in the customer profile (which may necessitate the effective change of the customer’s label) and the change in performance of the model(s) (which may necessitate changing or adapting the model(s)) is introduced and discussed by way of a novel flexible and efficient architecture for classifier model adaptation and customer profiles class relabeling

    Biomedical Data Classification with Improvised Deep Learning Architectures

    Get PDF
    With the rise of very powerful hardware and evolution of deep learning architectures, healthcare data analysis and its applications have been drastically transformed. These transformations mainly aim to aid a healthcare personnel with diagnosis and prognosis of a disease or abnormality at any given point of healthcare routine workflow. For instance, many of the cancer metastases detection depends on pathological tissue procedures and pathologist reviews. The reports of severity classification vary amongst different pathologist, which then leads to different treatment options for a patient. This labor-intensive work can lead to errors or mistreatments resulting in high cost of healthcare. With the help of machine learning and deep learning modules, some of these traditional diagnosis techniques can be improved and aid a doctor in decision making with an unbiased view. Some of such modules can help reduce the cost, shortage of an expertise, and time in identifying the disease. However, there are many other datapoints that are available with medical images, such as omics data, biomarker calculations, patient demographics and history. All these datapoints can enhance disease classification or prediction of progression with the help of machine learning/deep learning modules. However, it is very difficult to find a comprehensive dataset with all different modalities and features in healthcare setting due to privacy regulations. Hence in this thesis, we explore both medical imaging data with clinical datapoints as well as genomics datasets separately for classification tasks using combinational deep learning architectures. We use deep neural networks with 3D volumetric structural magnetic resonance images of Alzheimer Disease dataset for classification of disease. A separate study is implemented to understand classification based on clinical datapoints achieved by machine learning algorithms. For bioinformatics applications, sequence classification task is a crucial step for many metagenomics applications, however, requires a lot of preprocessing that requires sequence assembly or sequence alignment before making use of raw whole genome sequencing data, hence time consuming especially in bacterial taxonomy classification. There are only a few approaches for sequence classification tasks that mainly involve some convolutions and deep neural network. A novel method is developed using an intrinsic nature of recurrent neural networks for 16s rRNA sequence classification which can be adapted to utilize read sequences directly. For this classification task, the accuracy is improved using optimization techniques with a hybrid neural network

    Simple but Not Simplistic: Reducing the Complexity of Machine Learning Methods

    Get PDF
    Programa Oficial de Doutoramento en Computación . 5009V01[Resumo] A chegada do Big Data e a explosión do Internet das cousas supuxeron un gran reto para os investigadores en Aprendizaxe Automática, facendo que o proceso de aprendizaxe sexa mesmo roáis complexo. No mundo real, os problemas da aprendizaxe automática xeralmente teñen complexidades inherentes, como poden ser as características intrínsecas dos datos, o gran número de mostras, a alta dimensión dos datos de entrada, os cambios na distribución entre o conxunto de adestramento e test, etc. Todos estes aspectos son importantes, e requiren novoS modelos que poi dan facer fronte a estas situacións. Nesta tese, abordáronse todos estes problemas, tratando de simplificar o proceso de aprendizaxe automática no escenario actual. En primeiro lugar, realízase unha análise de complexidade para observar como inflúe esta na tarefa de clasificación, e se é posible que a aplicación dun proceso previo de selección de características reduza esta complexidade. Logo, abórdase o proceso de simplificación da fase de aprendizaxe automática mediante a filosofía divide e vencerás, usando un enfoque distribuído. Seguidamente, aplicamos esa mesma filosofía sobre o proceso de selección de características. Finalmente, optamos por un enfoque diferente seguindo a filosofía do Edge Computing, a cal permite que os datos producidos polos dispositivos do Internet das cousas se procesen máis preto de onde se crearon. Os enfoques propostos demostraron a súa capacidade para reducir a complexidade dos métodos de aprendizaxe automática tradicionais e, polo tanto, espérase que a contribución desta tese abra as portas ao desenvolvemento de novos métodos de aprendizaxe máquina máis simples, máis robustos, e máis eficientes computacionalmente.[Resumen] La llegada del Big Data y la explosión del Internet de las cosas han supuesto un gran reto para los investigadores en Aprendizaje Automático, haciendo que el proceso de aprendizaje sea incluso más complejo. En el mundo real, los problemas de aprendizaje automático generalmente tienen complejidades inherentes) como pueden ser las características intrínsecas de los datos, el gran número de muestras, la alta dimensión de los datos de entrada, los cambios en la distribución entre el conjunto de entrenamiento y test, etc. Todos estos aspectos son importantes, y requieren nuevos modelos que puedan hacer frente a estas situaciones. En esta tesis, se han abordado todos estos problemas, tratando de simplificar el proceso de aprendizaje automático en el escenario actual. En primer lugar, se realiza un análisis de complejidad para observar cómo influye ésta en la tarea de clasificación1 y si es posible que la aplicación de un proceso previo de selección de características reduzca esta complejidad. Luego, se aborda el proceso de simplificación de la fase de aprendizaje automático mediante la filosofía divide y vencerás, usando un enfoque distribuido. A continuación, aplicamos esa misma filosofía sobre el proceso de selección de características. Finalmente, optamos por un enfoque diferente siguiendo la filosofía del Edge Computing, la cual permite que los datos producidos por los dispositivos del Internet de las cosas se procesen más cerca de donde se crearon. Los enfoques propuestos han demostrado su capacidad para reducir la complejidad de los métodos de aprendizaje automático tnidicionales y, por lo tanto, se espera que la contribución de esta tesis abra las puertas al desarrollo de nuevos métodos de aprendizaje máquina más simples, más robustos, y más eficientes computacionalmente.[Abstract] The advent of Big Data and the explosion of the Internet of Things, has brought unprecedented challenges to Machine Learning researchers, making the learning task more complexo Real-world machine learning problems usually have inherent complexities, such as the intrinsic characteristics of the data, large number of instauces, high input dimensionality, dataset shift, etc. AH these aspects matter, and can fOI new models that can confront these situations. Thus, in this thesis, we have addressed aH these issues) simplifying the machine learning process in the current scenario. First, we carry out a complexity analysis to see how it inftuences the classification models, and if it is possible that feature selection might result in a deerease of that eomplexity. Then, we address the proeess of simplifying learning with the divide-and-conquer philosophy of the distributed approaeh. Later, we aim to reduce the complexity of the feature seleetion preprocessing through the same philosophy. FinallYl we opt for a different approaeh following the eurrent philosophy Edge eomputing, whieh allows the data produeed by Internet of Things deviees to be proeessed closer to where they were ereated. The proposed approaehes have demonstrated their eapability to reduce the complexity of traditional maehine learning algorithms, and thus it is expeeted that the eontribution of this thesis will open the doors to the development of new maehine learning methods that are simpler, more robust, and more eomputationally efficient

    Imaging White Blood Cells using a Snapshot Hyper-Spectral Imaging System

    Get PDF
    Automated white blood cell (WBC) counting systems process an extracted whole blood sample and provide a cell count. A step that would not be ideal for onsite screening of individuals in triage or at a security gate. Snapshot Hyper-Spectral imaging systems are capable of capturing several spectral bands simultaneously, offering co-registered images of a target. With appropriate optics, these systems are potentially able to image blood cells in vivo as they flow through a vessel, eliminating the need for a blood draw and sample staining. Our group has evaluated the capability of a commercial Snapshot Hyper-Spectral imaging system, specifically the Arrow system from Rebellion Photonics, in differentiating between white and red blood cells on unstained and sealed blood smear slides. We evaluated the imaging capabilities of this hyperspectral camera as a platform to build an automated blood cell counting system. Hyperspectral data consisting of 25, 443x313 hyperspectral bands with ~3nm spacing were captured over the range of 419 to 494nm. Open-source hyperspectral datacube analysis tools, used primarily in Geographic Information Systems (GIS) applications, indicate that white blood cells\u27 features are most prominent in the 428-442nm band for blood samples viewed under 20x and 50x magnification over a varying range of illumination intensities. The system has shown to successfully segment blood cells based on their spectral-spatial information. These images could potentially be used in subsequent automated white blood cell segmentation and counting algorithms for performing in vivo white blood cell counting
    corecore