146 research outputs found

    Active Learning for Data Streams under Concept Drift and concept evolution.

    Get PDF
    Data streams classification is an important problem however, poses many challenges. Since the length of the data is theoretically infinite, it is impractical to store and process all the historical data. Data streams also experience change of its underlying dis-tribution (concept drift), thus the classifier must adapt. Another challenge of data stream classification is the possible emergence and disappearance of classes which is known as (concept evolution) problem. On the top of these challenges, acquiring labels with such large data is expensive. In this paper, we propose a stream-based active learning (AL) strategy (SAL) that handles the aforementioned challenges. SAL aims at querying the labels of samples which results in optimizing the expected future error. It handles concept drift and concept evolution by adapting to the change in the stream. Furthermore, as a part of the error reduction process, SAL handles the sampling bias problem and queries the samples that caused the change i.e., drifted samples or samples coming from new classes. To tackle the lack of prior knowledge about the streaming data, non-parametric Bayesian modelling is adopted namely the two representations of Dirichlet process; Dirichlet mixture models and stick breaking process. Empirical results obtained on real-world benchmarks show the high performance of the proposed SAL method compared to the state-of-the-art methods

    A non-parametric hierarchical clustering model

    Get PDF
    © 2015 IEEE. We present a novel non-parametric clustering model using Gaussian mixture model (NHCM). NHCM uses a novel Dirichlet process (DP) prior allowing for more flexible modeling of the data, where the base distribution of DP is itself an infinite mixture of Gaussian conjugate prior. NHCM can be thought of as hierarchical clustering model, in which the low level base prior governs the distribution of the data points forming sub-clusters, and the higher level prior governs the distribution of the sub-clusters forming clusters. Using this hierarchical configuration, we can maintain low complexity of the model and allow for clustering skewed complex data. To perform inference, we propose a Gibbs sampling algorithm. Empirical investigations have been carried out to analyse the efficiency of the proposed clustering model

    A Bi-Criteria Active Learning Algorithm for Dynamic Data Streams

    Get PDF
    Active learning (AL) is a promising way to efficiently building up training sets with minimal supervision. A learner deliberately queries specific instances to tune the classifier’s model using as few labels as possible. The challenge for streaming is that the data distribution may evolve over time and therefore the model must adapt. Another challenge is the sampling bias where the sampled training set does not reflect the underlying data distribution. In presence of concept drift, sampling bias is more likely to occur as the training set needs to represent the whole evolving data. To tackle these challenges, we propose a novel bi-criteria AL approach (BAL) that relies on two selection criteria, namely label uncertainty criterion and density-based cri- terion . While the first criterion selects instances that are the most uncertain in terms of class membership, the latter dynamically curbs the sampling bias by weighting the samples to reflect on the true underlying distribution. To design and implement these two criteria for learning from streams, BAL adopts a Bayesian online learning approach and combines online classification and online clustering through the use of online logistic regression and online growing Gaussian mixture models respectively. Empirical results obtained on standard synthetic and real-world benchmarks show the high performance of the proposed BAL method compared to the state-of-the-art AL method

    Asynchronous Stochastic Variational Inference

    Full text link
    Stochastic variational inference (SVI) employs stochastic optimization to scale up Bayesian computation to massive data. Since SVI is at its core a stochastic gradient-based algorithm, horizontal parallelism can be harnessed to allow larger scale inference. We propose a lock-free parallel implementation for SVI which allows distributed computations over multiple slaves in an asynchronous style. We show that our implementation leads to linear speed-up while guaranteeing an asymptotic ergodic convergence rate O(1/(T)O(1/\sqrt(T) ) given that the number of slaves is bounded by (T)\sqrt(T) (TT is the total number of iterations). The implementation is done in a high-performance computing (HPC) environment using message passing interface (MPI) for python (MPI4py). The extensive empirical evaluation shows that our parallel SVI is lossless, performing comparably well to its counterpart serial SVI with linear speed-up.Comment: 7 pages, 8 figures, 1 table, 2 algorithms, The paper has been submitted for publicatio

    Synthesis and anti-tumor activities of new [1,2,4]triazolo[1,5-a]pyrimidine derivatives

    Get PDF
    Condensation of 1H-1,2,4-triazol-5-amine with the appropriate sodium (E)-(2-oxocycloalkylidene)methanolate gave 7,8-dihydro-6H-cyclopenta[e][1,2,4]triazolo [1,5-a]pyrimidine, 6,7,8,9-tetrahydro-[1,2,4]triazolo[1,5-a]quinazoline, 7,8,9,10-tetra hydro-6H-cyclohepta[e][1,2,4]triazolo[1,5-a]pyrimidine and 6,7,8,9,10,11-hexahydro cycloocta[e][1,2,4]triazolo[1,5-a]pyrimidine. Structures of the newly synthesized compounds were elucidated via elemental analyses, spectral (IR, 1H NMR, 13C NMR, 2D NMR), and X-ray single crystal diffraction data. These derivatives showed potent anti-tumor cytotoxic activity in vitro using different human cancer cell lines

    Active Learning for Classifying Data Streams with Unknown Number of Classes.

    Get PDF
    The classification of data streams is an interesting but also a challenging problem. A data stream may grow infinitely making it impractical for storage prior to processing and classification. Due to its dynamic nature, the underlying distribution of the data stream may change over time resulting in the so-called concept drift or the possible emergence and fading of classes, known as concept evolution. In addition, acquiring labels of data samples in a stream is admittedly expensive if not infeasible at all. In this paper, we propose a novel stream-based active learning algorithm (SAL) which is capable of coping with both concept drift and concept evolution by adapting the classification model to the dynamic changes in the stream. SAL is the first AL algorithm in the literature to explicitly take account of these concepts. Moreover, using SAL, only labels of samples that are expected to reduce the expected future error are queried. This process is done while tackling the problem of sampling bias so that samples that induce the change (i.e., drifting samples or samples coming from new classes) are queried. To efficiently implement SAL, the paper proposes the application of non parametric Bayesian models allowing to cope with the lack of prior knowledge about the data stream. In particular, Dirichlet mixture models and the stick breaking process are adopted and adapted to meet the requirements of online learning. The empirical results obtained on real-world benchmarks demonstrate the superiority of SAL in terms of classification performance over the state-of-the-art methods using average and average class accuracy

    Encapsulated polycaprolactone with triazole derivatives and selenium nanoparticles as promising antiproliferative and anticancer agents

    Get PDF
    Background and purpose Polycaprolactone nanocapsules incorporated with triazole derivatives in the presence and absence of selenium nanoparticles were prepared and evaluated as antiproliferative and anticancer agents. Polycaprolactone nanoparticles were prepared using the emulsion technique. Experimental approach The prepared capsules were characterized using FT-IR, TEM and DLS measurements. The synthesized triazolopyrimidine derivative in the presence and absence of selenium nanoparticles encapsulated in polycaprolactone was tested for its in vitro antiproliferative efficiency towards human breast cancer cell line (MCF7) and murine fibroblast normal cell line (BALB/3T3) in comparison to doxorubicin as a standard anticancer drug. Key results The results indicated that encapsulated polycaprolactone with selenium nanoparticles (SeNPs) and triazole-SeNPs were the most potent samples against the tested breast cancer cell line (MCF7). On the other hand, all compounds showed weak or moderate activities towards the tested murine fibroblast normal cell line (BALB/3T3). Conclusion As the safety index (SI) was higher than 1.0, it expanded the way for newly synthesized compounds to express antiproliferative efficacy against tumour cells. Hence, these compounds may be considered promising ones. However, they should be examined through further in-vivo and pharmacokinetic studies

    Preface

    Get PDF

    Preface

    Get PDF

    Detection of Helicobacter pylori oipA and dupA genes among dyspeptic patients with chronic gastritis

    Get PDF
    Helicobacter pylori (H. pylori): is a microbe with wide genetic diversity that infects the stomach of most people in developing countries, leading to several clinical outcomes among different individuals such as gastritis, ulcers, or gastric cancer. Outer inflammatory protein A (oipA) and duodenal ulcer promoting (dupA) genes are among the possible virulence factors which determine the patient outcome. Aim: To detect oipA and dupA genes of H. pylori among dyspeptic Egyptian patients, and to investigate their correlation with the varying degrees of the associated chronic gastritis. Methods: The study enrolled 50 patients with dyspepsia, attending the Gastrointestinal Endoscopy unit of the Gastroenterology and Tropical Departments at Ain Shams University Hospital for upper gastrointestinal endoscopy, in the period between, June and, December 2019. Four antral gastric biopsies were taken from each patient for polymerase chain reaction assay to detect the virulence genes oipA, dupA, and cagA and for histopathological assessment. Results: Forty patients were H. pylori positive by histopathology and PCR. cagA, oipA, and dupA were identified in 6 (15%), 13 (32.5%), 9 (22.5%) of biopsies, respectively. Both cagA and oipA genes were highly significantly associated with increasing the severity of gastritis. Only oipA virulence gene showed a highly significant association with gastroduodenitis. There was a highly significant moderate association between cagA and oipA genes. Conclusion: oipA could be a virulence biomarker that serves a great value in predicting the progress of gastric mucosal damage in patients with chronic gastritis, and targeting antimicrobial therapy in those patients to prevent severe gastroduodenal diseases
    • …
    corecore