180 research outputs found

    Knowledge discovery in data streams

    Full text link
    Knowing what to do with the massive amount of data collected has always been an ongoing issue for many organizations. While data mining has been touted to be the solution, it has failed to deliver the impact despite its successes in many areas. One reason is that data mining algorithms were not designed for the real world, i.e., they usually assume a static view of the data and a stable execution environment where resources are abundant. The reality however is that data are constantly changing and the execution environment is dynamic. Hence, it becomes difficult for data mining to truly deliver timely and relevant results. Recently, the processing of stream data has received many attention. What is interesting is that the methodology to design stream-based algorithms may well be the solution to the above problem. In this entry, we discuss this issue and present an overview of recent works

    Outlier Detection from Network Data with Subnetwork Interpretation

    Full text link
    Detecting a small number of outliers from a set of data observations is always challenging. This problem is more difficult in the setting of multiple network samples, where computing the anomalous degree of a network sample is generally not sufficient. In fact, explaining why the network is exceptional, expressed in the form of subnetwork, is also equally important. In this paper, we develop a novel algorithm to address these two key problems. We treat each network sample as a potential outlier and identify subnetworks that mostly discriminate it from nearby regular samples. The algorithm is developed in the framework of network regression combined with the constraints on both network topology and L1-norm shrinkage to perform subnetwork discovery. Our method thus goes beyond subspace/subgraph discovery and we show that it converges to a global optimum. Evaluation on various real-world network datasets demonstrates that our algorithm not only outperforms baselines in both network and high dimensional setting, but also discovers highly relevant and interpretable local subnetworks, further enhancing our understanding of anomalous networks

    The role of orographic effects on occurrence of the heavy rainfall event over Central Vietnam in November 1999

    Get PDF
    In this study, the WRF model is used to investigate the role of Central Vietnam terrain on occurrence of the heavy rainfall event in November 1999 over Central Vietnam. Two model experiments with and without terrain were performed to examine the orographic blocking effects during the event. In the terrain experiment, the results from a three-day simulation show that the model reasonably well captures northeast monsoon circulation, tropical cyclones and the occurrence of heavy rainfall in Central Vietnam. The topography causes a high pressure anomaly intensifying northeast monsoon. When the terrain is removed, the three-day accumulated rainfall decreases approximately 75% in comparison with that in the terrain experiment. The terrain blocking and lifting effects in strong wind and moisture laden conditions combined with convergence circulation over open ocean are the main factors for occurrence of the heavy rainfall event

    Maximal Domain Independent Representations Improve Transfer Learning

    Full text link
    Domain adaptation (DA) adapts a training dataset from a source domain for use in a learning task in a target domain in combination with data available at the target. One popular approach for DA is to create a domain-independent representation (DIRep) learned by a generator from all input samples and then train a classifier on top of it using all labeled samples. A domain discriminator is added to train the generator adversarially to exclude domain specific features from the DIRep. However, this approach tends to generate insufficient information for accurate classification learning. In this paper, we present a novel approach that integrates the adversarial model with a variational autoencoder. In addition to the DIRep, we introduce a domain-dependent representation (DDRep) such that information from both DIRep and DDRep is sufficient to reconstruct samples from both domains. We further penalize the size of the DDRep to drive as much information as possible to the DIRep, which maximizes the accuracy of the classifier in labeling samples in both domains. We empirically evaluate our model using synthetic datasets and demonstrate that spurious class-related features introduced in the source domain are successfully absorbed by the DDRep. This leaves a rich and clean DIRep for accurate transfer learning in the target domain. We further demonstrate its superior performance against other algorithms for a number of common image datasets. We also show we can take advantage of pretrained models

    Functional-Antioxidant Food

    Get PDF
    Nowadays, people face many different dangers, such as stress, unsafety food, and environmental pollution, but not everyone suffers. Meanwhile, free radicals are the biggest threat for humans because they lead to over 80 different diseases composed of aging. Free radicals can only be eliminated or minimized with antioxidant foods or antioxidants. The chapter on the functional-antioxidant food presents the antioxidant functional food concept, the classification, the structure, and the extraction process of antioxidant ingredients. Various antioxidant substances such as protein (collagen), polysaccharides (fucoidans, alginates, glucosamines, inulins, laminarins, ulvans, and pectins), and secondary metabolites (polyphenols (phlorotannins, lignins, polyphenols), alkaloids, and flavonoids) also present. The production technology, the mechanism, the opportunity, and the challenge of antioxidants functional food also present in the current chapter. The current chapter also gives the production process of functional-antioxidant food composed of the capsule, the tablet, tube, the pills, the powder, and the effervescent tablet
    • …
    corecore