180 research outputs found
Knowledge discovery in data streams
Knowing what to do with the massive amount of data collected has always been an ongoing issue for many organizations. While data mining has been touted to be the solution, it has failed to deliver the impact despite its successes in many areas. One reason is that data mining algorithms were not designed for the real world, i.e., they usually assume a static view of the data and a stable execution environment where resources are abundant. The reality however is that data are constantly changing and the execution environment is dynamic. Hence, it becomes difficult for data mining to truly deliver timely and relevant results. Recently, the processing of stream data has received many attention. What is interesting is that the methodology to design stream-based algorithms may well be the solution to the above problem. In this entry, we discuss this issue and present an overview of recent works
Outlier Detection from Network Data with Subnetwork Interpretation
Detecting a small number of outliers from a set of data observations is
always challenging. This problem is more difficult in the setting of multiple
network samples, where computing the anomalous degree of a network sample is
generally not sufficient. In fact, explaining why the network is exceptional,
expressed in the form of subnetwork, is also equally important. In this paper,
we develop a novel algorithm to address these two key problems. We treat each
network sample as a potential outlier and identify subnetworks that mostly
discriminate it from nearby regular samples. The algorithm is developed in the
framework of network regression combined with the constraints on both network
topology and L1-norm shrinkage to perform subnetwork discovery. Our method thus
goes beyond subspace/subgraph discovery and we show that it converges to a
global optimum. Evaluation on various real-world network datasets demonstrates
that our algorithm not only outperforms baselines in both network and high
dimensional setting, but also discovers highly relevant and interpretable local
subnetworks, further enhancing our understanding of anomalous networks
The role of orographic effects on occurrence of the heavy rainfall event over Central Vietnam in November 1999
In this study, the WRF model is used to investigate the role of Central Vietnam terrain on occurrence of the heavy rainfall event in November 1999 over Central Vietnam. Two model experiments with and without terrain were performed to examine the orographic blocking effects during the event. In the terrain experiment, the results from a three-day simulation show that the model reasonably well captures northeast monsoon circulation, tropical cyclones and the occurrence of heavy rainfall in Central Vietnam. The topography causes a high pressure anomaly intensifying northeast monsoon. When the terrain is removed, the three-day accumulated rainfall decreases approximately 75% in comparison with that in the terrain experiment. The terrain blocking and lifting effects in strong wind and moisture laden conditions combined with convergence circulation over open ocean are the main factors for occurrence of the heavy rainfall event
Maximal Domain Independent Representations Improve Transfer Learning
Domain adaptation (DA) adapts a training dataset from a source domain for use
in a learning task in a target domain in combination with data available at the
target. One popular approach for DA is to create a domain-independent
representation (DIRep) learned by a generator from all input samples and then
train a classifier on top of it using all labeled samples. A domain
discriminator is added to train the generator adversarially to exclude domain
specific features from the DIRep. However, this approach tends to generate
insufficient information for accurate classification learning. In this paper,
we present a novel approach that integrates the adversarial model with a
variational autoencoder. In addition to the DIRep, we introduce a
domain-dependent representation (DDRep) such that information from both DIRep
and DDRep is sufficient to reconstruct samples from both domains. We further
penalize the size of the DDRep to drive as much information as possible to the
DIRep, which maximizes the accuracy of the classifier in labeling samples in
both domains. We empirically evaluate our model using synthetic datasets and
demonstrate that spurious class-related features introduced in the source
domain are successfully absorbed by the DDRep. This leaves a rich and clean
DIRep for accurate transfer learning in the target domain. We further
demonstrate its superior performance against other algorithms for a number of
common image datasets. We also show we can take advantage of pretrained models
Functional-Antioxidant Food
Nowadays, people face many different dangers, such as stress, unsafety food, and environmental pollution, but not everyone suffers. Meanwhile, free radicals are the biggest threat for humans because they lead to over 80 different diseases composed of aging. Free radicals can only be eliminated or minimized with antioxidant foods or antioxidants. The chapter on the functional-antioxidant food presents the antioxidant functional food concept, the classification, the structure, and the extraction process of antioxidant ingredients. Various antioxidant substances such as protein (collagen), polysaccharides (fucoidans, alginates, glucosamines, inulins, laminarins, ulvans, and pectins), and secondary metabolites (polyphenols (phlorotannins, lignins, polyphenols), alkaloids, and flavonoids) also present. The production technology, the mechanism, the opportunity, and the challenge of antioxidants functional food also present in the current chapter. The current chapter also gives the production process of functional-antioxidant food composed of the capsule, the tablet, tube, the pills, the powder, and the effervescent tablet
- …