8,936 research outputs found
A framework for automated anomaly detection in high frequency water-quality data from in situ sensors
River water-quality monitoring is increasingly conducted using automated in
situ sensors, enabling timelier identification of unexpected values. However,
anomalies caused by technical issues confound these data, while the volume and
velocity of data prevent manual detection. We present a framework for automated
anomaly detection in high-frequency water-quality data from in situ sensors,
using turbidity, conductivity and river level data. After identifying end-user
needs and defining anomalies, we ranked their importance and selected suitable
detection methods. High priority anomalies included sudden isolated spikes and
level shifts, most of which were classified correctly by regression-based
methods such as autoregressive integrated moving average models. However, using
other water-quality variables as covariates reduced performance due to complex
relationships among variables. Classification of drift and periods of
anomalously low or high variability improved when we applied replaced anomalous
measurements with forecasts, but this inflated false positive rates.
Feature-based methods also performed well on high priority anomalies, but were
also less proficient at detecting lower priority anomalies, resulting in high
false negative rates. Unlike regression-based methods, all feature-based
methods produced low false positive rates, but did not and require training or
optimization. Rule-based methods successfully detected impossible values and
missing observations. Thus, we recommend using a combination of methods to
improve anomaly detection performance, whilst minimizing false detection rates.
Furthermore, our framework emphasizes the importance of communication between
end-users and analysts for optimal outcomes with respect to both detection
performance and end-user needs. Our framework is applicable to other types of
high frequency time-series data and anomaly detection applications
Dependency relations as source context in phrase-based SMT
The Phrase-Based Statistical Machine Translation (PB-SMT) model has recently begun to include source context modeling, under the assumption that the proper lexical
choice of an ambiguous word can be determined from the context in which it appears. Various types of lexical and syntactic features such as words, parts-of-speech, and
supertags have been explored as effective source context in SMT. In this paper, we show that position-independent syntactic dependency relations of the head of a source phrase can be modeled as useful source context to improve target phrase selection and thereby improve overall performance of PB-SMT. On a Dutch—English translation task, by combining dependency relations and syntactic contextual features (part-of-speech), we achieved a 1.0 BLEU (Papineni et al., 2002) point improvement (3.1% relative) over the baseline
Heterogeneous structure in mixed-species corvid flocks in flight
Flocks of birds in flight represent a striking example of collective behaviour. Models of self-organization suggest that repeated interactions among individuals following simple rules can generate the complex patterns and coordinated movements exhibited by flocks. However, such models often assume that individuals are identical and interchangeable, and fail to account for individual differences and social relationships among group members. Here, we show that heterogeneity resulting from species differences and social structure can affect flock spatial dynamics. Using high-resolution photographs of mixed flocks of jackdaws, Corvus monedula, and rooks, Corvus frugilegus, we show that birds preferentially associated with conspecifics and that, like high-ranking members of single-species groups, the larger and more socially dominant rooks positioned themselves near the leading edge of flocks. Neighbouring birds showed closer directional alignment if they were of the same species, and neighbouring jackdaws in particular flew very close to one another. Moreover, birds of both species often flew especially close to a single same-species neighbour, probably reflecting the monogamous pair bonds that characterize these corvid social systems. Together, our findings demonstrate that the characteristics of individuals and their social systems are likely to result in preferential associations that critically influence flock structure
Recommender System Based on Semantic Similarity
In electronic commerce, in order to help users to find their favourite products, we essentially need a system to classify the products based on the user's interests and needs to recommend them to the users. For the same reason the recommendation systems are designed to help finding information in large websites. They are basically developed to offer products to the customers in an automated fashion to help them to do conveniently their shopping. The developing of such systems is important since there are often a large number of factors involved in purchasing a product that would make it difficult for the customer to make the best decision. Finding relationship among users and relationships among products are important issue in these systems. One of relations is similarity. Measure similarity among users and products is used in the pure methods for calculating similarity degree. In this paper, semantic similarity is used to find a set of k nearest neighbours to the target user, or target item. Thus, because of incorporating semantic similarity in the proposed recommendation system, from the experimental results, the high accuracy was obtained on private building company dataset in comparison with state-of-the-art recommender systems.DOI:http://dx.doi.org/10.11591/ijece.v3i6.393
Recommended from our members
Characterisation of acoustic scenes using a temporally-constrained shift-invariant model
International audienceIn this paper, we propose a method for modeling and classifying acoustic scenes using temporally-constrained shift-invariant probabilistic latent component analysis (SIPLCA). SIPLCA can be used for extracting time-frequency patches from spectrograms in an unsupervised manner. Component-wise hidden Markov models are incorporated to the SIPLCA formulation for enforcing temporal constraints on the activation of each acoustic component. The time-frequency patches are converted to cepstral coefficients in order to provide a compact representation of acoustic events within a scene. Experiments are made using a corpus of train station recordings, classified into 6 scene classes. Results show that the proposed model is able to model salient events within a scene and outperforms the non-negative matrix factorization algorithm for the same task. In addition, it is demonstrated that the use of temporal constraints can lead to improved performance
Weakly supervised segment annotation via expectation kernel density estimation
Since the labelling for the positive images/videos is ambiguous in weakly
supervised segment annotation, negative mining based methods that only use the
intra-class information emerge. In these methods, negative instances are
utilized to penalize unknown instances to rank their likelihood of being an
object, which can be considered as a voting in terms of similarity. However,
these methods 1) ignore the information contained in positive bags, 2) only
rank the likelihood but cannot generate an explicit decision function. In this
paper, we propose a voting scheme involving not only the definite negative
instances but also the ambiguous positive instances to make use of the extra
useful information in the weakly labelled positive bags. In the scheme, each
instance votes for its label with a magnitude arising from the similarity, and
the ambiguous positive instances are assigned soft labels that are iteratively
updated during the voting. It overcomes the limitations of voting using only
the negative bags. We also propose an expectation kernel density estimation
(eKDE) algorithm to gain further insight into the voting mechanism.
Experimental results demonstrate the superiority of our scheme beyond the
baselines.Comment: 9 pages, 2 figure
- …