8,936 research outputs found

    A framework for automated anomaly detection in high frequency water-quality data from in situ sensors

    Full text link
    River water-quality monitoring is increasingly conducted using automated in situ sensors, enabling timelier identification of unexpected values. However, anomalies caused by technical issues confound these data, while the volume and velocity of data prevent manual detection. We present a framework for automated anomaly detection in high-frequency water-quality data from in situ sensors, using turbidity, conductivity and river level data. After identifying end-user needs and defining anomalies, we ranked their importance and selected suitable detection methods. High priority anomalies included sudden isolated spikes and level shifts, most of which were classified correctly by regression-based methods such as autoregressive integrated moving average models. However, using other water-quality variables as covariates reduced performance due to complex relationships among variables. Classification of drift and periods of anomalously low or high variability improved when we applied replaced anomalous measurements with forecasts, but this inflated false positive rates. Feature-based methods also performed well on high priority anomalies, but were also less proficient at detecting lower priority anomalies, resulting in high false negative rates. Unlike regression-based methods, all feature-based methods produced low false positive rates, but did not and require training or optimization. Rule-based methods successfully detected impossible values and missing observations. Thus, we recommend using a combination of methods to improve anomaly detection performance, whilst minimizing false detection rates. Furthermore, our framework emphasizes the importance of communication between end-users and analysts for optimal outcomes with respect to both detection performance and end-user needs. Our framework is applicable to other types of high frequency time-series data and anomaly detection applications

    Dependency relations as source context in phrase-based SMT

    Get PDF
    The Phrase-Based Statistical Machine Translation (PB-SMT) model has recently begun to include source context modeling, under the assumption that the proper lexical choice of an ambiguous word can be determined from the context in which it appears. Various types of lexical and syntactic features such as words, parts-of-speech, and supertags have been explored as effective source context in SMT. In this paper, we show that position-independent syntactic dependency relations of the head of a source phrase can be modeled as useful source context to improve target phrase selection and thereby improve overall performance of PB-SMT. On a Dutch—English translation task, by combining dependency relations and syntactic contextual features (part-of-speech), we achieved a 1.0 BLEU (Papineni et al., 2002) point improvement (3.1% relative) over the baseline

    Heterogeneous structure in mixed-species corvid flocks in flight

    Get PDF
    Flocks of birds in flight represent a striking example of collective behaviour. Models of self-organization suggest that repeated interactions among individuals following simple rules can generate the complex patterns and coordinated movements exhibited by flocks. However, such models often assume that individuals are identical and interchangeable, and fail to account for individual differences and social relationships among group members. Here, we show that heterogeneity resulting from species differences and social structure can affect flock spatial dynamics. Using high-resolution photographs of mixed flocks of jackdaws, Corvus monedula, and rooks, Corvus frugilegus, we show that birds preferentially associated with conspecifics and that, like high-ranking members of single-species groups, the larger and more socially dominant rooks positioned themselves near the leading edge of flocks. Neighbouring birds showed closer directional alignment if they were of the same species, and neighbouring jackdaws in particular flew very close to one another. Moreover, birds of both species often flew especially close to a single same-species neighbour, probably reflecting the monogamous pair bonds that characterize these corvid social systems. Together, our findings demonstrate that the characteristics of individuals and their social systems are likely to result in preferential associations that critically influence flock structure

    Recommender System Based on Semantic Similarity

    Get PDF
    In electronic commerce, in order to help users to find their favourite products, we essentially need a system to classify the products based on the user's interests and needs to recommend them to the users. For the same reason the recommendation systems are designed to help finding information in large websites. They are basically developed to offer products to the customers in an automated fashion to help them to do conveniently their shopping. The developing of such systems is important since there are often a large number of factors involved in purchasing a product that would make it difficult for the customer to make the best decision. Finding relationship among users and relationships among products are important issue in these systems. One of relations is similarity. Measure similarity among users and products is used in the pure methods for calculating similarity degree. In this paper, semantic similarity is used to find a set of k nearest neighbours to the target user, or target item. Thus, because of incorporating semantic similarity in the proposed recommendation system, from the experimental results, the high accuracy was obtained on private building company dataset in comparison with state-of-the-art recommender systems.DOI:http://dx.doi.org/10.11591/ijece.v3i6.393

    Weakly supervised segment annotation via expectation kernel density estimation

    Full text link
    Since the labelling for the positive images/videos is ambiguous in weakly supervised segment annotation, negative mining based methods that only use the intra-class information emerge. In these methods, negative instances are utilized to penalize unknown instances to rank their likelihood of being an object, which can be considered as a voting in terms of similarity. However, these methods 1) ignore the information contained in positive bags, 2) only rank the likelihood but cannot generate an explicit decision function. In this paper, we propose a voting scheme involving not only the definite negative instances but also the ambiguous positive instances to make use of the extra useful information in the weakly labelled positive bags. In the scheme, each instance votes for its label with a magnitude arising from the similarity, and the ambiguous positive instances are assigned soft labels that are iteratively updated during the voting. It overcomes the limitations of voting using only the negative bags. We also propose an expectation kernel density estimation (eKDE) algorithm to gain further insight into the voting mechanism. Experimental results demonstrate the superiority of our scheme beyond the baselines.Comment: 9 pages, 2 figure
    corecore