1,220 research outputs found

    A Review on Outlier/Anomaly Detection in Time Series Data

    Get PDF
    Recent advances in technology have brought major breakthroughs in data collection, enabling a large amount of data to be gathered over time and thus generating time series. Mining this data has become an important task for researchers and practitioners in the past few years, including the detection of outliers or anomalies that may represent errors or events of interest. This review aims to provide a structured and comprehensive state-of-the-art on outlier detection techniques in the context of time series. To this end, a taxonomy is presented based on the main aspects that characterize an outlier detection technique.KK/2019-00095 IT1244-19 TIN2016-78365-R PID2019-104966GB-I0

    A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets

    Get PDF
    The term "outlier" can generally be defined as an observation that is significantly different from the other values in a data set. The outliers may be instances of error or indicate events. The task of outlier detection aims at identifying such outliers in order to improve the analysis of data and further discover interesting and useful knowledge about unusual events within numerous applications domains. In this paper, we report on contemporary unsupervised outlier detection techniques for multiple types of data sets and provide a comprehensive taxonomy framework and two decision trees to select the most suitable technique based on data set. Furthermore, we highlight the advantages, disadvantages and performance issues of each class of outlier detection techniques under this taxonomy framework

    Characteristics of Positive Deviants in Western Chimpanzee Populations

    Get PDF
    With continued expansion of anthropogenically modified landscapes, the proximity between humans and wildlife is continuing to increase, frequently resulting in species decline. Occasionally however, species are able to persist and there is an increased interest in understanding such positive outliers and underlying mechanisms. Eventually, such insights can inform the design of effective conservation interventions by mimicking aspects of the social-ecological conditions found in areas of species persistence. Recently, frameworks have been developed to study the heterogeneity of species persistence across populations with a focus on positive outliers. Applications are still rare, and to our knowledge this is one of the first studies using this approach for terrestrial species conservation. We applied the positive deviance concept to the western chimpanzee, which occurs in a variety of social-ecological landscapes. It is now categorized as Critically Endangered due to hunting and habitat loss and resulting excessive decline of most of its populations. Here we are interested in understanding why some of the populations did not decline. We compiled a dataset of 17,109 chimpanzee survey transects (10,929 km) across nine countries and linked them to a range of social and ecological variables. We found that chimpanzees seemed to persist within three social-ecological configurations: first, rainforest habitats with a low degree of human impact, second, steep areas, and third, areas with high prevalence of hunting taboos and low degree of human impact. The largest chimpanzee populations are nowadays found under the third social-ecological configuration, even though most of these areas are not officially protected. Most commonly chimpanzee conservation has been based on exclusion of threats by creation of protected areas and law enforcement. Our findings suggest, however, that this approach should be complemented by an additional focus on threat reduction, i.e., interventions that directly target individual human behavior that is most threatening to chimpanzees, which is hunting. Although changing human behavior is difficult, stakeholder co-designed behavioral change approaches developed in the social sciences have been used successfully to promote pro-environmental behavior. With only a fraction of chimpanzees and primates living inside protected areas, such new approaches might be a way forward to improve primate conservation

    Finding Anomalous Periodic Time Series: An Application to Catalogs of Periodic Variable Stars

    Full text link
    Catalogs of periodic variable stars contain large numbers of periodic light-curves (photometric time series data from the astrophysics domain). Separating anomalous objects from well-known classes is an important step towards the discovery of new classes of astronomical objects. Most anomaly detection methods for time series data assume either a single continuous time series or a set of time series whose periods are aligned. Light-curve data precludes the use of these methods as the periods of any given pair of light-curves may be out of sync. One may use an existing anomaly detection method if, prior to similarity calculation, one performs the costly act of aligning two light-curves, an operation that scales poorly to massive data sets. This paper presents PCAD, an unsupervised anomaly detection method for large sets of unsynchronized periodic time-series data, that outputs a ranked list of both global and local anomalies. It calculates its anomaly score for each light-curve in relation to a set of centroids produced by a modified k-means clustering algorithm. Our method is able to scale to large data sets through the use of sampling. We validate our method on both light-curve data and other time series data sets. We demonstrate its effectiveness at finding known anomalies, and discuss the effect of sample size and number of centroids on our results. We compare our method to naive solutions and existing time series anomaly detection methods for unphased data, and show that PCAD's reported anomalies are comparable to or better than all other methods. Finally, astrophysicists on our team have verified that PCAD finds true anomalies that might be indicative of novel astrophysical phenomena

    Towards a Hierarchical Approach for Outlier Detection in Industrial Production Settings

    Get PDF
    In the context of Industry 4.0, the degree of cross-linking between machines, sensors, and production lines increases rapidly.However, this trend also offers the potential for the improve-ment of outlier scores, especially by combining outlier detectioninformation between different production levels. The latter, in turn, offer various other useful aspects like different time series resolutions or context variables. When utilizing these aspects, valuable outlier information can be extracted, which can be then used for condition-based monitoring, alert management, or predictive maintenance. In this work, we compare different types of outlier detection methods and scores in the light of the aforementioned production levels with the goal to develop a modelfor outlier detection that incorporates these production levels.The proposed model, in turn, is basically inspired by a use casefrom the field of additive manufacturing, which is also known asindustrial 3D-printing. Altogether, our model shall improve the detection of outliers by the use of a hierarchical structure that utilizes production levels in industrial scenarios

    Informing the transition to evidence-based conservation planning for western chimpanzees

    Get PDF
    Large-scale land-use change across the tropics has led to the decline of animal populations and their habitat. With large investments into mining, hydropower dams and industrial agriculture this trend is likely to continue. Consequently, there is a need for systematic land-use planning to set aside areas for protection and allocate scarce conservation funding effectively. Even though primates are relatively well studied, data-driven systematic planning is still rarely implemented. The overall aim of this dissertation was to investigate population parameters needed for evidence-based conservation planning for the critically endangered western chimpanzee (Pan troglodytes verus) in West Africa. To this end, I compiled density datasets covering the entire geographic range of this taxon from the IUCN SSC A.P.E.S. database and modeled chimpanzee densities as a function of 20 social-ecological variables. I found that western chimpanzees seemingly persist within three social-ecological configurations: rainforests with a low degree of anthropogenic threats, steep areas that are less likely to be developed and are harder to access by humans, and areas with a high prevalence of cultural taboos against hunting chimpanzees. The third configuration of reduced hunting pressure is not yet reflected in commonly implemented conservation interventions, suggesting a need for designing new approaches aimed at reducing the threat of hunting. Based on the modeled density distribution, I estimated that 52,811 (95% CI 17,577-96,564) western chimpanzees remain in West Africa, and identified areas of high conservation value to which conservation interventions should be targeted. These results can be used to inform the expansion of the protected area network in West Africa, to quantify the impact of planned industrial projects on western chimpanzees, and to guide the systematic allocation of conservation funding. In addition, this thesis highlights the unique position of taxon-specific databases of providing access to high-resolution data at the scale needed for conservation planning. Data-driven conservation planning has the potential to enable conservationists to respond more proactively to current and emerging threats, and ultimately improve conservation outcomes

    A Multiple Case Study Analysis of the Positive Deviance Approach in Community Health

    Get PDF
    The positive deviance (PD) approach involves finding individuals who have solved a problem and spreads their unique solutions to others. While there have been calls for PD to become a standard tool in community health, there has been little research on the approach. This study investigated how PD is used in practice and evidence of its effectiveness by analyzing case studies of 40 PD programs and 32 PD inquiries implemented in a range of high, middle, and low income countries by both national and international organizations. Case studies were developed using data from publicly available documents. Qualitative within-case and cross-case analyses were used to identify common themes and trends using the theory of diffusion of innovations. Results show that the first large scale applications of the PD approach were in child malnutrition in the 1990s. Since then the approach has been applied to other issues in individual behavior change (e.g., HIV/AIDS), organizational change (e.g., health services), and sociocultural change (e.g., female genital mutilation). Current PD approaches can be classified by the level of intervention, and the methods used to identify positive deviants, discover their behaviors, and spread the behaviors to others. Most programs do not fully involve the community at all stages. While there is substantial evidence for the effectiveness of the PD approach in child malnutrition, few high quality outcome evaluations have been conducted in other areas. Implications for positive social change include providing data to encourage practitioners to use the PD approach as a standard tool for child malnutrition, where it has the potential to improve nutritional status and thus contribute to long term outcomes in child health, education and social development
    corecore