225 research outputs found

    Anomaly and event detection for unsupervised athlete performance data

    Get PDF
    There are many projects today where data is collected automatically to provide input for various data mining algorithms. A problem with freshly generated datasets is their unsupervised nature, leading to difficulty in fitting predictive algorithms without substantial manual effort. One of the first steps in dataset preparation and mining is anomaly detection, where clear anomalies and outliers as well as events or changes in the pattern of the data are identified as a precursor to subsequent steps in data mining. In the research presented here, we provide a multi-step anomaly detection process which utilises different combinations of algorithms for the most accurate identification of outliers and events

    Data Analytics for Uncovering Fraudulent Behaviour in Elite Sports

    Get PDF
    Sports officials around the world are facing societal challenges due to the unfair nature of fraudulent practices performed by unscrupulous athletes. Recently, sample swapping has been raised as a potential practice where some athletes exchange their doped sample with a clean one to evade a positive test. The current detection method for such cases includes laboratory testing like DNA analysis. However, these methods are costly and time-consuming, which goes beyond the budgetary limits of anti-doping organisations. Therefore, there is a need to explore alternative methods to improve decision-making. We presented a data analytical methodology that supports anti-doping decision-makers on the task of athlete disambiguation. Our proposed model helps identify the swapped sample, which outperforms the current state-of-the-art method and different baseline models. The evaluation on real-world sample swapping cases shows promising results that help advance the research on the application of data analytics in the context of anti-doping analysis

    안정적인, 비디오 기반의 공정 이상 탐지

    Get PDF
    학위논문(석사) -- 서울대학교대학원 : 공과대학 기계공학부, 2021.8. 박종우.Industrial video anomaly detection is an important problem in industrial inspection, possessing features that are distinct from video anomaly detection in other application domains like surveillance. No public datasets pertinent to the problem have been developed, and accordingly, robust models suited for industrial video anomaly detection have yet to be developed. In this thesis, the key differences that distinguish the industrial video anomaly detection problem from its generic counterparts are examined: the relatively small amount of video data available, the lack of diversity among frames within the video clips, and the absence of labels that indicate anomalies. We then propose a robust framework for industrial video inspection that addresses these specific challenges. One novel aspect of our framework includes a model that masks regions in frames that are irrelevant to the inspection task. We show that our framework outperforms existing methods when validated on a novel database that replicates video clips of real-world automated tasks.공정 검사라는 다소 방대한 분야의 여러 문제 중에서, 산업용 비디오 이상 탐지는 큰 중요성을 지닌 문제이지만, 그 중요성에 비해 충분히 주목을 받지 못하고 있다. 이 문제를 연구할 때 사용할 공적인 데이터셋이 부재하며, 이를 기반으로 고안된 산업용 비디오 이상 탐지에 특화된 기법에 대한 선행 연구도 진행 된 적이 없었다. 본 논문에서는 일반적인 비디오 이상 탐지 문제와 산업용 비디오 이상 탐지 문제의 상이한 특성들을 분석하여 규명하였다. 일반적인 비디오 이상 탐지에서와 달리, 산업용 비디오 이상탐지 문제에서는 사용 가능한 데이터의 양이 한정되어 있으며, 학습에 필요한 라벨이 없기 때문에 이를 활용한 모델을 개발하는 것이 불가능하다. 이와 같은 이유로 인해, 기존 모델을 산업용 비디오 이상 탐지 문제에 적용할 시, 검사하고자 하는 동작과 무관한 요소의 출현과 움직임으로 인한 거짓 알람이 지나치게 자주 발생한다. 분석을 기반으로, 강건한 비디오 이상 감지가 가능한 산업용 비디오 이상 탐지 방안을 고안하였다. 이 기법에서는 이상 탐지를 위한 모델과 별개로, 영상 내의 요소들 중 동작 감지와 상관 없는 것들을 가리는 모델을 활용한다. 제안하고자 하는 방안의 효용성을 검증하기 위해, 실제 공정 영상과 유사한 특성들을 보이는 로봇 동작을 촬영해 수집한 데이터베이스를 구축하였으며, 이를 활용해 모델의 성능들을 측정하였다. 본 연구에서 제시하는 강건한 비디오 이상탐지 방안과 데이터 베이스를 논문을 통해 공개함으로써, 이 분야와 관련한 더 다양한 연구를 촉진하는데 기여 할 수 있을 것이라 기대한다.1 Introduction 1 1.1 Related Works 8 1.2 Contributions of Our Work 15 1.3 Organization 16 2 Preliminaries 18 2.1 Weakly Supervised Learning 19 2.1.1 Supervised Learning and Unsupervised Learning 19 2.1.2 Demysti cation on Weakly Supervised Learning 21 2.2 Class Activation Maps 22 2.2.1 Overview on Visualizing Activations 22 2.2.2 Overview on CAM 24 2.2.3 Overview on Grad-CAM 25 2.2.4 Overview on Eigen-CAM 26 2.3 Dynamic Time Warping 27 2.4 Label Smoothing 28 2.4.1 Review on Cross Entropy Function 30 2.4.2 Summary on Label Smoothing 31 3 Robust Framework for Industrial Video Anomaly Detection 32 3.1 Components of the Framework 34 3.1.1 Anomaly Detection Model 34 3.1.2 Background Masking Model 34 3.1.3 Fusing Results from the Components of the Framework 37 3.2 Details of the Weakly Supervised Learning Method 37 3.2.1 Partition Order Prediction Task 37 3.2.2 Partition Order Labels 39 3.2.3 Conditioning the Labels 43 4 Experiments 45 4.1 Database for Industrial Video Anomaly Detection 46 4.2 Ideal Background Mask 48 4.2.1 Acquiring Ideal Masks 48 4.2.2 Enhancing Robustness in VAD Using Masks 48 4.3 Masking Using the Proposed Method vs Using an Ideal Mask 49 4.4 Performance Enhancement Using the Proposed Method 50 4.5 Ablation Study 54 4.5.1 Number of Layers for Eigen-CAM 54 4.5.2 Threshold on Attention Maps 54 4.5.3 Temporal Smoothing Window Size 56 5 Conclusion 57 A Appendix 59 A.1 Experimental Results for All Tasks in the Database 59 Bibliography 60 국문초록 68석

    Enriching the fan experience in a smart stadium using internet of things technologies

    Get PDF
    Rapid urbanization has brought about an influx of people to cities, tipping the scale between urban and rural living. Population predictions estimate that 64% of the global population will reside in cities by 2050. To meet the growing resource needs, improve management, reduce complexities, and eliminate unnecessary costs while enhancing the quality of life of citizens, cities are increasingly exploring open innovation frameworks and smart city initiatives that target priority areas including transportation, sustainability, and security. The size and heterogeneity of urban centers impede progress of technological innovations for smart cities. We propose a Smart Stadium as a living laboratory to balance both size and heterogeneity so that smart city solutions and Internet of Things (IoT) technologies may be deployed and tested within an environment small enough to practically trial but large and diverse enough to evaluate scalability and efficacy. The Smart Stadium for Smart Living initiative brings together multiple institutions and partners including Arizona State University (ASU), Dublin City University (DCU), Intel Corporation, and Gaelic Athletic Association (GAA), to turn ASU's Sun Devil Stadium and Ireland's Croke Park Stadium into twinned smart stadia to investigate IoT and smart city technologies and applications

    REPRESENTATION LEARNING FOR ACTION RECOGNITION

    Get PDF
    The objective of this research work is to develop discriminative representations for human actions. The motivation stems from the fact that there are many issues encountered while capturing actions in videos like intra-action variations (due to actors, viewpoints, and duration), inter-action similarity, background motion, and occlusion of actors. Hence, obtaining a representation which can address all the variations in the same action while maintaining discrimination with other actions is a challenging task. In literature, actions have been represented either using either low-level or high-level features. Low-level features describe the motion and appearance in small spatio-temporal volumes extracted from a video. Due to the limited space-time volume used for extracting low-level features, they are not able to account for viewpoint and actor variations or variable length actions. On the other hand, high-level features handle variations in actors, viewpoints, and duration but the resulting representation is often high-dimensional which introduces the curse of dimensionality. In this thesis, we propose new representations for describing actions by combining the advantages of both low-level and high-level features. Specifically, we investigate various linear and non-linear decomposition techniques to extract meaningful attributes in both high-level and low-level features. In the first approach, the sparsity of high-level feature descriptors is leveraged to build action-specific dictionaries. Each dictionary retains only the discriminative information for a particular action and hence reduces inter-action similarity. Then, a sparsity-based classification method is proposed to classify the low-rank representation of clips obtained using these dictionaries. We show that this representation based on dictionary learning improves the classification performance across actions. Also, a few of the actions consist of rapid body deformations that hinder the extraction of local features from body movements. Hence, we propose to use a dictionary which is trained on convolutional neural network (CNN) features of the human body in various poses to reliably identify actors from the background. Particularly, we demonstrate the efficacy of sparse representation in the identification of the human body under rapid and substantial deformation. In the first two approaches, sparsity-based representation is developed to improve discriminability using class-specific dictionaries that utilize action labels. However, developing an unsupervised representation of actions is more beneficial as it can be used to both recognize similar actions and localize actions. We propose to exploit inter-action similarity to train a universal attribute model (UAM) in order to learn action attributes (common and distinct) implicitly across all the actions. Using maximum aposteriori (MAP) adaptation, a high-dimensional super action-vector (SAV) for each clip is extracted. As this SAV contains redundant attributes of all other actions, we use factor analysis to extract a novel lowvi dimensional action-vector representation for each clip. Action-vectors are shown to suppress background motion and highlight actions of interest in both trimmed and untrimmed clips that contributes to action recognition without the help of any classifiers. It is observed during our experiments that action-vector cannot effectively discriminate between actions which are visually similar to each other. Hence, we subject action-vectors to supervised linear embedding using linear discriminant analysis (LDA) and probabilistic LDA (PLDA) to enforce discrimination. Particularly, we show that leveraging complimentary information across action-vectors using different local features followed by discriminative embedding provides the best classification performance. Further, we explore non-linear embedding of action-vectors using Siamese networks especially for fine-grained action recognition. A visualization of the hidden layer output in Siamese networks shows its ability to effectively separate visually similar actions. This leads to better classification performance than linear embedding on fine-grained action recognition. All of the above approaches are presented on large unconstrained datasets with hundreds of examples per action. However, actions in surveillance videos like snatch thefts are difficult to model because of the diverse variety of scenarios in which they occur and very few labeled examples. Hence, we propose to utilize the universal attribute model (UAM) trained on large action datasets to represent such actions. Specifically, we show that there are similarities between certain actions in the large datasets with snatch thefts which help in extracting a representation for snatch thefts using the attributes from the UAM. This representation is shown to be effective in distinguishing snatch thefts from regular actions with high accuracy.In summary, this thesis proposes both supervised and unsupervised approaches for representing actions which provide better discrimination than existing representations. The first approach presents a dictionary learning based sparse representation for effective discrimination of actions. Also, we propose a sparse representation for the human body based on dictionaries in order to recognize actions with rapid body deformations. In the next approach, a low-dimensional representation called action-vector for unsupervised action recognition is presented. Further, linear and non-linear embedding of action-vectors is proposed for addressing inter-action similarity and fine-grained action recognition, respectively. Finally, we propose a representation for locating snatch thefts among thousands of regular interactions in surveillance videos

    Anomaly detection in agri warehouse construction

    Get PDF
    As with many sectors, strategists and decision makers in the agricultural sector have a requirement to predict key measures such as product and feed pricing in order to maintain their position and, in some cases, to survive in their industry. Predictive algorithms in the area of Agri Analytics have shown to be very difficult due to the wide range of parameters and often unpredictable nature of some of these variables. Improving the predictive capability of Agri planners requires access to up-to-date external information in addition to the analyses provided by their own in-house databases. This motivates the need for an Agri Data Warehouse together with appropriate cleaning and transformation processes. However, with the availability of rich and wide ranging sources of Agri data now available online, there is a strong motivation to process as much current, online information as possible. In this work, we introduce the Agri Data Warehouse built for the DATAS project which not only harvests from a large number of online sources but also adopts an anomaly detection and labelling process to assist transformation and loading into the warehouse

    Dutkat: A Privacy-Preserving System for Automatic Catch Documentation and Illegal Activity Detection in the Fishing Industry

    Get PDF
    United Nations' Sustainable Development Goal 14 aims to conserve and sustainably use the oceans and their resources for the benefit of people and the planet. This includes protecting marine ecosystems, preventing pollution, and overfishing, and increasing scientific understanding of the oceans. Achieving this goal will help ensure the health and well-being of marine life and the millions of people who rely on the oceans for their livelihoods. In order to ensure sustainable fishing practices, it is important to have a system in place for automatic catch documentation. This thesis presents our research on the design and development of Dutkat, a privacy-preserving, edge-based system for catch documentation and detection of illegal activities in the fishing industry. Utilising machine learning techniques, Dutkat can analyse large amounts of data and identify patterns that may indicate illegal activities such as overfishing or illegal discard of catch. Additionally, the system can assist in catch documentation by automating the process of identifying and counting fish species, thus reducing potential human error and increasing efficiency. Specifically, our research has consisted of the development of various components of the Dutkat system, evaluation through experimentation, exploration of existing data, and organization of machine learning competitions. We have also implemented it from a compliance-by-design perspective to ensure that the system is in compliance with data protection laws and regulations such as GDPR. Our goal with Dutkat is to promote sustainable fishing practices, which aligns with the Sustainable Development Goal 14, while simultaneously protecting the privacy and rights of fishing crews

    Newfoundland Orange

    Get PDF

    Business analytics in sport talent acquisition: methods, experiences, and open research opportunities

    Get PDF
    Recruitment of young talented players is a critical activity for most professional teams in different sports such as football, soccer, basketball, baseball, cycling, etc. In the past, the selection of the most promising players was done just by relying on the experts' opinions but without systematic data support. Nowadays, the existence of large amounts of data and powerful analytical tools have raised the interest in making informed decisions based on data analysis and data-driven methods. Hence, most professional clubs are integrating data scientists to support managers with data-intensive methods and techniques that can identify the best candidates and predict their future evolution. This paper reviews existing work on the use of data analytics, artificial intelligence, and machine learning methods in talent acquisition. A numerical case study, based on real-life data, is also included to illustrate some of the potential applications of business analytics in sport talent acquisition. In addition, research trends, challenges, and open lines are also identified and discussed

    Towards outlier detection for high-dimensional data streams using projected outlier analysis strategy

    Get PDF
    [Abstract]: Outlier detection is an important research problem in data mining that aims to discover useful abnormal and irregular patterns hidden in large data sets. Most existing outlier detection methods only deal with static data with relatively low dimensionality. Recently, outlier detection for high-dimensional stream data became a new emerging research problem. A key observation that motivates this research is that outliers in high-dimensional data are projected outliers, i.e., they are embedded in lower-dimensional subspaces. Detecting projected outliers from high-dimensional stream data is a very challenging task for several reasons. First, detecting projected outliers is difficult even for high-dimensional static data. The exhaustive search for the out-lying subspaces where projected outliers are embedded is a NP problem. Second, the algorithms for handling data streams are constrained to take only one pass to process the streaming data with the conditions of space limitation and time criticality. The currently existing methods for outlier detection are found to be ineffective for detecting projected outliers in high-dimensional data streams. In this thesis, we present a new technique, called the Stream Project Outlier deTector (SPOT), which attempts to detect projected outliers in high-dimensional data streams. SPOT employs an innovative window-based time model in capturing dynamic statistics from stream data, and a novel data structure containing a set of top sparse subspaces to detect projected outliers effectively. SPOT also employs a multi-objective genetic algorithm as an effective search method for finding the outlying subspaces where most projected outliers are embedded. The experimental results demonstrate that SPOT is efficient and effective in detecting projected outliers for high-dimensional data streams. The main contribution of this thesis is that it provides a backbone in tackling the challenging problem of outlier detection for high- dimensional data streams. SPOT can facilitate the discovery of useful abnormal patterns and can be potentially applied to a variety of high demand applications, such as for sensor network data monitoring, online transaction protection, etc
    corecore