6 research outputs found

    An Outlier Detection Algorithm Based on Cross-Correlation Analysis for Time Series Dataset

    Get PDF
    Outlier detection is a very essential problem in a variety of application areas. Many detection methods are deficient for high-dimensional time series data sets containing both isolated and assembled outliers. In this paper, we propose an Outlier Detection method based on Cross-correlation Analysis (ODCA). ODCA consists of three key parts. They are data preprocessing, outlier analysis, and outlier rank. First, we investigate a linear interpolation method to convert assembled outliers into isolated ones. Second, a detection mechanism based on the cross-correlation analysis is proposed for translating the high-dimensional data sets into 1-D cross-correlation function, according to which the isolated outlier is determined. Finally, a multilevel Otsu\u27s method is adopted to help us select the rank thresholds adaptively and output the abnormal samples at different levels. To illustrate the effectiveness of the ODCA algorithm, four experiments are performed using several high-dimensional time series data sets, which include two smallscale sets and two large-scale sets. Furthermore, we compare the proposed algorithm with the detection methods based on wavelet analysis, bilateral filtering, particle swarm optimization, auto-regression, and extreme learning machine. In addition, we discuss the robustness of the ODCA algorithm. The statistical results show that the ODCA algorithm is much better than existing mainstream methods in both effectiveness and time complexity

    Advanced analytical methods for fraud detection: a systematic literature review

    Get PDF
    The developments of the digital era demand new ways of producing goods and rendering services. This fast-paced evolution in the companies implies a new approach from the auditors, who must keep up with the constant transformation. With the dynamic dimensions of data, it is important to seize the opportunity to add value to the companies. The need to apply more robust methods to detect fraud is evident. In this thesis the use of advanced analytical methods for fraud detection will be investigated, through the analysis of the existent literature on this topic. Both a systematic review of the literature and a bibliometric approach will be applied to the most appropriate database to measure the scientific production and current trends. This study intends to contribute to the academic research that have been conducted, in order to centralize the existing information on this topic

    Detectores de novidades e classificadores especializados em sistemas de sonar passivo

    Get PDF
    In submarines, sonar operators have the task of identifying and classifying passive sonar contacts, so that possible threats are detected. The automation of this process is extremely relevant, since it facilitates the work of the professional of this area, requiring less physical and mental efforts during the surveillance. The proposal of this study is to investigate the efficiency of specialized models in the constitution of such a system, aiming to derive a mechanism that effectively detects unknown ships, as well as correctly identifies the labels of those already known. Three levels of specialization were considered: non-specialized, specialized in classes, and specialized in ships, assuming the following techniques for the construction of the system: Principal Component Analysis (PCA), Kernel Principal Component Analysis (KPCA), One-Class Support-Vector Machines (OCSVM), Gaussian Mixture Models (GMM), k-Nearest Neighbors (kNN), sparse k-Nearest Neighbors (s-kNN) and Local Outlier Factor (LOF). Experiments conducted with real data acquired on an acoustic lane showed a better performance of the models specialized in ships, which reached a novelty detection rate of 83.4%, conjugated with an average recognition rate of known classes of 90.5%. Regarding specifically the task of classifying the known classes, 98.7% are correctly labeled.Em submarinos, cabe aos operadores de sonar a tarefa de identificar e classificar contatos de sonar passivo, de forma que possíveis ameaças sejam detectadas. A automatização deste processo é extremamente relevante, visto que facilita o trabalho do profissional desta área, ao exigir um menor esforço físico e mental durante a vigilância. A proposta deste estudo é investigar a eficiência de modelos especializados na constituição de tal sistema, visando a derivar um mecanismo que detecte de forma eficaz navios desconhecidos, bem como identifique corretamente os rótulos daqueles conhecidos. Três níveis de especialização foram considerados: não-especializado, especializado por classes, e especializado por navios, assumindo as seguintes técnicas para a construção do sistema: Análise de Componentes Principais (PCA), Análise de Componentes Principais por Kernel (KPCA), Máquinas de Vetor-Suporte de Uma Classe (OCSVM), Modelos de Mistura de Gaussianas (GMM), k-vizinhos mais próximos (kNN), k-vizinhos mais próximos esparso (s-kNN) e Local Outlier Factor (LOF). Experimentos conduzidos com dados reais adquiridos em uma raia acústica mostraram um melhor desempenho dos modelos especializados em navios, que atingiram uma taxa de detecção de novidades de 83,4%, conjugada com uma taxa média de reconhecimento de classes conhecidas de 90,5%. Em relação especificamente à tarefa de classificação das classes conhecidas, 98,7% são corretamente rotuladas

    Distributed Contextual Anomaly Detection from Big Event Streams

    Get PDF
    The age of big digital data is emerged and the size of generating data is rapidly increasing in a millisecond through the Internet of Things (IoT) and Internet of Everything (IoE) objects. Specifically, most of today’s available data are generated in a form of streams through different applications including sensor networks, bioinformatics, smart airport, smart highway traffic, smart home applications, e-commerce online shopping, and social media streams. In this context, processing and mining such high volume of data stream becomes one of the research priority concern and challenging tasks. On the one hand, processing high volumes of streaming data with low-latency response is a critical concern in most of the real-time application before the important information can be missed or disregarded. On the other hand, detecting events from data stream is becoming a new research challenging task since the existing traditional anomaly detection method is mainly focusing on; a) limited size of data, b) centralised detection with limited computing resource, and c) specific anomaly detection types of either point or collective rather than the Contextual behaviour of the data. Thus, detecting Contextual events from high sequence volume of data stream is one of the research concerns to be addressed in this thesis. As the size of IoT data stream is scaled up to a high volume, it is impractical to propose existing processing data structure and anomaly detection method. This is due to the space, time and the complexity of the existing data processing model and learning algorithms. In this thesis, a novel distributed anomaly detection method and algorithm is proposed to detect Contextual behaviours from the sequence of bounded streams. Capturing event streams and partitioning them over several windows to control the high rate of event streams mainly base on, the proposed solution firstly. Secondly, by proposing a parallel and distributed algorithm to detect Contextual anomalous event. The experimental results are evaluated based on the algorithm’s performances, processing low-latency response, and detecting Contextual anomalous behaviour accuracy rate from the event streams. Finally, to address scalability concerned of the Contextual events, appropriate computational metrics are proposed to measure and evaluate the processing latency of distributed method. The achieved result is evidenced distributed detection is effective in terms of learning from high volumes of streams in real-time
    corecore