4,872 research outputs found

    Periodic Pattern Mining a Algorithms and Applications

    Get PDF
    Owing to a large number of applications periodic pattern mining has been extensively studied for over a decade Periodic pattern is a pattern that repeats itself with a specific period in a give sequence Periodic patterns can be mined from datasets like biological sequences continuous and discrete time series data spatiotemporal data and social networks Periodic patterns are classified based on different criteria Periodic patterns are categorized as frequent periodic patterns and statistically significant patterns based on the frequency of occurrence Frequent periodic patterns are in turn classified as perfect and imperfect periodic patterns full and partial periodic patterns synchronous and asynchronous periodic patterns dense periodic patterns approximate periodic patterns This paper presents a survey of the state of art research on periodic pattern mining algorithms and their application areas A discussion of merits and demerits of these algorithms was given The paper also presents a brief overview of algorithms that can be applied for specific types of datasets like spatiotemporal data and social network

    Pattern Recognition in Intensive Care Online Monitoring

    Get PDF
    Clinical information systems can record numerous variables describing the patient’s state at high sampling frequencies. Intelligent alarm systems and suitable bedsidedecision support are needed to cope with this flood of information. A basic task here is the fast and correct detection of important patterns of change such as level shifts and trends in the data. We present approaches for automated pattern detection in online-monitoring data. Several methods based on curve fitting and statistical time series analysis are described. Median filtering can be used as a preliminary step to reduce the noise and to remove clinically irrelevant short term fluctuations. Our special focus is the potential of these methods for online-monitoring in intensive care. The strengths and weaknesses of the methods are discussed in this special context. The best approach may well be a suitable combination of the methods for achieving reliable results. Further investigations are needed to further improve the methods and their performance should be compared extensively in simulation studies and applications to real data

    Anomaly Detection in Categorical Datasets with Artificial Contrasts

    Get PDF
    abstract: Anomaly is a deviation from the normal behavior of the system and anomaly detection techniques try to identify unusual instances based on deviation from the normal data. In this work, I propose a machine-learning algorithm, referred to as Artificial Contrasts, for anomaly detection in categorical data in which neither the dimension, the specific attributes involved, nor the form of the pattern is known a priori. I use RandomForest (RF) technique as an effective learner for artificial contrast. RF is a powerful algorithm that can handle relations of attributes in high dimensional data and detect anomalies while providing probability estimates for risk decisions. I apply the model to two simulated data sets and one real data set. The model was able to detect anomalies with a very high accuracy. Finally, by comparing the proposed model with other models in the literature, I demonstrate superior performance of the proposed model.Dissertation/ThesisMasters Thesis Industrial Engineering 201

    Automatic Network Fingerprinting through Single-Node Motifs

    Get PDF
    Complex networks have been characterised by their specific connectivity patterns (network motifs), but their building blocks can also be identified and described by node-motifs---a combination of local network features. One technique to identify single node-motifs has been presented by Costa et al. (L. D. F. Costa, F. A. Rodrigues, C. C. Hilgetag, and M. Kaiser, Europhys. Lett., 87, 1, 2009). Here, we first suggest improvements to the method including how its parameters can be determined automatically. Such automatic routines make high-throughput studies of many networks feasible. Second, the new routines are validated in different network-series. Third, we provide an example of how the method can be used to analyse network time-series. In conclusion, we provide a robust method for systematically discovering and classifying characteristic nodes of a network. In contrast to classical motif analysis, our approach can identify individual components (here: nodes) that are specific to a network. Such special nodes, as hubs before, might be found to play critical roles in real-world networks.Comment: 16 pages (4 figures) plus supporting information 8 pages (5 figures

    Attribute Relationship Analysis in Outlier Mining and Stream Processing

    Get PDF
    The main theme of this thesis is to unite two important fields of data analysis, outlier mining and attribute relationship analysis. In this work we establish the connection between these two fields. We present techniques which exploit this connection, allowing to improve outlier detection in high dimensional data. In the second part of the thesis we extend our work to the emerging topic of data streams

    User profiles’ image clustering for digital investigations

    Get PDF
    Sharing images on Social Network (SN) platforms is one of the most widespread behaviors which may cause privacy-intrusive and illegal content to be widely distributed. Clustering the images shared through SN platforms according to the acquisition cameras embedded in smartphones is regarded as a significant task in forensic investigations of cybercrimes. The Sensor Pattern Noise (SPN) caused by camera sensor imperfections due to the manufacturing process has been proved to be an effective and robust camera fingerprint that can be used for several tasks, such as digital evidence analysis, smartphone fingerprinting and user profile linking as well. Clustering the images uploaded by users on their profiles is a way of fingerprinting the camera sources and it is considered a challenging task since users may upload different types of images, i.e., the images taken by users’ smartphones (taken images) and single images from different sources, cropped images, or generic images from the Web (shared images). The shared images make a perturbation in the clustering task, as they do not usually present sufficient characteristics of SPN of their related sources. Moreover, they are not directly referable to the user’s device so they have to be detected and removed from the clustering process. In this paper, we propose a user profiles’ image clustering method without prior knowledge about the type and number of the camera sources. The hierarchical graph-based method clusters both types of images, taken images and shared images. The strengths of our method include overcoming large-scale image datasets, the presence of shared images that perturb the clustering process and the loss of image details caused by the process of content compression on SN platforms. The method is evaluated on the VISION dataset, which is a public benchmark including images from 35 smartphones. The dataset is perturbed by 3000 images, simulating the shared images from different sources except for users’ smartphones. Experimental results confirm the robustness of the proposed method against perturbed datasets and its effectiveness in the image clustering
    • …
    corecore