1,095 research outputs found

    DRSP : Dimension Reduction For Similarity Matching And Pruning Of Time Series Data Streams

    Get PDF
    Similarity matching and join of time series data streams has gained a lot of relevance in today's world that has large streaming data. This process finds wide scale application in the areas of location tracking, sensor networks, object positioning and monitoring to name a few. However, as the size of the data stream increases, the cost involved to retain all the data in order to aid the process of similarity matching also increases. We develop a novel framework to addresses the following objectives. Firstly, Dimension reduction is performed in the preprocessing stage, where large stream data is segmented and reduced into a compact representation such that it retains all the crucial information by a technique called Multi-level Segment Means (MSM). This reduces the space complexity associated with the storage of large time-series data streams. Secondly, it incorporates effective Similarity Matching technique to analyze if the new data objects are symmetric to the existing data stream. And finally, the Pruning Technique that filters out the pseudo data object pairs and join only the relevant pairs. The computational cost for MSM is O(l*ni) and the cost for pruning is O(DRF*wsize*d), where DRF is the Dimension Reduction Factor. We have performed exhaustive experimental trials to show that the proposed framework is both efficient and competent in comparison with earlier works.Comment: 20 pages,8 figures, 6 Table

    Attribute Relationship Analysis in Outlier Mining and Stream Processing

    Get PDF
    The main theme of this thesis is to unite two important fields of data analysis, outlier mining and attribute relationship analysis. In this work we establish the connection between these two fields. We present techniques which exploit this connection, allowing to improve outlier detection in high dimensional data. In the second part of the thesis we extend our work to the emerging topic of data streams

    Multivariate Correlation Discovery in Streaming Data

    Get PDF

    Immersogeometric analysis of compressible flows

    Get PDF
    This dissertation presents the development of a novel immersogeometric method for the simulation of turbulent compressible flows around complex geometries. The immersogeometric analysis is first extended into the version of tetrahedral finite cell method, in order to handle complex geometries flexibly and accurately. The developed method immerses complex objects into non-boundary-fitted meshes of tetrahedral finite elements which can be easily refined in interesting regions. Adaptively-refined quadrature rules faithfully capture the flow do- main geometry in the discrete problem without modifying the non-boundary-fitted finite element mesh. Particular emphasis is placed on studying the importance of the geometry resolution in in- tersected elements. Aligning with the immersogeometric concept, the results show that the faithful representation of the geometry in intersected elements is critical for accurate flow analysis. To simulate the compressible flows in an accurate and and robust way, a novel stabilized finite element formulation is developed. New weak imposition of essential boundary conditions and sliding-interface formulations are also proposed in the context of moving-domain compressible flows. The new formulation is successfully tested on a set of examples spanning a wide range of Reynolds and Mach numbers showing its superior robustness. Experimental validation of the new formulation is also carried out with good success. The developments of tetrahedral finite cell method and the stabilized finite element formulations are combined to further develop the immersogeometric method for compressible flows. Non-symmetric Nitsche method is used in the weak-boundary-condition operator, to offer good performance in the context of non-boundary-fitted discretization. The developed immersogeomet- ric method is tested against several benchmark problems, to prove its comparable accuracy to its boundary-fitted counterpart. Finally, the aerodynamic analysis of a UH-60 helicopter is carried outusing the developed method, to illustrate its potential to support design of real engineering systems through high-fidelity aerodynamic analysis

    Structural Generative Descriptions for Temporal Data

    Get PDF
    In data mining problems the representation or description of data plays a fundamental role, since it defines the set of essential properties for the extraction and characterisation of patterns. However, for the case of temporal data, such as time series and data streams, one outstanding issue when developing mining algorithms is finding an appropriate data description or representation. In this thesis two novel domain-independent representation frameworks for temporal data suitable for off-line and online mining tasks are formulated. First, a domain-independent temporal data representation framework based on a novel data description strategy which combines structural and statistical pattern recognition approaches is developed. The key idea here is to move the structural pattern recognition problem to the probability domain. This framework is composed of three general tasks: a) decomposing input temporal patterns into subpatterns in time or any other transformed domain (for instance, wavelet domain); b) mapping these subpatterns into the probability domain to find attributes of elemental probability subpatterns called primitives; and c) mining input temporal patterns according to the attributes of their corresponding probability domain subpatterns. This framework is referred to as Structural Generative Descriptions (SGDs). Two off-line and two online algorithmic instantiations of the proposed SGDs framework are then formulated: i) For the off-line case, the first instantiation is based on the use of Discrete Wavelet Transform (DWT) and Wavelet Density Estimators (WDE), while the second algorithm includes DWT and Finite Gaussian Mixtures. ii) For the online case, the first instantiation relies on an online implementation of DWT and a recursive version of WDE (RWDE), whereas the second algorithm is based on a multi-resolution exponentially weighted moving average filter and RWDE. The empirical evaluation of proposed SGDs-based algorithms is performed in the context of time series classification, for off-line algorithms, and in the context of change detection and clustering, for online algorithms. For this purpose, synthetic and publicly available real-world data are used. Additionally, a novel framework for multidimensional data stream evolution diagnosis incorporating RWDE into the context of Velocity Density Estimation (VDE) is formulated. Changes in streaming data and changes in their correlation structure are characterised by means of local and global evolution coefficients as well as by means of recursive correlation coefficients. The proposed VDE framework is evaluated using temperature data from the UK and air pollution data from Hong Kong.Open Acces
    corecore