1,095 research outputs found
DRSP : Dimension Reduction For Similarity Matching And Pruning Of Time Series Data Streams
Similarity matching and join of time series data streams has gained a lot of
relevance in today's world that has large streaming data. This process finds
wide scale application in the areas of location tracking, sensor networks,
object positioning and monitoring to name a few. However, as the size of the
data stream increases, the cost involved to retain all the data in order to aid
the process of similarity matching also increases. We develop a novel framework
to addresses the following objectives. Firstly, Dimension reduction is
performed in the preprocessing stage, where large stream data is segmented and
reduced into a compact representation such that it retains all the crucial
information by a technique called Multi-level Segment Means (MSM). This reduces
the space complexity associated with the storage of large time-series data
streams. Secondly, it incorporates effective Similarity Matching technique to
analyze if the new data objects are symmetric to the existing data stream. And
finally, the Pruning Technique that filters out the pseudo data object pairs
and join only the relevant pairs. The computational cost for MSM is O(l*ni) and
the cost for pruning is O(DRF*wsize*d), where DRF is the Dimension Reduction
Factor. We have performed exhaustive experimental trials to show that the
proposed framework is both efficient and competent in comparison with earlier
works.Comment: 20 pages,8 figures, 6 Table
Attribute Relationship Analysis in Outlier Mining and Stream Processing
The main theme of this thesis is to unite two important fields of data analysis, outlier mining and attribute relationship analysis. In this work we establish the connection between these two fields. We present techniques which exploit this connection, allowing to improve outlier detection in high dimensional data. In the second part of the thesis we extend our work to the emerging topic of data streams
Immersogeometric analysis of compressible flows
This dissertation presents the development of a novel immersogeometric method for the simulation of turbulent compressible flows around complex geometries.
The immersogeometric analysis is first extended into the version of tetrahedral finite cell method, in order to handle complex geometries flexibly and accurately. The developed method immerses complex objects into non-boundary-fitted meshes of tetrahedral finite elements which can be easily refined in interesting regions. Adaptively-refined quadrature rules faithfully capture the flow do- main geometry in the discrete problem without modifying the non-boundary-fitted finite element mesh. Particular emphasis is placed on studying the importance of the geometry resolution in in- tersected elements. Aligning with the immersogeometric concept, the results show that the faithful representation of the geometry in intersected elements is critical for accurate flow analysis.
To simulate the compressible flows in an accurate and and robust way, a novel stabilized finite element formulation is developed. New weak imposition of essential boundary conditions and sliding-interface formulations are also proposed in the context of moving-domain compressible flows. The new formulation is successfully tested on a set of examples spanning a wide range of Reynolds and Mach numbers showing its superior robustness. Experimental validation of the new formulation is also carried out with good success.
The developments of tetrahedral finite cell method and the stabilized finite element formulations are combined to further develop the immersogeometric method for compressible flows. Non-symmetric Nitsche method is used in the weak-boundary-condition operator, to offer good performance in the context of non-boundary-fitted discretization. The developed immersogeomet- ric method is tested against several benchmark problems, to prove its comparable accuracy to its boundary-fitted counterpart. Finally, the aerodynamic analysis of a UH-60 helicopter is carried outusing the developed method, to illustrate its potential to support design of real engineering systems through high-fidelity aerodynamic analysis
Structural Generative Descriptions for Temporal Data
In data mining problems the representation or description of data plays a fundamental role, since it defines the set of essential properties for the extraction and characterisation of patterns. However, for the case of temporal data, such as time series and data streams, one outstanding issue when developing mining algorithms is finding an appropriate data description or representation.
In this thesis two novel domain-independent representation frameworks for temporal data suitable for off-line and online mining tasks are formulated.
First, a domain-independent temporal data representation framework based on a novel data description strategy which combines structural and statistical pattern recognition approaches is developed. The key idea here is to move the structural pattern recognition problem to the probability domain. This framework is composed of three general tasks: a) decomposing input temporal patterns into subpatterns in time or any other transformed domain (for instance, wavelet domain); b) mapping these subpatterns into the probability domain to find attributes of elemental probability subpatterns called primitives; and c) mining input temporal patterns according to the attributes of their corresponding probability domain subpatterns. This framework is referred to as Structural Generative Descriptions (SGDs).
Two off-line and two online algorithmic instantiations of the proposed SGDs framework are then formulated: i) For the off-line case, the first instantiation is based on the use of Discrete Wavelet Transform (DWT) and Wavelet Density Estimators (WDE), while the second algorithm includes DWT and Finite Gaussian Mixtures. ii) For the online case, the first instantiation relies on an online implementation of DWT and a recursive version of WDE (RWDE), whereas the second algorithm is based on a multi-resolution exponentially weighted moving average filter and RWDE. The empirical evaluation of proposed SGDs-based algorithms is performed in the context of time series classification, for off-line algorithms, and in the context of change detection and clustering, for online algorithms. For this purpose, synthetic and publicly available real-world data are used.
Additionally, a novel framework for multidimensional data stream evolution diagnosis incorporating RWDE into the context of Velocity Density Estimation (VDE) is formulated. Changes in streaming data and changes in their correlation structure are characterised by means of local and global evolution coefficients as well as by means of recursive correlation coefficients. The proposed VDE framework is evaluated using temperature data from the UK and air pollution data from Hong Kong.Open Acces
- …