557 research outputs found
Continuous Action Recognition Based on Sequence Alignment
Continuous action recognition is more challenging than isolated recognition
because classification and segmentation must be simultaneously carried out. We
build on the well known dynamic time warping (DTW) framework and devise a novel
visual alignment technique, namely dynamic frame warping (DFW), which performs
isolated recognition based on per-frame representation of videos, and on
aligning a test sequence with a model sequence. Moreover, we propose two
extensions which enable to perform recognition concomitant with segmentation,
namely one-pass DFW and two-pass DFW. These two methods have their roots in the
domain of continuous recognition of speech and, to the best of our knowledge,
their extension to continuous visual action recognition has been overlooked. We
test and illustrate the proposed techniques with a recently released dataset
(RAVEL) and with two public-domain datasets widely used in action recognition
(Hollywood-1 and Hollywood-2). We also compare the performances of the proposed
isolated and continuous recognition algorithms with several recently published
methods
Optimised meta-clustering approach for clustering Time Series Matrices
The prognostics (health state) of multiple components represented as time series data stored in vectors and matrices were processed and clustered more effectively and efficiently using the newly devised ‘Meta-Clustering’ approach. These time series data gathered from large applications and systems in diverse fields such as communication, medicine, data mining, audio, visual applications, and sensors. The reason time series data was used as the domain of this research is that meaningful information could be extracted regarding the characteristics of systems and components found in large applications. Also when it came to clustering, only time series data would allow us to group these data according to their life cycle, i.e. from the time which they were healthy until the time which they start to develop faults and ultimately fail. Therefore by proposing a technique that can better process extracted time series data would significantly cut down on space and time consumption which are both crucial factors in data mining. This approach will, as a result, improve the current state of the art pattern recognition algorithms such as K-NM as the clusters will be identified faster while consuming less space. The project also has application implications in the sense that by calculating the distance between the similar components faster while also consuming less space means that the prognostics of multiple components clustered can be realised and understood more efficiently. This was achieved by using the Meta-Clustering approach to process and cluster the time series data by first extracting and storing the time series data as a two-dimensional matrix. Then implementing an enhance K-NM clustering algorithm based on the notion of Meta-Clustering and using the Euclidean distance tool to measure the similarity between the different set of failure patterns in space. This approach would initially classify and organise each component within its own refined individual cluster. This would provide the most relevant set of failure patterns that show the highest level of similarity and would also get rid of any unnecessary data that adds no value towards better understating the failure/health state of the component. Then during the second stage, once these clusters were effectively obtained, the following inner clusters initially formed are thereby grouped into one general cluster that now represents the prognostics of all the processed components. The approach was tested on multivariate time series data extracted from IGBT components within Matlab and the results achieved from this experiment showed that the optimised Meta-Clustering approach proposed does indeed consume less time and space to cluster the prognostics of IGBT components as compared to existing data mining techniques
An efficient implementation of lattice-ladder multilayer perceptrons in field programmable gate arrays
The implementation efficiency of electronic systems is a combination of conflicting requirements, as increasing volumes of computations, accelerating the exchange of data, at the same time increasing energy consumption forcing the researchers not only to optimize the algorithm, but also to quickly implement in a specialized hardware. Therefore in this work, the problem of efficient and straightforward implementation of operating in a real-time electronic intelligent systems on field-programmable gate array (FPGA) is tackled. The object of research is specialized FPGA intellectual property (IP) cores that operate in a real-time. In the thesis the following main aspects of the research object are investigated: implementation criteria and techniques.
The aim of the thesis is to optimize the FPGA implementation process of selected class dynamic artificial neural networks. In order to solve stated problem and reach the goal following main tasks of the thesis are formulated: rationalize the selection of a class of Lattice-Ladder Multi-Layer Perceptron (LLMLP) and its electronic intelligent system test-bed – a speaker dependent Lithuanian speech recognizer, to be created and investigated; develop dedicated technique for implementation of LLMLP class on FPGA that is based on specialized efficiency criteria for a circuitry synthesis; develop and experimentally affirm the efficiency of optimized FPGA IP cores used in
Lithuanian speech recognizer.
The dissertation contains: introduction, four chapters and general conclusions. The first chapter reveals the fundamental knowledge on computer-aideddesign, artificial neural networks and speech recognition implementation on FPGA. In the second chapter the efficiency criteria and technique of LLMLP IP cores implementation are proposed in order to make multi-objective optimization of throughput, LLMLP complexity and resource utilization. The data flow graphs are applied for optimization of LLMLP computations. The optimized neuron processing element is proposed. The IP cores for features extraction and comparison are developed for Lithuanian speech recognizer and analyzed in third chapter. The fourth chapter is devoted for experimental verification of developed numerous LLMLP IP cores. The experiments of isolated word recognition accuracy and speed for different speakers, signal to noise ratios, features extraction and accelerated comparison methods were performed.
The main results of the thesis were published in 12 scientific publications: eight of them were printed in peer-reviewed scientific journals, four of them in a Thomson Reuters Web of Science database, four articles – in conference proceedings. The results were presented in 17 scientific conferences
Speaker independent isolated word recognition
The work presented in this thesis concerns the recognition of
isolated words using a pattern matching approach. In such a system,
an unknown speech utterance, which is to be identified, is
transformed into a pattern of characteristic features. These
features are then compared with a set of pre-stored reference
patterns that were generated from the vocabulary words. The unknown
word is identified as that vocabulary word for which the reference
pattern gives the best match.
One of the major difficul ties in the pattern comparison process is
that speech patterns, obtained from the same word, exhibit non-linear
temporal fluctuations and thus a high degree of redundancy. The
initial part of this thesis considers various dynamic time warping
techniques used for normalizing the temporal differences between
speech patterns. Redundancy removal methods are also considered, and
their effect on the recognition accuracy is assessed.
Although the use of dynamic time warping algorithms provide
considerable improvement in the accuracy of isolated word recognition
schemes, the performance is ultimately limited by their poor ability
to discriminate between acoustically similar words. Methods for
enhancing the identification rate among acoustically similar words,
by using common pattern features for similar sounding regions, are
investigated.
Pattern matching based, speaker independent systems, can only operate
with a high recognition rate, by using multiple reference patterns
for each of the words included in the vocabulary. These patterns are
obtained from the utterances of a group of speakers. The use of
multiple reference patterns, not only leads to a large increase in
the memory requirements of the recognizer, but also an increase in
the computational load. A recognition system is proposed in this
thesis, which overcomes these difficulties by (i) employing vector
quantization techniques to reduce the storage of reference patterns,
and (ii) eliminating the need for dynamic time warping which reduces
the computational complexity of the system.
Finally, a method of identifying the acoustic structure of an
utterance in terms of voiced, unvoiced, and silence segments by using
fuzzy set theory is proposed. The acoustic structure is then
employed to enhance the recognition accuracy of a conventional
isolated word recognizer
Knowledge Extraction in Video Through the Interaction Analysis of Activities
Video is a massive amount of data that contains complex interactions between moving objects. The extraction of knowledge from this type of information creates a demand for video analytics systems that uncover statistical relationships between activities and learn the correspondence between content and labels. However, those are open research problems that have high complexity when multiple actors simultaneously perform activities, videos contain noise, and streaming scenarios are considered. The techniques introduced in this dissertation provide a basis for analyzing video. The primary contributions of this research consist of providing new algorithms for the efficient search of activities in video, scene understanding based on interactions between activities, and the predicting of labels for new scenes
Clustering of Time Series Data: Measures, Methods, and Applications
Clustering is an essential branch of data mining and statistical analysis that could help us explore the distribution of data and extract knowledge. With the broad accumulation and application of time series data, the study of its clustering is a natural extension of existing unsupervised learning heuristics. We discuss the components which configure the clustering of time series data, specifically, the similarity measure, the clustering heuristic, the evaluation of cluster quality, and the applications of said heuristics. Being the groundwork for the task of data analysis, we propose a scalable and efficient time series similarity measure: segmented-Dynamic Time Warping. For time series clustering, we formulate the Distance Density Clustering heuristic, a deterministic clustering algorithm that adopts concepts from both density and distance separation. In addition, we explored the characteristics and discussed the limitations of existing cluster evaluation methods. Finally, all components lead to the goal of real-world applications
- …