717 research outputs found
Multiple 2D self organising map network for surface reconstruction of 3D unstructured data
Surface reconstruction is a challenging task in reverse engineering because it must represent the surface which is similar to the original object based on the data obtained. The data obtained are mostly in unstructured type whereby there is not enough information and incorrect surface will be obtained. Therefore, the data should be reorganised by finding the correct topology with minimum surface error. Previous studies showed that Self Organising Map (SOM) model, the conventional surface approximation approach with Non Uniform Rational B-Splines (NURBS) surfaces, and optimisation methods such as Genetic Algorithm (GA), Differential Evolution (DE) and Particle Swarm Optimisation (PSO) methods are widely implemented in solving the surface reconstruction. However, the model, approach and optimisation methods are still suffer from the unstructured data and accuracy problems. Therefore, the aims of this research are to propose Cube SOM (CSOM) model with multiple 2D SOM network in organising the unstructured surface data, and to propose optimised surface approximation approach in generating the NURBS surfaces. GA, DE and PSO methods are implemented to minimise the surface error by adjusting the NURBS control points. In order to test and validate the proposed model and approach, four primitive objects data and one medical image data are used. As to evaluate the performance of the proposed model and approach, three performance measurements have been used: Average Quantisation Error (AQE) and Number Of Vertices (NOV) for the CSOM model while surface error for the proposed optimised surface approximation approach. The accuracy of AQE for CSOM model has been improved to 64% and 66% when compared to 2D and 3D SOM respectively. The NOV for CSOM model has been reduced from 8000 to 2168 as compared to 3D SOM. The accuracy of surface error for the optimised surface approximation approach has been improved to 7% compared to the conventional approach. The proposed CSOM model and optimised surface approximation approach have successfully reconstructed surface of all five data with better performance based on three performance measurements used in the evaluation
Acoustic data optimisation for seabed mapping with visual and computational data mining
Oceans cover 70% of Earth’s surface but little is known about their waters.
While the echosounders, often used for exploration of our oceans, have developed at
a tremendous rate since the WWII, the methods used to analyse and interpret the data
still remain the same. These methods are inefficient, time consuming, and often
costly in dealing with the large data that modern echosounders produce. This PhD
project will examine the complexity of the de facto seabed mapping technique by
exploring and analysing acoustic data with a combination of data mining and visual
analytic methods.
First we test the redundancy issues in multibeam echosounder (MBES) data
by using the component plane visualisation of a Self Organising Map (SOM). A total
of 16 visual groups were identified among the 132 statistical data descriptors. The
optimised MBES dataset had 35 attributes from 16 visual groups and represented a
73% reduction in data dimensionality. A combined Principal Component Analysis
(PCA) + k-means was used to cluster both the datasets. The cluster results were
visually compared as well as internally validated using four different internal
validation methods.
Next we tested two novel approaches in singlebeam echosounder (SBES)
data processing and clustering – using visual exploration for outlier detection and
direct clustering of time series echo returns. Visual exploration identified further
outliers the automatic procedure was not able to find. The SBES data were then
clustered directly. The internal validation indices suggested the optimal number of
clusters to be three. This is consistent with the assumption that the SBES time series
represented the subsurface classes of the seabed.
Next the SBES data were joined with the corresponding MBES data based on
identification of the closest locations between MBES and SBES. Two algorithms,
PCA + k-means and fuzzy c-means were tested and results visualised. From visual
comparison, the cluster boundary appeared to have better definitions when compared
to the clustered MBES data only. The results seem to indicate that adding SBES did
in fact improve the boundary definitions.
Next the cluster results from the analysis chapters were validated against
ground truth data using a confusion matrix and kappa coefficients. For MBES, the
classes derived from optimised data yielded better accuracy compared to that of the
original data. For SBES, direct clustering was able to provide a relatively reliable
overview of the underlying classes in survey area. The combined MBES + SBES
data provided by far the best accuracy for mapping with almost a 10% increase in
overall accuracy compared to that of the original MBES data.
The results proved to be promising in optimising the acoustic data and
improving the quality of seabed mapping. Furthermore, these approaches have the
potential of significant time and cost saving in the seabed mapping process. Finally
some future directions are recommended for the findings of this research project with
the consideration that this could contribute to further development of seabed
mapping problems at mapping agencies worldwide
Dynamic data placement and discovery in wide-area networks
The workloads of online services and applications such as social networks, sensor data platforms and web search engines have become increasingly global and dynamic, setting new challenges to providing users with low latency access to data. To achieve this, these services typically leverage a multi-site wide-area networked infrastructure. Data access latency in such an infrastructure depends on the network paths between users and data, which is determined by the data placement and discovery strategies. Current strategies are static, which offer low latencies upon deployment but worse performance under a dynamic workload.
We propose dynamic data placement and discovery strategies for wide-area networked infrastructures, which adapt to the data access workload. We achieve this with data activity correlation (DAC), an application-agnostic approach for determining the correlations between data items based on access pattern similarities. By dynamically clustering data according to DAC, network traffic in clusters is kept local. We utilise DAC as a key component in reducing access latencies for two application scenarios, emphasising different aspects of the problem:
The first scenario assumes the fixed placement of data at sites, and thus focusses on data discovery. This is the case for a global sensor discovery platform, which aims to provide low latency discovery of sensor metadata. We present a self-organising hierarchical infrastructure consisting of multiple DAC clusters, maintained with an online and distributed split-and-merge algorithm. This reduces the number of sites visited, and thus latency, during discovery for a variety of workloads.
The second scenario focusses on data placement. This is the case for global online services that leverage a multi-data centre deployment to provide users with low latency access to data. We present a geo-dynamic partitioning middleware, which maintains DAC clusters with an online elastic partition algorithm. It supports the geo-aware placement of partitions across data centres according to the workload. This provides globally distributed users with low latency access to data for static and dynamic workloads.Open Acces
Mining topological dependencies of recurrent congestion in road networks
The discovery of spatio-temporal dependencies within urban road networks that cause Recurrent Congestion (RC) patterns is crucial for numerous real-world applications, including urban planning and the scheduling of public transportation services. While most existing studies investigate temporal patterns of RC phenomena, the influence of the road network topology on RC is often over-looked. This article proposes the ST-DISCOVERY algorithm, a novel unsupervised spatio-temporal data mining algorithm that facilitates effective data-driven discovery of RC dependencies induced by the road network topology using real-world traffic data. We factor out regularly reoccurring traffic phenomena, such as rush hours, mainly induced by the daytime, by modelling and systematically exploiting temporal traffic load outliers. We present an algorithm that first constructs connected subgraphs of the road network based on the traffic speed outliers. Second, the algorithm identifies pairs of subgraphs that indicate spatio-temporal correlations in their traffic load behaviour to identify topological dependencies within the road network. Finally, we rank the identified subgraph pairs based on the dependency score determined by our algorithm. Our experimental results demonstrate that ST-DISCOVERY can effectively reveal topological dependencies in urban road networks
Image Compression Techniques: A Survey in Lossless and Lossy algorithms
The bandwidth of the communication networks has been increased continuously as results of technological advances. However, the introduction of new services and the expansion of the existing ones have resulted in even higher demand for the bandwidth. This explains the many efforts currently being invested in the area of data compression. The primary goal of these works is to develop techniques of coding information sources such as speech, image and video to reduce the number of bits required to represent a source without significantly degrading its quality. With the large increase in the generation of digital image data, there has been a correspondingly large increase in research activity in the field of image compression. The goal is to represent an image in the fewest number of bits without losing the essential information content within. Images carry three main type of information: redundant, irrelevant, and useful. Redundant information is the deterministic part of the information, which can be reproduced without loss from other information contained in the image. Irrelevant information is the part of information that has enormous details, which are beyond the limit of perceptual significance (i.e., psychovisual redundancy). Useful information, on the other hand, is the part of information, which is neither redundant nor irrelevant. Human usually observes decompressed images. Therefore, their fidelities are subject to the capabilities and limitations of the Human Visual System. This paper provides a survey on various image compression techniques, their limitations, compression rates and highlights current research in medical image compression
A distributed data extraction and visualisation service for wireless sensor networks
With the increase in applications of wireless sensor networks, data extraction and visualisation have become a key issue to develop and operate these networks. Wireless sensor networks typically gather data at a discrete number of locations. By bestowing the ability to predict inter-node values upon the network, it is proposed that it will become possible to build applications that are unaware of the concrete reality of sparse data. The aim of this thesis is to develop a service for maximising information return from large scale wireless sensor networks. This aim will be achieved through the development of a distributed information extraction and visualisation service called the mapping service. In the distributed mapping service, groups of network nodes cooperate to produce local maps which are cached and merged at a sink node, producing a map of the global network. Such a service would greatly simplify the production of higher-level information-rich representations suitable for informing other network services and the delivery of field information visualisations. The proposed distributed mapping service utilises a blend of both inductive and deductive models to successfully map sense data and the universal physical principles. It utilises the special characteristics of the application domain to render visualisations in a map format that are a precise reflection of the concrete reality. This service is suitable for visualising an arbitrary number of sense modalities. It is capable of visualising from multiple independent types of the sense data to overcome the limitations of generating visualisations from a single type of a sense modality. Furthermore, the proposed mapping service responds to changes in the environmental conditions that may impact the visualisation performance by continuously updating the application domain model in a distributed manner. Finally, a newdistributed self-adaptation algorithm, Virtual Congress Algorithm,which is based on the concept of virtual congress is proposed, with the goal of saving more power and generating more accurate data visualisation.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Recommended from our members
Evaluation and analysis of hybrid intelligent pattern recognition techniques for speaker identification
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The rapid momentum of the technology progress in the recent years has led to a tremendous rise in the use of biometric authentication systems. The objective of this research is to investigate the problem
of identifying a speaker from its voice regardless of the content (i.e.
text-independent), and to design efficient methods of combining face and voice in producing a robust authentication system.
A novel approach towards speaker identification is developed using
wavelet analysis, and multiple neural networks including Probabilistic
Neural Network (PNN), General Regressive Neural Network (GRNN)and Radial Basis Function-Neural Network (RBF NN) with the AND
voting scheme. This approach is tested on GRID and VidTIMIT cor-pora and comprehensive test results have been validated with state-
of-the-art approaches. The system was found to be competitive and it improved the recognition rate by 15% as compared to the classical Mel-frequency Cepstral Coe±cients (MFCC), and reduced the recognition time by 40% compared to Back Propagation Neural Network (BPNN), Gaussian Mixture Models (GMM) and Principal Component Analysis (PCA).
Another novel approach using vowel formant analysis is implemented using Linear Discriminant Analysis (LDA). Vowel formant based speaker identification is best suitable for real-time implementation and requires only a few bytes of information to be stored for each speaker, making it both storage and time efficient. Tested on GRID and Vid-TIMIT, the proposed scheme was found to be 85.05% accurate when Linear Predictive Coding (LPC) is used to extract the vowel formants, which is much higher than the accuracy of BPNN and GMM. Since the proposed scheme does not require any training time other than creating a small database of vowel formants, it is faster as well. Furthermore, an increasing number of speakers makes it di±cult for BPNN and GMM to sustain their accuracy, but the proposed score-based methodology stays almost linear.
Finally, a novel audio-visual fusion based identification system is implemented using GMM and MFCC for speaker identiÂŻcation and PCA for face recognition. The results of speaker identification and face recognition are fused at different levels, namely the feature, score and decision levels. Both the score-level and decision-level (with OR voting) fusions were shown to outperform the feature-level fusion in terms of accuracy and error resilience. The result is in line with the distinct nature of the two modalities which lose themselves when combined at the feature-level. The GRID and VidTIMIT test results validate that
the proposed scheme is one of the best candidates for the fusion of
face and voice due to its low computational time and high recognition accuracy
Towards Personalized and Human-in-the-Loop Document Summarization
The ubiquitous availability of computing devices and the widespread use of
the internet have generated a large amount of data continuously. Therefore, the
amount of available information on any given topic is far beyond humans'
processing capacity to properly process, causing what is known as information
overload. To efficiently cope with large amounts of information and generate
content with significant value to users, we require identifying, merging and
summarising information. Data summaries can help gather related information and
collect it into a shorter format that enables answering complicated questions,
gaining new insight and discovering conceptual boundaries.
This thesis focuses on three main challenges to alleviate information
overload using novel summarisation techniques. It further intends to facilitate
the analysis of documents to support personalised information extraction. This
thesis separates the research issues into four areas, covering (i) feature
engineering in document summarisation, (ii) traditional static and inflexible
summaries, (iii) traditional generic summarisation approaches, and (iv) the
need for reference summaries. We propose novel approaches to tackle these
challenges, by: i)enabling automatic intelligent feature engineering, ii)
enabling flexible and interactive summarisation, iii) utilising intelligent and
personalised summarisation approaches. The experimental results prove the
efficiency of the proposed approaches compared to other state-of-the-art
models. We further propose solutions to the information overload problem in
different domains through summarisation, covering network traffic data, health
data and business process data.Comment: PhD thesi
Digital ecosystems
We view Digital Ecosystems to be the digital counterparts of biological ecosystems, which
are considered to be robust, self-organising and scalable architectures that can automatically
solve complex, dynamic problems. So, this work is concerned with the creation, investigation,
and optimisation of Digital Ecosystems, exploiting the self-organising properties of biological
ecosystems. First, we created the Digital Ecosystem, a novel optimisation technique inspired
by biological ecosystems, where the optimisation works at two levels: a first optimisation,
migration of agents which are distributed in a decentralised peer-to-peer network, operating
continuously in time; this process feeds a second optimisation based on evolutionary computing
that operates locally on single peers and is aimed at finding solutions to satisfy locally relevant
constraints. We then investigated its self-organising aspects, starting with an extension
to the definition of Physical Complexity to include the evolving agent populations of our
Digital Ecosystem. Next, we established stability of evolving agent populations over time,
by extending the Chli-DeWilde definition of agent stability to include evolutionary dynamics.
Further, we evaluated the diversity of the software agents within evolving agent populations,
relative to the environment provided by the user base. To conclude, we considered alternative
augmentations to optimise and accelerate our Digital Ecosystem, by studying the accelerating
effect of a clustering catalyst on the evolutionary dynamics of our Digital Ecosystem, through
the direct acceleration of the evolutionary processes. We also studied the optimising effect of
targeted migration on the ecological dynamics of our Digital Ecosystem, through the indirect
and emergent optimisation of the agent migration patterns. Overall, we have advanced the
understanding of creating Digital Ecosystems, the self-organisation that occurs within them,
and the optimisation of their Ecosystem-Oriented Architecture
Self-building Artificial Intelligence and machine learning to empower big data analytics in smart cities
YesThe emerging information revolution makes it necessary to manage vast amounts of unstructured data rapidly. As the world is
increasingly populated by IoT devices and sensors that can sense their surroundings and communicate with each other, a digital
environment has been created with vast volumes of volatile and diverse data. Traditional AI and machine learning techniques
designed for deterministic situations are not suitable for such environments. With a large number of parameters required by each
device in this digital environment, it is desirable that the AI is able to be adaptive and self-build (i.e. self-structure, self-configure,
self-learn), rather than be structurally and parameter-wise pre-defined. This study explores the benefits of self-building AI and
machine learning with unsupervised learning for empowering big data analytics for smart city environments. By using the
growing self-organizing map, a new suite of self-building AI is proposed. The self-building AI overcomes the limitations of
traditional AI and enables data processing in dynamic smart city environments. With cloud computing platforms, the selfbuilding AI can integrate the data analytics applications that currently work in silos. The new paradigm of the self-building AI
and its value are demonstrated using the IoT, video surveillance, and action recognition applications.Supported by the Data to Decisions Cooperative Research Centre (D2D CRC) as part of their analytics and decision support program and a La Trobe University Postgraduate Research Scholarship
- …