Search CORE

159 research outputs found

An Order-based Algorithm for Minimum Dominating Set with Application in Graph Mining

Author: Chalupa David
Publication venue: 'Elsevier BV'
Publication date: 03/11/2017
Field of study

Dominating set is a set of vertices of a graph such that all other vertices have a neighbour in the dominating set. We propose a new order-based randomised local search (RLS

_o

) algorithm to solve minimum dominating set problem in large graphs. Experimental evaluation is presented for multiple types of problem instances. These instances include unit disk graphs, which represent a model of wireless networks, random scale-free networks, as well as samples from two social networks and real-world graphs studied in network science. Our experiments indicate that RLS

_o

performs better than both a classical greedy approximation algorithm and two metaheuristic algorithms based on ant colony optimisation and local search. The order-based algorithm is able to find small dominating sets for graphs with tens of thousands of vertices. In addition, we propose a multi-start variant of RLS

_o

that is suitable for solving the minimum weight dominating set problem. The application of RLS

_o

in graph mining is also briefly demonstrated

arXiv.org e-Print Archive

VBN

Bandwidth-efficient communication systems based on finite-length low density parity check codes

Author: Vu Huy Gia
Publication venue: 'University of Saskatchewan Library'
Publication date
Field of study

Low density parity check (LDPC) codes are linear block codes constructed by pseudo-random parity check matrices. These codes are powerful in terms of error performance and, especially, have low decoding complexity. While infinite-length LDPC codes approach the capacity of communication channels, finite-length LDPC codes also perform well, and simultaneously meet the delay requirement of many communication applications such as voice and backbone transmissions. Therefore, finite-length LDPC codes are attractive to employ in low-latency communication systems. This thesis mainly focuses on the bandwidth-efficient communication systems using finite-length LDPC codes. Such bandwidth-efficient systems are realized by mapping a group of LDPC coded bits to a symbol of a high-order signal constellation. Depending on the systems' infrastructure and knowledge of the channel state information (CSI), the signal constellations in different coded modulation systems can be two-dimensional multilevel/multiphase constellations or multi-dimensional space-time constellations. In the first part of the thesis, two basic bandwidth-efficient coded modulation systems, namely LDPC coded modulation and multilevel LDPC coded modulation, are investigated for both additive white Gaussian noise (AWGN) and frequency-flat Rayleigh fading channels. The bounds on the bit error rate (BER) performance are derived for these systems based on the maximum likelihood (ML) criterion. The derivation of these bounds relies on the union bounding and combinatoric techniques. In particular, for the LDPC coded modulation, the ML bound is computed from the Hamming distance spectrum of the LDPC code and the Euclidian distance profile of the two-dimensional constellation. For the multilevel LDPC coded modulation, the bound of each decoding stage is obtained for a generalized multilevel coded modulation, where more than one coded bit is considered for level. For both systems, the bounds are confirmed by the simulation results of ML decoding and/or the performance of the ordered-statistic decoding (OSD) and the sum-product decoding. It is demonstrated that these bounds can be efficiently used to evaluate the error performance and select appropriate parameters (such as the code rate, constellation and mapping) for the two communication systems.The second part of the thesis studies bandwidth-efficient LDPC coded systems that employ multiple transmit and multiple receive antennas, i.e., multiple-input multiple-output (MIMO) systems. Two scenarios of CSI availability considered are: (i) the CSI is unknown at both the transmitter and the receiver; (ii) the CSI is known at both the transmitter and the receiver. For the first scenario, LDPC coded unitary space-time modulation systems are most suitable and the ML performance bound is derived for these non-coherent systems. To derive the bound, the summation of chordal distances is obtained and used instead of the Euclidean distances. For the second case of CSI, adaptive LDPC coded MIMO modulation systems are studied, where three adaptive schemes with antenna beamforming and/or antenna selection are investigated and compared in terms of the bandwidth efficiency. For uncoded discrete-rate adaptive modulation, the computation of the bandwidth efficiency shows that the scheme with antenna selection at the transmitter and antenna combining at the receiver performs the best when the number of antennas is small. For adaptive LDPC coded MIMO modulation systems, an achievable threshold of the bandwidth efficiency is also computed from the ML bound of LDPC coded modulation derived in the first part

eCommons@USASK

University of Saskatchewan Research Archive

Recommended from our members

DATA-DRIVEN APPROACH TO IMAGE CLASSIFICATION

Author: NarasimhaMurthy Venkatesh
Publication venue: ScholarWorks@UMass Amherst
Publication date: 02/07/2019
Field of study

Image classification has been a core topic in the computer vision community. Its recent success with convolutional neural network (CNN) algorithm has led to various real world applications such as large scale management of photos/videos on cloud/social-media, image based search for online retailers, self-driving cars, building robots and healthcare. Image classification can be broadly categorized into binary, multi-class and multi-label classification problems. Binary classification involves assigning one of the two class labels to an instance. In multi-class classification problem, an instance should be categorized into one of more than two classes. Multi-label classification is a generalized version of the multi-class classification problem where each image is assigned multiple labels as opposed to a single label. In this work, we first present various methods that take advantage of deep representations (fully connected layer of pre-trained CNN on the ImageNet dataset) and yield better performance on multi-label classification when compared to methods that use over a dozen conventional visual features. Following the success of deep representations, we intend to build a generic end-to-end deep learning framework to address all three problem categories of image classification. However, there are still no well established guidelines (in terms of choosing the number of layers to go deeper, the number of kernels and the size, the type of regularizer, the choice of non-linear function, etc.) to build an efficient deep neural network and often network architecture design is specific to a problem/dataset. Hence, we present some initial efforts in building a computational framework called Deep Decision Network (DDN) which is completely data-driven. DDN is a tree-like structured built stage-wise. During the learning phase, starting from the root network node, DDN automatically builds a network that splits the data into disjoint clusters of classes which would be handled by the subsequent expert networks. This results in a tree-like structured network driven by the data. The proposed approach provides an insight into the data by identifying the group of classes that are hard to classify and require more attention when compared to others. This feature is crucial for people trying to solve the problem with little or no domain knowledge, especially for applications in medical domain. Initially, we evaluate DDN on a binary classification problem and later extend it to more challenging multi-class and multi-label classification problems. The extension of DDN to multi-class and multi-label involves some changes but they still operate under the same underlying principle. In all the three cases, the proposed approach is tested for its recognition performance and scalability on publicly available datasets providing comparison to other methods

ScholarWorks@UMass Amherst

Recommended from our members

Damage and repair identification in reinforced concrete beams modelled with various damage scenarios using vibration data

Author: Al-Ghalib AA
Publication venue
Publication date: 01/01/2013
Field of study

This research aims at developing a novel vibration-based damage identification technique that can efficiently be applied to real-time large data for detection, classification, localisation and quantification of the potential structural damage

Nottingham Trent Institutional Repository (IRep)

Antennas and Electromagnetics Research via Natural Language Processing.

Author: Cha Y-O
Publication venue: 'Queen Mary University of London'
Publication date: 17/10/2023
Field of study

Advanced techniques for performing natural language processing (NLP) are being utilised to devise a pioneering methodology for collecting and analysing data derived from scientific literature. Despite significant advancements in automated database generation and analysis within the domains of material chemistry and physics, the implementation of NLP techniques in the realms of metamaterial discovery, antenna design, and wireless communications remains at its early stages. This thesis proposes several novel approaches to advance research in material science. Firstly, an NLP method has been developed to automatically extract keywords from large-scale unstructured texts in the area of metamaterial research. This enables the uncovering of trends and relationships between keywords, facilitating the establishment of future research directions. Additionally, a trained neural network model based on the encoder-decoder Long Short-Term Memory (LSTM) architecture has been developed to predict future research directions and provide insights into the influence of metamaterials research. This model lays the groundwork for developing a research roadmap of metamaterials. Furthermore, a novel weighting system has been designed to evaluate article attributes in antenna and propagation research, enabling more accurate assessments of impact of each scientific publication. This approach goes beyond conventional numeric metrics to produce more meaningful predictions. Secondly, a framework has been proposed to leverage text summarisation, one of the primary NLP tasks, to enhance the quality of scientific reviews. It has been applied to review recent development of antennas and propagation for body-centric wireless communications, and the validation has been made available for comparison with well-referenced datasets for text summarisation. Lastly, the effectiveness of automated database building in the domain of tunable materials and their properties has been presented. The collected database will use as an input for training a surrogate machine learning model in an iterative active learning cycle. This model will be utilised to facilitate high-throughput material processing, with the ultimate goal of discovering novel materials exhibiting high tunability. The approaches proposed in this thesis will help to accelerate the discovery of new materials and enhance their applications in antennas, which has the potential to transform electromagnetic material research

Queen Mary Research Online

Implementation of Continuous Compliance:Automation of Information Security Measures in the software development process to ensure Continuous Compliance

Author: Ozkanli N (Nese)
Publication venue
Publication date: 10/06/2020
Field of study

Open University of the Netherlands Research Portal

A survey of the application of soft computing to investment and financial trading

Author: Tan Clarence
Vanstone Bruce J
Publication venue: The Australian Pattern Recognition Society
Publication date: 01/01/2003
Field of study

Bond University Research Portal

Just-in-time Pastureland Trait Estimation for Silage Optimization, under Limited Data Constraints

Author: O\u27Byrne Patricia
Publication venue: Technological University Dublin
Publication date: 01/01/2021
Field of study

To ensure that pasture-based farming meets production and environmental targets for a growing population under increasing resource constraints, producers need to know pastureland traits. Current proximal pastureland trait prediction methods largely rely on vegetation indices to determine biomass and moisture content. The development of new techniques relies on the challenging task of collecting labelled pastureland data, leading to small datasets. Classical computer vision has already been applied to weed identification and recognition of fruit blemishes using morphological features, but machine learning algorithms can parameterise models without the provision of explicit features, and deep learning can extract even more abstract knowledge although typically this is assumed to be based around very large datasets. This work hypothesises that through the advantages of state-of-the-art deep learning systems, pastureland crop traits can be accurately assessed in a just-in-time fashion, based on data retrieved from an inexpensive sensor platform, under the constraint of limited amounts of labelled data. However the challenges to achieve this overall goal are great, and for applications such as just-in-time yield and moisture estimation for farm-machinery, this work must bring together systems development, knowledge of good pastureland practice, and also techniques for handling low-volume datasets in a machine learning context. Given these challenges, this thesis makes a number of contributions. The first of these is a comprehensive literature review, relating pastureland traits to ruminant nutrient requirements and exploring trait estimation methods, from contact to remote sensing methods, including details of vegetation indices and the sensors and techniques required to use them. The second major contribution is a high-level specification of a platform for collecting and labelling pastureland data. This includes the collection of four-channel Blue, Green, Red and NIR (VISNIR) images, narrowband data, height and temperature differential, using inexpensive proximal sensors and provides a basis for holistic data analysis. Physical data platforms built around this specification were created to collect and label pastureland data, involving computer scientists, agricultural, mechanical and electronic engineers, and biologists from academia and industry, working with farmers. Using the developed platform and a set of protocols for data collection, a further contribution of this work was the collection of a multi-sensor multimodal dataset for pastureland properties. This was made up of four-channel image data, height data, thermal data, Global Positioning System (GPS) and hyperspectral data, and is available and labelled with biomass (Kg/Ha) and percentage dry matter, ready for use in deep learning. However, the most notable contribution of this work was a systematic investigation of various machine learning methods applied to the collected data in order to maximise model performance under the constraints indicated above. The initial set of models focused on collected hyperspectral datasets. However, due to their relative complexity in real-time deployment, the focus was instead on models that could best leverage image data. The main body of these models centred on image processing methods and, in particular, the use of the so-called Inception Resnet and MobileNet models to predict fresh biomass and percentage dry matter, enhancing performance using data fusion, transfer learning and multi-task learning. Images were subdivided to augment the dataset, using two different patch sizes, resulting in around 10,000 small patches of size 156 x 156 pixels and around 5,000 large patches of size 240 x 240 pixels. Five-fold cross validation was used in all analysis. Prediction accuracy was compared to older mechanisms, albeit using hyperspectral data collected, with no provision made for lighting, humidity or temperature. Hyperspectral labelled data did not produce accurate results when used to calculate Normalized Difference Vegetation Index (NDVI), or to train a neural network (NN), a 1D Convolutional Neural Network (CNN) or Long Short Term Memory (LSTM) models. Potential reasons for this are discussed, including issues around the use of highly sensitive devices in uncontrolled environments. The most accurate prediction came from a multi-modal hybrid model that concatenated output from an Inception ResNet based model, run on RGB data with ImageNet pre-trained RGB weights, output from a residual network trained on NIR data, and LiDAR height data, before fully connected layers, using the small patch dataset with a minimum validation MAPE of 28.23% for fresh biomass and 11.43% for dryness. However, a very similar prediction accuracy resulted from a model that omitted NIR data, thus requiring fewer sensors and training resources, making it more sustainable. Although NIR and temperature differential data were collected and used for analysis, neither improved prediction accuracy, with the Inception ResNet model’s minimum validation MAPE rising to 39.42% when NIR data was added. When both NIR data and temperature differential were added to a multi-task learning Inception ResNet model, it yielded a minimum validation MAPE of 33.32%. As more labelled data are collected, the models can be further trained, enabling sensors on mowers to collect data and give timely trait information to farmers. This technology is also transferable to other crops. Overall, this work should provide a valuable contribution to the smart agriculture research space

Arrow@TUDublin