36 research outputs found
Neural Network Ensembles for Time Series Prediction
Rapidly evolving businesses generate massive
amounts of time-stamped data sequences and defy a demand
for massively multivariate time series analysis. For such data
the predictive engine shifts from the historical auto-regression
to modelling complex non-linear relationships between multidimensional
features and the time series outputs. In order to
exploit these time-disparate relationships for the improved time
series forecasting, the system requires a flexible methodology
of combining multiple prediction models applied to multiple
versions of the temporal data under significant noise component
and variable temporal depth of predictions. In reply
to this challenge a composite time series prediction model
is proposed which combines the strength of multiple neural
network (NN) regressors applied to the temporally varied
feature subsets and the postprocessing smoothing of outputs
developed to further reduce noise. The key strength of the model
is its excellent adaptability and generalisation ability achieved
through a highly diversified set of complementary NN models.
The model has been evaluated within NISIS Competition 2006
and NN3 Competition 2007 concerning prediction of univariate
and multivariate time-series. It showed the best predictive
performance among 12 competitive models in the NISIS 2006
and is under evaluation within NN3 2007 Competition
Analysis of the Correlation Between Majority Voting Error and the Diversity Measures in Multiple Classifier Systems
Combining classifiers by majority voting (MV) has
recently emerged as an effective way of improving
performance of individual classifiers. However, the
usefulness of applying MV is not always observed and
is subject to distribution of classification outputs in a
multiple classifier system (MCS). Evaluation of MV
errors (MVE) for all combinations of classifiers in MCS
is a complex process of exponential complexity.
Reduction of this complexity can be achieved provided
the explicit relationship between MVE and any other
less complex function operating on classifier outputs is
found. Diversity measures operating on binary
classification outputs (correct/incorrect) are studied in
this paper as potential candidates for such functions.
Their correlation with MVE, interpreted as the quality
of a measure, is thoroughly investigated using artificial
and real-world datasets. Moreover, we propose new
diversity measure efficiently exploiting information
coming from the whole MCS, rather than its part, for
which it is applied
An Overview of Classifier Fusion Methods
A number of classifier fusion methods have been
recently developed opening an alternative approach
leading to a potential improvement in the
classification performance. As there is little theory of
information fusion itself, currently we are faced with
different methods designed for different problems and
producing different results. This paper gives an
overview of classifier fusion methods and attempts to
identify new trends that may dominate this area of
research in future. A taxonomy of fusion methods
trying to bring some order into the existing âpudding
of diversitiesâ is also provided
Nature-Inspired Learning Models
Intelligent learning mechanisms found in natural world are still unsurpassed in their learning performance and eficiency of dealing with uncertain information coming in a variety of forms, yet remain under continuous challenge
from human driven artificial intelligence methods. This work intends to demonstrate how the phenomena observed in physical world can be directly used to guide artificial learning models. An inspiration for the new
learning methods has been found in the mechanics of physical fields found in both micro and macro scale.
Exploiting the analogies between data and particles subjected to gravity, electrostatic and gas particle fields, new algorithms have been developed and applied to classification and clustering while the properties of the
field further reused in regression and visualisation of classification and classifier fusion. The paper covers extensive pictorial examples and visual interpretations of the presented techniques along with some testing over
the well-known real and artificial datasets, compared when possible to the traditional methods
An Overview of Classifier Fusion Methods
A number of classifier fusion methods have been
recently developed opening an alternative approach
leading to a potential improvement in the
classification performance. As there is little theory of
information fusion itself, currently we are faced with
different methods designed for different problems and
producing different results. This paper gives an
overview of classifier fusion methods and attempts to
identify new trends that may dominate this area of
research in future. A taxonomy of fusion methods
trying to bring some order into the existing âpudding
of diversitiesâ is also provided
Reducing Spatial Data Complexity for Classification Models
Intelligent data analytics gradually becomes a day-to-day reality of today's businesses. However, despite rapidly
increasing storage and computational power current state-of-the-art predictive models still can not handle massive and noisy
corporate data warehouses. What is more adaptive and real-time operational environment requires multiple models to be
frequently retrained which fiirther hinders their use. Various data reduction techniques ranging from data sampling up to
density retention models attempt to address this challenge by capturing a summarised data structure, yet they either do
not account for labelled data or degrade the classification performance of the model trained on the condensed dataset. Our
response is a proposition of a new general framework for reducing the complexity of labelled data by means of controlled
spatial redistribution of class densities in the input space. On the example of Parzen Labelled Data Compressor (PLDC) we
demonstrate a simulatory data condensation process directly inspired by the electrostatic field interaction where the data are
moved and merged following the attracting and repelling interactions with the other labelled data. The process is controlled
by the class density function built on the original data that acts as a class-sensitive potential field ensuring preservation of
the original class density distributions, yet allowing data to rearrange and merge joining together their soft class partitions.
As a result we achieved a model that reduces the labelled datasets much further than any competitive approaches yet with
the maximum retention of the original class densities and hence the classification performance. PLDC leaves the reduced
dataset with the soft accumulative class weights allowing for efficient online updates and as shown in a series of experiments
if coupled with Parzen Density Classifier (PDC) significantly outperforms competitive data condensation methods in terms of
classification performance at the comparable compression levels
Introducing the first whole genomes of nationals from the United Arab Emirates
Whole Genome Sequencing (WGS) provides an in depth description of genome variation. In the era of large-scale population genome projects, the assembly of ethnic-specific genomes combined with mapping human reference genomes of underrepresented populations has improved the understanding of human diversity and disease associations. In this study, for the first time, whole genome sequences of two nationals of the United Arab Emirates (UAE) at \u3e27X coverage are reported. The two Emirati individuals were predominantly of Central/South Asian ancestry. An in-house customized pipeline using BWA, Picard followed by the GATK tools to map the raw data from whole genome sequences of both individuals was used. A total of 3,994,521 variants (3,350,574 Single Nucleotide Polymorphisms (SNPs) and 643,947 indels) were identified for the first individual, the UAE S001 sample. A similar number of variants, 4,031,580 (3,373,501 SNPs and 658,079 indels), were identified for UAE S002. Variants that are associated with diabetes, hypertension, increased cholesterol levels, and obesity were also identified in these individuals. These Whole Genome Sequences has provided a starting point for constructing a UAE reference panel which will lead to improvements in the delivery of precision medicine, quality of life for affected individuals and a reduction in healthcare costs. The information compiled will likely lead to the identification of target genes that could potentially lead to the development of novel therapeutic modalities
Analysis of SARS-CoV-2 viral loads in stool samples and nasopharyngeal swabs from COVID-19 patients in the United Arab Emirates
Coronavirus disease 2019 (COVID-19) was first identified in respiratory samples and was found to commonly cause cough and pneumonia. However, non-respiratory symptoms including gastrointestinal disorders are also present and a big proportion of patients test positive for the virus in stools for a prolonged period. In this cross-sectional study, we investigated viral load trends in stools and nasopharyngeal swabs and their correlation with multiple demographic and clinical factors. The study included 211 laboratory-confirmed cases suffering from a mild form of the disease and completing their isolation period at a non-hospital center in the United Arab Emirates. Demographic and clinical information was collected by standardized questionnaire and from the medical records of the patient. Of the 211 participants, 25 % tested negative in both sample types at the time of this study and 53 % of the remaining patients had detectable viral RNA in their stools. A positive fecal viral test was associated with male gender, diarrhea as a symptom, and hospitalization during infection. A positive correlation was also observed between a delayed onset of symptoms and a positive stool test. Viral load in stools positively correlated with, being overweight, exercising, taking antibiotics in the last 3 months and blood type O. The viral load in nasopharyngeal swabs, on the other hand, was higher for blood type A, and rhesus positive (Rh factor). Regression analysis showed no correlation between the viral loads measured in stool and nasopharyngeal samples in any given patient. The results of this work highlight the factors associated with a higher viral count in each sample. It also shows the importance of stool sample analysis for the follow-up and diagnosis of recovering COVID-19 patients