36 research outputs found

    Neural Network Ensembles for Time Series Prediction

    Get PDF
    Rapidly evolving businesses generate massive amounts of time-stamped data sequences and defy a demand for massively multivariate time series analysis. For such data the predictive engine shifts from the historical auto-regression to modelling complex non-linear relationships between multidimensional features and the time series outputs. In order to exploit these time-disparate relationships for the improved time series forecasting, the system requires a flexible methodology of combining multiple prediction models applied to multiple versions of the temporal data under significant noise component and variable temporal depth of predictions. In reply to this challenge a composite time series prediction model is proposed which combines the strength of multiple neural network (NN) regressors applied to the temporally varied feature subsets and the postprocessing smoothing of outputs developed to further reduce noise. The key strength of the model is its excellent adaptability and generalisation ability achieved through a highly diversified set of complementary NN models. The model has been evaluated within NISIS Competition 2006 and NN3 Competition 2007 concerning prediction of univariate and multivariate time-series. It showed the best predictive performance among 12 competitive models in the NISIS 2006 and is under evaluation within NN3 2007 Competition

    Analysis of the Correlation Between Majority Voting Error and the Diversity Measures in Multiple Classifier Systems

    Get PDF
    Combining classifiers by majority voting (MV) has recently emerged as an effective way of improving performance of individual classifiers. However, the usefulness of applying MV is not always observed and is subject to distribution of classification outputs in a multiple classifier system (MCS). Evaluation of MV errors (MVE) for all combinations of classifiers in MCS is a complex process of exponential complexity. Reduction of this complexity can be achieved provided the explicit relationship between MVE and any other less complex function operating on classifier outputs is found. Diversity measures operating on binary classification outputs (correct/incorrect) are studied in this paper as potential candidates for such functions. Their correlation with MVE, interpreted as the quality of a measure, is thoroughly investigated using artificial and real-world datasets. Moreover, we propose new diversity measure efficiently exploiting information coming from the whole MCS, rather than its part, for which it is applied

    An Overview of Classifier Fusion Methods

    Get PDF
    A number of classifier fusion methods have been recently developed opening an alternative approach leading to a potential improvement in the classification performance. As there is little theory of information fusion itself, currently we are faced with different methods designed for different problems and producing different results. This paper gives an overview of classifier fusion methods and attempts to identify new trends that may dominate this area of research in future. A taxonomy of fusion methods trying to bring some order into the existing “pudding of diversities” is also provided

    Nature-Inspired Learning Models

    Get PDF
    Intelligent learning mechanisms found in natural world are still unsurpassed in their learning performance and eficiency of dealing with uncertain information coming in a variety of forms, yet remain under continuous challenge from human driven artificial intelligence methods. This work intends to demonstrate how the phenomena observed in physical world can be directly used to guide artificial learning models. An inspiration for the new learning methods has been found in the mechanics of physical fields found in both micro and macro scale. Exploiting the analogies between data and particles subjected to gravity, electrostatic and gas particle fields, new algorithms have been developed and applied to classification and clustering while the properties of the field further reused in regression and visualisation of classification and classifier fusion. The paper covers extensive pictorial examples and visual interpretations of the presented techniques along with some testing over the well-known real and artificial datasets, compared when possible to the traditional methods

    An Overview of Classifier Fusion Methods

    Get PDF
    A number of classifier fusion methods have been recently developed opening an alternative approach leading to a potential improvement in the classification performance. As there is little theory of information fusion itself, currently we are faced with different methods designed for different problems and producing different results. This paper gives an overview of classifier fusion methods and attempts to identify new trends that may dominate this area of research in future. A taxonomy of fusion methods trying to bring some order into the existing “pudding of diversities” is also provided

    Reducing Spatial Data Complexity for Classification Models

    Get PDF
    Intelligent data analytics gradually becomes a day-to-day reality of today's businesses. However, despite rapidly increasing storage and computational power current state-of-the-art predictive models still can not handle massive and noisy corporate data warehouses. What is more adaptive and real-time operational environment requires multiple models to be frequently retrained which fiirther hinders their use. Various data reduction techniques ranging from data sampling up to density retention models attempt to address this challenge by capturing a summarised data structure, yet they either do not account for labelled data or degrade the classification performance of the model trained on the condensed dataset. Our response is a proposition of a new general framework for reducing the complexity of labelled data by means of controlled spatial redistribution of class densities in the input space. On the example of Parzen Labelled Data Compressor (PLDC) we demonstrate a simulatory data condensation process directly inspired by the electrostatic field interaction where the data are moved and merged following the attracting and repelling interactions with the other labelled data. The process is controlled by the class density function built on the original data that acts as a class-sensitive potential field ensuring preservation of the original class density distributions, yet allowing data to rearrange and merge joining together their soft class partitions. As a result we achieved a model that reduces the labelled datasets much further than any competitive approaches yet with the maximum retention of the original class densities and hence the classification performance. PLDC leaves the reduced dataset with the soft accumulative class weights allowing for efficient online updates and as shown in a series of experiments if coupled with Parzen Density Classifier (PDC) significantly outperforms competitive data condensation methods in terms of classification performance at the comparable compression levels

    Introducing the first whole genomes of nationals from the United Arab Emirates

    Get PDF
    Whole Genome Sequencing (WGS) provides an in depth description of genome variation. In the era of large-scale population genome projects, the assembly of ethnic-specific genomes combined with mapping human reference genomes of underrepresented populations has improved the understanding of human diversity and disease associations. In this study, for the first time, whole genome sequences of two nationals of the United Arab Emirates (UAE) at \u3e27X coverage are reported. The two Emirati individuals were predominantly of Central/South Asian ancestry. An in-house customized pipeline using BWA, Picard followed by the GATK tools to map the raw data from whole genome sequences of both individuals was used. A total of 3,994,521 variants (3,350,574 Single Nucleotide Polymorphisms (SNPs) and 643,947 indels) were identified for the first individual, the UAE S001 sample. A similar number of variants, 4,031,580 (3,373,501 SNPs and 658,079 indels), were identified for UAE S002. Variants that are associated with diabetes, hypertension, increased cholesterol levels, and obesity were also identified in these individuals. These Whole Genome Sequences has provided a starting point for constructing a UAE reference panel which will lead to improvements in the delivery of precision medicine, quality of life for affected individuals and a reduction in healthcare costs. The information compiled will likely lead to the identification of target genes that could potentially lead to the development of novel therapeutic modalities

    Analysis of SARS-CoV-2 viral loads in stool samples and nasopharyngeal swabs from COVID-19 patients in the United Arab Emirates

    Get PDF
    Coronavirus disease 2019 (COVID-19) was first identified in respiratory samples and was found to commonly cause cough and pneumonia. However, non-respiratory symptoms including gastrointestinal disorders are also present and a big proportion of patients test positive for the virus in stools for a prolonged period. In this cross-sectional study, we investigated viral load trends in stools and nasopharyngeal swabs and their correlation with multiple demographic and clinical factors. The study included 211 laboratory-confirmed cases suffering from a mild form of the disease and completing their isolation period at a non-hospital center in the United Arab Emirates. Demographic and clinical information was collected by standardized questionnaire and from the medical records of the patient. Of the 211 participants, 25 % tested negative in both sample types at the time of this study and 53 % of the remaining patients had detectable viral RNA in their stools. A positive fecal viral test was associated with male gender, diarrhea as a symptom, and hospitalization during infection. A positive correlation was also observed between a delayed onset of symptoms and a positive stool test. Viral load in stools positively correlated with, being overweight, exercising, taking antibiotics in the last 3 months and blood type O. The viral load in nasopharyngeal swabs, on the other hand, was higher for blood type A, and rhesus positive (Rh factor). Regression analysis showed no correlation between the viral loads measured in stool and nasopharyngeal samples in any given patient. The results of this work highlight the factors associated with a higher viral count in each sample. It also shows the importance of stool sample analysis for the follow-up and diagnosis of recovering COVID-19 patients
    corecore