225 research outputs found

    Intelligent instance selection techniques for support vector machine speed optimization with application to e-fraud detection.

    Get PDF
    Doctor of Philosophy in Computer Science. University of KwaZulu-Natal, Durban 2017.Decision-making is a very important aspect of many businesses. There are grievous penalties involved in wrong decisions, including financial loss, damage of company reputation and reduction in company productivity. Hence, it is of dire importance that managers make the right decisions. Machine Learning (ML) simplifies the process of decision making: it helps to discover useful patterns from historical data, which can be used for meaningful decision-making. The ability to make strategic and meaningful decisions is dependent on the reliability of data. Currently, many organizations are overwhelmed with vast amounts of data, and unfortunately, ML algorithms cannot effectively handle large datasets. This thesis therefore proposes seven filter-based and five wrapper-based intelligent instance selection techniques for optimizing the speed and predictive accuracy of ML algorithms, with a particular focus on Support Vector Machine (SVM). Also, this thesis proposes a novel fitness function for instance selection. The primary difference between the filter-based and wrapper-based technique is in their method of selection. The filter-based techniques utilizes the proposed fitness function for selection, while the wrapper-based technique utilizes SVM algorithm for selection. The proposed techniques are obtained by fusing SVM algorithm with the following Nature Inspired algorithms: flower pollination algorithm, social spider algorithm, firefly algorithm, cuckoo search algorithm and bat algorithm. Also, two of the filter-based techniques are boundary detection algorithms, inspired by edge detection in image processing and edge selection in ant colony optimization. Two different sets of experiments were performed in order to evaluate the performance of the proposed techniques (wrapper-based and filter-based). All experiments were performed on four datasets containing three popular e-fraud types: credit card fraud, email spam and phishing email. In addition, experiments were performed on 20 datasets provided by the well-known UCI data repository. The results show that the proposed filter-based techniques excellently improved SVM training speed in 100% (24 out of 24) of the datasets used for evaluation, without significantly affecting SVM classification quality. Moreover, experimental results also show that the wrapper-based techniques consistently improved SVM predictive accuracy in 78% (18 out of 23) of the datasets used for evaluation and simultaneously improved SVM training speed in all cases. Furthermore, two different statistical tests were conducted to further validate the credibility of the results: Freidman’s test and Holm’s post-hoc test. The statistical test results reveal that the proposed filter-based and wrapper-based techniques are significantly faster, compared to standard SVM and some existing instance selection techniques, in all cases. Moreover, statistical test results also reveal that Cuckoo Search Instance Selection Algorithm outperform all the proposed techniques, in terms of speed. Overall, the proposed techniques have proven to be fast and accurate ML-based e-fraud detection techniques, with improved training speed, predictive accuracy and storage reduction. In real life application, such as video surveillance and intrusion detection systems, that require a classifier to be trained very quickly for speedy classification of new target concepts, the filter-based techniques provide the best solutions; while the wrapper-based techniques are better suited for applications, such as email filters, that are very sensitive to slight changes in predictive accuracy

    Aco-based feature selection algorithm for classification

    Get PDF
    Dataset with a small number of records but big number of attributes represents a phenomenon called “curse of dimensionality”. The classification of this type of dataset requires Feature Selection (FS) methods for the extraction of useful information. The modified graph clustering ant colony optimisation (MGCACO) algorithm is an effective FS method that was developed based on grouping the highly correlated features. However, the MGCACO algorithm has three main drawbacks in producing a features subset because of its clustering method, parameter sensitivity, and the final subset determination. An enhanced graph clustering ant colony optimisation (EGCACO) algorithm is proposed to solve the three (3) MGCACO algorithm problems. The proposed improvement includes: (i) an ACO feature clustering method to obtain clusters of highly correlated features; (ii) an adaptive selection technique for subset construction from the clusters of features; and (iii) a genetic-based method for producing the final subset of features. The ACO feature clustering method utilises the ability of various mechanisms such as intensification and diversification for local and global optimisation to provide highly correlated features. The adaptive technique for ant selection enables the parameter to adaptively change based on the feedback of the search space. The genetic method determines the final subset, automatically, based on the crossover and subset quality calculation. The performance of the proposed algorithm was evaluated on 18 benchmark datasets from the University California Irvine (UCI) repository and nine (9) deoxyribonucleic acid (DNA) microarray datasets against 15 benchmark metaheuristic algorithms. The experimental results of the EGCACO algorithm on the UCI dataset are superior to other benchmark optimisation algorithms in terms of the number of selected features for 16 out of the 18 UCI datasets (88.89%) and the best in eight (8) (44.47%) of the datasets for classification accuracy. Further, experiments on the nine (9) DNA microarray datasets showed that the EGCACO algorithm is superior than the benchmark algorithms in terms of classification accuracy (first rank) for seven (7) datasets (77.78%) and demonstrates the lowest number of selected features in six (6) datasets (66.67%). The proposed EGCACO algorithm can be utilised for FS in DNA microarray classification tasks that involve large dataset size in various application domains

    Biologically-inspired double skin facades for hot climates: a parametric approach for performative design

    Get PDF
    La Biomimicry è una scienza applicata che studia le forme, i materiali, i sistemi e i processi naturali per individuare soluzioni applicabili anche a problemi umani. Tale scienza trova applicazione in molti campi, quali l’agricoltura, la medicina, l’ingegneria e l’architettura. Grazie ai progressi compiuti nella modellazione parametrica, ad oggi sono disponibili potenti strumenti che, oltre alla simulazione energetica, consentono di esplorare le potenzialità delle soluzioni tratte dal mondo naturale nella progettazione architettonica, superando i limiti della semplice imitazione della forma. Una delle maggiori sfide per gli architetti negli ultimi anni è la riduzione della domanda energetica del costruito. Per i climi caldi, le esigenze di ventilazione e raffrescamento sono pertanto fattori cruciali per migliorarne la prestazione energetica. La tesi di ricerca affronta il problema della progettazione e dell’efficienza energetica dell’involucro edilizio in contesti climatici caldi, quale l’Egitto. A tal fine, è stato definito e applicato un approccio progettuale biomimetico-computazionale, per studiare e analizzare i comportamenti adattivi di termoregolazione di vari organismi naturali. In particolare, il lavoro di ricerca esplora possibili soluzioni architettoniche, ispirate a caratteristiche biologiche, per l’involucro di un edificio per uffici, con l’obiettivo di ridurre la domanda energetica per il raffrescamento. L’involucro dell’edificio è stato modellato parametricamente utilizzando Grasshopper Visual Programming Language per Rhino 3D Modeller, applicando inoltre alcuni algoritmi evolutivi multi-obiettivo per ottimizzare la soluzione architettonica rispetto al duplice obiettivo di diminuire i carichi di raffrescamento e mantenere un buon livello di illuminazione naturale. In tal modo, la riduzione dei carichi di raffreddamento non comporta un incremento dei consumi elettrici per l'illuminazione artificiale. Le prestazioni termiche dell’edificio sono state valutate con il software EnergyPlus. La soluzione architettonica esplorata è una facciata a doppia pelle ispirata a vari principi della natura. Le prestazioni della soluzione proposta sono state confrontate con quelle di un edificio per uffici esistente a Il Cairo. Il modello dell’edificio è stato ricostruito sulla base di planimetrie e specifiche sui materiali presenti; inoltre la disponibilità di dati sui consumi energetici per il raffrescamento dell’edificio ha permesso di valutare l’accuratezza della prestazione energetica calcolata con il software di modellazione. La soluzione progettuale è stata comparate anche rispetto alle prestazioni di una tipica facciata a doppia pelle. Inoltre le prestazioni termiche calcolate con EnergyPlus sono state confrontate con quelle ottenute con software di simulazione fluidodinamica computazionale (CFD), più accurati nel calcolo delle facciate a doppia pelle. Tale comparazione ha permesso di identificare il grado di errore e l’appropriatezza dell’uso di EnergyPlus nelle fasi iniziali della progettazione. La facciata a doppia pelle proposta consente una diminuzione della domanda di raffrescamento fino al 13,4%, migliorando al tempo stesso il livello di illuminazione naturale, che spesso costituisce uno dei maggiori limiti per l’applicazione di tale sistema. La ricerca termina con una sintesi dei risultati ottenuti e una valutazione complessiva del processo di progettazione presentato, degli strumenti di progettazione/simulazione utilizzati e delle prestazioni dell’involucro proposto, discutendone vantaggi e limiti. Sulla base delle sperimentazioni e dei risultati conseguiti, sono state individuate linee guida e raccomandazioni per la progettazione delle facciate a doppia pelle nei climi caldi. Inoltre viene fornita una matrice che raccoglie tutte le idee biomimetiche esplorate e analizzate, che rappresenta una mini-banca dati per architetti o designer interessati a questo approccio progettuale nell’affrontare i problemi di termoregolazione del costruito. Infine, la differenza di accuratezza tra i risultati di EnergyPlus e quelli dello strumento CFD è risultata trascurabile.Biomimicry is an applied science that derives inspiration for solutions to human problems through the study of natural designs, materials, structures and processes. Many fields of study benefit from biomimetic inspirations, such as agriculture, medicine, engineering, and architecture. Technological advances in parametric and computational design software in addition to environmental simulation means offer very useful tools in order to explore the potential of nature’s inspirations in architectural designs that does not just mimic shapes and forms. Energy efficiency is one of the major and growing concerns facing architects. Cooling and ventilation needs are critical factors that affect energy efficiency especially in hot climates. This thesis addresses the problem of designing building skins that are energy efficient in the context of hot climates such as that in Egypt. The research attempts to define and apply a biomimetic-computational design approach to study and analyse natural organisms in terms of their behaviour regarding thermoregulation. Aiming to decrease cooling loads, the research explores possible architectural solutions for a biologically inspired skin system for office buildings. The building’s skin is parametrically designed using Grasshopper Visual Programming Language for Rhino 3D Modeller, and it is optimised using multi-objective evolutionary algorithms which are particularly important in the attempt of finding a range of solutions that reduce cooling loads while maintaining daylight needs. Consequently, the reduction in cooling loads should not be at the expense of increased energy consumption in artificial lighting. Simulations regarding the thermal performance were performed using EnergyPlus. A Double-Skin Façade (DSF) is proposed based on inspirations from nature. In order to evaluate the performance of the proposal, it is compared to the performance of the skin of an existing office building in Cairo acting as a reference case. Data regarding the reference case such as the building drawings, material specifications and annual cooling consumption were obtained in order to build its digital model and assess its accuracy. The proposed design is also evaluated by comparing it to a typical flat DSF. The obtained results regarding the thermal performance of the proposed building skin are verified by comparing them to results of more accurate simulations performed using Computational Fluid Dynamics (CFD). The aim is to know the degree of error as well as the appropriateness of using EnergyPlus for geometrically-complex DSFs in early design phases when CFD is not practical. The proposed DSF was able to decrease cooling loads by up to 13.4% while improving daylight performance at the same time which is often one of the main challenges of using DSFs. The research criticises the presented design approach as a whole, the design/simulation tools used and the performance of the proposed skin discussing their benefits and limitations. Based on the design experimentation and results, general guidelines and recommendations for DSF design in hot climates are presented. Additionally, the research presents a compiled matrix of the biomimetic ideas explored and analysed in order to serve as a mini-data bank for architects or designers interested in this design approach in addressing thermoregulation problems. Finally, the comparison between EnergyPlus and CFD software results showed minor differences

    Systematic analysis of reproductive development in normal and mantled oil palm (Elaeis guineensis Jacq) flowers and fruit

    Get PDF
    Oil palm (Elaeis guineensis Jacq.) is the most efficient oil crop in the world; it uses substantially less land and resources and produces more oil than any other oil crop. Even so, to meet the growing palm oil demands due to the increasing global population, per capita consumption rates and biofuel demands, ground-breaking strategies for agronomic and genetic improvement of the commercial planting material are necessary. Clonal propagation through tissue culture has proven to be useful in producing uniform planting materials. However, there are incidences of the deleterious floral homeotic mutant, mantled, in oil palm ramets. In this study, standardised protocols and analytical parameters for the extraction and characterisation of oil palm inflorescences, bunches and pollen in the context of the mantled abnormality are proposed. Genotyping using twenty SSR markers showed good discriminatory powers and revealed ten ‘off types’. Methylation detection at the EgDEF1 KARMA locus using RsaI showed an 18.75% error in distinguishing mantled from normal. Thus, accurate phenotyping and appraisal of mantled phenotype were achieved through visual scoring of unripe bunches. This novel phenotyping regime allowed quantification of the severity as well as variability associated with the aberrant phenotype. For selection and extraction of comparable inflorescence samples from normal and mantled ramets, a new developmental classification was formulated, and the field sampling and histology protocols were optimised through trial. The different developmental categories were validated using ANOVA (F probability<0.001) and Fisher’s protected least significant difference test. This developmental classification supplements the previous model for developmental stage prediction and enables precise field identification of key developmental events. Subsequently, a reproductive developmental series for oil palm from early inflorescence development to floral maturity was prepared. This developmental series permitted comparisons between age categories (three-year-old young clone and ten-year-old mature clones), sexes as well as phenotypes (normal and mantled). Hence, for the first time, mantled reproductive development is compared alongside equivalent normal samples from the same clone, throughout the reproductive developmental process. The mantled phenotype was indistinguishable by histology till pseudocarpels were observable at the developmental category ‘floral triad 3 (FT3)’. Results revealed three novel features of mantled phenotype. Firstly, in the present set of samples, phenotypic expression of mantled was limited to pistillate flowers. Contrary to previous reports, even the abortive staminate flowers in mantled female inflorescences showed normal development while the pistillate flower of the same triad was mantled. Secondly, analysis of field sampling data revealed a lower incidence of male phase (p<.001) associated with the mantled phenotype. This possible effect of mantled on sex determination indicates an earlier manifestation of mantled phenotype than previously reported. Lastly, pollen samples from mantled ramets showed significantly higher pollen abortion and degeneration and lower pollen health (Chi2 probability <0.001). Functional quality assessment of oil palm pollen grains was done through histochemical approaches and germination tests and pollen from mantled sources was analysed for the first time. Healthy reproductive development and adequate pollination are vital for the optimal yield of oil palm. The systematic investigations undertaken here is a step towards a more comprehensive understanding of these events in normal and the mantled ramets. Results of previously uncharacterised effects of mantled phenotype call for further investigation into its phenotypic expression. Methodologies and parameters proposed here should be useful for a wide range of research into floral abnormalities of oil palm

    Smart Manufacturing

    Get PDF
    This book is a collection of 11 articles that are published in the corresponding Machines Special Issue “Smart Manufacturing”. It represents the quality, breadth and depth of the most updated study in smart manufacturing (SM); in particular, digital technologies are deployed to enhance system smartness by (1) empowering physical resources in production, (2) utilizing virtual and dynamic assets over the Internet to expand system capabilities, (3) supporting data-driven decision-making activities at various domains and levels of businesses, or (4) reconfiguring systems to adapt to changes and uncertainties. System smartness can be evaluated by one or a combination of performance metrics such as degree of automation, cost-effectiveness, leanness, robustness, flexibility, adaptability, sustainability, and resilience. This book features, firstly, the concepts digital triad (DT-II) and Internet of digital triad things (IoDTT), proposed to deal with the complexity, dynamics, and scalability of complex systems simultaneously. This book also features a comprehensive survey of the applications of digital technologies in space instruments; a systematic literature search method is used to investigate the impact of product design and innovation on the development of space instruments. In addition, the survey provides important information and critical considerations for using cutting edge digital technologies in designing and manufacturing space instruments

    Systematic analysis of reproductive development in normal and mantled oil palm (Elaeis guineensis Jacq) flowers and fruit

    Get PDF
    Oil palm (Elaeis guineensis Jacq.) is the most efficient oil crop in the world; it uses substantially less land and resources and produces more oil than any other oil crop. Even so, to meet the growing palm oil demands due to the increasing global population, per capita consumption rates and biofuel demands, ground-breaking strategies for agronomic and genetic improvement of the commercial planting material are necessary. Clonal propagation through tissue culture has proven to be useful in producing uniform planting materials. However, there are incidences of the deleterious floral homeotic mutant, mantled, in oil palm ramets. In this study, standardised protocols and analytical parameters for the extraction and characterisation of oil palm inflorescences, bunches and pollen in the context of the mantled abnormality are proposed. Genotyping using twenty SSR markers showed good discriminatory powers and revealed ten ‘off types’. Methylation detection at the EgDEF1 KARMA locus using RsaI showed an 18.75% error in distinguishing mantled from normal. Thus, accurate phenotyping and appraisal of mantled phenotype were achieved through visual scoring of unripe bunches. This novel phenotyping regime allowed quantification of the severity as well as variability associated with the aberrant phenotype. For selection and extraction of comparable inflorescence samples from normal and mantled ramets, a new developmental classification was formulated, and the field sampling and histology protocols were optimised through trial. The different developmental categories were validated using ANOVA (F probability<0.001) and Fisher’s protected least significant difference test. This developmental classification supplements the previous model for developmental stage prediction and enables precise field identification of key developmental events. Subsequently, a reproductive developmental series for oil palm from early inflorescence development to floral maturity was prepared. This developmental series permitted comparisons between age categories (three-year-old young clone and ten-year-old mature clones), sexes as well as phenotypes (normal and mantled). Hence, for the first time, mantled reproductive development is compared alongside equivalent normal samples from the same clone, throughout the reproductive developmental process. The mantled phenotype was indistinguishable by histology till pseudocarpels were observable at the developmental category ‘floral triad 3 (FT3)’. Results revealed three novel features of mantled phenotype. Firstly, in the present set of samples, phenotypic expression of mantled was limited to pistillate flowers. Contrary to previous reports, even the abortive staminate flowers in mantled female inflorescences showed normal development while the pistillate flower of the same triad was mantled. Secondly, analysis of field sampling data revealed a lower incidence of male phase (p<.001) associated with the mantled phenotype. This possible effect of mantled on sex determination indicates an earlier manifestation of mantled phenotype than previously reported. Lastly, pollen samples from mantled ramets showed significantly higher pollen abortion and degeneration and lower pollen health (Chi2 probability <0.001). Functional quality assessment of oil palm pollen grains was done through histochemical approaches and germination tests and pollen from mantled sources was analysed for the first time. Healthy reproductive development and adequate pollination are vital for the optimal yield of oil palm. The systematic investigations undertaken here is a step towards a more comprehensive understanding of these events in normal and the mantled ramets. Results of previously uncharacterised effects of mantled phenotype call for further investigation into its phenotypic expression. Methodologies and parameters proposed here should be useful for a wide range of research into floral abnormalities of oil palm

    Determination of Time Dependent Stress Distribution on Potato Tubers at Mechanical Collision

    Get PDF
    This study focuses on determining internal stress progression and the realistic representation of time dependent deformation behaviour of potato tubers under a sample mechanical collision case. A reverse engineering approach, physical material tests and finite element method (FEM)-based explicit dynamics simulations were utilised to investigate the collision based deformation characteristics of the potato tubers. Useful numerical data and deformation visuals were obtained from the simulation results. The numerical results are presented in a format that can be used for the determination of bruise susceptibility magnitude on solid-like agricultural products. The modulus of elasticity was calculated from experimental data as 3.12 [MPa] and simulation results showed that the maximum equivalent stress was 1.40 [MPa] and 3.13 [MPa] on the impacting and impacted tubers respectively. These stress values indicate that bruising is likely on the tubers. This study contributes to further research on the usage of numerical-methods-based nonlinear explicit dynamics simulation techniques in complicated deformation and bruising investigations and industrial applications related to solid-like agricultural products

    An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents

    Get PDF
    Event Detection (ED) works on identifying events from various types of data. Building an ED model for news text documents greatly helps decision-makers in various disciplines in improving their strategies. However, identifying and summarizing events from such data is a non-trivial task due to the large volume of published heterogeneous news text documents. Such documents create a high-dimensional feature space that influences the overall performance of the baseline methods in ED model. To address such a problem, this research presents an enhanced ED model that includes improved methods for the crucial phases of the ED model such as Feature Selection (FS), ED, and summarization. This work focuses on the FS problem by automatically detecting events through a novel wrapper FS method based on Adapted Binary Bat Algorithm (ABBA) and Adapted Markov Clustering Algorithm (AMCL), termed ABBA-AMCL. These adaptive techniques were developed to overcome the premature convergence in BBA and fast convergence rate in MCL. Furthermore, this study proposes four summarizing methods to generate informative summaries. The enhanced ED model was tested on 10 benchmark datasets and 2 Facebook news datasets. The effectiveness of ABBA-AMCL was compared to 8 FS methods based on meta-heuristic algorithms and 6 graph-based ED methods. The empirical and statistical results proved that ABBAAMCL surpassed other methods on most datasets. The key representative features demonstrated that ABBA-AMCL method successfully detects real-world events from Facebook news datasets with 0.96 Precision and 1 Recall for dataset 11, while for dataset 12, the Precision is 1 and Recall is 0.76. To conclude, the novel ABBA-AMCL presented in this research has successfully bridged the research gap and resolved the curse of high dimensionality feature space for heterogeneous news text documents. Hence, the enhanced ED model can organize news documents into distinct events and provide policymakers with valuable information for decision making

    Computational Optimizations for Machine Learning

    Get PDF
    The present book contains the 10 articles finally accepted for publication in the Special Issue “Computational Optimizations for Machine Learning” of the MDPI journal Mathematics, which cover a wide range of topics connected to the theory and applications of machine learning, neural networks and artificial intelligence. These topics include, among others, various types of machine learning classes, such as supervised, unsupervised and reinforcement learning, deep neural networks, convolutional neural networks, GANs, decision trees, linear regression, SVM, K-means clustering, Q-learning, temporal difference, deep adversarial networks and more. It is hoped that the book will be interesting and useful to those developing mathematical algorithms and applications in the domain of artificial intelligence and machine learning as well as for those having the appropriate mathematical background and willing to become familiar with recent advances of machine learning computational optimization mathematics, which has nowadays permeated into almost all sectors of human life and activity
    corecore