1,990 research outputs found

    Automated design of genetic programming of classification algorithms.

    Get PDF
    Doctoral Degree. University of KwaZulu-Natal, Pietermaritzburg.Over the past decades, there has been an increase in the use of evolutionary algorithms (EAs) for data mining and knowledge discovery in a wide range of application domains. Data classification, a real-world application problem is one of the areas EAs have been widely applied. Data classification has been extensively researched resulting in the development of a number of EA based classification algorithms. Genetic programming (GP) in particular has been shown to be one of the most effective EAs at inducing classifiers. It is widely accepted that the effectiveness of a parameterised algorithm like GP depends on its configuration. Currently, the design of GP classification algorithms is predominantly performed manually. Manual design follows an iterative trial and error approach which has been shown to be a menial, non-trivial time-consuming task that has a number of vulnerabilities. The research presented in this thesis is part of a large-scale initiative by the machine learning community to automate the design of machine learning techniques. The study investigates the hypothesis that automating the design of GP classification algorithms for data classification can still lead to the induction of effective classifiers. This research proposes using two evolutionary algorithms,namely,ageneticalgorithm(GA)andgrammaticalevolution(GE)toautomatethe design of GP classification algorithms. The proof-by-demonstration research methodology is used in the study to achieve the set out objectives. To that end two systems namely, a genetic algorithm system and a grammatical evolution system were implemented for automating the design of GP classification algorithms. The classification performance of the automated designed GP classifiers, i.e., GA designed GP classifiers and GE designed GP classifiers were compared to manually designed GP classifiers on real-world binary class and multiclass classification problems. The evaluation was performed on multiple domain problems obtained from the UCI machine learning repository and on two specific domains, cybersecurity and financial forecasting. The automated designed classifiers were found to outperform the manually designed GP classifiers on all the problems considered in this study. GP classifiers evolved by GE were found to be suitable for classifying binary classification problems while those evolved by a GA were found to be suitable for multiclass classification problems. Furthermore, the automated design time was found to be less than manual design time. Fitness landscape analysis of the design spaces searched by a GA and GE were carried out on all the class of problems considered in this study. Grammatical evolution found the search to be smoother on binary classification problems while the GA found multiclass problems to be less rugged than binary class problems

    A GPT-Based Approach for Scientometric Analysis: Exploring the Landscape of Artificial Intelligence Research

    Full text link
    This study presents a comprehensive approach that addresses the challenges of scientometric analysis in the rapidly evolving field of Artificial Intelligence (AI). By combining search terms related to AI with the advanced language processing capabilities of generative pre-trained transformers (GPT), we developed a highly accurate method for identifying and analyzing AI-related articles in the Web of Science (WoS) database. Our multi-step approach included filtering articles based on WoS citation topics, category, keyword screening, and GPT classification. We evaluated the effectiveness of our method through precision and recall calculations, finding that our combined approach captured around 94% of AI-related articles in the entire WoS corpus with a precision of 90%. Following this, we analyzed the publication volume trends, revealing a continuous growth pattern from 2013 to 2022 and an increasing degree of interdisciplinarity. We conducted citation analysis on the top countries and institutions and identified common research themes using keyword analysis and GPT. This study demonstrates the potential of our approach to facilitate accurate scientometric analysis, by providing insights into the growth, interdisciplinary nature, and key players in the field.Comment: 29 pages, 10 figures, 5 table

    The doctoral research abstracts. Vol:8 2015 / Institute of Graduate Studies, UiTM

    Get PDF
    Foreword: THIRTY FIRST October 2015 marks the celebration of 47 PhD doctorates receiving their scroll during UiTM 83rd Convocation Ceremony. This date is significant to UiTM since it is an official indication of 47 more scholarly contributions to the world of knowledge and innovation through the novelty of their research. To date UiTM has contributed 471 producers of knowledge through their doctoral research ranging from the field of Science and Technology, Business and Administration, and Social Science and Humanities. This Doctoral Abstracts epitomizes knowledge par excellence and a form of tribute to the 47 doctorates whose achievement we proudly celebrate. To the graduands, your success in achieving the highest academic qualification has demonstrated that you have indeed engineered your destiny well. The action of registering for a PhD program was not by chance but by choice. It was a choice made to realise your self-actualization level that is the highest level in Maslow’s Hierarchy of Needs, while at the same time unleashing your potential in the scholarly research. Do not forget that life is a treasure and that its contents continue to be a mystery, thus, your journey of discovery through research has not come to an end but rather, is just the beginning. Enjoy life through your continuous discovery of knowledge, and spearhead innovation while you are at it. Make your alma mater proud through this continuous discovery as alumni of UiTM. As you soar upwards in your career, my advice will be to continuously be humble and ‘plant’ your feet firmly on the ground. Congratulations once again and may you carry UiTM as ‘Sentiasa di Hatiku’. Tan Sri Dato’ Sri Prof Ir Dr Sahol Hamid Abu Bakar, FASc, PEng Vice Chancellor Universiti Teknologi MAR

    Characterization and uncertainty analysis of siliciclastic aquifer-fault system

    Get PDF
    The complex siliciclastic aquifer system underneath the Baton Rouge area, Louisiana, USA, is fluvial in origin. The east-west trending Baton Rouge fault and Denham Springs-Scotlandville fault cut across East Baton Rouge Parish and play an important role in groundwater flow and aquifer salinization. To better understand the salinization underneath Baton Rouge, it is imperative to study the hydrofacies architecture and the groundwater flow field of the Baton Rogue aquifer-fault system. This is done through developing multiple detailed hydrofacies architecture models and multiple groundwater flow models of the aquifer-fault system, representing various uncertain model propositions. The hydrofacies architecture models focus on the Miocene-Pliocene depth interval that consists of the “1,200-foot” sand, “1,500-foot” sand, “1,700-foot” sand and the “2,000-foot” sand, as these aquifer units are classified and named by their approximate depth below ground level. The groundwater flow models focus only on the “2,000-foot” sand. The study reveals the complexity of the Baton Rouge aquifer-fault system where the sand deposition is non-uniform, different sand units are interconnected, the sand unit displacement on the faults is significant, and the spatial distribution of flow pathways through the faults is sporadic. The identified locations of flow pathways through the Baton Rouge fault provide useful information on possible windows for saltwater intrusion from the south. From the results we learn that the “1,200-foot” sand, “1,500-foot” sand and the “1,700-foot” sand should not be modeled separately since they are very well connected near the Baton Rouge fault, while the “2,000-foot” sand between the two faults is a separate unit. Results suggest that at the “2,000-foot” sand the Denham Springs-Scotlandville fault has much lower permeability in comparison to the Baton Rouge fault, and that the Baton Rouge fault plays an important role in the aquifer salinization

    A Survey on Compiler Autotuning using Machine Learning

    Full text link
    Since the mid-1990s, researchers have been trying to use machine-learning based approaches to solve a number of different compiler optimization problems. These techniques primarily enhance the quality of the obtained results and, more importantly, make it feasible to tackle two main compiler optimization problems: optimization selection (choosing which optimizations to apply) and phase-ordering (choosing the order of applying optimizations). The compiler optimization space continues to grow due to the advancement of applications, increasing number of compiler optimizations, and new target architectures. Generic optimization passes in compilers cannot fully leverage newly introduced optimizations and, therefore, cannot keep up with the pace of increasing options. This survey summarizes and classifies the recent advances in using machine learning for the compiler optimization field, particularly on the two major problems of (1) selecting the best optimizations and (2) the phase-ordering of optimizations. The survey highlights the approaches taken so far, the obtained results, the fine-grain classification among different approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated quarterly here (Send me your new published papers to be added in the subsequent version) History: Received November 2016; Revised August 2017; Revised February 2018; Accepted March 2018

    Advances in Computer Science and Engineering

    Get PDF
    The book Advances in Computer Science and Engineering constitutes the revised selection of 23 chapters written by scientists and researchers from all over the world. The chapters cover topics in the scientific fields of Applied Computing Techniques, Innovations in Mechanical Engineering, Electrical Engineering and Applications and Advances in Applied Modeling

    Exploring the adaptive structure of the mental lexicon

    Get PDF
    The mental lexicon is a complex structure organised in terms of phonology, semantics and syntax, among other levels. In this thesis I propose that this structure can be explained in terms of the pressures acting on it: every aspect of the organisation of the lexicon is an adaptation ultimately related to the function of language as a tool for human communication, or to the fact that language has to be learned by subsequent generations of people. A collection of methods, most of which are applied to a Spanish speech corpus, reveal structure at different levels of the lexicon.• The patterns of intra-word distribution of phonological information may be a consequence of pressures for optimal representation of the lexicon in the brain, and of the pressure to facilitate speech segmentation.• An analysis of perceived phonological similarity between words shows that the sharing of different aspects of phonological similarity is related to different functions. Phonological similarity perception sometimes relates to morphology (the stressed final vowel determines verb tense and person) and at other times shows processing biases (similarity in the word initial and final segments is more readily perceived than in word-internal segments).• Another similarity analysis focuses on cooccurrence in speech to create a representation of the lexicon where the position of a word is determined by the words that tend to occur in its close vicinity. Variations of context-based lexical space naturally categorise words syntactically and semantically.• A higher level of lexicon structure is revealed by examining the relationships between the phonological and the cooccurrence similarity spaces. A study in Spanish supports the universality of the small but significant correlation between these two spaces found in English by Shillcock, Kirby, McDonald and Brew (2001). This systematicity across levels of representation adds an extra layer of structure that may help lexical acquisition and recognition. I apply it to a new paradigm to determine the function of parameters of phonological similarity based on their relationships with the syntacticsemantic level. I find that while some aspects of a language's phonology maintain systematicity, others work against it, perhaps responding to the opposed pressure for word identification.This thesis is an exploratory approach to the study of the mental lexicon structure that uses existing and new methodology to deepen our understanding of the relationships between language use and language structure

    Utilising restricted for-loops in genetic programming

    Get PDF
    Genetic programming is an approach that utilises the power of evolution to allow computers to evolve programs. While loops are natural components of most programming languages and appear in every reasonably-sized application, they are rarely used in genetic programming. The work is to investigate a number of restricted looping constructs to determine whether any significant benefits can be obtained in genetic programming. Possible benefits include: Solving problems which cannot be solved without loops, evolving smaller sized solutions which can be more easily understood by human programmers and solving existing problems quicker by using fewer evaluations. In this thesis, a number of explicit restricted loop formats were formulated and tested on the Santa Fe ant problem, a modified ant problem, a sorting problem, a visit-every-square problem and a difficult object classificat ion problem. The experimental results showed that these explicit loops can be successfully used in genetic programming. The evolutionary process can decide when, where and how to use them. Runs with these loops tended to generate smaller sized solutions in fewer evaluations. Solutions with loops were found to some problems that could not be solved without loops. The results and analysis of this thesis have established that there are significant benefits in using loops in genetic programming. Restricted loops can avoid the difficulties of evolving consistent programs and the infinite iterations problem. Researchers and other users of genetic programming should not be afraid of loops

    Automated retrieval and extraction of training course information from unstructured web pages

    Get PDF
    Web Information Extraction (WIE) is the discipline dealing with the discovery, processing and extraction of specific pieces of information from semi-structured or unstructured web pages. The World Wide Web comprises billions of web pages and there is much need for systems that will locate, extract and integrate the acquired knowledge into organisations practices. There are some commercial, automated web extraction software packages, however their success comes from heavily involving their users in the process of finding the relevant web pages, preparing the system to recognise items of interest on these pages and manually dealing with the evaluation and storage of the extracted results. This research has explored WIE, specifically with regard to the automation of the extraction and validation of online training information. The work also includes research and development in the area of automated Web Information Retrieval (WIR), more specifically in Web Searching (or Crawling) and Web Classification. Different technologies were considered, however after much consideration, Naïve Bayes Networks were chosen as the most suitable for the development of the classification system. The extraction part of the system used Genetic Programming (GP) for the generation of web extraction solutions. Specifically, GP was used to evolve Regular Expressions, which were then used to extract specific training course information from the web such as: course names, prices, dates and locations. The experimental results indicate that all three aspects of this research perform very well, with the Web Crawler outperforming existing crawling systems, the Web Classifier performing with an accuracy of over 95% and a precision of over 98%, and the Web Extractor achieving an accuracy of over 94% for the extraction of course titles and an accuracy of just under 67% for the extraction of other course attributes such as dates, prices and locations. Furthermore, the overall work is of great significance to the sponsoring company, as it simplifies and improves the existing time-consuming, labour-intensive and error-prone manual techniques, as will be discussed in this thesis. The prototype developed in this research works in the background and requires very little, often no, human assistance
    corecore