1,631 research outputs found

    Predicting grade progression within the Limpopo Education System

    Get PDF
    One way to improve education in South Africa is to ensure that additional support and resourcing are provided to schools and learners that are most in need of help. To this end, education officials need to understand the factors affecting learning and the schools most in need of appropriate interventions. Several theories, models and methods have been developed to attempt to address the challenges faced in the education sector. Educational Data Mining (EDM) is one which has gained prominence in addressing these challenges. EDM is a field of data mining using mathematical and machine learning models to improve learners’ performance, education administration, and policy formulation. This study explored the literature and related methodologies used within the EDM context and constructed a solution to improve learner support and planning in the Limpopo primary and secondary schools education system. The data utilized included socio-economic environment, demographic information as well as learner’s performance sourced from the Education Management Information Systems database of the Limpopo Department of Education (LDoE). Feature selection methods; Information Gain, Correlation and Asymmetrical Uncertainty were combined to determine factors that affect learning. Three machine learning classifiers, AdaboostM1 (Decision Stump), HoeffdingTree and NaïveBayes, were used to predict learners’ grade progression. These were compared using several evaluation metrics and HoeffdingTree outperformed AdaboostM1 (Decision Stump) and NaïveBayes. When the final HoeffdingTree model was applied to the test datasets, the performance was exceptionally good. It is hoped that the implementation of this model will assist the LDoE in its role of supporting learning and planning of resource allocation

    Virtual metrology for plasma etch processes.

    Get PDF
    Plasma processes can present dicult control challenges due to time-varying dynamics and a lack of relevant and/or regular measurements. Virtual metrology (VM) is the use of mathematical models with accessible measurements from an operating process to estimate variables of interest. This thesis addresses the challenge of virtual metrology for plasma processes, with a particular focus on semiconductor plasma etch. Introductory material covering the essentials of plasma physics, plasma etching, plasma measurement techniques, and black-box modelling techniques is rst presented for readers not familiar with these subjects. A comprehensive literature review is then completed to detail the state of the art in modelling and VM research for plasma etch processes. To demonstrate the versatility of VM, a temperature monitoring system utilising a state-space model and Luenberger observer is designed for the variable specic impulse magnetoplasma rocket (VASIMR) engine, a plasma-based space propulsion system. The temperature monitoring system uses optical emission spectroscopy (OES) measurements from the VASIMR engine plasma to correct temperature estimates in the presence of modelling error and inaccurate initial conditions. Temperature estimates within 2% of the real values are achieved using this scheme. An extensive examination of the implementation of a wafer-to-wafer VM scheme to estimate plasma etch rate for an industrial plasma etch process is presented. The VM models estimate etch rate using measurements from the processing tool and a plasma impedance monitor (PIM). A selection of modelling techniques are considered for VM modelling, and Gaussian process regression (GPR) is applied for the rst time for VM of plasma etch rate. Models with global and local scope are compared, and modelling schemes that attempt to cater for the etch process dynamics are proposed. GPR-based windowed models produce the most accurate estimates, achieving mean absolute percentage errors (MAPEs) of approximately 1:15%. The consistency of the results presented suggests that this level of accuracy represents the best accuracy achievable for the plasma etch system at the current frequency of metrology. Finally, a real-time VM and model predictive control (MPC) scheme for control of plasma electron density in an industrial etch chamber is designed and tested. The VM scheme uses PIM measurements to estimate electron density in real time. A predictive functional control (PFC) scheme is implemented to cater for a time delay in the VM system. The controller achieves time constants of less than one second, no overshoot, and excellent disturbance rejection properties. The PFC scheme is further expanded by adapting the internal model in the controller in real time in response to changes in the process operating point

    Virtual metrology for plasma etch processes.

    Get PDF
    Plasma processes can present dicult control challenges due to time-varying dynamics and a lack of relevant and/or regular measurements. Virtual metrology (VM) is the use of mathematical models with accessible measurements from an operating process to estimate variables of interest. This thesis addresses the challenge of virtual metrology for plasma processes, with a particular focus on semiconductor plasma etch. Introductory material covering the essentials of plasma physics, plasma etching, plasma measurement techniques, and black-box modelling techniques is rst presented for readers not familiar with these subjects. A comprehensive literature review is then completed to detail the state of the art in modelling and VM research for plasma etch processes. To demonstrate the versatility of VM, a temperature monitoring system utilising a state-space model and Luenberger observer is designed for the variable specic impulse magnetoplasma rocket (VASIMR) engine, a plasma-based space propulsion system. The temperature monitoring system uses optical emission spectroscopy (OES) measurements from the VASIMR engine plasma to correct temperature estimates in the presence of modelling error and inaccurate initial conditions. Temperature estimates within 2% of the real values are achieved using this scheme. An extensive examination of the implementation of a wafer-to-wafer VM scheme to estimate plasma etch rate for an industrial plasma etch process is presented. The VM models estimate etch rate using measurements from the processing tool and a plasma impedance monitor (PIM). A selection of modelling techniques are considered for VM modelling, and Gaussian process regression (GPR) is applied for the rst time for VM of plasma etch rate. Models with global and local scope are compared, and modelling schemes that attempt to cater for the etch process dynamics are proposed. GPR-based windowed models produce the most accurate estimates, achieving mean absolute percentage errors (MAPEs) of approximately 1:15%. The consistency of the results presented suggests that this level of accuracy represents the best accuracy achievable for the plasma etch system at the current frequency of metrology. Finally, a real-time VM and model predictive control (MPC) scheme for control of plasma electron density in an industrial etch chamber is designed and tested. The VM scheme uses PIM measurements to estimate electron density in real time. A predictive functional control (PFC) scheme is implemented to cater for a time delay in the VM system. The controller achieves time constants of less than one second, no overshoot, and excellent disturbance rejection properties. The PFC scheme is further expanded by adapting the internal model in the controller in real time in response to changes in the process operating point

    Audition, learning and experience: expertise through development

    Get PDF
    Our experience with the auditory world can shape and modify perceptual, cognitive and neural processes with respect to audition. Such experience can occur over multiple timescales, and can vary in its specificity and intensity. In order to understand how auditory perceptual, cognitive and neural processes develop, it is important to explore the different means through which experience can influence audition. This thesis aims to address these issues. Using an expertise framework, we explore how the auditory environment and ontogenetic factors can shape and guide perceptual, cognitive and neural processes through long- and short-term profiles of experience. In early chapters, we use expertly-trained musicians as a model for long-term experience accrued under specific auditory constraints. We find that expertise on a particular instrument (violin versus piano) yields training-specific auditory perceptual advantages in a musical context, as well as improvements to ‘low-level’ auditory acuity (versus non-musicians); yet we find limited generalisation of expertise to cognitive tasks that require some of the skills that musicians hone. In a subsequent chapter, we find that expert violinists (versus non-musicians) show subtle increases in quantitative MR proxies for cortical myelin at left auditory core. In latter chapters, we explore short-term sound learning. We ask whether listeners can learn combinations of auditory cues within an active visuo-spatial task, and whether development can mediate learning of auditory cue combinations or costs due to cue contingency violations. We show that auditory cue combinations can be learned within periods of minutes. However, we find wide variation in cue learning success across all experiments, with no differences in overall cue combination learning between children and adults. These experiments help to further understanding of auditory expertise, learning, development and plasticity, within an experience-based framework

    GPU-accelerated 3D visualisation and analysis of migratory behaviour of long lived birds

    Get PDF
    With the amount of data we collect increasing, due to the efficacy of tagging technology improving, the methods we previously applied have begun to take longer and longer to process. As we move forward, it is important that the methods we develop also evolve with the data we collect. Maritime visualisation has already begun to leverage the power of parallel processing to accelerate visualisation. However, some of these techniques require the use of distributed computing, that while useful for datasets that contain billions of points, is harder to implement due to hardware requirements. Here we show that movement ecology can also significantly benefit from the use of parallel processing, while using GPGPU acceleration to enable the use of a single workstation. With only minor adjustments, algorithms can be implemented in parallel, enabling for computation to be completed in real time. We show this by first implementing a GPGPU accelerated visualisation of global environmental datasets. Through the use of OpenGL and CUDA, it is possible to visualise a dataset containing over 25 million datapoints per timestamp and swap between timestamps in 5ms, allowing for environmental context to be considered when visualising trajectories in real time. These can then be used alongside different GPU accelerated visualisation methods, such as aggregate flow diagrams, to explore large datasets in real time. We also continue to apply GPGPU acceleration to the analysis of migratory data through the use of parallel primitives. With these parallel primitives we show that GPGPU acceleration can allow researchers to accelerate their workflow without the need to completely understand the complexities of GPU programming, allowing for orders of magnitude faster computation times when compared to sequential CPU methods

    Multivariate Analysis of Tumour Gene Expression Profiles Applying Regularisation and Bayesian Variable Selection Techniques

    No full text
    High-throughput microarray technology is here to stay, e.g. in oncology for tumour classification and gene expression profiling to predict cancer pathology and clinical outcome. The global objective of this thesis is to investigate multivariate methods that are suitable for this task. After introducing the problem and the biological background, an overview of multivariate regularisation methods is given in Chapter 3 and the binary classification problem is outlined (Chapter 4). The focus of applications presented in Chapters 5 to 7 is on sparse binary classifiers that are both parsimonious and interpretable. Particular emphasis is on sparse penalised likelihood and Bayesian variable selection models, all in the context of logistic regression. The thesis concludes with a final discussion chapter. The variable selection problem is particularly challenging here, since the number of variables is much larger than the sample size, which results in an ill-conditioned problem with many equally good solutions. Thus, one open problem is the stability of gene expression profiles. In a resampling study, various characteristics including stability are compared between a variety of classifiers applied to five gene expression data sets and validated on two independent data sets. Bayesian variable selection provides an alternative to resampling for estimating the uncertainty in the selection of genes. MCMC methods are used for model space exploration, but because of the high dimensionality standard algorithms are computationally expensive and/or result in poor Markov chain mixing. A novel MCMC algorithm is presented that uses the dependence structure between input variables for finding blocks of variables to be updated together. This drastically improves mixing while keeping the computational burden acceptable. Several algorithms are compared in a simulation study. In an ovarian cancer application in Chapter 7, the best-performing MCMC algorithms are combined with parallel tempering and compared with an alternative method

    Multi-scale data storage schemes for spatial information systems

    Get PDF
    This thesis documents a research project that has led to the design and prototype implementation of several data storage schemes suited to the efficient multi-scale representation of integrated spatial data. Spatial information systems will benefit from having data models which allow for data to be viewed and analysed at various levels of detail, while the integration of data from different sources will lead to a more accurate representation of reality. The work has addressed two specific problems. The first concerns the design of an integrated multi-scale data model suited for use within Geographical Information Systems. This has led to the development of two data models, each of which allow for the integration of terrain data and topographic data at multiple levels of detail. The models are based on a combination of adapted versions of three previous data structures, namely, the constrained Delaunay pyramid, the line generalisation tree and the fixed grid. The second specific problem addressed in this thesis has been the development of an integrated multi-scale 3-D geological data model, for use within a Geoscientific Information System. This has resulted in a data storage scheme which enables the integration of terrain data, geological outcrop data and borehole data at various levels of detail. The thesis also presents details of prototype database implementations of each of the new data storage schemes. These implementations have served to demonstrate the feasibility and benefits of an integrated multi-scale approach. The research has also brought to light some areas that will need further research before fully functional systems are produced. The final chapter contains, in addition to conclusions made as a result of the research to date, a summary of some of these areas that require future work
    • …
    corecore