Search CORE

14 research outputs found

Healing Products of Gaussian Processes

Author: Cohen Samuel
Deisenroth Marc Peter
Marwala Tshilidzi
Mbuvha Rendani
Publication venue
Publication date: 14/02/2021
Field of study

Gaussian processes (GPs) are nonparametric Bayesian models that have been applied to regression and classification problems. One of the approaches to alleviate their cubic training cost is the use of local GP experts trained on subsets of the data. In particular, product-of-expert models combine the predictive distributions of local experts through a tractable product operation. While these expert models allow for massively distributed computation, their predictions typically suffer from erratic behaviour of the mean or uncalibrated uncertainty quantification. By calibrating predictions via a tempered softmax weighting, we provide a solution to these problems for multiple product-of-expert models, including the generalised product of experts and the robust Bayesian committee machine. Furthermore, we leverage the optimal transport literature and propose a new product-of-expert model that combines predictions of local experts by computing their Wasserstein barycenter, which can be applied to both regression and classification.Comment: ICML 202

arXiv.org e-Print Archive

UCL Discovery

Imputation of Missing Streamflow Data at Multiple Gauging Stations in Benin Republic

Author: Adounkpe Julien Yise Peniel
Houngnibo Mandela
Marwala Tshilidzi
Mbuvha Rendani
Mongwe Wilson Tsakane
Newlands Nathaniel
Publication venue
Publication date: 17/11/2022
Field of study

Streamflow observation data is vital for flood monitoring, agricultural, and settlement planning. However, such streamflow data are commonly plagued with missing observations due to various causes such as harsh environmental conditions and constrained operational resources. This problem is often more pervasive in under-resourced areas such as Sub-Saharan Africa. In this work, we reconstruct streamflow time series data through bias correction of the GEOGloWS ECMWF streamflow service (GESS) forecasts at ten river gauging stations in Benin Republic. We perform bias correction by fitting Quantile Mapping, Gaussian Process, and Elastic Net regression in a constrained training period. We show by simulating missingness in a testing period that GESS forecasts have a significant bias that results in low predictive skill over the ten Beninese stations. Our findings suggest that overall bias correction by Elastic Net and Gaussian Process regression achieves superior skill relative to traditional imputation by Random Forest, k-Nearest Neighbour, and GESS lookup. The findings of this work provide a basis for integrating global GESS streamflow data into operational early-warning decision-making systems (e.g., flood alert) in countries vulnerable to drought and flooding due to extreme weather events.Comment: AAAI 2022 Fall Symposium: The Role of AI in Responding to Climate Challenges, Nov 17-19, 202

arXiv.org e-Print Archive

Open problems in causal structure learning: A case study of COVID-19 in the UK

Author: Chobtham Kiattikun
Constantinou Anthony
Hashemzadeh Arian
Kitson Neville K.
Liu Yang
Mbuvha Rendani
Nanavati Praharsh A.
Petrungaro Bruno
Publication venue
Publication date: 06/09/2023
Field of study

Causal machine learning (ML) algorithms recover graphical structures that tell us something about cause-and-effect relationships. The causal representation praovided by these algorithms enables transparency and explainability, which is necessary for decision making in critical real-world problems. Yet, causal ML has had limited impact in practice compared to associational ML. This paper investigates the challenges of causal ML with application to COVID-19 UK pandemic data. We collate data from various public sources and investigate what the various structure learning algorithms learn from these data. We explore the impact of different data formats on algorithms spanning different classes of learning, and assess the results produced by each algorithm, and groups of algorithms, in terms of graphical structure, model dimensionality, sensitivity analysis, confounding variables, predictive and interventional inference. We use these results to highlight open problems in causal structure learning and directions for future research. To facilitate future work, we make all graphs, models, data sets, and source code publicly available online

arXiv.org e-Print Archive

MphayaNER: Named Entity Recognition for Tshivenda

Author: Adelani David I.
Marivate Vukosi
Marwala Tshilidzi
Masindi Andisani
Mauda Aluwani
Maumela Tshifhiwa Joshua
Mbuvha Rendani
Mutavhatsindi Tendani
Rakhuhu Tshimangadzo
Rananga Seani
Publication venue
Publication date: 08/04/2023
Field of study

Named Entity Recognition (NER) plays a vital role in various Natural Language Processing tasks such as information retrieval, text classification, and question answering. However, NER can be challenging, especially in low-resource languages with limited annotated datasets and tools. This paper adds to the effort of addressing these challenges by introducing MphayaNER, the first Tshivenda NER corpus in the news domain. We establish NER baselines by \textit{fine-tuning} state-of-the-art models on MphayaNER. The study also explores zero-shot transfer between Tshivenda and other related Bantu languages, with chiShona and Kiswahili showing the best results. Augmenting MphayaNER with chiShona data was also found to improve model performance significantly. Both MphayaNER and the baseline models are made publicly available.Comment: Accepted at AfricaNLP Workshop at ICLR 202

arXiv.org e-Print Archive

Bayesianska neuronnät för korttidsprognoser för vindkraft

Author: Mbuvha Rendani
Publication venue: KTH, Skolan för datavetenskap och kommunikation (CSC)
Publication date: 01/01/2017
Field of study

In recent years, wind and other variable renewable energy sources have gained a rapidly increasing share of the global energy mix. In this context the greatest concern facing renewable energy sources like wind is the uncertainty in production volumes as their generation ability is inherently dependent on weather conditions. When providing forecasts for newly commissioned wind farms there is a limited amount of historical power production data, while the number of potential features from different weather forecast providers is vast. Bayesian regularization is therefore seen as a possible technique for reducing model overfitting problems that may arise. This thesis investigates Bayesian Neural Networks in one-hour and day-ahead forecasting of wind power generation. Initial results show that Bayesian Neural Networks display equivalent predictive performance to Neural Networks trained by Maximum Likelihood in both one-hour and day ahead forecasting. Models selected using maximum evidence were found to have statistically significant lower test error performance compared to those selected based on minimum test error. Further results show that the Bayesian Framework is able to identify irrelevant features through Automatic Relevance Determination, though not resulting in a statistically significant error reduction in predictiveperformance in one-hour ahead forecasting. In day-ahead forecasting removing irrelevant features based on Automatic Relevance Determination is found to yield statistically significant improvements in test error.Under de senaste åren har vind och andra variabla förnybara energikällor fått en snabbtökande andel av den globala energiandelen. I detta sammanhang är den största oron förförnybara energikällors produktionsvolymer vindosäkerheten, eftersom kraftverkens generationsförmåga i sig är beroende av väderförhållandena. Vid prognoser för nybyggdavindkraftverk finns en begränsad mängd historisk kraftproduktionsdata, medan antaletpotentiella mätvärden från olika väderprognosleverantörer är stor. Bayesian regulariseringses därför som en möjlig metod för att minska problem med den överanpassning avmodellerna som kan uppstå.Denna avhandling undersöker Bayesianska Neurala Nätverk (BNN) för prognosticeringen timme och en dag framåt av vindkraftproduktion. Resultat visar att BNN gerekvivalent prediktiv prestanda jämfört med neurala nätverk bildade med användandeav Maximum-likelihood för prognoser för en timme och dagsprognoser. Modeller somvalts med användning av maximum evidence visade sig ha statistiskt signifikant lägretestfelprestanda jämfört med de som valts utifrån minimaltestfel. Ytterligare resultatvisar att ett Bayesianskt ramverk kan identifiera irrelevanta särdrag genom automatiskrelevansbestämning. För prognoser för en timme framåt resulterar detta emellertid intei en statistiskt signifikant felreduktion i prediktiv prestanda. För 1-dagarsprognoser, närvi avlägsnar irrelevanta funktioner baserade på automatisk relevans, fås dock statistisktsignifikanta förbättringar av testfel

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Parameter inference using probabilistic techniques

Author: Mbuvha Rendani
Publication venue
Publication date: 01/01/2021
Field of study

Abstract: Complex non-linear prediction systems have become ubiquitous in numerous decision making and other socio-technical systems. In recent years, the increased adoption and use of these complex non-linear systems has been dominated by universal approximators such as neural networks and Gaussian Processes. These systems’ applications span a large number of critical domains, including transportation, drug design, law enforcement, financial services, energy planning, and pandemic forecasting. The aforementioned critical nature of the application domains necessitates the need to study the inference methods for training or calibration of these systems’ parameters. Further to this, inference methods coupled with estimators of the uncertainty around the system’s predictions and measures of the relative influence of its inputs aid in managing the very high societal risks associated with incorrect predictions. This thesis investigates probabilistic parameter inference methods that provide both the required uncertainty and relevance measures. We first introduce Metropolis Hastings (MH) and Hybrid Monte Carlo (HMC) methods for parameter inference in Bayesian Neural Networks (BNNs) with applications in credit risk modelling and South African wind energy resource planning. We further utilise a Separable Shadow Hamiltonian Hybrid Monte Carlo (S2HMC) method for the first time in the inference of BNN parameters. S2HMC addresses traditional MCMC methods’ discretisation constraints by using a perturbed Hamiltonian, which is conserved at a higher-order by the numerical integration scheme. Experimental results on wind energy and credit datasets find that S2HMC yields higher effective sample sizes than the competing Hybrid Monte Carlo (HMC). The predictive performance of S2HMC and HMC based BNNs is found to be similar. We thirdly perform hierarchical inference for BNN parameters by combining the S2HMC sampler with Gibbs sampling of hyperparameters for Automatic Relevance Determination (ARD). A generalisable ARD committee framework is introduced to synthesise various sampler’s ARD outputs into robust feature selections. Experimental results show that this ARD committee approach selects features of high predictive information value. Further, the results show that dimensionality reduction performed through this approach improves the sampling performance of samplers which suffer from random walk behaviour such as Metropolis-Hastings (MH). The thesis also addresses predictive distribution calibration pathologies of the existing product of Gaussian Process expert models. We introduce a solution to the predictive dominance of uninformed experts through expert combination via theWasserstein Barycenter and sparsity control through tempered softmax weightings. These proposals are empirically shown to outperform other product of experts (PoE) methods. The proposed PoE are also found to outperform BNNs on wind speed forecasting regression tasks. Finally, the thesis provides a Bayesian inference approach to change point determination in the spreading rates of the novel Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) in South Africa. This approach is a first in literature, probabilistically principled method for quantifying the relative efficacy of the various South African government interventions to slow the spread of SARS-CoV-2.Ph.D. (Electrical and Electronic Engineering

University of Johannesburg Institutional Repository