Machine learning research for developing countries can demonstrate clear
sustainable impact by delivering actionable and timely information to
in-country government organisations (GOs) and NGOs in response to their
critical information requirements. We co-create products with UK and in-country
commercial, GO and NGO partners to ensure the machine learning algorithms
address appropriate user needs whether for tactical decision making or
evidence-based policy decisions. In one particular case, we developed and
deployed a novel algorithm, BCCNet, to quickly process large quantities of
unstructured data to prevent and respond to natural disasters. Crowdsourcing
provides an efficient mechanism to generate labels from unstructured data to
prime machine learning algorithms for large scale data analysis. However, these
labels are often imperfect with qualities varying among different citizen
scientists, which prohibits their direct use with many state-of-the-art machine
learning techniques. We describe BCCNet, a framework that simultaneously
aggregates biased and contradictory labels from the crowd and trains an
automatic classifier to process new data. Our case studies, mosquito sound
detection for malaria prevention and damage detection for disaster response,
show the efficacy of our method in the challenging context of developing world
applications.Comment: Presented at NeurIPS 2018 Workshop on Machine Learning for the
  Developing Worl

Isupova, Olga

Kuzin, Danil

Li, Yunpeng

Reece, Steven

Roberts, Stephen J

Willis, Katherine

English

arXiv

Machine learning research for developing countries can demonstrate clear sustainable impact by delivering actionable and timely information to in-country government organisations (GOs) and NGOs in response to their critical information requirements. We co-create products with UK and in-country commercial, GO and NGO partners to ensure the machine learning algorithms address appropriate user needs whether for tactical decision making or evidence-based policy decisions. In one particular case, we developed and deployed a novel algorithm, BCCNet, to quickly process large quantities of unstructured data to prevent and respond to natural disasters. Crowdsourcing provides an efficient mechanism to generate labels from unstructured data to prime machine learning algorithms for large scale data analysis. However, these labels are often imperfect with qualities varying among different citizen scientists, which prohibits their direct use with many state-of-theart machine learning techniques. We describe BCCNet, a framework that simultaneously aggregates biased and contradictory labels from the crowd and trains an automatic classifier to process new data. Our case studies, mosquito sound detection for malaria prevention and damage detection for disaster response, show the efficacy of our method in the challenging context of developing world applications

Isupova, Olga

Li, Yunpeng

Kuzin, Danil

Roberts, Stephen J

Willis, Katherine

Reece, Steven

University of Surrey

BCCNet: Bayesian classifier combination neural network

Machine learning research for developing countries can demonstrate clear sustainable impact by delivering actionable and timely information to in-country government organisations (GOs) and NGOs in response to their critical information requirements. We co-create products with UK and in-country commercial, GO and NGO partners to ensure the machine learning algorithms address appropriate user needs whether for tactical decision making or evidence-based policy decisions. In one particular case, we developed and deployed a novel algorithm, BCCNet, to quickly process large quantities of unstructured data to prevent and respond to natural disasters. Crowdsourcing provides an efficient mechanism to generate labels from unstructured data to prime machine learning algorithms for large scale data analysis. However, these labels are often imperfect with qualities varying among different citizen scientists, which prohibits their direct use with many state-of-the-art machine learning techniques. We describe BCCNet, a framework that simultaneously aggregates biased and contradictory labels from the crowd and trains an automatic classifier to process new data. Our case studies, mosquito sound detection for malaria prevention and damage detection for disaster response, show the efficacy of our method in the challenging context of developing world applications

OPUS

        Citation for published version:Isupova, O, Li, Y, Kuzin, D, Roberts, SJ, Willis, K & Reece, S 2018 'BCCNet: Bayesian classifier combinationneural network'.Publication date:2018Link to publicationUniversity of BathAlternative formatsIf you require this document in an alternative format, please contact:openaccess@bath.ac.ukGeneral rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.Take down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.Download date: 12. Jan. 2023BCCNet: Bayesian classifier combination neuralnetworkOlga Isupova?olga.isupova@eng.ox.ac.ukYunpeng Li†yunpeng.li@surrey.ac.ukDanil Kuzin‡dkuzin1@sheffield.ac.ukStephen J Roberts?sjrob@robots.ox.ac.ukKatherine Willis§kathy.willis@zoo.ox.ac.ukSteven Reece?reece@robots.ox.ac.uk?Department of Engineering Science, University of Oxford, UK†Department of Computer Science, University of Surrey, UK‡Department of Automatic Control and Systems Engineering, University of Sheffield, UK§Department of Zoology, University of Oxford, UKAbstractMachine learning research for developing countries can demonstrate clear sustain-able impact by delivering actionable and timely information to in-country govern-ment organisations (GOs) and NGOs in response to their critical information re-quirements. We co-create products with UK and in-country commercial, GO andNGO partners to ensure the machine learning algorithms address appropriate userneeds whether for tactical decision making or evidence-based policy decisions.In one particular case, we developed and deployed a novel algorithm, BCCNet, toquickly process large quantities of unstructured data to prevent and respond to nat-ural disasters. Crowdsourcing provides an efficient mechanism to generate labelsfrom unstructured data to prime machine learning algorithms for large scale dataanalysis. However, these labels are often imperfect with qualities varying amongdifferent citizen scientists, which prohibits their direct use with many state-of-the-art machine learning techniques. We describe BCCNet, a framework that simulta-neously aggregates biased and contradictory labels from the crowd and trains anautomatic classifier to process new data. Our case studies, mosquito sound de-tection for malaria prevention and damage detection for disaster response, showthe efficacy of our method in the challenging context of developing world appli-cations.1 IntroductionWide area situation awareness or surveillance, for example, following a natural disaster or preempt-ing disease, benefits from rich, update-to-date yet unstructured data, including post hurricane satel-lite imagery and malarial mosquito audio signals. A small amount of data labelled by hand throughcrowdsourcing platforms like Zooniverse1 can be used to train machine learning algorithms, suchas neural networks (NNs), to label the rest of the data [1]. However, the crowdsourced labels canbe noisy and inconsistent, posing enormous challenges for machine learning algorithms to aggre-gate information and produce best decisions for policy makers and rescue workers [2]. The Bayesianclassifier combination (BCC) algorithm [3] resolves classifier bias and aggregates labels taking clas-sifier consistency into account.1https://www.zooniverse.orgNeurIPS 2018 Workshop on Machine Learning for the Developing World (ML4D), Montréal, Canada.arXiv:1811.12258v1  [stat.ML]  29 Nov 2018Figure 1: Heatmap of build-ing damage proportion inNorthern Dominica afterhurricane Maria in 2017:less than 20% (green),20% to 60% (magenta),greater than 60% (red).We propose an extension to BCC, the Bayesian classifier combi-nation neural network (BCCNet), which incorporates a neural net-work object classifier. BCCNet effectively trains the neural networkobject classifier using BCC bias corrected crowd labels. A novelhybrid variational Bayesian and maximum likelihood approach isdeveloped to jointly learn the neural network and BCC parameters.We demonstrate the efficacy of the approach on imbalanced dataand biased crowd labels, scenarios common in real applications.Our algorithm has been developed and deployed in collaborationwith Zooniverse and Rescue Global2, a UK based not-for-profit, togenerate damage heatmaps for disaster responders by combiningcrowd labels of satellite imagery immediately following HurricanesIrma and Maria (2017) [3, 4, 5] (see Figure 1) and earlier versionsfollowing earthquakes in Nepal (2015) and Ecuador (2016). Theseheatmaps were passed to the UN, FEMA and over 60 NGOs dur-ing the response phase of Irma and Maria in a timely manner. Thiswork has led to several research projects in disaster managementand environment protection in Africa, South East Asia and SouthAmerica. Our Zooniverse project on mosquito detection has crowd-sourced labels from more than 1200 citizen scientists on data col-lected in Thailand, Kenya, US and UK.The rest of the paper is organised as follows: Section 2 describesthe BCCNet model. We present two case studies in Section 3 and conclusions in Section 4.2 The Bayesian Classifier Combination Neural Network AlgorithmBCCNet is a multi-class classifier that combines high dimensional data (e.g., images, audio signals)and noisy, potentially biased crowdsourced labels from a set of imperfect base classifiers (e.g., crowdmembers). It integrates a neural network with the independent Bayesian classifier combinationalgorithm [3].c(k)itisi θNNπ(k)α(k)0i = [1,M ]k = [1,K]NNFigure 2: Graphical model of BC-CNetA neural network with parameters θNN takes an object si,e.g., an image patch of a satellite image, as input and pre-dicts a probability p(ti|si,θNN) that this object has class ti ∈{1, . . . , J}, ∀i ∈ {1, . . . ,M}, where M is the number of datapoints, and J is the number of possible classes.A label c(k)i ∈ {1, . . . , L} of a base classifier k ∈ {1, . . . ,K}is drawn from the multinomial distribution depending on thetrue label for this data point:c(k)i |π(k), ti ∼Mult(c(k)i ;π(k)ti )∀i ∈ {1, . . . ,M}, k ∈ {1, . . . ,K},(1)where π(k) is a confusion matrix for the base classifier k, π(k)tiis the ti-th row of the confusion matrix π(k), K is the total number of base classifiers, L is thenumber of values for the base classifiers’ labels. Our approach tolerates the case when labels fromthe base classifiers are missing for some objects.We impose a Dirichlet prior with hyperparameters α(k)0j for rows of the confusion matrices:π(k)j |α(k)0j ∼ Dir(π(k)j ;α(k)0j ),∀j ∈ {1, . . . , J}, k ∈ {1, . . . ,K}. (2)The resulting graphical model is given in Figure 2. BCCNet inference is based on maximisationof the evidence lower bound (ELBO). The ELBO is optimised using coordinate ascent over the NN2Rescue Global, Oxford machine learning and Zooniverse operational response team is collectively calledthe ‘Planetary Response Network’.2parameters θNN and the posterior approximating distributions for object class labels ti and confusionmatrices π(k) for the base classifiers. The NN parameters are updated via stochastic gradient ascentand the posterior approximating distribution is found using the variational mean-field approach. Weiterate between one full pass of the data for the NN parameter update and one iteration for theapproximating distribution update. We refer to this algorithm as VB.3 Experiments and resultsWe evaluated our approach on two real case studies, response after a natural disaster and malariaprevention, and compared the proposed algorithm (VB) with two baselines: i) the EM-algorithm [6](EM) extended to our BCCNet model from Section 2, and ii) the neural network with an added crowdlayer that models the confusion matrices [7] (CL). The base neural network for all methods wasLeNet-5 [8] with the Adam optimiser [9]. The learning rate was chosen by grid search on validationdatasets. We also used validation datasets for early stopping. The results are obtained from trainedneural networks on held-out test datasets over 30 Monte Carlo runs with random initialisation.3.1 Case study 1: damage detection in satellite imagery for disaster responseWe analysed crowdsourced labels of damage from Digital Globe3 high resolution (30cm) opticalsatellite imagery of Dominica before and after Hurricane Maria in 2017. Crowd members werepresented with a subset of satellite sub-images after the hurricane and asked, amongst other tasks,to draw bounding boxes around all buildings in their sub-images and also mark building damage.We extracted image patches from both ‘before’ and ‘after’ imagery corresponding to the boundingboxes as input for a neural network. Image patches were resized as 28 × 28 (the size of an averagebounding box). Before and after image patches formed different channels of the NN input layer.We also extracted corresponding labels from the crowd as: “background”, “undamaged building”,and “damaged building”. We thus obtain a dataset with M = 32, 932 objects labelled by K = 13volunteers (each object is labelled on average by 6 volunteers). This dataset is challenging because ofthe high discrepancy between different crowd members’ answers: 38% of the objects were assignedto different classes by the crowd members (for comparison in the second case study, below, the datahad only 20% of such objects).The data lacked ground truth labels to validate the algorithms so we defined ground truth as thecrowd consensus output inferred using BCC [3] when the whole dataset was processed. We thendivided the dataset in the ratio 70−10−20% into training, validation and test datasets for evaluationof the algorithms. The classification accuracy is given in Figure 3a. One can notice that the crowdlayer network has the lowest accuracy. The VB algorithm for BCCNet provides not only the highestaccuracy but also the most stable results among different Monte Carlo runs consistently for all threeclasses.3.2 Case study 2: mosquito detection in audio for malaria preventionThe HumBug project4 aims to detect malaria-vectoring mosquitoes through their flight tones [10]. Amalaria epidemic can occur a few weeks after initial impact of the disease and it is crucial to monitormalaria vectors (i.e. Anopheles species) and respond in the early stages [11]. As an initial step, wehave launched a crowdsourcing project on the Zooniverse platform5 to label 2-second length audioclips as containing “mosquito sound” or “no mosquito sound”. The project has attracted 1, 246volunteers up to date who have labelled 55, 590 audio clips from laboratory recordings collected inUK, US and Kenya and field recordings from Thailand. However, the crowd label matrix c is stillvery sparse, 99.8% of the matrix values are missing, so we chose data clips that were labelled by atleast 2 volunteers as our training dataset to ensure that our objects were assigned a class with someconfidence. Consequently, we had M = 22, 186 and K = 1, 128 in this case. We used a subset oflaboratory recordings with labels provided by the research team of the Humbug project as groundtruth labels for test and validation datasets with Mtest = 6, 651 samples for testing and Mval = 3, 326samples for validation.3https://www.digitalglobe.com4http://humbug.ac.uk5https://www.zooniverse.org/projects/yli/humbug3VB EMtotalCL VB EMbackgroundCL VB EMundamagedCL VB EMdamagedCL0.20.30.40.50.60.70.80.9Accuracy(a)VB EM CL0.260.280.300.320.340.360.380.400.42F1(b)Figure 3: Performance results. (a) box plots for accuracy on the damage detection data: for allclasses (blue), for the “background” class (red), for the “undamaged building” class (green), and forthe “damaged building” class (lavender). (b) box plots for F1 measure on the mosquito detectiondata.The neural network input comprised 20 × 26 sound ‘images’ constructed from audio clips where26 is the dimension of the mel-spectrum and 20 is the number of windows we used to divide eachof the 2-second long audio clips. Mosquito detection audio clips are naturally heavily imbalancedwith most of clips containing no mosquito sounds. According to the majority voted labels in thetraining data there are only 21% of clips containing mosquito sounds. In these settings, the crowdlayer neural network always predicts “no mosquito”. Therefore, for the CL algorithm we balancedthe training dataset based on majority voted labels. Both EM and VB algorithms for BCCNet areable to train appropriate networks on the raw data.Figure 3b provides box plots of F1 measure for the mosquito sound class. We used the F1 measurein this case as the data is highly imbalanced. The crowd layer neural network has the lowest me-dian accuracy and the highest variance among different Monte Carlo runs. The EM-algorithm forBCCNet provides more stable and more accurate results in comparison to the crowd layer network.The proposed VB-algorithm for BCCNet also gives stable results and additionally it has the highestmedian F1 measure amongst the competitors.4 ConclusionsWe present BCCNet, an approach to jointly aggregate noisy crowdsourced labels and train a neuralnetwork to process new data. This approach can be rapidly deployed as a solution to challengingproblems in the developing world that lack labelled data. We demonstrate that BCCNet is stable, ableto work with imbalanced data and contradictory crowd labels. Ongoing operational engagement withdisaster responders shows that this technology delivers sustainable impact by providing actionableand timely information to end users.AcknowledgmentsThis work is part-funded by a Google Impact Challenge award, by a grant from the Alan TuringInstitute’s Data Centric Engineering programme and also through the UK Space Agency’s Interna-tional Partnerships Programme. The authors would like to thank Digital Globe, Planet, ESA andthe Satellite Applications Catapult for ongoing satellite data provision; Dr. Marianne Sinka at theUniversity of Oxford, UK, Paul I. Howell at the Centers for Disease Control and Prevention (CDC),BEI Resources in Atlanta, USA, Dustin Miller in CDC Foundation, Centers for Disease Controland Prevention in Atlanta, Dr. Sheila Ogoma, US Army Military Research Unit, Kisumu, Kenya(USAMRU-K), and Dr. Theeraphap Chareonviriyaphap, Kasersart University, Thailand for theircollaborations on data collection and system deployment.4References[1] A. Gaunt, D. Borsa, and Y. Bachrach. Training deep neural nets to aggregate crowdsourced responses. InProceedings of the Conference on Uncertainty in Artificial Intelligence, pages 242–251, Jun. 2016.[2] M. Poblet, E. Garcı́a-Cuesta, and P. Casanovas. Crowdsourcing roles, methods and tools for data-intensivedisaster management. Information Systems Frontiers, Jan. 2017.[3] E. Simpson, S.J. Roberts, I. Psorakis, and A. Smith. Dynamic Bayesian combination of multiple imperfectclassifiers. In Decision making and imperfection, pages 1–35. Springer, 2013.[4] E. Simpson, S. Reece, and S.J. Roberts. Bayesian heatmaps: probabilistic classification with multipleunreliable information sources. In Proceedings of the Joint European Conference on Machine Learningand Knowledge Discovery in Databases, pages 109–125, 2017.[5] R. Yore. Heres how citizen scientists assisted with the disaster response in the caribbean. The Conversa-tion (Science and Technology), 2017.[6] S. Albarqouni, C. Baur, F. Achilles, V. Belagiannis, S. Demirci, and N. Navab. Aggnet: deep learningfrom crowds for mitosis detection in breast cancer histology images. IEEE Transactions on MedicalImaging, 35(5):1313–1321, 2016.[7] F. Rodrigues and F. Pereira. Deep learning from crowds. In Proceedings of the AAAI Conference onArtificial Intelligence, pages 1611–1619, 2018.[8] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 1998.[9] D. Kingma and J. Ba. Adam: A method for stochastic optimization. In Proceedings of the InternationalConference on Learning Representations, 2015.[10] Y. Li, D. Zilli, H. Chan, I. Kiskin, M. Sinka, S.J. Roberts, and K. Willis. Mosquito detection with low-cost smartphones: data acquisition for malaria research. In NIPS Workshop on Machine Learning for theDeveloping World, Long Beach, USA, Dec. 2017. arXiv:1711.06346.[11] S. C. Waring and B. J. Brown. The threat of communicable diseases following natural disasters: A publichealth response. Disaster Management & Response, 3(2):41 – 47, 2005. ISSN 1540-2487.5

        Citation for published version:Isupova, O, Li, Y, Kuzin, D, Roberts, SJ, Willis, K & Reece, S 2018 'BCCNet: Bayesian classifier combinationneural network'.Publication date:2018Link to publicationUniversity of BathAlternative formatsIf you require this document in an alternative format, please contact:openaccess@bath.ac.ukGeneral rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.Take down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.Download date: 08. Jul. 2024BCCNet: Bayesian classifier combination neuralnetworkOlga Isupova?olga.isupova@eng.ox.ac.ukYunpeng Li†yunpeng.li@surrey.ac.ukDanil Kuzin‡dkuzin1@sheffield.ac.ukStephen J Roberts?sjrob@robots.ox.ac.ukKatherine Willis§kathy.willis@zoo.ox.ac.ukSteven Reece?reece@robots.ox.ac.uk?Department of Engineering Science, University of Oxford, UK†Department of Computer Science, University of Surrey, UK‡Department of Automatic Control and Systems Engineering, University of Sheffield, UK§Department of Zoology, University of Oxford, UKAbstractMachine learning research for developing countries can demonstrate clear sustain-able impact by delivering actionable and timely information to in-country govern-ment organisations (GOs) and NGOs in response to their critical information re-quirements. We co-create products with UK and in-country commercial, GO andNGO partners to ensure the machine learning algorithms address appropriate userneeds whether for tactical decision making or evidence-based policy decisions.In one particular case, we developed and deployed a novel algorithm, BCCNet, toquickly process large quantities of unstructured data to prevent and respond to nat-ural disasters. Crowdsourcing provides an efficient mechanism to generate labelsfrom unstructured data to prime machine learning algorithms for large scale dataanalysis. However, these labels are often imperfect with qualities varying amongdifferent citizen scientists, which prohibits their direct use with many state-of-the-art machine learning techniques. We describe BCCNet, a framework that simulta-neously aggregates biased and contradictory labels from the crowd and trains anautomatic classifier to process new data. Our case studies, mosquito sound de-tection for malaria prevention and damage detection for disaster response, showthe efficacy of our method in the challenging context of developing world appli-cations.1 IntroductionWide area situation awareness or surveillance, for example, following a natural disaster or preempt-ing disease, benefits from rich, update-to-date yet unstructured data, including post hurricane satel-lite imagery and malarial mosquito audio signals. A small amount of data labelled by hand throughcrowdsourcing platforms like Zooniverse1 can be used to train machine learning algorithms, suchas neural networks (NNs), to label the rest of the data [1]. However, the crowdsourced labels canbe noisy and inconsistent, posing enormous challenges for machine learning algorithms to aggre-gate information and produce best decisions for policy makers and rescue workers [2]. The Bayesianclassifier combination (BCC) algorithm [3] resolves classifier bias and aggregates labels taking clas-sifier consistency into account.1https://www.zooniverse.orgNeurIPS 2018 Workshop on Machine Learning for the Developing World (ML4D), Montréal, Canada.arXiv:1811.12258v1  [stat.ML]  29 Nov 2018Figure 1: Heatmap of build-ing damage proportion inNorthern Dominica afterhurricane Maria in 2017:less than 20% (green),20% to 60% (magenta),greater than 60% (red).We propose an extension to BCC, the Bayesian classifier combi-nation neural network (BCCNet), which incorporates a neural net-work object classifier. BCCNet effectively trains the neural networkobject classifier using BCC bias corrected crowd labels. A novelhybrid variational Bayesian and maximum likelihood approach isdeveloped to jointly learn the neural network and BCC parameters.We demonstrate the efficacy of the approach on imbalanced dataand biased crowd labels, scenarios common in real applications.Our algorithm has been developed and deployed in collaborationwith Zooniverse and Rescue Global2, a UK based not-for-profit, togenerate damage heatmaps for disaster responders by combiningcrowd labels of satellite imagery immediately following HurricanesIrma and Maria (2017) [3, 4, 5] (see Figure 1) and earlier versionsfollowing earthquakes in Nepal (2015) and Ecuador (2016). Theseheatmaps were passed to the UN, FEMA and over 60 NGOs dur-ing the response phase of Irma and Maria in a timely manner. Thiswork has led to several research projects in disaster managementand environment protection in Africa, South East Asia and SouthAmerica. Our Zooniverse project on mosquito detection has crowd-sourced labels from more than 1200 citizen scientists on data col-lected in Thailand, Kenya, US and UK.The rest of the paper is organised as follows: Section 2 describesthe BCCNet model. We present two case studies in Section 3 and conclusions in Section 4.2 The Bayesian Classifier Combination Neural Network AlgorithmBCCNet is a multi-class classifier that combines high dimensional data (e.g., images, audio signals)and noisy, potentially biased crowdsourced labels from a set of imperfect base classifiers (e.g., crowdmembers). It integrates a neural network with the independent Bayesian classifier combinationalgorithm [3].c(k)itisi θNNπ(k)α(k)0i = [1,M ]k = [1,K]NNFigure 2: Graphical model of BC-CNetA neural network with parameters θNN takes an object si,e.g., an image patch of a satellite image, as input and pre-dicts a probability p(ti|si,θNN) that this object has class ti ∈{1, . . . , J}, ∀i ∈ {1, . . . ,M}, where M is the number of datapoints, and J is the number of possible classes.A label c(k)i ∈ {1, . . . , L} of a base classifier k ∈ {1, . . . ,K}is drawn from the multinomial distribution depending on thetrue label for this data point:c(k)i |π(k), ti ∼Mult(c(k)i ;π(k)ti )∀i ∈ {1, . . . ,M}, k ∈ {1, . . . ,K},(1)where π(k) is a confusion matrix for the base classifier k, π(k)tiis the ti-th row of the confusion matrix π(k), K is the total number of base classifiers, L is thenumber of values for the base classifiers’ labels. Our approach tolerates the case when labels fromthe base classifiers are missing for some objects.We impose a Dirichlet prior with hyperparameters α(k)0j for rows of the confusion matrices:π(k)j |α(k)0j ∼ Dir(π(k)j ;α(k)0j ),∀j ∈ {1, . . . , J}, k ∈ {1, . . . ,K}. (2)The resulting graphical model is given in Figure 2. BCCNet inference is based on maximisationof the evidence lower bound (ELBO). The ELBO is optimised using coordinate ascent over the NN2Rescue Global, Oxford machine learning and Zooniverse operational response team is collectively calledthe ‘Planetary Response Network’.2parameters θNN and the posterior approximating distributions for object class labels ti and confusionmatrices π(k) for the base classifiers. The NN parameters are updated via stochastic gradient ascentand the posterior approximating distribution is found using the variational mean-field approach. Weiterate between one full pass of the data for the NN parameter update and one iteration for theapproximating distribution update. We refer to this algorithm as VB.3 Experiments and resultsWe evaluated our approach on two real case studies, response after a natural disaster and malariaprevention, and compared the proposed algorithm (VB) with two baselines: i) the EM-algorithm [6](EM) extended to our BCCNet model from Section 2, and ii) the neural network with an added crowdlayer that models the confusion matrices [7] (CL). The base neural network for all methods wasLeNet-5 [8] with the Adam optimiser [9]. The learning rate was chosen by grid search on validationdatasets. We also used validation datasets for early stopping. The results are obtained from trainedneural networks on held-out test datasets over 30 Monte Carlo runs with random initialisation.3.1 Case study 1: damage detection in satellite imagery for disaster responseWe analysed crowdsourced labels of damage from Digital Globe3 high resolution (30cm) opticalsatellite imagery of Dominica before and after Hurricane Maria in 2017. Crowd members werepresented with a subset of satellite sub-images after the hurricane and asked, amongst other tasks,to draw bounding boxes around all buildings in their sub-images and also mark building damage.We extracted image patches from both ‘before’ and ‘after’ imagery corresponding to the boundingboxes as input for a neural network. Image patches were resized as 28 × 28 (the size of an averagebounding box). Before and after image patches formed different channels of the NN input layer.We also extracted corresponding labels from the crowd as: “background”, “undamaged building”,and “damaged building”. We thus obtain a dataset with M = 32, 932 objects labelled by K = 13volunteers (each object is labelled on average by 6 volunteers). This dataset is challenging because ofthe high discrepancy between different crowd members’ answers: 38% of the objects were assignedto different classes by the crowd members (for comparison in the second case study, below, the datahad only 20% of such objects).The data lacked ground truth labels to validate the algorithms so we defined ground truth as thecrowd consensus output inferred using BCC [3] when the whole dataset was processed. We thendivided the dataset in the ratio 70−10−20% into training, validation and test datasets for evaluationof the algorithms. The classification accuracy is given in Figure 3a. One can notice that the crowdlayer network has the lowest accuracy. The VB algorithm for BCCNet provides not only the highestaccuracy but also the most stable results among different Monte Carlo runs consistently for all threeclasses.3.2 Case study 2: mosquito detection in audio for malaria preventionThe HumBug project4 aims to detect malaria-vectoring mosquitoes through their flight tones [10]. Amalaria epidemic can occur a few weeks after initial impact of the disease and it is crucial to monitormalaria vectors (i.e. Anopheles species) and respond in the early stages [11]. As an initial step, wehave launched a crowdsourcing project on the Zooniverse platform5 to label 2-second length audioclips as containing “mosquito sound” or “no mosquito sound”. The project has attracted 1, 246volunteers up to date who have labelled 55, 590 audio clips from laboratory recordings collected inUK, US and Kenya and field recordings from Thailand. However, the crowd label matrix c is stillvery sparse, 99.8% of the matrix values are missing, so we chose data clips that were labelled by atleast 2 volunteers as our training dataset to ensure that our objects were assigned a class with someconfidence. Consequently, we had M = 22, 186 and K = 1, 128 in this case. We used a subset oflaboratory recordings with labels provided by the research team of the Humbug project as groundtruth labels for test and validation datasets with Mtest = 6, 651 samples for testing and Mval = 3, 326samples for validation.3https://www.digitalglobe.com4http://humbug.ac.uk5https://www.zooniverse.org/projects/yli/humbug3VB EMtotalCL VB EMbackgroundCL VB EMundamagedCL VB EMdamagedCL0.20.30.40.50.60.70.80.9Accuracy(a)VB EM CL0.260.280.300.320.340.360.380.400.42F1(b)Figure 3: Performance results. (a) box plots for accuracy on the damage detection data: for allclasses (blue), for the “background” class (red), for the “undamaged building” class (green), and forthe “damaged building” class (lavender). (b) box plots for F1 measure on the mosquito detectiondata.The neural network input comprised 20 × 26 sound ‘images’ constructed from audio clips where26 is the dimension of the mel-spectrum and 20 is the number of windows we used to divide eachof the 2-second long audio clips. Mosquito detection audio clips are naturally heavily imbalancedwith most of clips containing no mosquito sounds. According to the majority voted labels in thetraining data there are only 21% of clips containing mosquito sounds. In these settings, the crowdlayer neural network always predicts “no mosquito”. Therefore, for the CL algorithm we balancedthe training dataset based on majority voted labels. Both EM and VB algorithms for BCCNet areable to train appropriate networks on the raw data.Figure 3b provides box plots of F1 measure for the mosquito sound class. We used the F1 measurein this case as the data is highly imbalanced. The crowd layer neural network has the lowest me-dian accuracy and the highest variance among different Monte Carlo runs. The EM-algorithm forBCCNet provides more stable and more accurate results in comparison to the crowd layer network.The proposed VB-algorithm for BCCNet also gives stable results and additionally it has the highestmedian F1 measure amongst the competitors.4 ConclusionsWe present BCCNet, an approach to jointly aggregate noisy crowdsourced labels and train a neuralnetwork to process new data. This approach can be rapidly deployed as a solution to challengingproblems in the developing world that lack labelled data. We demonstrate that BCCNet is stable, ableto work with imbalanced data and contradictory crowd labels. Ongoing operational engagement withdisaster responders shows that this technology delivers sustainable impact by providing actionableand timely information to end users.AcknowledgmentsThis work is part-funded by a Google Impact Challenge award, by a grant from the Alan TuringInstitute’s Data Centric Engineering programme and also through the UK Space Agency’s Interna-tional Partnerships Programme. The authors would like to thank Digital Globe, Planet, ESA andthe Satellite Applications Catapult for ongoing satellite data provision; Dr. Marianne Sinka at theUniversity of Oxford, UK, Paul I. Howell at the Centers for Disease Control and Prevention (CDC),BEI Resources in Atlanta, USA, Dustin Miller in CDC Foundation, Centers for Disease Controland Prevention in Atlanta, Dr. Sheila Ogoma, US Army Military Research Unit, Kisumu, Kenya(USAMRU-K), and Dr. Theeraphap Chareonviriyaphap, Kasersart University, Thailand for theircollaborations on data collection and system deployment.4References[1] A. Gaunt, D. Borsa, and Y. Bachrach. Training deep neural nets to aggregate crowdsourced responses. InProceedings of the Conference on Uncertainty in Artificial Intelligence, pages 242–251, Jun. 2016.[2] M. Poblet, E. Garcı́a-Cuesta, and P. Casanovas. Crowdsourcing roles, methods and tools for data-intensivedisaster management. Information Systems Frontiers, Jan. 2017.[3] E. Simpson, S.J. Roberts, I. Psorakis, and A. Smith. Dynamic Bayesian combination of multiple imperfectclassifiers. In Decision making and imperfection, pages 1–35. Springer, 2013.[4] E. Simpson, S. Reece, and S.J. Roberts. Bayesian heatmaps: probabilistic classification with multipleunreliable information sources. In Proceedings of the Joint European Conference on Machine Learningand Knowledge Discovery in Databases, pages 109–125, 2017.[5] R. Yore. Heres how citizen scientists assisted with the disaster response in the caribbean. The Conversa-tion (Science and Technology), 2017.[6] S. Albarqouni, C. Baur, F. Achilles, V. Belagiannis, S. Demirci, and N. Navab. Aggnet: deep learningfrom crowds for mitosis detection in breast cancer histology images. IEEE Transactions on MedicalImaging, 35(5):1313–1321, 2016.[7] F. Rodrigues and F. Pereira. Deep learning from crowds. In Proceedings of the AAAI Conference onArtificial Intelligence, pages 1611–1619, 2018.[8] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 1998.[9] D. Kingma and J. Ba. Adam: A method for stochastic optimization. In Proceedings of the InternationalConference on Learning Representations, 2015.[10] Y. Li, D. Zilli, H. Chan, I. Kiskin, M. Sinka, S.J. Roberts, and K. Willis. Mosquito detection with low-cost smartphones: data acquisition for malaria research. In NIPS Workshop on Machine Learning for theDeveloping World, Long Beach, USA, Dec. 2017. arXiv:1711.06346.[11] S. C. Waring and B. J. Brown. The threat of communicable diseases following natural disasters: A publichealth response. Disaster Management & Response, 3(2):41 – 47, 2005. ISSN 1540-2487.5

https://purehost.bath.ac.uk/ws/files/262185206/1811.12258v1.pdf

BCCNet: Bayesian classifier combination neural network

Abstract

Similar works

Full text

Available Versions

University of Surrey

OPUS

OPUS