3 research outputs found

    Machine Learning and Data Mining-Based Methods to Estimate Parity Status and Age of Wild Mosquito Vectors of Infectious Diseases from Near-Infrared Spectra

    Get PDF
    Previous studies show that a trained partial least square regresser (PLSR) from near-infrared spectra classify laboratory and semi-field raised mosquitoes into less than or ≥ to seven days old with an average accuracy of 80%. This dissertation demonstrates that training models on near-infrared spectra (NIRS) using artificial neural network (ANN) as an architecture yields models with higher accuracies than training models using partial least squares (PLS) as an architecture. In addition, irrespective of the model architecture used, direct training of a binary classifier scores higher accuracy than training a regresser and interpreting it as a binary classifier. Furthermore, for the first time, this dissertation shows that training ANN models on autoencoded near-infrared spectra yields models that estimate parity status of wild mosquitoes with an accuracy of ≈93%, which is strong enough to support NIRS models as an alternative to ovary dissections. Results from this dissertation also show that there is no significant difference between spectra collected from semi-field raised and wild mosquitoes of the same species, supporting the on-going practice of training models on semi-field raised mosquitoes to estimate the age class in days of wild mosquitoes. Finally, the study shows that an ANN model trained on semi-field mosquitoes classifies wild mosquitoes into either less than or ≥ to seven days old with an average accuracy of 76%. In conclusion, the results in this dissertation strongly suggest the use of ANNs as a suitable architecture to train models that estimate parity status and age in days of wild mosquito vectors of infectious diseases. The results further suggest near-infrared spectroscopy as an appropriate alternative tool to estimate different parameters of mosquito vectors of infectious diseases

    Near Infrared Spectroscopy for Estimating the Age of Malaria Transmitting Mosquitoes

    Get PDF
    We explore the use of near infrared spectrometry to classifying the age of a wild malaria transmitting mosquito. In Chapter Two, using a different set of lab-reared mosquitoes, we replicate the Mayagaya et al. study of the accuracy of near-infrared spectrometry (NIRS) to estimate the age of lab-reared mosquitoes, reproducing the published accuracy. Our results strengthen the Mayagaya et. al study and increase confidence in using NIRS to estimate age classes of mosquitoes. In the field, we wish to classify the ages of wild, not lab-reared mosquitoes, but the necessary training data from wild mosquitoes is dicult to find. Applying a model trained on spectra from lab-reared mosquitoes to estimate the age of wild mosquitoes is appropriate only if spectra collected from lab-reared mosquitoes are equivalent to those collected from wild mosquitoes. In Chapter Three, we apply k-means cluster analysis to a mixture of spectra collected from lab-reared and wild Anopheles arabiensis mosquitoes to determine if there is significant difference between these spectra. We find no signicant difference (P = 0.245) in distributions between the wild and lab-reared mosquitoes in the two formed clusters. The two formed clusters have average silhouette coefficient values (cluster quality measure) of 0.51 and 0.77, respectively, which shows that the clusters were reasonable and strong, respectively. Basing on results from Chapter Three, we estimate the age class of wild Anopheles arabiensis mosquitoes using a classication model trained on lab-reared Anopheles arabiensis. We validate the accuracy of the model by comparing its estimates with ovary dissection estimates. While our model estimated 86% and 14% of wild Anopheles arabiensis to be \u3c 7 and 7 days old, respectively, ovary dissection estimated 72% as young and 28% as old. Studies show that wild mosquito populations generally consist of more young than old mosquitoes. Therefore, our model estimates age of wild mosquitoes in consistency with ovary dissection and other studies conducted to determine age structure of wild mosquitoes

    Age grading \u3cem\u3eAn. gambiae\u3c/em\u3e and \u3cem\u3eAn. arabiensis\u3c/em\u3e using near infrared spectra and artificial neural networks

    Get PDF
    Background Near infrared spectroscopy (NIRS) is currently complementing techniques to age-grade mosquitoes. NIRS classifies lab-reared and semi-field raised mosquitoes into \u3c or ≥ 7 days old with an average accuracy of 80%, achieved by training a regression model using partial least squares (PLS) and interpreted as a binary classifier. Methods and findings We explore whether using an artificial neural network (ANN) analysis instead of PLS regression improves the current accuracy of NIRS models for age-grading malaria transmitting mosquitoes. We also explore if directly training a binary classifier instead of training a regression model and interpreting it as a binary classifier improves the accuracy. A total of 786 and 870 NIR spectra collected from laboratory reared An. gambiae and An. arabiensis, respectively, were used and pre-processed according to previously published protocols. The ANN regression model scored root mean squared error (RMSE) of 1.6 ± 0.2 for An. gambiae and 2.8 ± 0.2 for An. arabiensis; whereas the PLS regression model scored RMSE of 3.7 ± 0.2 for An. gambiae, and 4.5 ± 0.1 for An. arabiensis. When we interpreted regression models as binary classifiers, the accuracy of the ANN regression model was 93.7 ± 1.0% for An. gambiae, and 90.2 ± 1.7% for An. arabiensis; while PLS regression model scored the accuracy of 83.9 ± 2.3% for An. gambiae, and 80.3 ± 2.1% for An. arabiensis. We also find that a directly trained binary classifier yields higher age estimation accuracy than a regression model interpreted as a binary classifier. A directly trained ANN binary classifier scored an accuracy of 99.4 ± 1.0 for An. gambiae and 99.0 ± 0.6% for An. arabiensis; while a directly trained PLS binary classifier scored 93.6 ± 1.2% for An. gambiae and 88.7 ± 1.1% for An. arabiensis. We further tested the reproducibility of these results on different independent mosquito datasets. ANNs scored higher estimation accuracies than when the same age models are trained using PLS. Regardless of the model architecture, directly trained binary classifiers scored higher accuracies on classifying age of mosquitoes than regression models translated as binary classifiers. Conclusion We recommend training models to estimate age of An. arabiensis and An. gambiae using ANN model architectures (especially for datasets with at least 70 mosquitoes per age group) and direct training of binary classifier instead of training a regression model and interpreting it as a binary classifier
    corecore