Article thumbnail

Generation of incomplete test-data using bayesian networks

By Olivier François and Philippe Leray

Abstract

International audienceWe introduce a new method based on Bayesian Network formalism for automatically generating incomplete datasets. This method can either be configured randomly to generate various datasets with respect to a global percentage of missing data or manually in order to handle many parameters. [1] proposed three types of missing data : MCAR (missing completly at random), MAR (missing at random) and NMAR (not missing at random). The proposed approach can successfully generate all MCAR data mechanisms and most of MAR data mechanisms. NMAR data generation is very difficult to manage automatically but we propose some hints in order to cover some of the NMAR data situations

Topics: [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]
Publisher: 'Institute of Electrical and Electronics Engineers (IEEE)'
Year: 2007
DOI identifier: 10.1109/IJCNN.2007.4371332
OAI identifier: oai:HAL:hal-00412939v1
Provided by: HAL-Univ-Nantes
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • https://hal.archives-ouvertes.... (external link)
  • https://hal.archives-ouvertes.... (external link)
  • https://hal.archives-ouvertes.... (external link)
  • https://hal.archives-ouvertes.... (external link)
  • https://hal.archives-ouvertes.... (external link)

  • To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.

    Suggested articles