1 research outputs found
Data Readiness Levels
Application of models to data is fraught. Data-generating collaborators often
only have a very basic understanding of the complications of collating,
processing and curating data. Challenges include: poor data collection
practices, missing values, inconvenient storage mechanisms, intellectual
property, security and privacy. All these aspects obstruct the sharing and
interconnection of data, and the eventual interpretation of data through
machine learning or other approaches. In project reporting, a major challenge
is in encapsulating these problems and enabling goals to be built around the
processing of data. Project overruns can occur due to failure to account for
the amount of time required to curate and collate. But to understand these
failures we need to have a common language for assessing the readiness of a
particular data set. This position paper proposes the use of data readiness
levels: it gives a rough outline of three stages of data preparedness and
speculates on how formalisation of these levels into a common language for data
readiness could facilitate project management