2 research outputs found

    Our Digital Legacy: an Archival Perspective

    Get PDF
    Our digital memories are threatened by archival hubris, technical misdirection, and simplistic application of rules to protect privacy rights. The obsession with the technical challenge of digital preservation has blinded much of the archival community to the challenges, created by the digital transition, to the other core principles of archival science - namely, appraisal (what to keep), sensitivity review (identifying material that cannot yet be disclosed for ethical or legal reasons) and access. The essay will draw on the considerations of appraisal and sensitivity review to project a vision of some aspects of access to the Digital Archive. This essay will argue that only by careful scrutiny of these three challenges and the introduction of appropriate practices and procedures will it be possible to prevent the precautionary closure of digital memories for long periods or, worse still, their destruction. We must ensure that our digital memories can be captured, kept, recalled and remain faithful to the events and circumstances that created them

    How the accuracy and confidence of sensitivity classification affects digital sensitivity review

    Get PDF
    Government documents must be manually reviewed to identify any sensitive information, e.g., confidential information, before being publicly archived. However, human-only sensitivity review is not practical for born-digital documents due to, for example, the volume of documents that are to be reviewed. In this work, we conduct a user study to evaluate the effectiveness of sensitivity classification for assisting human sensitivity reviewers. We evaluate how the accuracy and confidence levels of sensitivity classification affects the number of documents that are correctly judged as being sensitive (reviewer accuracy) and the time that it takes to sensitivity review a document (reviewing speed). In our within-subject study, the participants review government documents to identify real sensitivities while being assisted by three sensitivity classification treatments, namely None (no classification predictions), Medium (sensitivity predictions from a simulated classifier with a balanced accuracy (BAC) of 0.7), and Perfect (sensitivity predictions from a classifier with an accuracy of 1.0). Our results show that sensitivity classification leads to significant improvements (ANOVA, p < 0.05) in reviewer accuracy in terms of BAC (+37.9% Medium, +60.0% Perfect) and also in terms of F2 (+40.8% Medium, +44.9% Perfect). Moreover, we show that assisting reviewers with sensitivity classification predictions leads to significantly increased (ANOVA, p < 0.05) mean reviewing speeds (+72.2% Medium, +61.6% Perfect). We find that reviewers do not agree with the classifier significantly more as the classifier’s confidence increases. However, reviewing speed is significantly increased when the reviewers agree with the classifier (ANOVA, p < 0.05). Our in-depth analysis shows that when the reviewers are not assisted with sensitivity predictions, mean reviewing speeds are 40.5% slower for sensitive judgements compared to not-sensitive judgements. However, when the reviewers are assisted with sensitivity predictions, the difference in reviewing speeds between sensitive and not-sensitive judgements is reduced by ˜10%, from 40.5% to 30.8%. We also find that, for sensitive judgements, sensitivity classification predictions significantly increase mean reviewing speeds by 37.7% when the reviewers agree with the classifier’s predictions (t-test, p < 0.05). Overall, our findings demonstrate that sensitivity classification is a viable technology for assisting human reviewers with the sensitivity review of digital documents
    corecore