1 research outputs found
Effcient logging and querying for Blockchain-based cross-site genomic dataset access audit
Background: Genomic data have been collected by different institutions and
companies and need to be shared for broader use. In a cross-site genomic data
sharing system, a secure and transparent access control audit module plays an
essential role in ensuring the accountability. The 2018 iDASH competition first
track provides us with an opportunity to design efficient logging and querying
system for cross-site genomic dataset access audit. We designed a
blockchain-based log system which can provide a light-weight and widely
compatible module for existing blockchain platforms. The submitted solution won
the third place of the competition. In this paper, we report the technical
details in our system. Methods: We present two methods: baseline method and
enhanced method. We started with the baseline method and then adjusted our
implementation based on the competition evaluation criteria and characteristics
of the log system. To overcome obstacles of indexing on the immutable
Blockchain system, we designed a hierarchical timestamp structure which
supports efficient range queries on the timestamp field. Results: We
implemented our methods in Python3, tested the scalability, and compared the
performance using the test data supplied by competition organizer. We
successfully boosted the log retrieval speed for complex AND queries that
contain multiple predicates. For the range query, we boosted the speed for at
least one order of magnitude. The storage usage is reduced by 25%. Conclusion:
We demonstrate that Blockchain can be used to build a time and space efficient
log and query genomic dataset audit trail. Therefore, it provides a promising
solution for sharing genomic data with accountability requirement across
multiple sites