Using an Open Source Python Toolbox (Signac) to Manage High Dimensional Research Data

Abstract

Many research fields have entered the age of Big Data. For some researchers, big data means computationally generating large datasets with high dimensional parameter sweeps; for others, big data means generating terabytes of experimental data with many different types of metadata based on experimental conditions. Recording and storing these data in an organized way for future analysis can be challenging, as many ad hoc solutions might help the exact current situation but hurt one's progress later on. Having battled these challenges, I want to share my experience working with an open-source data management system based on Python called Signac. Signac was first developed in the Glotzer Group at the University of Michigan, where I was a graduate student, to help manage different kinds of molecular dynamics simulations, but later extended to support many different kinds of data. In this talk, I want to briefly talk about the design philosophies of Signac and give a quick demonstration of how one could use Signac to help with their research based on my personal experiences

    Similar works