research

The df: A proposed data format standard

Abstract

A standard is proposed describing a portable format for electronic exchange of data in the physical sciences. Writing scientific data in a standard format has three basic advantages: portability; the ability to use metadata to aid in interpretation of the data (understandability); and reusability. An improperly formulated standard format tends towards four disadvantages: (1) it can be inflexible and fail to allow the user to express his data as needed; (2) reading and writing such datasets can involve high overhead in computing time and storage space; (3) the format may be accessible only on certain machines using certain languages; and (4) under some circumstances it may be uncertain whether a given dataset actually conforms to the standard. A format was designed which enhances these advantages and lessens the disadvantages. The fundamental approach is to allow the user to make her own choices regarding strategic tradeoffs to achieve the performance desired in her local environment. The choices made are encoded in a specific and portable way in a set of records. A fully detailed description and specification of the format is given, and examples are used to illustrate various concepts. Implementation is discussed

    Similar works