The Partitur Format at BAS
- Publication date
- Publisher
Abstract
Most spoken language resources are produced and disseminated together with symbolic information relating to the speech signal. These are for instance orthographic transcripts, labelling and segmentation on the phonologic, phonetic, prosodic, phrasal level. Most of the known formats for these symbolic data are defined in a 'closed form' that is not flexible enough to allow simple and platformindependent processing and easy extensions. At the Bavarian Archive for Speech Signals (BAS) a new format has been developed and used over the last few years that shows some significant advantages over other existing formats. This paper describes the basic principles behind this format, discusses briefly the advantages and gives detailed definitions of the description levels used so far. Furthermore, we will give some examples for easy processing of the format and distributed work on the same data. In the future all corpora produced and disseminated by BAS will be distributed with the new BAS Partit..