This paper discusses corpus design and building issues when dealing with a complex, multidimensional phenomenon such as determinologisation. Its representation in corpus data imposes an original reflection on the process and on some essential concepts of corpus building. This paper focuses on the necessity of representing the progressive aspects of determinologisation in the corpus, i.e. through levels of specialisation and through time, and the practical issues this raises. At the same time, it shows that a representative corpus of determinologisation in a specific domain (in this case, particle physics) implies clear and objective criteria when it comes to picking individual texts. Four principles are established to this end. The discussion leads to the proposal of a solid text selection procedure, which ensures that the peculiarities of determinologisation in the domain of particle physics are reflected in the corpus
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.