Automatic Categorization of Medical Documents

Abstract

The main objective of this thesis is to propose a categorizing model for medical documents,called HiMeD Model. The HiMeD Model is based on the principle that we denominatedhierarchical correlation of specialized terms, in which a medical concept, to be used in anautomatic categorization process, can always be represented by terms, where these termsare linked up in a hierarchical path. This hierarchical linking can contain components thatallow the determination of these categories ordered by the degree of relevance of theadopted concept. The use of this principle allows us to isolate the categorization tasks fromthe unnecessary influence of terms not belonging to the medical vocabulary of referenceand of the straight calculation of the term-weight in the information retrieval process usedby the classic models. The concepts developed here were used in several experiments thatdemonstrated the quality of the proposed model. These experiments are another importantcontribution of this work. Finally, a tool for automatic coding of medical documents wasimplemented based on the components of our model, thus demonstrating its technologicalcapacity in building automatic categorization tools. This tool, called MedCode, was used inexperiments carried out with the help of medical coding specialists, and its use improvedthe precision of the automatic coding of medical documents. This improvement is largelydue to the interactive and visual characteristics of the prototype, which allowed thespecialists to modify the coding environment, to select the type of processing algorithm,and to modify other document processing options.Sociedad Argentina de Informática e Investigación Operativ

    Similar works