The goal of Gene Normalization (GN) is to identify the unique database identifiers of genes and proteins mentioned in biomedical literature. A major difficulty in GN comes from inter-species gene ambiguity. That is, the same gene name can refer to different database identifiers depending on the species in question. In this paper, we introduce a method to exploit contextual information in an abstract, like tissue type, chromosome location, etc., to tackle this problem. Using this technique, we have been able to improve system performance (Fscore) by 14.3 % on the BioCreAtIvE-II GN task test set. 1
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.