The TANKA project seeks to build a model of a technical domain by semi-automatically processing unedited English text that describes this domain. Each sentence is parsed and conceptual elements are extracted from the parse. Concepts are derived from the Case structure of a sentence, and added to a conceptual network that represents knowledge about the domain. The DIPETT parser has a particularly broad coverage of English syntax; its newest version can also process sentence fragments. The HAIKU subsystem is responsible for user-assisted semantic interpretation. It contains a Case Analyzer module that extracts phrases marking concepts from the parse and uses its past processing experience to derive the most likely Case realizations of each with almost no a priori semantic knowledge. The user must validate these selections. A key issue in our research is minimizing the number of interactions with the user by intelligently generating the alternatives offered
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.