Abstract. This paper proposes an unsupervised method that acquires a set of attribute-value pairs (avps, e.g., ⟨director, W. Wyler⟩) for a given object (e.g., “Ben-Hur”) from semi-structured HTML documents. The objects ’ avps are one of the principal components of domain ontologies. We first acquire class attributes that are used by many web authors to describe the objects ’ avps. Then, we exploit the acquired class attributes to induce patterns for extracting avps from web pages. Experimental results show that, with our method, at least one set of correct avps are acquired for 67.7 % of objects among open-domain class-object pairs whose source documents (web pages) include the objects ’ avps in layouts. Key words: open-domain attribute-value acquisition, semi-structured texts, question answering, faceted search
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.