1 research outputs found

    Mining Web Pages Using Features of Rendering HTML Elements in the Web Browser

    Get PDF
    The Web is the largest repository of useful information available for human users, but it is usual that Web Pages do not provide an API to get access to its information automatically. In order to solve this problem, Information Extractors are developed. We present a new methodology to induce Information Extractors from the Web. It is based on rendering HTML elements in the Web browser. The methodology uses a KDD process to mining a dataset with features of the elements in the Web page. An experimentation over 10 web sites has been made and the results show the effectiveness of the methodology.Ministerio de Ciencia y Tecnología TIN2007-64119Junta de Andalucía P07-TIC-02602Junta de Andalucía P08-TIC-410
    corecore