37 research outputs found

    Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features

    Full text link
    Satirical news is considered to be entertainment, but it is potentially deceptive and harmful. Despite the embedded genre in the article, not everyone can recognize the satirical cues and therefore believe the news as true news. We observe that satirical cues are often reflected in certain paragraphs rather than the whole document. Existing works only consider document-level features to detect the satire, which could be limited. We consider paragraph-level linguistic features to unveil the satire by incorporating neural network and attention mechanism. We investigate the difference between paragraph-level features and document-level features, and analyze them on a large satirical news dataset. The evaluation shows that the proposed model detects satirical news effectively and reveals what features are important at which level.Comment: EMNLP 2017, 11 page

    Meaningful Labeling of Integrated Query Interfaces

    Get PDF
    The contents of Web databases are accessed through queries formulated on complex user interfaces. In many domains of interest (e.g. Auto) users are interested in obtaining information from alternative sources. Thus, they have to access many individual Web databases via query interfaces. We aim to construct automatically a well-designed query interface that integrates a set of interfaces in the same domain. This will permit users to access information uniformly from multiple sources. Earlier research in this area includes matching attributes across multiple query interfaces in the same domain and grouping related attributes. In this paper, we investigate the naming of the attributes in the integrated query interface. We provide a set of properties which are required in order to have consistent labels for the attributes within an integrated interface so that users have no difficulty in understanding it. Based on these properties, we design algorithms to systematically label the attributes. Experimental results on seven domains validate our theoretical study. In the process of naming attributes, a set of logical inference rules among the textual labels is discovered. These inferences are also likely to be applicable to other integration problems sensitive to naming: HTML forms, HTML tables or concept hierarchies in the semantic Web

    ABSTRACT Meaningful Labeling of Integrated Query Interfaces

    No full text
    The contents of Web databases are accessed through queries formulated on complex user interfaces. In many domains of interest (e.g. Auto) users are interested in obtaining information from alternative sources. Thus, they have to access many individual Web databases via query interfaces. We aim to construct automatically a well-designed query interface that integrates a set of interfaces in the same domain. This will permit users to access information uniformly from multiple sources. Earlier research in this area includes matching attributes across multiple query interfaces in the same domain and grouping related attributes. In this paper, we investigate the naming of the attributes in the integrated query interface. We provide a set of properties which are required in order to have consistent labels for the attributes within an integrated interface so that users have no difficulty in understanding it. Based on these properties, we design algorithms to systematically label the attributes. Experimental results on seven domains validate our theoretical study. In the process of naming attributes, a set of logical inference rules among the textual labels is discovered. These inferences are also likely to be applicable to other integration problems sensitive to naming: e.g., HTML forms, HTML tables or concept hierarchies in the semantic Web. 1

    A Hierarchical Approach to Model Web Query Interfaces for Web Source Integration

    Get PDF
    Much data in the Web is hidden behind Web query interfaces. In most cases the only means to surface the content of a Web database is by formulating complex queries on such interfaces. Applications such as Deep Web crawling and Web database integration require an automatic usage of these interfaces. Therefore, an important problem to be addressed is the automatic extraction of query interfaces into an appropriate model. We hypothesize the existence of a set of domain-independent commonsense design rules that guides the creation of Web query interfaces. These rules transform query interfaces into schema trees. In this paper we describe a Web query interface extraction algorithm, which combines HTML tokens and the geometric layout of these tokens within a Web page. Tokens are classified into several classes out of which the most significant ones are text tokens and field tokens. A tree structure is derived for text tokens using their geometric layout. Another tree structure is derived for the field tokens. The hierarchical representation of a query interface is obtained by iteratively merging these two trees. Thus, we convert the extraction problem into an integration problem. Our experiments show the promise of our algorithm: it outperforms the previous approaches on extracting query interfaces on about 6.5% in accuracy as evaluated over three corpora with more than 500 Deep Web interfaces from 15 different domains
    corecore