93,764 research outputs found

    Augmenting cross-domain knowledge bases using web tables

    Get PDF
    Cross-domain knowledge bases are increasingly used for a large variety of applications. As the usefulness of a knowledge base for many of these applications increases with its completeness, augmenting knowledge bases with new knowledge is an important task. A source for this new knowledge could be in the form of web tables, which are relational HTML tables extracted from the Web. This thesis researches data integration methods for cross-domain knowledge base augmentation from web tables. Existing methods have focused on the task of slot filling static data. We research methods that additionally enable augmentation in the form of slot filling time-dependent data and entity expansion. When augmenting knowledge bases using time-dependent web table data, we require time-aware fusion methods. They identify from a set of conflicting web table values the one that is valid given a certain temporal scope. A primary concern of time-aware fusion is therefore the estimation of temporal scope annotations, which web table data lacks. We introduce two time-aware fusion approaches. In the first, we extract timestamps from the table and its context to exploit as temporal scopes, additionally introducing approaches to reduce the sparsity and noisiness of these timestamps. We introduce a second time-aware fusion method that exploits a temporal knowledge base to propagate temporal scopes to web table data, reducing the dependence on noisy and sparse timestamps. Entity expansion enriches a knowledge base with previously unknown long-tail entities. It is a task that to our knowledge has not been researched before. We introduce the Long-Tail Entity Extraction Pipeline, the first system that can perform entity expansion from web table data. The pipeline works by employing identity resolution twice, once to disambiguate between entity occurrences within web tables, and once between entities created from web tables and existing entities in the knowledge base. In addition to identifying new long-tail entities, the pipeline also creates their descriptions according to the knowledge base schema. By running the pipeline on a large-scale web table corpus, we profile the potential of web tables for the task of entity expansion. We find, that given certain classes, we can enrich a knowledge base with tens and even hundreds of thousands new entities and corresponding facts. Finally, we introduce a weak supervision approach for long-tail entity extraction, where supervision in the form of a large number of manually labeled matching and non-matching pairs is substituted with a small set of bold matching rules build using the knowledge base schema. Using this, we can reduce the supervision effort required to train our pipeline to enable cross-domain entity expansion at web-scale. In the context of this research, we created and published two datasets. The Time-Dependent Ground Truth contains time-dependent knowledge with more than one million temporal facts and corresponding temporal scope annotations. It could potentially be employed for a large variety of tasks that consider the temporal aspect of data. We also built the Web Tables for Long-Tail Entity Extraction gold standard, the first benchmark for the task of entity expansion from web tables

    Count three for wear able computers

    Get PDF
    This paper is a postprint of a paper submitted to and accepted for publication in the Proceedings of the IEE Eurowearable 2003 Conference, and is subject to Institution of Engineering and Technology Copyright. The copy of record is available at the IET Digital Library. A revised version of this paper was also published in Electronics Systems and Software, also subject to Institution of Engineering and Technology Copyright. The copy of record is also available at the IET Digital Library.A description of 'ubiquitous computer' is presented. Ubiquitous computers imply portable computers embedded into everyday objects, which would replace personal computers. Ubiquitous computers can be mapped into a three-tier scheme, differentiated by processor performance and flexibility of function. The power consumption of mobile devices is one of the most important design considerations. The size of a wearable system is often a design limitation

    Realising context-sensitive mobile messaging

    Get PDF
    Mobile technologies aim to assist people as they move from place to place going about their daily work and social routines. Established and very popular mobile technologies include short-text messages and multimedia messages with newer growing technologies including Bluetooth mobile data transfer protocols and mobile web access.Here we present new work which combines all of the above technologies to fulfil some of the predictions for future context aware messaging. We present a context sensitive mobile messaging system which derives context in the form of physical locations through location sensing and the co-location of people through Bluetooth familiarity
    corecore