101,604 research outputs found

    Negative Statements Considered Useful

    No full text
    Knowledge bases (KBs), pragmatic collections of knowledge about notable entities, are an important asset in applications such as search, question answering and dialogue. Rooted in a long tradition in knowledge representation, all popular KBs only store positive information, while they abstain from taking any stance towards statements not contained in them. In this paper, we make the case for explicitly stating interesting statements which are not true. Negative statements would be important to overcome current limitations of question answering, yet due to their potential abundance, any effort towards compiling them needs a tight coupling with ranking. We introduce two approaches towards compiling negative statements. (i) In peer-based statistical inferences, we compare entities with highly related entities in order to derive potential negative statements, which we then rank using supervised and unsupervised features. (ii) In query-log-based text extraction, we use a pattern-based approach for harvesting search engine query logs. Experimental results show that both approaches hold promising and complementary potential. Along with this paper, we publish the first datasets on interesting negative information, containing over 1.1M statements for 100K popular Wikidata entities

    Forecasting with time series imaging

    Full text link
    Feature-based time series representations have attracted substantial attention in a wide range of time series analysis methods. Recently, the use of time series features for forecast model averaging has been an emerging research focus in the forecasting community. Nonetheless, most of the existing approaches depend on the manual choice of an appropriate set of features. Exploiting machine learning methods to extract features from time series automatically becomes crucial in state-of-the-art time series analysis. In this paper, we introduce an automated approach to extract time series features based on time series imaging. We first transform time series into recurrence plots, from which local features can be extracted using computer vision algorithms. The extracted features are used for forecast model averaging. Our experiments show that forecasting based on automatically extracted features, with less human intervention and a more comprehensive view of the raw time series data, yields highly comparable performances with the best methods in the largest forecasting competition dataset (M4) and outperforms the top methods in the Tourism forecasting competition dataset

    BlogForever D2.4: Weblog spider prototype and associated methodology

    Get PDF
    The purpose of this document is to present the evaluation of different solutions for capturing blogs, established methodology and to describe the developed blog spider prototype

    Fuzzy Content Mining for Targeted Advertisement

    Get PDF
    Content-targeted advertising system is becoming an increasingly important part of the funding source of free web services. Highly efficient content analysis is the pivotal key of such a system. This project aims to establish a content analysis engine involving fuzzy logic that is able to automatically analyze real user-posted Web documents such as blog entries. Based on the analysis result, the system matches and retrieves the most appropriate Web advertisements. The focus and complexity is on how to better estimate and acquire the keywords that represent a given Web document. Fuzzy Web mining concept will be applied to synthetically consider multiple factors of Web content. A Fuzzy Ranking System is established based on certain fuzzy (and some crisp) rules, fuzzy sets, and membership functions to get the best candidate keywords. Once it is has obtained the keywords, the system will retrieve corresponding advertisements from certain providers through Web services as matched advertisements, similarly to retrieving a products list from Amazon.com. In 87% of the cases, the results of this system can match the accuracy of the Google Adwords system. Furthermore, this expandable system will also be a solid base for further research and development on this topic
    • …
    corecore