18,259 research outputs found
Taxonomy Induction using Hypernym Subsequences
We propose a novel, semi-supervised approach towards domain taxonomy
induction from an input vocabulary of seed terms. Unlike all previous
approaches, which typically extract direct hypernym edges for terms, our
approach utilizes a novel probabilistic framework to extract hypernym
subsequences. Taxonomy induction from extracted subsequences is cast as an
instance of the minimumcost flow problem on a carefully designed directed
graph. Through experiments, we demonstrate that our approach outperforms
stateof- the-art taxonomy induction approaches across four languages.
Importantly, we also show that our approach is robust to the presence of noise
in the input vocabulary. To the best of our knowledge, no previous approaches
have been empirically proven to manifest noise-robustness in the input
vocabulary
Transfer and Multi-Task Learning for Noun-Noun Compound Interpretation
In this paper, we empirically evaluate the utility of transfer and multi-task
learning on a challenging semantic classification task: semantic interpretation
of noun--noun compounds. Through a comprehensive series of experiments and
in-depth error analysis, we show that transfer learning via parameter
initialization and multi-task learning via parameter sharing can help a neural
classification model generalize over a highly skewed distribution of relations.
Further, we demonstrate how dual annotation with two distinct sets of relations
over the same set of compounds can be exploited to improve the overall accuracy
of a neural classifier and its F1 scores on the less frequent, but more
difficult relations.Comment: EMNLP 2018: Conference on Empirical Methods in Natural Language
Processing (EMNLP
PowerAqua: fishing the semantic web
The Semantic Web (SW) offers an opportunity to develop novel, sophisticated forms of question answering (QA). Specifically, the availability of distributed semantic markup on a large scale opens the way to QA systems which can make use of such semantic information to provide precise, formally derived answers to questions. At the same time the distributed, heterogeneous, large-scale nature of the semantic information introduces significant challenges. In this paper we describe the design of a QA system, PowerAqua, designed to exploit semantic markup on the web to provide answers to questions posed in natural language. PowerAqua does not assume that the user has any prior information about the semantic resources. The system takes as input a natural language query, translates it into a set of logical queries, which are then answered by consulting and aggregating information derived from multiple heterogeneous semantic sources
Using Information Filtering in Web Data Mining Process
Web service-oriented Grid is becoming a standard for achieving loosely coupled distributed computing. Grid services could easily be specified with web-service based interfaces. In this paper we first envisage a realistic Grid market with players such as end-users, brokers and service providers participating co-operatively with an aim to meet requirements and earn profit. End-users wish to use functionality of Grid services by paying the minimum possible price or price confined within a specified budget, brokers aim to maximise profit whilst establishing a SLA (Service Level Agreement) and satisfying end-user needs and at the same time resisting the volatility of service execution time and availability. Service providers aim to develop price models based on end-user or broker demands that will maximise their profit. In this paper we focus on developing stochastic approaches to end-user workflow scheduling that provides QoS guarantees by establishing a SLA. We also develop a novel 2-stage stochastic programming technique that aims at establishing a SLA with end-users regarding satisfying their workflow QoS requirements. We develop a scheduling (workload allocation) technique based on linear programming that embeds the negotiated workflow QoS into the program and model Grid services as generalised queues. This technique is shown to outperform existing scheduling techniques that don't rely on real-time performance information
- …