2,258 research outputs found
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
Proceedings of the Workshop Semantic Content Acquisition and Representation (SCAR) 2007
This is the proceedings of the Workshop on Semantic Content Acquisition and Representation, held in conjunction with NODALIDA 2007, on May 24 2007 in Tartu, Estonia.</p
G-Asks: An Intelligent Automatic Question Generation System for Academic Writing Support
Many electronic feedback systems have been proposed for writing support. However, most of these systems only aim at supporting writing to communicate instead of writing to learn, as in the case of literature review writing. Trigger questions are potentially forms of support for writing to learn, but current automatic question generation approaches focus on factual question generation for reading comprehension or vocabulary assessment. This article presents a novel Automatic Question Generation (AQG) system, called G-Asks, which generates specific trigger questions as a form of support for students' learning through writing. We conducted a large-scale case study, including 24 human supervisors and 33 research students, in an Engineering Research Method course at The University of Sydney and compared questions generated by G-Asks with human generated question. The results indicate that G-Asks can generate questions as useful as human supervisors (`useful' is one of five question quality measures) while significantly outperforming Human Peer and Generic Questions in most quality measures after filtering out questions with grammatical and semantic errors. Furthermore, we identified the most frequent question types, derived from the human supervisors' questions and discussed how the human supervisors generate such questions from the source text
TEI and LMF crosswalks
The present paper explores various arguments in favour of making the Text
Encoding Initia-tive (TEI) guidelines an appropriate serialisation for ISO
standard 24613:2008 (LMF, Lexi-cal Mark-up Framework) . It also identifies the
issues that would have to be resolved in order to reach an appropriate
implementation of these ideas, in particular in terms of infor-mational
coverage. We show how the customisation facilities offered by the TEI
guidelines can provide an adequate background, not only to cover missing
components within the current Dictionary chapter of the TEI guidelines, but
also to allow specific lexical projects to deal with local constraints. We
expect this proposal to be a basis for a future ISO project in the context of
the on going revision of LMF
Learning Domain-Specific Word Embeddings from Sparse Cybersecurity Texts
Word embedding is a Natural Language Processing (NLP) technique that
automatically maps words from a vocabulary to vectors of real numbers in an
embedding space. It has been widely used in recent years to boost the
performance of a vari-ety of NLP tasks such as Named Entity Recognition,
Syntac-tic Parsing and Sentiment Analysis. Classic word embedding methods such
as Word2Vec and GloVe work well when they are given a large text corpus. When
the input texts are sparse as in many specialized domains (e.g.,
cybersecurity), these methods often fail to produce high-quality vectors. In
this pa-per, we describe a novel method to train domain-specificword embeddings
from sparse texts. In addition to domain texts, our method also leverages
diverse types of domain knowledge such as domain vocabulary and semantic
relations. Specifi-cally, we first propose a general framework to encode
diverse types of domain knowledge as text annotations. Then we de-velop a novel
Word Annotation Embedding (WAE) algorithm to incorporate diverse types of text
annotations in word em-bedding. We have evaluated our method on two
cybersecurity text corpora: a malware description corpus and a Common
Vulnerability and Exposure (CVE) corpus. Our evaluation re-sults have
demonstrated the effectiveness of our method in learning domain-specific word
embeddings
The REVERE project:Experiments with the application of probabilistic NLP to systems engineering
Despite natural language’s well-documented shortcomings as a medium for precise technical description, its use in software-intensive systems engineering remains inescapable. This poses many problems for engineers who must derive problem understanding and synthesise precise solution descriptions from free text. This is true both for the largely unstructured textual descriptions from which system requirements are derived, and for more formal documents, such as standards, which impose requirements on system development processes. This paper describes experiments that we have carried out in the REVERE1 project to investigate the use of probabilistic natural language processing techniques to provide systems engineering support
The syntax of manner quotative constructions in English and Dutch
This paper proposes an account of some properties of the manner quotative constructions be like [Quote] in English and hebben (zo)iets van [Quote] in Dutch. We make two main claims about these constructions. First, in the spirit of Rothstein’s (1999) proposal for adjectival predicates of copula be, we propose that eventive direct speech interpretations of these quotatives are derived via a coercion mechanism akin to those that make count readings out of mass nouns in the nominal domain. Second, adapting a proposal for be like originally made by Kayne (2007), we propose that some exceptional syntactic properties of be like as a quote introducer in English are explained by the presence of a silent something quantifier, which takes a like-headed PP as its complement. We compare English be like quotatives with innovative (zo)iets van quotative constructions in Dutch, which contain an overt something quantifier and behave similarly
- …