5,098 research outputs found
Proceedings of the Workshop Semantic Content Acquisition and Representation (SCAR) 2007
This is the proceedings of the Workshop on Semantic Content Acquisition and Representation, held in conjunction with NODALIDA 2007, on May 24 2007 in Tartu, Estonia.</p
Morphological Disambiguation by Voting Constraints
We present a constraint-based morphological disambiguation system in which
individual constraints vote on matching morphological parses, and
disambiguation of all the tokens in a sentence is performed at the end by
selecting parses that receive the highest votes. This constraint application
paradigm makes the outcome of the disambiguation independent of the rule
sequence, and hence relieves the rule developer from worrying about potentially
conflicting rule sequencing. Our results for disambiguating Turkish indicate
that using about 500 constraint rules and some additional simple statistics, we
can attain a recall of 95-96% and a precision of 94-95% with about 1.01 parses
per token. Our system is implemented in Prolog and we are currently
investigating an efficient implementation based on finite state transducers.Comment: 8 pages, Latex source. To appear in Proceedings of ACL/EACL'97
Compressed postscript also available as
ftp://ftp.cs.bilkent.edu.tr/pub/ko/acl97.ps.
Large-Scale information extraction from textual definitions through deep syntactic and semantic analysis
We present DEFIE, an approach to large-scale Information Extraction (IE) based on a syntactic-semantic analysis of textual definitions. Given a large corpus of definitions we leverage syntactic dependencies to reduce data sparsity, then disambiguate the arguments and content words of the relation strings, and finally exploit the resulting information to organize the acquired relations hierarchically. The output of DEFIE is a high-quality knowledge base consisting of several million automatically acquired semantic relations
The interaction of knowledge sources in word sense disambiguation
Word sense disambiguation (WSD) is a computational linguistics task likely to benefit from the tradition of combining different knowledge sources in artificial in telligence research. An important step in the exploration of this hypothesis is to determine which linguistic knowledge sources are most useful and whether their combination leads to improved results.
We present a sense tagger which uses several knowledge sources. Tested accuracy exceeds 94% on our evaluation corpus.Our system attempts to disambiguate all content words in running text rather than limiting itself to treating a restricted vocabulary of words. It is argued that this approach is more likely to assist the creation of practical systems
Event-based Access to Historical Italian War Memoirs
The progressive digitization of historical archives provides new, often
domain specific, textual resources that report on facts and events which have
happened in the past; among these, memoirs are a very common type of primary
source. In this paper, we present an approach for extracting information from
Italian historical war memoirs and turning it into structured knowledge. This
is based on the semantic notions of events, participants and roles. We evaluate
quantitatively each of the key-steps of our approach and provide a graph-based
representation of the extracted knowledge, which allows to move between a Close
and a Distant Reading of the collection.Comment: 23 pages, 6 figure
- …