2,486 research outputs found
Automatic Thematic Extractor
We have created a system that identifies musical “keywords” or themes. The system searches for all patterns composed of melodic (intervallic for our purposes) repetition in a piece. This process generally uncovers a large number of patterns, many of which are either uninteresting or only superficially important. Filters reduce the number or prevalence, or both, of such patterns. Patterns are then rated according to perceptually significant characteristics. The top-ranked patterns correspond to important thematic or motivic musical content, as has been verified by comparisons with published musical thematic catalogs. The system operates robustly across a broad range of styles, and relies on no meta-data on its input, allowing it to independently and efficiently catalog multimedia data.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/46483/1/10844_2004_Article_5122823.pd
GeomRDF: A Geodata Converter with a Fine-Grained Structured Representation of Geometry in the Web
In recent years, with the advent of the web of data, a growing number of
national mapping agencies tend to publish their geospatial data as Linked Data.
However, differences between traditional GIS data models and Linked Data model
can make the publication process more complicated. Besides, it may require, to
be done, the setting of several parameters and some expertise in the semantic
web technologies. In addition, the use of standards like GeoSPARQL (or ad hoc
predicates) is mandatory to perform spatial queries on published geospatial
data. In this paper, we present GeomRDF, a tool that helps users to convert
spatial data from traditional GIS formats to RDF model easily. It generates
geometries represented as GeoSPARQL WKT literal but also as structured
geometries that can be exploited by using only the RDF query language, SPARQL.
GeomRDF was implemented as a module in the RDF publication platform Datalift. A
validation of GeomRDF has been realized against the French administrative units
dataset (provided by IGN France).Comment: 12 pages, 2 figures, the 1st International Workshop on Geospatial
Linked Data (GeoLD 2014) - SEMANTiCS 201
Automatic Extraction of Subcategorization from Corpora
We describe a novel technique and implemented system for constructing a
subcategorization dictionary from textual corpora. Each dictionary entry
encodes the relative frequency of occurrence of a comprehensive set of
subcategorization classes for English. An initial experiment, on a sample of 14
verbs which exhibit multiple complementation patterns, demonstrates that the
technique achieves accuracy comparable to previous approaches, which are all
limited to a highly restricted set of subcategorization classes. We also
demonstrate that a subcategorization dictionary built with the system improves
the accuracy of a parser by an appreciable amount.Comment: 8 pages; requires aclap.sty. To appear in ANLP-9
Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation
Question Generation (QG) is fundamentally a simple syntactic transformation;
however, many aspects of semantics influence what questions are good to form.
We implement this observation by developing Syn-QG, a set of transparent
syntactic rules leveraging universal dependencies, shallow semantic parsing,
lexical resources, and custom rules which transform declarative sentences into
question-answer pairs. We utilize PropBank argument descriptions and VerbNet
state predicates to incorporate shallow semantic content, which helps generate
questions of a descriptive nature and produce inferential and semantically
richer questions than existing systems. In order to improve syntactic fluency
and eliminate grammatically incorrect questions, we employ back-translation
over the output of these syntactic rules. A set of crowd-sourced evaluations
shows that our system can generate a larger number of highly grammatical and
relevant questions than previous QG systems and that back-translation
drastically improves grammaticality at a slight cost of generating irrelevant
questions.Comment: Some of the results in the paper were incorrec
Generating indicative-informative summaries with SumUM
We present and evaluate SumUM, a text summarization system that takes a raw technical text as input and produces an indicative informative summary. The indicative part of the summary identifies the topics of the document, and the informative part elaborates on some of these topics according to the reader's interest. SumUM motivates the topics, describes entities, and defines concepts. It is a first step for exploring the issue of dynamic summarization. This is accomplished through a process of shallow syntactic and semantic analysis, concept identification, and text regeneration. Our method was developed through the study of a corpus of abstracts written by professional abstractors. Relying on human judgment, we have evaluated indicativeness, informativeness, and text acceptability of the automatic summaries. The results thus far indicate good performance when compared with other summarization technologies
- …