Search CORE

2,486 research outputs found

Automatic Thematic Extractor

Author: Birmingham William P.
Meek Colin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2003
Field of study

We have created a system that identifies musical “keywords” or themes. The system searches for all patterns composed of melodic (intervallic for our purposes) repetition in a piece. This process generally uncovers a large number of patterns, many of which are either uninteresting or only superficially important. Filters reduce the number or prevalence, or both, of such patterns. Patterns are then rated according to perceptually significant characteristics. The top-ranked patterns correspond to important thematic or motivic musical content, as has been verified by comparisons with published musical thematic catalogs. The system operates robustly across a broad range of styles, and relies on no meta-data on its input, allowing it to independently and efficiently catalog multimedia data.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/46483/1/10844_2004_Article_5122823.pd

Deep Blue Documents at the University of Michigan

GeomRDF: A Geodata Converter with a Fine-Grained Structured Representation of Geometry in the Web

Author: Abadie Nathalie
Bucher Bénédicte
Feliachi Abdelfettah
Hamdi Fayçal
Publication venue
Publication date: 01/01/2014
Field of study

In recent years, with the advent of the web of data, a growing number of national mapping agencies tend to publish their geospatial data as Linked Data. However, differences between traditional GIS data models and Linked Data model can make the publication process more complicated. Besides, it may require, to be done, the setting of several parameters and some expertise in the semantic web technologies. In addition, the use of standards like GeoSPARQL (or ad hoc predicates) is mandatory to perform spatial queries on published geospatial data. In this paper, we present GeomRDF, a tool that helps users to convert spatial data from traditional GIS formats to RDF model easily. It generates geometries represented as GeoSPARQL WKT literal but also as structured geometries that can be exploited by using only the RDF query language, SPARQL. GeomRDF was implemented as a module in the RDF publication platform Datalift. A validation of GeomRDF has been realized against the French administrative units dataset (provided by IGN France).Comment: 12 pages, 2 figures, the 1st International Workshop on Geospatial Linked Data (GeoLD 2014) - SEMANTiCS 201

arXiv.org e-Print Archive

Hal-Diderot

Automatic Extraction of Subcategorization from Corpora

Author: Briscoe Ted
Carroll John
Publication venue
Publication date: 01/01/1997
Field of study

We describe a novel technique and implemented system for constructing a subcategorization dictionary from textual corpora. Each dictionary entry encodes the relative frequency of occurrence of a comprehensive set of subcategorization classes for English. An initial experiment, on a sample of 14 verbs which exhibit multiple complementation patterns, demonstrates that the technique achieves accuracy comparable to previous approaches, which are all limited to a highly restricted set of subcategorization classes. We also demonstrate that a subcategorization dictionary built with the system improves the accuracy of a parser by an appreciable amount.Comment: 8 pages; requires aclap.sty. To appear in ANLP-9

arXiv.org e-Print Archive

CiteSeerX

Crossref

Sussex Research Online

Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation

Author: Dhole Kaustubh D.
Manning Christopher D.
Publication venue
Publication date: 09/06/2021
Field of study

Question Generation (QG) is fundamentally a simple syntactic transformation; however, many aspects of semantics influence what questions are good to form. We implement this observation by developing Syn-QG, a set of transparent syntactic rules leveraging universal dependencies, shallow semantic parsing, lexical resources, and custom rules which transform declarative sentences into question-answer pairs. We utilize PropBank argument descriptions and VerbNet state predicates to incorporate shallow semantic content, which helps generate questions of a descriptive nature and produce inferential and semantically richer questions than existing systems. In order to improve syntactic fluency and eliminate grammatically incorrect questions, we employ back-translation over the output of these syntactic rules. A set of crowd-sourced evaluations shows that our system can generate a larger number of highly grammatical and relevant questions than previous QG systems and that back-translation drastically improves grammaticality at a slight cost of generating irrelevant questions.Comment: Some of the results in the paper were incorrec

arXiv.org e-Print Archive

Generating indicative-informative summaries with SumUM

Author: Benbrahim Mohamed
Guy Lapalme
Horacio Saggion
Jing Hongyan
Johnson Frances C
Jordan Michael P
Radev Dragomir R
Teufel S.
Tombros Anastasios
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2002
Field of study

We present and evaluate SumUM, a text summarization system that takes a raw technical text as input and produces an indicative informative summary. The indicative part of the summary identifies the topics of the document, and the informative part elaborates on some of these topics according to the reader's interest. SumUM motivates the topics, describes entities, and defines concepts. It is a first step for exploring the issue of dynamic summarization. This is accomplished through a process of shallow syntactic and semantic analysis, concept identification, and text regeneration. Our method was developed through the study of a corpus of abstracts written by professional abstractors. Relying on human judgment, we have evaluated indicativeness, informativeness, and text acceptability of the automatic summaries. The results thus far indicate good performance when compared with other summarization technologies

CiteSeerX

Crossref

White Rose Research Online