Search CORE

3,092 research outputs found

Submodularity in Batch Active Learning and Survey Problems on Gaussian Random Fields

Author: Garnett Roman
Ma Yifei
Schneider Jeff
Publication venue
Publication date: 17/09/2012
Field of study

Many real-world datasets can be represented in the form of a graph whose edge weights designate similarities between instances. A discrete Gaussian random field (GRF) model is a finite-dimensional Gaussian process (GP) whose prior covariance is the inverse of a graph Laplacian. Minimizing the trace of the predictive covariance Sigma (V-optimality) on GRFs has proven successful in batch active learning classification problems with budget constraints. However, its worst-case bound has been missing. We show that the V-optimality on GRFs as a function of the batch query set is submodular and hence its greedy selection algorithm guarantees an (1-1/e) approximation ratio. Moreover, GRF models have the absence-of-suppressor (AofS) condition. For active survey problems, we propose a similar survey criterion which minimizes 1'(Sigma)1. In practice, V-optimality criterion performs better than GPs with mutual information gain criteria and allows nonuniform costs for different nodes

arXiv.org e-Print Archive

CiteSeerX

Zur Lexikon-Grammatik-Schnittstelle in einem hypermedialen Informationssystem

Author: Schneider Roman
Publication venue: Tübingen : Narr
Publication date: 11/08/2015
Field of study

Der Beitrag beschreibt Konzeption und Umsetzung der Anbindung von lexikalischen Datenbanken an das grammatische Informationssystem grammis, das seit Mitte 1993 am Institut für deutsche Sprache (IDS) entwickelt wird. Im Rahmen dieses Projekts wird erforscht, wie grammatisches Wissen mit moderner Computertechnik anschaulich dargestellt und verständlich vermittelt werden kann

Publikationsserver des Instituts für Deutsche Sprache

Editorial

Author: František Turnovec
Martin Gregor
Ondřej Schneider
Roman Horvath
Publication venue
Publication date
Field of study

Research Papers in Economics

Using a domain ontology for the semantic-statistical classification of specialist hypertexts

Author: Bubenhofer Noah
Schneider Roman
Publication venue
Publication date: 17/08/2015
Field of study

In this feasibility study we aim at contributing at the practical use of domain ontologies for hypertext classification by introducing an algorithm generating potential keywords. The algorithm uses structural markup information and lemmatized word lists as well as a domain ontology on linguistics. We present the calculation and ranking of keyword candidates based on ontology relationships, word position, frequency information, and statistical significance as evidenced by log-likelihood tests. Finally, the results of our machine-driven classification are validated empirically against manually assigned keywords

Publikationsserver des Instituts für Deutsche Sprache

Empirische Verortung konzeptioneller Nähe/Mündlichkeit inner- und außerhalb schriftsprachlicher Korpora

Author: Broll Sarah
Schneider Roman
Publication venue: German Society for Computational Linguistics and Language Technology (GSCL)
Publication date: 15/05/2023
Field of study

Linguistische Studien arbeiten häufig mit einer Differenzierung zwischen gesprochener und geschriebener Sprache bzw. zwischen Kommunikation der Nähe und Distanz. Die Annahme eines Kontinuums zwischen diesen Polen bietet sich für eine Verortungunterschiedlichster Äußerungsformen an, inklusive unkonventioneller Textsorten wie etwa Popsongs. Wir konzipieren, implementieren und evaluieren ein automatisiertes Verfahren, das mithilfe unkorrelierter Entscheidungsbäume entsprechende Vorhersagenauf Textebene durchführt. Für die Identifizierung der Pole definieren wir einen Merkmalskatalog aus Sprachphänomenen, die als Markierer für Nähe/Mündlichkeit bzw. Distanz/Schriftlichkeit diskutiert werden, und wenden diesen auf prototypische Nähe-/Mündlichkeitstexte sowie prototypische Distanz-/Schrifttexte an. Basierend auf der sehr guten Klassifikationsgüte verorten wir anschließend eine Reihe weiterer Textsorten mithilfe der trainierten Klassifikatoren. Dabei erscheinen Popsongs als „mittige Textsorte“, die linguistisch motivierte Merkmale unterschiedlicher Kontinuumsstufen vereint. Weiterhin weisen wir nach, dass unsere Modelle mündlich kommunizierte, aber vorab oder nachträglich verschriftlichte Äußerungen wie Reden oder Interviews vollkommenanders verorten als prototypische Gesprächsdaten und decken Klassifikationsunterschiede für Social-Media-Varianten auf. Ziel ist dabei nicht eine systematisch-verbindliche Einordung im Kontinuum, sondern eine empirische Annäherung an die Frage, welchemaschinell vergleichsweise einfach bestimmbaren Merkmale („shallow features“) nachweisbar Einfluss auf die Verortung haben

Journal for Language Technology and Computational Linguistics (JLCL)

Conjunctive query inseparability of OWL 2 QL TBoxes

Author: Konev B.
Kontchakov Roman
Michel L.
Schneider T.
Wolter F.
Zakharyaschev Michael
Publication venue: AAAI Press
Publication date: 01/01/2011
Field of study

The OWL2 profile OWL 2 QL, based on the DL-Lite family of description logics, is emerging as a major language for developing new ontologies and approximating the existing ones. Its main application is ontology based data access, where ontologies are used to provide background knowledge for answering queries over data. We investigate the corresponding notion of query inseparability (or equivalence) for OWL 2 QL ontologies and show that deciding query inseparability is PSpace-hard and in ExpTime. We give polynomial-time (incomplete) algorithms and demonstrate by experiments that they can be used for practical module extraction

Birkbeck Institutional Research Online

Association for the Advancement of Artificial Intelligence: AAAI Publications

Module extraction via query inseparability in OWL 2 QL

Author: Konev B.
Kontchakov Roman
Ludwig M.
Schneider T.
Wolter F.
Zakharyaschev Michael
Publication venue: CEUR Workshop Proceedings
Publication date: 01/01/2011
Field of study

We show that deciding conjunctive query inseparability for OWL 2 QL ontologies is PSpace-hard and in ExpTime. We give polynomial-time (incomplete) algorithms and demonstrate by experiments that they can be used for practical module extraction

Birkbeck Institutional Research Online