Search CORE

95 research outputs found

Evaluating semantic relations by exploring ontologies on the Semantic Web

Author: A. Budanitsky
A. Lozano-Tello
G.A. Miller
H. Alani
J. Euzenat
M. d’Aquin
N. Guarino
P. Cimiano
R.L. Calibrasi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

We investigate the problem of evaluating the correctness of a semantic relation and propose two methods which explore the increasing number of online ontologies as a source of evidence for predicting correctness. We obtain encouraging results, with some of our measures reaching average precision values of 75%

CiteSeerX

Crossref

Open Research Online (The Open University)

Retrieval, alignment, and clustering of computational models based on semantic annotations

Author: Becker J
Budanitsky A
Edda Klipp
Falko Krause
Fielding R
Henkel R
Jiang J
Liebermeister W
Lin D
Marvin Schulz
Nicolas Le Novère
Resnik P
Salton G
Salton G
Tohsato Y
van Rijsbergen C
Wolfram Liebermeister
Publication venue: Nature Publishing Group
Publication date
Field of study

As the number of computational systems biology models increases, new methods are needed to explore their content and build connections with experimental data. In this Perspective article, the authors propose a flexible semantic framework that can help achieve these aims

Crossref

PubMed Central

Disambiguation of biomedical text using diverse sources of information

Author: A Aronson
A Aronson
A Budanitsky
A Harley
A Ratnaparkhi
B McInnes
C Friedman
C Manning
D McCarthy
D Swanson
D Widdows
David Martinez
E Agirre
G Leroy
H Liu
I Witten
L Humphreys
L Specia
M Joshi
M Schuemie
M Stevenson
M Stevenson
M Weeber
M Weeber
Mark Stevenson
N Ide
R Mihalcea
Robert Gaizauskas
S Humphrey
S Nelson
T Mitchell
T Pedersen
Yikun Guo
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Background: Like text in other domains, biomedical documents contain a range of terms with more than one possible meaning. These ambiguities form a significant obstacle to the automatic processing of biomedical texts. Previous approaches to resolving this problem have made use of various sources of information including linguistic features of the context in which the ambiguous term is used and domain-specific resources, such as UMLS. Materials and methods: We compare various sources of information including ones which have been previously used and a novel one: MeSH terms. Evaluation is carried out using a standard test set (the NLM-WSD corpus). Results: The best performance is obtained using a combination of linguistic features and MeSH terms. Performance of our system exceeds previously published results for systems evaluated using the same data set. Conclusion: Disambiguation of biomedical terms benefits from the use of information from a variety of sources. In particular, MeSH terms have proved to be useful and should be used if available

CiteSeerX

Crossref

Springer - Publisher Connector

PubMed Central

White Rose Research Online

An evaluative baseline for geo-semantic relatedness and similarity

Author: A Ballatore
A Ballatore
A Ballatore
A Ballatore
A Budanitsky
A Lehrer
A Schwering
A Schwering
A Tversky
Andrea Ballatore
C Keßler
C Khoo
D Medin
D Nelson
David C. Wilson
DM Blei
F Ferrara
F Wilcoxon
G Miller
G Strube
H Rubenstein
H Schütze
J Dawes
J LeBreton
J Rodgers
K Janowicz
K Janowicz
K Janowicz
L Finkelstein
L James
M Bakillah
M Banerjee
M Kendall
M Rodríguez
Michela Bertolotto
P Turney
R Finn
R Goldstone
R Rada
T Kaptchuk
W Robinson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

In geographic information science and semantics, the computation of semantic similarity is widely recognised as key to supporting a vast number of tasks in information integration and retrieval. By contrast, the role of geo-semantic relatedness has been largely ignored. In natural language processing, semantic relatedness is often confused with the more specific semantic similarity. In this article, we discuss a notion of geo-semantic relatedness based on Lehrer’s semantic fields, and we compare it with geo-semantic similarity. We then describe and validate the Geo Relatedness and Similarity Dataset (GeReSiD), a new open dataset designed to evaluate computational measures of geo-semantic relatedness and similarity. This dataset is larger than existing datasets of this kind, and includes 97 geographic terms combined into 50 term pairs rated by 203 human subjects. GeReSiD is available online and can be used as an evaluation baseline to determine empirically to what degree a given computational model approximates geo-semantic relatedness and similarity

arXiv.org e-Print Archive

Crossref

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Birkbeck Institutional Research Online

Geotag Propagation with User Trust Modeling

Author: A Budanitsky
A Jøsang
DG Lowe
DH Ballard
G. Koutrika
I Ivanov
I Ivanov
J Luo
JM Kleinberg
L. Ahn
P Heymann
P Vajda
RL Cilibrasi
S Marti
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/05/2012
Field of study

The amount of information that people share on social networks is constantly increasing. People also comment, annotate, and tag their own content (videos, photos, notes, etc.), as well as the content of others. In many cases, the content is tagged manually. One way to make this time-consuming manual tagging process more efficient is to propagate tags from a small set of tagged images to the larger set of untagged images automatically. In such a scenario, however, a wrong or a spam tag can damage the integrity and reliability of the automated propagation system. Users may make mistakes in tagging, or irrelevant tags and content may be added maliciously for advertisement or self-promotion. Therefore, a certain mechanism insuring the trustworthiness of users or published content is needed. In this chapter, we discuss several image retrieval methods based on tags, various approaches to trust modeling and spam protection in social networks, and trust modeling in geotagging systems. We then consider a specific example of automated geotag propagation system that adopts a user trust model. The tag propagation in images relies on the similarity between image content (famous landmarks) and its context (associated geotags). For each tagged image, similar untagged images are found by the robust graph-based object duplicate detection and the known tags are propagated accordingly. The user trust value is estimated based on a social feedback from the users of the photo-sharing system and only tags from trusted users are propagated. This approach demonstrates that a practical tagging system significantly benefits from the intelligent combination of efficient propagation algorithm and a user-centered trust model

Infoscience - École polytechnique fédérale de Lausanne

Crossref

A transversal approach to predict gene product networks from ontology-based similarity

Author: A Budanitsky
A Schlicker
A Singhal
Anita Burgun
C Wolting
D Lin
DS Harris
E Agirre
E Camon
E Levy
EB Camon
F Azuaje
FD Gibbons
FJ Field
G Rigau
G Salton
GO Consortium
H Bedrine-Ferran
H Sun
H Wang
IG Wool
J Chabalier
J Chabalier
J Jiang
Jean Mosser
JH Chiang
JM Mariadason
Julie Chabalier
M Gerstein
M Kanehisa
MB Eisen
MD Weiss
ME Brosnan
O Bodenreider
P Joseph
P Khatri
P Resnik
PW Lord
R Baeza-Yates
R Rada
RC Gentleman
T Barrett
T Nakajima
T Yamamoto
TK Jenssen
X Mao
Y Quentin
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Interpretation of transcriptomic data is usually made through a "standard" approach which consists in clustering the genes according to their expression patterns and exploiting Gene Ontology (GO) annotations within each expression cluster. This approach makes it difficult to underline functional relationships between gene products that belong to different expression clusters. To address this issue, we propose a transversal analysis that aims to predict functional networks based on a combination of GO processes and data expression. Results The transversal approach presented in this paper consists in computing the semantic similarity between gene products in a Vector Space Model. Through a weighting scheme over the annotations, we take into account the representativity of the terms that annotate a gene product. Comparing annotation vectors results in a matrix of gene product similarities. Combined with expression data, the matrix is displayed as a set of functional gene networks. The transversal approach was applied to 186 genes related to the enterocyte differentiation stages. This approach resulted in 18 functional networks proved to be biologically relevant. These results were compared with those obtained through a standard approach and with an approach based on information content similarity. Conclusion Complementary to the standard approach, the transversal approach offers new insight into the cellular mechanisms and reveals new research hypotheses by combining gene product networks based on semantic similarity, and data expression.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ReaderBench Learns Dutch: Building a Comprehensive Automated Essay Scoring System for Dutch Language

Author: A Budanitsky
AC Graesser
AC Graesser
CD Manning
CE Shannon
DE Powers
DM Blei
DM Blei
DS McNamara
DS McNamara
DS McNamara
GA Miller
H Vliet van der
H Zijlstra
J Nelson
M Dascalu
M Dascalu
M Dascalu
M Dascalu
MM Bakhtin
R Williams
S Elliot
S Owen
S Trausan-Matu
T Miller
TA Dijk van
TK Landauer
V Gervasi
W Duyck
W Westera
W Wresch
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Automated Essay Scoring has gained a wider applicability and usage with the integration of advanced Natural Language Processing techniques which enabled in-depth analyses of discourse in order capture the specificities of written texts. In this paper, we introduce a novel Automatic Essay Scoring method for Dutch language, built within the Readerbench framework, which encompasses a wide range of textual complexity indices, as well as an automated segmentation approach. Our method was evaluated on a corpus of 173 technical reports automatically split into sections and subsections, thus forming a hierarchical structure on which textual complexity indices were subsequently applied. The stepwise regression model explained 30.5% of the variance in students’ scores, while a Discriminant Function Analysis predicted with substantial accuracy (75.1%) whether they are high or low performance students.This study is part of the RAGE project. The RAGE project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 644187. This publication reflects only the author's view. The European Commission is not responsible for any use that may be made of the information it contains

Open University of the Netherlands Research Portal

Crossref

XML document-grammar comparison: related problems and applications

Author: A. Algergawy
A. Balmin
A. Budanitsky
A. Doan
A. Formica
A. Neumann
B. Bouchou
C. Chitic
C. Werner
C.J. Rijsbergen van
C.Y. Chan
D.C. Reis
E. Bertino
E.T. Ray
F. Giunchiglia
G. Lee
G. Salton
G.M. Landau
H. Do
J. Lee
J. Tekli
J. Tekli
K. Zhang
K. Zhang
M. Murata
P. Resnik
P. Shvaiko
R. Luz Da
R. Rada
R. Schenkel
S. Amer-Yahia
S. Axelsson
S. Nishimura
S.M. Selkow
T. Akatsu
T. Dalamagas
T. Schlieder
W. Lian
Publication venue: 'Walter de Gruyter GmbH'
Publication date
Field of study

Crossref

Omiotis: A Thesaurus-Based Measure of Text Relatedness

Author: A. Budanitsky
G. Tsatsaronis
I. Varlamis
R. Navigli
T. Landauer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Abstract. In this paper we present a new approach for measuring the relatedness between text segments, based on implicit semantic links between their words, as offered by a word thesaurus, namely WordNet. The approach does not require any type of training, since it exploits only WordNet to devise the implicit semantic links between text words. The paper presents a prototype on-line demo of the measure, that can provide word-to-word relatedness values, even for words of different part of speech. In addition the demo allows for the computation of relatedness between text segments

CiteSeerX

Crossref

Building Semantic Hierarchies Faithful to Image Semantics

Author: A. Budanitsky
J. Deng
K. Barnard
Y. Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

International audienceThis paper proposes a new image-semantic measure, named "Semantico-Visual Relatedness of Concepts" (SVRC), to estimate the semantic similarity between concepts. The proposed measure incorporates visual, conceptual and contextual information to provide a measure which is more meaningful and more representative of image semantics. We also propose a new methodology to automatically build a semantic hierarchy suitable for the purpose of image annotation and/or classification. The building is based on the previously proposed measure SVRC and on a new heuristic, named TRUST-ME, to connect concepts with higher relatedness till the building of the final hierarchy. The built hierarchy explicitly encodes a general to specific concepts relationship and therefore provides a semantic structure to concepts which facilitates the semantic interpretation of images. Our experiments showed that the use of the constructed semantic hierarchies as a hierarchical classification framework provides a better image annotation