Search CORE

1,548 research outputs found

JRC-Names: Multilingual Entity Name variants and titles as Linked Data

Author: EHRMANN Maud
JACQUET GUILLAUME
STEINBERGER Ralf
Publication venue: 'IOS Press'
Publication date: 30/04/2015
Field of study

Since 2004 the European Commission’s Joint Research Centre (JRC) has been analysing the online version of printed media in over twenty languages and has automatically recognised and compiled large amounts of named entities (persons and organisations) and their many name variants. The collected variants not only include standard spellings in various countries, languages and scripts, but also frequently found spelling mistakes or lesser used name forms, all occurring in real-life text (e.g. Benjamin/Binyamin/Bibi/Benyamín/Biniamin/Беньямин/ بنیامین Netanyahu/ Netanjahu/Nétanyahou/Netahnyahu/Нетаньяху/ نتنیاهو ). This entity name variant data, known as JRCNames, has been available for public download since 2011. In this article, we report on our efforts to render JRC-Names as Linked Data (LD), using the lexicon model for ontologies lemon. Besides adhering to Semantic Web standards, this new release goes beyond the initial one in that it includes titles found next to the names, as well as date ranges when the titles and the name variants were found. It also establishes links towards existing datasets, such as DBpedia and Talk-Of-Europe. As multilingual linguistic linked dataset, JRC-Names can help bridge the gap between structured data and natural languages, thus supporting large-scale data integration, e.g. cross-lingual mapping, and web-based content processing, e.g. entity linking. JRC-Names is publicly available through the dataset catalogue of the European Union’s Open Data Portal.JRC.G.2-Global security and crisis managemen

JRC Publications Repository

Ontology Driven Knowledge Extraction System with Application in e-Government

Author: Dias G. Paiva
Rodrigues M.
Teixeira A
Publication venue: 'Epiarte, s.l.'
Publication date: 01/10/2011
Field of study

Important sources of information are originally created in natural language. To make that knowledge computer processable it is necessary to understand the structure of natural languages, by adding lexical and syntactic information; to have a rich representation to encode the knowledge of sentences, like ontologies; and to develop algorithms to bridge the gap between natural languages and computer processable representations. In this paper we present the architecture, modules and results of a prototype that uses an ontology to represent the world concepts and their relationships, and also to guide the process of extracting information from natural language documents. The system was tested using minutes of Portuguese municipalities’ meetings. Initial results are presented for three topics of municipalities' aﬀairs: the subsidies granted, the building permits requested, and the existing protocols with other institutions

Repositório Institucional da Universidade de Aveiro

Ontology lexicalization: Relationship between content and meaning in the context of Information Retrieval

Author: ALLEMANG D.
BAEZA-YATES R.
BERNERS-LEE T.
BERNERS-LEE T.
BIRD S.
BOND F.
BRIN S.
BRÄSCHER M.
BUITELAAR P.
CASTELLS P.
CASTELLS P.
CASTELLS P.
CIMIANO P.
CONTRERAS J.
DAHLBERG I.
FELLBAUM C.
FERNÁNDEZ M.
FERNÁNDEZ M.
GRESSER J. Y.
GUARINO N.
GUARINO N.
GUHA R.
GUHA R.
HEATH T.
HOGAN A.
KARA S.
LESK M.
MAEDCHE A.
McCRAE J.
McCRAE J.
NARDI D.
NAVIGLI R.
OLIVEIRA H. G. Onto
PAIVA V.
PEDREGOSA F.
POPOV B.
REYMONET A.
ROCHA C.
SILVA F.
SÉRASSET G.
UNGER C.
VALLET D.
WALTER S.
WALTER S.
WILKS Y.
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

Medical Informatics

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Information technology has been revolutionizing the everyday life of the common man, while medical science has been making rapid strides in understanding disease mechanisms, developing diagnostic techniques and effecting successful treatment regimen, even for those cases which would have been classified as a poor prognosis a decade earlier. The confluence of information technology and biomedicine has brought into its ambit additional dimensions of computerized databases for patient conditions, revolutionizing the way health care and patient information is recorded, processed, interpreted and utilized for improving the quality of life. This book consists of seven chapters dealing with the three primary issues of medical information acquisition from a patient's and health care professional's perspective, translational approaches from a researcher's point of view, and finally the application potential as required by the clinicians/physician. The book covers modern issues in Information Technology, Bioinformatics Methods and Clinical Applications. The chapters describe the basic process of acquisition of information in a health system, recent technological developments in biomedicine and the realistic evaluation of medical informatics

Directory of Open Access Books (DOAB)

LexTec - a rich language resource for technical domains in Portuguese

Author: Amaro Raquel
Marrafa Palmira
Mendes Sara
Publication venue: European Language Resources Association
Publication date: 01/01/2014
Field of study

The growing amount of available information and the importance given to the access to technical information enhance the potential role of NLP applications in enabling users to deal with information for a variety of knowledge domains. In this process, language resources are crucial. This paper presents Lextec, a rich computational language resource for technical vocabulary in Portuguese. Encoding a representative set of terms for ten different technical domains, this concept-based relational language resource combines a wide range of linguistic information by integrating each entry in a domain-specific wordnet and associating it with a precise definition for each lexicalization in the technical domain at stake, illustrative texts and information for translation into English.info:eu-repo/semantics/publishedVersio

Universidade de Lisboa: Repositório.UL

Aligning Biomedical Terminologies in French: Towards Semantic Interoperability in Medical Applications

Author: Julien Grosjean
Lina F. Soualmia
Michel Joubert
Stefan J. Darmoni
Tayeb Merabti
Publication venue: 'IntechOpen'
Publication date: 09/03/2012
Field of study

IntechOpen

The six challenges of the Semantic Web

Author: Benjamins R.
Contreras Jesús
Corcho Oscar
Gómez-Pérez A.
Publication venue: Facultad de Informática (UPM)
Publication date: 01/04/2002
Field of study

The Semantic Web has attracted a diverse, but significant, community of researchers, institutes and companies, all sharing the belief that one day the Semantic Web will have as big an impact on life as currently the WWW/Internet has. We share that vision, based on the ever-increasing need to reduce information overload, and to increase task delegation to software agents. However, there is still a long way to go before the Semantic Web dream comes true. In this paper, we identify some of the major challenges the community faces in the coming years, and we outline solution directions. The major challenges concern: (i) the availability of content, (ii) ontology availability, development and evolution, (iii) scalability, (iv) multilinguality, (v) visualization to reduce information overload, and (vi) stability of Semantic Web languages. We will also say some words on the economic impact of the Semantic Web

Archivo Digital UPM

Predicate Matrix: an interoperable lexical knowledge base for predicates

Author: López de Lacalle Maddalen
Publication venue
Publication date: 10/07/2023
Field of study

183 p.La Matriz de Predicados (Predicate Matrix en inglés) es un nuevo recurso léxico-semántico resultado de la integración de múltiples fuentes de conocimiento, entre las cuales se encuentran FrameNet, VerbNet, PropBank y WordNet. La Matriz de Predicados proporciona un léxico extenso y robusto que permite mejorar la interoperabilidad entre los recursos semánticos mencionados anteriormente. La creación de la Matriz de Predicados se basa en la integración de Semlink y nuevos mappings obtenidos utilizando métodos automáticos que enlazan el conocimiento semántico a nivel léxico y de roles. Asimismo, hemos ampliado la Predicate Matrix para cubrir los predicados nominales (inglés, español) y predicados en otros idiomas (castellano, catalán y vasco). Como resultado, la Matriz de predicados proporciona un léxico multilingüe que permite el análisis semántico interoperable en múltiples idiomas

Archivo Digital para la Docencia y la Investigación