Search CORE

1,135 research outputs found

Una combinación basada en operadores OWA para la Clasificación de Género Multi-etiqueta de páginas web

Author: Jebari Chaker
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2015
Field of study

This paper presents a new method for genre identification that combines homogeneous classifiers using OWA (Ordered Weighted Averaging) operators. Our method uses character n-grams extracted from different information sources such as URL, title, headings and anchors. To deal with the complexity of web pages, we applied MLKNN as a multi-label classifier, in which a web page can be affected by more than one genre. Experiments conducted using a known multi-label corpus show that our method achieves good results.En este trabajo se presenta un nuevo método para la identificación de género que combina clasificadores homogéneos utilizando OWA (promedio ponderado) Pedimos operadores. Nuestro método utiliza caracteres n-gramas extraídos de diferentes fuentes de información, tales como URL, título, encabezados y anclajes. Para hacer frente a la complejidad de las páginas web, se aplicó MLKNN como un clasificador multi-etiqueta, en el que una página web puede verse afectada por más de un género. Los experimentos llevados a cabo usando un conocido corpus multi-etiqueta muestran que nuestro método logra buenos resultados

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

From classification to quantification in tweet sentiment analysis

Author: A Esuli
AP Dempster
DJ Hopkins
E Martínez-Cámara
F Wilcoxon
F Zou
G Forman
G King
I Csiszár
I Tsochantaridis
J Barranquero
J Barranquero
J Bollen
J Demšar
KP Murphy
M Saerens
PS Dodds
R Alaíz-Rodríguez
R-E Fan
S Burton
S Kiritchenko
T Joachims
T-F Wu
TM Cover
V González-Castro
V Vapnik
W Pan
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2016
Field of study

Crossref

Institutional Knowledge at Singapore Management University

Reasoning & Querying – State of the Art

Author: Bry François
Furche Tim
Weiand Klara
Publication venue
Publication date: 31/08/2008
Field of study

Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF

Open Access LMU

Automatic Synthesis of Regular Expressions from Examples

Author: Alberto Bartoli
Andrea De Lorenzo
Enrico Sorio
Eric Medvet
Giorgio Davanzo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

We propose a system for the automatic generation of regular expressions for text-extraction tasks. The user describes the desired task only by means of a set of labeled examples. The generated regexes may be used with common engines such as those that are part of Java, PHP, Perl and so on. Usage of the system does not require any familiarity with regular expressions syntax. We performed an extensive experimental evaluation on 12 different extraction tasks applied to real-world datasets. We obtained very good results in terms of precision and recall, even in comparison to earlier state-of-the-art proposals. Our results are highly promising toward the achievement of a practical surrogate for the specific skills required for generating regular expressions, and significant as a demonstration of what can be achieved with GP-based approaches on modern IT technology

Archivio istituzionale della ricerca - Università di Trieste

ReOnto: A Neuro-Symbolic Approach for Biomedical Relation Extraction

Author: Jain Monika
Mutharaju Raghava
Singh Kuldeep
Publication venue
Publication date: 04/09/2023
Field of study

Relation Extraction (RE) is the task of extracting semantic relationships between entities in a sentence and aligning them to relations defined in a vocabulary, which is generally in the form of a Knowledge Graph (KG) or an ontology. Various approaches have been proposed so far to address this task. However, applying these techniques to biomedical text often yields unsatisfactory results because it is hard to infer relations directly from sentences due to the nature of the biomedical relations. To address these issues, we present a novel technique called ReOnto, that makes use of neuro symbolic knowledge for the RE task. ReOnto employs a graph neural network to acquire the sentence representation and leverages publicly accessible ontologies as prior knowledge to identify the sentential relation between two entities. The approach involves extracting the relation path between the two entities from the ontology. We evaluate the effect of using symbolic knowledge from ontologies with graph neural networks. Experimental results on two public biomedical datasets, BioRel and ADE, show that our method outperforms all the baselines (approximately by 3\%).Comment: Accepted in ECML 202

arXiv.org e-Print Archive

Application of fuzzy sets in data-to-text system

Author: Ramos Soto Alejandro
Publication venue
Publication date: 01/01/2016
Field of study

This PhD dissertation addresses the convergence of two distinct paradigms: fuzzy sets and natural language generation. The object of study is the integration of fuzzy set-derived techniques that model imprecision and uncertainty in human language into systems that generate textual information from numeric data, commonly known as data-to-text systems. This dissertation covers an extensive state of the art review, potential convergence points, two real data-to-text applications that integrate fuzzy sets (in the meteorology and learning analytics domains), and a model that encompasses the most relevant elements in the linguistic description of data discipline and provides a framework for building and integrating fuzzy set-based approaches into natural language generation/data-to-ext systems

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional da Universidade de Santiago de Compostela