Search CORE

10 research outputs found

A resource-saving collective approach to biomedical semantic role labeling

Author: Po-Ting Lai
Richard Tzong-Han Tsai
Publication venue: Springer Nature
Publication date: 01/01/2014
Field of study

BACKGROUND: Biomedical semantic role labeling (BioSRL) is a natural language processing technique that identifies the semantic roles of the words or phrases in sentences describing biological processes and expresses them as predicate-argument structures (PAS’s). Currently, a major problem of BioSRL is that most systems label every node in a full parse tree independently; however, some nodes always exhibit dependency. In general SRL, collective approaches based on the Markov logic network (MLN) model have been successful in dealing with this problem. However, in BioSRL such an approach has not been attempted because it would require more training data to recognize the more specialized and diverse terms found in biomedical literature, increasing training time and computational complexity. RESULTS: We first constructed a collective BioSRL system based on MLN. This system, called collective BIOSMILE (CBIOSMILE), is trained on the BioProp corpus. To reduce the resources used in BioSRL training, we employ a tree-pruning filter to remove unlikely nodes from the parse tree and four argument candidate identifiers to retain candidate nodes in the tree. Nodes not recognized by any candidate identifier are discarded. The pruned annotated parse trees are used to train a resource-saving MLN-based system, which is referred to as resource-saving collective BIOSMILE (RCBIOSMILE). Our experimental results show that our proposed CBIOSMILE system outperforms BIOSMILE, which is the top BioSRL system. Furthermore, our proposed RCBIOSMILE maintains the same level of accuracy as CBIOSMILE using 92% less memory and 57% less training time. CONCLUSIONS: This greatly improved efficiency makes RCBIOSMILE potentially suitable for training on much larger BioSRL corpora over more biomedical domains. Compared to real-world biomedical corpora, BioProp is relatively small, containing only 445 MEDLINE abstracts and 30 event triggers. It is not large enough for practical applications, such as pathway construction. We consider it of primary importance to pursue SRL training on large corpora in the future

Springer - Publisher Connector

PubMed Central

Improving Data Quality by Leveraging Statistical Relational Learning

Author: Akbik A
Kaul Manohar
Markl V
Rabl T
Visengeriyeva L
Publication venue
Publication date: 01/01/2016
Field of study

Digitally collected data su ↵ ers from many data quality issues, such as duplicate, incorrect, or incomplete data. A common approach for counteracting these issues is to formulate a set of data cleaning rules to identify and repair incorrect, duplicate and missing data. Data cleaning systems must be able to treat data quality rules holistically, to incorporate heterogeneous constraints within a single routine, and to automate data curation. We propose an approach to data cleaning based on statistical relational learning (SRL). We argue that a formalism - Markov logic - is a natural fit for modeling data quality rules. Our approach allows for the usage of probabilistic joint inference over interleaved data cleaning rules to improve data quality. Furthermore, it obliterates the need to specify the order of rule execution. We describe how data quality rules expressed as formulas in first-order logic directly translate into the predictive model in our SRL framework

Research Archive of Indian Institute of Technology Hyderabad

Improving Data Quality by Leveraging Statistical Relational\ud Learning

Author: Akbik A
Kaul M
Markl V
Rabl T
Visengeriyeva L
Publication venue
Publication date
Field of study

Digitally collected data su\ud ↵\ud ers from many data quality issues, such as duplicate, incorrect, or incomplete data. A common\ud approach for counteracting these issues is to formulate a set of data cleaning rules to identify and repair incorrect, duplicate and\ud missing data. Data cleaning systems must be able to treat data quality rules holistically, to incorporate heterogeneous constraints\ud within a single routine, and to automate data curation. We propose an approach to data cleaning based on statistical relational\ud learning (SRL). We argue that a formalism - Markov logic - is a natural fit for modeling data quality rules. Our approach\ud allows for the usage of probabilistic joint inference over interleaved data cleaning rules to improve data quality. Furthermore, it\ud obliterates the need to specify the order of rule execution. We describe how data quality rules expressed as formulas in first-order\ud logic directly translate into the predictive model in our SRL framework

Transforming Graph Representations for Statistical Relational Learning

Author: Aha David W.
McDowell Luke K.
Neville Jennifer
Rossi Ryan A.
Publication venue
Publication date: 01/01/2012
Field of study

Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed

arXiv.org e-Print Archive

CiteSeerX

BIO-MOLECULAR EVENT EXTRACTION WITH MARKOV LOGIC

Author: Bilgic
Björne
Bunescu
Charniak
Clark
Cohen
Crammer
Dredze
Finkel
Kilicoglu
Kim
Kim
Kok
McClosky
McDonald
Meza-Ruiz
Poon
Ray
Richardson
Riedel
Riedel
Riedel
Saetre
Singh
Singla
Surdeanu
Taskar
Taskar
Toutanova
Van Landeghem
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Learning, Probability and Logic: Toward a Unified Approach for Content-Based Music Information Retrieval

Author: Al Farabi
Anglade
Anglade
Anglade
Anglade
Arabi
Aucouturier
Bartsch
Bello
Bello
Bengio
Bergmann
Besold
Blockeel
Boulanger-Lewandowski
Burgoyne
Burgoyne
Böck
Casey
Cella
Cho
Crane
d'Avila Garcez
Dannenberg
Davis
De Raedt
De Raedt
De Raedt
De Raedt
De Raedt
Deng
Deng
Dobrian
Domingos
Donadello
Donadello
Dovey
Downie
Ellis
Ellis
Ellis
Flach
Foote
Friedman
Fujishima
Gaudefroy
Getoor
Getoor
Grosche
Gurevych
Haack
Hamel
Harte
Herremans
Humphrey
Humphrey
Humphrey
Jain
Jain
Jernite
Kameoka
Kempf
Kernfeld
Kersting
Kersting
Kim
Kimmig
Kindermann
Kok
Kok
Koller
Koops
Korzeniowski
Korzeniowski
Korzeniowski
Korzeniowski
Krumhansl
Kuzelka
Lafferty
Lee
Leivant
Lew
Lewin
Liu
Lostanlen
Malkin
Mallat
Mallory
Maresz
Marsík
Mauch
Mauch
McFee
McVicar
Mihalkova
Minsky
Mishkin
Morales
Morales
Muggleton
Muller
Murphy
Müller
Müller
Ni
Nilsson
Ojima
Orio
Oudre
Pachet
Paiement
Pan
Papadopoulos
Papadopoulos
Papadopoulos
Papadopoulos
Papadopoulos
Papai
Paulus
Pauwels
Pawar
Pearl
Pereira
Poole
Poon
Poon
Poon
Prince
Pápai
Raedt
Rameau
Ramirez
Ramirez
Repetto
Richardson
Richardson
Riedel
Riemann
Russell
Salamon
Sarkhel
Schedl
Schedl
Schoenberg
Schuller
Serrà
Serrá
Sheh
Shenoy
Sigtia
Singla
Smith
Snidaro
Socher
Srinivasamurthy
Sutton
Sztyler
Thimm
Tsushima
Van Baelen
Van Haaren
Venugopal
Wang
Widmer
Wu
Zalkow
Zhou
Šourek
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2019
Field of study

Within the last 15 years, the field of Music Information Retrieval (MIR) has made tremendous progress in the development of algorithms for organizing and analyzing the ever-increasing large and varied amount of music and music-related data available digitally. However, the development of content-based methods to enable or ameliorate multimedia retrieval still remains a central challenge. In this perspective paper, we critically look at the problem of automatic chord estimation from audio recordings as a case study of content-based algorithms, and point out several bottlenecks in current approaches: expressiveness and flexibility are obtained to the expense of robustness and vice versa; available multimodal sources of information are little exploited; modeling multi-faceted and strongly interrelated musical information is limited with current architectures; models are typically restricted to short-term analysis that does not account for the hierarchical temporal structure of musical signals. Dealing with music data requires the ability to tackle both uncertainty and complex relational structure at multiple levels of representation. Traditional approaches have generally treated these two aspects separately, probability and learning being the usual way to represent uncertainty in knowledge, while logical representation being the usual way to represent knowledge and complex relational information. We advocate that the identified hurdles of current approaches could be overcome by recent developments in the area of Statistical Relational Artificial Intelligence (StarAI) that unifies probability, logic and (deep) learning. We show that existing approaches used in MIR find powerful extensions and unifications in StarAI, and we explain why we think it is time to consider the new perspectives offered by this promising research field

HAL-CentraleSupelec

Crossref

Directory of Open Access Journals

HAL-Rennes 1