Search CORE

831 research outputs found

Hypotheses, evidence and relationships: The HypER approach for representing scientific knowledge claims

Author: Buckingham Shum S.
Carusi A.
de Waard A.
Park J.
Samwald M.
Sándor Á.
Publication venue
Publication date: 01/01/2009
Field of study

Biological knowledge is increasingly represented as a collection of (entity-relationship-entity) triplets. These are queried, mined, appended to papers, and published. However, this representation ignores the argumentation contained within a paper and the relationships between hypotheses, claims and evidence put forth in the article. In this paper, we propose an alternate view of the research article as a network of 'hypotheses and evidence'. Our knowledge representation focuses on scientific discourse as a rhetorical activity, which leads to a different direction in the development of tools and processes for modeling this discourse. We propose to extract knowledge from the article to allow the construction of a system where a specific scientific claim is connected, through trails of meaningful relationships, to experimental evidence. We discuss some current efforts and future plans in this area

CiteSeerX

Open Research Online (The Open University)

Copenhagen University Research Information System

Gene Ontology density estimation and discourse analysis for automatic GeneRiF extraction

Author: Ehrler Frédéric
Gobeill Julien
Mottaz Anaïs
Ruch Patrick
Tbahriti Imad
Veuthey Anne-Lise
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background This paper describes and evaluates a sentence selection engine that extracts a GeneRiF (Gene Reference into Functions) as defined in ENTREZ-Gene based on a MEDLINE record. Inputs for this task include both a gene and a pointer to a MEDLINE reference. In the suggested approach we merge two independent sentence extraction strategies. The first proposed strategy (LASt) uses argumentative features, inspired by discourse-analysis models. The second extraction scheme (GOEx) uses an automatic text categorizer to estimate the density of Gene Ontology categories in every sentence; thus providing a full ranking of all possible candidate GeneRiFs. A combination of the two approaches is proposed, which also aims at reducing the size of the selected segment by filtering out non-content bearing rhetorical phrases. Results Based on the TREC-2003 Genomics collection for GeneRiF identification, the LASt extraction strategy is already competitive (52.78%). When used in a combined approach, the extraction task clearly shows improvement, achieving a Dice score of over 57% (+10%). Conclusions Argumentative representation levels and conceptual density estimation using Gene Ontology contents appear complementary for functional annotation in proteomics.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archive ouverte UNIGE

Finding and Interpreting Arguments: An Important Challenge for Humanities Computing and Scholarly Practice

Author: Allen C.
Allen C.
Ravenscroft A.
Ravenscroft A.
Publication venue: The Alliance of Digital Humanities Organizations
Publication date: 01/01/2019
Field of study

Skillful identification and interpretation of arguments is a cornerstone of learning, scholarly activity and thoughtful civic engagement. These are difficult skills for people to learn, and they are beyond the reach of current computational methods from artificial intelligence and machine learning, despite hype suggesting the contrary. In previous work, we have attempted to build systems that scaffold these skills in people. In this paper we reflect on the difficulties posed by this work, and we argue that it is a serious challenge which ought to be taken up within the digital humanities and related efforts to computationally support scholarly practice. Network analysis, bibliometrics, and stylometrics, essentially leave out the fundamental humanistic skill of charitable argument interpretation because they touch very little on the meanings embedded in texts. We present a problematisation of the design space for potential tool development, as a result of insights about the nature and form of arguments in historical texts gained from our attempt to locate and map the arguments in one corner of the Hathi Trust digital library

UEL Research Repository at University of East London

NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature

Author: Auer Sören
D'Souza Jennifer
Publication venue
Publication date: 01/01/2020
Field of study

We describe an annotation initiative to capture the scholarly contributions in natural language processing (NLP) articles, particularly, for the articles that discuss machine learning (ML) approaches for various information extraction tasks. We develop the annotation task based on a pilot annotation exercise on 50 NLP-ML scholarly articles presenting contributions to five information extraction tasks 1. machine translation, 2. named entity recognition, 3. question answering, 4. relation classification, and 5. text classification. In this article, we describe the outcomes of this pilot annotation phase. Through the exercise we have obtained an annotation methodology; and found ten core information units that reflect the contribution of the NLP-ML scholarly investigations. The resulting annotation scheme we developed based on these information units is called NLPContributions. The overarching goal of our endeavor is four-fold: 1) to find a systematic set of patterns of subject-predicate-object statements for the semantic structuring of scholarly contributions that are more or less generically applicable for NLP-ML research articles; 2) to apply the discovered patterns in the creation of a larger annotated dataset for training machine readers of research contributions; 3) to ingest the dataset into the Open Research Knowledge Graph (ORKG) infrastructure as a showcase for creating user-friendly state-of-the-art overviews; 4) to integrate the machine readers into the ORKG to assist users in the manual curation of their respective article contributions. We envision that the NLPContributions methodology engenders a wider discussion on the topic toward its further refinement and development. Our pilot annotated dataset of 50 NLP-ML scholarly articles according to the NLPContributions scheme is openly available to the research community at https://doi.org/10.25835/0019761.Comment: In Proceedings of the 1st Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE 2020) co-located with the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL 2020), Virtual Event, China, August 1. http://ceur-ws.org/Vol-2658

arXiv.org e-Print Archive

Repositorium für Naturwissenschaften und Technik

Discourse structure and language technology

Author: Agarwal
Al-Saif
Al-Saif
Asher
B. WEBBER
Baldridge
Barzilay
Barzilay
Bex
Buch-Kromann
Buch-Kromann
Bunt
Burchardt
Burstein
Callison-Birch
Chambers
Chen
Chiarcos
Choi
Dale
Daume
Do
Eales
Egg
Eisenstein
Elsner
Elsner
Elwell
Finlayson
Foster
Galley
Ghorbel
Ghosh
Ghosh
Grosz
Grosz
Grosz
Gu
Guo
Halliday
Hardmeier
Hardt
Hearst
Higgins
Hirohata
Holler
Hovy
Ide
Kan
Kingsbury
Koppel
Lee
Lee
Liakata
Lin
Lochbaum
Louis
M. EGG
Maamouri
Malioutov
Mandler
Marcu
Marcu
Marcu
Marcus
Martin
Maslennikov
McDonald
McKnight
Meyer
Mladová
Moore
Moore
Moore
Moser
Nagard
Oza
Palau
Pang
Pang
Paris
Patwardhan
Petukhova
Petukhova
Pitler
Pitler
Polanyi
Polanyi
Polanyi
Prasad
Prasad
Prasad
Prasad
Propp
Purver
Purver
Sagae
Sagae
Say
Schank
Sibun
Soricut
Stede
Subba
Taboada
Teufel
Thione
Tonelli
Turney
V. KORDONI
Versley
Voll
Walker
Wang
Webber
Webber
Wellner
Woods
Zeyrek
Zeyrek
Zeyrek
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/10/2012
Field of study

Crossref

Edinburgh Research Explorer

Automatic extraction and structure of arguments in legal documents

Author: Poudyal Prakash
Publication venue: 'Universidade de Evora'
Publication date: 04/12/2018
Field of study

A argumentação desempenha um papel fundamental na comunicação humana ao formular razões e tirar conclusões. Desenvolveu-se um sistema automático para identificar argumentos jurídicos de forma eficaz em termos de custos a partir da jurisprudência. Usando 42 leis jurídicas do Tribunal Europeu dos Direitos Humanos (ECHR), anotou-se os documentos para estabelecer um conjunto de dados “padrão-ouro”. Foi então desenvolvido e testado um processo composto por 3 etapas para mineração de argumentos. A primeira etapa foi avaliar o melhor conjunto de recursos para identificar automaticamente as frases argumentativas do texto não estruturado. Várias experiencias foram conduzidas dependendo do tipo de características disponíveis no corpus, a fim de determinar qual abordagem que produzia os melhores resultados. No segundo estágio, introduziu-se uma nova abordagem de agrupamento automático (para agrupar frases num argumento legal coerente), através da utilização de dois novos algoritmos: o “Algoritmo de Identificação do Grupo Apropriado”, ACIA e a “Distribuição de orações no agrupamento de Cluster”, DSCA. O trabalho inclui também um sistema de avaliação do algoritmo de agrupamento que permite ajustar o seu desempenho. Na terceira etapa do trabalho, utilizou-se uma abordagem híbrida de técnicas estatísticas e baseadas em regras para categorizar as orações argumentativas. No geral, observa-se que o nível de precisão e utilidade alcançado por essas novas técnicas é viável como base para uma estrutura geral de argumentação e mineração; Abstract: Automatic Extraction and Structure of Arguments in Legal Documents Argumentation plays a cardinal role in human communication when formulating reasons and drawing conclusions. A system to automatically identify legal arguments cost-effectively from case-law was developed. Using 42 legal case-laws from the European Court of Human Rights (ECHR), an annotation was performed to establish a ‘gold-standard’ dataset. Then a three-stage process for argument mining was developed and tested. The first stage aims at evaluating the best set of features for automatically identifying argumentative sentences within unstructured text. Several experiments were conducted, depending upon the type of features available in the corpus, in order to determine which approach yielded the best result. In the second stage, a novel approach to clustering (for grouping sentences automatically into a coherent legal argument) was introduced through the development of two new algorithms: the “Appropriate Cluster Identification Algorithm”,(ACIA) and the “Distribution of Sentence to the Cluster Algorithm” (DSCA). This work also includes a new evaluation system for the clustering algorithm, which helps tuning it for performance. In the third stage, a hybrid approach of statistical and rule-based techniques was used in order to categorize argumentative sentences. Overall, it’s possible to observe that the level of accuracy and usefulness achieve by these new techniques makes it viable as the basis of a general argument-mining framework

Repositório Científico da Universidade de Évora

Explainable Argument Mining

Author: Lawrence John
Publication venue
Publication date: 01/01/2021
Field of study

University of Dundee Online Publications

Theoretical Grounding for Computer Assisted Scholarly Text Reading (CASTR)

Author: Meunier Jean-Guy
Publication venue: 'CISP Journal Services'
Publication date: 02/09/2012
Field of study

Digital humanities technology has mainly focused its development on scholarly text digitalization and text analysis. It is only recently that attention has been paid to the activity of reading in a computerized environment. Some main causes of this have been the advent of the e-book but more importantly the massive enterprise of text digitalization (such as Gallica, Google Books, World Wide library, and others). In this article, we analyze, in a very exploratory manner, three main dimensions of computer assister scholarly reading of text: the cognitive, the computational and the software dimension. The cognitive dimension of scholarly reading pertains not the nature of reading as a psychological activity but to the complex interpretative act of going through argumentations, narrations, descriptions, demonstrations, dialogues, themes, etc. that are contained in a text

Scholarly and Research Communication (E-Journal)